Technical Information

Requirements

  • Web application: fully client-side - platform agnostic

  • Persistent database: stored in IndexedDB, locally, in path such as %LocalAppData%\Google\Chrome\User Data\Default\Local Storage

  • Input: edX metadata files, edX daily log files in gzip format, as obtained from the institution’s data czar, who in turn obtains everything from edX

  • Outputs: csv files (one for each table - sessions, video_interactions, quiz_sessions, etc.), quick indicators and metrics, insight graphs.

Development Choices/Challenges

  • Website deployment

    • Currently GitHub Pages
  • Uploading files to browser using FileReader

    • In gzip format, as obtained from Data Czar/edX
  • Local processing

    • efficient? Good enough!

    • JS vs Python = translate or transpile Definitely translate to JS (Python-in-the-browser was more than 200 times slower!)

  • Translation Script: Python vs JavaScript

    • Winner Use Transcrypt or Javascripthon to transform the translation module from Python to JavaScript, correct, adapt and add I/O modules

    • Runner-up Use Brython to run the translation module in Python in the browser and handle I/O with JavaScript

    • Not necessary Write from scratch in JavaScript

  • Database Tools

    • JsStore + SQLWeb + IndexedDB SQLWeb parses regular SQL queries into api calls for JsStore, which is an SQL-focused wrapper for IndexedDB, a local key-value store in most modern browsers. Might be problematic when adding new rows from logs, and for the metrics queries (especially JOINS) already created in SQL.

Existing Alternatives for edX Analytics

  • edX Analytics and Insights: Proprietary pipeline for the edX analytics dashboard and edX Insights

  • Event log parser pyedx: Written in Python, it parses the edX log file into CEDE (Center for Digital Education) format JSON, used by the Ecole Polytecnique Federale de Lausanne (EPFL)

  • edX Student and Course Analytics and Visualization Pipeline: Set of R scripts to process edX Data Package files to create tables and learning trajectory visualizations, from Cyberinfrastructure for Network Science Center (CNS) at Indiana University

  • edX Tools: Set of (easy, medium and advanced) tools created by various iniversities and research institutions for different purposes, from RSS generators to data parsers into sql or mongo (some look old or obsolete).