Artificial Intelligence: A Case for Optimism

In this essay, I advance three claims:

  1. It is theoretically possible for us to engineer an AGI and superintelligence.
  2. However, it is extremely unlikely that we will create an AGI in the foreseeable future.
  3. The proliferation of narrow AI applications will cause significant structural changes to the global economy long before AGI is feasible. While it’s difficult to predict whether narrow AI’s impact will be net positive or net negative for society, there is cause for optimism.

Read more

Staggered vs. All-At-Once Content Release in Massive Open Online Courses: Evaluating a Natural Experiment

This paper, based on a chapter of my undergraduate thesis, was accepted and published at Learning at Scale 2015. I was lucky enough to get to travel to Vancouver and present my findings at the conference.

The paper addresses two research questions:

  1. Does releasing all content at the beginning of the course (rather than sequentially) lead to more variation in student progress and less “ontrackness” (as measured by the courses recommended schedule)? Short answer: Yes, but in both content release strategies the vast majority of students proceed at individualized and off-track paces.

  2. Are there benefits to staying on-track? Short answer: Staying on-track has a modest positive correlation with certification, but such modest benefits to staying on-track must be weighed against the benefits of allowing students flexibility in how they move through the course. Releasing content upfront and all-at-once appears to be a viable strategy for MOOC designers.

I owe a huge thank you to Justin Reich, without whom this project would never have gotten off the ground.

Read the paper

HarvardX & MITx Working Papers

During my Junior year at Harvard, I started working with the HarvardX and MITx research teams to clean and analyze data generated by students taking courses on edX, an open-source platform for massive open online courses (MOOCs).

I contributed to the HarvardX-Tools repo with a set of Python scripts to extract, parse, and sanitize data from edX clickstream tracking logs. I was also a co-author on the first 15 HarvardX and MITx Working Papers, which report general statistics and early findings from Harvard and MIT’s first wave of MOOCs on the edX platform (Fall 2012 to Summer 2013).

All in all, I got very comfortable with Python, numpy, pandas, matplotlib, and scikit-learn. I had tons of fun and learned a lot from the experience, and I can’t thank the HarvardX and MITx research groups enough for welcoming me into the group.

Read the summary report

Making Sense of MOOCs: A Reconceptualization of HarvardX Courses and Their Students

My 118-page undergraduate thesis argues that massive open online courses (MOOCs) cannot be properly understood using conventional educational metrics and definitions, and that students, instructors, university leaders, and policymakers must be wary in viewing this new technology through the lenses of the past.

The thesis contextualizes MOOCs within a rich history of distance learning and open courseware, and proposes new “reconceptualizations” of retention and asynchronicity in MOOCs that can help us evaluate their efficacy and worth. As an empirical study, this paper draws on unique datasets derived from edX clicksteam event logs for six early HarvardX courses.

Building and testing these datasets was the most labor-intensive part of this study: clicksteam logs were large (~10 GB per course), non-standardized (cue lots of regex), and not well documented (at the time). Thus, I spent most of my time munging away in iPython notebooks, as well as digging through edX’s source code to verify which student actions correspond to which events.

Read the paper