This is the project proposal for the final project. Proposals should be for projects that are some combination of the following three forms

  1. Implement a non-trivial data analysis method for a problem of your choosing.
  2. Design a data science algorithm and rigorously analyze it.
  3. Prove a theorem about an existing data science method.

The project proposal should be 2-3 pages long, and type-set in LaTeX in an appropriate conference format (e.g. IEEE conference template).

Note: there should be interesting mathematical content in your project.  This means that I expect you to do more than take publicly available data and throw it into an existing machine learning pipeline! Although those kinds of projects are perfectly adequate in some conferences, they are not suitable for this class.

I encourage you to take on an ambitious project; if you propose an ambitious project that doesn't work out, you can either (1) fall back to doing an interesting synthesis/review of your chosen topic, or (2) write it up as a negative result, describing why the approach you tried failed and what you might do differently the next time. I will not penalize you for failing an ambitious project, so long as you put in a good-faith effort, learned something from the process, and can articulate it in paper/presentation form.

To help you out, I've also included some examples of appropriate projects below. Note that if you use one of my project ideas, I will expect to be a co-author on any publications after the class ends. More generally, if I contribute to the project (i.e. by giving significant amounts of advice), I expect to be a co-author. However, if you do everything yourself, you will of course get sole credit.

Project ideas