First things first,

  • A list of background music. Link.
  • Sketching Link


Data Science




Quick Short Cuts

Ipython Notes for learning

Lots of quick & interesting slides

Data Scientist Workbench:

It’s a free all-in-one solution for people interested in performing data analysis. The Data Scientist Workbench includes:

  • OpenRefine to clean up messy data.
  • Jupyter notebooks supporting Python, R, and Scala (with access to Apache Spark for Big Data processing).
  • Apache Zeppelin notebooks.
  • RStudio in your browser.

QuickSlides on NLTP – Natural Language Text Processing


Kaggle Tips:


Related reading:

Part 1 of this blog post series: Orientation

Part 2b: Ranking and regression metrics

Part 3: Validation and offline testing

Part 4: Hyperparameter tuning

Part 5: A/B testing

Tom Fawcett’s 2006 Pattern Recognition Letters paper on An Introduction to ROC Analysis.

Chapter 7 of Data Science for Business discusses the use of Expected Value as a useful classification metric, especially in cases of skewed data sets.

Research Articles

Note: This post was updated on April 16, 2015. Thanks to @aatallah for demystifying the origin of the name “ROC curve,” and to Joe McCarthy for the helpful references.

PDF/Slides Generator for presentations