Links

First things first,

  • A list of background music. Link.
  • Sketching Link

 

Data Science

Git

Python

Statistics

Quick Short Cuts

Ipython Notes for learning

Lots of quick & interesting slides

Data Scientist Workbench:

It’s a free all-in-one solution for people interested in performing data analysis. The Data Scientist Workbench includes:

  • OpenRefine to clean up messy data.
  • Jupyter notebooks supporting Python, R, and Scala (with access to Apache Spark for Big Data processing).
  • Apache Zeppelin notebooks.
  • RStudio in your browser.

https://my.datascientistworkbench.com/

QuickSlides on NLTP – Natural Language Text Processing

 

Kaggle Tips:

 

Related reading:

Part 1 of this blog post series: Orientation

Part 2b: Ranking and regression metrics

Part 3: Validation and offline testing

Part 4: Hyperparameter tuning

Part 5: A/B testing

Tom Fawcett’s 2006 Pattern Recognition Letters paper on An Introduction to ROC Analysis.

Chapter 7 of Data Science for Business discusses the use of Expected Value as a useful classification metric, especially in cases of skewed data sets.

Research Articles

Note: This post was updated on April 16, 2015. Thanks to @aatallah for demystifying the origin of the name “ROC curve,” and to Joe McCarthy for the helpful references.

PDF/Slides Generator for presentations
Advertisements