# Data Science resources

Inspired by many similar lists, for example this one by Carl Anderson and this one by Yu Wu, below is a list of free, online resources for learning data science (i.e. programming, machine learning, and statistics). This list includes textbooks (published online or with a free PDF) as well as YouTube videos and online courses (MOOCs). As always with these things, your mileage may vary.

### Introductory

James, Witten, Hastie & Tibshirani (2013) “An Introduction to Statistical Learning, with Applications in R” Springer.

Thomas (2018) “Mathematics for Machine Learning”

Irizarry (2019) “Introduction to Data Science: Data Analysis and Prediction Algorithms with R”

Welling (2010) “A First Encounter with Machine Learning”

Daumé III (2017) “A Course in Machine Learning”

### R Programming

Wickham & Grolemund (2017) “R for Data Science: Import, Tidy, Transform, Visualize, and Model Data” O’Reilly.

Wickham (2nd ed., 2019) “Advanced R” Chapman & Hall/CRC Press.

Wickham (2nd ed., 2015) “ggplot2: Elegant Graphics for Data Analysis”

Lovelace, Nowosad & Muenchow (2019) “Geocomputation with R” CRC Press.

### Python Programming

Downey (2nd ed., 2014) “ThinkStats: Exploratory Data Analysis in Python” O’Reilly.

Adhikari & DeNero “Computational and Inferential Thinking: The Foundations of Data Science”

Sklearn basics (Jupyter notebook)

Plotting and Visualization in Python (Jupyter notebook)

### More Advanced

Hastie, Tibshirani & Friedman (2nd ed., 2009) “The Elements of Statistical Learning: Data Mining, Inference, and Prediction” Springer.

Goodfellow, Bengio & Courville (2016) “Deep Learning” MIT Press.

Peyré (2019) “Mathematical Foundations of Data Sciences”

McElreath (2015; 2nd ed. 2020) “Statistical Rethinking: A Bayesian Course with Examples in R and Stan” Chapman & Hall/CRC Press. YouTube videos

Wikle, Zammit-Mangion & Cressie (2019) “Spatio-Temporal Statistics with R” Chapman & Hall/CRC Press.

Collins II (2003) “Fundamental Numerical Methods and Data Analysis”

Leskovec, Rajaraman & Ullman (3rd ed., 2020) “Mining of Massive Datasets” CUP.

Hyndman & Athanasopoulos (2nd ed., 2018) “Forecasting: Principles and Practice” OTexts.

Blitzstein & Hwang (2nd ed., 2019) “Introduction to Probability” CRC Press.

Petersen & Pedersen (2012) “The Matrix Cookbook”

### Courses

fast.ai (Jeremy Howard & Rachel Thomas)

Deep Learning Specialization (Andrew Ng, Coursera)

Intro to Hadoop and MapReduce (Udacity)

Statistical Learning (Trevor Hastie & Rob Tibshirani, Stanford Online)

Linear Algebra (Gilbert Strang, MIT OCW)