This is a collection of interesting sites and great references to the broader topics of data science and machine learning. I’m creating it partly as a way of keeping track of the things I find as I delve into ML, but also as a resource for others that might be interested.
General Resources
- https://datajobs.com/data-science-repo/
- “A central knowledge resource for data scientists/analytics experts
- links to tons of material organized by topic
- Awesome Data Science
- “… a repository to learn and apply towards solving real world problems”
- Machine Learning Mastery
- a plethora of amazing material. Easy to read. Just go and read!
Online Courses
- Andrew Ng’s Machine Learning course on Coursera
- I’m currently going through this course. The materials are great, and Ng does a great job of explaining the topics. He uses Octave throughout the course, rather than R or Python. The course is rather advanced in the sense that you need to not be scared of math.
- Data Science and Machine Learning with Python – Hands On – Udemy
- I’m going through this course as well. For me, the material is somewhat basic. However, the tutorials and such are in Python, which is something I’m not too familiar with. If you’re totally new to ML, this could be a great place to start.
Tools
Languages
- R Project
- R Tutorials by Dr. William King
- Introduction to R
Courses with Material OnLine
Hadoop Cluster on Raspberry Pis
Out of an interest in exploring Hadoop and the related technologies, I’m planning to build a cluster using the nifty Raspberry Pi computers. Here are some tutorials and how-tos I’ve collected as guidance.
- The $300 Raspi Hadoop Cluster
- Raspberry Pi 2 Hadoop 2 Cluster
- A Raspberry Pi Hadoop Cluster with Apache Spark on YARN: Big Data 101
- Create an Enclosure for a 6-Node Raspberry Pi Cluster
Data Sets by Category
Regression
Classification
Time Series
- Bejing PM2.5 Data Set – This is a data set that reports on the weather and the level of pollution each hour for five years at the US embassy in Beijing, China. Used in example here.