Data Science

R, Python, Machine Learning and AI


  • CS109 Data Science: Such an excellent course! It is a general yet quite advanced course that introduced to students a lot of topics in data science including web scraping, statistical inference, machine learning (regression, regularization, classification, SVM, decision trees, ensemble methods), Bayesian statistics, big data (MapReduce, Spark) and visualization (Tableau). Homeworks from previous years are available for practice.
  • The Analytics Edge on Edx: Strongly recommended if anyone wants to learn R programming! Filled with case studies and tons of exercises for practice.
  • Data Science Specialization on Coursera: Not recommended for starters. A good course introducing all the concepts in data science, yet not enough exercises and some of the courses are difficult to follow without some background. There is this excellent R markdown course note for this specialization by Xing everyone should check out if planning to start this specialization.
  • Introduction to Statistical Learning by Trevor Hastie and Rob Tibshirani
  • Machine learning by Andrew Ng on Cousera: One of the best courses ever! He explains machine learning concepts in such an elegant way, and easy to understand. It would be better if this course is taught in Python though. I know Octave is excellent in matrix manipulation and more research oriented, but I’m not sure if I will be using it ever.
  • Machine Learning Specialization on Coursera: Great course! It is taught by two amazon professors. In the first course of this specialization, they explain all the concepts (regression, classification, clustering and recommendation system) in a light way. There’s not much hard core algorithm yet, and it will be coming up in the following courses according to their agenda.



Advanced Topics