Data Science Course: R Basics (My Harvard edX Experience)
Last month, I enrolled in the first class, of nine, in Harvard University's Professional Certificate in Data Science via edX. I completed the course in 3 weeks with a passing grade and received a verified certificate. This is my experience.
First, I should note that while the course information stated that no prior programming experience was needed, I went in with some experience. That said, this course teaches you how to use R programming language, which I had never used before.
I initially enrolled just to audit the course, so I could get a quick introduction to R and see if it was something I might like. I have found over the years the best thing to do is plenty of research, and a free test run before committing to a purchase when it comes to technology tools and platforms. I went through the first module and did some internet research on R, and decided to upgrade to a verified track. I believe at the time, it was being offered at a lower rate than usual (like $37 or something).
Per the syllabus,
[The program] cover concepts such as probability, inference, regression and machine learning and develop skill sets such as R programming, data wrangling with dplyr, data visualization with ggplot2, file organization with unix, version control with GitHub, and reproducible document preparation with RStudio.
This all sounded like fun to me!
As far as graded assignments, there are nine hands-on programming exercises using DataCamp, and four sets of quiz exercises on edX. These together make up 100% of your grade, and you must complete the class with at least a 70% to earn the verified certificate. (The final set of exercises is available to verified learners only.)
This first course in the program covered the following content:
Section 1: R Basics, Functions, and Data Types
Getting started with R and learn about R's functions and data types.
Section 2: Vectors and Sorting
Learn to operate on vectors and advanced functions such as sorting.
Section 3: Indexing, Data Manipulation, and Plots
Learn to wrangle, analyze and visualize data.
Section 4: Programming Basics
Learn to use general programming features like 'if-else', and 'for loop' commands to write your own functions to perform various operations on datasets.
The first section was pretty simple (even for a noob) and took just a few hours to get through. It entailed downloading RStudio, and learning some of the most basic concepts and terminology. There were a number of video lectures and textbook reading, then when you've gone through all the section content, a link directs you to DataCamp to complete your interactive quiz exercises for grading.
I did really well on the first assessment and proceeded to the next section. I completed section 1 and 2 in less than a week. Section 3 and 4 took a bit longer because I found those to be more challenging (and because I didn't log in for like 6 days straight.)
So here's my key takeaways from the course.
I thought the instructor, Rafael Irizarry - Professor of Biostatistics at Harvard T.H. Chan School of Public Health was very effective. He spoke clearly, succinctly, and slowly enough that even the most beginner level student would be able to follow along.
Downloading RStudio (free version from CRAN) and using DataCamp was simple due to the detailed instructions, and the user friendly design.
While Professor Irizarry makes the content sound easy, there were moments where I was confused, and it was anything but easy. I found that one of the hardest parts, for me, was knowing which database I was supposed to be in, and whether or not I needed to assign something.
"Sorting" and "indexing" gave me the most trouble. In contrast -- "vectors", "vector arithmetic", and "for loops" were the easiest for me. (I have no idea if that says anything about me or not.)
The videos were short enough to be easily digested, but long enough to cover the subject thoroughly. I did not have to do a whole lot of textbook review in order to complete the quizzes. I referenced the text just a couple of times. The videos pretty much are the same script as the text, so I suppose if you are the type of person who likes to read information for learning purposes, then the free textbook will work fine.
By the end of the course, I didn't hate R or RStudio and so I may continue on with the next course in the certificate. I am still undecided. Why you ask? See below.
I definitely enjoyed the course, however, I am not sure of it's value. I chose it because it's Harvard, and with that, I expected a certain level of rigor, even if it is a MOOC type of course. It just seemed too easy, and I can't help but feel like I could have learnt the same stuff elsewhere on the internet. In fact, I know I could have. Yet, it was only $40 bucks. And honestly, I have taught myself how to use so many tools, platforms, and software applications over the years that I just didn't feel like it this time. I wanted to be taught. If that's what you are looking for, then I recommend this course for learning about Data Science, R, and RStudio. It got the job done. I know there are other courses out there, so do your due diligence, and choose which program or course is right for you. The beauty of MOOCs is that you get to test drive them for free.