Learn the basic components of building and applying prediction functions with an emphasis on practical applications. This is the eighth course in the Johns Hopkins Data Science Specialization.
One of the most common tasks performed by data scientists and data analysts are prediction and machine learning. This course will cover the basic components of building and applying prediction functions with an emphasis on practical applications. The course will provide basic grounding in concepts such as training and tests sets, overfitting, and error rates. The course will also introduce a range of model based and algorithmic machine learning methods including regression, classification trees, Naive Bayes, and random forests. The course will cover the complete process of building prediction functions including data collection, feature creation, algorithms, and evaluation.
Syllabus
Upon completion of this course you will understand the components of a machine learning algorithm. You will also know how to apply multiple basic machine learning tools. You will also learn to apply these tools to build and evaluate predictors on real data.
Recommended Background
The Data Scientist’s Toolbox,
R Programming,
Regression Models, and
Exploratory Data AnalysisCourse Format
Weekly lecture videos and quizzes and a final project that will be both objectively assessed and peer graded.
As part of this class you will be required to set up a
GitHub account. GitHub is a tool for collaborative code sharing and editing. During this course and other courses in the Specialization you will be submitting links to files you publicly place in your GitHub account as part of peer evaluation. If you are concerned about preserving your anonymity you will need to set up an anonymous GitHub account and be careful not to include any information you do not want made available to peer evaluators.
FAQ
Will there be more Data Science Specialization sessions after December 2015?Yes, the specialization is moving to the new Coursera platform in January 2016.
Will my current Data Science Specialization progress carry over to the new platform?Yes, the certificates you earned in the current platform will still be valid after the move to the new platform in January 2016.
How do the courses in the Data Science Specialization depend on each other?We have created a handy
course dependency chart to help you see how the nine courses in the specialization depend on each other.
Will I get a Statement of Accomplishment after completing this class?Free statements of accomplishment are not offered in this course. If you are not enrolled in Signature Track, participation and performance documentation will be reported on your Accomplishments page, but you will not receive a signed statement of accomplishment.
What resources will I need for this class?Students must have an active GitHub account and the latest version of R and RStudio installed.
How does this course fit into the Data Science Specialization?This is the eighth course in the sequence. Although it isn't a requirement, we recommend that you first take
The Data Scientist’s Toolbox,
R Programming,
Regression Models, and
Exploratory Data Analysis.