Natural Language Processing

Michael Collins, Columbia University

Have you ever wondered how to build a system that automatically translates between languages? Or a system that can understand natural language instructions from a human? This class will cover the fundamentals of mathematical and computational models of language, and the application of these models to key problems in natural language processing.

Natural language processing (NLP) deals with the application of computational models to text or speech data. Application areas within NLP include automatic (machine) translation between languages; dialogue systems, which allow a human to interact with a machine using natural language; and information extraction, where the goal is to transform unstructured text into structured (database) representations that can be searched and browsed in flexible ways. NLP technologies are having a dramatic impact on the way people interact with computers, on the way people interact with each other through the use of language, and on the way people access the vast amount of linguistic data now in electronic form. From a scientific viewpoint, NLP involves fundamental questions of how to structure formal models (for example statistical models) of natural language phenomena, and of how to design algorithms that implement these models.

In this course you will study mathematical and computational models of language, and the application of these models to key problems in natural language processing. The course has a focus on machine learning methods, which are widely used in modern NLP systems: we will cover formalisms such as hidden Markov models, probabilistic context-free grammars, log-linear models, and statistical models for machine translation. The curriculum closely follows a course currently taught by Professor Collins at Columbia University, and previously taught at MIT.


Topics covered include:

1. Language modeling.
2. Hidden Markov models, and tagging problems.
3. Probabilistic context-free grammars, and the parsing problem.
4. Statistical approaches to machine translation.
5. Log-linear models, and their application to NLP problems.
6. Unsupervised and semi-supervised learning in NLP.

Recommended Background

A basic knowledge of probability (e.g., you should be familiar with random variables, independence assumptions, etc.), a basic knowledge of algorithms, and a basic knowledge of calculus (e.g., how to differentiate simple functions). 

Suggested Readings

The course will be largely self-contained, with comprehensive lecture notes posted together with the lectures.

Course Format

The class will consist of lecture videos, which are broken into small chunks, usually between eight and twelve minutes each. Some of these may contain integrated quiz questions. There will also be standalone quizzes that are not part of video lectures, and programming assignments.


  • Will I get a statement of accomplishment after completing this class?

    Yes. Students who successfully complete the class will receive a statement of accomplishment signed by the instructor.

  • How much programming background is needed for the course?

    The class will include programming assignments, so some programming background will be helpful.

  • 2013年2月24日, 10 星期
  • 免费:
  • 收费:
  • 证书:
  • MOOC:
  • 视频讲座:
  • 音频讲座:
  • Email-课程:
  • 语言: 英语 Gb



请注册, 为了写反馈

Small-icon.hover Machine Learning
Machine learning: from the basics to advanced topics. Includes statistics...
Cs224n Natural Language Processing
This course is designed to introduce students to the fundamental concepts and...
Modelthinking Model Thinking
In this class, you will learn how to think with models and use them to make...
7f6d9029b864237e77ac8556be5a4876b3c0235b-thumb Certificate Course in Writing for a Global Market
With the industrial, technical, and commercial market becoming more and more...
Computer_translation_course_tile262x136 01718330x: Principles and Practice of Computer Aided Translation 计算机辅助翻译原理与实践
This course teaches the basic concepts of computer-aided translation technology...
6-864f05 Advanced Natural Language Processing
This course is a graduate introduction to natural language processing - the...
5a631d1c-cb20-4cfc-9b49-1cc9c8fc981e-5949e438d0ed.small Introduction to Linux
Never learned Linux? Want a refresh? Develop a good working knowledge of Linux...
2e052f23-9b7b-4081-808d-3fa3e87e652b-21a3bf7e7907.small 数据结构与算法设计(下) | Data Structures and Algorithm Design Part II
Learn the basics of data structures and methods to design algorithms and analyze...
354f926d-f2fe-40d5-94e9-c5b9a2c2c088-6ccbff1f45d4.small 数据结构与算法设计(上) | Data Structures and Algorithm Design Part I
Learn the basics of data structures and methods to design algorithms and analyze...
Bd426e10-8994-45bc-859f-c4259ff7a9e9-214b2fe72799.small 数据挖掘:理论与算法 | Data Mining: Theories and Algorithms for Tackling Big Data
Unraveling the mysteries of Data Mining and Big Data, this course is a must...
699c85b6-0298-4a29-bfe3-a22c79a87a7c-0d03aa7b0bb7.small 数据可视化|Data Visualization
数据可视化是一项致力于把抽象的数据或概念转化为适于人类理解和接受的视觉化的信息技术,是一个典型的交叉学科。 本课程适对数据进行可视化挖掘和理解大的各专...
Success-from-the-start-2 First Year Teaching (Secondary Grades) - Success from the Start
Success with your students starts on Day 1. Learn from NTC's 25 years developing...
New-york-city-78181 Understanding 9/11: Why Did al Qai’da Attack America?
This course will explore the forces that led to the 9/11 attacks and the policies...
Small-icon.hover Aboriginal Worldviews and Education
This course will explore indigenous ways of knowing and how this knowledge can...
Ac-logo Analytic Combinatorics
Analytic Combinatorics teaches a calculus that enables precise quantitative...
Talk_bubble_fin2 Accountable Talk®: Conversation that Works
Designed for teachers and learners in every setting - in school and out, in...

© 2013-2019