LING 131a: Introduction to NLP with Python

Brandeis University, Fall 2019

Course information

Class hours Tu,Fr 12:30-1:50pm, Abelson 131
Instructor and TAs Marc Verhagen, Kelley Lynch, Jingxuan Tu, Congzhu Lin, Linxuan Yang and Kuan-Yu Chen
Textbook Natural Language Processing with Python, by Steven Bird, Evan Klein and Edward Loper. 2009. O'Reilly Media, Inc.

This is an introductory graduate-level course on the computer processing of natural language text with the Python programming language. Python has become one of the most popular programming language in Natural Language Processing (NLP) since it first came into existence because it has built-in data structures that allow natural language text to be manipulated with ease and elegance. Python has a relatively short learning curve compared with other high-level programming languages such as Java and beginners in Python can build up their programming skills fairly quickly. In addition, a large number of Python modules (such as the NLTK module) already exist for language processing purposes so that linguistically oriented Python programmers can start to write practically useful programs within a relatively short period of time. Students are discouraged, however, from becoming overly reliant on third-party modules so that they could write code optimized to their own specific needs. The key to being successful in the course is to get your hands dirty and write a lot of code. By taking this course you have shown a commitment to become proficient in programming. If you have never written any code before you may need to adapt to a new learning style that is practice-oriented rather than reading-oriented.

Learning objectives:

Prerequisites. You can take this class if one of the following holds:

Basically, to do well in this course it helps if you have some programming background and know some linguistics. When in doubt, contact the instructor.

Office hours and contact information

We have office hour slots throughout the week. You can also contact any one of us to make appointments outside of those hours.

Marc Verhagen Tu 10:30-12:00, Fr 11:00-12:00 Volen 216 marc (at) cs (dot) brandeis (dot) edu
Kelley Lynch Tu,Fr 14:00-15:00 Volen 111 kelleylynch (at) gmail (dot) com
Jingxuan Tu Mo 9:00-11:00 Volen 110 jxtu (at) brandeis (dot) edu
Congzhu Lin Mo,We 3:30-4:30 Vertica lounge (inside or just outside) linc (at) brandeis (dot) edu
Linxuan Yang We,Th 10:00-11:00 Vertica lounge (inside or just outside) linxuany (at) brandeis (dot) edu
Kuan-Yu Chen Th 13:00-14:00, 15:30-16:30 Vertica lounge (inside or just outside) chenky (at) brandeis (dot) edu

Tentative Schedule

date topic readings assignments and quizzes
Aug 30 Introduction
Sep 3 Getting started with Python and NLTK assignment 0 - install Pyhon and NLTK
Sep 6 Python basics chapter 1
Sep 10 Python for NLP assignment 1 - Python exercises
Sep 13 Text, words and corpora chapter 2
Sep 17 Word lists and Wordnet
Sep 20 Unix and Git assignment 2 - text analysis
Sep 24 Regular expressions chapter 3 quiz 1
Sep 27 Unicode (guest lecture by Kyeongmin Rim)
Oct 1 Rosh Hashana, no class
Oct 4 Logistics and Python topics
Oct 8 Finite state automata assignment 3 - regular expressions
Oct 11 Regular expressions and tokenization
Oct 15 Brandeis Monday, no class
Oct 18 Miscelleaneous quiz 2
Oct 22 Language models and POS Tagging chapter 5
Oct 25 Language models and POS Tagging
Oct 29 Language models and POS Tagging assignment 4 - tagging
Nov 1 Miscelleneous quiz 3
Nov 5 Languorem
Nov 8 Classification chapter 6
Nov 12 Classification
Nov 15 Machine Learning Packages assignment 5 - classification
Nov 19 Dynamic programming and edit distance quiz 4
Nov 22 Working with trees chapter 8
Nov 26 Working with trees chapter 9 project proposals
Nov 29 Thanksgiving, no class
Dec 3 Ethics and NLP
Dec 6 Class presentations
Dec 10 Class presentations