Brandeis University, Fall 2019
Class hours | Tu,Fr 12:30-1:50pm, Abelson 131 |
Instructor and TAs | Marc Verhagen, Kelley Lynch, Jingxuan Tu, Congzhu Lin, Linxuan Yang and Kuan-Yu Chen |
Textbook | Natural Language Processing with Python, by Steven Bird, Evan Klein and Edward Loper. 2009. O'Reilly Media, Inc. |
This is an introductory graduate-level course on the computer processing of natural language text with the Python programming language. Python has become one of the most popular programming language in Natural Language Processing (NLP) since it first came into existence because it has built-in data structures that allow natural language text to be manipulated with ease and elegance. Python has a relatively short learning curve compared with other high-level programming languages such as Java and beginners in Python can build up their programming skills fairly quickly. In addition, a large number of Python modules (such as the NLTK module) already exist for language processing purposes so that linguistically oriented Python programmers can start to write practically useful programs within a relatively short period of time. Students are discouraged, however, from becoming overly reliant on third-party modules so that they could write code optimized to their own specific needs. The key to being successful in the course is to get your hands dirty and write a lot of code. By taking this course you have shown a commitment to become proficient in programming. If you have never written any code before you may need to adapt to a new learning style that is practice-oriented rather than reading-oriented.
Learning objectives:
Prerequisites. You can take this class if one of the following holds:
Basically, to do well in this course it helps if you have some programming background and know some linguistics. When in doubt, contact the instructor.
We have office hour slots throughout the week. You can also contact any one of us to make appointments outside of those hours.
Marc Verhagen | Tu 10:30-12:00, Fr 11:00-12:00 | Volen 216 | marc (at) cs (dot) brandeis (dot) edu |
Kelley Lynch | Tu,Fr 14:00-15:00 | Volen 111 | kelleylynch (at) gmail (dot) com |
Jingxuan Tu | Mo 9:00-11:00 | Volen 110 | jxtu (at) brandeis (dot) edu |
Congzhu Lin | Mo,We 3:30-4:30 | Vertica lounge (inside or just outside) | linc (at) brandeis (dot) edu |
Linxuan Yang | We,Th 10:00-11:00 | Vertica lounge (inside or just outside) | linxuany (at) brandeis (dot) edu |
Kuan-Yu Chen | Th 13:00-14:00, 15:30-16:30 | Vertica lounge (inside or just outside) | chenky (at) brandeis (dot) edu |
date | topic | readings | assignments and quizzes |
Aug 30 | Introduction | ||
Sep 3 | Getting started with Python and NLTK | assignment 0 - install Pyhon and NLTK | |
Sep 6 | Python basics | chapter 1 | |
Sep 10 | Python for NLP | assignment 1 - Python exercises | |
Sep 13 | Text, words and corpora | chapter 2 | |
Sep 17 | Word lists and Wordnet | ||
Sep 20 | Unix and Git | assignment 2 - text analysis | |
Sep 24 | Regular expressions | chapter 3 | quiz 1 |
Sep 27 | Unicode (guest lecture by Kyeongmin Rim) | ||
Oct 1 | Rosh Hashana, no class | ||
Oct 4 | Logistics and Python topics | ||
Oct 8 | Finite state automata | assignment 3 - regular expressions | |
Oct 11 | Regular expressions and tokenization | ||
Oct 15 | Brandeis Monday, no class | ||
Oct 18 | Miscelleaneous | quiz 2 | |
Oct 22 | Language models and POS Tagging | chapter 5 | |
Oct 25 | Language models and POS Tagging | ||
Oct 29 | Language models and POS Tagging | assignment 4 - tagging | |
Nov 1 | Miscelleneous | quiz 3 | |
Nov 5 | Languorem | ||
Nov 8 | Classification | chapter 6 | |
Nov 12 | Classification | ||
Nov 15 | Machine Learning Packages | assignment 5 - classification | |
Nov 19 | Dynamic programming and edit distance | quiz 4 | |
Nov 22 | Working with trees | chapter 8 | |
Nov 26 | Working with trees | chapter 9 | project proposals |
Nov 29 | Thanksgiving, no class | ||
Dec 3 | Ethics and NLP | ||
Dec 6 | Class presentations | ||
Dec 10 | Class presentations |