Computational Psycholinguistics: Fall 2019

Lecture: Tuesdays and Thursdays, 1:30-2:45 pm

Lab: Fridays, 3:45-5:00 pm

Krieger 134A

Instructor

Tal Linzen

tal.linzen@jhu.edu

Office hour: Krieger 243, by appointment only (sign-up spreadsheet)

Teaching assistant

Tom McCoy

tom.mccoy@jhu.edu

Office hour: Wed 4:30-5:30 pm, Krieger 141

Course description

How do we understand and produce sentences in a language we speak? How do we acquire the knowledge that underlies this ability? Computational psycholinguistics seeks to address these questions using a combination of two approaches: computational models, which aim to replicate the processes that take place in the human mind; and human experiments, which are designed to test those models. The perspective we will take in this class is that models and experimental paradigms from psycholinguistics do not only advance our understanding of human cognitive science, but can also help us advance artificial intelligence and language technologies. While research in computational psycholinguistics spans all levels of linguistic structure, from speech to discourse, the focus of this class will be at the level of the sentence (syntax and semantics).

At the end of this class, you are expected to be able to:

Prerequisites: I will assume you're familiar with probability theory (e.g., Bayes' law) and are comfortable with Python programming. I will also assume familiarity with basic concepts in linguistics. Experience with neural networks would be helpful but isn't essential.

Course organization

Lab: The class will be accompanied by weekly lab sessions led by the Teaching Assistant. The goals of the lab are to reinforce the linguistic, mathematical and computational concepts covered in the lecture, and to provide hands-on technical introduction to the software tools that are essential for successful completion of the homework assignments and class project. All students are expected to enroll in the lab; exceptions will be granted by the professor on a case-by-case basis (for example, to students who can demonstrate existing research experience in computational linguistics / NLP).

Office hours: If you'd like to attend my office hour, please sign up for a slot on this spreadsheet; do not show up without an appointment. To maximize access to office hours, the timing of the office hour may change from week to week (if there is sufficient demand). Please let me know if you're unable to attend my office hour due to a conflict and I'll schedule it at a different time the following week. My office hour is most appropriate for conceptual questions about course material and computational psycholinguistics more generally; technical issues and questions about the homework are best discussed in the lab section or the TA's office hour.

Interacting with the instructors: If you have a question about the material, please ask it in class or in the lab, attend one of our office hours, or post the question on Piazza. We will only use email to communicate about personal or confidential matters.

Anxiety, stress and mental health: If you are struggling with anxiety, stress, depression or other mental health related concerns, please consider visiting the JHU Counseling Center. If you are concerned about a friend, please encourage that person to seek out their services. The Counseling Center is located at 3003 North Charles Street in Suite S-200 and can be reached at 410-516-8278 and online.

Disability services: Any student with a disability who may need accommodations in this class should obtain an accommodation letter from Student Disability Services, studentdisabilityservices@jhu.edu, 385 Garland, (410) 516-4720. Please bring it to our attention as early as possible so we can do the best we can to accommodate your needs.

Course requirements

Your responsibilities for the course are:

Attendance: Students are expected to attend all of the meetings of the class. We will occasionally check attendance. Please email the TA in advance if you need to miss a meeting for religious, health or any other valid reason. Do not come to class if you're sick; you do not need to bring a doctor's note, but again, do email the TA in advance to let us know you'll be missing class. Repeated unexplained absence will have consequences beyond the participation grade and may result in failure in the class.

Participation: You are expected to engage in class discussion: ask questions, make comments and answer the instructors' questions. Make sure not to dominate the discussion, however: give all of the students space to participate.

Reactions to the readings: As a component of the participation grade, each student will be expected to post one short question or comment on the readings to Piazza every week. The reactions to a reading are due before the class in which the reading is discussed. Your reactions are expected to demonstrate that you have read and thought about the article. You can skip up to three weeks without penalty; after that, every missed reaction will be penalized with a single point. Reactions (or comments on other students' reactions) that are particularly thoughtful or comments on other students' reactions, will be rewarded with an extra credit point, up to a maximum of 3 points.

Class presentation: Each student—possibly in teams, depending on enrollment—will be expected to give a 45-minute presentation of one of the papers on the syllabus that present original research (that is, not a textbook chapter). While all students will have read the paper, the presentation should not assume that—it should be self-contained. Students are expected to present the theoretical motivation for the study, discuss relevant background (concisely), present the methodology and results of the study, and discuss its limitations. The time allotted to the presentation includes class discussion.

Homework assignments: There will be six homework assignments. The assignments will have technical components, which will involve implementing computational models discussed in class, as well as short essay questions related to the readings. All homework assignments will be posted on Thursday after class and will be due the next Thursday before class.

Homework late days: You have a budget of ten late days to be used at your discretion over the course of the semester, for any reason (e.g., illness); you do not need to ask for permission to use them. Use your late days wisely: once the budget has been exhausted, any late assignments will receive at most half of the possible points. Late days may only be used for homework assignments. They may not be used for reactions to the readings, for the class project, or for the intermediate deadlines for the class project.

Laptop policy: Cognitive scientists have found that laptop use in the classroom can lead to lower test scores:

Raviza, S. M., Uitvlugt, M. G., & Fenn, K. M. (2016). Logged in and zoned out: How laptop Internet use relates to classroom learning. Psychological Science, 28(2), 171–180.

See also the New York Times opinion piece, Laptops Are Great. But Not During a Lecture or a Meeting.

We recommend that you avoid using your laptop in class, except for activities that are directly related to the class (e.g., following a Jupyter notebook in lab sessions).

Piazza: We will be using a Piazza site to make announcements and answer questions. All enrolled students should have received an invitation to join the Piazza site. Alternatively, you can add yourself to the site.

Readings: There is no required textbook. All of the readings will be available on this website. Many of the readings are from the draft third edition of Jurafsky and Martin's textbook Speech and Language Processing. Page numbers and chapters refer to the September 23, 2018 version. An optional resource to supplement the readings is Jacob Eisenstein's new NLP textbook (also work in progress).

Class project

You will be expected to write a final paper reporting on an original research project in the area of computational psycholinguistics. The expected scope and ambition of the project will vary by section: graduate students are expected to write a report that could be submitted to a conference, while undergraduate projects can be more modest in scope. Teams of up to two students are allowed, though team projects are expected to be more ambitious. You will work with us throughout the semester to ensure that your project meets our expectations.

The timeline for the project is as follows (all deadlines are by 6 pm Eastern time):

Proposal: The proposal should be up to two pages long (including references). It should include the following components:

Final report: The final report should be up to six pages including references. The report is expected to include the following content (not necessarily as distinct parts):

Format: Please use the Cognitive Science Society (CogSci) LaTeX template for both the proposal and the final report. Overleaf's Learn LaTeX in 30 minutes tutorial may be helpful to students who haven't used LaTeX before.

Ethics policy

The strength of the university depends on academic and personal integrity. In this course, you must be honest and truthful. Ethical violations include cheating on exams, plagiarism, reuse of assignments, improper use of the Internet and electronic devices, unauthorized collaboration, alteration of graded assignments, forgery and falsification, lying, facilitating academic dishonesty, and unfair competition. Please report any ethics violations you witness to the instructor. You may consult the associate dean of student affairs and/or the chairman of the Ethics Board beforehand. See also the Guide on Academic Ethics for Undergraduates and the Ethics Board Web site. In particular:

Do not cheat. You are encouraged to talk with other students about the content of the course, and to use written material (lecture slides, the readings, external websites, newspaper/magazine stories, and so on) as sources, but your written work must be original to you, with the exception of short quotes that are clearly indicated as such (see next paragraph).

Do not plagiarize. If you quote directly from a book or other resource, please indicate this with quotes ("...") and a parenthesized citation after the quoted material; in any case, do not quote extensively from other sources. If you are simply paraphrasing a portion of a resource, leave off the quotes but keep the citation. Use a simple format for citations, for example: "human language syntax is not regular (Chomsky 1957: pages xxx-xxx)".

Course outline

The topics and readings may change during the semester, depending on our rate of progress and interests.

Week 1-2 (Sep 2, 4, 10, 12): Introduction, experimental methods in psycholinguistics, and probabilistic prediction

Week 3 (Sep 17, 19): Knowledge of grammar

Week 4 (Sep 24, 26): Human parsing

Weeks 5-6 (Oct 1, 3, 8): Computational models of parsing

Week 6-7 (Oct 10, 15): Word vector representations

Weeks 7-8 (Oct 17, 22, 24): Syntax in neural network language models

Week 9 (Oct 29, 31): Neural network modeling of the acquisition of syntactic transformations

Week 10 (Nov 5, 7): Pragmatics as inference

Week 11 (Nov 12, 14): Information, communication and the noisy channel

Week 12 (Nov 19): Syntactic priming and adaptation

Week 12 (Nov 21): Memory and sentence processing

Week 13 (Dec 3, 5): Class project presentations

Grading

Extra credit: There will be no individual extra credit opportunities.

Undergraduate grade composition:

Graduate grade composition:

Letter grades: We will use the following key to assign letter grades:

Number Letter
97–100A+
93–96A
90–92A-
87–89B+
83–86B
80–82B-
77–79C+
73–76C
70–72C-
60–69D
0–59F