Blind test sets

We have released the blind test sets used in English and Chinese. The datasets are in the same format as the training and development sets described below.

Training and development datasets

The participants must fill out the registration form and the license agreement form to obtain the full dataset for the task, which requires the permission from the Linguistic Data Consortium (LDC). Once you formally register as a participant of the shared task and email in the agreement to LDC (ldc@ldc.upenn.edu), you will receive the instructions for downloading the dataset.

Github Repo

The shared task has a github repo. It hosts many files that are required for developing a system such as: If you find a bug or have other analytic or utility code to share with the group, you are more than welcome to make a pull request. We will help edit and document the code as well if you do not have time for that.

Other resources

The resources that are allowed for the closed track in English are listed below:

The resources that are allowed for the closed track in Chinese are still limited. We provide the following: