Final Project Idea: Event-based Entity Chronicler

Here are a few thoughts to get you going.Your are free, and encouraged, to improvise.

The idea of this project is to use TARSQI temporal parsing components and embed them in an application for the end user. The Entity Chronicler allows the user to select a document set and a named entity and then to track the named entity throughout the collection. This tracking occurs via all the events that the entity is an argument of.

The interface could look like this:

This interface is just an example, you need not go this fancy, especially with repsect to the graphical representation. But there are a couple of necessary components:

  1. First there is a set of documents. For simplicity, you can use a fixed set of documents. This set should have at least a half dozen entities that occur in at least three of the documents. We think that about a dozen document may be enough.
  2. Some way to select an entity from the list of entities in the document set..
  3. Some way to represent the temporal relations between the events that the entity is an argument of. This could simply be an order list of tuples like <1994 INCLUDES uprising>
  4. A list of the events with their arguments.

Here is a schema that indicates what processing steps are needed to build an entity chronicle:

Some more details on these components

  1. Pre-processing. Use the TARSQI toolkit for this.
  2. Document selection. This can be a manual step, no need to plan for free input.
  3. Document-level processing. Named entity recognition and argument linking are necessary, anaphora resolution would give you extra credit.
  4. Cross-document processing. Entity coreference and event coreference are known hard problems, so we don't expect these to be solved during the project, luckily, an entity chronicler can be created without using these two modules. Timex coreference is essential and your project should include a component that figures out that 1998 in one document is before 1999 in another document..
  5. Chronicler environment. Most of your effort should be on the creation of the interface. In most cases, you won't need temporal closure.