JBS: Summer 2015

Late Policy

Assignments will be accepted late ONLY if an extension has been agreed upon. You must email Professor Meteer the day before the assignment is due with the reason you need an extension. Later requests will only be accepted in emergency circumstances.

Blog Assignment 1: Speech application review

Due Review due by the start of class June 4th, comments by June 6th (I’ll set up a forum)

  • Select a speech application and try it out. Remember what Cory said and look for something you care about.
  • Identify what worked and what didn't, how easy it was to use, how useful it was.
  • Write a short review describing the application (functionality, platform) and how well it works (usefulness, limitations.) Provide the information on how to access the application so that others can try it.
  • Post it to the class blog (you’ll get a message to respond to) by Thursday June 4th
  • Read the reviews from other classmates and comment on how it compares to the app your reviewed, whether you've used anything similar, or ask for more information about the app or its performance.
  • Make at least one comment by Saturday June 6th

Blog Assignment 2: Due Monday (6/29) before class

Evaluate 2 Text to Speech engines, paying particular attention to how well it would work for your group project.

First describe the uses in your application and create a set of requirements for the TTS engine (does it need to read? Does it need to say numbers, email addresses, names? How much variability in the spoken phrases will there be?)

Next, choose 2 TTS engines and evaluate them with respect to your requirements. You should both try out some phrases as well as look on their website to see what tools are available and how much control you have.

Here are a few links. I'm sure there are more out there

  • Acapela group: http://www.acapela-group.com
  • Festival: http://www.cstr.ed.ac.uk/projects/festival/onlinedemo.html
  • Cepstral: http://cepstral.com/demos/
  • Inova: http://www.ivona.com/en/
  • Neospeech: http://www.neospeech.com/?gclid=CjwKEAjwh6SsBRCYrKHF7J3NjicSJACUxAh7zyibyeKG_ja0BTk2azPjME7Iecej-3_aw7jC8uxhnRoCiXXw_wcB

Submit a description of the requirements you've identified, how well each meets the requirements, and what tests you did or information you found to support your conclusions.

Group Interface Usability Testing: Monday morning

Design a usability test for specific points in your interface.

  • Show your subject the home screen and ask them what they would do.
  • Give the your subject a persona and scenario (e.g. you are working in a study group with four other students. Order a couple pizzas") and ask them to perform a specific task. (have a few persona/scenario/tasks prepared.)
  • Ask them for general feedback.

Make sure someone in your group is the scribe to note what they do and say.

Group speech input/output presentation: Monday afternoon

On Monday, I would like each group to summarize your findings, ideas, and strategies for speech in your applications. You'll have about 45 minutes in class to pull all of the information together into a presentation then each group will present. Here are the questions I would like your presentation to answer:

1. What are the uses of speech input in your application? Give example utterances.

2. Discuss the offline regression tests of the corpus you've created (e.g. SCLIte results: .sys and .pra files) and what they tell you about what will be hard/easy for your app.

3. Describe how you will do NLP (e.g. if wit.ai, lists, intentions & entities, discuss performance given recognizer output, if other, describe how you go from words to actions. Discuss any limitations you are hitting.

4. Describe your ideas/strategies for error correction when the recognizer doesn't understand or the NLP doesn't do the right thing.

5. Describe your ideas for speech out.

Catching up on Reading and Quizes: Wednesday morning

We've gotten caught up in the doing, but understanding the research behind it is also important. For our last class, we're going to be discussing the papers that have been assigned, both in small groups and as a class. You'll then be given a take home set of quizzes to summarize what you've learned, which will be due Friday, July 10. Details on that will follow.

For Wednesday (July 1) Read the following papers (most of which have been in the syllabus already). Focus on the concepts. Don't get bogged down in the details of the research.

Optional:

  • Which ASR should I choose for my dialog system?
  • Tuning Sphinx to Outperform Google's Speech

Programming Assignment: Speech recognition off line evaluation

GOAL: Evaluate the performance of 3 speech recognizers

  • AT&T Mashup Speech Server
  • CMU Pocketsphinx
  • Google Chrome

Compare word error rates and details of what words the recognizer got right and wrong on audio you create, as well as audio created by the rest of the class.

Application: Ordering a pizza.

Details on the assignment can be found here

More info on the various tools you will be using is in the GoogleDocs Useful Tools folder