Brandeis CS 146a

ASSIGNMENT 4: September 24 to September 28, 2007

For Class Tuesday September 25, 2007


Read "Flash: an Efficient and Portable Web Server", by Pai, Drischel, and Zwaenopoel (the paper, not included in your packet, is available on the web). This is a very well written paper that considers the impact of structure on web server performance.

Your assignment includes answering the following questions:

1. Give a specific example where Flash exchanges larger latency for larger throughpue. Why is this worthwhile?
2. Why are Flash and SPED close for small data set? Why does Flash beat SPED xfor large data set?

For Lecture Material:

Read Chapter 6 (Performance) from S&K. Copies available in the COSI office.

For Discussion, Friday, September 28, 2007

Read "MapReduce" (paper #8), by Dean and Ghemawat. This is a more recent paper than Flash you read for previous class, and, unlike the single-node "Flash", "MapReduce" is concerned with a system consisting of multiple nodes. The paper describes a novel high-performance system design developed at Google for a specialized model of computation. Your reading assignment questions, therefore, focus on performance.
1. What are the two main reasons to execute the map and reduce functions in paraller on multiple machines?

2. Give examples of the use of batching and explain the specific performance benefit achieved.

3. How do the authors evaluate their system performance? What are "Input", "Output" and "Shuffle"? How do stragglers impact performance?

CS 146a Assignment 4, issued 9/21/07