Assignment 2 - CS120a Due Wednesday, February 23, 2:00 PM ----------------------------------- 1. Consider the 10 requests to the Brandeis CS web server below. These requests were extracted from the web access log and supplemented with an additional parameter that indicates the service demand time required of the disk to complete each request. No. Requestor Request File Size (Bytes) D_Disk (msec) ----------------------------------------------------------------------------------------------------- 1 129.64.160.68 GET /~tim/Classes/Spr00/CS155/Notes/ HTTP/1.0 1874 12 2 209.210.203.43 GET /~mikeb/images/dreambg.gif HTTP/1.1 15448 68 3 209.210.203.43 GET /~mikeb/images/ciawww.gif HTTP/1.1 35959 200 4 208.222.98.155 GET /~mikeb/images/ciaww.gif HTTP/1.1 78623 346 5 209.185.253.175 GET /~tim/Courses/1997/CS2a/Quizes/quiz17.gif 78479 345 6 209.245.141.164 GET /~suresh/cs11/HW/hw3.html HTTP/1.1 1766 9 7 209.245.141.164 GET /~suresh/cs11/HW/HW3.class HTTP/1.1 2261 15 8 12.79.222.154 GET /~cs21b/files/hw1.html HTTP/1.0 12071 54 9 12.79.222.154 GET /~cs21b/files/submit.html HTTP/1.0 14050 57 10 151.197.17.36 GET /~paulb/CoreLex/corelex.html HTTP/1.1 6198 28 a. Use the clustering algorithm discussed in class to cluster the 10 requests above into 4 clusters (very small, small, medium and large requests), according to their file sizes and disk service demands. Make sure to scale parameter values using their z-scores before running the clustering algorithm. (Recall that the standard deviation for any set of values, V = {v1, ..., vn} with mean value, M is defined as follows: stdev (V) = sqrt (average ({sqr (v1 - M), sqr (v2 - M), ..., sqr (vn - M)})) Indicate which requests belong in which clusters, as well as the average z-score and *raw* values for each parameter. Solution -------- Category Includes Avg Size z-score Avg Raw Size Avg Disk z-score Avg Raw Disk ----------------------------------------------------------------------------------------- Very Small: 1,6,7 -0.79 1967 -0.8 12 Small: 2,8,9,10 -0.45 11941.75 -0.48 51.75 Medium: 3 0.39 35959 0.68 200 Large: 5,4 1.88 78551 1.82 345.5 b. Suppose that we know that the response times for very small requests = e (s, io) (s = size, io = num io's) small requests = f (s, io) medium requests = g (s, io) large requests = h (s, io) A query processor must predict the expected response time for a file request of size n bytes that is determined to require m msec of disk service demand. How would you determine whether its expected response time would be e (n, m); f (n, m); g (n, m); or h (n, m)? Solution -------- Compute its z-scores, and find which centroid it is closest to, to see what class of request it falls in. Then choose the response time formula for that request. 2. Suppose requests arrive at a network queue for a T1 Line (1.5 Mbps) at a rate of 2000 packets / second, and that the average length of a packet is 515 bits. What is the expected throughput, response time and average population of the queue? Solution: -------- Assumes infinite population, fixed service rate, unlimited queue: We are given: lambda = 2000 p/sec Service time (S): S = 515 / (1.5 * 1024 * 1024) = .33 msec = .00033 sec Service rate (mu): mu = 1 / .00033 = 3000 p/sec Idle rate (p0): p0 = 1 - (lambda / mu) = 1 - (2000 / 3000) = 1/3 Utilization of Network (U): U = (1 - p0) = 2/3 Then we have: X = lambda (by FEA) = 2000 p/sec N = sum (1 <= i <= inf) (i * p_i) = U / (1 - U) = 2 R = N / X = 2 / 2000 = .001 sec = 1 msec 3. A small business has two outside lines for its telephones. Calls come in at a rate that is slower than the rate at which calls are processed, and yet measurements show that 1 out of every 7 incoming calls (roughly 14%) gets a busy signal. How many outside lines should be added to reduce the number of incoming calls that get a busy signal to 1 in 511 (less than .2%). Solution -------- For this problem, we can assume a limited queue (of size 2) with infinite population and fixed service rate. Let x = lambda/mu We know that p_Busy = p_2 = (1 - x) ------- * x^2 (1 - x^3) = 1 / 7 Therefore, solving for x, we have 1 - x^3 = 7x^2 - 7x^3 which implies 6x^3 - 7x^2 + 1 = 0 Note that 6x^3 - 7x^2 + 1 = (3x + 1) (x - 1) (2x - 1) Given that 0 <= x < 1, it must be the case that x = 1/2. Therefore, to solve the problem, we must find n s.t. 1 / 511 = (1 - 1/2) ------- * (1/2)^n (1 - (1/2)^{n+1}) = (1/2)^{n+1} ----------- (1 - (1/2)^{n+1}) 1 - (1/2)^{n+1} = 511 * (1/2)^{n+1} 1 = 512 * (1/2)^{n+1} 1 / 512 = (1/2)^{n+1} n = 8 Thus, we need a total of 8 lines. 4. Consider the DB Server of Example 9.3 of your text, and discussed in class. a. What is the expected response time when there are 50 requests in the system? (Express your answer in msec). b. Which of the CPU or Disk is the bottleneck in the DB server? Justify your answer by considering how independently replacing each resource with a faster version affects the maximum throughput of the system. c. What is the maximum possible throughput (expressed to 4 decimal places) of the system assuming that each request requires 15 msec of CPU time. Explain how you got your answer. d. Assuming the CPU is left as it is, what is the minimum disk speed (measured in transfer rate (KBps)) required to achieve 80% of the throughput you calculated in (c), with as few as 20 requests in the system? For this question, you can assume that each record read requires reading a block (2048 bytes) from disk. Solution -------- a. 2250 msec or 2.25 sec b. The disk is the bottleneck. Doubling disk speed results in doubling maximum throughput. Doubling CPU speed has neglible effect on maximum throughput. c. Maximum throughput (achieved when the service demand for disk is 0) is 0.0667. d. The service demand for the disk must be roughly 18.7 msec to achieve 80% of 0.0667 (0.053). This means that each record read must take 18.7 / 5 = 3.73 msec or less, requiring a disk that can read 2048 bytes / 3.73 msec, or roughly 536 KB / sec.