CS-350 - Fundamentals of Computing Systems
Homework Assignment #2 - EVAL
Due on September 26, 2024 — Late deadline: September 28, 2024 EoD at 11:59 pm
EVAL Problem 1
In this EVAL assignment we will start to explore how to discover characteristics of incoming workload, we will dive into how to measure the overheads of our systems, and we will also study the properties how queues evolve in response to varying traffic conditions.
a) First thing first, let’s try to make sense out of the workload that is coming from the client. You might recall that by invoking the client with various values of the -a and -s parameters significantly impacts the load seen by your server. Now it is time to reverse-engineer the characteristics of that traffic. Start with the following to collect the report of 1,000 packets handled at the server:
./server_mt 2222 & ./client -a 6 -s 10 -n 1000 2222
Now, isolate only the lengths of the requests as they are sent from the client. With that, produce a plot of the distribution of the request lengths you have collected. The distribution plot should have on the x-axis a set of time bins, e.g., from 0 (included) to 0.005 (excluded), from 0.005 (included) to 0.010 (excluded), and so on in steps of 0.005 increments. Given each transaction, look at its length. If it is in the range between 0.005 and 0.010 seconds, it falls in the second bin; if it is in the range between 0.010 and 0.015, it falls in the third bin and so on.
On the y-axis, plot how many requests fall in each bin! But do not plot the raw count. Rather, normalize that value by the total number of requests you are plotting. In this case, 1,000. Hooray! You have produced a distribution plot.
b) By using the same procedure used in the previous part, produce a distribution plot of the inter-arrival time between any two subsequent requests. Say that request R0 is sent (look at the sent timestamp!) arrives at t0 = 10s and R1 is sent at t1 = 15s, then the inter-arrival time between them is t
iat0,1 = (t1−t0).
Compute all the 999 inter-arrival times you have observed and plot their distribution just like you did above, except that this time you will normalize by 999.
c) Time to reverse-engineer things! Let’s start from the distribution of request lengths. Use your favorite programming language to generate 10,000 samples from the following theoretical distributions:
(1) A Normal distribution with mean 1/10 and standard deviation 1;
(2) An Exponential distribution with mean 1/10;
(3) A uniform. distribution with mean 1/10.
Plot these distribution together (on the same plot, just different lines) with the distribution you previ-ously acquired from your server run. Thus, your plot should have a total of 4 different lines (I suggest having lines instead of bars for this plot) in it. Then, comment on which ones of the curves matches more closely with the experimental data. If there is one of them that matches remarkably close, you have successfully reverse-engineered the characteristics of your input load!
d) Do the same with the inter-arrival times. But this time, compare it with the following three references:
(1) A Normal distribution with mean 1/6 and standard deviation 1;
(2) An Exponential distribution with mean 1/6;
(3) A uniform. distribution with mean 1/6.
Produce the comparison plot and comment on the match between the experimental and theoretical curves. At this point, can you tell me what the -a and -s parameters control, exactly?
EVAL Problem 2
In this EVAL problem, we will start to study the relationship between utilization, throughput, and queues in response to changing load conditions.
a) First thing first, learn how to take a good queue size average. A good queue average measurement should consider the amount of time the queue remains in a certain state, i.e., it should be a timed average of the queue length.
Let us make an example. Say that your queue at t = 0 has 3 elements in it, and it stays that way until time t = 9 sec, at which point the queue becomes empty. If you do not consider time, the average size would be q = (3 + 0)/2 = 1.5. But this is incorrect: most of the time we see the queue with 3 elements, and only at the end with 0. So an average of 1.5 seems wrong.
The right way to take the average is by weighting the queue state by the time it stays in that state. Thus, the right way to calculate q in our example is q = 3 · (10/9) + 0 · (10/1) = 2.7. You can see how this is a better average of queue size over a 10 seconds time window starting from time t = 0.
Now, measure the queue length for the case where your queue-enabled server is invoked with the following parameters:
./server_q 2222 & ./client -a 14 -s 15 -n 1000 2222
Use the queue snapshots produced by the worker thread to measure the queue size as it was observed after each request was observed. Use the time elapsed between two subsequent queue snapshots to weigh that size towards the total average.
b) Now let us repeat the computation of the queue size average as in Q?? but this time sweep through the -a parameter passed to the server. In particular, run the first experiment for a value of 1; then a second time with a value of 2; and so on until and including the case where the value is 15. Thus, you will run 15 experiments in total. This might take a while, so try to automate the runs and dump the results into a file for later analysis.
By reusing what you learned in hw1, also extract utilization and response time averages from each of the 15 experiments. Now, you should have three sets of 15 values each: (1) utilization, (2) average response time, (3) average queue length.
Finally, produce a plot that depicts the trend of the average response time (on the y-axis as line 1) and average queue size (on the y-axis as line 2) as a function of the server utilization x-axis. What relationship do you discover between how response time and queue length averages evolve as a result of increasing utilization?
c) By looking at the plot produced above, can you conclude that there is some fixed proportional rela-tionship between queue length and response time? Is there something in the theory covered so far capable of modeling this relationship?