Performance evaluation: closed systems.

* Consider a closed loop system with N users. That is, multiprogramming level MPL = N. Each user submits a request, gets a response, thinks for a certain time Z, and makes another request. Total turnaround time for a request is response time at server, and think time. 

E[T] = E[R] + E[Z]. 

Note that turnaround time and response time were the same in open loop systems, but now they are different.

* Suppose there is only one user and a simple single server. Assume no think time. Then the user is constantly keeping the system occupied, so X = mu and rho = 1.

* However, if there is think time, then the server won't always be busy, so more users are required to saturate a system. If N is small enough, the users fill in the gaps caused by the think time of other users, and the response time of the system will be close enough to service demand, and there will be no waiting in queue. However, the system may not be fully utilized and throughput will not be high. If N is large and the system is fully utilized, then users will end up queueing behind each other and response time will be high. So clearly, unlike open systems, throughput and response time are interrelated. Now, what is the ideal number of users that can just about saturate the system?

* Let us now compute the ideal number of users to saturate a closed system. Assume that the closed system has many subcomponents, each with individual service demands Di. The slowest component, or the bottleneck component, has service demand Dmax. D = sum(Di), and Dmax is one of the Di. 

* Now, if only one user, the bottleneck component is busy for Dmax in a total time period of D+E[Z]. So, intuitively, N* = D+E[Z]/Dmax number of users will keep the bottleneck fully occupied, and will lead to full utilization. What happens if less than N* users? The bottleneck is not fully used. What happens if more than N*? The bottleneck cannot go any faster, and users will be queued up behind each other at the bottlneck, leading to higher delays. Let us now try to plot how throughput X and response time R vary with number of users.

* Little's law for closed systems

N = X E[T] or E[T] = N / X

X = N / (E[R] + E[Z]) 

(recall, T and X uncorrelated for open systems, now they are inversely related)

* Now, for large values of N, the system cannot serve requests faster than the slowest component, so X <= 1/Dmax

So, from Little's law, E[R] >= N Dmax - E[Z]

* For small values of N, the response time cannot be smaller than D, so E[R] >= D

So, X <= N/( D + E[Z])

That is, throughput increases almost linearly when N initially. But as queueing delays start to build up, throughput flattens out. Response time is almost constant initially, but increases as N increases and queues build up.

* X <= min (1/Dmax, N/( D + E[Z]) )

  E[R] >= max( D, N Dmax - E[Z] )
 
* Visualize the plots of X and R as a function of N, and identify where N* would be located (at the knee of the curves).

* The fact that the saturation throughput only depends on Dmax means that one cannot increase throughput by speeding up other components. Reducing other Di will improve response time, but not throughput, which depends only on Dmax.

* Let's use the following example. A system has a disk and CPU. For every request, the disk takes Dmax=5 sec to serve the request and CPU takes 2 sec. So, D = 5+2=7 sec. Each request has a think time of Z = 18 seconds. 

For N = 1, throughput = 1 req / 25 sec = 0.04 req/s

For N = 2, can get 2 requests done in ~25 sec, 0.08 req/s.

N* = 25/5 = 5 users.

For large N, one cannot get more than 1/5 = 0.2 req/s through the system.

* Closed loop testing of a system: run multiple experiments, gradually increasing the number of concurrent users. As number of users increase, throughput should flatten out, response time should start increasing, and utilization of bottleneck should increase. Ideally, the capacity you identify should be similar to with open-loop testing.

* How to write a closed loop load generator? Have multiple threads, each emulating a single user's behavior: send a request, wait for response, think if needed, and start over again. 

* The calculation for optimal number of users to saturate a closed loop system can also be used to find the ideal window size to saturate a network path (bandwidth delay product), the ideal number of threads to saturate a CPU when each thread blocks for some time, and so on. The intuition developed here is very useful to think about systems and perform back-of-the-envelope calculations.

* For example, suppose a thread of an application has a CPU burst of 2 seconds and blocks for 10 seconds for I/O before being ready for CPU. In this case, how many threads are required to saturate the CPU? We have D = Dmax = 2 and Z = 10, so N* = 12/2 = 6.

* Consider a sender sending packets to the receiver, with acks, on a network path. We need to compute how many packets the sender should keep outstanding so that the bottleneck link is fully utilized. One can think of the bottleneck link of a network path as the bottleneck resource. If the bandwidth of the bottleneck link is R bits/sec, then Dmax = 1/R seconds per bit. Once a bit is served by the bottleneck, it incurs a delay equal to the network RTT before it can appear at the bottleneck again. So ideal window size at sender = R * RTT = bandwidth delay product that you would have studied in a networking course.