CS744 Course Project
The goal of the project is to build a real computer system to apply
the system design and engineering principles learnt in class. The
project will run in multiple phases throughout the semester. Below are
tentative deadlines for the various phases. Phase 0 does not carry any
grade, and each of Phase 1 through 4 will account for ~10% of your
grade (a total of 40% for the project).
- Phase 0: Aug 6.
- Phase 1: Sep 3.
- Phase 2: Oct 9.
- Phase 3: Nov 5.
- Phase 4: demo and presentations Nov 6 onwards.
Phase 0: Project Proposal, defining the system, forming teams.
In phase 0, you will finalize the system you will build, and pick your
teammates. While you are free to pick a system of your choice to
build, the following constraints will ensure that your system meets a
certain minimum level of complexity.
- Your system must be a multi-tier client-server system, with more
than one tier on the server side. That is, a client generates requests
to a frontend, which in turn consults one or more backend tiers to
process the request, and returns a response back to the client.
- There should be more than one type of requests processed by the
system, and each type of request must incur a different processing
logic at the various server tiers. The processing of a request can go
back and forth across the multiple server tiers before a reply is
returned to the client.
- The clients can request some service from the servers, or can use
the server to exchange messages or information with other
clients. That is, client-to-client communication can happen (if desired) via the
- You must use TCP as the transport protocol between the client and
the front end server, and preferably between the front end and the
- The server processing should be stateful, i.e., some
information/state must be stored and retrieved (either in memory or on
secondary storage) to process each request. The state can be stored at
one or more tiers in the servers.
- You can use an interactive client (e.g., requires user to type
something in the terminal) in your initial stages of the
project. However, the client application must be amenable to
automatically generating requests programmatically for testing
performance of your system in later stages of the project (see
description of Phase 2 below). So, keep this in mind when you pick
Some examples of systems you can build:
- A file server with a backend authentication database. When a client
connects to the server, it must first send a request to
authenticate/login. The frontend server requests a username and
password and checks with the backend database to authenticate the
user. The client can then request files from the frontend server,
which are served from the server's memory or disk.
- A web server with a front end proxy cache. When a client requests
a webpage, a front end proxy cache checks if a cached copy of the file
is available. In case of a cache miss, the proxy must go to the
backend and fetch the webpage. Otherwise, the file is served from the
front end's memory or disk.
- A file/web server with a front end load balancer. The front end
receives the request, fetches the file from one of the backend
servers, and replies to the client.
- A simple ticket/room reservation system, where the client issues
requests to check availability or book tickets, the backend database
holds the reservation data, and the frontend server fetches
information from the backend and replies to the reservation request.
- A key-value store, with a stateless frontend, and the key-value
pairs distributed across one or more backend servers. The frontend server
redirects the request to one of the backends depending on the value of
- A chat/messenger application. A user logs in to the front end and
sends/receives messages through a client side application. Multiple
users connected to the front end can send and receive messages to each
other using their usernames or some other identifier. The server front
end authenticates users and forwards messages for those users that
are online, while the backend server stores messages for users that
are offline. The front end must retrieve offline messages from the
backend when the user logs in.
- A simple gaming application, where the user logs in and sends
commands/actions via the client side application, and the front end
handles these requests. The users of a game can also communicate with
each other via the front end. The backend can be used to keep scores
or other such book keeping information.
Of course, the examples above are only suggestions, and you are
free to pick anything that is not on the list as well. We only require
that your system look realistic, and have a reasonable amount of
complexity to make your project interesting. However, there is no need
to make your system compatible with existing standard application
protocols. For example, if you are building a web server, you can use
your own simplified HTTP header formats at the client and server, and
your server need not be compatible with real life HTTP clients.
Submission instructions We will circulate an online form to
you, where you must fill in your team information and a brief
description of the system you propose to build. Your project team must
have no more than 2 members.
description must contain the following information:
- The application you propose to build, what do the clients and
servers do, what are the types of requests sent by the client, how the
requests are processed by the multiple backend servers, and so on.
- The architecture of the client and servers (single threaded/multithreaded/event-driven and so on).
- The interface between the user (you) and the client application and how requests will be generated by your client to the servers.
- The programming languages/frameworks you plan to use.
Please limit your description to 2-3 paragraphs, but be as specific
as possible in your proposal, so that we get a clear picture of what
you plan to do. If we have any feedback and need you to modify your
system in any way, we will get in touch with you within a week.
Note that you are not bound by this proposal, and you may tweak
(but, hopefully, not fully overhaul) your system design in the later phases
as you notice things to fix while coding. The goal of the intial
proposal is only to get you started with thinking about the design,
and to provide early feedback where needed.
Note that submission of proposals by the deadline for Phase 0 is
mandatory. This phase of the project carries no weightage in terms of
Phase 1: A simple prototype In phase 1, you will build a
simple prototype of your system. You will build the client and the
multiple server tiers. Your servers must be able to handle multiple
You will need to make a few design choices in this phase, like the
architecture (multiprocess / multithreaded with master-worker
configuration / event-based) to use at the server tiers. Your client
can be an interactive program with as fancy a user interface as you
like, in order to showcase all the features of your system. The goal
of this phase is correctness: multiple concurrent clients should be
able to send requests and obtain responses correctly from the server.
Submission instructions You must create a tar gzipped file, with the filename being the roll numbers of all the team members separated by an underscore ("_"). The tarball should contain the following:
- All code/scripts for the client, server, and other components in your system.
- A report (PDF, though .txt is also fine) describing the the high-level architecture of your system, the various request types and what to expect for each request, and other such details. Use figures when appropriate to describe your system architecture.
- A readme.txt describing the process to install and run your code. You must mention any additional software that must be installed to run your code. Please provide clear instructions to compile, run, and test your code.
Your project submission will be graded in a 15-minute demo by the TAs. You will need to set your system up using your submitted code on our lab machines (running Ubuntu 16.04) during the demo. Please provide enough scripts (e.g., to load data into your database) to ensure that the system can be setup quickly in the given demo slot. If you are using a database like mysql, make sure that the name of the database and the login credentials can be changed easily to avoid conflict with the other teams.
We will grade your project phase 1 on three aspects: the clarity of your report (the functionality of the system should be clearly escribed), the correct working of your system in handling each type of request defined, and the ability of your system to handle multiple requests concurrently.
Phase 2: Load testing
In phase 2, you will perform a load test of your system. You will
first write a multithreaded load generator that simulates multiple
clients (in place of a single client that sends one request at a time)
to bombard your server tiers with multiple requests in parallel. Your
simulated clients in the load generator should not have any
interactivity (e.g., solicit user input, or pause to think between
requests), but must bombard requests to the server in an automated
fashion. Your load test can be an open loop or a closed loop test. The
parameters to your load generator must be the number of concurrent
users (in the case of closed loop) or the rate of requests per second
(in the case of open loop). You may also have other inputs to the load
generator like the server IP, port, and the duration over which the
load test should be performed. Here are some tips on how you can run a load test correctly.
The outcome of your load test should be a characterization of your
server's performance (throughput and response time) as a function of
increasing levels of load, and an identification of the saturation
throughput of your server and the bottleneck resource at
saturation. You may also profile your code to verify that the
bottleneck you observe is indeed justified.
- If you are performing a closed loop test, you may use design your
load generator as follows: to simulate N concurrent users, you will
use N threads in your load generator. Each of the N threads will
emulate one user by issuing one request, waiting for it to complete,
and issuing the next request immediately afterwards. It is advised to
not use any think time between requests in order to effectively load
- If you are performing an open loop test, you may use a timer to go
off periodically (e.g., every 1/lambda to generate requests at the
rate of lambda) and send a request when the timer goes off. You will
have to take care of a few subtle issues in an open loop
generator. For example, if your rate of request generation is 10,000
req/s, then your timer must go off once every 100
microseconds. However, software timers may not work at such a fine
granularity. For example, if you set a timer to go off in 100
microseconds, the kernel may schedule your process only after 1000
microseconds. Your code must be able to detect such cases and
compensate for it, say, by issuing multiple requests together over
multiple timer intervals at a time. As a result of such issues, you
will need to carefully test your code to check that the rate at which
requests are being generated indeed matches your configured
rate. Further, you must use a large enough pool of threads (Little's
law should tell you if your pool of threads is sufficient or not) or
an event-driven design to handle multiple open connections in the load
generator. (Just a word of caution that open loop load generators are
much harder to get right than closed loop load generators, so if you
do choose this option, please be careful and meticulous in your
- If your system handles multiple types of requests (e.g., storing
and fetching files), your load generator can issue multiple types of
requests in a certain predefined ratio (e.g., 50% store requests and
50% fetch requests). For example, in a closed loop emulator, one
emulated user can issue multiple types of requests one after the
other. Otherwise, your load generator can issue only one type of
requests at a time, and perform the load test only for that type of
traffic. For example, you can store a set of files before the load
test at the server, and perform only fetch requests during your load
test. Either option is fine, as long as you run the load test
correctly for that option.
- In your load test, you must gradually increase the load on your
server either by increasing the number of concurrent users (closed
loop) or the arrival rate of requests (open loop). You must run
separate experiments for each value of load; do not change the load in
the middle of an experiment. For a given value of load, you must let
your experiment run for a long period of time (say, 5 or 10 minutes),
and measure the following parameters of your system in steady state
(that is, once they reach steady values): the average throughput of
your server (the number of requests/sec being successfully completed,
averaged over the entire experiment) and the response time of the
server (the average time taken for a request to complete, averaged
over all completed requests). You may measure these values within your
load generator itself, by keeping suitable counters. Ideally, you must
test for enough values of load to see the throughput of the system
increasing and eventually flattening out when it reaches saturation.
- It is recommended that you use two separate machines for this phase
of the project: one to host the load generator and the other to host
the server tiers. You may use different virtual machines (VMs) on a
single physical machine as well. Or, you may pin processes to CPUs
using the 'taskset' command to assign dedicated CPU cores to the load
generator and servers. Unless there is a separation of resources in
some form between the load generator and the server, you cannot be
sure that you are saturating your server.
- When you run the load test, you must be careful to ensure
that the load generator itself is not the bottleneck, and that the
load generator is able to effectively generate load at the specified
rate. You may allocate more CPU resources to the load generator than
to your actual server in order to ensure that your load test is not
limited by the capabilities of the load generator. Note that you can
truly identify the capacity of a system only if you saturate it
effectively with the maximum load that it can handle, so it is very
important to ensure that the load generator itself is not the
- Once your system hits saturation, measure the utilization of
various resources on the server (CPU, disk, network) to identify what
the bottleneck resource is. Ideally, your system should be limited by
one of these resources on one of the server tiers. You can measure
utilizations by using commands like top, iostat, netstat, etc. If you
find that you are limited by some resource on your load generator, you
must allocate more resources to your load generator to ensure that it
is not the bottleneck. If you find that your server throughput is
flattening out, but you do not see any resource at the server being
heavily utilized, you must check for some software bottleneck in your
system. Perhaps, your server doesn't have enough threads to saturate
the CPU or some system parameters need to be tuned to improve
performance. Or maybe you are printing a lot of output to the screen
causing the server CPU to spend most of its time waiting on I/O
(always disable printing to screen and other such unnecessary
activities during a load test). You must investigate such issues and
repeat your load test. Your load test should conclude when your server
has hit its capacity at a certain load level due to a hardware
- If your system uses the disk, be mindful of the disk buffer cache. If your requests access only a small number of files from disk, the files may almost always be in the disk buffer cache during a load test, causing no bottleneck in the disk. Be mindful of this scenario and interpret your results accordingly.
Submission instructions For phase 2 of your project, you must
submit all your code and scripts on Moodle, much like in phase 1, to
enable us to conduct a demo and see your code working. In addition to
your code, you are required to submit a report containing the
This phase of your project will be graded by reading over your report (primarily) and by conducting a demo and/or reading your code.
- A brief description of your load generator and how you generated load.
- Graphs of the average throughput and average response time of
your system with increasing values of load. Label your axes and
explain your graphs clearly in the report.
- An identification of the bottleneck in your system at saturation,
and the capacity of your system at saturation.
Phase 3: Evaluating design alternatives and optimizing system performance.
Note: some of the teams could not finish phase 2 correctly. For
such teams, please do a thorough job of load testing itself as part of
phase 3. The majority of the teams that finished phase 2 successfully
can go ahead and read the rest of the guidelines for phase 3.
In this phase, you will work on one of two things: you can either
apply some of the techniques learnt in class to optimize the
performance of your system, or you can reimplement your system using
an alternate design and evaluate multiple design choices. Some
possible ideas are listed below.
- If you found that the CPU of some server component was the
performance bottleneck during phase 2, you can first profile your code
to understand which functions are consuming CPU time, what the cache
hit rates are and so on. You can then try to apply some of the
optimizations to reduce cache misses learnt in class. Or, if you use
many locks in your code, you can check if performance improves by
moving to simple lockfree designs. If you do malloc/free many times
and your code is spending a lot of time in these functions, you can
try out different memory allocators more suited to your workload to
see if performance improves.
- If you find that the bottleneck is the
disk or network, you can change your server design to be limited by
the CPU (e.g., read files from memory and not from disk) in order to
shift the bottleneck to the CPU, after which you can apply one of
the above optimizations. Alternately, if the bottleneck is the network, you can
experiment with some high performance network stacks. If the
bottleneck is the disk, you can try out various new file systems that
are optimized for your workload. It is highly recommended that you
profile your system before optimizing it, in order to correctly
identify the part to be optimized.
- One other option for this phase of the project is to compare
multiple design alternatives in your original server design. For
example, if you have implemented a multithreaded blocking server, you
may try out an event-based non-blocking design and comapre its
performance. You can evaluate any alternate design of your choice.
- Another aspect of your system to evaluate would be the multicore
scalability of your design. Does your system performance scale
linearly as you increase the number of cores given to the bottleneck
component? If it does not scale, can you profile the code and identify
the limiting factor for scaling? Can you fix this issue and make it
- If the bottleneck in your system is a third-party component like
mysql, it may be difficult for you to change that code to optimize
your system. In such cases, you can consider replacing such components
with other third party alternatives to see if performance
improves. For example, can your system be adapted to use a key-value
store like Redis instead of a SQL database? If so, does it improve
- You can also try out a simple distributed design of your
bottleneck component. For example, if you are using a database that is
the bottleneck in your load test, you can split your state across two
databases and partition your requests to both. Does this partitoning
lead to improvement in performance proportional to the extra
components you have added? For example, if you split your database
into two machines, do you get a two fold increase in throughput?
Whichever path you choose, you must perform a load test on your
optimized/alternate design much like in phase 2, and compare the
performance of the two versions of your design. You must explain the
difference in performance observed, and the justifications for why the
performane changed or didn't change.
Note that the most important thing in this phase of the project is
to think creatively about your system and try out new ideas learnt in
class. Please do not be afraid of a negative result (no performance
improvement). It is ok to try something new and not find any
performance improvement, as long as you understand and can justify why
performance didn't improve.
Submission instructions You must submit your code and a
report (describing your optimizations and their impact) on
This phase of the project will be primarily graded via demo. You
will showcase your entire system that you have built over the three
phases. The TAs will ask you to run and demonstrate the performance of
your system to verify that you have successfully built your system,
optimized it (to the best of your abilities), and tested its
performance. Please submit all code, scripts, installation
instructions, and other details of your system to enable the TAs to
fully evaluate your system.
Phase 4: Final presentation and viva
You will not be writing any new code in the final phase of your
project. The goal of this phase is to test that you have actually
understood the systems concepts involved in your project, and to
verify that all team members have contributed constructively to the
In the final phase of your project, you will prepare a short
presentation describing your complete system (the basic design, any
optimizations you have done, your load generator, and so on) and your
results (initial performance, performance after optimizations, and so
on). You should prepare no more than 10 slides, and your presentation
should be under 10 minutes. You must sign up for a slot to make your
presentation. After the presentation, each member of the team will be
asked about their contribution to the project, and questioned on the
components they have contributed to.
Submission instructions A form will be circulated to signup
for presentation slots.