Introduction * What is a computer system? An application running over an operating system (OS), which in turn runs on machine hardware (CPU, memory, I/O). A distributed system can run on many machines, with coordination between the applications on the multiple machines. The OS acts middleware between the application and hardware. * Let's take an example of a simple system. A user writes a program, for, say, a web server. The compiler translates the high level program into instructions that the underlying hardware can follow (each hardware architecture like Intel x86 has its own instruction set). When the application is started, the compiled program runs as a process on the hardware. The OS reads the executable from the disk, creates a memory image for the process in the main memory, and schedules the process to run. (Real applications may consist of more than one process.) * When the process is scheduled to run on a CPU, The CPU's program counter points to instructions in the memory address of the process, and the application begins execution. The CPU loads an instruction from memory, decodes it, and executes it. Some hardware instructions load values from memory into CPU registers, some instructions perform calculations on the values in the CPU registers, and some instructions store back values into memory. CPU also has a cache of recently used instructions and data (L1, L2, L3 caches) to avoid expensive memory access on every instruction. Modern CPUs have multiple execution cores, and execute many instructions in parallel in a pipelined fashion on each core. * Periodically, the application process's execution may be halted, and the OS takes over. When does this happen? - System calls (software interrupts): the application requires some services from the OS, e.g., accessing I/O devices like the disk or network card. - Traps: the application process generated an error, requiring OS intervention. - Interrupts: an external device generates an event that requires some processing from the OS. * A process normally executes application code in user mode. When a system call/trap/interrupt occurs, the CPU moves to kernel mode. The OS saves the context of the process (various CPU registers and program counter), services the interrupt in kernel mode, then restores context and resumes execution of the process in user mode. The CPU runs certain privileged instructions only in kernel mode, ensuring protection and isolation amongst user processes. The OS can also periodically context switch from one process to the other, and restore context of a different process after an interrupt. Thus, a very simplified view of a computer system consists of one or more application processes executing on hardware, with periodic switches to the kernel mode to handle interrupts. * System calls form the API between the application and the OS. Modern operating systems have system calls to create processes and threads, allocate memory to these process, enable communication and synchronization between processes, enable processes access the network, disk, and other external devices, and so on. The POSIX API provides a standardized set of system calls across operating systems. Applications can invoke the system calls directly, or via an intermediate library (e.g., libc). * Applications typically have multiple processes (or threads). Why? Because some system calls block, i.e., the OS cannot return to the same process immediately. In such cases, the OS context switches to another process or thread, and the application can still make progress. Thus, computer systems consist of multiple processes (or threads) running concurrently. * What will we study in this course? - Processes, threads, and how they are composed to form an application. At a high level, real applications consist of one or more processes/threads that interact with each other. We will study various multithreaded/multiprocess application architectures. - How do you measure the performance of an application? We will understand concepts like throughput, latency, bottleneck resource, and capacity of an application. - Optimizing placement and access of data in memory, thereby reducing cache misses, paging related overheads etc. - Optimizing performance of an application when running on multicore systems. When multiple threads and processes running on multiple cores contend to access certain common resources (e.g., the same memory location), it leads to performance degradation on multiple cores. We will study techniques to scale application performance on multiple cores. - Optimizing network access. Network link speeds are increasing, and the OS and applications need to be better designed to be able to communicate at higher speeds. We will study various techniques to enable high speed networking. - Time permitting, we will study other optimizations like dynamic memory allocation algorithms in modern systems, newer file system designs, and so on. - Finally, we will study how distributed systems work (only the basic principles; a detailed treatment is a full course by itself.) * Overview of the course project.