Socket Programming: synchronous and event-driven I/O. * Important socket system calls: socket (to create a socket), bind (to bind to an IP address/port number), listen (to start receiving incoming connections on a socket), connect (connect to a remote IP/port), accept (accept a new connection), read and write. * Socket communication happens between a client and server. The server socket binds to a well known IP address/port with the bind system call, whereas the client can use any available port on its machine. The server than listens for new connections. The connect and accept are used to establish a connection between the client and server via the TCP handshake. The accept call returns at the server with a new file descriptor, i.e., the TCP server has one separate file descriptor for every client connection, in addition to the main listening socket file descriptor. This new file descriptor can be used to read and write to a specific client. * Communication over UDP sockets does not require the connect/accept system calls. Once sockets are created, data can be sent and received via read/write system calls. Multiple client communications happen over the same socket in the case of UDP. * What happens on a write system call to a socket? Every socket has a pair of buffers (memory pages) to store sent and received data. On a write system call, data is written into the send buffer, TCP/IP/ethernet headers and added, and the packet queued at be sent at the device driver. The write call returns after this processing and does not block until data reaches the other end point. In the case of TCP sockets, data is stored in the send buffer until ack is received (for reliability), while no such thing is done for UDP. * On a read system call, the receive buffer is checked for data. If no data is present, the read system call blocks, and returns when data is available. When data arrives on the network interface card (NIC), an interrupt is raised. While servicing the interrupt, the OS performs TCP/IP processing, copies the packet to the correct receive buffer, and unblocks the process. When the process returns, it copies the packet from the receive buffer into its application data structures and proceeds processing it. * The concept of DMA: modern NICs and disks copy data to a predetermined memory page via direct memory access (DMA) before raising interrupts, so that interrupt processing can finish quickly without spending time on data copy. An interrupt is still raised, but it is handled quickly as the packet copy has already been done. * Top halves and bottom halves: interrupt processing in Linux is split into two parts: the top half does the bare minimum, while the bottom half does the extra stuff like TCP/IP processing. This split processing is to ensure minimal disruption to running processes due to interrupts. * Now, the accept and read system calls block if no new data/connections are available. How can an application handle network I/O from multiple sockets if reading on any socket can block? One way is to use multiple threads to handle separate clients and block on them. For example, a multithreaded server can have one main process listening for new connections via accept. For every new accepted connection, the file descriptor of the new connection is passed on to a thread, which blocks on reads and services the connection request. This multithreaded or multiprocess architecture with a master process and multiple worker threads is quite commonly used. * Other options to multithreaded servers with blocking system calls: - all socket can be set to non blocking mode (suign socket options), and the application can periodically read from all sockets to see which has data. When set to non blocking mode, read on a socket returns even if there is no data. However, this method wastes a lot of CPU cycles. - an event-driven I/O mechanism like select/epoll can be used. In such mechanisms, the kernel monitors multiple sockets and returns when there is an event (e.g., packet received) on any socket. Based on what the event is, multiple functions or "callbacks" are invoked to handle the event. * Event-driven I/O is also called asynchronous I/O, because sending a request for I/O and receiving a reply happen at different times. For example, the application writes to a socket and does other things before a reply comes back. * Advantages of event-driven I/O: a single process/thread can successfully handle multiple clients, as there is no blocking. Further, since processing runs fully without being interrupted by another thread sharing the same data, race conditions are less of a problem. * Disadvantages of event-driven I/O: writing asynchronous/event-driven code is hard, as the processing of a given request is spread across multiple places. One needs to think in terms of callbacks for each event, and split processing across multiple events. In some sense, when a blocking system call is used, the user stack holds state during context switch, and this state is reloaded when the process in ready to run. In the case of async I/O, this state that would have been on the stack (local variables etc.) must be passed around manually between callback functions, complicating code.