Software Defined Networking ========================== Outline - Key ideas of traditional networks vs SDN, history - Ethane - the motivation - OpenFlow - the interface - Onix - SDN controllers - Applications - B4 by Google ================================================= * To understand what's new about this trend of software defined networks or SDN, let's see what traditional networking is: - Distributed state. The state about the network (connectivity, links) and policy (which user should talk to whom, how to configure firewalls, access control lists etc.) is distributed. This was a conscious decision for scalability. No single node has a global view of the network, or view of all name bindings (e.g., DNS to IP, IP to MAC). - Integrated control and forwarding plane: routers do route computation (control) as well as forward packets based on the decisions made by the control plane (data plane). - Most networking is done in hardware (think switches, routers). Any new networking functionality needed has to be built with hardware support. For example, if you want to implement a new QoS scheduler, you had to contact a switch vendor who had to build a custom ASIC chip for the functionality. So, building new features was a hard process, leading to slow change in networking protocols. * In contrast, here is what software defined networking or SDNs mean (at a high level). - Global visibility of the entire network state and name bindings in a central location, called the controller. The control has all inputs about the network, as well as what policy needs to be implemented for all users. - Separation of control plane and forwarding plane. The controller, which has a global view of the network, performs route computation and pushes the forwarding tables to the switches. The forwarding itself is done by the switches. - The switches are fairly simple, and expose a common API. The software controller can set the state in the switches to implement any fancy idea. So all the innovation is happening in software in the controller, and the switches are just implementing controller instructions, not doing anything smart on their own. * SDNs is a major buzz word among researchers as well as in the industry today. The reference provided below about the history of SDNs tells the story of how very similar ideas were proposed many years in the past as well. So what's the reason for all this buzz now? We will walk through the ideas in the order they were developed. * The latest interest in SDNs started with a research idea described in the "Ethane" paper below. Ethane was envisioned as system to manage enterprise networks (say, the network inside an organization like IITB), and not as a replacement for any networking protocol globally. - Ethane says that all users, switches, and other network elements in an enterprise should be managed by a central controller. The controller has global visibility of all network connections, and policy. For example, any switch that starts up will check in with the controller. Every user will authenticate with the controller. The controller knows all DNS and ARP bindings. The controller manages DHCP and other such services also. - In addition to network state, policy is also centralized. In current networks, policy is expressed as a bunch of firewall rules at different locations, and over things like IP addresses which can change. In Ethane, policy is specified at the controller, over high level names. - The controller computes shortest paths based on global knowledge. The controller also computes rules based on policy (e.g., firewall rules). Now, when the first packet of a flow starts, the first switch that gets the packet doesn't know what to do with the packet, so sends it to the controller. The controller computes the shortest path, policy etc for that flow, installs this state in all switches along the path, and returns the packet. All subsequent packets of that flow will follow these rules. [Note that a controller may choose not to set flow table rules and receive every packet of that flow, for example, DHCP packets.] - Switches need not have complicated forwarding logic, just "flow tables". That is, the switches match packets by some fields in packet headers, and perform simple actions (forward along port X, drop etc) for packets that match a certain flow. Switches also maintain statistics about flows that the controller can use. - Ethane proposed a simple implementation, a policy language etc. Ethane was a precursor to the latest wave of SDN research. * Once the basic ideas took hold, researchers wanted to standardize this interface between the controller and switches. The main motivation was that, if switches can expose a simple interface such as flow tables, then researchers can try and test out various ideas (like Ethane) in existing large campus networks. This led to the definition of the "OpenFlow" protocol, and the concept of an OpenFlow switch. - Most switches have flow tables to store some of their internal state. These flow tables are built from TCAMs (ternary content addressable memory). A CAM matches a certain bit pattern in a packet (usually headers), while a TCAM can match bit patterns with wild cards. So a TCAM can be used to match a certain pattern in packet headers, and can be used to implement flow tables. - A flow table has a pattern to match on the packet headers (input port, VLAN ID, layer 2 and layer 3 source and destination addresses, TCP or UDP ports, etc.). For packets that match a certain header, some actions are taken (drop, forward on certain port etc.). Flow tables can also maintain statistics about how many packets matched a certain pattern. - An OpenFlow switch is one that exposes these flow tables to external entities to configure them. A controller configures these entries using the OpenFlow protocol. An openflow switch has a secure communication channel with the controller to get instructions on how to configure flow tables, and report statistics. By standardizing this interface, one can use any switch that supports openflow with any controller that speaks openflow. - The initial vision was that if all campus switches supported openflow, researchers can run their own ideas on certain flows by configuring the flow table susing openflow. * The idea of OpenFlow was embraced by a few switch vendors easily. Switch vendors had already begun to expose some APIs, and adding supporting OpenFlow was not a big deal. It was around this time that switches were being built from "merchant silicon" or chipsets from standard chipmakers. That is, switch makers were assembling switches by taking components from chipset makers, instead of custom building the entire switch. Therefore, opening up some intrefaces was easier. * There is a lot of work on standardizing OpenFlow, and several versions have come up. Later versions of OpenFlow match on more header fields, and have more complicated actions associated with them, (metering or rate limiting flows, pushing and popping MPLS labels etc.) There can be a series of flow tables that are checked in order, in order to express priority of rules. * Note that SDNs are not all about OpenFlow. OpenFlow is only protocol that can be used to implement the idea of SDNs. Several such standardized ways can and will exist to configure switches from software. * Has switch design gotten simpler with OpenFlow? Not really. Switches need to support the complicated OpenFlow operations, as well as work with legacy IP forwarding (for backward compatibility). In fact, an OpenFlow controller can simply direct traffic through the "regular" switch path for some flows by setting action as "go through regular switch path". Most switches are such "hybrid" switches and not pure OpenFlow switches. So, switches haven't really gone simpler, but they have become more customizable. That is, implementing new functionality does not need new hardware, but just smart logic at controller, and configuration via OpenFlow. * Now, we come to the controller. Initially, the controller was just one PC, it worked for small prototypes like Ethane. However, researchers have worked on distributed controller frameworks, to make the controller more scalable. The references list one such controller design - "Onix". Note that several such open source controllers are being developed today. * What is an SDN controller? A controller maintains global state of the network. It exposes this network state over its "northbound API", so that applications can be built using this state to configure the network. Once an application decides how the network should be configured, a controller translates these instructions into OpenFlow commands on its "southbound" interface, and configures the OpenFlow switches accordingly. You can think of it as a network operating system, that abstracts out details of low level switches and enables development of high level applications. * The main ideas of the Onix controller: - Onix maintains a "Network Information Base" or NIB. NIB has several entities like nodes, ports, forwarding tables etc. Onix builds the NIB based on events in the network (switch is installed, host is connected etc.). Applications can decide what information should be imported / exported into NIB. - Applications built on top of Onix read and modify state in the NIB. Whenever any change happens to NIB (say, a new forwarding table entry is added), Onix translates this change into OpenFlow commands for several switches and does the low level configuration. - Onix has mechanisms for scalability. For example, the network view can be split between multiple Onix instances. - Onix also guarantees consistency of the network state across multiple replicas. Applications can choose strong consistency semantics for important data (using replicated state machines) or weak semantics (using a DHT) for not-so-important data. - In the end, Onix does not provide any SDN logic by itself, but enables easy development of SDN applications. For example, some idea like Ethane can be easily implemented using Onix. * Now, all these ideas about SDN are only useful if there are good applications that do things that were hard to do in hardware before, but can be easily done using SDNs. While Ethane is one such example, Google has also come up with a good use case for SDNs. Google has built their WAN (wide area network) that connects several datacenters using SDN conceprts. The resulting system is called B4, you can see the references for the paper. * Main ideas of B4: - Google has a WAN carrying lots of data. Applications are elastic, care about bandwidth, not latency. - Need to implement traffic engineering for optimal utilization of link bandwidth. Want to use centralized traffic engineering due to the small number of sites. - Packet loss on WANs is bad (high RTTs, TCP doesn't get high throughput with losses). So WAN switches are expensive. But Google does not want to use expensive switches with deep buffers, instead wants to build simple switches with "merchant" silicon, and carefully do traffic engineering and load balancing. - So Google built their own switches based on fat-tree topology by assembling smaller chipsets. The switches had an OpenFlow agent on each of them. Each site had an OpenFlow controller that was built on top of Onix. Finally, there was a central TE server that was the application on top of the openflow controller. - The sites ran BGP between them. BGP routing tables were converted to OpenFlow rules by the controller. So the network was a hybrid (not pure) OpenFlow network. BGP still worked in a distributed fashion, exchanging messages between sites. These messages were sent from switches to the controller, which had the BGP daemon running calculations. - Centralized traffic engineering decides how to route large flows. TE was implemented using IP-in-IP tunnels. The tunnels were setup using OpenFlow. TE works along with normal routing. If TE is disabled, normal paths computed by BGP are used. - Note that this seems like a more efficient laternative to distributed MPLS based TE where labels are distributed by BGP etc. * Some lessons from building and using B4: - The communication between the controller and the OpenFlow agent at the switch may become the bottleneck sometimes. For example, when failures happen and lots of messages are sent, then the tunnel between controller and switches gets clogged and openflow commands cannot be set correctly. - The OpenFlow agent that configures switches must be well designed using multithreading etc., to configure several linecards in parallel. * The reference on the history of SDN describes how several such ideas were proposed in the past. First, we had active networks that proposed configuring the datapath also from software. Then work on very simple centralization of control plane. The latest SDN movement has been more popular than all these efforts, due to several reasons: support from switch vendors and industry, good use cases etc. We will see some use cases that are driving SDNs in next lecture. * Further reading: - "Ethane: Taking Control of the Enterprise", Casado et al. Proposes the idea of centralized management of an enterprise network. Precursor to the idea of SDNs. - "OpenFlow: Enabling Innovation in Campus Networks", McKeown et al. Proposes the idea of the OpenFlow protocol to control switches in a campus, to enable research. - "Onix: A Distributed Control Platform for Large-scale Production Networks", Koponen et al. Describes the design of a distributed SDN controller framework. - "B4: Experience with a Globally-Deployed Software Defined WAN", from Google. Describes their SDN-based WAN design. A practical application of SDN ideas. - "The Road to SDN: An Intellectual History of Programmable Networks", Feamster et al. Describes the history to SDN-like ideas over the last 25 years.