MAC: Switching, VLANs, MPLS ============================ * In the last lecture, we have seen that shared broadcast links like Ethernet LAN have a fundamental scaling limit. Solution: switched ethernet. That is, hosts connect to a switch via Ethernet cable. * What is an Ethernet switch? When multiple hosts connect to the various ports in a switch, the switch forwards a frame only to the output port that leads to the destination (unlike an ethernet hub that forwards to all ports). How does a switch learn this information? When node A sends a frame through a port numbered "i" on the switch, the switch learns that MAC address of A is reachable via port "i". So the next time a frame arrives with destined to A's MAC address, the switch will forward the packet only on port "i". When a switch sees a destination MAC for the first time, it has to flood the frame on all ports, as it doesn't know the mappings. * What is the difference between a bridge and a switch? A bridge is a term used for a device that connects 2 or a small number of LANs. A bridge performs similar functions as a switch (learns which MAC addresses are on which ports, and forwards only on the right ports). So a switch is just a multiport bridge. * If the topology of switches / bridges in the network has a loop, broadcast frames will keep circling forever. To avoid this problem, switches in a LAN run the spanning tree protocol (STP). That is, switches use a distributed protocol to construct a spanning tree over all the switches, with the switch of the smallest ID (say, MAC address) as the root of the tree. Only ports of a switch that are on the spanning tree are active, and switches do not send or receive frames over the inactive ports. Switches re-run STP to construct a new topology as switches fail or topology changes. * Why have switch topologies with loops in the first place? For fault tolerance: if one link fails, another can be used. * Details on STP: initially the designated root bridge sends special messages (bridge protocol data units or BPDUs) on all its ports. These BPDUs are passed around the network, and every node tries to identify the shortest path to the root. When a BPDU is received on a port, the cost of that port is added to the path cost before the BPDU is forwarded on other ports. Note that the cost of the port is configured based on the link speed of the port etc. When a switch receives BPDUs of different costs from the root on different ports, it picks the port on which it got the shortest cost as the forwarding port to the root, and blocks the ports on which it received higher cost BPDUs. In the course of normal operation, the blocking ports are not used to send or receive data (except BPDUs periodically). When links fail or topology changes, the blocked ports may change back to being used again. * Switches (layer 2) vs. routers (layer 3): Switches are better because - Switches need no configuration (as they learn which ports to forward on automatically). Routers need to be configured with which prefixes they own etc. So switches are better for quickly interconnecting nodes. - Switched networks can have heterogenity of link types etc., unlike IP routers which only work with the IP protocol. - Switches perform simple lookup on fixed MAC addressses and layer 2 protocols are also simple (learning bridges etc). So, chepaer than IP routers that perform longest prefix match and complicated routing protocols. Routers are better because - Switches require lots of broadcasts (first packet to a destination, ARP etc), and need to store per-MAC address state like ARP tables. As such, they don't scale well. Routers scale much better as they can aggregate IP addresses into prefixes, and need not flood. - Switched paths are not guaranteed to be optimal (for example, even if a shorter path exists you may not take it due to spanning tree restrictions), whereas intra-domain IP paths are always shortest paths. * In conclusion, small networks are usually built as switched LANs, and larger networks have IP routers and perform layer 3 routing. * Brief discussion of WiFi link layer: note that an access point can either be a layer 2 switch or a layer 3 router (and NAT too). What role it plays depends on the configuration. * We have seen that switched LANs have to broadcast packets over the entire domain ocassionally, leading to scalability issues. One solution: virtual LANs. When a switch is configured with VLANs, broadcast packets are exchanged only between hosts of a particular VLAN, and not across the entire domain. In other words, a VLAN redefines the broadcast domain of a node. VLANs can be configured as port-based (certain ports of a switch are designated to certain VLANs), or MAC address-based (certain MAC addresses are mapped to certain VLANs). A special VLAN tag is added to packets by the first switch, so that all switches know how to deal with the packet. * ATM and virtual cirtuit switching. ATM is a parallel network stack to layer2/3 of IP. When it was first developed, ATM was considered a serious competetitor to packet switched IP networks. ATM is based on the concept of virtual circuit switching. When A wants to start a flow to destination D, it sends a setup message that is routed along the correct path to D (say, via ATM switches B and C). After the virtual circuit is setup is complete, state is established along the path on how to forward data of this circuit. Every subsequent packet only carries a virtual circuit identifier (VCI) and intermediate switches will forward data using this VCI. ATM uses fixed length 53 bute frames called "cells". * Note that a VCI need not have global scope (i.e., unique across the network). VCI in ATM only has "link local" scope. That is, flows over a given link need to have unique VCIs. So, when a node sends a cell on a link for the first time, it picks a VCI that is not in use at that link. The receiving switch notes the incoming VCI, picks an outgoing link and VCI again. So, switches in ATM maintain a table mapping incoming link and incoming VCI to outgoing links and outgoing VCI. Whenever a cell arrives on the link, an ATM switch looks up the entry corresponding to the incoming link and VCI, finds out the outgoing link for that cell, and rewrites the outgoing VCI number. * ATM was considered to be better than datagram based IP networks because of fixed length VCI and cells, leading to bounded latencies, and the benefit of setting up circuits to provide guarantees services. However, the best effort Internet performed reasonably, and ATM lost out to IP. * MPLS (multi protocol label switching) is a more recent technology that borrows some aspects from ATM. MPLS is not based on circuit switching, and is designed to work with packet switching and IP datagrams. However, it wishes to modify the forwarding logic of IP datagrams. At some point, it was thought that IP lookup algorithms based on longest prefix match would be too slow to forward data on high speed links, and that a fixed label lookup was needed. This was the initial motivation for MPLS. * MPLS works similar to ATM in the forwarding path. There is no connection setup. When an IP datagram arrives into an MPLS enabled network, the first MPLS edge router introduces a label on the packet. Sunsequent MPLS routers perform label switching, much like ATM cells. That is, all MPLS routers along the path maintain mapping from incoming label and link to outgoing link and label, and swap labels on packets. So MPLS routers are also called label switching routers (LSRs). Note that ATM switches can be easily reused to be LSRs. * Where is MPLS label added? MPLS has a 20 bit label in a 4 byte header. This header is usually placed between IP and layer 2 headers, so MPLS is considered layer 2.5. * While the initial motivation for MPLS wasn't strong (IP lookup algorithms became fast enough), MPLS has found other uses. Some of these are listed below. - Traffic engineering. MPLS can used to "pin" different IP flows to the same destination to different label switched paths (LSPs), to distribute load evenly and perform traffic engineering in ISP backbone networks. Explain with the example of the "fish topology": packets to the same destination arriving from different sources can be pinned to different paths. - MPLS can be used for fast reroute, to compute alternate paths in case of link failures before STP recovers and finds an alternate path. - MPLS can be used to build Layer 2 and layer 3 VPNs. The concept is similar to IP-in-IP tunneling. That is, to connect two private networks, the IP datagram in the private space is encapsulated with an MPLS header and tunneled to the other end point, after which the MPLS header is removed. However, MPLS based VPNs are more efficient because the MPLS header (4 bytes) is smaller than the IP header overhead (20 bytes) of IP-in-IP tunneling. - In general, MPLS has found many uses because the labels can be refurbished to mean different things and serve different purposes. It can be used at any point where simple destination IP based forwarding is not good enough. * How are MPLS labels distributed? That is, once a flow with an MPLS label arrives at a MPLS router, how does it decide the outgoing port/label of the flow? It depends on the purpose of using MPLS. For simple destination based forwarding, labels can be announced as part of the routing protocol. For example, a downstream router can tell an upstream router that it can forward packets of a certain label to a certain destination. For traffic engineering, the network operator can decide optimal routes and assign labels. For VPN services, labels are distributed with BGP. So distribution of labels requires some routing protocol, much like normal IP. * What other alternatives to MPLS for traffic engineering? OSPF ECMP (Equal Cost Multi Path) can take advantage of multiple paths to a destination and split traffic among multiple paths. ECMP works fine for the most part, except when there is large asymmetry in the network traffic patterns / link capacities etc. For example, when a large flow starts on one of the paths, equally splitting traffic over all paths may not be the best idea.