Networking in VMware Cloud Director (vCD) is the most complex and least understood component of its architecture. Among the different networking options was the introduction of vCD Network Isolation (vCDNI). In this blog I am going to attempt to give you a better understanding of vCDNI and how it works technically.
vCDNI is an overlay networking scheme that builds on top of an existing Ethernet network to provide isolated networking for Virtual Machines. An overlay network is a virtual network of devices or connection that rides on top of an existing networking infrastructure. An example would be a data connection that rides on top of a voice connection in a telephone network.
The scheme used for vCDNI is a MAC-in-MAC encapsulation that allows for many isolated networks, up to 16 million in vCD 1.0, to be created and isolated on top of one Ethernet network.
If you are familiar with Lab Managers networking, vCDNI is based on the same technology used in Host Spanning Private Networks found in LabManager 4. See LabManager4_Whats_New.pdf for details on this. VM traffic encapsulation is done in the fast data path by a vCDNI dvfilter operating in the vmkernel. There is no VM operation required for this. LM required a service VM to be present in the control path whereas VCD does not require such a VM.
vCDNI networks are implemented as portgroups in vSphere with each network designated a unique network ID, also know as a Fence ID. In vCenter the portgroup would have a name like:
The breakdown of the portgroup name is:
VC1055813345 vCenter ID
DVS2 Distributed switch number
CM1 vCD Instance
V99 Transport VLAN
F12 Network (Fence) ID
Emca Internet vCD Network Name
vCD assigns a unique Network ID to each vCDNI network and uses a different vSphere ephemeral portgroup for each network. With ephemeral portgroup, a virtual port will be created and assigned to a Virtual Machine when it is powered on, and will be deleted when it is powered off. An ephemeral portgroup has no limit on the number of ports that can be a part of this portgroup other than the vCenter supported limit. As such the practical upper bound for concurrent vCDNI networks is the max supported ephemeral portgroups in vCenter, currently 1016 in vSphere 4.1.
The isolation inherent in vCDNI is achieved by re-encapsulating the original MAC frames from Virtual Machines in a vCDNI frame to create a new MAC-in-MAC frame. Part of the new vCDNI header is a network ID that identifies which isolation network the packet belongs to. The final frame looks like this:
The vCDNI MAC header has the assigned virtual MAC addresses of the destination ESX(i) server and identifying information of the source ESX(i) server corresponding to the destination and source Virtual Machines respectively.
The vCDNI data header contains vCDNI protocol specific data such as sequence and version numbers and vCDNI Network ID.
The following example depicts the vCDNI packet flow:
- Virtual Machine sends packet out of it’s virtual NIC
- dvfilter adds vCDNI MAC and Data headers to create the overlay vCDNI network
- DVS adds the transport VLAN tags to the packet, if needed, and passes it on to the physical NIC
- Destination ESX server Physical NIC gets the packet
- DVS strips off the transport VLAN tags and passes it up to the dvfilter
- dvfilter strips the vCDNI protocol data and passes the packet on to the destination VM
The added vCDNI header increases the maximum size of the frame by 24 bytes. To ensure that there is no fragmentation due to oversized frames, the Ethernet Maximum Transmission Unit (MTU) needs to be increased by 24 bytes from its default of 1500 bytes to 1524 bytes. The MTU increase needs to be implemented end to end. This MTU change needs to be done in three places:
- At the vCD level, by changing the Network Pool MTU
- At the vCenter level, by modifying the advance properties of the DVS hosting the Network pool
- At the physical switches, depending on the switch implementation, by increasing the MTU of the switch or ports used by the DVS hosting the network pool
a. On a Cisco Catalyst switch
Catalyst_Switch(config)# system mtu 1524
b. On a Cisco Native IOS switch
NativeIOS_Switch># int gigabitEthernet 1/1
NativeIOS_Switch># mtu 1524
vCDNI is the optimal option in any environment where you need to create Virtual Machine networks without consuming VLANs in the process. I would recommend that vCDNI network pools back all Organization and vApp networks. (See my previous blog on the definitions). This would allow for the definition of thousands of networks without the actual consumption of VLANs, other than for the transport network.
By implementing the vCD dynamic portgroups on an “Organization” DVS and the static portgroups on a “Provider” DVS you will be able to isolate the dynamic portgroups to just the “Organization” DVS and make the MTU changes only on the infrastructure supporting the “Organization” DVS. Doing this will also allow you to maximize the number of ports you can create in one vCenter as the “Organization” DVS will use ephemeral portgroups and the “Provider” DVS will use dynamic, preferably, or static portgroups.
The vCDNI transport network needs to be an isolated set of switches or a dedicated non-routed VLAN. For practical reasons a dedicated non-routed VLAN will scale much better in a large vCD deployment.
The argument behind the isolated switches or non-routed VLAN recommendation is that vCDNI does not encrypt its packets and just overlays the different isolated networks on top of the transport network. If a physical machine, or Virtual Machine for that matter, has access to the transport network, it would be trivial to disassemble the packets, as well as inject packets into any of the isolated networks. This is not unique to vCDNI though, as Ethernet packets are not encrypted either, making it trivial to spoof Ethernet packets if you have access to the transport network. The same applies to VLANs by the way, as access to the VLAN transport network (Native VLAN of the trunk) will allow you to inject packets into any VLAN allowed on that trunk.
If you are not using vCDNI backed pools, I would strongly recommend that you isolate all vCD networks by using unique VLANs for each network to ensure tenant isolation.
It is very important to understand that one of the biggest risks to the virtual environment is misconfiguration, not the technology. Thus you need strong audit controls to ensure that you avoid misconfiguration, either accidental or malicious.
The picture below shows logically where in the traffic flow the dvfilter is inserted
To get per host vCDNI statistics use:
This utility can be used to display:
- Configured networks and their MTUs
- Active ports and their port IDs
- Switch state including inside and outside MAC addresses
- Port statistics on a port ID basis
For help with the command, type it with no options.
This blog would not have been without the technical assistance of Anupam Dalal and feedback of Michael Haines.