Jan 01 2009

Nexus 1000V with FCoE CNA and VMWare ESX 4.0 deployment diagram

Published by Brad Hedlund at 5:00 pm under Featured

What does the virtual data center environment look like when you have a CNA (Converged Network Adapter) installed in a ESX 4.0 server running the Cisco Nexus 1000V virtual switch?  I decided to draw it out in a diagram and came up with this:

Nexus 1000v with FCoE CNA with VMWare ESX 4.0

(Click the diagram to view a larger image in a new browser window)

Some key observations of importance:

  • The version of ESX running here is ESX 4.0 (not yet released)
  • The Nexus 1000V software on the physical server acts like a line card of a modular switch, described as a VEM (virtual ethernet module)
  • The Nexus 1000V VEM is a direct replacement of the VMWare vSwitch function
  • The Nexus 1000V VSM (virtual supervisor module) acts like the supervisor engine of a modular switch
  • One Nexus 1000V VSM instance manages a single ESX cluster of up to 64 physical servers
  • The form factor of Nexus 1000V VSM can be a physical appliance or a virtual machine
  • The network administrator manages the Cisco Nexus 1000V (from the VSM) as a single distributed virtual switch for the entire ESX cluster
  • Each virtual machine connects to its own Virtual Ethernet (vEthernet) port on the Nexus 1000V providing the network administrator traffic visibility and policy control on a per virtual machine basis.  Virtual machines can now be managed like physical servers in terms of their network connectivity.
  • In this diagram VM1 connects to interface VEth1 on the Nexus 1000v and keeps the same VEth1 interface even when it’s VMotion or powered up on a different physical server.
  • The VMKernel vmknic interface also connects to Nexus 1000V on a Virtual Ethernet port, ‘interface vEth 2′ for example (not shown here).  Same goes for the Service Console vswif interface, it also connects to Nexus 1000V on a virtual ethernet port (not shown here).
  • The network administrator defines Port Profiles which are a collection of network configuration settings such as the VLAN, any access lists, QoS policies, or traffic monitoring such as NetFlow or SPAN.
  • In this example Port Profile BLUE could be defining access to VLAN 10 and enabling NetFlow traffic monitoring, for example.
  • Once enabled, Port Profiles are dynamically pushed to VMWare Virtual Center and show up as Port Groups to be selected by the VMWare administrator
  • The VMWare administrator creates virtual machines and assigns them to Port Groups as he/she has always done.  By selecting VM1 to connect to Port Group BLUE in Virtual Center, the VMWare administrator has effectively connected VM1 to VLAN 10, along with any other security, monitoring, or QoS policies defined in Port Profile BLUE by the network administrator
  • The network administrator does not configure the virtual machine interfaces directly.  Rather, all configuration settings for virtual machines are made with Port Profiles (configured globally), and it’s the VMWare administrator who picks which virtual machines are attached to which Port Profile.  Once this happens the virtual machine is dynamically assigned a unique Virtual Ethernet interface (e.g. ‘int vEth 10′) and inherits the configuration settings from the chosen Port Profile.
  • The VMWare administrator no longer needs to manage multiple vSwitch configurations, and no longer needs to associate physical NICs to a vSwitch.
  • The  VMWare administrator associates physical NICs to Nexus 1000v, allowing the network administrator to begin defining the network configuration and policies.
  • The Nexus 1000v VSM is for control plane functions only and does not participate in forwarding traffic.
  • If the Nexus 1000v VSM goes down it does not disrupt traffic between physical servers and virtual machines.
  • If an ESX host reboots or is added to the network, the Nexus 1000v VSM must be accessible.
  • The Nexus 1000v VSM can be deployed redundantly, with a standby VSM ready to take over in case of failure.
  • The ESX server has a 10GE connection to a physical lossless Ethernet switch that supports Data Center Ethernet (DCE) and Fibre Channel over Ethernet (FCoE), such as the Cisco Nexus 5000.
  • The Cisco Nexus 5000 provides lossless ethernet services for the FCoE traffic received from the CNA.  If the Nexus 5000 buffers reach a high threshold an 802.3x pause signal with the CoS equal to FCoE will be sent to the CNA.  This per CoS pause signal tells the CNA to pause the FCoE traffic only, not the other TCP/IP traffic that is tolerant to loss.  The default CoS setting for FCoE is COS 3.  When the Nexus 5000 buffers reach a low threshold, a similar un-pause signal is sent to the CNA.  The 802.3x per CoS pause provides the same functionality as FC buffer credits, controlling throughput based on the networks ability to carry the traffic reliably.
  • The CNA and Cisco Nexus 5000 also support 802.1Qaz CoS based bandwidth management which allows the network administrator to provide bandwidth guarantees to different types of traffic.  For example, the VMotion vmkernel traffic could be given a minimum guaranteed bandwidth of 10% (1GE), and so on.
  • Fibre Channel HBA’s are not needed in the physical server, as the Fibre Channel connectivity is supplied by a Fibre Channel chip on the CNA from either Emulex or Qlogic (your choice)
  • Individual Ethernet NIC’s are not needed in the physical server, as the Ethernet connectivity is supplied by a Ethernet chip on the CNA from either Intel or Broadcom.
  • The single CNA appears to the ESX Hypervisor as two separate I/O cards, one Ethernet NIC card, and one Fibre Channel HBA.
  • The ESX Hypervisor uses a standard Emulex or Qlogic driver to operate what it sees as the Fibre Channel HBA.
  • The ESX Hypervisor uses a standard Intel ethernet driver to operate what it sees as the Ethernet NIC.
  • VMWare’s ESX 3.5 Update 2 Hardware Compatibility List contains support for the Emulex CNA and Qlogic CNA.
  • ESX 4.0 is not required to use CNA’s and FCoE.  FCoE can be deployed today with ESX 3.5 update 2.
  • ESX 4.0 is required for Nexus 1000V.
  • CNA’s are not required for Nexus 1000V.
  • The Nexus 1000v has no knowledge of FCoE and does not need to support FCoE because FCoE is of no concern to the Nexus 1000V deployment.  To assert this fact, consider that the Nexus 1000v operates no differently in a traditional server that has individual FC HBA’s and individual Ethernet NIC’s.  The Nexus 1000V uses the services of the Ethernet chip on the CNA and is unaware that the CNA is also providing FC services to the ESX host.  Additionally, the virtual machine has no knowledge of FCoE.
  • The Menlo ASIC on the CNA guarantees 4Gbps of bandwidth to the FC chip.  If the FC chip is not using the bandwidth, all 10GE bandwidth will be available to the Ethernet chip.

More notes to come…. please check back again later for updates.  (Updated diagram and notes 1/7/09)

Related Posts:
Cisco UCS and Nexus 1000V design diagram with Palo adapter
Cisco UCS and VMWare vSwitch design with Cisco 10GE Virtual Adapter

UPDATE: Jointly developed Cisco and VMWare white paper comparing the VMWare vSwitch to Cisco Nexus 1000V

###

79 responses so far

79 Responses to “Nexus 1000V with FCoE CNA and VMWare ESX 4.0 deployment diagram”

  1. Brad Hedlund says:

    Joe,
    Thanks for the input and thoughtful commentary.

    VLAN/VRF separation is OK for Internal; Physical separation is required for Internet facing.

    Totally understand. And with such a policy I would expect internet facing VM’s to exist on physically isolated internet facing servers. The manner in which a switch provides isolation at the VLAN/VRF level is no less secure than how a hypervisor provides isolation between VM’s. Thus why I chuckle when folks mix these VM’s on one server but at the same time insist on separate NIC’s and switches for these VM’s.

    your information on the evolving capability for virtual NICs (NIV through PCI SR-IOV) is a move in the right direction, but I can already achieve this today through Xsigo and HP Virtual Connect

    NIV is not an apples-to-apples comparison with Virtual Connect. NIV will take what HP currently does with Virtual Connect a step further in being able to provide hundreds of virtual NICs per physical NIC. This will allow a VM to connect directly to it’s own virtual NIC on the physical NIC, bypassing the hypervisor altogether for network I/O, resulting in I/O performance rivaling bare metal servers.

    CNAs and (Cisco) DCE is a significant risk of having to rip-and-replace many server cards as the solutions evolve over the next 6-18 months.

    I’m not suggesting customers rip and replace their exiting infrastructure. A unified fabric is something you build from scratch in a new green field data center, or gradually evolve into at an existing facility as servers are life cycled.

    I totally understand your frustration with trying to piece all these various parts together.
    Wouldn’t it be great if you could buy one complete computing system with the virtualization and unified fabric components built in from day one? ;-)

  2. Dwight says:

    I don’t know if this is related to the big March announcement, but some interesting rumors about Cisco blade servers at:

    http://www.channelregister.co.uk/2009/01/20/cisco_california_rumors/

    If Cisco gets into the blade server business, people also have to remember the entire management and deployment infrastructure that goes with servers. HP, IBM and Dell all have had a decade or longer to build up their management and deployment tools.

  3. Brad Hedlund says:

    Dwight,
    Cisco doesn’t enter a new market to provide a me too product that’s just more of the same. If Cisco does enter the blade server maket, expect to see a whole new approach to systems management that completely changes the current paradigm that has existed for over a decade, as you pointed out.

  4. Margaret Sellards says:

    “NIV is not an apples-to-apples comparison with Virtual Connect. NIV will take what HP currently does with Virtual Connect a step further in being able to provide hundreds of virtual NICs per physical NIC. This will allow a VM to connect directly to it’s own virtual NIC on the physical NIC, bypassing the hypervisor altogether for network I/O, resulting in I/O performance rivaling bare metal servers.”

    Sounds sort of like HP’s Flex 10 to me. But, from my understanding, Flex10 is actually HW, so while it’s virtual, it’s HW virtual..if that makes sense.

  5. Brad Hedlund says:

    Margaret,
    With Flex-10 a virtual machine cannot connect directly to Flex-10, you will still need a hypervisor switch. Flex-10 does not provide visibility and policy control to each individual virtual machine.
    The advantage of NIV is that each individual virtual machine will be able to connect directly to a hardware based switch (Nexus 5000), alleviating the hypervisor CPU from processing network I/O, thereby allowing the hypervisor to host more virtual machines and thus gaining better consolidation ratios. Not to mention better network I/O performance.

  6. [...] at the InternetExpert.org blog there is a post by Brad Hedlund revealing a bit more about the Nexus 1000V and FCoE in a VMware ESX 4.0 environment. As I mentioned in a previous post I’m not in a position to [...]

  7. Gary says:

    Hypothetical configuration: Small branch office, four ESX hosts (typical rackmount servers) with iSCSI storage, Nexus 1000v on each host. Vmotion and DRS is in use.

    Does using a Nexus 5000v as the next upstream switch buy me anything in terms of ESX network port management? I realize if I was in a CNA environment having a Nexus 5000v upstream would be important.

    Given the 5000v switches are relatively expensive for a branch office and I don’t need 10Gb Ethernet, I was hoping no VM networking functionality would be lost by not using a ‘typical’ workgroup Cisco ethernet switch.

  8. Brad Hedlund says:

    Gary,
    That would be fine. No VM networking functionality is lost by connecting your ESX hosts with Nexus 1000V to a “typical” Cisco ethernet switch. Just keep in mind you will need to have a Nexus 1000V VSM (either physical or virtual) at each small branch office. Some day it may be possible for a central VSM to manage remote VEM’s across a WAN link, but for now the VSM and it’s VEM’s are assumed to exist within the same local network.

  9. SAN guy says:

    In response to Ryan Hicks’ question above (1/29) about blade server switches for HP Blade Servers:

    There will be a blade server switch released in 2009 for HP that will do 10Gb CEE/FCoE and will also be 8Gb FC capable…

  10. Spidern says:

    Great news today :)

    After spending a few hours gobbling down 48 DIMM slots and various interconnects and so on, i’m left with one very basic question…

    When Cisco says its an open partnership, and RedHat is in there, does that mean we will have a Nexus1KV_Xen3.3_source.tar.gz or is it going to be a Nexus_KVM_1.0.rpm ?

    Obviously if anyone wants Oracle, Citrix, RedHat AND the open source community to join the unified fabric, _something_ needs to be opensourced here somewhere, right ?

    VmWare OEM or not, big clusters doesnt run vmware…

  11. Brad Hedlund says:

    Spidern,
    Keep in mind the N1KV has nothing to do with a unified fabric (no dependencies there).
    All that is needed for the linux community to adopt unified fabric is the appropriate device drivers for the CNA (which already exist).

  12. Brad Hedlund says:

    Checkout this whitepaper jointly developed by Cisco and VMWare that compares the features and benefits of VMWare’s vSwtich and Cisco Nexus 1000V.

    http://www.vmware.com/files/pdf/technology/cisco_vmware_virtualizing_the_datacenter.pdf

  13. sc says:

    Interesting reading, quality posts.

    A quick question, in the initial list of ‘observations of importance’ you have

    The ESX server has a 10GE connection to a physical lossless Ethernet switch that supports Data Center Ethernet (DCE) and Fibre Channel over Ethernet (FCoE), such as the Cisco Nexus 5000.

    Reading the rest of the posts I did not pick out details of exactly how the above connectivity is to be provided. Perhaps I missed it or perhaps not?

    Is the solution Pass Thru modules 1-1 mapping?

    Or is there some new Nexus blade switch card in pipeline, say an x000 with 10G internal and 10G uplink?

  14. Brad Hedlund says:

    SC,
    The solution shown in the diagram is currently possible with rack mount servers, and (very soon) will be available with the Cisco Unified Computing System.

    Nexus blade switches for HP, IBM, and Dell blade centers is certainly possible, however the necessary agreements do not exist yet.

    I have no specific knowledge of the status, but I suspect negotiations are underway with mixed results between the three. Somehow I get the feeling the agreements with HP are a little more challenging than the others (unfortunately) …

    It will certainly be interesting this year to see if (and how) HP adopts a unified fabric offering for their customer base.

    I suspect HP will likely dismiss unified fabric as not really important for the data center …

    I will thoroughly enjoy watching how this transpires in the coming months…

  15. Dwight says:

    My sources from HP say a Nexus switch for BladeSystem should be out in 2010. That’s also when they will support a unified fabric.

  16. Brad Hedlund says:

    Dwight,
    That is consistent with what I am hearing as well. This sounds promising for HP BladeSystem customers.

  17. [...] Distributed Switch back-up files Distributed Virtual Port Groups and Distributed Virtual Uplinks Nexus 1000V with FCoE CNA and VMWare ESX 4.0 deployment diagram Author: esiebert7625 Categories: vSphere Links Tags: Networking Comments (0) Trackbacks [...]

  18. Gio says:

    All these questions about HP support or other blade center servers vendors of NX1000V have been answered by Cisco. CISCO IS GOING TO SELL HEIR OWN BLADE SERVERS AND RACK SERVERS.

    So what ther FCoE avenue is there for HP and IBM? BROCADE….

  19. Brad Hedlund says:

    So what ther FCoE avenue is there for HP and IBM? BROCADE….

    You are making the assumption there will be no Nexus blade switch for IBM and HP ;-)

    Cheers,
    Brad

  20. shinelite says:

    Are you kidding? All of the techno-babble in the world doesn’t hide very significant facts:

    1 – VNTAG = 0 backward compatability. Kidding right?
    Without VNTAGs enabled = The frame will simply be discarded. Great for IT shops!
    2 – VEPA is backward compatable! You don’t mention it? Why?
    In fact not a single person mentions it on this blog. Sounds like a resounding blast of ignorance!
    3 – Enabling VNTAGS = All other switches are broken unless they are Cisco with VNTags enabled.
    4 – This is Cisco’s way of trying to regain their huge losses in the edge switch market. Simpleton response by a networking company. Don’t make it better. Rip it all out and put new switches everywhere and since Cisco decided to push the VNTAG standard then nobody else will immediately implement it is a standard.
    5 – Cisco continues to make 60 points on every switch sold. OUCH!
    Moores law down the tubes!
    6 – This is a Network only view. What about the existing investments in Storage, Servers, WAN accelerators and the like.
    7 – FCoE – Even with NEW standards to support lossless Enet is still slower than Fibre Channel.
    8 – Cisco promises 2to1 cable reduction! Are you kidding? I am already at a 4to1 reduction with my solution! I will go backwards with FCoE!

    Your Network only view is pitiful. But I am sure the Cisco Kool-Aide drinkers will lap it up….

    Brad, you are obviously a Cisco Stockholder. All this really adds up to is mo’ money for Cisco and no real progress for IT.

  21. Brad Hedlund says:

    1 – VNTAG = 0 backward compatability. Kidding right?

    Wrong. A switch that supports VNTag (such as Nexus) works perfectly fine with switches or NIC’s that do not have VNTag support.

    2 – VEPA is backward compatable! You don’t mention it? Why?

    A switch with VNTag support is backward compatible too, so what’s your point?

    3 – Enabling VNTAGS = All other switches are broken unless they are Cisco with VNTags enabled.

    Ouch, your 0-3. Wrong again! VNTag’s are link by link specific (NIC-to-Switch, or Fabric_Extender-to-Switch) and are not forwarded upsteam to other switches. So, enabling VNTag between a Server NIC and a switch means other switches continue to work just fine.

    4 – This is Cisco’s way of trying to regain their huge losses in the edge switch market. Simpleton response by a networking company. Don’t make it better. Rip it all out and put new switches everywhere and since Cisco decided to push the VNTAG standard then nobody else will immediately implement it is a standard.

    As stated above your “Rip it all out” comment is derived from being uniformed – no need to go there again. As for “Cisco decided to push the VNTag standard” — one point of clarification: VMWare is also pushing VNTag as a standard. :-)

    6 – This is a Network only view. What about the existing investments in Storage, Servers, WAN accelerators and the like.

    Where does it say that everything in the Data Center needs to be ripped out? The suggestion that implementing Cisco Nexus switches requires trashing all other investments in the Data Center is either rooted in ignorance, the intent to mislead, or both.

    7 – FCoE – Even with NEW standards to support lossless Enet is still slower than Fibre Channel.

    Wow, wrong again. Gen2 CNA’s from Emulex/Qlogic, and even Cisco’s adapter in UCS can all forward Fibre Channel at 10Gbps per second.

    8 – Cisco promises 2to1 cable reduction! Are you kidding? I am already at a 4to1 reduction with my solution! I will go backwards with FCoE!

    Today a server can have a single adapter, linked to single switch, with a single cable providing 10Gbps Eth and FC. Can your solution do that? Yeah, didn’t think so.

    Brad, you are obviously a Cisco Stockholder

    You’re right. Even better, I’m also a Cisco employee, and I clearly state that for every reader to see in the “About the Author” section at the top of every page.

    Your Network only view is pitiful. But I am sure the Cisco Kool-Aide drinkers will lap it up….

    Note to readers: This juvenile post was made by an employee of Hewlett-Packard.

    I should point out that some other comments in this article were made by HP employees in a mature discussion of facts and philosophical debate. Something I expect from a large and successful company like HP. This poster, unfortunately, did not represent HP very well.

    I always welcome a spirited debate with adults … let me know when you are ready for that.

    Cheers,
    Brad

  22. Andras Tudos says:

    You mention that a FCoE initiator is available for the Intel Oplin 10GE NICs. I was searching for that quite a lot, but could not find anything useful. There is the open-sourced code on open-fcoe.org, but I need a driver for Windows Server 2008 or VMWare ESX. I’ve seen that Microsoft has launched a FCoE logo program and they will issue the first certificates in 2010 only. VMWare lists only the first generation CNAs on its HCL, which have standalone HBA and 10GE chipsets on them.

    Do I have an Intel FCoE option today or the only candidates are the hybrid adapters from Qlogic or Emulex? Are their second generation CNAs already shipping?

  23. Brad Hedlund says:

    Do I have an Intel FCoE option today or the only candidates are the hybrid adapters from Qlogic or Emulex? Are their second generation CNAs already shipping?

    The Intel Oplin 10GE adapter has software FCoE capabilities and broad operating system support is forthcoming. As of right now, the Qlogic and Emulex CNA’s are your certified and supported options for FCoE. AFAIK, the Gen2 CNA’s from Emulex and Qlogic should be shipping this summer (very soon).

  24. LordInfidel says:

    I’m in InfoSec (financial services) and have been a long time cisco user.

    Just purchased the 5010 lab bundle and wanted to point out some gotchas to anyoen reading.

    Gotcha #1: Make sure you order the FCoE Bundle. You need the FCoE card AND the FCoE license (which costs 12K). If you order the ethernet bundle, you just bought a pretty white switch.

    Gotcha #2: If, like me, you are running a NON Cisco UCS Blade Center, MAKE sure that your vendor makes a CNA mezzanine card to replace the ethernet mezzanine card.

    Let’s just say that gotcha #2 somehow never made it into the initial architecture discussions and I now have really pretty 39K boat anchors (yes they are being returned unless cisco and my blade center vendor both say that a mezzanine card is shipping out in the next month and I can have them).

    So for all of you out there who were really hyped on 10GigE FCoE back to the SAN and BladeCenter, ask the fine print questions, because as I have found out, the reps barely know at this point.

    They are still working out the kinks I guess with a new product. (and I like my cisco reps)

    NOW- on a side note, I did go to the labs to play with the NK7 and NK5, and it is pretty cool. So I will most likely revisit it again once I can actually use them in the manner that they were designed for.

    LordInfidel

  25. [...] Combined with the ever growing capacity of CPU and RAM in servers this will result in VM host monsters. But how are all these new techonologies going to integrate with one another. Thankfully Brad Hedlund a Consulting System Engineer with Cisco and CCIE has written an article to explain this in detail. You can read about it here. [...]

  26. Robert Reid says:

    Does the Menlo ASIC in the CNA limit the reach of any media (fiber or copper) to 50 meters when using PFC?

  27. Brad Hedlund says:

    Robert,

    The distance limitation for lossless traffic (FCoE) on the Nexus 5000 is 300m.

    Cheers,
    Brad

  28. Brian Shepherd says:

    Do I have to use Nexus 1000V in order to see vm to vm traffic inside an ESX host? Can I get away with buying the 5010 and making my intra ESX traffic have to go through it without the 1000V for sniffing traffic? I don’t want to pay for the Enterprise Plus license…

    SHEP

  29. Brad Hedlund says:

    Brian,

    How would you force all VM to VM traffic on the same ESX through a Nexus 5000? The reality is, any VMs on the same ESX Host, on the same VLAN, will directly communicate via the vSwitch and you will not see that traffic at the Nexus 5000.

    Cheers,
    Brad

Leave a Reply