Storage based IP networks leveraging the NVMe protocol can provide a cost effective, high performance connectivity alternative to traditional Fibre Channel SANs. While ecosystem support, such as VMware, is maturing, the tools required to set up and manage NVMe based networks lack the automation and integration compared to Fibre Channel. Providing admins with similar IP based automation tools is a key requirement to drive adoption and simplify management at scale. The introduction of Dell’s Smart Fabric Storage Services (SFSS) addresses these challenges. Deploying storage with NVMe over IP using SFSS provides the necessary automation to significantly simplify and standardize the management these deployments. Leveraging SFSS will help accelerate adoption of this new technology and maximize IT’s ability to leverage the cost and performance benefits of NMVe over IP.
One of the biggest storage related announcements from the year’s VMworld user conference is new support for NVMe over IP to connect external storage. NVMe over IP is a relatively new block-based storage protocol. Block based connectivity over specialized Fibre Channel networks has been available for almost 3 decades. It’s considered the industry standard for building storage networks where high performance and ultra-scale are required. Block based connectivity over standard IP networks using iSCSI has also been available for years. iSCSI is often used as a low-cost option to network connect external storage compared to Fibre Channel. The tradeoffs of iSCSI included lower network bandwidths, protocol overhead and queuing limits that impact throughput. Scaling complexities and restrictions also make it more difficult to support a storage network with hundreds of apps over thousands of ports.
The introduction of NVMe over IP addresses many of these limits. It provides Fibre Channel like performance and scale combined with the costs benefits of standardized IP networking.
NVMe over TCP versus NVMe over RDMA/RoCE
Within NVMe over IP, there are two transports currently available, NVMe over RDMA and NVMe over TCP. RDMA is short for Remote Direct Memory Access and it is also referred to as RoCE – RDMA over Converged Ethernet. It allows networks to connect and access data between the memory of physically separate servers. It provides low latency and high bandwidth and is typically only used for specific workloads that need ultra high network performance. The downside to RDMA/RoCE is admins have to deal with the complexity of set up and tuning, as well as the lack of standardization across vendors, technologies, and implementations. It also an expensive networking option and requires 40Gb/s or 100Gb/s switches, network adapters, and fibre cables which is why it’s only typically used for specialized apps.
TCP is short for Transmission Control Protocol and provides a standard way to communicate and pass commands across networks. It’s well defined and universal across IP networks. It supports 10Gb/s and 25Gb/s networks, which are more common place and affordable. From a storage perspective, NVMe over TCP provides a configuration style that most users are familiar with and use for typical IP network traffic. It can support configuration zoning similar to Fibre Channel zoning and, based on user feedback, is the preferred transport for storage due to its maturity and commonality.
SmartFabric Storage Services for NVMe over TCP to Automate and Simplify
SmartFabric Storage Software (SFSS) enables an end-to-end automated and integrated NVMe over IP fabrics connecting servers and storage targets using TCP. In a nutshell, it is a Dell developed software tool that simplifies and automates the end to end configuration set up and management of servers and storage arrays that are connected on an IP fabric. It implements the standard NVMe calls to allow server and storage arrays to be discovered and added to an IP fabric automatically without manual administration.
There are two different ways to use SFSS to connect and use NVMe/TCP. There is the Direct Discovery option and the Centralized Discovery option. Direct Discovery option is applicable for smaller types of deployments where admins only have a few servers and a single target. From the server, you need to perform a discovery to the storage array which will return a list of subsystems that the server connects to. This is fine for a small environment but once there are many servers and targets, this can be error prone and difficult to scale.
That is where a Centralized Discovery option comes into play. The CDC creates a catalogue of server and storage array elements and stores them name server database. It keeps track of every device that logs into it by storing the entries inside the database. Admins are then be able to have access control through methods like zoning to provide access between servers and storage arrays.
Once servers and storage are stored on the name server database, these end points can be grouped and managed via zoning to control what severs have access to what storage. In addition to aggregating the catalogue of servers and storage, IP fabric events can be monitored and reported. These include Asynchronous Event Requests (AER) events that allow admins to subscribe to state change notifications from end points and Asynchronous Event Notifications (AEN) events that allow admins to send notifications to end points for state changes. The installation of SFSS is made simply by containerizing the software stack inside a Virtual Machine and deploy with the ESX environment.
SFSS Makes IP SANs Kind of Like Fibre Channel SANs…
SFSS includes several capabilities that are similar to Fibre Channel. Here’s a look at a few key ones. End Point Registration Service allows NVMe/TCP end points such as servers and storage arrays to register with the CDC. This is very similar to a host in Fibre Channel logging in with a Fibre Channel switch and populating the host WWN with the FC switch name server database. In the case of SFSS, admins have a server that has a host NQN that needs to register and populate the name server. There’s also the storage array that also need to register and populate their information into the name server as well.
End Point Query Service within SSFS controls which server and storage arrays can communicate with each other. This allows a server to automatically query the CDC to find information about which storage arrays with which it can communicate. Similarly, a storage array will also query the CDC to find information about which servers it has access to communicate.
SSFS also provides Zone Service to control access at a group level. Servers and storage can be grouped into zones via soft zoning. Each zone defines the set of servers and storage targets that can communicate with each other over the network. The zoning service is similar to Fibre Channel zoning as well.
In addition, SFSS provides Asynchronous Notifications to enable alerting for status changes. This includes Asynchronous Event Requests (AER) events and Asynchronous Event Notifications (AEN) events. The automatic alerting is very similar to a Fibre Channel Registered State Change Notification (RSCN)
SFSS Supported Platforms and End Points
Dell’s NVMe IP SAN solutions consist of NVMe-based products across Dell servers, storage, and networking. Here’s the list of what’s supported to build an end to end, fully tested, and integrated IP SAN:
- SmartFabric Storage Software (SFSS) is a stand-alone software solution packaged as a VM enabling an end-to-end automated and integrated NVMe solution running TCP over an IP fabric. The key functions of SFSS are policy-driven to help automate NVMe based storage service discovery, end-point registration, connectivity, and zoning services.
- PowerEdge servers running VMware ESXi 7.0U3 have expanded their support for NVMe include NVMe over TCP. This new feature maps NVMe onto the TCP protocol to enable the transfer of data and NVMe commands between a host server and a target storage device.
- The latest PowerStore 2.1 release introduces support for NVMe/TCP to allow hosts to access storage systems across a network fabric using the NVMe protocol. Existing IP 25Gb/s ports can be used to connect to NVMe over TCP hosts. All that’s required is the latest version of PowerStoreOS. PowerFlex and PowerMax is planned for next year.
- PowerSwitch S & Z series switches provide a high performance ethernet switching fabric performing as an IP SAN interconnect for NVMe/TCP server and storage. SmartFabric Services, a component of the SmartFabric OS10 operating system, is now integrated with PowerStore and ESXi and offers a turnkey IP SAN for NVMe/TCP.
- OpenManage Network Integration (OMNI) – offers centralized management for multiple SFSS instances and integrates with VMware vCenter to offer single pane of management through vCenter
An End-to-End NVMe IP SAN Solution From Dell Technologies
IT organizations are interested in storage connectivity solutions that not only meet their current infrastructure demands but will also provide the foundation to meet demands in the future. NVMe based IP networks provide new storage interconnect solutions to help scale efficiently while reducing costs leveraging standards-based IP networking technologies. With SFSS, these new connectivity solutions also incorporate automation capabilities much like existing Fibre Channel. The result is a Dell developed modern end-to-end storage solution that is tested and validated across compute, storage, and networking platforms.
For more information, here are some useful links.