Tr-4517 Ontap Select Product Architecture and Best Practices
Tr-4517 Ontap Select Product Architecture and Best Practices
ONTAP Select
Product Architecture and Best Practices
Tudor Pascu, NetApp
June 2017 | TR-4517
TABLE OF CONTENTS
1 Introduction ........................................................................................................................................... 5
1.1 Software-Defined Infrastructure ......................................................................................................................5
1.5 ONTAP Select Evaluation software versus running ONTAP Select in Evaluation mode ................................7
5.2 Schedule-driven SnapMirror relationships periodically replicate the data from the remote office to a single
consolidated engineered storage array, located in the main data center. .....................................................49
7 Performance ........................................................................................................................................ 54
LIST OF TABLES
Table 1) ONTAP Select versus Data ONTAP Edge. ......................................................................................................6
Table 2) ONTAP Select virtual machine properties. .......................................................................................................9
Table 3) ONTAP Select virtual machine properties. .....................................................................................................11
Table 4) ONTAP Select 9.0 versus ONTAP Select 9.1 versus ONTAP Select 9.2 .......................................................11
Table 5) Internal versus external network quick reference. ..........................................................................................36
Table 6) Network configuration support matrix. ............................................................................................................37
Table 7) ONTAP Deploy vs. ONTAP Select support matrix .........................................................................................50
Table 8) Performance results for a 4-node ONTAP Select Standard cluster and a 4-node ONTAP Select Premium
cluster. ..........................................................................................................................................................................55
Table 9) Performance results for a single-node ONTAP Select Standard cluster on an AF VSAN datastore. .............58
LIST OF FIGURES
Figure 1) Server LUN configuration with only RAID-managed spindles. ......................................................................16
Figure 2) Server LUN configuration on mixed RAID/non-RAID system. .......................................................................17
Figure 3) Virtual disk to physical disk mapping.............................................................................................................18
Figure 4) Incoming writes to ONTAP Select VM. ..........................................................................................................20
Figure 5) Two-node ONTAP Select cluster with remote Mediator and using local attached storage............................21
Figure 6) Four-node ONTAP Select cluster using local attached storage. ...................................................................21
Figure 7) ONTAP Select mirrored aggregate. ..............................................................................................................23
Figure 8) ONTAP Select write path workflow. ..............................................................................................................24
Figure 9) HA heart-beating in a 4-node cluster: steady state. ......................................................................................26
Figure 10) ONTAP Select installation VM placement. ..................................................................................................28
Figure 11) ONTAP Select multinode network configuration. ........................................................................................32
Figure 12) Network configuration of a multinode ONTAP Select VM. ..........................................................................33
Figure 13) Network configuration of single-node ONTAP Select VM. ..........................................................................35
Figure 14) Port group configurations using a standard vSwitch. ..................................................................................39
Figure 15) Link aggregation group properties when using LACP. ................................................................................40
Figure 16) Port group configurations using a distributed vSwitch with LACP enabled. ................................................40
Figure 17) Network configuration using shared physical switch. ..................................................................................42
Software-Defined Storage
The shift toward software-defined infrastructures may be having its greatest impact in an area that has
traditionally been one of the least affected by the virtualization movement: storage. Software-only solutions
that separate out storage management services from the physical hardware are becoming more
commonplace. This is especially evident within private cloud environments: enterprise-class service-
oriented architectures designed from the ground up with software defined in mind. Many of these
environments are being built on commodity hardware: white box servers with locally attached storage, with
software controlling the placement and management of user data.
This is also seen within the emergence of hyperconverged infrastructures (HCIs), a building-block style of
IT design based on the premise of bundling compute, storage, and networking services. The rapid adoption
of hyperconverged solutions over the past several years has highlighted the desire for simplicity and
flexibility. However, as companies make the decision to replace enterprise-class storage arrays with a more
customized, make your own model, by building storage management solutions on top of home-grown
components, a set of new problems emerges.
In a commodity world, where data lives fragmented across silos of direct-attached storage, data mobility
and data management become complex problems that need to be solved. This is where NetApp can help.
Compression No Yes
Hardware platform support Select families within qualified Wider support for major vendor
server vendors offerings that meet minimum
criteria
Hardware Requirements
ONTAP Select requires that the hosting physical server meet the following minimum requirements:
• Intel Xeon E5-26xx v3 (Haswell) CPU or greater: 6 cores (4 for ONTAP Select, 2 for OS)
• 24GB RAM (16 GB for ONTAP Select, 8 GB for OS)
• Min. 2 1Gb NIC Ports for single node clusters, min 4 1Gb NIC Ports for two node clusters and 2 10GbE
NIC ports (4 recommended) for four node clusters.
ONTAP Select 9.1 Premium license supports both the Standard VM (minimum requirements above) as well
as the Premium VM. The ONTAP Select Premium VM will reserve 8 cores and 64 GB of RAM therefore the
server minimum requirements should be adjusted accordingly.
For locally attached storage (DAS), the following requirements also apply
• 8–24 internal disks (SAS)
• 8 – 24 internal disks (SAS, NL-SAS, or SATA – Select 9.1 and above)
• 4 – 24 SSD disks (Select 9.1 Premium and above)
• Hardware RAID controller with 512MB writeback cache and 12Gb/s of throughput
Additionally, support for the OnCommand management suite is included. This includes most tooling used
to manage NetApp FAS arrays, such as OnCommand Unified Manager (OCUM), OnCommand Insight
(OCI), Workflow Automation (WFA), and SnapCenter®. Use of SnapCenter, SnapManager or SnapDrive
with ONTAP Select requires server-based licenses
Consult the IMT for a complete list of supported management applications.
Note that the following ONTAP features are not supported by ONTAP Select:
• Interface groups (IFGRPs)
• Service Processor
• Hardware-centric features such as MetroCluster, Fibre Channel (FC/FCoE), and full disk encryption
(FDE)
• SnapLock®
• NSE Drives
• FabricPools
ONTAP Select 9.1 and 9.2 are providing Storage Efficiency options that are similar to the Storage Efficiency
options present on FAS and AFF arrays. Both ONTAP Select 9.1 and 9.2 support SSD media, however
there are significant differences in default behaviors between these releases, as well as between ONTAP
Select Premium with SSD media and AFF arrays.
Please note that ONTAP Select vNAS deployments using All Flash VSAN or all flash arrays should follow
the best practices for ONTAP Select with non-SSD DAS storage.
ONTAP Deploy 2.4 adds an additional configuration check during the ONTAP Select cluster setup. This
configuration check asks the user to confirm the DAS storage is of SSD type. ONTAP Deploy will enforce
this check during setup, as well as during storage add operations. In other words, once an ONTAP Select
Premium VM is configured for SSD storage, only local (DAS) SSD media can be added to that VM. There
are a number of reasons for this, including the fact that ONTAP Select does not support multiple RAID
controllers, nor does it support mixing media types on the same RAID controller. However this enablement
enforcement insures that the SSD appropriate Storage Efficiency options cannot be enabled on HDD based
datastores.
Please note that unlike an AFF array which automatically enables its Inline Storage Efficiency policies,
configuring an ONTAP Select 9.2 Premium with the SSD feature during cluster setup does not automatically
enable Inline Storage Efficiencies inside ONTAP Select, but it simply makes this functionality available to
use later, at the time of volume creation. In other words, the client may enable Inline Storage Efficiencies,
on a volume per volume basis, for each volume provisioned on an ONTAP Select 9.2 Premium with SSD
media.
Table 2 summarizes the various Storage Efficiency options available and recommended depending on the
ONTAP Select version and media type:
2 Architecture Overview
ONTAP Select is clustered Data ONTAP deployed as a virtual machine and providing storage management
services on a virtualized commodity server.
The ONTAP Select product can be deployed two different ways:
• Non-HA (single node). The single-node version of ONTAP Select is well suited for storage
infrastructures that provide their own storage resiliency such as VSAN datastores or external arrays
which offer data protection at the array layer. The single node Select cluster can also be used for remote
and branch offices where the data is protected by replication to a core location.
• High availability (multi-node). The multi-node version of the solution uses two or four ONTAP Select
nodes and adds support for high availability and clustered Data ONTAP non-disruptive operations, all
within a shared-nothing environment.
When choosing a solution, resiliency requirements, environment restrictions, and cost factors should be
taken into consideration. Although both versions run clustered Data ONTAP and support many of the same
core features, the multi-node solution provides high availability and supports non-disruptive operations, a
core value proposition for clustered Data ONTAP.
Note: The single-node and multi-node versions of ONTAP Select are deployment options, not separate
products. Although the multi-node solution requires the purchase of additional node licenses, both
share the same product model, FDvM300.
This section provides a detailed analysis of the various aspects of the system architecture for both the
single-node and multi-node solutions while highlighting important differences between the two variants.
SCSI controllers 4 4
Serial ports 2 network serial ports (Select 9.0 and 2 network serial ports (Select 9.0 and
9.1 only) 9.1 only)
Note: The core dump disk partition is separate from the system boot disk. Because the core file size is
directly related to the amount of memory allocated to the ONTAP instance, this allows NetApp to
support larger-sized memory instances in the future without requiring a redesign of the system boot
disk.
Note: The serial ports were removed from the ONTAP Select 9.2 VM which allows ONTAP Select 9.2 to
support and install on any vSphere license. Prior to ONTAP Select 9.2, only the vSphere Enterprise
/ Enterprise + licenses were supported.
Starting with ONTAP Select 9.2, the ONTAP console will only be accessible via the Virtual Machine
video console tab in the vSphere client.
Table 4 show the differences between the ONTAP Select 9.0, 9.1 and 9.2 releases.
Table 4) ONTAP Select 9.0 versus ONTAP Select 9.1 versus ONTAP Select 9.2
Description ONTAP Select 9.0 ONTAP Select 9.1 ONTAP Select 9.2
4 vCPUs/16GB OR 4 vCPUs/16GB OR
CPU/memory 4 vCPUs/16GB RAM 8 vCPUs/64GB* 8 vCPUs/64GB*
(*Requires Premium License) (*Requires Premium License)
Disk type SAS only SAS, NL-SAS, SATA or SSD* SAS, NL-SAS, SATA or
(* Requires Premium License) SSD*
(* Requires Premium License)
Maximum number 24 24 24
of disks
vSpere license Enterprise / Enterprise + Enterprise / Enterprise + All vSphere licenses are
requirements supported
Select Cluster single node / 4 node single node / 4 node single node / 2 node / 4
Size node*
(* Requires ONTAP Deploy 2.4
min)
When using local attached storage (DAS), ONTAP Select will make use of the hardware RAID controller
cache, to achieve a significant increase in write performance. Additionally, when using locally attached
storage (DAS), certain restrictions apply to the ONTAP Select virtual machine.
Specifically:
• Only one ONTAP Select VM can reside on a single server.
• ONTAP Select may not be migrated or vMotioned to another server. This includes storage vMotion of
the ONTAP Select VM.
• vSphere Fault Tolerance (FT) is not supported
The following limitations should be considered when installing a single node ONTAP Select cluster on a
VSAN type datastore:
• Only one ONTAP Select node per VSAN / ESX host is supported. Multiple single node Select
clusters can share a VSAN datastore as long as they are installed on separate VSAN hosts.
• The ONTAP Deploy auto-discovery and re-host operations require that all ESX hosts be managed
by the same vCenter.
• A VMware HA or vMotion operation can result in a situation where two ONTAP Select VMs reside
on the same ESX host. This configuration is not currently supported and ONTAP Deploy 2.4 will
be unable to re-establish management connectivity to the ONTAP Select VM until that VM is moved
to another ESX host.
The following best practices should be considered when installing a single node Select cluster on an
external array type datastore:
• vSphere 6.0 Update 1, 2 or 3 are supported. Enterprise License is required for versions prior to
ONTAP Select 9.2 and ONTAP Deploy 2.4. All vSphere licenses are supported starting with
ONTAP Select 9.2 and ONTAP Deploy 2.4
• FC/FCoE/iSCSI and NFS are supported protocols for the connectivity between the ESX host and
the external array.
• Hybrid arrays and All Flash arrays are supported with both ONTAP Select Standard and Premium.
• Array side storage efficiency policies are supported.
The following limitations should be considered when installing a single node Select cluster on an external
array type datastore:
• VVOLS are not supported
• Only one ONTAP Select node per ESX host is supported. Multiple single node ONTAP Select
clusters can share an external array datastore as long as they are installed on separate ESX hosts.
• The ONTAP Deploy auto-discovery and re-host operations require that all ESX hosts be managed
by the same vCenter.
• A VMware HA or vMotion operation can result in a situation where two ONTAP Select VMs reside
on the same ESX host. This configuration is not currently supported and ONTAP Deploy 2.4 will
be unable to re-establish management connectivity to the ONTAP Select VM until that VM is moved
to another ESX host.
NetApp FAS, SolidFire and E-Series arrays are supported as long as they are on VMware HCL. We
recommend following the NetApp and VMware vSphere Storage Best Practices documentation for the
respective array.
RAID Mode
Many RAID controllers support up to three modes of operation, each representing a significant difference
in the data path taken by write requests. These are:
• Writethrough. All incoming I/O requests are written to the RAID controller cache and then
immediately flushed to disk before acknowledging the request back to the host.
• Writearound. All incoming I/O requests are written directly to disk, circumventing the RAID
controller cache.
• Writeback. All incoming I/O requests are written directly to the controller cache and immediately
acknowledged back to the host. Data blocks are flushed to disk asynchronously using the controller.
Writeback mode offers the shortest data path, with I/O acknowledgement occurring immediately after the
blocks enter cache, and thus lower latency and higher throughput for mixed read/write workloads. However,
without the presence of a BBU or nonvolatile flash technology, when operating in this mode, users run the
risk of losing data should the system incur a power failure.
Because ONTAP Select requires the presence of a battery backup or flash unit, we can be confident that
cached blocks are flushed to disk in the event of this type of failure. For this reason, it is a requirement that
the RAID controller be configured in writeback mode.
Best Practice
The server RAID controller should be configured to operate in writeback mode. If write workload
performance issues are seen, check the controller settings and make sure that writethrough or
writearound is not enabled.
Provisioning the OS LUNs from the same RAID group as ONTAP Select allows the hypervisor OS (and any
client VMs that are also provisioned from that storage) to benefit from RAID protection, preventing a single-
drive failure from bringing down the entire system
Best Practice
If the physical server contains a single RAID controller managing all locally attached disks, we
recommend creating a separate LUN for the server OS and one or more LUNs for ONTAP Select. In the
event of boot disk corruption, this allows the administrator to recreate the OS LUN without affecting
ONTAP Select.
Multiple LUNs
There are two cases where single–RAID group / single-LUN configurations must change. When using NL-
SAS or SATA drives, the RAID group size must not exceed 12 drives. Additionally, when a single LUN
becomes larger than the underlying hypervisor storage limits (either individual file system extent maximum
size or total storage pool maximum size), then the underlying physical storage must be broken up into
multiple LUNs to allow for successful file system creation.
Best Practice
ONTAP Select receives no performance benefits by increasing the number of LUNs within a RAID group.
Multiple LUNs should only be used in order to follow best practices for SATA / NL-SAS configurations or
to bypass hypervisor file system limitations.
Best Practice
Similar to creating multiple LUNs, ONTAP Select receives no performance benefits by increasing the
number of virtual disks used by the system.
Best Practice
Because the RAID controller cache is used to store all incoming block changes and not only those
targeted toward the NVRAM partition, when choosing a RAID controller, select one with the largest cache
available. A larger cache allows for less frequent disk flushing and an increase in performance of the
ONTAP Select VM, the hypervisor, and any compute VMs collocated on the server.
Figure 5) Two-node ONTAP Select cluster with remote Mediator and using local attached storage.
Note: The 4-node ONTAP Select cluster is composed of two HA pairs. The 2-node ONTAP Select cluster
is composed on one HA pair and a Mediator. Within each HA pair, data aggregates on each cluster
node are synchronously mirrored, and in the event of a failover there is no loss of data.
In the situation in which the ONTAP Deploy VM acting as a Mediator is temporarily or potentially
permanently unavailable, a secondary ONTAP Deploy VM (minimum version 2.4) can be used to restore
the 2-node cluster quorum. This will result in a configuration in which the new ONTAP Deploy VM is unable
to manage the ONTAP Select nodes but it will successfully participate in the cluster quorum algorithm. The
communication between the ONTAP Select nodes and the ONTAP Deploy VM will be done using the iSCSI
protocol. The ONTAP Select node management IP address is the initiator and the ONTAP Deploy VM IP
address is the target. The ONTAP Deploy hosted mailbox disks are automatically created and masked to
the proper ONTAP Select node management IP addresses at the time of the 2-node cluster creation. The
entire configuration is automatically done during setup and no further administrative action is required. The
ONTAP Deploy instance creating the cluster is the default Mediator for that cluster.
An administrative action is required if the original Mediator location needs to be changed. It is possible to
recover a cluster quorum even if the original ONTAP Deploy VM is completely lost. However, we
recommend that a backup of the ONTAP Deploy database be done after every 2-node cluster is
instantiated.
For a complete list of steps required to configure a new Mediator location, please refer to the “ONTAP
Select 9 Installation and Cluster Deployment Guide”.
Synchronous Replication
The Data ONTAP HA model is built on the notion of HA partners. As explained earlier, ONTAP Select
extends this architecture into the non-shared commodity server world by using the RAID SyncMirror
functionality that is present in clustered Data ONTAP to replicate data blocks between cluster nodes,
providing two copies of user data spread across an HA pair.
Note: This product is not intended to be an MCC-style disaster recovery replacement and cannot be used
as a stretch cluster. Cluster network and replication traffic occurs using link-local IP addresses and
requires a low-latency, high-throughput network. As a result, spreading out cluster nodes across
long distances is not supported.
Note: When an ONTAP Select cluster is deployed, all virtual disks present on the system are auto assigned
to the correct plex, requiring no additional step from the user with respect to disk assignment. This prevents
the accidental assignment of disks to an incorrect plex and makes sure of optimal mirror disk configuration.
Although the existence of the mirrored aggregate is needed to provide an up to date (RPO 0) copy of the
primary aggregate, care should be taken that the primary aggregate does not run low on free space. A
low-space condition in the primary aggregate may cause ONTAP to delete the common Snapshot® copy
used as the baseline for storage giveback. Although this works as designed in order to accommodate
client writes, the lack of a common Snapshot copy on failback requires the ONTAP Select node to do a
full base line from the mirrored aggregate. This operation can take a significant amount of time in a
shared-nothing environment.
A good baseline for monitoring aggregate space utilization is up to 85%.
Disk Heart-beating
Although the ONTAP Select HA architecture leverages many of the code paths used by the traditional FAS
arrays, some exceptions exist. One of these exceptions is in the implementation of disk-based heart-
beating, a non network-based method of communication used by cluster nodes to prevent network isolation
from causing split-brain behavior. Split brain is the result of cluster partitioning, typically caused by network
failures, whereby each side believes the other is down and attempts to take over cluster resources.
Enterprise-class HA implementations must gracefully handle this type of scenario, and Data ONTAP does
this through a customized disk-based method of heart-beating. This is the job of the HA mailbox, a location
on physical storage that is used by cluster nodes to pass heart-beat messages. This helps the cluster
determine connectivity and therefore define quorum in the event of a failover.
On FAS arrays, which use a shared-storage HA architecture, Data ONTAP resolves split-brain issues
through:
• SCSI persistent reservations
• Persistent HA metadata
• HA state sent over HA interconnect
However, within the shared-nothing architecture of an ONTAP Select cluster, a node is only able to “see”
its own local storage and not that of the HA partner. Therefore, when network partitioning isolates each side
of an HA pair, the preceding methods of determining cluster quorum and failover behavior are unavailable.
Although the existing method of split-brain detection and avoidance cannot be used, a method of mediation
is still required, one that fits within the constraints of a shared-nothing environment. ONTAP Select extends
the existing mailbox infrastructure further, allowing it to act as a method of mediation in the event of network
partitioning. Because shared storage is unavailable, mediation is accomplished through access to the
mailbox disks over network-attached storage. These disks are spread throughout the cluster, including the
Mediator in a 2-node cluster, using the iSCSI protocol, so intelligent failover decisions can be made by a
cluster node based on access to these disks. If a node is able to access the mailbox disks of other nodes
outside of its HA partner, it is likely up and healthy.
Note: The mailbox architecture and disk-based heart-beating method of resolving cluster quorum and
split-brain issues are the reasons the multi-node variant of ONTAP Select requires either four
separate nodes or a mediator for a 2-node cluster.
HA Mailbox Posting
The HA mailbox architecture uses a message “post” model. At repeated intervals, cluster nodes post
messages to all other mailbox disks across the cluster, including the Mediator, stating that the node is up
and running. Within a healthy cluster, at any given point in time, a single mailbox disk on a cluster node has
messages posted from all other cluster nodes.
Attached to each Select cluster node is a virtual disk that is used specifically for shared mailbox access.
This disk is referred to as the mediator mailbox disk, because its main function is to act as a method of
cluster mediation in the event of node failures or network partitioning. This mailbox disk contains partitions
for each cluster node and is mounted over an iSCSI network by other Select cluster nodes. Periodically,
these nodes post health status to the appropriate partition of the mailbox disk. Using network-accessible
mailbox disks spread throughout the cluster allows us to infer node health through a reachability matrix. For
example, if cluster nodes A and B can post to the mailbox of cluster node D, but not node C, and cluster
HA Heart-beating
Like NetApp’s FAS platforms, ONTAP Select periodically sends HA heartbeat messages over the HA
interconnect. Within the ONTAP Select cluster, this is done over a TCP/IP network connection that exists
between HA partners. Additionally, disk-based heartbeat messages are passed to all HA mailbox disks,
including mediator mailbox disks. These messages are passed every few seconds and read back
periodically. The frequency with which these are sent/received allows the ONTAP Select cluster to detect
HA failure events within approximately 15 seconds, the same window available on FAS platforms. When
heartbeat messages are no longer being read, a failover event is triggered.
Figure 8 illustrates the process of sending and receiving heartbeat messages over the HA interconnect and
mediator disks from the perspective of a single ONTAP Select cluster node, node C. Note that network
heartbeats are sent over the HA interconnect to the HA partner, node D, while disk heartbeats use mailbox
disks across all cluster nodes, A, B, C, and D.
Deploy Upgrades
The Deploy utility can be upgraded separately from the Select cluster. Similarly, the Select cluster can be
upgraded separately from the Deploy utility. Please see the Upgrade section for the Deploy x Select
interoperability matrix.
Server Preparation
Although ONTAP Deploy provides the user with functionality that allows for configuration of portions of the
underlying physical server, there are several requirements that must be met before attempting to manage
the server. This can be thought of as a manual preparation phase, because many of the steps are difficult
to orchestrate through automation. This preparation phase involves the following:
• For local storage, the RAID controller and attached local storage are configured.
RAID groups and LUNs have been provisioned.
• For VSAN or external array hosted datastores, insure that the configurations are supported by VMware
HCL and follow the specific vendor best practices.
• Physical network connectivity to server is verified.
- For external arrays, the network resiliency, speed and throughput are critical to the performance
of the ONTAP Select VM.
• Hypervisor is installed.
• Virtual networking constructs (vSwitches/port groups) are configured.
Note: After the ONTAP Select cluster has been deployed, the appropriate ONTAP management tooling
should be used to configure SVMs, LIFs, volumes, and so on. ONTAP Deploy does not provide
this functionality.
The ONTAP Deploy utility and ONTAP Select software are bundled together into a single virtual machine,
which is then made available as a .OVA file for vSphere. The bits are available from the NetApp Support
site, from this link:
http://mysupport.netapp.com/NOW/cgi-bin/software
This installation VM runs the Debian Linux OS and has the following properties:
• 2 vCPUs
• 4GB RAM
• 40GB virtual disk
VM Placement
The ONTAP Select installation VM can be placed on any virtualized server in the customer environment.
For 4-node clusters, the ONTAP Deploy VM can be collocated on the same host as an ONTAP Select
instance or on a separate virtualized server. For 2-node clusters, where the ONTAP Deploy VM is also the
cluster Mediator, the collocation model is NOT supported as a it would become a cluster single point of
failure (SPOF).
The ONTAP Deploy VM can be installed in the same datacenter as the ONTAP Select cluster, or it can be
centrally deployed in a core datacenter. The only requirement is that there exists network connectivity
between the ONTAP Deploy VM and the targeted ESX host as well as the future ONTAP Select cluster
management IP address. Note that creating an ONTAP Select cluster over the WAN may take a
considerably longer amount of time since the copying of the ONTAP Select binary files depends on the
latency and bandwidth available between datacenters. Deploying a 2-node ONTAP Select cluster is
supported on a WAN network whose maximum latency and minimum bandwidth can support the Mediator
service traffic (min throughput 5Mb/s, max latency 500 ms RTT).
The following figure shows these deployment options.
Note: Collocating the ONTAP Deploy VM and one of the ONTAP Select instances is NOT supported for
2-node clusters.
Best Practice
To eliminate the possibility of having multiple Deploy instances assign duplicate MAC addresses, one
Deploy instance per L2 network should be used to manage existing or create new Select clusters /nodes.
Note: Each ONTAP Deploy instance can generate up to 64,000 unique MAC addresses. Each ONTAP
Select node consumes four MAC addresses for its internal communication network schema. Each
Deploy instance is also limited to managing 100 Select clusters and 400 hosts (a host is equivalent
to one hypervisor server).
For 2-node clusters, the ONTAP Deploy VM that creates the cluster is also the default Mediator and it
requires no further configuration. However it is absolutely critical that the Mediator service is continuously
available in order to insure proper functioning of the storage failover capabilities. For configurations where
the network latency, bandwidth or other infrastructure issues require the re-positioning of the Mediator
service closer to the ONTAP Select 2-node cluster, another ONTAP Deploy VM can be used to host the
Mediator mailboxes temporarily or permanently.
Best Practice
The ONTAP Select 2-node cluster should be carefully monitored for EMS messages indicating that the
storage failover is disabled. These messages indicate a loss of connectivity to the Mediator service and
should be rectified immediately.
After the installation, ONTAP Deploy can be used to complement the other NetApp management tools for
troubleshooting purposes.
The ONTAP Deploy command line interface provides options for troubleshooting that are not available in
the GUI. Most commands include a “show” option. This allows you to gather information about the
environment.
The ONTAP Deploy logs can contain valuable information to help troubleshooting cluster setup issues. The
ONTAP Deploy GUI and command line interfaces allow you generate an AutoSupport bundle containing
the ONTAP Deploy logs. The GUI also allows you to download the bundle for immediate inspection.
Finally, the Deploy GUI can be used to invoke node specific AutoSupport bundles.
Each ONTAP Select virtual machine contains six virtual network adapters, presented to Data ONTAP as a
set of six network ports, e0a through e0f. Although ONTAP treats these adapters as physical NICs, they
are in fact virtual and map to a set of physical interfaces through a virtualized network layer. As a result,
each hosting server does not require six physical network ports.
Note: Adding virtual network adapters to the ONTAP Select VM is not supported.
These ports are preconfigured to provide the following services:
• e0a, e0b. Data and management LIFs
• e0c, e0d. Cluster network LIFs
• e0e. RAID SyncMirror (RSM)
• e0f. HA interconnect
Ports e0a and e0b reside on the external network. Although ports e0c–e0f perform several different
functions, collectively they compose the internal Select network. When making network design decisions,
these ports should be placed on a single L2 network. There is no need to separate these virtual adapters
across different networks.
The relationship between these ports and the underlying physical adapters can be seen in Figure 11, which
depicts one ONTAP Select cluster node on the ESX hypervisor.
Segregating internal and external traffic across different physical NICs ensures that we are not introducing
latencies into the system due to insufficient access to network resources. Additionally, aggregation through
NIC teaming makes sure that failure of a single network adapter does not prevent the ONTAP Select cluster
node from accessing the respective network.
LIF Assignment
With the introduction of IPspaces, Data ONTAP port roles have been deprecated. Similar to FAS arrays,
ONTAP Select clusters contain both a default and cluster IPspace. By placing network ports e0a and e0b
into the default IPspace and ports e0c and e0d into the cluster IPspace, we have essentially walled off
those ports from hosting LIFs that do not belong. The remaining ports within the ONTAP Select cluster are
consumed through the automatic assignment of interfaces providing internal services and not exposed
through the ONTAP shell, as is the case with the RSM and HA interconnect interfaces.
Note: Not all LIFs are visible through the ONTAP command shell. The HA interconnect and RSM
interfaces are hidden from ONTAP and used internally to provide their respective services.
The network ports/LIFs are explained in further detail in the following sections.
HA Interconnect (e0f)
NetApp FAS arrays use specialized hardware to pass information between HA pairs in an ONTAP cluster.
Software-defined environments, however, do not tend to have this type of equipment available (such as
Infiniband or iWARP devices), so an alternate solution is needed. Although several possibilities were
considered, ONTAP requirements placed on the interconnect transport required that this functionality be
emulated in software. As a result, within an ONTAP Select cluster, the functionality of the HA interconnect
(traditionally provided by hardware) has been designed into the OS, using Ethernet as a transport
mechanism.
Each ONTAP Select node is configured with an HA interconnect port, e0f. This port hosts the HA
interconnect network interface, which is responsible for two primary functions:
• Mirroring the contents of NVRAM between HA pairs
• Sending/receiving HA status information and network heartbeat messages between HA pairs
HA interconnect traffic flows through this network port using a single network interface by layering RDMA
frames within Ethernet packets. Similar to RSM, neither the physical port nor the hosted network interface
is visible to users from either the ONTAP CLI or management tooling. As a result, the IP address of this
interface cannot be modified, and the state of the port cannot be changed.
Note: This network port requires the use of jumbo frames (9000 MTU).
Note that NIC teaming is still required, though two adapters are sufficient for a single-node cluster.
LIF Assignment
As explained in the multinode LIF assignment section of this document, IPspaces are used by ONTAP
Select to keep cluster network traffic separate from data and management traffic. The single-node variant
of this platform does not contain a cluster network; therefore, no ports are present in the cluster IPspace.
Note: Cluster and node management LIFs are automatically created during ONTAP Select cluster setup.
The remaining LIFs may be created post deployment.
Note: Using the Select internal network for traffic other than Select cluster traffic, such as application or
management traffic, is not supported. There can be no other VMs or hosts on the ONTAP internal
VLAN.
Network packets traversing the internal network must be on a dedicated VLAN tagged layer-2 network.
This can be accomplished by one of the following:
• Assigning a VLAN-tagged port group to the internal virtual NICs (e0c–e0f)
• Using the native VLAN provided by the upstream switch where the native VLAN is not used for any
other traffic
DHCP support No No
Starting with Deploy 2.2, the internal network in a multi node cluster can be validated using the network
connectivity checker functionality, which can be invoked from the Deploy command line interface using
the ‘network connectivity-check start’ command.
The output of the test can be viewed using the ‘network connectivity-check show --run-id X’ command,
This tool is only useful for troubleshooting the internal network in a multi node Select cluster. It should
not be used to troubleshoot single node clusters or client side connectivity issues.
NIC Aggregation
To make sure that the internal and external networks have both the necessary bandwidth and resiliency
characteristics required to provide high performance and fault tolerance, physical network adapter
aggregation is used. This is a requirement on both the internal and external networks of the ONTAP Select
cluster and provides the ONTAP Select cluster with two major benefits:
• Isolation from a single physical port failure
• Increased throughput
NIC aggregation allows the ONTAP Select instance to balance network traffic across two physical ports.
LACP-enabled port channels are only supported with distributed vSwitches.
Best Practice
In the event that a NIC has multiple ASICs, select one network port from each ASIC when building
network aggregation constructs through NIC teaming for the internal and external networks.
• 2 or more x 10GB physical ports • Single LACP channel with all ports. • Load-balancing policy at the port
• Distributed vSwitch • Internal network uses a port group with group level is “route based on IP
• Physical uplink switch supports VST to add VLAN tagging. hash” and “source and destination IP
LACP and 9,000 MTU size on all • External network uses a separate port address and TCP/UDP port and
VLAN” on the link aggregation group
ports group; VST and VGT are supported.
(LAG).
• LACP mode set to ACTIVE on both
the ESX and the physical switches;
LACP timer should be set to FAST (1
second) on the port channel
interfaces and on the VMNICs.
• VMware recommends that STP be
set to Portfast on the switch ports
connected to the ESXi hosts.
• 2 x 10Gb ports and 2 x 1 Gb ports • Do not use any LACP channels. • Load-balancing policy at the port
OR • Internal network must use a port group group level is “route based on
• 9,000 MTU is not supported on all with at least 2 10Gb ports and MTU originating virtual port ID.”
physical ports or switch ports 9,000. 1Gb ports and ports that do not • VMware recommends that STP be
OR support 9000 MTU should be used for set to Portfast on the switch ports
the external network. connected to the ESXi hosts.
• Using a standard vSwitch
• External network uses a separate port
group containing all the ports. The
ACTIVE ports are ports that are not
used for the internal network. The
STANDBY ports are the internal
network ports.
• All the ports must be owned by the
same vSwitch. The MTU setting on the
vSwitch must be set to 9,000.
Because the performance of the ONTAP Select VM is tied directly to the characteristics of the underlying
hardware, increasing the throughput to the VM by selecting 10Gb-capable NICs results in a higher
performing cluster and a better overall user experience. When cost or form factor prevents the user from
designing a system with four 10Gb NICs, two 10Gb NICs can be used.
See figure 20 for an example of a configuration where LACP is used and figure 21 for a configuration without
LACP.
Best Practice
To make sure of optimal load balancing across both the internal and the external ONTAP Select
networks, the load-balancing policy of “Route based on originating virtual port” should be used.
Figure 13 shows the configuration of a standard vSwitch and the two port groups responsible for handling
internal and external communication services for the ONTAP Select cluster.
Note that the external network may use the internal network vmnics in the case of a network outage, but
the opposite may not always be the case, depending on the vmnic properties for speed and MTU size.
Figure 16) Port group configurations using a distributed vSwitch with LACP enabled.
Best Practice
NetApp recommends that the LACP mode be set to ACTIVE on both the ESX and the physical switches.
Furthermore, the LACP timer should be set to FAST (1 second) on the portchannel interfaces and on the
VMNICs.
When using a Distributed vSwitch with LACP, we recommend configuring the load-balancing policy to
“Route based on IP Hash” on the port group and “Source and Destination IP Address and TCP/UDP port
and VLAN” on the link aggregation group (LAG)
VMware recommends that STP be set to Portfast on the switch ports connected to the ESXi hosts. Not
setting STP to Portfast on the switch ports may affect ONTAP Select's ability to tolerate uplink failures.
Note: In this configuration, the shared switch becomes a single point of failure. If possible, multiple
switches should be used to prevent a physical hardware failure from causing a cluster network
outage.
Best Practice
When sufficient hardware is available, NetApp recommends using the following multi-switch
configuration, due to the added protection against physical switch failures.
Figure 19 shows the second scenario, where traffic is tagged by the ONTAP VM using virtual VLAN ports
that are placed into separate broadcast domains. In this example, virtual ports e0a-10/e0b-10 and e0a-
20/e0b-20 are placed on top of VM ports e0a and e0b, allowing the network tagging to be done directly
within ONTAP, rather than at the vSwitch layer. Management and data LIFs are placed on these virtual
ports, allowing further L2 subdivision within a single VM port. The cluster VLAN (VLAN ID 30) is still tagged
at the port group.
Note: This style of configuration is especially desirable when using multiple IPspaces. Group VLAN ports
into separate custom IPspaces if further logical isolation and multitenancy are desired.
If data traffic spans multiple layer 2 networks (and the use of VLAN ports is required) or when using
multiple IPspaces, VGT should be used.
Best Practice
In an environment where conditions prevent the server from being fit with 4 10Gb NIC cards, 2 1Gb NICs
can be used for the external ONTAP network.
4 1Gb ports can used for internal traffic in 2-node ONTAP Select clusters.
Figures 20 through 22 depicts various ways in which to configure the network on a physical server with four
physical NIC ports, depending on the whether a distributed switch is used or whether all four ports are
10Gb.
For 2-node ONTAP Select clusters, Figures 20 and 21 are also supported with 4 1Gb ports.
Note that in all cases VLAN tagging for internal network traffic is done by the port group (VLAN 10). External
traffic, however, is untagged by the port group and instead is tagged by the upstream switch, using the
native VLAN tag (VLAN 20). This is only intended to highlight one possible way of implementing layer 2
tagging within an ONTAP Select cluster. Like the ONTAP internal port group, a static VLAN ID could also
be assigned to the external network. Implementing tagging at the VM layer and not at the vSwitch does
have one added benefit, however. Similar to FAS systems, ONTAP Select allows the use of multiple
IPspaces and VLAN tagging in its support for multitenancy implementations. In order for this functionality
to be available to the ONTAP Select administrator, VLAN tagging should be done at the VM level.
Implementing the tagging within a virtual machine is a process known as virtual guest tagging (VGT). Using
VGT with ONTAP Select, rather than implementing VLAN tagging through the port group or physical switch,
allows data, management, and replication traffic to be further split across multiple layer 2 networks.
5 Use Cases
6 Upgrading
This section contains important information concerning the maintenance of various aspects of an ONTAP
Select cluster. It is possible to upgrade ONTAP Select and ONTAP Deploy independent of each other. The
following table describes the support matrix for Select and Deploy:
ONTAP Deploy will only manage Select clusters that it has deployed. Currently there is no functionality to
“discover” Select clusters installed using another instance of Deploy. We recommend backing up the Deploy
configuration every time a new Select cluster has been deployed. Restoring the Deploy database allows a
new Deploy instance to manage Select clusters installed using another Deploy VM. However, care should
be taken to ensure that one cluster is not managed by multiple Deploy instances.
Best Practice
NetApp recommends that the Deploy database be backed up on a regular basis as well as every time a
configuration change is made and before any upgrade.
Our first step is to assign the disks to the proper cluster node and plex. To accomplish this, use the following
steps (in this example, we’re using a newly installed ONTAP Select cluster with two 100GB data disks per
node):
From the ONTAP CLI, run the following command:
disk show –fields location,aggregate,owner
The “location” field lists the ONTAP Select cluster node that has a physical connection to the backing
VMDK. This is the owning node.
a. From here we can see that node “sdota” has two unassigned data disks physically connected: NET-
1.2 and NET-1.3.
b. We can also see that node “sdotb” has two unassigned data disks physically connected: NET-2.3
and NET-2.4.
To create an aggregate on node sdota, assign a local disk to the storage pool 0 (another term for plex) and
a mirror disk to storage pool 1. Remember that the mirror disk must be contributed by the HA partner,
in this case sdotb, so we’ll use disk NET-2.4.
Now that disks have been assigned to the correct plex (pool), our next step is to create the aggregate.
Note: This step may also be performed using System Manager.
To build the aggregate, issue the following command:
aggregate create -aggregate <aggr-name> -diskcount 2 -mirror true -node
<ontap-node>
mycluster::> aggregate create -aggregate data_aggr1 -diskcount 2 -mirror true -node sdota
(storage aggregate create)
Info: The layout for aggregate "data_aggr1" on node "sdota" would be:
First Plex
Second Plex
Note: From this point, SVMs, volumes, LIFs, and protocol configuration can be done through System
Manager (or the ONTAP CLI) using the same set of procedures you would use to configure these
on a FAS.
7 Performance
The following performance numbers are intended to be used as a rough estimate of the performance of a
Select cluster and are not a performance guarantee. The performance of an ONTAP Select cluster can
vary considerably due to the characteristics of the underlying hardware and configuration. The following
numbers should be used solely as a guide.
Reference Platform
Client hardware:
• 4 NFSv3 IBM 3650 clients
Config info:
• 1500 MTU for data path between clients and Select cluster
• No storage efficiency features in use (compression, dedupe, Snapshot copies, SnapMirror, and so on)
Results
Table 8 shows the throughput measured against read/write workloads on a four-node ONTAP Select
Standard and Premium clusters. The ONTAP Select Premium cluster used SSD media. Performance
measurements were taken using the SIO load-generating tool using the configuration defined earlier.
For each test scenario, further details are provided later.
Table 8) Performance results for a 4-node ONTAP Select Standard cluster and a 4-node ONTAP Select
Premium cluster.
1200 +110%
Throughput per Node (MB/s)
1000
800
600
400
+50%
200 +749%
+64%
0
Seq Wr Rnd Wr Seq Rd Rnd Rd
4-Core 16GB 8-Core 64GB
Sequential Read
Details:
• SIO direct I/O enabled
• 1 data NIC
• 1 data aggregate (1TB)
64 volumes, 64 SIO procs/threads
32 volumes per node (64 total)
1 SIO proc per volume, 1 SIO thread per file
1 file per volume; files 12GB each
Files pre-created using mkfile
Sequential Write
Details:
• SIO direct I/O enabled
• 1 data NIC
• 1 data aggregate (1TB)
64 volumes, 128 SIO procs/threads
32 volumes per node (64 total)
2 SIO procs per volume, 1 SIO thread per file
2 files per volume; files are 30720MB each
Using 100% sequential 64KiB I/Os, each thread writes through each file sequentially from beginning to end.
Each measurement lasts for 300 seconds. Tests are purposefully sized so that the I/O never wraps within
a given file. Performance measurements are designed to force I/O to disk.
Random Read
Details:
• SIO direct I/O enabled
• 1 data NIC
• 1 data aggregate (1TB)
64 volumes, 64 SIO procs, 512 threads
32 volumes per node (64 total)
64 SIO procs per volume, each with 8 threads
1 SIO proc per volume, 8 threads per file
1 file per volume; files are 8192MB each
Files precreated using mkfile
Using 100% random 4KiB I/Os, each thread randomly reads through each file. Each measurement lasts for
300 seconds. Performance measurements are designed to force I/O from disk.
Random Write
Details:
• SIO direct I/O enabled
• 1 data NIC
• 1 data aggregate (1TB)
64 volumes, 128 SIO procs, 512 threads
32 volumes per node (64 total)
64 SIO procs, each with 8 threads
1 SIO proc per volume, 8 threads per file
1 file per volume; files are 8192MB each
Using 100% random 4KiB I/Os, each thread randomly writes through each file. Each measurement lasts
for 300 seconds. Performance measurements are designed to force I/O to disk.
Reference Platform
ONTAP Select 9.2 (Standard) hardware (per node / 4 node AF VSAN cluster):
• Dell R630
Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz.
2 Sockets, 14 CPUs per socket.
56 logical CPUs (HT enabled).
256 GB RAM.
ESXi version: VMware ESXi 6.0.0 build-3620759.
• VSAN data-store
Drives per host.
INTEL SSDSC2BX40 - 372 GB for Cache tier.
4 INTEL SSDSC2BX01 – 1.46 TB for Capacity tier.
Client hardware:
- 1 NFSv3 - Debian Linux VM deployed on the same vSAN cluster.
- 80 GB workload distributed equally across 4 NFS volumes / mounts.
- No storage efficiency features in use.
- Separate 10Gig networks for NFS data traffic and vSAN internal traffic.
- 1500 MTU for NFS interfaces and 9000 MTU for VSAN interface.
- Block size: random work-load – 4k and sequential work-load – 64k.
Results
Table 9 shows the throughput measured against the read/write workloads on a single-node Select Standard
cluster running on an All Flash VSAN datastore. Performance measurements were taken using the FIO
load-generating tool.
Table 9) Performance results for a single-node ONTAP Select Standard cluster on an AF VSAN datastore.
1400
ONTAP Select 9.0 Standard with DAS
1200 1151 (SAS)
Throughput per Node (MB/s)
400
233
200 155 158 129
63
19 54 89 34
0
Seq Read Seq Write Random Read Random Write
Version History
Version Date Document Version History
Version 1.0 June 15, 2016 Initial version
Version 1.1 August 15, 2016 Updated the networking sections 2.5 and 5
Version 1.2 December 22, 2016 Added support for ONTAP Select 9.1 and OVF evaluation
method.
Consolidated the Networking Section.
Consolidated the Deploy Section.
Version 1.3 March 20, 2017 Added support for ONTAP Deploy 2.3, external array and VSAN.
Added support for SATA and NL-SAS along with datastore size
considerations for larger capacity media.
Added IOPS metric to performance table.
Added network checker for Internal Network troubleshooting.
Version 1.41 June, 2017 Added support for ONTAP Deploy 2.4, ONTAP Select 9.2, and 2-
node clusters.
Added VSAN performance information.