0% found this document useful (0 votes)
536 views133 pages

OceanStor Dorado 2000 6.1.6 Technical White Paper

Uploaded by

Marcelo Garcia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
536 views133 pages

OceanStor Dorado 2000 6.1.6 Technical White Paper

Uploaded by

Marcelo Garcia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 133

OceanStor Dorado 2000 6.1.

6 Technical White
Paper

OceanStor Dorado 2000 6.1.6


Technical White Paper
Issue 01
Date 2023-08-01

HUAWEI TECHNOLOGIES CO., LTD.


Copyright © Huawei Technologies Co., Ltd. 2023. All rights reserved.
No part of this document may be reproduced or transmitted in any form or by any means without prior
written consent of Huawei Technologies Co., Ltd.

Trademarks and Permissions

and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective
holders.

Notice
The purchased products, services and features are stipulated by the contract made between Huawei and
the customer. All or part of the products, services and features described in this document may not be
within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements,
information, and recommendations in this document are provided "AS IS" without warranties, guarantees
or representations of any kind, either express or implied.

The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but all statements, information, and
recommendations in this document do not constitute a warranty of any kind, express or implied.

Huawei Technologies Co., Ltd.


Address: Huawei Industrial Base
Bantian, Longgang
Shenzhen 518129
People's Republic of China

Website: https://www.huawei.com
Email: [email protected]

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. i


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper Contents

Contents

1 Executive Summary................................................................................................................. 1
2 Overview....................................................................................................................................2
2.1 Product Portfolio...................................................................................................................................................................... 2
2.2 Customer Benefits................................................................................................................................................................... 3

3 Hardware Architecture........................................................................................................... 5
3.1 Hardware Description............................................................................................................................................................ 5
3.1.1 Controller Enclosure............................................................................................................................................................ 5
3.1.2 Disk Enclosure....................................................................................................................................................................... 7
3.1.2.1 SAS Disk Enclosure........................................................................................................................................................... 7
3.1.3 HSSD Disk Unit..................................................................................................................................................................... 8
3.1.4 Huawei-developed Chips................................................................................................................................................ 10
3.1.5 Power Consumption and Heat Dissipation...............................................................................................................11

4 Software Architecture.......................................................................................................... 15
4.1 Symmetric Active-Active SAN Architecture.................................................................................................................. 15
4.1.1 Global Load Balancing..................................................................................................................................................... 15
4.2 Global Cache.......................................................................................................................................................................... 16
4.3 RAID 2.0+................................................................................................................................................................................. 17
4.4 FlashLink®................................................................................................................................................................................ 17
4.4.1 Intelligent Multi-Core Technology............................................................................................................................... 18
4.4.2 ROW Full-Stripe Write..................................................................................................................................................... 19
4.4.3 Multistreaming................................................................................................................................................................... 21
4.4.4 End-to-End I/O Priority....................................................................................................................................................24
4.4.5 Intelligent Analysis............................................................................................................................................................ 26
4.5 Rich Software Features....................................................................................................................................................... 27

5 Smart Series Features...........................................................................................................28


5.1 SmartDedupe and SmartCompression (Data Reduction)....................................................................................... 28
5.1.1 Compression........................................................................................................................................................................ 29
5.1.1.1 Data Compression..........................................................................................................................................................29
5.1.1.2 Data Compaction........................................................................................................................................................... 31
5.2 SmartQoS (Intelligent Quality of Service Control)................................................................................................... 31
5.2.1 Functions.............................................................................................................................................................................. 32
5.2.1.1 Upper Limit Control.......................................................................................................................................................32

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. ii


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper Contents

5.2.1.2 Lower Limit Guarantee................................................................................................................................................ 34


5.2.2 Policy Management.......................................................................................................................................................... 35
5.2.2.1 Hierarchical Management...........................................................................................................................................35
5.2.2.2 Objective Distribution................................................................................................................................................... 37
5.2.2.3 Recommended Configuration.................................................................................................................................... 37
5.3 SmartMigration (Intelligent Data Migration)............................................................................................................. 38
5.4 SmartThin (Intelligent Thin Provisioning).................................................................................................................... 39
5.5 SmartErase (Data Destruction)........................................................................................................................................ 40
5.6 SmartMulti-Tenant (Multi-tenancy)............................................................................................................................... 41

6 Hyper Series Features...........................................................................................................43


6.1 HyperSnap (Snapshot)........................................................................................................................................................ 43
6.1.1 HyperSnap for SAN (Snapshot for SAN)................................................................................................................... 43
6.1.1.1 Basic Principles................................................................................................................................................................ 43
6.1.1.2 Cascading Snapshot...................................................................................................................................................... 46
6.1.1.3 Snapshot Consistency Group......................................................................................................................................46
6.2 HyperCDP (Continuous Data Protection)..................................................................................................................... 47
6.3 HyperClone (Clone)............................................................................................................................................................. 50
6.3.1 HyperClone for SAN (Clone for SAN).........................................................................................................................50
6.3.1.1 Data Synchronization................................................................................................................................................... 50
6.3.1.2 Reverse Synchronization.............................................................................................................................................. 51
6.3.1.3 Immediately Available Clone LUNs..........................................................................................................................52
6.3.1.4 HyperClone Consistency Group................................................................................................................................. 53
6.3.1.5 Cascading Clone Pairs...................................................................................................................................................53
6.4 HyperReplication (Remote Replication)........................................................................................................................54
6.4.1 HyperReplication for SAN (Remote Replication for SAN)...................................................................................54
6.4.1.1 HyperReplication/S (Synchronous Remote Replication).................................................................................. 55
6.4.1.2 HyperReplication/A (Asynchronous Remote Replication)............................................................................... 55
6.4.1.3 Technical Highlights...................................................................................................................................................... 56
6.5 HyperMetro (Active-Active Deployment)..................................................................................................................... 57
6.5.1 HyperMetro for SAN......................................................................................................................................................... 57
6.5.1.1 Read and Write Processes........................................................................................................................................... 57
6.5.1.2 HyperMetro Consistency Group................................................................................................................................ 59
6.5.2 HyperMetro Technical Features.................................................................................................................................... 59
6.5.2.1 Gateway-free Active-Active Solution.......................................................................................................................59
6.5.2.2 Parallel Access................................................................................................................................................................. 60
6.5.2.3 Reliable Arbitration....................................................................................................................................................... 60
6.5.2.4 Strong Scalability........................................................................................................................................................... 61
6.5.2.5 High Performance.......................................................................................................................................................... 61
6.6 Geo-Redundancy (Multi-DC)............................................................................................................................................ 63
6.6.1 3DC (Geo-Redundancy).................................................................................................................................................. 63
6.6.2 4DC (Geo-Redundancy).................................................................................................................................................. 65
6.7 HyperEncryption (Array Encryption).............................................................................................................................. 67

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. iii


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper Contents

7 System-level Reliability Design..........................................................................................71


7.1 Data Reliability...................................................................................................................................................................... 71
7.1.1 Cache Data Reliability...................................................................................................................................................... 72
7.1.1.1 Multiple Cache Copies.................................................................................................................................................. 72
7.1.1.2 Power Failure Protection............................................................................................................................................. 73
7.1.2 Persistent Data Reliability.............................................................................................................................................. 73
7.1.2.1 Intra-disk RAID................................................................................................................................................................ 73
7.1.2.2 RAID 2.0+.......................................................................................................................................................................... 74
7.1.2.2.1 Disk-level redundancy RAID.................................................................................................................................... 74
7.1.2.3 Dynamic Reconstruction.............................................................................................................................................. 74
7.1.3 Data Reliability on I/O Paths......................................................................................................................................... 75
7.1.3.1 End-to-end PI.................................................................................................................................................................. 75
7.1.3.2 Matrix Verification......................................................................................................................................................... 76
7.2 Service Availability................................................................................................................................................................ 77
7.2.1 Interface Module and Link Redundancy....................................................................................................................77
7.2.2 Controller Redundancy.................................................................................................................................................... 78
7.2.3 Storage Media Redundancy........................................................................................................................................... 78
7.2.3.1 Fast Isolation of Disk Faults....................................................................................................................................... 78
7.2.3.2 Disk Redundancy............................................................................................................................................................ 78
7.2.4 Array-level Redundancy.................................................................................................................................................. 78

8 System Performance Design............................................................................................... 80


8.1 Front-end Network Optimization................................................................................................................................... 82
8.2 CPU Computing Optimization.......................................................................................................................................... 83
8.3 Back-end Network Optimization..................................................................................................................................... 83

9 System Serviceability Design.............................................................................................. 85


9.1 System Management........................................................................................................................................................... 85
9.1.1 DeviceManager.................................................................................................................................................................. 85
9.1.1.1 Storage Space Management...................................................................................................................................... 86
9.1.1.1.1 Flexible Storage Pool Management..................................................................................................................... 90
9.1.1.1.2 Mappings Between LUNs and Hosts....................................................................................................................91
9.1.1.1.3 Automatic Host Detection....................................................................................................................................... 91
9.1.1.1.4 Quick Configuration Wizard................................................................................................................................... 92
9.1.1.2 Data Protection Management................................................................................................................................... 93
9.1.1.2.1 Data Protection Based on Protection Groups................................................................................................... 96
9.1.1.2.2 Flexible Use of LUN Groups and Protection Groups...................................................................................... 97
9.1.1.2.3 Capacity Expansion of LUN Groups or Protection Groups........................................................................... 98
9.1.1.2.4 Configuration on One Device for Cross-Device Data Protection............................................................... 98
9.1.1.3 Configuration Task...................................................................................................................................................... 100
9.1.1.4 Fault Management......................................................................................................................................................101
9.1.1.4.1 Monitoring Status of Hardware Devices.......................................................................................................... 101
9.1.1.4.2 Alarm and Event Monitoring............................................................................................................................... 101
9.1.1.5 Performance and Capacity Management........................................................................................................... 102

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. iv


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper Contents

9.1.1.5.1 Built-In Performance Data Collection and Analysis Capabilities............................................................. 103


9.1.1.5.2 Independent Data Storage Space....................................................................................................................... 103
9.1.1.5.3 Performance Threshold Alarm............................................................................................................................. 104
9.1.1.5.4 Scheduled Report..................................................................................................................................................... 104
9.1.2 CLI......................................................................................................................................................................................... 105
9.1.3 RESTful APIs...................................................................................................................................................................... 105
9.1.4 SNMP...................................................................................................................................................................................105
9.1.5 SMI-S................................................................................................................................................................................... 105
9.1.6 Tools.....................................................................................................................................................................................105
9.2 Non-Disruptive Upgrade (NDU)................................................................................................................................... 105
9.3 Device Lifecycle Management....................................................................................................................................... 107
9.3.1 Replacing a Disk Enclosure.......................................................................................................................................... 107

10 System Security Design................................................................................................... 108


10.1 Software Integrity Protection....................................................................................................................................... 108
10.2 Secure Boot........................................................................................................................................................................ 109

11 Intelligent Storage Design..............................................................................................110


11.1 Intelligent Cloud Management................................................................................................................................... 110
11.1.1 Scope of Information to Be Collected................................................................................................................... 111
11.1.2 Intelligent Fault Reporting........................................................................................................................................ 112
11.1.3 Capacity Prediction...................................................................................................................................................... 112
11.1.4 Disk Health Prediction................................................................................................................................................ 114
11.1.5 Device Health Evaluation...........................................................................................................................................115
11.1.6 Performance Fluctuation Analysis.......................................................................................................................... 116
11.1.7 Performance Exception Detection...........................................................................................................................117
11.1.8 Performance Bottleneck Analysis............................................................................................................................ 118

12 Ecosystem Compatibility................................................................................................. 120


12.1 Data Plane Ecosystem Compatibility........................................................................................................................ 120
12.1.1 Host Operating System.............................................................................................................................................. 120
12.1.2 Host Virtualization System........................................................................................................................................ 120
12.1.3 Host Cluster Software................................................................................................................................................. 121
12.1.4 Database Software....................................................................................................................................................... 121
12.1.5 Storage Gateway...........................................................................................................................................................121
12.1.6 Storage Network........................................................................................................................................................... 121
12.2 Management and Control Plane Ecosystem Compatibility............................................................................... 121
12.2.1 Backup Software........................................................................................................................................................... 121
12.2.2 Network Management Software............................................................................................................................ 121
12.2.3 OpenStack Integration................................................................................................................................................ 122
12.2.4 Container Platform Integration............................................................................................................................... 122

13 More Information............................................................................................................. 123


14 Feedback............................................................................................................................. 124

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. v


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper Contents

15 Acronyms and Abbreviations......................................................................................... 125

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. vi


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 1 Executive Summary

1 Executive Summary

Huawei OceanStor Dorado 2000 all-flash storage system is a new-generation


entry-level flash storage product that supports rich enterprise-class value-added
features. Leveraging a cloud-oriented storage operating system, a powerful new
hardware platform, and intelligent management software, it delivers industry-
leading functions, performance, efficiency, reliability, and ease-of-use. It can meet
data storage requirements of various applications, such as virtualization and
database online transaction processing (OLTP) and online analytical processing
(OLAP), for governments, operators, and small- and medium-sized enterprises
(SMEs), ensuring service continuity and data security and providing cost-effective
storage services.
This document describes and highlights the unique advantages of OceanStor
Dorado 2000 in terms of its product positioning, hardware and software
architecture, and features.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 1


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 2 Overview

2 Overview

This chapter describes the unique customer benefits of the OceanStor Dorado
2000 all-flash storage system.
2.1 Product Portfolio
2.2 Customer Benefits

2.1 Product Portfolio


Huawei OceanStor Dorado 2000 all-flash storage system

Figure 2-1 Appearance of the OceanStor Dorado 2000

For details about the product models and specifications, see the product
specifications list.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 2


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 2 Overview

2.2 Customer Benefits


In the past few years, the explosive growth of data and mining of data values
have led to the innovation of IT systems, especially storage devices.

● The main storage media changes from hard disk drives (HDDs) to solid-state
drives (SSDs).
● The end-to-end access latency of storage systems is shortened from 10 ms to
1 ms or even shorter.
● The requirements on data storage are gradually switched from traditional
high reliability and performance to intelligent high reliability and
performance.

Business developments and technical innovations pose higher requirements on the


design and construction of customers' IT infrastructure, and choosing a proper
storage system is a crucial part in building a modern IT infrastructure. Stable
storage performance and high reliability are the basis for building an intelligent
and scalable IT system. Efficient storage is a key factor in reducing IT system costs;
efficient data flow, intelligent O&M, non-disruptive upgrade, and long-term supply
assurance are crucial for long-term IT system evolution and development.

Guided by the concept of "Data + Intelligence", OceanStor Dorado 2000 redefines


storage architecture. With software and hardware core technologies, OceanStor
Dorado implements intelligent, native all-flash storage to provide stable, high-
performance, and highly reliable services and meet the storage requirements of
intelligent IT systems.

● Using core technologies to achieve efficient and high-performance


storage
– E2E stable and high performance is provided with low latency and large
throughput.
– The storage system supports 32 Gbit/s Fibre Channel and 25 Gbit/s
RDMA for front-end interconnection, cluster switching, and back-end
interconnection, and can offload specific tasks to hardware to free CPU
resources, ensuring low latency and high bandwidth.
– The core processing unit of the storage systems incorporates multiple
cores and processors. With the key software designs, such as balanced
distribution, lock-free design, and high scalability, a storage system can
have up to 1000 cores to ensure efficient service processing.
– FlashLink® enables close collaboration between controllers and SSDs. The
use of multistreaming, full-stripe write (RAID 2.0+), garbage collection,
and read/write priority control effectively reduces write amplification on
SSDs, making the most of the SSD performance throughout the lifecycle.
– The chip learns and analyzes the workload to identify long-interval
sequential flows and data associations, greatly improving the data
prefetch capability of the cache and achieving ultra-low latency.
● Innovative SmartMatrix architecture providing highly reliable and stable
storage services

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 3


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 2 Overview

– The storage systems employ a new-generation hardware platform and an


ultra-stable SmartMatrix full-mesh architecture to enhance data
reliability and service continuity, ensuring always-on storage services.
– In terms of data reliability, the storage systems provide end-to-end data
redundancy protection, validity check, and recovery mechanisms. Data
protection measures at various levels can be used on demand, such as
multiple cache copies, data protection across controller enclosures, RAID,
data integrity protection, snapshots, remote replication, and active-active
DC solution. The system checks data integrity and rectifies any error in
the end-to-end data read and write processes to prevent unexpected
recoverable faults (such as bit reversal) in hardware. This effectively
prevents data error spreading when devices are in an uncontrollable and
intermittent sub-health state, ensuring data security.
– In terms of service continuity, the system accurately monitors the device
health status to quickly identify faults. If a fault is detected, the system
isolates and attempts to rectify the fault through redundancy takeover. If
the fault is rectified, the involved component continues providing services.
If the fault fails to be rectified, an alarm is reported to prompt users to
replace the faulty component. These data reliability and service
continuity assurance techniques enable OceanStor Dorado storage
systems to tolerate failure of controllers without service interruption, and
allow non-disruptive software upgrades.
● Intelligent architecture realizing intelligent storage services
As the business of enterprises grows, a single storage system will carry
multiple service systems, which have varied requirements on performance,
capacity, data protection, and cost. This poses a challenge on storage
reliability, performance, and capacity, as well as the customers' overall IT
planning and management. OceanStor Dorado 2000 has balanced the
capacity and performance for hybrid application models in which various
services and workloads share storage resources. The system provides a default
intelligent configuration management mode and supports user-defined
configuration management modes to configure multiple devices on a single
page based on the service logic. Customers can choose from the configuration
modes as required. The system also supports report management to help
customers grasp the health status, performance, and capacity of devices in
real time and predict future trends. The reports can be used as references for
service planning and adjustment. In adaptive mode, the system uses the
optimal configuration. In the future, the system will continuously learn the
workloads and make adjustments accordingly to optimize performance and
handle periodic or burst service peaks. For sub-healthy or worn parts, the
system adjusts their parameters to guarantee stability and prolong the service
life of the devices after long-term operation.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 4


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 3 Hardware Architecture

3 Hardware Architecture

The OceanStor Dorado 2000 all-flash storage system uses the SmartMatrix dual-
controller redundancy architecture. The two controllers use Huawei-developed 4 x
10GE RDMA high-speed buses to implement cache mirroring channels. Disks in a
controller enclosure or disk enclosure are connected to two controllers through
dual ports. BBUs are used in the event of power failures so that data in the cache
can be flushed to coffer disks to protect cache data.
3.1 Hardware Description

3.1 Hardware Description

3.1.1 Controller Enclosure


The controller enclosure of Huawei OceanStor Dorado 2000 is a 2 U 2-controller
25-slot controller enclosure with integrated disks. It adopts the symmetric active-
active architecture and field replaceable units (FRUs) for key modules, supporting
redundancy backup and online replacement of faulty modules. Each controller
supports two pluggable interface modules, two onboard GE ports, and one 10GE
port. Details are as follows:

● Front end:
Front-end interface modules include 4-port 8 Gbit/s, 16 Gbit/s, and 32 Gbit/s
Fibre Channel interface modules, 4-port 10GE and 25GE interface modules, as
well as 4-port GE (electrical) interface modules.
● Scale-out:
Scale-out interface modules are 4-port 25 Gbit/s RDMA interface modules.
● Back-end:
Back-end interface modules include 4-port 12 Gbit/s SAS interface modules
(that connect to SAS disk enclosures).

The two controllers in a controller enclosure of OceanStor Dorado 2000 are


interconnected through RDMA mirror channels, and two controller enclosures can
be directly connected through the scale-out interface modules. The SAS product
model provides onboard SAS back-end ports to connect to SAS disk enclosures.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 5


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 3 Hardware Architecture

Figure 3-1 Device hardware

Figure 3-2 Front view

Figure 3-3 Rear view

OceanStor Dorado 2000 uses a symmetric active-active architecture and a 2 U


controller enclosure that integrates disks and controllers. Two controllers work in
load balancing mode in normal situations and take over services in fault scenarios.
The two controllers are interconnected by RDMA mirror channels. Figure 3-4
shows the logical architecture. Figure 3-5 shows the controller appearance.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 6


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 3 Hardware Architecture

Figure 3-4 Logical architecture

Figure 3-5 Controller appearance

3.1.2 Disk Enclosure


OceanStor Dorado 2000 supports SAS 3.0 standard interfaces and can be cascaded
with SAS disk enclosures.

3.1.2.1 SAS Disk Enclosure


The SAS disk enclosure uses the SAS 3.0 protocol and supports 25 SAS SSDs. A
controller enclosure connects to a SAS disk enclosure through the onboard SAS
ports or SAS interface modules.

Figure 3-6 Front view of a 2 U 25-slot SAS disk enclosure

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 7


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 3 Hardware Architecture

Figure 3-7 Rear view of a 2 U 25-slot SAS disk enclosure

3.1.3 HSSD Disk Unit


OceanStor Dorado 2000 uses Huawei-developed SSDs (HSSDs) to maximize
system performance. HSSDs work perfectly with storage software to provide an
optimal experience across various service scenarios. An SSD consists of a control
unit and a storage unit. The control unit contains an SSD controller, host interface,
and dynamic random access memory (DRAM) module. The storage unit mainly
contains NAND flash chips.
Blocks and pages are the basic units for reading and writing data in the NAND
flash.
● A block is the smallest erasure unit and generally consists of multiple pages.
● A page is the smallest programming and read unit. Its size is usually 16 KB.
Operations on NAND flash include erase, program, and read. The program and
read operations are implemented on pages, while the erase operations are
implemented on blocks. Before writing a page, the system must erase the entire
block where the page resides. Therefore, the system must migrate the valid data in
the block to a new storage space before erasing it. This process is called garbage
collection (GC). Each programmatic write and erase of a block is called a
program/erase (P/E) cycle. SSDs can only tolerate a limited number of P/E cycles.
If a block on an SSD experiences more P/E cycles than others, it will wear out
more quickly. To ensure reliability and performance, HSSDs leverage the following
advanced technologies.

Wear Leveling
The SSD controller uses software algorithms to monitor and balance the P/E cycles
on blocks in the NAND flash. This prevents over-used blocks from failing and
extends the service life of the NAND flash.
HSSDs support both dynamic and static wear leveling. Dynamic wear leveling
enables the SSD to write data preferentially to less-worn blocks to balance P/E
cycles. Static wear leveling allows the SSD to periodically detect blocks with fewer
P/E cycles and reclaim their data, ensuring that blocks storing cold data can
participate in wear leveling. HSSDs combine the two solutions to ensure wear
leveling.

Bad block management


Unqualified blocks may occur when the NAND flash is manufactured or used,
which are labeled as bad blocks. HSSDs identify bad blocks according to the P/E

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 8


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 3 Hardware Architecture

cycles, error type, and error frequency of the NAND flash. If a bad block exists, the
SSD recovers the data on the bad block by using the Exclusive-OR (XOR) parity
data between the NAND flash memories, and saves it to a new block. HSSDs have
reserved space to replace bad blocks, ensuring sufficient available capacity and
user data security.

Data Redundancy Protection


HSSDs use multiple redundancy check methods to protect user data from bit
flipping, manipulation, or loss. Error correction code (ECC) and cyclic redundancy
check (CRC) are used in the DRAM of the SSDs to prevent data changes or
manipulation; low-density parity check (LDPC) and CRC are used in the NAND
flash to protect data on pages; XOR redundancy is used between NAND flash
memories to prevent data loss caused by flash failures.

Figure 3-8 Data redundancy protection

LDPC uses linear codes defined by the check matrix to check and correct errors.
When data is written to pages on the NAND flash, the system calculates the LDPC
verification information and writes it to the pages with the user data. When data
is read from the pages, LDPC verifies and corrects the data.

HSSDs use a built-in XOR redundancy mechanism to implement redundancy


protection between flash chips. If a flash chip becomes faulty (page failure, block
failure, die failure, or full chip failure), parity data is used to recover the data on
the faulty blocks, preventing data loss.

Background inspection
After data has been stored in NAND flash for a long term, data errors may occur
due to read interference, write interference, or random failures. HSSDs periodically
read data from the NAND flash, check for bit changes, and write data with bit
changes to new pages. This process detects and handles risks in advance, which
effectively prevents data loss and improves data security and reliability.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 9


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 3 Hardware Architecture

SAS SSD Module


A SAS SSD module consists of a handle and a 2.5-inch SSD, as shown in Figure
3-9.

Figure 3-9 SAS SSD appearance

3.1.4 Huawei-developed Chips


With long-term continuous accumulation and investment in the chipset field,
Huawei has developed some key chipsets for storage systems, such as the front-
end interface chipset (Hi1822), Kunpeng 920 chipset, SSD controller chipset, and
board management controller (BMC) chipset (Hi1710). They are integrated into
OceanStor Dorado 2000.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 10


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 3 Hardware Architecture

● Interface module chip


Hi182x (IOC) is a Huawei-developed storage interface chip. It integrates
multiple protocol interfaces, such as Ethernet interfaces of 100 Gbit/s, 40
Gbit/s, 25 Gbit/s, and 10 Gbit/s, and Fibre Channel interfaces of 32 Gbit/s, 16
Gbit/s, and 8 Gbit/s. Its high interface density, rich protocol types, and flexible
ports create unique values for storage.
● Kunpeng 920 chip
The Kunpeng 920 chip is independently developed by Huawei. It features high
performance, throughput, and energy efficiency to meet diversified computing
requirements of data centers. It can be widely used in big data and distributed
storage applications. The Kunpeng 920 chipset supports various protocols
such as DDR4, PCIe 4.0, SAS 3.0, and 100 Gbit/s RDMA to meet the
requirements of a wide range of scenarios.
● SSD controller chip
Huawei-developed SSDs use the latest-generation enterprise-class controller
chip, which supports SAS 3.0 and PCIe 4.0 ports. The controller chip features
high performance and low power consumption. The controller uses enhanced
ECC and built-in RAID technologies to extend the SSD service life to meet
enterprise-level reliability requirements. In addition, this chip supports the
latest DDR4, 12 Gbit/s SAS, and 8 Gbit/s PCIe rates as well as Flash
Translation Layer (FTL) hardware acceleration to provide stable performance
at a low latency for enterprise applications.
● BMC chip
Hi1710 is a BMC chip consisting of the A9 CPU, 8051 co-processor, sensor
circuits, control circuits, and interface circuits. It supports the Intelligent
Platform Management Interface (IPMI), which monitors and controls the
hardware components of the storage system, including system power control,
controller monitoring, interface module monitoring, power supply and BBU
management, and fan monitoring.

3.1.5 Power Consumption and Heat Dissipation


OceanStor Dorado 2000 uses the following designs to meet the requirements for
energy conservation and environment protection:

● Processor with a high energy efficiency ratio


● Power module with the industry's highest conversion efficiency

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 11


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 3 Hardware Architecture

● Proportional-integral-derivative (PID) fan speed adjustment algorithm for


high heat dissipation efficiency
● Staggered power-on design, avoiding peak load on the power supply

The energy-efficient design reduces power supply and heat dissipation costs. The
industry-leading heat dissipation technology and customized NVMe SSDs increase
the SSD density in an enclosure by 44% (a 2 U enclosure can house up to 36
NVMe SSDs).

Processor with High Energy Efficiency Ratio


The differences between the Arm and x86 platforms are mainly in the internal
design of the chip, including the instruction set, pipeline, core distribution, cache,
memory, and I/O control. x86 uses the complex instruction set computer (CISC) to
gain higher performance by increasing the processor complexity. The x86
instruction set has developed from multimedia extensions (MMX) to Streaming
SIMD Extensions (SSE) and advanced vector extensions (AVX). Arm uses the
reduced instruction set computer (RISC), which greatly simplifies the architecture
and retains only the necessary instructions. This simplifies the processor and
achieves a higher energy efficiency in a smaller size. Arm supports 64-bit
operations, 32-bit instructions, 64-bit registers, and 64-bit addressing capability.
The in-depth collaboration between hardware and software improves performance
and the multi-processor architecture supports performance scalability, providing
great advantages in energy efficiency.

Huawei uses the highly integrated SoC chip. The cache coherence bus allows
symmetrical multiprocessing (SMP) of up to four processors. Each processor
provides various types of I/O ports, storage controllers, and storage acceleration
engines.

● I/O ports and storage controllers: Up to 16 x SAS 3.0 ports, 40 x PCIe 4.0
ports, and 2 x 100GE ports
● Storage acceleration engines: RAID engine, SEC engine, and ZIP compression
engine
The processor integrates these ports and engines to reduce the number of
peripheral chips, simplifying the system design and reducing power
consumption.

Efficient Power Module


OceanStor Dorado 2000 uses 80 Plus Platinum and Titanium power modules,
which provide up to 94% power conversion efficiency and 98% power factor when
the power load is 50%, reducing power loss. The Titanium power module can even
reach 96% conversion efficiency when the load is 50% and 90% conversion
efficiency when the load is light, minimizing power loss. The power modules have
passed the 80 Plus certification (the certificate can be provided).

80 Plus efficiency requirements

80 Plus Power Module Power Conversion Efficiency (230 V Input)


Type

Load (%) 10% 20% 50% 100%

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 12


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 3 Hardware Architecture

80 Plus Power Module Power Conversion Efficiency (230 V Input)


Type

80 Plus Bronze --- 81% 85% 81%

80 Plus Silver --- 85% 89% 85%

80 Plus Gold --- 88% 92% 88%

80 Plus Platinum --- 90% 94% 91%

80 Plus Titanium 90% 94% 96% 91%

High-Voltage DC Power Input


OceanStor Dorado 2000 supports high-voltage direct current (HVDC) or AC/DC
hybrid power input for better power supply reliability. This also reduces the UPS
footprint and the equipment room construction and maintenance costs. The HVDC
power supply cuts down the intermediate processes of power conversion,
improving the power efficiency by 15% to 25%. This significantly saves electricity
fees for large data centers every year. In comparison, when low-voltage DC (12
V/48 V) is supplied to high-power devices, thick cables must be used to increase
the current, which causes trouble in cable layout. This problem is solved when
HVDC is used.

PID Fan Speed Adjustment Algorithm


OceanStor Dorado 2000 uses the proportional integral derivative (PID) algorithm
to adjust the fan speed, which solves the problems such as slow response of fan
speed adjustment, high fan power consumption, great fan speed fluctuation, and
loud noise. The PID algorithm allows the system to adjust the fan speed quickly,
save energy, and reduce noise.

● The PID algorithm increases energy efficiency by 4% to 9% and prevents fan


speed fluctuation.
● The PID algorithm increases the fan response speed by 22% to 53% and
significantly reduces the noise.

Staggered Power-On
Staggered power-on prevents the electrical surge that would occur when multiple
devices are powered on simultaneously, eliminating the risk on the power supply
of the equipment room.

Deduplication and Compression


Deduplication and compression are commonly used data reduction techniques.
The system compares the data blocks it receives and deletes duplicate data blocks
to save storage space. Because less storage space is required, the power
consumption of the system is reduced.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 13


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 3 Hardware Architecture

Energy Conservation Certification


The product has passed the RoHS energy efficiency certification.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 14


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 4 Software Architecture

4 Software Architecture

OceanStor Dorado 2000 uses the symmetric active-active software architecture to


implement load balancing and FlashLink® to optimize the system based on SSD
characteristics, fully utilizing all-flash storage system performance. OceanStor
Dorado 2000 uses the unified software architecture and supports interconnection
with each other.
4.1 Symmetric Active-Active SAN Architecture
4.2 Global Cache
4.3 RAID 2.0+
4.4 FlashLink®
4.5 Rich Software Features

4.1 Symmetric Active-Active SAN Architecture


In an asymmetrical logical unit access (ALUA) architecture, each LUN is owned by
a specific controller. Customers need to plan the owning controllers of LUNs for
load balancing. However, it is difficult for an ALUA architecture to implement load
balancing on live networks because service pressures vary with LUNs and vary in
different periods for a same LUN. OceanStor Dorado 2000 uses the symmetric
active-active software architecture. Load balancing algorithms are used to balance
the read and write requests received by each controller. Global cache is used so
that LUNs have no ownership. Each controller process received read and write
requests. (In an ALUA architecture, read and write requests to a LUN must be
processed by the LUN's owning controller), achieving load balancing among
controllers. RAID2.0+ is used to evenly distribute data to all disks in a storage pool,
balancing disk loads.

4.1.1 Global Load Balancing


OceanStor Dorado 2000 hashes the logical block addressing (LBA) of each host
read/write request to determine the controller that processes the request. Huawei
multipathing software UltraPath, FIMs, and controllers negotiate the same hash
method and parameters to implement intelligent distribution of read and write
requests. UltraPath and FIMs work together to directly distribute a read/write
request to the optimal processing controller, avoiding forwarding between

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 15


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 4 Software Architecture

controllers. If UltraPath and FIMs are not used, after receiving a host request, a
controller forwards the request to the corresponding processing controller based
on the hash result of request's LBA, ensuring that host requests are balanced
between controllers.

UltraPath sends I/O requests to an interface module on the optimal controller.


Then, the interface module directly sends the requests to the corresponding
controller for processing. In this way, I/O requests do not need to be forwarded
between controllers. If UltraPath is unavailable, a controller forwards received I/O
requests to the corresponding controller based on the hash result of the I/O LBA,
achieving load balancing among controllers. You are advised to install UltraPath to
prevent overhead caused by I/O forwarding between controllers.

4.2 Global Cache


On OceanStor Dorado 2000, the caches of all controllers constitute a global cache.
Data on each LUN is evenly distributed to the caches of all controllers and
processed by them. In this way, LUNs do not have any ownership.

Figure 4-1 Read and write processes

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 16


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 4 Software Architecture

1. The multipathing software distributes a host read/write request to a front-end


I/O module. If UltraPath is used, UltraPath determines the controller that
processes the read/write request based on the load balancing algorithm and
preferentially selects the front-end link to the corresponding I/O module.
2. The front-end I/O module sends the received read/write request to the
corresponding controller.
3. For a write request, the global cache stores the data on the local node and
mirrors the data to the mirror node. For a read request, the system checks
whether data is found in the cache. If yes, the system performs step 4; if no,
the system performs step 5.
4. The storage system returns a write acknowledgement or read data to the
host.
5. The system reads data from the storage pool, and returns it to the cache and
then to the host.

4.3 RAID 2.0+


If data is not evenly stored on SSDs, some heavily loaded SSDs may become the
system bottleneck. OceanStor Dorado 2000 uses RAID 2.0+ to implement fine-
grained division of SSDs and evenly distributes data to all LUNs on each SSD,
balancing load among disks. The storage systems implement RAID 2.0+ as follows:
● Multiple SSDs form a storage pool.
● Each SSD is divided into fixed-size chunks (typically 4 MB per chunk) to
facilitate logical space management.
● Chunks from different SSDs constitute a chunk group based on the customer-
configured RAID policy.
● A chunk group is further divided into grains (typically 8 KB per grain), which
are the smallest unit for volumes.

Figure 4-2 RAID 2.0+ mapping

4.4 FlashLink®
FlashLink® associates storage controllers with SSDs by using a series of
technologies for flash media, ensuring performance of flash storage. FlashLink®

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 17


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 4 Software Architecture

provides the following key technologies designed for flash media: intelligent multi-
core technology, large-block sequential write technology, multi-stream partitioning
technology, smart disk enclosure offload, end-to-end I/O priority, and end-to-end
NVMe. It ensures stable low latency and high IOPS of OceanStor Dorado 2000.

4.4.1 Intelligent Multi-Core Technology


A storage system needs to provide powerful compute capabilities to deliver high
throughput and low latency as well as support more value-added features, such as
data deduplication and compression. On OceanStor Dorado 2000, each controller
has four high-performance CPUs, providing up to 192 cores. OceanStor Dorado
provides the industry's most CPUs and cores in a controller, offering powerful
compute capabilities. The intelligent multi-core technology brings CPUs' compute
capabilities into full play, allowing performance to increase linearly with the
number of CPUs. The intelligent multi-core technology consists of CPU grouping,
service grouping, and lock-free design between cores.

CPU Grouping
In a multi-CPU architecture, that is, a non-uniform memory access (NUMA)
architecture, each CPU can access either a local or remote memory. Accessing the
local memory involves a lower delay and less overhead than accessing a remote
one. OceanStor Dorado 2000 considers each CPU in a controller and its local
memory as a CPU group. A CPU group processes received host read/write requests
from end to end if possible. This eliminates the overhead of communication across
CPUs and accessing the remote memory. CPU grouping enables each CPU group
to process different host read/write requests, allowing performance to increase
linearly with the number of CPUs.

Figure 4-3 CPU grouping

Service Grouping
CPU grouping enables different CPUs to process different read/write requests.
However, various service processes running in each CPU compete for CPU cores,
causing conflicts and hindering the linear increase of performance. OceanStor
Dorado 2000 classifies services into front-end interface service, global cache
service, back-end interface service, and storage pool service groups, and allocates
dedicated CPU cores to each service group. In this way, different service groups
run on different CPU cores. For example, if a CPU has 48 cores, each of the front-
end interface service, global cache service, back-end interface service, and storage

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 18


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 4 Software Architecture

pool service uses 12 cores. The system dynamically adjusts the number of cores
allocated to each service group based on the service load. Service grouping
prevents CPU resource contention and conflicts between service groups.

Lock-free Design Between Cores


In a service group, each core uses an independent data organization structure to
process service logic. This prevents the CPU cores in a service group from accessing
the same data structure, and implements lock-free design between CPU cores. The
following figure shows lock-free design between CPU cores. In the example, CPU
cores 24 to 35 are allocated to the storage pool service group and run only the
storage pool service logic. In the storage pool service group, services are allocated
to different cores, which use independent data organizations to prevent lock
conflicts between the cores.

Figure 4-4 Lock-free design between CPU cores

The CPU grouping, service grouping, and lock-free technologies enable system
performance to increase linearly with the number of controllers, CPUs, and CPU
cores.

4.4.2 ROW Full-Stripe Write


Flash chips on SSDs can be erased for a limited number of times. In traditional
RAID overwrite (write in place) mode, hot data on an SSD is continuously
rewritten, and its mapping flash chips wear out quickly. OceanStor Dorado 2000
uses redirect on write (ROW) full-stripe write for both new data writes and old
data rewrites. It allocates a new flash chip for each write, balancing the number of
erase times of all flash chips. This avoids RAID write penalty caused by data read,
verification and modification in the traditional RAID write process and greatly
reduces the overhead on controller CPUs and read/write loads on SSDs in a write
process. Compared with traditional RAID write-in-place, ROW large-block
sequential write enables various RAID levels to achieve high performance.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 19


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 4 Software Architecture

Figure 4-5 ROW large-block sequential write

In the above figure, the system uses RAID 6 (4+2) and writes new data blocks 1, 2,
3, and 4 to modify existing data. In traditional overwrite mode, a storage system
must modify every chunk group where these blocks reside. For example, when
writing data block 3 to CKG2, the system must first read the original data block d
and the parity data P and Q. Then it calculates new parity data P' and Q', and
writes P', Q', and data block 3 to CKG2. In ROW full-stripe write, the system uses
the data blocks 1, 2, 3, and 4 to calculate P and Q and writes them to a new
chunk group. Then it modifies the logical block addressing (LBA) pointer to point
to the new chunk group. During this process there is no need to read any existing
data.
For traditional RAID, for example, RAID 6, when D0 is changed, the system must
first read D0, P, and Q, and then write new nD0, nP, and nQ. Therefore, both the
read and write amplifications are 3. Generally, the read and write amplification of
small random I/Os in traditional RAID (xD+yP) is y+1.

Figure 4-6 Write amplification of traditional RAID 6

The following table lists the write amplification statistics of various traditional
RAID levels.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 20


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 4 Software Architecture

Table 4-1 Write amplification of traditional RAID levels


RAID Level Write Read Write
Amplification Amplification of Amplification
of Random Random Small of Sequential
Small I/Os I/Os I/Os

RAID 5 (7D+1P) 2 2 1.14 (8/7)

RAID 6 (14D+2P) 3 3 1.14 (16/14)

RAID-TP (not available - - -


in traditional RAID)

RAID 6 uses 22D+2P, and RAID-TP uses 21D+3P, where D indicates data columns
and P, Q, and R indicate parity columns. The following table compares write
amplification on OceanStor Dorado 2000 using these RAID levels based on ROW
full-strip write.

Table 4-2 Write amplification in ROW large-block sequential write


Write Read Write
Amplification of Amplification of Amplification
Random Small Random Small of Sequential
I/Os I/Os I/Os

RAID 6 (22D+2P) 1.09 (24/22) 0 1.09

RAID-TP (21D+3P) 1.14 (24/21) 0 1.14

4.4.3 Multistreaming
SSDs use NAND flash. The following figure shows the logical structure. Each SSD
consists of multiple NAND flash chips, each of which contains multiple blocks.
Each block further contains multiple pages (4 KB or 8 KB). The blocks in NAND
flash chips must be erased before being written. Before erasing data in a block,
the system must migrate valid data in the block, which causes write amplification
in the SSD.
The multistreaming technology classifies data and stores different types of data in
different blocks. There is a high probability that data of the same type is valid or
garbage data at the same time. This technology reduces the amount of data to be
migrated during block erasure and minimizes write amplification on SSDs,
improving the performance and service life of SSDs.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 21


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 4 Software Architecture

Figure 4-7 Logical structure of an SSD

In an ideal situation, garbage collection would expect all data in a block to be


invalid so that the whole block could be erased without data migration. This
would minimize write amplification. But this is rarely the case. Usually different
data in a storage system is updated less or more frequently. This is also referred to
as cold and hot data. For example, metadata (hot) is updated more frequently
and is more likely to cause garbage than user data (cold). The multistreaming
technology enables SSD drives and controllers to work together to store hot and
cold data in different blocks. This increases the possibility that all data in a block is
invalid, reducing valid data to be migrated during garbage collection and
improving SSD performance and reliability. Figure 4-8 shows data migration for
garbage collection before separation of hot and cold data, in which a large
amount of data needs to be migrated. Figure 4-9 shows data migration for
garbage collection after separation of hot and cold data, in which less data needs
to be migrated.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 22


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 4 Software Architecture

Figure 4-8 Data migration for garbage collection before separation of hot and
cold data

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 23


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 4 Software Architecture

Figure 4-9 Data migration for garbage collection after separation of hot and cold
data

OceanStor Micro 2000 separates hot and cold data into three types: metadata,
newly written data, and valid data that must be migrated for garbage collection.
Metadata is the hottest data. Newly written data is migrated by garbage
collection if it is unmodified for a long period. The migrated data has the lowest
probability of being changed and is thus considered the coldest. Data separation
reduces SSD write amplification, greatly streamlines garbage collection, and
reduces the number of erasures on SSDs to extend SSD service life.

4.4.4 End-to-End I/O Priority


To ensure stable latency for specific types of I/Os, OceanStor Dorado 2000
controllers label each I/O with a priority according to its type. Based on the
priorities, the system controls I/Os in terms of CPU scheduling, resource
scheduling, and queuing. Specifically, upon reception of I/Os, SSDs will check their
priorities and process higher-priority I/Os first. OceanStor Dorado 2000 classifies
I/Os into the following five types:

● Data read/write I/Os


● Advanced feature I/Os
● Reconstructing I/Os
● Cache flushing I/Os
● Garbage collection I/Os

The five types of I/Os are assigned priorities in descending order, as shown in the
following figure. Based on I/O priority control, the system achieves optimal
internal and external I/O response.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 24


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 4 Software Architecture

Figure 4-10 End-to-End I/O Priority

On the left side in the preceding figure, various I/Os have the same priority and
contend for resources. With I/O priority control, system resources are allocated by
I/O priorities.

On each disk, in addition to assigning priorities to I/Os, the system also allows
high-priority read requests to interrupt ongoing write and erase operations. When
a host writes data to a storage system, a write success is returned to the host after
the data is written to the global cache. When a host reads data from a storage
system, data needs to be read from SSD disks if the cache is not hit. In this case,
the disk read latency directly affects the read latency of the host. OceanStor
Dorado 2000 is equipped with the latest generation of SSDs and uses the read first
technology to ensure a stable latency. Generally, there are three operations on the
flash media of an SSD: read, write, and erase. The erase latency is 5 ms to 15 ms,
the write latency is 2 ms to 4 ms, and the read latency ranges from dozens of μs
to 100 μs. When a flash chip is performing a write or an erase operation, a read
operation must wait until the current operation is finished, which causes a great
jitter in read latency.

Figure 4-11 Read first on SSDs

As shown in the preceding figure, if a read request with a higher priority is


detected during an erase operation, the system cancels the current operation and
preferentially processes the read request. This greatly reduces the read latency on
SSDs.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 25


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 4 Software Architecture

4.4.5 Intelligent Analysis


The intelligent learning framework is built based on the powerful computing
capability provided by the intelligent chip. to continuously learn the service load
characteristics and device health status. Intelligent read cache can accurately
predict service models to improve the hit ratio of the read cache and ensure high
system performance in complex service models. Intelligent quality of service (QoS)
can identify and classify different system loads, and suppress the traffic of non-
critical services to guarantee stable running of critical services. Enhanced data
reduction performs inline or post-process deduplication according to data models,
and chooses proper reduction algorithms for specific data models to achieve the
optimal reduction ratio and performance..

The intelligent read cache uses deep learning to identify and prefetch service flows
based on space, time, and semantics, improving the read cache hit ratio.

● From the perspective of space, there are sequential flows (accessing adjacent
LBAs in sequence), simple interval flows (with regular interval between two
access operations), and complex interval flows (with no obvious regular
interval between two access operations).
● From the perspective of time, hotspots exist (centralized access to one area in
a short period of time).
● From the perspective of semantics, there is a semantic pattern (the accessed
data is logically related).

Complex interval flows and semantic pattern flows require intelligent chips.

Figure 4-12 Service flow patterns

Table 4-3 Identification methods for various service flow patterns

Pattern Comput Identification Method Flow


e Identificat
Resourc ion
e Accuracy

Sequential flow CPU Statistical learning 100%

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 26


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 4 Software Architecture

Pattern Comput Identification Method Flow


e Identificat
Resourc ion
e Accuracy

Hotspots CPU Statistical learning (MQ) 100%

Interval flow CPU Statistical learning 100%

Complex interval Intellige Machine learning (Bayesian 98%


flow nt chip network, EM, LZ77...)

Semantic Intellige Deep learning (including CNN and 95%


pattern nt chip RNN)

4.5 Rich Software Features


OceanStor Dorado 2000 provides the Smart series software for efficiency
improvement and Hyper series for data protection, implementing data
management throughout the lifecycle.
● The Smart series software includes SmartThin, SmartCompression, SmartQoS,
SmartErase, and SmartMulti-Tenant for SAN, which improve storage efficiency
and reduce the TCO.
● The Hyper software series includes HyperSnap for SAN, HyperCDP for SAN,
HyperClone for SAN, HyperReplication for SAN, HyperVault, HyperMetro for
SAN, 3DC for SAN, and HyperEncryption, which provide disaster recovery, data
backup, and security functions.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 27


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 5 Smart Series Features

5 Smart Series Features

The OceanStor Dorado 2000 all-flash storage system provides various Smart series
software, including SmartThin and SmartCompression for better space utilization,
SmartQoS and SmartMulti-Tenant for SAN for better performance and improved
service quality, and SmartErase for system lifecycle and security management.
5.1 SmartDedupe and SmartCompression (Data Reduction)
5.2 SmartQoS (Intelligent Quality of Service Control)
5.3 SmartMigration (Intelligent Data Migration)
5.4 SmartThin (Intelligent Thin Provisioning)
5.5 SmartErase (Data Destruction)
5.6 SmartMulti-Tenant (Multi-tenancy)

5.1 SmartDedupe and SmartCompression (Data


Reduction)
SmartDedupe&SmartCompression of OceanStor Dorado 2000 adaptively
deduplicates and compresses user data based on user data characteristics. This
section describes the working principles of deduplication and compression.
Adaptive deduplication and compression maximize the data reduction ratio.
Figure 5-1 shows the system processing flow.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 28


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 5 Smart Series Features

Figure 5-1 Adaptive deduplication and compression process

Write user data to disks.

Adaptive dedupe
algorithm

Similarity-based dedupe
Inline dedupe Calculate fingerprints for
similarity-based dedupe.
Compression Key technique:
Inline Similarity-based dedupe
Key techniques:
compression Opportunity Dedupe Fingerprint
1. Compression
table processing table
algorithm
Fingerprint
2. Data compaction
Data
compaction
Data

Adaptive dedupe algorithm: distributes requests based on data


characteristics.
Storage pool Opportunity table: saves similar fingerprints.
Fingerprint table: saves fingerprints of data that has been
deduplicated.

1. When a user writes data, the adaptive deduplication algorithm identifies data
suitable for inline deduplication based on data characteristics and directly
performs inline deduplication.
2. The adaptive deduplication algorithm identifies data suitable for similarity-
based deduplication based on data characteristics, calculates similar
fingerprints (SFPs), and adds the SFPs to the similarity-based deduplication
opportunity table. Then the system compresses the user data, writes the
compressed data to the storage pool, and returns a success message.
3. The background deduplication task finds similar data in the opportunity table
and reads the data from disks for similarity-based deduplication. After the
deduplication is complete, the fingerprint table is updated.

5.1.1 Compression
SmartCompression involves two processes. Input data blocks are first compressed
to smaller sizes using a compression algorithm and then compacted before being
written to disks.

5.1.1.1 Data Compression


This section describes how compression is implemented.

Preprocessing
Before data compression, OceanStor Dorado 2000 uses Huawei's preprocessing
algorithm to identify the data blocks that are difficult to compress by the general
compression algorithm based on the data format. Then it rearranges the data to
be compressed for a higher compression ratio.
After preprocessing, the data to be compressed is divided into two parts:

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 29


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 5 Smart Series Features

● For the part that is difficult to compress, the system compresses it using the
dedicated compression algorithm.
● For the other part, the system uses the general compression algorithm to
compress it.

Dedicated Compression
Based on the preprocessing result, OceanStor Dorado 2000 uses a proprietary
compression algorithm developed by Huawei to compress the data that is difficult
to compress.
This dedicated compression algorithm uses special coding rules to compress the
data without adding the metadata. It features high performance and does not
affect read and write operations. Figure 5-2 explains preprocessing and dedicated
compression.

Figure 5-2 Preprocessing and dedicated compression

General Compression
OceanStor Dorado 2000 provides multi-level general compression algorithms. To
balance the compression ratio and performance, it uses both inline and post-
process compression algorithms.
When data is written to the pool for the first time, the data is compressed using
the inline compression algorithm before being written to the pool. Then post-
process deduplication is performed and the deduplicated data is compressed again
using the post-process compression algorithm for a higher reduction ratio.
● Inline compression
The inline compression technology provided by OceanStor Dorado 2000 uses
the compression algorithm developed by Huawei, which uses LZ matching
and Huffman entropy encoding. This provides a compression ratio 20% higher
than that of the LZ4 algorithm while providing the same performance.
● Post-process compression
On OceanStor Dorado 2000, post-process compression works with similarity-
based deduplication and is implemented using the deep compression
algorithm developed by Huawei. The deep compression algorithm uses
stronger matching rules and supports redirection compression for databases
and combining compression for data blocks, providing a higher compression
ratio.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 30


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 5 Smart Series Features

5.1.1.2 Data Compaction


This section describes how data compaction is implemented.
Compressed user data is aligned by byte and then compacted to reduce the waste
of physical space. This provides a higher reduction ratio than the 1 KB alignment
granularity generally used in the industry.
As shown in Figure 5-3, byte-level alignment saves 2 KB physical space compared
with 1 KB granularity alignment, improving the compression ratio.

Figure 5-3 Alignment of compressed data by byte

5.2 SmartQoS (Intelligent Quality of Service Control)


SmartQoS dynamically allocates storage system resources to meet the
performance objectives of applications. You can set upper and lower limits on
IOPS, bandwidth, or response latency for specific applications. Based on the limits,
SmartQoS can accurately limit performance of these applications, preventing them
from contending for storage resources with critical applications.
SmartQoS uses object-specific I/O priority scheduling and I/O traffic control
(including upper limit control and minimum performance guarantee) to guarantee
the service quality.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 31


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 5 Smart Series Features

Table 5-1 SmartQoS policy and controlled objects


Function Controlled Object Configuration Item

Upper limit control SAN: LUN, snapshot, IOPS and bandwidth


(including burst traffic LUN group, host, and
control) vStore

Lower limit guarantee SAN: LUN, snapshot, and IOPS, bandwidth, and
LUN group maximum latency

5.2.1 Functions
SmartQoS supports upper limit control and minimum performance guarantee.
Upper limit control prevents traffic of some services from affecting the normal
running of other services. Minimum performance guarantee is mainly used to
guarantee resources allocated to critical services, especially latency-sensitive
services, thereby ensuring their performance.

5.2.1.1 Upper Limit Control


I/O traffic control is implemented based on a user-configured SmartQoS policy
that contains SAN resource objects (LUNs, snapshots, LUN groups, hosts, and
vStores) to limit their IOPS and bandwidth. This prevents some specific
applications from affecting the performance of other services due to heavy burst
traffic.
I/O traffic control is implemented by I/O queue management, token allocation,
and dequeue control.
After an upper limit objective is set for a SmartQoS policy, the system allocates
tokens based on the objective to control traffic. If the objective is to limit IOPS, an
I/O is converted to a number of normalized 8 KB I/Os and a token is allocated to
each of the 8 KB I/Os. If bandwidth is limited, a token is allocated to each byte.
I/O queue management allocates storage resources by tokens. The more tokens
owned by an I/O queue, the more resources will be allocated to that queue.
Figure 5-4 explains the implementation process.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 32


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 5 Smart Series Features

Figure 5-4 Upper limit control

1. The application servers send I/O requests to the target I/O queues.
2. The storage system converts the traffic control objective into a number of
tokens. You can set upper limit objectives for low-priority objects to guarantee
sufficient resources for high-priority objects.
3. The storage system processes the I/Os in the queues by tokens.

Burst Traffic Control Management


For latency-sensitive services, you can allow them to exceed the upper limit for a
specific period of time. SmartQoS supports burst traffic control management to
specify the burst IOPS, bandwidth, and duration for the controlled objects.
The system accumulates the unused resources during off-peak hours and
consumes them during traffic bursts to break the upper limit for a short period of
time. To achieve this, the long-term average traffic of the service should be below
the upper limit.

Figure 5-5 Traffic burst

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 33


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 5 Smart Series Features

1. If the traffic of an object does not reach the upper limit in a past period of
time, the service traffic of the object can temporarily exceed the upper limit
set by SmartQoS in a future period of time as long as the system is not
overloaded. This meets the requirements of the burst service traffic during
peak hours. The maximum duration and ratio of a burst are configurable.
2. Burst traffic control is implemented by accumulating burst durations. If the
traffic of a LUN, snapshot, LUN group, or host is below the upper limit in a
second, the system accumulates this second for the burst duration. When the
service load surges, performance can break the upper limit to reach the
specified burst limit for a duration accumulated earlier (this duration will not
exceed the maximum value specified).
3. When the accumulated duration or the specified maximum duration is
reached, the performance drops below the upper limit.

5.2.1.2 Lower Limit Guarantee


The lower limit guarantee function is dedicated to ensuring critical services. It
takes effect for all service objects (such as LUNs, snapshots, and LUN groups) in a
system to release resources for objects whose lower limit objectives are not
fulfilled. Therefore, this function is applicable to only a few critical services. If this
function is configured for all services, the quality of all services cannot be
guaranteed.
If a controlled object does not reach the lower limit, the system evaluates the load
on all controlled objects in the system. For heavily loaded objects, the system
suppresses their traffic based on the lower limit until sufficient resources are
released for all objects in the system to reach the lower limit. For objects that
have reached the lower limit but are not heavily loaded, the system prevents burst
traffic on these objects. For LUNs that do not reach the lower limit, the system
does not limit their traffic and allows them to preempt resources from heavily
loaded LUNs.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 34


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 5 Smart Series Features

Figure 5-6 Lower limit guarantee for LUNs

Latency assurance is to prioritize requests of objects that have latency objectives.


If latency assurance is not achieved, the latency objectives are converted into
traffic control objectives and guaranteed in the same way as lower limits.

5.2.2 Policy Management

5.2.2.1 Hierarchical Management


SmartQoS supports both common and hierarchical policies. The details are as
follows:
● A common policy contains only controlled objects. It controls the traffic from
a single application to the controlled objects. For example, VDI application
startup causes a temporary boot storm. You can configure a common policy
during the startup to prevent the boot storm from affecting other services.
● A hierarchical policy can contain common policies. It controls the traffic when
there are multiple applications running in the system. For example, in a
VMware environment, the customer wants to control the upper limit of a
specific VMDK on a VM. To do this, the customer can set a hierarchical policy
for the VM and a common policy for the VMDK.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 35


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 5 Smart Series Features

NOTE

● The types of objects in a SmartQoS policy are classified by LUNs/snapshots, LUN groups,
hosts, and vStores. A common policy can have only one type of object.
● A hierarchical policy can have multiple common policies, each of which can contain a
different type of object.
● Each LUN or snapshot can be added to only one SmartQoS policy.
● Each LUN group can be added to only one SmartQoS policy.
● Each host can be added to only one SmartQoS policy.
● Each vStore can be added to only one SmartQoS policy whose owner is the system.
● Each vStore can be added to one SmartQoS policy whose owner is the system and
one SmartQoS policy whose owner is the vStore at the same time.

The following figures show the relationship between the common and hierarchical
policies (using LUNs as an example).

Figure 5-7 Adding 1 to 512 LUNs or snapshots to each policy

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 36


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 5 Smart Series Features

Figure 5-8 Adding a LUN group to each policy

For details about the policy specifications, see the product specifications list.

5.2.2.2 Objective Distribution


All objects in a SmartQoS policy share the upper limit objectives. The SmartQoS
module periodically collects the performance and requirement statistics of all
objects in a traffic control policy, and distributes the traffic control objective to
each object.
Currently, the optimized weighted max-min fairness algorithm is used for objective
distribution. It determines the traffic control objective for each object based on the
policy's overall objective (including the upper and lower limits) and each object's
resource requirement. This algorithm preferentially meets the requirement of
objects. The requirement refers to the number of requests received by an object,
including the number of successful requests and the number of rejected requests.
Then, the remaining resources are distributed to each object based on the object's
weight. In addition, it uses a filtering mechanism to ensure a relative stable
objective for each object.
Note:
You can add each LUN or snapshot to a traffic control policy separately, or add
LUNs or snapshots to a LUN group and then add the LUN group to a traffic
control policy. When a LUN or snapshot is added to multiple traffic control
policies, the smallest value among the upper limits takes effect for the LUN or
snapshot.

5.2.2.3 Recommended Configuration

Upper Limit Configuration


● In a multi-tenant scenario, you can configure different hierarchical policies for
different tenants to ensure that the resources occupied by a single tenant do
not exceed the limit. Common policies can be configured for different services

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 37


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 5 Smart Series Features

of a single tenant to specify the upper and lower performance limits for these
services.
● When hybrid service loads are carried, you can set upper limits for non-critical
services, especially services with great fluctuations in loads, and set lower
limits for critical and latency-sensitive services.
● You are advised to configure a burst policy for service loads that are not
evenly distributed and are sensitive to latency.

Lower Limit Configuration


For a few mission-critical services in the system, you can configure a lower limit
policy to set the minimum IOPS, minimum bandwidth, and maximum response
latency. When system resources are insufficient, the lower limit policy works to
ensure the quality of the critical services by limiting the performance of non-
critical services. When the lower limit policy fails to work due to insufficient
system resources, the system generates an alarm. You can adjust the lower limit
policy based on the alarm information. Therefore, ensure that the overall
performance of the services configured with the lower limit policy does not exceed
50% of system performance. If the lower limit policy works for all services, it
works for nothing.

5.3 SmartMigration (Intelligent Data Migration)


OceanStor Dorado 2000 uses SmartMigration for intelligent data migration based
on LUNs. Data on a source LUN can be completely migrated to a target LUN
without interrupting ongoing services. SmartMigration also supports data
migration between a Huawei storage system and a compatible heterogeneous
storage system.

When the system receives new data during migration, it writes the new data to
both the source and target LUNs simultaneously and records data change logs
(DCLs) to ensure data consistency. After the migration is complete, the source and
target LUNs exchange information to allow the target LUN to take over services.

SmartMigration is implemented in two stages: data synchronization and LUN


information exchange.

Data Synchronization
1. Before the migration, you must configure the source and target LUNs.
2. When migration starts, the source LUN replicates data to the target LUN in
the background.
3. During migration, the host can still access the source LUN. When the host
writes data to the source LUN, the request is recorded in a log. The log
contains the address information instead of the specific data.
4. The system writes the incoming data to both the source and target LUNs.
– The system waits for the write response from the source and target LUNs.
If writing to both LUNs is successful, the system deletes the log. If writing
to either LUN fails, the system retains the log and replicates the data
again in the next synchronization.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 38


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 5 Smart Series Features

– The system returns the write result of the source LUN to the host.
5. The system performs the preceding operations until all data is replicated to
the target LUN.

LUN Information Exchange


After data replication is complete, the source LUN and target LUN exchange
information. In the information exchange, source and target LUN IDs and WWNs
remain unchanged but the data volume IDs of the source LUN and target LUN are
exchanged. This creates a new mapping relationship between the source LUN ID
and target volume ID. After the exchange, the host can still identify the source
LUN using the source LUN ID but accesses physical space of the data volume
corresponding to the target LUN.
SmartMigration can meet the following requirements:
● Storage system upgrade by working with SmartVirtualization. SmartMigration
works with SmartVirtualization to migrate data from legacy storage systems
(from Huawei or other vendors) to new Huawei storage systems to improve
service performance and data reliability.
● Data migration for capacity, performance, and reliability adjustments. For
example, a LUN can be migrated from one storage pool to another.

Value-Added Configurations for Target LUNs


The duration of data migration across devices varies from hours to weeks or even
months, depending on the amount of data to be migrated. Cross-array remote
replication and HyperMetro features can be configured for target LUNs to
replicate data across data centers during migration for disaster recovery. This
ensures immediate data protection upon completion of the migration, improving
solution reliability.
During data migration, data on the target LUNs is incomplete. Therefore, the DR
site cannot take over services before the migration and DR data synchronization
are complete.

5.4 SmartThin (Intelligent Thin Provisioning)


OceanStor Dorado supports thin provisioning, which enables the storage systems
to allocate storage resources on demand. SmartThin does not allocate all capacity
in advance, but presents a virtual storage capacity larger than the physical storage
capacity. This allows you to see a larger storage capacity than the actual storage
capacity. When you begin to use the storage, SmartThin provides only the required
space. If the storage space is about to be used up, SmartThin triggers storage
resource pool expansion to add more space. The expansion process is transparent
to users and causes no system downtime.
The following figure shows the benefits of SmartThin (using LUNs as an example).

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 39


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 5 Smart Series Features

Figure 5-9 Comparison between the thin LUN and traditional LUN

5.5 SmartErase (Data Destruction)


If a disk is no longer used in its original scenario, data on the disk is not needed. If
the data is not properly processed, unauthorized users may use the residual data
to reconstruct the original data, causing information leakage risks. Therefore, data
on disks must be thoroughly erased to ensure data security.

The data destruction function is implemented based on disks and is not specific to
service types. Data destruction complies with DoD standards.

During data erasure, the storage system sends one or more SANITIZE commands
(defined by the SCSI or NVMe protocol) to the disk according to the data erasure
security standard. After receiving the command, the disk returns a success
message. At the same time, the data erasure task is executed in the background.
The storage system periodically queries the task progress until it ends. All data,
including data in the over provisioning (OP) space, will be erased, and the data
cannot be restored.

Security standards for data erasure include:

● DoD 5220.22-M (E)


This standard suggests a software method to erase data from writable
storage media, including three overwrites:
a. Uses a 1-byte character to overwrite all addresses.
b. Uses one's complement of the character to overwrite all addresses.
c. Uses a random number to overwrite all addresses.
● DoD 5220.22-M (ECE)
This standard is an extended version of the DoD 5220.22-M (E). It runs the
DoD 5220.22-M (E) twice and uses a random number once to overwrite all
addresses.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 40


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 5 Smart Series Features

a. Uses the DoD 5220.22-M (E) standard to overwrite all addresses for three
times.
b. Uses a random number to overwrite all addresses once.
c. Uses the DoD 5220.22-M (E) standard to overwrite all addresses for three
times.
● VSITR
The VSITR data sanitization method was originally defined by the German
Federal Office for Information Security and is implemented in the following
way:
a. Write zero bytes (0x00).
b. Write high bytes (0xFF).
c. Write zero bytes (0x00).
d. Write high bytes (0xFF).
e. Write zero bytes (0x00).
f. Write high bytes (0xFF).
g. Write pseudo random bytes.
● Customization
In the custom overwrite mode, the system uses the custom value that is a 1-
byte hexadecimal number starting with r or 0x. A maximum of three
parameters separated by commas (,) can be entered. The system uses the
data to overwrite all the addresses of the disk for specified times. You can set
the times of overwriting the disk from 1 to 15. The default value is 1.

5.6 SmartMulti-Tenant (Multi-tenancy)


With improved service capability, a single storage system is carrying more and
more customer service systems. In this case, customers want to isolate these
service applications. OceanStor Dorado 2000 uses SmartMulti-Tenant to isolate
services.

SmartMulti-Tenant isolates logical resources among vStores, including services and


networks. Users cannot access data across vStores, ensuring security isolation.

● Service isolation: Each vStore has its own storage services and user access
authentication. Users can access the services through the logical interfaces
(LIFs) of the vStore.
● Network isolation: VLANs and LIFs separate the networks among vStores,
preventing unauthorized access of storage resources. For SAN services, users
can specify Fibre Channel ports available to vStores to implement network
isolation because the Fibre Channel protocol supports point-to-point
communication.

Service Isolation
SmartMulti-Tenant isolates NAS services by vStore. For SAN services, different
vStores cannot access LUNs of each other, ensuring data isolation by vStore.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 41


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 5 Smart Series Features

NOTE

vStores only isolate service resources and do not isolate storage pools. SAN isolation will be
available in later versions. For details, see the product roadmap. Storage pool isolation is
implemented using the multi-pool technology. That is, customers can plan and configure
multiple pools on a device to isolate storage space.

Network Isolation
vStores' network resources are managed using LIFs, implementing port
virtualization, management, and isolation as well as flexible and secure use of
resources.
For SAN services, users can specify Fibre Channel ports available to vStores to
implement network isolation.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 42


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

6 Hyper Series Features

To meet customers' requirements for local protection and remote disaster


recovery, OceanStor Dorado 2000 provides the Hyper series software. HyperSnap
and HyperCDP help you recover from local logic errors. HyperClone creates a
complete local data copy. Data integrity of the parent object has no impact on the
data integrity of the clone object, isolating fault domains. HyperReplication and
3DC are used to implement remote disaster recovery. HyperMetro not only ensures
service continuity but also provides disaster recovery capabilities.
6.1 HyperSnap (Snapshot)
6.2 HyperCDP (Continuous Data Protection)
6.3 HyperClone (Clone)
6.4 HyperReplication (Remote Replication)
6.5 HyperMetro (Active-Active Deployment)
6.6 Geo-Redundancy (Multi-DC)
6.7 HyperEncryption (Array Encryption)

6.1 HyperSnap (Snapshot)


The snapshot feature of OceanStor Dorado 2000 is called HyperSnap.

6.1.1 HyperSnap for SAN (Snapshot for SAN)


This section describes the technical principles and key functions of HyperSnap for
SAN.

6.1.1.1 Basic Principles


OceanStor Dorado 2000 provides readable and writable snapshots for SAN
services. A snapshot must be separately mapped to a host. This section details the
time point (TP) technology, which is crucial to HyperSnap.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 43


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

TP
OceanStor Dorado 2000 uses the multi-TP technology to implement basic data
protection features. All local and remote data protection features use this
technology to obtain data copies and ensure consistency.
This technology adds a TP attribute to LUNs. When a snapshot is created for a
LUN, the value of TP attribute of the source LUN is incremented, while the TP
attribute of the snapshot is the original TP at the snapshot creation.
In the example of Figure 6-1, the current TP of the source LUN is T0 and the user
creates a snapshot for the source LUN.

Figure 6-1 Snapshot principles

Because a snapshot is created at T0, the TP attribute of the source LUN changes
from T0 to T1. The TP attribute of the snapshot is T0. When the source LUN is
read, data at T1 is read; when the snapshot is read, data at T0 is read (ABCDEF in
this example).

Reading and Writing a Snapshot


The host reads and writes data on the source LUN at the latest TP, or reads and
writes data on the snapshot at the original TP, as shown in Figure 6-2.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 44


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

Figure 6-2 Reading and writing a snapshot

● Reading the source LUN


After a snapshot is created for the source LUN, the TP of the source LUN is
updated from T0 to T1. The read requests on the source LUN will read the
data within the [T0, T1] time range on the source LUN (from the latest data
to the old data according to mapping entries in the mapping table). This will
not cause any new performance overhead.
● Reading the snapshot
The latest TP of the snapshot is T0. When the snapshot is read, the data at T0
is returned if the mapping table of the snapshot is not empty. If the mapping
table of the snapshot is empty, a TP redirection is triggered and the data at
T0 of the source LUN is read.
● Writing the source LUN
When new data is written to the source LUN, the write requests carry the
latest T1. The system uses the logical address of the new data and T1 as the

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 45


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

key, and uses the address where the new data is stored in the SSD storage
pool as the key value.
● Writing the snapshot
When new data is written to the snapshot, the write requests carry the latest
T0 of the snapshot. The system uses the logical address of the new data and
T0 as the key, and uses the address where new data is stored in the SSD
storage pool as the value.

Because the read and write requests on the source LUN or snapshot carry
corresponding TPs, the metadata can be quickly located, minimizing the impact on
performance.

6.1.1.2 Cascading Snapshot


To protect writable snapshots, you can use cascading snapshots. This chapter
describes cascading snapshots.

Cascading snapshot is to create child snapshots for a parent snapshot. On


OceanStor Dorado 2000, HyperSnap supports up to eight levels of cascading
snapshots.

Cascading snapshots support cross-level rollback. For multi-level cascading


snapshots that share a source LUN, they can roll back each other regardless of
their cascading levels. In Figure 6-3, Snapshot1 is created for the source LUN at 9:
00, and Snapshot1.Snapshot0 is a cascading snapshot of Snapshot1 at 10:00. The
system can roll back the source LUN using Snapshot1.Snapshot0 or Snapshot1,
or roll back Snapshot1 using Snapshot1.Snapshot0.

Figure 6-3 Cascading snapshot and cross-level rollback

6.1.1.3 Snapshot Consistency Group


HyperSnap supports snapshot consistency groups. For LUNs that are dependent on
each other, you can create a snapshot consistency group for these LUNs to ensure
data consistency. For example, the data files, configuration files, and logs of an
Oracle database are usually saved on different LUNs. Snapshots for these LUNs
must be created at the same time to guarantee that the snapshot data is
consistent in time.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 46


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

Figure 6-4 Working principles of the snapshot consistency group

1. Create a LUN protection group and add LUNs to it.


NOTE

A maximum of 4096 LUNs can be added to a LUN protection group.


2. Create a snapshot consistency group for the protection group. The snapshots
in the snapshot consistency group have the same TP.

6.2 HyperCDP (Continuous Data Protection)


HyperCDP creates high-density snapshots on a storage system to provide
continuous data protection (CDP). This section provides an overview of this
feature.
Misoperations and virus attacks may cause data damage. Continuous data
protection is to create snapshots at a short interval to help customers restore data.
HyperCDP provides CDP for LUNs. HyperCDP is based on the lossless snapshot
technology (TP and ROW). Each HyperCDP object matches a time point of the
source LUN. Figure 6-5 illustrates how HyperCDP is implemented.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 47


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

Figure 6-5 HyperCDP principles

HyperCDP Schedule
You can specify HyperCDP schedules by day, week, month, or specific interval,
meeting different backup requirements.

Table 6-1 HyperCDP schedule


Schedule Description
Type

Fixed ● If you set the schedule by second, HyperCDP objects will be


period created every 10 seconds by default.
● If you set the interval by minute, the HyperCDP schedule is
executed every 1 minute by default.
● If you set the interval by hour, the HyperCDP schedule is executed
every 1 hour by default.
NOTE
SAN: A single LUN supports a maximum of 60,000 HyperCDP objects. The
system supports a maximum of 2,000,000 HyperCDP objects.

Execute Set the point in time every day for the storage system to create
daily HyperCDP snapshots. The value ranges from 00:00 to 23:59 at the
local time zone.
NOTE
The number of retained HyperCDP objects ranges from 1 to 256.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 48


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

Execute Set the point in time and day in a week for the storage system to
weekly create HyperCDP snapshots. The value ranges from 00:00 Sunday to
23:59 Saturday at the local time zone.
NOTE
The number of retained HyperCDP objects ranges from 1 to 256.

Execute Set the point in time and day in a month for the storage system to
monthly create HyperCDP snapshots. The value ranges from 00:00 Day 1 to
23:59 Day 31 (or last day) at the local time zone.
NOTE
The number of retained HyperCDP objects ranges from 1 to 256.

Intensive and Persistent Data Protection


A single LUN supports 60,000 HyperCDP objects. The minimum interval is 3
seconds. A single file system supports a maximum of 4096 HyperCDP objects. The
minimum interval is 15 seconds. You can configure the retention policy of
HyperCDP in the scheduling policy.

Consistency Group
In database applications, the data, configuration files, and logs are usually saved
on different LUNs. The HyperCDP consistency group ensures that data in the
group is consistent in time between these LUNs during restoration.

Secure Snapshot
In financial, securities, or bank applications, HyperCDP objects are configured to
back up critical data for long term retention. To prevent HyperCDP objects from
deletion, you can set a retention period. The HyperCDP objects cannot be deleted
within the retention period. After the period expires, they can be deleted manually
or automatically.
● You can create secure snapshots for individual LUNs and secure snapshot CGs
for LUN groups. The retention period ranges from 1 day to 20 years. You can
configure whether to automatically delete the snapshot upon expiration.
● You can change HyperCDP objects to secure snapshots and HyperCDP CGs to
secure snapshot CGs. The retention period ranges from 1 day to 20 years. You
can configure whether to automatically delete the snapshot upon expiration.
● You can adjust the retention period and the automatic deletion policy for
secure snapshots and secure snapshot CGs. The retention period can be
extended but cannot be shortened.
● You can create schedule policies for secure snapshots.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 49


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

NOTE

● Secure snapshots cannot be deleted within the retention period. Given that the retention
period cannot be shortened, exercise caution when configuring it.
● Even if Protection Data Auto Deletion is enabled, secure snapshots will not be deleted
if the used capacity of the storage pool reaches the Capacity Used Up Alarm
Threshold or the protection capacity of the storage pool reaches the upper threshold.
● Secure snapshots have an independent clock free from the system time change. The
clock is updated once every minute. After the system is shut down, the clock stops.

6.3 HyperClone (Clone)


On OceanStor Dorado 2000, HyperClone allows the system to create a complete
data copy of the source LUN's or file system's data on the target LUN or file
system. The target LUN or file system can be an existing one or automatically
created when the clone pair is created. The source and target LUNs or file systems
that form a clone pair must have the same capacity. The target LUN or file system
can either be empty or have existing data. If the target LUN or file system has
data, the data will be overwritten by the source LUN or file system of HyperClone.
Data access of the clone LUN or file system is independent from that of the source
LUN or file system. That is, changes to one LUN or file system do not affect the
data of the other LUN or file system.

6.3.1 HyperClone for SAN (Clone for SAN)


After a clone LUN is created, it shares the same data with the source LUN. The
data read/write model is the same as that of the snapshot LUN and source LUN.
Users can split a clone LUN to start background data replication. The data
synchronization status does not affect the read/write status of the target LUN. The
target LUN can be read and written immediately without waiting for the
background replication to complete. HyperClone for SAN supports incremental
synchronization and reverse synchronization. You can create a HyperClone
consistency group using a LUN protection group to protect data consistency on a
group of source LUNs.

6.3.1.1 Data Synchronization


When a clone pair starts synchronization, the system generates an instant
snapshot for the source LUN, and then synchronizes the snapshot data to the
target LUN. Any subsequent write operations are recorded in the DCL. When
synchronization is performed again, the system compares the data of the source
and target LUNs, and only synchronizes the differential data to the target LUN.
The data written to the target LUN between the two synchronizations will be
overwritten. To retain the existing data on the target LUN, you can create a
snapshot for it before synchronization.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 50


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

Figure 6-6 Data synchronization from the source LUN to the target LUN

6.3.1.2 Reverse Synchronization


If the source LUN is damaged, data on the target LUN can be reversely
synchronized to the source LUN. Both full and incremental reverse
synchronizations are supported. When reverse synchronization starts, the system
generates a snapshot for the target LUN and synchronizes the snapshot data to
the source LUN. For incremental reverse synchronization, the system compares the
data of the source and target LUNs, and only synchronizes the differential data.

Figure 6-7 Reverse synchronization from the target LUN to the source LUN

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 51


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

6.3.1.3 Immediately Available Clone LUNs


The HyperClone data synchronization status includes Synchronizing, Sync
paused, Unsynchronized, or Normal. In different status, the read and write I/Os
of the source LUN and target LUN are processed in different ways.
1. When HyperClone is in the normal or unsynchronized state:
The host reads and writes the source or target LUN directly.

Figure 6-8 Reads and writes when HyperClone is in the normal or


unsynchronized state

2. When HyperClone is in the synchronizing or paused state:


The host reads and writes the source LUN directly.
For read operations on the target LUN, if the requested data is found on the
target LUN (the data has been synchronized), the host reads the data from
the target LUN. If the requested data is not found on the target LUN (the
data has not been synchronized), the host reads the data from the snapshot
of the source LUN.
For write operations on the target LUN, if a data block has been synchronized
before the new data is written, the system overwrites this block. If a data
block has not been synchronized, the system writes the new data to this block
and stops synchronizing the source LUN's data to it. This ensures that the
target LUN can be read and written before the synchronization is complete.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 52


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

Figure 6-9 Reads and writes when HyperClone is in the synchronizing or


paused state

6.3.1.4 HyperClone Consistency Group


HyperClone allows you to create a consistency group for a LUN protection group.
A HyperClone consistency group contains multiple clone pairs. When you
synchronize or reversely synchronize a consistency group, data on all of its
member LUNs is always at a consistent point in time, ensuring data integrity and
availability.
A HyperClone consistency group supports a maximum of 4096 members.

6.3.1.5 Cascading Clone Pairs


After data has been synchronized to the target LUN of a clone pair, you can create
another clone pair for the target LUN by using it as a new source LUN, as shown
in Figure 6-10.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 53


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

Figure 6-10 Cascading clone pairs

HyperClone has no restriction on the cascading levels.

6.4 HyperReplication (Remote Replication)

6.4.1 HyperReplication for SAN (Remote Replication for SAN)


Huawei HyperReplication is a remote replication feature running on Huawei
Oceanstor Dorado. The HyperReplication supports synchronous or asynchronous
data replication between OceanStor Dorado and can be used to create intra-city
or remote DR solutions.

HyperReplication supports the following two modes:

● Synchronous remote replication. Data on the primary LUN is synchronized


to the secondary LUN in real time. No data is lost if a disaster occurs.
However, production service performance is affected by the data transfer
latency.
● Asynchronous remote replication. Data on the primary LUN is periodically
synchronized to the secondary LUN. Production service performance is not
affected by the data transfer latency. However, some data may lose if a
disaster occurs.

HyperReplication provides the storage system-based consistency group function


for synchronous and asynchronous remote replication to ensure the crash
consistency of cross-LUN applications in disaster recovery replication. The
consistency group function protects the dependency of host write I/Os across
multiple LUNs, ensuring data consistency between secondary LUNs.

HyperReplication enables data to be replicated using Fibre Channel and IP


networks. Data can be transferred between the primary and secondary storage
systems using Fibre Channel or IP links.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 54


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

6.4.1.1 HyperReplication/S (Synchronous Remote Replication)


HyperReplication/S supports the short-distance data disaster recovery of LUNs. It
applies to same-city disaster recovery that requires zero RPO.
It concurrently writes each host write I/O to both the primary and secondary LUNs
of the remote replication pair and returns a write success acknowledgement to the
host after the data is successfully written to the primary and secondary LUNs.
Therefore, the RPO is zero.

Figure 6-11 Working principles

Description:
1. The production storage system receives a write request from the host.
HyperReplication logs the address information instead of data content.
2. The data of the write request is written to both the primary and secondary
LUNs. If LUNs are in the write-back state, a write result will be returned after
the data is written to the cache.
3. HyperReplication waits for the data write results from the primary and
secondary LUNs. If the data has been successfully written to the primary and
secondary LUNs, HyperReplication deletes the log. Otherwise,
HyperReplication retains the log and enters the interrupted state. The data
will be replicated in the next synchronization.
4. HyperReplication returns the data write result. The data write result of the
primary LUN prevails.

6.4.1.2 HyperReplication/A (Asynchronous Remote Replication)


HyperReplication/A of OceanStor Dorado 2000 adopts the multi-time-segment
caching technology (patent number: PCT/CN2013/080203). The working principle
of the technology is as follows:
1. After an asynchronous remote replication relationship is established between
a primary LUN at the production site and a secondary LUN at the DR site, an
initial synchronization is implemented by default to copy all data from the
primary LUN to the secondary LUN.
2. When the initial synchronization is complete, the data status of the secondary
LUN becomes Consistent (data on the secondary LUN is a copy of data on
the primary LUN at a certain point in time in the past). Then I/Os are
processed as shown in the following figure.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 55


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

Figure 6-12 Working principles of HyperReplication/A

Description:

1. When an asynchronous remote replication task is started, snapshots are


generated for the primary and secondary LUNs and the snapshots' TPs are
updated. (The primary snapshot TP is X, and the TP is updated to X+1. The
secondary snapshot TP is Y, and the TP is updated to Y+1.)
2. New data from the host is stored in the primary LUN cache using TP X+1.
3. The host receives a write success.
4. Data at X is directly replicated to the secondary LUN at Y+1 based on the DCL.
5. The primary and secondary LUNs write the received data to disks. After the
synchronization is complete, the data at the latest TP Y+1 on the secondary
LUN is the data at the TP X on the primary LUN.

6.4.1.3 Technical Highlights


● Load balancing
When a remote replication task is started for a LUN, the task is executed by
all controllers concurrently. The workload of the task is distributed to all
controllers based on the data layout, improving the replication bandwidth and
reducing the impact on front-end services.
● Data compression
In asynchronous remote replication, both Fibre Channel and IP links support
data compression by using the fast LZ4 algorithm, which can be enabled or
disabled as required. Data compression reduces the bandwidth required by
asynchronous remote replication. In the testing of an OLTP application with
100 Mbit/s bandwidth, data compression saves half of the bandwidth.
● Quick response to host requests
In asynchronous remote replication, after a host writes data to the primary
LUN at the primary site, the primary site immediately returns a write success
to the host before the data is written to the secondary LUN. In addition, data
is synchronized in the background, which does not affect access to the
primary LUN. HyperReplication/A does not synchronize incremental data from
the primary LUN to the secondary LUN in real time. Therefore, the amount of
data loss depends on the synchronization interval (ranging from 3 seconds to
1440 minutes; 30 seconds by default), which can be specified based on site
requirements.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 56


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

● Splitting, switchover of primary and secondary LUNs, and rapid fault recovery
HyperReplication supports splitting, synchronization, primary/secondary
switchover, and recovery after disconnection. Disaster recovery tests and
service switchover are supported in various fault scenarios.
● Consistency group
Consistency groups apply to databases. Multiple LUNs, such as log LUNs and
data LUNs, can be added to a consistency group so that data on these LUNs is
from a consistent time in the case of periodic synchronization or fault. This
facilitates data recovery at the application layer.
● Support for fan-in/fan-out
HyperReplication supports data replication from 64 storage devices to one
storage device for central backup (64:1 fan-in/fan-out), greatly reducing the
disaster recovery cost.
● Support for various types of replication links
HyperReplication supports both Fibre Channel and IP replication links. In
synchronous replication scenarios, Fibre Channel (FastWrite) or RDMA is
recommended for replication links.
● Cross-array snapshot data synchronization
User snapshots of the primary LUN can be synchronized to the secondary LUN
in sequence. This function is recommended when host data consistency is
required.

6.5 HyperMetro (Active-Active Deployment)


HyperMetro, an array-level active-active technology provided by OceanStor
Dorado, allows two LUNs from separate storage systems to maintain real-time
data consistency and to be accessible to hosts.

HyperMetro supports both Fibre Channel and IP networking. The two storage
systems in a HyperMetro deployment can be at two locations within 300 km from
each other, such as in the same equipment room or in the same city. It is
recommended that the quorum server should be deployed at a third site.

If one storage system fails, hosts automatically choose the paths to the other
storage system for service access. If the replication links between the storage
systems fail, only one storage system can be accessed by hosts, which is
determined by the arbitration mechanism of HyperMetro.

6.5.1 HyperMetro for SAN

6.5.1.1 Read and Write Processes

Read Process
When a LUN receives a read request, the storage system reads its local cache and
returns the requested data to the application.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 57


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

Write Process
When a LUN receives a write request, mutual exclusion is performed for parallel
access paths. After the write permission is obtained, the requested data is written
to the caches of both the local and remote LUNs of the HyperMetro pair. After the
write operation is complete at both ends, a write success is returned to the
application. Figure 6-13 illustrates the write process of HyperMetro.

Figure 6-13 SAN HyperMetro write process

1. The host delivers a write request.


2. Storage system A receives the request.
3. Storage system A uses the local optimistic lock to apply for write permission
in the I/O address range corresponding to the HyperMetro pair. After the
write permission is obtained, the HyperMetro pair records the request in a log,
which records only the address information but no data content into memory
space with power failure protection to achieve better reliability.
4. The system writes the request to the caches of both the local and remote
LUNs separately. When receiving a write request, the remote LUN also needs
to apply for the write permission in the I/O address range corresponding to
the HyperMetro pair. Data can be written to the cache only after the remote
LUN obtains the write permission.
5. The LUNs at both ends report the write results.
6. The HyperMetro pair returns a write success acknowledgement to the host.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 58


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

Optimistic Lock
In the active-active storage systems, read and write operations are concurrently
performed at both sites. If hosts deliver read and write requests to the same LBA
of a LUN simultaneously, data at both sites must be consistent at the storage
layer.
In the conventional solution, a cross-site distributed lock service is required. When
a site receives a data write request, the site applies for a lock from the lock server.
If the lock service is at the peer site, data can be written to the two sites only after
a cross-site lock is obtained.
Because hosts in the upper-layer application cluster seldom send write requests to
the same LBA concurrently, HyperMetro uses the optimized optimistic lock. If no
lock conflict exists, HyperMetro directly initiates a write request to the peer
storage system. After the data is written, a lock is added to the local storage
system. When hosts concurrently access data at the same LBA, the storage system
can also convert concurrent access requests into serial queuing requests to ensure
data consistency between the two sites. In this solution, the cross-site lock server
is not required, which reduces the architecture complexity. This solution also
allows direct dual write operations when there is no conflict. It eliminates the
interaction with the lock server (or even the remote lock server) and improves
performance.

Cross-Site Bad Block Repair


Disks may have bad blocks due to abnormalities such as power failure. If
repairable bad blocks fail to be repaired by the local end, HyperMetro
automatically obtains data from the remote end to repair them, which further
enhances the system reliability.

6.5.1.2 HyperMetro Consistency Group


HyperMetro provides and manages services by pair or consistency group.
A consistency group contains multiple HyperMetro pairs. It ensures data
consistency between multiple associated LUNs on the storage systems.
When you split or synchronize a consistency group, all HyperMetro pairs in the
group are split or synchronized at the same time. If a link fault occurs, all member
pairs are interrupted simultaneously. After the fault is rectified, data
synchronization is implemented for all pairs to ensure data availability.

6.5.2 HyperMetro Technical Features

6.5.2.1 Gateway-free Active-Active Solution


This section describes the gateway-free active-active solution implemented by
OceanStor Dorado 2000 and its characteristics.
The HyperMetro deployment groups two storage systems into a cross-site cluster
without any additional virtual gateway. This solution has a simplified architecture
and is well compatible with value-added storage features. It delivers the following
values to customers:

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 59


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

● Reduced gateway-related fault points and enhanced solution reliability


● Quicker I/O response (Latency caused by gateway forwarding is eliminated
because I/Os are not forwarded by gateways.)
● Superb compatibility with existing storage features. HyperMetro can work
with other Smart- and Hyper-series features on OceanStor Dorado 2000 to
deliver a wide range of data protection and DR solutions.
● Simplified network and easier maintenance

6.5.2.2 Parallel Access


This section describes the parallel access of the HyperMetro system of OceanStor
Dorado 2000.
HyperMetro delivers active-active service capabilities on two storage systems.
Data is synchronized in real time between the HyperMetro service objects on both
storage systems, and both storage systems process read and write I/Os from
application servers to provide non-differentiated parallel active-active access. If
either storage system fails, services are seamlessly switched to the other system
without interrupting service access.
In comparison with the active-passive mode, the active-active solution fully utilizes
computing resources, reduces communication between storage systems, and
shortens I/O paths, thereby ensuring better access performance and faster failover.
Figure 6-14 compares the active-passive and active-active solutions.

Figure 6-14 Active-passive and active-active storage architectures

6.5.2.3 Reliable Arbitration


If links between two HyperMetro storage systems are disconnected, real-time
mirroring will be unavailable to the storage systems and only one system can
continue providing services. To ensure data consistency, HyperMetro uses the
arbitration mechanism to determine which storage system will continue providing
services.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 60


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

HyperMetro supports arbitration by pair or consistency group. If services running


on multiple pairs are mutually dependent, you can add the pairs into a
consistency group. After arbitration, only one storage system provides services.

Arbitration Modes
HyperMetro provides the following arbitration modes:
● Static priority mode
The static priority mode is used when no quorum server is deployed. You can
set either side of a HyperMetro pair or consistency group as the preferred site
and the other side the non-preferred site. If heartbeats between the storage
systems are interrupted, the preferred site wins the arbitration.
● Quorum server mode
In quorum server mode, when heartbeats between the storage systems are
lost, each of them sends an arbitration request to the quorum server, and only
the winner continues providing services. You can set one site as the preferred
site, which takes precedence in arbitration.

Automatic Switch of Arbitration Modes


If all quorum servers fail or their links to storage systems fail but the heartbeats
between the storage systems are normal, the system automatically switches to the
static priority mode.

6.5.2.4 Strong Scalability


This section describes the strong scalability of HyperMetro.
HyperMetro can work with other Smart- and Hyper-series features on OceanStor
Dorado to provide various data protection and DR solutions.
● Online expansion to active-active storage systems
HyperMetro is paired with Huawei UltraPath multipathing software that
supports LUN aggregation and shields physical differences at the storage
layer. When users want to expand a single storage system to active-active
storage systems, UltraPath can seamlessly take over the new storage system
and HyperMetro member LUNs for online expansion.
● Compatibility with existing features
HyperMetro can be used together with existing features such as
HyperReplication, HyperSnap, HyperClone, SmartThin, and SmartCompression.

6.5.2.5 High Performance


A series of optimization designs are used to enhance HyperMetro performance.

FastWrite
HyperMetro uses FastWrite to optimize data transmission over the replication links
between storage systems. With SCSI's First Burst Enabled function, the data
transmission interactions involved in a data write process are reduced by half. In a
standard SCSI process, transmission of a write I/O undergoes multiple interactions

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 61


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

between two ends, such as write command delivery, write allocation completion,
data write, and write execution status return. FastWrite optimizes the write I/O
interaction process by combining command delivery and data transfer and
canceling write completion acknowledgement. This reduces the interactions
involved in writing data across sites by half. Figure 6-15 shows the FastWrite
design.

Figure 6-15 Transmission protocol optimization

Optimized Cross-Site Access


For active-active services, the distance between two sites is essential to I/O access
performance. By working with OceanStor UltraPath, HyperMetro for SAN provides
two I/O access policies for users.
● Load balancing mode
This mode is mainly used when HyperMetro storage systems are deployed in
the same DC. In these scenarios, both storage systems deliver almost the
same access performance to a host. To maximize resource usage, host I/Os
are delivered in slices to both storage systems.
● Preferred storage system mode
This mode is mainly used when the HyperMetro storage systems are deployed
in different DCs over distance. In these scenarios, cross-DC access will increase
the latency. If the link distance between the DCs is 100 km, the round-trip
time (RTT) is approximately 1 ms. Reducing cross-DC communication
improves I/O performance.

Optimal Path
You are advised to use a fully interconnected network for Fibre Channel and IP
replication links between OceanStor Dorado 2000 storage systems. That is, each
controller pair of the two storage systems has a direct logical link with another
controller pair.
For read and write I/O requests sent between storage systems, a storage system
searches for the optimal path to the peer storage system based on the balancing
policy of the service object, and preferentially sends I/O requests over a link that
directly connects to the owning node on the peer storage system. This reduces the

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 62


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

cross-controller forwarding latency within storage nodes and improves end-to-end


performance.

Figure 6-16 Fully interconnected network between storage systems

6.6 Geo-Redundancy (Multi-DC)


This section describes the multi-DC networks supported by OceanStor Dorado
2000 and their features.

6.6.1 3DC (Geo-Redundancy)


The 3DC solution involves a production center, an intra-city DR center, and a
remote DR center. Data of the production center is synchronously replicated to the
intra-city DR center, and is asynchronously replicated to the remote DR center. The
intra-city DR center has the same service processing capability as the production
center. Applications can be switched to the intra-city DR center without any data
loss, achieving zero RPO and second-level RTO. If a low-probability large-scale
disaster, such as an earthquake, occurs and causes both the production center and
intra-city DR center to be unavailable, applications can be switched to the remote
DR center. Based on routine DR drills, applications can be recovered in the remote
DR center within the tolerable time to ensure service continuity and RPO within
seconds.

Compared with the solutions where either only an intra-city DR center or a remote
DR center is deployed, the geo-redundant 3DC DR solution can cope with larger-
scale disasters by combining their advantages. In this way, the DR system can
efficiently respond to both small-scale regional disasters and large-scale natural
disasters to prevent loss of service data as far as possible and provide superior
RPO and RTO. Therefore, the geo-redundant 3DC solution has been widely used.

Flexible networks of 3DC are supported using HyperMetro, HyperReplication/S


(synchronous remote replication), and HyperReplication/A (asynchronous remote
replication), including:

● Cascading network topology with HyperMetro and HyperReplication/A


● Parallel network topology with HyperMetro and HyperReplication/A
● Ring network topology (DR star) with HyperMetro and HyperReplication/A
● Cascading network topology with HyperReplication/S and HyperReplication/A
● Parallel network topology with HyperReplication/S and HyperReplication/A
● Ring network topology (DR Star) with HyperReplication/S and
HyperReplication/A

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 63


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

● Cascading network topology with only HyperReplication/A


● Parallel network topology with only HyperReplication/A
Figure 6-17 shows the cascading, parallel, and ring networking topologies.

Figure 6-17 3DC network topologies

The 3DC solution features cost effectiveness, elastic scalability, robust reliability,
high efficiency, and enhanced security.
● Interoperability among entry-level, mid-range, and high-end storage systems
Huawei storage systems of different performance levels can be used in one
DR solution via remote replication. Customers can choose proper storage
systems based on project conditions, reducing more than 50% of investment
in DR construction. However, the active-active storage systems configuring
HyperMetro must use devices of the same model.
● Elastic scalability
The active-active DC solution can be online upgraded to a geo-redundant 3DC
solution for higher reliability.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 64


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

● Diverse networking modes


Multiple networking modes, such as cascading and parallel topologies (with
HyperReplication/A and HyperMetro, HyperReplication/A and
HyperReplication/S, and HyperReplication/A and HyperReplication/A) and ring
topologies (with HyperReplication/A and HyperMetro, and HyperReplication/A
and HyperReplication/S), are supported to build a DR system that is most
suitable for the production environment and improve the cost-effectiveness of
the DR system.
● Consistency protection
The DR system provides application-based data consistency protection,
strengthening application reliability.
● Visualization
DR can be displayed in a topology and end-to-end real-time monitoring is
supported, greatly simplifying maintenance.
● One-click operations
The solution supports one-click testing, switchover, and recovery, and
automated scripts of service clusters are used to replace manual operations,
remarkably improving the work efficiency and DR success rate of the DR
system.
● Security
User authentication, encrypted data transfer, and service isolation between
storage systems ensure security of the entire DR system.

6.6.2 4DC (Geo-Redundancy)


This section describes Huawei SAN 4DC networking and features.
The Huawei geo-redundant 4DC DR solution includes one local production center,
one intra-city production center, one remote DR center, and one remote intra-city
DR center. The two active-active storage systems in the production centers provide
the SAN active-active function. Data in the production centers is periodically and
asynchronously replicated to the remote DR center. The remote DR center mirrors
data to the remote intra-city DR center in real time. The intra-city production
center has the same service processing capability as the local production center.
Applications can be switched to the intra-city production center without any data
loss to ensure service continuity. In case of low-probability large-scale disasters,
such as earthquakes, which cause both the intra-city DR center and production
center to be unavailable, applications can be failed over to the remote DR center
and remote intra-city DR center to ensure service continuity and high protection
level. Based on routine DR drills, applications can be recovered in the remote DR
center and remote intra-city DR center within the tolerable time limit to ensure
service continuity. However, a small amount of data may be lost during remote
recovery. If a disaster occurs in a remote DC, the remaining remote DC can still
provide services.

Compared with the solutions that include only one intra-city DR center or include
one intra-city DR center and one remote DR center, the geo-redundant 4DC DR
solution combines their advantages to continue providing services even if three
DCs in two cities fail and to address disasters that affect broader areas. If a
disaster that affects a small area or a natural disaster that affects a large area
occurs, the solution responds quickly to prevent data loss to the maximum extent

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 65


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

and achieve smaller recovery point objective (RPO) and recovery time objective
(RTO).

OceanStor supports the 4DC solution consisting of HyperMetro,


HyperReplication/A, and HyperMetro. The networking is as follows:

The process of handling write I/O requests for the HyperMetro +


HyperReplication/A + HyperMetro solution is as follows:
1. The production host delivers a write request to the HyperMetro LUN.
2. The LUN writes data to the HyperMetro data LUNs in both local and intra-city
production centers. A write success message is returned to the host.
3. When the synchronization period of the asynchronous replication from the
intra-city production center to the remote DR center starts, storage system B
generates a snapshot of LUN 1 at the point in time (such as t1) and notifies
storage system C in the remote DR center of generating a snapshot of LUN 2
at the point in time (such as t2). In the background, the differential data
between storage system B's LUN 1 at t1 and storage system C's LUN 2 at t2 is
periodically synchronized. Data on LUN 2 of storage system C in the remote
DR center is mirrored to LUN 2 of storage system D in the remote intra-city
DR center in real time.
4. If data synchronization fails for the asynchronous replication and data on
LUN2 in storage system C in the remote DR center must be used to run
services, set LUN2 on storage system C to the writable state (this requires that
the HyperMetro relationship between remote DCs C and D be suspended and
storage system C take over services). After LUN2 on storage system C is set to
the writable state, the system starts a task in the background to roll back
data to t2 to ensure data availability on storage system C. After the rollback
task is complete, start data synchronization for the HyperMetro pair between
storage systems C and D. After the data synchronization is complete, the
HyperMetro pair works in active-active mode. (If the storage system in a DC is

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 66


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

faulty, the remaining DC can take over services. Before the standby
HyperMetro pair between storage systems C and D works in active-active
mode, storage system D cannot provide services independently.)

The 4DC DR solution features cost effectiveness, elastic scalability, robust


reliability, and enhanced security.

● Multi-DC and multi-copy


Multiple DCs are built and four copies are created to provide multi-region
multi-copy DR protection, further improving the data protection level.
● Elastic scalability
An active-active or geo-redundant 3DC DR architecture can be smoothly
upgraded to the geo-redundant 4DC DR architecture without interrupting
production services.
● Multi-level protection
The geo-redundant 4DC DR architecture ensures that remote applications can
still run in HA mode for a long time if both DCs in a city are faulty. The local
production center and remote DR center can back up each other, achieving
two service centers in two cities, service continuity, and high reliability.
● Consistency protection
The DR system provides application-based data consistency protection,
strengthening application reliability.
● Security
User authentication, encrypted data transfer, and service isolation between
storage systems ensure security of the entire DR system.

6.7 HyperEncryption (Array Encryption)


OceanStor Dorado 2000 supports data encryption. The storage system implements
array encryption to ensure data security.

Internal Key Manager


The internal key manager is the storage system's built-in key management system.
It generates, updates, backs up, restores, and destroys keys, and provides
hierarchical key protection. The internal key manager is easy to deploy, configure,
and manage. You are advised to use the internal key manager if security
certification by the cryptographic module is not required and the key management
system is only used by the storage systems in a data center.

External Key Manager


The standard KMIP+TLS protocol is used to support interconnection with the
external key manager. The external key manager is recommended if FIPS 140-2
certification is required or multiple systems in a data center require centralized key
management. The external key manager supports key generation, update,
destruction, backup, and restoration. Two external key managers can be deployed,
which synchronize keys in real time for enhanced reliability.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 67


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

Array Encryption Principles


Array encryption of the storage system uses the built-in encryption engine of the
controller processor to implement encryption and decryption. The independent
built-in encryption engine leverages the encryption algorithm of Arm hardware to
offload encryption workloads. The encryption and decryption algorithms of the
storage system are offloaded to the hardware for execution, without involving
software. During data encryption on OceanStor Dorado 2000, the block device
management subsystem generates a data encryption key (DEK) on each disk, and
the key manager provides an authentication key (AK). The AK is used to encrypt
the DEK. After service I/Os are delivered, encryption and decryption are offloaded
to the built-in encryption engine for execution. The encryption engine supports the
AES-256-XTS and SM4-128-XTS (only for the Chinese mainland) algorithms. The
algorithm used by the key manager must match that used by the encryption
engine. The following figure shows the topology with an internal key manager (as
an example):

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 68


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

Figure 6-18 Built-in encryption engine

● Data encryption: After the built-in encryption engine is enabled, the block
device management subsystem uses the built-in encryption engine to encrypt
and decrypt data with the DEK during data writes and reads.
When the storage system receives a write request, the built-in encryption
engine encrypts the plaintext data, and the block device management
subsystem then writes the encrypted data into storage media.
When the storage system receives a read request, the block device
management subsystem reads the encrypted data, which is then decrypted by
the built-in encryption engine into plaintext.
● Data destruction: When the external key manager is used, data can be
destroyed by destroying corresponding keys.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 69


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 6 Hyper Series Features

● AK update: AKs must be regularly updated to avoid cracking or leakage. The


system encrypts and stores the DEKs again based on the new AK2 delivered
by the key manager. The updates can be performed periodically (every year)
or manually.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 70


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 7 System-level Reliability Design

7 System-level Reliability Design

OceanStor Dorado 2000 provides up to 99.9999% availability via data and service
availability designs. Powered by the DR solution, the system availability can reach
99.99999%.
7.1 Data Reliability
7.2 Service Availability

7.1 Data Reliability


For data written by hosts into storage systems, OceanStor Dorado 2000 undergoes
three processes: data caching, data persisting on disks, and data transmitting on
paths. The following describes the data reliability measures in the three processes.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 71


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 7 System-level Reliability Design

Figure 7-1 Data reliability overview

7.1.1 Cache Data Reliability


To improve the speed of writing data, OceanStor Dorado 2000 provides the write
cache mechanism. That is, after data is written to the memory cache of the
controller and mirrored to the peer controller, a success message is returned to the
host and then cache data is destaged to disks in the background.

User data stored in the controller memory may be lost if the system is powered
off or the controller is faulty. To prevent the data loss, the system provides
multiple cache copies across controllers and power failure protection to ensure
data reliability.

7.1.1.1 Multiple Cache Copies


The mid-range devices support two copies, ensuring that write cache data is not
lost and services are not interrupted in the event of a controller failure. As shown
in the following figure, write data is mirrored to controller B when being cached to
controller A. This ensures that data is not lost if either controller A or controller B
fails.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 72


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 7 System-level Reliability Design

Figure 7-2 Multiple cache copies

7.1.1.2 Power Failure Protection


OceanStor Dorado 2000 has built-in BBUs. If a power outage occurs in the system,
BBUs in controllers provide extra power for moving the cache data in the memory
to the coffer. After the power supply is recovered, the storage systems restore the
cache data in the coffer to the memory during startup to prevent data loss.
The process of moving cache data to the coffer upon power failures is
implemented by the underlying system and does not rely on upper-layer software,
thereby not affected by services and further improving user data reliability.

7.1.2 Persistent Data Reliability


OceanStor Dorado 2000 uses the intra-disk RAID technology to ensure disk-level
data reliability and prevents data loss. The RAID 2.0+ technology and dynamic
reconstruction ensure system-level data reliability. As long as the number of faulty
disks does not exceed that of the redundant disks, data will not be lost and the
redundancy will not decrease.

7.1.2.1 Intra-disk RAID


In addition to overall disk faults, regional damage may occur on the chips used for
storing data. This is called the silent failure (bad block). These bad blocks do not
cause the failure of the entire disk, but cause the data access failure on the disk.
Common bad block scanning can detect silent and invalid data in advance and
repair the data. However, disk scanning occupies a large number of resources. To
prevent impact on foreground services, the scanning speed must be controlled.
Therefore, when the disk capacity and quantity are large, it takes weeks or even
months to scan all disks. In addition, if both the bad block and disk failure occur in
the interval between two scans, data may fail to be recovered.
Based on bad block scanning, OceanStor Dorado 2000 uses Huawei SSDs (HSSDs)
to provide the intra-disk RAID feature to prevent silent failures in scanning

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 73


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 7 System-level Reliability Design

isolation. Specifically, RAID 4 groups are created for data on SSDs in the unit of
dies to implement redundancy, tolerating the failure of a single die without any
data loss on SSDs.

Figure 7-3 Intra-disk RAID on an SSD

7.1.2.2 RAID 2.0+


In a conventional RAID storage system that uses fixed physical disks in RAID
groups, LUNs or file systems used by users are divided from the RAID groups.
Because the access frequency of each LUN or file system in the system is different,
disks in some RAID groups are busy and become hot spots. Disks in other RAID
groups cannot share the workloads even if they are idle. In addition, if a disk
works for a longer time than others, its failure rate increases sharply and may be
faulty in a shorter time than other disks. Therefore, hot disks in conventional RAID
storage systems are at the risk of being overloaded.
With RAID 2.0+, OceanStor Dorado 2000 divides each SSD into fixed-size chunks
(CKs, generally 4 MB) and forms chunk groups (CKGs) using these CKs based on
the configured RAID level. RAID 2.0+ has the following advantages over traditional
RAID:
● Balanced service loads for zero hotspots Data is evenly distributed to all disks
in a storage resource pool, eliminating hotspot disks and lowering the disk
failure rate.
● Quick reconstruction for a lowered data loss risk. If a disk is faulty, its data
will be reconstructed to all the other disks in the resource pool. This many-to-
many reconstruction is rapidly implemented, significantly shortening the non-
redundancy period of data.
● All member disks participate in reconstruction. All member disks in a storage
resource pool participate in reconstruction, so each disk only needs to
reconstruct a small amount of data. The reconstruction process does not
affect upper-layer applications.

7.1.2.2.1 Disk-level redundancy RAID

7.1.2.3 Dynamic Reconstruction


If the number of available disks in a storage pool is fewer than N+M (due to
consecutive failures of disks or disk replacement), the reconstruction cannot be

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 74


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 7 System-level Reliability Design

performed and user data redundancy cannot be ensured. To cope with the
preceding problems, OceanStor Dorado 2000 uses dynamic reconstruction by
reducing the number of data columns. If the total number of available disks in a
storage pool is less than the number of RAID member disks, the system retains the
number of parity columns (M) and reduces the number of data columns (N)
during reconstruction. After the reconstruction is complete, the number of
member disks in the RAID group decreases, but the RAID redundancy level
remains unchanged.
After the faulty disks are replaced, the system increases the number of data
columns (N) based on the number of available disks in the storage pool, and new
data will be written to the new N+M columns. Data that has been written during
the fault will also be converted into the new N+M columns.

NOTE

Dynamic reconstruction reduces the total available capacity of the system. If multiple disks
are faulty, handle the disk faults in time and pay attention to the storage pool usage.

7.1.3 Data Reliability on I/O Paths


During data transmission within a storage system, data passes through multiple
components over various channels and undergoes complex software processing.
Any problem during this process may cause data errors. If such errors cannot be
detected immediately, error data can be written to persistent disks, calculated
internally, or returned to the host, causing service exceptions.
To resolve the preceding problems, OceanStor Dorado 2000 uses the end-to-end
Protection Information (PI) function to detect and correct data errors (internal
changes to data) on the transmission path. The matrix verification function
ensures that changes to the whole data block (the whole data block is overwritten
by old data or other data) can be detected. The preceding measures ensure data
reliability on I/O paths.

7.1.3.1 End-to-end PI
OceanStor Dorado 2000 supports ANSI T10 PI. Upon reception of data from a
host, the storage system inserts an 8-byte PI field to every 512 bytes of data
before performing internal processing.
After data is written to disks, the disks verify the PI fields of the data to detect any
change to the data between reception and flushing to the disks. In the following
figure, the green block indicates that a PI is inserted to the data. The blue blocks
indicate that a PI is calculated for the 512-byte data and compared with the saved
PI to verify data correctness.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 75


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 7 System-level Reliability Design

Figure 7-4 End-to-end PI

When the host reads data, the disks verify the data to prevent changes to the
data. If any error occurs, the disks notify the upper-layer controller software, which
then recovers the data by using RAID. To prevent errors on the path between the
disks and the front end of the storage system, the storage system verifies the data
again before returning it to the host. If any error occurs, the storage system
recovers the data using RAID to ensure end-to-end data reliability from the front
end to the back end.

7.1.3.2 Matrix Verification


Because the internal structure of disks is complex or the read path is long
(involving multiple hardware components), various errors may occur due to
software defects. For example, a write success is returned but the data fails to be
written to disks; data B is returned when data A is read (read offset); or data that
should be written to address A is actually written to address B (write offset). Once
such errors occur, the PI check of the data is passed. If the data is still used, the
incorrect data (such as old data) may be returned to the host.

Figure 7-5 Matrix verification

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 76


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 7 System-level Reliability Design

OceanStor Dorado 2000 provides matrix verification to cope with the write failure,
read offset, and write offset that may occur on disks. In the preceding figure, each
piece of data consists of 512-byte user data and 8-byte PI. Two bytes of the PI are
used for cyclic redundancy check (CRC) to ensure reliability of the 512-byte data
horizontally (protection point 1). The CRC bytes in 16 PI sectors are extracted to
calculate the checksum, which is then saved in a metadata node. If offset occurs in
a single or multiple pieces of data (512+8), the checksum of the 16 pieces of data
is also changed and becomes inconsistent with that saved in the metadata. This
ensures data reliability vertically. After detecting data damage, the storage system
uses RAID redundancy to recover the data. This is matrix verification.

7.2 Service Availability


The storage system provides multiple redundancy protection mechanisms for the
entire path from the host to the storage system. That is, when a single point of
failure occurs on the interface module or link (1), controller (2), and storage
media (3) that I/Os pass through, redundant components and fault tolerance
measures can be used to ensure that services are not interrupted. In an active-
active scenario, the peer storage system can take over services even if a single
storage system fails (4) without interrupting host services.

Figure 7-6 System multiple redundancy protection design

7.2.1 Interface Module and Link Redundancy


OceanStor Dorado 2000 supports full redundancy. Interface module and link
redundancy is provided for the front end for interconnection with the host, the
back end for connecting disks, and the communication between controllers. In the
event of a controller fault or replacement, the other controller takes over services.
New I/Os are delivered to the takeover controller to ensure service continuity.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 77


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 7 System-level Reliability Design

7.2.2 Controller Redundancy


OceanStor Dorado 2000 provides redundant controllers to ensure reliability. In
typical scenarios, cache data is stored on the current controller and the copy of
cache data is stored on another controller. If a controller fails, services can be
switched to the controller to which the cache data copy belongs, ensuring service
continuity.

7.2.3 Storage Media Redundancy


OceanStor Dorado 2000 not only ensures high reliability of a single disk, but also
uses the multi-disk redundancy capability to ensure service availability if a single
disk is faulty. That is, disk faults or sub-health is detected in a timely manner by
using algorithms, and faulty disks are isolated in a timely manner to avoid long-
term impact on services. Then, data of the faulty disk is recovered by using the
redundancy technology. In this case, services can be continuously provided.

7.2.3.1 Fast Isolation of Disk Faults


When disks are running properly, OceanStor Dorado 2000 monitors the in-position
and reset signals. If a disk is removed or faulty, the storage system isolates it. New
I/Os are written to other disks and a read success is returned to the host after I/Os
are read by using RAID.

In addition, continuous, long-time operation causes disks to wear and increases


the chance of particle failures. As a result, disks respond more slowly to I/Os,
which can affect services. Therefore, slow disks will be detected and isolated in a
timely manner so that they cannot further affect services.

A model that compares the average I/O service time of disks is built for OceanStor
Dorado 2000 based on common features of disks, including the disk type, interface
type, and owning disk domain. With this model, slow disks can be detected and
isolated within a short period of time, shortening the time when host services are
affected by slow disks.

7.2.3.2 Disk Redundancy


OceanStor Dorado 2000 supports multiple RAID configuration modes, which
ensure service continuity in the event of disk failures. The storage systems can
tolerate the simultaneous failure of three disks in the storage pool at most,
ensuring zero data loss without service interruption.

● RAID 5 uses the EC-1 algorithm and generates one copy of parity data for
each stripe. The failure of one disk is allowed.
● RAID 6 uses the EC-2 algorithm and generates two copies of parity data for
each stripe. The simultaneous failure of two disks is allowed.
● RAID-TP uses the EC-3 algorithm and generates three copies of parity data for
each stripe. The simultaneous failure of three disks is allowed.

7.2.4 Array-level Redundancy


In addition to providing intra-array high availability protection for services
requiring high reliability, OceanStor Dorado 2000 provides array-level (site) active-

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 78


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 7 System-level Reliability Design

active protection to ensure service continuity in case of a power failure or disaster


such as earthquake and fire.
HyperMetro, an array-level active-active technology provided by OceanStor
Dorado 2000, supports both Fibre Channel and IP networking. The two storage
systems in a HyperMetro deployment can be in the same equipment room, in the
same city, or in two locations 300 km away from each other.
Two LUNs from two storage arrays maintain real-time data consistency and are
accessible to hosts. If one storage array fails, hosts will automatically choose the
path to the other storage array for service access. If only one storage array can be
accessed by hosts due to failures of the links between storage arrays, the
arbitration mechanism determines which storage array continues to provide
services. The quorum server is deployed at a third place to determine which
storage array continues to provide services when the link between two storage
arrays is disconnected.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 79


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 8 System Performance Design

8 System Performance Design

OceanStor Dorado 2000 uses brand-new hardware design and optimizes I/O paths
from multipathing software, front-end and back-end networks, and CPU to
provide high IOPS and low latency for customers. Table 8-1 describes the key
performance design based on the I/O delivery process from the host to SSDs for
addressing the current problems and pain points.

Table 8-1 OceanStor Dorado key performance design


I/O Challenge Key Design Point Performance Design Principles
Proces
s

Host SAN: After a Global load The path selection mode of


path read/write balancing UltraPath is changed from the
selecti request is load balancing mode to the
on delivered to mode in which UltraPath
the controller, negotiates with controllers and
forwarding delivers I/Os to controllers that
the request eventually process the I/Os,
again achieving global load balancing.
increases the
CPU overhead
and latency.

Front The native Direct TCP/IP ● I/Os bypass the kernel mode
end Ethernet Offloading Engine to reduce cross-mode
protocol has (DTOE), a overheads.
multiple technology ● Offload protocols using
layers, optimized by hardware, reducing CPU
resulting in Huawei usage.
high latency.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 80


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 8 System Performance Design

I/O Challenge Key Design Point Performance Design Principles


Proces
s

Reducing the Round robin ● The working thread


system scheduling driven periodically checks the
scheduling by Fibre Channel, receiving queue of the
latency is iSCSI, and interface module in polling
required. switching networks mode to reduce the latency
caused by waking up the
working thread upon a
request.
● Load balancing of front-end,
mirroring, and back-end
networks is supported, fully
utilizing CPU capabilities.

Contro How to make Intelligent multi- ● I/Os are distributed among


ller full use of the core technology CPUs by CPU group to reduce
computing the latency of cross-CPU
capability of scheduling.
multi-core ● A CPU is divided into
CPUs different zones based on
services to reduce service
interference.
● No lock is designed in the
service partition to reduce
lock conflicts.

Load Global load Tasks with high-density


imbalance balancing computing overheads are
between CPU scheduled in load balancing
groups in mode among service groups in a
different CPU.
scenarios

Back Write Multistreaming Hot and cold data is separated


end amplification to reduce write penalty within
causes short disks.
SSD life time
and low ROW full-stripe The ROW full-stripe write
performance. write design reduces random write
amplification.

The Read first on SSDs. Collaboration of SSD hardware


background and software improves the read
erasing and I/O priority and reduces the
write read latency.
operations of
SSDs affect
the
foreground
read latency.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 81


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 8 System Performance Design

8.1 Front-end Network Optimization


8.2 CPU Computing Optimization
8.3 Back-end Network Optimization

8.1 Front-end Network Optimization


Front-end network optimization mainly refers to the optimization of latency
between applications and storage devices, including the optimization of the path
selection algorithm of the multipathing software on the server side, protocol
offloading optimization in iSCSI scenarios, and scheduling optimization in
common scenarios.

Protocol Offloading
The performance bottleneck of NICs (iSCSI) lies in the long I/O path. The overhead
of TCP and IP is extremely high. Huawei uses the highly-optimized user-mode
iSCSI protocol stack and DTOE to offload TCP and IP, as shown in Figure 8-1.

Figure 8-1 DTOE technology

If controllers use traditional NICs, network protocol stacks processed by the


controllers have deep layers. As a result, every time a data packet is processed, an
interruption is triggered, causing high CPU overhead.
By using the TOE technology, NICs offload the TCP and IP protocols. An
interruption is triggered after an application implements a complete data
processing, which significantly reduces the interruption overhead. However, in this
case, some drivers are running in the kernel mode, resulting in the latency caused
by the overhead of system calls and thread switchover between user mode and
kernel mode.
The DTOE technology adopted by OceanStor Dorado 2000 offloads iSCSI TCP data
paths to NICs. The transport layer processing, including the DIF check function, is
offloaded to the NIC microcode, eliminating the CPU overhead. In addition, the
working thread in the system checks the receiving queue of the interface module
periodically in polling mode. If there is a request, the thread processes the request

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 82


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 8 System Performance Design

immediately. This reduces the latency overhead caused by waking up the working
thread after a request is received.

8.2 CPU Computing Optimization


Intelligent Multi-core Technology
OceanStor Dorado 2000 uses high-performance processors. Each controller
contains more CPUs and cores than any other controller in the industry.
For the symmetric multiprocessor (SMP), the biggest challenge is to keep the
system performance growing linearly as the number of CPUs increases. The SMP
system has the following two key problems:
1. The more CPUs, the more overhead of communication between CPUs, and the
more memory access across CPUs.
2. The more the number of cores, the more likely that conflicts may be caused
by program mutual exclusion, and the longer time for conflict handling.
OceanStor Dorado 2000 uses the intelligent multi-core technology to allow
performance to increase linearly with the number of CPUs. The key optimization
technologies for the key problems are as follows:
1. The CPU grouping and distribution technologies are used among CPUs. Each
I/O is scheduled within one CPU during the process of being delivered to the
controllers by the multipathing software and arriving at the back-end disk
enclosures. In addition, memory is allocated on the current memory channel,
minimizing the communication overhead between CPUs.
2. CPU cores within a CPU are grouped based on service attributes. The front-
end, back-end, and node interconnection networks are scheduled in separate
CPU core groups. The I/O stack for processing a task is scheduled only within
one CPU core group, which effectively controls the conflict impact and
processing overhead, improves the scheduling efficiency, and accelerates the
processing of the I/O stack.

Dynamic Load Balancing


The traditional CPU grouping technology can solve the conflict domain problem of
each service, but also brings the problem of unbalanced resource usage between
CPU groups in different scenarios. The dynamic load balancing technology of
OceanStor Dorado 2000 defines differentiated scheduling policies based on the
computing overheads of tasks. In this way, tasks are balanced among the CPU
core groups. High-density computing tasks are distinguished from common
density computing tasks and are used as scheduling units for load balancing
among groups. This prevents scheduled tasks from interfering with each other,
improving task execution efficiency.

8.3 Back-end Network Optimization


Multistreaming
OceanStor Dorado 2000 uses the multistreaming technology. The SSD driver works
with the controller software to effectively distinguish data with different change

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 83


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 8 System Performance Design

frequencies and store the data in different blocks. For example, metadata (hot
data) and user data (warm data) are stored in different blocks. This increases the
probability that data in the blocks becomes invalid at the same time, reduces the
amount of valid data to be migrated during GC, and improves SSD access
performance and the service life.

ROW Full-Stripe Write


OceanStor Dorado 2000 uses ROW full-stripe write, which writes all new data to
new blocks. This avoids the write amplification caused by data reads and parity
check in a traditional RAID write process and eliminates the risk when multiple
blocks are changed simultaneously. In addition, it effectively reduces the CPU
overhead of the storage controller and the read/write workload on SSDs during
the write process, and simplifies the processing logic for a better error tolerance
capability. Compared with the traditional Write In Place mode, the ROW full-stripe
write mode delivers higher performance and fault tolerance efficiency in random
write scenarios.

Read First on SSDs


OceanStor Dorado 2000 uses the latest generation of SSDs, reducing the average
read latency by over 50 μs. Generally, there are three types of operations on the
flash media of an SSD: read, write, and erase. The erase latency is 5 ms to 15 ms,
the write latency is 2 ms to 4 ms, and the read latency ranges from dozens of μs
to 100 μs. When a flash chip is performing a write or an erase operation, a read
operation must wait until the current operation is finished, which causes a great
jitter in read latency. By using the read first on SSDs technology, if a read request
with a higher priority is detected during an erase or write operation, the system
cancels the current operation and preferentially processes the read request. This
greatly reduces the read latency on SSDs.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 84


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

9 System Serviceability Design

This chapter describes how to manage the OceanStor Dorado 2000 storage system
through various interfaces, (including DeviceManager, CLI, RESTful API, SNMP, and
SMIS) and the lifecycle of devices through FlashEver as well as describes
transparent upgrade mode of hosts.
9.1 System Management
9.2 Non-Disruptive Upgrade (NDU)
9.3 Device Lifecycle Management

9.1 System Management


OceanStor Dorado 2000 provides device management interfaces and integrated
northbound management interfaces. Device management interfaces include a
graphic management interface DeviceManager and a command-line interface
(CLI). Northbound interfaces are mainly RESTful interfaces, supporting SNMP,
evaluation tools, and third-party network management plug-ins. For details, see
https://info.support.huawei.com/storage/comp/#/home.

9.1.1 DeviceManager
DeviceManager is a built-in HTML5-based management system for OceanStor
Dorado 2000. It provides wizard-based GUI for efficient management. Users can
enter https://storage management IP address:8088/ on the browser to use
DeviceManager. On DeviceManager, you can perform almost all required
configurations. The following figure shows the DeviceManager login page.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 85


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

Figure 9-1 DeviceManager login page

You can use the following functions on DeviceManager:

● Storage space management: This includes storage pool management, LUN


management, and mapping between LUNs and hosts.
● Data protection management: LUN data is protected using snapshot, clone,
backup, replication, and active-active.
● Configuration task: Background tasks for complex configuration operations
are provided to trace the procedure of the configuration process.
● Fault management: The status of storage devices and management units on
storage devices are monitored. If faults occur, alarms will be generated and
troubleshooting suggestions and guidance will be provided.
● Performance and capacity management: The performance and capacity of
storage devices is monitored in real time. You can view the collected historical
performance and capacity data and analyze associated performance data.
● Security management: DeviceManager supports the management of users,
roles, permissions, certificates, and keys.

DeviceManager uses a new UI design to provide a simple interactive interface.


Users can complete configuration tasks only in a few operations, which improves
user experience.

9.1.1.1 Storage Space Management

Flexible Storage Pool Management


OceanStor Dorado manages storage space by using storage pools. A storage pool
consists of multiple SSDs and can be divided into multiple LUNs for hosts to use.
Users can use only one storage pool to manage all the space or divide multiple
storage pools.

● One storage pool for the entire storage system


This is the simplest division method. Users only need to create one storage
pool using all disks during system initialization.
● Multiple storage pools to isolate applications

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 86


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

If you want to use multiple storage pools to manage space and isolate fault
domains of different applications, you can manually create storage pools at
any time in either of the following ways:
– Specify the number of disks used to create a storage pool. The system
automatically selects qualified disks to create a storage pool.
– Manually select specific disks to create a storage pool.

Mappings Between LUNs and Hosts


To facilitate LUN management and provide storage volumes for hosts, OceanStor
Dorado defines the following types of management objects:

Object Function

LUN A storage volume that can be accessed by hosts

LUN group A LUN group consists of multiple LUNs. If the data of an


application comes from multiple LUNs, these LUNs can be
added to a LUN group. Operations on the LUN group apply
to all LUNs in the LUN group.

Host A host that can access the storage system. The host can be
a physical host or a VM.

Host group A host group consists of multiple hosts. If an application is


deployed on a cluster consisting of multiple hosts and these
hosts access the data volumes of the application
simultaneously, you can create a host group for these hosts.

DeviceManager provides multiple simple and flexible mechanisms. No matter the


application is simple or complex, a proper mapping scheme is available for users.
● Mappings between LUN groups and host groups
If an application has multiple LUNs and is deployed on a cluster consisting of
multiple hosts, you are advised to manage the LUNs using a LUN group,
manage the hosts using a host group, and create mappings between the LUN
group and host group.
● Mappings between the LUN group and host
If an application has multiple LUNs and is deployed on only one host, you are
advised to manage these LUNs using a LUN group and create the mapping
between the LUN group and the host.
● Mapping between the LUN and host
If an application uses only one LUN and is deployed only on one host or you
are not used to using LUN groups, create the mapping between the LUN and
host.

Automatic Host Detection


In addition to manually creating a host using the WWN or IQN, hosts can be
automatically created and scanned if Huawei's multipathing software UltraPath is
installed.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 87


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

As shown in the following figure, after the physical network connection between
the host and the storage device is set up, you can click Scan for Host on the host
management page. The system scans for all hosts connected to the storage device,
identifies their WWNs or IQNs, and automatically creates hosts. If a host has
multiple WWNs or IQNs, the system can automatically identify them as on one
host.
When a large number of hosts exist, managing their WWNs or IQNs is time-
consuming. The automatic host detection function simplifies the management. For
details about the operation requirements and environment requirements for
automatic host detection, see the online help provided by OceanStor Dorado.

Figure 9-2 Automatic host detection

Quick Configuration Wizard


OceanStor Dorado provides an initial configuration wizard for comprehensive
scenarios, simplifying the configuration process from unpacking to use. According
to the configuration sequence, the configuration items are as follows:
● Basic device information, license, time, DNS, and alarm notification

● Storage pool creation


OceanStor Dorado manages storage space by using storage pools. A storage
pool consists of multiple SSDs and can be divided into multiple LUNs for hosts
to use. You can use one storage pool to manage all of the space or create
multiple storage pools.
– Only one storage pool on the storage system
This is the simplest method. You only need to create one storage pool
with all disks during system initialization.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 88


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

– Multiple storage pools to isolate applications


If you want to use multiple storage pools to manage space and isolate
fault domains of different applications, you can manually create storage
pools at any time in either of the following ways. You can specify the
number of disks used to create a storage pool, and the system
automatically selects the disks that meet the requirements. Alternatively,
you can manually select specific disks to create a storage pool.

● Resource allocation
The wizard allows for automatic discovery of hosts using Huawei UltraPath
and NAS domain authentication configuration, and provides links for service
provisioning. The entire process is streamlined, helping users get started
quickly.

Rapid Provisioning of NAS Resources Based on Existing Objects


● Creating a file system using a template
Use the selected file system to automatically fill in parameters except the
name.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 89


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

● Creating a share using a template


Use the selected share to automatically fill in all parameters.

9.1.1.1.1 Flexible Storage Pool Management


OceanStor Dorado 2000 manages storage space by using storage pools. A storage
pool consists of multiple SSDs and can be divided into multiple LUNs for hosts to
use. You can use one storage pool to manage all of the space or create multiple
storage pools.

● One storage pool for the entire storage system


This is the simplest division method. Users only need to create one storage
pool using all disks during system initialization.
● Multiple storage pools to isolate applications
If you want to use multiple storage pools to manage space and isolate fault
domains of different applications, you can manually create storage pools at
any time in either of the following ways. You can specify number of disks
used to create a storage pool, and the system automatically selects the disks
that meet the requirements. Alternatively, you can manually select specific
disks to create a storage pool.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 90


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

9.1.1.1.2 Mappings Between LUNs and Hosts


To facilitate LUN management and provide storage volumes for hosts, OceanStor
Dorado 2000 defines the following types of management objects:

Object Function

LUN A storage volume that can be accessed by hosts.

LUN group A LUN group consists of multiple LUNs. If the data of an


application comes from multiple LUNs, these LUNs can be
added to a LUN group. Operations on the LUN group apply to
all LUNs in the LUN group.

Host A host that can access the storage system. The host can be a
physical host or a VM.

Host group A host group consists of multiple hosts. If an application is


deployed on a cluster consisting of multiple hosts and these
hosts access the data volumes of the application
simultaneously, you can create a host group for these hosts.

DeviceManager provides multiple simple and flexible mechanisms. No matter the


application is simple or complex, a proper mapping scheme is available for users.
● Mappings between LUN groups and host groups
If an application has multiple LUNs and is deployed on a cluster consisting of
multiple hosts, you are advised to manage the LUNs using a LUN group,
manage the hosts using a host group, and create mappings between the LUN
group and host group.
● Mappings between the LUN group and host
If an application has multiple LUNs and is deployed on only one host, you are
advised to manage these LUNs using a LUN group and create the mapping
between the LUN group and the host.
● Mapping between the LUN and host
If an application uses only one LUN and is deployed only on one host or you
are not used to using LUN groups, you are advised to create the mapping
between the LUN and host.

9.1.1.1.3 Automatic Host Detection


In addition to manually creating a host using the WWN or IQN, hosts can be
automatically created and scanned if Huawei's multipathing software UltraPath is
installed.
As shown in the following figure, after the physical network connection between
the host and the storage device is set up, you can click Scan for Host on the host
management page. The system scans for all hosts connected to the storage device,
identifies their WWNs or IQNs, and automatically creates hosts. If a host has
multiple WWNs or IQNs, the system can automatically identify them as on one
host.
When a large number of hosts exist, managing their WWNs or IQNs is time-
consuming. The automatic host detection function simplifies the management. For

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 91


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

details about the operation requirements and environment requirements for


automatic host detection, see the online help provided by OceanStor Dorado 2000.

Figure 9-3 Automatic host detection

9.1.1.1.4 Quick Configuration Wizard


OceanStor Dorado 2000 provides an initial configuration wizard for comprehensive
scenarios, simplifying the configuration process from unpacking to use. According
to the configuration sequence, the configuration items are as follows:
● Basic device information, license, time, DNS, and alarm notification

● Storage pool creation


OceanStor Dorado 2000 manages storage space by using storage pools. A
storage pool consists of multiple SSDs and can be divided into multiple LUNs
for hosts to use. You can use one storage pool to manage all of the space or
create multiple storage pools.
– Only one storage pool on the storage system
This is the simplest division method. Users only need to create a storage
pool using all disks during system initialization.
– Multiple storage pools to isolate applications
If you want to use multiple storage pools to manage space and isolate
fault domains of different applications, you can manually create storage
pools at any time in either of the following ways. You can specify the
number of disks used to create a storage pool, and the system
automatically selects the disks that meet the requirements. Alternatively,
you can manually select specific disks to create a storage pool.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 92


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

● Resource allocation
The wizard allows for automatic discovery of hosts using Huawei UltraPath
and provides links for service provisioning. The entire process is streamlined,
helping users get started quickly.

9.1.1.2 Data Protection Management

Data Protection Based on Protection Groups


If an application has multiple LUNs, data protection for the application is to
protect its LUNs simultaneously and ensure their data consistency. OceanStor
Dorado introduces protection groups to protect the LUNs of an application in
groups.
A protection group consists of multiple LUNs. If you implement data protection on
a protection group, such as creating snapshots, the operation applies to all LUNs
in the protection group and ensures data consistency between LUNs.
The following features support batch LUN data protection based on the protection
group:
● Protection group-based snapshot
When you create a snapshot for a protection group, a snapshot is created for
each LUN in the protection group. A snapshot consistency group is
automatically created and the snapshots are added to the snapshot
consistency group.
● Protection group-based clone
When you create a clone for a protection group, a clone is created for each
LUN in the protection group. A clone consistency group is automatically
created and the clone LUNs are added to the clone consistency group.
● Protection group-based remote replication
When you create remote replication for a protection group, a remote
replication pair relationship is established for each LUN in the protection
group. A remote replication consistency group is automatically created and
the remote replication pairs are added to the remote replication consistency
group.
● Protection group-based HyperMetro
When you configure HyperMetro for a protection group, a HyperMetro pair
relationship is established for each LUN in the protection group. A
HyperMetro consistency group is automatically created and the HyperMetro
pairs are added to the HyperMetro consistency group.
Like protecting an independent LUN, the protection group allows users to
configure protection for multiple LUNs of an application. Users do not need to
configure LUNs separately. Managing a batch of LUNs is as simple as managing
one LUN and data consistency is ensured at the same time.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 93


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

Flexible Use of LUN Groups and Protection Groups


As previously mentioned, OceanStor Dorado uses LUN groups and protection
groups to manage volumes of applications in batches. The following table
describes the application scenarios.

Group Application Scenario Relationship with Applications

LUN group Mapping 1. Manages all volumes of an


application.
2. Manages all volumes of a host
(involving multiple applications).

Protection Snapshot, clone, Manages all volumes of an application.


group replication, and active-
active

● Using a LUN group only to manage LUNs of an application


Most recommended method: Use a LUN group to manage LUNs of an
application. In this way, LUNs are mapped and protected on the basis of LUN
groups.
This management model is simple. The system will automatically create a
unique protection group and implement various types of data protection
based on the protection group.
This method makes the management of multiple LUNs as simple as the
management of only one LUN.
● Using LUN groups to manage volumes of hosts and protection groups to
manage volumes of applications
This method is usually used to manage LUNs that belong to different
applications which are deployed on the same host or host group.
In such conditions, it is not appropriate to create the LUN group as a
protection group, because data volumes of some applications may not need
to be protected.
Therefore, you can select the specified LUNs of the LUN group to create a
protection group and perform protection operations for the protection group.
● Using LUN groups or protection groups separately
If the data protection feature is not enabled, only LUN groups are available
for you.
If LUN group-based mappings are unnecessary, you can add LUNs of an
application to a protection group to keep them consistent.

Capacity Expansion of LUN Groups or Protection Groups


For an application running out of storage space, you can add LUNs to expand the
capacity. If the LUNs are managed in a LUN group or protection group, they
inherit the mappings and protection settings of the group.

● Automatic creation of mappings

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 94


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

After new LUNs are added to a LUN group, these LUNs share the mappings of
the LUN group. Hosts accessible to the LUN group are also accessible to the
LUNs.
● Automatic configuration of data protection
After new LUNs are added to a LUN group that is included in a protection
group, the system will automatically append the protection settings of the
protection group to the LUNs, such as creating clones, replication pairs, and
HyperMetro pairs for the LUNs, and adding the created objects to their
respective consistency groups.
If new LUNs are added to a protection group, the system will append the
protection settings of the protection group to the LUNs.
Similarly, if LUNs are removed from a LUN group or protection group, their
mappings will be deleted, and they will be removed from consistency groups
(their existing replication pairs and HyperMetro pairs will be retained and can
be manually deleted).

Configuration on One Device for Cross-Device Data Protection


When cross-device data protection (such as replication, HyperMetro) is enabled,
data protection configuration only need to be performed on one device. For
example, as shown in Figure 9-4, if you want to replicate the protection group
(composed of LUN 1 and LUN 2) at site 1 in Shanghai to site 2 in Beijing, you only
need to configure protection settings at site 1, without logging in to site 2.

Figure 9-4 Configurations on one device

If you use one protection group to manage all LUNs of an application and have
enabled protection for the protection group, the system automatically creates
target volumes for all of the LUNs, pairs the sources with targets, and creates a
consistency group. Currently, DeviceManager allows the following settings to be
configured on one device:

● Remote replication

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 95


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

After you select the protection group you want to replicate to a remote
device, DeviceManager does the following automatically:
– Creates a target protection group on the target device.
– Creates target LUNs.
– Pairs each source LUN with target LUN.
– Creates a replication consistency group.
– Adds the pairs to the replication consistency group.
● HyperMetro
After you select the protection group for which you want to enable active-
active protection, DeviceManager does the following automatically:
– Creates a target protection group on the target device.
– Creates target LUNs.
– Pairs each source LUN with target LUN.
– Creates a HyperMetro consistency group.
– Adds the pairs to the HyperMetro consistency group.
● DR Star
After you select the protection group for which you want to enable DR Star
protection, DeviceManager does the following automatically:
– Creates target protection groups on the two target devices.
– Creates target LUNs.
– Pairs each source LUN with target LUN and form DR Star.
– Creates three consistency groups to form DR Star.
– Adds the replication or HyperMetro pairs to the corresponding
consistency group.
DeviceManager uses data links between devices to transmit management
commands (see Figure 9-4). You can directly use this feature with no need to
configure network devices.
For security purposes, you must be authorized by the target device and enter the
user name and password of the target device on the primary device.

9.1.1.2.1 Data Protection Based on Protection Groups


If an application has multiple LUNs, data protection for the application is to
protect its LUNs simultaneously and ensure their data consistency. In this case,
protection groups are introduced to protect the LUNs of an application.
A protection group consists of multiple LUNs. If you implement data protection on
a protection group, such as creating snapshots, the operation applies to all LUNs
in the protection group and ensures data consistency between LUNs. The
following features support batch LUN data protection based on the protection
group:
● Protection group-based snapshot
When you create a snapshot for a protection group, a snapshot is created for
each LUN in the protection group. A snapshot consistency group is
automatically created and the snapshots are added to the snapshot
consistency group.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 96


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

● Protection group-based clone


When you create a clone for a protection group, a clone is created for each
LUN in the protection group. A clone consistency group is automatically
created and the clone LUNs are added to the clone consistency group.
● Protection group-based remote replication
When you create remote replication for a protection group, a remote
replication pair relationship is established for each LUN in the protection
group. A remote replication consistency group is automatically created and
the remote replication pairs are added to the remote replication consistency
group.
● Protection group-based HyperMetro
When you configure HyperMetro for a protection group, a HyperMetro pair
relationship is established for each LUN in the protection group. A
HyperMetro consistency group is automatically created and the HyperMetro
pairs are added to the HyperMetro consistency group.

Like protecting an independent LUN, the protection group allows users to


configure protection for multiple LUNs of an application. Users do not need to
configure LUNs separately. Managing a batch of LUNs is as simple as managing
one LUN and data consistency is ensured at the same time.

9.1.1.2.2 Flexible Use of LUN Groups and Protection Groups


As previously mentioned, LUN groups and protection groups are used to manage
volumes of applications in batches. The following table describes the application
scenarios.

Group Application Scenario Relationship with Applications

LUN group Mapping ● Manages all volumes of an


application.
● Manages all volumes of a host
(involving multiple applications).

Protection Snapshot, clone, Manages all volumes of an


group replication, and active- application.
active

● Using LUN groups to manage LUNs of an application


This is the most recommended method. If LUN groups are used to manage
LUNs of an application, LUNs are mapped and protected on the basis of LUN
groups.
This management model is simple. The system will automatically create a
unique protection group and implement various types of data protection
based on the protection group.
This method makes the management of multiple LUNs as simple as the
management of only one LUN.
● Using LUN groups to manage volumes of hosts and protection groups to
manage volumes of applications

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 97


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

This method is usually used to manage LUNs that belong to different


applications which are deployed on the same host or host group.
In such conditions, it is not appropriate to create the LUN group as a
protection group, because data volumes of some applications may not need
to be protected. Therefore, you can select the specified LUNs of the LUN
group to create a protection group and perform protection operations for the
protection group.
● Using LUN groups or protection groups separately
If the data protection feature is not enabled, only LUN groups are available
for you.
If LUN group-based mappings are unnecessary, you can add LUNs of an
application to a protection group to keep them consistent.

9.1.1.2.3 Capacity Expansion of LUN Groups or Protection Groups


For an application running out of storage space, you can add LUNs to expand the
capacity. If the LUNs are managed in a LUN group or protection group, they
inherit the mappings and protection settings of the group.
● Automatic creation of mappings
After new LUNs are added to a LUN group, these LUNs share the mappings of
the LUN group. Hosts accessible to the LUN group are also accessible to the
LUNs.
● Automatic configuration of data protection
After new LUNs are added to a LUN group that is included in a protection
group, the system will automatically append the protection settings of the
protection group to the LUNs, such as creating clones, replication pairs, and
HyperMetro pairs for the LUNs, and adding the created objects to their
respective consistency groups.
If new LUNs are added to a protection group, the system will append the
protection settings of the protection group to the LUNs.
Similarly, if LUNs are removed from a LUN group or protection group, their
mappings will be deleted, and they will be removed from consistency groups
(their existing replication pairs and HyperMetro pairs will be retained and can
be manually deleted).

9.1.1.2.4 Configuration on One Device for Cross-Device Data Protection


When cross-device data protection (such as replication and HyperMetro) is
enabled, data protection configuration only needs to be performed on one device.
For example, as shown in the following figure, if you want to replicate the
protection group (composed of LUN 1 and LUN 2) at site 1 in Shanghai to site 2 in
Beijing, you only need to configure protection settings at site 1, without logging in
to site 2.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 98


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

Figure 9-5 Configurations on one device

If you use one protection group to manage all LUNs of an application and have
enabled protection for the protection group, the system automatically creates
target volumes for all of the LUNs, pairs the sources with targets, and creates a
consistency group. Currently, DeviceManager allows the following settings to be
configured on one device:

● Remote replication
After you select the protection group you want to replicate to a remote
device, DeviceManager does the following automatically:
a. Creates a target protection group on the target device.
b. Creates target LUNs.
c. Pairs each source LUN with the target LUN.
d. Creates a replication consistency group.
e. Adds the pairs to the replication consistency group.
● HyperMetro
After you select the protection group for which you want to enable active-
active protection, DeviceManager does the following automatically:
a. Creates a target protection group on the target device.
b. Creates target LUNs.
c. Pairs each source LUN with the target LUN.
d. Creates a HyperMetro consistency group.
e. Adds the pairs to the HyperMetro consistency group.
● DR Star
After you select the protection group for which you want to enable DR Star
protection, DeviceManager does the following automatically:

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 99


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

a. Creates target protection groups on the two target devices.


b. Creates target LUNs.
c. Pairs each source LUN with the target LUN and forms DR Star.
d. Creates three consistency groups to form DR Star.
e. Adds the replication or HyperMetro pairs to the corresponding
consistency group.

DeviceManager uses data links between devices to transmit management


commands (see Figure 9-5). You can directly use this feature with no need to
configure network devices.

For security purposes, you must be authorized by the target device and enter the
user name and password of the target device on the primary device.

9.1.1.3 Configuration Task


DeviceManager automatically enables a background configuration task after you
submit a complex configuration operation. The task is running on the storage
background. In this way, you can perform other operations while the previous one
is still being processed.

Background configuration tasks also apply to LUN group-based mappings,


protection group-based protection, and cross-device protection. Suppose that a
user needs to configure protection for a LUN group that has hundreds of LUNs. It
takes a long time and hundreds of operations to complete the configuration.
However, the background configuration task helps the user complete the
configuration in the background, freeing the user from long-term waiting.

● Task steps and progress


A complex task usually contains multiple executable steps. You can use
DeviceManager to check in which step the task is executed and view the
overall task execution progress (%).
● Task execution in the background
After a configuration task is submitted, the task is executed in the
background. You can close the DeviceManager page without waiting for the
task to complete. You can also submit multiple configuration tasks. If the
resources on which the tasks depend do not conflict, the tasks are
automatically executed in sequence in the background.
● Resuming tasks from the breakpoint
If the system is powered off unexpectedly during the task execution, the task
will be resumed at the breakpoint after the system is restarted and all
subsequent steps will be performed.
● Retrying failed tasks manually
If an exception occurs during the execution of a step, the task is automatically
interrupted and the cause is displayed. For example, a task cannot be
executed because the storage space is insufficient during LUN creation. In this
case, you can manually rectify the fault that causes task interruption, and
then manually restart the task. The task will then continue from the failure
point.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 100


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

9.1.1.4 Fault Management

9.1.1.4.1 Monitoring Status of Hardware Devices


This function provides hardware views in a what-you-see-is-what-you-get manner
and uses colored digits to make status of different hardware more distinguishable.
Figure 9-6 shows the status statistics of hardware components.

Figure 9-6 Inventory management

You can further navigate through a specific device frame and its hardware
components, and view the device frame in the device hardware view. In the
hardware view, you can monitor the real-time status of each hardware component
and learn the physical locations of specific hardware components (such as ports
and SSDs), facilitating hardware maintenance.
You can query the real-time health status of the SSDs, ports, interface modules,
fans, power modules, BBUs, controllers, and disk enclosures. In addition, disks and
ports support performance monitoring. You can refer to 9.1.1.5 Performance and
Capacity Management for more details.

9.1.1.4.2 Alarm and Event Monitoring


This function provides you with real-time fault monitoring information. If a system
fault occurs, the fault is pushed to the home page of DeviceManager in real time.
Figure 9-7 shows an example.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 101


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

Figure 9-7 Alarm notification

A dedicated page is available for you to view information about all alarms and
events and also provides you with troubleshooting suggestions.

You can also receive alarm and event notifications through syslog, email, and SMS
(a dedicated SMS modem is required). You can configure multiple email addresses
or mobile phone numbers to receive notifications.

9.1.1.5 Performance and Capacity Management


Performance data collection and analysis are essential to daily device
maintenance. Because the performance data volume is large and analyzing the
data consumes many system resources, an extra server is often required for
installing dedicated performance data collection and analysis software, making
performance management complex.
OceanStor Dorado 2000 has a built-in performance and capacity data collection
and analysis component that is ready for use. The component is specially designed
to consume minimal system resources.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 102


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

9.1.1.5.1 Built-In Performance Data Collection and Analysis Capabilities


OceanStor Dorado 2000 has built-in performance collection and analysis software.
You do not need to install the software separately. You can collect, store, and
query historical performance and capacity data of a maximum of the past three
years. Alternatively, you can specify the period as required. Performance data of
multiple specific objects can be collected and analyzed, such as controllers, ports,
disks, hosts, host groups, LUNs, LUN groups, file systems, remote replication, and
replication links. Multiple performance indicators can be monitored for each
object, such as bandwidth, IOPS, average I/O response time, and usage.

Different objects and performance indicators can be displayed in the same view to
help you analyze performance issues. You can specify the desired performance
indicators to analyze the top or bottom objects, so that you can locate overloaded
objects more efficiently and tune performance more precisely. The following figure
shows performance monitoring. For details about monitoring objects, monitoring
indicators, and performance analysis functions, see the online help.

9.1.1.5.2 Independent Data Storage Space


To store collected performance and capacity data, a dedicated storage space is
required. DeviceManager provides a dedicated configuration page for users to
select a storage pool for storing data.

You can also customize the data retention period. The maximum retention period
is three years. DeviceManager automatically calculates the required storage space

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 103


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

based on your selection and allocates the required storage space in the storage
pool.

9.1.1.5.3 Performance Threshold Alarm


This function allows you to configure threshold alarms for objects such as
controllers, ports, LUNs, and replication. The alarm threshold, flapping period, and
alarm severity can be customized.
Different storage resources carry different types of services. Therefore, common
threshold alarms may not meet requirements. Performance management allows
you to set threshold rules for specified objects to ensure high accuracy for
threshold alarms.

9.1.1.5.4 Scheduled Report


Performance and capacity reports for specific objects can be generated
periodically. Users can learn about the performance and capacity usage of storage
devices.

Reports can be generated by day, week, or month. You can set the time when the
reports are generated, the time when the reports take effect, and the retention
duration of the reports. You can select a report file format. Currently, the PDF and
CSV formats are supported.
You can select the objects for which you want to generate a performance report.
All the objects for which you want to collect performance statistics can be
included in the report. You can also select the performance indicators to be
displayed in the report. The capacity report collects statistics on the capacity of
the entire system and storage pools.
You can create multiple report tasks. Each report task can be configured with its
own parameters. The system automatically generates reports according to the task
requirements.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 104


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

9.1.2 CLI
The Command Line Interface (CLI) allows administrators and other system users
to manage and maintain the storage system. It is based on the secure shell
protocol (SSH) and supports key-based SSH access.

9.1.3 RESTful APIs


RESTful API of OceanStor Dorado 2000 allows system automation, development,
query, and allocation based on HTTPS interfaces. With Restful API, you can use
third-party applications to control and manage arrays and develop flexible
management solutions for Dorado V3. RESTful API enables users to develop
flexible management solutions for OceanStor Dorado 2000.

9.1.4 SNMP
The storage system reports alarms and events through SNMP traps.

9.1.5 SMI-S
The Storage Management Initiative Specification (SMI-S) interface is a storage
standard management interface developed and maintained by the Storage
Networking Industry Association (SNIA). Many storage vendors participate in
defining and implementing SMI-S. The SMI-S interface is used to configure
storage hardware and services. Storage management software can use this
interface to manage storage devices and perform standard management tasks,
such as viewing storage hardware, storage resources, and alarm information.
OceanStor Dorado 2000 supports SMI-S 1.5.0, 1.6.0, and 1.6.1.

9.1.6 Tools
OceanStor Dorado 2000 provides diversified tools for pre-sales assessment
(eDesigner) and post-sales delivery (SmartKit). These tools effectively help deploy,
monitor, analyze, and maintain OceanStor Dorado 2000.

9.2 Non-Disruptive Upgrade (NDU)


The storage system upgrade usually requires controller restart and services need
to be switched over between controllers to ensure service continuity. This not only
greatly affects service performance but also requires a large amount of host
information to be collected for evaluating potential upgrade compatibility risks.
Host information collection requires host accounts and involves a large amount of
information. Certain information cannot be automatically evaluated and demands
manual intervention, adding to the complexity of an upgrade.

OceanStor Dorado 2000 provides an upgrade mode that does not require
controller restart. During an upgrade, controllers are not restarted, front-end links
are not switched, and services are not affected. In addition, the performance is
restored quickly after the upgrade. During an upgrade, the links between the
storage system and host are not interrupted and no link switchover is performed,
eliminating the need for collecting host information and avoiding compatibility
risks caused by path switchover.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 105


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

Component-based Upgrade
A storage system can be divided into four kinds of components: physical hardware,
driver firmware, operating system, and storage software. Each kind is upgraded in
different ways to complete the system upgrade in OceanStor Dorado 2000, as
shown in Figure 9-8.

Figure 9-8 Component-based upgrade

● Physical hardware does not need to be upgraded.


● The driver firmware, including the BIOS, CPLD, and firmware of interface
modules, supports hot upgrade without restarting controllers.
● The operating system is upgraded by installing hot patches.
● Storage software is user-mode processes. Earlier processes are killed and new
processes are started using upgraded codes, which are completed within
seconds.

The component-based upgrade and front-end connection keepalive techniques


eliminate the need for controller restart and front-end link switchover, leaving
host services not affected.

Connection Keepalive
The daemon process keeps the links (Fibre Channel and iSCSI links) between the
controller and hosts connected during service process startup. The restart of
service processes takes a very short time (1 to 2 seconds). In this way, I/Os issued
by the host do not time out and the host is unaware of the restart.

Zero Performance Loss and Short Upgrade Time


OceanStor Dorado does not involve service switchover. Therefore, an upgrade has
nearly no impact on host performance, and performance can recover to 100%
within 2 seconds. The upgrade process does not demand controller restart or link

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 106


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 9 System Serviceability Design

switchover. You do not need to collect host information or perform compatibility


evaluation. The end-to-end upgrade process (from importing packages to upgrade
completion) takes less than 30 minutes, 10 minutes in general. The upgrade of the
I/O handling process takes only 10 seconds, affecting services for only 10s. This
ensures that the upgrade has the minimum impact on the storage system.

9.3 Device Lifecycle Management

9.3.1 Replacing a Disk Enclosure


You can add a new disk enclosure when the disk lifecycle expires. Then the storage
system migrates data from the to-be-replaced disk enclosure to the new disk
enclosure without affecting host and service running.

Step 1 Connect a new disk enclosure to the controller enclosure. Disks installed on the
disk enclosure can be disks of various new media or disks with larger capacity.
Step 2 Data is migrated in the background from the to-be-replaced disk enclosure to the
new one. This is transparent to users and hosts.
Step 3 After data has been migrated, remove the old disk enclosure.

----End

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 107


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 10 System Security Design

10 System Security Design

Storage security must be safeguarded by technical measures. Data integrity,


confidentiality, and availability must be monitored. Secure boot and access
permission control as well as security policies based on specific security threats of
storage devices and networks further enhance system security. All these measures
prevent unauthorized access to storage resources and data. Storage security
consists of device security, network security, service security, and management
security. This chapter describes the software integrity protection, secure boot, and
data encryption capabilities related to system security. The digital signature
technology ensures that the product package (including the upgrade package)
developed by Huawei is not tampered with during device installation and upgrade.
The secure boot technology guarantees that the startup components are verified
during the startup of storage devices to prevent the startup files from being
tampered with. The disk encryption feature is used to protect data stored on disks
and prevent data loss caused by disk loss.
10.1 Software Integrity Protection
10.2 Secure Boot

10.1 Software Integrity Protection


Figure 10-1 Software integrity protection

The product software package, upgrade package, and firmware package may be
tampered with during the time of waiting for onsite R&D engineers. The digital

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 108


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 10 System Security Design

signature of the product software package is used to protect the integrity of the
upgrade package used during product development and onsite deployment. The
software package uses an internal digital signature and a product package digital
signature. After the software package is sent to the customer over the network,
the upgrade module of the storage system verifies the digital signature and
performs the upgrade only after the verification is successful. This ensures the
integrity and uniqueness of the upgrade package and internal software modules.

10.2 Secure Boot


Figure 10-2 Secure boot

After the device is powered on, the initial startup module starts and verification is
performed level by level. If the verification is successful, the device starts. Digital
signatures are used to verify firmware integrity to prevent firmware and operating
systems from being tampered with. The related technologies are as follows:
● The root of trust (RoT) is integrated into the CPU to prevent software and
physical attacks, providing the highest level of security in the industry.
● Software integrity is ensured by two levels of digital signatures (root key +
level-2 key) and software uniqueness is ensured by digital certificates.
● The RSA 2048/4096 algorithm is used, which has the top security level in the
industry.
● Level-2 keys (code signing keys) can be revoked.
● The built-in RoT of the CPU can prevent malicious tampering, such as
tampering of flash firmware outside the ARM and replacement of the system
disk.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 109


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 11 Intelligent Storage Design

11 Intelligent Storage Design

11.1 Intelligent Cloud Management

11.1 Intelligent Cloud Management


In traditional service support mode, technical support personnel provide services
manually. Faults may not be detected in a timely manner and information may
not be delivered correctly. To resolve the preceding problems, Huawei provides the
DME IQ cloud intelligent management system (DME IQ for short) based on the
cloud native. With the customer's authorization, device alarms and logs are sent
to DME IQ at a scheduled time every day. Based on artificial intelligence
technologies, DME IQ implements intelligent fault reporting, real-time health
analysis, and intelligent fault prevention to identify potential risks, automatically
locate faults, and provide troubleshooting solutions, minimizing device running
risks and reducing operation costs.

Figure 11-1 Typical DME IQ networking

DME IQ enables the client to work with the cloud system.

● The client is deployed on the customer side.


DME IQ client is used or DME IQ is enabled on DeviceManager to connect to
the DME IQ cloud system. Alarm information about customer devices is
collected and sent to the Huawei cloud system in a timely manner.
● The cloud system is deployed in Huawei remote support center.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 110


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 11 Intelligent Storage Design

DME IQ receives device alarms from the customer client all day long,
automatically reports the problems to the Huawei remote support center, and
creates the corresponding service request. Huawei service engineers will assist
the customer to resolve the problem in time.

DME IQ has the following advantages:


● It provides a self-service O&M system for customers, aiming for precise
personalized information services.
● Customers can use a web to access DME IQ to view device information
anytime anywhere.
● High data security and reliability are ensured. Secure information transmission
is provided and DME IQ can access the customer's system only after being
authorized by the customer.
● It provides 24/7 secure, reliable, and proactive O&M services. SRs can be
automatically created.
● Based on Huawei Cloud, the DME IQ cloud system drives IT O&M activities
through big data analytics and artificial intelligence (AI) technologies to
identify faults in advance, reduce O&M difficulties, and improve O&M
efficiency.

11.1.1 Scope of Information to Be Collected


With the authorization of customers, Huawei storage systems can be connected to
DME IQ through a network to periodically collect their O&M data, helping fully
understand storage O&M activities. The O&M data includes performance data,
configuration information, alarm information, system logs, and disk information.

Table 11-1 Scope of Information to Be Collected


Data Type Description Interval of Data Upload

Performance data A .txt file in the Uploaded automatically. The new


JSON format performance data is uploaded to the
DME IQ cloud system every 5 minutes.

Configuration A .txt file Uploaded automatically. The


information configuration is uploaded to the DME
IQ cloud system once a day.

Alarm HTTPS message Uploaded automatically. The new


information device alarm messages are uploaded
to the DME IQ cloud system every 30
seconds.

System logs Email Uploaded automatically. The new


alarm messages are uploaded to the
DME IQ cloud system every 5 minutes.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 111


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 11 Intelligent Storage Design

A .tgz file Manually uploaded by Huawei


technical support personnel on the
DME IQ cloud system. In the current
version, all system logs and system
logs in the latest one hour, latest two
hours, latest 24 hours, or a specific
time period can be uploaded.

Hardware A .txt file Uploaded automatically. The disk


information information is uploaded to the DME IQ
cloud system once a day.

11.1.2 Intelligent Fault Reporting


DME IQ provides 24/7 health reporting. If a device fails, DME IQ is automatically
notified. Traditional fault reporting mechanisms have difficulties in covering all
scenarios and have problems such as false alarm reporting and alarm missing.
DME IQ provides 24/7 active monitoring for customer device alarms. The alarms
generated by the device are reported to DME IQ. Based on the fault feature model
library of global devices, DME IQ performs automatic alarm masking to filter
redundant alarms, improving the accuracy and efficiency of alarm handling. Based
on the service level, DME IQ can automatically create an SR and send it to the
corresponding Huawei engineer for problem processing. At the same time, DME IQ
notifies the customer of the problem by using the pre-agreed method (email by
default) to facilitate troubleshooting.

Figure 11-2 Working principles of intelligent fault reporting

11.1.3 Capacity Prediction


System capacity changes are affected by multiple factors. The traditional single
prediction algorithm cannot ensure the accuracy of prediction results. DME IQ
ensures the rationality and accuracy of prediction results from the following
aspects:
● DME IQ uses multiple prediction model clusters for online prediction, outputs
the prediction results of multiple models, and then selects the best prediction
results based on the selection rules recommended by online prediction. At the

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 112


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 11 Intelligent Storage Design

same time, based on the historical capacity data, DME IQ periodically trains
and verifies itself in the light of linear prediction, recognizable trends, periods,
and partial changes to optimize the model parameters, ensuring that the
optimal prediction algorithm is selected.
● The DME IQ prediction algorithm model can accurately identify various
factors that affect capacity changes, for example, sudden capacity increase
and decrease caused by major events, irregular trend caused by capacity
reclamation of existing services, and capacity hops caused by new service
rollout. In this way, the system capacity consumption can be predicted more
accurately.

DME IQ selects the best prediction model using the intelligent algorithm, and
predicts the capacity consumption in the next 12 months. Based on the capacity
prediction algorithm, DME IQ provides the overloaded resource warning, capacity
expansion suggestions for existing services, and annual capacity planning
functions for customers.

Figure 11-3 Working principles of capacity prediction

Responsibilities of each component:

● Data source collection: collects configurations, performance indicators, and


alarm information to reduce the interference of multiple factors on machine
learning and training results.
● Feature extraction: uses algorithms to transform and extract features
automatically.
● Historical database warehouse: stores historical capacity data of the latest
year.
● Online training
– Uses a large number of samples for training to obtain the measurement
indicator statistics predicted by each model and output the model
selection rules.
– For the current historical data, performs iteration for a limited number of
times to optimize the model algorithm.
● Machine learning model library: includes ARIMA, fbprophet, and linear
prediction models.
● Online prediction: performs prediction online using the optimized models and
outputs the prediction results of multiple models and mean absolute
percentage error (MAPE) values of measurement indexes.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 113


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 11 Intelligent Storage Design

In the preceding formula, At indicates the actual capacity and Ft indicates the
predicted capacity.
● Best model selection: weights the model statistics of online training and
results of online prediction, and selects the optimal prediction results.

11.1.4 Disk Health Prediction


Disks are the basis of a storage system. Although various redundancy technologies
are used in storage systems, they tolerate the failure of a limited number of disks
while ensuring service running. For example, RAID 5 allows only one disk to fail.
When two disks fail, the storage system stops providing services to ensure data
reliability. Disks are the largest consumables in a storage system. Therefore, disk
service life is the most concerned topic for many users. SSDs are electronic
components and their service life prediction indicators are limited. In addition, the
number of read and write requests delivered by service varies every day, which
further complicates disk service life prediction.

You can collect the Self-Monitoring, Analysis and Reporting Technology


(S.M.A.R.T.) information and I/O link information of SSDs, as well as reliability
indicators of SSDs, and enter such information to hundreds of disk failure
prediction models, implementing accurate SSD service life prediction. Intelligent
algorithms are used to predict SSD risks to detect failed SSDs and replace risky
SSDs in advance, preventing faults and improving system reliability.

Figure 11-4 Working principles of disk health prediction

● Data collection

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 114


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 11 Intelligent Storage Design

The disk vendors provide the S.M.A.R.T. data of disks. The S.M.A.R.T. data can
indicate the running status of the disks and help predict risky disks in a
certain range. However, it is difficult to ensure the accuracy of prediction
results. Intelligence technologies are used to dynamically analyze S.M.A.R.T.
changes of disks, performance indicator fluctuations, and disk logs, ensuring
more accurate prediction results.
● S.M.A.R.T
For SSDs, the interfaces provide SCSI log page information that records the
current disk status and performance indicators, such as the grown defect list,
non-medium error, and read/write/verify uncorrected errors.
● Performance indicators
Workload information such as the average I/O size distribution per minute,
IOPS, bandwidth, and number of bytes processed per day, and performance
indicators such as the latency and average service time
● Disk log
I/O error codes collected by Huawei storage systems, DIF errors, degradation
errors, slow disk information, slow disk cycles, and disk service life
● Feature extraction
Based on massive amounts of historical big data, feature transformation and
feature extraction are automatically performed by using algorithms.
● Analysis platform
– Online training: performs trainings based on the model algorithms, and
performs iteration for a limited number of times to optimize the model
algorithms.
– Machine learning model library: disk failure prediction model.
– Online prediction: uses the optimized training models to predict disk
failures.
● Prediction result
Massive SSD data is tested and verified to accurately predict SSD service life
based on the disk failure prediction model.

11.1.5 Device Health Evaluation


Generally, after a device goes online, the customer performs inspections to prevent
device risks, which has two disadvantages:

● Inspection frequency
Inspections are performed monthly or quarterly. As a result, customers cannot
detect problems in a timely manner.
● Inspection depth
The customer can only check whether the current device is faulty. The system,
hardware, configuration, capacity, and performance risks are not analyzed.

You can evaluate device health in real time from the system, hardware,
configuration, capacity, and performance dimensions, detect potential risks, and
display device running status based on health scores. In addition, solutions are
provided for customers to prevent risks.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 115


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 11 Intelligent Storage Design

Figure 11-5 Device health evaluation details

11.1.6 Performance Fluctuation Analysis


Periodic service operations (such as scheduled snapshot) or temporary changes
(such as online upgrade, capacity expansion, and parts replacement) should be
performed during off-peak hours to avoid affecting online services. In the past,
O&M personnel estimate the proper time window according to experiences or
performance indicators of a past period of time.
Based on historical device performance data, you can analyze performance
fluctuations from the aspects of load, IOPS, bandwidth, and latency. Users can
view the service period rules from the four dimensions and select a proper time
window to perform periodic operations on services (such as scheduled snapshot)
or temporary service changes (such as online upgrade, capacity expansion, and
spare parts replacement) to prevent impacts on services during peak hours.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 116


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 11 Intelligent Storage Design

Figure 11-6 Working principles of weekly performance fluctuation analysis

DME IQ calculates the performance indicator values of each hour from Monday to
Sunday based on the device performance data in the past four weeks. For
example, the IOPS is calculated as follows: Calculate the sum of the IOPS on each
hour in the past four weeks and then divide the sum by 4 to obtain the weekly
performance fluctuation patterns. For daily and monthly calculation, the methods
are similar.

Users can view the service performance statistics by day, week, or month as
required, as shown in the following figure.

Figure 11-7 Weekly performance fluctuation

11.1.7 Performance Exception Detection


The biggest concern of enterprises is whether services can run smoothly. However,
because the performance problems are complicated and difficult to identify and
solve in advance, the problems are detected until they get worse, affecting the
services and causing losses to enterprises. The performance exception detection
function is provided. For service latency, the deep learning algorithm is used to
learn service characteristics from historical performance data. Combining service
characteristics with the industry and Huawei expertise, DME IQ obtains the device
performance profiles which show real-time exceptions, precisely locate faults, and
provide rectification suggestions.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 117


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 11 Intelligent Storage Design

Figure 11-8 Working principles of performance exception detection

11.1.8 Performance Bottleneck Analysis


After a service goes online, O&M personnel are concerned about whether the
device has high performance pressure and whether the service can run properly.
Due to complicated factors that affect device performance, such as hardware
configuration, software configuration, service type, and performance data,
multiple performance indicators need to be compared and analyzed concurrently
and manually. Therefore, performance pressure evaluation and performance
optimization become big challenges.

Performance bottleneck analysis covers device configurations and performance


data. Device performance pressure is automatically evaluated to provide clear
overall device loads and loads on each component based on Huawei expertise,
identify performance bottlenecks, and provide optimization suggestions.
Customers can make adjustment based on suggestions for the performance
bottleneck to ensure stable service running.

Figure 11-9 Working principles of performance bottleneck analysis

Users can view the overall device loads and loads on each component, as shown
in Figure 11-10.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 118


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 11 Intelligent Storage Design

Figure 11-10 Performance bottleneck analysis

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 119


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 12 Ecosystem Compatibility

12 Ecosystem Compatibility

IT ecosystem infrastructure includes hardware (servers, network devices, and


storage devices) and software (virtualization systems, operating systems, cluster
software, and management and control software) and different software and
hardware products must be compatible with each other. As core infrastructure,
OceanStor Dorado 2000 all-flash storage is compatible with IT software and
hardware products of different vendors, types, and versions.

OceanStor Dorado 2000 all-flash storage supports a large number of scenarios


where different software and hardware products are combined. For details, visit
https://info.support.huawei.com/storage/comp/#/oceanstor-dorado.
12.1 Data Plane Ecosystem Compatibility
12.2 Management and Control Plane Ecosystem Compatibility

12.1 Data Plane Ecosystem Compatibility

12.1.1 Host Operating System


OceanStor Dorado 2000 is compatible with mainstream host operating systems in
the industry (including IBM AIX, HP-UX, Solaris, Red Hat Enterprise Linux, SuSE
Linux Enterprise Server, Oracle Enterprise Linux, Windows Server, NeoKylin, Kylin
(Tianjin), Hunan Kylin, Deepin, Linx-TECH Rocky, and Redflag Linux) and
multipathing software for operating systems (including embedded multipathing
software of the host operating system, Huawei UltraPath, and third-party
multipathing software Veritas DMP). In addition, OceanStor Dorado 2000 provides
active-active storage solutions for mainstream host operating systems.

12.1.2 Host Virtualization System


OceanStor Dorado 2000 is compatible with mainstream host virtualization systems
in the industry (including VMware ESXi, Microsoft Hyper-V, XenServer, Red Hat
RHV, IBM PowerVM (VIOS), Huawei FusionCompute, and NeoKylin advanced
server operating system (OS) (virtualization version)) and multipathing software
(including embedded multipathing software of host virtualization systems and
Huawei UltraPath). In addition, it provides active-active storage solutions.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 120


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 12 Ecosystem Compatibility

OceanStor Dorado 2000 also supports various VMware features, including VAAI,
VASA, SRM, vSphere Web Client Plug-in, vRealize Operations, and vRealize
Orchestrator. It is deeply integrated with VMware, providing customers with
comprehensive storage services in VMware virtualization environments.

12.1.3 Host Cluster Software


OceanStor Dorado 2000 all-flash storage supports various host cluster software,
including IBM PowerHA/HACMP, IBM GPFS, IBM DB2 PureScale, HPE ServiceGuard,
Oracle SUN Cluster, Oracle RAC, Windows Server Failover Clustering, and Red Hat
Cluster Suite, providing reliable shared storage services for host services in cluster
scenarios.

12.1.4 Database Software


OceanStor Dorado 2000 supports various database software to meet customers'
requirements for different service applications, including Oracle, DB2, SQL Server,
SAP, GaussDB, DAMen, GBase, and Kingbase.

12.1.5 Storage Gateway


OceanStor Dorado 2000 can be taken over by third-party storage gateways,
including third-party hardware storage gateways (IBM SVC and EMC Vplex), third-
party software storage gateways (DataCore SANsymphony-V and FalconStor CDP/
NSS), and gateways of third-party storage product HDS VSP series that support
heterogeneous storage.

12.1.6 Storage Network


OceanStor Dorado 2000 all-flash storage supports protocols, such as FC, iSCSI, and
TCP/IP, and is compatible with mainstream FC switches and directors (including
Brocade and Cisco), standard Ethernet switch, and mainstream FC HBAs and
standard Ethernet NICs.

12.2 Management and Control Plane Ecosystem


Compatibility

12.2.1 Backup Software


OceanStor Dorado 2000 all-flash storage supports mainstream backup software in
the industry and snapshot-based backup solutions, improving backup efficiency,
saving host resources, and ensuring data security. Mainstream third-party backup
software includes IBM TSM, Veeam, Veritas NBU, EISOO, and SCUTECH.

12.2.2 Network Management Software


OceanStor Dorado 2000 all-flash storage supports mainstream network
management protocols and standards (including SNMP, RESTful, and SMI-S), as
well as management software (including IBM Spectrum Control, SolarWinds
Storage Resource Monitor, Microsoft System Center Visual Machine Manager
(SCVMM), HPE Operations Manager, and BMC Atrium Discovery) in the industry.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 121


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 12 Ecosystem Compatibility

In addition, it provides unified O&M management for customer data centers,


reducing O&M costs.

12.2.3 OpenStack Integration


OceanStor Dorado 2000 launches the latest OpenStack Cinder Driver in the
OpenStack community. Vendors of commercial OpenStack versions can obtain and
integrate OpenStack Cinder Driver, allowing their products to support OceanStor
Dorado. In addition, the new-gen OceanStor hybrid flash storage supports
commercial versions of OpenStack such as Huawei FusionSphere OpenStack, Red
Hat OpenStack Platform, Mirantis OpenStack, and EasyStack.

12.2.4 Container Platform Integration


OceanStor Dorado 2000 releases CSI Plugin and FlexVolume Plugin, and supports
mainstream container management platforms such as Kubernetes and Red Hat
OpenShift.
OceanStor Dorado 2000 releases CDR Plugin to provide backup and recovery
capabilities for containerized applications, ensuring data security of mission-
critical services.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 122


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 13 More Information

13 More Information

More Information
You can obtain more information about OceanStor Dorado 2000 at the following
site:
Huawei is continuously collecting requirements of important customers in major
industries and summarizes typical high-performance storage service application
scenarios and challenges faced by these customers. This helps Huawei provide
best practices which are tested and verified together with application suppliers.
For best practices, visit the following website:
http://storage.huawei.com/index.html?lang=en
You can also visit Huawei's official website to obtain more information about
Huawei storage:
https://e.huawei.com/en/
For after-sales support, visit Huawei technical support website:
https://support.huawei.com/enterprise/en/index.html
For pre-sales support, visit the following website:
https://e.huawei.com/en/how-to-buy/contact-us
You can also contact your local Huawei office:
For contact information of the local office, visit http://e.huawei.com.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 123


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 14 Feedback

14 Feedback

Huawei welcomes your suggestions for improving our documentation. If you have
comments, send your feedback to [email protected].
Your suggestions will be seriously considered and we will make necessary changes
to the document in the next release.

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 124


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 15 Acronyms and Abbreviations

15 Acronyms and Abbreviations

Acronym or Full Spelling


Abbreviation

FRU Field Replaceable Unit

FlashLink® FlashLink®

CK Chunk

CKG Chunk Group

DIF Data Integrity Field

RDMA Remote Direct Memory Access

FC Fiber Channel

FTL FLASH Translation Layer

GC Garbage Collection

SSD Solid State Disk

LUN Logical Unit Number

OLAP On-Line Analytical Processing

OLTP On-Line Transaction Processing

OP Over-Provisioning

RAID Redundant Array of Independent Disks

RAID-TP Redundant Array of Independent Disks-Triple Parity

SAS Serial Attached SCSI

SCSI Small Computer System Interface

SSD Solid State Disk

T10 PI T10 Protection Information

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 125


OceanStor Dorado 2000 6.1.6 Technical White Paper
OceanStor Dorado 2000 6.1.6 Technical White Paper 15 Acronyms and Abbreviations

Acronym or Full Spelling


Abbreviation

VDI Virtual Desktop Infrastructure

VSI Virtual Server Infrastructure

WA Write amplification

Wear Leveling Wear Leveling

TCO Total Cost of Ownership

DC Data Center

DCL Data Change Log

TP Time Point

GUI Graphical User Interface

CLI Command Line Interface

eDevLun External device LUN

FIM front-end interconnect I/O module

SCM storage class memory

FRU field replaceable unit

PI Protection Information

SFP Similar FingerPrint

DTOE Direct TCP/IP Offloading Engine

NUMA non-uniform memory access

ROW redirect-on-write

PID Proportional Integral Derivative

SMP symmetrical multiprocessor system

CSI Container Storage Interface

DCI Data Center Interconnect

Issue 01 (2023-08-01) Copyright © Huawei Technologies Co., Ltd. 126

You might also like