Huawei Servers Troubleshooting 16
Huawei Servers Troubleshooting 16
Troubleshooting
Issue 16
Date 2019-11-15
and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective
holders.
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and the
customer. All or part of the products, services and features described in this document may not be within the
purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information,
and recommendations in this document are provided "AS IS" without warranties, guarantees or
representations of any kind, either express or implied.
The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but all statements, information, and
recommendations in this document do not constitute a warranty of any kind, express or implied.
Website: https://e.huawei.com
Overview
This document describes how to collect logs, diagnose faults, upgrade software, perform
preventive maintenance and common operations, and collect the information required to for
troubleshoot Huawei E9000, E6000, X6000, X8000, X6800, rack, and G Series
Heterogeneous servers.
Intended Audience
This document is intended for:
Symbol Conventions
The symbols that may be found in this document are defined as follows.
Symbol Description
Symbol Description
Change History
Issue Date Description
Contents
8 Common Operations.................................................................................................................107
8.1 Obtaining a Product SN.............................................................................................................................................. 108
8.2 Using iMana 200 to Collect Information in Batches.................................................................................................. 114
8.3 Using iBMC to Collect Information in Batches......................................................................................................... 115
8.4 Using the MM910 WebUI to Collect Information in Batches (for Versions Earlier Than U54 2.20)........................117
8.5 Using the MM910 WebUI to Collect Information in Batches (for U54 2.20 or Later)..............................................117
8.6 Using the FusionDirector WebUI to Collection Information in Batches....................................................................118
8.7 Using the MM510 CLI to Collect Information (FusionServer G5500)......................................................................118
8.8 Logging In to the iMana 200 WebUI..........................................................................................................................119
8.9 Logging In to the iBMC WebUI................................................................................................................................. 122
8.10 Logging In to the Web Tools of the MX510.............................................................................................................125
8.11 Logging In to the MM910 WebUI............................................................................................................................ 126
8.12 Logging In to the FusionDirector WebUI.................................................................................................................130
8.13 Logging In to the MM510 CLI.................................................................................................................................134
8.14 Logging In to the RMC CLI..................................................................................................................................... 136
8.15 Logging In to a Server Over a Network Port by Using PuTTY............................................................................... 140
8.16 Logging In to a Server Over a Serial Port by Using PuTTY....................................................................................142
8.17 Logging In to a Compute Node, Passthrough Module, or Switch Module by Using the SOL Function of the
MM910............................................................................................................................................................................. 144
8.18 Logging In to a Compute Node, Passthrough Module, or Switch Module by Using the SOL Function of the
MM920/MM921............................................................................................................................................................... 147
8.19 Using WinSCP to Transfer Files...............................................................................................................................148
1 Safety Instructions
General Instructions
l Comply with all local laws and regulations when installing the hardware. These Safety
Instructions are only a supplement.
l Observe the instructions that accompany all "DANGER", "WARNING", "CAUTION",
and "NOTE" symbols in this document. Follow them in conjunction with these Safety
Instructions.
l Observe all safety instructions provided on the device labels when installing hardware.
Follow them in conjunction with these Safety Instructions.
l Operations involving high voltages or moving equipment must be performed by
authorized, qualified personnel.
l Take protective measures against radio interference before operating the device in
residential areas.
Personal Safety
l Only personnel certified or authorized by Huawei are allowed to install equipment or its
components.
l Discontinue any dangerous operations and take protective measures. Report anything
that could cause personal injury or equipment damage to a project supervisor.
l Do not move devices or install cabinets and power cables in hazardous weather
conditions.
l The average weight carried by a person cannot exceed the maximum acceptable weight
of lift (MAWL) allowed by local safety regulations. Before moving a device, check the
maximum device weight and arrange required personnel.
l Wear clean protective gloves, ESD clothing, a protective hat, and protective shoes, as
shown in Figure 1-1.
l Before contacting devices, wear antistatic clothing and ESD gloves, and take off
electricity-conductive materials such as watches and jewelries, as shown in Figure 1-2.
l Exercise caution when using tools that could cause personal injury.
l Use a stacker when lifting hardware above shoulder height.
l Avoid any contact with high-voltage cables.
l Ensure that the device is properly grounded before powering it on.
l Do not use a ladder alone.
l Do not look into optical ports without eye protection.
Equipment Safety
l Use dedicated power cables to ensure equipment and personal safety.
l Use power cables only for dedicated devices.
l When moving a device, hold the handles or bottom of the device. Do not hold the handle
of the installed module, such as a power module, fan module, drive, or mainboard.
l Connect the power cables to separate power distribution units (PDUs) for active/standby
operation.
Transportation Precautions
l The logistics company engaged to transport the equipment must be reliable and comply
with international standards for transporting electronics. Ensure that the equipment being
transported is always kept upright. Take necessary precautions to prevent collisions,
corrosion, package damage, damp conditions and pollution.
l Transport the equipment in its original packaging.
l If original packages are not used, package heavy, bulky items (such as chassis and
compute nodes) and fragile components (such as PCIe GPUs and SSDs and optical
modules) separately.
Use the Intelligent Computing Compatibility Checker to search for components supported by the
compute nodes or servers.
l Power off all equipment before transportation. Do not transport hazardous materials.
To reduce the risk of personal injury, comply with local regulations with regard to the
maximum weight one person is permitted to carry.
Table 1-1 lists the maximum weight each person is permitted to carry by standards
organization.
2 Troubleshooting Process
Troubleshooting is a process of using appropriate methods to find the cause of a fault and
rectify the fault. The guideline of troubleshooting is to narrow down the scope of possible
causes for a fault to reduce troubleshooting complexity, identify the root cause, and rectify the
fault.
Figure 2-1 shows the recommended troubleshooting process.
3 Preparing for Prepare the manuals and tools required for fault diagnosis and
Troubleshooting rectification.
Step Description
9.1 Obtaining If a fault is difficult to locate or rectify after you refer to documents,
Technical Support contact Huawei technical support.
Scenarios
This section describes how to prepare for troubleshooting.
Essential Materials
Table 3-1 lists the materials that you must read before routine maintenance for Huawei
servers.
User Guide Describes the server structure, 1. Log in to the Support >
specifications, and installation Intelligent Servers or
method. Each Huawei server has Support > AI Computing
a user guide or maintenance and Platform page.
service guide. 2. Choose a server model to
access the product page.
3. On the Documentation tab
page, choose Operation &
Maintenance.
4. View the required user guide
or maintenance and service
guide.
Equipment Room Describes the regulations for Comply with the customer's
Management equipment room management equipment room management
Regulations and routine maintenance. regulations during onsite
maintenance.
Software Tools
Table 3-2 lists the software tools required for routine maintenance of Huawei servers.
FusionServer See the Used for new site deployment and delivery,
Tools 2.0 FusionServer troubleshooting, and firmware upgrade.
SmartKit Tools 2.0 SmartKit Download link: FusionServer Tools
User Guide.
PuTTY All Huawei servers Third-party tool used for remote access. You
of all versions can obtain the tool from the Internet.
WinSCP All Huawei servers Third-party tool used for file transfer for iMana
of all versions 200/iBMC or the management module. You can
obtain the tool from the Internet.
WFTPD All Huawei servers Third-party tool used for file transfer for the
of all versions Ethernet switching plane of a switch module.
You can obtain the tool from the Internet.
CoreFTPServer/ All Huawei servers Third-party tools used for file transfer for the
mini-sftp-server of all versions FC switching plane of a switch module. You
can obtain the tool from the Internet.
Hardware Tools
Table 3-3 lists the hardware tools required for routine maintenance of Huawei servers.
Floating nut hook Used to guide floating nuts to the holes in the mounting bars of
a rack.
ESD wrist strap Used to prevent ESD damage when you touch or operate
devices or components.
Tool Description
Serial cable Used to connect the serial port on the server. The serial port is
usually a DB9 or RJ45 port.
Thermometer and Used to measure the equipment room temperature and relative
hygrometer humidity.
4 Collecting Information
OS and Service Example: SLES 11 SP1 64-bit or Oracle 10.2. (Consider the fault
Software Version symptom to determine whether to collect the OS and service
software versions.)
Action Before Fault Example: BIOS settings configuration, memory capacity expansion,
Occurrence network settings modification.
Action and Result Example: After the power cable is disconnected and then
After Fault reconnected, the fault persists.
Occurrence After the DVD-ROM is replaced, the fault persists.
(Optional)
...
NOTICE
Table 4-2 describes the methods for collecting logs of different OSs.
OS Collection Method
Linux For details, see the FusionServer Tools 2.0 SmartKit User Guide.
VMware l If the purple screen of death (PSOD) does not occur, perform the following
steps:
1. Log in to the ESX server console as the root user.
2. Run the vm-support command to collect all VMware logs.
3. After logs are collected, check that a log file in the esxsupport-YYYY-
[email protected] format is generated in the /var/tmp
directory.
l If the PSOD occurs and the customer retains the site environment, perform
the following steps:
1. Capture a screenshot of the PSOD or take a photo to save the displayed
information.
2. Press Alt+F12 to switch to forcible memory information output mode,
and press Alt+PageUp/Alt+PageDown to capture screenshots and
photos. Ensure that screenshots and photos of the last several screens are
captured after the PSOD occurs.
3. Hot-restart the system, and run the vm-support command to collect all
VMware logs.
4. After logs are collected, check that a log file in the esxsupport-YYYY-
[email protected] format is generated in the /var/tmp
directory.
l If the PSOD occurs and the customer hot-restarts the system, run vm-
support to collect all of the VMware logs and check that a log file in the
[email protected] format is generated in
the /var/tmp directory.
FreeBSD Log in to the OS CLI over SSH and copy all files in /var/log/.
Copy the messages file and all files prefixed with messages (for example,
messages.0) in /var/log/ before copying other files.
Solaris Log in to the OS CLI over SSH and copy all files in the /var/log/ directory
and /var/adm/ directory.
Copy the syslog file and all files prefixed with syslog (for example, syslog.0)
in /var/log/, and copy the messages file and files prefixed with messages (for
example, messages.0) in /var/adm/ before copying other files.
NOTICE
You can use one of the following methods to collect hardware logs:
l Use SmartKit to collect server hardware information in batches. For details about the
supported servers and operations, see section "Using SmartKit > Collecting Server Logs"
in the FusionServer Tools 2.0 SmartKit User Guide.
l Use iBMC to collect hardware logs of a single server. For details, see 8.3 Using iBMC
to Collect Information in Batches.
l Use iMana 200/iBMC to collect hardware logs. For details, see the 8.2 Using iMana 200
to Collect Information in Batches or 8.3 Using iBMC to Collect Information in
Batches.
l Use SmartKit to collect hardware logs and Windows/Linux logs. For details, see the
FusionServer Tools 2.0 SmartKit User Guide.
Procedure
Step 1 Connect the Ethernet port of the PC to the management network ports of the active and
standby MM910 modules over the LAN. Figure 4-1 shows the network connection.
NOTICE
l The MGMT port on the MM910 panel is the management network port.
l If the active MM910 MGMT port has been connected to the network by using a network
cable and the client needs to be directly connected to the MM910, do not directly
disconnect the network cable from the active MM910 MGMT port. Otherwise, an active/
standby MM910 switchover will be triggered, which may cause network interruption. You
are advised to connect the client to the active MM910 STACK port in the chassis by using
a network cable. If the active MM910 STACK port has been connected to the MGMT port
in another chassis, use an idle active MM910 STACK port in another chassis.
l In V2.25 and earlier versions, the MM910 MGMT port is accessed by the external network through
the 2X and 3X switch modules by default. In this case, do not connect the MM910 MGMT port and
the switch module network ports to the same network. Otherwise, a network storm will occur and
the network connection will be interrupted.
You as advised to run the smmset -d outportmode -v 1 command on the CLI to provide the
MM910 MGMT port for the external network.
l In V2.26 and later versions, the MM910 MGMT port is provided as the default management
network port for the external network.
Step 2 Use an SSH tool and the MM910 floating IP address to connect to the MM910 CLI.
For details about how to use PuTTY for SSH login, see 8.15 Logging In to a Server Over a
Network Port by Using PuTTY.
Step 3 to Step 5 configure the IP address and routing information for the management network port of
the Ethernet switching plane. If the IP address and routing information of the management network port
have been configured, skip Step 3 to Step 5.
Step 3 (Optional) Run the following command to query the IP address of the management network
port of the Ethernet switching plane:
smmget -l swiN:fruM -d swipcontrol
The parameters are described as follows:
l N indicates the slot number of the switch module. The value range is 1 to 4, mapping to
logical slot numbers 1E, 2X, 3X, and 4E from left to right on the panel respectively.
l M: indicates the ID of the switching plane. The value for the Ethernet switching plane is
2.
For stacked switching planes, configure the gateway only for the master switching plane.
----End
Prerequisites
l The switch modules have been powered on.
l For logging in to the Ethernet switching plane over SSH, the default username is root
and the default password is Huawei12#$.
l By default, the MM910 username is root and the password is Huawei12#$.
l You are familiar with the parameters required for this operation.
Procedure
Step 1 Connect the PC to the Ethernet switching plane.
For details, see 4.4.1.1 Connecting a PC to the Ethernet Switching Plane.
Step 2 Log in to the CLI of the Ethernet switching plane by using the SOL function of the MM910.
For details about SOL login, see 8.17 Logging In to a Compute Node, Passthrough
Module, or Switch Module by Using the SOL Function of the MM910.
Step 3 Run the following command to query the version of the Ethernet switching plane:
display version
l Information similar to the following is displayed:
BoardName : CX910
CPLD Version : 003
PCB Version : VER.A
Bootrom Version : 008
Creation Time : Sep 17 2012, 09:53:25
Backup Bootrom Version : 008
Creation Time : Sep 17 2012, 09:53:25
Switch Version : 1.1.0.200.3
Creation Time : Oct 17 2012, 17:10:28
Backup Switch Version : 1.1.0.200.3
FC BoardName : UNKNOWN
FC PCB Version : UNKNOWN
If the command output contains Software Version, the software version is V8.
----End
4.4.2.3 Using the V5 Switch Module CLI to Collect Ethernet Switching Plane
Information
Operation Scenario
Use the E9000 server switch module CLI of the V5 platform to collect Ethernet switching
plane information, including:
l Logs
l Debugging information
l Trap information
For details about how to query the Ethernet switching plane version, see 4.4.1.2 Querying the
Software Version of the Ethernet Switching Plane.
Prerequisites
Conditions
Data
The default username of the switching plane is root, and the default password is Huawei12#
$.
You can query and set IP addresses of all modules. For details, see 8.11 Logging In to the MM910
WebUI.
l For the MM910 versions earlier than (U54) 2.20, choose System Management > Network
Management > xx > IP addresses.
l For the MM910 (U54) 2.20 or later, choose Chassis Settings > Network Settings > xx.
Software Tools
wftpd32.exe: used to transfer files between different platforms, for example, from a PC to a
switch module. This tool is a free third-party tool. You can obtain it from the Internet.
Procedure
Step 1 Configure the FTP server.
For detailed about the configuration operations, see 8.20 Configuring an FTP Server.
Step 2 Configure the IP address of the management network port.
1. After logging in to the switch module by using a serial port or the SOL function, run the
following commands on the switching plane CLI to query and set the IP address of the
management network port so that the switch module can properly communicate with the
FTP server:
Skip this step if you log in to the switch module by using a network port.
<Fabric>system-view
[Fabric]interface MEth 0/0/1
[Fabric-MEth0/0/1]ip address 192.168.100.123 24
[Fabric-MEth0/0/1]display this
#
interface MEth0/0/1
ip address 192.168.100.123 255.255.255.0
#
return
[Fabric-MEth0/0/1]quit
[Fabric]quit
2. If the configured IP address and the FTP server address are not on the same network
segment, run the following command on the HMM CLI to configure a gateway for the
switching plane:
smmset -l swiN:fruM -d route -v targetvalue maskvalue gatewayvalue
The parameters are described as follows:
– N indicates the slot number of the switch module. The value range is 1 to 4,
mapping to logical slot numbers 1E, 2X, 3X, and 4E from left to right on the panel
respectively.
– M: indicates the ID of the switching plane. The value for the Ethernet switching
plane is 2.
– targetvalue: indicates the target network segment IP address of the switching plane.
– maskvalue: indicates the subnet mask of the switching plane.
– gatewayvalue: indicates the gateway IP address of the switching plane.
For example, if the IP address is 192.168.112.1, run the following command:
smmset -l swi3:fru2 -d route -v 0.0.0.0 0.0.0.0 192.168.112.1
Step 3 Obtain the log information.
1. Run the following command to collect logs.
<Fabric>display diagnostic-information diag-info.txt
Now saving the diagnostic information to the
device
<Fabric>save logfile
Save log file successfully.
<Fabric>dir flashvx:/logfile/
Directory of flashvx:/
logfile/
Idx Attr Size(Byte) Date Time(LMT)
FileName
0 -rw- 2,939,200 Apr 01 2000 23:55:02
log.dblg
1 -rw- 95,988 Jan 07 2014 19:16:00
2014-01-07.19-13-54.log.zip
2 -rw- 172,081 Jan 07 2014 21:35:14
2014-01-07.21-31-56.log.zip
3 -rw- 2,716,484 Jan 23 2014 01:35:24
log.log
4 -rw- 4,589,648 Jan 17 2014 12:30:48
2000-04-01.23-55-08.dblg
3. Enter the IP address, username, and password to log in to the FTP server. In the
following example, the FTP server address is 200.1.1.126 and the username is root.
<Fabric>ftp 200.1.1.126
Trying 200.1.1.126 ...
Press CTRL+K to abort
Connected to 200.1.1.126.
220 WFTPD 2.0 service (by Texas Imperial Software) ready for new user
User(200.1.1.126 none):root
331 Give me your password, please
Enter password:
230 Logged in successfull
[ftp]
The IP address of the FTP server is configured by the user and is on the same network segment as
the management IP address of the switch module.
4. Convert the log file into a binary file for transfer.
[ftp]binary
5. Obtain the log file.
[ftp]put flash:/diag-info.txt
200 PORT command okay
150 "F:\diag-info.txt" file ready to receive in IMAGE / Binary mode
226 Transfer finished successfully.
FTP: 148848 byte(s) sent in 0.280 second(s) 531.60Kbyte(s)/sec.
[ftp]lcd flashVX:/logfile
The current local directory is flashVX:/logfile.
[ftp]mput *
Error: The file name . is invalid.
Error: The file name .. is invalid.
[ftp]quit
----End
4.4.2.4 Using the V8 Switch Module CLI to Collect Ethernet Switching Plane
Information
Operation Scenario
Use the CLI of an E9000 switch module to collect the following information about the V8
platform:
l Logs
l Debugging information
l Trap information
For details about how to query the Ethernet switching plane version, see 4.4.1.2 Querying the
Software Version of the Ethernet Switching Plane.
Prerequisites
Conditions
Data
The default username of the switching plane is root, and the default password is Huawei12#
$.
You can query and set IP addresses of all modules. For details, see 8.11 Logging In to the MM910
WebUI.
l For the MM910 versions earlier than (U54) 2.20, choose System Management > Network
Management > xx > IP addresses.
l For the MM910 (U54) 2.20 or later, choose Chassis Settings > Network Settings > xx.
Software Tools
wftpd32.exe: used to transfer files between different platforms, for example, from a PC to a
switch module. wftpd32.exe is a free third-party tool. You can obtain it from the Internet.
Procedure
Step 1 Configure the FTP server.
Step 2 After logging in through the serial port or SOL function, run the following commands on the
Ethernet switching plane CLI to check whether the management network port IP address has
been configured.
Skip this step if you log in to the switch module by using a network port.
<HUAWEI>system-view
[~HUAWEI-MEth0/0/0]display this
Step 3 (Optional) After logging in to the switch module by using a serial port or the SOL function,
run the following commands on the Ethernet switching plane CLI to query and set the IP
address of the management network port so that the switch module can properly communicate
with the FTP server:
Skip this step if you log in to the switch module by using a network port.
<HUAWEI>system-view
[~HUAWEI-MEth0/0/0]quit
[~HUAWEI]quit
Step 4 Obtain the log information.
1. View the log file system.
<HUAWEI>system-view
Enter system view, return user view with return command.
[~HUAWEI]diagnose
Warning: Enter diagnose view, return user view by pressing Ctrl+Z.
Info: The diagnose view is used to debug system hardware and software. Misuse
of some commands in this view will affect system performance. Therefore, use
these commands with the guidance of Huawei engineers.
[~HUAWEI-diagnose]return
<HUAWEI>save logfile
Info: Save logfile successfully.
<HUAWEI>dir
Directory of flash:/
vrpcfg.zip
1,048,576 KB total (367,972 KB free)
<HUAWEI>dir logfile/
Directory of flash:/logfile/
------------------------------------------------------------------------------
--
2 Standby dcd2-fcf8-5600 100 CX910 2X/
300
Role specifies the switch module role. The value can be Master, Standby, or Slave,
indicating the primary switch module, standby switch module, and slave switch module
respectively. Bay in Bay/Chassis indicates the switch module slot number.
3. Obtain the log file.
<HUAWEI>ftp 192.168.100.122
Trying
192.168.100.122 ...
Press CTRL+K to
abort
Connected to
192.168.100.122.
220 WFTPD 2.0 service (by Texas Imperial Software) ready for new
user
User(192.168.100.122:
(none)):huawei
331 Give me your password,
please
Enter
password:
230 Logged in successfully
[ftp]binary
200 Type is Image (Binary)
# On the FTP server, create a log receiving directory for the master switch module in the
stack. In this example, the number 3 in swi3 indicates the stack ID (same as the slot
number) of the master switch module. (If the switch modules are not stacked, create a
log receiving directory for the current switch module. The number 3 in swi3 indicates the
slot number of the current switch module.)
[ftp]mkdir swi3
[ftp]cd swi3
[ftp]put flash:/diag-info.txt
200 Port command successful.
150 Opening data connection for diag-info.txt.
/ 100% [***********]
226 File received ok
FTP: 1756870 byte(s) send in 0.308 second(s) 5570.431Kbyte(s)/sec.
[ftp]mput flash:/logfile/*
200 Port command successful.
150 Opening data connection for diag.log.
/ 100% [***********]
226 File received ok
[ftp]cd ..
# On the FTP server, create a log receiving directory for the standby or slave switch
module in the stack. In this example, the number 2 in swi2 indicates the stack ID (same
as the slot number) of the master switch module. (If the switch modules are not stacked,
log in to each switch module and repeat the preceding log collection procedure.)
[ftp]mkdir swi2
[ftp]cd swi2
[ftp]mput 2#flash:/logfile/*
[ftp]cd ..
[ftp]quit
221 Windows FTP Server (WFTPD, by Texas Imperial Software) says goodbye
<HUAWEI>
– When you use the mput command in the FTP CLI, 2#flash:/ indicates the flash root directory
of the switch module with the stack ID 2. You can obtain the stack ID and role information by
using the display stack command.
– The flash root directory of the master switch module in a stack is flash:/.
– If multiple switch modules are displayed after running the display stack command, obtain the
log file of each switch module in the logfile directory.
4. View the log file in the FTP directory on the PC.
----End
4.4.2.5 Using the Web Tools Page of a Switch Module to Collect FC Switching
Plane Information (MX510)
Operation Scenario
Use Web Tools page of a switch module (MX510) to collect information about the FC
switching plane.
Prerequisites
Conditions
l The connection between the management IP address of the FC switch module and the
server IP address is normal.
l You have logged in to the Ethernet switching plane Web Tools page. For details, see 8.10
Logging In to the Web Tools of the MX510.
Data
IP address 192.168.1.100
For exporting the dump_support log file, the username is images, and the default password
is Huawei12#$.
Procedure
Step 1 On Web Tools, choose Switch > Download Support File, as shown in Figure 4-2.
Step 2 Select the directory for storing the log file, and click Start.
The log file download starts. If "Support file saved" is displayed in the Status area, the log
file has been successfully exported, See Figure 4-3.
----End
4.4.2.6 Using the Switch Module CLI to Collect FC Switching Plane Information
(MX510)
Operation Scenario
Use the CLI of a switch module (MX510) to collect FC switching plane information.
Prerequisites
Conditions
l The PC has been connected to the management network port of the server by using a
network cable.
l You have obtained mini-sftp-server.exe.
If the MX510 firmware version is earlier than 9.8.2.6.0, you can use the FTP tool WFTPD to collect
information. For details, see 8.20 Configuring an FTP Server.
Data
IP address 192.168.1.100
The default username of the switching plane is admin, and the default password is
Huawei12#$.
Software Tools
mini-sftp-server.exe: used to transfer files between different platforms, for example, from a
switch module to a PC. mini-sftp-server.exe is a free third-party tool. You can obtain it from
the Internet.
Procedure
Step 1 Configure an SFTP server.
For details about how to access the FC switching plane CLI, see 8.15 Logging In to a Server
Over a Network Port by Using PuTTY or 8.17 Logging In to a Compute Node,
Passthrough Module, or Switch Module by Using the SOL Function of the MM910.
----End
4.4.2.7 Using the Switch Module CLI to Collect FC Switching Plane Information
(MX210/MX220)
Operation Scenario
Use the CLI of a switch module (MX210/MX220) to collect FC switching plane information.
This section applies to the CX210, CX220, CX912, and CX916. The FC switching planes of
the CX210 and CX912 are the MX210, and those of the CX220 and CX916 are the MX220.
Prerequisites
Conditions
l The PC has been connected to the management network port of the server by using a
network cable.
l You have obtained mini-sftp-server.exe.
Data
IP address 10.77.77.77
The default username of the switching plane is admin, and the default password is
Huawei12#$.
Software Tools
mini-sftp-server.exe: used to transfer files between different platforms, for example, from a
switch module to a PC. mini-sftp-server.exe is a free third-party tool. You can obtain it from
the Internet.
Procedure
Step 1 Configure an SFTP server.
For details about how to access the FC switching plane CLI, see 8.15 Logging In to a Server
Over a Network Port by Using PuTTY or 8.17 Logging In to a Compute Node,
Passthrough Module, or Switch Module by Using the SOL Function of the MM910.
Step 3 Run the ipaddrset command to set the management IP address and then run the ipaddrshow
command to check whether the IP address is correct.
l IPv4
FC_SW:admin> ipaddrset
Ethernet IP Address [10.77.77.77]:10.32.53.47
Ethernet Subnetmask [255.255.255.0]:255.255.240.0
Fibre Channel IP Addresss [none]:
Fibre Channel Subnetmask [none]:
Gateway IP Address [0.0.0.0]:10.32.48.1
DHCP [Off]:
IP address is being changed...Done.
FC_SW:admin> ipaddrshow
FC_SW:admin> ipaddrshow
Ethernet IP Address: 10.32.53.47
Ethernet Subnetmask: 255.255.240.0
Fibre Channel IP Addresss: none
Fibre Channel Subnetmask: none
Gateway IP Address 10.32.48.1
DHCP: Off
l IPv6
FC_SW:admin> ipaddrset -ipv6 --add fd00:60:69bc:82:205:33ff:fed7:f6fe/64
IP address is being changed...Done.
FC_SW:admin> ipaddrshow
SWITCH
Ethernet IP Address: 10.20.24.55
Ethernet Subnetmask: 255.255.240.0
Gateway IP Address: 10.20.16.1
DHCP: Off
IPv6 Autoconfiguration Enabled: No
Local IPv6 Addresses:
static fd00:60:69bc:82:205:33ff:fed7:f6fe/64 preferred
IPv6 Gateways: fe80:21b:3dff:fe0b:7800 fe80:21b:edff:fe0b:2400
The current environment uses IPv4 addresses. You do not need to set the IPv6 address.
Step 2 Choose Compute > Hardware > Add Device > Add Online to add the chassis of the
MM920/MM921.
Step 6 Select the switch modules whose logs you want to export and click OK.
After the task is complete, decompress the downloaded package to obtain switch module logs.
----End
Collecting QLogic HBA logs has no adverse impact on services. The following describes how
to collect QLogic HBA logs on mainstream OSs. You can download the scripts from the
official QLogic support website.
NOTICE
NOTICE
FusionServer Tools Toolkit and Smart Provisioning can be used only after services on the
server are stopped. Notify the customer to back up data before using the tools.
E9000 For details, see the FusionServer Pro E9000 Server V100R001
HMM Alarm Handling.
To check switch module alarms, run the following commands on
the Ethernet switching plane:
l display trapbuffer
l display alarm active
l display alarm history
NOTE
For details about how to log in to the Ethernet switching plane of a switch
module, see 8.15 Logging In to a Server Over a Network Port by Using
PuTTY, 8.16 Logging In to a Server Over a Serial Port by Using
PuTTY, or 8.17 Logging In to a Compute Node, Passthrough Module,
or Switch Module by Using the SOL Function of the MM910.
E6000 For details, see the E6000 Server V100R002 Alarm Reference.
Rack server For details, see the FusionServer Pro Rack Server iBMC Alarm
Handling.
X6000 For details, see the FusionServer Pro X6000 Server iBMC (Earlier
than V250) Alarm Handling or X6000 Server Alarm Handling
(iMana 200).
X8000 For details, see the X8000 Server V100R001 Alarm Reference.
X6800 For details, see the FusionServer Pro X6800 Server iBMC (Earlier
than V250) Alarm Handling.
G2500 For details, see the G2500 Server 1.0.0 iBMC Alarm Handling.
FusionServer G5500 For details, see the FusionServer G5500 Server Alarm Handling.
Atlas 800 AI server For details, see the Atlas 800 AI Server (Model 3010) iBMC Alarm
(model 3010) Handling.
Process
Figure 5-2 shows the process for checking the indicators.
Step 2 View iMana 200 or iBMC system event logs (SELs) to locate faults.
Steady green, Steady yellow The drive is faulty. Log in to the iMana
blinking green, or 200 or iBMC and
off use FusionServer
Tools Toolkit or
Smart
Provisioning to
check for any drive
faults.
The NVMe PCIe SSD indicators are available only on high-density servers, specific rack servers
(RH1288 V3, RH2288 V3, RH2288H V3, RH5288 V3, RH5885 V3, RH5885H V3, and RH8100 V3),
and CH225 V3 compute nodes in the E9000 blade server.
Steady green or off Steady yellow The NVMe PCIe Reseat the NVMe
SSD is faulty. PCIe SSD. If the
problem persists,
replace the SSD.
----End
Indicators Available Only on the RH5885 V2, RH5885 V3, and RH5885H V3
Table 5-9 Module indicators on the RH5885 V2, RH5885 V3, and RH5885H V3
Indicator Status Meaning Diagnosis
Memory riser fault Steady red A DIMM on the Locate the faulty
indicator memory riser is DIMM according to
faulty. the DIMM fault
locating indicator,
and replace the
faulty DIMM with a
spare one.
DIMM fault locating Steady red The DIMM is faulty. After you remove
indicator the memory riser
and hold down the
DIMM fault locating
button, the indicator
of the faulty DIMM
turns on.
Diagnostic panel on Steady green A fault alarm is For details, see 2.5.1
the RH5885 V2 generated for the "Components on the
server server component. Front Panel" and
2.5.2 "Indicators and
Buttons" in the
RH5885 V2 Server
(8S) V100R001C02
User Guide.
Fault diagnosis Steady red A fault alarm is For details, see 2.4
panel on the generated for the "Indicators and
RH5885 V3 server server component. Buttons" in the
RH5885 V3 Server
V100R003 User
Guide.
RH8100 V3 fan Steady green The fan module Check whether the
module indicator hardware or fan module
backplane is faulty hardware or
or the fan module backplane is faulty
software is and whether the fan
performing an module software is
online upgrade. (An performing an
online upgrade takes online upgrade.
about 3 minutes.)
Memory riser ATTN Steady yellow The hot insertion or Check whether
indicator removal operation services can be
has failed. migrated or stopped.
After services are
stopped, power off
and then power on
the server.
l If the indicator is
off, attempt to
hot-swap the
memory riser
again. If hot
swap fails again,
replace the
memory riser and
DIMMs on it.
l If the indicator is
steady yellow,
replace the
memory riser and
DIMMs on it.
A switch module
that cannot be
stacked is operating
properly.
A switch module
that cannot be
stacked is being
powered on.
RH8100 V3 (8P) One CPU in the CPU1 Dual system mode (one
socket PSU in any slot)
RH8100 V3 (dual-system One CPU in the CPU1 slot Dual system primary 4P
primary 4P) (one PSU in any slot)
One memory board in slot
1
RH8100 V3 (dual system One CPU in CPU5 slot Dual system secondary
secondary 4P) 4P (one PSU in any slot)
One memory board in slot
9
l If a fault can be located using logs or tools, see "Handling Procedure". If a fault needs to be rectified
quickly onsite, see "Quick Recovery Method".
l For more fault symptoms and solutions, see the Intelligent Computing Case Library. The
Intelligent Computing Product Case Query Assistant is available only to Huawei partners and
Huawei engineers.
A PSU is 1. Check the PSU indicator and record 1. Check whether the current
faulty (the any alarms on the iMana 200 or configuration has sufficient
PSU has iBMC WebUI. For details, see 5.5 power supplies.
no power Checking Indicators to Locate l If yes, services are not
output and Faults. affected.
the health NOTE
indicator l If no, contact Huawei
l For E9000 servers, record alarms technical support.
is blinking on the MM910 WebUI.
red). 2. Replace the faulty PSU with a
2. Check whether an "AC lost" alarm
spare PSU. Do not install the
is generated.
faulty PSU into a server again.
l If yes, check that the power
cable is connected properly and
that the PDU is supplying power
properly.
l If no, go to 3.
3. Replace the PSU with a spare PSU
and check whether the fault is
rectified.
l If yes, no further action is
required.
l If no, go to 4.
4. Replace the PSU backplane or
replace the mainboard if no PSU
backplane is configured. Check
whether the fault is rectified.
l If yes, no further action is
required.
l If no, contact Huawei technical
support.
The rack 1. Check whether the external power Follow the handling procedure to
server or supply to the rack server is normal. replace any faulty modules.
Atlas 800 l If yes, go to 2.
AI server
(model l If no, resolve this issue.
3010) has 2. Replace the PSU with a normal one
no power. and check whether the fault is
(All rectified.
indicators l If yes, no further action is
are off.) required.
l If no, go to 3.
3. Replace the mainboard and PSU
backplane and check whether the
fault is rectified.
l If yes, no further action is
required.
l If no, contact Huawei technical
support.
The 1. Check whether the external power Follow the handling procedure to
chassis supply to the chassis is normal or replace any faulty modules.
where a whether a power overload has
blade occurred.
server or a 2. Remove all compute nodes, switch
high- modules, management modules and
density fan modules, label them with the
server is slot numbers, and check whether
located their power connectors are normal.
has no
power. 3. Remove all PSUs, install the PSUs
back one at a time in ascending
order by slot number (ensure that
only one PSU is installed at the
same time), and check whether the
chassis can be connected to the
power source. If the chassis cannot
be connected to the power source
no matter which PSU is installed,
replace the chassis.
4. If the chassis cannot be connected
to the power source after a PSU is
installed, replace the PSU.
5. After verifying that the chassis and
PSUs can be connected to the
power source, install only one PSU.
Then install the switch modules,
compute nodes, fan modules and
management modules one at a time
in ascending order by slot number,
and check whether the module can
be connected to the power source.
6. After the fault is rectified, install the
switch modules, compute nodes, fan
modules and management modules
back into their original slots.
The 1. Remove the compute node or server 1. Remove the faulty compute
chassis of node, and check whether its power node or server node. Check
a blade connector is damaged. whether other compute nodes or
server or l If yes, replace the compute node server nodes work properly. (Do
high- or server node mainboard or not install the node into a server
density replace the chassis. again.)
server has l If yes, services are not
power but l If no, go to 2.
affected.
a compute 2. Do not install the faulty compute
node or node or server node into a server l If no, contact Huawei
server again. Install a spare component technical support.
node does when available. 2. Follow the handling procedure
not. to replace any faulty modules.
l If a fault can be located using logs or tools, see "Handling Procedure". If a fault needs to be
rectified quickly onsite, see "Quick Recovery Method".
l For more fault symptoms and solutions, see the Intelligent Computing Case Library. The
Intelligent Computing Product Case Query Assistant is available only to Huawei partners and
Huawei engineers.
2. If the KVM connection is abnormal, you are advised to use the Independent Remote
Console for login.
The KVM 1. Use a third-party tool, such as 1. Follow the handling procedure
is PuTTY, to run the telnet IP to replace any faulty modules.
inaccessibl address:8208 command to check 2. Restart iMana 200/iBMC and
e. whether the KVM port is normal. replace the local PC.
The default port number is 8208.
Log in to the iMana 200 or iBMC 3. Connect the management
WebUI, choose Configuration > network port to the local PC
Services, and check the VMM directly instead of through a
parameter to obtain the actual port switching network.
number. If Telnet access is
unavailable, use a PC to directly
connect to iMana 200 or iBMC for
troubleshooting.
2. Clear all browser and Java cache
and close all browsers. Then re-log
in to iMana 200 or iBMC.
3. Adjust the Java security level to
medium or lower, or add the KVM
address to the Java exception sites.
4. Check the OS and browser versions
on the client. Firefox 23.0 is
recommended. For details about the
operating environment
requirements, see the iMana 200 or
iBMC help document.
l If a fault can be located using logs or tools, see "Handling Procedure". If a fault needs to be rectified
quickly onsite, see "Quick Recovery Method".
l For more fault symptoms and solutions, see the Intelligent Computing Case Library. The
Intelligent Computing Product Case Query Assistant is available only to Huawei partners and
Huawei engineers.
The server 1. View serial port logs to determine For a rack server or Atlas 800 AI
fails to whether the iMana 200 or iBMC server (model 3010), perform the
enter the has been repeatedly reset. following operations:
standby If the iMana 200 or iBMC has 1. Power off the server, remove
mode after been repeatedly reset, the logs and reinstall the power cables,
it powers repeatedly record the following power on the server, and check
on. (The information: whether the iMana 200 or
power ### JFFS2 load complete:
1107083 bytes loaded to iBMC is functioning correctly.
indicator is 0x8b000000
blinking l If yes, upgrade iMana 200 or
## Booting kernel from
yellow for Legacy Image at 8a000000 ... iBMC by using software of
over 5 Image Name: its current version or a later
linux-2.6.34 version.
minutes.) Image Type: ARM Linux
Kernel Image (uncompressed) l If no, check the iMana 200
Data Size: 1511292
or iBMC version. If the
Bytes = 1.4 MiB
Load Address: 86008000 version is 1.91 or later, go to
Entry Point: 86008000 2; otherwise, go to 3.
Verifying Checksum ... OK
## Loading init Ramdisk from 2. Keep the power cables removed
Legacy Image at 8b000000 ... and add a jumper cap to the
Image Name: Ramdisk Clear_BMC_PW pin on the
Image
Image Type: ARM Linux mainboard to attempt to restore
RAMDisk Image (uncompressed) the default settings of the iMana
Data Size: 1107019 200 or iBMC. Then reconnect
Bytes = 1.1 MiB
Load Address: 00000000 power cables.
Entry Point: 00000000 3. Replace the mainboard or BMC
Verifying Checksum ... OK
Loading Kernel Image ... board.
OK
For an E9000 server, perform the
OK
following operations:
Starting kernel ...
1. Remove and reinstall the
NOTE compute node and check
l The CH140 and CH140 V3 whether the iMana 200 or
compute nodes of the E9000 do iBMC is functioning correctly.
not provide any serial ports.
Directly ping the IP address of the l If yes, upgrade the iMana
iMana 200 or iBMC. If the ping 200 or iBMC by using
tests occasionally or always fail, software of its current
use the quick recovery method. If version or a later version.
the problem persists, contact
Huawei technical support. l If no, check the iMana 200
l During the iMana 200 or iBMC
or iBMC version. If the
startup process, the serial port on a version is 1.91 or later, go to
server is used by default. After the 2; otherwise, go to 3.
startup is complete, the serial port 2. Keep the compute node
is switched for the system serial
port.
removed and add a jumper cap
to the Clear_BMC_PW pin on
2. Contact Huawei technical support the mainboard to attempt to
to query a case or replace the restore the default settings of
mainboard. iMana 200 or iBMC. Then
reinstall the compute node.
A server in 1. Collect iMana 200 or iBMC logs, 1. Remove the external PCIe
standby and query the complex devices such as NICs and FC
mode programmable logical device HBAs. Then check whether the
cannot (CPLD) register to determine fault is rectified.
power on. whether the power supply link to l If yes, no further action is
(The power the mainboard has failed. required.
indicator is 2. Check whether the mainboard
steady l If no, go to 2.
(with integrated CPUs) and
yellow.) DIMMs are installed properly. 2. Retain only the minimum server
configuration (a single CPU, a
single mainboard, and a single
DIMM). Then check whether
the fault is rectified.
l If yes, no further action is
required.
l If no, go to 3.
3. Check whether the CPUs,
mainboard, and memory
modules are faulty, and replace
the faulty components.
A server 1. Collect iMana 200 or iBMC logs, 1. Check all external power
powers off and query the CPLD register to supplies, including the PDUs,
immediatel determine whether the power PSUs, and power cables.
y when supply link to the mainboard has Replace any faulty components
powered failed. and check whether the fault is
on. NOTE rectified.
For an E9000 server, you are advised l If yes, no further action is
to use the MM910 for one-click log
required.
collection.
2. Check the power supply unit l If no, go to 2.
(PSU) backplane and the 2. Replace the mainboard or PSU
mainboard. backplane.
The 1. Collect iMana 200 or iBMC logs, 1. Run the ipmcset -d clearcmos
message and query the CPLD register to command to clear the CMOS.
"no signal" determine whether the power Then check whether the fault is
is displayed supply link to the mainboard has rectified.
immediatel failed. l If yes, no further action is
y after the NOTE required.
server For an E9000 server, you are advised
powers on. to use the MM910 for one-click log l If no, go to 2.
collection. NOTICE
2. Set the printing level for Running the ipmcset -d
clearcmos command will
debugging the BIOS with the restore the BIOS defaults.
iMana 200 or iBMC CLI, restart Exercise caution when running
the server, and save system serial this command.
port logs. When the fault is 2. Upgrade the iMana 200 or
repeated, collect iMana 200 or iBMC, and the BIOS. Then
iBMC logs and download the .bin check whether the fault is
file of the BIOS. rectified.
The server 1. Enable the video recording l If yes, no further action is
repeatedly function on the iMana 200 or required.
powers on iBMC WebUI. l If no, go to 3.
and then 2. Set the printing level for
powers off. 3. Remove the external devices,
debugging the BIOS with the including the PCIe cards and
iMana 200 or iBMC CLI, restart HBAs. Then check whether the
the server, and save system serial fault is rectified.
port logs. When the fault is
repeated, collect iMana 200 or l If yes, no further action is
iBMC logs and download the .bin required.
file of the BIOS. l If no, go to 4.
3. Restore the default BIOS settings, 4. Retain only the minimum server
and check whether the server configuration (a single CPU, a
operates properly. single mainboard, and a single
l If yes, modify the BIOS DIMM). Then check whether
parameters in the OS side based the fault is rectified.
on actual requirements. l If yes, no further action is
l If no, collect iMana 200 or required.
iBMC logs, download the .bin l If no, go to 5.
file of the BIOS. For details, 5. Check whether the CPUs,
see the iBMC User Guide of mainboard, and memory
the corresponding version. modules are faulty, and replace
NOTE the faulty components.
For an E9000 server, you are advised to
use the MM910 for one-click log
collection.
RAID self- 1. Capture the current iMana 200/ 1. If a RAID controller card
check is iBMC KVM or local KVM screen. firmware error exists, replace
suspended. 2. Collect iMana 200 or iBMC logs. the RAID controller card,
supercapacitor, or BBU. Then
check whether the fault is
rectified.
l If yes, no further action is
required.
l If no, go to 2.
2. Check whether the drives, drive
backplane, and SAS cables are
faulty.
l If yes, replace faulty
components.
l If no, go to 3.
3. If the RAID array is offline,
import it again. Then check
whether the fault is rectified.
l If yes, no further action is
required.
l If no, go to 4.
4. If the BBU or supercapacitor
runs out of power, follow the
instructions shown in the
displayed messages to keep the
server running. After the server
runs for 30 minutes, check the
BBU or supercapacitor status. If
the BBU or supercapacitor is
abnormal, replace it.
NIC 1. Check whether the NIC supports Follow the handling procedure.
Preboot PXE.
Execution 2. Check the BIOS PXE
Environme configuration. Ensure that the NIC
nt (PXE) PXE function and NIC UMC
has failed. function are enabled. To enable the
NIC PXE function, press Ctrl+S.
3. Check the NIC.
4. Check the PXE network
environment on the service side.
l If a fault can be located using logs or tools, see "Handling Procedure". If a fault needs to be rectified
quickly onsite, see "Quick Recovery Method".
l For more fault symptoms and solutions, see the Intelligent Computing Case Library. The
Intelligent Computing Product Case Query Assistant is available only to Huawei partners and
Huawei engineers.
The 1. Check whether the DIMMs are 1. If the iBMC generates the
memory compatible with the server by "DIMMxxx Configuration
capacity using the Intelligent Computing Error" alarm, replace the related
detected by Compatibility Checker. DIMM.
the system l If yes, go to 2. 2. If the DIMM status displayed in
is less than iBMC or the OS is abnormal
the l If no, replace the DIMM with a
compatible model specified by (unidentified or faulty), replace
configured the faulty DIMMs.
memory the Intelligent Computing
capacity. Compatibility Checker. 3. If memory mirroring or memory
2. Check whether memory mirroring rank sparing is configured in the
has been enabled in the BIOS. BIOS, the total available
memory capacity is less than the
l If yes, the memory capacity is configured physical memory
reduced by 50% due to the capacity.
memory mirroring function.
You can disable the function in 4. If the DIMMs do not comply
the BIOS. If the problem with the DIMM installation
persists, go to 3. rules, use Huawei Server
Product Memory
l If no, go to 3. Configuration Assistant to
3. Check whether the DIMM reinstall the DIMMs.
installation positions meet 5. If DIMM installation slots are
configuration rules. faulty, replace the mainboard.
l If yes, go to 4.
l If no, reinstall the DIMMs in
correct slots according to the
configuration rules.
4. Check whether a "DIMM
configuration error" alarm is
generated by iBMC.
l If yes, replace the faulty
DIMM. For details, see 5.3
Handling Alarms.
l If no, go to 5.
5. Check whether any DIMM slots
are abnormal. If a DIMM slot is
abnormal, replace the mainboard.
l If a fault can be located using logs or tools, see "Handling Procedure". If a fault needs to be rectified
quickly onsite, see "Quick Recovery Method".
l For more fault symptoms and solutions, see the Intelligent Computing Case Library. The
Intelligent Computing Product Case Query Assistant is available only to Huawei partners and
Huawei engineers.
A "Disk 1. If the drive is in a RAID array and 1. If the faulty drive is not in a
Fault" the RAID array is not functioning RAID array (except drives in
alarm is correctly, troubleshoot the RAID passthrough mode), the drive
reported to array. cannot be used and needs to be
iMana 200 2. If the server has stopped, use replaced. It is recommended that
or iBMC. Smart Provisioning to inspect the you configure RAID for all
server hardware. If the server is drives and then deploy the
operating, replace the drive. redundant services.
3. If the fault persists, insert the new 2. Back up the data of redundant
drive into the slot that you suspect RAID arrays to avoid data loss.
to be faulty to check whether that 3. Follow the handling procedure
slot is faulty. to replace any faulty modules.
NOTE
For RAID controller cards that
support out-of-band management, if a
hard drive is in the Unconfigured
Good (Foreign) state, an iBMC alarm
will be generated but the fault
indicator will be off.
A RAID 1. Power off the server, swap the 1. If the redundant RAID array
controller drive that cannot be identified with fails or no RAID array is
card fails to a normal drive, and power on the configured, the related drive
identify server to check whether the drive partitions are unavailable.
one or is faulty. 2. Move the unidentified drives or
more l If the fault is caused by the all drives in the RAID array to a
drives. drive, replace the drive. standby server. Ensure that you
l If the fault is caused by the retain their order during this
drive slot, check whether SAS process and attempt to back up
cables are connected properly data.
to all SAS ports on the drive 3. Follow the handling procedure
backplane. For details, see the to replace any faulty modules.
server user guide.
l If the fault persists, go to 2.
2. Replace the RAID controller card
first, the SAS cables second, and
the drive backplane third.
Note: If a fault occurs on the RH2288A V2 server, check whether the cable connecting the
mainboard to the power adapter board is connected properly. Figure 5-3 shows the cable
connection.
l If a fault can be located using logs or tools, see "Handling Procedure". If a fault needs to be rectified
quickly onsite, see "Quick Recovery Method".
l For more fault symptoms and solutions, see the Intelligent Computing Case Library. The
Intelligent Computing Product Case Query Assistant is available only to Huawei partners and
Huawei engineers.
A network 1. Ensure that the NIC type, NIC 1. If a visible NIC port becomes
port is driver, OS, BIOS version, and invisible when the server is
invisible. iMana 200 or iBMC version on the running, and services can be
server or compute node are interrupted, power the server off
compatible. and on. If the fault persists, go
l If the OS compatibility is not to 2.
specified by the Intelligent 2. Insert the NIC into another PCIe
Computing Compatibility slot and check whether the fault
Checker, contact the technical is rectified.
support team of the OS vendor l If the NIC is causing the
to resolve to problem. fault, replace the NIC.
NOTE l If the PCIe slot is causing
You are advised to used compatible
OSs specified by the Intelligent
the fault, replace the
Computing Compatibility Checker. mainboard.
l If the NIC driver version is
incompatible, upgrade the
driver before continuing.
2. To check whether the PCI device
of the NIC is visible, run the lspci |
grep -i eth* command in Linux (or
equivalent in other operating
systems) and observe the response.
l If yes, go to 4.
l If no, go to 3.
3. If the PCI device is invisible,
perform the following steps:
a. Check the logical topology of
the NIC. If the NIC PCI bus
does not have a CPU, screw-in
PCI cards connected to the bus
are invisible.
b. Power the iMana 200 or iBMC
off and then on. Check whether
the fault persists.
c. Insert the NIC you suspect to
be faulty into another slot, and
a normal NIC into the slot you
suspect to be faulty. Then check
which of these cause the fault.
4. If the PCI device is visible but its
network port is invisible, the driver
cannot be loaded. To rectify the
fault, perform the following steps:
a. Run the ifconfig ethN up
command in Linux (or
A 1. Check whether the network cable 1. Use the ping command to check
communica is connected properly to the whether the server or other
tion error network port. servers on the network have
occurs on a 2. Ensure that the NIC type, NIC network faults.
network driver, OS, BIOS version, and l If the fault occurs on more
port. iMana 200 or iBMC version meet than one server, check
the compatibility requirements of whether the external
the server or compute node. If the switching network is normal.
NIC driver is incompatible, l If the fault occurs only on
upgrade the driver before one server, go to 2.
continuing.
2. Check the indicator to see the
3. To check whether the network NIC port status. If the indicator
ports are up, run the ifconfig ethN is off, switch the optical
up command in Linux (the module, optical cable, and
command may vary in different uplink switch port related to the
OSs). To check whether IP faulty NIC port with those of a
addresses are set for the required normal NIC port if any of these
network ports, run the ethtool components are faulty. Then
ethN command. replace them.
4. Run the ethtool -p ethN command 3. If the NIC is causing the fault,
in Linux (the command may vary restart the server when
in other OSs) to check whether the interruption will not affect
information in the network port services, and check whether the
configuration file of the rack server communication is normal. If the
or Atlas 800 AI server (model fault persists, power the server
3010) is consistent with the actual off and on. If the fault still
physical network ports, and check persists, replace the NIC.
whether the network port status
indicators are on and whether the
network ports on the switch are up.
NOTE
The ethtool -p ethN command applies
only to plug-in PCIe cards.
5. Check whether the network ports
on the compute node and switch
module are up. For details, see
E9000 Blade Server Mezzanine
Module-Switch Module Interface
Mapping Tool.
6. Check the settings of IP addresses,
gateway addresses, VLANs,
bondings, and uplink switch
network ports.
7. Collect OS logs.
A packet 1. Ensure that the NIC type, NIC 1. Check whether the packet loss
error or driver, OS, BIOS version, and occurs only on a single server.
packet loss iMana 200 or iBMC version meet Run the ethtool -S ethN
occurs on a the compatibility requirements of command to check the packet
network the server or compute node. If the loss type and run the top
port. NIC driver is incompatible, command to check the system
upgrade the driver before resource usage (software
continuing. interrupts, CPU usage, and
2. Check whether there are an memory usage) and NIC traffic.
increasing number of network port 2. When you have the customer's
packet losses and errors. If there is permission to interrupt services,
no continuous increase, ignore this connect a PC to the port and
error. check for packet loss. Connect
3. Insert the NIC that you suspect to the PC to other working ports,
be faulty into another slot, and and check optical modules,
insert a normal NIC into the slot optical cables, and uplink
that you suspect to be faulty. Then, switches. Then, replace or
check which of these is causing the adjust components based on the
fault. actual situation.
4. Connect the suspicious network 3. If the NIC is causing the fault,
cable to a normal server, connect a restart the server when
normal network cable to the interruption will not affect
suspicious server, and check services, and check whether the
whether the fault is caused by the communication is normal. If the
suspicious network cable. fault persists, power the server
off and on. If the fault still
5. Switch the service traffic from the persists, replace the NIC.
network port that you suspect to be
faulty to a different network port.
Then, check whether the fault is
caused by the network port.
6. To check parameters regarding the
packet error or loss, run the ethtool
-S ethN command in Linux (or
similar in other operating systems).
7. Collect OS logs.
For more fault symptoms and solutions, see the Intelligent Computing Case Library. The Intelligent
Computing Product Case Query Assistant is available only to Huawei partners and Huawei engineers.
The storage device fails 1. Connect to the switch and run the brocade: switchshow
to identify the host command to query port connection status.
World Wide Port Name 2. If the switch fails to obtain the host WWPN, the host bus
(WWPN). adapter (HBA) cannot register with the switch. In this case,
do as follows:
a. Check that the HBA and the processor connected to the
PCIe bus are installed properly.
b. (Optional) Check the mapping between the HBAs and
switch modules for E9000 and E6000 servers.
c. Check FC links between the HBA and the switch by
checking the optical cable connections and the optical
module power. If E9000 servers are used, check the HBA
work mode.
d. Ensure that the lpfc driver and firmware matching the
E9000 are installed.
e. If multiple switches are connected, check whether the
switch connection mode (AG or TR) is correct.
f. Collect the OS message logs and check lpfc driver
information for faults.
g. Collect log information of the switches.
3. If the HBA is successfully registered with the switch, the
switch obtains the host WWPN, but the storage cannot
identify host WWPNs, rectify the fault as follows:
a. Check the FC links (optical cables and modules) between
the switch and the storage device.
b. Check whether the HBA and the storage ports are in the
same zone.
c. Check whether the zone configurations are the same for
switches from the same vendor.
d. Collect the OS message logs and check lpfc driver
information for faults.
e. Collect the log information of switches.
The storage device has 1. Check whether the lpfc driver and firmware matching the
identified the HBA E9000 have been installed.
WWPN, but LUNs 2. Collect the OS message logs and check lpfc driver
cannot be mapped to information for faults.
the host.
3. Collect log information of the switches.
4. If no faults are identified, faults may exist on the storage
device or OS SCSI application layer. Contact the OS or
storage device vendor.
Some multipath links 1. Ensure that the installed lpfc driver and firmware match the
of LUNs are down. E9000.
2. Check for error codes on FC links between the HBA and the
storage device.
3. Collect the OS message log and check lpfc and multipath
driver information for faults.
4. Collect log information of the switches.
5. Contact the OS multipath driver vendor or storage device
vendor.
Poor data read/write 1. Check whether the installed lpfc driver and firmware match
performance of LUNs the E9000.
2. Check for error codes on FC links between the HBA and the
storage device.
3. Run the iostat command on the host to query the I/O delay
and concurrent I/O operations.
4. Collect the OS message log and check the lpfc driver
information and the I/O queue depth configured for the HAB
driver.
5. Perform drive performance tests (read and write 100 GB and
100 MB files).
6. Contact storage analysis engineers.
Table 5-15 Quick recovery methods and handling procedures of FC controller faults
Fault Symptom Quick Recovery Method
Storage services are 1. Migrate all services, and safely power off the server. Next,
affected but HBA links remove and reinstall the compute node, and power on the
are normal. server. Then, check whether the fault is rectified.
l If yes, no further action is required.
l If no, contact the storage vendor for quick fault recovery.
2. Before contacting Huawei technical support, it is
recommended that you migrate services and collect switch
module logs, OS logs, LLD networking information, and
device time differences.
Storage LUN 1. Check for FC link error codes on the FC switch module. If
performance issues error codes exist, run the porterrshow command and
determine the cause of the fault based on the port mapping
relationships.
l If any links between the switch modules and the external
switches are faulty, remove and reconnect the optical
cables and modules. If a link is still faulty and spare
components are available, replace any related optical
cables and modules and try again.
l If a link between an HBA and switch module is faulty,
move the compute node to a working slot to check
whether the fault is caused by the HBA, switch module, or
backplane. Replace any faulty modules as required.
2. Clear the error code count history, observe the error codes for
10 minutes, test the performance, and contact the storage
vendor for quick fault recovery.
For more fault symptoms and solutions, see the Intelligent Computing Case Library. The Intelligent
Computing Product Case Query Assistant is available only to Huawei partners and Huawei engineers.
A switch module fails to be started. 1. Switch between active and standby MM910s and
After logging in to the switch check whether the switch module can start
module over SOL, the SOL screen normally.
displays the following: Can not get l If yes, no further action is required.
config file from smm. Begin
reboot .... l If no, go to 2.
2. Restart the baseboard management controller
(BMC) of the switch module and check whether
the switch module can be started properly.
l If yes, no further action is required.
l If no, go to 3.
3. Upgrade the switch module software to the latest
version. For details, see the "Upgrading Software
by Using U-Boot" section in the "Common
Operations" chapter of the E9000 Server
V100R001 Upgrade Guide.
A switch module fails to start. After 1. If services are running, connect the network
logging in to the switch module cable or the optical cable to the switch module
over SOL, the SOL screen displays and press Y to continue.
the following: Ensure that the 2. If no services are running, press Y to continue.
optical fibers or cables are
inserted on the same ports on the
panel after the board
replacement. During system
startup, do not power off or
remove the board. To continue
the startup, press Y:.
After logging in to a switch module Upgrade the switch module software to a specified
over SOL, the SOL screen shows version or the latest version depending on the
Critical Error! and only the meth displayed message.
port can be displayed by running
display interface.
A port is Up but no traffic passes 1. On the interface view, run the following
through the port. commands to check whether the fault is rectified:
[~HUAWEI]interface 10ge 1/17/1
[~HUAWEI-10ge 1/17/1]restart
l If yes, no further action is required.
l If no, go to 2.
2. Run the reboot command to restart the switch
module.
Incorrect packets are generated Run the display interface command and check
(running the display interface CRC and Symbols.
command shows that the value of 1. If the values of CRC and Symbols are not zero,
Total Error in the Input area is not perform the following operations:
zero and keeps increasing).
l Ensure that the optical cables are connected
properly to the faulty switch module and the
device it is directly connected to.
l Check whether any optical cables are
damaged.
l Check whether the optical modules of the
faulty switch module and the device it is
directly connected to are working properly.
l If there is a transmission device between the
switch module and its connected device,
check the transmission device gateway for
alarms.
2. If the values of CRC and Symbols are zero, run
the reboot command to restart the switch
module.
5.6.9 OS Faults
OS Installation Faults
Diagnose and rectify faults related to OS installation depending on the symptoms.
For more fault symptoms and solutions, see the Intelligent Computing Case Library. The Intelligent
Computing Product Case Query Assistant is available only to Huawei partners and Huawei engineers.
Drive 1. Ensure that the target drive is identified by the RAID controller,
identification and use the Intelligent Computing Compatibility Checker to
issue check whether the target drive is compatible with the server. Then
check the BIOS to see whether the target storage devices, including
SATADOMs, microSD cards, and built-in USB flash drives, are
identified.
2. Check the RAID controller card model and determine whether to
configure RAID (LSI SAS1078, LSI SAS2108, LSI SAS2208, LSI
SAS3008, LSI SAS2308, LSI SAS3108, Avago SAS 3408, Avago
SAS 3416iMR, Avago SAS 3416IT, Avago SAS 3508, Software
RAID).
NOTE
The V5 server or Atlas 800 AI server (model 3010) supports OS installation
on the drive that is managed by the standard RAID controller card.
3. Check the RAID array properties to ensure that the boot drive and
the target drive are the same or in the same RAID array.
4. Set the BIOS mode to UEFI if the drive capacity is over 2 TB.
NOTE
V1 and V3 servers do not support UEFI mode.
5. Check whether the drive is a 4K drive.
6. Check whether the loaded RAID controller card driver is correct.
7. Format the drive or reconfigure the RAID array.
OS Faults
If you have confirmed that faults are not caused by other factors, diagnose them as follows:
The server is suspended Disable C state, P state, T state, The OS version does not
or restarted. and ASPM in the BIOS and support CPUs of the current
ensure that the server functions platform.
correctly.
Table 6-1 lists the software and firmware to be upgraded and reference documents of TaiShan
servers.
FusionServer BIOS, HMM, and iBMC l For details, see the upgrade
G5500 guide.
To obtain the upgrade guide,
Atlas 800 AI iBMC, BIOS, and LCD perform the following steps:
server (model
3010) 1. Log in to the Support >
AI Computing Platform
page.
2. Choose a server model to
access the product page.
3. On the Documentation
tab page, choose
Installation & Upgrade
> Upgrade Guide.
4. View the required upgrade
guide.
l To obtain the upgrade
package, perform the
following steps:
1. Log in to the Support >
AI Computing Platform
page.
2. Choose a server model to
access the product page.
3. Click the Software
Download tab.
4. Select the latest patch
version.
5. Download the required
upgrade package.
7 Preventive Maintenance
NOTICE
Take protective measures to prevent ESD damage and any other damage to servers during
preventive maintenance.
7.1.1 Precautions
Familiarize yourself with the security icons listed in Table 7-1 before preventive maintenance
to reduce the chance of injury to yourself or damage to the equipment. These security icons
will be on some server components.
Icon Description
Indicates that this device can cause personal injury or can fail to operate
properly if it is not externally grounded. Each end of a ground cable
should be connected to a different device, and the devices must be
connected to ground points.
Indicates that this device can cause personal injury or can fail to operate
properly if it is not internally grounded. Each end of a ground cable
should be connected to different device components, and the device must
be connected to a ground point.
To prevent any damage to the cables, take the following precautions before inspecting the
cable layout:
7.2.1 Precautions
l Obtain the customer's consent before inspecting servers. Do not modify server
configuration or power on/power off servers before obtaining written consent from the
customer.
l Before inspecting servers, obtain the iMana 200 or iBMC IP address, MM910 IP
address, and password of the root user for each server to be inspected. After inspecting
servers, advise the customer to change the password of the root user as soon as possible.
l Supports inspection for racks servers, high-density servers, blade servers, KunLun
servers, and Atlas servers, and allows users to export inspection reports.
l Supports inspection for mainstream OSs including SLES, RHEL, CentOS, VMware,
Ubuntu, and Windows, and allows users to export inspection reports.
l Supports batch log collection for BMC and blade server management modules, and
supports SLES, RHEL, and CentOS mainstream versions.
l Supports batch upgrade for BMC, BIOS, CPLD, and Smart Provisioning firmware of
rack servers, high-density servers, blade servers, KunLun servers, and Atlas servers.
l Supports firmware bundle upgrade by using the E9000 active management module.
l Supports batch configuration for PSUs, BIOSs, BMCs, and RAID controller cards of
rack servers, high-density servers, blade servers, KunLun servers, and Atlas servers.
l Supports batch configuration for E9000 management modules.
Inspection and log collection do not modify data, collect service data, or affect services, and will delete
the collection scripts and files when finished.
For details about the supported server models and detailed inspection operations, see the
FusionServer Tools 2.0 SmartKit User Guide.
Prerequisites
You can log in to the iBMC WebUI.
Procedure
Step 1 Log in to the iBMC WebUI. For details, see 8.9 Logging In to the iBMC WebUI.
Step 3 View the status of hardware, including drives, DIMMs, and sensors.
1. On the menu bar of the iBMC WebUI, choose Information.
2. In the navigation tree, choose System Info. On the right panel, click the Storage tab and
view hardware status information.
3. In the navigation tree, choose Real-Time Monitoring to view the CPU usage, memory
usage, and air intake vent temperature.
– The RH5885 V3, RH5885H V3, and RH8100 V3 do not support display of the CPU usage and
memory usage.
– After iBMA 2.0 is installed and started on the server OS, the CPU usage is obtained from the
iBMA 2.0 and the CPU usage data is the same as the data collected on the OS.
– If iBMA 2.0 is not installed on the server OS or iBMA 2.0 has not completely started, the CPU
usage data is obtained from the Intel Management Engine (ME). The CPU usage is the average
compute usage per second of all CPU cores calculated by the CPU internal module.
– If iBMA 2.0 is not installed on the server OS, obtain the latest iBMA user guide and software
package, and install iBMA 2.0 by referring to the user guide.
4. In the navigation tree, choose Sensor Info to view the status of sensors.
----End
Customer Name
Time of
Inspection
Inspected By Phone
Number
Service Hotline
Enterprise China 4008229999
Region:
Inspecting Servers
View the inspection report generated by SmartKit to check server health status. An item has
passed the inspection if the value of Result for the item is OK in the report.
Insp Ph Date
ecte on
d By e
Nu
m
be
r
Ins P Date
pe h
cte o
d ne
By N
u
m
be
r
8 Common Operations
Check the first two digits of the product SN before reading the following information.
l If the first two digits of the product SN are 02 or 03, see Figure 8-1.
No. Description
1 SN ID (two characters).
No. Description
No. Description
Obtaining a Product SN
Use one of the following methods to obtain a product SN:
l Use SmartKit.
Use the server inspection function of SmartKit to obtain ESNs in batches. For details
about the product SN, see "Asset Inspection Information" > "Board SN" in the inspection
report.
l View the product label.
A product label is attached to each Huawei server. You can view the product label to
obtain its ESN. The product label position varies with the Huawei server model. For
details, see the user guide of a specific server.
– Figure 8-3 shows the product SN of a rack server.
– Figure 8-4 shows the product SN of an Atlas 800 AI server (model 3010).
– Figure 8-5 shows the product SN of an X6800. In Figure 8-5, (1) is the product
label of the server, and (2) is the product label of a server node.
– Figure 8-6 shows the product SN of an E9000. In Figure 8-6, (1) is the product
label of the server, and (2) is the product label of a compute node.
The product labels of switch modules and MM910s are on their ejector levers.
l Use the iMana 200 WebUI.
a. Log in to the iMana 200 WebUI. For details, see 8.8 Logging In to the iMana 200
WebUI.
b. On the Overview page, view the product SN of the server. See Figure 8-7.
a. Log in to the iBMC WebUI. For details, see 8.9 Logging In to the iBMC WebUI.
b. Choose Information > Information Summary/Overview/Summary. (The menu
varies depending on software versions.) View the product SN of the server. See
Figure 8-8.
This method applies only to E9000 servers whose MM910 version is (U54) 2.20 or later.
a. Log in to the MM910 WebUI. For details, see 8.11 Logging In to the MM910
WebUI.
b. Choose Chassis Information > Manufacturing Information and view the product
SN of the server. See Figure 8-9.
c. Choose Chassis Information > Compute Node Slot Number > Manufacturing
Information and view the SN of the compute node, as shown in Figure 8-10.
l This method applies only to E9000 servers whose management module is the MM920/
MM921.
l Before the operations, add the MM920/MM921 to FusionDirector.
a. Log in to the FusionDirector WebUI. For details, see 8.12 Logging In to the
FusionDirector WebUI.
e. Click the Device tab and click Server, Management Module, and Switch Module
respectively to view the SNs of the compute node, management module, and switch
module, as shown in Figure 8-12.
Procedure
Step 1 Use PuTTY to log in to the server. For details, see 8.15 Logging In to a Server Over a
Network Port by Using PuTTY or 8.17 Logging In to a Compute Node, Passthrough
Module, or Switch Module by Using the SOL Function of the MM910.
Step 2 On the iMana 200 CLI, run the imtool command (for versions earlier than 7.01) or the
ipmcset -t maintenance -d imtool command (for 7.01 and later versions). Information
similar to the following is displayed:
root@BMC:/#ipmcset -t maintenance -d imtool
tar: removing leading '/' from member names
Tar result information success.
iMana:/->
Step 3 Use a cross-platform file transfer tool to connect to the iMana 200 IP address.
In this document, WinSCP is used as the cross-platform file transfer tool. For details, see 8.19
Using WinSCP to Transfer Files.
Step 4 Download the tar.gz package in the /tmp directory on iMana 200 to a directory on the local
PC. See Figure 8-13.
----End
Table 8-1 One-click information collection by the iBMC for each server
Server Series One-Click Information Description
Collection
E6000 N/A
X8000
X6800
Procedure
Step 1 Log in to the iBMC WebUI. For details, see 8.9 Logging In to the iBMC WebUI.
Step 2 Choose Information > Overview > Shortcuts > One-Click Info Collection, as shown in
Figure 8-14.
----End
Procedure
Step 1 Log in to the MM910 WebUI. For details, see 8.11 Logging In to the MM910 WebUI.
Step 2 Choose System Management on the menu bar, choose SEL Information in the navigation
tree, and click the SMM tab and then the One touch collect tab.
Step 3 On the log collection page, choose Collect All > Start.
Log collection takes about 20 minutes. When log collection is complete, a log file named
one_touch_info_all.tar.gz is displayed in the File Name area.
Step 4 Click the log file name and download it to the local PC as prompted.
For MM910 earlier than (U54) 2.20, you need to collect logs of both the active and standby HMMs.
----End
Procedure
Step 1 Log in to the MM910 WebUI. For details, see 8.11 Logging In to the MM910 WebUI.
Step 2 Choose System Management > Information Collection, and set log collection parameters.
l Select MM for Collected from.
l Select One-click full collection for Collected content.
Log collection takes about 20 minutes. When log collection is complete, a log file named
one_touch_info_all.tar.gz is displayed in the File Name area.
Step 4 In the dialog box displayed, download the log file to the local PC as prompted. (In some
browsers, the log file is automatically saved in the default directory.)
----End
Prerequisites
The MM920 or MM921 has been managed by FusionDirector.
Procedure
Step 1 Log in to the FusionDirector WebUI. For details, see 8.12 Logging In to the FusionDirector
WebUI.
Step 2 Choose Menu > Alarms and Logs > Log. The Log page is displayed.
Step 3 Click Collect Log. In the displayed dialog box, click OK.
The Task area is displayed on the right of the page, showing the progress and status of the log
collecting task.
Step 4 Click Export Log to export the log information to a local directory.
----End
Use the MM510 CLI to collect information about the MM510 and heterogeneous nodes in batches. To
collect information about the server, MM510, and heterogeneous nodes in batches, use the iBMC. For
details, see 8.3 Using iBMC to Collect Information in Batches.
Prerequisites
You have logged in to the CLI of the MM510. For details, see 8.13 Logging In to the
MM510 CLI.
Example
# One-click information collection
iBMC:/->ipmcget -d diaginfo
Download diagnose info to /tmp/ successfully.
Prerequisites
Conditions
If the remote control function is required, ensure that the OS, browser, and Java Runtime
Environment (JRE) of the required versions have been installed on the local PC. Table 8-2
shows the system configuration requirements of the local PC.
l The local PC is properly connected to the iMana 200 management network port on the
server by using a network cable.
l The IP addresses of the local PC and the iMana 200 management network port are on the
same network segment.
OS Software Version
OS Software Version
If the JRE does not meet requirements, download and install a proper Java version.
Data
Table 8-3 lists the required data before you log in to the iBMC WebUI.
User User name Username for logging in to the iMana 200 root
login WebUI
informat
ion Password User password for logging in to the iBMC Huawei12#$
WebUI.
NOTE
The default iMana 200 user is root. The root user
belongs to the administrator group. The default
password is Huawei12#$.
Procedure
Step 1 Connect the local PC to the iMana 200 management network port on the server by using a
crossover cable or twisted pair cable.
Figure 8-15 shows the network diagram.
l If the message "There is a problem with this website's security certificate" is displayed, click
Continue to this website (not recommended).
l If the Security Alert dialog box indicating a certificate error is displayed, click Yes.
Step 5 On the iMana 200 login page, enter the username and password.
The user account will be locked after five consecutive login failures caused by incorrect passwords. If
your user account is locked, log in again 5 minutes later.
You can click Reset to clear the information entered on the User Login page.
----End
Prerequisites
Conditions
Before using the remote control function, ensure that the OS, browser, and Java Runtime
Environment (JRE) of the required versions have been installed on the local PC. Table 8-4
lists the required software versions.
l The local PC is connected to the iBMC management network port on the server by using
a network cable.
l The IP addresses of the local PC and the iBMC management network port are on the
same network segment.
OS Software Version
OS Software Version
Data
Table 8-5 lists the required data before you log in to the iBMC WebUI.
User User name Username for logging in to the iBMC WebUI. root
login
informat Password Password for logging in to the iBMC WebUI. Huawei12#$
ion NOTE
The default username for logging in to the iBMC
WebUI of V2 & V3 servers is root, and the default
password is Huawei12#$.
The default username for logging in to the iBMC
WebUI of V5 servers or Atlas 800 AI servers
(model 3010) is Administrator, and the default
password is Admin@9000.
Procedure
Step 1 Connect the local PC to the iBMC management network port on the server by using a
crossover cable or twisted pair cable.
Figure 8-17 shows the network diagram.
Step 3 In the address box, enter the IP address of the server iBMC management network port (for
example, https://192.168.2.100) and press Enter.
l If the message "There is a problem with this website's security certificate" is displayed, click
Continue to this website (not recommended).
l If the Security Alert dialog box indicating a certificate error is displayed, click Yes.
Step 4 On the login page, enter the username and password for logging in to the iBMC WebUI.
The user account will be locked after five consecutive login failures with wrong passwords. If your user
account is locked, log in again 5 minutes later.
----End
Data
The following data is required:
l IP address of the server to be connected
l User name for logging in to the server to be connected. The default username is admin.
l User password for logging in to the server to be connected. The default user password is
Huawei12#$.
Tool
JRE: third-party free software. You can obtain it from the Internet. JRE 1.8 or later is
required.
Procedure
Step 1 Connect a client (for example, a local PC) to the management network port of the
management module by using a network cable.
Step 2 In this displayed security alert dialog box, click Allow to allow web access.
Step 3 In the displayed security alert dialog box, select Do not block this program.
Step 4 In the address box of the PC browser, enter https://IP address of the FC switching plane and
press Enter.
The login dialog box is displayed, as shown in Figure 8-19.
Step 5 Enter the username and password, and click Add Fabric.
----End
l The user account will be locked if incorrect passwords are entered for five consecutive times. The
user account will be automatically unlocked in 5 minutes, but cannot be forcibly unlocked. If you
attempt to enter a password again within 5 minutes, the lock duration is reset to 5 minutes no matter
whether the entered password is correct.
l The WebUI of the standby MM910 (displayed as "This is the standby MM.") does not display
component installation status. After logging in to the WebUI of the standby MM910, you can view
the status of the active MM910 and perform the following operations for the standby MM910: Set
the DHCP parameters and a static IP address, set and query the thresholds and hysteresis of
threshold sensors, collect system operating information, and upgrade the management software. To
perform other operations, log in to the WebUI of the active MM910.
Data
You have obtained the following data:
l Username for logging in to the server to be connected. The default username is root.
l User password for logging in to the server to be connected. The default user password is
Huawei12#$.
Procedure
Step 1 Connect the Ethernet port on the local PC to the MGMT ports on the active and standby
MM910s over the local area network (LAN).
NOTICE
If the active MM910 MGMT port has been connected to the network by using a network
cable and the client needs to be directly connected to the MM910, do not directly disconnect
the network cable from the active MM910 MGMT port that has been connected to the
network. Otherwise, an active/standby MM910 switchover will be triggered, which may cause
network interruption. You are advised to connect the client to the active MM910 STACK port
in the chassis by using a network cable. If the active MM910 STACK port in the chassis has
been connected to the MGMT port in another chassis, use an idle active MM910 STACK port
in another chassis.
l In V2.25 and earlier versions, the MM910 MGMT port is accessed by the external network through
the 2X and 3X switch modules by default. In this case, do not connect the MM910 MGMT port and
the switch module network ports to the same network. Otherwise, a network storm will occur and
the network connection will be interrupted.
To use the MGMT port on the MM910 panel as the management network port for connecting to an
external network, run the smmset -d outportmode -v 1 command on the CLI.
l In V2.26 and later versions, the MM910 MGMT port is provided as the default management
network port for the external network.
Step 2 Set the IP address and subnet mask or route information for the local PC so that the local PC
can communicate with the MM910 properly.
Step 3 On the menu bar of Internet Explorer, choose Tools > Internet Options.
The Internet Options dialog box is displayed.
This section uses a PC running Windows 7 and Internet Explorer 8.0 as an example.
Step 8 Open Internet Explorer, enter https://MM910 floating IP address in the address box, and press
Enter.
For example, enter https://10.85.4.77 in the address box.
"There is a problem with this website's security certificate" is displayed.
Step 9 Click Continue to this website (not recommended).
The page for logging in to the HMM WebUI is displayed.
Step 10 Set the parameters. See Figure 8-22 and Figure 8-23.
l Language: Select English.
l User name: Enter the username for login. The default username is root.
l Password: Enter the user password for login. The default password is Huawei12#$.
l Login To: Select This Machine/computer in most cases. Select LDAP if the system
manages domain users by using an active directory (AD) server.
Figure 8-22 Logging in to the HMM WebUI (MM910 (U54) 2.20 or later)
Figure 8-23 Logging in to the HMM WebUI (MM910 earlier than (U54) 2.20)
----End
Prerequisites
Conditions
l Google Chrome 55 or later is required for logging in to FusionDirector.
l You have obtained the IP address, username, and password of FusionDirector.
The default username of the FusionDirector WebUI is Administrator, and the password
is Admin@9000.
l If you log in as an LDAP domain user, ensure that the LDAP server communicates with
FusionDirector properly, the LDAP function has been enabled on FusionDirector, and
the LDAP server and user group information has been configured.
l If you use the DNS domain name to log in, ensure that the DNS server communicates
with FusionDirector properly and the domain name and DNS server are configured on
FusionDirector.
Precautions
l FusionDirector supports a maximum of 100 concurrent users.
l The default timeout interval of FusionDirector is 30 minutes. If you do not perform any
operation on the WebUI within 30 minutes, the account is automatically logged out. You
need to enter the username and password to log in again.
l If the number of login failures caused by incorrect user names and passwords reaches the
value specified in the system security policy, the account is automatically locked. When
the lockout duration reaches the value specified in the security policy, the user is
automatically unlocked.
l To ensure system security, change the default password upon the first login and change
the password periodically.
Procedure
Step 1 Connect the Ethernet port of the PC to a management network port of the active or standby
MM920/MM921 over the LAN.
The 10GE optical port and MGMT port on the MM920/MM921 panel are management
network ports. This section uses the MGMT port as an example.
Figure 8-26 shows the network connections.
Step 2 Set an IP address and a subnet mask or add route information for the PC so that the PC can
communicate with FusionDirector.
Step 3 Open the browser, enter https://ipaddr in the address box, and press Enter.
l ipaddr indicates the address used to access the FusionDirector WebUI. It can be in either of the
following formats:
– IPv4 address in dotted-decimal format XXX.XXX.XXX.XXX.
– Fully qualified domain name (FQDN) of FusionDirector.
l The browser may display a message indicating that the website has a security certificate error. Ignore
this error and continue the login if the IP address is correct.
Password Specifies the password of the user. For security purposes, change the
password periodically.
l If the username or password is incorrect, you need to enter a verification code in the second login
attempt. If the verification code is not clear, click to refresh the verification code.
l If you enter incorrect passwords for three consecutive times, the account will be locked for 5
minutes. If the account is locked, try again later or contact the administrator.
----End
Prerequisites
When logging in to the HMM CLI, ensure that:
l If you log in to the CLI over SSH, a maximum of five concurrent users are supported.
l To log in to the CLI over the network port, you must connect the network port on the
configuration terminal to the network port on the server by using a network cable, and
ensure that the IP addresses of the two network ports are on the same network segment.
l To log in to the CLI over the serial port, you must connect the serial ports of the terminal
and the server by using a serial cable.
Login Method
l Login over SSH
l Login over the local serial port
l The HMM provides one default user Administrator, and the default password is on the
product nameplate.
l The system locks a user account if the user enters incorrect passwords for five consecutive
times. The user is automatically unlocked 5 minutes later, or an administrator can unlock the
user on the CLI.
l For security purposes, change the initial password after the first login and change your
password periodically.
The methods for logging in to the CMC CLI over SSH varies according to the client operating
system:
At the initial startup of the HMM, wait for about 3 minutes before you log in to the CLI.
l If the client uses Windows:
a. Download and install the SSH client communication tool.
b. Connect the client to the management network port on the server.
c. Enter the IP address, username, and password of the management network port on
the client communication tool.
– Stop bits: 1
– Flow control: None
Figure 8-29 lists the parameters to be specified.
l SSH
SSH provides secure remote login and other secure network services over an insecure
network.
To log in to the RMC CLI over SSH, connect a PC to the RMC management network
port by using a network cable.
l Login over the local serial port
Prerequisites
The RMC is operating properly.
Data
l IP address of the RMC management network port. The default IP address is
192.168.2.100.
l RMC user names and passwords
The RMC provides four default users:
– User root (default password: Huawei12#$)
– User admin (default password: Huawei12#$)
– User operator (default password: Huawei12#$)
– User taobao (default password: Huawei12#$)
Tool
A terminal tool (for example, PuTTY) has been installed on the PC. PuTTY is third-party free
software. PuTTY 0.60 or later is required for login over a serial port.
Document
For details about the RMC, see the X8000 Server RMC Command Reference.
----End
Step 4 In the Host Name (or IP address) text box, enter the IP address of the RMC management
network port.
Step 5 Click Open.
The PuTTY window is displayed, prompting "login as:" for you to enter a user name.
Step 6 Enter a user name and password.
After login, the RMC command prompt root@RMC:/ is displayed.
----End
The server in this section can be a management module, compute node, or switching plane.
Prerequisites
Conditions
The PC and the MM910/MM920/MM921 management network port have been connected by
using a network cable.
Data
You have obtained the following data:
l You have obtained the IP address of the server to be connected.
l You have obtained the user name and password for logging in to the server to be
connected.
Software Tools
PuTTY.exe (third-party software)
Procedure
Step 1 Set an IP address and a subnet mask or add route information for the PC so that the PC can
properly communicate with the server.
You can run the Ping Server IP address command on the PC CLI to check the
communication between the PC and the server.
Step 2 Double-click PuTTY.exe.
The PuTTY Configuration window is displayed, as shown in Figure 8-32.
Configure Host Name and Saved Sessions, and click Save. You can double-click the saved record
under Saved Sessions to log in to the server the next time.
Step 4 (Optional) After logging in to the Ethernet plane by using PuTTY, if you fail to delete
characters on the CLI by using the Backspace key, choose Terminal > Keyboard, and select
Control-H under The Backspace key, as shown in Figure 8-33.
The PuTTY window is displayed, prompting "login as:" for you to enter a user name.
l If this is your first login to the server, the PuTTY Security Alert dialog box is displayed. Click Yes
to proceed.
l If an incorrect user name or password is entered, you must set up a new PuTTY session.
If the login is successful, the server host name is displayed on the left of the prompt.
----End
By default, the server serial port is the OS serial port. For details about how to redirect the server serial
port, see "Querying and Redirecting the Serial Port (serialdir)" in the iBMC User Guide.
Scenarios
Use PuTTY to log in to the server over a serial port in either of the following scenarios:
The server in this section can be a management module, compute node, or switching plane.
Prerequisites
Conditions
Data
You have obtained the user name and password for logging in to the server to be connected.
Software Tools
PuTTY.exe (third-party software) PuTTY 0.60 or later is required for login over a serial port.
Procedure
Step 1 Double-click PuTTY.exe.
Step 2 In the navigation tree on the left, choose Connection > Serial.
In COMN, N indicates the serial port number, and the value is an integer.
----End
Prerequisites
Conditions
l You have logged in to the MM910 CLI by using the floating IP address of the MM910.
l There is no jumper cap over the pins on the mainboard of the compute node, passthrough
module, or switch module.
Data
l User name and password for logging in to the management module. The default user
name of the MM910 is root, and the default password is Huawei12#$.
l User name and password for logging in to the compute node to be connected. The default
user name is root, and the password is Huawei12#$.
l Password for logging in to the passthrough module or switch module to be connected
The default password is Huawei12#$.
Procedure
Step 1 Use an SSH tool and the floating IP address of the MM910 to log in to the MM910 CLI.
In this document, PuTTY is used as the SSH tool. For details, see 8.15 Logging In to a
Server Over a Network Port by Using PuTTY.
telnet 0 1101
*=====================================================================*
* Welcome to SMM SOL Server *
* Please log in with SMM account and password. *
*=====================================================================*
user name:
NOTICE
If you need to disconnect the service terminal or server power after logging in to the SOL
screen, exit the SOL screen first. Otherwise, re-logging in to the SOL screen will fail.
*=================================================================================
==========================
please input the SOL Blade1~Blade16(1 ~ 16), Blade1A~Blade16A(17 ~ 32),
Swi1~Swi4(33 ~ 36) and COM#(n)
press Ctrl+R to return
*=================================================================================
==========================
Blade1~Blade16(1 ~ 16)
Blade1A~Blade16A(17 ~ 32)
Swi1~Swi4(33 ~ 36)
Please input your choice:
Or
1 SYS COM
2 BMC COM
Or
1 systemcom
2 BMCcom
l If you enter a switch module slot number, the following serial port information is
displayed:
1 BMCcom
2 fabriccom
3 basecom
4 FCcom
Or
1 BMCcom
2 fabriccom
Or
1 BMCcom
2 fabriccom
3 basecom
l If you enter a passthrough module slot number, the following serial port information is
displayed:
1 BMCcom
Step 5 Enter the value representing the serial port to be connected, and press Enter.
The serial port screen is displayed. On this screen, you can perform operations such as
configuration and query.
You can press Ctrl+R once to return to the slot number selection screen shown in Step 3, or press Ctrl
+R twice to exit the SOL screen.
----End
Prerequisites
Conditions
l You have logged in to the MM920/MM921 CLI by using the floating IP address of the
MM920/MM921.
l There is no jumper cap over the pins on the mainboard of the compute node, passthrough
module, or switch module.
Data
l Username and password for logging in to the management module. The default
username and password of the MM920/MM921 are Administrator and Admin@9000
respectively.
l Username and password for logging in to the compute node to be connected. The default
username and password are Administrator and Admin@9000 respectively.
l Password for logging in to the passthrough module or switch module to be connected
The default password is Huawei12#$.
Procedure
Step 1 Use an SSH tool and the floating IP address of the MM920/MM921 to log in to the CLI.
In this document, PuTTY is used as the SSH tool. For details, see 8.15 Logging In to a
Server Over a Network Port by Using PuTTY.
Step 2 Run the ipmcget -l bladeN -t SOL -d cominfo or ipmcget -l swiN -t SOL -d cominfo
command to query the SOL port information of the compute node, pass through module, or
switch module.
Step 3 Run the ipmcset -l bladeN -t sol -d activate -v com_value or ipmcset -l swiN -t sol -d
activate -v com_value command to enter the serial port input interface.
----End
Prerequisites
Conditions
The Secure File Transfer Protocol (SFTP) service has been enabled on the destination device.
Data
You have obtained the following data:
l You have obtained the IP address of the server to be connected.
l You have obtained the user name and password for logging in to the server to be
connected.
Software Tools
WinSCP.exe (third-party free software)
Procedure
Step 1 Open the WinSCP folder, and double-click WinSCP.exe.
The WinSCP Login dialog box is displayed, as shown in Figure 8-35.
l If a private key file is not selected at the first login, the warning message "Continue connecting and
add host key to cache" is displayed. Click Yes. The WinSCP file transfer window is displayed.
l On Windows 7, C:\Users\Administrator\Documents on the local PC is opened in the left pane,
and /root on the server is opened in the right pane by default.
Step 4 In the left and right panes, create, delete, or copy folders in specific directories as required.
----End
Prerequisites
l A PC is connected to the server by using a serial cable.
l WFTPD has been installed.
Software Tools
wftpd32.exe: used to transfer files between different platforms, for example, from a PC to a
switching plane of a switch module. wftpd32.exe is a free third-party tool. You can obtain it
from the Internet.
Procedure
Step 1 Double-click wftpd32.exe.
Step 3 Select all check boxes except Winsock Calls, and click OK.
Step 8 Select vxworks from the User Name combo box, and enter the upgrade file directory (for
example, D:\FTP) in the Home Directory text box. See Figure 8-37.
----End
Prerequisites
The SFTP service has been enabled on the destination device.
Software Tools
mini-sftp-server.exe (free software)
Procedure
Step 1 Double-click mini-sftp-server.exe.
The Core FTP mini-sftp-server dialog box is displayed, as shown in Figure 8-38.
----End
9 Other Resources
News
For notices about product life cycles, warnings, and updates, visit Support > Bulletins >
Product Bulletins > Life Cycle Notices.
Cases
To learn server applications, visit Intelligent Computing Case Library.
The Intelligent Computing Case Library is available only to Huawei engineers and partners.
Huawei Server Power Used to calculate server Visit Huawei Server Power
Calculator power consumption with Calculator.
different configurations.