How To Replace the M10-4/M10-4S/M12-1/M12-2/M12-2S DISK [VCAP]

Asset ID:	1-71-1527298.1
Update Date:	2017-08-09
Keywords:

Solution Type Technical Instruction Sure

Solution 1527298.1 : How To Replace the M10-4/M10-4S/M12-1/M12-2/M12-2S DISK [VCAP]

Applies to:

Fujitsu M10-4S - Version All Versions to All Versions [Release All Releases]
Fujitsu M10-4 - Version All Versions to All Versions [Release All Releases]
Fujitsu SPARC M12-2 - Version All Versions to All Versions [Release All Releases]
Fujitsu SPARC M12-2S - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Goal

CAP PROBLEM OVERVIEW: Replace the internal DISK
********************************************************************************
To report errors or request improvements on this procedure,
please go to http://support.us.oracle.com and put a comment on Doc ID:1527298.1
********************************************************************************

Solution

Disk Replacement Video

DISPATCH INSTRUCTIONS

WHAT SKILLS ARE REQUIRED?: No special skills required, Customer Replaceable Unit (CRU) procedure

TASK COMPLEXITY: 0

Time Estimate: 15 minutes

REMOVAL/REPLACEMENT INSTRUCTIONS:

WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE RESOLUTION ACTIVITY?:

The DISK can be replaced Active/Hot with some limitations or Inactive/Hot or Inactive/Cold or System Stopped

WHAT ACTIONS ARE REQUIRED?:

Enable DISK Unit for replacement ( Follow either the Active/Hot or Inactive/Hot or Inactive/Cold or System Stopped methods below).,

-- Active/Hot -- NOTE: Boot disk devices can only be maintained with the Active/Hot method if they are redundantly configured.

HW RAID -

Operate only after checking the failed internal disk. See "Configuring Hardware RAID" in the SPARC M10 Systems System Operation and Administration Guide for details.

Follow the links from the M10 Documentation page in the Oracle System Handbook to the Admin Guide.

SW RAID -

For details, see the manual for the software in use.

Note: Detailed SVM instructions can be found at http://docs.oracle.com/cd/E19253-01/816-4520/troubleshoottasks-33506/index.html.

Note: Detailed ZFS instructions can be found in Document 1002753.1

NO SW or HW RAID Used -

1. Display the Oracle Solaris super-user prompt (#).

2. Execute the cfgadm (1M) command to check the status of the internal disk.

# cfgadm -a

3. Stop the applications from using the internal disk.

4. Execute the cfgadm (1M) command to disconnect the internal disk to be maintained from the system. (Ap_Id is from the cfgadm -a output from previous step).

# cfgadm -c unconfigure Ap_Id

5. Execute the cfgadm (1M) command to blink the CHECK LED of the internal disk to be maintained.

# cfgadm -x led=fault,mode=blink Ap_Id

6. Execute the cfgadm (1M) command to confirm that the internal disk to be maintained is disconnected - (The disconnected internal disk is displayed as "unconfigured").

# cfgadm -a

7. IF the internal disk to be maintained is in an "unconfigured" state ready to be removed see the Removing the internal DISK section below.

-- Inactive/Hot --

1. In a SINGLE cabinet configuration

The inactive/cold maintenance procedure in single cabinet configuration is the same as that for stopping the system except there is no need to remove the power cords.

2. In a Building Block (BB) configuration

a. Power off the maintenance-target physical partition

1. Log in to the XSCF shell.

2. Execute the switchscf command to switch the master XSCF to the standby XSCF. (Perform this when the cabinet to be maintained works as a master cabinet.)

XSCF> switchscf -t Standby

3. Execute the showpcl command to confirm the operation condition of the physical partition.

XSCF> showpcl -a -v

4. Execute the showboards command to confirm the physical partition to be powered off.

XSCF> showboards -a

5. Stop the guest domain on the physical partition under maintenance. - See "Shutting down logical domain" in the SPARC M10 Systems System Operation and Administration Guide.

6. Stop the control domain on the physical partition under maintenance. - See "3.2.4 Powering off physical partitions individually" in the SPARC M10 Systems System Operation and Administration Guide.

7. Execute the showpparstatus command to confirm that the power of the physical partition is turned off.

XSCF> showpparstatus -a

8. Switch the mode switches of the master cabinet and cabinets whose XSCFs are in the standby state to the Service mode.

(For building block configuration, change the mode switches of BB-ID#00 and #01 to the Service mode).

-- Inactive/Cold -- NOTE: Inactive/cold maintenance can be performed only in building block configurations.

1. In a SINGLE cabinet configuration

(The inactive/cold maintenance procedure in single cabinet configuration is the same as that for stopping the system.)

2. In a Building Block (BB) configuration

a. Power off the maintenance-target physical partition

1. Log in to the XSCF shell.

2. Execute the switchscf command to switch the master XSCF to the standby XSCF. (Perform this when the cabinet to be maintained works as a master cabinet.)

XSCF> switchscf -t Standby

3. Execute the showpcl command to confirm the operation condition of the physical partition.

XSCF> showpcl -a -v

4. Execute the showboards command to confirm the physical partition to be powered off.

XSCF> showboards -a

5. Stop the guest domain on the physical partition under maintenance. - See "Shutting down logical domain" in the SPARC M10 Systems System Operation and Administration Guide.

6. Stop the control domain on the physical partition under maintenance. - See "Powering off physical partitions individually" in the SPARC M10 Systems System Operation and Administration Guide.

7. Execute the showpparstatus command to confirm that the power of the physical partition is turned off.

XSCF> showpparstatus -a

8. Switch the mode switches of the master cabinet and cabinets whose XSCFs are in the standby state to the Service mode.

(For building block configuration, change the mode switches of BB-ID#00 and #01 to the Service mode).

b. Execute the replacefru command to release the maintenance-target cabinet from the system.

XSCF> replacefru -

a. Select the maintenance-target cabinet by specifying it with a numeric key.

b. Select the maintenance-target FRU by specifying it with a numeric key.

c. Select the faulty FRU by specifying it with a numeric key.

d. After confirming that the selected FRU is displayed, enter [r].

e. Confirm that the CHECK LED of FRU is blinking. (You can now start the FRU maintenance work.)

c. Remove all power cords from the PSU backplane unit of the maintenance-target cabinet.

-- System-Stopped -- (Stop)

1. In a SINGLE cabinet configuration

a. Stop the entire system

1. Display the Oracle Solaris super-user prompt. (#).

2. Stop the guest domains on all physical partitions. - See "Shutting down logical domain" in the SPARC M10 Systems System Operation and Administration Guide.

3. Stop the control domains on all physical partitions to stop the system. - See "Stopping system with XSCF command" or "Stopping system from operation panel."

a. Switch the mode switch on the operation panel to the Service mode.

b. Log in to the XSCF shell.

c. Execute the poweroff command. (After the poweroff command is executed, following processes are performed.)

XSCF> poweroff -a or (press the power switch on the operator panel for 4 seconds or more).

■ Oracle Solaris is completely shut down.
■ The system stops and enters the standby mode. (The power of the XSCF unit continues to be on.)

d. Check that the POWER LED on the operation panel is off.

4. Open the door to the rack

b. Remove all power cords from the PSU backplane unit of the maintenance-target cabinet.

2. In a Building Block (BB) configuration

a. Stop the entire system

1. Display the Oracle Solaris super-user prompt. (#).

2. Stop the guest domains on all physical partitions. - See "Shutting down logical domain" in the SPARC M10 Systems System Operation and Administration Guide.

3. Stop the control domains on all physical partitions to stop the system. - See "Stopping system with XSCF command" or "Stopping system from operation panel."

a. Switch the mode switch on the operation panel to the Service mode.

b. Log in to the XSCF shell.

c. Execute the poweroff command. (After the poweroff command is executed, following processes are performed.)

XSCF> poweroff -a or (press the power switch on the operator panel for 4 seconds or more).

■ Oracle Solaris is completely shut down.
■ The system stops and enters the standby mode. (The power of the XSCF unit continues to be on.)

d. Check that the POWER LED on the operation panel is off.

4. Open the door to the rack

b. Execute the replacefru command to release the maintenance-target cabinet from the system.

XSCF> replacefru -

a. Select the maintenance-target cabinet by specifying it with a numeric key.

b. Select the maintenance-target FRU by specifying it with a numeric key.

c. Select the faulty FRU by specifying it with a numeric key.

d. After confirming that the selected FRU is displayed, enter [r].

e. Confirm that the CHECK LED of FRU is blinking. (You can now start the FRU maintenance work.)

c. Remove all power cords from the PSU backplane unit of the maintenance-target cabinet.

Removing the internal DISK

Remove the front cover.
Push the knob of the internal disk to unlock and raise the lever at a 45 degree angle.
Hold the lever and pull out the internal disk by 2 to 3 cm (0.8-1.2 in.) forward.
Carefully remove the internal disk from the slot and set on a conductive mat. - NOTE: If reducing the number if internal disks be sure to replace empty disk slot with dummy filler unit.

Installing the internal DISK -

NOTE: If expanding the number of internal disks be sure to remove the dummy filler unit from the slot the new disk will be inserted into.

Open the lever of the disk and hold the disk.
Carefully insert the disk into the slot. (Do NOT force).
Close the lever and be sure the disk locks into place.
Install the front cover.

Reassembling the Server (Follow either the Active/Hot or Inactive/Hot or Inactive/Cold or System Stopped methods).

-- Active/Hot --

HW RAID -

Operate only after checking the failed internal disk. See "Configuring Hardware RAID" in the SPARC M10 Systems System Operation and Administration Guide for details.

Follow the links from the M10 Documentation page in the Oracle System Handbook to the Admin Guide.

SW RAID -

For details, see the manual for the software in use.

Note: Detailed SVM instructions can be found at http://docs.oracle.com/cd/E19253-01/816-4520/troubleshoottasks-33506/index.html.

Note: Detailed ZFS instructions can be found in Document 1002753.1

NO SW or HW RAID Used -

1. Display the Oracle Solaris super-user prompt (#).

2. Execute the cfgadm (1M) command to check the status of the installed internal disk and get the Ap_Id.

# cfgadm -a

3. Execute the cfgadm (1M) command to connect the internal disk to tha was just maintained to the system. (Ap_Id is from the cfgadm -a output from previous step).

# cfgadm -c configure Ap_Id

4. Execute the cfgadm (1M) command to confirm that the internal disk just installed is now connected - (The installed internal disk should now be displayed as "configured").

# cfgadm -a

5. IF the internal disk just installed is now in a "configured" state then disk is now available for applications.

-- Inactive/Hot --

1. In a SINGLE cabinet configuration

a. Start the entire system

1. Check that the XSCF STANDBY LED on the operation panel is on.

2. Log in to the XSCF shell.

3. Switch the mode switch on the operation panel to the Locked mode.

4. Execute the showstatus command to confirm that there is no problem with the FRU after the maintenance.

XSCF> showstatus

5. Execute the showhardconf command to check the hardware configuration and the status of each component.

XSCF> showhardconf

6. Execute the poweron command. (After a short time, following processes are performed.) - See "Starting System" in the SPARC M10 Systems System Operation and Administration Guide.

XSCF> poweron -a

■ The POWER LED on the operation panel comes on.
■ Power-on self-test (POST; self diagnosis when powering on) is executed.
■ Then, the system is started.

7. Close the door of the rack.

2. In a Building Block (BB) configuration

a. Power on the maintenance-target physical partition

1. Switch the mode switches of the master cabinet and cabinets whose XSCFs are in the standby state to the Locked mode

2. Log in to the XSCF shell.

3. Execute the showstatus command to confirm that there is no problem with the FRU after the maintenance

XSCF> showstatus

4. Execute the showhardconf command to check the hardware configuration and the status of each component.

XSCF> showhardconf

5. Execute the switchscf command to switch the standby XSCF to the master XSCF. (Perform this when you maintained SPARC M10-4S that worked as the master cabinet.)

XSCF> switchscf -t Active

6. Execute the poweron command to power on the stopped physical partition.

XSCF> poweron -p ppar_id

-- Inactive/Cold -- NOTE: Inactive/cold maintenance can be performed only in building block configurations.

(The inactive/cold maintenance procedure in single cabinet configuration is the same as that for stopping the system.)

1. In a SINGLE cabinet configuration

1. Connect all power cords to the PSU backplane unit.

2. Execute the testsb command to confirm that the maintenance-target internal disk is normally recognized.

XSCF> testsb -y -p 00-0 (This will test the system boards and backplane by running probe-scsi-all to verify the disks are seen and display the results).

3. Start the entire system.

a. Check that the XSCF STANDBY LED on the operation panel is on.

b. login in XSCF shell

c. Switch the mode switch on the operation panel to the Locked mode.

d. Execute the showstatus command to confirm that there is no problem with the FRU after the maintenance.

XSCF> showstatus

e. Execute the showhardconf command to check the hardware configuration and the status of each component.

XSCF> showhardconf

f. Execute the poweron command. (After a short time, following processes are performed.) - See "Starting System" in the SPARC M10 Systems System Operation and Administration Guide.

XSCF> poweron -a

■ The POWER LED on the operation panel comes on.
■ Power-on self-test (POST; self diagnosis when powering on) is executed.
■ Then, the system is started.

4. Close the door of the rack.

2. In a Building Block (BB) configuration

1. Connect all power cords to the PSU backplane unit of the maintenance-target cabinet.

2. Return to the operation of the XSCF firmware replacefru command to confirm that the cabinet is incorporated into the system.

a. After the maintenance work of the target FRU, enter [f].

b. Check that the status shows normal ("Normal") after diagnosis, and input [F]. (displayed on the console within the replacefru process window.)

c. Input [c] to return to the screen to select the FRU.

3. Execute the testsb command to confirm that the maintenance-target internal disk is normally recognized

XSCF> testsb -y -p 00-0 (This will test the system boards and backplane by running probe-scsi-all to verify the disks are seen and display the results).

4. Power on the maintenance-target physical partition.

a. Switch the mode switches of the master cabinet and cabinets whose XSCFs are in the standby state to the Locked mode.

(For building block configuration, change the mode switches of BB-ID#00 and #01 to the Locked mode.)

b. Execute the showstatus command to confirm that there is no problem with the FRU after the maintenance

XSCF> showstatus

c. Execute the showhardconf command to check the hardware configuration and the status of each component.

XSCF> showhardconf

d. Execute the switchscf command to switch the standby XSCF to the master XSCF. (Perform this when you maintained SPARC M10-4S that worked as the master cabinet.)

XSCF> switchscf -t Active

e. Execute the poweron command to power on the stopped physical partition.

XSCF> poweron -p ppar_id

-- System Stopped -- (Start)

1. In a SINGLE cabinet configuration

1. Connect all power cords to the PSU backplane unit.

2. Execute the testsb command to confirm that the maintenance-target internal disk is normally recognized.

XSCF> testsb -y -p 00-0 (This will test the system boards and backplane by running probe-scsi-all to verify the disks are seen and display the results).

3. Start the entire system.

a. Check that the XSCF STANDBY LED on the operation panel is on.

b. login in XSCF shell

c. Switch the mode switch on the operation panel to the Locked mode.

d. Execute the showstatus command to confirm that there is no problem with the FRU after the maintenance.

XSCF> showstatus

e. Execute the showhardconf command to check the hardware configuration and the status of each component.

XSCF> showhardconf

f. Execute the poweron command. (After a short time, following processes are performed.) - See "Starting System" in the SPARC M10 Systems System Operation and Administration Guide.

XSCF> poweron -a

■ The POWER LED on the operation panel comes on.
■ Power-on self-test (POST; self diagnosis when powering on) is executed.
■ Then, the system is started.

4. Close the door of the rack.

2. In a Building Block (BB) configuration

1. Connect all power cords to the PSU backplane unit.

2. Return to the operation of the XSCF firmware replacefru command to confirm that the cabinet is incorporated into the system.

a. After the maintenance work of the target FRU, enter [f].

b. Check that the status shows normal ("Normal") after diagnosis, and input [F]. (displayed on the console within the replacefru process window.)

c. Input [c] to return to the screen to select the FRU.

3. Execute the testsb command to confirm that the maintenance-target internal disk is normally recognized.

XSCF> testsb -y -p 00-0 (This will test the system boards and backplane by running probe-scsi-all to verify the disks are seen and display the results).

4. Start the entire system.

a. Check that the XSCF STANDBY LED on the operation panel is on.

b. Switch the mode switch on the operation panel to the Locked mode.

c. Execute the showstatus command to confirm that there is no problem with the FRU after the maintenance.

XSCF> showstatus

d. Execute the showhardconf command to check the hardware configuration and the status of each component.

XSCF> showhardconf

e. Execute the poweron command. (After a short time, following processes are performed.) - See "Starting System" in the SPARC M10 Systems System Operation and Administration Guide.

XSCF> poweron -a

■ The POWER LED on the operation panel comes on.
■ Power-on self-test (POST; self diagnosis when powering on) is executed.
■ Then, the system is started.

5. Close the door of the rack.

OBTAIN CUSTOMER ACCEPTANCE

WHAT ACTION NEEDS TO TAKEN TO RETURN THE SYSTEM TO AN OPERATIONAL STATE: Post work required to bring the system back online

Due to Bug 17451091 in Solaris 10 and Bug 15606570 in Solaris 11 FMA disk errors may not be cleared when the disk is replaced. This will result in replay errors.

Once the disk has been replaced check FMA to make sure the error was cleared:

# > fmadm faulty

If the disk is still showing in error, then clear the error by doing:

# > fmadm acquit <UUID>

REFERENCE INFORMATION:

M10 Documentation Link: http://docs.oracle.com/cd/E38160_01/index.html
M12 Documentation Link: http://docs.oracle.com/cd/E86029_01/index.html

NOTE:1526831.1 - M10 CRU / FRU Replacement Methods

References

<NOTE:1002753.1> - How to Replace a Drive in Solaris[TM] ZFS

Attachments

This solution has no attachment