|
IBM-AUSTRIA - PC-HW-Support 30 Aug 1999 |
Recovery Procedures When HSP is not Present at Time of Failure
Recovery Procedures When HSP is not Present at Time of Failure
For the IBM SCSI-2 Fast/Wide PCI-Bus RAID Adapter and IBM Fast/Wide Streaming
Adapter/A, use the following instructions.
One DDD Drive, No OFL
Follow these steps to bring the DDD drive back to the ONL state if the following items are true:
- Only one drive is marked DDD and the rest are ONL.
- There are no drives with an OFL status.
Once the conditions above are verified through either the RAID administration log or the RAID
administration utility, perform the following steps to bring the DDD drive back to ONL status.
- If drive has never been marked DDD, proceed to step 3 to software replace the drive
using the RAID Administration Program or Netfinity RAID Manager.
NOTE: Refer to 'Software Replace vs. Physical Replace' section in this manual to
understand differences between software and physical replacement
- If the drive has been marked DDD before, proceed to step 7.
- With a RAID-1 or RAID-5 array, the operating system will be functional. Use either
NetFinity or the RAID administration utility within the operating system to bring the
drive back to ONL status. With the RAID administration utility, open the Options
menu and select Rebuild Drive.
- When you see the prompt to select the DDD drive, highlight the drive you just
replaced and press Enter.
- The RAID adapter issues a start unit command to the drive. You receive a message
confirming that the drive is starting. The drive then begins the rebuild process. Once
the drive completes this process, the drive's status changes to ONL.
- If you see a 'Error in starting drive' message, reinsert the cables, hard drive, etc., to
verify there is a good connection, then go to step 3. If the error persists, go to step 7.
- Physically replace the hard drive in the DDD bay with a new one of the same or
greater capacity and go to step 3.
- If the error still occurs with a known good hard file, then troubleshoot to determine if
the cable, back plane, RAID adapter, etc., is defective.
NOTE: RAID Adapter should not be replaced unless Hard Errors are reported under
Drive Information with RAID Administration Options Menu or Netfinity RAID
Manager.
Once you have replaced the defective part so that there is a good connection between
the adapter and hard drive, go to step 3.
Two DDD Drives, No OFL
In this case, with no defined hot spare drive, then the server more than likely trapped (under OS/2
and NT), or the volume was dismounted (under NetWare). To attempt to resolve this scenario,
you must examine the RAID log generated by the RAID Administration Utility and follow the
steps below:
- Boot to the RAID configuration utility for your RAID adapter.
- Select Replace Drive. Highlight the drive marked DDD last by the RAID adapter
and press enter. The drive spins up and changes from DDD to ONL status.
IF YOU USE THE WRONG ORDER WHEN YOU SELECT SET
DEVICE STATE TO CHANGE DRIVE'S STATE TO ONL, DATA CORRUPTION
RESULTS. SEE NOTE BELOW TO DETERMINE LAST DRIVE MARKED DDD
By THE RAID ADAPTER.
NOTE: Refer to 'Using and Understanding the RAID Administration Log' section of
this document, for details on obtaining and interpreting the RAID log. If only one
drive is recorded in the RAID log because the RAID adapter was not able to log the
defiinct drive before the operating system went down, then the last drive that went
deftinet is the drive that is not recorded in the RAID log. If two drives are recorded in
the RAID log, then the last drive to go defunct is the second drive listed in the logthe
drive with the most recent time stamp.
- If the drive has been marked DDD before, proceed to step 8.
- Proceed to step 5 to software replace the remaining DDD drive using the RAID
Administration Program or Netfinity RAID Manager.
NOTE: Refer to 'Software Replace vs. Physical Replace' section in this manual to
understand differences between software and physical replacement
- With a RAID-1 or RAID-5 array, the operating system will be functional. Use either
NetFinity or the RAID administration utility within the operating system to bring the
drive back to ONL status. With the RAID administration utility, open the Options
menu and select Rebuild Drive.
- When you see the prompt to select the DDD drive, highlight the drive you just
replaced and press Enter.
- The RAID adapter issues a start unit command to the drive. You receive a message
confirming that the drive is starting. The drive then begins the rebuild process. Once
the drive completes this process, the drive's status changes to ONL.
- If you see a 'Error in starting drive' message, reinsert the cables, hard drive, etc., to
verify there is a good connection, then go to step 5. If the error persists, go to step 9.
- Physically replace the hard drive in the DDD bay with a new one of the same or
greater capacity and go to step 5.
- If the error still occurs with a known good hard file, then troubleshoot to determine if
the cable, back plane, RAID adapter, etc., is defective.
NOTE: RAID Adapter should not be replaced unless Hard Errors are reported under
Drive Information with RAID Administration Options Menu or Netfinity RAID
Manager.
Once you have replaced the defective part so that there is a good connection between
the adapter and hard drive, go to step 3,
- If software replacement brings all drives back ONL and makes system operational,
carefully inspect all cables, etc. to ensure that cable or backplane is not defective.
Check all backplane connectors and ensure that backplane is not bowed. When
multiple drives are marked defunct, it is often the communication channel (cable or
backplane) that is the cause of the failure. If backplane is bowed, drives and
backplane connectors may not seat properly causing it to have a bad connection.
Also, with hot-swap drives that are removed frequently, connectors could become
damaged if too much force is exerted.
- If the rebuild completes successfully, then perform the following steps to ensure that
all drives are good:
Run non-destrutive RAID diagnostics individually on each drive. Run the diagnostics
individually to ensure that you do not have more than one drive that can become
defunct at a time. If a drive does become DDD, physically replace that drive and run a
rebuild process on the new drive. This verifies that all defective drives are removed
from the system, if any exist.
If the REBUILD process fails, then perform the fbllowing steps:
- Exit to the RAID Main Menu.
- Select Drive Information and view the error counters for each of the hard
files to find out which drive had errors. Refer to
'First Actions to be Performed on Service Call With DDD Drives'
for descriptions of the various errors and the appropriate action.
- If the errors occur on the drive being rebuilt, then physically replace this drive
and select Rebuild again. The drive's status changes from DDD to RBL and
the rebuild process begins. If this process completes successfully, go to Step 5.
If it still fails the rebuild, then verify that the drives that are being rebuilt from do not
have any errors. If they have no errors, then you should be able to rebuild the data.
Check cable connections to the drive being rebuilt. It is possible that you replaced a
defective drive with another defective drive.
- If a backup configuration is available, restore the backup configuration.
- If a backup configuration is not available, write down the information you
can retrieve by selecting the View Configuration option. Delete the array
and manually create it to match this configuration information. Perform
this step carefully, for if you deviate in any way from the original
configuration, then you will lose all data.
NOTE: Do not Initialize this logical drive.
- Have all users verify their personal files to ensure their data is good. Keep
in mind that some files may be corrupt due to rebuild errors.
More than 2 DDD Drives, No OFL
To attempt to recover, perform the following:
- View the RAID log and write down the order in which the drives went defunct.
- Boot to the RAID Configuration Diskette and select View Configuration. Make sure
that the template contains the correct information for the status of all drives, not just
those listed in the RAID log.
- Using the RAID configuration utility, select Replace Drive and choose a DDD drive
not listed in the RAID log. Change the state of this drive to ONL. Perform this step
until the only DDD drives remaining are those indicated in the RAID log.
IF YOU USE THE WRONG ORDER WHEN YOU SELECT SET
DEVICE STATE TO CHANGE DRIVES' STATES TO ONL, DATA
CORRUPTION RESULTS. ENSURE THAT YOU ONLY CHANGE DEVICE
STATES TO ONL OF DRIVES NOT LISTED AS DDD IN THE RAID LOG. THE
FIRST DRIVE THAT WENT DEFUNCT REQUIRES REBUILDING. SO IT MUST
BE REPLACED LAST.
NOTE: Refer to 'Using and Understanding the RAID Administration Log' section
of this document, for details on obtaining and interpreting the RAID log. Refer to
'Software Replace vs. Physical Replace' section in this manual to understand
differences between software and physical replacement
- Follow the same procedure used to recover from two DDD drives, as outlined in the
previous section.
Back to
More INFORMATION / HELP is available at the IBM-HelpCenter
Please see the LEGAL - Trademark notice.
Feel free - send a for any BUG on this page found - Thank you.