Return to MAIN-Index  Return to SUB-Index    IBM-AUSTRIA - PC-HW-Support    30 Aug 1999

Recovery Procedures When HSP is not Present at Time of Failure



Recovery Procedures When HSP is not Present at Time of Failure


For the IBM SCSI-2 Fast/Wide PCI-Bus RAID Adapter and IBM Fast/Wide Streaming Adapter/A, use the following instructions.

One DDD Drive, No OFL 

Follow these steps to bring the DDD drive back to the ONL state if the following items are true:



Once the conditions above are verified through either the RAID administration log or the RAID administration utility, perform the following steps to bring the DDD drive back to ONL status.
  1.  If drive has never been marked DDD, proceed to step 3 to software replace the drive  using the RAID Administration Program or Netfinity RAID Manager.

    NOTE: Refer to 'Software Replace vs. Physical Replace' section in this manual to  understand differences between software and physical replacement

  2.  If the drive has been marked DDD before, proceed to step 7.
  3.  With a RAID-1 or RAID-5 array, the operating system will be functional. Use either  NetFinity or the RAID administration utility within the operating system to bring the  drive back to ONL status. With the RAID administration utility, open the Options  menu and select Rebuild Drive.
  4.  When you see the prompt to select the DDD drive, highlight the drive you just  replaced and press Enter.
  5.  The RAID adapter issues a start unit command to the drive. You receive a message  confirming that the drive is starting. The drive then begins the rebuild process. Once  the drive completes this process, the drive's status changes to ONL.
  6.  If you see a 'Error in starting drive' message, reinsert the cables, hard drive, etc., to  verify there is a good connection, then go to step 3. If the error persists, go to step 7.
  7.  Physically replace the hard drive in the DDD bay with a new one of the same or  greater capacity and go to step 3.
  8.  If the error still occurs with a known good hard file, then troubleshoot to determine if  the cable, back plane, RAID adapter, etc., is defective.

    NOTE: RAID Adapter should not be replaced unless Hard Errors are reported under  Drive Information with RAID Administration Options Menu or Netfinity RAID  Manager.

     Once you have replaced the defective part so that there is a good connection between  the adapter and hard drive, go to step 3.


Two DDD Drives, No OFL 

In this case, with no defined hot spare drive, then the server more than likely trapped (under OS/2 and NT), or the volume was dismounted (under NetWare). To attempt to resolve this scenario, you must examine the RAID log generated by the RAID Administration Utility and follow the steps below:

  1.  Boot to the RAID configuration utility for your RAID adapter.
  2.  Select Replace Drive. Highlight the drive marked DDD last by the RAID adapter  and press enter. The drive spins up and changes from DDD to ONL status.

      IF YOU USE THE WRONG ORDER WHEN YOU SELECT SET  DEVICE STATE TO CHANGE DRIVE'S STATE TO ONL, DATA CORRUPTION  RESULTS. SEE NOTE BELOW TO DETERMINE LAST DRIVE MARKED DDD  By THE RAID ADAPTER.

    NOTE: Refer to 'Using and Understanding the RAID Administration Log' section of  this document, for details on obtaining and interpreting the RAID log. If only one  drive is recorded in the RAID log because the RAID adapter was not able to log the  defiinct drive before the operating system went down, then the last drive that went  deftinet is the drive that is not recorded in the RAID log. If two drives are recorded in  the RAID log, then the last drive to go defunct is the second drive listed in the logthe  drive with the most recent time stamp.

  3.  If the drive has been marked DDD before, proceed to step 8.
  4.  Proceed to step 5 to software replace the remaining DDD drive using the RAID  Administration Program or Netfinity RAID Manager.

    NOTE: Refer to 'Software Replace vs. Physical Replace' section in this manual to  understand differences between software and physical replacement

  5.  With a RAID-1 or RAID-5 array, the operating system will be functional. Use either  NetFinity or the RAID administration utility within the operating system to bring the  drive back to ONL status. With the RAID administration utility, open the Options  menu and select Rebuild Drive.
  6.  When you see the prompt to select the DDD drive, highlight the drive you just  replaced and press Enter.
  7.  The RAID adapter issues a start unit command to the drive. You receive a message  confirming that the drive is starting. The drive then begins the rebuild process. Once  the drive completes this process, the drive's status changes to ONL.
  8.  If you see a 'Error in starting drive' message, reinsert the cables, hard drive, etc., to  verify there is a good connection, then go to step 5. If the error persists, go to step 9.
  9.  Physically replace the hard drive in the DDD bay with a new one of the same or  greater capacity and go to step 5.
  10.  If the error still occurs with a known good hard file, then troubleshoot to determine if  the cable, back plane, RAID adapter, etc., is defective.

    NOTE: RAID Adapter should not be replaced unless Hard Errors are reported under  Drive Information with RAID Administration Options Menu or Netfinity RAID  Manager.

     Once you have replaced the defective part so that there is a good connection between  the adapter and hard drive, go to step 3,

  11.  If software replacement brings all drives back ONL and makes system operational,  carefully inspect all cables, etc. to ensure that cable or backplane is not defective.  Check all backplane connectors and ensure that backplane is not bowed. When  multiple drives are marked defunct, it is often the communication channel (cable or  backplane) that is the cause of the failure. If backplane is bowed, drives and  backplane connectors may not seat properly causing it to have a bad connection.  Also, with hot-swap drives that are removed frequently, connectors could become  damaged if too much force is exerted.
  12.  If the rebuild completes successfully, then perform the following steps to ensure that  all drives are good:

     Run non-destrutive RAID diagnostics individually on each drive. Run the diagnostics  individually to ensure that you do not have more than one drive that can become  defunct at a time. If a drive does become DDD, physically replace that drive and run a  rebuild process on the new drive. This verifies that all defective drives are removed  from the system, if any exist.

     If the REBUILD process fails, then perform the fbllowing steps:

    1.  Exit to the RAID Main Menu.
    2.  Select Drive Information and view the error counters for each of the hard  files to find out which drive had errors. Refer to 'First Actions to be Performed on Service Call With DDD Drives'  for descriptions of the various errors and the appropriate action.
    3.  If the errors occur on the drive being rebuilt, then physically replace this drive  and select Rebuild again. The drive's status changes from DDD to RBL and  the rebuild process begins. If this process completes successfully, go to Step 5.

     If it still fails the rebuild, then verify that the drives that are being rebuilt from do not  have any errors. If they have no errors, then you should be able to rebuild the data.  Check cable connections to the drive being rebuilt. It is possible that you replaced a  defective drive with another defective drive.

  13.  If a backup configuration is available, restore the backup configuration.
  14.  If a backup configuration is not available, write down the information you  can retrieve by selecting the View Configuration option. Delete the array  and manually create it to match this configuration information. Perform  this step carefully, for if you deviate in any way from the original  configuration, then you will lose all data.

    NOTE: Do not Initialize this logical drive.

  15.  Have all users verify their personal files to ensure their data is good. Keep  in mind that some files may be corrupt due to rebuild errors.


More than 2 DDD Drives, No OFL 

To attempt to recover, perform the following:

  1.  View the RAID log and write down the order in which the drives went defunct.
  2.  Boot to the RAID Configuration Diskette and select View Configuration. Make sure  that the template contains the correct information for the status of all drives, not just  those listed in the RAID log.
  3.  Using the RAID configuration utility, select Replace Drive and choose a DDD drive  not listed in the RAID log. Change the state of this drive to ONL. Perform this step  until the only DDD drives remaining are those indicated in the RAID log.

      IF YOU USE THE WRONG ORDER WHEN YOU SELECT SET  DEVICE STATE TO CHANGE DRIVES' STATES TO ONL, DATA  CORRUPTION RESULTS. ENSURE THAT YOU ONLY CHANGE DEVICE  STATES TO ONL OF DRIVES NOT LISTED AS DDD IN THE RAID LOG. THE  FIRST DRIVE THAT WENT DEFUNCT REQUIRES REBUILDING. SO IT MUST  BE REPLACED LAST.

    NOTE: Refer to 'Using and Understanding the RAID Administration Log' section  of this document, for details on obtaining and interpreting the RAID log. Refer to 'Software Replace vs. Physical Replace' section in this manual to understand  differences between software and physical replacement

  4.  Follow the same procedure used to recover from two DDD drives, as outlined in the  previous section.


Back to  Jump to TOP-of-PAGE
More INFORMATION / HELP is available at the  IBM-HelpCenter

Please see the LEGAL  -  Trademark notice.
Feel free - send a Email-NOTE  for any BUG on this page found - Thank you.