Announcement

Collapse
No announcement yet.

SCSI problems.... advise would be welcome...

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SCSI problems.... advise would be welcome...

    Hello,

    Yesterday, my system started acting up very weirdly. My first harddisk (Quantum Atlas 10K) made the seek noises it makes when spinning up (few specific clicks and beep-like sounds).

    It eventually lead to a blue screen (it flashed very vast, but I thought I read something about "pagefile operation"). Now, the system boots, SMART gives no errors, the disks have been scanned for errors (the Quantum even at boot by XP !).

    The driver is (see signature for config):
    Aaptec AIC-7902-Ultra320 SCSI
    Adaptec
    02/12/2002
    1.3.0.0
    not digitally signed
    In the system log, I have numerous entries (id11, grouped in 6 at a time; id15, less frequently, but when it occurs it is listed 20-30 times in a row, for both disks).
    Here are the details of the log-entries:

    Code:
    adpu320
    id 11: The driver detected a controller error on \Device\Scsi\adpu3201.
    0000: 0f 00 10 00 01 00 68 00   ......h.
    0008: 00 00 00 00 0b 00 04 c0   .......À
    0010: 24 50 00 c1 00 00 00 00   $P.Á....
    0018: 46 01 00 00 00 00 00 00   F.......
    0020: 00 00 00 00 00 00 00 00   ........
    0028: 00 00 00 00 01 00 00 00   ........
    0030: 00 00 00 00 05 00 00 00   ........
    disk
    id 15: The device, \Device\Harddisk0\D, is not ready for access yet.
    0000: 0f 00 68 00 01 00 b6 00   ..h...¶.
    0008: 00 00 00 00 0f 00 04 c0   .......À
    0010: 04 01 00 00 9d 00 00 c0   ....?..À
    0018: 00 00 00 00 00 00 00 00   ........
    0020: 00 00 00 00 00 00 00 00   ........
    0028: f2 06 07 00 00 00 00 00   ò.......
    0030: ff ff ff ff 00 00 00 00   ÿÿÿÿ....
    0038: 40 00 00 0a 00 00 05 00   @.......
    0040: 05 20 06 12 08 01 20 00   . .... .
    0048: 00 00 00 00 0a 00 00 00   ........
    0050: 00 00 00 00 f8 54 c2 85   ....øTÂ?
    0058: 00 00 00 00 c8 e1 b8 85   ....Èá¸?
    0060: 00 80 3d 86 00 00 00 00   .?=?....
    0068: 1e 00 00 00 00 00 00 00   ........
    0070: 00 00 00 00 00 00 00 00   ........
    0078: 00 00 00 00 00 00 00 00   ........
    0080: 00 00 00 00 00 00 00 00   ........
    0088: 00 00 00 00 00 00 00 00   ........
    This article has got me worried:
    Microsoft support is here to help you with Microsoft products. Find how-to articles, videos, and training for Microsoft Copilot, Microsoft 365, Windows, Surface, and more.


    I'm now looking for newer drivers, but there don't seem to be any more recent ones...
    I also have contacted SuperMicro tech support...

    Any other suggestions ?


    Jörg
    pixar
    Dream as if you'll live forever. Live as if you'll die tomorrow. (James Dean)

  • #2
    Check your cables?

    unplug/replug?

    Comment


    • #3
      Tried both things; I have a second cable + terminator, but the problem persists. Pluging/unpluging doesn't seem to help...

      Currently, I get the event-entries upon boot, and occasionally while the system is running. Fortunutaly without any consequences, but I don't trust the system anymore. I want to know where these messages came from, prevent the cause from occuring and have a "clean" log-file when there are no issues.


      Jörg
      pixar
      Dream as if you'll live forever. Live as if you'll die tomorrow. (James Dean)

      Comment


      • #4
        me votes for failing hard drive. Have you done any checks on it.
        Chief Lemon Buyer no more Linux sucks but not as much
        Weather nut and sad git.

        My Weather Page

        Comment


        • #5
          Yes, I ran scan-software and smart-diagnostics. The IBM supports SMART-selftests and passes them; the Quantum does not support selftests, but reports as OK. Besides, it shouldn't occur when the "failing" drive is disconnected, right ?

          The event-log entries were most likely present as soon as I moved my disks to this controller (came from an Adaptec 2940UW). I then contacted SuperMicro regarding this, but they didn't respond. As there were no problems at that time, I decided (stupidely enough ?) to ignore these log-entries. For what it is worth, all hardware is still covered by warranty (drives 5 year, mainboard is quite recent).

          I also checked the directions here:

          If both internal and external SCSI devices are attached, make sure that the last device on each SCSI chain is terminated, and make sure that intermediate devices are not terminated.
          Check.
          If there is only a single SCSI chain (either all internal or external), make sure that the last device of the SCSI chain is terminated and that the SCSI controller itself is terminated. This is usually a BIOS setting.
          Check.
          Check for loose or poor-quality SCSI cabling. A long chain of cables with mixed internal and external cabling can degrade the signal. A SCSI specification that allows for a long distance assumes that the cabling allows no leakage or interference. The allowable reality is generally a shorter distance. External cables that are six feet long or longer should be replaced with three-foot cables.
          2 different scsi-cables and terminators yield the same issues; cable length is well within limits.
          Take note of when the event messages were recorded, and try to determine whether the messages coincide with certain processing schedules (such as backups) or heavy disk processing. This might pinpoint the device that is causing the errors.
          No particular action is undertaken at the time of the entries.
          The tendency of drives to have these types of problems under heavy stress is often due to slow microprocessors. In a multitasking environment, the processor may not be fast enough to process all the input/output (I/O) commands that arrive almost simultaneously.
          Problem also occurs when system is close to idle.
          Slow down the transfer rate settings if timeouts are associated with tape drives; using a 5-mbs transfer rate usually cures the timeouts.
          No tape drives.
          Simplify the SCSI/IDE chain by removing devices. If you suspect that a particular device is causing the problem, move that device to another controller. If the behavior follows the device, replace the device.
          Simplifying doesn't help (only thing that has not been replaced is the onboard controller).
          Check the revisions of the SCSI controller BIOS and of device firmware, and obtain the latest revisions from the manufacturer. (There is a procedure for checking the model number and firmware revision later in this article.)
          I haven't found new firmware for either controller or driver.
          Check the version of SCSI device driver. The SCSI driver is located in the %SystemRoot%\System32\Drivers folder. Look at the version in the properties for the driver file. If the driver is not up-to-date, see whether the manufacturer has a newer version.
          No newer version (on Adaptec or on Supermicro)
          Remove any other controllers that might create bus contention issues.
          Not possible, bus is hardly fully loaded (no network, U320 is the only traffic-intensive device).
          See whether a low-level format performed by the SCSI controller resolves the event messages.
          The drives were not low-level formatted by me when I got them; they first were attached to a 2940UW (without any problems, so I didn't perform a low-level format back then)
          Could it be this ? From all the options, this is the only one that seems viable...?
          Try substituting a different make or model of any suspect hardware.
          Not possible (onboard controller, no other ones available).


          Jörg
          Last edited by VJ; 10 July 2003, 02:46.
          pixar
          Dream as if you'll live forever. Live as if you'll die tomorrow. (James Dean)

          Comment


          • #6
            Go back to 2940 and see if the problem still occurs.
            Chief Lemon Buyer no more Linux sucks but not as much
            Weather nut and sad git.

            My Weather Page

            Comment


            • #7
              Now why didn't I think of that...
              (perhaps because the controller is in another computer, but that should be too much of a problem)
              Hopefully, XP will not complain about the controller with the bootdrive being changed...

              Will be able to try it after the weekend though...
              (nice thinking)

              Euhm, any ideas in both cases (i.e. if the problem does not appear anymore, and if it the problem persists) ?


              Jörg
              pixar
              Dream as if you'll live forever. Live as if you'll die tomorrow. (James Dean)

              Comment


              • #8
                It really sounds to me like a failing HDD. My Quantum Viking did this on a 2940UW. Worked fine after, then did it a couple of weeks later, and a week after that, failed. Did you try verifing the disk the the Adaptec controller bios? This has been know to find and fix errors for me. Other than that....Stupid question, but do you have your power mgmt set to turn of the HDD after say 20 minutes? Just a couple of thoughts....
                "I dream of a better world where chickens can cross the road without having their motives questioned."

                Comment


                • #9
                  I had a 9.1GB 7200RPM SCSI Quantum die on me, sounding pretty much as you described. luckily they had 5 year warrenties on theml and when I RMA'd it, I got a 9.1GB 10K RPM drive back
                  has worked fine since
                  We have enough youth - What we need is a fountain of smart!


                  i7-920, 6GB DDR3-1600, HD4870X2, Dell 27" LCD

                  Comment


                  • #10
                    Byock:
                    Not a stupid idea, but I haven't set the drives to spin down.

                    Well, this Quantum is acutally a replacement for another one (about 3 years ago), which started giving numerous smart errors.

                    I will try verifying the media (and low formatting them), but I dreaded doing so, as I would have problems moving my data on other disks... Still, better be safe then sorrow...
                    The fact that it just started acting up on the new controller would then be merely a coincidence ? Also, how could one failing drive result in event id 15 for both drives on this controller ?


                    Jörg
                    pixar
                    Dream as if you'll live forever. Live as if you'll die tomorrow. (James Dean)

                    Comment


                    • #11
                      Supermicro mailed back (pretty fast ). They advised me to perform a bios update, and they mailed a new SCSI-driver I should try.

                      If the problem persits, they advised me to disconnect my channel B (cd-writer) and remove my Adaptec 2906 (to see if any of those caused the conflicts).


                      Jörg
                      pixar
                      Dream as if you'll live forever. Live as if you'll die tomorrow. (James Dean)

                      Comment


                      • #12
                        Problem has been located!

                        I followed SuperMicro's instructions, but after removing the 2906, my U320 could not find my bootdrive. . I then started looking further, and it turns out there is too much clearance between the Quantum PCB and the disk. Pressing the PCB results in the system working again, releasing it puts the drive without power. I then put an object to keep the PCB in place, and lo and behold: the system booted, all devices were connected, but there was no error-entry in the system log.

                        So the Quantum is being sent back. (for the second time, it was replaced after my initial one got corrupt)


                        Jörg
                        pixar
                        Dream as if you'll live forever. Live as if you'll die tomorrow. (James Dean)

                        Comment


                        • #13
                          bummer
                          If there's artificial intelligence, there's bound to be some artificial stupidity.

                          Jeremy Clarkson "806 brake horsepower..and that on that limp wrist faerie liquid the Americans call petrol, if you run it on the more explosive jungle juice we have in Europe you'd be getting 850 brake horsepower..."

                          Comment


                          • #14
                            Yeah... it means I have to completely re-install my system (and I just had it configured with all the software I needed). Either way, better like this (I got to copy all the necessary files) than experience a total failure when in the midst of something...

                            Odd though, that the system was working for about 3 months without any problem besides the messages in the log-files. I must say, the disk-access seemed a lot faster and more responsive when I had it running with the PCB held in place by an object. So I'm guessing perhaps the power connector is not the only one to suffer from disturbances.

                            Hope it is covered by warranty; either way, I'm going to purchase an additional drive (never hurts ): waiting for stuff to return usually takes some time (last time it took 3 weeks).


                            Jörg
                            pixar
                            Dream as if you'll live forever. Live as if you'll die tomorrow. (James Dean)

                            Comment


                            • #15
                              ...interesting how simple things end up and how complicated and exhausting the paths are to find the end.

                              So often I find myself forced to learn things I don't even want to know about, but am richer for it in the end.

                              Richer, but not so sure I need the scare and pulled out hair...

                              How can you possibly take anything seriously?
                              Who cares?

                              Comment

                              Working...
                              X