[TAG] Diagnosing SATA problems

Neil Youngman ny at youngman.org.uk
Tue Jan 23 00:23:58 MSK 2007


I've been having problems with my SATA disk for some time and I've moved back 
to working off my old IDE disk, while I investigate the problem. I'm assuming 
the problem is hardware, but I don't have any suitable hardware to swap 
around to prove the point. I think my next step is to buy another SATA 
controller and swap that, but first I thought I'd see if the gang's 
collective wisdom had any pointers to offer.

First off, here's an extract from /var/log/messages as I try to copy a 1.4GB 
file onto the SATA disk.

Jan 22 15:52:57 tsr2 kernel: ata2: hard resetting port
Jan 22 15:52:57 tsr2 kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 
310)
Jan 22 15:52:57 tsr2 kernel: ata2.00: configured for UDMA/100
Jan 22 15:52:57 tsr2 kernel: ata2: EH complete
Jan 22 15:52:57 tsr2 kernel: SCSI device sda: 390721968 512-byte hdwr sectors 
(200050 MB)
Jan 22 15:52:57 tsr2 kernel: sda: Write Protect is off
Jan 22 15:52:57 tsr2 kernel: SCSI device sda: drive cache: write back
Jan 22 15:52:58 tsr2 kernel: ata2: hard resetting port
Jan 22 15:52:59 tsr2 kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 
310)
Jan 22 15:52:59 tsr2 kernel: ata2.00: configured for UDMA/100
Jan 22 15:52:59 tsr2 kernel: ata2: EH complete
Jan 22 15:52:59 tsr2 kernel: SCSI device sda: 390721968 512-byte hdwr sectors 
(200050 MB)
Jan 22 15:52:59 tsr2 kernel: sda: Write Protect is off
Jan 22 15:52:59 tsr2 kernel: SCSI device sda: drive cache: write back
Jan 22 15:52:59 tsr2 kernel: ata2: hard resetting port
Jan 22 15:52:59 tsr2 kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 
310)
Jan 22 15:52:59 tsr2 kernel: ata2.00: configured for UDMA/100
Jan 22 15:52:59 tsr2 kernel: ata2: EH complete
Jan 22 15:52:59 tsr2 kernel: SCSI device sda: 390721968 512-byte hdwr sectors 
(200050 MB)
Jan 22 15:52:59 tsr2 kernel: sda: Write Protect is off
Jan 22 15:52:59 tsr2 kernel: SCSI device sda: drive cache: write back
Jan 22 15:53:01 tsr2 kernel: ata2.00: limiting speed to UDMA/66
Jan 22 15:53:01 tsr2 kernel: ata2: hard resetting port
Jan 22 15:53:02 tsr2 kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 
310)
Jan 22 15:53:02 tsr2 kernel: ata2.00: configured for UDMA/66
Jan 22 15:53:02 tsr2 kernel: ata2: EH complete
Jan 22 15:53:02 tsr2 kernel: SCSI device sda: 390721968 512-byte hdwr sectors 
(200050 MB)
Jan 22 15:53:02 tsr2 kernel: sda: Write Protect is off
Jan 22 15:53:02 tsr2 kernel: SCSI device sda: drive cache: write back
Jan 22 15:53:04 tsr2 kernel: ata2.00: limiting speed to UDMA/44
Jan 22 15:53:04 tsr2 kernel: ata2: hard resetting port
Jan 22 15:53:05 tsr2 kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 
310)
Jan 22 15:53:05 tsr2 kernel: ata2.00: configured for UDMA/44
Jan 22 15:53:05 tsr2 kernel: ata2: EH complete
Jan 22 15:53:05 tsr2 kernel: SCSI device sda: 390721968 512-byte hdwr sectors 
(200050 MB)
Jan 22 15:53:05 tsr2 kernel: sda: Write Protect is off
Jan 22 15:53:05 tsr2 kernel: SCSI device sda: drive cache: write back
Jan 22 15:53:09 tsr2 kernel: ata2.00: limiting speed to UDMA/33
Jan 22 15:53:09 tsr2 kernel: ata2: hard resetting port
Jan 22 15:53:10 tsr2 kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 
310)
Jan 22 15:53:10 tsr2 kernel: ata2.00: configured for UDMA/33
Jan 22 15:53:10 tsr2 kernel: ata2: EH complete
Jan 22 15:53:10 tsr2 kernel: SCSI device sda: 390721968 512-byte hdwr sectors 
(200050 MB)
Jan 22 15:53:10 tsr2 kernel: sda: Write Protect is off
Jan 22 15:53:10 tsr2 kernel: SCSI device sda: drive cache: write back
Jan 22 15:53:11 tsr2 kernel: ata2.00: limiting speed to UDMA/25
Jan 22 15:53:11 tsr2 kernel: ata2: hard resetting port
Jan 22 15:53:12 tsr2 kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 
310)
Jan 22 15:53:12 tsr2 kernel: ata2.00: configured for UDMA/25
Jan 22 15:53:12 tsr2 kernel: ata2: EH complete
Jan 22 15:53:12 tsr2 kernel: SCSI device sda: 390721968 512-byte hdwr sectors 
(200050 MB)
Jan 22 15:53:12 tsr2 kernel: sda: Write Protect is off
Jan 22 15:53:12 tsr2 kernel: SCSI device sda: drive cache: write back
Jan 22 15:53:16 tsr2 kernel: ata2.00: limiting speed to UDMA/16
Jan 22 15:53:16 tsr2 kernel: ata2: hard resetting port
Jan 22 15:53:17 tsr2 kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 
310)
Jan 22 15:53:17 tsr2 kernel: ata2.00: configured for UDMA/16
Jan 22 15:53:17 tsr2 kernel: ata2: EH complete
Jan 22 15:53:17 tsr2 kernel: SCSI device sda: 390721968 512-byte hdwr sectors 
(200050 MB)
Jan 22 15:53:17 tsr2 kernel: sda: Write Protect is off
Jan 22 15:53:17 tsr2 kernel: SCSI device sda: drive cache: write back
Jan 22 15:53:18 tsr2 kernel: ata2.00: limiting speed to PIO4
Jan 22 15:53:18 tsr2 kernel: ata2: hard resetting port
Jan 22 15:53:19 tsr2 kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 
310)
Jan 22 15:53:19 tsr2 kernel: ata2.00: configured for PIO4
Jan 22 15:53:19 tsr2 kernel: ata2: EH complete

At this point 46MB has been copied and the machine is effectively hung.

Once I realised I had a problem, i naturally installed smartmontools and this 
is what smartctl tells me.

# smartctl -d ata -l selftest /dev/sda
smartctl version 5.36 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  
LBA_of_first_error
# 1  Short offline       Completed without error       00%      1685         -
# 2  Short offline       Completed without error       00%      1671         -
# 3  Short offline       Completed without error       00%      1667         -
# 4  Extended offline    Completed without error       00%      1661         -
# 5  Short offline       Completed without error       00%      1643         -
# 6  Short offline       Completed without error       00%      1628         -
# 7  Short offline       Completed without error       00%      1613         -
# 8  Short offline       Completed without error       00%      1598         -
# 9  Short offline       Completed without error       00%      1583         -
#10  Short offline       Completed without error       00%      1569         -
#11  Short offline       Completed without error       00%      1555         -
#12  Extended offline    Completed without error       00%      1549         -
#13  Short offline       Completed without error       00%      1538         -
#14  Extended offline    Completed without error       00%      1523         -

# smartctl -d ata -l error /dev/sda
smartctl version 5.36 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
No Errors Logged

# 

The lack of any errors suggests to me that the problem is not with the disk, 
hence the thought that I should replace the controller. Is this a reasonable 
conclusion from the data available?

I have tried reseating the controller card and cables and moved the SATA cable 
to the secondary port on the SATA controller. 

Is there anything else I should be trying?

Neil Youngman




More information about the TAG mailing list