S.M.A.R.T. Errors -
Self-Monitoring, Analysis, and
Reporting Technology
Predictable Failures: Mechanical wear and gradual degradation of storage surfaces are the reasons for this type of failure. S.M.A.R.T. monitoring can be deployed to determine when these failures are going to be more likely.
Unpredictable Failures: Happens suddenly and without warning, this can happen for a range of reasons such as electrical components becoming faulty and causing a sudden mechanical failure (more common to bad handling of the disk drive).
Each drive manufacturer defines a set of attributes. Then determine a score in which the drive cannot pass under normal operation. Each attribute has a raw value, whose meaning is entirely up to the drive manufacturer.
Known ATA errors. Drive manufacturers may not make all of these available, but just a select few (normally 20 are chosen).
| 1 |
1 |
Read Error Rate |
Indicates the rate of hardware read errors that occurred when reading data from a disk surface. The raw value has different structure for different vendors and is often not meaningful as a decimal number. |
| 2 |
2 |
Throughput Performance |
Overall (general) throughput performance of a hard disk drive. If the value of this attribute is decreasing there is a high probability that there is a problem with the disk. |
| 3 |
3 |
Spin-Up Time |
Average time of spindle spin up (from zero RPM to fully operational [millisecs]). |
| 4 |
4 |
Start/Stop Count |
A tally of spindle start/stop cycles. The spindle turns on, and hence the count is increased, both when the hard disk is turned on after having before been turned entirely off (disconnected from power source) and when the hard disk returns from having previously been put to sleep mode. |
| 5 |
5 |
Reallocated Sectors Count |
Count of reallocated sectors. When the hard drive finds a read/write/verification error, it marks this sector as "reallocated" and transfers data to a special reserved area (spare area). This process is also known as remapping, and "reallocated" sectors are called remaps. This is why, on modern hard disks, "bad blocks" cannot be found while testing the surface – all bad blocks are hidden in reallocated sectors. However, as the number of reallocated sectors increases, the read/write speed tends to decrease. The raw value normally represents a count of the number of bad sectors that have been found and remapped. Thus, the higher the attribute value, the more sectors the drive has had to reallocate. |
| 6 |
6 |
Read Channel Margin |
Margin of a channel while reading data. The function of this attribute is not specified. |
| 7 |
7 |
Seek Error Rate |
Rate of seek errors of the magnetic heads. If there is a partial failure in the mechanical positioning system, then seek errors will arise. Such a failure may be due to numerous factors, such as damage to a servo, or thermal widening of the hard disk. The raw value has different structure for different vendors and is often not meaningful as a decimal number. |
| 8 |
8 |
Seek Time Performance |
Average performance of seek operations of the magnetic heads. If this attribute is decreasing, it is a sign of problems in the mechanical subsystem. |
| 9 |
9 |
Power-On Hours (POH) |
Count of hours in power-on state. The raw value of this attribute shows total count of hours (or minutes, or seconds, depending on manufacturer) in power-on state. |
| 10 |
0A |
Spin Retry Count |
Count of retry of spin start attempts. This attribute stores a total count of the spin start attempts to reach the fully operational speed (under the condition that the first attempt was unsuccessful). An increase of this attribute value is a sign of problems in the hard disk mechanical subsystem. |
| 11 |
0B |
Recalibration Retries |
This attribute indicates the number of times recalibration was requested (under the condition that the first attempt was unsuccessful). An increase of this attribute value is a sign of problems in the hard disk mechanical subsystem. |
| Calibration_Retry_Count |
| 12 |
0C |
Power Cycle Count |
This attribute indicates the count of full hard disk power on/off cycles. |
| 13 |
0D |
Soft Read Error Rate |
Uncorrected read errors reported to the operating system. |
| 183 |
B7 |
SATA Downshift Error Count |
Western Digital and Samsung attribute. |
| 184 |
B8 |
End-to-End error |
This attribute is a part of HP's SMART IV technology and it means that after transferring through the cache RAM data buffer the parity data between the host and the hard drive did not match. |
| 185 |
B9 |
Head Stability |
Western Digital attribute. |
| 186 |
BA |
Induced Op-Vibration Detection |
Western Digital attribute. |
| 187 |
BB |
Reported Uncorrectable Errors |
A number of errors that could not be recovered using hardware ECC (see attribute 195). |
| 188 |
BC |
Command Timeout |
A number of aborted operations due to HDD timeout. Normally this attribute value should be equal to zero and if the value is far above zero, then most likely there will be some serious problems with power supply or an oxidized data cable. |
| 189 |
BD |
High Fly Writes |
HDD producers implement a Fly Height Monitor that attempts to provide additional protections for write operations by detecting when a recording head is flying outside its normal operating range. If an unsafe fly height condition is encountered, the write process is stopped, and the information is rewritten or reallocated to a safe region of the hard drive. This attribute indicates the count of these errors detected over the lifetime of the drive. |
|
| This feature is implemented in most modern Seagate drives and some of Western Digital’s drives, beginning with the WD Enterprise WDE18300 and WDE9180 Ultra2 SCSI hard drives, and will be included on all future WD Enterprise products. |
| 190 |
BE |
Airflow Temperature (WDC) |
Airflow temperature on Western Digital HDs (Same as temp. [C2], but current value is 50 less for some models. Marked as obsolete.) |
| 190 |
BE |
Temperature Difference from 100 |
Value is equal to (100 – temp. °C), allowing manufacturer to set a minimum threshold which corresponds to a maximum temperature. |
|
| (Seagate only?)[citation needed] |
| Seagate ST910021AS: Verified Present[citation needed] |
| Seagate ST9120823ASG: Verified Present under name "Airflow Temperature Cel" 2008-10-06 |
| Seagate ST3802110A: Verified Present 2007-02-13[citation needed] |
| Seagate ST980825AS: Verified Present 2007-04-05[citation needed] |
| Seagate ST3320620AS: Verified Present 2007-04-23[citation needed] |
| Seagate ST3500641AS: Verified Present 2007-06-12[citation needed] |
| Seagate ST3250824AS: Verified Present 2007-08-07[citation needed] |
| Seagate ST3250620AS: Verified Present |
| Seagate ST31000340AS: Verified Present 2008-02-05[citation needed] |
| Seagate ST31000333AS: Verified Present 2008-11-24[citation needed] |
| Seagate ST3160211AS: Verified Present 2008-06-12[citation needed] |
| Seagate ST3320620AS: Verified Present 2008-06-12[citation needed] |
| Seagate ST3400620AS: Verified Present 2008-06-12[citation needed] |
| Seagate ST3750330AS: Verified present 2009-07-06[citation needed] |
| Seagate ST3500418AS: Verified present 2010-04-03 |
| Samsung HD501LJ: Verified Present under name "Airflow Temperature" 2008-03-02[citation needed] |
| Samsung HD753LJ: Verified Present under name "Airflow Temperature" 2008-07-15[citation needed] |
|
| 191 |
BF |
G-sense error rate |
The number of errors resulting from externally-induced shock & vibration. |
| 192 |
C0 |
Power-off Retract Count |
Number of times the heads are loaded off the media. Heads can be unloaded without actually powering off.[citation needed] |
| Emergency Retract Cycle count (Fujitsu) |
| 193 |
C1 |
Load Cycle Count |
Count of load/unload cycles into head landing zone position. |
| Load/Unload Cycle Count (Fujitsu) |
|
|
The typical lifetime rating for laptop (2.5-in) hard drives is 300,000 to 600,000 load cycles. Some laptop drives are programmed to unload the heads whenever there has not been any activity for about five seconds. Many Linux installations write to the filesystem a few times a minute in the background. As a result, there may be 100 or more load cycles per hour, and the load cycle rating may be exceeded in less than a year. |
| 194 |
C2 |
Temperature |
Current internal temperature. |
| 195 |
C3 |
Hardware ECC Recovered |
The raw value has different structure for different vendors and is often not meaningful as a decimal number. |
| 196 |
C4 |
Reallocation Event Count |
Count of remap operations. The raw value of this attribute shows the total number of attempts to transfer data from reallocated sectors to a spare area. Both successful & unsuccessful attempts are counted. |
| 197 |
C5 |
Current Pending Sector Count |
Number of "unstable" sectors (waiting to be remapped, because of read errors). If an unstable sector is subsequently written or read successfully, this value is decreased and the sector is not remapped. Read errors on a sector will not remap the sector (since it might be readable later); instead, the drive firmware remembers that the sector needs to be remapped, and remaps it the next time it's written. |
| 198 |
C6 |
Uncorrectable Sector Count |
The total number of uncorrectable errors when reading/writing a sector. A rise in the value of this attribute indicates defects of the disk surface and/or problems in the mechanical subsystem. (or Off-Line Scan Uncorrectable Sector Count – Fujitsu) |
| 199 |
C7 |
UltraDMA CRC Error Count |
The number of errors in data transfer via the interface cable as determined by ICRC (Interface Cyclic Redundancy Check). |
| 200 |
C8 |
Multi-Zone Error Rate |
The number of errors found when writing a sector. The higher the value, the worse the disk's mechanical condition is. |
| 200 |
C8 |
Write Error Rate (Fujitsu) |
The total number of errors when writing a sector. |
| 201 |
C9 |
Soft Read Error Rate |
Number of off-track errors. |
| 202 |
CA |
Data Address Mark errors |
Number of Data Address Mark errors (or vendor-specific).[citation needed] |
| 203 |
CB |
Run Out Cancel |
Number of ECC errors |
| 204 |
CC |
Soft ECC Correction |
Number of errors corrected by software ECC[citation needed] |
| 205 |
CD |
Thermal Asperity Rate (TAR) |
Number of errors due to high temperaure. |
| 206 |
CE |
Flying Height |
Height of heads above the disk surface. A flying height that's too low increases the chances of a head crash while a flying height that's too high increases the chances of a read/write error.[citation needed] |
| 207 |
CF |
Spin High Current |
Amount of surge current used to spin up the drive. |
| 208 |
D0 |
Spin Buzz |
Number of buzz routines needed to spin up the drive due to insufficient power. |
| 209 |
D1 |
Offline Seek Performance |
Drive’s seek performance during its internal tests. |
| 211 |
D3 |
Vibration During Write |
Vibration During Write[citation needed] |
| 212 |
D4 |
Shock During Write |
Shock During Write[citation needed] |
| 220 |
DC |
Disk Shift |
Distance the disk has shifted relative to the spindle (usually due to shock or temperature). Unit of measure is unknown. |
| 221 |
DD |
G-Sense Error Rate |
The number of errors resulting from externally-induced shock & vibration. |
| 222 |
DE |
Loaded Hours |
Time spent operating under data load (movement of magnetic head armature)[citation needed] |
| 223 |
DF |
Load/Unload Retry Count |
Number of times head changes position.[citation needed] |
| 224 |
E0 |
Load Friction |
Resistance caused by friction in mechanical parts while operating.[citation needed] |
| 225 |
E1 |
Load/Unload Cycle Count |
Total number of load cycles[citation needed] |
| 226 |
E2 |
Load 'In'-time |
Total time of loading on the magnetic heads actuator (time not spent in parking area).[citation needed] |
| 227 |
E3 |
Torque Amplification Count |
Number of attempts to compensate for platter speed variations[citation needed] |
| 228 |
E4 |
Power-Off Retract Cycle |
The number of times the magnetic armature was retracted automatically as a result of cutting power.[citation needed] |
| 230 |
E6 |
GMR Head Amplitude |
Amplitude of "thrashing" (distance of repetitive forward/reverse head motion)[citation needed] |
| 231 |
E7 |
Temperature |
Drive Temperature |
| 240 |
F0 |
Head Flying Hours |
Time while head is positioning[citation needed] |
| 240 |
F0 |
Transfer Error Rate (Fujitsu) |
Counts the number of times the link is reset during a data transfer. |
| 241 |
F1 |
Total LBAs Written |
Total LBAs Written |
| 242 |
F2 |
Total LBAs Read |
Total LBAs Read |
| Some S.M.A.R.T. utilities will report a negative number for the raw value since in reality it has 48 bits rather than 32. |
| 250 |
FA |
Read Error Retry Rate |
Number of errors while reading from a disk |
| 254 |
FE |
Free Fall Protection |
Number of "Free Fall Events" detected |
Disclaimers
Information in this document is provided in connection with integrated products we sell at eAegis. No license, express or implied, by estoppels or otherwise, to any eAegis, Inc. intellectual property rights is granted by this document. Except as provided in eAegis’s Terms and Conditions of Sale for such products, eAegis, Inc. assumes no liability whatsoever, and eAegis, Inc. disclaims any Express or implied warranty, relating to sale and/or use of eAegis products including liability or warranties relating to fitness for a particular purpose, merchantability, or infringement of any patent, copyright or other eAegis intellectual property right. eAegis products are not intended for use in medical, life saving, or life sustaining applications. Copyright © eAegis, Inc. 2010.
© Copyright 2010 eAegis, Inc. All rights reserved.
Article posted July 2010.