1======================== 2libATA Developer's Guide 3======================== 4 5:Author: Jeff Garzik 6 7Introduction 8============ 9 10libATA is a library used inside the Linux kernel to support ATA host 11controllers and devices. libATA provides an ATA driver API, class 12transports for ATA and ATAPI devices, and SCSI<->ATA translation for ATA 13devices according to the T10 SAT specification. 14 15This Guide documents the libATA driver API, library functions, library 16internals, and a couple sample ATA low-level drivers. 17 18libata Driver API 19================= 20 21:c:type:`struct ata_port_operations <ata_port_operations>` 22is defined for every low-level libata 23hardware driver, and it controls how the low-level driver interfaces 24with the ATA and SCSI layers. 25 26FIS-based drivers will hook into the system with ``->qc_prep()`` and 27``->qc_issue()`` high-level hooks. Hardware which behaves in a manner 28similar to PCI IDE hardware may utilize several generic helpers, 29defining at a bare minimum the bus I/O addresses of the ATA shadow 30register blocks. 31 32:c:type:`struct ata_port_operations <ata_port_operations>` 33---------------------------------------------------------- 34 35Post-IDENTIFY device configuration 36~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 37 38:: 39 40 void (*dev_config) (struct ata_port *, struct ata_device *); 41 42 43Called after IDENTIFY [PACKET] DEVICE is issued to each device found. 44Typically used to apply device-specific fixups prior to issue of SET 45FEATURES - XFER MODE, and prior to operation. 46 47This entry may be specified as NULL in ata_port_operations. 48 49Set PIO/DMA mode 50~~~~~~~~~~~~~~~~ 51 52:: 53 54 void (*set_piomode) (struct ata_port *, struct ata_device *); 55 void (*set_dmamode) (struct ata_port *, struct ata_device *); 56 void (*post_set_mode) (struct ata_port *); 57 unsigned int (*mode_filter) (struct ata_port *, struct ata_device *, unsigned int); 58 59 60Hooks called prior to the issue of SET FEATURES - XFER MODE command. The 61optional ``->mode_filter()`` hook is called when libata has built a mask of 62the possible modes. This is passed to the ``->mode_filter()`` function 63which should return a mask of valid modes after filtering those 64unsuitable due to hardware limits. It is not valid to use this interface 65to add modes. 66 67``dev->pio_mode`` and ``dev->dma_mode`` are guaranteed to be valid when 68``->set_piomode()`` and when ``->set_dmamode()`` is called. The timings for 69any other drive sharing the cable will also be valid at this point. That 70is the library records the decisions for the modes of each drive on a 71channel before it attempts to set any of them. 72 73``->post_set_mode()`` is called unconditionally, after the SET FEATURES - 74XFER MODE command completes successfully. 75 76``->set_piomode()`` is always called (if present), but ``->set_dma_mode()`` 77is only called if DMA is possible. 78 79Taskfile read/write 80~~~~~~~~~~~~~~~~~~~ 81 82:: 83 84 void (*sff_tf_load) (struct ata_port *ap, struct ata_taskfile *tf); 85 void (*sff_tf_read) (struct ata_port *ap, struct ata_taskfile *tf); 86 87 88``->tf_load()`` is called to load the given taskfile into hardware 89registers / DMA buffers. ``->tf_read()`` is called to read the hardware 90registers / DMA buffers, to obtain the current set of taskfile register 91values. Most drivers for taskfile-based hardware (PIO or MMIO) use 92:c:func:`ata_sff_tf_load` and :c:func:`ata_sff_tf_read` for these hooks. 93 94PIO data read/write 95~~~~~~~~~~~~~~~~~~~ 96 97:: 98 99 void (*sff_data_xfer) (struct ata_device *, unsigned char *, unsigned int, int); 100 101 102All bmdma-style drivers must implement this hook. This is the low-level 103operation that actually copies the data bytes during a PIO data 104transfer. Typically the driver will choose one of 105:c:func:`ata_sff_data_xfer`, or :c:func:`ata_sff_data_xfer32`. 106 107ATA command execute 108~~~~~~~~~~~~~~~~~~~ 109 110:: 111 112 void (*sff_exec_command)(struct ata_port *ap, struct ata_taskfile *tf); 113 114 115causes an ATA command, previously loaded with ``->tf_load()``, to be 116initiated in hardware. Most drivers for taskfile-based hardware use 117:c:func:`ata_sff_exec_command` for this hook. 118 119Per-cmd ATAPI DMA capabilities filter 120~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 121 122:: 123 124 int (*check_atapi_dma) (struct ata_queued_cmd *qc); 125 126 127Allow low-level driver to filter ATA PACKET commands, returning a status 128indicating whether or not it is OK to use DMA for the supplied PACKET 129command. 130 131This hook may be specified as NULL, in which case libata will assume 132that atapi dma can be supported. 133 134Read specific ATA shadow registers 135~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 136 137:: 138 139 u8 (*sff_check_status)(struct ata_port *ap); 140 u8 (*sff_check_altstatus)(struct ata_port *ap); 141 142 143Reads the Status/AltStatus ATA shadow register from hardware. On some 144hardware, reading the Status register has the side effect of clearing 145the interrupt condition. Most drivers for taskfile-based hardware use 146:c:func:`ata_sff_check_status` for this hook. 147 148Write specific ATA shadow register 149~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 150 151:: 152 153 void (*sff_set_devctl)(struct ata_port *ap, u8 ctl); 154 155 156Write the device control ATA shadow register to the hardware. Most 157drivers don't need to define this. 158 159Select ATA device on bus 160~~~~~~~~~~~~~~~~~~~~~~~~ 161 162:: 163 164 void (*sff_dev_select)(struct ata_port *ap, unsigned int device); 165 166 167Issues the low-level hardware command(s) that causes one of N hardware 168devices to be considered 'selected' (active and available for use) on 169the ATA bus. This generally has no meaning on FIS-based devices. 170 171Most drivers for taskfile-based hardware use :c:func:`ata_sff_dev_select` for 172this hook. 173 174Private tuning method 175~~~~~~~~~~~~~~~~~~~~~ 176 177:: 178 179 void (*set_mode) (struct ata_port *ap); 180 181 182By default libata performs drive and controller tuning in accordance 183with the ATA timing rules and also applies blacklists and cable limits. 184Some controllers need special handling and have custom tuning rules, 185typically raid controllers that use ATA commands but do not actually do 186drive timing. 187 188 **Warning** 189 190 This hook should not be used to replace the standard controller 191 tuning logic when a controller has quirks. Replacing the default 192 tuning logic in that case would bypass handling for drive and bridge 193 quirks that may be important to data reliability. If a controller 194 needs to filter the mode selection it should use the mode_filter 195 hook instead. 196 197Control PCI IDE BMDMA engine 198~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 199 200:: 201 202 void (*bmdma_setup) (struct ata_queued_cmd *qc); 203 void (*bmdma_start) (struct ata_queued_cmd *qc); 204 void (*bmdma_stop) (struct ata_port *ap); 205 u8 (*bmdma_status) (struct ata_port *ap); 206 207 208When setting up an IDE BMDMA transaction, these hooks arm 209(``->bmdma_setup``), fire (``->bmdma_start``), and halt (``->bmdma_stop``) the 210hardware's DMA engine. ``->bmdma_status`` is used to read the standard PCI 211IDE DMA Status register. 212 213These hooks are typically either no-ops, or simply not implemented, in 214FIS-based drivers. 215 216Most legacy IDE drivers use :c:func:`ata_bmdma_setup` for the 217:c:func:`bmdma_setup` hook. :c:func:`ata_bmdma_setup` will write the pointer 218to the PRD table to the IDE PRD Table Address register, enable DMA in the DMA 219Command register, and call :c:func:`exec_command` to begin the transfer. 220 221Most legacy IDE drivers use :c:func:`ata_bmdma_start` for the 222:c:func:`bmdma_start` hook. :c:func:`ata_bmdma_start` will write the 223ATA_DMA_START flag to the DMA Command register. 224 225Many legacy IDE drivers use :c:func:`ata_bmdma_stop` for the 226:c:func:`bmdma_stop` hook. :c:func:`ata_bmdma_stop` clears the ATA_DMA_START 227flag in the DMA command register. 228 229Many legacy IDE drivers use :c:func:`ata_bmdma_status` as the 230:c:func:`bmdma_status` hook. 231 232High-level taskfile hooks 233~~~~~~~~~~~~~~~~~~~~~~~~~ 234 235:: 236 237 enum ata_completion_errors (*qc_prep) (struct ata_queued_cmd *qc); 238 int (*qc_issue) (struct ata_queued_cmd *qc); 239 240 241Higher-level hooks, these two hooks can potentially supersede several of 242the above taskfile/DMA engine hooks. ``->qc_prep`` is called after the 243buffers have been DMA-mapped, and is typically used to populate the 244hardware's DMA scatter-gather table. Some drivers use the standard 245:c:func:`ata_bmdma_qc_prep` and :c:func:`ata_bmdma_dumb_qc_prep` helper 246functions, but more advanced drivers roll their own. 247 248``->qc_issue`` is used to make a command active, once the hardware and S/G 249tables have been prepared. IDE BMDMA drivers use the helper function 250:c:func:`ata_sff_qc_issue` for taskfile protocol-based dispatch. More 251advanced drivers implement their own ``->qc_issue``. 252 253:c:func:`ata_sff_qc_issue` calls ``->sff_tf_load()``, ``->bmdma_setup()``, and 254``->bmdma_start()`` as necessary to initiate a transfer. 255 256Exception and probe handling (EH) 257~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 258 259:: 260 261 void (*freeze) (struct ata_port *ap); 262 void (*thaw) (struct ata_port *ap); 263 264 265:c:func:`ata_port_freeze` is called when HSM violations or some other 266condition disrupts normal operation of the port. A frozen port is not 267allowed to perform any operation until the port is thawed, which usually 268follows a successful reset. 269 270The optional ``->freeze()`` callback can be used for freezing the port 271hardware-wise (e.g. mask interrupt and stop DMA engine). If a port 272cannot be frozen hardware-wise, the interrupt handler must ack and clear 273interrupts unconditionally while the port is frozen. 274 275The optional ``->thaw()`` callback is called to perform the opposite of 276``->freeze()``: prepare the port for normal operation once again. Unmask 277interrupts, start DMA engine, etc. 278 279:: 280 281 void (*error_handler) (struct ata_port *ap); 282 283 284``->error_handler()`` is a driver's hook into probe, hotplug, and recovery 285and other exceptional conditions. The primary responsibility of an 286implementation is to call :c:func:`ata_std_error_handler`. 287 288:c:func:`ata_std_error_handler` will perform a standard error handling sequence 289to resurect failed devices, detach lost devices and add new devices (if any). 290This function will call the various reset operations for a port, as needed. 291These operations are as follows. 292 293* The 'prereset' operation (which may be NULL) is called during an EH reset, 294 before any other action is taken. 295 296* The 'postreset' hook (which may be NULL) is called after the EH reset is 297 performed. Based on existing conditions, severity of the problem, and hardware 298 capabilities, 299 300* Either the 'softreset' operation or the 'hardreset' operation will be called 301 to perform the low-level EH reset. If both operations are defined, 302 'hardreset' is preferred and used. If both are not defined, no low-level reset 303 is performed and EH assumes that an ATA class device is connected through the 304 link. 305 306:: 307 308 void (*post_internal_cmd) (struct ata_queued_cmd *qc); 309 310 311Perform any hardware-specific actions necessary to finish processing 312after executing a probe-time or EH-time command via 313:c:func:`ata_exec_internal`. 314 315Hardware interrupt handling 316~~~~~~~~~~~~~~~~~~~~~~~~~~~ 317 318:: 319 320 irqreturn_t (*irq_handler)(int, void *, struct pt_regs *); 321 void (*irq_clear) (struct ata_port *); 322 323 324``->irq_handler`` is the interrupt handling routine registered with the 325system, by libata. ``->irq_clear`` is called during probe just before the 326interrupt handler is registered, to be sure hardware is quiet. 327 328The second argument, dev_instance, should be cast to a pointer to 329:c:type:`struct ata_host_set <ata_host_set>`. 330 331Most legacy IDE drivers use :c:func:`ata_sff_interrupt` for the irq_handler 332hook, which scans all ports in the host_set, determines which queued 333command was active (if any), and calls ata_sff_host_intr(ap,qc). 334 335Most legacy IDE drivers use :c:func:`ata_sff_irq_clear` for the 336:c:func:`irq_clear` hook, which simply clears the interrupt and error flags 337in the DMA status register. 338 339SATA phy read/write 340~~~~~~~~~~~~~~~~~~~ 341 342:: 343 344 int (*scr_read) (struct ata_port *ap, unsigned int sc_reg, 345 u32 *val); 346 int (*scr_write) (struct ata_port *ap, unsigned int sc_reg, 347 u32 val); 348 349 350Read and write standard SATA phy registers. 351sc_reg is one of SCR_STATUS, SCR_CONTROL, SCR_ERROR, or SCR_ACTIVE. 352 353Init and shutdown 354~~~~~~~~~~~~~~~~~ 355 356:: 357 358 int (*port_start) (struct ata_port *ap); 359 void (*port_stop) (struct ata_port *ap); 360 void (*host_stop) (struct ata_host_set *host_set); 361 362 363``->port_start()`` is called just after the data structures for each port 364are initialized. Typically this is used to alloc per-port DMA buffers / 365tables / rings, enable DMA engines, and similar tasks. Some drivers also 366use this entry point as a chance to allocate driver-private memory for 367``ap->private_data``. 368 369Many drivers use :c:func:`ata_port_start` as this hook or call it from their 370own :c:func:`port_start` hooks. :c:func:`ata_port_start` allocates space for 371a legacy IDE PRD table and returns. 372 373``->port_stop()`` is called after ``->host_stop()``. Its sole function is to 374release DMA/memory resources, now that they are no longer actively being 375used. Many drivers also free driver-private data from port at this time. 376 377``->host_stop()`` is called after all ``->port_stop()`` calls have completed. 378The hook must finalize hardware shutdown, release DMA and other 379resources, etc. This hook may be specified as NULL, in which case it is 380not called. 381 382Error handling 383============== 384 385This chapter describes how errors are handled under libata. Readers are 386advised to read SCSI EH (Documentation/scsi/scsi_eh.rst) and ATA 387exceptions doc first. 388 389Origins of commands 390------------------- 391 392In libata, a command is represented with 393:c:type:`struct ata_queued_cmd <ata_queued_cmd>` or qc. 394qc's are preallocated during port initialization and repetitively used 395for command executions. Currently only one qc is allocated per port but 396yet-to-be-merged NCQ branch allocates one for each tag and maps each qc 397to NCQ tag 1-to-1. 398 399libata commands can originate from two sources - libata itself and SCSI 400midlayer. libata internal commands are used for initialization and error 401handling. All normal blk requests and commands for SCSI emulation are 402passed as SCSI commands through queuecommand callback of SCSI host 403template. 404 405How commands are issued 406----------------------- 407 408Internal commands 409 Once allocated qc's taskfile is initialized for the command to be 410 executed. qc currently has two mechanisms to notify completion. One 411 is via ``qc->complete_fn()`` callback and the other is completion 412 ``qc->waiting``. ``qc->complete_fn()`` callback is the asynchronous path 413 used by normal SCSI translated commands and ``qc->waiting`` is the 414 synchronous (issuer sleeps in process context) path used by internal 415 commands. 416 417 Once initialization is complete, host_set lock is acquired and the 418 qc is issued. 419 420SCSI commands 421 All libata drivers use :c:func:`ata_scsi_queuecmd` as 422 ``hostt->queuecommand`` callback. scmds can either be simulated or 423 translated. No qc is involved in processing a simulated scmd. The 424 result is computed right away and the scmd is completed. 425 426 ``qc->complete_fn()`` callback is used for completion notification. ATA 427 commands use :c:func:`ata_scsi_qc_complete` while ATAPI commands use 428 :c:func:`atapi_qc_complete`. Both functions end up calling ``qc->scsidone`` 429 to notify upper layer when the qc is finished. After translation is 430 completed, the qc is issued with :c:func:`ata_qc_issue`. 431 432 Note that SCSI midlayer invokes hostt->queuecommand while holding 433 host_set lock, so all above occur while holding host_set lock. 434 435How commands are processed 436-------------------------- 437 438Depending on which protocol and which controller are used, commands are 439processed differently. For the purpose of discussion, a controller which 440uses taskfile interface and all standard callbacks is assumed. 441 442Currently 6 ATA command protocols are used. They can be sorted into the 443following four categories according to how they are processed. 444 445ATA NO DATA or DMA 446 ATA_PROT_NODATA and ATA_PROT_DMA fall into this category. These 447 types of commands don't require any software intervention once 448 issued. Device will raise interrupt on completion. 449 450ATA PIO 451 ATA_PROT_PIO is in this category. libata currently implements PIO 452 with polling. ATA_NIEN bit is set to turn off interrupt and 453 pio_task on ata_wq performs polling and IO. 454 455ATAPI NODATA or DMA 456 ATA_PROT_ATAPI_NODATA and ATA_PROT_ATAPI_DMA are in this 457 category. packet_task is used to poll BSY bit after issuing PACKET 458 command. Once BSY is turned off by the device, packet_task 459 transfers CDB and hands off processing to interrupt handler. 460 461ATAPI PIO 462 ATA_PROT_ATAPI is in this category. ATA_NIEN bit is set and, as 463 in ATAPI NODATA or DMA, packet_task submits cdb. However, after 464 submitting cdb, further processing (data transfer) is handed off to 465 pio_task. 466 467How commands are completed 468-------------------------- 469 470Once issued, all qc's are either completed with :c:func:`ata_qc_complete` or 471time out. For commands which are handled by interrupts, 472:c:func:`ata_host_intr` invokes :c:func:`ata_qc_complete`, and, for PIO tasks, 473pio_task invokes :c:func:`ata_qc_complete`. In error cases, packet_task may 474also complete commands. 475 476:c:func:`ata_qc_complete` does the following. 477 4781. DMA memory is unmapped. 479 4802. ATA_QCFLAG_ACTIVE is cleared from qc->flags. 481 4823. :c:expr:`qc->complete_fn` callback is invoked. If the return value of the 483 callback is not zero. Completion is short circuited and 484 :c:func:`ata_qc_complete` returns. 485 4864. :c:func:`__ata_qc_complete` is called, which does 487 488 1. ``qc->flags`` is cleared to zero. 489 490 2. ``ap->active_tag`` and ``qc->tag`` are poisoned. 491 492 3. ``qc->waiting`` is cleared & completed (in that order). 493 494 4. qc is deallocated by clearing appropriate bit in ``ap->qactive``. 495 496So, it basically notifies upper layer and deallocates qc. One exception 497is short-circuit path in #3 which is used by :c:func:`atapi_qc_complete`. 498 499For all non-ATAPI commands, whether it fails or not, almost the same 500code path is taken and very little error handling takes place. A qc is 501completed with success status if it succeeded, with failed status 502otherwise. 503 504However, failed ATAPI commands require more handling as REQUEST SENSE is 505needed to acquire sense data. If an ATAPI command fails, 506:c:func:`ata_qc_complete` is invoked with error status, which in turn invokes 507:c:func:`atapi_qc_complete` via ``qc->complete_fn()`` callback. 508 509This makes :c:func:`atapi_qc_complete` set ``scmd->result`` to 510SAM_STAT_CHECK_CONDITION, complete the scmd and return 1. As the 511sense data is empty but ``scmd->result`` is CHECK CONDITION, SCSI midlayer 512will invoke EH for the scmd, and returning 1 makes :c:func:`ata_qc_complete` 513to return without deallocating the qc. This leads us to 514:c:func:`ata_scsi_error` with partially completed qc. 515 516:c:func:`ata_scsi_error` 517------------------------ 518 519:c:func:`ata_scsi_error` is the current ``transportt->eh_strategy_handler()`` 520for libata. As discussed above, this will be entered in two cases - 521timeout and ATAPI error completion. This function will check if a qc is active 522and has not failed yet. Such a qc will be marked with AC_ERR_TIMEOUT such that 523EH will know to handle it later. Then it calls low level libata driver's 524:c:func:`error_handler` callback. 525 526When the :c:func:`error_handler` callback is invoked it stops BMDMA and 527completes the qc. Note that as we're currently in EH, we cannot call 528scsi_done. As described in SCSI EH doc, a recovered scmd should be 529either retried with :c:func:`scsi_queue_insert` or finished with 530:c:func:`scsi_finish_command`. Here, we override ``qc->scsidone`` with 531:c:func:`scsi_finish_command` and calls :c:func:`ata_qc_complete`. 532 533If EH is invoked due to a failed ATAPI qc, the qc here is completed but 534not deallocated. The purpose of this half-completion is to use the qc as 535place holder to make EH code reach this place. This is a bit hackish, 536but it works. 537 538Once control reaches here, the qc is deallocated by invoking 539:c:func:`__ata_qc_complete` explicitly. Then, internal qc for REQUEST SENSE 540is issued. Once sense data is acquired, scmd is finished by directly 541invoking :c:func:`scsi_finish_command` on the scmd. Note that as we already 542have completed and deallocated the qc which was associated with the 543scmd, we don't need to/cannot call :c:func:`ata_qc_complete` again. 544 545Problems with the current EH 546---------------------------- 547 548- Error representation is too crude. Currently any and all error 549 conditions are represented with ATA STATUS and ERROR registers. 550 Errors which aren't ATA device errors are treated as ATA device 551 errors by setting ATA_ERR bit. Better error descriptor which can 552 properly represent ATA and other errors/exceptions is needed. 553 554- When handling timeouts, no action is taken to make device forget 555 about the timed out command and ready for new commands. 556 557- EH handling via :c:func:`ata_scsi_error` is not properly protected from 558 usual command processing. On EH entrance, the device is not in 559 quiescent state. Timed out commands may succeed or fail any time. 560 pio_task and atapi_task may still be running. 561 562- Too weak error recovery. Devices / controllers causing HSM mismatch 563 errors and other errors quite often require reset to return to known 564 state. Also, advanced error handling is necessary to support features 565 like NCQ and hotplug. 566 567- ATA errors are directly handled in the interrupt handler and PIO 568 errors in pio_task. This is problematic for advanced error handling 569 for the following reasons. 570 571 First, advanced error handling often requires context and internal qc 572 execution. 573 574 Second, even a simple failure (say, CRC error) needs information 575 gathering and could trigger complex error handling (say, resetting & 576 reconfiguring). Having multiple code paths to gather information, 577 enter EH and trigger actions makes life painful. 578 579 Third, scattered EH code makes implementing low level drivers 580 difficult. Low level drivers override libata callbacks. If EH is 581 scattered over several places, each affected callbacks should perform 582 its part of error handling. This can be error prone and painful. 583 584libata Library 585============== 586 587.. kernel-doc:: drivers/ata/libata-core.c 588 :export: 589 590libata Core Internals 591===================== 592 593.. kernel-doc:: drivers/ata/libata-core.c 594 :internal: 595 596.. kernel-doc:: drivers/ata/libata-eh.c 597 598libata SCSI translation/emulation 599================================= 600 601.. kernel-doc:: drivers/ata/libata-scsi.c 602 :export: 603 604.. kernel-doc:: drivers/ata/libata-scsi.c 605 :internal: 606 607ATA errors and exceptions 608========================= 609 610This chapter tries to identify what error/exception conditions exist for 611ATA/ATAPI devices and describe how they should be handled in 612implementation-neutral way. 613 614The term 'error' is used to describe conditions where either an explicit 615error condition is reported from device or a command has timed out. 616 617The term 'exception' is either used to describe exceptional conditions 618which are not errors (say, power or hotplug events), or to describe both 619errors and non-error exceptional conditions. Where explicit distinction 620between error and exception is necessary, the term 'non-error exception' 621is used. 622 623Exception categories 624-------------------- 625 626Exceptions are described primarily with respect to legacy taskfile + bus 627master IDE interface. If a controller provides other better mechanism 628for error reporting, mapping those into categories described below 629shouldn't be difficult. 630 631In the following sections, two recovery actions - reset and 632reconfiguring transport - are mentioned. These are described further in 633`EH recovery actions <#exrec>`__. 634 635HSM violation 636~~~~~~~~~~~~~ 637 638This error is indicated when STATUS value doesn't match HSM requirement 639during issuing or execution any ATA/ATAPI command. 640 641- ATA_STATUS doesn't contain !BSY && DRDY && !DRQ while trying to 642 issue a command. 643 644- !BSY && !DRQ during PIO data transfer. 645 646- DRQ on command completion. 647 648- !BSY && ERR after CDB transfer starts but before the last byte of CDB 649 is transferred. ATA/ATAPI standard states that "The device shall not 650 terminate the PACKET command with an error before the last byte of 651 the command packet has been written" in the error outputs description 652 of PACKET command and the state diagram doesn't include such 653 transitions. 654 655In these cases, HSM is violated and not much information regarding the 656error can be acquired from STATUS or ERROR register. IOW, this error can 657be anything - driver bug, faulty device, controller and/or cable. 658 659As HSM is violated, reset is necessary to restore known state. 660Reconfiguring transport for lower speed might be helpful too as 661transmission errors sometimes cause this kind of errors. 662 663ATA/ATAPI device error (non-NCQ / non-CHECK CONDITION) 664~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 665 666These are errors detected and reported by ATA/ATAPI devices indicating 667device problems. For this type of errors, STATUS and ERROR register 668values are valid and describe error condition. Note that some of ATA bus 669errors are detected by ATA/ATAPI devices and reported using the same 670mechanism as device errors. Those cases are described later in this 671section. 672 673For ATA commands, this type of errors are indicated by !BSY && ERR 674during command execution and on completion. 675 676For ATAPI commands, 677 678- !BSY && ERR && ABRT right after issuing PACKET indicates that PACKET 679 command is not supported and falls in this category. 680 681- !BSY && ERR(==CHK) && !ABRT after the last byte of CDB is transferred 682 indicates CHECK CONDITION and doesn't fall in this category. 683 684- !BSY && ERR(==CHK) && ABRT after the last byte of CDB is transferred 685 \*probably\* indicates CHECK CONDITION and doesn't fall in this 686 category. 687 688Of errors detected as above, the following are not ATA/ATAPI device 689errors but ATA bus errors and should be handled according to 690`ATA bus error <#excatATAbusErr>`__. 691 692CRC error during data transfer 693 This is indicated by ICRC bit in the ERROR register and means that 694 corruption occurred during data transfer. Up to ATA/ATAPI-7, the 695 standard specifies that this bit is only applicable to UDMA 696 transfers but ATA/ATAPI-8 draft revision 1f says that the bit may be 697 applicable to multiword DMA and PIO. 698 699ABRT error during data transfer or on completion 700 Up to ATA/ATAPI-7, the standard specifies that ABRT could be set on 701 ICRC errors and on cases where a device is not able to complete a 702 command. Combined with the fact that MWDMA and PIO transfer errors 703 aren't allowed to use ICRC bit up to ATA/ATAPI-7, it seems to imply 704 that ABRT bit alone could indicate transfer errors. 705 706 However, ATA/ATAPI-8 draft revision 1f removes the part that ICRC 707 errors can turn on ABRT. So, this is kind of gray area. Some 708 heuristics are needed here. 709 710ATA/ATAPI device errors can be further categorized as follows. 711 712Media errors 713 This is indicated by UNC bit in the ERROR register. ATA devices 714 reports UNC error only after certain number of retries cannot 715 recover the data, so there's nothing much else to do other than 716 notifying upper layer. 717 718 READ and WRITE commands report CHS or LBA of the first failed sector 719 but ATA/ATAPI standard specifies that the amount of transferred data 720 on error completion is indeterminate, so we cannot assume that 721 sectors preceding the failed sector have been transferred and thus 722 cannot complete those sectors successfully as SCSI does. 723 724Media changed / media change requested error 725 <<TODO: fill here>> 726 727Address error 728 This is indicated by IDNF bit in the ERROR register. Report to upper 729 layer. 730 731Other errors 732 This can be invalid command or parameter indicated by ABRT ERROR bit 733 or some other error condition. Note that ABRT bit can indicate a lot 734 of things including ICRC and Address errors. Heuristics needed. 735 736Depending on commands, not all STATUS/ERROR bits are applicable. These 737non-applicable bits are marked with "na" in the output descriptions but 738up to ATA/ATAPI-7 no definition of "na" can be found. However, 739ATA/ATAPI-8 draft revision 1f describes "N/A" as follows. 740 741 3.2.3.3a N/A 742 A keyword the indicates a field has no defined value in this 743 standard and should not be checked by the host or device. N/A 744 fields should be cleared to zero. 745 746So, it seems reasonable to assume that "na" bits are cleared to zero by 747devices and thus need no explicit masking. 748 749ATAPI device CHECK CONDITION 750~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 751 752ATAPI device CHECK CONDITION error is indicated by set CHK bit (ERR bit) 753in the STATUS register after the last byte of CDB is transferred for a 754PACKET command. For this kind of errors, sense data should be acquired 755to gather information regarding the errors. REQUEST SENSE packet command 756should be used to acquire sense data. 757 758Once sense data is acquired, this type of errors can be handled 759similarly to other SCSI errors. Note that sense data may indicate ATA 760bus error (e.g. Sense Key 04h HARDWARE ERROR && ASC/ASCQ 47h/00h SCSI 761PARITY ERROR). In such cases, the error should be considered as an ATA 762bus error and handled according to `ATA bus error <#excatATAbusErr>`__. 763 764ATA device error (NCQ) 765~~~~~~~~~~~~~~~~~~~~~~ 766 767NCQ command error is indicated by cleared BSY and set ERR bit during NCQ 768command phase (one or more NCQ commands outstanding). Although STATUS 769and ERROR registers will contain valid values describing the error, READ 770LOG EXT is required to clear the error condition, determine which 771command has failed and acquire more information. 772 773READ LOG EXT Log Page 10h reports which tag has failed and taskfile 774register values describing the error. With this information the failed 775command can be handled as a normal ATA command error as in 776`ATA/ATAPI device error (non-NCQ / non-CHECK CONDITION) <#excatDevErr>`__ 777and all other in-flight commands must be retried. Note that this retry 778should not be counted - it's likely that commands retried this way would 779have completed normally if it were not for the failed command. 780 781Note that ATA bus errors can be reported as ATA device NCQ errors. This 782should be handled as described in `ATA bus error <#excatATAbusErr>`__. 783 784If READ LOG EXT Log Page 10h fails or reports NQ, we're thoroughly 785screwed. This condition should be treated according to 786`HSM violation <#excatHSMviolation>`__. 787 788ATA bus error 789~~~~~~~~~~~~~ 790 791ATA bus error means that data corruption occurred during transmission 792over ATA bus (SATA or PATA). This type of errors can be indicated by 793 794- ICRC or ABRT error as described in 795 `ATA/ATAPI device error (non-NCQ / non-CHECK CONDITION) <#excatDevErr>`__. 796 797- Controller-specific error completion with error information 798 indicating transmission error. 799 800- On some controllers, command timeout. In this case, there may be a 801 mechanism to determine that the timeout is due to transmission error. 802 803- Unknown/random errors, timeouts and all sorts of weirdities. 804 805As described above, transmission errors can cause wide variety of 806symptoms ranging from device ICRC error to random device lockup, and, 807for many cases, there is no way to tell if an error condition is due to 808transmission error or not; therefore, it's necessary to employ some kind 809of heuristic when dealing with errors and timeouts. For example, 810encountering repetitive ABRT errors for known supported command is 811likely to indicate ATA bus error. 812 813Once it's determined that ATA bus errors have possibly occurred, 814lowering ATA bus transmission speed is one of actions which may 815alleviate the problem. See `Reconfigure transport <#exrecReconf>`__ for 816more information. 817 818PCI bus error 819~~~~~~~~~~~~~ 820 821Data corruption or other failures during transmission over PCI (or other 822system bus). For standard BMDMA, this is indicated by Error bit in the 823BMDMA Status register. This type of errors must be logged as it 824indicates something is very wrong with the system. Resetting host 825controller is recommended. 826 827Late completion 828~~~~~~~~~~~~~~~ 829 830This occurs when timeout occurs and the timeout handler finds out that 831the timed out command has completed successfully or with error. This is 832usually caused by lost interrupts. This type of errors must be logged. 833Resetting host controller is recommended. 834 835Unknown error (timeout) 836~~~~~~~~~~~~~~~~~~~~~~~ 837 838This is when timeout occurs and the command is still processing or the 839host and device are in unknown state. When this occurs, HSM could be in 840any valid or invalid state. To bring the device to known state and make 841it forget about the timed out command, resetting is necessary. The timed 842out command may be retried. 843 844Timeouts can also be caused by transmission errors. Refer to 845`ATA bus error <#excatATAbusErr>`__ for more details. 846 847Hotplug and power management exceptions 848~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 849 850<<TODO: fill here>> 851 852EH recovery actions 853------------------- 854 855This section discusses several important recovery actions. 856 857Clearing error condition 858~~~~~~~~~~~~~~~~~~~~~~~~ 859 860Many controllers require its error registers to be cleared by error 861handler. Different controllers may have different requirements. 862 863For SATA, it's strongly recommended to clear at least SError register 864during error handling. 865 866Reset 867~~~~~ 868 869During EH, resetting is necessary in the following cases. 870 871- HSM is in unknown or invalid state 872 873- HBA is in unknown or invalid state 874 875- EH needs to make HBA/device forget about in-flight commands 876 877- HBA/device behaves weirdly 878 879Resetting during EH might be a good idea regardless of error condition 880to improve EH robustness. Whether to reset both or either one of HBA and 881device depends on situation but the following scheme is recommended. 882 883- When it's known that HBA is in ready state but ATA/ATAPI device is in 884 unknown state, reset only device. 885 886- If HBA is in unknown state, reset both HBA and device. 887 888HBA resetting is implementation specific. For a controller complying to 889taskfile/BMDMA PCI IDE, stopping active DMA transaction may be 890sufficient iff BMDMA state is the only HBA context. But even mostly 891taskfile/BMDMA PCI IDE complying controllers may have implementation 892specific requirements and mechanism to reset themselves. This must be 893addressed by specific drivers. 894 895OTOH, ATA/ATAPI standard describes in detail ways to reset ATA/ATAPI 896devices. 897 898PATA hardware reset 899 This is hardware initiated device reset signalled with asserted PATA 900 RESET- signal. There is no standard way to initiate hardware reset 901 from software although some hardware provides registers that allow 902 driver to directly tweak the RESET- signal. 903 904Software reset 905 This is achieved by turning CONTROL SRST bit on for at least 5us. 906 Both PATA and SATA support it but, in case of SATA, this may require 907 controller-specific support as the second Register FIS to clear SRST 908 should be transmitted while BSY bit is still set. Note that on PATA, 909 this resets both master and slave devices on a channel. 910 911EXECUTE DEVICE DIAGNOSTIC command 912 Although ATA/ATAPI standard doesn't describe exactly, EDD implies 913 some level of resetting, possibly similar level with software reset. 914 Host-side EDD protocol can be handled with normal command processing 915 and most SATA controllers should be able to handle EDD's just like 916 other commands. As in software reset, EDD affects both devices on a 917 PATA bus. 918 919 Although EDD does reset devices, this doesn't suit error handling as 920 EDD cannot be issued while BSY is set and it's unclear how it will 921 act when device is in unknown/weird state. 922 923ATAPI DEVICE RESET command 924 This is very similar to software reset except that reset can be 925 restricted to the selected device without affecting the other device 926 sharing the cable. 927 928SATA phy reset 929 This is the preferred way of resetting a SATA device. In effect, 930 it's identical to PATA hardware reset. Note that this can be done 931 with the standard SCR Control register. As such, it's usually easier 932 to implement than software reset. 933 934One more thing to consider when resetting devices is that resetting 935clears certain configuration parameters and they need to be set to their 936previous or newly adjusted values after reset. 937 938Parameters affected are. 939 940- CHS set up with INITIALIZE DEVICE PARAMETERS (seldom used) 941 942- Parameters set with SET FEATURES including transfer mode setting 943 944- Block count set with SET MULTIPLE MODE 945 946- Other parameters (SET MAX, MEDIA LOCK...) 947 948ATA/ATAPI standard specifies that some parameters must be maintained 949across hardware or software reset, but doesn't strictly specify all of 950them. Always reconfiguring needed parameters after reset is required for 951robustness. Note that this also applies when resuming from deep sleep 952(power-off). 953 954Also, ATA/ATAPI standard requires that IDENTIFY DEVICE / IDENTIFY PACKET 955DEVICE is issued after any configuration parameter is updated or a 956hardware reset and the result used for further operation. OS driver is 957required to implement revalidation mechanism to support this. 958 959Reconfigure transport 960~~~~~~~~~~~~~~~~~~~~~ 961 962For both PATA and SATA, a lot of corners are cut for cheap connectors, 963cables or controllers and it's quite common to see high transmission 964error rate. This can be mitigated by lowering transmission speed. 965 966The following is a possible scheme Jeff Garzik suggested. 967 968 If more than $N (3?) transmission errors happen in 15 minutes, 969 970 - if SATA, decrease SATA PHY speed. if speed cannot be decreased, 971 972 - decrease UDMA xfer speed. if at UDMA0, switch to PIO4, 973 974 - decrease PIO xfer speed. if at PIO3, complain, but continue 975 976ata_piix Internals 977=================== 978 979.. kernel-doc:: drivers/ata/ata_piix.c 980 :internal: 981 982sata_sil Internals 983=================== 984 985.. kernel-doc:: drivers/ata/sata_sil.c 986 :internal: 987 988Thanks 989====== 990 991The bulk of the ATA knowledge comes thanks to long conversations with 992Andre Hedrick (www.linux-ide.org), and long hours pondering the ATA and 993SCSI specifications. 994 995Thanks to Alan Cox for pointing out similarities between SATA and SCSI, 996and in general for motivation to hack on libata. 997 998libata's device detection method, ata_pio_devchk, and in general all 999the early probing was based on extensive study of Hale Landis's 1000probe/reset code in his ATADRVR driver (www.ata-atapi.com). 1001