1========================== 2Bulk Register Access (BRA) 3========================== 4 5Conventions 6----------- 7 8Capitalized words used in this documentation are intentional and refer 9to concepts of the SoundWire 1.x specification. 10 11Introduction 12------------ 13 14The SoundWire 1.x specification provides a mechanism to speed-up 15command/control transfers by reclaiming parts of the audio 16bandwidth. The Bulk Register Access (BRA) protocol is a standard 17solution based on the Bulk Payload Transport (BPT) definitions. 18 19The regular control channel uses Column 0 and can only send/retrieve 20one byte per frame with write/read commands. With a typical 48kHz 21frame rate, only 48kB/s can be transferred. 22 23The optional Bulk Register Access capability can transmit up to 12 24Mbits/s and reduce transfer times by several orders of magnitude, but 25has multiple design constraints: 26 27 (1) Each frame can only support a read or a write transfer, with a 28 10-byte overhead per frame (header and footer response). 29 30 (2) The read/writes SHALL be from/to contiguous register addresses 31 in the same frame. A fragmented register space decreases the 32 efficiency of the protocol by requiring multiple BRA transfers 33 scheduled in different frames. 34 35 (3) The targeted Peripheral device SHALL support the optional Data 36 Port 0, and likewise the Manager SHALL expose audio-like Ports 37 to insert BRA packets in the audio payload using the concepts of 38 Sample Interval, HSTART, HSTOP, etc. 39 40 (4) The BRA transport efficiency depends on the available 41 bandwidth. If there are no on-going audio transfers, the entire 42 frame minus Column 0 can be reclaimed for BRA. The frame shape 43 also impacts efficiency: since Column0 cannot be used for 44 BTP/BRA, the frame should rely on a large number of columns and 45 minimize the number of rows. The bus clock should be as high as 46 possible. 47 48 (5) The number of bits transferred per frame SHALL be a multiple of 49 8 bits. Padding bits SHALL be inserted if necessary at the end 50 of the data. 51 52 (6) The regular read/write commands can be issued in parallel with 53 BRA transfers. This is convenient to e.g. deal with alerts, jack 54 detection or change the volume during firmware download, but 55 accessing the same address with two independent protocols has to 56 be avoided to avoid undefined behavior. 57 58 (7) Some implementations may not be capable of handling the 59 bandwidth of the BRA protocol, e.g. in the case of a slow I2C 60 bus behind the SoundWire IP. In this case, the transfers may 61 need to be spaced in time or flow-controlled. 62 63 (8) Each BRA packet SHALL be marked as 'Active' when valid data is 64 to be transmitted. This allows for software to allocate a BRA 65 stream but not transmit/discard data while processing the 66 results or preparing the next batch of data, or allowing the 67 peripheral to deal with the previous transfer. In addition BRA 68 transfer can be started early on without data being ready. 69 70 (9) Up to 470 bytes may be transmitted per frame. 71 72 (10) The address is represented with 32 bits and does not rely on 73 the paging registers used for the regular command/control 74 protocol in Column 0. 75 76 77Error checking 78-------------- 79 80Firmware download is one of the key usages of the Bulk Register Access 81protocol. To make sure the binary data integrity is not compromised by 82transmission or programming errors, each BRA packet provides: 83 84 (1) A CRC on the 7-byte header. This CRC helps the Peripheral Device 85 check if it is addressed and set the start address and number of 86 bytes. The Peripheral Device provides a response in Byte 7. 87 88 (2) A CRC on the data block (header excluded). This CRC is 89 transmitted as the last-but-one byte in the packet, prior to the 90 footer response. 91 92The header response can be one of: 93 (a) Ack 94 (b) Nak 95 (c) Not Ready 96 97The footer response can be one of: 98 (1) Ack 99 (2) Nak (CRC failure) 100 (3) Good (operation completed) 101 (4) Bad (operation failed) 102 103Example frame 104------------- 105 106The example below is not to scale and makes simplifying assumptions 107for clarity. The different chunks in the BRA packets are not required 108to start on a new SoundWire Row, and the scale of data may vary. 109 110 :: 111 112 +---+--------------------------------------------+ 113 + | | 114 + | BRA HEADER | 115 + | | 116 + +--------------------------------------------+ 117 + C | HEADER CRC | 118 + O +--------------------------------------------+ 119 + M | HEADER RESPONSE | 120 + M +--------------------------------------------+ 121 + A | | 122 + N | | 123 + D | DATA | 124 + | | 125 + | | 126 + | | 127 + +--------------------------------------------+ 128 + | DATA CRC | 129 + +--------------------------------------------+ 130 + | FOOTER RESPONSE | 131 +---+--------------------------------------------+ 132 133 134Assuming the frame uses N columns, the configuration shown above can 135be programmed by setting the DP0 registers as: 136 137 - HSTART = 1 138 - HSTOP = N - 1 139 - Sampling Interval = N 140 - WordLength = N - 1 141 142Addressing restrictions 143----------------------- 144 145The Device Number specified in the Header follows the SoundWire 146definitions, and broadcast and group addressing are permitted. For now 147the Linux implementation only allows for a single BPT transfer to a 148single device at a time. This might be revisited at a later point as 149an optimization to send the same firmware to multiple devices, but 150this would only be beneficial for single-link solutions. 151 152In the case of multiple Peripheral devices attached to different 153Managers, the broadcast and group addressing is not supported by the 154SoundWire specification. Each device must be handled with separate BRA 155streams, possibly in parallel - the links are really independent. 156 157Unsupported features 158-------------------- 159 160The Bulk Register Access specification provides a number of 161capabilities that are not supported in known implementations, such as: 162 163 (1) Transfers initiated by a Peripheral Device. The BRA Initiator is 164 always the Manager Device. 165 166 (2) Flow-control capabilities and retransmission based on the 167 'NotReady' header response require extra buffering in the 168 SoundWire IP and are not implemented. 169 170Bi-directional handling 171----------------------- 172 173The BRA protocol can handle writes as well as reads, and in each 174packet the header and footer response are provided by the Peripheral 175Target device. On the Peripheral device, the BRA protocol is handled 176by a single DP0 data port, and at the low-level the bus ownership can 177will change for header/footer response as well as the data transmitted 178during a read. 179 180On the host side, most implementations rely on a Port-like concept, 181with two FIFOs consuming/generating data transfers in parallel 182(Host->Peripheral and Peripheral->Host). The amount of data 183consumed/produced by these FIFOs is not symmetrical, as a result 184hardware typically inserts markers to help software and hardware 185interpret raw data 186 187Each packet will typically have: 188 189 (1) a 'Start of Packet' indicator. 190 191 (2) an 'End of Packet' indicator. 192 193 (3) a packet identifier to correlate the data requested and 194 transmitted, and the error status for each frame 195 196Hardware implementations can check errors at the frame level, and 197retry a transfer in case of errors. However, as for the flow-control 198case, this requires extra buffering and intelligence in the 199hardware. The Linux support assumes that the entire transfer is 200cancelled if a single error is detected in one of the responses. 201 202Abstraction required 203~~~~~~~~~~~~~~~~~~~~ 204 205There are no standard registers or mandatory implementation at the 206Manager level, so the low-level BPT/BRA details must be hidden in 207Manager-specific code. For example the Cadence IP format above is not 208known to the codec drivers. 209 210Likewise, codec drivers should not have to know the frame size. The 211computation of CRC and handling of responses is handled in helpers and 212Manager-specific code. 213 214The host BRA driver may also have restrictions on pages allocated for 215DMA, or other host-DSP communication protocols. The codec driver 216should not be aware of any of these restrictions, since it might be 217reused in combination with different implementations of Manager IPs. 218 219Concurrency between BRA and regular read/write 220~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 221 222The existing 'nread/nwrite' API already relies on a notion of start 223address and number of bytes, so it would be possible to extend this 224API with a 'hint' requesting BPT/BRA be used. 225 226However BRA transfers could be quite long, and the use of a single 227mutex for regular read/write and BRA is a show-stopper. Independent 228operation of the control/command and BRA transfers is a fundamental 229requirement, e.g. to change the volume level with the existing regmap 230interface while downloading firmware. The integration must however 231ensure that there are no concurrent access to the same address with 232the command/control protocol and the BRA protocol. 233 234In addition, the 'sdw_msg' structure hard-codes support for 16-bit 235addresses and paging registers which are irrelevant for BPT/BRA 236support based on native 32-bit addresses. A separate API with 237'sdw_bpt_msg' makes more sense. 238 239One possible strategy to speed-up all initialization tasks would be to 240start a BRA transfer for firmware download, then deal with all the 241"regular" read/writes in parallel with the command channel, and last 242to wait for the BRA transfers to complete. This would allow for a 243degree of overlap instead of a purely sequential solution. As such, 244the BRA API must support async transfers and expose a separate wait 245function. 246 247 248Peripheral/bus interface 249------------------------ 250 251The bus interface for BPT/BRA is made of two functions: 252 253 - sdw_bpt_send_async(bpt_message) 254 255 This function sends the data using the Manager 256 implementation-defined capabilities (typically DMA or IPC 257 protocol). 258 259 Queueing is currently not supported, the caller 260 needs to wait for completion of the requested transfer. 261 262 - sdw_bpt_wait() 263 264 This function waits for the entire message provided by the 265 codec driver in the 'send_async' stage. Intermediate status for 266 smaller chunks will not be provided back to the codec driver, 267 only a return code will be provided. 268 269Regmap use 270~~~~~~~~~~ 271 272Existing codec drivers rely on regmap to download firmware to 273Peripherals. regmap exposes an async interface similar to the 274send/wait API suggested above, so at a high-level it would seem 275natural to combine BRA and regmap. The regmap layer could check if BRA 276is available or not, and use a regular read-write command channel in 277the latter case. 278 279The regmap integration will be handled in a second step. 280 281BRA stream model 282---------------- 283 284For regular audio transfers, the machine driver exposes a dailink 285connecting CPU DAI(s) and Codec DAI(s). 286 287This model is not required BRA support: 288 289 (1) The SoundWire DAIs are mainly wrappers for SoundWire Data 290 Ports, with possibly some analog or audio conversion 291 capabilities bolted behind the Data Port. In the context of 292 BRA, the DP0 is the destination. DP0 registers are standard and 293 can be programmed blindly without knowing what Peripheral is 294 connected to each link. In addition, if there are multiple 295 Peripherals on a link and some of them do not support DP0, the 296 write commands to program DP0 registers will generate harmless 297 COMMAND_IGNORED responses that will be wired-ORed with 298 responses from Peripherals which support DP0. In other words, 299 the DP0 programming can be done with broadcast commands, and 300 the information on the Target device can be added only in the 301 BRA Header. 302 303 (2) At the CPU level, the DAI concept is not useful for BRA; the 304 machine driver will not create a dailink relying on DP0. The 305 only concept that is needed is the notion of port. 306 307 (3) The stream concept relies on a set of master_rt and slave_rt 308 concepts. All of these entities represent ports and not DAIs. 309 310 (4) With the assumption that a single BRA stream is used per link, 311 that stream can connect master ports as well as all peripheral 312 DP0 ports. 313 314 (5) BRA transfers only make sense in the context of one 315 Manager/Link, so the BRA stream handling does not rely on the 316 concept of multi-link aggregation allowed by regular DAI links. 317 318Audio DMA support 319----------------- 320 321Some DMAs, such as HDaudio, require an audio format field to be 322set. This format is in turn used to define acceptable bursts. BPT/BRA 323support is not fully compatible with these definitions in that the 324format and bandwidth may vary between read and write commands. 325 326In addition, on Intel HDaudio Intel platforms the DMAs need to be 327programmed with a PCM format matching the bandwidth of the BPT/BRA 328transfer. The format is based on 192kHz 32-bit samples, and the number 329of channels varies to adjust the bandwidth. The notion of channel is 330completely notional since the data is not typical audio 331PCM. Programming such channels helps reserve enough bandwidth and adjust 332FIFO sizes to avoid xruns. 333 334Alignment requirements are currently not enforced at the core level 335but at the platform-level, e.g. for Intel the data sizes must be 336multiples of 32 bytes. 337