1*78ac2d8dSPeter MaydellBlock driver correctness testing with ``blkverify`` 2*78ac2d8dSPeter Maydell=================================================== 3d9d33417SStefan Hajnoczi 4*78ac2d8dSPeter MaydellIntroduction 5*78ac2d8dSPeter Maydell------------ 6d9d33417SStefan Hajnoczi 7*78ac2d8dSPeter MaydellThis document describes how to use the ``blkverify`` protocol to test that a block 8d9d33417SStefan Hajnoczidriver is operating correctly. 9d9d33417SStefan Hajnoczi 10d9d33417SStefan HajnocziIt is difficult to test and debug block drivers against real guests. Often 11d9d33417SStefan Hajnocziprocesses inside the guest will crash because corrupt sectors were read as part 12d9d33417SStefan Hajnocziof the executable. Other times obscure errors are raised by a program inside 13d9d33417SStefan Hajnoczithe guest. These issues are extremely hard to trace back to bugs in the block 14d9d33417SStefan Hajnoczidriver. 15d9d33417SStefan Hajnoczi 16*78ac2d8dSPeter Maydell``blkverify`` solves this problem by catching data corruption inside QEMU the first 17d9d33417SStefan Hajnoczitime bad data is read and reporting the disk sector that is corrupted. 18d9d33417SStefan Hajnoczi 19*78ac2d8dSPeter MaydellHow it works 20*78ac2d8dSPeter Maydell------------ 21d9d33417SStefan Hajnoczi 22*78ac2d8dSPeter MaydellThe ``blkverify`` protocol has two child block devices, the "test" device and the 23d9d33417SStefan Hajnoczi"raw" device. Read/write operations are mirrored to both devices so their 24d9d33417SStefan Hajnoczistate should always be in sync. 25d9d33417SStefan Hajnoczi 26d9d33417SStefan HajnocziThe "raw" device is a raw image, a flat file, that has identical starting 27d9d33417SStefan Hajnoczicontents to the "test" image. The idea is that the "raw" device will handle 28d9d33417SStefan Hajnocziread/write operations correctly and not corrupt data. It can be used as a 29d9d33417SStefan Hajnoczireference for comparison against the "test" device. 30d9d33417SStefan Hajnoczi 31*78ac2d8dSPeter MaydellAfter a mirrored read operation completes, ``blkverify`` will compare the data and 32d9d33417SStefan Hajnocziraise an error if it is not identical. This makes it possible to catch the 33d9d33417SStefan Hajnoczifirst instance where corrupt data is read. 34d9d33417SStefan Hajnoczi 35*78ac2d8dSPeter MaydellExample 36*78ac2d8dSPeter Maydell------- 37d9d33417SStefan Hajnoczi 38*78ac2d8dSPeter MaydellImagine raw.img has 0xcd repeated throughout its first sector:: 39d9d33417SStefan Hajnoczi 40d9d33417SStefan Hajnoczi $ ./qemu-io -c 'read -v 0 512' raw.img 41d9d33417SStefan Hajnoczi 00000000: cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd ................ 42d9d33417SStefan Hajnoczi 00000010: cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd ................ 43d9d33417SStefan Hajnoczi [...] 44d9d33417SStefan Hajnoczi 000001e0: cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd ................ 45d9d33417SStefan Hajnoczi 000001f0: cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd ................ 46d9d33417SStefan Hajnoczi read 512/512 bytes at offset 0 47d9d33417SStefan Hajnoczi 512.000000 bytes, 1 ops; 0.0000 sec (97.656 MiB/sec and 200000.0000 ops/sec) 48d9d33417SStefan Hajnoczi 49*78ac2d8dSPeter MaydellAnd test.img is corrupt, its first sector is zeroed when it shouldn't be:: 50d9d33417SStefan Hajnoczi 51d9d33417SStefan Hajnoczi $ ./qemu-io -c 'read -v 0 512' test.img 52d9d33417SStefan Hajnoczi 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 53d9d33417SStefan Hajnoczi 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 54d9d33417SStefan Hajnoczi [...] 55d9d33417SStefan Hajnoczi 000001e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 56d9d33417SStefan Hajnoczi 000001f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 57d9d33417SStefan Hajnoczi read 512/512 bytes at offset 0 58d9d33417SStefan Hajnoczi 512.000000 bytes, 1 ops; 0.0000 sec (81.380 MiB/sec and 166666.6667 ops/sec) 59d9d33417SStefan Hajnoczi 60*78ac2d8dSPeter MaydellThis error is caught by ``blkverify``:: 61d9d33417SStefan Hajnoczi 62d9d33417SStefan Hajnoczi $ ./qemu-io -c 'read 0 512' blkverify:a.img:b.img 63d9d33417SStefan Hajnoczi blkverify: read sector_num=0 nb_sectors=4 contents mismatch in sector 0 64d9d33417SStefan Hajnoczi 65*78ac2d8dSPeter MaydellA more realistic scenario is verifying the installation of a guest OS:: 66d9d33417SStefan Hajnoczi 67d9d33417SStefan Hajnoczi $ ./qemu-img create raw.img 16G 68d9d33417SStefan Hajnoczi $ ./qemu-img create -f qcow2 test.qcow2 16G 6964ed6f92SPaolo Bonzini $ ./qemu-system-x86_64 -cdrom debian.iso \ 70d9d33417SStefan Hajnoczi -drive file=blkverify:raw.img:test.qcow2 71d9d33417SStefan Hajnoczi 72*78ac2d8dSPeter MaydellIf the installation is aborted when ``blkverify`` detects corruption, use ``qemu-io`` 73d9d33417SStefan Hajnoczito explore the contents of the disk image at the sector in question. 74