Lines Matching +full:block +full:- +full:number
2 BTT - Block Translation Table
11 storage as traditional block devices. The block drivers for persistent memory
14 using stored energy in capacitors to complete in-flight block writes, or perhaps
15 in firmware. We don't have this luxury with persistent memory - if a write is in
16 progress, and we experience a power failure, the block will contain a mix of old
19 The Block Translation Table (BTT) provides atomic sector update semantics for
21 being torn can continue to do so. The BTT manifests itself as a stacked block
23 the heart of it, is an indirection table that re-maps all the blocks on the
37 next arena). The following depicts the "On-disk" metadata layout::
40 Backing Store +-------> Arena
41 +---------------+ | +------------------+
42 | | | | Arena info block |
43 | Arena 0 +---+ | 4K |
44 | 512G | +------------------+
46 +---------------+ | |
51 +---------------+ | |
57 +---------------+ +------------------+
62 +------------------+
66 +------------------+
67 | Info block copy |
69 +------------------+
77 --------------
80 block. Each map entry is 32 bits. The two most significant bits are special
81 flags, and the remaining form the internal block number.
86 31 - 30 Error and Zero flags - Used in the following way::
94 1 1 Normal Block – has valid postmap
97 29 - 0 Mappings to internal 'postmap' blocks
105 ABA Arena Block Address - Block offset/number within an arena
106 Premap ABA The block offset into an arena, which was decided upon by range
108 Postmap ABA The block number in the "Data Blocks" area obtained after
110 nfree The number of free blocks that are maintained at any given time.
111 This is the number of concurrent writes that can happen to the
118 worth of blocks that this arena contributes, this block is at 256G. Thus, the
119 premap ABA is 256G. We now refer to the map, and find out the mapping for block
120 'X' (256G) points to block 'Y', say '64'. Thus the postmap ABA is 64.
124 ---------------
127 i.e. Every write goes to a "free" block. A running list of free blocks is
133 old_map The old postmap ABA - after 'this' write completes, this will be a
134 free block.
136 lba->postmap_aba mapping, but we log it here in case we have to
138 seq Sequence number to mark which of the 2 sections of this flog entry is
139 valid/newest. It cycles between 01->10->11->01 (binary) under normal
144 seq' alternate sequence number.
147 Each of the above fields is 32-bit, making one entry 32 bytes. Entries are also
151 b. writes the 'new' section such that the sequence number is written last.
155 -----------------------
157 While 'nfree' describes the number of concurrent IOs an arena can process
158 concurrently, 'nlanes' is the number of IOs the BTT device as a whole can
163 A lane number is obtained at the start of any IO, and is used for indexing into
164 all the on-disk and in-memory data structures for the duration of the IO. If
165 there are more CPUs than the max number of available lanes, than lanes are
169 d. In-memory data structure: Read Tracking Table (RTT)
170 ------------------------------------------------------
173 writes. We can hit a condition where the writer thread grabs a free block to do
175 the reader consulted a map entry, and started reading the corresponding block. A
178 internal, postmap block that the reader is (still) reading has been inserted
180 grab this free block, and start writing to it, causing the reader to read
185 read is complete. Every writer thread, after grabbing a free block, checks the
186 RTT for its presence. If the postmap free block is in the RTT, it waits till the
190 e. In-memory data structure: map locks
191 --------------------------------------
210 -------------------------------
215 number). The reconstruction rules/steps are simple:
217 - Read map[log_entry.lba].
218 - If log_entry.new matches the map entry, then log_entry.old is free.
219 - If log_entry.new does not match the map entry, then log_entry.new is free.
220 (This case can only be caused by power-fails/unsafe shutdowns)
223 g. Summarizing - Read and Write flows
224 -------------------------------------
228 1. Convert external LBA to arena number + pre-map ABA
230 3. Read map to get the entry for this pre-map ABA
231 4. Enter post-map ABA into RTT[lane]
234 7. Read data from this block
235 8. Remove post-map ABA entry from RTT[lane]
240 1. Convert external LBA to Arena number + pre-map ABA
242 3. Use lane to index into in-memory free list and obtain a new block, next flog
243 index, next sequence number
244 4. Scan the RTT to check if free block is present, and spin/wait if it is.
245 5. Write data to this free block
246 6. Read map to get the existing post-map ABA entry for this pre-map ABA
248 8. Write new post-map ABA into map.
249 9. Write old post-map entry into the free list
250 10. Calculate next sequence number and write into the free list entry
261 - Info block checksum does not match (and recovering from the copy also fails)
262 - All internal available blocks are not uniquely and entirely addressed by the
264 - Rebuilding free list from the flog reveals missing/duplicate/impossible
266 - A map entry is out of bounds
269 only state using a flag in the info block.
281 ndctl create-namespace -f -e namespace0.0 -m sector -l 4k
283 See ndctl create-namespace --help for more options.