1=======================
2initramfs buffer format
3=======================
4
5Al Viro, H. Peter Anvin
6
7With kernel 2.5.x, the old "initial ramdisk" protocol was complemented
8with an "initial ramfs" protocol.  The initramfs content is passed
9using the same memory buffer protocol used by initrd, but the content
10is different.  The initramfs buffer contains an archive which is
11expanded into a ramfs filesystem; this document details the initramfs
12buffer format.
13
14The initramfs buffer format is based around the "newc" or "crc" CPIO
15formats, and can be created with the cpio(1) utility.  The cpio
16archive can be compressed using gzip(1), or any other algorithm provided
17via CONFIG_DECOMPRESS_*.  One valid version of an initramfs buffer is
18thus a single .cpio.gz file.
19
20The full format of the initramfs buffer is defined by the following
21grammar, where::
22
23	*	is used to indicate "0 or more occurrences of"
24	(|)	indicates alternatives
25	+	indicates concatenation
26	GZIP()	indicates gzip compression of the operand
27	BZIP2()	indicates bzip2 compression of the operand
28	LZMA()	indicates lzma compression of the operand
29	XZ()	indicates xz compression of the operand
30	LZO()	indicates lzo compression of the operand
31	LZ4()	indicates lz4 compression of the operand
32	ZSTD()	indicates zstd compression of the operand
33	ALGN(n)	means padding with null bytes to an n-byte boundary
34
35	initramfs := ("\0" | cpio_archive | cpio_compressed_archive)*
36
37	cpio_compressed_archive := (GZIP(cpio_archive) | BZIP2(cpio_archive)
38		| LZMA(cpio_archive) | XZ(cpio_archive) | LZO(cpio_archive)
39		| LZ4(cpio_archive) | ZSTD(cpio_archive))
40
41	cpio_archive := cpio_file* + (<nothing> | cpio_trailer)
42
43	cpio_file := ALGN(4) + cpio_header + filename + "\0" + ALGN(4) + data
44
45	cpio_trailer := ALGN(4) + cpio_header + "TRAILER!!!\0" + ALGN(4)
46
47
48In human terms, the initramfs buffer contains a collection of
49compressed and/or uncompressed cpio archives (in the "newc" or "crc"
50formats); arbitrary amounts zero bytes (for padding) can be added
51between members.
52
53The cpio "TRAILER!!!" entry (cpio end-of-archive) is optional, but is
54not ignored; see "handling of hard links" below.
55
56The structure of the cpio_header is as follows (all fields contain
57hexadecimal ASCII numbers fully padded with '0' on the left to the
58full width of the field, for example, the integer 4780 is represented
59by the ASCII string "000012ac"):
60
61============= ================== ==============================================
62Field name    Field size	 Meaning
63============= ================== ==============================================
64c_magic	      6 bytes		 The string "070701" or "070702"
65c_ino	      8 bytes		 File inode number
66c_mode	      8 bytes		 File mode and permissions
67c_uid	      8 bytes		 File uid
68c_gid	      8 bytes		 File gid
69c_nlink	      8 bytes		 Number of links
70c_mtime	      8 bytes		 Modification time
71c_filesize    8 bytes		 Size of data field
72c_maj	      8 bytes		 Major part of file device number
73c_min	      8 bytes		 Minor part of file device number
74c_rmaj	      8 bytes		 Major part of device node reference
75c_rmin	      8 bytes		 Minor part of device node reference
76c_namesize    8 bytes		 Length of filename, including final \0
77c_chksum      8 bytes		 Checksum of data field if c_magic is 070702;
78				 otherwise zero
79============= ================== ==============================================
80
81The c_mode field matches the contents of st_mode returned by stat(2)
82on Linux, and encodes the file type and file permissions.
83
84c_mtime is ignored unless CONFIG_INITRAMFS_PRESERVE_MTIME=y is set.
85
86The c_filesize should be zero for any file which is not a regular file
87or symlink.
88
89The c_chksum field contains a simple 32-bit unsigned sum of all the
90bytes in the data field.  cpio(1) refers to this as "crc", which is
91clearly incorrect (a cyclic redundancy check is a different and
92significantly stronger integrity check), however, this is the
93algorithm used.
94
95If the filename is "TRAILER!!!" this is actually an end-of-archive
96marker; the c_filesize for an end-of-archive marker must be zero.
97
98
99Handling of hard links
100======================
101
102When a nondirectory with c_nlink > 1 is seen, the (c_maj,c_min,c_ino)
103tuple is looked up in a tuple buffer.  If not found, it is entered in
104the tuple buffer and the entry is created as usual; if found, a hard
105link rather than a second copy of the file is created.  It is not
106necessary (but permitted) to include a second copy of the file
107contents; if the file contents is not included, the c_filesize field
108should be set to zero to indicate no data section follows.  If data is
109present, the previous instance of the file is overwritten; this allows
110the data-carrying instance of a file to occur anywhere in the sequence
111(GNU cpio is reported to attach the data to the last instance of a
112file only.)
113
114c_filesize must not be zero for a symlink.
115
116When a "TRAILER!!!" end-of-archive marker is seen, the tuple buffer is
117reset.  This permits archives which are generated independently to be
118concatenated.
119
120To combine file data from different sources (without having to
121regenerate the (c_maj,c_min,c_ino) fields), therefore, either one of
122the following techniques can be used:
123
124a) Separate the different file data sources with a "TRAILER!!!"
125   end-of-archive marker, or
126
127b) Make sure c_nlink == 1 for all nondirectory entries.
128