1===================================================
2Using Coresight for Kernel panic and Watchdog reset
3===================================================
4
5Introduction
6------------
7This documentation is about using Linux coresight trace support to
8debug kernel panic and watchdog reset scenarios.
9
10Coresight trace during Kernel panic
11-----------------------------------
12From the coresight driver point of view, addressing the kernel panic
13situation has four main requirements.
14
15a. Support for allocation of trace buffer pages from reserved memory area.
16   Platform can advertise this using a new device tree property added to
17   relevant coresight nodes.
18
19b. Support for stopping coresight blocks at the time of panic
20
21c. Saving required metadata in the specified format
22
23d. Support for reading trace data captured at the time of panic
24
25Allocation of trace buffer pages from reserved RAM
26~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
27A new optional device tree property "memory-region" is added to the
28Coresight TMC device nodes, that would give the base address and size of trace
29buffer.
30
31Static allocation of trace buffers would ensure that both IOMMU enabled
32and disabled cases are handled. Also, platforms that support persistent
33RAM will allow users to read trace data in the subsequent boot without
34booting the crashdump kernel.
35
36Note:
37For ETR sink devices, this reserved region will be used for both trace
38capture and trace data retrieval.
39For ETF sink devices, internal SRAM would be used for trace capture,
40and they would be synced to reserved region for retrieval.
41
42
43Disabling coresight blocks at the time of panic
44~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
45In order to avoid the situation of losing relevant trace data after a
46kernel panic, it would be desirable to stop the coresight blocks at the
47time of panic.
48
49This can be achieved by configuring the comparator, CTI and sink
50devices as below::
51
52           Trigger on panic
53    Comparator --->External out --->CTI -->External In---->ETR/ETF stop
54
55Saving metadata at the time of kernel panic
56~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
57Coresight metadata involves all additional data that are required for a
58successful trace decode in addition to the trace data. This involves
59ETR/ETF/ETB register snapshot etc.
60
61A new optional device property "memory-region" is added to
62the ETR/ETF/ETB device nodes for this.
63
64Reading trace data captured at the time of panic
65~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
66Trace data captured at the time of panic, can be read from rebooted kernel
67or from crashdump kernel using a special device file /dev/crash_tmc_xxx.
68This device file is created only when there is a valid crashdata available.
69
70General flow of trace capture and decode in case of kernel panic
71~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
721. Enable source and sink on all the cores using the sysfs interface.
73   ETR sinks should have trace buffers allocated from reserved memory,
74   by selecting "resrv" buffer mode from sysfs.
75
762. Run relevant tests.
77
783. On a kernel panic, all coresight blocks are disabled, necessary
79   metadata is synced by kernel panic handler.
80
81   System would eventually reboot or boot a crashdump kernel.
82
834. For  platforms that supports crashdump kernel, raw trace data can be
84   dumped using the coresight sysfs interface from the crashdump kernel
85   itself. Persistent RAM is not a requirement in this case.
86
875. For platforms that supports persistent RAM, trace data can be dumped
88   using the coresight sysfs interface in the subsequent Linux boot.
89   Crashdump kernel is not a requirement in this case. Persistent RAM
90   ensures that trace data is intact across reboot.
91
92Coresight trace during Watchdog reset
93-------------------------------------
94The main difference between addressing the watchdog reset and kernel panic
95case are below,
96
97a. Saving coresight metadata need to be taken care by the
98   SCP(system control processor) firmware in the specified format,
99   instead of kernel.
100
101b. Reserved memory region given by firmware for trace buffer and metadata
102   has to be in persistent RAM.
103   Note: This is a requirement for watchdog reset case but optional
104   in kernel panic case.
105
106Watchdog reset can be supported only on platforms that meet the above
107two requirements.
108
109Sample commands for testing a Kernel panic case with ETR sink
110-------------------------------------------------------------
111
1121. Boot Linux kernel with "crash_kexec_post_notifiers" added to the kernel
113   bootargs. This is mandatory if the user would like to read the tracedata
114   from the crashdump kernel.
115
1162. Enable the preloaded ETM configuration::
117
118    #echo 1 > /sys/kernel/config/cs-syscfg/configurations/panicstop/enable
119
1203. Configure CTI using sysfs interface::
121
122    #./cti_setup.sh
123
124    #cat cti_setup.sh
125
126
127    cd /sys/bus/coresight/devices/
128
129    ap_cti_config () {
130      #ETM trig out[0] trigger to Channel 0
131      echo 0 4 > channels/trigin_attach
132    }
133
134    etf_cti_config () {
135      #ETF Flush in trigger from Channel 0
136      echo 0 1 > channels/trigout_attach
137      echo 1 > channels/trig_filter_enable
138    }
139
140    etr_cti_config () {
141      #ETR Flush in from Channel 0
142      echo 0 1 > channels/trigout_attach
143      echo 1 > channels/trig_filter_enable
144    }
145
146    ctidevs=`find . -name "cti*"`
147
148    for i in $ctidevs
149    do
150            cd $i
151
152            connection=`find . -name "ete*"`
153            if [ ! -z "$connection" ]
154            then
155                    echo "AP CTI config for $i"
156                    ap_cti_config
157            fi
158
159            connection=`find . -name "tmc_etf*"`
160            if [ ! -z "$connection" ]
161            then
162                    echo "ETF CTI config for $i"
163                    etf_cti_config
164            fi
165
166            connection=`find . -name "tmc_etr*"`
167            if [ ! -z "$connection" ]
168            then
169                    echo "ETR CTI config for $i"
170                    etr_cti_config
171            fi
172
173            cd ..
174    done
175
176Note: CTI connections are SOC specific and hence the above script is
177added just for reference.
178
1794. Choose reserved buffer mode for ETR buffer::
180
181    #echo "resrv" > /sys/bus/coresight/devices/tmc_etr0/buf_mode_preferred
182
1835. Enable stop on flush trigger configuration::
184
185    #echo 1 > /sys/bus/coresight/devices/tmc_etr0/stop_on_flush
186
1876. Start Coresight tracing on cores 1 and 2 using sysfs interface
188
1897. Run some application on core 1::
190
191    #taskset -c 1 dd if=/dev/urandom of=/dev/null &
192
1938. Invoke kernel panic on core 2::
194
195    #echo 1 > /proc/sys/kernel/panic
196    #taskset -c 2 echo c > /proc/sysrq-trigger
197
1989. From rebooted kernel or crashdump kernel, read crashdata::
199
200    #dd if=/dev/crash_tmc_etr0 of=/trace/cstrace.bin
201
20210. Run opencsd decoder tools/scripts to generate the instruction trace.
203
204Sample instruction trace dump
205~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
206
207Core1 dump::
208
209    A                                  etm4_enable_hw: ffff800008ae1dd4
210    CONTEXT EL2                        etm4_enable_hw: ffff800008ae1dd4
211    I                                  etm4_enable_hw: ffff800008ae1dd4:
212    d503201f   nop
213    I                                  etm4_enable_hw: ffff800008ae1dd8:
214    d503201f   nop
215    I                                  etm4_enable_hw: ffff800008ae1ddc:
216    d503201f   nop
217    I                                  etm4_enable_hw: ffff800008ae1de0:
218    d503201f   nop
219    I                                  etm4_enable_hw: ffff800008ae1de4:
220    d503201f   nop
221    I                                  etm4_enable_hw: ffff800008ae1de8:
222    d503233f   paciasp
223    I                                  etm4_enable_hw: ffff800008ae1dec:
224    a9be7bfd   stp     x29, x30, [sp, #-32]!
225    I                                  etm4_enable_hw: ffff800008ae1df0:
226    910003fd   mov     x29, sp
227    I                                  etm4_enable_hw: ffff800008ae1df4:
228    a90153f3   stp     x19, x20, [sp, #16]
229    I                                  etm4_enable_hw: ffff800008ae1df8:
230    2a0003f4   mov     w20, w0
231    I                                  etm4_enable_hw: ffff800008ae1dfc:
232    900085b3   adrp    x19, ffff800009b95000 <reserved_mem+0xc48>
233    I                                  etm4_enable_hw: ffff800008ae1e00:
234    910f4273   add     x19, x19, #0x3d0
235    I                                  etm4_enable_hw: ffff800008ae1e04:
236    f8747a60   ldr     x0, [x19, x20, lsl #3]
237    E                                  etm4_enable_hw: ffff800008ae1e08:
238    b4000140   cbz     x0, ffff800008ae1e30 <etm4_starting_cpu+0x50>
239    I    149.039572921                 etm4_enable_hw: ffff800008ae1e30:
240    a94153f3   ldp     x19, x20, [sp, #16]
241    I    149.039572921                 etm4_enable_hw: ffff800008ae1e34:
242    52800000   mov     w0, #0x0                        // #0
243    I    149.039572921                 etm4_enable_hw: ffff800008ae1e38:
244    a8c27bfd   ldp     x29, x30, [sp], #32
245
246    ..snip
247
248        149.052324811           chacha_block_generic: ffff800008642d80:
249    9100a3e0   add     x0,
250    I    149.052324811           chacha_block_generic: ffff800008642d84:
251    b86178a2   ldr     w2, [x5, x1, lsl #2]
252    I    149.052324811           chacha_block_generic: ffff800008642d88:
253    8b010803   add     x3, x0, x1, lsl #2
254    I    149.052324811           chacha_block_generic: ffff800008642d8c:
255    b85fc063   ldur    w3, [x3, #-4]
256    I    149.052324811           chacha_block_generic: ffff800008642d90:
257    0b030042   add     w2, w2, w3
258    I    149.052324811           chacha_block_generic: ffff800008642d94:
259    b8217882   str     w2, [x4, x1, lsl #2]
260    I    149.052324811           chacha_block_generic: ffff800008642d98:
261    91000421   add     x1, x1, #0x1
262    I    149.052324811           chacha_block_generic: ffff800008642d9c:
263    f100443f   cmp     x1, #0x11
264
265
266Core 2 dump::
267
268    A                                  etm4_enable_hw: ffff800008ae1dd4
269    CONTEXT EL2                        etm4_enable_hw: ffff800008ae1dd4
270    I                                  etm4_enable_hw: ffff800008ae1dd4:
271    d503201f   nop
272    I                                  etm4_enable_hw: ffff800008ae1dd8:
273    d503201f   nop
274    I                                  etm4_enable_hw: ffff800008ae1ddc:
275    d503201f   nop
276    I                                  etm4_enable_hw: ffff800008ae1de0:
277    d503201f   nop
278    I                                  etm4_enable_hw: ffff800008ae1de4:
279    d503201f   nop
280    I                                  etm4_enable_hw: ffff800008ae1de8:
281    d503233f   paciasp
282    I                                  etm4_enable_hw: ffff800008ae1dec:
283    a9be7bfd   stp     x29, x30, [sp, #-32]!
284    I                                  etm4_enable_hw: ffff800008ae1df0:
285    910003fd   mov     x29, sp
286    I                                  etm4_enable_hw: ffff800008ae1df4:
287    a90153f3   stp     x19, x20, [sp, #16]
288    I                                  etm4_enable_hw: ffff800008ae1df8:
289    2a0003f4   mov     w20, w0
290    I                                  etm4_enable_hw: ffff800008ae1dfc:
291    900085b3   adrp    x19, ffff800009b95000 <reserved_mem+0xc48>
292    I                                  etm4_enable_hw: ffff800008ae1e00:
293    910f4273   add     x19, x19, #0x3d0
294    I                                  etm4_enable_hw: ffff800008ae1e04:
295    f8747a60   ldr     x0, [x19, x20, lsl #3]
296    E                                  etm4_enable_hw: ffff800008ae1e08:
297    b4000140   cbz     x0, ffff800008ae1e30 <etm4_starting_cpu+0x50>
298    I    149.046243445                 etm4_enable_hw: ffff800008ae1e30:
299    a94153f3   ldp     x19, x20, [sp, #16]
300    I    149.046243445                 etm4_enable_hw: ffff800008ae1e34:
301    52800000   mov     w0, #0x0                        // #0
302    I    149.046243445                 etm4_enable_hw: ffff800008ae1e38:
303    a8c27bfd   ldp     x29, x30, [sp], #32
304    I    149.046243445                 etm4_enable_hw: ffff800008ae1e3c:
305    d50323bf   autiasp
306    E    149.046243445                 etm4_enable_hw: ffff800008ae1e40:
307    d65f03c0   ret
308    A                                ete_sysreg_write: ffff800008adfa18
309
310    ..snip
311
312    I     149.05422547                          panic: ffff800008096300:
313    a90363f7   stp     x23, x24, [sp, #48]
314    I     149.05422547                          panic: ffff800008096304:
315    6b00003f   cmp     w1, w0
316    I     149.05422547                          panic: ffff800008096308:
317    3a411804   ccmn    w0, #0x1, #0x4, ne  // ne = any
318    N     149.05422547                          panic: ffff80000809630c:
319    540001e0   b.eq    ffff800008096348 <panic+0xe0>  // b.none
320    I     149.05422547                          panic: ffff800008096310:
321    f90023f9   str     x25, [sp, #64]
322    E     149.05422547                          panic: ffff800008096314:
323    97fe44ef   bl      ffff8000080276d0 <panic_smp_self_stop>
324    A                                           panic: ffff80000809634c
325    I     149.05422547                          panic: ffff80000809634c:
326    910102d5   add     x21, x22, #0x40
327    I     149.05422547                          panic: ffff800008096350:
328    52800020   mov     w0, #0x1                        // #1
329    E     149.05422547                          panic: ffff800008096354:
330    94166b8b   bl      ffff800008631180 <bust_spinlocks>
331    N    149.054225518                 bust_spinlocks: ffff800008631180:
332    340000c0   cbz     w0, ffff800008631198 <bust_spinlocks+0x18>
333    I    149.054225518                 bust_spinlocks: ffff800008631184:
334    f000a321   adrp    x1, ffff800009a98000 <pbufs.0+0xbb8>
335    I    149.054225518                 bust_spinlocks: ffff800008631188:
336    b9405c20   ldr     w0, [x1, #92]
337    I    149.054225518                 bust_spinlocks: ffff80000863118c:
338    11000400   add     w0, w0, #0x1
339    I    149.054225518                 bust_spinlocks: ffff800008631190:
340    b9005c20   str     w0, [x1, #92]
341    E    149.054225518                 bust_spinlocks: ffff800008631194:
342    d65f03c0   ret
343    A                                           panic: ffff800008096358
344
345Perf based testing
346------------------
347
348Starting perf session
349~~~~~~~~~~~~~~~~~~~~~
350ETF::
351
352    perf record -e cs_etm/panicstop,@tmc_etf1/ -C 1
353    perf record -e cs_etm/panicstop,@tmc_etf2/ -C 2
354
355ETR::
356
357    perf record -e cs_etm/panicstop,@tmc_etr0/ -C 1,2
358
359Reading trace data after panic
360~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
361Same sysfs based method explained above can be used to retrieve and
362decode the trace data after the reboot on kernel panic.
363