1.. SPDX-License-Identifier: GPL-2.0 2 3============================================================================== 4Concurrent Modification and Execution of Instructions (CMODX) for RISC-V Linux 5============================================================================== 6 7CMODX is a programming technique where a program executes instructions that were 8modified by the program itself. Instruction storage and the instruction cache 9(icache) are not guaranteed to be synchronized on RISC-V hardware. Therefore, the 10program must enforce its own synchronization with the unprivileged fence.i 11instruction. 12 13CMODX in the Kernel Space 14------------------------- 15 16Dynamic ftrace 17--------------------- 18 19Essentially, dynamic ftrace directs the control flow by inserting a function 20call at each patchable function entry, and patches it dynamically at runtime to 21enable or disable the redirection. In the case of RISC-V, 2 instructions, 22AUIPC + JALR, are required to compose a function call. However, it is impossible 23to patch 2 instructions and expect that a concurrent read-side executes them 24without a race condition. This series makes atmoic code patching possible in 25RISC-V ftrace. Kernel preemption makes things even worse as it allows the old 26state to persist across the patching process with stop_machine(). 27 28In order to get rid of stop_machine() and run dynamic ftrace with full kernel 29preemption, we partially initialize each patchable function entry at boot-time, 30setting the first instruction to AUIPC, and the second to NOP. Now, atmoic 31patching is possible because the kernel only has to update one instruction. 32According to Ziccif, as long as an instruction is naturally aligned, the ISA 33guarantee an atomic update. 34 35By fixing down the first instruction, AUIPC, the range of the ftrace trampoline 36is limited to +-2K from the predetermined target, ftrace_caller, due to the lack 37of immediate encoding space in RISC-V. To address the issue, we introduce 38CALL_OPS, where an 8B naturally align metadata is added in front of each 39pacthable function. The metadata is resolved at the first trampoline, then the 40execution can be derect to another custom trampoline. 41 42CMODX in the User Space 43----------------------- 44 45Though fence.i is an unprivileged instruction, the default Linux ABI prohibits 46the use of fence.i in userspace applications. At any point the scheduler may 47migrate a task onto a new hart. If migration occurs after the userspace 48synchronized the icache and instruction storage with fence.i, the icache on the 49new hart will no longer be clean. This is due to the behavior of fence.i only 50affecting the hart that it is called on. Thus, the hart that the task has been 51migrated to may not have synchronized instruction storage and icache. 52 53There are two ways to solve this problem: use the riscv_flush_icache() syscall, 54or use the ``PR_RISCV_SET_ICACHE_FLUSH_CTX`` prctl() and emit fence.i in 55userspace. The syscall performs a one-off icache flushing operation. The prctl 56changes the Linux ABI to allow userspace to emit icache flushing operations. 57 58As an aside, "deferred" icache flushes can sometimes be triggered in the kernel. 59At the time of writing, this only occurs during the riscv_flush_icache() syscall 60and when the kernel uses copy_to_user_page(). These deferred flushes happen only 61when the memory map being used by a hart changes. If the prctl() context caused 62an icache flush, this deferred icache flush will be skipped as it is redundant. 63Therefore, there will be no additional flush when using the riscv_flush_icache() 64syscall inside of the prctl() context. 65 66prctl() Interface 67--------------------- 68 69Call prctl() with ``PR_RISCV_SET_ICACHE_FLUSH_CTX`` as the first argument. The 70remaining arguments will be delegated to the riscv_set_icache_flush_ctx 71function detailed below. 72 73.. kernel-doc:: arch/riscv/mm/cacheflush.c 74 :identifiers: riscv_set_icache_flush_ctx 75 76Example usage: 77 78The following files are meant to be compiled and linked with each other. The 79modify_instruction() function replaces an add with 0 with an add with one, 80causing the instruction sequence in get_value() to change from returning a zero 81to returning a one. 82 83cmodx.c:: 84 85 #include <stdio.h> 86 #include <sys/prctl.h> 87 88 extern int get_value(); 89 extern void modify_instruction(); 90 91 int main() 92 { 93 int value = get_value(); 94 printf("Value before cmodx: %d\n", value); 95 96 // Call prctl before first fence.i is called inside modify_instruction 97 prctl(PR_RISCV_SET_ICACHE_FLUSH_CTX, PR_RISCV_CTX_SW_FENCEI_ON, PR_RISCV_SCOPE_PER_PROCESS); 98 modify_instruction(); 99 // Call prctl after final fence.i is called in process 100 prctl(PR_RISCV_SET_ICACHE_FLUSH_CTX, PR_RISCV_CTX_SW_FENCEI_OFF, PR_RISCV_SCOPE_PER_PROCESS); 101 102 value = get_value(); 103 printf("Value after cmodx: %d\n", value); 104 return 0; 105 } 106 107cmodx.S:: 108 109 .option norvc 110 111 .text 112 .global modify_instruction 113 modify_instruction: 114 lw a0, new_insn 115 lui a5,%hi(old_insn) 116 sw a0,%lo(old_insn)(a5) 117 fence.i 118 ret 119 120 .section modifiable, "awx" 121 .global get_value 122 get_value: 123 li a0, 0 124 old_insn: 125 addi a0, a0, 0 126 ret 127 128 .data 129 new_insn: 130 addi a0, a0, 1 131