- 04 6月, 2021 18 次提交
-
-
由 Cheng Jian 提交于
hulk inclusion category: feature bugzilla: 51919 CVE: NA ---------------------------------------- support livepatch without ftrace for x86_64 supported now: livepatch relocation when init_patch after load_module; instruction patched when enable; activeness function check; enforcing the patch stacking principle; x86_64 use variable length instruction, so there's no need to consider extra implementation for long jumps. Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NLi Bin <huawei.libin@huawei.com> Tested-by: NYang ZuoTing <yangzuoting@huawei.com> Tested-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Signed-off-by: NYe Weihua <yeweihua4@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> -
由 Dong Kai 提交于
hulk inclusion category: feature bugzilla: 51921 CVE: NA --------------------------- After commit d556e1be ("livepatch: Remove module_disable_ro() usage") and commit 0d9fbf78 ("module: Remove module_disable_ro()") and commit e6eff437 ("module: Make module_enable_ro() static again") merged, the module_disable_ro is removed and module_enable_ro is make static. It's ok for x86/ppc platform because the livepatch module relocation is done by text poke func which internally modify the text addr by remap to high virtaddr which has write permission. However for arm/arm64 platform, it's apply_relocate[_add] still directly modify the text code so we should change the module text permission before relocation. Otherwise it will lead to following problem: Unable to handle kernel write to read-only memory at virtual address ffff800008a95288 Mem abort info: ESR = 0x9600004f EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x0000004f CM = 0, WnR = 1 swapper pgtable: 4k pages, 48-bit VAs, pgdp=000000004133c000 [ffff800008a95288] pgd=00000000bdfff003, p4d=00000000bdfff003, pud=00000000bdffe003, pmd=0000000080ce7003, pte=0040000080d5d783 Internal error: Oops: 9600004f [#1] PREEMPT SMP Modules linked in: livepatch_testmod_drv(OK+) testmod_drv(O) CPU: 0 PID: 139 Comm: insmod Tainted: G O K 5.10.0-01131-gf6b4602e09b2-dirty #35 Hardware name: linux,dummy-virt (DT) pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--) pc : reloc_insn_imm+0x54/0x78 lr : reloc_insn_imm+0x50/0x78 sp : ffff800011cf3910 ... Call trace: reloc_insn_imm+0x54/0x78 apply_relocate_add+0x464/0x680 klp_apply_section_relocs+0x11c/0x148 klp_enable_patch+0x338/0x998 patch_init+0x338/0x1000 [livepatch_testmod_drv] do_one_initcall+0x60/0x1d8 do_init_module+0x58/0x1e0 load_module+0x1fb4/0x2688 __do_sys_finit_module+0xc0/0x128 __arm64_sys_finit_module+0x20/0x30 do_el0_svc+0x84/0x1b0 el0_svc+0x14/0x20 el0_sync_handler+0x90/0xc8 el0_sync+0x158/0x180 Code: 2a0503e0 9ad42a73 97d6a499 91000673 (b90002a0) ---[ end trace 67dd2ef1203ed335 ]--- Though the permission change is not necessary to x86/ppc platform, consider that the jump_label_register api may modify the text code either, we just put the change handle here instead of putting it in arch-specific relocate. Besides, the jump_label_module_nb callback called in jump_label_register also maybe need motify the module code either, it sort and swap the jump entries if necessary. So just disable ro before jump_label handling and restore it back. Signed-off-by: NDong Kai <dongkai11@huawei.com> Signed-off-by: NYe Weihua <yeweihua4@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Cheng Jian 提交于
hulk inclusion category: bugfix bugzilla: 51923 CVE: NA --------------------------- When doing consistency stack checking, if we try to patch a function which has been patched already. We should check the new function(not the origin func) that is activeness currently, it's always the first entry in list func_node->func_stack. Example : module : origin livepatch v1 livepatch v2 func : old func A -[enable]=> new func A' -[enable]=> new func A'' check : A A' when we try to patch function A to new function A'' by livepatch v2, but the func A has already patched to function A' by livepatch v1, so function A' which provided in livepatch v1 is active in the stack instead of origin function A. Even if the long jump method is used, we jump to the new function A' using a call without LR, the origin function A will not appear in the stack. We must check the active function A' in consistency stack checking. Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-By: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Nyangerkun <yangerkun@huawei.com> Signed-off-by: NDong Kai <dongkai11@huawei.com> Signed-off-by: NYe Weihua <yeweihua4@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Cheng Jian 提交于
euler inclusion category: feature bugzilla: 51923 CVE: N/A ---------------------------------------- The offset of the direct jump under ARM is 32M. Longer jumps are required to exceed this range. First-- long jump for relocations If the jump address exceeds the range in these relocation, it needs to be implemented with a long jump. but there is no function for us to modify its first LJMP_INSN_SIZE instructions like enable livepatch do, we should use module plts to store the information. so we need enough PLTS to store the symbol. The .klp.rela.objname.secname section store all symbols that required relocate by livepatch. For commit 425595a7 ("livepatch: reuse module loader code to write relocations") merged, load_module can create enough plt entries for livepatch by module_frob_arch_sections. However, the module loader only use rel section, this is will be fixed in the next commits and need adapter kpatch-build front-tools. Second-- long jump for call new function We modify several instructions from the beginning of the function to jump instructions, thus completing the jump from the old function to the new function. Unlike the relocation information, there is no plt sections to use here, so use the LDT instruction to complete the long jump using the LDT instruction. [PC+0]: ldr PC [PC+8] [PC+4]: nop [PC+8]: new_addr_to_jump Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NBin Li <huawei.libin@huawei.com> Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com> Signed-off-by: NDong Kai <dongkai11@huawei.com> Signed-off-by: NYe Weihua <yeweihua4@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Cheng Jian 提交于
euler inclusion category: bugfix bugzilla: 51923 CVE: NA --------------------------------- We through stack checking to ensure the consistency of livepatch. Task blocked in __switch_to when switch out, thread_saved_fs/pc store the FP and PC when switching, it can be useful when tracing blocked threads. For running task, __builtin_frame_address can be used, but it's difficult to backtracking the running task on other CPUs. Fortunately, all CPUs will stay in this function, the current's backtrace is so similar. so just backtracking the current on this CPU, skip the current of other CPUs. Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com> Signed-off-by: NDong Kai <dongkai11@huawei.com> Signed-off-by: NYe Weihua <yeweihua4@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Li Bin 提交于
euler inclusion category: feature bugzilla: 51923 CVE: N/A ---------------------------------------- support livepatch without ftrace for ARM supported now: livepatch relocation when init_patch after load_module; instruction patched when enable; activeness function check; enforcing the patch stacking principle; unsupport now:(willn't fix it feature) long jump (both livepatch relocation and insn patched) module plts request by livepatch-relocation Because CONFIG_ARM_MODULE_PLTS will be not set in ARM, so we needn't long jump and livepatch plts. Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NLi Bin <huawei.libin@huawei.com> Tested-by: NCheng Jian <cj.chengjian@huawei.com> Tested-by: NWang Feng <wangfeng59@huawei.com> Tested-by: NLin DingYu <lindingyu@huawei.com> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com> Signed-off-by: NDong Kai <dongkai11@huawei.com> Signed-off-by: NYe Weihua <yeweihua4@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> -
由 Dong Kai 提交于
hulk inclusion category: feature bugzilla: 51923 CVE: NA --------------------------- In the older version of livepatch implementation without ftrace on arm, it use klp_relocs and do special relocation for klp syms. The kpatch-build front-tools use kpatch version to generate klp_relocs. After commit 7c8e2bdd ("livepatch: Apply vmlinux-specific KLP relocations early") and commit 425595a7 ("livepatch: reuse module loader code to write relocations"), the mainline klp relocation flow is always using ".klp.rela." section and kpatch-build front-tools use klp version to generate klp module. The default klp_apply_section_relocs is only for 64bit and modules with rela support. Because CONFIG_MODULES_USE_ELF_REL is set in arm, so we modify klp relocation to support 32bit and modules using rel. Also the kpatch-build front-tools should adapter to support this. Signed-off-by: NDong Kai <dongkai11@huawei.com> Signed-off-by: NYe Weihua <yeweihua4@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Dong Kai 提交于
hulk inclusion category: feature bugzilla: 51921 CVE: NA --------------------------- We are planning to add livepatch without ftrace support for arm in the next commit. However after commit 425595a7 ("livepatch: reuse module loader code to write relocations") merged, the klp relocations is done by apply_relocate function. The mod->arch.{core,init}.plt pointers were problematic for livepatch because they pointed within temporary section headers (provided by the module loader via info->sechdrs) that would be freed after module load. Here we take same modification based on commit c8ebf64e ("arm64/module: use plt section indices for relocations") to solve. Signed-off-by: NDong Kai <dongkai11@huawei.com> Signed-off-by: NYe Weihua <yeweihua4@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Cheng Jian 提交于
euler inclusion category: feature Bugzilla: 51921 CVE: N/A ---------------------------------------- Now, arm64 don't support DYNAMIC_FTRACE_WITH_REGS and RELIABLE_STACKTRACE. which the first is necessary to implement livepatch with ftrace and the second allow to implement per-task consistency. So. arm64 only support LIVEPATCH_WO_FTRACE and STOP_MACHINE_CONSISTENCY. but other architectures can work under LIVEPATCH_FTRACE with PER_TASK_CONSISTENCY. commit the depends to avoid incorrect configuration. Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NLi Bin <huawei.libin@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NDong Kai <dongkai11@huawei.com> Signed-off-by: NYe Weihua <yeweihua4@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Cheng Jian 提交于
hulk inclusion category: bugfix bugzilla: 51921 CVE: NA --------------------------- When doing consistency stack checking, if we try to patch a function which has been patched already. We should check the new function(not the origin func) that is activeness currently, it's always the first entry in list func_node->func_stack. Example : module : origin livepatch v1 livepatch v2 func : old func A -[enable]=> new func A' -[enable]=> new func A'' check : A A' when we try to patch function A to new function A'' by livepatch v2, but the func A has already patched to function A' by livepatch v1, so function A' which provided in livepatch v1 is active in the stack instead of origin function A. Even if the long jump method is used, we jump to the new function A' using a call without LR, the origin function A will not appear in the stack. We must check the active function A' in consistency stack checking. Reviewed-By: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Signed-off-by: NYe Weihua <yeweihua4@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Cheng Jian 提交于
hulk inclusion category: bugfix bugzilla: 51921 CVE: NA ------------------------------------------------- We through stack checking to ensure the consistency of livepatch. Task blocked in __switch_to when switch out, thread_saved_fs/pc store the FP and PC when switching, it can be useful when tracing blocked threads. For running task, __builtin_frame_address can be used, but it's difficult to backtracking the running task on other CPUs. Fortunately, all CPUs will stay in this function, the current's backtrace is so similar. so just backtracking the current on this CPU, skip the current of other CPUs. Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NLi Bin <huawei.libin@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Signed-off-by: NDong Kai <dongkai11@huawei.com> Signed-off-by: NYe Weihua <yeweihua4@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Cheng Jian 提交于
euler inclusion category: feature bugzilla: 51921 CVE: N/A ---------------------------------------- We need to modify the first 4 instructions of a livepatch function to complete the long jump if offset out of short-range. So it's important that this function must have more than 4 instructions, so we checked it when the livepatch module insmod. In fact, this corner case is highly unlikely to occur on arm64, but it's still an effective and meaningful check to avoid crash by doing this. Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NLi Bin <huawei.libin@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Signed-off-by: NDong Kai <dongkai11@huawei.com> Signed-off-by: NYe Weihua <yeweihua4@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Cheng Jian 提交于
hulk inclusion category: feature bugzilla: 51921 CVE: N/A ---------------------------------------- support livepatch without ftrace for ARM64 supported now: livepatch relocation when init_patch after load_module; instruction patched when enable; activeness function check; enforcing the patch stacking principle; long jump (both livepatch relocation and insn patched) module plts request by livepatch-relocation Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NLi Bin <huawei.libin@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Signed-off-by: NDong Kai <dongkai11@huawei.com> Signed-off-by: NYe Weihua <yeweihua4@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> -
由 Cheng Jian 提交于
hulk inclusion category: feature bugzilla: 51921 CVE: NA ----------------------------------------------- The kpatch-build processes the __jump_table special section, and only the jump_lable used by the changed functions will be included in __jump_table section, and the livepatch should process the tracepoint again after the dynamic relocation. NOTE: adding new tracepoints definition is not supported. Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Signed-off-by: NDong Kai <dongkai11@huawei.com> Signed-off-by: NYe Weihua <yeweihua4@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Cheng Jian 提交于
euler inclusion category: feature bugzilla: 51921 CVE: N/A ---------------------------------------- The front-tools kpatch-build support load and unload hooks in the older version and already changed to use pre/post callbacks after commit 93862e38 ("livepatch: add (un)patch callbacks"). However, for livepatch based on stop machine consistency, this callbacks will be called within stop_machine context if we using it. This is dangerous because we can't known what the user will do in the callbacks. It may trigger system crash if using any function which internally might sleep. Here we use the old load/unload hooks to allow user-defined hooks. Although it's not good enough compared to pre/post callbacks, it can meets user needs to some extent. Of cource, this requires cooperation of kpatch-build tools. Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Signed-off-by: NDong Kai <dongkai11@huawei.com> Signed-off-by: NYe Weihua <yeweihua4@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Cheng Jian 提交于
euler inclusion category: feature bugzilla: 51921 CVE: N/A ---------------------------------------- In the previous version we forced the association between livepatch wo_ftrace and stop_machine. This is unwise and obviously confusing. commit d83a7cb3 ("livepatch: change to a per-task consistency model") introduce a PER-TASK consistency model. It's a hybrid of kGraft and kpatch: it uses kGraft's per-task consistency and syscall barrier switching combined with kpatch's stack trace switching. There are also a number of fallback options which make it quite flexible. So we split livepatch consistency for without ftrace to two model: [1] PER-TASK consistency model. per-task consistency and syscall barrier switching combined with kpatch's stack trace switching. [2] STOP-MACHINE consistency model. stop-machine consistency and kpatch's stack trace switching. Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NLi Bin <huawei.libin@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Signed-off-by: NDong Kai <dongkai11@huawei.com> Signed-off-by: NYe Weihua <yeweihua4@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Cheng Jian 提交于
euler inclusion category: feature bugzilla: 51921 CVE: N/A ---------------------------------------- livepatch wo_ftrace and kprobe are in conflict, because kprobe may modify the instructions anywhere in the function. So it's dangerous to patched/unpatched an function when there are some kprobes registered on it. Restrict these situation. we should hold kprobe_mutex in klp_check_patch_kprobed, but it's static and can't export, so protect klp_check_patch_probe in stop_machine to avoid registing kprobes when patching. we do nothing for (un)register kprobes on the (old) function which has been patched. because there are sone engineers need this. certainly, it will not lead to hangs, but not recommended. Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NLi Bin <huawei.libin@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Signed-off-by: NYe Weihua <yeweihua4@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Cheng Jian 提交于
euler inclusion category: feature bugzilla: 51921 CVE: NA ---------------------------------------- support for livepatch without ftrace mode new config for WO_FTRACE CONFIG_LIVEPATCH_WO_FTRACE=y CONFIG_LIVEPATCH_STACK=y Implements livepatch without ftrace by direct jump, we directly modify the first few instructions(usually one, but four for long jumps under ARM64) of the old function as jump instructions by stop_machine, so it will jump to the first address of the new function when livepatch enable KERNEL/MODULE call/bl A---------------old_A------------ | jump new_A----+--------| | | | | | | ----------------- | | | | livepatch_module------------- | | | | |new_A <--------------------+--------------------| | | | | |---------------------------| | .plt | | ......PLTS for livepatch | ----------------------------- something we need to consider under different architectures: 1. jump instruction 2. partial relocation in new function requires for livepatch. 3. long jumps may be required if the jump address exceeds the offset. both for livepatch relocation and livepatch enable. Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NLi Bin <huawei.libin@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Signed-off-by: NDong Kai <dongkai11@huawei.com> Signed-off-by: NYe Weihua <yeweihua4@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 03 6月, 2021 22 次提交
-
-
由 Jean-Philippe Brucker 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=b81eda9426104cf59867c1ccf6b147fc0727e08b --------------------------------------------- A bunch of sanity-checks. For development only, because it probably adds a large overhead to the fast path. The fault only comes from the IOMMU driver, which is obviously bug-free so this won't ever trigger. Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jean-Philippe Brucker 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=870541b34bfa7ba8d5846fcb3246e533e7492e20 --------------------------------------------- It's useful when debugging to have some trace events for SVA object allocation and freeing. Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jean-Philippe Brucker 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=836544e7c81361379e509eeca568d64f8f3dfbe2 --------------------------------------------- Add two new ioctls for VFIO containers. VFIO_IOMMU_BIND_PROCESS creates a bond between a container and a process address space, identified by a Process Address Space ID (PASID). Devices in the container append this PASID to DMA transactions in order to access the process' address space. The process page tables are shared with the IOMMU, and mechanisms such as PCI ATS/PRI are used to handle faults. VFIO_IOMMU_UNBIND_PROCESS removes a bond created with VFIO_IOMMU_BIND_PROCESS. This patch is only provided for testing. It isn't possible to implement SVA with vfio-pci, because the generic VFIO driver doesn't know how to perform device-specific methods for stopping the use of PASID. This could be achieved with vfio-mdev and a mediating driver that knows how to perform stop PASID. Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jean-Philippe Brucker 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=7984f27d63b2f56dcdc106be9484061b3a13df0b --------------------------------------------- VFIO works directly with IOMMU groups rather than devices. Even if groups ar still required to only contain one device in order to use SVA, add a set of helpers for VFIO. Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jean-Philippe Brucker 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=5207d639ca92f1e9aad02023fedaaafb3b91708d --------------------------------------------- In some cases releasing a mm bound to a device might invoke an exit handler, that takes a lock already held by the function calling mmput(). This is the case for VFIO, which needs to call mmput_async to avoid a deadlock. Other drivers using SVA might follow. Since they can be built as modules, export the mmput_async symbol. Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jean-Philippe Brucker 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=176b3454e53ffdd4a4f6dbc51f128f1f35a1b357 --------------------------------------------- Userspace drivers implemented with VFIO might want to bind sub-processes to their devices. In a VFIO ioctl, they provide a pid that is used to find a task and its mm. Since VFIO can be built as a module, export the find_get_task_by_vpid symbol. Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jean-Philippe Brucker 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=17c99c238dbfb453256f2a85a9f0c48dbc976814 --------------------------------------------- Some devices can access process address spaces directly. When creating such bond, to check that a process controlling the device is allowed to access the target address space, the device driver uses mm_access(). Since the drivers (in this case VFIO) can be built as a module, export the mm_access symbol. Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jean-Philippe Brucker 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=4411467daeff90c7c371cef6369e4bf8561fb00e --------------------------------------------- In commit a3a19592 ("iommu: Add APIs for multiple domains per device"), the IOMMU API gained the concept of auxiliary domains (AUXD), which allows to control the PASID-tagged address spaces of a device. With AUXD the PASID address space are not shared with the CPU, but are instead modified with iommu_map() and iommu_unmap() calls on auxiliary domains. Add auxiliary domain support to the SMMUv3 driver. Device drivers allocate an unmanaged IOMMU domain with iommu_domain_alloc(), and attach it to the device with iommu_aux_attach_domain(). The AUXD API is fairly permissive, and allows to attach an IOMMU domain in both normal and auxiliary mode at the same time - one device can be attached to the domain normally, and another device can be attached through one of its PASIDs. To avoid excessive complexity in the SMMU implementation we pose some restrictions on supported AUXD usage: * A domain is either in auxiliary mode or normal mode. And that state is sticky. Once detached the domain has to be re-attached in the same mode. * An auxiliary domain can have a single parent domain. Two devices can be attached to the same auxiliary domain only if they are attached to the same parent domain. In practice these shouldn't be problematic, since we have the same kind of restriction on normal domains and users have been able to cope so far: at the moment a domain cannot be attached to two devices behind different SMMUs. When VFIO puts two such devices in the same container, it simply falls back to allocating two separate IOMMU domains. Be careful with mixing ATS and PASID. PCIe does not provide a way to only invalidate non-PASID ATC entries, without also invalidating all PASID-tagged ATC entries in the same address range! Try to avoid using PASID and non-PASID contexts at the same time. FIXME: if a device is removed from the domain while we drop an auxiliary domain, there may be a problem. We remove the ctx desc, invalidate CD cache for all devices, then invaliate the ATC for all devices. But we drop the devices lock between the two invalidations so if a device is removed we might miss the ATC inval? Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jacob Pan 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=f6cfa2e3655787bc3f90c50aff7cc75d0746bdf3 --------------------------------------------- For performance and debugging purposes, these trace events help analyzing device faults that interact with IOMMU subsystem. E.g. IOMMU:0000:00:0a.0 type=2 reason=0 addr=0x00000000007ff000 pasid=1 group=1 last=0 prot=1 Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com> [JPB: removed invalidate event, that will be added later] Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jacob Pan 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=303996d99beb31a70149ab4a328bfc41af037e92 --------------------------------------------- For development only, trace I/O page faults and responses. Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com> [JPB: removed the invalidate trace event, that will be added later] Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jacob Pan 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=9a1e957fd072d0e993827e428367947d915da382 --------------------------------------------- When IO page faults are reported outside IOMMU subsystem, the page request handler may fail for various reasons. E.g. a guest received page requests but did not have a chance to run for a long time. The irresponsive behavior could hold off limited resources on the pending device. There can be hardware or credit based software solutions as suggested in the PCI ATS Ch-4. To provide a basic safety net this patch introduces a per device deferrable timer which monitors the longest pending page fault that requires a response. Proper action such as sending failure response code could be taken when timer expires but not included in this patch. We need to consider the life cycle of page groupd ID to prevent confusion with reused group ID by a device. For now, a warning message provides clue of such failure. Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: NAshok Raj <ashok.raj@intel.com> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jacob Pan 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=2c56ef21daed0cd65da9b4907ce4b84c56ca7c1c --------------------------------------------- When an I/O page request is processed outside the IOMMU subsystem, response can be delayed or lost. Add a tunable setup parameter such that user can choose the timeout for IOMMU to track pending page requests. This timeout mechanism is a basic safety net which can be implemented in conjunction with credit based or device level page response exception handling. Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jacob Pan 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=ff9d3ac0df919b780a81ccfa54b4dc2eb69175a8 --------------------------------------------- In virtualization use case, when a guest is assigned a PCI host device, protected by a virtual IOMMU on the guest, the physical IOMMU must be programmed to be consistent with the guest mappings. If the physical IOMMU supports two translation stages it makes sense to program guest mappings onto the first stage/level (ARM/Intel terminology) while the host owns the stage/level 2. In that case, it is mandated to trap on guest configuration settings and pass those to the physical iommu driver. This patch adds a new API to the iommu subsystem that allows to set/unset the pasid table information. A generic iommu_pasid_table_config struct is introduced in a new iommu.h uapi header. This is going to be used by the VFIO user API. Reviewed-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: NLiu, Yi L <yi.l.liu@linux.intel.com> Signed-off-by: NAshok Raj <ashok.raj@intel.com> Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: NEric Auger <eric.auger@redhat.com> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jean-Philippe Brucker 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=25fe4fbaddf2401ab711d800e89452f41930f98b --------------------------------------------- The "pci=noats" kernel parameter disables PCIe ATS globally, and affects any ATS-capable IOMMU driver. So rather than adding Arm SMMUv3, which recently gained ATS support, to the list of relevant build options, simplify the noats description. Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jean-Philippe Brucker 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=0e93ed439f6957d7ff821107c16f8dd8e7df0a2c --------------------------------------------- Device-tree declare whether a PCI root-complex supports ATS by setting the "ats-supported" property. Copy this flag into device fwspec to let IOMMU drivers quickly check if they can enable ATS for a device. Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jean-Philippe Brucker 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=af3fee2e57e5ddd4f16cfad160e89cafddb0d629 --------------------------------------------- Declare that the host controller supports ATS, so the OS can enable it for ATS-capable PCIe endpoints. Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jean-Philippe Brucker 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=a914759f2c6939546c47e11942b690c1110738cf --------------------------------------------- Add a way for firmware to tell the OS that ATS is supported by the PCI root complex. An endpoint with ATS enabled may send Translation Requests and Translated Memory Requests, which look just like Normal Memory Requests with a non-zero AT field. So a root controller that ignores the AT field may simply forward the request to the IOMMU as a Normal Memory Request, which could end badly. In any case, the endpoint will be unusable. The ats-supported property allows the OS to only enable ATS in endpoints if the root controller can handle ATS requests. Only add the property to pcie-host-ecam-generic for the moment. For non-generic root controllers, availability of ATS can be inferred from the compatible string. Reviewed-by: NRob Herring <robh@kernel.org> Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jean-Philippe Brucker 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=b61d8bac171c221b3d14c03eba78338bfbbb8825 --------------------------------------------- When a device or driver misbehaves, it is possible to receive events much faster than we can print them out. Ratelimit the printing of events. Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jean-Philippe Brucker 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=9c63ea52c28effb4b797e2f8528a09958185514c --------------------------------------------- If the SMMU supports it and the kernel was built with HTTU support, enable hardware update of access and dirty flags. This is essential for shared page tables, to reduce the number of access faults on the fault queue. Normal DMA with io-pgtables doesn't currently use the access or dirty flags. We can enable HTTU even if CPUs don't support it, because the kernel always checks for HW dirty bit and updates the PTE flags atomically. Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jean-Philippe Brucker 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=027d704339ca0e1a89987eac24e60678f6844a56 --------------------------------------------- The SMMUv3 can handle invalidation targeted at TLB entries with shared ASIDs. If the implementation supports broadcast TLB maintenance, enable it and keep track of it in a feature bit. The SMMU will then be affected by inner-shareable TLB invalidations from other agents. A major side-effect of this change is that stage-2 translation contexts are now affected by all invalidations by VMID. VMIDs are all shared and the only ways to prevent over-invalidation, since the stage-2 page tables are not shared between CPU and SMMU, are to either disable BTM or allocate different VMIDs. This patch does not address the problem. Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jean-Philippe Brucker 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=b6805ab77b4ca7a844ce3b1c4584133e81e3f071 --------------------------------------------- For PCI devices that support it, enable the PRI capability and handle PRI Page Requests with the generic fault handler. It is enabled when device driver enables IOMMU_DEV_FEAT_SVA. Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jean-Philippe Brucker 提交于
maillist inclusion category: feature bugzilla: 51855 CVE: NA Reference: https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=8d0debddf237bd794dbb0bb8de89dc11387ede03 --------------------------------------------- The SMMUv3 driver uses pci_{enable,disable}_pri() and related functions. Export those functions to allow the driver to be built as a module. Acked-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NKuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-