- 19 2月, 2016 1 次提交
-
-
由 Dave Hansen 提交于
We try to enforce protection keys in software the same way that we do in hardware. (See long example below). But, we only want to do this when accessing our *own* process's memory. If GDB set PKRU[6].AD=1 (disable access to PKEY 6), then tried to PTRACE_POKE a target process which just happened to have some mprotect_pkey(pkey=6) memory, we do *not* want to deny the debugger access to that memory. PKRU is fundamentally a thread-local structure and we do not want to enforce it on access to _another_ thread's data. This gets especially tricky when we have workqueues or other delayed-work mechanisms that might run in a random process's context. We can check that we only enforce pkeys when operating on our *own* mm, but delayed work gets performed when a random user context is active. We might end up with a situation where a delayed-work gup fails when running randomly under its "own" task but succeeds when running under another process. We want to avoid that. To avoid that, we use the new GUP flag: FOLL_REMOTE and add a fault flag: FAULT_FLAG_REMOTE. They indicate that we are walking an mm which is not guranteed to be the same as current->mm and should not be subject to protection key enforcement. Thanks to Jerome Glisse for pointing out this scenario. Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boaz Harrosh <boaz@plexistor.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Dominik Vogt <vogt@linux.vnet.ibm.com> Cc: Eric B Munson <emunson@akamai.com> Cc: Geliang Tang <geliangtang@163.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Low <jason.low2@hp.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Shachar Raindel <raindel@mellanox.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xie XiuQi <xiexiuqi@huawei.com> Cc: iommu@lists.linux-foundation.org Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 14 12月, 2015 4 次提交
-
-
由 Julia Lawall 提交于
This mmu_notifier_ops structure is never modified, so declare it as const, like the other mmu_notifier_ops structures. Done with the help of Coccinelle. Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
Get rid of the three error paths that look the same and move error handling to a single place. Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org> Acked-By: NDavid Woodhouse <David.Woodhouse@intel.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
Instead of just checking for a write access, calculate the flags that are passed to handle_mm_fault() more precisly and use the pre-defined macros. Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org> Acked-By: NDavid Woodhouse <David.Woodhouse@intel.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
The handle_mm_fault function expects the caller to do the access checks. Not doing so and calling the function with wrong permissions is a bug (catched by a BUG_ON). So fix this bug by adding proper access checking to the io page-fault code in the AMD IOMMUv2 driver. Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org> Acked-By: NDavid Woodhouse <David.Woodhouse@intel.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 15 10月, 2015 1 次提交
-
-
由 Jay Cornwall 提交于
handle_mm_fault indirectly triggers a BUG in do_numa_page when given a VMA without read/write/execute access. Check this condition in do_fault. do_fault -> handle_mm_fault -> handle_pte_fault -> do_numa_page mm/memory.c 3147 static int do_numa_page(struct mm_struct *mm, struct vm_area_struct *vma, .... 3159 /* A PROT_NONE fault should not end up here */ 3160 BUG_ON(!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE))); Signed-off-by: NJay Cornwall <jay@jcornwall.me> Cc: <stable@vger.kernel.org> # v4.1+ Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 14 8月, 2015 1 次提交
-
-
由 Joerg Roedel 提交于
Found by a coccicheck script. Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 30 7月, 2015 1 次提交
-
-
由 Joerg Roedel 提交于
Since the conversion to default domains the iommu_attach_device function only works for devices with their own group. But this isn't always true for current IOMMUv2 capable devices, so use iommu_attach_group instead. Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 04 5月, 2015 1 次提交
-
-
由 Oded Gabbay 提交于
This patch fixes a bug in put_pasid_state_wait that appeared in kernel 4.0 The bug is that pasid_state->count wasn't decremented before entering the wait_event. Thus, the condition in wait_event will never be true. The fix is to decrement (atomically) the pasid_state->count before the wait_event. Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Cc: stable@vger.kernel.org #v4.0 Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 04 3月, 2015 1 次提交
-
-
由 Dan Carpenter 提交于
"pasid_state->device_state" and "dev_state" are the same, but it's nicer to use dev_state consistently. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 04 2月, 2015 3 次提交
-
-
由 Joerg Roedel 提交于
The AMD address is dead for a long time already, replace it with a working one. Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
Now that I learned about possible spurious wakeups this place needs fixing too. Replace the self-coded sleep variant with the generic wait_event() helper. Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Peter Zijlstra 提交于
put_device_state_wait() doesn't loop on the condition and a spurious wakeup will have it free the device state even though there might still be references out to it. Fix this by using 'normal' wait primitives. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 14 12月, 2014 1 次提交
-
-
由 Jesse Barnes 提交于
This could be useful for debug in the future if we want to track major/minor faults more closely, and also avoids the put_page trick we used with gup. In order to do this, we also track the task struct in the PASID state structure. This lets us update the appropriate task stats after the fault has been handled, and may aid with debug in the future as well. Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org> Tested-by: NOded Gabbay <oded.gabbay@amd.com> Cc: Joerg Roedel <jroedel@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 12 11月, 2014 1 次提交
-
-
由 Oded Gabbay 提交于
This patch fixes a bug in the accounting of the device_state. In the current code, the device_state was put (decremented) too many times, which sometimes lead to the driver getting stuck permanently in put_device_state_wait(). That happen because the device_state->count would go below zero, which is never supposed to happen. The root cause is that the device_state was decremented in put_pasid_state() and put_pasid_state_wait() but also in all the functions that call those functions. Therefore, the device_state was decremented twice in each of these code paths. The fix is to decouple the device_state accounting from the pasid_state accounting - remove the call to put_device_state() from the put_pasid_state() and the put_pasid_state_wait()) Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 24 9月, 2014 1 次提交
-
-
由 Andres Lagar-Cavilla 提交于
1. We were calling clear_flush_young_notify in unmap_one, but we are within an mmu notifier invalidate range scope. The spte exists no more (due to range_start) and the accessed bit info has already been propagated (due to kvm_pfn_set_accessed). Simply call clear_flush_young. 2. We clear_flush_young on a primary MMU PMD, but this may be mapped as a collection of PTEs by the secondary MMU (e.g. during log-dirty). This required expanding the interface of the clear_flush_young mmu notifier, so a lot of code has been trivially touched. 3. In the absence of shadow_accessed_mask (e.g. EPT A bit), we emulate the access bit by blowing the spte. This requires proper synchronizing with MMU notifier consumers, like every other removal of spte's does. Signed-off-by: NAndres Lagar-Cavilla <andreslc@google.com> Acked-by: NRik van Riel <riel@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 30 7月, 2014 4 次提交
-
-
由 Joerg Roedel 提交于
amd_iommu_pasid_bind -> amd_iommu_bind_pasid Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
The references to the device state are not dropped everywhere. This might cause a dead-lock in amd_iommu_free_device(). Fix it. Signed-off-by: NJoerg Roedel <jroedel@suse.de> Tested-by: NOded Gabbay <oded.gabbay@amd.com>
-
由 Joerg Roedel 提交于
All calls to this call-back are wrapped with mmu_notifer_invalidate_range_start()/end(), making this notifier pretty useless, so remove it. Signed-off-by: NJoerg Roedel <jroedel@suse.de> Tested-by: NOded Gabbay <oded.gabbay@amd.com>
-
由 Joerg Roedel 提交于
With calling te mmu_notifier_register function we hold a reference to the mm_struct that needs to be released in mmu_notifier_unregister. This is true even if the notifier was already unregistered from exit_mmap and the .release call-back has already run. So make sure we call mmu_notifier_unregister unconditionally in amd_iommu_unbind_pasid. Signed-off-by: NJoerg Roedel <jroedel@suse.de> Tested-by: NOded Gabbay <oded.gabbay@amd.com>
-
- 10 7月, 2014 9 次提交
-
-
由 Joerg Roedel 提交于
On the error path of amd_iommu_bind_pasid() we call mmu_notifier_unregister() for cleanup. This calls mn_release() which calls the users inv_ctx_cb function if one is available. Since the pasid is not set up yet there is nothing the user can to tear down in this call-back. So don't call inv_ctx_cb on the error path of amd_iommu_unbind_pasid() and make life of the users simpler. Signed-off-by: NJoerg Roedel <jroedel@suse.de> Tested-by: NOded Gabbay <Oded.Gabbay@amd.com>
-
由 Joerg Roedel 提交于
Since we are only caring about the lifetime of the mm_struct and not the task we can't safely keep a reference to it. The reference is also not needed anymore, so remove that code entirely. Signed-off-by: NJoerg Roedel <jroedel@suse.de> Tested-by: NOded Gabbay <Oded.Gabbay@amd.com>
-
由 Joerg Roedel 提交于
With mmu_notifiers we don't need to hold a reference to the mm_struct during the time the pasid is bound to a device. We can rely on the .mn_release call back to inform us when the mm_struct goes away. Signed-off-by: NJoerg Roedel <jroedel@suse.de> Tested-by: NOded Gabbay <Oded.Gabbay@amd.com>
-
由 Joerg Roedel 提交于
This is used to signal the ppr_notifer function that no more faults should be processes on this pasid_state. This way we can keep the pasid_state safely in the state-table so that it can be freed in the amd_iommu_unbind_pasid() function. This allows us to not hold a reference to the mm_struct during the whole pasid-binding-time. Signed-off-by: NJoerg Roedel <jroedel@suse.de> Tested-by: NOded Gabbay <Oded.Gabbay@amd.com>
-
由 Joerg Roedel 提交于
In case we are not able to allocate a fault structure a reference to the pasid_state will be leaked. Fix that by dropping the reference in the error path in case we hold one. Signed-off-by: NJoerg Roedel <jroedel@suse.de> Tested-by: NOded Gabbay <Oded.Gabbay@amd.com>
-
由 Joerg Roedel 提交于
Unbind_pasid is only called from mn_release which already has the pasid_state. Use this to simplify the unbind_pasid path and get rid of __unbind_pasid. Signed-off-by: NJoerg Roedel <jroedel@suse.de> Tested-by: NOded Gabbay <Oded.Gabbay@amd.com>
-
由 Joerg Roedel 提交于
The mmu_notifier state is part of pasid_state so it can't be freed in the mn_release path. Free the pasid_state after mmu_notifer_unregister has completed. Signed-off-by: NJoerg Roedel <jroedel@suse.de> Tested-by: NOded Gabbay <Oded.Gabbay@amd.com>
-
由 Joerg Roedel 提交于
This function is called only in the mn_release() path, so there is no need to unregister the mmu_notifer here. Signed-off-by: NJoerg Roedel <jroedel@suse.de> Tested-by: NOded Gabbay <Oded.Gabbay@amd.com>
-
由 Joerg Roedel 提交于
Fix typo in a comment. Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 09 7月, 2014 1 次提交
-
-
由 Alexey Skidanov 提交于
The pasid wasn't properly initialized before caling to invalid PPR calback Signed-off-by: NAlexey Skidanov <Alexey.Skidanov@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 20 6月, 2014 1 次提交
-
-
由 Joerg Roedel 提交于
Commit e79df31c introduced mmu_notifer_count to protect against parallel mmu_notifier_invalidate_range_start/end calls. The patch left a small race condition when invalidate_range_end() races with a new invalidate_range_start() the empty page-table may be reverted leading to stale TLB entries in the IOMMU and the device. Use a spin_lock instead of just an atomic variable to eliminate the race. Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 26 5月, 2014 5 次提交
-
-
由 Joerg Roedel 提交于
Add a counter to the pasid_state so that we do not restore the original page-table before all invalidate_range_start to invalidate_range_end sections have finished. Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
This list was only used for the task_exit notifier function. Now that it is gone we can remove it. Signed-off-by: NJoerg Roedel <jroedel@suse.de> Tested-by: NJay Cornwall <Jay.Cornwall@amd.com>
-
由 Joerg Roedel 提交于
Since mmu_notifier call-backs can sleep (because they use SRCU now) we can use them to tear down PASID mappings. This allows us to finally remove the hack to use the task_exit notifier from oprofile to get notified when a process dies. Signed-off-by: NJoerg Roedel <jroedel@suse.de> Tested-by: NJay Cornwall <Jay.Cornwall@amd.com>
-
由 Joerg Roedel 提交于
The state_table consumes 512kb of memory and is only sparsly populated. Convert it into a list to save memory. There should be no measurable performance impact. Signed-off-by: NJoerg Roedel <jroedel@suse.de> Tested-by: NJay Cornwall <Jay.Cornwall@amd.com>
-
由 Joerg Roedel 提交于
This is a preparation for converting the state_table into a state_list. Signed-off-by: NJoerg Roedel <jroedel@suse.de> Tested-by: NJay Cornwall <Jay.Cornwall@amd.com>
-
- 13 5月, 2014 1 次提交
-
-
由 Jay Cornwall 提交于
get_user_pages requires caller to hold a read lock on mmap_sem. Signed-off-by: NJay Cornwall <jay.cornwall@amd.com> Signed-off-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: NJoerg Roedel <joro@8bytes.org>
-
- 10 11月, 2014 1 次提交
-
-
由 Oded Gabbay 提交于
This patch fixes a bug in the accounting of the device_state. In the current code, the device_state was put (decremented) too many times, which sometimes lead to the driver getting stuck permanently in put_device_state_wait(). That happen because the device_state->count would go below zero, which is never supposed to happen. The root cause is that the device_state was decremented in put_pasid_state() and put_pasid_state_wait() but also in all the functions that call those functions. Therefore, the device_state was decremented twice in each of these code paths. The fix is to decouple the device_state accounting from the pasid_state accounting - remove the call to put_device_state() from the put_pasid_state() and the put_pasid_state_wait()) Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
- 13 11月, 2014 1 次提交
-
-
由 Joerg Roedel 提交于
Make use of the new invalidate_range mmu_notifier call-back and remove the old logic of assigning an empty page-table between invalidate_range_start and invalidate_range_end. Signed-off-by: NJoerg Roedel <jroedel@suse.de> Tested-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com> Reviewed-by: NJérôme Glisse <jglisse@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Rik van Riel <riel@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Jay Cornwall <Jay.Cornwall@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
- 24 7月, 2012 1 次提交
-
-
由 Masanari Iida 提交于
Correct spelling typo in debug messages and comments in drivers/iommu. Signed-off-by: NMasanari Iida <standby24x7@gmail.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-