- 24 2月, 2017 1 次提交
-
-
由 Peter Zijlstra 提交于
Linus asked to please make this real C code. And since size then isn't an issue what so ever anymore, remove the debug knob and make all WARN()s unconditional. Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dwindsor@gmail.com Cc: elena.reshetova@intel.com Cc: gregkh@linuxfoundation.org Cc: ishkamiel@gmail.com Cc: keescook@chromium.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 16 2月, 2017 1 次提交
-
-
由 David S. Miller 提交于
This reverts commits: 6a254780 9dbbfb0a 40137906 It's too risky to put in this late in the release cycle. We'll put these changes into the next merge window instead. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 2月, 2017 1 次提交
-
-
由 Herbert Xu 提交于
This patch adds code that handles GFP_ATOMIC kmalloc failure on insertion. As we cannot use vmalloc, we solve it by making our hash table nested. That is, we allocate single pages at each level and reach our desired table size by nesting them. When a nested table is created, only a single page is allocated at the top-level. Lower levels are allocated on demand during insertion. Therefore for each insertion to succeed, only two (non-consecutive) pages are needed. After a nested table is created, a rehash will be scheduled in order to switch to a vmalloced table as soon as possible. Also, the rehash code will never rehash into a nested table. If we detect a nested table during a rehash, the rehash will be aborted and a new rehash will be scheduled. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 2月, 2017 3 次提交
-
-
由 Kees Cook 提交于
Currently CONFIG_TIMER_STATS exposes process information across namespaces: kernel/time/timer_list.c print_timer(): SEQ_printf(m, ", %s/%d", tmp, timer->start_pid); /proc/timer_list: #11: <0000000000000000>, hrtimer_wakeup, S:01, do_nanosleep, cron/2570 Given that the tracer can give the same information, this patch entirely removes CONFIG_TIMER_STATS. Suggested-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NKees Cook <keescook@chromium.org> Acked-by: NJohn Stultz <john.stultz@linaro.org> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: linux-doc@vger.kernel.org Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Xing Gao <xgao01@email.wm.edu> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Jessica Frazelle <me@jessfraz.com> Cc: kernel-hardening@lists.openwall.com Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Michal Marek <mmarek@suse.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Olof Johansson <olof@lixom.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-api@vger.kernel.org Cc: Arjan van de Ven <arjan@linux.intel.com> Link: http://lkml.kernel.org/r/20170208192659.GA32582@beastSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Waiman Long 提交于
As suggested by Ingo, the debug_objects_alloc counter is now renamed to debug_objects_allocated with minor twist in comment and debug output. Signed-off-by: NWaiman Long <longman@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1486503630-1501-1-git-send-email-longman@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Provide refcount_t, an atomic_t like primitive built just for refcounting. It provides saturation semantics such that overflow becomes impossible and thereby 'spurious' use-after-free is avoided. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 06 2月, 2017 1 次提交
-
-
由 Waiman Long 提交于
On a large SMP system with many CPUs, the global pool_lock may become a performance bottleneck as all the CPUs that need to allocate or free debug objects have to take the lock. That can sometimes cause soft lockups like: NMI watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [rcuos/1:21] ... RIP: 0010:[<ffffffff817c216b>] [<ffffffff817c216b>] _raw_spin_unlock_irqrestore+0x3b/0x60 ... Call Trace: [<ffffffff813f40d1>] free_object+0x81/0xb0 [<ffffffff813f4f33>] debug_check_no_obj_freed+0x193/0x220 [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0 [<ffffffff81284996>] ? file_free_rcu+0x36/0x60 [<ffffffff81251712>] kmem_cache_free+0xd2/0x380 [<ffffffff81284960>] ? fput+0x90/0x90 [<ffffffff81284996>] file_free_rcu+0x36/0x60 [<ffffffff81124c23>] rcu_nocb_kthread+0x1b3/0x550 [<ffffffff81124b71>] ? rcu_nocb_kthread+0x101/0x550 [<ffffffff81124a70>] ? sync_exp_work_done.constprop.63+0x50/0x50 [<ffffffff810c59d1>] kthread+0x101/0x120 [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0 [<ffffffff817c2d32>] ret_from_fork+0x22/0x50 To reduce the amount of contention on the pool_lock, the actual kmem_cache_free() of the debug objects will be delayed if the pool_lock is busy. This will temporarily increase the amount of free objects available at the free pool when the system is busy. As a result, the number of kmem_cache allocation and freeing is reduced. To further reduce the lock operations free debug objects in batches of four. Signed-off-by: NWaiman Long <longman@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: "Du Changbin" <changbin.du@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Stancek <jstancek@redhat.com> Link: http://lkml.kernel.org/r/1483647425-4135-4-git-send-email-longman@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 04 2月, 2017 2 次提交
-
-
由 Waiman Long 提交于
On a large SMP systems with hundreds of CPUs, the current thresholds for allocating and freeing debug objects (256 and 1024 respectively) may not work well. This can cause a lot of needless calls to kmem_aloc() and kmem_free() on those systems. To alleviate this thrashing problem, the object freeing threshold is now increased to "1024 + # of CPUs * 32". Whereas the object allocation threshold is increased to "256 + # of CPUs * 4". That should make the debug objects subsystem scale better with the number of CPUs available in the system. Signed-off-by: NWaiman Long <longman@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: "Du Changbin" <changbin.du@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Stancek <jstancek@redhat.com> Link: http://lkml.kernel.org/r/1483647425-4135-3-git-send-email-longman@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Waiman Long 提交于
New debugfs stat counters are added to track the numbers of kmem_cache_alloc() and kmem_cache_free() function calls to get a sense of how the internal debug objects cache management is performing. Signed-off-by: NWaiman Long <longman@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: "Du Changbin" <changbin.du@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Stancek <jstancek@redhat.com> Link: http://lkml.kernel.org/r/1483647425-4135-2-git-send-email-longman@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 03 2月, 2017 1 次提交
-
-
由 Jason A. Donenfeld 提交于
The "half md4" transform should not be used by any new code. And fortunately, it's only used now by ext4. Since ext4 supports several hashing methods, at some point it might be desirable to move to something like SipHash. As an intermediate step, remove half md4 from cryptohash.h and lib, and make it just a local function in ext4's hash.c. There's precedent for doing this; the other function ext can use for its hashes -- TEA -- is also implemented in the same place. Also, by being a local function, this might allow gcc to perform some additional optimizations. Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: NAndreas Dilger <adilger@dilger.ca> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 25 1月, 2017 2 次提交
-
-
由 zhong jiang 提交于
Recently, I've found cases in which ioremap_page_range was used incorrectly, in external modules, leading to crashes. This can be partly attributed to the fact that ioremap_page_range is lower-level, with fewer protections, as compared to the other functions that an external module would typically call. Those include: ioremap_cache ioremap_nocache ioremap_prot ioremap_uc ioremap_wc ioremap_wt ...each of which wraps __ioremap_caller, which in turn provides a safer way to achieve the mapping. Therefore, stop EXPORT-ing ioremap_page_range. Link: http://lkml.kernel.org/r/1485173220-29010-1-git-send-email-zhongjiang@huawei.comSigned-off-by: Nzhong jiang <zhongjiang@huawei.com> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Suggested-by: NJohn Hubbard <jhubbard@nvidia.com> Acked-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox 提交于
The newly introduced warning in radix_tree_free_nodes() was testing the wrong variable; it should have been 'old' instead of 'node'. Fixes: ea07b862 ("mm: workingset: fix use-after-free in shadow node shrinker") Link: http://lkml.kernel.org/r/20170118163746.GA32495@cmpxchg.orgSigned-off-by: NMatthew Wilcox <mawilcox@microsoft.com> Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 24 1月, 2017 1 次提交
-
-
由 Matt Fleming 提交于
While debugging a performance issue I needed to understand why RCU sofitrqs were firing so frequently. Unfortunately, the RCU callback tracepoints are hidden behind CONFIG_RCU_TRACE which defaults to off in the upstream kernel and is likely to also be disabled in enterprise distribution configs. Enable it by default for CONFIG_TREE_RCU. However, we must keep it disabled for tiny RCU, because it would otherwise pull in a large amount of code that would make tiny RCU less than tiny. I ran some file system metadata intensive workloads (git checkout, FS-Mark) on a variety of machines with this patch and saw no detectable change in performance. Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
-
- 20 1月, 2017 1 次提交
-
-
由 Geliang Tang 提交于
Signed-off-by: NGeliang Tang <geliangtang@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/0d5cf199ac43792df0b6f7e2145545c30fa1dbbe.1482222135.git.geliangtang@gmail.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 16 1月, 2017 1 次提交
-
-
由 Nikita Yushchenko 提交于
Some drivers do depend on page mappings to be page aligned. Swiotlb already enforces such alignment for mappings greater than page, extend that to page-sized mappings as well. Without this fix, nvme hits BUG() in nvme_setup_prps(), because that routine assumes page-aligned mappings. Signed-off-by: NNikita Yushchenko <nikita.yoush@cogentembedded.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NKonrad Rzeszutek Wilk <konrad@kernel.org>
-
- 15 1月, 2017 1 次提交
-
-
由 Al Viro 提交于
The logics in pipe_advance() used to release all buffers past the new position failed in cases when the number of buffers to release was equal to pipe->buffers. If that happened, none of them had been released, leaving pipe full. Worse, it was trivial to trigger and we end up with pipe full of uninitialized pages. IOW, it's an infoleak. Cc: stable@vger.kernel.org # v4.9 Reported-by: N"Alan J. Wylie" <alan@wylie.me.uk> Tested-by: N"Alan J. Wylie" <alan@wylie.me.uk> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 14 1月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maarten Lankhorst <dev@mblankhorst.nl> Cc: Nicolai Hähnle <nhaehnle@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20161201114711.28697-4-chris@chris-wilson.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 11 1月, 2017 1 次提交
-
-
由 Sudip Mukherjee 提交于
The build of frv allmodconfig was failing with the errors like: /tmp/cc0JSPc3.s: Assembler messages: /tmp/cc0JSPc3.s:1839: Error: symbol `.LSLT0' is already defined /tmp/cc0JSPc3.s:1842: Error: symbol `.LASLTP0' is already defined /tmp/cc0JSPc3.s:1969: Error: symbol `.LELTP0' is already defined /tmp/cc0JSPc3.s:1970: Error: symbol `.LELT0' is already defined Commit 866ced95 ("kbuild: Support split debug info v4") introduced splitting the debug info and keeping that in a separate file. Somehow, the frv-linux gcc did not like that and I am guessing that instead of splitting it started copying. The first report about this is at: https://lists.01.org/pipermail/kbuild-all/2015-July/010527.html. I will try and see if this can work with frv and if still fails I will open a bug report with gcc. But meanwhile this is the easiest option to solve build failure of frv. Fixes: 866ced95 ("kbuild: Support split debug info v4") Link: http://lkml.kernel.org/r/1482062348-5352-1-git-send-email-sudipm.mukherjee@gmail.comSigned-off-by: NSudip Mukherjee <sudip.mukherjee@codethink.co.uk> Reported-by: NFengguang Wu <fengguang.wu@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Howells <dhowells@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 08 1月, 2017 1 次提交
-
-
由 Johannes Weiner 提交于
Several people report seeing warnings about inconsistent radix tree nodes followed by crashes in the workingset code, which all looked like use-after-free access from the shadow node shrinker. Dave Jones managed to reproduce the issue with a debug patch applied, which confirmed that the radix tree shrinking indeed frees shadow nodes while they are still linked to the shadow LRU: WARNING: CPU: 2 PID: 53 at lib/radix-tree.c:643 delete_node+0x1e4/0x200 CPU: 2 PID: 53 Comm: kswapd0 Not tainted 4.10.0-rc2-think+ #3 Call Trace: delete_node+0x1e4/0x200 __radix_tree_delete_node+0xd/0x10 shadow_lru_isolate+0xe6/0x220 __list_lru_walk_one.isra.4+0x9b/0x190 list_lru_walk_one+0x23/0x30 scan_shadow_nodes+0x2e/0x40 shrink_slab.part.44+0x23d/0x5d0 shrink_node+0x22c/0x330 kswapd+0x392/0x8f0 This is the WARN_ON_ONCE(!list_empty(&node->private_list)) placed in the inlined radix_tree_shrink(). The problem is with 14b46879 ("mm: workingset: move shadow entry tracking to radix tree exceptional tracking"), which passes an update callback into the radix tree to link and unlink shadow leaf nodes when tree entries change, but forgot to pass the callback when reclaiming a shadow node. While the reclaimed shadow node itself is unlinked by the shrinker, its deletion from the tree can cause the left-most leaf node in the tree to be shrunk. If that happens to be a shadow node as well, we don't unlink it from the LRU as we should. Consider this tree, where the s are shadow entries: root->rnode | [0 n] | | [s ] [sssss] Now the shadow node shrinker reclaims the rightmost leaf node through the shadow node LRU: root->rnode | [0 ] | [s ] Because the parent of the deleted node is the first level below the root and has only one child in the left-most slot, the intermediate level is shrunk and the node containing the single shadow is put in its place: root->rnode | [s ] The shrinker again sees a single left-most slot in a first level node and thus decides to store the shadow in root->rnode directly and free the node - which is a leaf node on the shadow node LRU. root->rnode | s Without the update callback, the freed node remains on the shadow LRU, where it causes later shrinker runs to crash. Pass the node updater callback into __radix_tree_delete_node() in case the deletion causes the left-most branch in the tree to collapse too. Also add warnings when linked nodes are freed right away, rather than wait for the use-after-free when the list is scanned much later. Fixes: 14b46879 ("mm: workingset: move shadow entry tracking to radix tree exceptional tracking") Reported-by: NDave Chinner <david@fromorbit.com> Reported-by: NHugh Dickins <hughd@google.com> Reported-by: NAndrea Arcangeli <aarcange@redhat.com> Reported-and-tested-by: NDave Jones <davej@codemonkey.org.uk> Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Chris Leech <cleech@redhat.com> Cc: Lee Duncan <lduncan@suse.com> Cc: Jan Kara <jack@suse.cz> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <mawilcox@linuxonhyperv.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 1月, 2017 1 次提交
-
-
由 Konrad Rzeszutek Wilk 提交于
So they can figure out what is the optimal number of pages that can be contingously stitched together without fear of bounce buffer. We also expose an mechanism for sub-users of SWIOTLB API, such as Xen-SWIOTLB to set the max segment value. And lastly if swiotlb=force is set (which mandates we bounce buffer everything) we set max_segment so at least we can bounce buffer one 4K page instead of a giant 512KB one for which we may not have space. Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reported-and-Tested-by: NJuergen Gross <jgross@suse.com>
-
- 26 12月, 2016 1 次提交
-
-
由 Thomas Gleixner 提交于
ktime is a union because the initial implementation stored the time in scalar nanoseconds on 64 bit machine and in a endianess optimized timespec variant for 32bit machines. The Y2038 cleanup removed the timespec variant and switched everything to scalar nanoseconds. The union remained, but become completely pointless. Get rid of the union and just keep ktime_t as simple typedef of type s64. The conversion was done with coccinelle and some manual mopping up. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>
-
- 25 12月, 2016 2 次提交
-
-
由 Thomas Gleixner 提交于
hotcpu_notifier(), cpu_notifier(), __hotcpu_notifier(), __cpu_notifier(), register_hotcpu_notifier(), register_cpu_notifier(), __register_hotcpu_notifier(), __register_cpu_notifier(), unregister_hotcpu_notifier(), unregister_cpu_notifier(), __unregister_hotcpu_notifier(), __unregister_cpu_notifier() are unused now. Remove them and all related code. Remove also the now pointless cpu notifier error injection mechanism. The states can be executed step by step and error rollback is the same as cpu down, so any state transition can be tested w/o requiring the notifier error injection. Some CPU hotplug states are kept as they are (ab)used for hotplug state tracking. Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20161221192112.005642358@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Linus Torvalds 提交于
This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 12月, 2016 1 次提交
-
-
由 Al Viro 提交于
Problem similar to ones dealt with in "fold checks into iterate_and_advance()" and followups, except that in this case we really want to do nothing when asked for zero-length operation - unlike zero-length iterate_and_advance(), zero-length iterate_all_kinds() has no side effects, and callers are simpler that way. That got exposed when copy_from_iter_full() had been used by tipc, which builds an msghdr with zero payload and (now) feeds it to a primitive based on iterate_all_kinds() instead of iterate_and_advance(). Reported-by: NJon Maloy <jon.maloy@ericsson.com> Tested-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 21 12月, 2016 1 次提交
-
-
由 Borislav Petkov 提交于
s/prink/printk/ Link: http://lkml.kernel.org/r/20161215170111.19075-1-bp@alien8.deSigned-off-by: NBorislav Petkov <bp@suse.de> Cc: Olof Johansson <olof@lixom.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 19 12月, 2016 2 次提交
-
-
由 Geert Uytterhoeven 提交于
On architectures like arm64, swiotlb is tied intimately to the core architecture DMA support. In addition, ZONE_DMA cannot be disabled. To aid debugging and catch devices not supporting DMA to memory outside the 32-bit address space, add a kernel command line option "swiotlb=noforce", which disables the use of bounce buffers. If specified, trying to map memory that cannot be used with DMA will fail, and a rate-limited warning will be printed. Note that io_tlb_nslabs is set to 1, which is the minimal supported value. Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Geert Uytterhoeven 提交于
Convert the flag swiotlb_force from an int to an enum, to prepare for the advent of more possible values. Suggested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 16 12月, 2016 1 次提交
-
-
由 Matthew Wilcox 提交于
[ This resurrects commit 53855d10, which was reverted in 2b41226b. It depended on commit d544abd5 ("lib/radix-tree: Convert to hotplug state machine") so now it is correct to apply ] Patch "lib/radix-tree: Convert to hotplug state machine" breaks the test suite as it adds a call to cpuhp_setup_state_nocalls() which is not currently emulated in the test suite. Add it, and delete the emulation of the old CPU hotplug mechanism. Link: http://lkml.kernel.org/r/1480369871-5271-36-git-send-email-mawilcox@linuxonhyperv.comSigned-off-by: NMatthew Wilcox <mawilcox@microsoft.com> Tested-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 12月, 2016 12 次提交
-
-
由 Matthew Wilcox 提交于
radix_tree_join() was freeing nodes with a non-zero ->exceptional count, and radix_tree_split() wasn't zeroing ->exceptional when it allocated the new node. Fix this by making all callers of radix_tree_node_alloc() pass in the new counts (and some other always-initialised fields), which will prevent the problem recurring if in future we decide to do something similar. Link: http://lkml.kernel.org/r/1481667692-14500-3-git-send-email-mawilcox@linuxonhyperv.comSigned-off-by: NMatthew Wilcox <mawilcox@microsoft.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox 提交于
When replacing an entry with NULL, we need to delete any sibling entries. Also account deleting exceptional entries properly. Also fix a bug with radix_tree_iter_replace() where we would fail to remove entirely freed nodes. Also fix accounting bug when switching between normal and exceptional entries with replace_slot. Also add testcases for all these bugs. Link: http://lkml.kernel.org/r/1480369871-5271-61-git-send-email-mawilcox@linuxonhyperv.comSigned-off-by: NMatthew Wilcox <mawilcox@microsoft.com> Tested-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox 提交于
Calculate how many nodes we need to allocate to split an old_order entry into multiple entries, each of size new_order. The test suite checks that we allocated exactly the right number of nodes; neither too many (checked by rtp->nr == 0), nor too few (checked by comparing nr_allocated before and after the call to radix_tree_split()). Link: http://lkml.kernel.org/r/1480369871-5271-60-git-send-email-mawilcox@linuxonhyperv.comSigned-off-by: NMatthew Wilcox <willy@linux.intel.com> Tested-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox 提交于
This new function splits a larger multiorder entry into smaller entries (potentially multi-order entries). These entries are initialised to RADIX_TREE_RETRY to ensure that RCU walkers who see this state aren't confused. The caller should then call radix_tree_for_each_slot() and radix_tree_replace_slot() in order to turn these retry entries into the intended new entries. Tags are replicated from the original multiorder entry into each new entry. Link: http://lkml.kernel.org/r/1480369871-5271-59-git-send-email-mawilcox@linuxonhyperv.comSigned-off-by: NMatthew Wilcox <willy@linux.intel.com> Tested-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox 提交于
This new function allows for the replacement of many smaller entries in the radix tree with one larger multiorder entry. From the point of view of an RCU walker, they may see a mixture of the smaller entries and the large entry during the same walk, but they will never see NULL for an index which was populated before the join. Link: http://lkml.kernel.org/r/1480369871-5271-58-git-send-email-mawilcox@linuxonhyperv.comSigned-off-by: NMatthew Wilcox <willy@linux.intel.com> Tested-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox 提交于
This is an exceptionally complicated function with just one caller (tag_pages_for_writeback). We devote a large portion of the runtime of the test suite to testing this one function which has one caller. By introducing the new function radix_tree_iter_tag_set(), we can eliminate all of the complexity while keeping the performance. The caller can now use a fairly standard radix_tree_for_each() loop, and it doesn't need to worry about tricksy things like 'start' wrapping. The test suite continues to spend a large amount of time investigating this function, but now it's testing the underlying primitives such as radix_tree_iter_resume() and the radix_tree_for_each_tagged() iterator which are also used by other parts of the kernel. Link: http://lkml.kernel.org/r/1480369871-5271-57-git-send-email-mawilcox@linuxonhyperv.comSigned-off-by: NMatthew Wilcox <willy@infradead.org> Tested-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox 提交于
This rather complicated function can be better implemented as an iterator. It has only one caller, so move the functionality to the only place that needs it. Update the test suite to follow the same pattern. Link: http://lkml.kernel.org/r/1480369871-5271-56-git-send-email-mawilcox@linuxonhyperv.comSigned-off-by: NMatthew Wilcox <mawilcox@microsoft.com> Acked-by: NKonstantin Khlebnikov <koct9i@gmail.com> Tested-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox 提交于
This fixes several interlinked problems with the iterators in the presence of multiorder entries. 1. radix_tree_iter_next() would only advance by one slot, which would result in the iterators returning the same entry more than once if there were sibling entries. 2. radix_tree_next_slot() could return an internal pointer instead of a user pointer if a tagged multiorder entry was immediately followed by an entry of lower order. 3. radix_tree_next_slot() expanded to a lot more code than it used to when multiorder support was compiled in. And I wasn't comfortable with entry_to_node() being in a header file. Fixing radix_tree_iter_next() for the presence of sibling entries necessarily involves examining the contents of the radix tree, so we now need to pass 'slot' to radix_tree_iter_next(), and we need to change the calling convention so it is called *before* dropping the lock which protects the tree. Also rename it to radix_tree_iter_resume(), as some people thought it was necessary to call radix_tree_iter_next() each time around the loop. radix_tree_next_slot() becomes closer to how it looked before multiorder support was introduced. It only checks to see if the next entry in the chunk is a sibling entry or a pointer to a node; this should be rare enough that handling this case out of line is not a performance impact (and such impact is amortised by the fact that the entry we just processed was a multiorder entry). Also, radix_tree_next_slot() used to force a new chunk lookup for untagged entries, which is more expensive than the out of line sibling entry skipping. Link: http://lkml.kernel.org/r/1480369871-5271-55-git-send-email-mawilcox@linuxonhyperv.comSigned-off-by: NMatthew Wilcox <mawilcox@microsoft.com> Tested-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox 提交于
Print the indices of the entries as unsigned (instead of signed) integers and print the parent node of each entry to help navigate around larger trees where the layout is not quite so obvious. Print the indices covered by a node. Rearrange the order of fields printed so the indices and parents line up for each type of entry. Link: http://lkml.kernel.org/r/1480369871-5271-53-git-send-email-mawilcox@linuxonhyperv.comSigned-off-by: NMatthew Wilcox <willy@infradead.org> Tested-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox 提交于
Since this function is specialised to the radix tree, pass in the node and tag to calculate the address of the bitmap in radix_tree_find_next_bit() instead of the caller. Likewise, there is no need to pass in the size of the bitmap. Link: http://lkml.kernel.org/r/1480369871-5271-52-git-send-email-mawilcox@linuxonhyperv.comSigned-off-by: NMatthew Wilcox <willy@infradead.org> Tested-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox 提交于
Similar to node_tag_clear(), factor node_tag_set() out of radix_tree_range_tag_if_tagged(). Link: http://lkml.kernel.org/r/1480369871-5271-51-git-send-email-mawilcox@linuxonhyperv.comSigned-off-by: NMatthew Wilcox <willy@linux.intel.com> Tested-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox 提交于
I want to be able to reference node->parent after freeing node. Currently node->parent is in a union with rcu_head, so it is overwritten when the node is put on the RCU list. We know that private_list is not referenced after the node is freed, so it is safe for these two members to share space. Link: http://lkml.kernel.org/r/1480369871-5271-50-git-send-email-mawilcox@linuxonhyperv.comSigned-off-by: NMatthew Wilcox <willy@infradead.org> Tested-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-