- 16 11月, 2017 1 次提交
-
-
由 Mel Gorman 提交于
During truncation, the mapping has already been checked for shmem and dax so it's known that workingset_update_node is required. This patch avoids the checks on mapping for each page being truncated. In all other cases, a lookup helper is used to determine if workingset_update_node() needs to be called. The one danger is that the API is slightly harder to use as calling workingset_update_node directly without checking for dax or shmem mappings could lead to surprises. However, the API rarely needs to be used and hopefully the comment is enough to give people the hint. sparsetruncate (tiny) 4.14.0-rc4 4.14.0-rc4 oneirq-v1r1 pickhelper-v1r1 Min Time 141.00 ( 0.00%) 140.00 ( 0.71%) 1st-qrtle Time 142.00 ( 0.00%) 141.00 ( 0.70%) 2nd-qrtle Time 142.00 ( 0.00%) 142.00 ( 0.00%) 3rd-qrtle Time 143.00 ( 0.00%) 143.00 ( 0.00%) Max-90% Time 144.00 ( 0.00%) 144.00 ( 0.00%) Max-95% Time 147.00 ( 0.00%) 145.00 ( 1.36%) Max-99% Time 195.00 ( 0.00%) 191.00 ( 2.05%) Max Time 230.00 ( 0.00%) 205.00 ( 10.87%) Amean Time 144.37 ( 0.00%) 143.82 ( 0.38%) Stddev Time 10.44 ( 0.00%) 9.00 ( 13.74%) Coeff Time 7.23 ( 0.00%) 6.26 ( 13.41%) Best99%Amean Time 143.72 ( 0.00%) 143.34 ( 0.26%) Best95%Amean Time 142.37 ( 0.00%) 142.00 ( 0.26%) Best90%Amean Time 142.19 ( 0.00%) 141.85 ( 0.24%) Best75%Amean Time 141.92 ( 0.00%) 141.58 ( 0.24%) Best50%Amean Time 141.69 ( 0.00%) 141.31 ( 0.27%) Best25%Amean Time 141.38 ( 0.00%) 140.97 ( 0.29%) As you'd expect, the gain is marginal but it can be detected. The differences in bonnie are all within the noise which is not surprising given the impact on the microbenchmark. radix_tree_update_node_t is a callback for some radix operations that optionally passes in a private field. The only user of the callback is workingset_update_node and as it no longer requires a mapping, the private field is removed. Link: http://lkml.kernel.org/r/20171018075952.10627-3-mgorman@techsingularity.netSigned-off-by: NMel Gorman <mgorman@techsingularity.net> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Reviewed-by: NJan Kara <jack@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 10月, 2017 1 次提交
-
-
由 Eric Biggers 提交于
idr_replace() returns the old value on success, not 0. Link: http://lkml.kernel.org/r/20170918162642.37511-1-ebiggers3@gmail.comSigned-off-by: NEric Biggers <ebiggers@google.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 9月, 2017 1 次提交
-
-
由 Eric Biggers 提交于
IDR only supports non-negative IDs. There used to be a 'WARN_ON_ONCE(id < 0)' in idr_replace(), but it was intentionally removed by commit 2e1c9b28 ("idr: remove WARN_ON_ONCE() on negative IDs"). Then it was added back by commit 0a835c4f ("Reimplement IDR and IDA using the radix tree"). However it seems that adding it back was a mistake, given that some users such as drm_gem_handle_delete() (DRM_IOCTL_GEM_CLOSE) pass in a value from userspace to idr_replace(), allowing the WARN_ON_ONCE to be triggered. drm_gem_handle_delete() actually just wants idr_replace() to return an error code if the ID is not allocated, including in the case where the ID is invalid (negative). So once again remove the bogus WARN_ON_ONCE(). This bug was found by syzkaller, which encountered the following warning: WARNING: CPU: 3 PID: 3008 at lib/idr.c:157 idr_replace+0x1d8/0x240 lib/idr.c:157 Kernel panic - not syncing: panic_on_warn set ... CPU: 3 PID: 3008 Comm: syzkaller218828 Not tainted 4.13.0-rc4-next-20170811 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:190 do_trap_no_signal arch/x86/kernel/traps.c:224 [inline] do_trap+0x260/0x390 arch/x86/kernel/traps.c:273 do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:310 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:323 invalid_op+0x1e/0x30 arch/x86/entry/entry_64.S:930 RIP: 0010:idr_replace+0x1d8/0x240 lib/idr.c:157 RSP: 0018:ffff8800394bf9f8 EFLAGS: 00010297 RAX: ffff88003c6c60c0 RBX: 1ffff10007297f43 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800394bfa78 RBP: ffff8800394bfae0 R08: ffffffff82856487 R09: 0000000000000000 R10: ffff8800394bf9a8 R11: ffff88006c8bae28 R12: ffffffffffffffff R13: ffff8800394bfab8 R14: dffffc0000000000 R15: ffff8800394bfbc8 drm_gem_handle_delete+0x33/0xa0 drivers/gpu/drm/drm_gem.c:297 drm_gem_close_ioctl+0xa1/0xe0 drivers/gpu/drm/drm_gem.c:671 drm_ioctl_kernel+0x1e7/0x2e0 drivers/gpu/drm/drm_ioctl.c:729 drm_ioctl+0x72e/0xa50 drivers/gpu/drm/drm_ioctl.c:825 vfs_ioctl fs/ioctl.c:45 [inline] do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685 SYSC_ioctl fs/ioctl.c:700 [inline] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691 entry_SYSCALL_64_fastpath+0x1f/0xbe Here is a C reproducer: #include <fcntl.h> #include <stddef.h> #include <stdint.h> #include <sys/ioctl.h> #include <drm/drm.h> int main(void) { int cardfd = open("/dev/dri/card0", O_RDONLY); ioctl(cardfd, DRM_IOCTL_GEM_CLOSE, &(struct drm_gem_close) { .handle = -1 } ); } Link: http://lkml.kernel.org/r/20170906235306.20534-1-ebiggers3@gmail.com Fixes: 0a835c4f ("Reimplement IDR and IDA using the radix tree") Signed-off-by: NEric Biggers <ebiggers@google.com> Acked-by: NTejun Heo <tj@kernel.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: <stable@vger.kernel.org> [v4.11+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 31 8月, 2017 1 次提交
-
-
由 Chris Mi 提交于
The following new APIs are added: int idr_alloc_ext(struct idr *idr, void *ptr, unsigned long *index, unsigned long start, unsigned long end, gfp_t gfp); void *idr_remove_ext(struct idr *idr, unsigned long id); void *idr_find_ext(const struct idr *idr, unsigned long id); void *idr_replace_ext(struct idr *idr, void *ptr, unsigned long id); void *idr_get_next_ext(struct idr *idr, unsigned long *nextid); Signed-off-by: NChris Mi <chrism@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 2月, 2017 4 次提交
-
-
由 Matthew Wilcox 提交于
Where we use the radix tree iteration macros, we need to annotate 'slot' with __rcu. Make sure we don't forget any new places in the future with the same CFLAGS check used for radix-tree.c. Signed-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
-
由 Matthew Wilcox 提交于
We can use the root entry as a bitmap and save allocating a 128 byte bitmap for an IDA that contains only a few entries (30 on a 32-bit machine, 62 on a 64-bit machine). This costs about 300 bytes of kernel text on x86-64, so as long as 3 IDAs fall into this category, this is a net win for memory consumption. Thanks to Rasmus Villemoes for his work documenting the problem and collecting statistics on IDAs. Signed-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
-
由 Matthew Wilcox 提交于
When we preload the IDA, we allocate an IDA bitmap. Instead of storing that preallocated bitmap in the IDA, we store it in a percpu variable. Generally there are more IDAs in the system than CPUs, so this cuts down on the number of preallocated bitmaps that are unused, and about half of the IDA users did not call ida_destroy() so they were leaking IDA bitmaps. Signed-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
-
由 Matthew Wilcox 提交于
The IDR is very similar to the radix tree. It has some functionality that the radix tree did not have (alloc next free, cyclic allocation, a callback-based for_each, destroy tree), which is readily implementable on top of the radix tree. A few small changes were needed in order to use a tag to represent nodes with free space below them. More extensive changes were needed to support storing NULL as a valid entry in an IDR. Plain radix trees still interpret NULL as a not-present entry. The IDA is reimplemented as a client of the newly enhanced radix tree. As in the current implementation, it uses a bitmap at the last level of the tree. Signed-off-by: NMatthew Wilcox <willy@infradead.org> Signed-off-by: NMatthew Wilcox <mawilcox@microsoft.com> Tested-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 13 12月, 2016 1 次提交
-
-
由 Daniel Vetter 提交于
I wanted to wrap a bunch of ida_simple_get calls into their own locking, until I dug around and read the original commit message. Stuff like this should imo be added to the kernel doc, let's do that. Link: http://lkml.kernel.org/r/20161027072216.20411-1-daniel.vetter@ffwll.chSigned-off-by: NDaniel Vetter <daniel.vetter@intel.com> Acked-by: NTejun Heo <tj@kernel.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 11月, 2015 1 次提交
-
-
由 Mel Gorman 提交于
mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd __GFP_WAIT has been used to identify atomic context in callers that hold spinlocks or are in interrupts. They are expected to be high priority and have access one of two watermarks lower than "min" which can be referred to as the "atomic reserve". __GFP_HIGH users get access to the first lower watermark and can be called the "high priority reserve". Over time, callers had a requirement to not block when fallback options were available. Some have abused __GFP_WAIT leading to a situation where an optimisitic allocation with a fallback option can access atomic reserves. This patch uses __GFP_ATOMIC to identify callers that are truely atomic, cannot sleep and have no alternative. High priority users continue to use __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify callers that want to wake kswapd for background reclaim. __GFP_WAIT is redefined as a caller that is willing to enter direct reclaim and wake kswapd for background reclaim. This patch then converts a number of sites o __GFP_ATOMIC is used by callers that are high priority and have memory pools for those requests. GFP_ATOMIC uses this flag. o Callers that have a limited mempool to guarantee forward progress clear __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall into this category where kswapd will still be woken but atomic reserves are not used as there is a one-entry mempool to guarantee progress. o Callers that are checking if they are non-blocking should use the helper gfpflags_allow_blocking() where possible. This is because checking for __GFP_WAIT as was done historically now can trigger false positives. Some exceptions like dm-crypt.c exist where the code intent is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to flag manipulations. o Callers that built their own GFP flags instead of starting with GFP_KERNEL and friends now also need to specify __GFP_KSWAPD_RECLAIM. The first key hazard to watch out for is callers that removed __GFP_WAIT and was depending on access to atomic reserves for inconspicuous reasons. In some cases it may be appropriate for them to use __GFP_HIGH. The second key hazard is callers that assembled their own combination of GFP flags instead of starting with something like GFP_KERNEL. They may now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless if it's missed in most cases as other activity will wake kswapd. Signed-off-by: NMel Gorman <mgorman@techsingularity.net> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Vitaly Wool <vitalywool@gmail.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 2月, 2015 1 次提交
-
-
由 Rasmus Villemoes 提交于
idr.c doesn't seem to use anything from hardirq.h (or anything included from that). Removing it produces identical objdump -d output, and gives 44 fewer lines in the .idr.o.cmd dependency file. Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 9月, 2014 1 次提交
-
-
由 Masanari Iida 提交于
This patch fix spelling typo found in DocBook/kernel-api.xml. It is because the file is generated from the source comments, I have to fix the comments in source codes. Signed-off-by: NMasanari Iida <standby24x7@gmail.com> Acked-by: NRandy Dunlap <rdunlap@infradead.org> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 09 8月, 2014 1 次提交
-
-
由 Andrey Ryabinin 提交于
I'm working on address sanitizer project for kernel. Recently we started experiments with stack instrumentation, to detect out-of-bounds read/write bugs on stack. Just after booting I've hit out-of-bounds read on stack in idr_for_each (and in __idr_remove_all as well): struct idr_layer **paa = &pa[0]; while (id >= 0 && id <= max) { ... while (n < fls(id)) { n += IDR_BITS; p = *--paa; <--- here we are reading pa[-1] value. } } Despite the fact that after this dereference we are exiting out of loop and never use p, such behaviour is undefined and should be avoided. Fix this by moving pointer derference to the beggining of the loop, right before we will use it. Signed-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com> Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Alexey Preobrazhensky <preobr@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 6月, 2014 6 次提交
-
-
由 Lai Jiangshan 提交于
If "idr->hint == p" is true, it also implies "idr->hint" is true(not NULL). Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lai Jiangshan 提交于
After idr subsystem is changed to RCU-awared, the free layer will not go to the free list. The free list will not be filled up when idr_remove(). So we don't need to shink it too. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lai Jiangshan 提交于
When the smaller id is not found, idr_replace() returns -ENOENT. But when the id is bigger enough, idr_replace() returns -EINVAL, actually there is no difference between these two kinds of ids. These are all unallocated id, the return values of the idr_replace() for these ids should be the same: -ENOENT. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lai Jiangshan 提交于
If the ida has at least one existing id, and when an unallocated ID which meets a certain condition is passed to the ida_remove(), the system will crash because it hits NULL pointer dereference. The condition is that the unallocated ID shares the same lowest idr layer with the existing ID, but the idr slot would be different if the unallocated ID were to be allocated. In this case the matching idr slot for the unallocated_id is NULL, causing @bitmap to be NULL which the function dereferences without checking crashing the kernel. See the test code: static void test3(void) { int id; DEFINE_IDA(test_ida); printk(KERN_INFO "Start test3\n"); if (ida_pre_get(&test_ida, GFP_KERNEL) < 0) return; if (ida_get_new(&test_ida, &id) < 0) return; ida_remove(&test_ida, 4000); /* bug: null deference here */ printk(KERN_INFO "End of test3\n"); } It happens only when the caller tries to free an unallocated ID which is the caller's fault. It is not a bug. But it is better to add the proper check and complain rather than crashing the kernel. [tj@kernel.org: updated patch description] Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lai Jiangshan 提交于
If unallocated_id = (ANY * idr_max(idp->layers) + existing_id) is passed to idr_remove(). The existing_id will be removed unexpectedly. The following test shows this unexpected id-removal: static void test4(void) { int id; DEFINE_IDR(test_idr); printk(KERN_INFO "Start test4\n"); id = idr_alloc(&test_idr, (void *)1, 42, 43, GFP_KERNEL); BUG_ON(id != 42); idr_remove(&test_idr, 42 + IDR_SIZE); TEST_BUG_ON(idr_find(&test_idr, 42) != (void *)1); idr_destroy(&test_idr); printk(KERN_INFO "End of test4\n"); } ida_remove() shares the similar problem. It happens only when the caller tries to free an unallocated ID which is the caller's fault. It is not a bug. But it is better to add the proper check and complain rather than removing an existing_id silently. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lai Jiangshan 提交于
idr_replace() open-codes the logic to calculate the maximum valid ID given the height of the idr tree; unfortunately, the open-coded logic doesn't account for the fact that the top layer may have unused slots and over-shifts the limit to zero when the tree is at its maximum height. The following test code shows it fails to replace the value for id=((1<<27)+42): static void test5(void) { int id; DEFINE_IDR(test_idr); #define TEST5_START ((1<<27)+42) /* use the highest layer */ printk(KERN_INFO "Start test5\n"); id = idr_alloc(&test_idr, (void *)1, TEST5_START, 0, GFP_KERNEL); BUG_ON(id != TEST5_START); TEST_BUG_ON(idr_replace(&test_idr, (void *)2, TEST5_START) != (void *)1); idr_destroy(&test_idr); printk(KERN_INFO "End of test5\n"); } Fix the bug by using idr_max() which correctly takes into account the maximum allowed shift. sub_alloc() shares the same problem and may incorrectly fail with -EAGAIN; however, this bug doesn't affect correct operation because idr_get_empty_slot(), which already uses idr_max(), retries with the increased @id in such cases. [tj@kernel.org: Updated patch description.] Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Acked-by: NTejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 08 4月, 2014 2 次提交
-
-
由 Monam Agarwal 提交于
Replace rcu_assign_pointer(x, NULL) with RCU_INIT_POINTER(x, NULL) The rcu_assign_pointer() ensures that the initialization of a structure is carried out before storing a pointer to that structure. And in the case of the NULL pointer, there is no structure to initialize. So, rcu_assign_pointer(p, NULL) can be safely converted to RCU_INIT_POINTER(p, NULL) Signed-off-by: NMonam Agarwal <monamagarwal123@gmail.com> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Stephen Hemminger 提交于
Remove no longer used deprecated code, and make local functions static. Signed-off-by: NStephen Hemminger <stephen@networkplumber.org> Acked-by: NJean Delvare <jdelvare@suse.de> Acked-by: NTejun Heo <tj@kernel.org> Cc: Jeff Layton <jlayton@redhat.com> Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: George Spelvin <linux@horizon.com> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 2月, 2014 1 次提交
-
-
由 Andreas Gruenbacher 提交于
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com> Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
-
- 04 7月, 2013 1 次提交
-
-
由 Jean Delvare 提交于
We print a dump stack after idr_remove warning. This is useful to find the faulty piece of code. Let's do the same for ida_remove, as it would be equally useful there. [akpm@linux-foundation.org: convert the open-coded printk+dump_stack into WARN()] Signed-off-by: NJean Delvare <jdelvare@suse.de> Cc: Tejun Heo <tj@kernel.org> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 4月, 2013 1 次提交
-
-
由 Jeff Layton 提交于
As Tejun points out, there are several users of the IDR facility that attempt to use it in a cyclic fashion. These users are likely to see -ENOSPC errors after the counter wraps one or more times however. This patchset adds a new idr_alloc_cyclic routine and converts several of these users to it. Many of these users are in obscure parts of the kernel, and I don't have a good way to test some of them. The change is pretty straightforward though, so hopefully it won't be an issue. There is one other cyclic user of idr_alloc that I didn't touch in ipc/util.c. That one is doing some strange stuff that I didn't quite understand, but it looks like it should probably be converted later somehow. This patch: Thus spake Tejun Heo: Ooh, BTW, the cyclic allocation is broken. It's prone to -ENOSPC after the first wraparound. There are several cyclic users in the kernel and I think it probably would be best to implement cyclic support in idr. This patch does that by adding new idr_alloc_cyclic function that such users in the kernel can use. With this, there's no need for a caller to keep track of the last value used as that's now tracked internally. This should prevent the ENOSPC problems that can hit when the "last allocated" counter exceeds INT_MAX. Later patches will convert existing cyclic users to the new interface. Signed-off-by: NJeff Layton <jlayton@redhat.com> Reviewed-by: NTejun Heo <tj@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Eric Paris <eparis@parisplace.org> Cc: Jack Morgenstein <jackm@dev.mellanox.co.il> Cc: John McCutchan <john@johnmccutchan.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Robert Love <rlove@rlove.org> Cc: Roland Dreier <roland@purestorage.com> Cc: Sridhar Samudrala <sri@us.ibm.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Tom Tucker <tom@opengridcomputing.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 3月, 2013 2 次提交
-
-
由 Tejun Heo 提交于
GFP_NOIO is often used for idr_alloc() inside preloaded section as the allocation mask doesn't really matter. If the idr tree needs to be expanded, idr_alloc() first tries to allocate using the specified allocation mask and if it fails falls back to the preloaded buffer. This order prevent non-preloading idr_alloc() users from taking advantage of preloading ones by using preload buffer without filling it shifting the burden of allocation to the preload users. Unfortunately, this allowed/expected-to-fail kmem_cache allocation ends up generating spurious slab lowmem warning before succeeding the request from the preload buffer. This patch makes idr_layer_alloc() add __GFP_NOWARN to the first kmem_cache attempt and try kmem_cache again w/o __GFP_NOWARN after allocation from preload_buffer fails so that lowmem warning is generated if not suppressed by the original @gfp_mask. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NDavid Teigland <teigland@redhat.com> Tested-by: NDavid Teigland <teigland@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
Now that all in-kernel users are converted to ues the new alloc interface, mark the old interface deprecated. We should be able to remove these in a few releases. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 3月, 2013 1 次提交
-
-
由 Randy Dunlap 提交于
Fix new kernel-doc warnings in idr: Warning(include/linux/idr.h:113): No description found for parameter 'idr' Warning(include/linux/idr.h:113): Excess function parameter 'idp' description in 'idr_find' Warning(lib/idr.c:232): Excess function parameter 'id' description in 'sub_alloc' Warning(lib/idr.c:232): Excess function parameter 'id' description in 'sub_alloc' Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 3月, 2013 1 次提交
-
-
由 Tejun Heo 提交于
idr_find(), idr_remove() and idr_replace() used to silently ignore the sign bit and perform lookup with the rest of the bits. The weird behavior has been changed such that negative IDs are treated as invalid. As the behavior change was subtle, WARN_ON_ONCE() was added in the hope of determining who's calling idr functions with negative IDs so that they can be examined for problems. Up until now, all two reported cases are ID number coming directly from userland and getting fed into idr_find() and the warnings seem to cause more problems than being helpful. Drop the WARN_ON_ONCE()s. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: <markus@trippelsdorf.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 28 2月, 2013 12 次提交
-
-
由 Tejun Heo 提交于
Until recently, when an negative ID is specified, idr functions used to ignore the sign bit and proceeded with the operation with the rest of bits, which is bizarre and error-prone. The behavior recently got changed so that negative IDs are treated as invalid but we're triggering WARN_ON_ONCE() on negative IDs just in case somebody was depending on the sign bit being ignored, so that those can be detected and fixed easily. We only need this for a while. Explain why WARN_ON_ONCE()s are there and that they can be removed later. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
While idr lookup isn't a particularly heavy operation, it still is too substantial to use in hot paths without worrying about the performance implications. With recent changes, each idr_layer covers 256 slots which should be enough to cover most use cases with single idr_layer making lookup hint very attractive. This patch adds idr->hint which points to the idr_layer which allocated an ID most recently and the fast path lookup becomes if (look up target's prefix matches that of the hinted layer) return hint->ary[ID's offset in the leaf layer]; which can be inlined. idr->hint is set to the leaf node on idr_fill_slot() and cleared from free_layer(). [andriy.shevchenko@linux.intel.com: always do slow path when hint is uninitialized] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
Add a field which carries the prefix of ID the idr_layer covers. This will be used to implement lookup hint. This patch doesn't make use of the new field and doesn't introduce any behavior difference. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
Currently, idr->bitmap is declared as an unsigned long which restricts the number of bits an idr_layer can contain. All bitops can handle arbitrary positive integer bit number and there's no reason for this restriction. Declare idr_layer->bitmap using DECLARE_BITMAP() instead of a single unsigned long. * idr_layer->bitmap is now an array. '&' dropped from params to bitops. * Replaced "== IDR_FULL" tests with bitmap_full() and removed IDR_FULL. * Replaced find_next_bit() on ~bitmap with find_next_zero_bit(). * Replaced "bitmap = 0" with bitmap_clear(). This patch doesn't (or at least shouldn't) introduce any behavior changes. [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
MAX_IDR_MASK is another weirdness in the idr interface. As idr covers whole positive integer range, it's defined as 0x7fffffff or INT_MAX. Its usage in idr_find(), idr_replace() and idr_remove() is bizarre. They basically mask off the sign bit and operate on the rest, so if the caller, by accident, passes in a negative number, the sign bit will be masked off and the remaining part will be used as if that was the input, which is worse than crashing. The constant is visible in idr.h and there are several users in the kernel. * drivers/i2c/i2c-core.c:i2c_add_numbered_adapter() Basically used to test if adap->nr is a negative number which isn't -1 and returns -EINVAL if so. idr_alloc() already has negative @start checking (w/ WARN_ON_ONCE), so this can go away. * drivers/infiniband/core/cm.c:cm_alloc_id() drivers/infiniband/hw/mlx4/cm.c:id_map_alloc() Used to wrap cyclic @start. Can be replaced with max(next, 0). Note that this type of cyclic allocation using idr is buggy. These are prone to spurious -ENOSPC failure after the first wraparound. * fs/super.c:get_anon_bdev() The ID allocated from ida is masked off before being tested whether it's inside valid range. ida allocated ID can never be a negative number and the masking is unnecessary. Update idr_*() functions to fail with -EINVAL when negative @id is specified and update other MAX_IDR_MASK users as described above. This leaves MAX_IDR_MASK without any user, remove it and relocate other MAX_IDR_* constants to lib/idr.c. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Jean Delvare <khali@linux-fr.org> Cc: Roland Dreier <roland@kernel.org> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: "Marciniszyn, Mike" <mike.marciniszyn@intel.com> Cc: Jack Morgenstein <jackm@dev.mellanox.co.il> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: NWolfram Sang <wolfram@the-dreams.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
Most functions in idr fail to deal with the high bits when the idr tree grows to the maximum height. * idr_get_empty_slot() stops growing idr tree once the depth reaches MAX_IDR_LEVEL - 1, which is one depth shallower than necessary to cover the whole range. The function doesn't even notice that it didn't grow the tree enough and ends up allocating the wrong ID given sufficiently high @starting_id. For example, on 64 bit, if the starting id is 0x7fffff01, idr_get_empty_slot() will grow the tree 5 layer deep, which only covers the 30 bits and then proceed to allocate as if the bit 30 wasn't specified. It ends up allocating 0x3fffff01 without the bit 30 but still returns 0x7fffff01. * __idr_remove_all() will not remove anything if the tree is fully grown. * idr_find() can't find anything if the tree is fully grown. * idr_for_each() and idr_get_next() can't iterate anything if the tree is fully grown. Fix it by introducing idr_max() which returns the maximum possible ID given the depth of tree and replacing the id limit checks in all affected places. As the idr_layer pointer array pa[] needs to be 1 larger than the maximum depth, enlarge pa[] arrays by one. While this plugs the discovered issues, the whole code base is horrible and in desparate need of rewrite. It's fragile like hell, Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
The current idr interface is very cumbersome. * For all allocations, two function calls - idr_pre_get() and idr_get_new*() - should be made. * idr_pre_get() doesn't guarantee that the following idr_get_new*() will not fail from memory shortage. If idr_get_new*() returns -EAGAIN, the caller is expected to retry pre_get and allocation. * idr_get_new*() can't enforce upper limit. Upper limit can only be enforced by allocating and then freeing if above limit. * idr_layer buffer is unnecessarily per-idr. Each idr ends up keeping around MAX_IDR_FREE idr_layers. The memory consumed per idr is under two pages but it makes it difficult to make idr_layer larger. This patch implements the following new set of allocation functions. * idr_preload[_end]() - Similar to radix preload but doesn't fail. The first idr_alloc() inside preload section can be treated as if it were called with @gfp_mask used for idr_preload(). * idr_alloc() - Allocate an ID w/ lower and upper limits. Takes @gfp_flags and can be used w/o preloading. When used inside preloaded section, the allocation mask of preloading can be assumed. If idr_alloc() can be called from a context which allows sufficiently relaxed @gfp_mask, it can be used by itself. If, for example, idr_alloc() is called inside spinlock protected region, preloading can be used like the following. idr_preload(GFP_KERNEL); spin_lock(lock); id = idr_alloc(idr, ptr, start, end, GFP_NOWAIT); spin_unlock(lock); idr_preload_end(); if (id < 0) error; which is much simpler and less error-prone than idr_pre_get and idr_get_new*() loop. The new interface uses per-pcu idr_layer buffer and thus the number of idr's in the system doesn't affect the amount of memory used for preloading. idr_layer_alloc() is introduced to handle idr_layer allocations for both old and new ID allocation paths. This is a bit hairy now but the new interface is expected to replace the old and the internal implementation eventually will become simpler. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
Move slot filling to idr_fill_slot() from idr_get_new_above_int() and make idr_get_new_above() directly call it. idr_get_new_above_int() is no longer needed and removed. This will be used to implement a new ID allocation interface. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
idr uses -1, IDR_NEED_TO_GROW and IDR_NOMORE_SPACE to communicate exception conditions internally. The return value is later translated to errno values using _idr_rc_to_errno(). This is confusing. Drop the custom ones and consistently use -EAGAIN for "tree needs to grow", -ENOMEM for "need more memory" and -ENOSPC for "ran out of ID space". Due to the weird memory preloading mechanism, [ra]_get_new*() return -EAGAIN on memory shortage, so we need to substitute -ENOMEM w/ -EAGAIN on those interface functions. They'll eventually be cleaned up and the translations will go away. This patch doesn't introduce any functional changes. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
* Move idr_for_each_entry() definition next to other idr related definitions. * Make id[r|a]_get_new() inline wrappers of id[r|a]_get_new_above(). This changes the implementation of idr_get_new() but the new implementation is trivial. This patch doesn't introduce any functional change. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
There was only one legitimate use of idr_remove_all() and a lot more of incorrect uses (or lack of it). Now that idr_destroy() implies idr_remove_all() and all the in-kernel users updated not to use it, there's no reason to keep it around. Mark it deprecated so that we can later unexport it. idr_remove_all() is made an inline function calling __idr_remove_all() to avoid triggering deprecated warning on EXPORT_SYMBOL(). Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
idr is silly in quite a few ways, one of which is how it's supposed to be destroyed - idr_destroy() doesn't release IDs and doesn't even whine if the idr isn't empty. If the caller forgets idr_remove_all(), it simply leaks memory. Even ida gets this wrong and leaks memory on destruction. There is absoltely no reason not to call idr_remove_all() from idr_destroy(). Nobody is abusing idr_destroy() for shrinking free layer buffer and continues to use idr after idr_destroy(), so it's safe to do remove_all from destroy. In the whole kernel, there is only one place where idr_remove_all() is legitimiately used without following idr_destroy() while there are quite a few places where the caller forgets either idr_remove_all() or idr_destroy() leaking memory. This patch makes idr_destroy() call idr_destroy_all() and updates the function description accordingly. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-