- 08 4月, 2012 2 次提交
-
-
由 Amos Kong 提交于
kvm_io_bus devices are used for ioevent, pit, pic, ioapic, coalesced_mmio. Currently Qemu only emulates one PCI bus, it contains 32 slots, one slot contains 8 functions, maximum of supported PCI devices: 1 * 32 * 8 = 256. One virtio-blk takes one iobus device, one virtio-net(vhost=on) takes two iobus devices. The maximum of coalesced mmio zone is 100, each zone has an iobus devices. So 300 io_bus devices are not enough. Set an upper bounds for kvm_io_range to limit userspace. 1000 is a very large limit and not bloat the typical user. Signed-off-by: NAmos Kong <akong@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Amos Kong 提交于
This patch makes the kvm_io_range array can be resized dynamically. Signed-off-by: NAmos Kong <akong@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 06 4月, 2012 6 次提交
-
-
由 Ben Hutchings 提交于
Commit e52ac339 ('net: Use device model to get driver name in skb_gso_segment()') removed the only in-tree caller of ethtool ops that doesn't hold the RTNL lock. Signed-off-by: NBen Hutchings <bhutchings@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Chris Ball 提交于
This reverts commit c16e981b2fd9455af670a69a84f4c8cf07e12658, because it's no longer useful once MSI support is reverted. Signed-off-by: NChris Ball <cjb@laptop.org>
-
由 Manuel Lauss 提交于
MSI on my O2Micro OZ600 SD card reader is broken. This patch adds a quirk to disable MSI on these controllers. Signed-off-by: NManuel Lauss <manuel.lauss@googlemail.com> Signed-off-by: NChris Ball <cjb@laptop.org>
-
由 Eric Dumazet 提交于
commit 2f533844 (tcp: allow splice() to build full TSO packets) added a regression for splice() calls using SPLICE_F_MORE. We need to call tcp_flush() at the end of the last page processed in tcp_sendpages(), or else transmits can be deferred and future sends stall. Add a new internal flag, MSG_SENDPAGE_NOTLAST, acting like MSG_MORE, but with different semantic. For all sendpage() providers, its a transparent change. Only sock_sendpage() and tcp_sendpages() can differentiate the two different flags provided by pipe_to_sendpage() Reported-by: NTom Herbert <therbert@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: H.K. Jerry Chu <hkchu@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: NEric Dumazet <eric.dumazet@gmail>com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Michal Hocko 提交于
Although mem_cgroup_uncharge_swap has an empty placeholder for !CONFIG_CGROUP_MEM_RES_CTLR_SWAP the definition is placed in the CONFIG_SWAP ifdef block so we are missing the same definition for !CONFIG_SWAP which implies !CONFIG_CGROUP_MEM_RES_CTLR_SWAP. This has not been an issue before, because mem_cgroup_uncharge_swap was not called from !CONFIG_SWAP context. But Hugh Dickins has a cleanup patch to call __mem_cgroup_commit_charge_swapin which is defined also for !CONFIG_SWAP. Let's move both the empty definition and declaration outside of the CONFIG_SWAP block to avoid the following compilation error: mm/memcontrol.c: In function '__mem_cgroup_commit_charge_swapin': mm/memcontrol.c:2837: error: implicit declaration of function 'mem_cgroup_uncharge_swap' if CONFIG_SWAP is disabled. Reported-by: NDavid Rientjes <rientjes@google.com> Signed-off-by: NMichal Hocko <mhocko@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Stephen Boyd 提交于
debugfs and a few other drivers use an open-coded version of simple_open() to pass a pointer from the file to the read/write file ops. Add support for this simple case to libfs so that we can remove the many duplicate copies of this simple function. Signed-off-by: NStephen Boyd <sboyd@codeaurora.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 4月, 2012 4 次提交
-
-
由 Eric Dumazet 提交于
Commit f04565dd (dev: use name hash for dev_seq_ops) added a second regression, as some devices are missing from /proc/net/dev if many devices are defined. When seq_file buffer is filled, the last ->next/show() method is canceled (pos value is reverted to value prior ->next() call) Problem is after above commit, we dont restart the lookup at right position in ->start() method. Fix this by removing the internal 'pos' pointer added in commit, since we need to use the 'loff_t *pos' provided by seq_file layer. This also reverts commit 5cac98dd (net: Fix corruption in /proc/*/net/dev_mcast), since its not needed anymore. Reported-by: NBen Greear <greearb@candelatech.com> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Cc: Mihai Maruseac <mmaruseac@ixiacom.com> Tested-by: NBen Greear <greearb@candelatech.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Linus Torvalds 提交于
It just bloats the audit data structure for no good reason, since the only time those fields are filled are just before calling the common_lsm_audit() function, which is also the only user of those fields. So just make them be the arguments to common_lsm_audit(), rather than bloating that structure that is passed around everywhere, and is initialized in hot paths. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Eric Paris 提交于
After shrinking the common_audit_data stack usage for private LSM data I'm not going to shrink the data union. To do this I'm going to move anything larger than 2 void * ptrs to it's own structure and require it to be declared separately on the calling stack. Thus hot paths which don't need more than a couple pointer don't have to declare space to hold large unneeded structures. I could get this down to one void * by dealing with the key struct and the struct path. We'll see if that is helpful after taking care of networking. Signed-off-by: NEric Paris <eparis@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Eric Paris 提交于
Linus found that the gigantic size of the common audit data caused a big perf hit on something as simple as running stat() in a loop. This patch requires LSMs to declare the LSM specific portion separately rather than doing it in a union. Thus each LSM can be responsible for shrinking their portion and don't have to pay a penalty just because other LSMs have a bigger space requirement. Signed-off-by: NEric Paris <eparis@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 4月, 2012 3 次提交
-
-
由 Paul Gortmaker 提交于
Builds of the openrisc or1ksim_defconfig show the following: In file included from arch/openrisc/include/generated/asm/cmpxchg.h:1:0, from include/asm-generic/atomic.h:18, from arch/openrisc/include/generated/asm/atomic.h:1, from include/linux/atomic.h:4, from include/linux/dcache.h:4, from fs/notify/fsnotify.c:19: include/asm-generic/cmpxchg.h: In function '__xchg': include/asm-generic/cmpxchg.h:34:20: error: expected ')' before 'u8' include/asm-generic/cmpxchg.h:34:20: warning: type defaults to 'int' in type name and many more lines of similar errors. It seems specific to the or32 because most other platforms have an arch specific component that would have already included types.h ahead of time, but the o32 does not. Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jonas Bonn <jonas@southpole.se> Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Acked-by: David Howells <dhowells@redhat.com Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Gortmaker 提交于
Commit 313162d0 ("device.h: audit and cleanup users in main include dir") exchanged an include <linux/device.h> for a struct *device but in actuality I misread this file when creating 313162d0 and it should have remained an include. There were no build regressions since all consumers were already getting device.h anyway, but make it right regardless. Reported-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Gortmaker 提交于
Commit bf4289cb ("ATMEL: fix nand ecc support") indicated that it wanted to "Move platform data to a common header include/linux/platform_data/atmel_nand.h" and the new header even had re-include protectors with: #ifndef __ATMEL_NAND_H__ However, the file that was added was simply called atmel.h and this caused avr32 defconfig to fail with: In file included from arch/avr32/boards/atstk1000/setup.c:22: arch/avr32/mach-at32ap/include/mach/board.h:10:44: error: linux/platform_data/atmel_nand.h: No such file or directory In file included from arch/avr32/boards/atstk1000/setup.c:22: arch/avr32/mach-at32ap/include/mach/board.h:121: warning: 'struct atmel_nand_data' declared inside parameter list arch/avr32/mach-at32ap/include/mach/board.h:121: warning: its scope is only this definition or declaration, which is probably not what you want make[2]: *** [arch/avr32/boards/atstk1000/setup.o] Error 1 It seems the scope of the file contents will expand beyond just nand, so ignore the original intention, and fix up the users who reference the bad name with the _nand suffix. CC: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> CC: David Woodhouse <dwmw2@infradead.org> Acked-by: NHans-Christian Egtvedt <egtvedt@samfundet.no> Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 02 4月, 2012 2 次提交
-
-
由 Daniel Vetter 提交于
Totally unexpected that this regressed. Luckily it sounds like we just need to have dmar disable on the igfx, not the entire system. At least that's what a few days of testing between Tony Vroon and me indicates. Reported-by: NTony Vroon <tony@linx.net> Cc: Tony Vroon <tony@linx.net> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=43024Acked-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Pavel Shilovsky 提交于
We can deadlock if we have a write oplock and two processes use the same file handle. In this case the first process can't unlock its lock if the second process blocked on the lock in the same time. Fix it by using posix_lock_file rather than posix_lock_file_wait under cinode->lock_mutex. If we request a blocking lock and posix_lock_file indicates that there is another lock that prevents us, wait untill that lock is released and restart our call. Cc: stable@kernel.org Acked-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NPavel Shilovsky <piastry@etersoft.ru> Signed-off-by: NSteve French <sfrench@us.ibm.com>
-
- 01 4月, 2012 4 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
... implemented that way since the next commit will leave it almost alone in ext2_fs.h - most of the file (including struct ext2_super_block) is going to move to fs/ext2/ext2.h. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Thierry Reding 提交于
Since the on-disk format has been stable for quite some time, users should either use the headers provided by libext2fs or keep a private copy of this header. For the full discussion, see this thread: https://lkml.org/lkml/2012/3/21/516 While at it, this commit removes all __KERNEL__ guards, which are now unnecessary. Signed-off-by: NThierry Reding <thierry.reding@avionic-design.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jan Kara <jack@suse.cz> Cc: Ted Ts'o <tytso@mit.edu> Cc: Artem Bityutskiy <dedekind1@gmail.com> Cc: Andreas Dilger <aedilger@gmail.com> Cc: linux-ext4@vger.kernel.org
-
- 31 3月, 2012 2 次提交
-
-
由 Oleg Nesterov 提交于
1. TRACE_EVENT(sched_process_exec) forgets to actually use the old pid argument, it sets ->old_pid = p->pid. 2. search_binary_handler() uses the wrong pid number. tracepoint needs the global pid_t from the root namespace, while old_pid is the virtual pid number as it seen by the tracer/parent. With this patch we have two pid_t's in search_binary_handler(), not really nice. Perhaps we should switch to "struct pid*", but in this case it would be better to cleanup the current code first and move the "depth == 0" code outside. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Cc: David Smith <dsmith@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Denys Vlasenko <dvlasenk@redhat.com> Link: http://lkml.kernel.org/r/20120330162636.GA4857@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Amit Shah 提交于
The thaw operation was used by the balloon driver, but after the last commit there's no reason to have separate thaw and restore callbacks. Signed-off-by: NAmit Shah <amit.shah@redhat.com>
-
- 30 3月, 2012 11 次提交
-
-
由 Dave Airlie 提交于
This adds the basic drm dma-buf interface layer, called PRIME. This commit doesn't add any driver support, it is simply and agreed upon starting point so we can work towards merging driver support for the next merge window. Current drivers with work done are nouveau, i915, udl, exynos and omap. The main APIs exposed to userspace allow translating a 32-bit object handle to a file descriptor, and a file descriptor to a 32-bit object handle. The flags value is currently limited to O_CLOEXEC. Acknowledgements: Daniel Vetter: lots of review Rob Clark: cleaned up lots of the internals and did lifetime review. v2: rename some functions after Chris preferred a green shed fix IS_ERR_OR_NULL -> IS_ERR v3: Fix Ville pointed out using buffer + kmalloc v4: add locking as per ickle review v5: allow re-exporting the original dma-buf (Daniel) Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: NRob Clark <rob.clark@linaro.org> Reviewed-by: NSumit Semwal <sumit.semwal@linaro.org> Reviewed-by: NInki Dae <inki.dae@samsung.com> Acked-by: NBen Widawsky <benjamin.widawsky@intel.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 Boris Ostrovsky 提交于
power_usage is always assigned a negative value and should be declared a signed integer Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@amd.com> Signed-off-by: NLen Brown <len.brown@intel.com>
-
由 Boris Ostrovsky 提交于
Currently when a CPU is off-lined it enters either MWAIT-based idle or, if MWAIT is not desired or supported, HLT-based idle (which places the processor in C1 state). This patch allows processors without MWAIT support to stay in states deeper than C1. Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@amd.com> Signed-off-by: NLen Brown <len.brown@intel.com>
-
由 Lin Ming 提交于
acpi_dev_run_wake() is a generic function which can be used by other subsystem too. Rename it to acpi_pm_device_run_wake, to be consistent with acpi_pm_device_sleep_wake. Then move it to ACPI core. Acked-by: NRafael J. Wysocki <rjw@sisk.pl> Signed-off-by: NLin Ming <ming.m.lin@intel.com> Signed-off-by: NLen Brown <len.brown@intel.com>
-
由 Daniel Lezcano 提交于
As far as I can see, this field is never used in the code. Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: NLen Brown <len.brown@intel.com>
-
由 Daniel Lezcano 提交于
All the modules name are ro-data, it is never copied to the array. eg. static struct cpuidle_driver intel_idle_driver = { .name = "intel_idle", .owner = THIS_MODULE, }; It safe to assign the pointer of this ro-data to a const char *. By this way we save 12 bytes. Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Acked-by: NDeepthi Dharwar <deepthi@linux.vnet.ibm.com> Tested-by: NDeepthi Dharwar <deepthi@linux.vnet.ibm.com> Signed-off-by: NLen Brown <len.brown@intel.com>
-
由 ShuoX Liu 提交于
Some C states of new CPU might be not good. One reason is BIOS might configure them incorrectly. To help developers root cause it quickly, the patch adds a new sysfs entry, so developers could disable specific C state manually. In addition, C state might have much impact on performance tuning, as it takes much time to enter/exit C states, which might delay interrupt processing. With the new debug option, developers could check if a deep C state could impact performance and how much impact it could cause. Also add this option in Documentation/cpuidle/sysfs.txt. [akpm@linux-foundation.org: check kstrtol return value] Signed-off-by: NShuoX Liu <shuox.liu@intel.com> Reviewed-by: NYanmin Zhang <yanmin_zhang@intel.com> Reviewed-and-Tested-by: NDeepthi Dharwar <deepthi@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLen Brown <len.brown@intel.com>
-
由 Lin Ming 提交于
Devices may share same list of power resources in _PR0, for example Device(Dev0) { Name (_PR0, Package (0x01) { P0PR, P1PR }) } Device(Dev1) { Name (_PR0, Package (0x01) { P0PR, P1PR } } Assume Dev0 and Dev1 were runtime suspended. Then Dev0 is resumed first and it goes into D0 state. But Dev1 is left in D0_Uninitialised state. This is wrong. In this case, Dev1 must be resumed too. In order to hand this case, each power resource maintains a list of devices which relies on it. When power resource is ON, it will check if the devices on its list can be resumed. The device can only be resumed when all the power resouces of its _PR0 are ON. Signed-off-by: NLin Ming <ming.m.lin@intel.com> Signed-off-by: NLen Brown <len.brown@intel.com>
-
由 Bob Moore 提交于
Version 20120320. Signed-off-by: NBob Moore <robert.moore@intel.com> Signed-off-by: NLin Ming <ming.m.lin@intel.com> Signed-off-by: NLen Brown <len.brown@intel.com>
-
由 Jason Wessel 提交于
There has long been a limitation using software breakpoints with a kernel compiled with CONFIG_DEBUG_RODATA going back to 2.6.26. For this particular patch, it will apply cleanly and has been tested all the way back to 2.6.36. The kprobes code uses the text_poke() function which accommodates writing a breakpoint into a read-only page. The x86 kgdb code can solve the problem similarly by overriding the default breakpoint set/remove routines and using text_poke() directly. The x86 kgdb code will first attempt to use the traditional probe_kernel_write(), and next try using a the text_poke() function. The break point install method is tracked such that the correct break point removal routine will get called later on. Cc: x86@kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: stable@vger.kernel.org # >= 2.6.36 Inspried-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
-
由 Jason Wessel 提交于
There is extra state information that needs to be exposed in the kgdb_bpt structure for tracking how a breakpoint was installed. The debug_core only uses the the probe_kernel_write() to install breakpoints, but this is not enough for all the archs. Some arch such as x86 need to use text_poke() in order to install a breakpoint into a read only page. Passing the kgdb_bpt structure to kgdb_arch_set_breakpoint() and kgdb_arch_remove_breakpoint() allows other archs to set the type variable which indicates how the breakpoint was installed. Cc: stable@vger.kernel.org # >= 2.6.36 Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
-
- 29 3月, 2012 6 次提交
-
-
由 Steffen Klassert 提交于
The default netlink message size limit might be exceeded when dumping a lot of algorithms to userspace. As a result, not all of the instantiated algorithms dumped to userspace. So calculate an upper bound on the message size and call netlink_dump_start() with that value. Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Steffen Klassert 提交于
We lookup algorithms with crypto_alg_mod_lookup() when instantiating via crypto_add_alg(). However, algorithms that are wrapped by an IV genearator (e.g. aead or genicv type algorithms) need special care. The userspace process hangs until it gets a timeout when we use crypto_alg_mod_lookup() to lookup these algorithms. So export the lookup functions for these algorithms and use them in crypto_add_alg(). Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Axel Lin 提交于
Signed-off-by: NAxel Lin <axel.lin@gmail.com> Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com>
-
由 Rusty Russell 提交于
These are obsolete: cpu_*_mask provides (const) pointers. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Konstantin Khlebnikov 提交于
A series of radix tree cleanups, and usage of them in the core pagecache code. Micro-benchmark: lookup 14 slots (typical page-vector size) in radix-tree there earch <step> slot filled and tagged before/after - nsec per full scan through tree * Intel Sandy Bridge i7-2620M 4Mb L3 New code always faster * AMD Athlon 6000+ 2x1Mb L2, without L3 New code generally faster, Minor degradation (marked with "*") for huge sparse trees * i386 on Sandy Bridge New code faster for common cases: tagged and dense trees. Some degradations for non-tagged lookup on sparse trees. Ideally, there might help __ffs() analog for searching first non-zero long element in array, gcc sometimes cannot optimize this loop corretly. Numbers: CPU: Intel Sandy Bridge i7-2620M 4Mb L3 radix-tree with 1024 slots: tagged lookup step 1 before 7156 after 3613 step 2 before 5399 after 2696 step 3 before 4779 after 1928 step 4 before 4456 after 1429 step 5 before 4292 after 1213 step 6 before 4183 after 1052 step 7 before 4157 after 951 step 8 before 4016 after 812 step 9 before 3952 after 851 step 10 before 3937 after 732 step 11 before 4023 after 709 step 12 before 3872 after 657 step 13 before 3892 after 633 step 14 before 3720 after 591 step 15 before 3879 after 578 step 16 before 3561 after 513 normal lookup step 1 before 4266 after 3301 step 2 before 2695 after 2129 step 3 before 2083 after 1712 step 4 before 1801 after 1534 step 5 before 1628 after 1313 step 6 before 1551 after 1263 step 7 before 1475 after 1185 step 8 before 1432 after 1167 step 9 before 1373 after 1092 step 10 before 1339 after 1134 step 11 before 1292 after 1056 step 12 before 1319 after 1030 step 13 before 1276 after 1004 step 14 before 1256 after 987 step 15 before 1228 after 992 step 16 before 1247 after 999 radix-tree with 1024*1024*128 slots: tagged lookup step 1 before 1086102841 after 674196409 step 2 before 816839155 after 498138306 step 7 before 599728907 after 240676762 step 15 before 555729253 after 185219677 step 63 before 606637748 after 128585664 step 64 before 608384432 after 102945089 step 65 before 596987114 after 123996019 step 128 before 304459225 after 56783056 step 256 before 158846855 after 31232481 step 512 before 86085652 after 18950595 step 12345 before 6517189 after 1674057 normal lookup step 1 before 626064869 after 544418266 step 2 before 418809975 after 336321473 step 7 before 242303598 after 207755560 step 15 before 208380563 after 176496355 step 63 before 186854206 after 167283638 step 64 before 176188060 after 170143976 step 65 before 185139608 after 167487116 step 128 before 88181865 after 86913490 step 256 before 45733628 after 45143534 step 512 before 24506038 after 23859036 step 12345 before 2177425 after 2018662 * AMD Athlon 6000+ 2x1Mb L2, without L3 radix-tree with 1024 slots: tag-lookup step 1 before 8164 after 5379 step 2 before 5818 after 5581 step 3 before 4959 after 4213 step 4 before 4371 after 3386 step 5 before 4204 after 2997 step 6 before 4950 after 2744 step 7 before 4598 after 2480 step 8 before 4251 after 2288 step 9 before 4262 after 2243 step 10 before 4175 after 2131 step 11 before 3999 after 2024 step 12 before 3979 after 1994 step 13 before 3842 after 1929 step 14 before 3750 after 1810 step 15 before 3735 after 1810 step 16 before 3532 after 1660 normal-lookup step 1 before 7875 after 5847 step 2 before 4808 after 4071 step 3 before 4073 after 3462 step 4 before 3677 after 3074 step 5 before 4308 after 2978 step 6 before 3911 after 3807 step 7 before 3635 after 3522 step 8 before 3313 after 3202 step 9 before 3280 after 3257 step 10 before 3166 after 3083 step 11 before 3066 after 3026 step 12 before 2985 after 2982 step 13 before 2925 after 2924 step 14 before 2834 after 2808 step 15 before 2805 after 2803 step 16 before 2647 after 2622 radix-tree with 1024*1024*128 slots: tag-lookup step 1 before 1288059720 after 951736580 step 2 before 961292300 after 884212140 step 7 before 768905140 after 547267580 step 15 before 771319480 after 456550640 step 63 before 504847640 after 242704304 step 64 before 392484800 after 177920786 step 65 before 491162160 after 246895264 step 128 before 208084064 after 97348392 step 256 before 112401035 after 51408126 step 512 before 75825834 after 29145070 step 12345 before 5603166 after 2847330 normal-lookup step 1 before 1025677120 after 861375100 step 2 before 647220080 after 572258540 step 7 before 505518960 after 484041813 step 15 before 430483053 after 444815320 * step 63 before 388113453 after 404250546 * step 64 before 374154666 after 396027440 * step 65 before 381423973 after 396704853 * step 128 before 190078700 after 202619384 * step 256 before 100886756 after 102829108 * step 512 before 64074505 after 56158720 step 12345 before 4237289 after 4422299 * * i686 on Sandy bridge radix-tree with 1024 slots: tagged lookup step 1 before 7990 after 4019 step 2 before 5698 after 2897 step 3 before 5013 after 2475 step 4 before 4630 after 1721 step 5 before 4346 after 1759 step 6 before 4299 after 1556 step 7 before 4098 after 1513 step 8 before 4115 after 1222 step 9 before 3983 after 1390 step 10 before 4077 after 1207 step 11 before 3921 after 1231 step 12 before 3894 after 1116 step 13 before 3840 after 1147 step 14 before 3799 after 1090 step 15 before 3797 after 1059 step 16 before 3783 after 745 normal lookup step 1 before 5103 after 3499 step 2 before 3299 after 2550 step 3 before 2489 after 2370 step 4 before 2034 after 2302 * step 5 before 1846 after 2268 * step 6 before 1752 after 2249 * step 7 before 1679 after 2164 * step 8 before 1627 after 2153 * step 9 before 1542 after 2095 * step 10 before 1479 after 2109 * step 11 before 1469 after 2009 * step 12 before 1445 after 2039 * step 13 before 1411 after 2013 * step 14 before 1374 after 2046 * step 15 before 1340 after 1975 * step 16 before 1331 after 2000 * radix-tree with 1024*1024*128 slots: tagged lookup step 1 before 1225865377 after 667153553 step 2 before 842427423 after 471533007 step 7 before 609296153 after 276260116 step 15 before 544232060 after 226859105 step 63 before 519209199 after 141343043 step 64 before 588980279 after 141951339 step 65 before 521099710 after 138282060 step 128 before 298476778 after 83390628 step 256 before 149358342 after 43602609 step 512 before 76994713 after 22911077 step 12345 before 53286669 after 1472111 normal lookup step 1 before 819284564 after 533635310 step 2 before 512421605 after 364956155 step 7 before 271443305 after 305721345 * step 15 before 223591630 after 273960216 * step 63 before 190320247 after 217770207 * step 64 before 178538168 after 267411372 * step 65 before 186400423 after 215347937 * step 128 before 88106045 after 140540612 * step 256 before 44812420 after 70660377 * step 512 before 24435438 after 36328275 * step 12345 before 2123924 after 2148062 * bloat-o-meter delta for this patchset + patchset with related shmem cleanups bloat-o-meter: x86_64 add/remove: 4/3 grow/shrink: 5/6 up/down: 928/-939 (-11) function old new delta radix_tree_next_chunk - 499 +499 shmem_unuse 428 554 +126 shmem_radix_tree_replace 131 227 +96 find_get_pages_tag 354 419 +65 find_get_pages_contig 345 407 +62 find_get_pages 362 396 +34 __kstrtab_radix_tree_next_chunk - 22 +22 __ksymtab_radix_tree_next_chunk - 16 +16 __kcrctab_radix_tree_next_chunk - 8 +8 radix_tree_gang_lookup_slot 204 203 -1 static.shmem_xattr_set 384 381 -3 radix_tree_gang_lookup_tag_slot 208 191 -17 radix_tree_gang_lookup 231 187 -44 radix_tree_gang_lookup_tag 247 199 -48 shmem_unlock_mapping 278 190 -88 __lookup 217 - -217 __lookup_tag 242 - -242 radix_tree_locate_item 279 - -279 bloat-o-meter: i386 add/remove: 3/3 grow/shrink: 8/9 up/down: 1075/-1275 (-200) function old new delta radix_tree_next_chunk - 757 +757 shmem_unuse 352 449 +97 find_get_pages_contig 269 322 +53 shmem_radix_tree_replace 113 154 +41 find_get_pages_tag 277 318 +41 dcache_dir_lseek 426 458 +32 __kstrtab_radix_tree_next_chunk - 22 +22 vc_do_resize 968 977 +9 snd_pcm_lib_read1 725 733 +8 __ksymtab_radix_tree_next_chunk - 8 +8 netlbl_cipsov4_list 1120 1127 +7 find_get_pages 293 291 -2 new_slab 467 459 -8 bitfill_unaligned_rev 425 417 -8 radix_tree_gang_lookup_tag_slot 177 146 -31 blk_dump_cmd 267 229 -38 radix_tree_gang_lookup_slot 212 134 -78 shmem_unlock_mapping 221 128 -93 radix_tree_gang_lookup_tag 275 162 -113 radix_tree_gang_lookup 255 126 -129 __lookup 227 - -227 __lookup_tag 271 - -271 radix_tree_locate_item 277 - -277 This patch: Implement a clean, simple and effective radix-tree iteration routine. Iterating divided into two phases: * lookup next chunk in radix-tree leaf node * iterating through slots in this chunk Main iterator function radix_tree_next_chunk() returns pointer to first slot, and stores in the struct radix_tree_iter index of next-to-last slot. For tagged-iterating it also constuct bitmask of tags for retunted chunk. All additional logic implemented as static-inline functions and macroses. Also adds radix_tree_find_next_bit() static-inline variant of find_next_bit() optimized for small constant size arrays, because find_next_bit() too heavy for searching in an array with one/two long elements. [akpm@linux-foundation.org: rework comments a bit] Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org> Tested-by: NHugh Dickins <hughd@google.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Daniel Lezcano 提交于
In the case of a child pid namespace, rebooting the system does not really makes sense. When the pid namespace is used in conjunction with the other namespaces in order to create a linux container, the reboot syscall leads to some problems. A container can reboot the host. That can be fixed by dropping the sys_reboot capability but we are unable to correctly to poweroff/ halt/reboot a container and the container stays stuck at the shutdown time with the container's init process waiting indefinitively. After several attempts, no solution from userspace was found to reliabily handle the shutdown from a container. This patch propose to make the init process of the child pid namespace to exit with a signal status set to : SIGINT if the child pid namespace called "halt/poweroff" and SIGHUP if the child pid namespace called "reboot". When the reboot syscall is called and we are not in the initial pid namespace, we kill the pid namespace for "HALT", "POWEROFF", "RESTART", and "RESTART2". Otherwise we return EINVAL. Returning EINVAL is also an easy way to check if this feature is supported by the kernel when invoking another 'reboot' option like CAD. By this way the parent process of the child pid namespace knows if it rebooted or not and can take the right decision. Test case: ========== #include <alloca.h> #include <stdio.h> #include <sched.h> #include <unistd.h> #include <signal.h> #include <sys/reboot.h> #include <sys/types.h> #include <sys/wait.h> #include <linux/reboot.h> static int do_reboot(void *arg) { int *cmd = arg; if (reboot(*cmd)) printf("failed to reboot(%d): %m\n", *cmd); } int test_reboot(int cmd, int sig) { long stack_size = 4096; void *stack = alloca(stack_size) + stack_size; int status; pid_t ret; ret = clone(do_reboot, stack, CLONE_NEWPID | SIGCHLD, &cmd); if (ret < 0) { printf("failed to clone: %m\n"); return -1; } if (wait(&status) < 0) { printf("unexpected wait error: %m\n"); return -1; } if (!WIFSIGNALED(status)) { printf("child process exited but was not signaled\n"); return -1; } if (WTERMSIG(status) != sig) { printf("signal termination is not the one expected\n"); return -1; } return 0; } int main(int argc, char *argv[]) { int status; status = test_reboot(LINUX_REBOOT_CMD_RESTART, SIGHUP); if (status < 0) return 1; printf("reboot(LINUX_REBOOT_CMD_RESTART) succeed\n"); status = test_reboot(LINUX_REBOOT_CMD_RESTART2, SIGHUP); if (status < 0) return 1; printf("reboot(LINUX_REBOOT_CMD_RESTART2) succeed\n"); status = test_reboot(LINUX_REBOOT_CMD_HALT, SIGINT); if (status < 0) return 1; printf("reboot(LINUX_REBOOT_CMD_HALT) succeed\n"); status = test_reboot(LINUX_REBOOT_CMD_POWER_OFF, SIGINT); if (status < 0) return 1; printf("reboot(LINUX_REBOOT_CMD_POWERR_OFF) succeed\n"); status = test_reboot(LINUX_REBOOT_CMD_CAD_ON, -1); if (status >= 0) { printf("reboot(LINUX_REBOOT_CMD_CAD_ON) should have failed\n"); return 1; } printf("reboot(LINUX_REBOOT_CMD_CAD_ON) has failed as expected\n"); return 0; } [akpm@linux-foundation.org: tweak and add comments] [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: NDaniel Lezcano <daniel.lezcano@free.fr> Acked-by: NSerge Hallyn <serge.hallyn@canonical.com> Tested-by: NSerge Hallyn <serge.hallyn@canonical.com> Reviewed-by: NOleg Nesterov <oleg@redhat.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-