- 16 4月, 2015 2 次提交
-
-
由 Zhang Zhen 提交于
All callers of zone_movable_is_highmem are under #ifdef CONFIG_HIGHMEM, so the else branch return 0 is not needed. Signed-off-by: NZhang Zhen <zhenzhang.zhang@huawei.com> Acked-by: NDavid Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Zhang Zhen 提交于
vfs_readdir() was replaced by iterate_dir() in commit 5c0ba4e0 ("[readdir] introduce iterate_dir() and dir_context"). Signed-off-by: NZhang Zhen <zhenzhang.zhang@huawei.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 4月, 2015 25 次提交
-
-
由 Vladimir Murzin 提交于
Since memtest might be used by other architectures pass input parameters as phys_addr_t instead of long to prevent overflow. Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Tested-by: NMark Rutland <mark.rutland@arm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vladimir Murzin 提交于
Memtest is a simple feature which fills the memory with a given set of patterns and validates memory contents, if bad memory regions is detected it reserves them via memblock API. Since memblock API is widely used by other architectures this feature can be enabled outside of x86 world. This patch set promotes memtest to live under generic mm umbrella and enables memtest feature for arm/arm64. It was reported that this patch set was useful for tracking down an issue with some errant DMA on an arm64 platform. This patch (of 6): There is nothing platform dependent in the core memtest code, so other platforms might benefit from this feature too. [linux@roeck-us.net: MEMTEST depends on MEMBLOCK] Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Tested-by: NMark Rutland <mark.rutland@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Rientjes 提交于
Allocating a large number of elements in atomic context could quickly deplete memory reserves, so just disallow atomic resizing entirely. Nothing currently uses mempool_resize() with anything other than GFP_KERNEL, so convert existing callers to drop the gfp_mask. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NDavid Rientjes <rientjes@google.com> Acked-by: Steffen Maier <maier@linux.vnet.ibm.com> [zfcp] Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Steve French <sfrench@samba.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
If kernel panics due to oom, caused by a cgroup reaching its limit, when 'compulsory panic_on_oom' is enabled, then we will only see that the OOM happened because of "compulsory panic_on_oom is enabled" but this doesn't tell the difference between mempolicy and memcg. And dumping system wide information is plain wrong and more confusing. This patch provides the information of the cgroup whose limit triggerred panic Signed-off-by: NBalasubramani Vivekanandan <balasubramani_vivekanandan@mentor.com> Acked-by: NMichal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
This code is dead since commit 9e645ab6 ("sched/numa: Continue PTE scanning even if migrate rate limited") so remove it. Signed-off-by: NMel Gorman <mgorman@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kees Cook 提交于
The arch_randomize_brk() function is used on several architectures, even those that don't support ET_DYN ASLR. To avoid bulky extern/#define tricks, consolidate the support under CONFIG_ARCH_HAS_ELF_RANDOMIZE for the architectures that support it, while still handling CONFIG_COMPAT_BRK. Signed-off-by: NKees Cook <keescook@chromium.org> Cc: Hector Marco-Gisbert <hecmargi@upv.es> Cc: Russell King <linux@arm.linux.org.uk> Reviewed-by: NIngo Molnar <mingo@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: "David A. Long" <dave.long@linaro.org> Cc: Andrey Ryabinin <a.ryabinin@samsung.com> Cc: Arun Chandran <achandran@mvista.com> Cc: Yann Droneaud <ydroneaud@opteya.com> Cc: Min-Hua Chen <orca.chen@gmail.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Alex Smith <alex@alex-smith.me.uk> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Vineeth Vijayan <vvijayan@mvista.com> Cc: Jeff Bailey <jeffbailey@google.com> Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Behan Webster <behanw@converseincode.com> Cc: Ismael Ripoll <iripoll@upv.es> Cc: Jan-Simon Mller <dl9pf@gmx.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kees Cook 提交于
When an architecture fully supports randomizing the ELF load location, a per-arch mmap_rnd() function is used to find a randomized mmap base. In preparation for randomizing the location of ET_DYN binaries separately from mmap, this renames and exports these functions as arch_mmap_rnd(). Additionally introduces CONFIG_ARCH_HAS_ELF_RANDOMIZE for describing this feature on architectures that support it (which is a superset of ARCH_BINFMT_ELF_RANDOMIZE_PIE, since s390 already supports a separated ET_DYN ASLR from mmap ASLR without the ARCH_BINFMT_ELF_RANDOMIZE_PIE logic). Signed-off-by: NKees Cook <keescook@chromium.org> Cc: Hector Marco-Gisbert <hecmargi@upv.es> Cc: Russell King <linux@arm.linux.org.uk> Reviewed-by: NIngo Molnar <mingo@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: "David A. Long" <dave.long@linaro.org> Cc: Andrey Ryabinin <a.ryabinin@samsung.com> Cc: Arun Chandran <achandran@mvista.com> Cc: Yann Droneaud <ydroneaud@opteya.com> Cc: Min-Hua Chen <orca.chen@gmail.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Alex Smith <alex@alex-smith.me.uk> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Vineeth Vijayan <vvijayan@mvista.com> Cc: Jeff Bailey <jeffbailey@google.com> Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Behan Webster <behanw@converseincode.com> Cc: Ismael Ripoll <iripoll@upv.es> Cc: Jan-Simon Mller <dl9pf@gmx.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Toshi Kani 提交于
Add ioremap_pud_enabled() and ioremap_pmd_enabled(), which return 1 when I/O mappings with pud/pmd are enabled on the kernel. ioremap_huge_init() calls arch_ioremap_pud_supported() and arch_ioremap_pmd_supported() to initialize the capabilities at boot-time. A new kernel option "nohugeiomap" is also added, so that user can disable the huge I/O map capabilities when necessary. Signed-off-by: NToshi Kani <toshi.kani@hp.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Robert Elliott <Elliott@hp.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
__GFP_NOFAIL is documented as a deprecated flag since commit 478352e7 ("mm: add comment about deprecation of __GFP_NOFAIL"). This has discouraged people from using it but in some cases an opencoded endless loop around allocator has been used instead. So the allocator is not aware of the de facto __GFP_NOFAIL allocation because this information was not communicated properly. Let's make clear that if the allocation context really cannot afford failure because there is no good failure policy then using __GFP_NOFAIL is preferable to opencoding the loop outside of the allocator. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NMichal Hocko <mhocko@suse.cz> Acked-by: NDavid Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Dave Chinner <david@fromorbit.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Mel Gorman <mgorman@suse.de> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: "David S. Miller" <davem@davemloft.net> Cc: Vipul Pandya <vipul@chelsio.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Sasha Levin 提交于
Constify function parameters and use correct signness where needed. Signed-off-by: NSasha Levin <sasha.levin@oracle.com> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Acked-by: NGregory Fong <gregory.0xf0@gmail.com> Cc: Pintu Kumar <pintu.k@samsung.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Rientjes 提交于
NOTE: this is not about __GFP_THISNODE, this is only about GFP_THISNODE. GFP_THISNODE is a secret combination of gfp bits that have different behavior than expected. It is a combination of __GFP_THISNODE, __GFP_NORETRY, and __GFP_NOWARN and is special-cased in the page allocator slowpath to fail without trying reclaim even though it may be used in combination with __GFP_WAIT. An example of the problem this creates: commit e97ca8e5 ("mm: fix GFP_THISNODE callers and clarify") fixed up many users of GFP_THISNODE that really just wanted __GFP_THISNODE. The problem doesn't end there, however, because even it was a no-op for alloc_misplaced_dst_page(), which also sets __GFP_NORETRY and __GFP_NOWARN, and migrate_misplaced_transhuge_page(), where __GFP_NORETRY and __GFP_NOWAIT is set in GFP_TRANSHUGE. Converting GFP_THISNODE to __GFP_THISNODE is a no-op in these cases since the page allocator special-cases __GFP_THISNODE && __GFP_NORETRY && __GFP_NOWARN. It's time to just remove GFP_THISNODE entirely. We leave __GFP_THISNODE to restrict an allocation to a local node, but remove GFP_THISNODE and its obscurity. Instead, we require that a caller clear __GFP_WAIT if it wants to avoid reclaim. This allows the aforementioned functions to actually reclaim as they should. It also enables any future callers that want to do __GFP_THISNODE but also __GFP_NORETRY && __GFP_NOWARN to reclaim. The rule is simple: if you don't want to reclaim, then don't set __GFP_WAIT. Aside: ovs_flow_stats_update() really wants to avoid reclaim as well, so it is unchanged. Signed-off-by: NDavid Rientjes <rientjes@google.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Christoph Lameter <cl@linux.com> Acked-by: NPekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Pravin Shelar <pshelar@nicira.com> Cc: Jarno Rajahalme <jrajahalme@nicira.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Greg Thelen <gthelen@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vladimir Davydov 提交于
The limit equals 32 and is imposed by the number of entries in the fs_poolid_map and shared_fs_poolid_map. Nowadays it is insufficient, because with containers on board a Linux host can have hundreds of active fs mounts. These maps were introduced by commit 49a9ab81 ("mm: cleancache: lazy initialization to allow tmem backends to build/run as modules") in order to allow compiling cleancache drivers as modules. Real pool ids are stored in these maps while super_block->cleancache_poolid points to an entry in the map, so that on cleancache registration we can walk over all (if there are <= 32 of them, of course) cleancache-enabled super blocks and assign real pool ids. Actually, there is absolutely no need in these maps, because we can iterate over all super blocks immediately using iterate_supers. This is not racy, because cleancache_init_ops is called from mount_fs with super_block->s_umount held for writing, while iterate_supers takes this semaphore for reading, so if we call iterate_supers after setting cleancache_ops, all super blocks that had been created before cleancache_register_ops was called will be assigned pool ids by the action function of iterate_supers while all newer super blocks will receive it in cleancache_init_fs. This patch therefore removes the maps and hence the artificial limit on the number of cleancache enabled filesystems. Signed-off-by: NVladimir Davydov <vdavydov@parallels.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Stefan Hengelein <ilendir@googlemail.com> Cc: Florian Schmaus <fschmaus@gmail.com> Cc: Andor Daam <andor.daam@googlemail.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Bob Liu <lliubbo@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vladimir Davydov 提交于
Currently, cleancache_register_ops returns the previous value of cleancache_ops to allow chaining. However, chaining, as it is implemented now, is extremely dangerous due to possible pool id collisions. Suppose, a new cleancache driver is registered after the previous one assigned an id to a super block. If the new driver assigns the same id to another super block, which is perfectly possible, we will have two different filesystems using the same id. No matter if the new driver implements chaining or not, we are likely to get data corruption with such a configuration eventually. This patch therefore disables the ability to override cleancache_ops altogether as potentially dangerous. If there is already cleancache driver registered, all further calls to cleancache_register_ops will return EBUSY. Since no user of cleancache implements chaining, we only need to make minor changes to the code outside the cleancache core. Signed-off-by: NVladimir Davydov <vdavydov@parallels.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Stefan Hengelein <ilendir@googlemail.com> Cc: Florian Schmaus <fschmaus@gmail.com> Cc: Andor Daam <andor.daam@googlemail.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Bob Liu <lliubbo@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vladimir Davydov 提交于
Use super_block->s_uuid instead. Every shared filesystem using cleancache must now initialize super_block->s_uuid before calling cleancache_init_shared_fs. The only one on the tree, ocfs2, already meets this requirement. Signed-off-by: NVladimir Davydov <vdavydov@parallels.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Stefan Hengelein <ilendir@googlemail.com> Cc: Florian Schmaus <fschmaus@gmail.com> Cc: Andor Daam <andor.daam@googlemail.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Bob Liu <lliubbo@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kirill A. Shutemov 提交于
CONFIG_PGTABLE_LEVELS is now available on every architecture and we can use it to check if we need to add nr_pmds into mm_struct. Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: NGuenter Roeck <linux@roeck-us.net> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: David Howells <dhowells@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Konstantin Khlebnikov 提交于
It seems nobody needs this. Signed-off-by: NKonstantin Khlebnikov <koct9i@gmail.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Konstantin Khlebnikov 提交于
This makes show_mem() much less verbose on huge machines. Instead of huge and almost useless dump of counters for each per-zone per-cpu lists this patch prints the sum of these counters for each zone (free_pcp) and size of per-cpu list for current cpu (local_pcp). The filter flag SHOW_MEM_PERCPU_LISTS reverts to the old verbose mode. [akpm@linux-foundation.org: update show_free_areas comment] Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: NMichal Hocko <mhocko@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Konstantin Khlebnikov 提交于
This patch replaces cancel_dirty_page() with a helper function account_page_cleaned() which only updates counters. It's called from truncate_complete_page() and from try_to_free_buffers() (hack for ext3). Page is locked in both cases, page-lock protects against concurrent dirtiers: see commit 2d6d7f98 ("mm: protect set_page_dirty() from ongoing truncation"). Delete_from_page_cache() shouldn't be called for dirty pages, they must be handled by caller (either written or truncated). This patch treats final dirty accounting fixup at the end of __delete_from_page_cache() as a debug check and adds WARN_ON_ONCE() around it. If something removes dirty pages without proper handling that might be a bug and unwritten data might be lost. Hugetlbfs has no dirty pages accounting, ClearPageDirty() is enough here. cancel_dirty_page() in nfs_wb_page_cancel() is redundant. This is helper for nfs_invalidate_page() and it's called only in case complete invalidation. The mess was started in v2.6.20 after commits 46d2277c ("Clean up and make try_to_free_buffers() not race with dirty pages") and 3e67c098 ("truncate: clear page dirtiness before running try_to_free_buffers()") first was reverted right in v2.6.20 in commit ecdfc978 ("Resurrect 'try_to_free_buffers()' VM hackery"), second in v2.6.25 commit a2b34564 ("Fix dirty page accounting leak with ext3 data=journal"). Custom fixes were introduced between these points. NFS in v2.6.23, commit 1b3b4a1a ("NFS: Fix a write request leak in nfs_invalidate_page()"). Kludge in __delete_from_page_cache() in v2.6.24, commit 3a692790 ("Do dirty page accounting when removing a page from the page cache"). Since v2.6.25 all of them are redundant. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Tejun Heo <tj@kernel.org> Cc: Jan Kara <jack@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Rientjes 提交于
There's a deadlock when concurrently hot-adding memory through the probe interface and switching a memory block from offline to online. When hot-adding memory via the probe interface, add_memory() first takes mem_hotplug_begin() and then device_lock() is later taken when registering the newly initialized memory block. This creates a lock dependency of (1) mem_hotplug.lock (2) dev->mutex. When switching a memory block from offline to online, dev->mutex is first grabbed in device_online() when the write(2) transitions an existing memory block from offline to online, and then online_pages() will take mem_hotplug_begin(). This creates a lock inversion between mem_hotplug.lock and dev->mutex. Vitaly reports that this deadlock can happen when kworker handling a probe event races with systemd-udevd switching a memory block's state. This patch requires the state transition to take mem_hotplug_begin() before dev->mutex. Hot-adding memory via the probe interface creates a memory block while holding mem_hotplug_begin(), there is no way to take dev->mutex first in this case. online_pages() and offline_pages() are only called when transitioning memory block state. We now require that mem_hotplug_begin() is taken before calling them -- this requires exporting the mem_hotplug_begin() and mem_hotplug_done() to generic code. In all hot-add and hot-remove cases, mem_hotplug_begin() is done prior to device_online(). This is all that is needed to avoid the deadlock. Signed-off-by: NDavid Rientjes <rientjes@google.com> Reported-by: NVitaly Kuznetsov <vkuznets@redhat.com> Tested-by: NVitaly Kuznetsov <vkuznets@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Zhang Zhen <zhenzhang.zhang@huawei.com> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: Wang Nan <wangnan0@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kirill A. Shutemov 提交于
After commit a1fde08c ("VM: skip the stack guard page lookup in get_user_pages only for mlock") FOLL_MLOCK has lost its original meaning: we don't necessarily mlock the page if the flags is set -- we also take VM_LOCKED into consideration. Since we use the same codepath for __mm_populate(), let's rename FOLL_MLOCK to FOLL_POPULATE. Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Acked-by: NDavid Rientjes <rientjes@google.com> Cc: Michel Lespinasse <walken@google.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Rientjes 提交于
CONFIG_SLAB_DEBUG doesn't exist, CONFIG_DEBUG_SLAB does. Signed-off-by: NDavid Rientjes <rientjes@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ulrich Obergfell 提交于
Have kvm_guest_init() use hardlockup_detector_disable() instead of watchdog_enable_hardlockup_detector(false). Remove the watchdog_hardlockup_detector_is_enabled() and the watchdog_enable_hardlockup_detector() function which are no longer needed. Signed-off-by: NUlrich Obergfell <uobergfe@redhat.com> Signed-off-by: NDon Zickus <dzickus@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ulrich Obergfell 提交于
With the current user interface of the watchdog mechanism it is only possible to disable or enable both lockup detectors at the same time. This series introduces new kernel parameters and changes the semantics of some existing kernel parameters, so that the hard lockup detector and the soft lockup detector can be disabled or enabled individually. With this series applied, the user interface is as follows. - parameters in /proc/sys/kernel . soft_watchdog This is a new parameter to control and examine the run state of the soft lockup detector. . nmi_watchdog The semantics of this parameter have changed. It can now be used to control and examine the run state of the hard lockup detector. . watchdog This parameter is still available to control the run state of both lockup detectors at the same time. If this parameter is examined, it shows the logical OR of soft_watchdog and nmi_watchdog. . watchdog_thresh The semantics of this parameter are not affected by the patch. - kernel command line parameters . nosoftlockup The semantics of this parameter have changed. It can now be used to disable the soft lockup detector at boot time. . nmi_watchdog=0 or nmi_watchdog=1 Disable or enable the hard lockup detector at boot time. The patch introduces '=1' as a new option. . nowatchdog The semantics of this parameter are not affected by the patch. It is still available to disable both lockup detectors at boot time. Also, remove the proc_dowatchdog() function which is no longer needed. [dzickus@redhat.com: wrote changelog] [dzickus@redhat.com: update documentation for kernel params and sysctl] Signed-off-by: NUlrich Obergfell <uobergfe@redhat.com> Signed-off-by: NDon Zickus <dzickus@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ulrich Obergfell 提交于
Separate handlers for each watchdog parameter in /proc/sys/kernel replace the proc_dowatchdog() function. Three of those handlers merely call proc_watchdog_common() with one different argument. Signed-off-by: NUlrich Obergfell <uobergfe@redhat.com> Signed-off-by: NDon Zickus <dzickus@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ulrich Obergfell 提交于
The hardlockup and softockup had always been tied together. Due to the request of KVM folks, they had a need to have one enabled but not the other. Internally rework the code to split things apart more cleanly. There is a bunch of churn here, but the end result should be code that should be easier to maintain and fix without knowing the internals of what is going on. This patch (of 9): Introduce new definitions and variables to separate the user interface in /proc/sys/kernel from the internal run state of the lockup detectors. The internal run state is represented by two bits in a new variable that is named 'watchdog_enabled'. This helps simplify the code, for example: - In order to check if any of the two lockup detectors is enabled, it is sufficient to check if 'watchdog_enabled' is not zero. - In order to enable/disable one or both lockup detectors, it is sufficient to set/clear one or both bits in 'watchdog_enabled'. - Concurrent updates of 'watchdog_enabled' need not be synchronized via a spinlock or a mutex. Updates can either be atomic or concurrency can be detected by using 'cmpxchg'. Signed-off-by: NUlrich Obergfell <uobergfe@redhat.com> Signed-off-by: NDon Zickus <dzickus@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 4月, 2015 5 次提交
-
-
由 Richard Weinberger 提交于
portid is an unsigned integer. Fix netlink_notify to match all other portid user in the kernel. Signed-off-by: NRichard Weinberger <richard@nod.at> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pablo Neira Ayuso 提交于
linux/if.h creates conflicts in userspace with net/if.h By using it here we force userspace to use linux/if.h while net/if.h may be needed. Note that: include/linux/netfilter_ipv4/ip_tables.h and include/linux/netfilter_ipv6/ip6_tables.h don't include linux/if.h and they also refer to IFNAMSIZ, so they are expecting userspace to include use net/if.h from the client program. Signed-off-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Daniel Borkmann 提交于
Even if we make use of classifier and actions from the egress path, we're going into handle_ing() executing additional code on a per-packet cost for ingress qdisc, just to realize that nothing is attached on ingress. Instead, this can just be blinded out as a no-op entirely with the use of a static key. On input fast-path, we already make use of static keys in various places, e.g. skb time stamping, in RPS, etc. It makes sense to not waste time when we're assured that no ingress qdisc is attached anywhere. Enabling/disabling of that code path is being done via two helpers, namely net_{inc,dec}_ingress_queue(), that are being invoked under RTNL mutex when a ingress qdisc is being either initialized or destructed. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@plumgrid.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Graf 提交于
Some trivial reorders while preserving the RX/TX cache lines split to fill a couple of holes. Signed-off-by: NThomas Graf <tgraf@suug.ch> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Graf 提交于
e1000e is the only driver requiring pm_qos_req, instead of causing every device to waste up to 240 bytes. Allocate it for the specific driver. Signed-off-by: NThomas Graf <tgraf@suug.ch> Acked-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 4月, 2015 2 次提交
-
-
由 Ingo Molnar 提交于
Now that we are using smpboot_thread_init() in init/main.c as well, provide it for !CONFIG_SMP as well. This addresses a !CONFIG_SMP build failure. Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Paul E. McKenney 提交于
Currently, smpboot_unpark_threads() is invoked before the incoming CPU has been added to the scheduler's runqueue structures. This might potentially cause the unparked kthread to run on the wrong CPU, since the correct CPU isn't fully set up yet. That causes a sporadic, hard to debug boot crash triggering on some systems, reported by Borislav Petkov, and bisected down to: 2a442c9c ("x86: Use common outgoing-CPU-notification code") This patch places smpboot_unpark_threads() in a CPU hotplug notifier with priority set so that these kthreads are unparked just after the CPU has been added to the runqueues. Reported-and-tested-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 12 4月, 2015 5 次提交
-
-
由 Anton Altaparmakov 提交于
simillar to iov_iter_fault_in_readable() but differs in that it is not limited to faulting in the first iovec and instead faults in "bytes" bytes iterating over the iovecs as necessary. Also, instead of only faulting in the first and last page of the range, all pages are faulted in. This function is needed by NTFS when it does multi page file writes. Signed-off-by: NAnton Altaparmakov <anton@tuxera.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
just make const char iname[] the last member and compare name->name with name->iname instead of checking name->separate We need to make sure that out-of-line name doesn't end up allocated adjacent to struct filename refering to it; fortunately, it's easy to achieve - just allocate that struct filename with one byte in ->iname[], so that ->iname[0] will be inside the same object and thus have an address different from that of out-of-line name [spotted by Boqun Feng <boqun.feng@gmail.com>] Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
convert open-coded instances Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
it's equal to iov_iter_count(&msg->msg_iter) in all cases Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 11 4月, 2015 1 次提交
-
-
由 Vlad Zolotarov 提交于
Add configuration setting for drivers to allow/block an RSS Redirection Table and a Hash Key querying for discrete VFs. On some devices VF share the mentioned above information with PF and querying it may adduce a theoretical security risk. We want to let a system administrator to decide if he/she wants to take this risk or not. Signed-off-by: NVlad Zolotarov <vladz@cloudius-systems.com> Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-