提交 · 07db04098d1e2f238959c858a0d63243157695f9 · openeuler / raspberrypi-kernel

24 7月, 2012 2 次提交

由 Masanari Iida 提交于 7月 22, 2012

Correct spelling typo in debug messages and comments
in drivers/iommu.
Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

07db0409

video: Fix typo in drivers/video · ff0c2642

由 Masanari Iida 提交于 7月 22, 2012

Correct spelling typo in debug messages and comments
within drivers/video.
Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

ff0c2642

21 7月, 2012 1 次提交

Documentation: Add newline at end-of-file to files lacking one · f9028317

由 Jesper Juhl 提交于 7月 20, 2012

This patch simply adds a newline character at end-of-file to those
files in Documentation/ that currently lack one.

This is done for a few different reasons:

A) It's rather annoying when you do "cat some_file.txt" that your
   prompt/cursor ends up at the end of the last line of output rather
   than on a new line.

B) Some tools that process files line-by-line may get confused by the
   lack of a newline on the last line.

C) The "\ No newline at end of file" line in diffs annoys me for some
   reason.

So, let's just add the missing newline once and for all.
Signed-off-by: NJesper Juhl <jj@chaosbits.net>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

f9028317

20 7月, 2012 6 次提交

arm,unicore32: Remove obsolete "select MISC_DEVICES" · 18d8fe1f

由 Geert Uytterhoeven 提交于 7月 19, 2012

Obsoleted since commit 7c5763b8 ("drivers:
misc: Remove MISC_DEVICES config option")
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

18d8fe1f

module.c: spelling s/postition/position/g · 2e76c283

由 Geert Uytterhoeven 提交于 7月 19, 2012

Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

2e76c283

cpufreq: Fix typo in cpufreq driver · c03c3013

由 Masanari Iida 提交于 7月 18, 2012

Correct spelling typo in cpufreq driver.
Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

c03c3013

trivial: typo in comment in mksysmap · 4fec5420

由 Masatake YAMATO 提交于 7月 14, 2012

Signed-off-by: NMasatake YAMATO <yamato@redhat.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

4fec5420

mach-omap2: Fix typo in debug message and comment · 260db902

由 Masanari Iida 提交于 7月 12, 2012

Correcting spelling typo in mach-omap2
Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

260db902

scsi: aha152x: Fix sparse warning and make printing pointer address more portable. · b631cf1f

由 Krzysztof Wilczynski 提交于 5月 02, 2012

This is to change use of "0x%08x" in favour of "%p" as per ../Documentation/printk-formats.txt,
which also takes care about the following warning during compilation time:

drivers/scsi/aha152x.c: In function ‘get_command’:
drivers/scsi/aha152x.c:2987: warning: cast from pointer to integer of different size
Signed-off-by: NKrzysztof Wilczynski <krzysztof.wilczynski@linux.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

b631cf1f

17 7月, 2012 1 次提交

Change email address for Steve Glendinning · 90b24cfb

由 Steve Glendinning 提交于 4月 16, 2012

I no longer have a mailbox at smsc.com, and I've had two reports
that that email address now bounces from people trying to
contact me.  This patch updates all references to that invalid
address to one that I can be contacted on more permanently.

This patch also updates the maintainer status to reflect
the fact I'm no longer directly paid to maintain these drivers.
Signed-off-by: NSteve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

90b24cfb

12 7月, 2012 1 次提交

Btrfs: fix typo in convert_extent_bit · 10983f2e

由 Liu Bo 提交于 7月 11, 2012

It should be convert_extent_bit.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

10983f2e

11 7月, 2012 1 次提交

via: Remove bogus if check · 39012f68

由 Alan Cox 提交于 7月 11, 2012

Reported-by: <dcb314@hotmail.com>
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Resolves-bug: https://bugzilla.kernel.org/show_bug.cgi?id=44331Signed-off-by: NJiri Kosina <jkosina@suse.cz>

39012f68

10 7月, 2012 1 次提交

netprio_cgroup.c: fix comment typo · 0f307323

由 Liu Bo 提交于 7月 07, 2012

poitner -> pointer.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

0f307323

09 7月, 2012 1 次提交

backlight: fix memory leak on obscure error path · 9ea3c498

由 Martlin Ettl 提交于 7月 05, 2012

Dredged out of bugzilla
Reported-by: NMartlin Ettl <ettl.martin@gmx.de>
Resolves-bug: https://bugzilla.kernel.org/show_bug.cgi?id=15492Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

9ea3c498

29 6月, 2012 1 次提交

Merge branch 'master' into for-next · 59f91e5d

由 Jiri Kosina 提交于 6月 29, 2012

Conflicts:
	include/linux/mmzone.h

Synced with Linus' tree so that trivial patch can be applied
on top of up-to-date code properly.
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>

59f91e5d

28 6月, 2012 8 次提交

Documentation: asus-laptop.txt references an obsolete Kconfig item · 57bdfdd8

由 Paul Gortmaker 提交于 6月 22, 2012

The CONFIG_X86_UP_APIC option no longer exists.  Delete the
reference of it.

Cc: trivial@kernel.org
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

57bdfdd8

Documentation: ManagementStyle: fixed typo · 692c86b7

由 Christopher L. Simons 提交于 6月 19, 2012

Fixed a spelling error (less that you -> less than you)
Signed-off-by: NChristopher L. Simons <christopherleesimons@gmail.com>
Acked-by: NRob Landley <rob@landley.net>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

692c86b7

W
mm/vmscan: cleanup comment error in balance_pgdat · ab8704b8
由 Wanpeng Li 提交于 6月 17, 2012
```
Signed-off-by: NWanpeng Li <liwp.linux@gmail.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>
```
ab8704b8

mm: cleanup on the comments of zone_reclaim_stat · 46028e6d

由 Wanpeng Li 提交于 6月 15, 2012

Signed-off-by: NWanpeng Li <liwp.linux@gmail.com>
Acked-by: NMinchan Kim <minchan@kernel.org>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

46028e6d

mm: fix page reclaim comment error · be7bd59d

由 Wanpeng Li 提交于 6月 14, 2012

Since there are five lists in LRU cache, the array nr in get_scan_count
should be:

nr[0] = anon inactive pages to scan; nr[1] = anon active pages to scan
nr[2] = file inactive pages to scan; nr[3] = file active pages to scan
Signed-off-by: NWanpeng Li <liwp.linux@gmail.com>
Reviewed-by: NRik van Riel <riel@redhat.com>
Acked-by: NMinchan Kim <minchan@kernel.org>
Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

be7bd59d

lib: correct link to the original source for div64_u64 · 422aa274

由 Akinobu Mita 提交于 6月 09, 2012

Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Cc: Jiri Kosina <trivial@kernel.org>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

422aa274

ab8500-btemp: Fix typo 'AB5500' · 8b7d133c

由 Paul Bolle 提交于 6月 06, 2012

Signed-off-by: NPaul Bolle <pebolle@tiscali.nl>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

8b7d133c

P
treewide: Put a space between #include and FILE · 6ac7d115
由 Paul Bolle 提交于 6月 06, 2012
```
Signed-off-by: NPaul Bolle <pebolle@tiscali.nl>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>
```
6ac7d115

27 6月, 2012 1 次提交

lm8333: Fix check ordering · 954bd6d1

由 Alan Cox 提交于 6月 27, 2012

Fix harmless reference off end of array

Reported-by: <dcb314@hotmail.com>
Resolves-bug: https://bugzilla.kernel.org/show_bug.cgi?43861Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

954bd6d1

05 6月, 2012 2 次提交

parisc: cleanup quoted include · c224071e

由 Paul Bolle 提交于 6月 03, 2012

A quoted include starts with a superfluous "./". Clean up that quoted
include.
Signed-off-by: NPaul Bolle <pebolle@tiscali.nl>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

c224071e

renesas_usbhs: cleanup quoted includes · cc502bb7

由 Paul Bolle 提交于 6月 03, 2012

A few quoted includes start with a superfluous "./". Clean up those
quoted includes.
Signed-off-by: NPaul Bolle <pebolle@tiscali.nl>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

cc502bb7

03 6月, 2012 2 次提交

Doc: document max raw dev number · d2582a7a

由 Kazuo Moriwaka 提交于 5月 28, 2012

Documenting description about max minor number of raw devices.
Signed-off-by: NKazuo Moriwaka <moriwaka@gmail.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

d2582a7a

Fix comment typo multipy -> multiply · 92a9f14b

由 Ralf Baechle 提交于 5月 30, 2012

Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

92a9f14b

30 5月, 2012 12 次提交

mm/memcg: move reclaim_stat into lruvec · 89abfab1

由 Hugh Dickins 提交于 5月 29, 2012

With mem_cgroup_disabled() now explicit, it becomes clear that the
zone_reclaim_stat structure actually belongs in lruvec, per-zone when
memcg is disabled but per-memcg per-zone when it's enabled.

We can delete mem_cgroup_get_reclaim_stat(), and change
update_page_reclaim_stat() to update just the one set of stats, the one
which get_scan_count() will actually use.
Signed-off-by: NHugh Dickins <hughd@google.com>
Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: NMichal Hocko <mhocko@suse.cz>
Reviewed-by: NMinchan Kim <minchan@kernel.org>
Reviewed-by: NMichal Hocko <mhocko@suse.cz>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

89abfab1

mm/memcg: scanning_global_lru means mem_cgroup_disabled · c3c787e8

由 Hugh Dickins 提交于 5月 29, 2012

Although one has to admire the skill with which it has been concealed,
scanning_global_lru(mz) is actually just an interesting way to test
mem_cgroup_disabled().  Too many developer hours have been wasted on
confusing it with global_reclaim(): just use mem_cgroup_disabled().
Signed-off-by: NHugh Dickins <hughd@google.com>
Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: NGlauber Costa <glommer@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c3c787e8

memcg swap: use mem_cgroup_uncharge_swap() · 86493009

由 Hugh Dickins 提交于 5月 29, 2012

That stuff __mem_cgroup_commit_charge_swapin() does with a swap entry, it
has a name and even a declaration: just use mem_cgroup_uncharge_swap().
Signed-off-by: NHugh Dickins <hughd@google.com>
Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

86493009

memcg swap: mem_cgroup_move_swap_account never needs fixup · e91cbb42

由 Hugh Dickins 提交于 5月 29, 2012

The need_fixup arg to mem_cgroup_move_swap_account() is always false,
so just remove it.
Signed-off-by: NHugh Dickins <hughd@google.com>
Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e91cbb42

memcg: fix/change behavior of shared anon at moving task · 4b91355e

由 KAMEZAWA Hiroyuki 提交于 5月 29, 2012

This patch changes memcg's behavior at task_move().

At task_move(), the kernel scans a task's page table and move the changes
for mapped pages from source cgroup to target cgroup.  There has been a
bug at handling shared anonymous pages for a long time.

Before patch:
  - The spec says 'shared anonymous pages are not moved.'
  - The implementation was 'shared anonymoys pages may be moved'.
    If page_mapcount <=2, shared anonymous pages's charge were moved.

After patch:
  - The spec says 'all anonymous pages are moved'.
  - The implementation is 'all anonymous pages are moved'.

Considering usage of memcg, this will not affect user's experience.
'shared anonymous' pages only exists between a tree of processes which
don't do exec().  Moving one of process without exec() seems not sane.
For example, libcgroup will not be affected by this change.  (Anyway, no
one noticed the implementation for a long time...)

Below is a discussion log:

 - current spec/implementation are complex
 - Now, shared file caches are moved
 - It adds unclear check as page_mapcount(). To do correct check,
   we should check swap users, etc.
 - No one notice this implementation behavior. So, no one get benefit
   from the design.
 - In general, once task is moved to a cgroup for running, it will not
   be moved....
 - Finally, we have control knob as memory.move_charge_at_immigrate.

Here is a patch to allow moving shared pages, completely. This makes
memcg simpler and fix current broken code.
Suggested-by: NHugh Dickins <hughd@google.com>
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: NMichal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Glauber Costa <glommer@parallels.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4b91355e

mm/memblock: fix memory leak on extending regions · 181eb394

由 Gavin Shan 提交于 5月 29, 2012

The overall memblock has been organized into the memory regions and
reserved regions. Initially, the memory regions and reserved regions are
stored in the predetermined arrays of "struct memblock _region". It's
possible for the arrays to be enlarged when we have newly added regions,
but no free space left there. The policy here is to create double-sized
array either by slab allocator or memblock allocator. Unfortunately, we
didn't free the old array, which might be allocated through slab allocator
before. That would cause memory leak.

The patch introduces 2 variables to trace where (slab or memblock) the
memory and reserved regions come from. The memory for the memory or
reserved regions will be deallocated by kfree() if that was allocated by
slab allocator. Thus to fix the memory leak issue.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

181eb394

mm/memblock: cleanup on duplicate VA/PA conversion · 4e2f0775

由 Gavin Shan 提交于 5月 29, 2012

The overall memblock has been organized into the memory regions and
reserved regions. Initially, the memory regions and reserved regions are
stored in the predetermined arrays of "struct memblock _region". It's
possible for the arrays to be enlarged when we have newly added regions
for them, but no enough space there. Under the situation, We will created
double-sized array to meet the requirement. However, the original
implementation converted the VA (Virtual Address) of the newly allocated
array of regions to PA (Physical Address), then translate back when we
allocates the new array from slab. That's actually unnecessary.

The patch removes the duplicate VA/PA conversion.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4e2f0775

mm: fix slab->page flags corruption · 5bf5f03c

由 Pravin B Shelar 提交于 5月 29, 2012

Transparent huge pages can change page->flags (PG_compound_lock) without
taking Slab lock.  Since THP can not break slab pages we can safely access
compound page without taking compound lock.

Specifically this patch fixes a race between compound_unlock() and slab
functions which perform page-flags updates.  This can occur when
get_page()/put_page() is called on a page from slab.

[akpm@linux-foundation.org: tweak comment text, fix comment layout, fix label indenting]
Reported-by: NAmey Bhide <abhide@nicira.com>
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Reviewed-by: NChristoph Lameter <cl@linux.com>
Acked-by: NAndrea Arcangeli <aarcange@redhat.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5bf5f03c

mm: fix faulty initialization in vmalloc_init() · dbda591d

由 KyongHo 提交于 5月 29, 2012

The transfer of ->flags causes some of the static mapping virtual
addresses to be prematurely freed (before the mapping is removed) because
VM_LAZY_FREE gets "set" if tmp->flags has VM_IOREMAP set.  This might
cause subsequent vmalloc/ioremap calls to fail because it might allocate
one of the freed virtual address ranges that aren't unmapped.

va->flags has different types of flags from tmp->flags.  If a region with
VM_IOREMAP set is registered with vm_area_add_early(), it will be removed
by __purge_vmap_area_lazy().

Fix vmalloc_init() to correctly initialize vmap_area for the given
vm_struct.

Also initialise va->vm.  If it is not set, find_vm_area() for the early
vm regions will always fail.
Signed-off-by: NKyongHo Cho <pullip.cho@samsung.com>
Cc: "Olav Haugan" <ohaugan@codeaurora.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dbda591d

mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race condition · 26c19178

由 Andrea Arcangeli 提交于 5月 29, 2012

When holding the mmap_sem for reading, pmd_offset_map_lock should only
run on a pmd_t that has been read atomically from the pmdp pointer,
otherwise we may read only half of it leading to this crash.

PID: 11679  TASK: f06e8000  CPU: 3   COMMAND: "do_race_2_panic"
 #0 [f06a9dd8] crash_kexec at c049b5ec
 #1 [f06a9e2c] oops_end at c083d1c2
 #2 [f06a9e40] no_context at c0433ded
 #3 [f06a9e64] bad_area_nosemaphore at c043401a
 #4 [f06a9e6c] __do_page_fault at c0434493
 #5 [f06a9eec] do_page_fault at c083eb45
 #6 [f06a9f04] error_code (via page_fault) at c083c5d5
    EAX: 01fb470c EBX: fff35000 ECX: 00000003 EDX: 00000100 EBP:
    00000000
    DS:  007b     ESI: 9e201000 ES:  007b     EDI: 01fb4700 GS:  00e0
    CS:  0060     EIP: c083bc14 ERR: ffffffff EFLAGS: 00010246
 #7 [f06a9f38] _spin_lock at c083bc14
 #8 [f06a9f44] sys_mincore at c0507b7d
 #9 [f06a9fb0] system_call at c083becd
                         start           len
    EAX: ffffffda  EBX: 9e200000  ECX: 00001000  EDX: 6228537f
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 003d0f00
    SS:  007b      ESP: 62285354  EBP: 62285388  GS:  0033
    CS:  0073      EIP: 00291416  ERR: 000000da  EFLAGS: 00000286

This should be a longstanding bug affecting x86 32bit PAE without THP.
Only archs with 64bit large pmd_t and 32bit unsigned long should be
affected.

With THP enabled the barrier() in pmd_none_or_trans_huge_or_clear_bad()
would partly hide the bug when the pmd transition from none to stable,
by forcing a re-read of the *pmd in pmd_offset_map_lock, but when THP is
enabled a new set of problem arises by the fact could then transition
freely in any of the none, pmd_trans_huge or pmd_trans_stable states.
So making the barrier in pmd_none_or_trans_huge_or_clear_bad()
unconditional isn't good idea and it would be a flakey solution.

This should be fully fixed by introducing a pmd_read_atomic that reads
the pmd in order with THP disabled, or by reading the pmd atomically
with cmpxchg8b with THP enabled.

Luckily this new race condition only triggers in the places that must
already be covered by pmd_none_or_trans_huge_or_clear_bad() so the fix
is localized there but this bug is not related to THP.

NOTE: this can trigger on x86 32bit systems with PAE enabled with more
than 4G of ram, otherwise the high part of the pmd will never risk to be
truncated because it would be zero at all times, in turn so hiding the
SMP race.

This bug was discovered and fully debugged by Ulrich, quote:

----
[..]
pmd_none_or_trans_huge_or_clear_bad() loads the content of edx and
eax.

    496 static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t
    *pmd)
    497 {
    498         /* depend on compiler for an atomic pmd read */
    499         pmd_t pmdval = *pmd;

                                // edi = pmd pointer
0xc0507a74 <sys_mincore+548>:   mov    0x8(%esp),%edi
...
                                // edx = PTE page table high address
0xc0507a84 <sys_mincore+564>:   mov    0x4(%edi),%edx
...
                                // eax = PTE page table low address
0xc0507a8e <sys_mincore+574>:   mov    (%edi),%eax

[..]

Please note that the PMD is not read atomically. These are two "mov"
instructions where the high order bits of the PMD entry are fetched
first. Hence, the above machine code is prone to the following race.

-  The PMD entry {high|low} is 0x0000000000000000.
   The "mov" at 0xc0507a84 loads 0x00000000 into edx.

-  A page fault (on another CPU) sneaks in between the two "mov"
   instructions and instantiates the PMD.

-  The PMD entry {high|low} is now 0x00000003fda38067.
   The "mov" at 0xc0507a8e loads 0xfda38067 into eax.
----
Reported-by: NUlrich Obergfell <uobergfe@redhat.com>
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

26c19178

mm, oom: normalize oom scores to oom_score_adj scale only for userspace · a7f638f9

由 David Rientjes 提交于 5月 29, 2012

The oom_score_adj scale ranges from -1000 to 1000 and represents the
proportion of memory available to the process at allocation time.  This
means an oom_score_adj value of 300, for example, will bias a process as
though it was using an extra 30.0% of available memory and a value of
-350 will discount 35.0% of available memory from its usage.

The oom killer badness heuristic also uses this scale to report the oom
score for each eligible process in determining the "best" process to
kill.  Thus, it can only differentiate each process's memory usage by
0.1% of system RAM.

On large systems, this can end up being a large amount of memory: 256MB
on 256GB systems, for example.

This can be fixed by having the badness heuristic to use the actual
memory usage in scoring threads and then normalizing it to the
oom_score_adj scale for userspace.  This results in better comparison
between eligible threads for kill and no change from the userspace
perspective.
Suggested-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Tested-by: NDave Jones <davej@redhat.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a7f638f9

mm: avoid swapping out with swappiness==0 · fe35004f

由 Satoru Moriya 提交于 5月 29, 2012

Sometimes we'd like to avoid swapping out anonymous memory.  In
particular, avoid swapping out pages of important process or process
groups while there is a reasonable amount of pagecache on RAM so that we
can satisfy our customers' requirements.

OTOH, we can control how aggressive the kernel will swap memory pages with
/proc/sys/vm/swappiness for global and
/sys/fs/cgroup/memory/memory.swappiness for each memcg.

But with current reclaim implementation, the kernel may swap out even if
we set swappiness=0 and there is pagecache in RAM.

This patch changes the behavior with swappiness==0.  If we set
swappiness==0, the kernel does not swap out completely (for global reclaim
until the amount of free pages and filebacked pages in a zone has been
reduced to something very very small (nr_free + nr_filebacked < high
watermark)).
Signed-off-by: NSatoru Moriya <satoru.moriya@hds.com>
Acked-by: NMinchan Kim <minchan@kernel.org>
Reviewed-by: NRik van Riel <riel@redhat.com>
Acked-by: NJerome Marchand <jmarchan@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fe35004f