1. 22 2月, 2010 1 次提交
  2. 17 2月, 2010 1 次提交
  3. 16 2月, 2010 4 次提交
  4. 15 2月, 2010 4 次提交
  5. 11 2月, 2010 1 次提交
  6. 09 2月, 2010 1 次提交
    • X
      selinux: fix memory leak in sel_make_bools · 8007f102
      Xiaotian Feng 提交于
      In sel_make_bools, kernel allocates memory for bool_pending_names[i]
      with security_get_bools. So if we just free bool_pending_names, those
      memories for bool_pending_names[i] will be leaked.
      
      This patch resolves dozens of following kmemleak report after resuming
      from suspend:
      unreferenced object 0xffff88022e4c7380 (size 32):
        comm "init", pid 1, jiffies 4294677173
        backtrace:
          [<ffffffff810f76b5>] create_object+0x1a2/0x2a9
          [<ffffffff810f78bb>] kmemleak_alloc+0x26/0x4b
          [<ffffffff810ef3eb>] __kmalloc+0x18f/0x1b8
          [<ffffffff811cd511>] security_get_bools+0xd7/0x16f
          [<ffffffff811c48c0>] sel_write_load+0x12e/0x62b
          [<ffffffff810f9a39>] vfs_write+0xae/0x10b
          [<ffffffff810f9b56>] sys_write+0x4a/0x6e
          [<ffffffff81011b82>] system_call_fastpath+0x16/0x1b
          [<ffffffffffffffff>] 0xffffffffffffffff
      Signed-off-by: NXiaotian Feng <dfeng@redhat.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      8007f102
  7. 08 2月, 2010 1 次提交
  8. 05 2月, 2010 1 次提交
  9. 04 2月, 2010 4 次提交
  10. 03 2月, 2010 1 次提交
  11. 27 1月, 2010 1 次提交
  12. 25 1月, 2010 2 次提交
  13. 18 1月, 2010 2 次提交
    • J
      Merge branch 'master' into next · 2457552d
      James Morris 提交于
      2457552d
    • S
      selinux: change the handling of unknown classes · 19439d05
      Stephen Smalley 提交于
      If allow_unknown==deny, SELinux treats an undefined kernel security
      class as an error condition rather than as a typical permission denial
      and thus does not allow permissions on undefined classes even when in
      permissive mode.  Change the SELinux logic so that this case is handled
      as a typical permission denial, subject to the usual permissive mode and
      permissive domain handling.
      
      Also drop the 'requested' argument from security_compute_av() and
      helpers as it is a legacy of the original security server interface and
      is unused.
      
      Changes:
      - Handle permissive domains consistently by moving up the test for a
      permissive domain.
      - Make security_compute_av_user() consistent with security_compute_av();
      the only difference now is that security_compute_av() performs mapping
      between the kernel-private class and permission indices and the policy
      values.  In the userspace case, this mapping is handled by libselinux.
      - Moved avd_init inside the policy lock.
      
      Based in part on a patch by Paul Moore <paul.moore@hp.com>.
      Reported-by: NAndrew Worsley <amworsley@gmail.com>
      Signed-off-by: NStephen D. Smalley <sds@tycho.nsa.gov>
      Reviewed-by: NPaul Moore <paul.moore@hp.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      19439d05
  14. 17 1月, 2010 16 次提交
    • K
      page allocator: update NR_FREE_PAGES only when necessary · 6ccf80eb
      KOSAKI Motohiro 提交于
      commit f2260e6b (page allocator: update NR_FREE_PAGES only as necessary)
      made one minor regression.  if __rmqueue() was failed, NR_FREE_PAGES stat
      go wrong.  this patch fixes it.
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Reviewed-by: NMinchan Kim <minchan.kim@gmail.com>
      Reported-by: NHuang Shijie <shijie8@gmail.com>
      Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6ccf80eb
    • L
      Merge branch 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging · 1f0b8b95
      Linus Torvalds 提交于
      * 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
        i2c: Do not use device name after device_unregister
        i2c/pca: Don't use *_interruptible
        i2c-ali1563: Remove sparse warnings
        i2c: Test off by one in {piix4,vt596}_transaction()
        i2c-core: Storage class should be before const qualifier
      1f0b8b95
    • L
      Merge branch 'x86-fixes-for-linus' of... · 330a518a
      Linus Torvalds 提交于
      Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        x86, uv: Ensure hub revision set for all ACPI modes.
        x86, uv: Add function retrieving node controller revision number
        x86: xen: 64-bit kernel RPL should be 0
        x86: kernel_thread() -- initialize SS to a known state
        x86/agp: Fix agp_amd64_init and agp_amd64_cleanup
        x86: SGI UV: Fix mapping of MMIO registers
        x86: mce.h: Fix warning in header checks
      330a518a
    • L
      Merge branch 'core-fixes-for-linus' of... · 2a8249da
      Linus Torvalds 提交于
      Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        futexes: Remove rw parameter from get_futex_key()
      2a8249da
    • L
      Merge branch 'perf-fixes-for-linus' of... · c6a93d33
      Linus Torvalds 提交于
      Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        perf tools: Check if /dev/null can be used as the -o gcc argument
        perf tools: Move QUIET_STDERR def to before first use
        perf: Stop stack frame walking off kernel addresses boundaries
      c6a93d33
    • L
      Merge branch 'tracing-fixes-for-linus' of... · 6ccc347b
      Linus Torvalds 提交于
      Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        tracing/filters: Add comment for match callbacks
        tracing/filters: Fix MATCH_FULL filter matching for PTR_STRING
        tracing/filters: Fix MATCH_MIDDLE_ONLY filter matching
        lib: Introduce strnstr()
        tracing/filters: Fix MATCH_END_ONLY filter matching
        tracing/filters: Fix MATCH_FRONT_ONLY filter matching
        ftrace: Fix MATCH_END_ONLY function filter
        tracing/x86: Derive arch from bits argument in recordmcount.pl
        ring-buffer: Add rb_list_head() wrapper around new reader page next field
        ring-buffer: Wrap a list.next reference with rb_list_head()
      6ccc347b
    • M
      revert "drivers/video/s3c-fb.c: fix clock setting for Samsung SoC Framebuffer" · eb29a5cc
      Mark Brown 提交于
      Fix divide by zero and broken output.  Commit 600ce1a0 ("fix clock
      setting for Samsung SoC Framebuffer") introduced a mandatory refresh
      parameter to the platform data for the S3C framebuffer but did not
      introduce any validation code, causing existing platforms (none of which
      have refresh set) to divide by zero whenever the framebuffer is
      configured, generating warnings and unusable output.
      
      Ben Dooks noted several problems with the patch:
      
       - The platform data supplies the pixclk directly and should already
         have taken care of the refresh rate.
       - The addition of a window ID parameter doesn't help since only the
         root framebuffer can control the pixclk.
       - pixclk is specified in picoseconds (rather than Hz) as the patch
         assumed.
      
      and suggests reverting the commit so do that.  Without fixing this no
      mainline user of the driver will produce output.
      
      [akpm@linux-foundation.org: don't revert the correct bit]
      Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com>
      Cc: InKi Dae <inki.dae@samsung.com>
      Cc: Kyungmin Park <kmpark@infradead.org>
      Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Ben Dooks <ben-linux@fluff.org>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eb29a5cc
    • D
      nommu: fix shared mmap after truncate shrinkage problems · 7e660872
      David Howells 提交于
      Fix a problem in NOMMU mmap with ramfs whereby a shared mmap can happen
      over the end of a truncation.  The problem is that
      ramfs_nommu_check_mappings() checks that the reduced file size against the
      VMA tree, but not the vm_region tree.
      
      The following sequence of events can cause the problem:
      
      	fd = open("/tmp/x", O_RDWR|O_TRUNC|O_CREAT, 0600);
      	ftruncate(fd, 32 * 1024);
      	a = mmap(NULL, 32 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
      	b = mmap(NULL, 16 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
      	munmap(a, 32 * 1024);
      	ftruncate(fd, 16 * 1024);
      	c = mmap(NULL, 32 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
      
      Mapping 'a' creates a vm_region covering 32KB of the file.  Mapping 'b'
      sees that the vm_region from 'a' is covering the region it wants and so
      shares it, pinning it in memory.
      
      Mapping 'a' then goes away and the file is truncated to the end of VMA
      'b'.  However, the region allocated by 'a' is still in effect, and has
      _not_ been reduced.
      
      Mapping 'c' is then created, and because there's a vm_region covering the
      desired region, get_unmapped_area() is _not_ called to repeat the check,
      and the mapping is granted, even though the pages from the latter half of
      the mapping have been discarded.
      
      However:
      
      	d = mmap(NULL, 16 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
      
      Mapping 'd' should work, and should end up sharing the region allocated by
      'a'.
      
      To deal with this, we shrink the vm_region struct during the truncation,
      lest do_mmap_pgoff() take it as licence to share the full region
      automatically without calling the get_unmapped_area() file op again.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Greg Ungerer <gerg@snapgear.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7e660872
    • D
      nommu: fix race between ramfs truncation and shared mmap · 81759b5b
      David Howells 提交于
      Fix the race between the truncation of a ramfs file and an attempt to make
      a shared mmap of region of that file.
      
      The problem is that do_mmap_pgoff() calls f_op->get_unmapped_area() to
      verify that the file region is made of contiguous pages and to find its
      base address - but there isn't any locking to guarantee this region until
      vma_prio_tree_insert() is called by add_vma_to_mm().
      
      Note that moving the functionality into f_op->mmap() doesn't help as that
      is also called before vma_prio_tree_insert().
      
      Instead make ramfs_nommu_check_mappings() grab nommu_region_sem whilst it
      does its checks.  This means that this function will wait whilst mmaps
      take place.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Greg Ungerer <gerg@snapgear.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      81759b5b
    • D
      nommu: don't need get_unmapped_area() for NOMMU · efc1a3b1
      David Howells 提交于
      get_unmapped_area() is unnecessary for NOMMU as no-one calls it.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Greg Ungerer <gerg@snapgear.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      efc1a3b1
    • D
      nommu: remove a superfluous check of vm_region::vm_usage · 779c1023
      David Howells 提交于
      In split_vma(), there's no need to check if the VMA being split has a
      region that's in use by more than one VMA because:
      
       (1) The preceding test prohibits splitting of non-anonymous VMAs and regions
           (eg: file or chardev backed VMAs).
      
       (2) Anonymous regions can't be mapped multiple times because there's no handle
           by which to refer to the already existing region.
      
       (3) If a VMA has previously been split, then the region backing it has also
           been split into two regions, each of usage 1.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Greg Ungerer <gerg@snapgear.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      779c1023
    • D
      nommu: struct vm_region's vm_usage count need not be atomic · 1e2ae599
      David Howells 提交于
      The vm_usage count field in struct vm_region does not need to be atomic as
      it's only even modified whilst nommu_region_sem is write locked.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Greg Ungerer <gerg@snapgear.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1e2ae599
    • D
      nommu: fix SYSV SHM for NOMMU · ed5e5894
      David Howells 提交于
      Commit c4caa778 ("file
      ->get_unmapped_area() shouldn't duplicate work of get_unmapped_area()")
      broke SYSV SHM for NOMMU by taking away the pointer to
      shm_get_unmapped_area() from shm_file_operations.
      
      Put it back conditionally on CONFIG_MMU=n.
      
      file->f_ops->get_unmapped_area() is used to find out the base address for a
      mapping of a mappable chardev device or mappable memory-based file (such as a
      ramfs file).  It needs to be called prior to file->f_ops->mmap() being called.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Greg Ungerer <gerg@snapgear.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed5e5894
    • W
      sysdev: fix prototype for memory_sysdev_class show/store functions · 8ff410da
      Wu Fengguang 提交于
      The function prototype mismatches in call stack:
      
                      [<ffffffff81494268>] print_block_size+0x58/0x60
                      [<ffffffff81487e3f>] sysdev_class_show+0x1f/0x30
                      [<ffffffff811d629b>] sysfs_read_file+0xcb/0x1f0
                      [<ffffffff81176328>] vfs_read+0xc8/0x180
      
      Due to prototype mismatch, print_block_size() will sprintf() into
      *attribute instead of *buf, hence user space will read the initial
      zeros from *buf:
      	$ hexdump /sys/devices/system/memory/block_size_bytes
      	0000000 0000 0000 0000 0000
      	0000008
      
      After patch:
      	cat /sys/devices/system/memory/block_size_bytes
      	0x8000000
      
      This complements commits c29af9636 and 4a0b2b4d.
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Cc: "Zheng, Shaohui" <shaohui.zheng@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8ff410da
    • W
      memory-hotplug: add 0x prefix to HEX block_size_bytes · ba168fc3
      Wu Fengguang 提交于
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ba168fc3
    • D
      memcg: ensure list is empty at rmdir · fce66477
      Daisuke Nishimura 提交于
      Current mem_cgroup_force_empty() only ensures mem->res.usage == 0 on
      success.  But this doesn't guarantee memcg's LRU is really empty, because
      there are some cases in which !PageCgrupUsed pages exist on memcg's LRU.
      
      For example:
      - Pages can be uncharged by its owner process while they are on LRU.
      - race between mem_cgroup_add_lru_list() and __mem_cgroup_uncharge_common().
      
      So there can be a case in which the usage is zero but some of the LRUs are not empty.
      
      OTOH, mem_cgroup_del_lru_list(), which can be called asynchronously with
      rmdir, accesses the mem_cgroup, so this access can cause a problem if it
      races with rmdir because the mem_cgroup might have been freed by rmdir.
      
      Actually, I saw a bug which seems to be caused by this race.
      
      	[1530745.949906] BUG: unable to handle kernel NULL pointer dereference at 0000000000000230
      	[1530745.950651] IP: [<ffffffff810fbc11>] mem_cgroup_del_lru_list+0x30/0x80
      	[1530745.950651] PGD 3863de067 PUD 3862c7067 PMD 0
      	[1530745.950651] Oops: 0002 [#1] SMP
      	[1530745.950651] last sysfs file: /sys/devices/system/cpu/cpu7/cache/index1/shared_cpu_map
      	[1530745.950651] CPU 3
      	[1530745.950651] Modules linked in: configs ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp nfsd nfs_acl auth_rpcgss exportfs autofs4 hidp rfcomm l2cap crc16 bluetooth lockd sunrpc ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i cxgb3 mdio libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror dm_multipath scsi_dh video output sbs sbshc battery ac lp kvm_intel kvm sg ide_cd_mod cdrom serio_raw tpm_tis tpm tpm_bios acpi_memhotplug button parport_pc parport rtc_cmos rtc_core rtc_lib e1000 i2c_i801 i2c_core pcspkr dm_region_hash dm_log dm_mod ata_piix libata shpchp megaraid_mbox sd_mod scsi_mod megaraid_mm ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: freq_table]
      	[1530745.950651] Pid: 19653, comm: shmem_test_02 Tainted: G   M       2.6.32-mm1-00701-g2b04386 #3 Express5800/140Rd-4 [N8100-1065]
      	[1530745.950651] RIP: 0010:[<ffffffff810fbc11>]  [<ffffffff810fbc11>] mem_cgroup_del_lru_list+0x30/0x80
      	[1530745.950651] RSP: 0018:ffff8803863ddcb8  EFLAGS: 00010002
      	[1530745.950651] RAX: 00000000000001e0 RBX: ffff8803abc02238 RCX: 00000000000001e0
      	[1530745.950651] RDX: 0000000000000000 RSI: ffff88038611a000 RDI: ffff8803abc02238
      	[1530745.950651] RBP: ffff8803863ddcc8 R08: 0000000000000002 R09: ffff8803a04c8643
      	[1530745.950651] R10: 0000000000000000 R11: ffffffff810c7333 R12: 0000000000000000
      	[1530745.950651] R13: ffff880000017f00 R14: 0000000000000092 R15: ffff8800179d0310
      	[1530745.950651] FS:  0000000000000000(0000) GS:ffff880017800000(0000) knlGS:0000000000000000
      	[1530745.950651] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      	[1530745.950651] CR2: 0000000000000230 CR3: 0000000379d87000 CR4: 00000000000006e0
      	[1530745.950651] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      	[1530745.950651] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      	[1530745.950651] Process shmem_test_02 (pid: 19653, threadinfo ffff8803863dc000, task ffff88038612a8a0)
      	[1530745.950651] Stack:
      	[1530745.950651]  ffffea00040c2fe8 0000000000000000 ffff8803863ddd98 ffffffff810c739a
      	[1530745.950651] <0> 00000000863ddd18 000000000000000c 0000000000000000 0000000000000000
      	[1530745.950651] <0> 0000000000000002 0000000000000000 ffff8803863ddd68 0000000000000046
      	[1530745.950651] Call Trace:
      	[1530745.950651]  [<ffffffff810c739a>] release_pages+0x142/0x1e7
      	[1530745.950651]  [<ffffffff810c778f>] ? pagevec_move_tail+0x6e/0x112
      	[1530745.950651]  [<ffffffff810c781e>] pagevec_move_tail+0xfd/0x112
      	[1530745.950651]  [<ffffffff810c78a9>] lru_add_drain+0x76/0x94
      	[1530745.950651]  [<ffffffff810dba0c>] exit_mmap+0x6e/0x145
      	[1530745.950651]  [<ffffffff8103f52d>] mmput+0x5e/0xcf
      	[1530745.950651]  [<ffffffff81043ea8>] exit_mm+0x11c/0x129
      	[1530745.950651]  [<ffffffff8108fb29>] ? audit_free+0x196/0x1c9
      	[1530745.950651]  [<ffffffff81045353>] do_exit+0x1f5/0x6b7
      	[1530745.950651]  [<ffffffff8106133f>] ? up_read+0x2b/0x2f
      	[1530745.950651]  [<ffffffff8137d187>] ? lockdep_sys_exit_thunk+0x35/0x67
      	[1530745.950651]  [<ffffffff81045898>] do_group_exit+0x83/0xb0
      	[1530745.950651]  [<ffffffff810458dc>] sys_exit_group+0x17/0x1b
      	[1530745.950651]  [<ffffffff81002c1b>] system_call_fastpath+0x16/0x1b
      	[1530745.950651] Code: 54 53 0f 1f 44 00 00 83 3d cc 29 7c 00 00 41 89 f4 75 63 eb 4e 48 83 7b 08 00 75 04 0f 0b eb fe 48 89 df e8 18 f3 ff ff 44 89 e2 <48> ff 4c d0 50 48 8b 05 2b 2d 7c 00 48 39 43 08 74 39 48 8b 4b
      	[1530745.950651] RIP  [<ffffffff810fbc11>] mem_cgroup_del_lru_list+0x30/0x80
      	[1530745.950651]  RSP <ffff8803863ddcb8>
      	[1530745.950651] CR2: 0000000000000230
      	[1530745.950651] ---[ end trace c3419c1bb8acc34f ]---
      	[1530745.950651] Fixing recursive fault but reboot is needed!
      
      The problem here is pages on LRU may contain pointer to stale memcg.  To
      make res->usage to be 0, all pages on memcg must be uncharged or moved to
      another(parent) memcg.  Moved page_cgroup have already removed from
      original LRU, but uncharged page_cgroup contains pointer to memcg withou
      PCG_USED bit.  (This asynchronous LRU work is for improving performance.)
      If PCG_USED bit is not set, page_cgroup will never be added to memcg's
      LRU.  So, about pages not on LRU, they never access stale pointer.  Then,
      what we have to take care of is page_cgroup _on_ LRU list.  This patch
      fixes this problem by making mem_cgroup_force_empty() visit all LRUs
      before exiting its loop and guarantee there are no pages on its LRU.
      Signed-off-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fce66477