1. 02 11月, 2013 1 次提交
    • G
      memcg: remove incorrect underflow check · 6920a1bd
      Greg Thelen 提交于
      When a memcg is deleted mem_cgroup_reparent_charges() moves charged
      memory to the parent memcg.  As of v3.11-9444-g3ea67d06 "memcg: add per
      cgroup writeback pages accounting" there's bad pointer read.  The goal
      was to check for counter underflow.  The counter is a per cpu counter
      and there are two problems with the code:
      
       (1) per cpu access function isn't used, instead a naked pointer is used
           which easily causes oops.
       (2) the check doesn't sum all cpus
      
      Test:
        $ cd /sys/fs/cgroup/memory
        $ mkdir x
        $ echo 3 > /proc/sys/vm/drop_caches
        $ (echo $BASHPID >> x/tasks && exec cat) &
        [1] 7154
        $ grep ^mapped x/memory.stat
        mapped_file 53248
        $ echo 7154 > tasks
        $ rmdir x
        <OOPS>
      
      The fix is to remove the check.  It's currently dangerous and isn't
      worth fixing it to use something expensive, such as
      percpu_counter_sum(), for each reparented page.  __this_cpu_read() isn't
      enough to fix this because there's no guarantees of the current cpus
      count.  The only guarantees is that the sum of all per-cpu counter is >=
      nr_pages.
      
      Fixes: 3ea67d06 ("memcg: add per cgroup writeback pages accounting")
      Reported-and-tested-by: NFlavio Leitner <fbl@redhat.com>
      Signed-off-by: NGreg Thelen <gthelen@google.com>
      Reviewed-by: NSha Zhengju <handai.szj@taobao.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6920a1bd
  2. 01 11月, 2013 10 次提交
    • L
      Merge branch 'akpm' (fixes from Andrew Morton) · 4f794ee8
      Linus Torvalds 提交于
      Merge four more fixes from Andrew Morton.
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        lib/scatterlist.c: don't flush_kernel_dcache_page on slab page
        mm: memcg: fix test for child groups
        mm: memcg: lockdep annotation for memcg OOM lock
        mm: memcg: use proper memcg in limit bypass
      4f794ee8
    • M
      lib/scatterlist.c: don't flush_kernel_dcache_page on slab page · 3d77b50c
      Ming Lei 提交于
      Commit b1adaf65 ("[SCSI] block: add sg buffer copy helper
      functions") introduces two sg buffer copy helpers, and calls
      flush_kernel_dcache_page() on pages in SG list after these pages are
      written to.
      
      Unfortunately, the commit may introduce a potential bug:
      
       - Before sending some SCSI commands, kmalloc() buffer may be passed to
         block layper, so flush_kernel_dcache_page() can see a slab page
         finally
      
       - According to cachetlb.txt, flush_kernel_dcache_page() is only called
         on "a user page", which surely can't be a slab page.
      
       - ARCH's implementation of flush_kernel_dcache_page() may use page
         mapping information to do optimization so page_mapping() will see the
         slab page, then VM_BUG_ON() is triggered.
      
      Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled,
      and this patch fixes the bug by adding test of '!PageSlab(miter->page)'
      before calling flush_kernel_dcache_page().
      Signed-off-by: NMing Lei <ming.lei@canonical.com>
      Reported-by: NAaro Koskinen <aaro.koskinen@iki.fi>
      Tested-by: NSimon Baatz <gmbnomis@gmail.com>
      Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: <stable@vger.kernel.org>	[3.2+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3d77b50c
    • J
      mm: memcg: fix test for child groups · 696ac172
      Johannes Weiner 提交于
      When memcg code needs to know whether any given memcg has children, it
      uses the cgroup child iteration primitives and returns true/false
      depending on whether the iteration loop is executed at least once or
      not.
      
      Because a cgroup's list of children is RCU protected, these primitives
      require the RCU read-lock to be held, which is not the case for all
      memcg callers.  This results in the following splat when e.g.  enabling
      hierarchy mode:
      
        WARNING: CPU: 3 PID: 1 at kernel/cgroup.c:3043 css_next_child+0xa3/0x160()
        CPU: 3 PID: 1 Comm: systemd Not tainted 3.12.0-rc5-00117-g83f11a9c-dirty #18
        Hardware name: LENOVO 3680B56/3680B56, BIOS 6QET69WW (1.39 ) 04/26/2012
        Call Trace:
          dump_stack+0x54/0x74
          warn_slowpath_common+0x78/0xa0
          warn_slowpath_null+0x1a/0x20
          css_next_child+0xa3/0x160
          mem_cgroup_hierarchy_write+0x5b/0xa0
          cgroup_file_write+0x108/0x2a0
          vfs_write+0xbd/0x1e0
          SyS_write+0x4c/0xa0
          system_call_fastpath+0x16/0x1b
      
      In the memcg case, we only care about children when we are attempting to
      modify inheritable attributes interactively.  Racing with deletion could
      mean a spurious -EBUSY, no problem.  Racing with addition is handled
      just fine as well through the memcg_create_mutex: if the child group is
      not on the list after the mutex is acquired, it won't be initialized
      from the parent's attributes until after the unlock.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      696ac172
    • J
      mm: memcg: lockdep annotation for memcg OOM lock · 0056f4e6
      Johannes Weiner 提交于
      The memcg OOM lock is a mutex-type lock that is open-coded due to
      memcg's special needs.  Add annotations for lockdep coverage.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0056f4e6
    • J
      mm: memcg: use proper memcg in limit bypass · 3168ecbe
      Johannes Weiner 提交于
      Commit 84235de3 ("fs: buffer: move allocation failure loop into the
      allocator") allowed __GFP_NOFAIL allocations to bypass the limit if they
      fail to reclaim enough memory for the charge.  But because the main test
      case was on a 3.2-based system, the patch missed the fact that on newer
      kernels the charge function needs to return root_mem_cgroup when
      bypassing the limit, and not NULL.  This will corrupt whatever memory is
      at NULL + percpu pointer offset.  Fix this quickly before problems are
      reported.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3168ecbe
    • L
      vfs: decrapify dput(), fix cache behavior under normal load · 358eec18
      Linus Torvalds 提交于
      We do not want to dirty the dentry->d_flags cacheline in dput() just to
      set the DCACHE_REFERENCED flag when it is already set in the common case
      anyway.  This way the first cacheline of the dentry (which contains the
      RCU lookup information etc) can stay shared among multiple CPU's.
      
      This finishes off some of the details of all the scalability patches
      merged during the merge window.
      
      Also don't mark dentry_kill() for inlining, since it's the uncommon path
      and inlining it just makes the common path slower due to extra function
      entry/exit overhead.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      358eec18
    • L
      i915: fix compiler warning · 0baab4fd
      Linus Torvalds 提交于
      The last i915 drm update brought with it this annoying warning
      
        drivers/gpu/drm/i915/intel_crt.c: In function ‘intel_crt_get_config’:
        drivers/gpu/drm/i915/intel_crt.c:110:21: warning: unused variable ‘dev’ [-Wunused-variable]
          struct drm_device *dev = encoder->base.dev;
                             ^
      
      introduced by commit 7195a50b ("drm/i915: Add HSW CRT output readout
      support").
      
      Remove the offending pointless variable.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0baab4fd
    • L
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 52469b4f
      Linus Torvalds 提交于
      Pull NUMA balancing memory corruption fixes from Ingo Molnar:
       "So these fixes are definitely not something I'd like to sit on, but as
        I said to Mel at the KS the timing is quite tight, with Linus planning
        v3.12-final within a week.
      
        Fedora-19 is affected:
      
         comet:~> grep NUMA_BALANCING /boot/config-3.11.3-201.fc19.x86_64
      
         CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
         CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y
         CONFIG_NUMA_BALANCING=y
      
        AFAICS Ubuntu will be affected as well, once it updates the kernel:
      
         hubble:~> grep NUMA_BALANCING /boot/config-3.8.0-32-generic
      
         CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
         CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y
         CONFIG_NUMA_BALANCING=y
      
        These 6 commits are a minimalized set of cherry-picks needed to fix
        the memory corruption bugs.  All commits are fixes, except "mm: numa:
        Sanitize task_numa_fault() callsites" which is a cleanup that made two
        followup fixes simpler.
      
        I've done targeted testing with just this SHA1 to try to make sure
        there are no cherry-picking artifacts.  The original non-cherry-picked
        set of fixes were exposed to linux-next for a couple of weeks"
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        mm: Account for a THP NUMA hinting update as one PTE update
        mm: Close races between THP migration and PMD numa clearing
        mm: numa: Sanitize task_numa_fault() callsites
        mm: Prevent parallel splits during THP migration
        mm: Wait for THP migrations to complete during NUMA hinting faults
        mm: numa: Do not account for a hinting fault if we raced
      52469b4f
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 026f8f61
      Linus Torvalds 提交于
      Pull input updates from Dmitry Torokhov:
       "A bit later than I would want, but the changes are very minor - a few
        new device IDs for new hardware in existing drivers, fix for battery
        in Wacom devices not be considered system battery and cause emergency
        hibernations, and a couple of other bug fixes"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: ALPS - add support for model found on Dell XT2
        Input: wacom - add support for ISDv4 0x10E sensor
        Input: wacom - add support for ISDv4 0x10F sensor
        Input: wacom - export battery scope
        Input: cm109 - convert high volume dev_err() to dev_err_ratelimited()
        Input: move name/timer init to input_alloc_dev()
        Input: i8042 - i8042_flush fix for a full 8042 buffer
        Input: pxa27x_keypad - fix NULL pointer dereference
      026f8f61
    • L
      Merge tag 'pm+acpi-3.12-late' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · e7647027
      Linus Torvalds 提交于
      Pull ACPI and power management fixes from Rafael J Wysocki:
       "Last-minute ACPI and power management fixes for 3.12
      
         - Revert epoll and select commits related to the freezer, introduced
           during the 3.11 cycle, that cause mysterious user space breakage to
           occur during resume from suspend to RAM for multiple users of
           32-bit x86 systems.  Material for 3.11.y stable kernels.
      
         - Revert a recent ACPI-based PCI hotplug (ACPIPHP) commit that was
           part of boot problem fixes for one machine, but turns out to cause
           issues with hotplug on Thunderbolt chains with multiple devices.
           It also turns out to be unnecessary after another fix in the same
           area that went in later.  From Mika Westerberg"
      
      * tag 'pm+acpi-3.12-late' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        Revert "ACPI / hotplug / PCI: Avoid doing too much for spurious notifies"
        Revert "select: use freezable blocking call"
        Revert "epoll: use freezable blocking call"
      e7647027
  3. 31 10月, 2013 17 次提交
    • Y
      Input: ALPS - add support for model found on Dell XT2 · 5beea882
      Yunkang Tang 提交于
      This patch adds support for touchpad found on Dell XT2. It's a dual device
      with device ID: 73, 00, 14, that comply with "ALPS_PROTO_V2".
      Signed-off-by: NYunkang Tang <yunkang.tang@cn.alps.com>
      Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
      5beea882
    • D
      Merge branch 'drm-fixes-3.12' of git://people.freedesktop.org/~agd5f/linux into drm-fixes · 74c85e13
      Dave Airlie 提交于
      Just a few small fixes for radeon (audio regression fix,
      stability fix, and an endian bug noticed by coverity).
      
      * 'drm-fixes-3.12' of git://people.freedesktop.org/~agd5f/linux:
        drm/radeon/dpm: fix incompatible casting on big endian
        drm/radeon: disable bapm on KB
        drm/radeon: use sw CTS/N values for audio on DCE4+
      74c85e13
    • L
      Merge branch 'akpm' (fixes from Andrew Morton) · 12aee278
      Linus Torvalds 提交于
      Merge three fixes from Andrew Morton.
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        memcg: use __this_cpu_sub() to dec stats to avoid incorrect subtrahend casting
        percpu: fix this_cpu_sub() subtrahend casting for unsigneds
        mm/pagewalk.c: fix walk_page_range() access of wrong PTEs
      12aee278
    • G
      memcg: use __this_cpu_sub() to dec stats to avoid incorrect subtrahend casting · 5e8cfc3c
      Greg Thelen 提交于
      As of commit 3ea67d06 ("memcg: add per cgroup writeback pages
      accounting") memcg counter errors are possible when moving charged
      memory to a different memcg.  Charge movement occurs when processing
      writes to memory.force_empty, moving tasks to a memcg with
      memcg.move_charge_at_immigrate=1, or memcg deletion.
      
      An example showing error after memory.force_empty:
      
        $ cd /sys/fs/cgroup/memory
        $ mkdir x
        $ rm /data/tmp/file
        $ (echo $BASHPID >> x/tasks && exec mmap_writer /data/tmp/file 1M) &
        [1] 13600
        $ grep ^mapped x/memory.stat
        mapped_file 1048576
        $ echo 13600 > tasks
        $ echo 1 > x/memory.force_empty
        $ grep ^mapped x/memory.stat
        mapped_file 4503599627370496
      
      mapped_file should end with 0.
        4503599627370496 == 0x10,0000,0000,0000 == 0x100,0000,0000 pages
        1048576          == 0x10,0000           == 0x100 pages
      
      This issue only affects the source memcg on 64 bit machines; the
      destination memcg counters are correct.  So the rmdir case is not too
      important because such counters are soon disappearing with the entire
      memcg.  But the memcg.force_empty and memory.move_charge_at_immigrate=1
      cases are larger problems as the bogus counters are visible for the
      (possibly long) remaining life of the source memcg.
      
      The problem is due to memcg use of __this_cpu_from(.., -nr_pages), which
      is subtly wrong because it subtracts the unsigned int nr_pages (either
      -1 or -512 for THP) from a signed long percpu counter.  When
      nr_pages=-1, -nr_pages=0xffffffff.  On 64 bit machines stat->count[idx]
      is signed 64 bit.  So memcg's attempt to simply decrement a count (e.g.
      from 1 to 0) boils down to:
      
        long count = 1
        unsigned int nr_pages = 1
        count += -nr_pages  /* -nr_pages == 0xffff,ffff */
        count is now 0x1,0000,0000 instead of 0
      
      The fix is to subtract the unsigned page count rather than adding its
      negation.  This only works once "percpu: fix this_cpu_sub() subtrahend
      casting for unsigneds" is applied to fix this_cpu_sub().
      Signed-off-by: NGreg Thelen <gthelen@google.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5e8cfc3c
    • G
      percpu: fix this_cpu_sub() subtrahend casting for unsigneds · bd09d9a3
      Greg Thelen 提交于
      this_cpu_sub() is implemented as negation and addition.
      
      This patch casts the adjustment to the counter type before negation to
      sign extend the adjustment.  This helps in cases where the counter type
      is wider than an unsigned adjustment.  An alternative to this patch is
      to declare such operations unsupported, but it seemed useful to avoid
      surprises.
      
      This patch specifically helps the following example:
        unsigned int delta = 1
        preempt_disable()
        this_cpu_write(long_counter, 0)
        this_cpu_sub(long_counter, delta)
        preempt_enable()
      
      Before this change long_counter on a 64 bit machine ends with value
      0xffffffff, rather than 0xffffffffffffffff.  This is because
      this_cpu_sub(pcp, delta) boils down to this_cpu_add(pcp, -delta),
      which is basically:
        long_counter = 0 + 0xffffffff
      
      Also apply the same cast to:
        __this_cpu_sub()
        __this_cpu_sub_return()
        this_cpu_sub_return()
      
      All percpu_test.ko passes, especially the following cases which
      previously failed:
      
        l -= ui_one;
        __this_cpu_sub(long_counter, ui_one);
        CHECK(l, long_counter, -1);
      
        l -= ui_one;
        this_cpu_sub(long_counter, ui_one);
        CHECK(l, long_counter, -1);
        CHECK(l, long_counter, 0xffffffffffffffff);
      
        ul -= ui_one;
        __this_cpu_sub(ulong_counter, ui_one);
        CHECK(ul, ulong_counter, -1);
        CHECK(ul, ulong_counter, 0xffffffffffffffff);
      
        ul = this_cpu_sub_return(ulong_counter, ui_one);
        CHECK(ul, ulong_counter, 2);
      
        ul = __this_cpu_sub_return(ulong_counter, ui_one);
        CHECK(ul, ulong_counter, 1);
      Signed-off-by: NGreg Thelen <gthelen@google.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bd09d9a3
    • C
      mm/pagewalk.c: fix walk_page_range() access of wrong PTEs · 3017f079
      Chen LinX 提交于
      When walk_page_range walk a memory map's page tables, it'll skip
      VM_PFNMAP area, then variable 'next' will to assign to vma->vm_end, it
      maybe larger than 'end'.  In next loop, 'addr' will be larger than
      'next'.  Then in /proc/XXXX/pagemap file reading procedure, the 'addr'
      will growing forever in pagemap_pte_range, pte_to_pagemap_entry will
      access the wrong pte.
      
        BUG: Bad page map in process procrank  pte:8437526f pmd:785de067
        addr:9108d000 vm_flags:00200073 anon_vma:f0d99020 mapping:  (null) index:9108d
        CPU: 1 PID: 4974 Comm: procrank Tainted: G    B   W  O 3.10.1+ #1
        Call Trace:
          dump_stack+0x16/0x18
          print_bad_pte+0x114/0x1b0
          vm_normal_page+0x56/0x60
          pagemap_pte_range+0x17a/0x1d0
          walk_page_range+0x19e/0x2c0
          pagemap_read+0x16e/0x200
          vfs_read+0x84/0x150
          SyS_read+0x4a/0x80
          syscall_call+0x7/0xb
      Signed-off-by: NLiu ShuoX <shuox.liu@intel.com>
      Signed-off-by: NChen LinX <linx.z.chen@intel.com>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reviewed-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: <stable@vger.kernel.org>	[3.10.x+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3017f079
    • R
      mm: list_lru: fix almost infinite loop causing effective livelock · c56b097a
      Russell King 提交于
      I've seen a fair number of issues with kswapd and other processes
      appearing to get stuck in v3.12-rc.  Using sysrq-p many times seems to
      indicate that it gets stuck somewhere in list_lru_walk_node(), called
      from prune_icache_sb() and super_cache_scan().
      
      I never seem to be able to trigger a calltrace for functions above that
      point.
      
      So I decided to add the following to super_cache_scan():
      
          @@ -81,10 +81,14 @@ static unsigned long super_cache_scan(struct shrinker *shrink,
                  inodes = list_lru_count_node(&sb->s_inode_lru, sc->nid);
                  dentries = list_lru_count_node(&sb->s_dentry_lru, sc->nid);
                  total_objects = dentries + inodes + fs_objects + 1;
          +printk("%s:%u: %s: dentries %lu inodes %lu total %lu\n", current->comm, current->pid, __func__, dentries, inodes, total_objects);
      
                  /* proportion the scan between the caches */
                  dentries = mult_frac(sc->nr_to_scan, dentries, total_objects);
                  inodes = mult_frac(sc->nr_to_scan, inodes, total_objects);
          +printk("%s:%u: %s: dentries %lu inodes %lu\n", current->comm, current->pid, __func__, dentries, inodes);
          +BUG_ON(dentries == 0);
          +BUG_ON(inodes == 0);
      
                  /*
                   * prune the dcache first as the icache is pinned by it, then
          @@ -99,7 +103,7 @@ static unsigned long super_cache_scan(struct shrinker *shrink,
                          freed += sb->s_op->free_cached_objects(sb, fs_objects,
                                                                 sc->nid);
                  }
          -
          +printk("%s:%u: %s: dentries %lu inodes %lu freed %lu\n", current->comm, current->pid, __func__, dentries, inodes, freed);
                  drop_super(sb);
                  return freed;
           }
      
      and shortly thereafter, having applied some pressure, I got this:
      
          update-apt-xapi:1616: super_cache_scan: dentries 25632 inodes 2 total 25635
          update-apt-xapi:1616: super_cache_scan: dentries 1023 inodes 0
          ------------[ cut here ]------------
          Kernel BUG at c0101994 [verbose debug info unavailable]
          Internal error: Oops - BUG: 0 [#3] SMP ARM
          Modules linked in: fuse rfcomm bnep bluetooth hid_cypress
          CPU: 0 PID: 1616 Comm: update-apt-xapi Tainted: G      D      3.12.0-rc7+ #154
          task: daea1200 ti: c3bf8000 task.ti: c3bf8000
          PC is at super_cache_scan+0x1c0/0x278
          LR is at trace_hardirqs_on+0x14/0x18
          Process update-apt-xapi (pid: 1616, stack limit = 0xc3bf8240)
          ...
          Backtrace:
            (super_cache_scan) from [<c00cd69c>] (shrink_slab+0x254/0x4c8)
            (shrink_slab) from [<c00d09a0>] (try_to_free_pages+0x3a0/0x5e0)
            (try_to_free_pages) from [<c00c59cc>] (__alloc_pages_nodemask+0x5)
            (__alloc_pages_nodemask) from [<c00e07c0>] (__pte_alloc+0x2c/0x13)
            (__pte_alloc) from [<c00e3a70>] (handle_mm_fault+0x84c/0x914)
            (handle_mm_fault) from [<c001a4cc>] (do_page_fault+0x1f0/0x3bc)
            (do_page_fault) from [<c001a7b0>] (do_translation_fault+0xac/0xb8)
            (do_translation_fault) from [<c000840c>] (do_DataAbort+0x38/0xa0)
            (do_DataAbort) from [<c00133f8>] (__dabt_usr+0x38/0x40)
      
      Notice that we had a very low number of inodes, which were reduced to
      zero my mult_frac().
      
      Now, prune_icache_sb() calls list_lru_walk_node() passing that number of
      inodes (0) into that as the number of objects to scan:
      
          long prune_icache_sb(struct super_block *sb, unsigned long nr_to_scan,
                               int nid)
          {
                  LIST_HEAD(freeable);
                  long freed;
      
                  freed = list_lru_walk_node(&sb->s_inode_lru, nid, inode_lru_isolate,
                                                 &freeable, &nr_to_scan);
      
      which does:
      
          unsigned long
          list_lru_walk_node(struct list_lru *lru, int nid, list_lru_walk_cb isolate,
                             void *cb_arg, unsigned long *nr_to_walk)
          {
      
                  struct list_lru_node    *nlru = &lru->node[nid];
                  struct list_head *item, *n;
                  unsigned long isolated = 0;
      
                  spin_lock(&nlru->lock);
          restart:
                  list_for_each_safe(item, n, &nlru->list) {
                          enum lru_status ret;
      
                          /*
                           * decrement nr_to_walk first so that we don't livelock if we
                           * get stuck on large numbesr of LRU_RETRY items
                           */
                          if (--(*nr_to_walk) == 0)
                                  break;
      
      So, if *nr_to_walk was zero when this function was entered, that means
      we're wanting to operate on (~0UL)+1 objects - which might as well be
      infinite.
      
      Clearly this is not correct behaviour.  If we think about the behaviour
      of this function when *nr_to_walk is 1, then clearly it's wrong - we
      decrement first and then test for zero - which results in us doing
      nothing at all.  A post-decrement would give the desired behaviour -
      we'd try to walk one object and one object only if *nr_to_walk were one.
      
      It also gives the correct behaviour for zero - we exit at this point.
      
      Fixes: 5cedf721 ("list_lru: fix broken LRU_RETRY behaviour")
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Cc: Dave Chinner <dchinner@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      [ Modified to make sure we never underflow the count: this function gets
        called in a loop, so the 0 -> ~0ul transition is dangerous  - Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c56b097a
    • L
      Merge tag 'tty-3.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · ced5d6b5
      Linus Torvalds 提交于
      Pull serial fixes from Greg KH:
       "Here are 3 tiny fixes that are needed for 3.12-final for some serial
        drivers.
      
        One of them is a revert of a broken patch, and two others are fixes
        for reported bugs.  All of these have been in linux-next for a while,
        I forgot I had not sent them to you yet, my fault"
      
      (Actually, Greg, you _had_ sent two of the three, so this pulls in just
      one actual new fix)
      
      * tag 'tty-3.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        tty/serial: at91: fix uart/usart selection for older products
      ced5d6b5
    • L
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · b8cab706
      Linus Torvalds 提交于
      Pull drm fixes from Dave Airlie:
       "Mainly Intel regression fixes and quirks, along with a simple one
        liner to fix rendernodes ioctl access (off by default, but testers
        want to test it)"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        drm: allow DRM_IOCTL_VERSION on render-nodes
        drm/i915: Fix the PPT fdi lane bifurcate state handling on ivb
        drm/i915: No LVDS hardware on Intel D410PT and D425KT
        drm/i915/dp: workaround BIOS eDP bpp clamping issue
        drm/i915: Add HSW CRT output readout support
        drm/i915: Add support for pipe_bpp readout
      b8cab706
    • L
      Merge tag 'sound-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 182b4fd9
      Linus Torvalds 提交于
      Pull sound fixes from Takashi Iwai:
       "A few small HD-audio regression fixes, mostly for stable kernels, too"
      
      * tag 'sound-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda - Fix silent headphone on Thinkpads with AD1984A codec
        ALSA: hda - Add missing initial vmaster hook at build_controls callback
        ALSA: hda - Fix unbalanced runtime PM refcount after S3/S4
      182b4fd9
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 96d33b08
      Linus Torvalds 提交于
      Pull KVM fixes from Paolo Bonzini:
       "Fixes for the 3.12 debugfs problem - removing the duplicate directory
        name, and using a better the error code"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: use a more sensible error number when debugfs directory creation fails
        KVM: Fix modprobe failure for kvm_intel/kvm_amd
      96d33b08
    • D
      Staging: sb105x: info leak in mp_get_count() · a8b33654
      Dan Carpenter 提交于
      The icount.reserved[] array isn't initialized so it leaks stack
      information to userspace.
      Reported-by: NNico Golde <nico@ngolde.de>
      Reported-by: NFabian Yamaguchi <fabs@goesec.de>
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a8b33654
    • D
      Staging: bcm: info leak in ioctl · 8d1e7225
      Dan Carpenter 提交于
      The DevInfo.u32Reserved[] array isn't initialized so it leaks kernel
      information to user space.
      Reported-by: NNico Golde <nico@ngolde.de>
      Reported-by: NFabian Yamaguchi <fabs@goesec.de>
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d1e7225
    • D
      staging: wlags49_h2: buffer overflow setting station name · b5e2f339
      Dan Carpenter 提交于
      We need to check the length parameter before doing the memcpy().  I've
      actually changed it to strlcpy() as well so that it's NUL terminated.
      
      You need CAP_NET_ADMIN to trigger these so it's not the end of the
      world.
      Reported-by: NNico Golde <nico@ngolde.de>
      Reported-by: NFabian Yamaguchi <fabs@goesec.de>
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b5e2f339
    • D
      aacraid: missing capable() check in compat ioctl · f856567b
      Dan Carpenter 提交于
      In commit d496f94d ('[SCSI] aacraid: fix security weakness') we
      added a check on CAP_SYS_RAWIO to the ioctl.  The compat ioctls need the
      check as well.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f856567b
    • D
      staging: ozwpan: prevent overflow in oz_cdev_write() · c2c65cd2
      Dan Carpenter 提交于
      We need to check "count" so we don't overflow the ei->data buffer.
      Reported-by: NNico Golde <nico@ngolde.de>
      Reported-by: NFabian Yamaguchi <fabs@goesec.de>
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c2c65cd2
    • D
      uml: check length in exitcode_proc_write() · 201f99f1
      Dan Carpenter 提交于
      We don't cap the size of buffer from the user so we could write past the
      end of the array here.  Only root can write to this file.
      Reported-by: NNico Golde <nico@ngolde.de>
      Reported-by: NFabian Yamaguchi <fabs@goesec.de>
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      201f99f1
  4. 30 10月, 2013 8 次提交
  5. 29 10月, 2013 4 次提交