1. 10 8月, 2011 3 次提交
    • M
      Revert "memcg: get rid of percpu_charge_mutex lock" · 9f50fad6
      Michal Hocko 提交于
      This reverts commit 8521fc50.
      
      The patch incorrectly assumes that using atomic FLUSHING_CACHED_CHARGE
      bit operations is sufficient but that is not true.  Johannes Weiner has
      reported a crash during parallel memory cgroup removal:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
        IP: [<ffffffff81083b70>] css_is_ancestor+0x20/0x70
        Oops: 0000 [#1] PREEMPT SMP
        Pid: 19677, comm: rmdir Tainted: G        W   3.0.0-mm1-00188-gf38d32b #35 ECS MCP61M-M3/MCP61M-M3
        RIP: 0010:[<ffffffff81083b70>]  css_is_ancestor+0x20/0x70
        RSP: 0018:ffff880077b09c88  EFLAGS: 00010202
        Process rmdir (pid: 19677, threadinfo ffff880077b08000, task ffff8800781bb310)
        Call Trace:
         [<ffffffff810feba3>] mem_cgroup_same_or_subtree+0x33/0x40
         [<ffffffff810feccf>] drain_all_stock+0x11f/0x170
         [<ffffffff81103211>] mem_cgroup_force_empty+0x231/0x6d0
         [<ffffffff811036c4>] mem_cgroup_pre_destroy+0x14/0x20
         [<ffffffff81080559>] cgroup_rmdir+0xb9/0x500
         [<ffffffff81114d26>] vfs_rmdir+0x86/0xe0
         [<ffffffff81114e7b>] do_rmdir+0xfb/0x110
         [<ffffffff81114ea6>] sys_rmdir+0x16/0x20
         [<ffffffff8154d76b>] system_call_fastpath+0x16/0x1b
      
      We are crashing because we try to dereference cached memcg when we are
      checking whether we should wait for draining on the cache.  The cache is
      already cleaned up, though.
      
      There is also a theoretical chance that the cached memcg gets freed
      between we test for the FLUSHING_CACHED_CHARGE and dereference it in
      mem_cgroup_same_or_subtree:
      
              CPU0                    CPU1                         CPU2
        mem=stock->cached
        stock->cached=NULL
                                    clear_bit
                                                              test_and_set_bit
        test_bit()                    ...
        <preempted>             mem_cgroup_destroy
        use after free
      
      The percpu_charge_mutex protected from this race because sync draining
      is exclusive.
      
      It is safer to revert now and come up with a more parallel
      implementation later.
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Reported-by: NJohannes Weiner <jweiner@redhat.com>
      Acked-by: NJohannes Weiner <jweiner@redhat.com>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9f50fad6
    • L
      47e180d6
    • C
      slub: Fix partial count comparison confusion · 81107188
      Christoph Lameter 提交于
      deactivate_slab() has the comparison if more than the minimum number of
      partial pages are in the partial list wrong. An effect of this may be that
      empty pages are not freed from deactivate_slab(). The result could be an
      OOM due to growth of the partial slabs per node. Frees mostly occur from
      __slab_free which is okay so this would only affect use cases where a lot
      of switching around of per cpu slabs occur.
      
      Switching per cpu slabs occurs with high frequency if debugging options are
      enabled.
      Reported-and-tested-by: NXiaotian Feng <xtfeng@gmail.com>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      81107188
  2. 09 8月, 2011 11 次提交
  3. 08 8月, 2011 16 次提交
  4. 07 8月, 2011 10 次提交
    • T
      ALSA: Fix dependency of CONFIG_SND_TEA575X · df944f66
      Takashi Iwai 提交于
      CONFIG_SND_TEA575X is enabled by RADIO_SF16FMR2, but the latter one is
      no PCI device.  Since tea575x-tuner itself is independent from the board
      bus type, the config should be moved out of SND_PCI dependency.
      Reported-by: NRandy Dunlap <rdunlap@xenotime.net>
      Acked-by: NRandy Dunlap <rdunlap@xenotime.net>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      df944f66
    • T
      ALSA: asihpi - use kzalloc() · 67ada836
      Thomas Meyer 提交于
       Use kzalloc rather than kmalloc followed by memset with 0
      
       This considers some simple cases that are common and easy to validate
       Note in particular that there are no ...s in the rule, so all of the
       matched code has to be contiguous
      
       The semantic patch that makes this output is available
       in scripts/coccinelle/api/alloc/kzalloc-simple.cocci.
      
       More information about semantic patching is available at
       http://coccinelle.lip6.fr/Signed-off-by: NThomas Meyer <thomas@m3y3r.de>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      67ada836
    • A
      Fix POSIX ACL permission check · 206b1d09
      Ari Savolainen 提交于
      After commit 3567866b: "RCUify freeing acls, let check_acl() go ahead in
      RCU mode if acl is cached" posix_acl_permission is being called with an
      unsupported flag and the permission check fails. This patch fixes the issue.
      Signed-off-by: NAri Savolainen <ari.m.savolainen@gmail.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      206b1d09
    • L
      Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd · c2f340a6
      Linus Torvalds 提交于
      * 'for-linus' of git://git.open-osd.org/linux-open-osd:
        ore: Make ore its own module
        exofs: Rename raid engine from exofs/ios.c => ore
        exofs: ios: Move to a per inode components & device-table
        exofs: Move exofs specific osd operations out of ios.c
        exofs: Add offset/length to exofs_get_io_state
        exofs: Fix truncate for the raid-groups case
        exofs: Small cleanup of exofs_fill_super
        exofs: BUG: Avoid sbi realloc
        exofs: Remove pnfs-osd private definitions
        nfs_xdr: Move nfs4_string definition out of #ifdef CONFIG_NFS_V4
      c2f340a6
    • L
      vfs: optimize inode cache access patterns · 3ddcd056
      Linus Torvalds 提交于
      The inode structure layout is largely random, and some of the vfs paths
      really do care.  The path lookup in particular is already quite D$
      intensive, and profiles show that accessing the 'inode->i_op->xyz'
      fields is quite costly.
      
      We already optimized the dcache to not unnecessarily load the d_op
      structure for members that are often NULL using the DCACHE_OP_xyz bits
      in dentry->d_flags, and this does something very similar for the inode
      ops that are used during pathname lookup.
      
      It also re-orders the fields so that the fields accessed by 'stat' are
      together at the beginning of the inode structure, and roughly in the
      order accessed.
      
      The effect of this seems to be in the 1-2% range for an empty kernel
      "make -j" run (which is fairly kernel-intensive, mostly in filename
      lookup), so it's visible.  The numbers are fairly noisy, though, and
      likely depend a lot on exact microarchitecture.  So there's more tuning
      to be done.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3ddcd056
    • L
      vfs: renumber DCACHE_xyz flags, remove some stale ones · 830c0f0e
      Linus Torvalds 提交于
      Gcc tends to generate better code with small integers, including the
      DCACHE_xyz flag tests - so move the common ones to be first in the list.
      Also just remove the unused DCACHE_INOTIFY_PARENT_WATCHED and
      DCACHE_AUTOFS_PENDING values, their users no longer exists in the source
      tree.
      
      And add a "unlikely()" to the DCACHE_OP_COMPARE test, since we want the
      common case to be a nice straight-line fall-through.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      830c0f0e
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 7cd4767e
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        net: Compute protocol sequence numbers and fragment IDs using MD5.
        crypto: Move md5_transform to lib/md5.c
      7cd4767e
    • B
      ore: Make ore its own module · cf283ade
      Boaz Harrosh 提交于
      Export everything from ore need exporting. Change Kbuild and Kconfig
      to build ore.ko as an independent module. Import ore from exofs
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      cf283ade
    • B
      exofs: Rename raid engine from exofs/ios.c => ore · 8ff660ab
      Boaz Harrosh 提交于
      ORE stands for "Objects Raid Engine"
      
      This patch is a mechanical rename of everything that was in ios.c
      and its API declaration to an ore.c and an osd_ore.h header. The ore
      engine will later be used by the pnfs objects layout driver.
      
      * File ios.c => ore.c
      
      * Declaration of types and API are moved from exofs.h to a new
        osd_ore.h
      
      * All used types are prefixed by ore_ from their exofs_ name.
      
      * Shift includes from exofs.h to osd_ore.h so osd_ore.h is
        independent, include it from exofs.h.
      
      Other than a pure rename there are no other changes. Next patch
      will move the ore into it's own module and will export the API
      to be used by exofs and later the layout driver
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      8ff660ab
    • B
      exofs: ios: Move to a per inode components & device-table · 9e9db456
      Boaz Harrosh 提交于
      Exofs raid engine was saving on memory space by having a single layout-info,
      single pid, and a single device-table, global to the filesystem. Then passing
      a credential and object_id info at the io_state level, private for each
      inode. It would also devise this contraption of rotating the device table
      view for each inode->ino to spread out the device usage.
      
      This is not compatible with the pnfs-objects standard, demanding that
      each inode can have it's own layout-info, device-table, and each object
      component it's own pid, oid and creds.
      
      So: Bring exofs raid engine to be usable for generic pnfs-objects use by:
      
      * Define an exofs_comp structure that holds obj_id and credential info.
      
      * Break up exofs_layout struct to an exofs_components structure that holds a
        possible array of exofs_comp and the array of devices + the size of the
        arrays.
      
      * Add a "comps" parameter to get_io_state() that specifies the ids creds
        and device array to use for each IO.
      
        This enables to keep the layout global, but the device-table view, creds
        and IDs at the inode level. It only adds two 64bit to each inode, since
        some of these members already existed in another form.
      
      * ios raid engine now access layout-info and comps-info through the passed
        pointers. Everything is pre-prepared by caller for generic access of
        these structures and arrays.
      
      At the exofs Level:
      
      * Super block holds an exofs_components struct that holds the device
        array, previously in layout. The devices there are in device-table
        order. The device-array is twice bigger and repeats the device-table
        twice so now each inode's device array can point to a random device
        and have a round-robin view of the table, making it compatible to
        previous exofs versions.
      
      * Each inode has an exofs_components struct that is initialized at
        load time, with it's own view of the device table IDs and creds.
        When doing IO this gets passed to the io_state together with the
        layout.
      
      While preforming this change. Bugs where found where credentials with the
      wrong IDs where used to access the different SB objects (super.c). As well
      as some dead code. It was never noticed because the target we use does not
      check the credentials.
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      9e9db456