1. 09 7月, 2012 4 次提交
  2. 02 7月, 2012 4 次提交
  3. 20 6月, 2012 3 次提交
    • J
      slub: refactoring unfreeze_partials() · 43d77867
      Joonsoo Kim 提交于
      Current implementation of unfreeze_partials() is so complicated,
      but benefit from it is insignificant. In addition many code in
      do {} while loop have a bad influence to a fail rate of cmpxchg_double_slab.
      Under current implementation which test status of cpu partial slab
      and acquire list_lock in do {} while loop,
      we don't need to acquire a list_lock and gain a little benefit
      when front of the cpu partial slab is to be discarded, but this is a rare case.
      In case that add_partial is performed and cmpxchg_double_slab is failed,
      remove_partial should be called case by case.
      
      I think that these are disadvantages of current implementation,
      so I do refactoring unfreeze_partials().
      
      Minimizing code in do {} while loop introduce a reduced fail rate
      of cmpxchg_double_slab. Below is output of 'slabinfo -r kmalloc-256'
      when './perf stat -r 33 hackbench 50 process 4000 > /dev/null' is done.
      
      ** before **
      Cmpxchg_double Looping
      ------------------------
      Locked Cmpxchg Double redos   182685
      Unlocked Cmpxchg Double redos 0
      
      ** after **
      Cmpxchg_double Looping
      ------------------------
      Locked Cmpxchg Double redos   177995
      Unlocked Cmpxchg Double redos 1
      
      We can see cmpxchg_double_slab fail rate is improved slightly.
      
      Bolow is output of './perf stat -r 30 hackbench 50 process 4000 > /dev/null'.
      
      ** before **
       Performance counter stats for './hackbench 50 process 4000' (30 runs):
      
           108517.190463 task-clock                #    7.926 CPUs utilized            ( +-  0.24% )
               2,919,550 context-switches          #    0.027 M/sec                    ( +-  3.07% )
                 100,774 CPU-migrations            #    0.929 K/sec                    ( +-  4.72% )
                 124,201 page-faults               #    0.001 M/sec                    ( +-  0.15% )
         401,500,234,387 cycles                    #    3.700 GHz                      ( +-  0.24% )
         <not supported> stalled-cycles-frontend
         <not supported> stalled-cycles-backend
         250,576,913,354 instructions              #    0.62  insns per cycle          ( +-  0.13% )
          45,934,956,860 branches                  #  423.297 M/sec                    ( +-  0.14% )
             188,219,787 branch-misses             #    0.41% of all branches          ( +-  0.56% )
      
            13.691837307 seconds time elapsed                                          ( +-  0.24% )
      
      ** after **
       Performance counter stats for './hackbench 50 process 4000' (30 runs):
      
           107784.479767 task-clock                #    7.928 CPUs utilized            ( +-  0.22% )
               2,834,781 context-switches          #    0.026 M/sec                    ( +-  2.33% )
                  93,083 CPU-migrations            #    0.864 K/sec                    ( +-  3.45% )
                 123,967 page-faults               #    0.001 M/sec                    ( +-  0.15% )
         398,781,421,836 cycles                    #    3.700 GHz                      ( +-  0.22% )
         <not supported> stalled-cycles-frontend
         <not supported> stalled-cycles-backend
         250,189,160,419 instructions              #    0.63  insns per cycle          ( +-  0.09% )
          45,855,370,128 branches                  #  425.436 M/sec                    ( +-  0.10% )
             169,881,248 branch-misses             #    0.37% of all branches          ( +-  0.43% )
      
            13.596272341 seconds time elapsed                                          ( +-  0.22% )
      
      No regression is found, but rather we can see slightly better result.
      Acked-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NJoonsoo Kim <js1304@gmail.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      43d77867
    • J
      slub: use __cmpxchg_double_slab() at interrupt disabled place · d24ac77f
      Joonsoo Kim 提交于
      get_freelist(), unfreeze_partials() are only called with interrupt disabled,
      so __cmpxchg_double_slab() is suitable.
      Acked-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NJoonsoo Kim <js1304@gmail.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      d24ac77f
    • A
      slab/mempolicy: always use local policy from interrupt context · e7b691b0
      Andi Kleen 提交于
      slab_node() could access current->mempolicy from interrupt context.
      However there's a race condition during exit where the mempolicy
      is first freed and then the pointer zeroed.
      
      Using this from interrupts seems bogus anyways. The interrupt
      will interrupt a random process and therefore get a random
      mempolicy. Many times, this will be idle's, which noone can change.
      
      Just disable this here and always use local for slab
      from interrupts. I also cleaned up the callers of slab_node a bit
      which always passed the same argument.
      
      I believe the original mempolicy code did that in fact,
      so it's likely a regression.
      
      v2: send version with correct logic
      v3: simplify. fix typo.
      Reported-by: NArun Sharma <asharma@fb.com>
      Cc: penberg@kernel.org
      Cc: cl@linux.com
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      [tdmackey@twitter.com: Rework control flow based on feedback from
      cl@linux.com, fix logic, and cleanup current task_struct reference]
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NDavid Mackey <tdmackey@twitter.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      e7b691b0
  4. 14 6月, 2012 7 次提交
  5. 02 6月, 2012 1 次提交
    • J
      fs: introduce inode operation ->update_time · c3b2da31
      Josef Bacik 提交于
      Btrfs has to make sure we have space to allocate new blocks in order to modify
      the inode, so updating time can fail.  We've gotten around this by having our
      own file_update_time but this is kind of a pain, and Christoph has indicated he
      would like to make xfs do something different with atime updates.  So introduce
      ->update_time, where we will deal with i_version an a/m/c time updates and
      indicate which changes need to be made.  The normal version just does what it
      has always done, updates the time and marks the inode dirty, and then
      filesystems can choose to do something different.
      
      I've gone through all of the users of file_update_time and made them check for
      errors with the exception of the fault code since it's complicated and I wasn't
      quite sure what to do there, also Jan is going to be pushing the file time
      updates into page_mkwrite for those who have it so that should satisfy btrfs and
      make it not a big deal to check the file_update_time() return code in the
      generic fault path. Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      c3b2da31
  6. 01 6月, 2012 17 次提交
  7. 31 5月, 2012 3 次提交
  8. 30 5月, 2012 1 次提交
    • D
      mm: fix vma_resv_map() NULL pointer · 4523e145
      Dave Hansen 提交于
      hugetlb_reserve_pages() can be used for either normal file-backed
      hugetlbfs mappings, or MAP_HUGETLB.  In the MAP_HUGETLB, semi-anonymous
      mode, there is not a VMA around.  The new call to resv_map_put() assumed
      that there was, and resulted in a NULL pointer dereference:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
        IP: vma_resv_map+0x9/0x30
        PGD 141453067 PUD 1421e1067 PMD 0
        Oops: 0000 [#1] PREEMPT SMP
        ...
        Pid: 14006, comm: trinity-child6 Not tainted 3.4.0+ #36
        RIP: vma_resv_map+0x9/0x30
        ...
        Process trinity-child6 (pid: 14006, threadinfo ffff8801414e0000, task ffff8801414f26b0)
        Call Trace:
          resv_map_put+0xe/0x40
          hugetlb_reserve_pages+0xa6/0x1d0
          hugetlb_file_setup+0x102/0x2c0
          newseg+0x115/0x360
          ipcget+0x1ce/0x310
          sys_shmget+0x5a/0x60
          system_call_fastpath+0x16/0x1b
      
      This was reported by Dave Jones, but was reproducible with the
      libhugetlbfs test cases, so shame on me for not running them in the
      first place.
      
      With this, the oops is gone, and the output of libhugetlbfs's
      run_tests.py is identical to plain 3.4 again.
      
      [ Marked for stable, since this was introduced by commit c50ac050
        ("hugetlb: fix resv_map leak in error path") which was also marked for
        stable ]
      Reported-by: NDave Jones <davej@redhat.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: <stable@vger.kernel.org>        [2.6.32+]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4523e145