1. 07 1月, 2011 2 次提交
    • N
      fs: use fast counters for vfs caches · 3e880fb5
      Nick Piggin 提交于
      percpu_counter library generates quite nasty code, so unless you need
      to dynamically allocate counters or take fast approximate value, a
      simple per cpu set of counters is much better.
      
      The percpu_counter can never be made to work as well, because it has an
      indirection from pointer to percpu memory, and it can't use direct
      this_cpu_inc interfaces because it doesn't use static PER_CPU data, so
      code will always be worse.
      
      In the fastpath, it is the difference between this:
      
              incl %gs:nr_dentry      # nr_dentry
      
      and this:
      
              movl    percpu_counter_batch(%rip), %edx        # percpu_counter_batch,
              movl    $1, %esi        #,
              movq    $nr_dentry, %rdi        #,
              call    __percpu_counter_add    # (plus I clobber registers)
      
      __percpu_counter_add:
              pushq   %rbp    #
              movq    %rsp, %rbp      #,
              subq    $32, %rsp       #,
              movq    %rbx, -24(%rbp) #,
              movq    %r12, -16(%rbp) #,
              movq    %r13, -8(%rbp)  #,
              movq    %rdi, %rbx      # fbc, fbc
      #APP
      # 216 "/home/npiggin/usr/src/linux-2.6/arch/x86/include/asm/thread_info.h" 1
              movq %gs:kernel_stack,%rax      #, pfo_ret__
      # 0 "" 2
      #NO_APP
              incl    -8124(%rax)     # <variable>.preempt_count
              movq    32(%rdi), %r12  # <variable>.counters, tcp_ptr__
      #APP
      # 78 "lib/percpu_counter.c" 1
              add %gs:this_cpu_off, %r12      # this_cpu_off, tcp_ptr__
      # 0 "" 2
      #NO_APP
              movslq  (%r12),%r13     #* tcp_ptr__, tmp73
              movslq  %edx,%rax       # batch, batch
              addq    %rsi, %r13      # amount, count
              cmpq    %rax, %r13      # batch, count
              jge     .L27    #,
              negl    %edx    # tmp76
              movslq  %edx,%rdx       # tmp76, tmp77
              cmpq    %rdx, %r13      # tmp77, count
              jg      .L28    #,
      .L27:
              movq    %rbx, %rdi      # fbc,
              call    _raw_spin_lock  #
              addq    %r13, 8(%rbx)   # count, <variable>.count
              movq    %rbx, %rdi      # fbc,
              movl    $0, (%r12)      #,* tcp_ptr__
              call    _raw_spin_unlock        #
      .L29:
      #APP
      # 216 "/home/npiggin/usr/src/linux-2.6/arch/x86/include/asm/thread_info.h" 1
              movq %gs:kernel_stack,%rax      #, pfo_ret__
      # 0 "" 2
      #NO_APP
              decl    -8124(%rax)     # <variable>.preempt_count
              movq    -8136(%rax), %rax       #, D.14625
              testb   $8, %al #, D.14625
              jne     .L32    #,
      .L31:
              movq    -24(%rbp), %rbx #,
              movq    -16(%rbp), %r12 #,
              movq    -8(%rbp), %r13  #,
              leave
              ret
              .p2align 4,,10
              .p2align 3
      .L28:
              movl    %r13d, (%r12)   # count,*
              jmp     .L29    #
      .L32:
              call    preempt_schedule        #
              .p2align 4,,6
              jmp     .L31    #
              .size   __percpu_counter_add, .-__percpu_counter_add
              .p2align 4,,15
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      3e880fb5
    • N
      vfs: revert per-cpu nr_unused counters for dentry and inodes · 86c8749e
      Nick Piggin 提交于
      The nr_unused counters count the number of objects on an LRU, and as such they
      are synchronized with LRU object insertion and removal and scanning, and
      protected under the LRU lock.
      
      Making it per-cpu does not actually get any concurrency improvements because of
      this lock, and summing the counter is much slower, and
      incrementing/decrementing it costs more code size and is slower too.
      
      These counters should stay per-LRU, which currently means global.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      86c8749e
  2. 27 10月, 2010 1 次提交
    • E
      IMA: move read counter into struct inode · a178d202
      Eric Paris 提交于
      IMA currently allocated an inode integrity structure for every inode in
      core.  This stucture is about 120 bytes long.  Most files however
      (especially on a system which doesn't make use of IMA) will never need
      any of this space.  The problem is that if IMA is enabled we need to
      know information about the number of readers and the number of writers
      for every inode on the box.  At the moment we collect that information
      in the per inode iint structure and waste the rest of the space.  This
      patch moves those counters into the struct inode so we can eventually
      stop allocating an IMA integrity structure except when absolutely
      needed.
      
      This patch does the minimum needed to move the location of the data.
      Further cleanups, especially the location of counter updates, may still
      be possible.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: NMimi Zohar <zohar@linux.vnet.ibm.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a178d202
  3. 26 10月, 2010 18 次提交
  4. 10 8月, 2010 12 次提交
  5. 28 7月, 2010 2 次提交
  6. 19 7月, 2010 1 次提交
    • D
      mm: add context argument to shrinker callback · 7f8275d0
      Dave Chinner 提交于
      The current shrinker implementation requires the registered callback
      to have global state to work from. This makes it difficult to shrink
      caches that are not global (e.g. per-filesystem caches). Pass the shrinker
      structure to the callback so that users can embed the shrinker structure
      in the context the shrinker needs to operate on and get back to it in the
      callback via container_of().
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      7f8275d0
  7. 22 5月, 2010 2 次提交
  8. 12 4月, 2010 1 次提交
  9. 05 3月, 2010 1 次提交
    • C
      dquot: move dquot initialization responsibility into the filesystem · 907f4554
      Christoph Hellwig 提交于
      Currently various places in the VFS call vfs_dq_init directly.  This means
      we tie the quota code into the VFS.  Get rid of that and make the
      filesystem responsible for the initialization.   For most metadata operations
      this is a straight forward move into the methods, but for truncate and
      open it's a bit more complicated.
      
      For truncate we currently only call vfs_dq_init for the sys_truncate case
      because open already takes care of it for ftruncate and open(O_TRUNC) - the
      new code causes an additional vfs_dq_init for those which is harmless.
      
      For open the initialization is moved from do_filp_open into the open method,
      which means it happens slightly earlier now, and only for regular files.
      The latter is fine because we don't need to initialize it for operations
      on special files, and we already do it as part of the namespace operations
      for directories.
      
      Add a dquot_file_open helper that filesystems that support generic quotas
      can use to fill in ->open.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      907f4554