1. 27 3月, 2012 4 次提交
    • J
      Btrfs: set page->private to the eb · 4f2de97a
      Josef Bacik 提交于
      We spend a lot of time looking up extent buffers from pages when we could just
      store the pointer to the eb the page is associated with in page->private.  This
      patch does just that, and it makes things a little simpler and reduces a bit of
      CPU overhead involved with doing metadata IO.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      4f2de97a
    • C
      Btrfs: allow metadata blocks larger than the page size · 727011e0
      Chris Mason 提交于
      A few years ago the btrfs code to support blocks lager than
      the page size was disabled to fix a few corner cases in the
      page cache handling.  This fixes the code to properly support
      large metadata blocks again.
      
      Since current kernels will crash early and often with larger
      metadata blocks, this adds an incompat bit so that older kernels
      can't mount it.
      
      This also does away with different blocksizes for nodes and leaves.
      You get a single block size for all tree blocks.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      727011e0
    • J
      Btrfs: remove search_start and search_end from find_free_extent and callers · 81c9ad23
      Josef Bacik 提交于
      We have been passing nothing but (u64)-1 to find_free_extent for search_end in
      all of the callers, so it's completely useless, and we've always been passing 0
      in as search_start, so just remove them as function arguments and move
      search_start into find_free_extent.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      81c9ad23
    • J
      Btrfs: remove the ideal caching code · 285ff5af
      Josef Bacik 提交于
      This is a relic from before we had the disk space cache and it was to make
      bootup times when you had btrfs as root not be so damned slow.  Now that we have
      the disk space cache this isn't a problem anymore and really having this code
      casues uneeded fragmentation and complexity, so just remove it.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      285ff5af
  2. 19 3月, 2012 1 次提交
  3. 17 3月, 2012 4 次提交
  4. 11 3月, 2012 5 次提交
  5. 10 3月, 2012 2 次提交
    • A
      aio: fix the "too late munmap()" race · c7b28555
      Al Viro 提交于
      Current code has put_ioctx() called asynchronously from aio_fput_routine();
      that's done *after* we have killed the request that used to pin ioctx,
      so there's nothing to stop io_destroy() waiting in wait_for_all_aios()
      from progressing.  As the result, we can end up with async call of
      put_ioctx() being the last one and possibly happening during exit_mmap()
      or elf_core_dump(), neither of which expects stray munmap() being done
      to them...
      
      We do need to prevent _freeing_ ioctx until aio_fput_routine() is done
      with that, but that's all we care about - neither io_destroy() nor
      exit_aio() will progress past wait_for_all_aios() until aio_fput_routine()
      does really_put_req(), so the ioctx teardown won't be done until then
      and we don't care about the contents of ioctx past that point.
      
      Since actual freeing of these suckers is RCU-delayed, we don't need to
      bump ioctx refcount when request goes into list for async removal.
      All we need is rcu_read_lock held just over the ->ctx_lock-protected
      area in aio_fput_routine().
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Acked-by: NBenjamin LaHaise <bcrl@kvack.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c7b28555
    • A
      aio: fix io_setup/io_destroy race · 86b62a2c
      Al Viro 提交于
      Have ioctx_alloc() return an extra reference, so that caller would drop it
      on success and not bother with re-grabbing it on failure exit.  The current
      code is obviously broken - io_destroy() from another thread that managed
      to guess the address io_setup() would've returned would free ioctx right
      under us; gets especially interesting if aio_context_t * we pass to
      io_setup() points to PROT_READ mapping, so put_user() fails and we end
      up doing io_destroy() on kioctx another thread has just got freed...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Acked-by: NBenjamin LaHaise <bcrl@kvack.org>
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      86b62a2c
  6. 07 3月, 2012 2 次提交
  7. 06 3月, 2012 4 次提交
  8. 05 3月, 2012 1 次提交
    • L
      vfs: move dentry_cmp from <linux/dcache.h> to fs/dcache.c · 5483f18e
      Linus Torvalds 提交于
      It's only used inside fs/dcache.c, and we're going to play games with it
      for the word-at-a-time patches.  This time we really don't even want to
      export it, because it really is an internal function to fs/dcache.c, and
      has been since it was introduced.
      
      Having it in that extremely hot header file (it's included in pretty
      much everything, thanks to <linux/fs.h>) is a disaster for testing
      different versions, and is utterly pointless.
      
      We really should have some kind of header file diet thing, where we
      figure out which parts of header files are really better off private and
      only result in more expensive compiles.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5483f18e
  9. 03 3月, 2012 7 次提交
  10. 02 3月, 2012 1 次提交
  11. 29 2月, 2012 1 次提交
    • R
      ecryptfs: fix printk format warning for size_t · 164974a8
      Randy Dunlap 提交于
      Fix printk format warning (from Linus's suggestion):
      
      on i386:
        fs/ecryptfs/miscdev.c:433:38: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'unsigned int'
      
      and on x86_64:
        fs/ecryptfs/miscdev.c:433:38: warning: format '%u' expects type 'unsigned int', but argument 4 has type 'long unsigned int'
      Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
      Cc:	Geert Uytterhoeven <geert@linux-m68k.org>
      Cc:	Tyler Hicks <tyhicks@canonical.com>
      Cc:	Dustin Kirkland <dustin.kirkland@gazzang.com>
      Cc:	ecryptfs@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      164974a8
  12. 28 2月, 2012 4 次提交
  13. 27 2月, 2012 2 次提交
  14. 26 2月, 2012 1 次提交
    • I
      autofs: work around unhappy compat problem on x86-64 · a32744d4
      Ian Kent 提交于
      When the autofs protocol version 5 packet type was added in commit
      5c0a32fc ("autofs4: add new packet type for v5 communications"), it
      obvously tried quite hard to be word-size agnostic, and uses explicitly
      sized fields that are all correctly aligned.
      
      However, with the final "char name[NAME_MAX+1]" array at the end, the
      actual size of the structure ends up being not very well defined:
      because the struct isn't marked 'packed', doing a "sizeof()" on it will
      align the size of the struct up to the biggest alignment of the members
      it has.
      
      And despite all the members being the same, the alignment of them is
      different: a "__u64" has 4-byte alignment on x86-32, but native 8-byte
      alignment on x86-64.  And while 'NAME_MAX+1' ends up being a nice round
      number (256), the name[] array starts out a 4-byte aligned.
      
      End result: the "packed" size of the structure is 300 bytes: 4-byte, but
      not 8-byte aligned.
      
      As a result, despite all the fields being in the same place on all
      architectures, sizeof() will round up that size to 304 bytes on
      architectures that have 8-byte alignment for u64.
      
      Note that this is *not* a problem for 32-bit compat mode on POWER, since
      there __u64 is 8-byte aligned even in 32-bit mode.  But on x86, 32-bit
      and 64-bit alignment is different for 64-bit entities, and as a result
      the structure that has exactly the same layout has different sizes.
      
      So on x86-64, but no other architecture, we will just subtract 4 from
      the size of the structure when running in a compat task.  That way we
      will write the properly sized packet that user mode expects.
      
      Not pretty.  Sadly, this very subtle, and unnecessary, size difference
      has been encoded in user space that wants to read packets of *exactly*
      the right size, and will refuse to touch anything else.
      Reported-and-tested-by: NThomas Meyer <thomas@m3y3r.de>
      Signed-off-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a32744d4
  15. 25 2月, 2012 1 次提交
    • O
      epoll: ep_unregister_pollwait() can use the freed pwq->whead · 971316f0
      Oleg Nesterov 提交于
      signalfd_cleanup() ensures that ->signalfd_wqh is not used, but
      this is not enough. eppoll_entry->whead still points to the memory
      we are going to free, ep_unregister_pollwait()->remove_wait_queue()
      is obviously unsafe.
      
      Change ep_poll_callback(POLLFREE) to set eppoll_entry->whead = NULL,
      change ep_unregister_pollwait() to check pwq->whead != NULL under
      rcu_read_lock() before remove_wait_queue(). We add the new helper,
      ep_remove_wait_queue(), for this.
      
      This works because sighand_cachep is SLAB_DESTROY_BY_RCU and because
      ->signalfd_wqh is initialized in sighand_ctor(), not in copy_sighand.
      ep_unregister_pollwait()->remove_wait_queue() can play with already
      freed and potentially reused ->sighand, but this is fine. This memory
      must have the valid ->signalfd_wqh until rcu_read_unlock().
      Reported-by: NMaxime Bizon <mbizon@freebox.fr>
      Cc: <stable@kernel.org>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      971316f0