1. 27 3月, 2012 14 次提交
    • I
      Btrfs: stop silently switching single chunks to raid0 on balance · e3176ca2
      Ilya Dryomov 提交于
      This has been causing a lot of confusion for quite a while now and a lot
      of users were surprised by this (some of them were even stuck in a
      ENOSPC situation which they couldn't easily get out of).  The addition
      of restriper gives users a clear choice between raid0 and drive concat
      setup so there's absolutely no excuse for us to keep doing this.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      e3176ca2
    • J
      Btrfs: deal with read errors on extent buffers differently · ea466794
      Josef Bacik 提交于
      Since we need to read and write extent buffers in their entirety we can't use
      the normal bio_readpage_error stuff since it only works on a per page basis.  So
      instead make it so that if we see an io error in endio we just mark the eb as
      having an IO error and then in btree_read_extent_buffer_pages we will manually
      try other mirrors and then overwrite the bad mirror if we find a good copy.
      This works with larger than page size blocks.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      ea466794
    • C
      Btrfs: don't use threaded IO completion helpers for metadata writes · f3f266ab
      Chris Mason 提交于
      The metadata write IO completion code is now simple enough that we
      don't need the threaded helpers anymore.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      f3f266ab
    • C
      Btrfs: adjust the write_lock_level as we unlock · f7c79f30
      Chris Mason 提交于
      btrfs_search_slot sometimes needs write locks on high levels of
      the tree.  It remembers the highest level that needs a write lock
      and will use that for all future searches through the tree in a given
      call.
      
      But, very often we'll just cow the top level or the level below and we
      won't really need write locks on the root again after that.  This patch
      changes things to adjust the write lock requirement as it unlocks
      levels.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      f7c79f30
    • C
      Btrfs: loop waiting on writeback · a098d8e8
      Chris Mason 提交于
      lock_extent_buffer_for_io needs to loop around and make sure the
      writeback bits are not set.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      a098d8e8
    • C
      Btrfs: add the ability to cache a pointer into the eb · cfed81a0
      Chris Mason 提交于
      This cuts down on the CPU time used by map_private_extent_buffer
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      cfed81a0
    • J
      Btrfs: ensure an entire eb is written at once · 0b32f4bb
      Josef Bacik 提交于
      This patch simplifies how we track our extent buffers.  Previously we could exit
      writepages with only having written half of an extent buffer, which meant we had
      to track the state of the pages and the state of the extent buffers differently.
      Now we only read in entire extent buffers and write out entire extent buffers,
      this allows us to simply set bits in our bflags to indicate the state of the eb
      and we no longer have to do things like track uptodate with our iotree.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      0b32f4bb
    • J
      Btrfs: introduce mark_extent_buffer_accessed · 5df4235e
      Josef Bacik 提交于
      Because an eb can have multiple pages we need to make sure that all pages within
      the eb are markes as accessed, since releasepage can be called against any page
      in the eb.  This will keep us from possibly evicting hot eb's when we're doing
      larger than pagesize eb's.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      5df4235e
    • J
      Btrfs: introduce free_extent_buffer_stale · 3083ee2e
      Josef Bacik 提交于
      Because btrfs cow's we can end up with extent buffers that are no longer
      necessary just sitting around in memory.  So instead of evicting these pages, we
      could end up evicting things we actually care about.  Thus we have
      free_extent_buffer_stale for use when we are freeing tree blocks.  This will
      make it so that the ref for the eb being in the radix tree is dropped as soon as
      possible and then is freed when the refcount hits 0 instead of waiting to be
      released by releasepage.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      3083ee2e
    • J
      Btrfs: only use the existing eb if it's count isn't 0 · 115391d2
      Josef Bacik 提交于
      We can run into a problem where we find an eb for our existing page already on
      the radix tree but it has a ref count of 0.  It hasn't yet been removed by RCU
      yet so this can cause issues where we will use the EB after free.  So do
      atomic_inc_not_zero on the exists->refs and if it is zero just do
      synchronize_rcu() and try again.  We won't have to worry about new allocators
      coming in since they will block on the page lock at this point.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      115391d2
    • J
      Btrfs: set page->private to the eb · 4f2de97a
      Josef Bacik 提交于
      We spend a lot of time looking up extent buffers from pages when we could just
      store the pointer to the eb the page is associated with in page->private.  This
      patch does just that, and it makes things a little simpler and reduces a bit of
      CPU overhead involved with doing metadata IO.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      4f2de97a
    • C
      Btrfs: allow metadata blocks larger than the page size · 727011e0
      Chris Mason 提交于
      A few years ago the btrfs code to support blocks lager than
      the page size was disabled to fix a few corner cases in the
      page cache handling.  This fixes the code to properly support
      large metadata blocks again.
      
      Since current kernels will crash early and often with larger
      metadata blocks, this adds an incompat bit so that older kernels
      can't mount it.
      
      This also does away with different blocksizes for nodes and leaves.
      You get a single block size for all tree blocks.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      727011e0
    • J
      Btrfs: remove search_start and search_end from find_free_extent and callers · 81c9ad23
      Josef Bacik 提交于
      We have been passing nothing but (u64)-1 to find_free_extent for search_end in
      all of the callers, so it's completely useless, and we've always been passing 0
      in as search_start, so just remove them as function arguments and move
      search_start into find_free_extent.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      81c9ad23
    • J
      Btrfs: remove the ideal caching code · 285ff5af
      Josef Bacik 提交于
      This is a relic from before we had the disk space cache and it was to make
      bootup times when you had btrfs as root not be so damned slow.  Now that we have
      the disk space cache this isn't a problem anymore and really having this code
      casues uneeded fragmentation and complexity, so just remove it.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      285ff5af
  2. 03 3月, 2012 2 次提交
  3. 24 2月, 2012 1 次提交
  4. 23 2月, 2012 5 次提交
  5. 21 2月, 2012 1 次提交
  6. 17 2月, 2012 6 次提交
  7. 16 2月, 2012 1 次提交
  8. 15 2月, 2012 9 次提交
    • D
      btrfs: silence warning in raid array setup · 8a334426
      David Sterba 提交于
      Raid array setup code creates an extent buffer in an usual way. When the
      PAGE_CACHE_SIZE is > super block size, the extent pages are not marked
      up-to-date, which triggers a WARN_ON in the following
      write_extent_buffer call. Add an explicit up-to-date call to silence the
      warning.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      8a334426
    • D
      btrfs: fix structs where bitfields and spinlock/atomic share 8B word · c08782da
      David Sterba 提交于
      On ia64, powerpc64 and sparc64 the bitfield is modified through a RMW cycle and current
      gcc rewrites the adjacent 4B word, which in case of a spinlock or atomic has
      disaterous effect.
      
      https://lkml.org/lkml/2012/2/1/220Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      c08782da
    • J
      btrfs: delalloc for page dirtied out-of-band in fixup worker · 87826df0
      Jeff Mahoney 提交于
       We encountered an issue that was easily observable on s/390 systems but
       could really happen anywhere. The timing just seemed to hit reliably
       on s/390 with limited memory.
      
       The gist is that when an unexpected set_page_dirty() happened, we'd
       run into the BUG() in btrfs_writepage_fixup_worker since it wasn't
       properly set up for delalloc.
      
       This patch does the following:
       - Performs the missing delalloc in the fixup worker
       - Allow the start hook to return -EBUSY which informs __extent_writepage
         that it should mark the page skipped and not to redirty it. This is
         required since the fixup worker can fail with -ENOSPC and the page
         will have already been redirtied. That causes an Oops in
         drop_outstanding_extents later. Retrying the fixup worker could
         lead to an infinite loop. Deferring the page redirty also saves us
         some cycles since the page would be stuck in a resubmit-redirty loop
         until the fixup worker completes. It's not harmful, just wasteful.
       - If the fixup worker fails, we mark the page and mapping as errored,
         and end the writeback, similar to what we would do had the page
         actually been submitted to writeback.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      87826df0
    • T
      Btrfs: fix memory leak in load_free_space_cache() · a7e221e9
      Tsutomu Itoh 提交于
      load_free_space_cache() has forgotten to free path.
      Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
      a7e221e9
    • A
      btrfs: don't check DUP chunks twice · 859acaf1
      Arne Jansen 提交于
      Because scrub enumerates the dev extent tree to find the chunks to scrub,
      it currently finds each DUP chunk twice and also scrubs it twice. This
      patch makes sure that scrub_chunk only checks that part of the chunk the
      dev extent has been found for. This only changes the behaviour for DUP
      chunks.
      Reported-and-tested-by: NStefan Behrens <sbehrens@giantdisaster.de>
      Signed-off-by: NArne Jansen <sensille@gmx.net>
      859acaf1
    • L
      Btrfs: fix trim 0 bytes after a device delete · 2cac13e4
      Liu Bo 提交于
      A user reported a bug of btrfs's trim, that is we will trim 0 bytes
      after a device delete.
      
      The reproducer:
      
      $ mkfs.btrfs disk1
      $ mkfs.btrfs disk2
      $ mount disk1 /mnt
      $ fstrim -v /mnt
      $ btrfs device add disk2 /mnt
      $ btrfs device del disk1 /mnt
      $ fstrim -v /mnt
      
      This is because after we delete the device, the block group may start from
      a non-zero place, which will confuse trim to discard nothing.
      Reported-by: NLutz Euler <lutz.euler@freenet.de>
      Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
      2cac13e4
    • J
      Btrfs: return the internal error unchanged if btrfs_get_extent_fiemap() call... · 6af021d8
      Jeff Liu 提交于
      Btrfs: return the internal error unchanged if btrfs_get_extent_fiemap() call failed for SEEK_DATA/SEEK_HOLE inquiry
      
      Given that ENXIO only means "offset beyond EOF" for either SEEK_DATA or SEEK_HOLE inquiry
      in a desired file range, so we should return the internal error unchanged if btrfs_get_extent_fiemap()
      call failed, rather than ENXIO.
      
      Cc: Dave Chinner <david@fromorbit.com>
      Signed-off-by: NJie Liu <jeff.liu@oracle.com>
      6af021d8
    • J
      Btrfs: avoid positive number with ERR_PTR · 8f24b496
      Jan Schmidt 提交于
      inode_ref_info() returns 1 when the element wasn't found and < 0 on error,
      just like btrfs_search_slot(). In iref_to_path() it's an error when the
      inode ref can't be found, thus we return ERR_PTR(ret) in that case. In order
      to avoid ERR_PTR(1), we now set ret to -ENOENT in that case.
      Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
      8f24b496
    • K
      btrfs: Sector Size check during Mount · 941b2ddf
      Keith Mannthey 提交于
      Gracefully fail when trying to mount a BTRFS file system that has a
      sectorsize smaller than PAGE_SIZE.
      
      On PPC it is possible to build a FS while using a 4k PAGE_SIZE kernel
      then boot into a 64K PAGE_SIZE kernel.  Presently open_ctree fails in an
      endless loop and hangs the machine in this situation.
      
      My debugging has show this Sector size < Page size to be a non trivial
      situation and a graceful exit from the situation would be nice for the
      time being.
      Signed-off-by: NKeith Mannthey <kmannth@us.ibm.com>
      941b2ddf
  9. 01 2月, 2012 1 次提交