1. 11 3月, 2014 2 次提交
  2. 12 11月, 2013 1 次提交
  3. 01 9月, 2013 2 次提交
  4. 18 5月, 2013 1 次提交
    • C
      Btrfs: use a btrfs bioset instead of abusing bio internals · 9be3395b
      Chris Mason 提交于
      Btrfs has been pointer tagging bi_private and using bi_bdev
      to store the stripe index and mirror number of failed IOs.
      
      As bios bubble back up through the call chain, we use these
      to decide if and how to retry our IOs.  They are also used
      to count IO failures on a per device basis.
      
      Recently a bio tracepoint was added lead to crashes because
      we were abusing bi_bdev.
      
      This commit adds a btrfs bioset, and creates explicit fields
      for the mirror number and stripe index.  The plan is to
      extend this structure for all of the fields currently in
      struct btrfs_bio, which will mean one less kmalloc in
      our IO path.
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      Reported-by: NTejun Heo <tj@kernel.org>
      9be3395b
  5. 07 5月, 2013 1 次提交
    • E
      btrfs: make static code static & remove dead code · 48a3b636
      Eric Sandeen 提交于
      Big patch, but all it does is add statics to functions which
      are in fact static, then remove the associated dead-code fallout.
      
      removed functions:
      
      btrfs_iref_to_path()
      __btrfs_lookup_delayed_deletion_item()
      __btrfs_search_delayed_insertion_item()
      __btrfs_search_delayed_deletion_item()
      find_eb_for_page()
      btrfs_find_block_group()
      range_straddles_pages()
      extent_range_uptodate()
      btrfs_file_extent_length()
      btrfs_scrub_cancel_devid()
      btrfs_start_transaction_lflush()
      
      btrfs_print_tree() is left because it is used for debugging.
      btrfs_start_transaction_lflush() and btrfs_reada_detach() are
      left for symmetry.
      
      ulist.c functions are left, another patch will take care of those.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      48a3b636
  6. 03 3月, 2013 1 次提交
  7. 01 3月, 2013 1 次提交
    • D
      btrfs: try harder to allocate raid56 stripe cache · 83c8266a
      David Sterba 提交于
      The stripe hash table is large, starting with allocation order 4 and can go as
      high as order 7 in case lock debugging is turned on and structure padding
      happens.
      
      Observed mount failure:
      
      mount: page allocation failure: order:7, mode:0x200050
      Pid: 8234, comm: mount Tainted: G        W    3.8.0-default+ #267
      Call Trace:
       [<ffffffff81114353>] warn_alloc_failed+0xf3/0x140
       [<ffffffff811171d2>] ? __alloc_pages_direct_compact+0x92/0x250
       [<ffffffff81117ac3>] __alloc_pages_nodemask+0x733/0x9d0
       [<ffffffff81152878>] ? cache_alloc_refill+0x3f8/0x840
       [<ffffffff811528bc>] cache_alloc_refill+0x43c/0x840
       [<ffffffff811302eb>] ? is_kernel_percpu_address+0x4b/0x90
       [<ffffffffa00a00ac>] ? btrfs_alloc_stripe_hash_table+0x5c/0x130 [btrfs]
       [<ffffffff811531d7>] kmem_cache_alloc_trace+0x247/0x270
       [<ffffffffa00a00ac>] btrfs_alloc_stripe_hash_table+0x5c/0x130 [btrfs]
       [<ffffffffa003133f>] open_ctree+0xb2f/0x1f90 [btrfs]
       [<ffffffff81397289>] ? string+0x49/0xe0
       [<ffffffff813987b3>] ? vsnprintf+0x443/0x5d0
       [<ffffffffa0007cb6>] btrfs_mount+0x526/0x600 [btrfs]
       [<ffffffff8115127c>] ? cache_alloc_debugcheck_after+0x4c/0x200
       [<ffffffff81162b90>] mount_fs+0x20/0xe0
       [<ffffffff8117db26>] vfs_kern_mount+0x76/0x120
       [<ffffffff811801b6>] do_mount+0x386/0x980
       [<ffffffff8112a5cb>] ? strndup_user+0x5b/0x80
       [<ffffffff81180840>] sys_mount+0x90/0xe0
       [<ffffffff81962e99>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      83c8266a
  8. 02 2月, 2013 3 次提交
    • C
      Btrfs: add a plugging callback to raid56 writes · 6ac0f488
      Chris Mason 提交于
      Buffered writes and DIRECT_IO writes will often break up
      big contiguous changes to the file into sub-stripe writes.
      
      This adds a plugging callback to gather those smaller writes full stripe
      writes.
      
      Example on flash:
      
      fio job to do 64K writes in batches of 3 (which makes a full stripe):
      
      With plugging: 450MB/s
      Without plugging: 220MB/s
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      6ac0f488
    • C
      Btrfs: Add a stripe cache to raid56 · 4ae10b3a
      Chris Mason 提交于
      The stripe cache allows us to avoid extra read/modify/write cycles
      by caching the pages we read off the disk.  Pages are cached when:
      
      * They are read in during a read/modify/write cycle
      
      * They are written during a read/modify/write cycle
      
      * They are involved in a parity rebuild
      
      Pages are not cached if we're doing a full stripe write.  We're
      assuming that a full stripe write won't be followed by another
      partial stripe write any time soon.
      
      This provides a substantial boost in performance for workloads that
      synchronously modify adjacent offsets in the file, and for the parity
      rebuild use case in general.
      
      The size of the stripe cache isn't tunable (yet) and is set at 1024
      entries.
      
      Example on flash: dd if=/dev/zero of=/mnt/xxx bs=4K oflag=direct
      
      Without the stripe cache  -- 2.1MB/s
      With the stripe cache 21MB/s
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      4ae10b3a
    • D
      Btrfs: RAID5 and RAID6 · 53b381b3
      David Woodhouse 提交于
      This builds on David Woodhouse's original Btrfs raid5/6 implementation.
      The code has changed quite a bit, blame Chris Mason for any bugs.
      
      Read/modify/write is done after the higher levels of the filesystem have
      prepared a given bio.  This means the higher layers are not responsible
      for building full stripes, and they don't need to query for the topology
      of the extents that may get allocated during delayed allocation runs.
      It also means different files can easily share the same stripe.
      
      But, it does expose us to incorrect parity if we crash or lose power
      while doing a read/modify/write cycle.  This will be addressed in a
      later commit.
      
      Scrub is unable to repair crc errors on raid5/6 chunks.
      
      Discard does not work on raid5/6 (yet)
      
      The stripe size is fixed at 64KiB per disk.  This will be tunable
      in a later commit.
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      53b381b3