1. 22 4月, 2013 3 次提交
    • D
      xfs: add CRC checks to the AGF · 4e0e6040
      Dave Chinner 提交于
      The AGF already has some self identifying fields (e.g. the sequence
      number) so we only need to add the uuid to it to identify the
      filesystem it belongs to. The location is fixed based on the
      sequence number, so there's no need to add a block number, either.
      
      Hence the only additional fields are the CRC and LSN fields. These
      are unlogged, so place some space between the end of the logged
      fields and them so that future expansion of the AGF for logged
      fields can be placed adjacent to the existing logged fields and
      hence not complicate the field-derived range based logging we
      currently have.
      
      Based originally on a patch from myself, modified further by
      Christoph Hellwig and then modified again to fit into the
      verifier structure with additional fields by myself. The multiple
      signed-off-by tags indicate the age and history of this patch.
      Signed-off-by: NDave Chinner <dgc@sgi.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      4e0e6040
    • C
      xfs: add support for large btree blocks · ee1a47ab
      Christoph Hellwig 提交于
      Add support for larger btree blocks that contains a CRC32C checksum,
      a filesystem uuid and block number for detecting filesystem
      consistency and out of place writes.
      
      [dchinner@redhat.com] Also include an owner field to allow reverse
      mappings to be implemented for improved repairability and a LSN
      field to so that log recovery can easily determine the last
      modification that made it to disk for each buffer.
      
      [dchinner@redhat.com] Add buffer log format flags to indicate the
      type of buffer to recovery so that we don't have to do blind magic
      number tests to determine what the buffer is.
      
      [dchinner@redhat.com] Modified to fit into the verifier structure.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      ee1a47ab
    • D
      xfs: increase hexdump output in xfs_corruption_error · a2050646
      Dave Chinner 提交于
      Currently xfs_corruption_error() dumps the first 16 bytes of the
      buffer that is passed to it when a corruption occurs. This is not
      large enough to see the entire state of the header of the block that
      was determined to be corrupt.  increase the output to 64 bytes to
      capture the majority of all headers in all types of metadata blocks.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      a2050646
  2. 17 4月, 2013 2 次提交
  3. 06 4月, 2013 1 次提交
    • D
      xfs: don't free EFIs before the EFDs are committed · 666d644c
      Dave Chinner 提交于
      Filesystems are occasionally being shut down with this error:
      
      xfs_trans_ail_delete_bulk: attempting to delete a log item that is
      not in the AIL.
      
      It was diagnosed to be related to the EFI/EFD commit order when the
      EFI and EFD are in different checkpoints and the EFD is committed
      before the EFI here:
      
      http://oss.sgi.com/archives/xfs/2013-01/msg00082.html
      
      The real problem is that a single bit cannot fully describe the
      states that the EFI/EFD processing can be in. These completion
      states are:
      
      EFI			EFI in AIL	EFD		Result
      committed/unpinned	Yes		committed	OK
      committed/pinned	No		committed	Shutdown
      uncommitted		No		committed	Shutdown
      
      
      Note that the "result" field is what should happen, not what does
      happen. The current logic is broken and handles the first two cases
      correctly by luck.  That is, the code will free the EFI if the
      XFS_EFI_COMMITTED bit is *not* set, rather than if it is set. The
      inverted logic "works" because if both EFI and EFD are committed,
      then the first __xfs_efi_release() call clears the XFS_EFI_COMMITTED
      bit, and the second frees the EFI item. Hence as long as
      xfs_efi_item_committed() has been called, everything appears to be
      fine.
      
      It is the third case where the logic fails - where
      xfs_efd_item_committed() is called before xfs_efi_item_committed(),
      and that results in the EFI being freed before it has been
      committed. That is the bug that triggered the shutdown, and hence
      keeping track of whether the EFI has been committed or not is
      insufficient to correctly order the EFI/EFD operations w.r.t. the
      AIL.
      
      What we really want is this: the EFI is always placed into the
      AIL before the last reference goes away. The only way to guarantee
      that is that the EFI is not freed until after it has been unpinned
      *and* the EFD has been committed. That is, restructure the logic so
      that the only case that can occur is the first case.
      
      This can be done easily by replacing the XFS_EFI_COMMITTED with an
      EFI reference count. The EFI is initialised with it's own count, and
      that is not released until it is unpinned. However, there is a
      complication to this method - the high level EFI/EFD code in
      xfs_bmap_finish() does not hold direct references to the EFI
      structure, and runs a transaction commit between the EFI and EFD
      processing. Hence the EFI can be freed even before the EFD is
      created using such a method.
      
      Further, log recovery uses the AIL for tracking EFI/EFDs that need
      to be recovered, but it uses the AIL *differently* to the EFI
      transaction commit. Hence log recovery never pins or unpins EFIs, so
      we can't drop the EFI reference count indirectly to free the EFI.
      
      However, this doesn't prevent us from using a reference count here.
      There is a 1:1 relationship between EFIs and EFDs, so when we
      initialise the EFI we can take a reference count for the EFD as
      well. This solves the xfs_bmap_finish() issue - the EFI will never
      be freed until the EFD is processed. In terms of log recovery,
      during the committing of the EFD we can look for the
      XFS_EFI_RECOVERED bit being set and drop the EFI reference as well,
      thereby ensuring everything works correctly there as well.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      666d644c
  4. 04 4月, 2013 1 次提交
  5. 23 3月, 2013 7 次提交
  6. 15 3月, 2013 3 次提交
  7. 08 3月, 2013 6 次提交
  8. 03 3月, 2013 8 次提交
  9. 02 3月, 2013 8 次提交
  10. 01 3月, 2013 1 次提交
    • D
      btrfs: try harder to allocate raid56 stripe cache · 83c8266a
      David Sterba 提交于
      The stripe hash table is large, starting with allocation order 4 and can go as
      high as order 7 in case lock debugging is turned on and structure padding
      happens.
      
      Observed mount failure:
      
      mount: page allocation failure: order:7, mode:0x200050
      Pid: 8234, comm: mount Tainted: G        W    3.8.0-default+ #267
      Call Trace:
       [<ffffffff81114353>] warn_alloc_failed+0xf3/0x140
       [<ffffffff811171d2>] ? __alloc_pages_direct_compact+0x92/0x250
       [<ffffffff81117ac3>] __alloc_pages_nodemask+0x733/0x9d0
       [<ffffffff81152878>] ? cache_alloc_refill+0x3f8/0x840
       [<ffffffff811528bc>] cache_alloc_refill+0x43c/0x840
       [<ffffffff811302eb>] ? is_kernel_percpu_address+0x4b/0x90
       [<ffffffffa00a00ac>] ? btrfs_alloc_stripe_hash_table+0x5c/0x130 [btrfs]
       [<ffffffff811531d7>] kmem_cache_alloc_trace+0x247/0x270
       [<ffffffffa00a00ac>] btrfs_alloc_stripe_hash_table+0x5c/0x130 [btrfs]
       [<ffffffffa003133f>] open_ctree+0xb2f/0x1f90 [btrfs]
       [<ffffffff81397289>] ? string+0x49/0xe0
       [<ffffffff813987b3>] ? vsnprintf+0x443/0x5d0
       [<ffffffffa0007cb6>] btrfs_mount+0x526/0x600 [btrfs]
       [<ffffffff8115127c>] ? cache_alloc_debugcheck_after+0x4c/0x200
       [<ffffffff81162b90>] mount_fs+0x20/0xe0
       [<ffffffff8117db26>] vfs_kern_mount+0x76/0x120
       [<ffffffff811801b6>] do_mount+0x386/0x980
       [<ffffffff8112a5cb>] ? strndup_user+0x5b/0x80
       [<ffffffff81180840>] sys_mount+0x90/0xe0
       [<ffffffff81962e99>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      83c8266a