1. 13 12月, 2012 2 次提交
    • J
      fs/btrfs: use WARN · 31b1a2bd
      Julia Lawall 提交于
      Use WARN rather than printk followed by WARN_ON(1), for conciseness.
      
      A simplified version of the semantic patch that makes this transformation
      is as follows: (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@
      expression list es;
      @@
      
      -printk(
      +WARN(1,
        es);
      -WARN_ON(1);
      // </smpl>
      Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr>
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      31b1a2bd
    • S
      Btrfs: don't allow degraded mount if too many devices are missing · 292fd7fc
      Stefan Behrens 提交于
      The current behavior is to allow mounting or remounting a filesystem
      writeable in degraded mode if at least one writeable device is
      present.
      The next failed write access to a missing device which is above
      the tolerance of the configured level of redundancy results in an
      read-only enforcement. Even without this, the next time
      barrier_all_devices() is called and more devices are missing than
      tolerable, the switch to read-only mode takes place.
      
      In order to behave predictably and to provide proper feedback to
      the user at mount time, this patch compares the number of missing
      devices with the number of devices that are tolerated to be missing
      according to the configured RAID level. If more devices are missing
      than tolerated, e.g. if two devices are missing in case of RAID1,
      only a read-only mount and remount is allowed.
      Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      292fd7fc
  2. 12 12月, 2012 1 次提交
  3. 09 10月, 2012 5 次提交
    • W
      Btrfs: remove repeated eb->pages check in, disk-io.c/csum_dirty_buffer · 1037a5af
      Wang Sheng-Hui 提交于
      In csum_dirty_buffer, we first get eb from page->private.
      Then we check if the page is the first page of eb. Later
      we check it again. Remove the repeated check here.
      Signed-off-by: NWang Sheng-Hui <shhuiw@gmail.com>
      1037a5af
    • S
      Btrfs: make filesystem read-only when submitting barrier fails · 5af3e8cc
      Stefan Behrens 提交于
      So far the return code of barrier_all_devices() is ignored, which
      means that errors are ignored. The result can be a corrupt
      filesystem which is not consistent.
      This commit adds code to evaluate the return code of
      barrier_all_devices(). The normal btrfs_error() mechanism is used to
      switch the filesystem into read-only mode when errors are detected.
      
      In order to decide whether barrier_all_devices() should return
      error or success, the number of disks that are allowed to fail the
      barrier submission is calculated. This calculation accounts for the
      worst RAID level of metadata, system and data. If single, dup or
      RAID0 is in use, a single disk error is already considered to be
      fatal. Otherwise a single disk error is tolerated.
      
      The calculation of the number of disks that are tolerated to fail
      the barrier operation is performed when the filesystem gets mounted,
      when a balance operation is started and finished, and when devices
      are added or removed.
      Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
      5af3e8cc
    • J
      Btrfs: cache extent state when writing out dirty metadata pages · e6138876
      Josef Bacik 提交于
      Everytime we write out dirty pages we search for an offset in the tree,
      convert the bits in the state, and then when we wait we search for the
      offset again and clear the bits.  So for every dirty range in the io tree we
      are doing 4 rb searches, which is suboptimal.  With this patch we are only
      doing 2 searches for every cycle (modulo weird things happening).  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      e6138876
    • J
      Btrfs: do not async metadata csumming in certain situations · de0022b9
      Josef Bacik 提交于
      There are a coule scenarios where farming metadata csumming off to an async
      thread doesn't help.  The first is if our processor supports crc32c, in
      which case the csumming will be fast and so the overhead of the async model
      is not worth the cost.  The other case is for our tree log.  We will be
      making that stuff dirty and writing it out and waiting for it immediately.
      Even with software crc32c this gives me a ~15% increase in speed with O_SYNC
      workloads.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      de0022b9
    • M
      Btrfs: fix orphan transaction on the freezed filesystem · 354aa0fb
      Miao Xie 提交于
      With the following debug patch:
      
       static int btrfs_freeze(struct super_block *sb)
       {
      + 	struct btrfs_fs_info *fs_info = btrfs_sb(sb);
      +	struct btrfs_transaction *trans;
      +
      +	spin_lock(&fs_info->trans_lock);
      +	trans = fs_info->running_transaction;
      +	if (trans) {
      +		printk("Transid %llu, use_count %d, num_writer %d\n",
      +			trans->transid, atomic_read(&trans->use_count),
      +			atomic_read(&trans->num_writers));
      +	}
      +	spin_unlock(&fs_info->trans_lock);
       	return 0;
       }
      
      I found there was a orphan transaction after the freeze operation was done.
      
      It is because the transaction may not be committed when the transaction handle
      end even though it is the last handle of the current transaction. This design
      avoid committing the transaction frequently, but also introduce the above
      problem.
      
      So I add btrfs_attach_transaction() which can catch the current transaction
      and commit it. If there is no transaction, it will return ENOENT, and do not
      anything.
      
      This function also can be used to instead of btrfs_join_transaction_freeze()
      because it don't increase the writer counter and don't start a new transaction,
      so it also can fix the deadlock between sync and freeze.
      
      Besides that, it is used to instead of btrfs_join_transaction() in
      transaction_kthread(), because if there is no transaction, the transaction
      kthread needn't anything.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      354aa0fb
  4. 04 10月, 2012 4 次提交
  5. 02 10月, 2012 3 次提交
  6. 29 8月, 2012 4 次提交
    • S
      Btrfs: fix that repair code is spuriously executed for transid failures · 256dd1bb
      Stefan Behrens 提交于
      If verify_parent_transid() fails for all mirrors, the current code
      calls repair_io_failure() anyway which means:
      - that the disk block is rewritten without repairing anything and
      - that a kernel log message is printed which misleadingly claims
        that a read error was corrected.
      
      This is an example:
      parent transid verify failed on 615015833600 wanted 110423 found 110424
      parent transid verify failed on 615015833600 wanted 110423 found 110424
      btrfs read error corrected: ino 1 off 615015833600 (dev /dev/...)
      
      It is wrong to ignore the results from verify_parent_transid() and to
      call repair_eb_io_failure() when the verification of the transids failed.
      This commit fixes the issue.
      Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      256dd1bb
    • S
      Btrfs: remove superblock writing after fatal error · 68ce9682
      Stefan Behrens 提交于
      With commit acce952b, btrfs was changed to flag the filesystem with
      BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a fatal
      error happened like a write I/O errors of all mirrors.
      In such situations, on unmount, the superblock is written in
      btrfs_error_commit_super(). This is done with the intention to be able
      to evaluate the error flag on the next mount. A warning is printed
      in this case during the next mount and the log tree is ignored.
      
      The issue is that it is possible that the superblock points to a root
      that was not written (due to write I/O errors).
      The result is that the filesystem cannot be mounted. btrfsck also does
      not start and all the other btrfs-progs tools fail to start as well.
      However, mount -o recovery is working well and does the right things
      to recover the filesystem (i.e., don't use the log root, clear the
      free space cache and use the next mountable root that is stored in the
      root backup array).
      
      This patch removes the writing of the superblock when
      BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the error
      flag in the mount function.
      
      These lines can be used to reproduce the issue (using /dev/sdm):
      SCRATCH_DEV=/dev/sdm
      SCRATCH_MNT=/mnt
      echo 0 25165824 linear $SCRATCH_DEV 0 | dmsetup create foo
      ls -alLF /dev/mapper/foo
      mkfs.btrfs /dev/mapper/foo
      mount /dev/mapper/foo $SCRATCH_MNT
      echo bar > $SCRATCH_MNT/foo
      sync
      echo 0 25165824 error | dmsetup reload foo
      dmsetup resume foo
      ls -alF $SCRATCH_MNT
      touch $SCRATCH_MNT/1
      ls -alF $SCRATCH_MNT
      sleep 35
      echo 0 25165824 linear $SCRATCH_DEV 0 | dmsetup reload foo
      dmsetup resume foo
      sleep 1
      umount $SCRATCH_MNT
      btrfsck /dev/mapper/foo
      dmsetup remove foo
      Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
      Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
      68ce9682
    • J
      Btrfs: barrier before waitqueue_active · 66657b31
      Josef Bacik 提交于
      We need a barrir before calling waitqueue_active otherwise we will miss
      wakeups.  So in places that do atomic_dec(); then atomic_read() use
      atomic_dec_return() which imply a memory barrier (see memory-barriers.txt)
      and then add an explicit memory barrier everywhere else that need them.
      Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      66657b31
    • A
      Btrfs: fix deadlock in wait_for_more_refs · 1fa11e26
      Arne Jansen 提交于
      Commit a168650c introduced a waiting mechanism to prevent busy waiting in
      btrfs_run_delayed_refs. This can deadlock with btrfs_run_ordered_operations,
      where a tree_mod_seq is held while waiting for the io to complete, while
      the end_io calls btrfs_run_delayed_refs.
      This whole mechanism is unnecessary. If not enough runnable refs are
      available to satisfy count, just return as count is more like a guideline
      than a strict requirement.
      In case we have to run all refs, commit transaction makes sure that no
      other threads are working in the transaction anymore, so we just assert
      here that no refs are blocked.
      Signed-off-by: NArne Jansen <sensille@gmx.net>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      1fa11e26
  7. 31 7月, 2012 2 次提交
    • J
      btrfs: Convert to new freezing mechanism · b2b5ef5c
      Jan Kara 提交于
      We convert btrfs_file_aio_write() to use new freeze check.  We also add proper
      freeze protection to btrfs_page_mkwrite(). We also add freeze protection to
      the transaction mechanism to avoid starting transactions on frozen filesystem.
      At minimum this is necessary to stop iput() of unlinked file to change frozen
      filesystem during truncation.
      
      Checks in cleaner_kthread() and transaction_kthread() can be safely removed
      since btrfs_freeze() will lock the mutexes and thus block the threads (and they
      shouldn't have anything to do anyway).
      
      CC: linux-btrfs@vger.kernel.org
      CC: Chris Mason <chris.mason@oracle.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      b2b5ef5c
    • J
      btrfs: use printk_get_level and printk_skip_level, add __printf, fix fallout · 533574c6
      Joe Perches 提交于
      Use the generic printk_get_level() to search a message for a kern_level.
      
      Add __printf to verify format and arguments.  Fix a few messages that
      had mismatches in format and arguments.  Add #ifdef CONFIG_PRINTK blocks
      to shrink the object size a bit when not using printk.
      
      [akpm@linux-foundation.org: whitespace tweak]
      Signed-off-by: NJoe Perches <joe@perches.com>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Chris Mason <chris.mason@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      533574c6
  8. 26 7月, 2012 1 次提交
    • A
      Btrfs: introduce subvol uuids and times · 8ea05e3a
      Alexander Block 提交于
      This patch introduces uuids for subvolumes. Each
      subvolume has it's own uuid. In case it was snapshotted,
      it also contains parent_uuid. In case it was received,
      it also contains received_uuid.
      
      It also introduces subvolume ctime/otime/stime/rtime. The
      first two are comparable to the times found in inodes. otime
      is the origin/creation time and ctime is the change time.
      stime/rtime are only valid on received subvolumes.
      stime is the time of the subvolume when it was
      sent. rtime is the time of the subvolume when it was
      received.
      
      Additionally to the times, we have a transid for each
      time. They are updated at the same place as the times.
      
      btrfs receive uses stransid and rtransid to find out
      if a received subvolume changed in the meantime.
      
      If an older kernel mounts a filesystem with the
      extented fields, all fields become invalid. The next
      mount with a new kernel will detect this and reset the
      fields.
      Signed-off-by: NAlexander Block <ablock84@googlemail.com>
      Reviewed-by: NDavid Sterba <dave@jikos.cz>
      Reviewed-by: NArne Jansen <sensille@gmx.net>
      Reviewed-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
      Reviewed-by: NAlex Lyakas <alex.bolshoy.btrfs@gmail.com>
      8ea05e3a
  9. 24 7月, 2012 3 次提交
  10. 12 7月, 2012 1 次提交
  11. 10 7月, 2012 3 次提交
  12. 03 7月, 2012 2 次提交
  13. 21 6月, 2012 1 次提交
  14. 15 6月, 2012 8 次提交