1. 29 5月, 2018 23 次提交
  2. 24 5月, 2018 1 次提交
    • O
      Btrfs: fix error handling in btrfs_truncate() · d5014738
      Omar Sandoval 提交于
      Jun Wu at Facebook reported that an internal service was seeing a return
      value of 1 from ftruncate() on Btrfs in some cases. This is coming from
      the NEED_TRUNCATE_BLOCK return value from btrfs_truncate_inode_items().
      
      btrfs_truncate() uses two variables for error handling, ret and err.
      When btrfs_truncate_inode_items() returns non-zero, we set err to the
      return value. However, NEED_TRUNCATE_BLOCK is not an error. Make sure we
      only set err if ret is an error (i.e., negative).
      
      To reproduce the issue: mount a filesystem with -o compress-force=zstd
      and the following program will encounter return value of 1 from
      ftruncate:
      
      int main(void) {
              char buf[256] = { 0 };
              int ret;
              int fd;
      
              fd = open("test", O_CREAT | O_WRONLY | O_TRUNC, 0666);
              if (fd == -1) {
                      perror("open");
                      return EXIT_FAILURE;
              }
      
              if (write(fd, buf, sizeof(buf)) != sizeof(buf)) {
                      perror("write");
                      close(fd);
                      return EXIT_FAILURE;
              }
      
              if (fsync(fd) == -1) {
                      perror("fsync");
                      close(fd);
                      return EXIT_FAILURE;
              }
      
              ret = ftruncate(fd, 128);
              if (ret) {
                      printf("ftruncate() returned %d\n", ret);
                      close(fd);
                      return EXIT_FAILURE;
              }
      
              close(fd);
              return EXIT_SUCCESS;
      }
      
      Fixes: ddfae63c ("btrfs: move btrfs_truncate_block out of trans handle")
      CC: stable@vger.kernel.org # 4.15+
      Reported-by: NJun Wu <quark@fb.com>
      Signed-off-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      d5014738
  3. 17 5月, 2018 1 次提交
  4. 12 5月, 2018 1 次提交
    • A
      do d_instantiate/unlock_new_inode combinations safely · 1e2e547a
      Al Viro 提交于
      For anything NFS-exported we do _not_ want to unlock new inode
      before it has grown an alias; original set of fixes got the
      ordering right, but missed the nasty complication in case of
      lockdep being enabled - unlock_new_inode() does
      	lockdep_annotate_inode_mutex_key(inode)
      which can only be done before anyone gets a chance to touch
      ->i_mutex.  Unfortunately, flipping the order and doing
      unlock_new_inode() before d_instantiate() opens a window when
      mkdir can race with open-by-fhandle on a guessed fhandle, leading
      to multiple aliases for a directory inode and all the breakage
      that follows from that.
      
      	Correct solution: a new primitive (d_instantiate_new())
      combining these two in the right order - lockdep annotate, then
      d_instantiate(), then the rest of unlock_new_inode().  All
      combinations of d_instantiate() with unlock_new_inode() should
      be converted to that.
      
      Cc: stable@kernel.org	# 2.6.29 and later
      Tested-by: NMike Marshall <hubcap@omnibond.com>
      Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      1e2e547a
  5. 19 4月, 2018 1 次提交
  6. 12 4月, 2018 1 次提交
  7. 31 3月, 2018 9 次提交
  8. 26 3月, 2018 3 次提交
    • S
      btrfs: adjust return values of btrfs_inode_by_name · 005d6712
      Su Yue 提交于
      Previously, btrfs_inode_by_name() returned 0 which left caller to check
      objectid of location even location if the type was invalid.
      
      Let btrfs_inode_by_name() return -EUCLEAN if a corrupted location of a
      dir entry is found.  Removal of label out_err also simplifies the
      function.
      Signed-off-by: NSu Yue <suy.fnst@cn.fujitsu.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      [ drop unlikely ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      005d6712
    • N
      btrfs: Remove root argument from cow_file_range_inline · d02c0e20
      Nikolay Borisov 提交于
      This argument is always set to the root of the inode, which is also
      passed. So let's get a reference inside the function and simplify
      the arg list.
      Signed-off-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      d02c0e20
    • F
      Btrfs: skip writeback of last page when truncating file to same size · 213e8c55
      Filipe Manana 提交于
      When we truncate a file to the same size and that size is not aligned
      with the sector size, we end up triggering writeback (and wait for it to
      complete) of the last page. This is unncessary as we can not have delayed
      allocation beyond the inode's i_size and the goal of truncating a file
      to its own size is to discard prealloc extents (allocated via the
      fallocate(2) system call). Besides the unnecessary IO start and wait, it
      also breaks the oppurtunity for larger contiguous extents on disk, as
      before the last dirty page there might be other dirty pages.
      
      This scenario is probably not very common in general, however it is
      common for btrfs receive implementations because currently the send
      stream always issues a truncate operation for each processed inode as
      the last operation for that inode (this truncate operation is not
      always needed and the send implementation will be addressed to avoid
      them).
      
      So improve this by not starting and waiting for writeback of the inode's
      last page when we are truncating to exactly the same size.
      
      The following script was used to quickly measure the time a receive
      operation takes:
      
       $ cat test_send.sh
       #!/bin/bash
      
       SRC_DEV=/dev/sdc
       DST_DEV=/dev/sdd
       SRC_MNT=/mnt/sdc
       DST_MNT=/mnt/sdd
      
       mkfs.btrfs -f $SRC_DEV >/dev/null
       mkfs.btrfs -f $DST_DEV >/dev/null
       mount $SRC_DEV $SRC_MNT
       mount $DST_DEV $DST_MNT
      
       echo "Creating source filesystem"
       for ((t = 0; t < 10; t++)); do
           (
               for ((i = 1; i <= 20000; i++)); do
                   xfs_io -f -c "pwrite -S 0xab 0 5000" \
                      $SRC_MNT/file_$i > /dev/null
               done
           ) &
           worker_pids[$t]=$!
       done
       wait ${worker_pids[@]}
      
       echo "Creating and sending snapshot"
       btrfs subvolume snapshot -r $SRC_MNT $SRC_MNT/snap1 >/dev/null
       /usr/bin/time -f "send took %e seconds"    \
           btrfs send -f $SRC_MNT/send_file $SRC_MNT/snap1
       /usr/bin/time -f "receive took %e seconds" \
           btrfs receive -f $SRC_MNT/send_file $DST_MNT
      
       umount $SRC_MNT
       umount $DST_MNT
      
      The results for 5 runs were the following:
      
      * Without this change
      
      average receive time was 26.49 seconds
      standard deviation of 2.53 seconds
      
      * With this change
      
      average receive time was 12.51 seconds
      standard deviation of 0.32 seconds
      Reported-by: NRobbie Ko <robbieko@synology.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      213e8c55