1. 10 6月, 2014 17 次提交
    • D
      btrfs: protect snapshots from deleting during send · 521e0546
      David Sterba 提交于
      The patch "Btrfs: fix protection between send and root deletion"
      (18f687d5) does not actually prevent to delete the snapshot
      and just takes care during background cleaning, but this seems rather
      user unfriendly, this patch implements the idea presented in
      
      http://www.spinics.net/lists/linux-btrfs/msg30813.html
      
      - add an internal root_item flag to denote a dead root
      - check if the send_in_progress is set and refuse to delete, otherwise
        set the flag and proceed
      - check the flag in send similar to the btrfs_root_readonly checks, for
        all involved roots
      
      The root lookup in send via btrfs_read_fs_root_no_name will check if the
      root is really dead or not. If it is, ENOENT, aborted send. If it's
      alive, it's protected by send_in_progress, send can continue.
      
      CC: Miao Xie <miaox@cn.fujitsu.com>
      CC: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      521e0546
    • D
      btrfs: remove redundant null check in btrfs_dentry_release() · 944a4515
      Daeseok Youn 提交于
      It doesn't need to check NULL for kfree()
      Signed-off-by: NDaeseok Youn <daeseok.youn@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      944a4515
    • F
      Btrfs: implement inode_operations callback tmpfile · ef3b9af5
      Filipe Manana 提交于
      This implements the tmpfile callback of struct inode_operations, introduced
      in the linux kernel 3.11, and implemented already by some filesystems. This
      callback is invoked by the VFS when the flag O_TMPFILE is passed to the open
      system call.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      ef3b9af5
    • D
      btrfs: make FS_INFO ioctl available to anyone · e4ef90ff
      David Sterba 提交于
      This ioctl provides basic info about the filesystem that can be obtained
      in other ways (eg. sysfs), there's no reason to restrict it to
      CAP_SYSADMIN.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      e4ef90ff
    • D
      btrfs: make DEV_INFO ioctl available to anyone · 7d6213c5
      David Sterba 提交于
      This ioctl provides basic info about the devices that can be obtained in
      other ways (eg. sysfs), there's no reason to restrict it to
      CAP_SYSADMIN.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      7d6213c5
    • D
      btrfs: export more from FS_INFO to sysfs · df93589a
      David Sterba 提交于
      Similar to the FS_INFO updates, export the basic filesystem info through
      sysfs: node size, sector size and clone alignment.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      df93589a
    • D
      btrfs: retrieve more info from FS_INFO ioctl · 80a773fb
      David Sterba 提交于
      Provide the basic information about filesystem through the ioctl:
      * b-tree node size (same as leaf size)
      * sector size
      * expected alignment of CLONE_RANGE and EXTENT_SAME ioctl arguments
      
      Backward compatibility: if the values are 0, kernel does not provide
      this information, the applications should ignore them.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      80a773fb
    • D
      btrfs: balance filter: add limit of processed chunks · 7d824b6f
      David Sterba 提交于
      This started as debugging helper, to watch the effects of converting
      between raid levels on multiple devices, but could be useful standalone.
      
      In my case the usage filter was not finegrained enough and led to
      converting too many chunks at once. Another example use is in connection
      with drange+devid or vrange filters that allow to work with a specific
      chunk or even with a chunk on a given device.
      
      The limit filter applies last, the value of 0 means no limiting.
      
      CC: Ilya Dryomov <idryomov@gmail.com>
      CC: Hugo Mills <hugo@carfax.org.uk>
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      7d824b6f
    • F
      Btrfs: fix leaf corruption caused by ENOSPC while hole punching · fc19c5e7
      Filipe Manana 提交于
      While running a stress test with multiple threads writing to the same btrfs
      file system, I ended up with a situation where a leaf was corrupted in that
      it had 2 file extent item keys that had the same exact key. I was able to
      detect this quickly thanks to the following patch which triggers an assertion
      as soon as a leaf is marked dirty if there are duplicated keys or out of order
      keys:
      
          Btrfs: check if items are ordered when a leaf is marked dirty
          (https://patchwork.kernel.org/patch/3955431/)
      
      Basically while running the test, I got the following in dmesg:
      
          [28877.415877] WARNING: CPU: 2 PID: 10706 at fs/btrfs/file.c:553 btrfs_drop_extent_cache+0x435/0x440 [btrfs]()
          (...)
          [28877.415917] Call Trace:
          [28877.415922]  [<ffffffff816f1189>] dump_stack+0x4e/0x68
          [28877.415926]  [<ffffffff8104a32c>] warn_slowpath_common+0x8c/0xc0
          [28877.415929]  [<ffffffff8104a37a>] warn_slowpath_null+0x1a/0x20
          [28877.415944]  [<ffffffffa03775a5>] btrfs_drop_extent_cache+0x435/0x440 [btrfs]
          [28877.415949]  [<ffffffff8118e7be>] ? kmem_cache_alloc+0xfe/0x1c0
          [28877.415962]  [<ffffffffa03777d9>] fill_holes+0x229/0x3e0 [btrfs]
          [28877.415972]  [<ffffffffa0345865>] ? block_rsv_add_bytes+0x55/0x80 [btrfs]
          [28877.415984]  [<ffffffffa03792cb>] btrfs_fallocate+0xb6b/0xc20 [btrfs]
          (...)
          [29854.132560] BTRFS critical (device sdc): corrupt leaf, bad key order: block=955232256,root=1, slot=24
          [29854.132565] BTRFS info (device sdc): leaf 955232256 total ptrs 40 free space 778
          (...)
          [29854.132637] 	item 23 key (3486 108 667648) itemoff 2694 itemsize 53
          [29854.132638] 		extent data disk bytenr 14574411776 nr 286720
          [29854.132639] 		extent data offset 0 nr 286720 ram 286720
          [29854.132640] 	item 24 key (3486 108 954368) itemoff 2641 itemsize 53
          [29854.132641] 		extent data disk bytenr 0 nr 0
          [29854.132643] 		extent data offset 0 nr 0 ram 0
          [29854.132644] 	item 25 key (3486 108 954368) itemoff 2588 itemsize 53
          [29854.132645] 		extent data disk bytenr 8699670528 nr 77824
          [29854.132646] 		extent data offset 0 nr 77824 ram 77824
          [29854.132647] 	item 26 key (3486 108 1146880) itemoff 2535 itemsize 53
          [29854.132648] 		extent data disk bytenr 8699670528 nr 77824
          [29854.132649] 		extent data offset 0 nr 77824 ram 77824
          (...)
          [29854.132707] kernel BUG at fs/btrfs/ctree.h:3901!
          (...)
          [29854.132771] Call Trace:
          [29854.132779]  [<ffffffffa0342b5c>] setup_items_for_insert+0x2dc/0x400 [btrfs]
          [29854.132791]  [<ffffffffa0378537>] __btrfs_drop_extents+0xba7/0xdd0 [btrfs]
          [29854.132794]  [<ffffffff8109c0d6>] ? trace_hardirqs_on_caller+0x16/0x1d0
          [29854.132797]  [<ffffffff8109c29d>] ? trace_hardirqs_on+0xd/0x10
          [29854.132800]  [<ffffffff8118e7be>] ? kmem_cache_alloc+0xfe/0x1c0
          [29854.132810]  [<ffffffffa036783b>] insert_reserved_file_extent.constprop.66+0xab/0x310 [btrfs]
          [29854.132820]  [<ffffffffa036a6c6>] __btrfs_prealloc_file_range+0x116/0x340 [btrfs]
          [29854.132830]  [<ffffffffa0374d53>] btrfs_prealloc_file_range+0x23/0x30 [btrfs]
          (...)
      
      So this is caused by getting an -ENOSPC error while punching a file hole, more
      specifically, we get -ENOSPC error from __btrfs_drop_extents in the while loop
      of file.c:btrfs_punch_hole() when it's unable to modify the btree to delete one
      or more file extent items due to lack of enough free space. When this happens,
      in btrfs_punch_hole(), we attempt to reclaim free space by switching our transaction
      block reservation object to root->fs_info->trans_block_rsv, end our transaction and
      start a new transaction basically - and, we keep increasing our current offset
      (cur_offset) as long as it's smaller than the end of the target range (lockend) -
      this makes use leave the loop with cur_offset == drop_end which in turn makes us
      call fill_holes() for inserting a file extent item that represents a 0 bytes range
      hole (and this insertion succeeds, as in the meanwhile more space became available).
      
      This 0 bytes file hole extent item is a problem because any subsequent caller of
      __btrfs_drop_extents (regular file writes, or fallocate calls for e.g.), with a
      start file offset that is equal to the offset of the hole, will not remove this
      extent item due to the following conditional in the while loop of
      __btrfs_drop_extents:
      
          if (extent_end <= search_start) {
                  path->slots[0]++;
                  goto next_slot;
          }
      
      This later makes the call to setup_items_for_insert() (at the very end of
      __btrfs_drop_extents), insert a new file extent item with the same offset as
      the 0 bytes file hole extent item that follows it. Needless is to say that this
      causes chaos, either when reading the leaf from disk (btree_readpage_end_io_hook),
      where we perform leaf sanity checks or in subsequent operations that manipulate
      file extent items, as in the fallocate call as shown by the dmesg trace above.
      
      Without my other patch to perform the leaf sanity checks once a leaf is marked
      as dirty (if the integrity checker is enabled), it would have been much harder
      to debug this issue.
      
      This change might fix a few similar issues reported by users in the mailing
      list regarding assertion failures in btrfs_set_item_key_safe calls performed
      by __btrfs_drop_extents, such as the following report:
      
          http://comments.gmane.org/gmane.comp.file-systems.btrfs/32938
      
      Asking fill_holes() to create a 0 bytes wide file hole item also produced the
      first warning in the trace above, as we passed a range to btrfs_drop_extent_cache
      that has an end smaller (by -1) than its start.
      
      On 3.14 kernels this issue manifests itself through leaf corruption, as we get
      duplicated file extent item keys in a leaf when calling setup_items_for_insert(),
      but on older kernels, setup_items_for_insert() isn't called by __btrfs_drop_extents(),
      instead we have callers of __btrfs_drop_extents(), namely the functions
      inode.c:insert_inline_extent() and inode.c:insert_reserved_file_extent(), calling
      btrfs_insert_empty_item() to insert the new file extent item, which would fail with
      error -EEXIST, instead of inserting a duplicated key - which is still a serious
      issue as it would make all similar file extent item replace operations keep
      failing if they target the same file range.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      fc19c5e7
    • L
      Btrfs: do not increment on bio_index one by one · d2cbf2a2
      Liu Bo 提交于
      'bio_index' is just a index, it's really not necessary to do increment
      one by one.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      d2cbf2a2
    • F
      Btrfs: read inode size after acquiring the mutex when punching a hole · a1a50f60
      Filipe Manana 提交于
      In a previous change, commit 12870f1c,
      I accidentally moved the roundup of inode->i_size to outside of the
      critical section delimited by the inode mutex, which is not atomic and
      not correct since the size can be changed by other task before we acquire
      the mutex. Therefore fix it.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      a1a50f60
    • T
      btrfs: Remove unnecessary check for NULL · 7fb18a06
      Tobias Klauser 提交于
      iput() already checks for the inode being NULL, thus it's unnecessary to
      check before calling.
      Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
      Signed-off-by: NChris Mason <clm@fb.com>
      7fb18a06
    • Z
      btrfs: fix inline compressed read err corruption · 166ae5a4
      Zach Brown 提交于
      uncompress_inline() is dropping the error from btrfs_decompress() after
      testing it and zeroing the page that was supposed to hold decompressed
      data.  This can silently turn compressed inline data in to zeros if
      decompression fails due to corrupt compressed data or memory allocation
      failure.
      
      I verified this by manually forcing the error from btrfs_decompress()
      for a silly named copy of od:
      
      	if (!strcmp(current->comm, "failod"))
      		ret = -ENOMEM;
      
        # od -x /mnt/btrfs/dir/80 | head -1
        0000000 3031 3038 310a 2d30 6f70 6e69 0a74 3031
        # echo 3 > /proc/sys/vm/drop_caches
        # cp $(which od) /tmp/failod
        # /tmp/failod -x /mnt/btrfs/dir/80 | head -1
        0000000 0000 0000 0000 0000 0000 0000 0000 0000
      
      The fix is to pass the error to its caller.  Which still has a BUG_ON().
      So we fix that too.
      
      There seems to be no reason for the zeroing of the page on the error
      from btrfs_decompress() but not from the allocation error a few lines
      above.  So the page zeroing is removed.
      Signed-off-by: NZach Brown <zab@redhat.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      166ae5a4
    • Z
      btrfs: return ptr error from compression workspace · 774bcb35
      Zach Brown 提交于
      The btrfs compression wrappers translated errors from workspace
      allocation to either -ENOMEM or -1.  The compression type workspace
      allocators are already returning a ERR_PTR(-ENOMEM).  Just return that
      and get rid of the magical -1.
      
      This helps a future patch return errors from the compression wrappers.
      Signed-off-by: NZach Brown <zab@redhat.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      774bcb35
    • Z
      btrfs: return errno instead of -1 from compression · 60e1975a
      Zach Brown 提交于
      The compression layer seems to have been built to return -1 and have
      callers make up errors that make sense.  This isn't great because there
      are different errors that originate down in the compression layer.
      
      Let's return real negative errnos from the compression layer so that
      callers can pass on the error without having to guess what happened.
      ENOMEM for allocation failure, E2BIG when compression exceeds the
      uncompressed input, and EIO for everything else.
      
      This helps a future path return errors from btrfs_decompress().
      Signed-off-by: NZach Brown <zab@redhat.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      60e1975a
    • S
      btrfs: check_int: propagate out-of-memory error upwards · 98806b44
      Stefan Behrens 提交于
      This issue was not causing any harm but IMO (and in the opinion of the
      static code checker) it is better to propagate this error status upwards.
      Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      98806b44
    • F
      Btrfs: fix hang on error (such as ENOSPC) when writing extent pages · 61391d56
      Filipe Manana 提交于
      When running low on available disk space and having several processes
      doing buffered file IO, I got the following trace in dmesg:
      
      [ 4202.720152] INFO: task kworker/u8:1:5450 blocked for more than 120 seconds.
      [ 4202.720401]       Not tainted 3.13.0-fdm-btrfs-next-26+ #1
      [ 4202.720596] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [ 4202.720874] kworker/u8:1    D 0000000000000001     0  5450      2 0x00000000
      [ 4202.720904] Workqueue: btrfs-flush_delalloc normal_work_helper [btrfs]
      [ 4202.720908]  ffff8801f62ddc38 0000000000000082 ffff880203ac2490 00000000001d3f40
      [ 4202.720913]  ffff8801f62ddfd8 00000000001d3f40 ffff8800c4f0c920 ffff880203ac2490
      [ 4202.720918]  00000000001d4a40 ffff88020fe85a40 ffff88020fe85ab8 0000000000000001
      [ 4202.720922] Call Trace:
      [ 4202.720931]  [<ffffffff816a3cb9>] schedule+0x29/0x70
      [ 4202.720950]  [<ffffffffa01ec48d>] btrfs_start_ordered_extent+0x6d/0x110 [btrfs]
      [ 4202.720956]  [<ffffffff8108e620>] ? bit_waitqueue+0xc0/0xc0
      [ 4202.720972]  [<ffffffffa01ec559>] btrfs_run_ordered_extent_work+0x29/0x40 [btrfs]
      [ 4202.720988]  [<ffffffffa0201987>] normal_work_helper+0x137/0x2c0 [btrfs]
      [ 4202.720994]  [<ffffffff810680e5>] process_one_work+0x1f5/0x530
      (...)
      [ 4202.721027] 2 locks held by kworker/u8:1/5450:
      [ 4202.721028]  #0:  (%s-%s){++++..}, at: [<ffffffff81068083>] process_one_work+0x193/0x530
      [ 4202.721037]  #1:  ((&work->normal_work)){+.+...}, at: [<ffffffff81068083>] process_one_work+0x193/0x530
      [ 4202.721054] INFO: task btrfs:7891 blocked for more than 120 seconds.
      [ 4202.721258]       Not tainted 3.13.0-fdm-btrfs-next-26+ #1
      [ 4202.721444] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [ 4202.721699] btrfs           D 0000000000000001     0  7891   7890 0x00000001
      [ 4202.721704]  ffff88018c2119e8 0000000000000086 ffff8800a33d2490 00000000001d3f40
      [ 4202.721710]  ffff88018c211fd8 00000000001d3f40 ffff8802144b0000 ffff8800a33d2490
      [ 4202.721714]  ffff8800d8576640 ffff88020fe85bc0 ffff88020fe85bc8 7fffffffffffffff
      [ 4202.721718] Call Trace:
      [ 4202.721723]  [<ffffffff816a3cb9>] schedule+0x29/0x70
      [ 4202.721727]  [<ffffffff816a2ebc>] schedule_timeout+0x1dc/0x270
      [ 4202.721732]  [<ffffffff8109bd79>] ? mark_held_locks+0xb9/0x140
      [ 4202.721736]  [<ffffffff816a90c0>] ? _raw_spin_unlock_irq+0x30/0x40
      [ 4202.721740]  [<ffffffff8109bf0d>] ? trace_hardirqs_on_caller+0x10d/0x1d0
      [ 4202.721744]  [<ffffffff816a488f>] wait_for_completion+0xdf/0x120
      [ 4202.721749]  [<ffffffff8107fa90>] ? try_to_wake_up+0x310/0x310
      [ 4202.721765]  [<ffffffffa01ebee4>] btrfs_wait_ordered_extents+0x1f4/0x280 [btrfs]
      [ 4202.721781]  [<ffffffffa020526e>] btrfs_mksubvol.isra.62+0x30e/0x5a0 [btrfs]
      [ 4202.721786]  [<ffffffff8108e620>] ? bit_waitqueue+0xc0/0xc0
      [ 4202.721799]  [<ffffffffa02056a9>] btrfs_ioctl_snap_create_transid+0x1a9/0x1b0 [btrfs]
      [ 4202.721813]  [<ffffffffa020583a>] btrfs_ioctl_snap_create_v2+0x10a/0x170 [btrfs]
      (...)
      
      It turns out that extent_io.c:__extent_writepage(), which ends up being called
      through filemap_fdatawrite_range() in btrfs_start_ordered_extent(), was getting
      -ENOSPC when calling the fill_delalloc callback. In this situation, it returned
      without the writepage_end_io_hook callback (inode.c:btrfs_writepage_end_io_hook)
      ever being called for the respective page, which prevents the ordered extent's
      bytes_left count from ever reaching 0, and therefore a finish_ordered_fn work
      is never queued into the endio_write_workers queue. This makes the task that
      called btrfs_start_ordered_extent() hang forever on the wait queue of the ordered
      extent.
      
      This is fairly easy to reproduce using a small filesystem and fsstress on
      a quad core vm:
      
          mkfs.btrfs -f -b `expr 2100 \* 1024 \* 1024` /dev/sdd
          mount /dev/sdd /mnt
      
          fsstress -p 6 -d /mnt -n 100000 -x \
              "btrfs subvolume snapshot -r /mnt /mnt/mysnap" \
      	    -f allocsp=0 \
      	    -f bulkstat=0 \
      	    -f bulkstat1=0 \
      	    -f chown=0 \
      	    -f creat=1 \
      	    -f dread=0 \
      	    -f dwrite=0 \
      	    -f fallocate=1 \
      	    -f fdatasync=0 \
      	    -f fiemap=0 \
      	    -f freesp=0 \
      	    -f fsync=0 \
      	    -f getattr=0 \
      	    -f getdents=0 \
      	    -f link=0 \
      	    -f mkdir=0 \
      	    -f mknod=0 \
      	    -f punch=1 \
      	    -f read=0 \
      	    -f readlink=0 \
      	    -f rename=0 \
      	    -f resvsp=0 \
      	    -f rmdir=0 \
      	    -f setxattr=0 \
      	    -f stat=0 \
      	    -f symlink=0 \
      	    -f sync=0 \
      	    -f truncate=1 \
      	    -f unlink=0 \
      	    -f unresvsp=0 \
      	    -f write=4
      
      So just ensure that if an error happens while writing the extent page
      we call the writepage_end_io_hook callback. Also make it return the
      error code and ensure the caller (extent_write_cache_pages) processes
      all pages in the page vector even if an error happens only for some
      of them, so that ordered extents end up released.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      61391d56
  2. 07 6月, 2014 2 次提交
    • N
      mm: add !pte_present() check on existing hugetlb_entry callbacks · d4c54919
      Naoya Horiguchi 提交于
      The age table walker doesn't check non-present hugetlb entry in common
      path, so hugetlb_entry() callbacks must check it.  The reason for this
      behavior is that some callers want to handle it in its own way.
      
      [ I think that reason is bogus, btw - it should just do what the regular
        code does, which is to call the "pte_hole()" function for such hugetlb
        entries  - Linus]
      
      However, some callers don't check it now, which causes unpredictable
      result, for example when we have a race between migrating hugepage and
      reading /proc/pid/numa_maps.  This patch fixes it by adding !pte_present
      checks on buggy callbacks.
      
      This bug exists for years and got visible by introducing hugepage
      migration.
      
      ChangeLog v2:
      - fix if condition (check !pte_present() instead of pte_present())
      Reported-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: <stable@vger.kernel.org> [3.12+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      [ Backported to 3.15.  Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org> ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d4c54919
    • F
      Btrfs: send, fix corrupted path strings for long paths · 01a9a8a9
      Filipe Manana 提交于
      If a path has more than 230 characters, we allocate a new buffer to
      use for the path, but we were forgotting to copy the contents of the
      previous buffer into the new one, which has random content from the
      kmalloc call.
      
      Test:
      
          mkfs.btrfs -f /dev/sdd
          mount /dev/sdd /mnt
      
          TEST_PATH="/mnt/fdmanana/.config/google-chrome-mysetup/Default/Pepper_Data/Shockwave_Flash/WritableRoot/#SharedObjects/JSHJ4ZKN/s.wsj.net/[[IMPORT]]/players.edgesuite.net/flash/plugins/osmf/advanced-streaming-plugin/v2.7/osmf1.6/Ak#"
          mkdir -p $TEST_PATH
          echo "hello world" > $TEST_PATH/amaiAdvancedStreamingPlugin.txt
      
          btrfs subvolume snapshot -r /mnt /mnt/mysnap1
          btrfs send /mnt/mysnap1 -f /tmp/1.snap
      
      A test for xfstests follows.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Cc: Marc Merlin <marc@merlins.org>
      Tested-by: NMarc MERLIN <marc@merlins.org>
      Signed-off-by: NChris Mason <clm@fb.com>
      01a9a8a9
  3. 03 6月, 2014 1 次提交
  4. 01 6月, 2014 1 次提交
    • L
      dcache: add missing lockdep annotation · 9f12600f
      Linus Torvalds 提交于
      lock_parent() very much on purpose does nested locking of dentries, and
      is careful to maintain the right order (lock parent first).  But because
      it didn't annotate the nested locking order, lockdep thought it might be
      a deadlock on d_lock, and complained.
      
      Add the proper annotation for the inner locking of the child dentry to
      make lockdep happy.
      
      Introduced by commit 046b961b ("shrink_dentry_list(): take parent's
      ->d_lock earlier").
      Reported-and-tested-by: NJosh Boyer <jwboyer@fedoraproject.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9f12600f
  5. 30 5月, 2014 3 次提交
    • A
      dentry_kill() doesn't need the second argument now · 8cbf74da
      Al Viro 提交于
      it's 1 in the only remaining caller.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      8cbf74da
    • A
      dealing with the rest of shrink_dentry_list() livelock · b2b80195
      Al Viro 提交于
      We have the same problem with ->d_lock order in the inner loop, where
      we are dropping references to ancestors.  Same solution, basically -
      instead of using dentry_kill() we use lock_parent() (introduced in the
      previous commit) to get that lock in a safe way, recheck ->d_count
      (in case if lock_parent() has ended up dropping and retaking ->d_lock
      and somebody managed to grab a reference during that window), trylock
      the inode->i_lock and use __dentry_kill() to do the rest.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      b2b80195
    • A
      shrink_dentry_list(): take parent's ->d_lock earlier · 046b961b
      Al Viro 提交于
      The cause of livelocks there is that we are taking ->d_lock on
      dentry and its parent in the wrong order, forcing us to use
      trylock on the parent's one.  d_walk() takes them in the right
      order, and unfortunately it's not hard to create a situation
      when shrink_dentry_list() can't make progress since trylock
      keeps failing, and shrink_dcache_parent() or check_submounts_and_drop()
      keeps calling d_walk() disrupting the very shrink_dentry_list() it's
      waiting for.
      
      Solution is straightforward - if that trylock fails, let's unlock
      the dentry itself and take locks in the right order.  We need to
      stabilize ->d_parent without holding ->d_lock, but that's doable
      using RCU.  And we'd better do that in the very beginning of the
      loop in shrink_dentry_list(), since the checks on refcount, etc.
      would need to be redone anyway.
      
      That deals with a half of the problem - killing dentries on the
      shrink list itself.  Another one (dropping their parents) is
      in the next commit.
      
      locking parent is interesting - it would be easy to do rcu_read_lock(),
      lock whatever we think is a parent, lock dentry itself and check
      if the parent is still the right one.  Except that we need to check
      that *before* locking the dentry, or we are risking taking ->d_lock
      out of order.  Fortunately, once the D1 is locked, we can check if
      D2->d_parent is equal to D1 without the need to lock D2; D2->d_parent
      can start or stop pointing to D1 only under D1->d_lock, so taking
      D1->d_lock is enough.  In other words, the right solution is
      rcu_read_lock/lock what looks like parent right now/check if it's
      still our parent/rcu_read_unlock/lock the child.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      046b961b
  6. 29 5月, 2014 2 次提交
  7. 28 5月, 2014 2 次提交
  8. 24 5月, 2014 1 次提交
  9. 23 5月, 2014 3 次提交
    • D
      AFS: Pass an afs_call* to call->async_workfn() instead of a work_struct* · 656f88dd
      David Howells 提交于
      call->async_workfn() can take an afs_call* arg rather than a work_struct* as
      the functions assigned there are now called from afs_async_workfn() which has
      to call container_of() anyway.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NNathaniel Wesley Filardo <nwf@cs.jhu.edu>
      Reviewed-by: NTejun Heo <tj@kernel.org>
      656f88dd
    • N
      AFS: Fix kafs module unloading · 150a6b47
      Nathaniel Wesley Filardo 提交于
      At present, it is not possible to successfully unload the kafs module if there
      are outstanding async outgoing calls (those made with afs_make_call()).  This
      appears to be due to the changes introduced by:
      
      	commit 05949945
      	Author: Tejun Heo <tj@kernel.org>
      	Date:   Fri Mar 7 10:24:50 2014 -0500
      	Subject: afs: don't use PREPARE_WORK
      
      which didn't go far enough.  The problem is due to:
      
       (1) The aforementioned commit introduced a separate handler function pointer
           in the call, call->async_workfn, in addition to the original workqueue
           item, call->async_work, for asynchronous operations because workqueues
           subsystem cannot handle the workqueue item pointer being changed whilst
           the item is queued or being processed.
      
       (2) afs_async_workfn() was introduced in that commit to be the callback for
           call->async_work.  Its sole purpose is to run whatever call->async_workfn
           points to.
      
       (3) call->async_workfn is only used from afs_async_workfn(), which is only
           set on async_work by afs_collect_incoming_call() - ie. for incoming
           calls.
      
       (4) call->async_workfn is *not* set by afs_make_call() when outgoing calls are
           made, and call->async_work is set afs_process_async_call() - and not
           afs_async_workfn().
      
       (5) afs_process_async_call() now changes call->async_workfn rather than
           call->async_work to point to afs_delete_async_call() to clean up, but this
           is only effective for incoming calls because call->async_work does not
           point to afs_async_workfn() for outgoing calls.
      
       (6) Because, for incoming calls, call->async_work remains pointing to
           afs_process_async_call() this results in an infinite loop.
      
      Instead, make the workqueue uniformly vector through call->async_workfn, via
      afs_async_workfn() and simply initialise call->async_workfn to point to
      afs_process_async_call() in afs_make_call().
      Signed-off-by: NNathaniel Wesley Filardo <nwf@cs.jhu.edu>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NTejun Heo <tj@kernel.org>
      150a6b47
    • N
      AFS: Part of afs_end_call() is identical to code elsewhere, so split it · 6cf12869
      Nathaniel Wesley Filardo 提交于
      Split afs_end_call() into two pieces, one of which is identical to code in
      afs_process_async_call().  Replace the latter with a call to the first part of
      afs_end_call().
      Signed-off-by: NNathaniel Wesley Filardo <nwf@cs.jhu.edu>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      6cf12869
  10. 21 5月, 2014 5 次提交
    • J
      nfsd4: warn on finding lockowner without stateid's · 27b11428
      J. Bruce Fields 提交于
      The current code assumes a one-to-one lockowner<->lock stateid
      correspondance.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      27b11428
    • J
      nfsd4: remove lockowner when removing lock stateid · a1b8ff4c
      J. Bruce Fields 提交于
      The nfsv4 state code has always assumed a one-to-one correspondance
      between lock stateid's and lockowners even if it appears not to in some
      places.
      
      We may actually change that, but for now when FREE_STATEID releases a
      lock stateid it also needs to release the parent lockowner.
      
      Symptoms were a subsequent LOCK crashing in find_lockowner_str when it
      calls same_lockowner_ino on a lockowner that unexpectedly has an empty
      so_stateids list.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      a1b8ff4c
    • D
      AFS: Fix cache manager service handlers · 6c67c7c3
      David Howells 提交于
      Fix the cache manager RPC service handlers.  The afs_send_empty_reply() and
      afs_send_simple_reply() functions:
      
       (a) Kill the call and free up the buffers associated with it if they fail.
      
       (b) Return with call intact if it they succeed.
      
      However, none of the callers actually check the result or clean up if
      successful - and may use the now non-existent data if it fails.
      
      This was detected by Dan Carpenter using a static checker:
      
      	The patch 08e0e7c8: "[AF_RXRPC]: Make the in-kernel AFS
      	filesystem use AF_RXRPC." from Apr 26, 2007, leads to the following
      	static checker warning:
      	"fs/afs/cmservice.c:155 SRXAFSCB_CallBack()
      		 warn: 'call' was already freed."
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      6c67c7c3
    • F
      Btrfs: send, fix incorrect ref access when using extrefs · 51a60253
      Filipe Manana 提交于
      When running send, if an inode only has extended reference items
      associated to it and no regular references, send.c:get_first_ref()
      was incorrectly assuming the reference it found was of type
      BTRFS_INODE_REF_KEY due to use of the wrong key variable.
      This caused weird behaviour when using the found item has a regular
      reference, such as weird path string, and occasionally (when lucky)
      a crash:
      
      [  190.600652] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
      [  190.600994] Modules linked in: btrfs xor raid6_pq binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc psmouse serio_raw evbug pcspkr i2c_piix4 e1000 floppy
      [  190.602565] CPU: 2 PID: 14520 Comm: btrfs Not tainted 3.13.0-fdm-btrfs-next-26+ #1
      [  190.602728] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [  190.602868] task: ffff8800d447c920 ti: ffff8801fa79e000 task.ti: ffff8801fa79e000
      [  190.603030] RIP: 0010:[<ffffffff813266b4>]  [<ffffffff813266b4>] memcpy+0x54/0x110
      [  190.603262] RSP: 0018:ffff8801fa79f880  EFLAGS: 00010202
      [  190.603395] RAX: ffff8800d4326e3f RBX: 000000000000036a RCX: ffff880000000000
      [  190.603553] RDX: 000000000000032a RSI: ffe708844042936a RDI: ffff8800d43271a9
      [  190.603710] RBP: ffff8801fa79f8c8 R08: 00000000003a4ef0 R09: 0000000000000000
      [  190.603867] R10: 793a4ef09f000000 R11: 9f0000000053726f R12: ffff8800d43271a9
      [  190.604020] R13: 0000160000000000 R14: ffff8802110134f0 R15: 000000000000036a
      [  190.604020] FS:  00007fb423d09b80(0000) GS:ffff880216200000(0000) knlGS:0000000000000000
      [  190.604020] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [  190.604020] CR2: 00007fb4229d4b78 CR3: 00000001f5d76000 CR4: 00000000000006e0
      [  190.604020] Stack:
      [  190.604020]  ffffffffa01f4d49 ffff8801fa79f8f0 00000000000009f9 ffff8801fa79f8c8
      [  190.604020]  00000000000009f9 ffff880211013260 000000000000f971 ffff88021147dba8
      [  190.604020]  00000000000009f9 ffff8801fa79f918 ffffffffa02367f5 ffff8801fa79f928
      [  190.604020] Call Trace:
      [  190.604020]  [<ffffffffa01f4d49>] ? read_extent_buffer+0xb9/0x120 [btrfs]
      [  190.604020]  [<ffffffffa02367f5>] fs_path_add_from_extent_buffer+0x45/0x60 [btrfs]
      [  190.604020]  [<ffffffffa0238806>] get_first_ref+0x1f6/0x210 [btrfs]
      [  190.604020]  [<ffffffffa0238994>] __get_cur_name_and_parent+0x174/0x3a0 [btrfs]
      [  190.604020]  [<ffffffff8118df3d>] ? kmem_cache_alloc_trace+0x11d/0x1e0
      [  190.604020]  [<ffffffffa0236674>] ? fs_path_alloc+0x24/0x60 [btrfs]
      [  190.604020]  [<ffffffffa0238c91>] get_cur_path+0xd1/0x240 [btrfs]
      (...)
      
      Steps to reproduce (either crash or some weirdness like an odd path string):
      
          mkfs.btrfs -f -O extref /dev/sdd
          mount /dev/sdd /mnt
      
          mkdir /mnt/testdir
          touch /mnt/testdir/foobar
      
          for i in `seq 1 2550`; do
              ln /mnt/testdir/foobar /mnt/testdir/foobar_link_`printf "%04d" $i`
          done
      
          ln /mnt/testdir/foobar /mnt/testdir/final_foobar_name
      
          rm -f /mnt/testdir/foobar
          for i in `seq 1 2550`; do
              rm -f /mnt/testdir/foobar_link_`printf "%04d" $i`
          done
      
          btrfs subvolume snapshot -r /mnt /mnt/mysnap
          btrfs send /mnt/mysnap -f /tmp/mysnap.send
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
      51a60253
    • L
      Btrfs: fix EIO on reading file after ioctl clone works on it · d3ecfcdf
      Liu Bo 提交于
      For inline data extent, we need to make its length aligned, otherwise,
      we can get a phantom extent map which confuses readpages() to return -EIO.
      
      This can be detected by xfstests/btrfs/035.
      Reported-by: NDavid Disseldorp <ddiss@suse.de>
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      d3ecfcdf
  11. 20 5月, 2014 1 次提交
    • T
      sysfs: make sure read buffer is zeroed · f5c16f29
      Tejun Heo 提交于
      13c589d5 ("sysfs: use seq_file when reading regular files")
      switched sysfs from custom read implementation to seq_file to enable
      later transition to kernfs.  After the change, the buffer passed to
      ->show() is acquired through seq_get_buf(); unfortunately, this
      introduces a subtle behavior change.  Before the commit, the buffer
      passed to ->show() was always zero as it was allocated using
      get_zeroed_page().  Because seq_file doesn't clear buffers on
      allocation and neither does seq_get_buf(), after the commit, depending
      on the behavior of ->show(), we may end up exposing uninitialized data
      to userland thus possibly altering userland visible behavior and
      leaking information.
      
      Fix it by explicitly clearing the buffer.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NRon <ron@debian.org>
      Fixes: 13c589d5 ("sysfs: use seq_file when reading regular files")
      Cc: stable <stable@vger.kernel.org> # 3.13+
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f5c16f29
  12. 16 5月, 2014 1 次提交
  13. 15 5月, 2014 1 次提交