提交 · dc2ee4e244138124a05cdc39365b38d4cc77e3fa · openeuler / Kernel

09 8月, 2015 19 次提交

btrfs: Cleanup: Remove chunk_objectid argument from btrfs_relocate_chunk() · dc2ee4e2

由 Zhaolei 提交于 8月 05, 2015

Remove chunk_objectid argument from btrfs_relocate_chunk() because
it is not necessary, it can also cleanup some code in caller for
prepare its value.
Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

dc2ee4e2

btrfs: Cleanup: Remove objectid's init-value in create_reloc_inode() · 4624900d

由 Zhaolei 提交于 8月 05, 2015

objectid's init-value is not used in any case, remove it.
Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

4624900d

btrfs: Error handle for get_ref_objectid_v0() in relocate_block_group() · 4b3576e4

由 Zhaolei 提交于 8月 05, 2015

We need error checking code for get_ref_objectid_v0() in
relocate_block_group(), to avoid unpredictable result, especially
for accessing uninitialized value(when function failed) after
this line.
Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

4b3576e4

btrfs: Fix data checksum error cause by replace with io-load. · 55e3a601

由 Zhaolei 提交于 8月 05, 2015

xfstests btrfs/070 sometimes failed.
In my test machine, its fail rate is about 30%.
In another vm(vmware), its fail rate is about 50%.

Reason:
  btrfs/070 do replace and defrag with fsstress simultaneously,
  after above operation, checksum error is found by scrub.

  Actually, it have no relationship with defrag operation, only
  replace with fsstress can trigger this bug.

  New data writen to target device have possibility rewrited by
  old data from source device by replace code in debug, to avoid
  above problem, we can set target block group to readonly in
  replace period, so new data requested by other operation will
  not write to same place with replace code.

  Before patch(4.1-rc3):
    30% failed in 100 xfstests.
  After patch:
    0% failed in 300 xfstests.

It also happened in btrfs/071 as it's another scrub with IO load tests.
Reported-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

55e3a601

btrfs: use scrub_pause_on/off() to reduce code in scrub_enumerate_chunks() · b708ce96

由 Zhaolei 提交于 8月 05, 2015

Use new intruduced scrub_pause_on/off() can make this code block
clean and more readable.
Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

b708ce96

btrfs: Separate scrub_blocked_if_needed() to scrub_pause_on/off() · 0e22be89

由 Zhaolei 提交于 8月 05, 2015

It can reduce current duplicated code which is similar to
scrub_blocked_if_needed() but can not call it because little
different.
It also used by my next patch which is in same case.
Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

0e22be89

btrfs: Use ref_cnt for set_block_group_ro() · 868f401a

由 Zhaolei 提交于 8月 05, 2015

More than one code call set_block_group_ro() and restore rw in fail.

Old code use bool bit to save blockgroup's ro state, it can not
support parallel case(it is confirmd exist in my debug log).

This patch use ref count to store ro state, and rename
set_block_group_ro/set_block_group_rw
to
inc_block_group_ro/dec_block_group_ro.
Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

868f401a

btrfs: Bypass unrelated items before accessing its contents in scrub · d7cad238

由 Zhao Lei 提交于 7月 22, 2015

When we access extent_root in scrub_stripe() and
scrub_raid56_parity(), we need bypass unrelated tree item firstly
before using its contents to do other condition.

It is not a bug fix, only making code sequence in logic.
Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

d7cad238

btrfs: Load only necessary csums into list in scrub · fe8cf654

由 Zhao Lei 提交于 7月 22, 2015

We need not load csum of whole strip in scrub because strip is trimed
before use, it is to say, what we really need to calculate csum is
data between [extent_logical, extent_len).

This patch changed to use above segment for btrfs_lookup_csums_range()
in scrub_stripe()
Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

fe8cf654

btrfs: Fix calculate typo caused by ambiguous meaning of logic_end · a0dd59de

由 Zhao Lei 提交于 7月 21, 2015

For example, in scrub_raid56_parity(), following lines are used
to judge is all data processed:
 place1: if (key.objectid > logic_end) ...
 place2: if (logic_start >= logic_end) ...
 ...
 (place2 is typo, is should be ">", it is copied from other
  place, where logic_end's meaning is different, long story...)

We can fix above typo directly, but the root reason is ambiguous
meaning of logic_end in scrub raid56 parity.

In other place, XXX_end is pointed to data which is not included,
and we need to process segment of [XXX_start, XXX_end).

But for scrub raid56 parity, logic_end is pointed to lattest data
need to process, and introduced many "+ 1" and "- 1" in code as
below:
 length = sparity->logic_end - sparity->logic_start + 1
 logic_end - logic_start + 1
 stripe_logical + increment - 1

This patch changed logic_end's meaning to make it in normal understanding
in raid56 parity functions and data struct alone with above bugfix.
Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

a0dd59de

btrfs: Free checksum list on scrub_extent() fail · 6fa96d72

由 Zhao Lei 提交于 7月 21, 2015

When scrub_extent() failed, we need to free previois created
checksum list.
Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

6fa96d72

btrfs: Check cancel and pause in interval of scrub operation · f2f66a2f

由 Zhao Lei 提交于 7月 21, 2015

Old code checking cancel and pause request inside scrub stripe
operation, like:
  loop() {
    if (parity) {
      scrub_parity_stripe();
      continue;
    }

    check_cancel_and_pause()

    scrub_normal_stripe();
  }

Reason is when introduce raid56 stripe scrub, new code is inserted
simplely to front of loop.

Better to:
  loop() {
    check_cancel_and_pause()

    if (parity)
      scrub_parity_stripe();
    else
      scrub_normal_stripe();
  }

This patch adjusted code place to realize above sequence.
Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

f2f66a2f

btrfs: Show detail information when mount failed on missing devices · 78fa1770

由 Zhao Lei 提交于 7月 20, 2015

When mount failed because missing device, we can see following
dmesg:
 [ 1060.267743] BTRFS: too many missing devices, writeable mount is not allowed
 [ 1060.273158] BTRFS: open_ctree failed

This patch add missing_device_number and tolerated_missing_device_number
to above output, to let user know what really happened, and helps
bug-report and debug.

dmesg after patch:
 [  127.050367] BTRFS: missing devices(1) exceeds the limit(0), writeable mount is not allowed
 [  127.056099] BTRFS: open_ctree failed

Changelog v1->v2:
1: Changed to more clear description, suggested-by:
   Anand Jain <anand.jain@oracle.com>
Suggested-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
Reviewed-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NChris Mason <clm@fb.com>

78fa1770

btrfs: Fix scrub panic when leaf crosses stripes · a323e813

由 Zhao Lei 提交于 7月 23, 2015

Scrub panic in following operation:
  mkfs.ext4 /dev/vdh
  btrfs-convert /dev/vdh
  mount /dev/vdh /mnt/tmp1
  btrfs scrub start -B /dev/vdh
  (panic)

Reason:
  1: In some case, leaf created by btrfs-convert was splited into 2
     strips.
  2: Scrub bypassed part of above wrong leaf data, but remain data
     caused panic in scrub_checksum_tree_block().

For reason 1:
  we can get following information after some simple operation.
  a. mkfs.ext4 /dev/vdh
     btrfs-convert /dev/vdh
  b. btrfs-debug-tree /dev/vdh
     we can see following item in extent tree:
     item 25 key (27054080 METADATA_ITEM 0) itemoff 15083 itemsize 33
     Its logical address is [27054080, 27070464)
     and acrossed 2 strips:
     [27000832, 27066368)
     [27066368, 27131904)
  Will be fixed in btrfs-progs(btrfs-convert, btrfsck, ...)

For reason 2:
  Scrub is trying to do a "bypass" in this case, but the result is
  "panic", because current code lacks of some condition in bypass,
  and let some wrong leaf data escaped.

This patch fixed above scrub code.

Before patch:
  # btrfs scrub start -B /dev/vdh
  (panic)

After patch:
  # btrfs scrub start -B /dev/vdh
  scrub done for 353cec8f-da31-4a94-aa35-be72d997b06e
  ...
  # dmesg
  ...
  [   59.088697] BTRFS error (device vdh): scrub: tree block 27054080 spanning stripes, ignored. logical=27000832
  [   59.089929] BTRFS error (device vdh): scrub: tree block 27054080 spanning stripes, ignored. logical=27066368
  #
Reported-by: NChris Murphy <lists@colorremedies.com>
Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

a323e813

Btrfs: fix stale dir entries after removing a link and fsync · 18aa0922

由 Filipe Manana 提交于 8月 05, 2015

We have one more case where after a log tree is replayed we get
inconsistent metadata leading to stale directory entries, due to
some directories having entries pointing to some inode while the
inode does not have a matching BTRFS_INODE_[REF|EXTREF]_KEY item.

To trigger the problem we need to have a file with multiple hard links
belonging to different parent directories. Then if one of those hard
links is removed and we fsync the file using one of its other links
that belongs to a different parent directory, we end up not logging
the fact that the removed hard link doesn't exists anymore in the
parent directory.

Simple reproducer:

  seq=`basename $0`
  seqres=$RESULT_DIR/$seq
  echo "QA output created by $seq"
  tmp=/tmp/$$
  status=1	# failure is the default!
  trap "_cleanup; exit \$status" 0 1 2 3 15

  _cleanup()
  {
      _cleanup_flakey
      rm -f $tmp.*
  }

  # get standard environment, filters and checks
  . ./common/rc
  . ./common/filter
  . ./common/dmflakey

  # real QA test starts here
  _need_to_be_root
  _supported_fs generic
  _supported_os Linux
  _require_scratch
  _require_dm_flakey
  _require_metadata_journaling $SCRATCH_DEV

  rm -f $seqres.full

  _scratch_mkfs >>$seqres.full 2>&1
  _init_flakey
  _mount_flakey

  # Create our test directory and file.
  mkdir $SCRATCH_MNT/testdir
  touch $SCRATCH_MNT/foo
  ln $SCRATCH_MNT/foo $SCRATCH_MNT/testdir/foo2
  ln $SCRATCH_MNT/foo $SCRATCH_MNT/testdir/foo3

  # Make sure everything done so far is durably persisted.
  sync

  # Now we remove one of our file's hardlinks in the directory testdir.
  unlink $SCRATCH_MNT/testdir/foo3

  # We now fsync our file using the "foo" link, which has a parent that
  # is not the directory "testdir".
  $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo

  # Silently drop all writes and unmount to simulate a crash/power
  # failure.
  _load_flakey_table $FLAKEY_DROP_WRITES
  _unmount_flakey

  # Allow writes again, mount to trigger journal/log replay.
  _load_flakey_table $FLAKEY_ALLOW_WRITES
  _mount_flakey

  # After the journal/log is replayed we expect to not see the "foo3"
  # link anymore and we should be able to remove all names in the
  # directory "testdir" and then remove it (no stale directory entries
  # left after the journal/log replay).
  echo "Entries in testdir:"
  ls -1 $SCRATCH_MNT/testdir

  rm -f $SCRATCH_MNT/testdir/*
  rmdir $SCRATCH_MNT/testdir

  _unmount_flakey

  status=0
  exit

The test fails with:

  $ ./check generic/107
  FSTYP         -- btrfs
  PLATFORM      -- Linux/x86_64 debian3 4.1.0-rc6-btrfs-next-11+
  MKFS_OPTIONS  -- /dev/sdc
  MOUNT_OPTIONS -- /dev/sdc /home/fdmanana/btrfs-tests/scratch_1

  generic/107 3s ... - output mismatch (see .../results/generic/107.out.bad)
    --- tests/generic/107.out	2015-08-01 01:39:45.807462161 +0100
    +++ /home/fdmanana/git/hub/xfstests/results//generic/107.out.bad
    @@ -1,3 +1,5 @@
     QA output created by 107
     Entries in testdir:
     foo2
    +foo3
    +rmdir: failed to remove '/home/fdmanana/btrfs-tests/scratch_1/testdir': Directory not empty
    ...
    _check_btrfs_filesystem: filesystem on /dev/sdc is inconsistent \
      (see /home/fdmanana/git/hub/xfstests/results//generic/107.full)
    _check_dmesg: something found in dmesg (see .../results/generic/107.dmesg)
  Ran: generic/107
  Failures: generic/107
  Failed 1 of 1 tests

  $ cat /home/fdmanana/git/hub/xfstests/results//generic/107.full
  (...)
  checking fs roots
  root 5 inode 257 errors 200, dir isize wrong
	unresolved ref dir 257 index 3 namelen 4 name foo3 filetype 1 errors 5, no dir item, no inode ref
  (...)

And produces the following warning in dmesg:

  [127298.759064] BTRFS info (device dm-0): failed to delete reference to foo3, inode 258 parent 257
  [127298.762081] ------------[ cut here ]------------
  [127298.763311] WARNING: CPU: 10 PID: 7891 at fs/btrfs/inode.c:3956 __btrfs_unlink_inode+0x182/0x35a [btrfs]()
  [127298.767327] BTRFS: Transaction aborted (error -2)
  (...)
  [127298.788611] Call Trace:
  [127298.789137]  [<ffffffff8145f077>] dump_stack+0x4f/0x7b
  [127298.790090]  [<ffffffff81095de5>] ? console_unlock+0x356/0x3a2
  [127298.791157]  [<ffffffff8104b3b0>] warn_slowpath_common+0xa1/0xbb
  [127298.792323]  [<ffffffffa065ad09>] ? __btrfs_unlink_inode+0x182/0x35a [btrfs]
  [127298.793633]  [<ffffffff8104b410>] warn_slowpath_fmt+0x46/0x48
  [127298.794699]  [<ffffffffa065ad09>] __btrfs_unlink_inode+0x182/0x35a [btrfs]
  [127298.797640]  [<ffffffffa065be8f>] btrfs_unlink_inode+0x1e/0x40 [btrfs]
  [127298.798876]  [<ffffffffa065bf11>] btrfs_unlink+0x60/0x9b [btrfs]
  [127298.800154]  [<ffffffff8116fb48>] vfs_unlink+0x9c/0xed
  [127298.801303]  [<ffffffff81173481>] do_unlinkat+0x12b/0x1fb
  [127298.802450]  [<ffffffff81253855>] ? lockdep_sys_exit_thunk+0x12/0x14
  [127298.803797]  [<ffffffff81174056>] SyS_unlinkat+0x29/0x2b
  [127298.805017]  [<ffffffff81465197>] system_call_fastpath+0x12/0x6f
  [127298.806310] ---[ end trace bbfddacb7aaada7b ]---
  [127298.807325] BTRFS warning (device dm-0): __btrfs_unlink_inode:3956: Aborting unused transaction(No such entry).

So fix this by logging all parent inodes, current and old ones, to make
sure we do not get stale entries after log replay. This is not a simple
solution such as triggering a full transaction commit because it would
imply full transaction commit when an inode is fsynced in the same
transaction that modified it and reloaded it after eviction (because its
last_unlink_trans is set to the same value as its last_trans as of the
commit with the title "Btrfs: fix stale dir entries after unlink, inode
eviction and fsync"), and it would also make fstest generic/066 fail
since one of the fsyncs triggers a full commit and the next fsync will
not find the inode in the log anymore (therefore not removing the xattr).
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

18aa0922

btrfs: fix search key advancing condition · dd81d459

由 Naohiro Aota 提交于 6月 30, 2015

The search key advancing condition used in copy_to_sk() is loose. It can
advance the key even if it reaches sk->max_*: e.g. when the max key = (512,
1024, -1) and the current key = (512, 1025, 10), it increments the
offset by 1, continues hopeless search from (512, 1025, 11). This issue
make ioctl() to take unexpectedly long time scanning all the leaf a blocks
one by one.

This commit fix the problem using standard way of key comparison:
btrfs_comp_cpu_keys()
Signed-off-by: NNaohiro Aota <naota@elisp.net>
Reviewed-by: NFilipe Manana <fdmanana@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

dd81d459

Btrfs: teach backref walking about backrefs with underflowed offset values · d6589101

由 Filipe Manana 提交于 7月 29, 2015

When cloning/deduplicating file extents (through the clone and extent_same
ioctls) we can get data back references with offset values that are a
result of an unsigned integer arithmetic underflow, that is, values that
are much larger then they could be otherwise.

This is not a problem when decrementing or dropping the back references
(happens when we overwrite the extents or punch a hole for example, through
__btrfs_drop_extents()), since we compute the same too large offset value,
but it is a problem for the backref walking code, used by an incremental
send and the ioctls that are used by the btrfs tool "inspect-internal"
commands, as it makes it miss the corresponding file extent items because
the search key is set for an extent item that starts at an offset matching
the exceptionally large offset value of the data back reference. For an
incremental send this causes the send ioctl to fail with -EIO.

So teach the backref walking code to deal with these cases by setting the
search key's offset to 0 if the backref's offset value is larger than
LLONG_MAX (the largest possible file offset). This makes sure the backref
walking code finds the corresponding file extent items at the expense of
scanning more items and leafs in the btree.

Fixing the clone/dedup ioctls to not produce such underflowed results would
require major changes breaking backward compatibility, updating user space
tools, etc.

Simple reproducer case for fstests:

  seq=`basename $0`
  seqres=$RESULT_DIR/$seq
  echo "QA output created by $seq"

  tmp=/tmp/$$
  status=1	# failure is the default!
  trap "_cleanup; exit \$status" 0 1 2 3 15

  _cleanup()
  {
      rm -fr $send_files_dir
      rm -f $tmp.*
  }

  # get standard environment, filters and checks
  . ./common/rc
  . ./common/filter

  # real QA test starts here
  _supported_fs btrfs
  _supported_os Linux
  _require_scratch
  _require_cloner
  _need_to_be_root

  send_files_dir=$TEST_DIR/btrfs-test-$seq

  rm -f $seqres.full
  rm -fr $send_files_dir
  mkdir $send_files_dir

  _scratch_mkfs >>$seqres.full 2>&1
  _scratch_mount

  # Create our test file with a single extent of 64K starting at file
  # offset 128K.
  $XFS_IO_PROG -f -c "pwrite -S 0xaa 128K 64K" $SCRATCH_MNT/foo \
      | _filter_xfs_io

  _run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT \
      $SCRATCH_MNT/mysnap1

  # Now clone parts of the original extent into lower offsets of the file.
  #
  # The first clone operation adds a file extent item to file offset 0
  # that points to our initial extent with a data offset of 16K. The
  # corresponding data back reference in the extent tree has an offset of
  # 18446744073709535232, which is the result of file_offset - data_offset
  # = 0 - 16K.
  #
  # The second clone operation adds a file extent item to file offset 16K
  # that points to our initial extent with a data offset of 48K. The
  # corresponding data back reference in the extent tree has an offset of
  # 18446744073709518848, which is the result of file_offset - data_offset
  # = 16K - 48K.
  #
  # Those large back reference offsets (result of unsigned arithmetic
  # underflow) confused the back reference walking code (used by an
  # incremental send and the multiple inspect-internal ioctls) and made it
  # miss the back references, which for the case of an incremental send it
  # made it fail with -EIO and print a message like the following to
  # dmesg:
  #
  # "BTRFS error (device sdc): did not find backref in send_root. \
  #  inode=257, offset=0, disk_byte=12845056 found extent=12845056"
  #
  $CLONER_PROG -s $(((128 + 16) * 1024)) -d 0 -l $((16 * 1024)) \
      $SCRATCH_MNT/foo $SCRATCH_MNT/foo
  $CLONER_PROG -s $(((128 + 48) * 1024)) -d $((16 * 1024)) \
      -l $((16 * 1024)) $SCRATCH_MNT/foo $SCRATCH_MNT/foo

  _run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT \
      $SCRATCH_MNT/mysnap2

  _run_btrfs_util_prog send $SCRATCH_MNT/mysnap1 -f $send_files_dir/1.snap
  _run_btrfs_util_prog send -p $SCRATCH_MNT/mysnap1 $SCRATCH_MNT/mysnap2 \
      -f $send_files_dir/2.snap

  echo "File digest in the original filesystem:"
  md5sum $SCRATCH_MNT/mysnap2/foo | _filter_scratch

  # Now recreate the filesystem by receiving both send streams and verify
  # we get the same file contents that the original filesystem had.
  _scratch_unmount
  _scratch_mkfs >>$seqres.full 2>&1
  _scratch_mount

  _run_btrfs_util_prog receive $SCRATCH_MNT -f $send_files_dir/1.snap
  _run_btrfs_util_prog receive $SCRATCH_MNT -f $send_files_dir/2.snap

  echo "File digest in the new filesystem:"
  md5sum $SCRATCH_MNT/mysnap2/foo | _filter_scratch

  status=0
  exit

The test's expected golden output is:

  wrote 65536/65536 bytes at offset 131072
  XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
  File digest in the original filesystem:
  6c6079335cff141b8a31233ead04cbff  SCRATCH_MNT/mysnap2/foo
  File digest in the new filesystem:
  6c6079335cff141b8a31233ead04cbff  SCRATCH_MNT/mysnap2/foo

But it failed with:

    (...)
    @@ -1,7 +1,5 @@
     QA output created by 097
     wrote 65536/65536 bytes at offset 131072
     XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
    -File digest in the original filesystem:
    -6c6079335cff141b8a31233ead04cbff  SCRATCH_MNT/mysnap2/foo
    -File digest in the new filesystem:
    -6c6079335cff141b8a31233ead04cbff  SCRATCH_MNT/mysnap2/foo
    ...

  $ cat /home/fdmanana/git/hub/xfstests/results//btrfs/097.full
  (...)
  ERROR: send ioctl failed with -5: Input/output error
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

d6589101

Btrfs: fix stale dir entries after unlink, inode eviction and fsync · bde6c242

由 Filipe Manana 提交于 7月 24, 2015

If we remove a hard link from an inode, the inode gets evicted, then
we fsync the inode and then power fail/crash, when the log tree is
replayed, the parent directory inode still has entries pointing to
the name that no longer exists, while our inode no longer has the
BTRFS_INODE_REF_KEY item matching the deleted hard link (as expected),
leaving the filesystem in an inconsistent state. The stale directory
entries can not be deleted (an attempt to delete them causes -ESTALE
errors), which makes it impossible to delete the parent directory.

This happens because we track the id of the transaction where the last
unlink operation for the inode happened (last_unlink_trans) in an
in-memory only field of the inode, that is, a value that is never
persisted in the inode item stored on the fs/subvol btree. So if an
inode is evicted and loaded again, the value for last_unlink_trans is
set to 0, which prevents the fsync from logging the parent directory
at btrfs_log_inode_parent(). So fix this by setting last_unlink_trans
to the id of the transaction that last modified the inode when we
load the inode. This is a pessimistic approach but it always ensures
correctness with the trade off of ocassional full transaction commits
when an fsync is done against the inode in the same transaction where
it was evicted and reloaded when our inode is a directory and often
logging its parent unnecessarily when our inode is not a directory.

The following test case for fstests triggers the problem:

  seq=`basename $0`
  seqres=$RESULT_DIR/$seq
  echo "QA output created by $seq"
  tmp=/tmp/$$
  status=1	# failure is the default!
  trap "_cleanup; exit \$status" 0 1 2 3 15

  _cleanup()
  {
      _cleanup_flakey
      rm -f $tmp.*
  }

  # get standard environment, filters and checks
  . ./common/rc
  . ./common/filter
  . ./common/dmflakey

  # real QA test starts here
  _need_to_be_root
  _supported_fs generic
  _supported_os Linux
  _require_scratch
  _require_dm_flakey
  _require_metadata_journaling $SCRATCH_DEV

  rm -f $seqres.full

  _scratch_mkfs >>$seqres.full 2>&1
  _init_flakey
  _mount_flakey

  # Create our test file with 2 hard links.
  mkdir $SCRATCH_MNT/testdir
  touch $SCRATCH_MNT/testdir/foo
  ln $SCRATCH_MNT/testdir/foo $SCRATCH_MNT/testdir/bar

  # Make sure everything done so far is durably persisted.
  sync

  # Now remove one of the links, trigger inode eviction and then fsync
  # our inode.
  unlink $SCRATCH_MNT/testdir/bar
  echo 2 > /proc/sys/vm/drop_caches
  $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/testdir/foo

  # Silently drop all writes on our scratch device to simulate a power failure.
  _load_flakey_table $FLAKEY_DROP_WRITES
  _unmount_flakey

  # Allow writes again and mount the fs to trigger log/journal replay.
  _load_flakey_table $FLAKEY_ALLOW_WRITES
  _mount_flakey

  # Now verify our directory entries.
  echo "Entries in testdir:"
  ls -1 $SCRATCH_MNT/testdir

  # If we remove our inode, its parent should become empty and therefore we should
  # be able to remove the parent.
  rm -f $SCRATCH_MNT/testdir/*
  rmdir $SCRATCH_MNT/testdir

  _unmount_flakey

  # The fstests framework will call fsck against our filesystem which will verify
  # that all metadata is in a consistent state.

  status=0
  exit

The test failed on btrfs with:

  generic/098 4s ... - output mismatch (see /home/fdmanana/git/hub/xfstests/results//generic/098.out.bad)
    --- tests/generic/098.out	2015-07-23 18:01:12.616175932 +0100
    +++ /home/fdmanana/git/hub/xfstests/results//generic/098.out.bad	2015-07-23 18:04:58.924138308 +0100
    @@ -1,3 +1,6 @@
     QA output created by 098
     Entries in testdir:
    +bar
     foo
    +rm: cannot remove '/home/fdmanana/btrfs-tests/scratch_1/testdir/foo': Stale file handle
    +rmdir: failed to remove '/home/fdmanana/btrfs-tests/scratch_1/testdir': Directory not empty
    ...
    (Run 'diff -u tests/generic/098.out /home/fdmanana/git/hub/xfstests/results//generic/098.out.bad'  to see the entire diff)
  _check_btrfs_filesystem: filesystem on /dev/sdc is inconsistent (see /home/fdmanana/git/hub/xfstests/results//generic/098.full)

  $ cat /home/fdmanana/git/hub/xfstests/results//generic/098.full
  (...)
  checking fs roots
  root 5 inode 258 errors 2001, no inode item, link count wrong
     unresolved ref dir 257 index 0 namelen 3 name foo filetype 1 errors 6, no dir index, no inode ref
     unresolved ref dir 257 index 3 namelen 3 name bar filetype 1 errors 5, no dir item, no inode ref
  Checking filesystem on /dev/sdc
  (...)
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

bde6c242

Btrfs: fix stale directory entries after fsync log replay · bb53eda9

由 Filipe Manana 提交于 7月 15, 2015

We have another case where after an fsync log replay we get an inode with
a wrong link count (smaller than it should be) and a number of directory
entries greater than its link count. This happens when we add a new link
hard link to our inode A and then we fsync some other inode B that has
the side effect of logging the parent directory inode too. In this case
at log replay time we add the new hard link to our inode (the item with
key BTRFS_INODE_REF_KEY) when processing the parent directory but we
never adjust the link count of our inode A. As a result we get stale dir
entries for our inode A that can never be deleted and therefore it makes
it impossible to remove the parent directory (as its i_size can never
decrease back to 0).

A simple reproducer for fstests that triggers this issue:

  seq=`basename $0`
  seqres=$RESULT_DIR/$seq
  echo "QA output created by $seq"
  tmp=/tmp/$$
  status=1	# failure is the default!
  trap "_cleanup; exit \$status" 0 1 2 3 15

  _cleanup()
  {
      _cleanup_flakey
      rm -f $tmp.*
  }

  # get standard environment, filters and checks
  . ./common/rc
  . ./common/filter
  . ./common/dmflakey

  # real QA test starts here
  _need_to_be_root
  _supported_fs generic
  _supported_os Linux
  _require_scratch
  _require_dm_flakey
  _require_metadata_journaling $SCRATCH_DEV

  rm -f $seqres.full

  _scratch_mkfs >>$seqres.full 2>&1
  _init_flakey
  _mount_flakey

  # Create our test directory and files.
  mkdir $SCRATCH_MNT/testdir
  touch $SCRATCH_MNT/testdir/foo
  touch $SCRATCH_MNT/testdir/bar

  # Make sure everything done so far is durably persisted.
  sync

  # Create one hard link for file foo and another one for file bar. After
  # that fsync only the file bar.
  ln $SCRATCH_MNT/testdir/bar $SCRATCH_MNT/testdir/bar_link
  ln $SCRATCH_MNT/testdir/foo $SCRATCH_MNT/testdir/foo_link
  $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/testdir/bar

  # Silently drop all writes on scratch device to simulate power failure.
  _load_flakey_table $FLAKEY_DROP_WRITES
  _unmount_flakey

  # Allow writes again and mount the fs to trigger log/journal replay.
  _load_flakey_table $FLAKEY_ALLOW_WRITES
  _mount_flakey

  # Now verify both our files have a link count of 2.
  echo "Link count for file foo: $(stat --format=%h $SCRATCH_MNT/testdir/foo)"
  echo "Link count for file bar: $(stat --format=%h $SCRATCH_MNT/testdir/bar)"

  # We should be able to remove all the links of our files in testdir, and
  # after that the parent directory should become empty and therefore
  # possible to remove it.
  rm -f $SCRATCH_MNT/testdir/*
  rmdir $SCRATCH_MNT/testdir

  _unmount_flakey

  # The fstests framework will call fsck against our filesystem which will verify
  # that all metadata is in a consistent state.

  status=0
  exit

The test fails with:

 -Link count for file foo: 2
 +Link count for file foo: 1
  Link count for file bar: 2
 +rm: cannot remove '/home/fdmanana/btrfs-tests/scratch_1/testdir/foo_link': Stale file handle
 +rmdir: failed to remove '/home/fdmanana/btrfs-tests/scratch_1/testdir': Directory not empty
 (...)
 _check_btrfs_filesystem: filesystem on /dev/sdc is inconsistent

And fsck's output:

  (...)
  checking fs roots
  root 5 inode 258 errors 2001, no inode item, link count wrong
      unresolved ref dir 257 index 5 namelen 8 name foo_link filetype 1 errors 4, no inode ref
  Checking filesystem on /dev/sdc
  (...)

So fix this by marking inodes for link count fixup at log replay time
whenever a directory entry is replayed if the entry was created in the
transaction where the fsync was made and if it points to a non-directory
inode.

This isn't a new problem/regression, the issue exists for a long time,
possibly since the log tree feature was added (2008).
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

bb53eda9

03 8月, 2015 5 次提交

L

Linux 4.2-rc5 · 74d33293
由 Linus Torvalds 提交于 8月 02, 2015

74d33293

Merge tag 'powerpc-4.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · d08c3181

由 Linus Torvalds 提交于 8月 02, 2015

Pull powerpc fixes from Michael Ellerman:
 - TCE table memory calculation fix from Alexey
 - Build fix for ans-lcd from Luis
 - Unbalanced IRQ warning fix from Alistair

* tag 'powerpc-4.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/eeh-powernv: Fix unbalanced IRQ warning
  macintosh/ans-lcd: fix build failure after module_init/exit relocation
  powerpc/powernv/ioda2: Fix calculation for memory allocated for TCE table

d08c3181

i915: temporary fix for DP MST docking station NULL pointer dereference · 27667f47

由 Linus Torvalds 提交于 7月 29, 2015

Ted Ts'o reports that his Lenovo T540p ThinkPad crashes at boot if
attached to the docking station.  This is a regression that he was able
to bisect to commit 8c7b5ccb: "drm/i915: Use atomic helpers for
computing changed flags:"

The reason seems to be the new call to drm_atomic_helper_check_modeset()
added to intel_modeset_compute_config(), which in turn calls
update_connector_routing(), and somehow ends up picking a NULL crtc for
the connector state, causing the subsequent drm_crtc_index() to OOPS.

Daniel Vetter says that the fundamental issue seems to be confusion in
the encoder selection, and this isn't the right fix, but while he chases
down the proper fix, this at least avoids the NULL pointer dereference
and makes Ted's docking station work again.
Reported-bisected-and-tested-by: NTheodore Ts'o <tytso@mit.edu>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Mani Nikula <jani.nikula@linux.intel.com>
Cc: Dave Airlie <airlied@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

27667f47

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · d4edea40

由 Linus Torvalds 提交于 8月 02, 2015

Pull SCSI fixes from James Bottomley:
 "A set of three fixes for the ipr driver and one fairly major one for
  memory leaks in the mq path of SCSI"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: fix memory leak with scsi-mq
  ipr: Fix invalid array indexing for HRRQ
  ipr: Fix incorrect trace indexing
  ipr: Fix locking for unit attention handling

d4edea40

Merge tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 30c7b56d

由 Linus Torvalds 提交于 8月 02, 2015

Pull ARM SoC fixes from Olof Johansson:
 "Things are calming down nicely here w.r.t. fixes.  This batch
  includes two week's worth since I missed to send before -rc4.

  Nothing particularly scary to point out, smaller fixes here and there.
  Shortlog describes it pretty well"

* tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: dts: keystone: fix dt bindings to use post div register for mainpll
  ARM: nomadik: disable UART0 on Nomadik boards
  ARM: dts: i.MX35: Fix can support.
  ARM: OMAP2+: hwmod: Fix _wait_target_ready() for hwmods without sysc
  ARM: dts: add CPU OPP and regulator supply property for exynos4210
  ARM: dts: Update video-phy node with syscon phandle for exynos3250
  ARM: DRA7: hwmod: fix gpmc hwmod

30c7b56d

02 8月, 2015 5 次提交

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 01183609

由 Linus Torvalds 提交于 8月 01, 2015

Pull VFS fix from Al Viro:
 "Spurious ENOTDIR fix"

This should fix the problems reported by Dominique Martinet and Hugh
Dickins.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  link_path_walk(): be careful when failing with ENOTDIR

01183609

link_path_walk(): be careful when failing with ENOTDIR · 97242f99

由 Al Viro 提交于 8月 01, 2015

In RCU mode we might end up with dentry evicted just we check
that it's a directory.  In such case we should return ECHILD
rather than ENOTDIR, so that pathwalk would be retries in non-RCU
mode.

Breakage had been introduced in commit b18825a7 - prior to that
we were looking at nd->inode, which had been fetched before
verifying that ->d_seq was still valid.  That form of check
would only be satisfied if at some point the pathname prefix
would indeed have resolved to a non-directory.  The fix consists
of checking ->d_seq after we'd run into a non-directory dentry,
and failing with ECHILD in case of mismatch.

Note that all branches since 3.12 have that problem...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

97242f99

Merge tag 'dmaengine-fix-4.2-rc5' of git://git.infradead.org/users/vkoul/slave-dma · 3f6d9e08

由 Linus Torvalds 提交于 8月 01, 2015

Pull dmaengine fixes from Vinod Koul:
 "We had a regression due to reuse of descriptor so we have reverted
  that.

  The rest are driver fixes:

   - at_hdmac and at_xdmac for residue, trannfer width, and channel config
   - pl330 final fix for dma fails and overflow issue
   - xgene resouce map fix
   - mv_xor big endian op fix"

* tag 'dmaengine-fix-4.2-rc5' of git://git.infradead.org/users/vkoul/slave-dma:
  Revert "dmaengine: virt-dma: don't always free descriptor upon completion"
  dmaengine: mv_xor: fix big endian operation in register mode
  dmaengine: xgene-dma: Fix the resource map to handle overlapping
  dmaengine: at_xdmac: fix transfer data width in at_xdmac_prep_slave_sg()
  dmaengine: at_hdmac: fix residue computation
  dmaengine: at_xdmac: fix bug about channel configuration
  dmaengine: pl330: Really fix choppy sound because of wrong residue calculation
  dmaengine: pl330: Fix overflow when reporting residue in memcpy

3f6d9e08

Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3270c8ea

由 Linus Torvalds 提交于 8月 01, 2015

Pull irq fixlets from Thomas Gleixner:
 "Just two updates to the maintainers file"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  MAINTAINERS: Appoint Jiang and Marc as irqdomain maintainers
  MAINTAINERS: Appoint Marc Zyngier as irqchips co-maintainer

3270c8ea

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 51d2e09b

由 Linus Torvalds 提交于 8月 01, 2015

Pull x86 fixes from Ingo Molnar:
 "Fallout from the recent NMI fixes: make x86 LDT handling more robust.

  Also some EFI fixes"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ldt: Make modify_ldt synchronous
  x86/xen: Probe target addresses in set_aliased_prot() before the hypercall
  x86/irq: Use the caller provided polarity setting in mp_check_pin_attr()
  efi: Check for NULL efi kernel parameters
  x86/efi: Use all 64 bit of efi_memmap in setup_e820()

51d2e09b

01 8月, 2015 11 次提交

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 7c764cec

由 Linus Torvalds 提交于 7月 31, 2015

Pull networking fixes from David Miller:

 1) Must teardown SR-IOV before unregistering netdev in igb driver, from
    Alex Williamson.

 2) Fix ipv6 route unreachable crash in IPVS, from Alex Gartrell.

 3) Default route selection in ipv4 should take the prefix length, table
    ID, and TOS into account, from Julian Anastasov.

 4) sch_plug must have a reset method in order to purge all buffered
    packets when the qdisc is reset, likewise for sch_choke, from WANG
    Cong.

 5) Fix deadlock and races in slave_changelink/br_setport in bridging.
    From Nikolay Aleksandrov.

 6) mlx4 bug fixes (wrong index in port even propagation to VFs,
    overzealous BUG_ON assertion, etc.) from Ido Shamay, Jack
    Morgenstein, and Or Gerlitz.

 7) Turn off klog message about SCTP userspace interface compat that
    makes no sense at all, from Daniel Borkmann.

 8) Fix unbounded restarts of inet frag eviction process, causing NMI
    watchdog soft lockup messages, from Florian Westphal.

 9) Suspend/resume fixes for r8152 from Hayes Wang.

10) Fix busy loop when MSG_WAITALL|MSG_PEEK is used in TCP recv, from
    Sabrina Dubroca.

11) Fix performance regression when removing a lot of routes from the
    ipv4 routing tables, from Alexander Duyck.

12) Fix device leak in AF_PACKET, from Lars Westerhoff.

13) AF_PACKET also has a header length comparison bug due to signedness,
    from Alexander Drozdov.

14) Fix bug in EBPF tail call generation on x86, from Daniel Borkmann.

15) Memory leaks, TSO stats, watchdog timeout and other fixes to
    thunderx driver from Sunil Goutham and Thanneeru Srinivasulu.

16) act_bpf can leak memory when replacing programs, from Daniel
    Borkmann.

17) WOL packet fixes in gianfar driver, from Claudiu Manoil.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (79 commits)
  stmmac: fix missing MODULE_LICENSE in stmmac_platform
  gianfar: Enable device wakeup when appropriate
  gianfar: Fix suspend/resume for wol magic packet
  gianfar: Fix warning when CONFIG_PM off
  act_pedit: check binding before calling tcf_hash_release()
  net: sk_clone_lock() should only do get_net() if the parent is not a kernel socket
  net: sched: fix refcount imbalance in actions
  r8152: reset device when tx timeout
  r8152: add pre_reset and post_reset
  qlcnic: Fix corruption while copying
  act_bpf: fix memory leaks when replacing bpf programs
  net: thunderx: Fix for crash while BGX teardown
  net: thunderx: Add PCI driver shutdown routine
  net: thunderx: Fix crash when changing rss with mutliple traffic flows
  net: thunderx: Set watchdog timeout value
  net: thunderx: Wakeup TXQ only if CQE_TX are processed
  net: thunderx: Suppress alloc_pages() failure warnings
  net: thunderx: Fix TSO packet statistic
  net: thunderx: Fix memory leak when changing queue count
  net: thunderx: Fix RQ_DROP miscalculation
  ...

7c764cec

Merge branch 'for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · acea568f

由 Linus Torvalds 提交于 7月 31, 2015

Pull btrfs fixes from Chris Mason:
 "Filipe fixed up a hard to trigger ENOSPC regression from our merge
  window pull, and we have a few other smaller fixes"

* 'for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: fix quick exhaustion of the system array in the superblock
  btrfs: its btrfs_err() instead of btrfs_error()
  btrfs: Avoid NULL pointer dereference of free_extent_buffer when read_tree_block() fail
  btrfs: Fix lockdep warning of btrfs_run_delayed_iputs()

acea568f

Merge tag 'sound-4.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · c6fd4fc7

由 Linus Torvalds 提交于 7月 31, 2015

Pull sound fixes from Takashi Iwai:
 "This became a relative big update as it includes the collected ASoC
  fixes.  There are a few fixes in ASoC core side, mostly for DAPM and
  the new topology API.  The rest are various ASoC driver-specific
  fixes, as well as the usual HD-audio and USB-audio quirks"

* tag 'sound-4.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (29 commits)
  ALSA: hda - Fix MacBook Pro 5,2 quirk
  ALSA: hda - Fix race between PM ops and HDA init/probe
  ALSA: usb-audio: add dB range mapping for some devices
  ALSA: hda - Apply a fixup to Dell Vostro 5480
  ALSA: hda - Add pin quirk for the headset mic jack detection on Dell laptop
  ALSA: hda - Apply fixup for another Toshiba Satellite S50D
  ALSA: fireworks: add support for AudioFire2 quirk
  ALSA: hda - Fix the headset mic that will not work on Dell desktop machine
  ALSA: hda - fix cs4210_spdif_automute()
  ASoC: pcm1681: Fix setting de-emphasis sampling rate selection
  ASoC: ssm4567: Keep TDM_BCLKS in ssm4567_set_dai_fmt
  ASoC: sgtl5000: Fix up define for SGTL5000_SMALL_POP
  ASoC: dapm: Don't add prefix to widget stream name
  ASoC: rt5645: Check if codec is initialized in workqueue handler
  ASoC: Intel: Get correct usage_count value to load firmware
  ASoC: topology: Fix to add dapm mixer info
  ASoC: zx: spdif: Fix devm_ioremap_resource return value check
  ASoC: zx: i2s: Fix devm_ioremap_resource return value check
  ASoC: mediatek: Use platform_of_node for machine drivers
  ASoC: Free card DAPM context on snd_soc_instantiate_card() error path
  ...

c6fd4fc7

stmmac: fix missing MODULE_LICENSE in stmmac_platform · ea111545

由 Joachim Eastwood 提交于 7月 31, 2015

Commit 50649ab1 ("stmmac: drop driver from stmmac platform code")
was a bit overzealous in removing code and dropped the MODULE_*
macro's that are still needed since stmmac_platform can be a module.
Fix this by putting the macro's remvoed in 50649ab1 back.

This fixes the following errors when used as a module:
  stmmac_platform: module license 'unspecified' taints kernel.
  Disabling lock debugging due to kernel taint
  stmmac_platform: Unknown symbol devm_kmalloc (err 0)
  stmmac_platform: Unknown symbol stmmac_suspend (err 0)
  stmmac_platform: Unknown symbol platform_get_irq_byname (err 0)
  stmmac_platform: Unknown symbol stmmac_dvr_remove (err 0)
  stmmac_platform: Unknown symbol platform_get_resource (err 0)
  stmmac_platform: Unknown symbol of_get_phy_mode (err 0)
  stmmac_platform: Unknown symbol of_property_read_u32_array (err 0)
  stmmac_platform: Unknown symbol of_alias_get_id (err 0)
  stmmac_platform: Unknown symbol stmmac_resume (err 0)
  stmmac_platform: Unknown symbol stmmac_dvr_probe (err 0)

Fixes: 50649ab1 ("stmmac: drop driver from stmmac platform code")
Reported-by: NIgor Gnatenko <i.gnatenko.brain@gmail.com>
Signed-off-by: NJoachim Eastwood <manabian@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ea111545

Merge branch 'gianfar-wol-fixes' · ef1f4364

由 David S. Miller 提交于 7月 31, 2015

Claudiu Manoil says:

====================
gianfar: wol magic packet fixes

These changes were already validated as part of FSL SDK.
Patch 2 fixes occasional wake-on magic packet failures during
traffic, probably due to incorrect traffic stop/ device halt
sequence and incorrect usage of txlock.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ef1f4364

gianfar: Enable device wakeup when appropriate · b0734b6d

由 Claudiu Manoil 提交于 7月 31, 2015

The wol_en flag is 0 by default anyway, and we have the
following inconsistency: a MAGIC packet wol capable eth
interface is registered as a wake-up source but unable
to wake-up the system as wol_en is 0 (wake-on flag set to 'd').
Calling set_wakeup_enable() at netdev open is just redundant
because wol_en is 0 by default.
Let only ethtool call set_wakeup_enable() for now.

The bflock is obviously obsoleted, its utility has been corroded
over time.  The bitfield flags used today in gianfar are accessed
only on the init/ config path, with no real possibility of
concurrency - nothing that would justify smth. like bflock.
Signed-off-by: NClaudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b0734b6d

gianfar: Fix suspend/resume for wol magic packet · 614b4242

由 Claudiu Manoil 提交于 7月 31, 2015

If we disable NAPI in the first place we can mask the device's
interrupts (and halt it) without fearing that imask may be
concurrently accessed from interrupt context, so there's
no need to do local_irq_save() around gfar_halt_nodisable().
lock_rx_qs()/unlock_tx_qs() are just obsoleted and potentially
buggy routines.  The txlock is currently used in the driver only
to manage TX congestion, it has nothing to do with halting the
device.  With these changes, the TX processing is stopped before
gfar_halt().

Compact gfar_halt() is used instead of gfar_halt_nodisable(),
as it disables Rx/TX DMA h/w blocks and the Rx/TX h/w queues.
gfar_start() re-enables all these blocks on resume.  Enabling
the magic-packet mode remains the same, note that the RX block
is re-enabled just before entering sleep mode.

Add IRQF_NO_SUSPEND flag for the error interrupt line, to signal
that the interrupt line must remain active during sleep in order
to wake the system by magic packet (MAG) reception interrupt.
(On some systems the MAG interrupt did trigger w/o this flag
as well, but on others it didn't.)

Without these fixes, when suspended during fair Tx traffic the
interface occasionally failed to be woken up by magic packet.
Signed-off-by: NClaudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

614b4242

gianfar: Fix warning when CONFIG_PM off · 84868305

由 Claudiu Manoil 提交于 7月 31, 2015

CC      drivers/net/ethernet/freescale/gianfar.o
drivers/net/ethernet/freescale/gianfar.c:568:13: warning: 'lock_tx_qs'
defined but not used [-Wunused-function]
 static void lock_tx_qs(struct gfar_private *priv)
             ^
drivers/net/ethernet/freescale/gianfar.c:576:13: warning: 'unlock_tx_qs'
defined but not used [-Wunused-function]
 static void unlock_tx_qs(struct gfar_private *priv)
             ^
Reported-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NClaudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

84868305

act_pedit: check binding before calling tcf_hash_release() · 5175f710

由 WANG Cong 提交于 7月 30, 2015

When we share an action within a filter, the bind refcnt
should increase, therefore we should not call tcf_hash_release().

Fixes: 1a29321e ("net_sched: act: Dont increment refcnt on replace")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NCong Wang <cwang@twopensource.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5175f710

ARM: dts: keystone: fix dt bindings to use post div register for mainpll · c1bfa985

由 Murali Karicheri 提交于 5月 29, 2015

All of the keystone devices have a separate register to hold post
divider value for main pll clock. Currently the fixed-postdiv
value used for k2hk/l/e SoCs works by sheer luck as u-boot happens to
use a value of 2 for this. Now that we have fixed this in the pll
clock driver change the dt bindings for the same.
Signed-off-by: NMurali Karicheri <m-karicheri2@ti.com>
Acked-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NOlof Johansson <olof@lixom.net>

c1bfa985

Merge tag 'iommu-fixes-v4.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 5e49e0be

由 Linus Torvalds 提交于 7月 31, 2015

Pull IOMMU fixes from Joerg Roedel:
 "These fixes are all for the AMD IOMMU driver:

   - A regression with HSA caused by the conversion of the driver to
     default domains.  The fixes make sure that an HSA device can still
     be attached to an IOMMUv2 domain and that these domains also allow
     non-IOMMUv2 capable devices.

   - Fix iommu=pt mode which did not work because the dma_ops where set
     to nommu_ops, which breaks devices that can only do 32bit DMA.

   - Fix an issue with non-PCI devices not working, because there are no
     dma_ops for them.  This issue was discovered recently as new AMD
     x86 platforms have non-PCI devices too"

* tag 'iommu-fixes-v4.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/amd: Allow non-ATS devices in IOMMUv2 domains
  iommu/amd: Set global dma_ops if swiotlb is disabled
  iommu/amd: Use swiotlb in passthrough mode
  iommu/amd: Allow non-IOMMUv2 devices in IOMMUv2 domains
  iommu/amd: Use iommu core for passthrough mode
  iommu/amd: Use iommu_attach_group()

5e49e0be

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功