1. 05 5月, 2016 10 次提交
  2. 30 4月, 2016 1 次提交
  3. 26 4月, 2016 1 次提交
  4. 25 4月, 2016 1 次提交
  5. 17 4月, 2016 1 次提交
  6. 15 4月, 2016 2 次提交
  7. 11 4月, 2016 1 次提交
  8. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  9. 02 4月, 2016 1 次提交
  10. 01 4月, 2016 3 次提交
    • S
      MD: add rdev reference for super write · ed3b98c7
      Shaohua Li 提交于
      Xiao Ni reported below crash:
      [26396.335146] BUG: unable to handle kernel NULL pointer dereference at 00000000000002a8
      [26396.342990] IP: [<ffffffffa0425b00>] super_written+0x20/0x80 [md_mod]
      [26396.349449] PGD 0
      [26396.351468] Oops: 0002 [#1] SMP
      [26396.354898] Modules linked in: ext4 mbcache jbd2 raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_td
      [26396.408404] CPU: 5 PID: 3261 Comm: loop0 Not tainted 4.5.0 #1
      [26396.414140] Hardware name: Dell Inc. PowerEdge R715/0G2DP3, BIOS 3.2.2 09/15/2014
      [26396.421608] task: ffff8808339be680 ti: ffff8808365f4000 task.ti: ffff8808365f4000
      [26396.429074] RIP: 0010:[<ffffffffa0425b00>]  [<ffffffffa0425b00>] super_written+0x20/0x80 [md_mod]
      [26396.437952] RSP: 0018:ffff8808365f7c38  EFLAGS: 00010046
      [26396.443252] RAX: ffffffffa0425ae0 RBX: ffff8804336a7900 RCX: ffffe8f9f7b41198
      [26396.450371] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8804336a7900
      [26396.457489] RBP: ffff8808365f7c50 R08: 0000000000000005 R09: 00001801e02ce3d7
      [26396.464608] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
      [26396.471728] R13: ffff8808338d9a00 R14: 0000000000000000 R15: ffff880833f9fe00
      [26396.478849] FS:  00007f9e5066d740(0000) GS:ffff880237b40000(0000) knlGS:0000000000000000
      [26396.486922] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [26396.492656] CR2: 00000000000002a8 CR3: 00000000019ea000 CR4: 00000000000006e0
      [26396.499775] Stack:
      [26396.501781]  ffff8804336a7900 0000000000000000 0000000000000000 ffff8808365f7c68
      [26396.509199]  ffffffff81308cd0 ffff8804336a7900 ffff8808365f7ca8 ffffffff81310637
      [26396.516618]  00000000a0233a00 ffff880833f9fe00 0000000000000000 ffff880833fb0000
      [26396.524038] Call Trace:
      [26396.526485]  [<ffffffff81308cd0>] bio_endio+0x40/0x60
      [26396.531529]  [<ffffffff81310637>] blk_update_request+0x87/0x320
      [26396.537439]  [<ffffffff8131a20a>] blk_mq_end_request+0x1a/0x70
      [26396.543261]  [<ffffffff81313889>] blk_flush_complete_seq+0xd9/0x2a0
      [26396.549517]  [<ffffffff81313ccf>] flush_end_io+0x15f/0x240
      [26396.554993]  [<ffffffff8131a22a>] blk_mq_end_request+0x3a/0x70
      [26396.560815]  [<ffffffff8131a314>] __blk_mq_complete_request+0xb4/0xe0
      [26396.567246]  [<ffffffff8131a35c>] blk_mq_complete_request+0x1c/0x20
      [26396.573506]  [<ffffffffa04182df>] loop_queue_work+0x6f/0x72c [loop]
      [26396.579764]  [<ffffffff81697844>] ? __schedule+0x2b4/0x8f0
      [26396.585242]  [<ffffffff810a7812>] kthread_worker_fn+0x52/0x170
      [26396.591065]  [<ffffffff810a77c0>] ? kthread_create_on_node+0x1a0/0x1a0
      [26396.597582]  [<ffffffff810a7238>] kthread+0xd8/0xf0
      [26396.602453]  [<ffffffff810a7160>] ? kthread_park+0x60/0x60
      [26396.607929]  [<ffffffff8169bdcf>] ret_from_fork+0x3f/0x70
      [26396.613319]  [<ffffffff810a7160>] ? kthread_park+0x60/0x60
      
      md_super_write() and corresponding md_super_wait() generally are called
      with reconfig_mutex locked, which prevents disk disappears. There is one
      case this rule is broken. write_sb_page of bitmap.c doesn't hold the
      mutex. next_active_rdev does increase rdev reference, but it decreases
      the reference too early (eg, before IO finish). disk can disappear at
      the window. We unconditionally increase rdev reference in
      md_super_write() to avoid the race.
      Reported-and-tested-by: NXiao Ni <xni@redhat.com>
      Reviewed-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NShaohua Li <shli@fb.com>
      ed3b98c7
    • W
      md: fix a trivial typo in comments · 466ad292
      Wei Fang 提交于
      Fix a trivial typo in md_ioctl().
      Signed-off-by: NWei Fang <fangwei1@huawei.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      466ad292
    • W
      md:raid1: fix a dead loop when read from a WriteMostly disk · 816b0acf
      Wei Fang 提交于
      If first_bad == this_sector when we get the WriteMostly disk
      in read_balance(), valid disk will be returned with zero
      max_sectors. It'll lead to a dead loop in make_request(), and
      OOM will happen because of endless allocation of struct bio.
      
      Since we can't get data from this disk in this case, so
      continue for another disk.
      Signed-off-by: NWei Fang <fangwei1@huawei.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      816b0acf
  11. 18 3月, 2016 3 次提交
    • A
      md/raid5: Cleanup cpu hotplug notifier · 1d034e68
      Anna-Maria Gleixner 提交于
      The raid456_cpu_notify() hotplug callback lacks handling of the
      CPU_UP_CANCELED case. That means if CPU_UP_PREPARE fails, the scratch
      buffer is leaked.
      
      Add handling for CPU_UP_CANCELED[_FROZEN] hotplug notifier transitions
      to free the scratch buffer.
      
      CC: Shaohua Li <shli@kernel.org>
      CC: linux-raid@vger.kernel.org
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Signed-off-by: NShaohua Li <shli@fb.com>
      1d034e68
    • S
      raid10: include bio_end_io_list in nr_queued to prevent freeze_array hang · 23ddba80
      Shaohua Li 提交于
      This is the raid10 counterpart of the bug fixed by Nate
      (raid1: include bio_end_io_list in nr_queued to prevent freeze_array hang)
      
      Fixes: 95af587e(md/raid10: ensure device failure recorded before write request returns)
      Cc: stable@vger.kernel.org (V4.3+)
      Cc: Nate Dailey <nate.dailey@stratus.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      23ddba80
    • N
      raid1: include bio_end_io_list in nr_queued to prevent freeze_array hang · ccfc7bf1
      Nate Dailey 提交于
      If raid1d is handling a mix of read and write errors, handle_read_error's
      call to freeze_array can get stuck.
      
      This can happen because, though the bio_end_io_list is initially drained,
      writes can be added to it via handle_write_finished as the retry_list
      is processed. These writes contribute to nr_pending but are not included
      in nr_queued.
      
      If a later entry on the retry_list triggers a call to handle_read_error,
      freeze array hangs waiting for nr_pending == nr_queued+extra. The writes
      on the bio_end_io_list aren't included in nr_queued so the condition will
      never be satisfied.
      
      To prevent the hang, include bio_end_io_list writes in nr_queued.
      
      There's probably a better way to handle decrementing nr_queued, but this
      seemed like the safest way to avoid breaking surrounding code.
      
      I'm happy to supply the script I used to repro this hang.
      
      Fixes: 55ce74d4(md/raid1: ensure device failure recorded before write request returns.)
      Cc: stable@vger.kernel.org (v4.3+)
      Signed-off-by: NNate Dailey <nate.dailey@stratus.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      ccfc7bf1
  12. 15 3月, 2016 5 次提交
  13. 12 3月, 2016 1 次提交
    • M
      dm thin: consistently return -ENOSPC if pool has run out of data space · c3667cc6
      Mike Snitzer 提交于
      Commit 0a927c2f ("dm thin: return -ENOSPC when erroring retry list due
      to out of data space") was a step in the right direction but didn't go
      far enough.
      
      Add a new 'out_of_data_space' flag to 'struct pool' and set it if/when
      the pool runs of of data space.  This fixes cell_error() and
      error_retry_list() to not blindly return -EIO.
      
      We cannot rely on the 'error_if_no_space' feature flag since it is
      transient (in that it can be reset once space is added, plus it only
      controls whether errors are issued, it doesn't reflect whether the
      pool is actually out of space).
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      c3667cc6
  14. 11 3月, 2016 9 次提交
    • M
      dm cache: bump the target version · 843f0f2e
      Mike Snitzer 提交于
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      843f0f2e
    • J
      dm cache: make sure every metadata function checks fail_io · d14fcf3d
      Joe Thornber 提交于
      Otherwise operations may be attempted that will only ever go on to crash
      (since the metadata device is either missing or unreliable if 'fail_io'
      is set).
      Signed-off-by: NJoe Thornber <ejt@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org
      d14fcf3d
    • M
    • M
    • M
      dm: return error if bio_integrity_clone() fails in clone_bio() · c80914e8
      Mike Snitzer 提交于
      clone_bio() now checks if bio_integrity_clone() returned an error rather
      than just drop it on the floor.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      c80914e8
    • J
      dm thin metadata: don't issue prefetches if a transaction abort has failed · 2eae9e44
      Joe Thornber 提交于
      If a transaction abort has failed then we can no longer use the metadata
      device.  Typically this happens if the superblock is unreadable.
      
      This fix addresses a crash seen during metadata device failure testing.
      
      Fixes: 8a01a6af ("dm thin: prefetch missing metadata pages")
      Cc: stable@vger.kernel.org # 3.19+
      Signed-off-by: NJoe Thornber <ejt@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      2eae9e44
    • D
      dm snapshot: disallow the COW and origin devices from being identical · 4df2bf46
      DingXiang 提交于
      Otherwise loading a "snapshot" table using the same device for the
      origin and COW devices, e.g.:
      
      echo "0 20971520 snapshot 253:3 253:3 P 8" | dmsetup create snap
      
      will trigger:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
      [ 1958.979934] IP: [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot]
      [ 1958.989655] PGD 0
      [ 1958.991903] Oops: 0000 [#1] SMP
      ...
      [ 1959.059647] CPU: 9 PID: 3556 Comm: dmsetup Tainted: G          IO    4.5.0-rc5.snitm+ #150
      ...
      [ 1959.083517] task: ffff8800b9660c80 ti: ffff88032a954000 task.ti: ffff88032a954000
      [ 1959.091865] RIP: 0010:[<ffffffffa040efba>]  [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot]
      [ 1959.104295] RSP: 0018:ffff88032a957b30  EFLAGS: 00010246
      [ 1959.110219] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000001
      [ 1959.118180] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff880329334a00
      [ 1959.126141] RBP: ffff88032a957b50 R08: 0000000000000000 R09: 0000000000000001
      [ 1959.134102] R10: 000000000000000a R11: f000000000000000 R12: ffff880330884d80
      [ 1959.142061] R13: 0000000000000008 R14: ffffc90001c13088 R15: ffff880330884d80
      [ 1959.150021] FS:  00007f8926ba3840(0000) GS:ffff880333440000(0000) knlGS:0000000000000000
      [ 1959.159047] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1959.165456] CR2: 0000000000000098 CR3: 000000032f48b000 CR4: 00000000000006e0
      [ 1959.173415] Stack:
      [ 1959.175656]  ffffc90001c13040 ffff880329334a00 ffff880330884ed0 ffff88032a957bdc
      [ 1959.183946]  ffff88032a957bb8 ffffffffa040f225 ffff880329334a30 ffff880300000000
      [ 1959.192233]  ffffffffa04133e0 ffff880329334b30 0000000830884d58 00000000569c58cf
      [ 1959.200521] Call Trace:
      [ 1959.203248]  [<ffffffffa040f225>] dm_exception_store_create+0x1d5/0x240 [dm_snapshot]
      [ 1959.211986]  [<ffffffffa040d310>] snapshot_ctr+0x140/0x630 [dm_snapshot]
      [ 1959.219469]  [<ffffffffa0005c44>] ? dm_split_args+0x64/0x150 [dm_mod]
      [ 1959.226656]  [<ffffffffa0005ea7>] dm_table_add_target+0x177/0x440 [dm_mod]
      [ 1959.234328]  [<ffffffffa0009203>] table_load+0x143/0x370 [dm_mod]
      [ 1959.241129]  [<ffffffffa00090c0>] ? retrieve_status+0x1b0/0x1b0 [dm_mod]
      [ 1959.248607]  [<ffffffffa0009e35>] ctl_ioctl+0x255/0x4d0 [dm_mod]
      [ 1959.255307]  [<ffffffff813304e2>] ? memzero_explicit+0x12/0x20
      [ 1959.261816]  [<ffffffffa000a0c3>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
      [ 1959.268615]  [<ffffffff81215eb6>] do_vfs_ioctl+0xa6/0x5c0
      [ 1959.274637]  [<ffffffff81120d2f>] ? __audit_syscall_entry+0xaf/0x100
      [ 1959.281726]  [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
      [ 1959.288814]  [<ffffffff81216449>] SyS_ioctl+0x79/0x90
      [ 1959.294450]  [<ffffffff8167e4ae>] entry_SYSCALL_64_fastpath+0x12/0x71
      ...
      [ 1959.323277] RIP  [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot]
      [ 1959.333090]  RSP <ffff88032a957b30>
      [ 1959.336978] CR2: 0000000000000098
      [ 1959.344121] ---[ end trace b049991ccad1169e ]---
      
      Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1195899
      Cc: stable@vger.kernel.org
      Signed-off-by: NDing Xiang <dingxiang@huawei.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      4df2bf46
    • J
      dm cache: make the 'mq' policy an alias for 'smq' · 9ed84698
      Joe Thornber 提交于
      smq seems to be performing better than the old mq policy in all
      situations, as well as using a quarter of the memory.
      
      Make 'mq' an alias for 'smq' when choosing a cache policy.  The tunables
      that were present for the old mq are faked, and have no effect.  mq
      should be considered deprecated now.
      Signed-off-by: NJoe Thornber <ejt@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      9ed84698
    • B
      dm: drop unnecessary assignment of md->queue · e233d800
      Bob Liu 提交于
      md->queue and q are the same thing in dm_old_init_request_queue() and
      dm_mq_init_request_queue().
      
      Also drop the temporary 'struct request_queue *q' in
      dm_old_init_request_queue().
      Signed-off-by: NBob Liu <bob.liu@oracle.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      e233d800