1. 02 12月, 2017 1 次提交
    • N
      md: limit mdstat resync progress to max_sectors · d2e2ec82
      Nate Dailey 提交于
      There is a small window near the end of md_do_sync where mddev->curr_resync
      can be equal to MaxSector.
      
      If status_resync is called during this window, the resulting /proc/mdstat
      output contains a HUGE number of = signs due to the very large curr_resync:
      
      Personalities : [raid1]
      md123 : active raid1 sdd3[2] sdb3[0]
        204736 blocks super 1.0 [2/1] [U_]
        [=====================================================================
         ... (82 MB more) ...
         ================>]  recovery =429496729.3% (9223372036854775807/204736)
         finish=0.2min speed=12796K/sec
        bitmap: 0/1 pages [0KB], 65536KB chunk
      
      Modify status_resync to ensure the resync variable doesn't exceed
      the array's max_sectors.
      Signed-off-by: NNate Dailey <nate.dailey@stratus.com>
      Acked-by: NGuoqing Jiang <gqjiang@suse.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      d2e2ec82
  2. 15 11月, 2017 1 次提交
  3. 11 11月, 2017 1 次提交
  4. 09 11月, 2017 1 次提交
    • N
      md: be cautious about using ->curr_resync_completed for ->recovery_offset · db0505d3
      NeilBrown 提交于
      The ->recovery_offset shows how much of a non-InSync device is actually
      in sync - how much has been recoveryed.
      
      When performing a recovery, ->curr_resync and ->curr_resync_completed
      follow the device address being recovered and so can be used to update
      ->recovery_offset.
      
      When performing a reshape, ->curr_resync* might follow the device
      addresses (raid5) or might follow array addresses (raid10), so cannot
      in general be used to set ->recovery_offset.  When reshaping backwards,
      ->curre_resync* measures from the *end* of the array-or-device, so is
      particularly unhelpful.
      
      So change the common code in md.c to only use ->curr_resync_complete
      for the simple recovery case, and add code to raid5.c to update
      ->recovery_offset during a forwards reshape.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      db0505d3
  5. 02 11月, 2017 9 次提交
    • A
      md: don't check MD_SB_CHANGE_CLEAN in md_allow_write · b90f6ff0
      Artur Paszkiewicz 提交于
      Only MD_SB_CHANGE_PENDING should be used to wait for transition from
      clean to dirty. Checking also MD_SB_CHANGE_CLEAN is unnecessary and can
      race with e.g. md_do_sync(). This sporadically causes a hang when
      changing consistency policy during resync:
      
      INFO: task mdadm:6183 blocked for more than 30 seconds.
            Not tainted 4.14.0-rc3+ #391
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      mdadm           D12752  6183   6022 0x00000000
      Call Trace:
       __schedule+0x93f/0x990
       schedule+0x6b/0x90
       md_allow_write+0x100/0x130 [md_mod]
       ? do_wait_intr_irq+0x90/0x90
       resize_stripes+0x3a/0x5b0 [raid456]
       ? kernfs_fop_write+0xbe/0x180
       raid5_change_consistency_policy+0xa6/0x200 [raid456]
       consistency_policy_store+0x2e/0x70 [md_mod]
       md_attr_store+0x90/0xc0 [md_mod]
       sysfs_kf_write+0x42/0x50
       kernfs_fop_write+0x119/0x180
       __vfs_write+0x28/0x110
       ? rcu_sync_lockdep_assert+0x12/0x60
       ? __sb_start_write+0x15a/0x1c0
       ? vfs_write+0xa3/0x1a0
       vfs_write+0xb4/0x1a0
       SyS_write+0x49/0xa0
       entry_SYSCALL_64_fastpath+0x18/0xad
      
      Fixes: 2214c260 ("md: don't return -EAGAIN in md_allow_write for external metadata arrays")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NArtur Paszkiewicz <artur.paszkiewicz@intel.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      b90f6ff0
    • S
      md: use lockdep_assert_held · efa4b77b
      Shaohua Li 提交于
      lockdep_assert_held is a better way to assert lock held, and it works
      for UP.
      Signed-off-by: NShaohua Li <shli@fb.com>
      efa4b77b
    • N
      md: remove special meaning of ->quiesce(.., 2) · b03e0ccb
      NeilBrown 提交于
      The '2' argument means "wake up anything that is waiting".
      This is an inelegant part of the design and was added
      to help support management of suspend_lo/suspend_hi setting.
      Now that suspend_lo/hi is managed in mddev_suspend/resume,
      that need is gone.
      These is still a couple of places where we call 'quiesce'
      with an argument of '2', but they can safely be changed to
      call ->quiesce(.., 1); ->quiesce(.., 0) which
      achieve the same result at the small cost of pausing IO
      briefly.
      
      This removes a small "optimization" from suspend_{hi,lo}_store,
      but it isn't clear that optimization served a useful purpose.
      The code now is a lot clearer.
      Suggested-by: NShaohua Li <shli@kernel.org>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      b03e0ccb
    • N
      md: allow metadata update while suspending. · 35bfc521
      NeilBrown 提交于
      There are various deadlocks that can occur
      when a thread holds reconfig_mutex and calls
      ->quiesce(mddev, 1).
      As some write request block waiting for
      metadata to be updated (e.g. to record device
      failure), and as the md thread updates the metadata
      while the reconfig mutex is held, holding the mutex
      can stop write requests completing, and this prevents
      ->quiesce(mddev, 1) from completing.
      
      ->quiesce() is now usually called from mddev_suspend(),
      and it is always called with reconfig_mutex held.  So
      at this time it is safe for the thread to update metadata
      without explicitly taking the lock.
      
      So add 2 new flags, one which says the unlocked updates is
      allowed, and one which ways it is happening.  Then allow it
      while the quiesce completes, and then wait for it to finish.
      Reported-and-tested-by: NXiao Ni <xni@redhat.com>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      35bfc521
    • N
      md: use mddev_suspend/resume instead of ->quiesce() · 9e1cc0a5
      NeilBrown 提交于
      mddev_suspend() is a more general interface than
      calling ->quiesce() and is so more extensible.  A
      future patch will make use of this.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      9e1cc0a5
    • N
      md: move suspend_hi/lo handling into core md code · b3143b9a
      NeilBrown 提交于
      responding to ->suspend_lo and ->suspend_hi is similar
      to responding to ->suspended.  It is best to wait in
      the common core code without incrementing ->active_io.
      This allows mddev_suspend()/mddev_resume() to work while
      requests are waiting for suspend_lo/hi to change.
      This is will be important after a subsequent patch
      which uses mddev_suspend() to synchronize updating for
      suspend_lo/hi.
      
      So move the code for testing suspend_lo/hi out of raid1.c
      and raid5.c, and place it in md.c
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      b3143b9a
    • N
      md: don't call bitmap_create() while array is quiesced. · 52a0d49d
      NeilBrown 提交于
      bitmap_create() allocates memory with GFP_KERNEL and
      so can wait for IO.
      If called while the array is quiesced, it could wait indefinitely
      for write out to the array - deadlock.
      So call bitmap_create() before quiescing the array.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      52a0d49d
    • N
      md: always hold reconfig_mutex when calling mddev_suspend() · 4d5324f7
      NeilBrown 提交于
      Most often mddev_suspend() is called with
      reconfig_mutex held.  Make this a requirement in
      preparation a subsequent patch.  Also require
      reconfig_mutex to be held for mddev_resume(),
      partly for symmetry and partly to guarantee
      no races with incr/decr of mddev->suspend.
      
      Taking the mutex in r5c_disable_writeback_async() is
      a little tricky as this is called from a work queue
      via log->disable_writeback_work, and flush_work()
      is called on that while holding ->reconfig_mutex.
      If the work item hasn't run before flush_work()
      is called, the work function will not be able to
      get the mutex.
      
      So we use mddev_trylock() inside the wait_event() call, and have that
      abort when conf->log is set to NULL, which happens before
      flush_work() is called.
      We wait in mddev->sb_wait and ensure this is woken
      when any of the conditions change.  This requires
      waking mddev->sb_wait in mddev_unlock().  This is only
      like to trigger extra wake_ups of threads that needn't
      be woken when metadata is being written, and that
      doesn't happen often enough that the cost would be
      noticeable.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      4d5324f7
    • N
      md: forbid a RAID5 from having both a bitmap and a journal. · 230b55fa
      NeilBrown 提交于
      Having both a bitmap and a journal is pointless.
      Attempting to do so can corrupt the bitmap if the journal
      replay happens before the bitmap is initialized.
      Rather than try to avoid this corruption, simply
      refuse to allow arrays with both a bitmap and a journal.
      So:
       - if raid5_run sees both are present, fail.
       - if adding a bitmap finds a journal is present, fail
       - if adding a journal finds a bitmap is present, fail.
      
      Cc: stable@vger.kernel.org (4.10+)
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Tested-by: NJoshua Kinard <kumba@gentoo.org>
      Acked-by: NJoshua Kinard <kumba@gentoo.org>
      Signed-off-by: NShaohua Li <shli@fb.com>
      230b55fa
  6. 31 10月, 2017 1 次提交
    • K
      treewide: Fix function prototypes for module_param_call() · e4dca7b7
      Kees Cook 提交于
      Several function prototypes for the set/get functions defined by
      module_param_call() have a slightly wrong argument types. This fixes
      those in an effort to clean up the calls when running under type-enforced
      compiler instrumentation for CFI. This is the result of running the
      following semantic patch:
      
      @match_module_param_call_function@
      declarer name module_param_call;
      identifier _name, _set_func, _get_func;
      expression _arg, _mode;
      @@
      
       module_param_call(_name, _set_func, _get_func, _arg, _mode);
      
      @fix_set_prototype
       depends on match_module_param_call_function@
      identifier match_module_param_call_function._set_func;
      identifier _val, _param;
      type _val_type, _param_type;
      @@
      
       int _set_func(
      -_val_type _val
      +const char * _val
       ,
      -_param_type _param
      +const struct kernel_param * _param
       ) { ... }
      
      @fix_get_prototype
       depends on match_module_param_call_function@
      identifier match_module_param_call_function._get_func;
      identifier _val, _param;
      type _val_type, _param_type;
      @@
      
       int _get_func(
      -_val_type _val
      +char * _val
       ,
      -_param_type _param
      +const struct kernel_param * _param
       ) { ... }
      
      Two additional by-hand changes are included for places where the above
      Coccinelle script didn't notice them:
      
      	drivers/platform/x86/thinkpad_acpi.c
      	fs/lockd/svc.c
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NJessica Yu <jeyu@kernel.org>
      e4dca7b7
  7. 25 10月, 2017 1 次提交
    • M
      locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns... · 6aa7de05
      Mark Rutland 提交于
      locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()
      
      Please do not apply this to mainline directly, instead please re-run the
      coccinelle script shown below and apply its output.
      
      For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
      preference to ACCESS_ONCE(), and new code is expected to use one of the
      former. So far, there's been no reason to change most existing uses of
      ACCESS_ONCE(), as these aren't harmful, and changing them results in
      churn.
      
      However, for some features, the read/write distinction is critical to
      correct operation. To distinguish these cases, separate read/write
      accessors must be used. This patch migrates (most) remaining
      ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
      coccinelle script:
      
      ----
      // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
      // WRITE_ONCE()
      
      // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch
      
      virtual patch
      
      @ depends on patch @
      expression E1, E2;
      @@
      
      - ACCESS_ONCE(E1) = E2
      + WRITE_ONCE(E1, E2)
      
      @ depends on patch @
      expression E;
      @@
      
      - ACCESS_ONCE(E)
      + READ_ONCE(E)
      ----
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: davem@davemloft.net
      Cc: linux-arch@vger.kernel.org
      Cc: mpe@ellerman.id.au
      Cc: shuah@kernel.org
      Cc: snitzer@redhat.com
      Cc: thor.thayer@linux.intel.com
      Cc: tj@kernel.org
      Cc: viro@zeniv.linux.org.uk
      Cc: will.deacon@arm.com
      Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6aa7de05
  8. 17 10月, 2017 1 次提交
  9. 09 10月, 2017 1 次提交
    • G
      md: always set THREAD_WAKEUP and wake up wqueue if thread existed · d1d90147
      Guoqing Jiang 提交于
      Since commit 4ad23a97 ("MD: use per-cpu counter for writes_pending"),
      the wait_queue is only got invoked if THREAD_WAKEUP is not set previously.
      
      With above change, I can see process_metadata_update could always hang on
      the wait queue, because mddev->thread could stay on 'D' status and the
      THREAD_WAKEUP flag is not cleared since there are lots of place to wake up
      mddev->thread. Then deadlock happened as follows:
      
      linux175:~ # ps aux|grep md|grep D
      root    20117   0.0 0.0         0   0 ? D   03:45   0:00 [md0_raid1]
      root    20125   0.0 0.0         0   0 ? D   03:45   0:00 [md0_cluster_rec]
      linux175:~ # cat /proc/20117/stack
      [<ffffffffa0635604>] dlm_lock_sync+0x94/0xd0 [md_cluster]
      [<ffffffffa0635674>] lock_token+0x34/0xd0 [md_cluster]
      [<ffffffffa0635804>] metadata_update_start+0x64/0x110 [md_cluster]
      [<ffffffffa04d985b>] md_update_sb.part.58+0x9b/0x860 [md_mod]
      [<ffffffffa04da035>] md_update_sb+0x15/0x30 [md_mod]
      [<ffffffffa04dc066>] md_check_recovery+0x266/0x490 [md_mod]
      [<ffffffffa06450e2>] raid1d+0x42/0x810 [raid1]
      [<ffffffffa04d2252>] md_thread+0x122/0x150 [md_mod]
      [<ffffffff81091741>] kthread+0x101/0x140
      linux175:~ # cat /proc/20125/stack
      [<ffffffffa0636679>] recv_daemon+0x3f9/0x5c0 [md_cluster]
      [<ffffffffa04d2252>] md_thread+0x122/0x150 [md_mod]
      [<ffffffff81091741>] kthread+0x101/0x140
      
      So let's revert the part of code in the commit to resovle the problem since
      we can't get lots of benefits of previous change.
      
      Fixes: 4ad23a97 ("MD: use per-cpu counter for writes_pending")
      Signed-off-by: NGuoqing Jiang <gqjiang@suse.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      d1d90147
  10. 06 10月, 2017 1 次提交
    • N
      md: fix deadlock error in recent patch. · d47c8ad2
      NeilBrown 提交于
      A recent patch aimed to cause md_write_start() to fail (rather than
      block) when the mddev was suspending, so as to avoid deadlocks.
      Unfortunately the test in wait_event() was wrong, and it didn't change
      behaviour at all.
      
      We wait_event() must wait until the metadata is written OR the array is
      suspending.
      
      Fixes: cc27b0c7 ("md: fix deadlock between mddev_suspend() and md_write_start()")
      Cc: stable@vger.kernel.org
      Reported-by: NXiao Ni <xni@redhat.com>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      d47c8ad2
  11. 28 9月, 2017 2 次提交
  12. 28 8月, 2017 1 次提交
  13. 26 8月, 2017 2 次提交
  14. 24 8月, 2017 1 次提交
    • C
      block: replace bi_bdev with a gendisk pointer and partitions index · 74d46992
      Christoph Hellwig 提交于
      This way we don't need a block_device structure to submit I/O.  The
      block_device has different life time rules from the gendisk and
      request_queue and is usually only available when the block device node
      is open.  Other callers need to explicitly create one (e.g. the lightnvm
      passthrough code, or the new nvme multipathing code).
      
      For the actual I/O path all that we need is the gendisk, which exists
      once per block device.  But given that the block layer also does
      partition remapping we additionally need a partition index, which is
      used for said remapping in generic_make_request.
      
      Note that all the block drivers generally want request_queue or
      sometimes the gendisk, so this removes a layer of indirection all
      over the stack.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      74d46992
  15. 12 8月, 2017 1 次提交
  16. 08 8月, 2017 2 次提交
    • N
      md: fix test in md_write_start() · 81fe48e9
      NeilBrown 提交于
      md_write_start() needs to clear the in_sync flag is it is set, or if
      there might be a race with set_in_sync() such that the later will
      set it very soon.  In the later case it is sufficient to take the
      spinlock to synchronize with set_in_sync(), and then set the flag
      if needed.
      
      The current test is incorrect.
      It should be:
        if "flag is set" or "race is possible"
      
      "flag is set" is trivially "mddev->in_sync".
      "race is possible" should be tested by "mddev->sync_checkers".
      
      If sync_checkers is 0, then there can be no race.  set_in_sync() will
      wait in percpu_ref_switch_to_atomic_sync() for an RCU grace period,
      and as md_write_start() holds the rcu_read_lock(), set_in_sync() will
      be sure ot see the update to writes_pending.
      
      If sync_checkers is > 0, there could be race.  If md_write_start()
      happened entirely between
      		if (!mddev->in_sync &&
      		    percpu_ref_is_zero(&mddev->writes_pending)) {
      and
      			mddev->in_sync = 1;
      in set_in_sync(), then it would not see that is_sync had been set,
      and set_in_sync() would not see that writes_pending had been
      incremented.
      
      This bug means that in_sync is sometimes not set when it should be.
      Consequently there is a small chance that the array will be marked as
      "clean" when in fact it is inconsistent.
      
      Fixes: 4ad23a97 ("MD: use per-cpu counter for writes_pending")
      cc: stable@vger.kernel.org (v4.12+)
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      81fe48e9
    • N
      md: always clear ->safemode when md_check_recovery gets the mddev lock. · 33182d15
      NeilBrown 提交于
      If ->safemode == 1, md_check_recovery() will try to get the mddev lock
      and perform various other checks.
      If mddev->in_sync is zero, it will call set_in_sync, and clear
      ->safemode.  However if mddev->in_sync is not zero, ->safemode will not
      be cleared.
      
      When md_check_recovery() drops the mddev lock, the thread is woken
      up again.  Normally it would just check if there was anything else to
      do, find nothing, and go to sleep.  However as ->safemode was not
      cleared, it will take the mddev lock again, then wake itself up
      when unlocking.
      
      This results in an infinite loop, repeatedly calling
      md_check_recovery(), which RCU or the soft-lockup detector
      will eventually complain about.
      
      Prior to commit 4ad23a97 ("MD: use per-cpu counter for
      writes_pending"), safemode would only be set to one when the
      writes_pending counter reached zero, and would be cleared again
      when writes_pending is incremented.  Since that patch, safemode
      is set more freely, but is not reliably cleared.
      
      So in md_check_recovery() clear ->safemode before checking ->in_sync.
      
      Fixes: 4ad23a97 ("MD: use per-cpu counter for writes_pending")
      Cc: stable@vger.kernel.org (4.12+)
      Reported-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Reported-by: NDavid R <david@unsolicited.net>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      33182d15
  17. 26 7月, 2017 1 次提交
  18. 04 7月, 2017 1 次提交
  19. 24 6月, 2017 1 次提交
  20. 22 6月, 2017 1 次提交
    • N
      md: use a separate bio_set for synchronous IO. · 5a85071c
      NeilBrown 提交于
      md devices allocate a bio_set and use it for two
      distinct purposes.
      mddev->bio_set is used to clone bios as part of sending
      upper level requests down to lower level devices,
      and it is also use for synchronous IO such as superblock
      and bitmap updates, and for correcting read errors.
      
      This multiple usage can lead to deadlocks.  It is likely
      that cloned bios might be queued for write and to be
      waiting for a metadata update before the write can be permitted.
      If the cloning exhausted mddev->bio_set, the metadata update
      may not be able to proceed.
      
      This scenario has been seen during heavy testing, with lots of IO and
      lots of memory pressure.
      
      Address this by adding a new bio_set specifically for synchronous IO.
      All synchronous IO goes directly to the underlying device and is not
      queued at the md level, so request using entries from the new
      mddev->sync_set will complete in a timely fashion.
      Requests that use mddev->bio_set will sometimes need to wait
      for synchronous IO, but will no longer risk deadlocking that iO.
      
      Also: small simplification in mddev_put(): there is no need to
      wait until the spinlock is released before calling bioset_free().
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      5a85071c
  21. 19 6月, 2017 3 次提交
  22. 17 6月, 2017 1 次提交
  23. 14 6月, 2017 1 次提交
    • N
      md: fix deadlock between mddev_suspend() and md_write_start() · cc27b0c7
      NeilBrown 提交于
      If mddev_suspend() races with md_write_start() we can deadlock
      with mddev_suspend() waiting for the request that is currently
      in md_write_start() to complete the ->make_request() call,
      and md_write_start() waiting for the metadata to be updated
      to mark the array as 'dirty'.
      As metadata updates done by md_check_recovery() only happen then
      the mddev_lock() can be claimed, and as mddev_suspend() is often
      called with the lock held, these threads wait indefinitely for each
      other.
      
      We fix this by having md_write_start() abort if mddev_suspend()
      is happening, and ->make_request() aborts if md_write_start()
      aborted.
      md_make_request() can detect this abort, decrease the ->active_io
      count, and wait for mddev_suspend().
      Reported-by: NNix <nix@esperi.org.uk>
      Fix: 68866e42(MD: no sync IO while suspended)
      Cc: stable@vger.kernel.org
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      cc27b0c7
  24. 09 6月, 2017 1 次提交
  25. 06 6月, 2017 1 次提交
  26. 05 6月, 2017 1 次提交
  27. 01 6月, 2017 1 次提交
    • J
      md: Make flush bios explicitely sync · 5a8948f8
      Jan Kara 提交于
      Commit b685d3d6 "block: treat REQ_FUA and REQ_PREFLUSH as
      synchronous" removed REQ_SYNC flag from WRITE_{FUA|PREFLUSH|...}
      definitions.  generic_make_request_checks() however strips REQ_FUA and
      REQ_PREFLUSH flags from a bio when the storage doesn't report volatile
      write cache and thus write effectively becomes asynchronous which can
      lead to performance regressions
      
      Fix the problem by making sure all bios which are synchronous are
      properly marked with REQ_SYNC.
      
      CC: linux-raid@vger.kernel.org
      CC: Shaohua Li <shli@kernel.org>
      Fixes: b685d3d6
      CC: stable@vger.kernel.org
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NShaohua Li <shli@fb.com>
      5a8948f8