1. 19 10月, 2018 8 次提交
  2. 15 10月, 2018 1 次提交
  3. 11 10月, 2018 2 次提交
  4. 04 10月, 2018 1 次提交
    • N
      md: allow metadata updates while suspending an array - fix · 059421e0
      NeilBrown 提交于
      Commit 35bfc521 ("md: allow metadata update while suspending.")
      added support for allowing md_check_recovery() to still perform
      metadata updates while the array is entering the 'suspended' state.
      This is needed to allow the processes of entering the state to
      complete.
      
      Unfortunately, the patch doesn't really work.  The test for
      "mddev->suspended" at the start of md_check_recovery() means that the
      function doesn't try to do anything at all while entering suspend.
      
      This patch moves the code of updating the metadata while suspending to
      *before* the test on mddev->suspended.
      Reported-by: NJeff Mahoney <jeffm@suse.com>
      Fixes: 35bfc521 ("md: allow metadata update while suspending.")
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      059421e0
  5. 02 10月, 2018 1 次提交
  6. 29 9月, 2018 2 次提交
    • A
      md/raid10: Fix raid10 replace hang when new added disk faulty · ee37d731
      Alex Wu 提交于
      [Symptom]
      
      Resync thread hang when new added disk faulty during replacing.
      
      [Root Cause]
      
      In raid10_sync_request(), we expect to issue a bio with callback
      end_sync_read(), and a bio with callback end_sync_write().
      
      In normal situation, we will add resyncing sectors into
      mddev->recovery_active when raid10_sync_request() returned, and sub
      resynced sectors from mddev->recovery_active when end_sync_write()
      calls end_sync_request().
      
      If new added disk, which are replacing the old disk, is set faulty,
      there is a race condition:
          1. In the first rcu protected section, resync thread did not detect
             that mreplace is set faulty and pass the condition.
          2. In the second rcu protected section, mreplace is set faulty.
          3. But, resync thread will prepare the read object first, and then
             check the write condition.
          4. It will find that mreplace is set faulty and do not have to
             prepare write object.
      This cause we add resync sectors but never sub it.
      
      [How to Reproduce]
      
      This issue can be easily reproduced by the following steps:
          mdadm -C /dev/md0 --assume-clean -l 10 -n 4 /dev/sd[abcd]
          mdadm /dev/md0 -a /dev/sde
          mdadm /dev/md0 --replace /dev/sdd
          sleep 1
          mdadm /dev/md0 -f /dev/sde
      
      [How to Fix]
      
      This issue can be fixed by using local variables to record the result
      of test conditions. Once the conditions are satisfied, we can make sure
      that we need to issue a bio for read and a bio for write.
      
      Previous 'commit 24afd80d ("md/raid10: handle recovery of
      replacement devices.")' will also check whether bio is NULL, but leave
      the comment saying that it is a pointless test. So we remove this dummy
      check.
      Reported-by: NAlex Chen <alexchen@synology.com>
      Reviewed-by: NAllen Peng <allenpeng@synology.com>
      Reviewed-by: NBingJing Chang <bingjingc@synology.com>
      Signed-off-by: NAlex Wu <alexwu@synology.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      ee37d731
    • M
      raid5: block failing device if raid will be failed · fb73b357
      Mariusz Tkaczyk 提交于
      Currently there is an inconsistency for failing the member drives
      for arrays with different RAID levels. For RAID456 - there is a possibility
      to fail all of the devices. However - for other RAID levels - kernel blocks
      removing the member drive, if the operation results in array's FAIL state
      (EBUSY is returned). For example - removing last drive from RAID1 is not
      possible.
      This kind of blocker was never implemented for raid456 and we cannot see
      the reason why.
      
      We had tested following patch and did not observe any regression, so do you
      have any comments/reasons for current approach, or we can send the proper
      patch for this?
      Signed-off-by: NMariusz Tkaczyk <mariusz.tkaczyk@intel.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      fb73b357
  7. 27 9月, 2018 5 次提交
  8. 26 9月, 2018 5 次提交
  9. 25 9月, 2018 3 次提交
  10. 24 9月, 2018 12 次提交
    • S
      RDMA/bnxt_re: Fix system crash during RDMA resource initialization · de5c95d0
      Selvin Xavier 提交于
      bnxt_re_ib_reg acquires and releases the rtnl lock whenever it accesses
      the L2 driver.
      
      The following sequence can trigger a crash
      
      Acquires the rtnl_lock ->
      	Registers roce driver callback with L2 driver ->
      		release the rtnl lock
      bnxt_re acquires the rtnl_lock ->
      	Request for MSIx vectors ->
      		release the rtnl_lock
      
      Issue happens when bnxt_re proceeds with remaining part of initialization
      and L2 driver invokes bnxt_ulp_irq_stop as a part of bnxt_open_nic.
      
      The crash is in bnxt_qplib_nq_stop_irq as the NQ structures are
      not initialized yet,
      
      <snip>
      [ 3551.726647] BUG: unable to handle kernel NULL pointer dereference at (null)
      [ 3551.726656] IP: [<ffffffffc0840ee9>] bnxt_qplib_nq_stop_irq+0x59/0xb0 [bnxt_re]
      [ 3551.726674] PGD 0
      [ 3551.726679] Oops: 0002 1 SMP
      ...
      [ 3551.726822] Hardware name: Dell Inc. PowerEdge R720/08RW36, BIOS 2.4.3 07/09/2014
      [ 3551.726826] task: ffff97e30eec5ee0 ti: ffff97e3173bc000 task.ti: ffff97e3173bc000
      [ 3551.726829] RIP: 0010:[<ffffffffc0840ee9>] [<ffffffffc0840ee9>]
      bnxt_qplib_nq_stop_irq+0x59/0xb0 [bnxt_re]
      ...
      [ 3551.726872] Call Trace:
      [ 3551.726886] [<ffffffffc082cb9e>] bnxt_re_stop_irq+0x4e/0x70 [bnxt_re]
      [ 3551.726899] [<ffffffffc07d6a53>] bnxt_ulp_irq_stop+0x43/0x70 [bnxt_en]
      [ 3551.726908] [<ffffffffc07c82f4>] bnxt_reserve_rings+0x174/0x1e0 [bnxt_en]
      [ 3551.726917] [<ffffffffc07cafd8>] __bnxt_open_nic+0x368/0x9a0 [bnxt_en]
      [ 3551.726925] [<ffffffffc07cb62b>] bnxt_open_nic+0x1b/0x50 [bnxt_en]
      [ 3551.726934] [<ffffffffc07cc62f>] bnxt_setup_mq_tc+0x11f/0x260 [bnxt_en]
      [ 3551.726943] [<ffffffffc07d5f58>] bnxt_dcbnl_ieee_setets+0xb8/0x1f0 [bnxt_en]
      [ 3551.726954] [<ffffffff890f983a>] dcbnl_ieee_set+0x9a/0x250
      [ 3551.726966] [<ffffffff88fd6d21>] ? __alloc_skb+0xa1/0x2d0
      [ 3551.726972] [<ffffffff890f72fa>] dcb_doit+0x13a/0x210
      [ 3551.726981] [<ffffffff89003ff7>] rtnetlink_rcv_msg+0xa7/0x260
      [ 3551.726989] [<ffffffff88ffdb00>] ? rtnl_unicast+0x20/0x30
      [ 3551.726996] [<ffffffff88bf9dc8>] ? __kmalloc_node_track_caller+0x58/0x290
      [ 3551.727002] [<ffffffff890f7326>] ? dcb_doit+0x166/0x210
      [ 3551.727007] [<ffffffff88fd6d0d>] ? __alloc_skb+0x8d/0x2d0
      [ 3551.727012] [<ffffffff89003f50>] ? rtnl_newlink+0x880/0x880
      ...
      [ 3551.727104] [<ffffffff8911f7d5>] system_call_fastpath+0x1c/0x21
      ...
      [ 3551.727164] RIP [<ffffffffc0840ee9>] bnxt_qplib_nq_stop_irq+0x59/0xb0 [bnxt_re]
      [ 3551.727175] RSP <ffff97e3173bf788>
      [ 3551.727177] CR2: 0000000000000000
      
      Avoid this inconsistent state and  system crash by acquiring
      the rtnl lock for the entire duration of device initialization.
      Re-factor the code to remove the rtnl lock from the individual function
      and acquire and release it from the caller.
      
      Fixes: 1ac5a404 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
      Fixes: 6e04b103 ("RDMA/bnxt_re: Fix broken RoCE driver due to recent L2 driver changes")
      Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      de5c95d0
    • F
      net: aquantia: memory corruption on jumbo frames · d26ed6b0
      Friedemann Gerold 提交于
      This patch fixes skb_shared area, which will be corrupted
      upon reception of 4K jumbo packets.
      
      Originally build_skb usage purpose was to reuse page for skb to eliminate
      needs of extra fragments. But that logic does not take into account that
      skb_shared_info should be reserved at the end of skb data area.
      
      In case packet data consumes all the page (4K), skb_shinfo location
      overflows the page. As a consequence, __build_skb zeroed shinfo data above
      the allocated page, corrupting next page.
      
      The issue is rarely seen in real life because jumbo are normally larger
      than 4K and that causes another code path to trigger.
      But it 100% reproducible with simple scapy packet, like:
      
          sendp(IP(dst="192.168.100.3") / TCP(dport=443) \
                / Raw(RandString(size=(4096-40))), iface="enp1s0")
      
      Fixes: 018423e9 ("net: ethernet: aquantia: Add ring support code")
      Reported-by: NFriedemann Gerold <f.gerold@b-c-s.de>
      Reported-by: NMichael Rauch <michael@rauch.be>
      Signed-off-by: NFriedemann Gerold <f.gerold@b-c-s.de>
      Tested-by: NNikita Danilov <nikita.danilov@aquantia.com>
      Signed-off-by: NIgor Russkikh <igor.russkikh@aquantia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d26ed6b0
    • E
      tun: remove ndo_poll_controller · 765cdc20
      Eric Dumazet 提交于
      As diagnosed by Song Liu, ndo_poll_controller() can
      be very dangerous on loaded hosts, since the cpu
      calling ndo_poll_controller() might steal all NAPI
      contexts (for all RX/TX queues of the NIC). This capture
      can last for unlimited amount of time, since one
      cpu is generally not able to drain all the queues under load.
      
      tun uses NAPI for TX completions, so we better let core
      networking stack call the napi->poll() to avoid the capture.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      765cdc20
    • E
      nfp: remove ndo_poll_controller · 0825ce70
      Eric Dumazet 提交于
      As diagnosed by Song Liu, ndo_poll_controller() can
      be very dangerous on loaded hosts, since the cpu
      calling ndo_poll_controller() might steal all NAPI
      contexts (for all RX/TX queues of the NIC). This capture
      can last for unlimited amount of time, since one
      cpu is generally not able to drain all the queues under load.
      
      nfp uses NAPI for TX completions, so we better let core
      networking stack call the napi->poll() to avoid the capture.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
      Acked-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Tested-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0825ce70
    • E
      bnxt: remove ndo_poll_controller · 58e0e22b
      Eric Dumazet 提交于
      As diagnosed by Song Liu, ndo_poll_controller() can
      be very dangerous on loaded hosts, since the cpu
      calling ndo_poll_controller() might steal all NAPI
      contexts (for all RX/TX queues of the NIC). This capture
      can last for unlimited amount of time, since one
      cpu is generally not able to drain all the queues under load.
      
      bnxt uses NAPI for TX completions, so we better let core
      networking stack call the napi->poll() to avoid the capture.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Michael Chan <michael.chan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      58e0e22b
    • E
      bnx2x: remove ndo_poll_controller · d8ea6a91
      Eric Dumazet 提交于
      As diagnosed by Song Liu, ndo_poll_controller() can
      be very dangerous on loaded hosts, since the cpu
      calling ndo_poll_controller() might steal all NAPI
      contexts (for all RX/TX queues of the NIC). This capture
      can last for unlimited amount of time, since one
      cpu is generally not able to drain all the queues under load.
      
      bnx2x uses NAPI for TX completions, so we better let core
      networking stack call the napi->poll() to avoid the capture.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Ariel Elior <ariel.elior@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d8ea6a91
    • E
      mlx5: remove ndo_poll_controller · 9c29bcd1
      Eric Dumazet 提交于
      As diagnosed by Song Liu, ndo_poll_controller() can
      be very dangerous on loaded hosts, since the cpu
      calling ndo_poll_controller() might steal all NAPI
      contexts (for all RX/TX queues of the NIC). This capture
      can last for unlimited amount of time, since one
      cpu is generally not able to drain all the queues under load.
      
      mlx5 uses NAPI for TX completions, so we better let core
      networking stack call the napi->poll() to avoid the capture.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9c29bcd1
    • E
      mlx4: remove ndo_poll_controller · a24b66c2
      Eric Dumazet 提交于
      As diagnosed by Song Liu, ndo_poll_controller() can
      be very dangerous on loaded hosts, since the cpu
      calling ndo_poll_controller() might steal all NAPI
      contexts (for all RX/TX queues of the NIC). This capture
      can last for unlimited amount of time, since one
      cpu is generally not able to drain all the queues under load.
      
      mlx4 uses NAPI for TX completions, so we better let core
      networking stack call the napi->poll() to avoid the capture.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a24b66c2
    • E
      i40evf: remove ndo_poll_controller · 1aa28fb9
      Eric Dumazet 提交于
      As diagnosed by Song Liu, ndo_poll_controller() can
      be very dangerous on loaded hosts, since the cpu
      calling ndo_poll_controller() might steal all NAPI
      contexts (for all RX/TX queues of the NIC). This capture
      can last for unlimited amount of time, since one
      cpu is generally not able to drain all the queues under load.
      
      i40evf uses NAPI for TX completions, so we better let core
      networking stack call the napi->poll() to avoid the capture.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1aa28fb9
    • E
      ice: remove ndo_poll_controller · 158a08a6
      Eric Dumazet 提交于
      As diagnosed by Song Liu, ndo_poll_controller() can
      be very dangerous on loaded hosts, since the cpu
      calling ndo_poll_controller() might steal all NAPI
      contexts (for all RX/TX queues of the NIC). This capture
      can last for unlimited amount of time, since one
      cpu is generally not able to drain all the queues under load.
      
      ice uses NAPI for TX completions, so we better let core
      networking stack call the napi->poll() to avoid the capture.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      158a08a6
    • E
      igb: remove ndo_poll_controller · 0542997e
      Eric Dumazet 提交于
      As diagnosed by Song Liu, ndo_poll_controller() can
      be very dangerous on loaded hosts, since the cpu
      calling ndo_poll_controller() might steal all NAPI
      contexts (for all RX/TX queues of the NIC). This capture
      can last for unlimited amount of time, since one
      cpu is generally not able to drain all the queues under load.
      
      igb uses NAPI for TX completions, so we better let core
      networking stack call the napi->poll() to avoid the capture.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0542997e
    • E
      ixgb: remove ndo_poll_controller · 2753166e
      Eric Dumazet 提交于
      As diagnosed by Song Liu, ndo_poll_controller() can
      be very dangerous on loaded hosts, since the cpu
      calling ndo_poll_controller() might steal all NAPI
      contexts (for all RX/TX queues of the NIC). This capture
      can last for unlimited amount of time, since one
      cpu is generally not able to drain all the queues under load.
      
      ixgb uses NAPI for TX completions, so we better let core
      networking stack call the napi->poll() to avoid the capture.
      
      This also removes a problematic use of disable_irq() in
      a context it is forbidden, as explained in commit
      af3e0fcf ("8139too: Use disable_irq_nosync() in
      rtl8139_poll_controller()")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2753166e