1. 05 8月, 2020 5 次提交
  2. 20 7月, 2020 6 次提交
    • D
      dm crypt: Enable zoned block device support · 8e225f04
      Damien Le Moal 提交于
      Enable support for zoned block devices. This is done by:
      1) implementing the target report_zones method.
      2) adding the DM_TARGET_ZONED_HM flag to the target features.
      3) setting DM_CRYPT_NO_WRITE_WORKQUEUE flag to avoid IO
         processing via workqueue.
      4) Introducing inline write encryption completion to preserve write
         ordering.
      
      The last point is implemented by introducing the internal flag
      DM_CRYPT_WRITE_INLINE. When set, kcryptd_crypt_write_convert() always
      waits inline for the completion of a write request encryption if the
      request is not already completed once crypt_convert() returns.
      Completion of write request encryption is signaled using the
      restart completion by kcryptd_async_done(). This mechanism allows
      using ciphers that have an asynchronous implementation, isolating
      dm-crypt from any potential request completion reordering for these
      ciphers.
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      8e225f04
    • I
      dm crypt: add flags to optionally bypass kcryptd workqueues · 39d42fa9
      Ignat Korchagin 提交于
      This is a follow up to [1] that detailed latency problems associated
      with dm-crypt's use of workqueues when processing IO.
      
      Current dm-crypt implementation creates a significant IO performance
      overhead (at least on small IO block sizes) for both latency and
      throughput. We suspect offloading IO request processing into
      workqueues and async threads is more harmful these days with the
      modern fast storage. I also did some digging into the dm-crypt git
      history and much of this async processing is not needed anymore,
      because the reasons it was added are mostly gone from the kernel. More
      details can be found in [2] (see "Git archeology" section).
      
      This change adds DM_CRYPT_NO_READ_WORKQUEUE and
      DM_CRYPT_NO_WRITE_WORKQUEUE flags for read and write BIOs, which
      direct dm-crypt to not offload crypto operations into kcryptd
      workqueues.  In addition, writes are not buffered to be sorted in the
      dm-crypt red-black tree, but dispatched immediately. For cases, where
      crypto operations cannot happen (hard interrupt context, for example
      the read path of some NVME drivers), we offload the work to a tasklet
      rather than a workqueue.
      
      These flags only ensure no async BIO processing in the dm-crypt
      module. It is worth noting that some Crypto API implementations may
      offload encryption into their own workqueues, which are independent of
      the dm-crypt and its configuration. However upon enabling these new
      flags dm-crypt will instruct Crypto API not to backlog crypto
      requests.
      
      To give an idea of the performance gains for certain workloads,
      consider the script, and results when tested against various
      devices, detailed here:
      https://www.redhat.com/archives/dm-devel/2020-July/msg00138.html
      
      [1]: https://www.spinics.net/lists/dm-crypt/msg07516.html
      [2]: https://blog.cloudflare.com/speeding-up-linux-disk-encryption/Signed-off-by: NIgnat Korchagin <ignat@cloudflare.com>
      Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: NBob Liu <bob.liu@oracle.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      39d42fa9
    • M
      dm bufio: do buffer cleanup from a workqueue · 70704c33
      Mikulas Patocka 提交于
      Until now, DM bufio's waiting for IO from reclaim context in its
      shrinker has caused kswapd to block; which results in systemic IO
      stalls and even deadlock, e.g.:
      https://www.redhat.com/archives/dm-devel/2020-March/msg00025.html
      
      Here is Dave Chinner's problem description that motivated this fix,
      from: https://lore.kernel.org/linux-fsdevel/20190809215733.GZ7777@dread.disaster.area/
      
      "Waiting for IO in kswapd reclaim context is considered harmful -
      kswapd context shrinker reclaim should be as non-blocking as possible,
      and any back-off to wait for IO to complete should be done by the high
      level reclaim core once it's completed an entire reclaim scan cycle of
      everything....
      
      What follows from that, and is pertinent in this situation, is that if
      you don't block kswapd, then other reclaim contexts are not going to
      get stuck waiting for it regardless of the reclaim context they use."
      
      Continued elsewhere:
      
      "The only way to fix this problem once and for all is to stop using
      the shrinker as a mechanism to issue and wait on IO. If you need
      background writeback of dirty buffers, do it from a WQ_MEM_RECLAIM
      workqueue that isn't directly in the memory reclaim path and so can
      issue writeback and block safely from a GFP_KERNEL context. Kick the
      workqueue from the shrinker context, but get rid of the IO submission
      and waiting from the shrinker and all the GFP_NOFS memory reclaim
      recursion problems go away."
      
      As such, this commit moves buffer cleanup to a workqueue.
      Suggested-by: NDave Chinner <dchinner@redhat.com>
      Reported-by: NTahsin Erdogan <tahsin@google.com>
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Tested-by: NGabriel Krisman Bertazi <krisman@collabora.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      70704c33
    • M
      dm rq: don't call blk_mq_queue_stopped() in dm_stop_queue() · e766668c
      Ming Lei 提交于
      dm_stop_queue() only uses blk_mq_quiesce_queue() so it doesn't
      formally stop the blk-mq queue; therefore there is no point making the
      blk_mq_queue_stopped() check -- it will never be stopped.
      
      In addition, even though dm_stop_queue() actually tries to quiesce hw
      queues via blk_mq_quiesce_queue(), checking with blk_queue_quiesced()
      to avoid unnecessary queue quiesce isn't reliable because: the
      QUEUE_FLAG_QUIESCED flag is set before synchronize_rcu() and
      dm_stop_queue() may be called when synchronize_rcu() from another
      blk_mq_quiesce_queue() is in-progress.
      
      Fixes: 7b17c2f7 ("dm: Fix a race condition related to stopping and starting queues")
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      e766668c
    • Y
      dm dust: add interface to list all badblocks · 0c248ea2
      yangerkun 提交于
      This interface may help anyone who want to know all badblocks without
      querying for each block.
      
      [Bryan: DMEMIT message if no blocks are in the bad block list.]
      Signed-off-by: Nyangerkun <yangerkun@huawei.com>
      Signed-off-by: NBryan Gurney <bgurney@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      0c248ea2
    • Y
      dm dust: report some message results directly back to user · 4f7f590b
      yangerkun 提交于
      Some messages (queryblock, countbadblocks, removebadblock) are best
      reported directly to user directly. Do so with DMEMIT.
      
      [Bryan: maintain __func__ output in DMEMIT messages]
      Signed-off-by: Nyangerkun <yangerkun@huawei.com>
      Signed-off-by: NBryan Gurney <bgurney@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      4f7f590b
  3. 13 7月, 2020 12 次提交
  4. 12 7月, 2020 6 次提交
  5. 11 7月, 2020 11 次提交
    • L
      Merge tag 'libnvdimm-fix-v5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · 1df0d896
      Linus Torvalds 提交于
      Pull libnvdimm fix from Dan Williams:
       "A one-line Fix for key ring search permissions to address a regression
        from -rc1"
      
      * tag 'libnvdimm-fix-v5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        libnvdimm/security: Fix key lookup permissions
      1df0d896
    • L
      Merge tag '5.8-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6 · 5ab39e08
      Linus Torvalds 提交于
      Pull cifs fixes from Steve French:
       "Four cifs/smb3 fixes: the three for stable fix problems found recently
        with change notification including a reference count leak"
      
      * tag '5.8-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: update internal module version number
        cifs: fix reference leak for tlink
        smb3: fix unneeded error message on change notify
        cifs: remove the retry in cifs_poxis_lock_set
        smb3: fix access denied on change notify request to some servers
      5ab39e08
    • L
      Merge tag 'inclusive-terminology' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux · 49decddd
      Linus Torvalds 提交于
      Pull coding style terminology documentation from Dan Williams:
       "The discussion has tapered off as well as the incoming ack, review,
        and sign-off tags. I did not see a reason to wait for the next merge
        window"
      
      * tag 'inclusive-terminology' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux:
        CodingStyle: Inclusive Terminology
      49decddd
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 5a764898
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Restore previous behavior of CAP_SYS_ADMIN wrt loading networking
          BPF programs, from Maciej Żenczykowski.
      
       2) Fix dropped broadcasts in mac80211 code, from Seevalamuthu
          Mariappan.
      
       3) Slay memory leak in nl80211 bss color attribute parsing code, from
          Luca Coelho.
      
       4) Get route from skb properly in ip_route_use_hint(), from Miaohe Lin.
      
       5) Don't allow anything other than ARPHRD_ETHER in llc code, from Eric
          Dumazet.
      
       6) xsk code dips too deeply into DMA mapping implementation internals.
          Add dma_need_sync and use it. From Christoph Hellwig
      
       7) Enforce power-of-2 for BPF ringbuf sizes. From Andrii Nakryiko.
      
       8) Check for disallowed attributes when loading flow dissector BPF
          programs. From Lorenz Bauer.
      
       9) Correct packet injection to L3 tunnel devices via AF_PACKET, from
          Jason A. Donenfeld.
      
      10) Don't advertise checksum offload on ipa devices that don't support
          it. From Alex Elder.
      
      11) Resolve several issues in TCP MD5 signature support. Missing memory
          barriers, bogus options emitted when using syncookies, and failure
          to allow md5 key changes in established states. All from Eric
          Dumazet.
      
      12) Fix interface leak in hsr code, from Taehee Yoo.
      
      13) VF reset fixes in hns3 driver, from Huazhong Tan.
      
      14) Make loopback work again with ipv6 anycast, from David Ahern.
      
      15) Fix TX starvation under high load in fec driver, from Tobias
          Waldekranz.
      
      16) MLD2 payload lengths not checked properly in bridge multicast code,
          from Linus Lüssing.
      
      17) Packet scheduler code that wants to find the inner protocol
          currently only works for one level of VLAN encapsulation. Allow
          Q-in-Q situations to work properly here, from Toke
          Høiland-Jørgensen.
      
      18) Fix route leak in l2tp, from Xin Long.
      
      19) Resolve conflict between the sk->sk_user_data usage of bpf reuseport
          support and various protocols. From Martin KaFai Lau.
      
      20) Fix socket cgroup v2 reference counting in some situations, from
          Cong Wang.
      
      21) Cure memory leak in mlx5 connection tracking offload support, from
          Eli Britstein.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (146 commits)
        mlxsw: pci: Fix use-after-free in case of failed devlink reload
        mlxsw: spectrum_router: Remove inappropriate usage of WARN_ON()
        net: macb: fix call to pm_runtime in the suspend/resume functions
        net: macb: fix macb_suspend() by removing call to netif_carrier_off()
        net: macb: fix macb_get/set_wol() when moving to phylink
        net: macb: mark device wake capable when "magic-packet" property present
        net: macb: fix wakeup test in runtime suspend/resume routines
        bnxt_en: fix NULL dereference in case SR-IOV configuration fails
        libbpf: Fix libbpf hashmap on (I)LP32 architectures
        net/mlx5e: CT: Fix memory leak in cleanup
        net/mlx5e: Fix port buffers cell size value
        net/mlx5e: Fix 50G per lane indication
        net/mlx5e: Fix CPU mapping after function reload to avoid aRFS RX crash
        net/mlx5e: Fix VXLAN configuration restore after function reload
        net/mlx5e: Fix usage of rcu-protected pointer
        net/mxl5e: Verify that rpriv is not NULL
        net/mlx5: E-Switch, Fix vlan or qos setting in legacy mode
        net/mlx5: Fix eeprom support for SFP module
        cgroup: Fix sock_cgroup_data on big-endian.
        selftests: bpf: Fix detach from sockmap tests
        ...
      5a764898
    • N
      mips: Remove compiler check in unroll macro · 9321f1aa
      Nathan Chancellor 提交于
      CONFIG_CC_IS_GCC is undefined when Clang is used, which breaks the build
      (see our Travis link below).
      
      Clang 8 was chosen as a minimum version for this check because there
      were some improvements around __builtin_constant_p in that release. In
      reality, MIPS was not even buildable until clang 9 so that check was not
      technically necessary. Just remove all compiler checks and just assume
      that we have a working compiler.
      
      Fixes: d4e60453 ("Restore gcc check in mips asm/unroll.h")
      Link: https://travis-ci.com/github/ClangBuiltLinux/continuous-integration/jobs/359642821Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9321f1aa
    • D
      Merge branch 'mlxsw-Various-fixes' · 1195c7ce
      David S. Miller 提交于
      Ido Schimmel says:
      
      ====================
      mlxsw: Various fixes
      
      Fix two issues found by syzkaller.
      
      Patch #1 removes inappropriate usage of WARN_ON() following memory
      allocation failure. Constantly triggered when syzkaller injects faults.
      
      Patch #2 fixes a use-after-free that can be triggered by 'devlink dev
      info' following a failed devlink reload.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1195c7ce
    • I
      mlxsw: pci: Fix use-after-free in case of failed devlink reload · c4317b11
      Ido Schimmel 提交于
      In case devlink reload failed, it is possible to trigger a
      use-after-free when querying the kernel for device info via 'devlink dev
      info' [1].
      
      This happens because as part of the reload error path the PCI command
      interface is de-initialized and its mailboxes are freed. When the
      devlink '->info_get()' callback is invoked the device is queried via the
      command interface and the freed mailboxes are accessed.
      
      Fix this by initializing the command interface once during probe and not
      during every reload.
      
      This is consistent with the other bus used by mlxsw (i.e., 'mlxsw_i2c')
      and also allows user space to query the running firmware version (for
      example) from the device after a failed reload.
      
      [1]
      BUG: KASAN: use-after-free in memcpy include/linux/string.h:406 [inline]
      BUG: KASAN: use-after-free in mlxsw_pci_cmd_exec+0x177/0xa60 drivers/net/ethernet/mellanox/mlxsw/pci.c:1675
      Write of size 4096 at addr ffff88810ae32000 by task syz-executor.1/2355
      
      CPU: 1 PID: 2355 Comm: syz-executor.1 Not tainted 5.8.0-rc2+ #29
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0xf6/0x16e lib/dump_stack.c:118
       print_address_description.constprop.0+0x1c/0x250 mm/kasan/report.c:383
       __kasan_report mm/kasan/report.c:513 [inline]
       kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
       check_memory_region_inline mm/kasan/generic.c:186 [inline]
       check_memory_region+0x14e/0x1b0 mm/kasan/generic.c:192
       memcpy+0x39/0x60 mm/kasan/common.c:106
       memcpy include/linux/string.h:406 [inline]
       mlxsw_pci_cmd_exec+0x177/0xa60 drivers/net/ethernet/mellanox/mlxsw/pci.c:1675
       mlxsw_cmd_exec+0x249/0x550 drivers/net/ethernet/mellanox/mlxsw/core.c:2335
       mlxsw_cmd_access_reg drivers/net/ethernet/mellanox/mlxsw/cmd.h:859 [inline]
       mlxsw_core_reg_access_cmd drivers/net/ethernet/mellanox/mlxsw/core.c:1938 [inline]
       mlxsw_core_reg_access+0x2f6/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1985
       mlxsw_reg_query drivers/net/ethernet/mellanox/mlxsw/core.c:2000 [inline]
       mlxsw_devlink_info_get+0x17f/0x6e0 drivers/net/ethernet/mellanox/mlxsw/core.c:1090
       devlink_nl_info_fill.constprop.0+0x13c/0x2d0 net/core/devlink.c:4588
       devlink_nl_cmd_info_get_dumpit+0x246/0x460 net/core/devlink.c:4648
       genl_lock_dumpit+0x85/0xc0 net/netlink/genetlink.c:575
       netlink_dump+0x515/0xe50 net/netlink/af_netlink.c:2245
       __netlink_dump_start+0x53d/0x830 net/netlink/af_netlink.c:2353
       genl_family_rcv_msg_dumpit.isra.0+0x296/0x300 net/netlink/genetlink.c:638
       genl_family_rcv_msg net/netlink/genetlink.c:733 [inline]
       genl_rcv_msg+0x78d/0x9d0 net/netlink/genetlink.c:753
       netlink_rcv_skb+0x152/0x440 net/netlink/af_netlink.c:2469
       genl_rcv+0x24/0x40 net/netlink/genetlink.c:764
       netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
       netlink_unicast+0x53a/0x750 net/netlink/af_netlink.c:1329
       netlink_sendmsg+0x850/0xd90 net/netlink/af_netlink.c:1918
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg+0x150/0x190 net/socket.c:672
       ____sys_sendmsg+0x6d8/0x840 net/socket.c:2363
       ___sys_sendmsg+0xff/0x170 net/socket.c:2417
       __sys_sendmsg+0xe5/0x1b0 net/socket.c:2450
       do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:359
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: a9c8336f ("mlxsw: core: Add support for devlink info command")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c4317b11
    • I
      mlxsw: spectrum_router: Remove inappropriate usage of WARN_ON() · d9d54202
      Ido Schimmel 提交于
      We should not trigger a warning when a memory allocation fails. Remove
      the WARN_ON().
      
      The warning is constantly triggered by syzkaller when it is injecting
      faults:
      
      [ 2230.758664] FAULT_INJECTION: forcing a failure.
      [ 2230.758664] name failslab, interval 1, probability 0, space 0, times 0
      [ 2230.762329] CPU: 3 PID: 1407 Comm: syz-executor.0 Not tainted 5.8.0-rc2+ #28
      ...
      [ 2230.898175] WARNING: CPU: 3 PID: 1407 at drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:6265 mlxsw_sp_router_fib_event+0xfad/0x13e0
      [ 2230.898179] Kernel panic - not syncing: panic_on_warn set ...
      [ 2230.898183] CPU: 3 PID: 1407 Comm: syz-executor.0 Not tainted 5.8.0-rc2+ #28
      [ 2230.898190] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
      
      Fixes: 3057224e ("mlxsw: spectrum_router: Implement FIB offload in deferred work")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d9d54202
    • D
      Merge branch 'macb-WOL-fixes' · f9f41e3d
      David S. Miller 提交于
      Nicolas Ferre says:
      
      ====================
      net: macb: Wake-on-Lan magic packet fixes and GEM handling
      
      Here is a split series to fix WoL magic-packet on the current macb driver. Only
      fixes in this one based on current net/master.
      
      Changes in v5:
      - Addressed the error code returned by phylink_ethtool_set_wol() as suggested
        by Russell.
        If PHY handles WoL, MAC doesn't stay in the way.
      - Removed Florian's tag on 3/5 because of the above changes.
      - Correct the "Fixes" tag on 1/5.
      
      Changes in v4:
      - Pure bug fix series for 'net'. GEM addition and MACB update removed: will be
        sent later.
      
      Changes in v3:
      - Revert some of the v2 changes done in macb_resume(). Now the resume function
        supports in-depth re-configuration of the controller in order to deal with
        deeper sleep states. Basically as it was before changes introduced by this
        series
      - Tested for non-regression with our deeper Power Management mode which cuts
        power to the controller completely
      
      Changes in v2:
      - Add patch 4/7 ("net: macb: fix macb_suspend() by removing call to netif_carrier_off()")
        needed for keeping phy state consistent
      - Add patch 5/7 ("net: macb: fix call to pm_runtime in the suspend/resume functions") that prevent
        putting the macb in runtime pm suspend mode when WoL is used
      - Collect review tags on 3 first patches from Florian: Thanks!
      - Review of macb_resume() function
      - Addition of pm_wakeup_event() in both MACB and GEM WoL IRQ handlers
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f9f41e3d
    • N
      net: macb: fix call to pm_runtime in the suspend/resume functions · 6c8f85ca
      Nicolas Ferre 提交于
      The calls to pm_runtime_force_suspend/resume() functions are only
      relevant if the device is not configured to act as a WoL wakeup source.
      Add the device_may_wakeup() test before calling them.
      
      Fixes: 3e2a5e15 ("net: macb: add wake-on-lan support via magic packet")
      Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
      Cc: Harini Katakam <harini.katakam@xilinx.com>
      Cc: Sergio Prado <sergio.prado@e-labworks.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NNicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6c8f85ca
    • N
      net: macb: fix macb_suspend() by removing call to netif_carrier_off() · 64febc5e
      Nicolas Ferre 提交于
      As we now use the phylink call to phylink_stop() in the non-WoL path,
      there is no need for this call to netif_carrier_off() anymore. It can
      disturb the underlying phylink FSM.
      
      Fixes: 7897b071 ("net: macb: convert to phylink")
      Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
      Cc: Harini Katakam <harini.katakam@xilinx.com>
      Cc: Antoine Tenart <antoine.tenart@bootlin.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NNicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      64febc5e