1. 05 8月, 2020 5 次提交
  2. 20 7月, 2020 6 次提交
    • D
      dm crypt: Enable zoned block device support · 8e225f04
      Damien Le Moal 提交于
      Enable support for zoned block devices. This is done by:
      1) implementing the target report_zones method.
      2) adding the DM_TARGET_ZONED_HM flag to the target features.
      3) setting DM_CRYPT_NO_WRITE_WORKQUEUE flag to avoid IO
         processing via workqueue.
      4) Introducing inline write encryption completion to preserve write
         ordering.
      
      The last point is implemented by introducing the internal flag
      DM_CRYPT_WRITE_INLINE. When set, kcryptd_crypt_write_convert() always
      waits inline for the completion of a write request encryption if the
      request is not already completed once crypt_convert() returns.
      Completion of write request encryption is signaled using the
      restart completion by kcryptd_async_done(). This mechanism allows
      using ciphers that have an asynchronous implementation, isolating
      dm-crypt from any potential request completion reordering for these
      ciphers.
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      8e225f04
    • I
      dm crypt: add flags to optionally bypass kcryptd workqueues · 39d42fa9
      Ignat Korchagin 提交于
      This is a follow up to [1] that detailed latency problems associated
      with dm-crypt's use of workqueues when processing IO.
      
      Current dm-crypt implementation creates a significant IO performance
      overhead (at least on small IO block sizes) for both latency and
      throughput. We suspect offloading IO request processing into
      workqueues and async threads is more harmful these days with the
      modern fast storage. I also did some digging into the dm-crypt git
      history and much of this async processing is not needed anymore,
      because the reasons it was added are mostly gone from the kernel. More
      details can be found in [2] (see "Git archeology" section).
      
      This change adds DM_CRYPT_NO_READ_WORKQUEUE and
      DM_CRYPT_NO_WRITE_WORKQUEUE flags for read and write BIOs, which
      direct dm-crypt to not offload crypto operations into kcryptd
      workqueues.  In addition, writes are not buffered to be sorted in the
      dm-crypt red-black tree, but dispatched immediately. For cases, where
      crypto operations cannot happen (hard interrupt context, for example
      the read path of some NVME drivers), we offload the work to a tasklet
      rather than a workqueue.
      
      These flags only ensure no async BIO processing in the dm-crypt
      module. It is worth noting that some Crypto API implementations may
      offload encryption into their own workqueues, which are independent of
      the dm-crypt and its configuration. However upon enabling these new
      flags dm-crypt will instruct Crypto API not to backlog crypto
      requests.
      
      To give an idea of the performance gains for certain workloads,
      consider the script, and results when tested against various
      devices, detailed here:
      https://www.redhat.com/archives/dm-devel/2020-July/msg00138.html
      
      [1]: https://www.spinics.net/lists/dm-crypt/msg07516.html
      [2]: https://blog.cloudflare.com/speeding-up-linux-disk-encryption/Signed-off-by: NIgnat Korchagin <ignat@cloudflare.com>
      Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: NBob Liu <bob.liu@oracle.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      39d42fa9
    • M
      dm bufio: do buffer cleanup from a workqueue · 70704c33
      Mikulas Patocka 提交于
      Until now, DM bufio's waiting for IO from reclaim context in its
      shrinker has caused kswapd to block; which results in systemic IO
      stalls and even deadlock, e.g.:
      https://www.redhat.com/archives/dm-devel/2020-March/msg00025.html
      
      Here is Dave Chinner's problem description that motivated this fix,
      from: https://lore.kernel.org/linux-fsdevel/20190809215733.GZ7777@dread.disaster.area/
      
      "Waiting for IO in kswapd reclaim context is considered harmful -
      kswapd context shrinker reclaim should be as non-blocking as possible,
      and any back-off to wait for IO to complete should be done by the high
      level reclaim core once it's completed an entire reclaim scan cycle of
      everything....
      
      What follows from that, and is pertinent in this situation, is that if
      you don't block kswapd, then other reclaim contexts are not going to
      get stuck waiting for it regardless of the reclaim context they use."
      
      Continued elsewhere:
      
      "The only way to fix this problem once and for all is to stop using
      the shrinker as a mechanism to issue and wait on IO. If you need
      background writeback of dirty buffers, do it from a WQ_MEM_RECLAIM
      workqueue that isn't directly in the memory reclaim path and so can
      issue writeback and block safely from a GFP_KERNEL context. Kick the
      workqueue from the shrinker context, but get rid of the IO submission
      and waiting from the shrinker and all the GFP_NOFS memory reclaim
      recursion problems go away."
      
      As such, this commit moves buffer cleanup to a workqueue.
      Suggested-by: NDave Chinner <dchinner@redhat.com>
      Reported-by: NTahsin Erdogan <tahsin@google.com>
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Tested-by: NGabriel Krisman Bertazi <krisman@collabora.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      70704c33
    • M
      dm rq: don't call blk_mq_queue_stopped() in dm_stop_queue() · e766668c
      Ming Lei 提交于
      dm_stop_queue() only uses blk_mq_quiesce_queue() so it doesn't
      formally stop the blk-mq queue; therefore there is no point making the
      blk_mq_queue_stopped() check -- it will never be stopped.
      
      In addition, even though dm_stop_queue() actually tries to quiesce hw
      queues via blk_mq_quiesce_queue(), checking with blk_queue_quiesced()
      to avoid unnecessary queue quiesce isn't reliable because: the
      QUEUE_FLAG_QUIESCED flag is set before synchronize_rcu() and
      dm_stop_queue() may be called when synchronize_rcu() from another
      blk_mq_quiesce_queue() is in-progress.
      
      Fixes: 7b17c2f7 ("dm: Fix a race condition related to stopping and starting queues")
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      e766668c
    • Y
      dm dust: add interface to list all badblocks · 0c248ea2
      yangerkun 提交于
      This interface may help anyone who want to know all badblocks without
      querying for each block.
      
      [Bryan: DMEMIT message if no blocks are in the bad block list.]
      Signed-off-by: Nyangerkun <yangerkun@huawei.com>
      Signed-off-by: NBryan Gurney <bgurney@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      0c248ea2
    • Y
      dm dust: report some message results directly back to user · 4f7f590b
      yangerkun 提交于
      Some messages (queryblock, countbadblocks, removebadblock) are best
      reported directly to user directly. Do so with DMEMIT.
      
      [Bryan: maintain __func__ output in DMEMIT messages]
      Signed-off-by: Nyangerkun <yangerkun@huawei.com>
      Signed-off-by: NBryan Gurney <bgurney@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      4f7f590b
  3. 13 7月, 2020 8 次提交
  4. 11 7月, 2020 8 次提交
    • I
      mlxsw: pci: Fix use-after-free in case of failed devlink reload · c4317b11
      Ido Schimmel 提交于
      In case devlink reload failed, it is possible to trigger a
      use-after-free when querying the kernel for device info via 'devlink dev
      info' [1].
      
      This happens because as part of the reload error path the PCI command
      interface is de-initialized and its mailboxes are freed. When the
      devlink '->info_get()' callback is invoked the device is queried via the
      command interface and the freed mailboxes are accessed.
      
      Fix this by initializing the command interface once during probe and not
      during every reload.
      
      This is consistent with the other bus used by mlxsw (i.e., 'mlxsw_i2c')
      and also allows user space to query the running firmware version (for
      example) from the device after a failed reload.
      
      [1]
      BUG: KASAN: use-after-free in memcpy include/linux/string.h:406 [inline]
      BUG: KASAN: use-after-free in mlxsw_pci_cmd_exec+0x177/0xa60 drivers/net/ethernet/mellanox/mlxsw/pci.c:1675
      Write of size 4096 at addr ffff88810ae32000 by task syz-executor.1/2355
      
      CPU: 1 PID: 2355 Comm: syz-executor.1 Not tainted 5.8.0-rc2+ #29
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0xf6/0x16e lib/dump_stack.c:118
       print_address_description.constprop.0+0x1c/0x250 mm/kasan/report.c:383
       __kasan_report mm/kasan/report.c:513 [inline]
       kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
       check_memory_region_inline mm/kasan/generic.c:186 [inline]
       check_memory_region+0x14e/0x1b0 mm/kasan/generic.c:192
       memcpy+0x39/0x60 mm/kasan/common.c:106
       memcpy include/linux/string.h:406 [inline]
       mlxsw_pci_cmd_exec+0x177/0xa60 drivers/net/ethernet/mellanox/mlxsw/pci.c:1675
       mlxsw_cmd_exec+0x249/0x550 drivers/net/ethernet/mellanox/mlxsw/core.c:2335
       mlxsw_cmd_access_reg drivers/net/ethernet/mellanox/mlxsw/cmd.h:859 [inline]
       mlxsw_core_reg_access_cmd drivers/net/ethernet/mellanox/mlxsw/core.c:1938 [inline]
       mlxsw_core_reg_access+0x2f6/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1985
       mlxsw_reg_query drivers/net/ethernet/mellanox/mlxsw/core.c:2000 [inline]
       mlxsw_devlink_info_get+0x17f/0x6e0 drivers/net/ethernet/mellanox/mlxsw/core.c:1090
       devlink_nl_info_fill.constprop.0+0x13c/0x2d0 net/core/devlink.c:4588
       devlink_nl_cmd_info_get_dumpit+0x246/0x460 net/core/devlink.c:4648
       genl_lock_dumpit+0x85/0xc0 net/netlink/genetlink.c:575
       netlink_dump+0x515/0xe50 net/netlink/af_netlink.c:2245
       __netlink_dump_start+0x53d/0x830 net/netlink/af_netlink.c:2353
       genl_family_rcv_msg_dumpit.isra.0+0x296/0x300 net/netlink/genetlink.c:638
       genl_family_rcv_msg net/netlink/genetlink.c:733 [inline]
       genl_rcv_msg+0x78d/0x9d0 net/netlink/genetlink.c:753
       netlink_rcv_skb+0x152/0x440 net/netlink/af_netlink.c:2469
       genl_rcv+0x24/0x40 net/netlink/genetlink.c:764
       netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
       netlink_unicast+0x53a/0x750 net/netlink/af_netlink.c:1329
       netlink_sendmsg+0x850/0xd90 net/netlink/af_netlink.c:1918
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg+0x150/0x190 net/socket.c:672
       ____sys_sendmsg+0x6d8/0x840 net/socket.c:2363
       ___sys_sendmsg+0xff/0x170 net/socket.c:2417
       __sys_sendmsg+0xe5/0x1b0 net/socket.c:2450
       do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:359
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: a9c8336f ("mlxsw: core: Add support for devlink info command")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c4317b11
    • I
      mlxsw: spectrum_router: Remove inappropriate usage of WARN_ON() · d9d54202
      Ido Schimmel 提交于
      We should not trigger a warning when a memory allocation fails. Remove
      the WARN_ON().
      
      The warning is constantly triggered by syzkaller when it is injecting
      faults:
      
      [ 2230.758664] FAULT_INJECTION: forcing a failure.
      [ 2230.758664] name failslab, interval 1, probability 0, space 0, times 0
      [ 2230.762329] CPU: 3 PID: 1407 Comm: syz-executor.0 Not tainted 5.8.0-rc2+ #28
      ...
      [ 2230.898175] WARNING: CPU: 3 PID: 1407 at drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:6265 mlxsw_sp_router_fib_event+0xfad/0x13e0
      [ 2230.898179] Kernel panic - not syncing: panic_on_warn set ...
      [ 2230.898183] CPU: 3 PID: 1407 Comm: syz-executor.0 Not tainted 5.8.0-rc2+ #28
      [ 2230.898190] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
      
      Fixes: 3057224e ("mlxsw: spectrum_router: Implement FIB offload in deferred work")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d9d54202
    • N
      net: macb: fix call to pm_runtime in the suspend/resume functions · 6c8f85ca
      Nicolas Ferre 提交于
      The calls to pm_runtime_force_suspend/resume() functions are only
      relevant if the device is not configured to act as a WoL wakeup source.
      Add the device_may_wakeup() test before calling them.
      
      Fixes: 3e2a5e15 ("net: macb: add wake-on-lan support via magic packet")
      Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
      Cc: Harini Katakam <harini.katakam@xilinx.com>
      Cc: Sergio Prado <sergio.prado@e-labworks.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NNicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6c8f85ca
    • N
      net: macb: fix macb_suspend() by removing call to netif_carrier_off() · 64febc5e
      Nicolas Ferre 提交于
      As we now use the phylink call to phylink_stop() in the non-WoL path,
      there is no need for this call to netif_carrier_off() anymore. It can
      disturb the underlying phylink FSM.
      
      Fixes: 7897b071 ("net: macb: convert to phylink")
      Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
      Cc: Harini Katakam <harini.katakam@xilinx.com>
      Cc: Antoine Tenart <antoine.tenart@bootlin.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NNicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      64febc5e
    • N
      net: macb: fix macb_get/set_wol() when moving to phylink · 253fe094
      Nicolas Ferre 提交于
      Keep previous function goals and integrate phylink actions to them.
      
      phylink_ethtool_get_wol() is not enough to figure out if Ethernet driver
      supports Wake-on-Lan.
      Initialization of "supported" and "wolopts" members is done in phylink
      function, no need to keep them in calling function.
      
      phylink_ethtool_set_wol() return value is considered and determines
      if the MAC has to handle WoL or not. The case where the PHY doesn't
      implement WoL leads to the MAC configuring it to provide this feature.
      
      Fixes: 7897b071 ("net: macb: convert to phylink")
      Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
      Cc: Harini Katakam <harini.katakam@xilinx.com>
      Cc: Antoine Tenart <antoine.tenart@bootlin.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NNicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      253fe094
    • N
      net: macb: mark device wake capable when "magic-packet" property present · ced4799d
      Nicolas Ferre 提交于
      Change the way the "magic-packet" DT property is handled in the
      macb_probe() function, matching DT binding documentation.
      Now we mark the device as "wakeup capable" instead of calling the
      device_init_wakeup() function that would enable the wakeup source.
      
      For Ethernet WoL, enabling the wakeup_source is done by
      using ethtool and associated macb_set_wol() function that
      already calls device_set_wakeup_enable() for this purpose.
      
      That would reduce power consumption by cutting more clocks if
      "magic-packet" property is set but WoL is not configured by ethtool.
      
      Fixes: 3e2a5e15 ("net: macb: add wake-on-lan support via magic packet")
      Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
      Cc: Harini Katakam <harini.katakam@xilinx.com>
      Cc: Sergio Prado <sergio.prado@e-labworks.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NNicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ced4799d
    • N
      net: macb: fix wakeup test in runtime suspend/resume routines · 515a10a7
      Nicolas Ferre 提交于
      Use the proper struct device pointer to check if the wakeup flag
      and wakeup source are positioned.
      Use the one passed by function call which is equivalent to
      &bp->dev->dev.parent.
      
      It's preventing the trigger of a spurious interrupt in case the
      Wake-on-Lan feature is used.
      
      Fixes: d54f89af ("net: macb: Add pm runtime support")
      Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
      Cc: Harini Katakam <harini.katakam@xilinx.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NNicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      515a10a7
    • D
      bnxt_en: fix NULL dereference in case SR-IOV configuration fails · c8b1d743
      Davide Caratti 提交于
      we need to set 'active_vfs' back to 0, if something goes wrong during the
      allocation of SR-IOV resources: otherwise, further VF configurations will
      wrongly assume that bp->pf.vf[x] are valid memory locations, and commands
      like the ones in the following sequence:
      
       # echo 2 >/sys/bus/pci/devices/${ADDR}/sriov_numvfs
       # ip link set dev ens1f0np0 up
       # ip link set dev ens1f0np0 vf 0 trust on
      
      will cause a kernel crash similar to this:
      
       bnxt_en 0000:3b:00.0: not enough MMIO resources for SR-IOV
       BUG: kernel NULL pointer dereference, address: 0000000000000014
       #PF: supervisor read access in kernel mode
       #PF: error_code(0x0000) - not-present page
       PGD 0 P4D 0
       Oops: 0000 [#1] SMP PTI
       CPU: 43 PID: 2059 Comm: ip Tainted: G          I       5.8.0-rc2.upstream+ #871
       Hardware name: Dell Inc. PowerEdge R740/08D89F, BIOS 2.2.11 06/13/2019
       RIP: 0010:bnxt_set_vf_trust+0x5b/0x110 [bnxt_en]
       Code: 44 24 58 31 c0 e8 f5 fb ff ff 85 c0 0f 85 b6 00 00 00 48 8d 1c 5b 41 89 c6 b9 0b 00 00 00 48 c1 e3 04 49 03 9c 24 f0 0e 00 00 <8b> 43 14 89 c2 83 c8 10 83 e2 ef 45 84 ed 49 89 e5 0f 44 c2 4c 89
       RSP: 0018:ffffac6246a1f570 EFLAGS: 00010246
       RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000000b
       RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff98b28f538900
       RBP: ffff98b28f538900 R08: 0000000000000000 R09: 0000000000000008
       R10: ffffffffb9515be0 R11: ffffac6246a1f678 R12: ffff98b28f538000
       R13: 0000000000000001 R14: 0000000000000000 R15: ffffffffc05451e0
       FS:  00007fde0f688800(0000) GS:ffff98baffd40000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000014 CR3: 000000104bb0a003 CR4: 00000000007606e0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       PKRU: 55555554
       Call Trace:
        do_setlink+0x994/0xfe0
        __rtnl_newlink+0x544/0x8d0
        rtnl_newlink+0x47/0x70
        rtnetlink_rcv_msg+0x29f/0x350
        netlink_rcv_skb+0x4a/0x110
        netlink_unicast+0x21d/0x300
        netlink_sendmsg+0x329/0x450
        sock_sendmsg+0x5b/0x60
        ____sys_sendmsg+0x204/0x280
        ___sys_sendmsg+0x88/0xd0
        __sys_sendmsg+0x5e/0xa0
        do_syscall_64+0x47/0x80
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: c0c050c5 ("bnxt_en: New Broadcom ethernet driver.")
      Reported-by: NFei Liu <feliu@redhat.com>
      CC: Jonathan Toppins <jtoppins@redhat.com>
      CC: Michael Chan <michael.chan@broadcom.com>
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Reviewed-by: NMichael Chan <michael.chan@broadcom.com>
      Acked-by: NJonathan Toppins <jtoppins@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c8b1d743
  5. 10 7月, 2020 13 次提交
    • D
      xen/xenbus: Fix a double free in xenbus_map_ring_pv() · ba8c4234
      Dan Carpenter 提交于
      When there is an error the caller frees "info->node" so the free here
      will result in a double free.  We should just delete first kfree().
      
      Fixes: 3848e4e0 ("xen/xenbus: avoid large structs and arrays on the stack")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Link: https://lore.kernel.org/r/20200710113610.GA92345@mwandaReviewed-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      ba8c4234
    • E
      net/mlx5e: CT: Fix memory leak in cleanup · eb32b3f5
      Eli Britstein 提交于
      CT entries are deleted via a workqueue from netfilter. If removing the
      module before that, the rules are cleaned by the driver itself, but the
      memory entries for them are not freed. Fix that.
      
      Fixes: ac991b48 ("net/mlx5e: CT: Offload established flows")
      Signed-off-by: NEli Britstein <elibr@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      eb32b3f5
    • E
      net/mlx5e: Fix port buffers cell size value · 88b3d5c9
      Eran Ben Elisha 提交于
      Device unit for port buffers size, xoff_threshold and xon_threshold is
      cells. Fix a bug in driver where cell unit size was hard-coded to
      128 bytes. This hard-coded value is buggy, as it is wrong for some hardware
      versions.
      
      Driver to read cell size from SBCAM register and translate bytes to cell
      units accordingly.
      
      In order to fix the bug, this patch exposes SBCAM (Shared buffer
      capabilities mask) layout and defines.
      
      If SBCAM.cap_cell_size is valid, use it for all bytes to cells
      calculations. If not valid, fallback to 128.
      
      Cell size do not change on the fly per device. Instead of issuing SBCAM
      access reg command every time such translation is needed, cache it in
      mlx5e_dcbx as part of mlx5e_dcbnl_initialize(). Pass dcbx.port_buff_cell_sz
      as a param to every function that needs bytes to cells translation.
      
      While fixing the bug, move MLX5E_BUFFER_CELL_SHIFT macro to
      en_dcbnl.c, as it is only used by that file.
      
      Fixes: 0696d608 ("net/mlx5e: Receive buffer configuration")
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Reviewed-by: NHuy Nguyen <huyn@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      88b3d5c9
    • A
      net/mlx5e: Fix 50G per lane indication · 6a1cf4e4
      Aya Levin 提交于
      Some released FW versions mistakenly don't set the capability that 50G
      per lane link-modes are supported for VFs (ptys_extended_ethernet
      capability bit). When the capability is unset, read
      PTYS.ext_eth_proto_capability (always reliable).
      If PTYS.ext_eth_proto_capability is valid (has a non-zero value)
      conclude that the HCA supports 50G per lane. Otherwise, conclude that
      the HCA doesn't support 50G per lane.
      
      Fixes: a08b4ed1 ("net/mlx5: Add support to ext_* fields introduced in Port Type and Speed register")
      Signed-off-by: NAya Levin <ayal@mellanox.com>
      Reviewed-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      6a1cf4e4
    • A
      net/mlx5e: Fix CPU mapping after function reload to avoid aRFS RX crash · f4aebbfb
      Aya Levin 提交于
      After function reload, CPU mapping used by aRFS RX is broken, leading to
      a kernel panic. Fix by moving initialization of rx_cpu_rmap from
      netdev_init to netdev_attach. IRQ table is re-allocated on mlx5_load,
      but netdev is not re-initialize.
      
      Trace of the panic:
      [ 22.055672] general protection fault, probably for non-canonical address 0x785634120000ff1c: 0000 [#1] SMP PTI
      [ 22.065010] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 5.7.0-rc2-for-upstream-perf-2020-04-21_16-34-03-31 #1
      [ 22.067967] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
      [ 22.071174] RIP: 0010:get_rps_cpu+0x267/0x300
      [ 22.075692] RSP: 0018:ffffc90000244d60 EFLAGS: 00010202
      [ 22.076888] RAX: ffff888459b0e400 RBX: 0000000000000000 RCX:0000000000000007
      [ 22.078364] RDX: 0000000000008884 RSI: ffff888467cb5b00 RDI:0000000000000000
      [ 22.079815] RBP: 00000000ff342b27 R08: 0000000000000007 R09:0000000000000003
      [ 22.081289] R10: ffffffffffffffff R11: 00000000000070cc R12:ffff888454900000
      [ 22.082767] R13: ffffc90000e5a950 R14: ffffc90000244dc0 R15:0000000000000007
      [ 22.084190] FS: 0000000000000000(0000) GS:ffff88846fc80000(0000)knlGS:0000000000000000
      [ 22.086161] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 22.087427] CR2: ffffffffffffffff CR3: 0000000464426003 CR4:0000000000760ee0
      [ 22.088888] DR0: 0000000000000000 DR1: 0000000000000000 DR2:0000000000000000
      [ 22.090336] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400
      [ 22.091764] PKRU: 55555554
      [ 22.092618] Call Trace:
      [ 22.093442] <IRQ>
      [ 22.094211] ? kvm_clock_get_cycles+0xd/0x10
      [ 22.095272] netif_receive_skb_list_internal+0x258/0x2a0
      [ 22.096460] gro_normal_list.part.137+0x19/0x40
      [ 22.097547] napi_complete_done+0xc6/0x110
      [ 22.098685] mlx5e_napi_poll+0x190/0x670 [mlx5_core]
      [ 22.099859] net_rx_action+0x2a0/0x400
      [ 22.100848] __do_softirq+0xd8/0x2a8
      [ 22.101829] irq_exit+0xa5/0xb0
      [ 22.102750] do_IRQ+0x52/0xd0
      [ 22.103654] common_interrupt+0xf/0xf
      [ 22.104641] </IRQ>
      
      Fixes: 4383cfcc ("net/mlx5: Add devlink reload")
      Signed-off-by: NAya Levin <ayal@mellanox.com>
      Reviewed-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      f4aebbfb
    • A
      net/mlx5e: Fix VXLAN configuration restore after function reload · b3c2ed21
      Aya Levin 提交于
      When detaching netdev, remove vxlan port configuration using
      udp_tunnel_drop_rx_info. During function reload, configuration will be
      restored using udp_tunnel_get_rx_info. This ensures sync between
      firmware and driver. Use udp_tunnel_get_rx_info even if its physical
      interface is down.
      
      Fixes: 4383cfcc ("net/mlx5: Add devlink reload")
      Signed-off-by: NAya Levin <ayal@mellanox.com>
      Reviewed-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      b3c2ed21
    • V
      net/mlx5e: Fix usage of rcu-protected pointer · c1aea9e1
      Vlad Buslov 提交于
      In mlx5e_configure_flower() flow pointer is protected by rcu read lock.
      However, after cited commit the pointer is being used outside of rcu read
      block. Extend the block to protect all pointer accesses.
      
      Fixes: 553f9328 ("net/mlx5e: Support tc block sharing for representors")
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      c1aea9e1
    • V
      net/mxl5e: Verify that rpriv is not NULL · 2fb15e72
      Vlad Buslov 提交于
      In helper function is_flow_rule_duplicate_allowed() verify that rpviv
      pointer is not NULL before dereferencing it. This can happen when device is
      in NIC mode and leads to following crash:
      
      [90444.046419] BUG: kernel NULL pointer dereference, address: 0000000000000000
      [90444.048149] #PF: supervisor read access in kernel mode
      [90444.049781] #PF: error_code(0x0000) - not-present page
      [90444.051386] PGD 80000003d35a4067 P4D 80000003d35a4067 PUD 3d35a3067 PMD 0
      [90444.053051] Oops: 0000 [#1] SMP PTI
      [90444.054683] CPU: 16 PID: 31736 Comm: tc Not tainted 5.8.0-rc1+ #1157
      [90444.056340] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
      [90444.058079] RIP: 0010:mlx5e_configure_flower+0x3aa/0x9b0 [mlx5_core]
      [90444.059753] Code: 24 50 49 8b 95 08 02 00 00 48 b8 00 08 00 00 04 00 00 00 48 21 c2 48 39 c2 74 0a 41 f6 85 0d 02 00 00 20 74 16 48 8b 44 24 20 <48> 8b 00 66 83 78 20 ff 74 07 4d 89 aa e0 00 00 00 48 83 7d 28 00
      [90444.063232] RSP: 0018:ffffabe9c61ff768 EFLAGS: 00010246
      [90444.065014] RAX: 0000000000000000 RBX: ffff9b13c4c91e80 RCX: 00000000000093fa
      [90444.066784] RDX: 0000000400000800 RSI: 0000000000000000 RDI: 000000000002d5e0
      [90444.068533] RBP: ffff9b174d308468 R08: 0000000000000000 R09: ffff9b17d63003f0
      [90444.070285] R10: ffff9b17ea288600 R11: 0000000000000000 R12: ffffabe9c61ff878
      [90444.072032] R13: ffff9b174d300000 R14: ffffabe9c61ffbb8 R15: ffff9b174d300880
      [90444.073760] FS:  00007f3c23775480(0000) GS:ffff9b13efc80000(0000) knlGS:0000000000000000
      [90444.075492] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [90444.077266] CR2: 0000000000000000 CR3: 00000003e2a60002 CR4: 00000000001606e0
      [90444.079024] Call Trace:
      [90444.080753]  tc_setup_cb_add+0xca/0x1e0
      [90444.082415]  fl_hw_replace_filter+0x15f/0x1f0 [cls_flower]
      [90444.084119]  fl_change+0xa59/0x13dc [cls_flower]
      [90444.085772]  ? wait_for_completion+0xa8/0xf0
      [90444.087364]  tc_new_tfilter+0x3f5/0xa60
      [90444.088960]  rtnetlink_rcv_msg+0xeb/0x360
      [90444.090514]  ? __d_lookup_done+0x76/0xe0
      [90444.092034]  ? proc_alloc_inode+0x16/0x70
      [90444.093560]  ? prep_new_page+0x8c/0xf0
      [90444.095048]  ? _cond_resched+0x15/0x30
      [90444.096483]  ? rtnl_calcit.isra.0+0x110/0x110
      [90444.097907]  netlink_rcv_skb+0x49/0x110
      [90444.099289]  netlink_unicast+0x191/0x230
      [90444.100629]  netlink_sendmsg+0x243/0x480
      [90444.101984]  sock_sendmsg+0x5e/0x60
      [90444.103305]  ____sys_sendmsg+0x1f3/0x260
      [90444.104597]  ? copy_msghdr_from_user+0x5c/0x90
      [90444.105916]  ? __mod_lruvec_state+0x3c/0xe0
      [90444.107210]  ___sys_sendmsg+0x81/0xc0
      [90444.108484]  ? do_filp_open+0xa5/0x100
      [90444.109732]  ? handle_mm_fault+0x117b/0x1e00
      [90444.110970]  ? __check_object_size+0x46/0x147
      [90444.112205]  ? __check_object_size+0x136/0x147
      [90444.113402]  __sys_sendmsg+0x59/0xa0
      [90444.114587]  do_syscall_64+0x4d/0x90
      [90444.115782]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [90444.116953] RIP: 0033:0x7f3c2393b7b8
      [90444.118101] Code: Bad RIP value.
      [90444.119240] RSP: 002b:00007ffc6ad8e6c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [90444.120408] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3c2393b7b8
      [90444.121583] RDX: 0000000000000000 RSI: 00007ffc6ad8e740 RDI: 0000000000000003
      [90444.122750] RBP: 000000005eea0c3a R08: 0000000000000001 R09: 00007ffc6ad8e68c
      [90444.123928] R10: 0000000000404fa8 R11: 0000000000000246 R12: 0000000000000001
      [90444.125073] R13: 0000000000000000 R14: 00007ffc6ad92a00 R15: 00000000004866a0
      [90444.126221] Modules linked in: act_skbedit act_tunnel_key act_mirred bonding vxlan ip6_udp_tunnel udp_tunnel nfnetlink act_gact cls_flower sch_ingress openvswitch nsh nf_conncount nfsv3 nfs_acl nfs lockd grace fscache tun bridge stp llc sunrpc rdma_ucm rdma_cm iw_cm ib_cm mlx5_ib ib_uverbs ib_core mlx5_core intel_r
      apl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel mlxfw kvm act_ct nf_flow_table nf_nat nf_conntrack irqbypass crct10dif_pclmul nf_defrag_ipv6 igb ipmi_ssif libcrc32c crc32_pclmul crc32c_intel ipmi_si nf_defrag_ipv4 ptp ghash_clmulni_intel mei_me ses iTCO_wdt i2c_i801 pps_core
      ioatdma iTCO_vendor_support joydev mei enclosure intel_cstate i2c_smbus wmi dca ipmi_devintf intel_uncore lpc_ich ipmi_msghandler pcspkr acpi_pad acpi_power_meter ast i2c_algo_bit drm_vram_helper drm_kms_helper drm_ttm_helper ttm drm mpt3sas raid_class scsi_transport_sas
      [90444.136253] CR2: 0000000000000000
      [90444.137621] ---[ end trace 924af62aa2b151bd ]---
      
      Fixes: 553f9328 ("net/mlx5e: Support tc block sharing for representors")
      Reported-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      2fb15e72
    • V
      net/mlx5: E-Switch, Fix vlan or qos setting in legacy mode · 01f3d5db
      Vu Pham 提交于
      Refactoring eswitch ingress acl codes accidentally inserts extra
      memset zero that removes vlan and/or qos setting in legacy mode.
      
      Fixes: 07bab950 ("net/mlx5: E-Switch, Refactor eswitch ingress acl codes")
      Signed-off-by: NVu Pham <vuhuong@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      01f3d5db
    • E
      net/mlx5: Fix eeprom support for SFP module · 47afbdd2
      Eran Ben Elisha 提交于
      Fix eeprom SFP query support by setting i2c_addr, offset and page number
      correctly. Unlike QSFP modules, SFP eeprom params are as follow:
      - i2c_addr is 0x50 for offset 0 - 255 and 0x51 for offset 256 - 511.
      - Page number is always zero.
      - Page offset is always relative to zero.
      
      As part of eeprom query, query the module ID (SFP / QSFP*) via helper
      function to set the params accordingly.
      
      In addition, change mlx5_qsfp_eeprom_page() input type to be u16 to avoid
      unnecessary casting.
      
      Fixes: a708fb7b ("net/mlx5e: ethtool, Add support for EEPROM high pages query")
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NHuy Nguyen <huyn@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      47afbdd2
    • S
      qed: Populate nvm-file attributes while reading nvm config partition. · 13cf8aab
      Sudarsana Reddy Kalluru 提交于
      NVM config file address will be modified when the MBI image is upgraded.
      Driver would return stale config values if user reads the nvm-config
      (via ethtool -d) in this state. The fix is to re-populate nvm attribute
      info while reading the nvm config values/partition.
      
      Changes from previous version:
      -------------------------------
      v3: Corrected the formatting in 'Fixes' tag.
      v2: Added 'Fixes' tag.
      
      Fixes: 1ac4329a ("qed: Add configuration information to register dump and debug data")
      Signed-off-by: NSudarsana Reddy Kalluru <skalluru@marvell.com>
      Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      13cf8aab
    • M
      drm/amdgpu: don't do soft recovery if gpu_recovery=0 · f4892c32
      Marek Olšák 提交于
      It's impossible to debug shader hangs with soft recovery.
      Signed-off-by: NMarek Olšák <marek.olsak@amd.com>
      Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
      Reviewed-by: NChristian König <christian.koenig@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      f4892c32
    • T
      drm/radeon: fix double free · 41855a89
      Tom Rix 提交于
      clang static analysis flags this error
      
      drivers/gpu/drm/radeon/ci_dpm.c:5652:9: warning: Use of memory after it is freed [unix.Malloc]
                      kfree(rdev->pm.dpm.ps[i].ps_priv);
                            ^~~~~~~~~~~~~~~~~~~~~~~~~~
      drivers/gpu/drm/radeon/ci_dpm.c:5654:2: warning: Attempt to free released memory [unix.Malloc]
              kfree(rdev->pm.dpm.ps);
              ^~~~~~~~~~~~~~~~~~~~~~
      
      problem is reported in ci_dpm_fini, with these code blocks.
      
      	for (i = 0; i < rdev->pm.dpm.num_ps; i++) {
      		kfree(rdev->pm.dpm.ps[i].ps_priv);
      	}
      	kfree(rdev->pm.dpm.ps);
      
      The first free happens in ci_parse_power_table where it cleans up locally
      on a failure.  ci_dpm_fini also does a cleanup.
      
      	ret = ci_parse_power_table(rdev);
      	if (ret) {
      		ci_dpm_fini(rdev);
      		return ret;
      	}
      
      So remove the cleanup in ci_parse_power_table and
      move the num_ps calculation to inside the loop so ci_dpm_fini
      will know how many array elements to free.
      
      Fixes: cc8dbbb4 ("drm/radeon: add dpm support for CI dGPUs (v2)")
      Signed-off-by: NTom Rix <trix@redhat.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      41855a89