1. 20 12月, 2018 10 次提交
    • A
      net/mlx5e: Remove the false indication of software timestamping support · 47654204
      Alaa Hleihel 提交于
      mlx5 driver falsely advertises support of software timestamping.
      Fix it by removing the false indication.
      
      Fixes: ef9814de ("net/mlx5e: Add HW timestamping (TS) support")
      Signed-off-by: NAlaa Hleihel <alaa@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      47654204
    • Y
      net/mlx5: Typo fix in del_sw_hw_rule · f0337889
      Yuval Avnery 提交于
      Expression terminated with "," instead of ";", resulted in
      set_fte getting bad value for modify_enable_mask field.
      
      Fixes: bd5251db ("net/mlx5_core: Introduce flow steering destination of type counter")
      Signed-off-by: NYuval Avnery <yuvalav@mellanox.com>
      Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      f0337889
    • T
      net/mlx5e: RX, Fix wrong early return in receive queue poll · bfc69825
      Tariq Toukan 提交于
      When the completion queue of the RQ is empty, do not immediately return.
      If left-over decompressed CQEs (from the previous cycle) were processed,
      need to go to the finalization part of the poll function.
      
      Bug exists only when CQE compression is turned ON.
      
      This solves the following issue:
      mlx5_core 0000:82:00.1: mlx5_eq_int:544:(pid 0): CQ error on CQN 0xc08, syndrome 0x1
      mlx5_core 0000:82:00.1 p4p2: mlx5e_cq_error_event: cqn=0x000c08 event=0x04
      
      Fixes: 4b7dfc99 ("net/mlx5e: Early-return on empty completion queues")
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Reviewed-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      bfc69825
    • C
      ipv6: explicitly initialize udp6_addr in udp_sock_create6() · fb242745
      Cong Wang 提交于
      syzbot reported the use of uninitialized udp6_addr::sin6_scope_id.
      We can just set ::sin6_scope_id to zero, as tunnels are unlikely
      to use an IPv6 address that needs a scope id and there is no
      interface to bind in this context.
      
      For net-next, it looks different as we have cfg->bind_ifindex there
      so we can probably call ipv6_iface_scope_id().
      
      Same for ::sin6_flowinfo, tunnels don't use it.
      
      Fixes: 8024e028 ("udp: Add udp_sock_create for UDP tunnels to open listener socket")
      Reported-by: syzbot+c56449ed3652e6720f30@syzkaller.appspotmail.com
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fb242745
    • M
      bnxt_en: Fix ethtool self-test loopback. · 84404d5f
      Michael Chan 提交于
      The current code has 2 problems.  It assumes that the RX ring for
      the loopback packet is combined with the TX ring.  This is not
      true if the ethtool channels are set to non-combined mode.  The
      second problem is that it won't work on 57500 chips without
      adjusting the logic to get the proper completion ring (cpr) pointer.
      Fix both issues by locating the proper cpr pointer through the RX
      ring.
      
      Fixes: e44758b7 ("bnxt_en: Use bnxt_cp_ring_info struct pointer as parameter for RX path.")
      Signed-off-by: NMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      84404d5f
    • D
      Merge branch 'rds-fixes' · 912cb1d5
      David S. Miller 提交于
      Shamir Rabinovitch says:
      
      ====================
      WARNING in rds_message_alloc_sgs
      
      This patch set fix google syzbot rds bug found in linux-next.
      The first patch solve the syzbot issue.
      The second patch fix issue mentioned by Leon Romanovsky that
      drivers should not call WARN_ON as result from user input.
      
      syzbot bug report can be foud here: https://lkml.org/lkml/2018/10/31/28
      
      v1->v2:
      - patch 1: make rds_iov_vector fields name more descriptive (Hakon)
      - patch 1: fix potential mem leak in rds_rm_size if krealloc fail
        (Hakon)
      v2->v3:
      - patch 2: harden rds_sendmsg for invalid number of sgs (Gerd)
      v3->v4
      - Santosh a.b. on both patches + repost to net-dev
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      912cb1d5
    • S
      net/rds: remove user triggered WARN_ON in rds_sendmsg · c75ab8a5
      shamir rabinovitch 提交于
      per comment from Leon in rdma mailing list
      https://lkml.org/lkml/2018/10/31/312 :
      
      Please don't forget to remove user triggered WARN_ON.
      https://lwn.net/Articles/769365/
      "Greg Kroah-Hartman raised the problem of core kernel API code that will
      use WARN_ON_ONCE() to complain about bad usage; that will not generate
      the desired result if WARN_ON_ONCE() is configured to crash the machine.
      He was told that the code should just call pr_warn() instead, and that
      the called function should return an error in such situations. It was
      generally agreed that any WARN_ON() or WARN_ON_ONCE() calls that can be
      triggered from user space need to be fixed."
      
      in addition harden rds_sendmsg to detect and overcome issues with
      invalid sg count and fail the sendmsg.
      Suggested-by: NLeon Romanovsky <leon@kernel.org>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: Nshamir rabinovitch <shamir.rabinovitch@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c75ab8a5
    • S
      net/rds: fix warn in rds_message_alloc_sgs · ea010070
      shamir rabinovitch 提交于
      redundant copy_from_user in rds_sendmsg system call expose rds
      to issue where rds_rdma_extra_size walk the rds iovec and and
      calculate the number pf pages (sgs) it need to add to the tail of
      rds message and later rds_cmsg_rdma_args copy the rds iovec again
      and re calculate the same number and get different result causing
      WARN_ON in rds_message_alloc_sgs.
      
      fix this by doing the copy_from_user only once per rds_sendmsg
      system call.
      
      When issue occur the below dump is seen:
      
      WARNING: CPU: 0 PID: 19789 at net/rds/message.c:316 rds_message_alloc_sgs+0x10c/0x160 net/rds/message.c:316
      Kernel panic - not syncing: panic_on_warn set ...
      CPU: 0 PID: 19789 Comm: syz-executor827 Not tainted 4.19.0-next-20181030+ #101
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x244/0x39d lib/dump_stack.c:113
       panic+0x2ad/0x55c kernel/panic.c:188
       __warn.cold.8+0x20/0x45 kernel/panic.c:540
       report_bug+0x254/0x2d0 lib/bug.c:186
       fixup_bug arch/x86/kernel/traps.c:178 [inline]
       do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
       do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
       invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:969
      RIP: 0010:rds_message_alloc_sgs+0x10c/0x160 net/rds/message.c:316
      Code: c0 74 04 3c 03 7e 6c 44 01 ab 78 01 00 00 e8 2b 9e 35 fa 4c 89 e0 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 14 9e 35 fa <0f> 0b 31 ff 44 89 ee e8 18 9f 35 fa 45 85 ed 75 1b e8 fe 9d 35 fa
      RSP: 0018:ffff8801c51b7460 EFLAGS: 00010293
      RAX: ffff8801bc412080 RBX: ffff8801d7bf4040 RCX: ffffffff8749c9e6
      RDX: 0000000000000000 RSI: ffffffff8749ca5c RDI: 0000000000000004
      RBP: ffff8801c51b7490 R08: ffff8801bc412080 R09: ffffed003b5c5b67
      R10: ffffed003b5c5b67 R11: ffff8801dae2db3b R12: 0000000000000000
      R13: 000000000007165c R14: 000000000007165c R15: 0000000000000005
       rds_cmsg_rdma_args+0x82d/0x1510 net/rds/rdma.c:623
       rds_cmsg_send net/rds/send.c:971 [inline]
       rds_sendmsg+0x19a2/0x3180 net/rds/send.c:1273
       sock_sendmsg_nosec net/socket.c:622 [inline]
       sock_sendmsg+0xd5/0x120 net/socket.c:632
       ___sys_sendmsg+0x7fd/0x930 net/socket.c:2117
       __sys_sendmsg+0x11d/0x280 net/socket.c:2155
       __do_sys_sendmsg net/socket.c:2164 [inline]
       __se_sys_sendmsg net/socket.c:2162 [inline]
       __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2162
       do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x44a859
      Code: e8 dc e6 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 6b cb fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f1d4710ada8 EFLAGS: 00000297 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00000000006dcc28 RCX: 000000000044a859
      RDX: 0000000000000000 RSI: 0000000020001600 RDI: 0000000000000003
      RBP: 00000000006dcc20 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000297 R12: 00000000006dcc2c
      R13: 646e732f7665642f R14: 00007f1d4710b9c0 R15: 00000000006dcd2c
      Kernel Offset: disabled
      Rebooting in 86400 seconds..
      
      Reported-by: syzbot+26de17458aeda9d305d8@syzkaller.appspotmail.com
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: Nshamir rabinovitch <shamir.rabinovitch@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ea010070
    • D
      Merge tag 'wireless-drivers-for-davem-2018-12-19' of... · c6f4075e
      David S. Miller 提交于
      Merge tag 'wireless-drivers-for-davem-2018-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
      
      Kalle Valo says:
      
      ====================
      wireless-drivers fixes for 4.20
      
      Last set of fixes for 4.20. All (except the mt76 fix) of these are
      important fixes to user reported problems and pretty small in size.
      
      rtlwifi
      
      * fix skb leak
      
      mwifiex
      
      * revert a commit from v4.19 due to problems with locking
      
      mt76
      
      * fix a potential NULL derenfence
      
      * add entry to MAINTAINERS
      
      iwlwifi
      
      * fix a firmware crash which was a regression introduced in v4.20-rc4
      
      ath10k
      
      * fix a firmware crash with wcn3990 firmware
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c6f4075e
    • D
      Merge tag 'mac80211-for-davem-2018-12-19' of... · 49ce708b
      David S. Miller 提交于
      Merge tag 'mac80211-for-davem-2018-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
      
      Johannes Berg says:
      
      ====================
      Just three fixes:
       * fix a memory leak in an error path
       * fix TXQs in interface teardown
       * free fraglist if we used it internally
         before returning SKB
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      49ce708b
  2. 19 12月, 2018 30 次提交
    • R
      ath10k: skip sending quiet mode cmd for WCN3990 · 53884577
      Rakesh Pillai 提交于
      HL2.0 firmware does not support setting quiet mode.  If the host driver sends
      the quiet mode setting command to the HL2.0 firmware, it crashes with the below
      signature.
      
      fatal error received: err_qdi.c:456:EX:wlan_process:1:WLAN RT:207a:PC=b001b4f0
      
      The quiet mode command support is exposed by the firmware via thermal throttle
      wmi service. Enable ath10k thermal support if thermal throttle wmi service bit
      is set.  10.x firmware versions support this feature by default, but
      unfortunately do not advertise the support via service flags, hence have to
      manually set the service flag in ath10k_core_compat_services().
      
      Tested on QCA988X with 10.2.4.70.9-2. Also tested on WCN3990.
      Co-developed-by: NGovind Singh <govinds@codeaurora.org>
      Co-developed-by: NKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: NRakesh Pillai <pillair@codeaurora.org>
      Signed-off-by: NGovind Singh <govinds@codeaurora.org>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      53884577
    • S
      mac80211: free skb fraglist before freeing the skb · 34b1e0e9
      Sara Sharon 提交于
      mac80211 uses the frag list to build AMSDU. When freeing
      the skb, it may not be really freed, since someone is still
      holding a reference to it.
      In that case, when TCP skb is being retransmitted, the
      pointer to the frag list is being reused, while the data
      in there is no longer valid.
      Since we will never get frag list from the network stack,
      as mac80211 doesn't advertise the capability, we can safely
      free and nullify it before releasing the SKB.
      Signed-off-by: NSara Sharon <sara.sharon@intel.com>
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      34b1e0e9
    • J
      nl80211: fix memory leak if validate_pae_over_nl80211() fails · d350a0f4
      Johannes Berg 提交于
      If validate_pae_over_nl80211() were to fail in nl80211_crypto_settings(),
      we might leak the 'connkeys' allocation. Fix this.
      
      Fixes: 64bf3d4b ("nl80211: Add CONTROL_PORT_OVER_NL80211 attribute")
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      d350a0f4
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 3061169a
      David S. Miller 提交于
      Alexei Starovoitov says:
      
      ====================
      pull-request: bpf 2018-12-18
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) promote bpf_perf_event.h to mandatory UAPI header, from Masahiro.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3061169a
    • M
      net/smc: fix TCP fallback socket release · 78abe3d0
      Myungho Jung 提交于
      clcsock can be released while kernel_accept() references it in TCP
      listen worker. Also, clcsock needs to wake up before released if TCP
      fallback is used and the clcsock is blocked by accept. Add a lock to
      safely release clcsock and call kernel_sock_shutdown() to wake up
      clcsock from accept in smc_release().
      
      Reported-by: syzbot+0bf2e01269f1274b4b03@syzkaller.appspotmail.com
      Reported-by: syzbot+e3132895630f957306bc@syzkaller.appspotmail.com
      Signed-off-by: NMyungho Jung <mhjungk@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      78abe3d0
    • C
      vxge: ensure data0 is initialized in when fetching firmware version information · f7db2beb
      Colin Ian King 提交于
      Currently variable data0 is not being initialized so a garbage value is
      being passed to vxge_hw_vpath_fw_api and this value is being written to
      the rts_access_steer_data0 register.  There are other occurrances where
      data0 is being initialized to zero (e.g. in function
      vxge_hw_upgrade_read_version) so I think it makes sense to ensure data0
      is initialized likewise to 0.
      
      Detected by CoverityScan, CID#140696 ("Uninitialized scalar variable")
      
      Fixes: 8424e00d ("vxge: serialize access to steering control register")
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f7db2beb
    • J
      xen/netfront: tolerate frags with no data · d81c5054
      Juergen Gross 提交于
      At least old Xen net backends seem to send frags with no real data
      sometimes. In case such a fragment happens to occur with the frag limit
      already reached the frontend will BUG currently even if this situation
      is easily recoverable.
      
      Modify the BUG_ON() condition accordingly.
      Tested-by: NDietmar Hahn <dietmar.hahn@ts.fujitsu.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d81c5054
    • K
      net: phy: Fix the issue that netif always links up after resuming · 8742beb5
      Kunihiko Hayashi 提交于
      Even though the link is down before entering hibernation,
      there is an issue that the network interface always links up after resuming
      from hibernation.
      
      If the link is still down before enabling the network interface,
      and after resuming from hibernation, the phydev->state is forcibly set
      to PHY_UP in mdio_bus_phy_restore(), and the link becomes up.
      
      In suspend sequence, only if the PHY is attached, mdio_bus_phy_suspend()
      calls phy_stop_machine(), and mdio_bus_phy_resume() calls
      phy_start_machine().
      In resume sequence, it's enough to do the same as mdio_bus_phy_resume()
      because the state has been preserved.
      
      This patch fixes the issue by calling phy_start_machine() in
      mdio_bus_phy_restore() in the same way as mdio_bus_phy_resume().
      
      Fixes: bc87922f ("phy: Move PHY PM operations into phy_device")
      Suggested-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NKunihiko Hayashi <hayashi.kunihiko@socionext.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8742beb5
    • J
      lan78xx: Resolve issue with changing MAC address · 15515aaa
      Jason Martinsen 提交于
      Current state for the lan78xx driver does not allow for changing the
      MAC address of the interface, without either removing the module (if
      you compiled it that way) or rebooting the machine.  If you attempt to
      change the MAC address, ifconfig will show the new address, however,
      the system/interface will not respond to any traffic using that
      configuration.  A few short-term options to work around this are to
      unload the module and reload it with the new MAC address, change the
      interface to "promisc", or reboot with the correct configuration to
      change the MAC.
      
      This patch enables the ability to change the MAC address via fairly normal means...
      ifdown <interface>
      modify entry in /etc/network/interfaces OR a similar method
      ifup <interface>
      Then test via any network communication, such as ICMP requests to gateway.
      
      My only test platform for this patch has been a raspberry pi model 3b+.
      Signed-off-by: NJason Martinsen <jasonmartinsen@msn.com>
      
      -----
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      15515aaa
    • B
      lan743x: Expand phy search for LAN7431 · 0db7d253
      Bryan Whitehead 提交于
      The LAN7431 uses an external phy, and it can be found anywhere in
      the phy address space. This patch uses phy address 1 for LAN7430
      only. And searches all addresses otherwise.
      Signed-off-by: NBryan Whitehead <Bryan.Whitehead@microchip.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0db7d253
    • D
      Merge branch 'vxlan-Various-fixes' · 59fc137e
      David S. Miller 提交于
      Petr Machata says:
      
      ====================
      vxlan: Various fixes
      
      This patch set contains three fixes for the vxlan driver.
      
      Patch #1 fixes handling of offload mark on replaced VXLAN FDB entries. A
      way to trigger this is to replace the FDB entry with one that can not be
      offloaded. A future patch set should make it possible to veto such FDB
      changes. However the FDB might still fail to be offloaded due to another
      issue, and the offload mark should reflect that.
      
      Patch #2 fixes problems in __vxlan_dev_create() when a call to
      rtnl_configure_link() fails. These failures would be tricky to hit on a
      real system, the most likely vector is through an error in vxlan_open().
      However, with the abovementioned vetoing patchset, vetoing the created
      entry would trigger the same problems (and be easier to reproduce).
      
      Patch #3 fixes a problem in vxlan_changelink(). In situations where the
      default remote configured in the FDB table (if any) does not exactly
      match the remote address configured at the VXLAN device, changing the
      remote address breaks the default FDB entry. Patch #4 is then a self
      test for this issue.
      
      v3:
      - Patch #2:
          - Reuse the same errout block for both cleanup paths. Use a bool to
            decide whether the unregister_netdevice() call should be made.
      
      v2:
      - Drop former patch #3
      - Patch #2:
          - Delete the default entry before calling unregister_netdevice(). That
            takes care of former patch #3, hence tweak the commit message to
            mention that problem as well.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59fc137e
    • P
      selftests: net: Add test_vxlan_fdb_changelink.sh · 55cbe079
      Petr Machata 提交于
      Add a test to exercise the fix from the previous patch.
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      55cbe079
    • P
      vxlan: changelink: Fix handling of default remotes · ce5e098f
      Petr Machata 提交于
      Default remotes are stored as FDB entries with an Ethernet address of
      00:00:00:00:00:00. When a request is made to change a remote address of
      a VXLAN device, vxlan_changelink() first deletes the existing default
      remote, and then creates a new FDB entry.
      
      This works well as long as the list of default remotes matches exactly
      the configuration of a VXLAN remote address. Thus when the VXLAN device
      has a remote of X, there should be exactly one default remote FDB entry
      X. If the VXLAN device has no remote address, there should be no such
      entry.
      
      Besides using "ip link set", it is possible to manipulate the list of
      default remotes by using the "bridge fdb". It is therefore easy to break
      the above condition. Under such circumstances, the __vxlan_fdb_delete()
      call doesn't delete the FDB entry itself, but just one remote. The
      following vxlan_fdb_create() then creates a new FDB entry, leading to a
      situation where two entries exist for the address 00:00:00:00:00:00,
      each with a different subset of default remotes.
      
      An even more obvious breakage rooted in the same cause can be observed
      when a remote address is configured for a VXLAN device that did not have
      one before. In that case vxlan_changelink() doesn't remove any remote,
      and just creates a new FDB entry for the new address:
      
      $ ip link add name vx up type vxlan id 2000 dstport 4789
      $ bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.20 self permanent
      $ bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.30 self permanent
      $ ip link set dev vx type vxlan remote 192.0.2.30
      $ bridge fdb sh dev vx | grep 00:00:00:00:00:00
      00:00:00:00:00:00 dst 192.0.2.30 self permanent <- new entry, 1 rdst
      00:00:00:00:00:00 dst 192.0.2.20 self permanent <- orig. entry, 2 rdsts
      00:00:00:00:00:00 dst 192.0.2.30 self permanent
      
      To fix this, instead of calling vxlan_fdb_create() directly, defer to
      vxlan_fdb_update(). That has logic to handle the duplicates properly.
      Additionally, it also handles notifications, so drop that call from
      changelink as well.
      
      Fixes: 0241b836 ("vxlan: fix default fdb entry netlink notify ordering during netdev create")
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ce5e098f
    • P
      vxlan: Fix error path in __vxlan_dev_create() · 6db92468
      Petr Machata 提交于
      When a failure occurs in rtnl_configure_link(), the current code
      calls unregister_netdevice() to roll back the earlier call to
      register_netdevice(), and jumps to errout, which calls
      vxlan_fdb_destroy().
      
      However unregister_netdevice() calls transitively ndo_uninit, which is
      vxlan_uninit(), and that already takes care of deleting the default FDB
      entry by calling vxlan_fdb_delete_default(). Since the entry added
      earlier in __vxlan_dev_create() is exactly the default entry, the
      cleanup code in the errout block always leads to double free and thus a
      panic.
      
      Besides, since vxlan_fdb_delete_default() always destroys the FDB entry
      with notification enabled, the deletion of the default entry is notified
      even before the addition was notified.
      
      Instead, move the unregister_netdevice() call after the manual destroy,
      which solves both problems.
      
      Fixes: 0241b836 ("vxlan: fix default fdb entry netlink notify ordering during netdev create")
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6db92468
    • P
      vxlan: Unmark offloaded bit on replaced FDB entries · 6ad0b5a4
      Petr Machata 提交于
      When rdst of an offloaded FDB entry is replaced, it certainly isn't
      offloaded anymore. Drivers are notified about such replacements, and can
      re-mark the entry as offloaded again if they so wish. However until a
      driver does so explicitly, assume a replaced FDB entry is not offloaded.
      
      Note that replaces coming via vxlan_fdb_external_learn_add() are always
      immediately followed by an explicit offload marking.
      
      Fixes: 0efe1173 ("vxlan: Support marking RDSTs as offloaded")
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6ad0b5a4
    • D
      Merge branch 'macb-DMA-race-fixes' · a9d6d897
      David S. Miller 提交于
      Anssi Hannula says:
      
      ====================
      net: macb: DMA race condition fixes
      
      Here are a couple of race condition fixes for the macb driver. The first
      two are for issues observed at runtime on real HW.
      
      v2:
      - added received Tested-bys and Acked-bys to the first two patches
      - in patch 3/3, moved the timestamp protection barrier closer to the
        timestamp reads
      - in patch 3/3, removed unnecessary move of the addr assignment in
        gem_rx() to keep the patch minimal for maximum clarity
      - in patch 3/3, clarified commit message and comments
      
      The 3/3 is the same one I improperly sent last week as a standalone
      patch.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a9d6d897
    • A
      net: macb: add missing barriers when reading descriptors · 6e0af298
      Anssi Hannula 提交于
      When reading buffer descriptors on RX or on TX completion, an
      RX_USED/TX_USED bit is checked first to ensure that the descriptors have
      been populated, i.e. the ownership has been transferred. However, there
      are no memory barriers to ensure that the data protected by the
      RX_USED/TX_USED bit is up-to-date with respect to that bit.
      
      Specifically:
      
      - TX timestamp descriptors may be loaded before ctrl is loaded for the
        TX_USED check, which is racy as the descriptors may be updated between
        the loads, causing old timestamp descriptor data to be used.
      
      - RX ctrl may be loaded before addr is loaded for the RX_USED check,
        which is racy as a new frame may be written between the loads, causing
        old ctrl descriptor data to be used.
        This issue exists for both macb_rx() and gem_rx() variants.
      
      Fix the races by adding DMA read memory barriers on those paths and
      reordering the reads in macb_rx().
      
      I have not observed any actual problems in practice caused by these
      being missing, though.
      
      Tested on a ZynqMP based system.
      
      Fixes: 89e5785f ("[PATCH] Atmel MACB ethernet driver")
      Signed-off-by: NAnssi Hannula <anssi.hannula@bitwise.fi>
      Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6e0af298
    • A
      net: macb: fix dropped RX frames due to a race · 8159ecab
      Anssi Hannula 提交于
      Bit RX_USED set to 0 in the address field allows the controller to write
      data to the receive buffer descriptor.
      
      The driver does not ensure the ctrl field is ready (cleared) when the
      controller sees the RX_USED=0 written by the driver. The ctrl field might
      only be cleared after the controller has already updated it according to
      a newly received frame, causing the frame to be discarded in gem_rx() due
      to unexpected ctrl field contents.
      
      A message is logged when the above scenario occurs:
      
        macb ff0b0000.ethernet eth0: not whole frame pointed by descriptor
      
      Fix the issue by ensuring that when the controller sees RX_USED=0 the
      ctrl field is already cleared.
      
      This issue was observed on a ZynqMP based system.
      
      Fixes: 4df95131 ("net/macb: change RX path for GEM")
      Signed-off-by: NAnssi Hannula <anssi.hannula@bitwise.fi>
      Tested-by: NClaudiu Beznea <claudiu.beznea@microchip.com>
      Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8159ecab
    • A
      net: macb: fix random memory corruption on RX with 64-bit DMA · e100a897
      Anssi Hannula 提交于
      64-bit DMA addresses are split in upper and lower halves that are
      written in separate fields on GEM. For RX, bit 0 of the address is used
      as the ownership bit (RX_USED). When the RX_USED bit is unset the
      controller is allowed to write data to the buffer.
      
      The driver does not guarantee that the controller already sees the upper
      half when the RX_USED bit is cleared, possibly resulting in the
      controller writing an incoming frame to an address with an incorrect
      upper half and therefore possibly corrupting unrelated system memory.
      
      Fix that by adding the necessary DMA memory barrier between the writes.
      
      This corruption was observed on a ZynqMP based system.
      
      Fixes: fff8019a ("net: macb: Add 64 bit addressing support for GEM")
      Signed-off-by: NAnssi Hannula <anssi.hannula@bitwise.fi>
      Acked-by: NHarini Katakam <harini.katakam@xilinx.com>
      Tested-by: NClaudiu Beznea <claudiu.beznea@microchip.com>
      Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
      Cc: Michal Simek <michal.simek@xilinx.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e100a897
    • D
      net: Use __kernel_clockid_t in uapi net_stamp.h · e2c4cf7f
      Davide Caratti 提交于
      Herton reports the following error when building a userspace program that
      includes net_stamp.h:
      
       In file included from foo.c:2:
       /usr/include/linux/net_tstamp.h:158:2: error: unknown type name
       ‘clockid_t’
         clockid_t clockid; /* reference clockid */
         ^~~~~~~~~
      
      Fix it by using __kernel_clockid_t in place of clockid_t.
      
      Fixes: 80b14dee ("net: Add a new socket option for a future transmit time.")
      Cc: Timothy Redaelli <tredaelli@redhat.com>
      Reported-by: NHerton R. Krzesinski <herton@redhat.com>
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Tested-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e2c4cf7f
    • C
      net: macb: restart tx after tx used bit read · 42983885
      Claudiu Beznea 提交于
      On some platforms (currently detected only on SAMA5D4) TX might stuck
      even the pachets are still present in DMA memories and TX start was
      issued for them. This happens due to race condition between MACB driver
      updating next TX buffer descriptor to be used and IP reading the same
      descriptor. In such a case, the "TX USED BIT READ" interrupt is asserted.
      GEM/MACB user guide specifies that if a "TX USED BIT READ" interrupt
      is asserted TX must be restarted. Restart TX if used bit is read and
      packets are present in software TX queue. Packets are removed from software
      TX queue if TX was successful for them (see macb_tx_interrupt()).
      Signed-off-by: NClaudiu Beznea <claudiu.beznea@microchip.com>
      Acked-by: NNicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      42983885
    • D
      net: stmmac: Fix an error code in probe() · b26322d2
      Dan Carpenter 提交于
      The function should return an error if create_singlethread_workqueue()
      fails.
      
      Fixes: 34877a15 ("net: stmmac: Rework and fix TX Timeout code")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b26322d2
    • C
      tipc: check group dests after tipc_wait_for_cond() · 3c6306d4
      Cong Wang 提交于
      Similar to commit 143ece65 ("tipc: check tsk->group in tipc_wait_for_cond()")
      we have to reload grp->dests too after we re-take the sock lock.
      This means we need to move the dsts check after tipc_wait_for_cond()
      too.
      
      Fixes: 75da2163 ("tipc: introduce communication groups")
      Reported-and-tested-by: syzbot+99f20222fc5018d2b97a@syzkaller.appspotmail.com
      Cc: Ying Xue <ying.xue@windriver.com>
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3c6306d4
    • D
      qed: Fix an error code qed_ll2_start_xmit() · f07d4276
      Dan Carpenter 提交于
      We accidentally deleted the code to set "rc = -ENOMEM;" and this patch
      adds it back.
      
      Fixes: d2201a21 ("qed: No need for LL2 frags indication")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f07d4276
    • A
      net: mvpp2: 10G modes aren't supported on all ports · 00679177
      Antoine Tenart 提交于
      The mvpp2_phylink_validate() function sets all modes that are
      supported by a given PPv2 port. A recent change made all ports to
      advertise they support 10G modes in certain cases. This is not true,
      as only the port #0 can do so. This patch fixes it.
      
      Fixes: 01b3fd5a ("net: mvpp2: fix detection of 10G SFP modules")
      Cc: Baruch Siach <baruch@tkos.co.il>
      Signed-off-by: NAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      00679177
    • J
      VSOCK: Send reset control packet when socket is partially bound · a915b982
      Jorgen Hansen 提交于
      If a server side socket is bound to an address, but not in the listening
      state yet, incoming connection requests should receive a reset control
      packet in response. However, the function used to send the reset
      silently drops the reset packet if the sending socket isn't bound
      to a remote address (as is the case for a bound socket not yet in
      the listening state). This change fixes this by using the src
      of the incoming packet as destination for the reset packet in
      this case.
      
      Fixes: d021c344 ("VSOCK: Introduce VM Sockets")
      Reviewed-by: NAdit Ranadive <aditr@vmware.com>
      Reviewed-by: NVishnu Dasa <vdasa@vmware.com>
      Signed-off-by: NJorgen Hansen <jhansen@vmware.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a915b982
    • D
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · fde9cd69
      David S. Miller 提交于
      Steffen Klassert says:
      
      ====================
      pull request (net): ipsec 2018-12-18
      
      1) Fix error return code in xfrm_output_one()
         when no dst_entry is attached to the skb.
         From Wei Yongjun.
      
      2) The xfrm state hash bucket count reported to
         userspace is off by one. Fix from Benjamin Poirier.
      
      3) Fix NULL pointer dereference in xfrm_input when
         skb_dst_force clears the dst_entry.
      
      4) Fix freeing of xfrm states on acquire. We use a
         dedicated slab cache for the xfrm states now,
         so free it properly with kmem_cache_free.
         From Mathias Krause.
      
      Please pull or let me know if there are problems.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fde9cd69
    • D
      Merge branch 'mlxsw-VXLAN-and-firmware-flashing-fixes' · 8d013b79
      David S. Miller 提交于
      Ido Schimmel says:
      
      ====================
      mlxsw: VXLAN and firmware flashing fixes
      
      Patch #1 fixes firmware flashing failures by increasing the time period
      after which the driver fails the transaction with the firmware. The
      problem is explained in detail in the commit message.
      
      Patch #2 adds a missing trap for decapsulated ARP packets. It is
      necessary for VXLAN routing to work.
      
      Patch #3 fixes a memory leak during driver reload caused by NULLing a
      pointer before kfree().
      
      Please consider patch #1 for 4.19.y
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d013b79
    • I
      mlxsw: spectrum_nve: Fix memory leak upon driver reload · 5edb7e8b
      Ido Schimmel 提交于
      The pointer was NULLed before freeing the memory, resulting in a memory
      leak. Trace from kmemleak:
      
      unreferenced object 0xffff88820ae36528 (size 512):
        comm "devlink", pid 5374, jiffies 4295354033 (age 10829.296s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<00000000a43f5195>] kmem_cache_alloc_trace+0x1be/0x330
          [<00000000312f8140>] mlxsw_sp_nve_init+0xcb/0x1ae0
          [<0000000009201d22>] mlxsw_sp_init+0x1382/0x2690
          [<000000007227d877>] mlxsw_sp1_init+0x1b5/0x260
          [<000000004a16feec>] __mlxsw_core_bus_device_register+0x776/0x1360
          [<0000000070ab954c>] mlxsw_devlink_core_bus_device_reload+0x129/0x220
          [<00000000432313d5>] devlink_nl_cmd_reload+0x119/0x1e0
          [<000000003821a06b>] genl_family_rcv_msg+0x813/0x1150
          [<00000000d54d04c0>] genl_rcv_msg+0xd1/0x180
          [<0000000040543d12>] netlink_rcv_skb+0x152/0x3c0
          [<00000000efc4eae8>] genl_rcv+0x2d/0x40
          [<00000000ea645603>] netlink_unicast+0x52f/0x740
          [<00000000641fca1a>] netlink_sendmsg+0x9c7/0xf50
          [<00000000fed4a4b8>] sock_sendmsg+0xbe/0x120
          [<00000000d85795a9>] __sys_sendto+0x397/0x620
          [<00000000c5f84622>] __x64_sys_sendto+0xe6/0x1a0
      
      Fixes: 6e6030bd ("mlxsw: spectrum_nve: Implement common NVE core")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5edb7e8b
    • I
      mlxsw: spectrum: Add trap for decapsulated ARP packets · 5d504391
      Ido Schimmel 提交于
      After a packet was decapsulated it is classified to the relevant FID
      based on its VNI and undergoes L2 forwarding.
      
      Unlike regular (non-encapsulated) ARP packets, Spectrum does not trap
      decapsulated ARP packets during L2 forwarding and instead can only trap
      such packets in the underlay router during decapsulation.
      
      Add this missing packet trap, which is required for VXLAN routing when
      the MAC of the target host is not known.
      
      Fixes: b02597d5 ("mlxsw: spectrum: Add NVE packet traps")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5d504391