1. 31 5月, 2019 19 次提交
    • W
      net: correct zerocopy refcnt with udp MSG_MORE · 100f6d8e
      Willem de Bruijn 提交于
      TCP zerocopy takes a uarg reference for every skb, plus one for the
      tcp_sendmsg_locked datapath temporarily, to avoid reaching refcnt zero
      as it builds, sends and frees skbs inside its inner loop.
      
      UDP and RAW zerocopy do not send inside the inner loop so do not need
      the extra sock_zerocopy_get + sock_zerocopy_put pair. Commit
      52900d22288ed ("udp: elide zerocopy operation in hot path") introduced
      extra_uref to pass the initial reference taken in sock_zerocopy_alloc
      to the first generated skb.
      
      But, sock_zerocopy_realloc takes this extra reference at the start of
      every call. With MSG_MORE, no new skb may be generated to attach the
      extra_uref to, so refcnt is incorrectly 2 with only one skb.
      
      Do not take the extra ref if uarg && !tcp, which implies MSG_MORE.
      Update extra_uref accordingly.
      
      This conditional assignment triggers a false positive may be used
      uninitialized warning, so have to initialize extra_uref at define.
      
      Changes v1->v2: fix typo in Fixes SHA1
      
      Fixes: 52900d22 ("udp: elide zerocopy operation in hot path")
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Diagnosed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      100f6d8e
    • M
      ethtool: Check for vlan etype or vlan tci when parsing flow_rule · b73484b2
      Maxime Chevallier 提交于
      When parsing an ethtool flow spec to build a flow_rule, the code checks
      if both the vlan etype and the vlan tci are specified by the user to add
      a FLOW_DISSECTOR_KEY_VLAN match.
      
      However, when the user only specified a vlan etype or a vlan tci, this
      check silently ignores these parameters.
      
      For example, the following rule :
      
      ethtool -N eth0 flow-type udp4 vlan 0x0010 action -1 loc 0
      
      will result in no error being issued, but the equivalent rule will be
      created and passed to the NIC driver :
      
      ethtool -N eth0 flow-type udp4 action -1 loc 0
      
      In the end, neither the NIC driver using the rule nor the end user have
      a way to know that these keys were dropped along the way, or that
      incorrect parameters were entered.
      
      This kind of check should be left to either the driver, or the ethtool
      flow spec layer.
      
      This commit makes so that ethtool parameters are forwarded as-is to the
      NIC driver.
      
      Since none of the users of ethtool_rx_flow_rule_create are using the
      VLAN dissector, I don't think this qualifies as a regression.
      
      Fixes: eca4205f ("ethtool: add ethtool_rx_flow_spec to flow_rule structure translator")
      Signed-off-by: NMaxime Chevallier <maxime.chevallier@bootlin.com>
      Acked-by: NPablo Neira Ayuso <pablo@gnumonks.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b73484b2
    • J
      net: don't clear sock->sk early to avoid trouble in strparser · 2b81f816
      Jakub Kicinski 提交于
      af_inet sets sock->sk to NULL which trips strparser over:
      
      BUG: kernel NULL pointer dereference, address: 0000000000000012
      PGD 0 P4D 0
      Oops: 0000 [#1] SMP PTI
      CPU: 7 PID: 0 Comm: swapper/7 Not tainted 5.2.0-rc1-00139-g14629453a6d3 #21
      RIP: 0010:tcp_peek_len+0x10/0x60
      RSP: 0018:ffffc02e41c54b98 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff9cf924c4e030 RCX: 0000000000000051
      RDX: 0000000000000000 RSI: 000000000000000c RDI: ffff9cf97128f480
      RBP: ffff9cf9365e0300 R08: ffff9cf94fe7d2c0 R09: 0000000000000000
      R10: 000000000000036b R11: ffff9cf939735e00 R12: ffff9cf91ad9ae40
      R13: ffff9cf924c4e000 R14: ffff9cf9a8fcbaae R15: 0000000000000020
      FS: 0000000000000000(0000) GS:ffff9cf9af7c0000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000012 CR3: 000000013920a003 CR4: 00000000003606e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       Call Trace:
       <IRQ>
       strp_data_ready+0x48/0x90
       tls_data_ready+0x22/0xd0 [tls]
       tcp_rcv_established+0x569/0x620
       tcp_v4_do_rcv+0x127/0x1e0
       tcp_v4_rcv+0xad7/0xbf0
       ip_protocol_deliver_rcu+0x2c/0x1c0
       ip_local_deliver_finish+0x41/0x50
       ip_local_deliver+0x6b/0xe0
       ? ip_protocol_deliver_rcu+0x1c0/0x1c0
       ip_rcv+0x52/0xd0
       ? ip_rcv_finish_core.isra.20+0x380/0x380
       __netif_receive_skb_one_core+0x7e/0x90
       netif_receive_skb_internal+0x42/0xf0
       napi_gro_receive+0xed/0x150
       nfp_net_poll+0x7a2/0xd30 [nfp]
       ? kmem_cache_free_bulk+0x286/0x310
       net_rx_action+0x149/0x3b0
       __do_softirq+0xe3/0x30a
       ? handle_irq_event_percpu+0x6a/0x80
       irq_exit+0xe8/0xf0
       do_IRQ+0x85/0xd0
       common_interrupt+0xf/0xf
       </IRQ>
      RIP: 0010:cpuidle_enter_state+0xbc/0x450
      
      To avoid this issue set sock->sk after sk_prot->close.
      My grepping and testing did not discover any code which
      would depend on the current behaviour.
      
      Fixes: c46234eb ("tls: RX path for ktls")
      Reported-by: NDavid Beckett <david.beckett@netronome.com>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: NDirk van der Merwe <dirk.vandermerwe@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2b81f816
    • E
      net-gro: fix use-after-free read in napi_gro_frags() · a4270d67
      Eric Dumazet 提交于
      If a network driver provides to napi_gro_frags() an
      skb with a page fragment of exactly 14 bytes, the call
      to gro_pull_from_frag0() will 'consume' the fragment
      by calling skb_frag_unref(skb, 0), and the page might
      be freed and reused.
      
      Reading eth->h_proto at the end of napi_frags_skb() might
      read mangled data, or crash under specific debugging features.
      
      BUG: KASAN: use-after-free in napi_frags_skb net/core/dev.c:5833 [inline]
      BUG: KASAN: use-after-free in napi_gro_frags+0xc6f/0xd10 net/core/dev.c:5841
      Read of size 2 at addr ffff88809366840c by task syz-executor599/8957
      
      CPU: 1 PID: 8957 Comm: syz-executor599 Not tainted 5.2.0-rc1+ #32
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       print_address_description.cold+0x7c/0x20d mm/kasan/report.c:188
       __kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
       kasan_report+0x12/0x20 mm/kasan/common.c:614
       __asan_report_load_n_noabort+0xf/0x20 mm/kasan/generic_report.c:142
       napi_frags_skb net/core/dev.c:5833 [inline]
       napi_gro_frags+0xc6f/0xd10 net/core/dev.c:5841
       tun_get_user+0x2f3c/0x3ff0 drivers/net/tun.c:1991
       tun_chr_write_iter+0xbd/0x156 drivers/net/tun.c:2037
       call_write_iter include/linux/fs.h:1872 [inline]
       do_iter_readv_writev+0x5f8/0x8f0 fs/read_write.c:693
       do_iter_write fs/read_write.c:970 [inline]
       do_iter_write+0x184/0x610 fs/read_write.c:951
       vfs_writev+0x1b3/0x2f0 fs/read_write.c:1015
       do_writev+0x15b/0x330 fs/read_write.c:1058
      
      Fixes: a50e233c ("net-gro: restore frag0 optimization")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a4270d67
    • D
      Merge branch 'Fixes-for-DSA-tagging-using-802-1Q' · c3bc6deb
      David S. Miller 提交于
      Vladimir Oltean says:
      
      ====================
      Fixes for DSA tagging using 802.1Q
      
      During the prototyping for the "Decoupling PHYLINK from struct
      net_device" patchset, the CPU port of the sja1105 driver was moved to a
      different spot.  This uncovered an issue in the tag_8021q DSA code,
      which used to work by mistake - the CPU port was the last hardware port
      numerically, and this was masking an ordering issue which is very likely
      to be seen in other drivers that make use of 802.1Q tags.
      
      A question was also raised whether the VID numbers bear any meaning, and
      the conclusion was that they don't, at least not in an absolute sense.
      The second patch defines bit fields inside the DSA 802.1Q VID so that
      tcpdump can decode it unambiguously (although the meaning is now clear
      even by visual inspection).
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c3bc6deb
    • V
      net: dsa: tag_8021q: Create a stable binary format · 0471dd42
      Vladimir Oltean 提交于
      Tools like tcpdump need to be able to decode the significance of fake
      VLAN headers that DSA uses to separate switch ports.
      
      But currently these have no global significance - they are simply an
      ordered list of DSA_MAX_SWITCHES x DSA_MAX_PORTS numbers ending at 4095.
      
      The reason why this is submitted as a fix is that the existing mapping
      of VIDs should not enter into a stable kernel, so we can pretend that
      only the new format exists. This way tcpdump won't need to try to make
      something out of the VLAN tags on 5.2 kernels.
      
      Fixes: f9bbe447 ("net: dsa: Optional VLAN-based port separation for switches without tagging")
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0471dd42
    • I
      net: dsa: tag_8021q: Change order of rx_vid setup · d34d2baa
      Ioana Ciornei 提交于
      The 802.1Q tagging performs an unbalanced setup in terms of RX VIDs on
      the CPU port. For the ingress path of a 802.1Q switch to work, the RX
      VID of a port needs to be seen as tagged egress on the CPU port.
      
      While configuring the other front-panel ports to be part of this VID,
      for bridge scenarios, the untagged flag is applied even on the CPU port
      in dsa_switch_vlan_add.  This happens because DSA applies the same flags
      on the CPU port as on the (bridge-controlled) slave ports, and the
      effect in this case is that the CPU port tagged settings get deleted.
      
      Instead of fixing DSA by introducing a way to control VLAN flags on the
      CPU port (and hence stop inheriting from the slave ports) - a hard,
      perhaps intractable problem - avoid this situation by moving the setup
      part of the RX VID on the CPU port after all the other front-panel ports
      have been added to the VID.
      
      Fixes: f9bbe447 ("net: dsa: Optional VLAN-based port separation for switches without tagging")
      Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com>
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d34d2baa
    • A
      net: mvpp2: fix bad MVPP2_TXQ_SCHED_TOKEN_CNTR_REG queue value · 21808437
      Antoine Tenart 提交于
      MVPP2_TXQ_SCHED_TOKEN_CNTR_REG() expects the logical queue id but
      the current code is passing the global tx queue offset, so it ends
      up writing to unknown registers (between 0x8280 and 0x82fc, which
      seemed to be unused by the hardware). This fixes the issue by using
      the logical queue id instead.
      
      Fixes: 3f518509 ("ethernet: Add new driver for Marvell Armada 375 network unit")
      Signed-off-by: NAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      21808437
    • Y
      ipv4: tcp_input: fix stack out of bounds when parsing TCP options. · 9609dad2
      Young Xiao 提交于
      The TCP option parsing routines in tcp_parse_options function could
      read one byte out of the buffer of the TCP options.
      
      1         while (length > 0) {
      2                 int opcode = *ptr++;
      3                 int opsize;
      4
      5                 switch (opcode) {
      6                 case TCPOPT_EOL:
      7                         return;
      8                 case TCPOPT_NOP:        /* Ref: RFC 793 section 3.1 */
      9                         length--;
      10                        continue;
      11                default:
      12                        opsize = *ptr++; //out of bound access
      
      If length = 1, then there is an access in line2.
      And another access is occurred in line 12.
      This would lead to out-of-bound access.
      
      Therefore, in the patch we check that the available data length is
      larger enough to pase both TCP option code and size.
      Signed-off-by: NYoung Xiao <92siuyang@gmail.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9609dad2
    • D
      Merge branch 'mlxsw-Two-small-fixes' · 62851d71
      David S. Miller 提交于
      Ido Schimmel says:
      
      ====================
      mlxsw: Two small fixes
      
      Patch #1 from Jiri fixes an issue specific to Spectrum-2 where the
      insertion of two identical flower filters with different priorities
      would trigger a warning.
      
      Patch #2 from Amit prevents the driver from trying to configure a port
      with a speed of 56Gb/s and autoneg off as this is not supported and
      results in error messages from firmware.
      
      Please consider patch #1 for stable.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      62851d71
    • A
      mlxsw: spectrum: Prevent force of 56G · 275e928f
      Amit Cohen 提交于
      Force of 56G is not supported by hardware in Ethernet devices. This
      configuration fails with a bad parameter error from firmware.
      
      Add check of this case. Instead of trying to set 56G with autoneg off,
      return a meaningful error.
      
      Fixes: 56ade8fe ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
      Signed-off-by: NAmit Cohen <amitc@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      275e928f
    • J
      mlxsw: spectrum_acl: Avoid warning after identical rules insertion · ef744220
      Jiri Pirko 提交于
      When identical rules are inserted, the latter one goes to C-TCAM. For
      that, a second eRP with the same mask is created. These 2 eRPs by the
      nature cannot be merged and also one cannot be parent of another.
      Teach mlxsw_sp_acl_erp_delta_fill() about this possibility and handle it
      gracefully.
      Reported-by: NAlex Kushnarov <alexanderk@mellanox.com>
      Fixes: c22291f7 ("mlxsw: spectrum: acl: Implement delta for ERP")
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ef744220
    • R
      net: dsa: mv88e6xxx: fix handling of upper half of STATS_TYPE_PORT · 84b3fd1f
      Rasmus Villemoes 提交于
      Currently, the upper half of a 4-byte STATS_TYPE_PORT statistic ends
      up in bits 47:32 of the return value, instead of bits 31:16 as they
      should.
      
      Fixes: 6e46e2d8 ("net: dsa: mv88e6xxx: Fix u64 statistics")
      Signed-off-by: NRasmus Villemoes <rasmus.villemoes@prevas.dk>
      Reviewed-by: NVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      84b3fd1f
    • H
      r8169: fix MAC address being lost in PCI D3 · 59715171
      Heiner Kallweit 提交于
      (At least) RTL8168e forgets its MAC address in PCI D3. To fix this set
      the MAC address when resuming. For resuming from runtime-suspend we
      had this in place already, for resuming from S3/S5 it was missing.
      
      The commit referenced as being fixed isn't wrong, it's just the first
      one where the patch applies cleanly.
      
      Fixes: 0f07bd85 ("r8169: use dev_get_drvdata where possible")
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Reported-by: NAlbert Astals Cid <aacid@kde.org>
      Tested-by: NAlbert Astals Cid <aacid@kde.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59715171
    • D
      Merge tag 'mlx5-fixes-2019-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 200c6758
      David S. Miller 提交于
      Saeed Mahameed says:
      
      ====================
      Mellanox, mlx5 fixes 2019-05-28
      
      This series introduces some fixes to mlx5 driver.
      
      Please pull and let me know if there is any problem.
      
      For -stable v4.13:
      ('net/mlx5: Allocate root ns memory using kzalloc to match kfree')
      
      For -stable v4.16:
      ('net/mlx5: Avoid double free in fs init error unwinding path')
      
      For -stable v4.18:
      ('net/mlx5e: Disable rxhash when CQE compress is enabled')
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      200c6758
    • D
      Merge branch 'XDP-generic-fixes' · 4b280531
      David S. Miller 提交于
      Stephen Hemminger says:
      
      ====================
      XDP generic fixes
      
      This set of patches came about while investigating XDP
      generic on Azure. The split brain nature of the accelerated
      networking exposed issues with the stack device model.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4b280531
    • S
      net: core: support XDP generic on stacked devices. · 458bf2f2
      Stephen Hemminger 提交于
      When a device is stacked like (team, bonding, failsafe or netvsc) the
      XDP generic program for the parent device was not called.
      
      Move the call to XDP generic inside __netif_receive_skb_core where
      it can be done multiple times for stacked case.
      
      Fixes: d4455169 ("net: xdp: support xdp generic on virtual devices")
      Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      458bf2f2
    • S
      netvsc: unshare skb in VF rx handler · 996ed047
      Stephen Hemminger 提交于
      The netvsc VF skb handler should make sure that skb is not
      shared. Similar logic already exists in bonding and team device
      drivers.
      
      This is not an issue in practice because the VF devicex
      does not send up shared skb's. But the netvsc driver
      should do the right thing if it did.
      
      Fixes: 0c195567 ("netvsc: transparent VF management")
      Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      996ed047
    • S
      udp: Avoid post-GRO UDP checksum recalculation · f2696099
      Sean Tranchetti 提交于
      Currently, when resegmenting an unexpected UDP GRO packet, the full UDP
      checksum will be calculated for every new SKB created by skb_segment()
      because the netdev features passed in by udp_rcv_segment() lack any
      information about checksum offload capabilities.
      
      Usually, we have no need to perform this calculation again, as
        1) The GRO implementation guarantees that any packets making it to the
           udp_rcv_segment() function had correct checksums, and, more
           importantly,
        2) Upon the successful return of udp_rcv_segment(), we immediately pull
           the UDP header off and either queue the segment to the socket or
           hand it off to a new protocol handler.
      
      Unless userspace has set the IP_CHECKSUM sockopt to indicate that they
      want the final checksum values, we can pass the needed netdev feature
      flags to __skb_gso_segment() to avoid checksumming each segment in
      skb_segment().
      
      Fixes: cf329aa4 ("udp: cope with UDP GRO packet misdirection")
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
      Signed-off-by: NSean Tranchetti <stranche@codeaurora.org>
      Acked-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f2696099
  2. 30 5月, 2019 8 次提交
  3. 29 5月, 2019 12 次提交
    • S
      net/mlx5e: Disable rxhash when CQE compress is enabled · c0194e2d
      Saeed Mahameed 提交于
      When CQE compression is enabled (Multi-host systems), compressed CQEs
      might arrive to the driver rx, compressed CQEs don't have a valid hash
      offload and the driver already reports a hash value of 0 and invalid hash
      type on the skb for compressed CQEs, but this is not good enough.
      
      On a congested PCIe, where CQE compression will kick in aggressively,
      gro will deliver lots of out of order packets due to the invalid hash
      and this might cause a serious performance drop.
      
      The only valid solution, is to disable rxhash offload at all when CQE
      compression is favorable (Multi-host systems).
      
      Fixes: 7219ab34 ("net/mlx5e: CQE compression")
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      c0194e2d
    • W
      net/mlx5e: restrict the real_dev of vlan device is the same as uplink device · 24bcd210
      wenxu 提交于
      When register indr block for vlan device, it should check the real_dev
      of vlan device is same as uplink device. Or it will set offload rule
      to mlx5e which will never hit.
      
      Fixes: 35a605db ("net/mlx5e: Offload TC e-switch rules with ingress VLAN device")
      Signed-off-by: Nwenxu <wenxu@ucloud.cn>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      24bcd210
    • P
      net/mlx5: Allocate root ns memory using kzalloc to match kfree · 25fa506b
      Parav Pandit 提交于
      root ns is yet another fs core node which is freed using kfree() by
      tree_put_node().
      Rest of the other fs core objects are also allocated using kmalloc
      variants.
      
      However, root ns memory is allocated using kvzalloc().
      Hence allocate root ns memory using kzalloc().
      
      Fixes: 25302363 ("net/mlx5_core: Flow steering tree initialization")
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
      Reviewed-by: NMark Bloch <markb@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      25fa506b
    • P
      net/mlx5: Avoid double free in fs init error unwinding path · 9414277a
      Parav Pandit 提交于
      In below code flow, for ingress acl table root ns memory leads
      to double free.
      
      mlx5_init_fs
        init_ingress_acls_root_ns()
          init_ingress_acl_root_ns
             kfree(steering->esw_ingress_root_ns);
             /* steering->esw_ingress_root_ns is not marked NULL */
        mlx5_cleanup_fs
          cleanup_ingress_acls_root_ns
             steering->esw_ingress_root_ns non NULL check passes.
             kfree(steering->esw_ingress_root_ns);
             /* double free */
      
      Similar issue exist for other tables.
      
      Hence zero out the pointers to not process the table again.
      
      Fixes: 9b93ab98 ("net/mlx5: Separate ingress/egress namespaces for each vport")
      Fixes: 40c3eebb49e51 ("net/mlx5: Add support in RDMA RX steering")
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Reviewed-by: NMark Bloch <markb@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      9414277a
    • P
      net/mlx5: Avoid double free of root ns in the error flow path · 905f6bd3
      Parav Pandit 提交于
      When root ns setup for rdma, sniffer tx and sniffer rx fails,
      such root ns cleanup is done by the error unwinding path of
      mlx5_cleanup_fs().
      Below call graph shows an example for sniffer_rx_root_ns.
      
      mlx5_init_fs()
        init_sniffer_rx_root_ns()
          cleanup_root_ns(steering->sniffer_rx_root_ns);
      mlx5_cleanup_fs()
        cleanup_root_ns(steering->sniffer_rx_root_ns);
        /* double free of sniffer_rx_root_ns */
      
      Hence, use the existing cleanup_fs to cleanup.
      
      Fixes: d83eb50e ("net/mlx5: Add support in RDMA RX steering")
      Fixes: 87d22483 ("net/mlx5: Add sniffer namespaces")
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      905f6bd3
    • S
      net/mlx5: Fix error handling in mlx5_load() · 87883929
      Saeed Mahameed 提交于
      In case mlx5_core_set_hca_defaults fails, it should jump to
      mlx5_cleanup_fs, fix that.
      
      Fixes: c85023e1 ("IB/mlx5: Add raw ethernet local loopback support")
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Reviewed-by: NHuy Nguyen <huyn@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      87883929
    • F
      Documentation: net-sysfs: Remove duplicate PHY device documentation · a6cd0d2d
      Florian Fainelli 提交于
      Both sysfs-bus-mdio and sysfs-class-net-phydev contain the same
      duplication information. There is not currently any MDIO bus specific
      attribute, but there are PHY device (struct phy_device) specific
      attributes. Use the more precise description from sysfs-bus-mdio and
      carry that over to sysfs-class-net-phydev.
      
      Fixes: 86f22d04 ("net: sysfs: Document PHY device sysfs attributes")
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a6cd0d2d
    • E
      llc: fix skb leak in llc_build_and_send_ui_pkt() · 8fb44d60
      Eric Dumazet 提交于
      If llc_mac_hdr_init() returns an error, we must drop the skb
      since no llc_build_and_send_ui_pkt() caller will take care of this.
      
      BUG: memory leak
      unreferenced object 0xffff8881202b6800 (size 2048):
        comm "syz-executor907", pid 7074, jiffies 4294943781 (age 8.590s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          1a 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00  ...@............
        backtrace:
          [<00000000e25b5abe>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
          [<00000000e25b5abe>] slab_post_alloc_hook mm/slab.h:439 [inline]
          [<00000000e25b5abe>] slab_alloc mm/slab.c:3326 [inline]
          [<00000000e25b5abe>] __do_kmalloc mm/slab.c:3658 [inline]
          [<00000000e25b5abe>] __kmalloc+0x161/0x2c0 mm/slab.c:3669
          [<00000000a1ae188a>] kmalloc include/linux/slab.h:552 [inline]
          [<00000000a1ae188a>] sk_prot_alloc+0xd6/0x170 net/core/sock.c:1608
          [<00000000ded25bbe>] sk_alloc+0x35/0x2f0 net/core/sock.c:1662
          [<000000002ecae075>] llc_sk_alloc+0x35/0x170 net/llc/llc_conn.c:950
          [<00000000551f7c47>] llc_ui_create+0x7b/0x140 net/llc/af_llc.c:173
          [<0000000029027f0e>] __sock_create+0x164/0x250 net/socket.c:1430
          [<000000008bdec225>] sock_create net/socket.c:1481 [inline]
          [<000000008bdec225>] __sys_socket+0x69/0x110 net/socket.c:1523
          [<00000000b6439228>] __do_sys_socket net/socket.c:1532 [inline]
          [<00000000b6439228>] __se_sys_socket net/socket.c:1530 [inline]
          [<00000000b6439228>] __x64_sys_socket+0x1e/0x30 net/socket.c:1530
          [<00000000cec820c1>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
          [<000000000c32554f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      BUG: memory leak
      unreferenced object 0xffff88811d750d00 (size 224):
        comm "syz-executor907", pid 7074, jiffies 4294943781 (age 8.600s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          00 f0 0c 24 81 88 ff ff 00 68 2b 20 81 88 ff ff  ...$.....h+ ....
        backtrace:
          [<0000000053026172>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
          [<0000000053026172>] slab_post_alloc_hook mm/slab.h:439 [inline]
          [<0000000053026172>] slab_alloc_node mm/slab.c:3269 [inline]
          [<0000000053026172>] kmem_cache_alloc_node+0x153/0x2a0 mm/slab.c:3579
          [<00000000fa8f3c30>] __alloc_skb+0x6e/0x210 net/core/skbuff.c:198
          [<00000000d96fdafb>] alloc_skb include/linux/skbuff.h:1058 [inline]
          [<00000000d96fdafb>] alloc_skb_with_frags+0x5f/0x250 net/core/skbuff.c:5327
          [<000000000a34a2e7>] sock_alloc_send_pskb+0x269/0x2a0 net/core/sock.c:2225
          [<00000000ee39999b>] sock_alloc_send_skb+0x32/0x40 net/core/sock.c:2242
          [<00000000e034d810>] llc_ui_sendmsg+0x10a/0x540 net/llc/af_llc.c:933
          [<00000000c0bc8445>] sock_sendmsg_nosec net/socket.c:652 [inline]
          [<00000000c0bc8445>] sock_sendmsg+0x54/0x70 net/socket.c:671
          [<000000003b687167>] __sys_sendto+0x148/0x1f0 net/socket.c:1964
          [<00000000922d78d9>] __do_sys_sendto net/socket.c:1976 [inline]
          [<00000000922d78d9>] __se_sys_sendto net/socket.c:1972 [inline]
          [<00000000922d78d9>] __x64_sys_sendto+0x2a/0x30 net/socket.c:1972
          [<00000000cec820c1>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
          [<000000000c32554f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8fb44d60
    • S
      selftests: pmtu: Fix encapsulating device in pmtu_vti6_link_change_mtu · 73f51d15
      Stefano Brivio 提交于
      In the pmtu_vti6_link_change_mtu test, both local and remote addresses
      for the vti6 tunnel are assigned to the same address given to the dummy
      interface that we use as encapsulating device with a known MTU.
      
      This works as long as the dummy interface is actually selected, via
      rt6_lookup(), as encapsulating device. But if the remote address of the
      tunnel is a local address too, the loopback interface could also be
      selected, and there's nothing wrong with it.
      
      This is what some older -stable kernels do (3.18.z, at least), and
      nothing prevents us from subtly changing FIB implementation to revert
      back to that behaviour in the future.
      
      Define an IPv6 prefix instead, and use two separate addresses as local
      and remote for vti6, so that the encapsulating device can't be a
      loopback interface.
      Reported-by: NXiumei Mu <xmu@redhat.com>
      Fixes: 1fad59ea ("selftests: pmtu: Add pmtu_vti6_link_change_mtu test")
      Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73f51d15
    • M
      dpaa_eth: use only online CPU portals · 7aae703f
      Madalin Bucur 提交于
      Make sure only the portals for the online CPUs are used.
      Without this change, there are issues when someone boots with
      maxcpus=n, with n < actual number of cores available as frames
      either received or corresponding to the transmit confirmation
      path would be offered for dequeue to the offline CPU portals,
      getting lost.
      Signed-off-by: NMadalin Bucur <madalin.bucur@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7aae703f
    • J
      net: mvneta: Fix err code path of probe · d484e06e
      Jisheng Zhang 提交于
      Fix below issues in err code path of probe:
      1. we don't need to unregister_netdev() because the netdev isn't
      registered.
      2. when register_netdev() fails, we also need to destroy bm pool for
      HWBM case.
      
      Fixes: dc35a10f ("net: mvneta: bm: add support for hardware buffer management")
      Signed-off-by: NJisheng Zhang <Jisheng.Zhang@synaptics.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d484e06e
    • T
      net: stmmac: Do not output error on deferred probe · 54ed6fd2
      Thierry Reding 提交于
      If the subdriver defers probe, do not show an error message. It's
      perfectly fine for this error to occur since the driver will get another
      chance to probe after some time and will usually succeed after all of
      the resources that it requires have been registered.
      Signed-off-by: NThierry Reding <treding@nvidia.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      54ed6fd2
  4. 28 5月, 2019 1 次提交