1. 20 2月, 2022 24 次提交
    • M
      net: tcp: add skb drop reasons to tcp_add_backlog() · 7a26dc9e
      Menglong Dong 提交于
      Pass the address of drop_reason to tcp_add_backlog() to store the
      reasons for skb drops when fails. Following drop reasons are
      introduced:
      
      SKB_DROP_REASON_SOCKET_BACKLOG
      Reviewed-by: NMengen Sun <mengensun@tencent.com>
      Reviewed-by: NHao Peng <flyingpeng@tencent.com>
      Signed-off-by: NMenglong Dong <imagedong@tencent.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Reviewed-by: NDavid Ahern <dsahern@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7a26dc9e
    • M
      net: tcp: add skb drop reasons to tcp_v{4,6}_inbound_md5_hash() · 643b622b
      Menglong Dong 提交于
      Pass the address of drop reason to tcp_v4_inbound_md5_hash() and
      tcp_v6_inbound_md5_hash() to store the reasons for skb drops when this
      function fails. Therefore, the drop reason can be passed to
      kfree_skb_reason() when the skb needs to be freed.
      
      Following drop reasons are added:
      
      SKB_DROP_REASON_TCP_MD5NOTFOUND
      SKB_DROP_REASON_TCP_MD5UNEXPECTED
      SKB_DROP_REASON_TCP_MD5FAILURE
      
      SKB_DROP_REASON_TCP_MD5* above correspond to LINUX_MIB_TCPMD5*
      Reviewed-by: NMengen Sun <mengensun@tencent.com>
      Reviewed-by: NHao Peng <flyingpeng@tencent.com>
      Signed-off-by: NMenglong Dong <imagedong@tencent.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Reviewed-by: NDavid Ahern <dsahern@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      643b622b
    • M
      net: tcp: use kfree_skb_reason() for tcp_v6_rcv() · c0e3154d
      Menglong Dong 提交于
      Replace kfree_skb() used in tcp_v6_rcv() with kfree_skb_reason().
      Reviewed-by: NMengen Sun <mengensun@tencent.com>
      Reviewed-by: NHao Peng <flyingpeng@tencent.com>
      Signed-off-by: NMenglong Dong <imagedong@tencent.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Reviewed-by: NDavid Ahern <dsahern@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c0e3154d
    • M
      net: tcp: add skb drop reasons to tcp_v4_rcv() · 255f9034
      Menglong Dong 提交于
      Use kfree_skb_reason() for some path in tcp_v4_rcv() that missed before,
      including:
      
      SKB_DROP_REASON_SOCKET_FILTER
      SKB_DROP_REASON_XFRM_POLICY
      Reviewed-by: NMengen Sun <mengensun@tencent.com>
      Reviewed-by: NHao Peng <flyingpeng@tencent.com>
      Signed-off-by: NMenglong Dong <imagedong@tencent.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Reviewed-by: NDavid Ahern <dsahern@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      255f9034
    • M
      net: tcp: introduce tcp_drop_reason() · 082116ff
      Menglong Dong 提交于
      For TCP protocol, tcp_drop() is used to free the skb when it needs
      to be dropped. To make use of kfree_skb_reason() and pass the drop
      reason to it, introduce the function tcp_drop_reason(). Meanwhile,
      make tcp_drop() an inline call to tcp_drop_reason().
      Reviewed-by: NMengen Sun <mengensun@tencent.com>
      Reviewed-by: NHao Peng <flyingpeng@tencent.com>
      Signed-off-by: NMenglong Dong <imagedong@tencent.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Reviewed-by: NDavid Ahern <dsahern@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      082116ff
    • V
      net: prestera: acl: fix 'client_map' buff overflow · 48c77bdf
      Volodymyr Mytnyk 提交于
      smatch warnings:
      drivers/net/ethernet/marvell/prestera/prestera_acl.c:103
      prestera_acl_chain_to_client() error: buffer overflow
      'client_map' 3 <= 3
      
      	prestera_acl_chain_to_client(u32 chain_index, ...)
              ...
      	u32 client_map[] = {
      		PRESTERA_HW_COUNTER_CLIENT_LOOKUP_0,
      		PRESTERA_HW_COUNTER_CLIENT_LOOKUP_1,
      		PRESTERA_HW_COUNTER_CLIENT_LOOKUP_2
      	};
      	if (chain_index > ARRAY_SIZE(client_map))
      	...
      
      Fixes: fa5d824c ("net: prestera: acl: add multi-chain support offload")
      Reported-by: Nkernel test robot <lkp@intel.com>
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NVolodymyr Mytnyk <vmytnyk@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      48c77bdf
    • A
      net: dsa: microchip: add ksz8563 to ksz9477 I2C driver · 173a272a
      Ahmad Fatoum 提交于
      The KSZ9477 SPI driver already has support for the KSZ8563. The same switch
      chip can also be managed via i2c and we have an KSZ9477 I2C driver, but
      that one lacks the relevant compatible entry. Add it.
      
      DT bindings already describe this compatible.
      Signed-off-by: NAhmad Fatoum <a.fatoum@pengutronix.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      173a272a
    • D
      net/smc: unlock on error paths in __smc_setsockopt() · 7a11455f
      Dan Carpenter 提交于
      These two error paths need to release_sock(sk) before returning.
      
      Fixes: a6a6fe27 ("net/smc: Dynamic control handshake limitation by socket options")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: ND. Wythe <alibuda@linux.alibaba.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7a11455f
    • O
      net: dsa: microchip: ksz9477: export HW stats over stats64 interface · a7f4f13a
      Oleksij Rempel 提交于
      Provide access to HW offloaded packets over stats64 interface.
      The rx/tx_bytes values needed some fixing since HW is accounting size of
      the Ethernet frame together with FCS.
      Signed-off-by: NOleksij Rempel <o.rempel@pengutronix.de>
      Reviewed-by: NVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7f4f13a
    • D
      Merge branch 'phylink-remove-pcs_poll' · 0d0350c4
      David S. Miller 提交于
      Russell King says:
      
      ====================
      net: phylink: remove pcs_poll
      
      This small series removes the now unused pcs_poll members from DSA and
      phylink. "git grep pcs_poll drivers/net/ net/" on net-next confirms that
      the only places that reference this are in DSA core code and phylink
      code:
      
      drivers/net/phy/phylink.c:              if (pl->config->pcs_poll || pcs->poll)
      drivers/net/phy/phylink.c:              poll |= pl->config->pcs_poll;
      net/dsa/port.c: dp->pl_config.pcs_poll = ds->pcs_poll;
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0d0350c4
    • R
      net: phylink: remove phylink_config's pcs_poll · 64b4a0f8
      Russell King (Oracle) 提交于
      phylink_config's pcs_poll is no longer used, let's get rid of it.
      Signed-off-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      64b4a0f8
    • R
      net: dsa: remove pcs_poll · ccfbf44d
      Russell King (Oracle) 提交于
      With drivers converted over to using phylink PCS, there is no need for
      the struct dsa_switch member "pcs_poll" to exist anymore - there is a
      flag in the struct phylink_pcs which indicates whether this PCS needs
      to be polled which supersedes this.
      Signed-off-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ccfbf44d
    • J
      net: hsr: fix suspicious RCU usage warning in hsr_node_get_first() · e7f27420
      Juhee Kang 提交于
      When hsr_create_self_node() calls hsr_node_get_first(), the suspicious
      RCU usage warning is occurred. The reason why this warning is raised is
      the callers of hsr_node_get_first() use rcu_read_lock_bh() and
      other different synchronization mechanisms. Thus, this patch solved by
      replacing rcu_dereference() with rcu_dereference_bh_check().
      
      The kernel test robot reports:
          [   50.083470][ T3596] =============================
          [   50.088648][ T3596] WARNING: suspicious RCU usage
          [   50.093785][ T3596] 5.17.0-rc3-next-20220208-syzkaller #0 Not tainted
          [   50.100669][ T3596] -----------------------------
          [   50.105513][ T3596] net/hsr/hsr_framereg.c:34 suspicious rcu_dereference_check() usage!
          [   50.113799][ T3596]
          [   50.113799][ T3596] other info that might help us debug this:
          [   50.113799][ T3596]
          [   50.124257][ T3596]
          [   50.124257][ T3596] rcu_scheduler_active = 2, debug_locks = 1
          [   50.132368][ T3596] 2 locks held by syz-executor.0/3596:
          [   50.137863][ T3596]  #0: ffffffff8d3357e8 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x3be/0xb80
          [   50.147470][ T3596]  #1: ffff88807ec9d5f0 (&hsr->list_lock){+...}-{2:2}, at: hsr_create_self_node+0x225/0x650
          [   50.157623][ T3596]
          [   50.157623][ T3596] stack backtrace:
          [   50.163510][ T3596] CPU: 1 PID: 3596 Comm: syz-executor.0 Not tainted 5.17.0-rc3-next-20220208-syzkaller #0
          [   50.173381][ T3596] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
          [   50.183623][ T3596] Call Trace:
          [   50.186904][ T3596]  <TASK>
          [   50.189844][ T3596]  dump_stack_lvl+0xcd/0x134
          [   50.194640][ T3596]  hsr_node_get_first+0x9b/0xb0
          [   50.199499][ T3596]  hsr_create_self_node+0x22d/0x650
          [   50.204688][ T3596]  hsr_dev_finalize+0x2c1/0x7d0
          [   50.209669][ T3596]  hsr_newlink+0x315/0x730
          [   50.214113][ T3596]  ? hsr_dellink+0x130/0x130
          [   50.218789][ T3596]  ? rtnl_create_link+0x7e8/0xc00
          [   50.223803][ T3596]  ? hsr_dellink+0x130/0x130
          [   50.228397][ T3596]  __rtnl_newlink+0x107c/0x1760
          [   50.233249][ T3596]  ? rtnl_setlink+0x3c0/0x3c0
          [   50.238043][ T3596]  ? is_bpf_text_address+0x77/0x170
          [   50.243362][ T3596]  ? lock_downgrade+0x6e0/0x6e0
          [   50.248219][ T3596]  ? unwind_next_frame+0xee1/0x1ce0
          [   50.253605][ T3596]  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
          [   50.259669][ T3596]  ? __sanitizer_cov_trace_cmp4+0x1c/0x70
          [   50.265423][ T3596]  ? is_bpf_text_address+0x99/0x170
          [   50.270819][ T3596]  ? kernel_text_address+0x39/0x80
          [   50.275950][ T3596]  ? __kernel_text_address+0x9/0x30
          [   50.281336][ T3596]  ? unwind_get_return_address+0x51/0x90
          [   50.286975][ T3596]  ? create_prof_cpu_mask+0x20/0x20
          [   50.292178][ T3596]  ? arch_stack_walk+0x93/0xe0
          [   50.297172][ T3596]  ? kmem_cache_alloc_trace+0x42/0x2c0
          [   50.302637][ T3596]  ? rcu_read_lock_sched_held+0x3a/0x70
          [   50.308194][ T3596]  rtnl_newlink+0x64/0xa0
          [   50.312524][ T3596]  ? __rtnl_newlink+0x1760/0x1760
          [   50.317545][ T3596]  rtnetlink_rcv_msg+0x413/0xb80
          [   50.322631][ T3596]  ? rtnl_newlink+0xa0/0xa0
          [   50.327159][ T3596]  netlink_rcv_skb+0x153/0x420
          [   50.331931][ T3596]  ? rtnl_newlink+0xa0/0xa0
          [   50.336436][ T3596]  ? netlink_ack+0xa80/0xa80
          [   50.341095][ T3596]  ? netlink_deliver_tap+0x1a2/0xc40
          [   50.346532][ T3596]  ? netlink_deliver_tap+0x1b1/0xc40
          [   50.351839][ T3596]  netlink_unicast+0x539/0x7e0
          [   50.356633][ T3596]  ? netlink_attachskb+0x880/0x880
          [   50.361750][ T3596]  ? __sanitizer_cov_trace_const_cmp8+0x1d/0x70
          [   50.368003][ T3596]  ? __sanitizer_cov_trace_const_cmp8+0x1d/0x70
          [   50.374707][ T3596]  ? __phys_addr_symbol+0x2c/0x70
          [   50.379753][ T3596]  ? __sanitizer_cov_trace_cmp8+0x1d/0x70
          [   50.385568][ T3596]  ? __check_object_size+0x16c/0x4f0
          [   50.390859][ T3596]  netlink_sendmsg+0x904/0xe00
          [   50.395715][ T3596]  ? netlink_unicast+0x7e0/0x7e0
          [   50.400722][ T3596]  ? __sanitizer_cov_trace_const_cmp4+0x1c/0x70
          [   50.407003][ T3596]  ? netlink_unicast+0x7e0/0x7e0
          [   50.412119][ T3596]  sock_sendmsg+0xcf/0x120
          [   50.416548][ T3596]  __sys_sendto+0x21c/0x320
          [   50.421052][ T3596]  ? __ia32_sys_getpeername+0xb0/0xb0
          [   50.426427][ T3596]  ? lockdep_hardirqs_on_prepare+0x400/0x400
          [   50.432721][ T3596]  ? __context_tracking_exit+0xb8/0xe0
          [   50.438188][ T3596]  ? lock_downgrade+0x6e0/0x6e0
          [   50.443041][ T3596]  ? lock_downgrade+0x6e0/0x6e0
          [   50.447902][ T3596]  __x64_sys_sendto+0xdd/0x1b0
          [   50.452759][ T3596]  ? lockdep_hardirqs_on+0x79/0x100
          [   50.457964][ T3596]  ? syscall_enter_from_user_mode+0x21/0x70
          [   50.464150][ T3596]  do_syscall_64+0x35/0xb0
          [   50.468565][ T3596]  entry_SYSCALL_64_after_hwframe+0x44/0xae
          [   50.474452][ T3596] RIP: 0033:0x7f3148504e1c
          [   50.479052][ T3596] Code: fa fa ff ff 44 8b 4c 24 2c 4c 8b 44 24 20 89 c5 44 8b 54 24 28 48 8b 54 24 18 b8 2c 00 00 00 48 8b 74 24 10 8b 7c 24 08 0f 05 <48> 3d 00 f0 ff ff 77 34 89 ef 48 89 44 24 08 e8 20 fb ff ff 48 8b
          [   50.498926][ T3596] RSP: 002b:00007ffeab5f2ab0 EFLAGS: 00000293 ORIG_RAX: 000000000000002c
          [   50.507342][ T3596] RAX: ffffffffffffffda RBX: 00007f314959d320 RCX: 00007f3148504e1c
          [   50.515393][ T3596] RDX: 0000000000000048 RSI: 00007f314959d370 RDI: 0000000000000003
          [   50.523444][ T3596] RBP: 0000000000000000 R08: 00007ffeab5f2b04 R09: 000000000000000c
          [   50.531492][ T3596] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
          [   50.539455][ T3596] R13: 00007f314959d370 R14: 0000000000000003 R15: 0000000000000000
      
      Fixes: 4acc45db ("net: hsr: use hlist_head instead of list_head for mac addresses")
      Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
      Reported-and-tested-by: syzbot+f0eb4f3876de066b128c@syzkaller.appspotmail.com
      Signed-off-by: NJuhee Kang <claudiajkang@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e7f27420
    • C
      atm: nicstar: Use kcalloc() to simplify code · 92c54a65
      Christophe JAILLET 提交于
      Use kcalloc() instead of kmalloc_array() and a loop to set all the values
      of the array to NULL.
      
      While at it, remove a duplicated assignment to 'scq->num_entries'.
      Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      92c54a65
    • D
      Merge branch 'dpaa2-eth-one-step-register' · 32d51cef
      David S. Miller 提交于
      Radu Bulie says:
      
      ====================
      Provide direct access to 1588 one step register
      
      DPAA2 MAC supports 1588 one step timestamping.
      If this option is enabled then for each transmitted PTP event packet,
      the 1588 SINGLE_STEP register is accessed to modify the following fields:
      
      -offset of the correction field inside the PTP packet
      -UDP checksum update bit,  in case the PTP event packet has
       UDP encapsulation
      
      These values can change any time, because there may be multiple
      PTP clients connected, that receive various 1588 frame types:
      - L2 only frame
      - UDP / Ipv4
      - UDP / Ipv6
      - other
      
      The current implementation uses dpni_set_single_step_cfg to update the
      SINLGE_STEP register.
      Using an MC command  on the Tx datapath for each transmitted 1588 message
      introduces high delays, leading to low throughput and consequently to a
      small number of supported PTP clients. Besides these, the nanosecond
      correction field from the PTP packet will contain the high delay from the
      driver which together with the originTimestamp will render timestamp
      values that are unacceptable in a GM clock implementation.
      
      This patch series replaces the dpni_set_single_step_cfg function call from
      the Tx datapath for 1588 messages (when one step timestamping is enabled)
      with a callback that either implements direct access to the SINGLE_STEP
      register, eliminating the overhead caused by the MC command that will need
      to be dispatched by the MC firmware through the MC command portal
      interface or falls back to the dpni_set_single_step_cfg in case the MC
      version does not have support for returning the single step register
      base address.
      
      In other words all the delay introduced by dpni_set_single_step_cfg
      function will be eliminated (if MC version has support for returning the
      base address of the single step register), improving the egress driver
      performance for PTP packets when single step timestamping is enabled.
      
      The first patch adds a new attribute that contains the base address of
      the SINGLE_STEP register. It will be used to directly update the register
      on the Tx datapath.
      
      The second patch updates the driver such that the SINGLE_STEP
      register is either accessed directly if MC version >= 10.32 or is
      accessed through dpni_set_single_step_cfg command when 1588 messages
      are transmitted.
      
      Changes in v2:
       - move global function pointer into the driver's private structure in 2/2
       - move repetitive code outside the body of the callback functions  in 2/2
       - update function dpaa2_ptp_onestep_reg_update_method  and remove goto
         statement from non error path in 2/2
      
      Changes in v3:
       - remove static storage class specifier from within the structure in 2/2
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32d51cef
    • R
      dpaa2-eth: Update SINGLE_STEP register access · c4680c97
      Radu Bulie 提交于
      DPAA2 MAC supports 1588 one step timestamping.
      If this option is enabled then for each transmitted PTP event packet,
      the 1588 SINGLE_STEP register is accessed to modify the following fields:
      
      -offset of the correction field inside the PTP packet
      -UDP checksum update bit,  in case the PTP event packet has
       UDP encapsulation
      
      These values can change any time, because there may be multiple
      PTP clients connected, that receive various 1588 frame types:
      - L2 only frame
      - UDP / Ipv4
      - UDP / Ipv6
      - other
      
      The current implementation uses dpni_set_single_step_cfg to update the
      SINLGE_STEP register.
      Using an MC command  on the Tx datapath for each transmitted 1588 message
      introduces high delays, leading to low throughput and consequently to a
      small number of supported PTP clients. Besides these, the nanosecond
      correction field from the PTP packet will contain the high delay from the
      driver which together with the originTimestamp will render timestamp
      values that are unacceptable in a GM clock implementation.
      
      This patch updates the Tx datapath for 1588 messages when single step
      timestamp is enabled and provides direct access to SINGLE_STEP register,
      eliminating the  overhead caused by the dpni_set_single_step_cfg
      MC command. MC version >= 10.32 implements this functionality.
      If the MC version does not have support for returning the
      single step register base address, the driver will use
      dpni_set_single_step_cfg command for updates operations.
      
      All the delay introduced by dpni_set_single_step_cfg
      function will be eliminated (if MC version has support for returning the
      base address of the single step register), improving the egress driver
      performance for PTP packets when single step timestamping is enabled.
      
      Before these changes the maximum throughput for 1588 messages with
      single step hardware timestamp enabled was around 2000pps.
      After the updates the throughput increased up to 32.82 Mbps / 46631.02 pps.
      Signed-off-by: NRadu Bulie <radu-andrei.bulie@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c4680c97
    • R
      dpaa2-eth: Update dpni_get_single_step_cfg command · 9572594e
      Radu Bulie 提交于
      dpni_get_single_step_cfg is an MC firmware command used for
      retrieving the contents of SINGLE_STEP 1588 register available
      in a DPMAC.
      
      This patch adds a new version of this command that returns as an extra
      argument the physical base address of the aforementioned register.
      The address will be used to directly modify the contents of the
      SINGLE_STEP register instead of invoking the MC command
      dpni_set_single_step_cgf. The former approach introduced huge delays on
      the TX datapath when one step PTP events were transmitted. This led to low
      throughput and high latencies observed in the PTP correction field.
      Signed-off-by: NRadu Bulie <radu-andrei.bulie@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9572594e
    • E
      net: get rid of rtnl_lock_unregistering() · 8a4fc54b
      Eric Dumazet 提交于
      After recent patches, and in particular commits
       faab39f6 ("net: allow out-of-order netdev unregistration") and
       e5f80fcf ("ipv6: give an IPv6 dev to blackhole_netdev")
      we no longer need the barrier implemented in rtnl_lock_unregistering().
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a4fc54b
    • V
      net: prestera: flower: fix destroy tmpl in chain · b3ae2d35
      Volodymyr Mytnyk 提交于
      Fix flower destroy template callback to release template
      only for specific tc chain instead of all chain tempaltes.
      
      The issue was intruduced by previous commit that introduced
      multi-chain support.
      
      Fixes: fa5d824c ("net: prestera: acl: add multi-chain support offload")
      Signed-off-by: NVolodymyr Mytnyk <vmytnyk@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b3ae2d35
    • E
      bridge: switch br_net_exit to batch mode · 36a29fb6
      Eric Dumazet 提交于
      cleanup_net() is competing with other rtnl users.
      
      Instead of calling br_net_exit() for each netns,
      call br_net_exit_batch() once.
      
      This gives cleanup_net() ability to group more devices
      and call unregister_netdevice_many() only once for all bridge devices.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Roopa Prabhu <roopa@nvidia.com>
      Cc: Nikolay Aleksandrov <razor@blackwall.org>
      Acked-by: NNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      36a29fb6
    • D
      Merge branch 'mctp-i2c' · a7cc3464
      David S. Miller 提交于
      Matt Johnston says:
      
      ====================
      MCTP I2C driver
      
      This patch series adds a netdev driver providing MCTP transport over
      I2C.
      
      I think I've addressed all the points raised in v5. It now has
      mctp_i2c_unregister() to run things in the correct order, waiting for
      the worker thread and I2C rx to complete.
      
      Cheers,
      Matt
      
      --
      
      v6:
       - Changed netdev register/unregister/free to avoid races. Ensure that
         netif functions are not used by irq handler/threads after unregister.
       - Fix incoming I2C hwaddr that was previously incorrect (left
         shifted 1 bit)
       - Add a check that byte_count wire header matches the length received
       - Renamed I2C driver to mctp-i2c-interface
       - Removed __func__ from print messages, added missing newlines
       - Removed sysfs mctp_current_mux file which was used for debug
       - Renamed curr_lock to sel_lock
       - Tidied comment formatting
       - Fix newline in Kconfig
      v5:
       - Fix incorrect format string
      v4:
       - Switch to __i2c_transfer() rather than __i2c_smbus_xfer(), drop 255 byte
         smbus patches
       - Use wait_event_idle() for the sleeping TX thread
       - Use dev_addr_set()
      v3:
       - Added Reviewed-bys for npcm7xx
       - Resend with net-next open
      v2:
       - Simpler Kconfig condition for i2c-mux dependency, from Randy Dunlap
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7cc3464
    • M
      mctp i2c: MCTP I2C binding driver · f5b8abf9
      Matt Johnston 提交于
      Provides MCTP network transport over an I2C bus, as specified in
      DMTF DSP0237. All messages between nodes are sent as SMBus Block Writes.
      
      Each I2C bus to be used for MCTP is flagged in devicetree by a
      'mctp-controller' property on the bus node. Each flagged bus gets a
      mctpi2cX net device created based on the bus number. A
      'mctp-i2c-controller' I2C client needs to be added under the adapter. In
      an I2C mux situation the mctp-i2c-controller node must be attached only
      to the root I2C bus. The I2C client will handle incoming I2C slave block
      write data for subordinate busses as well as its own bus.
      
      In configurations without devicetree a driver instance can be attached
      to a bus using the I2C slave new_device mechanism.
      
      The MCTP core will hold/release the MCTP I2C device while responses
      are pending (a 6 second timeout or once a socket is closed, response
      received etc). While held the MCTP I2C driver will lock the I2C bus so
      that the correct I2C mux remains selected while responses are received.
      
      (Ideally we would just lock the mux to keep the current bus selected for
      the response rather than a full I2C bus lock, but that isn't exposed in
      the I2C mux API)
      Signed-off-by: NMatt Johnston <matt@codeconstruct.com.au>
      Signed-off-by: NJeremy Kerr <jk@codeconstruct.com.au>
      Reviewed-by: Wolfram Sang <wsa@kernel.org> # I2C transport parts
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f5b8abf9
    • M
      dt-bindings: net: New binding mctp-i2c-controller · 6881e493
      Matt Johnston 提交于
      Used to define a local endpoint to communicate with MCTP peripherals
      attached to an I2C bus. This I2C endpoint can communicate with remote
      MCTP devices on the I2C bus.
      
      In the example I2C topology below (matching the second yaml example) we
      have MCTP devices on busses i2c1 and i2c6. MCTP-supporting busses are
      indicated by the 'mctp-controller' DT property on an I2C bus node.
      
      A mctp-i2c-controller I2C client DT node is placed at the top of the
      mux topology, since only the root I2C adapter will support I2C slave
      functionality.
                                                     .-------.
                                                     |eeprom |
          .------------.     .------.               /'-------'
          | adapter    |     | mux  --@0,i2c5------'
          | i2c1       ----.*|      --@1,i2c6--.--.
          |............|    \'------'           \  \  .........
          | mctp-i2c-  |     \                   \  \ .mctpB  .
          | controller |      \                   \  '.0x30   .
          |            |       \  .........        \  '.......'
          | 0x50       |        \ .mctpA  .         \ .........
          '------------'         '.0x1d   .          '.mctpC  .
                                  '.......'          '.0x31   .
                                                      '.......'
      (mctpX boxes above are remote MCTP devices not included in the DT at
      present, they can be hotplugged/probed at runtime. A DT binding for
      specific fixed MCTP devices could be added later if required)
      Signed-off-by: NMatt Johnston <matt@codeconstruct.com.au>
      Reviewed-by: NRob Herring <robh@kernel.org>
      Acked-by: NWolfram Sang <wsa@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6881e493
    • M
      net: ip6mr: add support for passing full packet on wrong mif · 4b340a5a
      Mobashshera Rasool 提交于
      This patch adds support for MRT6MSG_WRMIFWHOLE which is used to pass
      full packet and real vif id when the incoming interface is wrong.
      While the RP and FHR are setting up state we need to be sending the
      registers encapsulated with all the data inside otherwise we lose it.
      The RP then decapsulates it and forwards it to the interested parties.
      Currently with WRONGMIF we can only be sending empty register packets
      and will lose that data.
      This behaviour can be enabled by using MRT_PIM with
      val == MRT6MSG_WRMIFWHOLE. This doesn't prevent MRT6MSG_WRONGMIF from
      happening, it happens in addition to it, also it is controlled by the same
      throttling parameters as WRONGMIF (i.e. 1 packet per 3 seconds currently).
      Both messages are generated to keep backwards compatibily and avoid
      breaking someone who was enabling MRT_PIM with val == 4, since any
      positive val is accepted and treated the same.
      Signed-off-by: NMobashshera Rasool <mobash.rasool.linux@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4b340a5a
  2. 19 2月, 2022 10 次提交
  3. 18 2月, 2022 6 次提交
    • E
      net: avoid quadratic behavior in netdev_wait_allrefs_any() · 86213f80
      Eric Dumazet 提交于
      If the list of devices has N elements, netdev_wait_allrefs_any()
      is called N times, and linkwatch_forget_dev() is called N*(N-1)/2 times.
      
      Fix this by calling linkwatch_forget_dev() only once per device.
      
      Fixes: faab39f6 ("net: allow out-of-order netdev unregistration")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20220218065430.2613262-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      86213f80
    • E
      ipv6: annotate some data-races around sk->sk_prot · 086d4905
      Eric Dumazet 提交于
      IPv6 has this hack changing sk->sk_prot when an IPv6 socket
      is 'converted' to an IPv4 one with IPV6_ADDRFORM option.
      
      This operation is only performed for TCP and UDP, knowing
      their 'struct proto' for the two network families are populated
      in the same way, and can not disappear while a reader
      might use and dereference sk->sk_prot.
      
      If we think about it all reads of sk->sk_prot while
      either socket lock or RTNL is not acquired should be using READ_ONCE().
      
      Also note that other layers like MPTCP, XFRM, CHELSIO_TLS also
      write over sk->sk_prot.
      
      BUG: KCSAN: data-race in inet6_recvmsg / ipv6_setsockopt
      
      write to 0xffff8881386f7aa8 of 8 bytes by task 26932 on cpu 0:
       do_ipv6_setsockopt net/ipv6/ipv6_sockglue.c:492 [inline]
       ipv6_setsockopt+0x3758/0x3910 net/ipv6/ipv6_sockglue.c:1019
       udpv6_setsockopt+0x85/0x90 net/ipv6/udp.c:1649
       sock_common_setsockopt+0x5d/0x70 net/core/sock.c:3489
       __sys_setsockopt+0x209/0x2a0 net/socket.c:2180
       __do_sys_setsockopt net/socket.c:2191 [inline]
       __se_sys_setsockopt net/socket.c:2188 [inline]
       __x64_sys_setsockopt+0x62/0x70 net/socket.c:2188
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      read to 0xffff8881386f7aa8 of 8 bytes by task 26911 on cpu 1:
       inet6_recvmsg+0x7a/0x210 net/ipv6/af_inet6.c:659
       ____sys_recvmsg+0x16c/0x320
       ___sys_recvmsg net/socket.c:2674 [inline]
       do_recvmmsg+0x3f5/0xae0 net/socket.c:2768
       __sys_recvmmsg net/socket.c:2847 [inline]
       __do_sys_recvmmsg net/socket.c:2870 [inline]
       __se_sys_recvmmsg net/socket.c:2863 [inline]
       __x64_sys_recvmmsg+0xde/0x160 net/socket.c:2863
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      value changed: 0xffffffff85e0e980 -> 0xffffffff85e01580
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 26911 Comm: syz-executor.3 Not tainted 5.17.0-rc2-syzkaller-00316-g0457e515-dirty #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      086d4905
    • C
      net/ibmvnic: Cleanup workaround doing an EOI after partition migration · 7ea0c16a
      Cédric Le Goater 提交于
      There were a fair amount of changes to workaround a firmware bug leaving
      a pending interrupt after migration of the ibmvnic device :
      
      commit 2df5c60e ("net/ibmvnic: Ignore H_FUNCTION return from H_EOI
             		    to tolerate XIVE mode")
      commit 284f87d2 ("Revert "net/ibmvnic: Fix EOI when running in
             		    XIVE mode"")
      commit 11d49ce9 ("net/ibmvnic: Fix EOI when running in XIVE mode.")
      commit f23e0643 ("ibmvnic: Clear pending interrupt after device reset")
      
      Here is the final one taking into account the XIVE interrupt mode.
      
      Cc: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
      Cc: Dany Madden <drt@linux.ibm.com>
      Signed-off-by: NCédric Le Goater <clg@kaod.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7ea0c16a
    • J
      teaming: deliver link-local packets with the link they arrive on · aaae162a
      jeffreyji 提交于
      skb is ignored if team port is disabled. We want the skb to be delivered
      if it's an link layer packet.
      
      Issue is already fixed for bonding in
      commit b89f04c6 ("bonding: deliver link-local packets with skb->dev set to link that packets arrived on")
      
      changelog:
      
      v2: change LLDP -> link layer in comments/commit descrip, comment format
      Signed-off-by: Njeffreyji <jeffreyji@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aaae162a
    • D
      Merge branch 'qca8k-phylink' · a3b355c7
      David S. Miller 提交于
      Russell King says:
      
      ====================
      net: dsa: qca8k: convert to phylink_pcs and mark as non-legacy
      
      This series adds support into DSA for the mac_select_pcs method, and
      converts qca8k to make use of this, eventually marking qca8k as non-
      legacy.
      
      Patch 1 adds DSA support for mac_select_pcs.
      Patch 2 and patch 3 moves code around in qca8k to make patch 4 more
      readable.
      Patch 4 does a simple conversion to phylink_pcs.
      Patch 5 moves the serdes configuration to phylink_pcs.
      Patch 6 marks qca8k as non-legacy.
      
      v2: fix dsa_phylink_mac_select_pcs() formatting and double-blank line
      in patch 5
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a3b355c7
    • R
      net: dsa: qca8k: mark as non-legacy · d9cbacf0
      Russell King (Oracle) 提交于
      The qca8k driver does not make use of the speed, duplex, pause or
      advertisement in its phylink_mac_config() implementation, so it can be
      marked as a non-legacy driver.
      Signed-off-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d9cbacf0