1. 15 2月, 2019 1 次提交
  2. 07 2月, 2019 1 次提交
    • F
      net: Introduce ndo_get_port_parent_id() · d6abc596
      Florian Fainelli 提交于
      In preparation for getting rid of switchdev_ops, create a dedicated NDO
      operation for getting the port's parent identifier. There are
      essentially two classes of drivers that need to implement getting the
      port's parent ID which are VF/PF drivers with a built-in switch, and
      pure switchdev drivers such as mlxsw, ocelot, dsa etc.
      
      We introduce a helper function: dev_get_port_parent_id() which supports
      recursion into the lower devices to obtain the first port's parent ID.
      
      Convert the bridge, core and ipv4 multicast routing code to check for
      such ndo_get_port_parent_id() and call the helper function when valid
      before falling back to switchdev_port_attr_get(). This will allow us to
      convert all relevant drivers in one go instead of having to implement
      both switchdev_port_attr_get() and ndo_get_port_parent_id() operations,
      then get rid of switchdev_port_attr_get().
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6abc596
  3. 04 2月, 2019 1 次提交
  4. 23 1月, 2019 1 次提交
    • C
      net: introduce a knob to control whether to inherit devconf config · 856c395c
      Cong Wang 提交于
      There have been many people complaining about the inconsistent
      behaviors of IPv4 and IPv6 devconf when creating new network
      namespaces.  Currently, for IPv4, we inherit all current settings
      from init_net, but for IPv6 we reset all setting to default.
      
      This patch introduces a new /proc file
      /proc/sys/net/core/devconf_inherit_init_net to control the
      behavior of whether to inhert sysctl current settings from init_net.
      This file itself is only available in init_net.
      
      As demonstrated below:
      
      Initial setup in init_net:
       # cat /proc/sys/net/ipv4/conf/all/rp_filter
       2
       # cat /proc/sys/net/ipv6/conf/all/accept_dad
       1
      
      Default value 0 (current behavior):
       # ip netns del test
       # ip netns add test
       # ip netns exec test cat /proc/sys/net/ipv4/conf/all/rp_filter
       2
       # ip netns exec test cat /proc/sys/net/ipv6/conf/all/accept_dad
       0
      
      Set to 1 (inherit from init_net):
       # echo 1 > /proc/sys/net/core/devconf_inherit_init_net
       # ip netns del test
       # ip netns add test
       # ip netns exec test cat /proc/sys/net/ipv4/conf/all/rp_filter
       2
       # ip netns exec test cat /proc/sys/net/ipv6/conf/all/accept_dad
       1
      
      Set to 2 (reset to default):
       # echo 2 > /proc/sys/net/core/devconf_inherit_init_net
       # ip netns del test
       # ip netns add test
       # ip netns exec test cat /proc/sys/net/ipv4/conf/all/rp_filter
       0
       # ip netns exec test cat /proc/sys/net/ipv6/conf/all/accept_dad
       0
      
      Set to a value out of range (invalid):
       # echo 3 > /proc/sys/net/core/devconf_inherit_init_net
       -bash: echo: write error: Invalid argument
       # echo -1 > /proc/sys/net/core/devconf_inherit_init_net
       -bash: echo: write error: Invalid argument
      Reported-by: NZhu Yanjun <Yanjun.Zhu@windriver.com>
      Reported-by: NTonghao Zhang <xiangxia.m.yue@gmail.com>
      Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: NTonghao Zhang <xiangxia.m.yue@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      856c395c
  5. 18 1月, 2019 1 次提交
    • P
      net: Add extack argument to ndo_fdb_add() · 87b0984e
      Petr Machata 提交于
      Drivers may not be able to support certain FDB entries, and an error
      code is insufficient to give clear hints as to the reasons of rejection.
      
      In order to make it possible to communicate the rejection reason, extend
      ndo_fdb_add() with an extack argument. Adapt the existing
      implementations of ndo_fdb_add() to take the parameter (and ignore it).
      Pass the extack parameter when invoking ndo_fdb_add() from rtnl_fdb_add().
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      87b0984e
  6. 17 12月, 2018 1 次提交
  7. 14 12月, 2018 3 次提交
  8. 13 12月, 2018 1 次提交
  9. 07 12月, 2018 3 次提交
  10. 26 11月, 2018 1 次提交
  11. 25 11月, 2018 1 次提交
  12. 20 11月, 2018 1 次提交
  13. 18 11月, 2018 1 次提交
  14. 16 11月, 2018 1 次提交
  15. 15 11月, 2018 1 次提交
  16. 11 11月, 2018 3 次提交
  17. 09 11月, 2018 1 次提交
    • I
      net: core: dev_addr_lists: add auxiliary func to handle reference address updates · e7946760
      Ivan Khoronzhuk 提交于
      In order to avoid all table update, and only remove or add new
      address, the auxiliary function exists, named __hw_addr_sync_dev().
      It allows end driver do nothing when nothing changed and add/rm when
      concrete address is firstly added or lastly removed. But it doesn't
      include cases when an address of real device or vlan was reused by
      other vlans or vlan/macval devices.
      
      For handaling events when address was reused/unreused the patch adds
      new auxiliary routine - __hw_addr_ref_sync_dev(). It allows to do
      nothing when nothing was changed and do updates only for an address
      being added/reused/deleted/unreused. Thus, clone address changes for
      vlans can be mirrored in the table. The function is exclusive with
      __hw_addr_sync_dev(). It's responsibility of the end driver to
      identify address vlan device, if it needs so.
      Signed-off-by: NIvan Khoronzhuk <ivan.khoronzhuk@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e7946760
  18. 04 11月, 2018 1 次提交
    • E
      net: bql: add __netdev_tx_sent_queue() · 3e59020a
      Eric Dumazet 提交于
      When qdisc_run() tries to use BQL budget to bulk-dequeue a batch
      of packets, GSO can later transform this list in another list
      of skbs, and each skb is sent to device ndo_start_xmit(),
      one at a time, with skb->xmit_more being set to one but
      for last skb.
      
      Problem is that very often, BQL limit is hit in the middle of
      the packet train, forcing dev_hard_start_xmit() to stop the
      bulk send and requeue the end of the list.
      
      BQL role is to avoid head of line blocking, making sure
      a qdisc can deliver high priority packets before low priority ones.
      
      But there is no way requeued packets can be bypassed by fresh
      packets in the qdisc.
      
      Aborting the bulk send increases TX softirqs, and hot cache
      lines (after skb_segment()) are wasted.
      
      Note that for TSO packets, we never split a packet in the middle
      because of BQL limit being hit.
      
      Drivers should be able to update BQL counters without
      flipping/caring about BQL status, if the current skb
      has xmit_more set.
      
      Upper layers are ultimately responsible to stop sending another
      packet train when BQL limit is hit.
      
      Code template in a driver might look like the following :
      
      	send_doorbell = __netdev_tx_sent_queue(tx_queue, nr_bytes, skb->xmit_more);
      
      Note that __netdev_tx_sent_queue() use is not mandatory,
      since following patch will change dev_hard_start_xmit()
      to not care about BQL status.
      
      But it is highly recommended so that xmit_more full benefits
      can be reached (less doorbells sent, and less atomic operations as well)
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3e59020a
  19. 16 10月, 2018 1 次提交
    • M
      FDDI: defza: Support capturing outgoing SMT traffic · 9f9a742d
      Maciej W. Rozycki 提交于
      DEC FDDIcontroller 700 (DEFZA) uses a Tx/Rx queue pair to communicate
      SMT frames with adapter's firmware.  Any SMT frame received from the RMC
      via the Rx queue is queued back by the driver to the SMT Rx queue for
      the firmware to process.  Similarly the firmware uses the SMT Tx queue
      to supply the driver with SMT frames which are queued back to the Tx
      queue for the RMC to send to the ring.
      
      When a network tap is attached to an FDDI interface handled by `defza'
      any incoming SMT frames captured are queued to our usual processing of
      network data received, which in turn delivers them to any listening
      taps.
      
      However the outgoing SMT frames produced by the firmware bypass our
      network protocol stack and are therefore not delivered to taps.  This in
      turn means that taps are missing a part of network traffic sent by the
      adapter, which may make it more difficult to track down network problems
      or do general traffic analysis.
      
      Call `dev_queue_xmit_nit' then in the SMT Tx path, having checked that
      a network tap is attached, with a newly-created `dev_nit_active' helper
      wrapping the usual condition used in the transmit path.
      Signed-off-by: NMaciej W. Rozycki <macro@linux-mips.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9f9a742d
  20. 11 10月, 2018 1 次提交
    • S
      net: ipv4: update fnhe_pmtu when first hop's MTU changes · af7d6cce
      Sabrina Dubroca 提交于
      Since commit 5aad1de5 ("ipv4: use separate genid for next hop
      exceptions"), exceptions get deprecated separately from cached
      routes. In particular, administrative changes don't clear PMTU anymore.
      
      As Stefano described in commit e9fa1495 ("ipv6: Reflect MTU changes
      on PMTU of exceptions for MTU-less routes"), the PMTU discovered before
      the local MTU change can become stale:
       - if the local MTU is now lower than the PMTU, that PMTU is now
         incorrect
       - if the local MTU was the lowest value in the path, and is increased,
         we might discover a higher PMTU
      
      Similarly to what commit e9fa1495 did for IPv6, update PMTU in those
      cases.
      
      If the exception was locked, the discovered PMTU was smaller than the
      minimal accepted PMTU. In that case, if the new local MTU is smaller
      than the current PMTU, let PMTU discovery figure out if locking of the
      exception is still needed.
      
      To do this, we need to know the old link MTU in the NETDEV_CHANGEMTU
      notifier. By the time the notifier is called, dev->mtu has been
      changed. This patch adds the old MTU as additional information in the
      notifier structure, and a new call_netdevice_notifiers_u32() function.
      
      Fixes: 5aad1de5 ("ipv4: use separate genid for next hop exceptions")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: NStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      af7d6cce
  21. 05 10月, 2018 1 次提交
  22. 27 9月, 2018 1 次提交
  23. 19 9月, 2018 1 次提交
  24. 14 9月, 2018 1 次提交
  25. 06 9月, 2018 1 次提交
    • V
      packet: add sockopt to ignore outgoing packets · fa788d98
      Vincent Whitchurch 提交于
      Currently, the only way to ignore outgoing packets on a packet socket is
      via the BPF filter.  With MSG_ZEROCOPY, packets that are looped into
      AF_PACKET are copied in dev_queue_xmit_nit(), and this copy happens even
      if the filter run from packet_rcv() would reject them.  So the presence
      of a packet socket on the interface takes away the benefits of
      MSG_ZEROCOPY, even if the packet socket is not interested in outgoing
      packets.  (Even when MSG_ZEROCOPY is not used, the skb is unnecessarily
      cloned, but the cost for that is much lower.)
      
      Add a socket option to allow AF_PACKET sockets to ignore outgoing
      packets to solve this.  Note that the *BSDs already have something
      similar: BIOCSSEESENT/BIOCSDIRECTION and BIOCSDIRFILT.
      
      The first intended user is lldpd.
      Signed-off-by: NVincent Whitchurch <vincent.whitchurch@axis.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fa788d98
  26. 30 8月, 2018 1 次提交
  27. 11 8月, 2018 1 次提交
  28. 01 8月, 2018 2 次提交
  29. 30 7月, 2018 1 次提交
  30. 17 7月, 2018 1 次提交
    • L
      net: convert gro_count to bitmask · d9f37d01
      Li RongQing 提交于
      gro_hash size is 192 bytes, and uses 3 cache lines, if there is few
      flows, gro_hash may be not fully used, so it is unnecessary to iterate
      all gro_hash in napi_gro_flush(), to occupy unnecessary cacheline.
      
      convert gro_count to a bitmask, and rename it as gro_bitmask, each bit
      represents a element of gro_hash, only flush a gro_hash element if the
      related bit is set, to speed up napi_gro_flush().
      
      and update gro_bitmask only if it will be changed, to reduce cache
      update
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NLi RongQing <lirongqing@baidu.com>
      Cc: Stefano Brivio <sbrivio@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d9f37d01
  31. 16 7月, 2018 1 次提交
  32. 14 7月, 2018 2 次提交