1. 04 1月, 2017 1 次提交
    • S
      af_packet: TX_RING support for TPACKET_V3 · 7f953ab2
      Sowmini Varadhan 提交于
      Although TPACKET_V3 Rx has some benefits over TPACKET_V2 Rx, *_v3
      does not currently have TX_RING support. As a result an application
      that wants the best perf for Tx and Rx (e.g. to handle request/response
      transacations) ends up needing 2 sockets, one with *_v2 for Tx and
      another with *_v3 for Rx.
      
      This patch enables TPACKET_V2 compatible Tx features in TPACKET_V3
      so that an application can use a single descriptor to get the benefits
      of _v3 RX_RING and _v2 TX_RING. An application may do a block-send by
      first filling up multiple frames in the Tx ring and then triggering a
      transmit. This patch only support fixed size Tx frames for TPACKET_V3,
      and requires that tp_next_offset must be zero.
      Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f953ab2
  2. 03 1月, 2017 1 次提交
  3. 11 12月, 2016 1 次提交
  4. 09 12月, 2016 3 次提交
  5. 05 12月, 2016 1 次提交
    • F
      netfilter: conntrack: add nf_conntrack_default_on sysctl · 481fa373
      Florian Westphal 提交于
      This switch (default on) can be used to disable automatic registration
      of connection tracking functionality in newly created network
      namespaces.
      
      This means that when net namespace goes down (or the tracker protocol
      module is unloaded) we *might* have to unregister the hooks.
      
      We can either add another per-netns variable that tells if
      the hooks got registered by default, or, alternatively, just call
      the protocol _put() function and have the callee deal with a possible
      'extra' put() operation that doesn't pair with a get() one.
      
      This uses the latter approach, i.e. a put() without a get has no effect.
      
      Conntrack is still enabled automatically regardless of the new sysctl
      setting if the new net namespace requires connection tracking, e.g. when
      NAT rules are created.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      481fa373
  6. 04 12月, 2016 2 次提交
  7. 03 12月, 2016 1 次提交
  8. 30 11月, 2016 2 次提交
    • F
      tcp: SOF_TIMESTAMPING_OPT_STATS option for SO_TIMESTAMPING · 1c885808
      Francis Yan 提交于
      This patch exports the sender chronograph stats via the socket
      SO_TIMESTAMPING channel. Currently we can instrument how long a
      particular application unit of data was queued in TCP by tracking
      SOF_TIMESTAMPING_TX_SOFTWARE and SOF_TIMESTAMPING_TX_SCHED. Having
      these sender chronograph stats exported simultaneously along with
      these timestamps allow further breaking down the various sender
      limitation.  For example, a video server can tell if a particular
      chunk of video on a connection takes a long time to deliver because
      TCP was experiencing small receive window. It is not possible to
      tell before this patch without packet traces.
      
      To prepare these stats, the user needs to set
      SOF_TIMESTAMPING_OPT_STATS and SOF_TIMESTAMPING_OPT_TSONLY flags
      while requesting other SOF_TIMESTAMPING TX timestamps. When the
      timestamps are available in the error queue, the stats are returned
      in a separate control message of type SCM_TIMESTAMPING_OPT_STATS,
      in a list of TLVs (struct nlattr) of types: TCP_NLA_BUSY_TIME,
      TCP_NLA_RWND_LIMITED, TCP_NLA_SNDBUF_LIMITED. Unit is microsecond.
      Signed-off-by: NFrancis Yan <francisyyan@gmail.com>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1c885808
    • S
      docs: ieee802154: update main documentation file · 6bf0d84d
      Stefan Schmidt 提交于
      This updates some out of date documentation and fixes some wrong assumptions as
      well as pure grammar fixes. This file needs to move towards the new kernel doc
      system and getting an overhaul during this work.
      Signed-off-by: NStefan Schmidt <stefan@osg.samsung.com>
      6bf0d84d
  9. 29 11月, 2016 4 次提交
  10. 24 11月, 2016 1 次提交
  11. 10 11月, 2016 2 次提交
  12. 08 11月, 2016 1 次提交
  13. 24 10月, 2016 1 次提交
  14. 17 10月, 2016 6 次提交
  15. 14 10月, 2016 1 次提交
  16. 10 10月, 2016 1 次提交
  17. 28 9月, 2016 1 次提交
  18. 20 9月, 2016 1 次提交
  19. 19 9月, 2016 1 次提交
    • M
      ipvlan: Introduce l3s mode · 4fbae7d8
      Mahesh Bandewar 提交于
      In a typical IPvlan L3 setup where master is in default-ns and
      each slave is into different (slave) ns. In this setup egress
      packet processing for traffic originating from slave-ns will
      hit all NF_HOOKs in slave-ns as well as default-ns. However same
      is not true for ingress processing. All these NF_HOOKs are
      hit only in the slave-ns skipping them in the default-ns.
      IPvlan in L3 mode is restrictive and if admins want to deploy
      iptables rules in default-ns, this asymmetric data path makes it
      impossible to do so.
      
      This patch makes use of the l3_rcv() (added as part of l3mdev
      enhancements) to perform input route lookup on RX packets without
      changing the skb->dev and then uses nf_hook at NF_INET_LOCAL_IN
      to change the skb->dev just before handing over skb to L4.
      Signed-off-by: NMahesh Bandewar <maheshb@google.com>
      CC: David Ahern <dsa@cumulusnetworks.com>
      Reviewed-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4fbae7d8
  20. 02 9月, 2016 1 次提交
    • D
      rxrpc: Don't expose skbs to in-kernel users [ver #2] · d001648e
      David Howells 提交于
      Don't expose skbs to in-kernel users, such as the AFS filesystem, but
      instead provide a notification hook the indicates that a call needs
      attention and another that indicates that there's a new call to be
      collected.
      
      This makes the following possibilities more achievable:
      
       (1) Call refcounting can be made simpler if skbs don't hold refs to calls.
      
       (2) skbs referring to non-data events will be able to be freed much sooner
           rather than being queued for AFS to pick up as rxrpc_kernel_recv_data
           will be able to consult the call state.
      
       (3) We can shortcut the receive phase when a call is remotely aborted
           because we don't have to go through all the packets to get to the one
           cancelling the operation.
      
       (4) It makes it easier to do encryption/decryption directly between AFS's
           buffers and sk_buffs.
      
       (5) Encryption/decryption can more easily be done in the AFS's thread
           contexts - usually that of the userspace process that issued a syscall
           - rather than in one of rxrpc's background threads on a workqueue.
      
       (6) AFS will be able to wait synchronously on a call inside AF_RXRPC.
      
      To make this work, the following interface function has been added:
      
           int rxrpc_kernel_recv_data(
      		struct socket *sock, struct rxrpc_call *call,
      		void *buffer, size_t bufsize, size_t *_offset,
      		bool want_more, u32 *_abort_code);
      
      This is the recvmsg equivalent.  It allows the caller to find out about the
      state of a specific call and to transfer received data into a buffer
      piecemeal.
      
      afs_extract_data() and rxrpc_kernel_recv_data() now do all the extraction
      logic between them.  They don't wait synchronously yet because the socket
      lock needs to be dealt with.
      
      Five interface functions have been removed:
      
      	rxrpc_kernel_is_data_last()
          	rxrpc_kernel_get_abort_code()
          	rxrpc_kernel_get_error_number()
          	rxrpc_kernel_free_skb()
          	rxrpc_kernel_data_consumed()
      
      As a temporary hack, sk_buffs going to an in-kernel call are queued on the
      rxrpc_call struct (->knlrecv_queue) rather than being handed over to the
      in-kernel user.  To process the queue internally, a temporary function,
      temp_deliver_data() has been added.  This will be replaced with common code
      between the rxrpc_recvmsg() path and the kernel_rxrpc_recv_data() path in a
      future patch.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d001648e
  21. 01 9月, 2016 1 次提交
  22. 30 8月, 2016 2 次提交
    • D
      rxrpc: Pass struct socket * to more rxrpc kernel interface functions · 4de48af6
      David Howells 提交于
      Pass struct socket * to more rxrpc kernel interface functions.  They should
      be starting from this rather than the socket pointer in the rxrpc_call
      struct if they need to access the socket.
      
      I have left:
      
      	rxrpc_kernel_is_data_last()
      	rxrpc_kernel_get_abort_code()
      	rxrpc_kernel_get_error_number()
      	rxrpc_kernel_free_skb()
      	rxrpc_kernel_data_consumed()
      
      unmodified as they're all about to be removed (and, in any case, don't
      touch the socket).
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      4de48af6
    • D
      rxrpc: Provide a way for AFS to ask for the peer address of a call · 8324f0bc
      David Howells 提交于
      Provide a function so that kernel users, such as AFS, can ask for the peer
      address of a call:
      
         void rxrpc_kernel_get_peer(struct rxrpc_call *call,
      			      struct sockaddr_rxrpc *_srx);
      
      In the future the kernel service won't get sk_buffs to look inside.
      Further, this allows us to hide any canonicalisation inside AF_RXRPC for
      when IPv6 support is added.
      
      Also propagate this through to afs_find_server() and issue a warning if we
      can't handle the address family yet.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      8324f0bc
  23. 29 8月, 2016 1 次提交
  24. 27 8月, 2016 1 次提交
    • I
      bridge: switchdev: Add forward mark support for stacked devices · 6bc506b4
      Ido Schimmel 提交于
      switchdev_port_fwd_mark_set() is used to set the 'offload_fwd_mark' of
      port netdevs so that packets being flooded by the device won't be
      flooded twice.
      
      It works by assigning a unique identifier (the ifindex of the first
      bridge port) to bridge ports sharing the same parent ID. This prevents
      packets from being flooded twice by the same switch, but will flood
      packets through bridge ports belonging to a different switch.
      
      This method is problematic when stacked devices are taken into account,
      such as VLANs. In such cases, a physical port netdev can have upper
      devices being members in two different bridges, thus requiring two
      different 'offload_fwd_mark's to be configured on the port netdev, which
      is impossible.
      
      The main problem is that packet and netdev marking is performed at the
      physical netdev level, whereas flooding occurs between bridge ports,
      which are not necessarily port netdevs.
      
      Instead, packet and netdev marking should really be done in the bridge
      driver with the switch driver only telling it which packets it already
      forwarded. The bridge driver will mark such packets using the mark
      assigned to the ingress bridge port and will prevent the packet from
      being forwarded through any bridge port sharing the same mark (i.e.
      having the same parent ID).
      
      Remove the current switchdev 'offload_fwd_mark' implementation and
      instead implement the proposed method. In addition, make rocker - the
      sole user of the mark - use the proposed method.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6bc506b4
  25. 25 8月, 2016 1 次提交
    • V
      net: dsa: rename switch operations structure · 9d490b4e
      Vivien Didelot 提交于
      Now that the dsa_switch_driver structure contains only function pointers
      as it is supposed to, rename it to the more appropriate dsa_switch_ops,
      uniformly to any other operations structure in the kernel.
      
      No functional changes here, basically just the result of something like:
      s/dsa_switch_driver *drv/dsa_switch_ops *ops/g
      
      However keep the {un,}register_switch_driver functions and their
      dsa_switch_drivers list as is, since they represent the -- likely to be
      deprecated soon -- legacy DSA registration framework.
      
      In the meantime, also fix the following checks from checkpatch.pl to
      make it happy with this patch:
      
          CHECK: Comparison to NULL could be written "!ops"
          #403: FILE: net/dsa/dsa.c:470:
          +	if (ops == NULL) {
      
          CHECK: Comparison to NULL could be written "ds->ops->get_strings"
          #773: FILE: net/dsa/slave.c:697:
          +		if (ds->ops->get_strings != NULL)
      
          CHECK: Comparison to NULL could be written "ds->ops->get_ethtool_stats"
          #824: FILE: net/dsa/slave.c:785:
          +	if (ds->ops->get_ethtool_stats != NULL)
      
          CHECK: Comparison to NULL could be written "ds->ops->get_sset_count"
          #835: FILE: net/dsa/slave.c:798:
          +		if (ds->ops->get_sset_count != NULL)
      
          total: 0 errors, 0 warnings, 4 checks, 784 lines checked
      Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Acked-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9d490b4e
  26. 24 8月, 2016 1 次提交