1. 20 4月, 2016 2 次提交
  2. 18 4月, 2016 1 次提交
  3. 17 4月, 2016 2 次提交
    • A
      ip_tunnel_core: iptunnel_handle_offloads returns int and doesn't free skb · aed069df
      Alexander Duyck 提交于
      This patch updates the IP tunnel core function iptunnel_handle_offloads so
      that we return an int and do not free the skb inside the function.  This
      actually allows us to clean up several paths in several tunnels so that we
      can free the skb at one point in the path without having to have a
      secondary path if we are supporting tunnel offloads.
      
      In addition it should resolve some double-free issues I have found in the
      tunnels paths as I believe it is possible for us to end up triggering such
      an event in the case of fou or gue.
      Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aed069df
    • H
      vxlan: synchronously and race-free destruction of vxlan sockets · 0412bd93
      Hannes Frederic Sowa 提交于
      Due to the fact that the udp socket is destructed asynchronously in a
      work queue, we have some nondeterministic behavior during shutdown of
      vxlan tunnels and creating new ones. Fix this by keeping the destruction
      process synchronous in regards to the user space process so IFF_UP can
      be reliably set.
      
      udp_tunnel_sock_release destroys vs->sock->sk if reference counter
      indicates so. We expect to have the same lifetime of vxlan_sock and
      vxlan_sock->sock->sk even in fast paths with only rcu locks held. So
      only destruct the whole socket after we can be sure it cannot be found
      by searching vxlan_net->sock_list.
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Jiri Benc <jbenc@redhat.com>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0412bd93
  4. 16 4月, 2016 5 次提交
    • X
      sctp: export some apis or variables for sctp_diag and reuse some for proc · 626d16f5
      Xin Long 提交于
      For some main variables in sctp.ko, we couldn't export it to other modules,
      so we have to define some api to access them.
      
      It will include sctp transport and endpoint's traversal.
      
      There are some transport traversal functions for sctp_diag, we can also
      use it for sctp_proc. cause they have the similar situation to traversal
      transport.
      
      v2->v3:
      - rhashtable_walk_init need the parameter gfp, because of recent upstrem
        update
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      626d16f5
    • X
      sctp: add sctp_info dump api for sctp_diag · 52c52a61
      Xin Long 提交于
      sctp_diag will dump some important details of sctp's assoc or ep, we use
      sctp_info to describe them,  sctp_get_sctp_info to get them, and export
      it to sctp_diag.ko.
      
      v2->v3:
      - we will not use list_for_each_safe in sctp_get_sctp_info, cause
        all the callers of it will use lock_sock.
      
      - fix the holes in struct sctp_info with __reserved* field.
        because sctp_diag is a new feature, and sctp_info is just for now,
        it may be changed in the future.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52c52a61
    • M
      sctp: simplify sk_receive_queue locking · 311b2177
      Marcelo Ricardo Leitner 提交于
      SCTP already serializes access to rcvbuf through its sock lock:
      sctp_recvmsg takes it right in the start and release at the end, while
      rx path will also take the lock before doing any socket processing. On
      sctp_rcv() it will check if there is an user using the socket and, if
      there is, it will queue incoming packets to the backlog. The backlog
      processing will do the same. Even timers will do such check and
      re-schedule if an user is using the socket.
      
      Simplifying this will allow us to remove sctp_skb_list_tail and get ride
      of some expensive lockings.  The lists that it is used on are also
      mangled with functions like __skb_queue_tail and __skb_unlink in the
      same context, like on sctp_ulpq_tail_event() and sctp_clear_pd().
      sctp_close() will also purge those while using only the sock lock.
      
      Therefore the lockings performed by sctp_skb_list_tail() are not
      necessary. This patch removes this function and replaces its calls with
      just skb_queue_splice_tail_init() instead.
      
      The biggest gain is at sctp_ulpq_tail_event(), because the events always
      contain a list, even if it's queueing a single skb and this was
      triggering expensive calls to spin_lock_irqsave/_irqrestore for every
      data chunk received.
      
      As SCTP will deliver each data chunk on a corresponding recvmsg, the
      more effective the change will be.
      Before this patch, with chunks with 30 bytes:
      netperf -t SCTP_STREAM -H 192.168.1.2 -cC -l 60 -- -m 30 -S 400000
      400000 -s 400000 400000
      on a 10Gbit link with 1500 MTU:
      
      SCTP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.1 () port 0 AF_INET
      Recv   Send    Send                          Utilization       Service Demand
      Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
      Size   Size    Size     Time     Throughput  local    remote   local   remote
      bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB
      
      425984 425984     30    60.00       137.45   7.34     7.36     52.504  52.608
      
      With it:
      
      SCTP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.1 () port 0 AF_INET
      Recv   Send    Send                          Utilization       Service Demand
      Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
      Size   Size    Size     Time     Throughput  local    remote   local   remote
      bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB
      
      425984 425984     30    60.00       179.10   7.97     6.70     43.740  36.788
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      311b2177
    • E
      tcp: do not mess with listener sk_wmem_alloc · b3d05147
      Eric Dumazet 提交于
      When removing sk_refcnt manipulation on synflood, I missed that
      using skb_set_owner_w() was racy, if sk->sk_wmem_alloc had already
      transitioned to 0.
      
      We should hold sk_refcnt instead, but this is a big deal under attack.
      (Doing so increase performance from 3.2 Mpps to 3.8 Mpps only)
      
      In this patch, I chose to not attach a socket to syncookies skb.
      
      Performance is now 5 Mpps instead of 3.2 Mpps.
      
      Following patch will remove last known false sharing in
      tcp_rcv_state_process()
      
      Fixes: 3b24d854 ("tcp/dccp: do not touch listener sk_refcnt under synflood")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b3d05147
    • J
      devlink: fix sb register stub in case devlink is disabled · de33efd0
      Jiri Pirko 提交于
      Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
      Fixes: bf797471 ("devlink: add shared buffer configuration")
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      de33efd0
  5. 15 4月, 2016 2 次提交
  6. 14 4月, 2016 9 次提交
  7. 13 4月, 2016 2 次提交
  8. 12 4月, 2016 5 次提交
    • J
      cfg80211: remove enum ieee80211_band · 57fbcce3
      Johannes Berg 提交于
      This enum is already perfectly aliased to enum nl80211_band, and
      the only reason for it is that we get IEEE80211_NUM_BANDS out of
      it. There's no really good reason to not declare the number of
      bands in nl80211 though, so do that and remove the cfg80211 one.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      57fbcce3
    • J
      cfg80211: Improve Connect/Associate command documentation · 35eb8f7b
      Jouni Malinen 提交于
      The roaming cases for the Connect command were not fully covered and
      neither Connect nor Associate command uses of the prev_bssid parameter
      were very clear. Add details to describe how the prev_bssid argument is
      supposed to be used and when the driver should use association or
      reassociation.
      Signed-off-by: NJouni Malinen <jouni@qca.qualcomm.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      35eb8f7b
    • D
      rxrpc: Differentiate local and remote abort codes in structs · dc44b3a0
      David Howells 提交于
      In the rxrpc_connection and rxrpc_call structs, there's one field to hold
      the abort code, no matter whether that value was generated locally to be
      sent or was received from the peer via an abort packet.
      
      Split the abort code fields in two for cleanliness sake and add an error
      field to hold the Linux error number to the rxrpc_call struct too
      (sometimes this is generated in a context where we can't return it to
      userspace directly).
      
      Furthermore, add a skb mark to indicate a packet that caused a local abort
      to be generated so that recvmsg() can pick up the correct abort code.  A
      future addition will need to be to indicate to userspace the difference
      between aborts via a control message.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dc44b3a0
    • D
      rxrpc: Move some miscellaneous bits out into their own file · 8e688d9c
      David Howells 提交于
      Move some miscellaneous bits out into their own file to make it easier to
      split the call handling.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8e688d9c
    • D
      net: ipv4: Consider failed nexthops in multipath routes · a6db4494
      David Ahern 提交于
      Multipath route lookups should consider knowledge about next hops and not
      select a hop that is known to be failed.
      
      Example:
      
                           [h2]                   [h3]   15.0.0.5
                            |                      |
                           3|                     3|
                          [SP1]                  [SP2]--+
                           1  2                   1     2
                           |  |     /-------------+     |
                           |   \   /                    |
                           |     X                      |
                           |    / \                     |
                           |   /   \---------------\    |
                           1  2                     1   2
               12.0.0.2  [TOR1] 3-----------------3 [TOR2] 12.0.0.3
                           4                         4
                            \                       /
                              \                    /
                               \                  /
                                -------|   |-----/
                                       1   2
                                      [TOR3]
                                        3|
                                         |
                                        [h1]  12.0.0.1
      
      host h1 with IP 12.0.0.1 has 2 paths to host h3 at 15.0.0.5:
      
          root@h1:~# ip ro ls
          ...
          12.0.0.0/24 dev swp1  proto kernel  scope link  src 12.0.0.1
          15.0.0.0/16
                  nexthop via 12.0.0.2  dev swp1 weight 1
                  nexthop via 12.0.0.3  dev swp1 weight 1
          ...
      
      If the link between tor3 and tor1 is down and the link between tor1
      and tor2 then tor1 is effectively cut-off from h1. Yet the route lookups
      in h1 are alternating between the 2 routes: ping 15.0.0.5 gets one and
      ssh 15.0.0.5 gets the other. Connections that attempt to use the
      12.0.0.2 nexthop fail since that neighbor is not reachable:
      
          root@h1:~# ip neigh show
          ...
          12.0.0.3 dev swp1 lladdr 00:02:00:00:00:1b REACHABLE
          12.0.0.2 dev swp1  FAILED
          ...
      
      The failed path can be avoided by considering known neighbor information
      when selecting next hops. If the neighbor lookup fails we have no
      knowledge about the nexthop, so give it a shot. If there is an entry
      then only select the nexthop if the state is sane. This is similar to
      what fib_detect_death does.
      
      To maintain backward compatibility use of the neighbor information is
      based on a new sysctl, fib_multipath_use_neigh.
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Reviewed-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a6db4494
  9. 09 4月, 2016 3 次提交
  10. 08 4月, 2016 9 次提交