1. 06 4月, 2017 1 次提交
    • J
      bonding: attempt to better support longer hw addresses · faeeb317
      Jarod Wilson 提交于
      People are using bonding over Infiniband IPoIB connections, and who knows
      what else. Infiniband has a hardware address length of 20 octets
      (INFINIBAND_ALEN), and the network core defines a MAX_ADDR_LEN of 32.
      Various places in the bonding code are currently hard-wired to 6 octets
      (ETH_ALEN), such as the 3ad code, which I've left untouched here. Besides,
      only alb is currently possible on Infiniband links right now anyway, due
      to commit 1533e773, so the alb code is where most of the changes are.
      
      One major component of this change is the addition of a bond_hw_addr_copy
      function that takes a length argument, instead of using ether_addr_copy
      everywhere that hardware addresses need to be copied about. The other
      major component of this change is converting the bonding code from using
      struct sockaddr for address storage to struct sockaddr_storage, as the
      former has an address storage space of only 14, while the latter is 128
      minus a few, which is necessary to support bonding over device with up to
      MAX_ADDR_LEN octet hardware addresses. Additionally, this probably fixes
      up some memory corruption issues with the current code, where it's
      possible to write an infiniband hardware address into a sockaddr declared
      on the stack.
      
      Lightly tested on a dual mlx4 IPoIB setup, which properly shows a 20-octet
      hardware address now:
      
      $ cat /proc/net/bonding/bond0
      Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
      
      Bonding Mode: fault-tolerance (active-backup) (fail_over_mac active)
      Primary Slave: mlx4_ib0 (primary_reselect always)
      Currently Active Slave: mlx4_ib0
      MII Status: up
      MII Polling Interval (ms): 100
      Up Delay (ms): 100
      Down Delay (ms): 100
      
      Slave Interface: mlx4_ib0
      MII Status: up
      Speed: Unknown
      Duplex: Unknown
      Link Failure Count: 0
      Permanent HW addr:
      80:00:02:08:fe:80:00:00:00:00:00:00:e4:1d:2d:03:00:1d:67:01
      Slave queue ID: 0
      
      Slave Interface: mlx4_ib1
      MII Status: up
      Speed: Unknown
      Duplex: Unknown
      Link Failure Count: 0
      Permanent HW addr:
      80:00:02:09:fe:80:00:00:00:00:00:01:e4:1d:2d:03:00:1d:67:02
      Slave queue ID: 0
      
      Also tested with a standard 1Gbps NIC bonding setup (with a mix of
      e1000 and e1000e cards), running LNST's bonding tests.
      
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: netdev@vger.kernel.org
      Signed-off-by: NJarod Wilson <jarod@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      faeeb317
  2. 05 4月, 2017 4 次提交
  3. 04 4月, 2017 7 次提交
  4. 03 4月, 2017 1 次提交
  5. 02 4月, 2017 4 次提交
    • D
      net: mpls: Increase max number of labels for lwt encap · 1511009c
      David Ahern 提交于
      Alow users to push down more labels per MPLS encap. Similar to LSR case,
      move label array to the end of mpls_iptunnel_encap and allocate based on
      the number of labels for the route.
      
      For consistency with the LSR case, re-use the same maximum number of
      labels.
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1511009c
    • P
      udp: use sk_protocol instead of pcflag to detect udplite sockets · 3d8417d7
      Paolo Abeni 提交于
      In the udp_sock struct, the 'forward_deficit' and 'pcflag' fields
      share the same cacheline. While the first is dirtied by
      udp_recvmsg, the latter is read, possibly several times, by the
      bottom half processing to discriminate between udp and udplite
      sockets.
      
      With this patch, sk->sk_protocol is used to check is the socket is
      really an udplite one, avoiding some cache misses per
      packet and improving the performance under udp_flood with
      small packet up to 10%.
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3d8417d7
    • A
      bpf: introduce BPF_PROG_TEST_RUN command · 1cf1cae9
      Alexei Starovoitov 提交于
      development and testing of networking bpf programs is quite cumbersome.
      Despite availability of user space bpf interpreters the kernel is
      the ultimate authority and execution environment.
      Current test frameworks for TC include creation of netns, veth,
      qdiscs and use of various packet generators just to test functionality
      of a bpf program. XDP testing is even more complicated, since
      qemu needs to be started with gro/gso disabled and precise queue
      configuration, transferring of xdp program from host into guest,
      attaching to virtio/eth0 and generating traffic from the host
      while capturing the results from the guest.
      
      Moreover analyzing performance bottlenecks in XDP program is
      impossible in virtio environment, since cost of running the program
      is tiny comparing to the overhead of virtio packet processing,
      so performance testing can only be done on physical nic
      with another server generating traffic.
      
      Furthermore ongoing changes to user space control plane of production
      applications cannot be run on the test servers leaving bpf programs
      stubbed out for testing.
      
      Last but not least, the upstream llvm changes are validated by the bpf
      backend testsuite which has no ability to test the code generated.
      
      To improve this situation introduce BPF_PROG_TEST_RUN command
      to test and performance benchmark bpf programs.
      
      Joint work with Daniel Borkmann.
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1cf1cae9
    • V
      net: dsa: add cross-chip bridging operations · 40ef2c93
      Vivien Didelot 提交于
      Introduce crosschip_bridge_{join,leave} operations in the dsa_switch_ops
      structure, which can be used by switches supporting interconnection.
      Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40ef2c93
  6. 31 3月, 2017 1 次提交
    • P
      sock: avoid dirtying sk_stamp, if possible · 6c7c98ba
      Paolo Abeni 提交于
      sock_recv_ts_and_drops() unconditionally set sk->sk_stamp for
      every packet, even if the SOCK_TIMESTAMP flag is not set in the
      related socket.
      If selinux is enabled, this cause a cache miss for every packet
      since sk->sk_stamp and sk->sk_security share the same cacheline.
      With this change sk_stamp is set only if the SOCK_TIMESTAMP
      flag is set, and is cleared for the first packet, so that the user
      perceived behavior is unchanged.
      
      This gives up to 5% speed-up under udp-flood with small packets.
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6c7c98ba
  7. 29 3月, 2017 9 次提交
  8. 28 3月, 2017 4 次提交
  9. 26 3月, 2017 2 次提交
    • J
      gtp: support SGSN-side tunnels · 91ed81f9
      Jonas Bonn 提交于
      The GTP-tunnel driver is explicitly GGSN-side as it searches for PDP
      contexts based on the incoming packets _destination_ address.  If we
      want to place ourselves on the SGSN side of the  tunnel, then we want
      to be identifying PDP contexts based on _source_ address.
      
      Let it be noted that in a "real" configuration this module would never
      be used:  the SGSN normally does not see IP packets as input.  The
      justification for this functionality is for PGW load-testing applications
      where the input to the SGSN is locally generally IP traffic.
      
      This patch adds a "role" argument at GTP-link creation time to specify
      whether we are on the GGSN or SGSN side of the tunnel; this flag is then
      used to determine which part of the IP packet to use in determining
      the PDP context.
      Signed-off-by: NJonas Bonn <jonas@southpole.se>
      Acked-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: NHarald Welte <laforge@gnumonks.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91ed81f9
    • J
      gtp: rename SGSN netlink attribute · ae6336b5
      Jonas Bonn 提交于
      This is a mostly cosmetic rename of the SGSN netlink attribute to
      the GTP link.  The justification for this is that we will be making
      the module support decapsulation of "downstream" SGSN packets, in
      which case the netlink parameter actually refers to the upstream GGSN
      peer.  Renaming the parameter makes the relationship clearer.
      
      The legacy name is maintained as a define in the header file in order
      to not break existing code.
      Signed-off-by: NJonas Bonn <jonas@southpole.se>
      Acked-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: NHarald Welte <laforge@gnumonks.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ae6336b5
  10. 25 3月, 2017 7 次提交