1. 08 6月, 2016 2 次提交
    • E
      net: get rid of spin_trylock() in net_tx_action() · 3bcb846c
      Eric Dumazet 提交于
      Note: Tom Herbert posted almost same patch 3 months back, but for
      different reasons.
      
      The reasons we want to get rid of this spin_trylock() are :
      
      1) Under high qdisc pressure, the spin_trylock() has almost no
      chance to succeed.
      
      2) We loop multiple times in softirq handler, eventually reaching
      the max retry count (10), and we schedule ksoftirqd.
      
      Since we want to adhere more strictly to ksoftirqd being waked up in
      the future (https://lwn.net/Articles/687617/), better avoid spurious
      wakeups.
      
      3) calls to __netif_reschedule() dirty the cache line containing
      q->next_sched, slowing down the owner of qdisc.
      
      4) RT kernels can not use the spin_trylock() here.
      
      With help of busylock, we get the qdisc spinlock fast enough, and
      the trylock trick brings only performance penalty.
      
      Depending on qdisc setup, I observed a gain of up to 19 % in qdisc
      performance (1016600 pps instead of 853400 pps, using prio+tbf+fq_codel)
      
      ("mpstat -I SCPU 1" is much happier now)
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <tom@herbertland.com>
      Acked-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3bcb846c
    • J
      vhost_net: stop polling socket during rx processing · 8241a1e4
      Jason Wang 提交于
      We don't stop rx polling socket during rx processing, this will lead
      unnecessary wakeups from under layer net devices (E.g
      sock_def_readable() form tun). Rx will be slowed down in this
      way. This patch avoids this by stop polling socket during rx
      processing. A small drawback is that this introduces some overheads in
      light load case because of the extra start/stop polling, but single
      netperf TCP_RR does not notice any change. In a super heavy load case,
      e.g using pktgen to inject packet to guest, we get about ~8.8%
      improvement on pps:
      
      before: ~1240000 pkt/s
      after:  ~1350000 pkt/s
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8241a1e4
  2. 07 6月, 2016 8 次提交
    • B
      net: ethernet: cavium: liquidio: request_manager: Remove create_workqueue · aaa76724
      Bhaktipriya Shridhar 提交于
      alloc_workqueue replaces deprecated create_workqueue().
      
      A dedicated workqueue has been used since the workitem viz
      (&db_wq->wk.work which maps to check_db_timeout) is involved
      in normal device operation. WQ_MEM_RECLAIM has been set to guarantee
      forward progress under memory pressure, which is a requirement here.
      Since there are only a fixed number of work items, explicit concurrency
      limit is unnecessary.
      
      flush_workqueue is unnecessary since destroy_workqueue() itself calls
      drain_workqueue() which flushes repeatedly till the workqueue
      becomes empty.
      Signed-off-by: NBhaktipriya Shridhar <bhaktipriya96@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aaa76724
    • B
      net: ethernet: cavium: liquidio: response_manager: Remove create_workqueue · 523a61b4
      Bhaktipriya Shridhar 提交于
      alloc_workqueue replaces deprecated create_workqueue().
      
      A dedicated workqueue has been used since the workitem viz
      (&cwq->wk.work which maps to oct_poll_req_completion) is involved
      in normal device operation. WQ_MEM_RECLAIM has been set to guarantee
      forward progress under memory pressure, which is a requirement here.
      Since there are only a fixed number of work items, explicit concurrency
      limit is unnecessary.
      
      flush_workqueue is unnecessary since destroy_workqueue() itself calls
      drain_workqueue() which flushes repeatedly till the workqueue
      becomes empty. Hence the call to flush_workqueue() has been dropped.
      Signed-off-by: NBhaktipriya Shridhar <bhaktipriya96@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      523a61b4
    • A
      virtio-net: Add initial MTU advice feature · 14de9d11
      Aaron Conole 提交于
      This commit adds the feature bit and associated mtu device entry for the
      virtio network device.  When a virtio device comes up, it checks the
      feature bit for the VIRTIO_NET_F_MTU feature.  If such feature bit is
      enabled, the driver will read the advised MTU and use it as the initial
      value.
      Signed-off-by: NAaron Conole <aconole@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      14de9d11
    • D
      net: Revert vrf-local changes. · 3d9dc408
      David S. Miller 提交于
      This reverts commit 2fb7ea45.
      
      It results in build errors because ip6_input is not a
      symbol exported to modules.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3d9dc408
    • D
      Merge branch 'vrf-local' · 2fb7ea45
      David S. Miller 提交于
      David Ahern says:
      
      ====================
      net: vrf: Add support for local traffic to local addresses
      
      Add support for locally originated traffic to VRF-local addresses,
      be it addresses on enslaved devices or addresses on the VRF device:
      
      $ ip addr show dev red
      33: red: <NOARP,MASTER,UP,LOWER_UP> mtu 65536 qdisc pfifo_fast state UP group default qlen 1000
          link/ether be:00:53:b5:e4:25 brd ff:ff:ff:ff:ff:ff
          inet 1.1.1.1/32 scope global red
             valid_lft forever preferred_lft forever
          inet6 1111:1::1/128 scope global
             valid_lft forever preferred_lft forever
      
      $ ip addr show dev eth1
      3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP group default qlen 1000
          link/ether 02:e0:f9:79:34:bd brd ff:ff:ff:ff:ff:ff
          inet 10.100.1.1/24 brd 10.100.1.255 scope global eth1
             valid_lft forever preferred_lft forever
          inet6 2100:1::1/120 scope global
             valid_lft forever preferred_lft forever
          inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
             valid_lft forever preferred_lft forever
      
      $ ping -c1 -I red 10.100.1.1
          ping: Warning: source address might be selected on device other than red.
          PING 10.100.1.1 (10.100.1.1) from 10.100.1.1 red: 56(84) bytes of data.
          64 bytes from 10.100.1.1: icmp_seq=1 ttl=64 time=0.057 ms
      
      $ ping -c1 -I red 1.1.1.1
      PING 1.1.1.1 (1.1.1.1) from 1.1.1.1 red: 56(84) bytes of data.
      64 bytes from 1.1.1.1: icmp_seq=1 ttl=64 time=0.136 ms
      
      --- 1.1.1.1 ping statistics ---
      1 packets transmitted, 1 received, 0% packet loss, time 0ms
      rtt min/avg/max/mdev = 0.136/0.136/0.136/0.000 ms
      
      $ ping6 -c1 -I red  2100:1::1
      ping6: Warning: source address might be selected on device other than red.
      PING 2100:1::1(2100:1::1) from 2100:1::1 red: 56 data bytes
      64 bytes from 2100:1::1: icmp_seq=1 ttl=64 time=0.167 ms
      
      --- 2100:1::1 ping statistics ---
      1 packets transmitted, 1 received, 0% packet loss, time 0ms
      rtt min/avg/max/mdev = 0.167/0.167/0.167/0.000 ms
      
      $ ping6 -c1 -I red 1111::1
      PING 1111::1(1111::1) from 1111:1::1 red: 56 data bytes
      64 bytes from 1111::1: icmp_seq=1 ttl=64 time=0.187 ms
      
      --- 1111::1 ping statistics ---
      1 packets transmitted, 1 received, 0% packet loss, time 0ms
      rtt min/avg/max/mdev = 0.187/0.187/0.187/0.000 ms
      
      This change also enables use of loopback address on the VRF device:
      $ ip addr add dev red 127.0.0.1/8
      
      $ ping -c1 -I red 127.0.0.1
      PING 127.0.0.1 (127.0.0.1) from 127.0.0.1 red: 56(84) bytes of data.
      64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.058 ms
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2fb7ea45
    • D
      net: vrf: ipv6 support for local traffic to local addresses · 625b47b5
      David Ahern 提交于
      Add support for locally originated traffic to VRF-local IPv6 addresses.
      Similar to IPv4 a local dst is set on the skb and the packet is
      reinserted with a call to netif_rx. With this patch, ping, tcp and udp
      packets to a local IPv6 address are successfully routed:
      
          $ ip addr show dev eth1
          4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP group default qlen 1000
              link/ether 02:e0:f9:1c:b9:74 brd ff:ff:ff:ff:ff:ff
              inet 10.100.1.1/24 brd 10.100.1.255 scope global eth1
                 valid_lft forever preferred_lft forever
              inet6 2100:1::1/120 scope global
                 valid_lft forever preferred_lft forever
              inet6 fe80::e0:f9ff:fe1c:b974/64 scope link
                 valid_lft forever preferred_lft forever
      
          $ ping6 -c1 -I red 2100:1::1
          ping6: Warning: source address might be selected on device other than red.
          PING 2100:1::1(2100:1::1) from 2100:1::1 red: 56 data bytes
          64 bytes from 2100:1::1: icmp_seq=1 ttl=64 time=0.098 ms
      
      ip6_input is exported so the VRF driver can use it for the dst input
      function. The dst_alloc function for IPv4 defaults to setting the input and
      output functions; IPv6's does not. VRF does not need to duplicate the Rx path
      so just export the ipv6 input function.
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      625b47b5
    • D
      net: vrf: ipv4 support for local traffic to local addresses · 671cd19a
      David Ahern 提交于
      Add support for locally originated traffic to VRF-local addresses. If
      destination device for an skb is the loopback or VRF device then set
      its dst to a local version of the VRF cached dst_entry and call netif_rx
      to insert the packet onto the rx queue - similar to what is done for
      loopback. This patch handles IPv4 support; follow on patch handles IPv6.
      
      With this patch, ping, tcp and udp packets to a local IPv4 address are
      successfully routed:
      
          $ ip addr show dev eth1
          4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP group default qlen 1000
              link/ether 02:e0:f9:1c:b9:74 brd ff:ff:ff:ff:ff:ff
              inet 10.100.1.1/24 brd 10.100.1.255 scope global eth1
                 valid_lft forever preferred_lft forever
              inet6 2100:1::1/120 scope global
                 valid_lft forever preferred_lft forever
              inet6 fe80::e0:f9ff:fe1c:b974/64 scope link
                 valid_lft forever preferred_lft forever
      
          $ ping -c1 -I red 10.100.1.1
          ping: Warning: source address might be selected on device other than red.
          PING 10.100.1.1 (10.100.1.1) from 10.100.1.1 red: 56(84) bytes of data.
          64 bytes from 10.100.1.1: icmp_seq=1 ttl=64 time=0.057 ms
      
      This patch also enables use of IPv4 loopback address on the VRF device:
          $ ip addr add dev red 127.0.0.1/8
      
          $ ping -c1 -I red 127.0.0.1
          PING 127.0.0.1 (127.0.0.1) from 127.0.0.1 red: 56(84) bytes of data.
          64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.058 ms
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      671cd19a
    • D
      net: vrf: Minor refactoring for local address patches · 09fcf916
      David Ahern 提交于
      Move the stripping of the ethernet header from is_ip_tx_frame into the
      ipv4 and ipv6 outbound functions. If the packet is destined to a local
      address the header is retained since the packet is sent back to netif_rx.
      
      Collapse vrf_send_v4_prep into vrf_process_v4_outbound.
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      09fcf916
  3. 06 6月, 2016 7 次提交
  4. 05 6月, 2016 23 次提交