1. 16 10月, 2014 1 次提交
    • T
      net: Add ndo_gso_check · 04ffcb25
      Tom Herbert 提交于
      Add ndo_gso_check which a device can define to indicate whether is
      is capable of doing GSO on a packet. This funciton would be called from
      the stack to determine whether software GSO is needed to be done. A
      driver should populate this function if it advertises GSO types for
      which there are combinations that it wouldn't be able to handle. For
      instance a device that performs UDP tunneling might only implement
      support for transparent Ethernet bridging type of inner packets
      or might have limitations on lengths of inner headers.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04ffcb25
  2. 27 9月, 2014 1 次提交
    • V
      macvtap: Fix race between device delete and open. · 40b8fe45
      Vlad Yasevich 提交于
      In macvtap device delete and open calls can race and
      this causes a list curruption of the vlan queue_list.
      
      The race intself is triggered by the idr accessors
      that located the vlan device.  The device is stored
      into and removed from the idr under both an rtnl and
      a mutex.  However, when attempting to locate the device
      in idr, only a mutex is taken.  As a result, once cpu
      perfoming a delete may take an rtnl and wait for the mutex,
      while another cput doing an open() will take the idr
      mutex first to fetch the device pointer and later take
      an rtnl to add a queue for the device which may have
      just gotten deleted.
      
      With this patch, we now hold the rtnl for the duration
      of the macvtap_open() call thus making sure that
      open will not race with delete.
      
      CC: Michael S. Tsirkin <mst@redhat.com>
      CC: Jason Wang <jasowang@redhat.com>
      Signed-off-by: NVladislav Yasevich <vyasevic@redhat.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40b8fe45
  3. 01 5月, 2014 1 次提交
    • V
      mactap: Fix checksum errors for non-gso packets in bridge mode · cbdb0427
      Vlad Yasevich 提交于
      The following is a problematic configuration:
      
       VM1: virtio-net device connected to macvtap0@eth0
       VM2: e1000 device connect to macvtap1@eth0
      
      The problem is is that virtio-net supports checksum offloading
      and thus sends the packets to the host with CHECKSUM_PARTIAL set.
      On the other hand, e1000 does not support any acceleration.
      
      For small TCP packets (and this includes the 3-way handshake),
      e1000 ends up receiving packets that only have a partial checksum
      set.  This causes TCP to fail checksum validation and to drop
      packets.  As a result tcp connections can not be established.
      
      Commit 3e4f8b78
      	macvtap: Perform GSO on forwarding path.
      fixes this issue for large packets wthat will end up undergoing GSO.
      This commit adds a check for the non-GSO case and attempts to
      compute the checksum for partially checksummed packets in the
      non-GSO case.
      
      CC: Daniel Lezcano <daniel.lezcano@free.fr>
      CC: Patrick McHardy <kaber@trash.net>
      CC: Andrian Nord <nightnord@gmail.com>
      CC: Eric Dumazet <eric.dumazet@gmail.com>
      CC: Michael S. Tsirkin <mst@redhat.com>
      CC: Jason Wang <jasowang@redhat.com>
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cbdb0427
  4. 17 1月, 2014 1 次提交
  5. 18 12月, 2013 1 次提交
  6. 13 12月, 2013 2 次提交
    • V
      macvlan: Remove custom recieve and forward handlers · 2f6a1b66
      Vlad Yasevich 提交于
      Since now macvlan and macvtap use the same receive and
      forward handlers, we can remove them completely and use
      netif_rx and dev_forward_skb() directly.
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f6a1b66
    • V
      macvtap: Add support of packet capture on macvtap device. · 6acf54f1
      Vlad Yasevich 提交于
      Macvtap device currently doesn not allow a user to capture
      traffic on due to the fact that it steals the packets
      from the network stack before the skb->dev is set correctly
      on the receive side, and that use uses macvlan transmit
      path directly on the send side.  As a result, we never
      get a change to give traffic to the taps while the correct
      device is set in the skb.
      
      This patch makes macvtap device behave almost exaclty like
      macvlan.  On the send side, we switch to using dev_queue_xmit().
      On the receive side, to deliver packets to macvtap, we now
      use netif_rx and dev_forward_skb just like macvlan.  The only
      differnce now is that macvtap has its own rx_handler which is
      attached to the macvtap netdev.  It is here that we now steal
      the packet and provide it to the socket.
      
      As a result, we can now capture traffic on the macvtap device:
         tcpdump -i macvtap0
      
      It also gives us the abilit to add tc actions to the macvtap
      device and actually utilize different bandwidth management
      queues on output.
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6acf54f1
  7. 12 12月, 2013 1 次提交
  8. 11 12月, 2013 3 次提交
  9. 10 12月, 2013 1 次提交
  10. 07 12月, 2013 3 次提交
  11. 30 11月, 2013 1 次提交
  12. 29 11月, 2013 1 次提交
  13. 15 11月, 2013 1 次提交
    • J
      macvtap: limit head length of skb allocated · 16a3fa28
      Jason Wang 提交于
      We currently use hdr_len as a hint of head length which is advertised by
      guest. But when guest advertise a very big value, it can lead to an 64K+
      allocating of kmalloc() which has a very high possibility of failure when host
      memory is fragmented or under heavy stress. The huge hdr_len also reduce the
      effect of zerocopy or even disable if a gso skb is linearized in guest.
      
      To solves those issues, this patch introduces an upper limit (PAGE_SIZE) of the
      head, which guarantees an order 0 allocation each time.
      
      Cc: Stefan Hajnoczi <stefanha@redhat.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      16a3fa28
  14. 21 8月, 2013 3 次提交
  15. 12 8月, 2013 1 次提交
    • E
      macvtap: fix two races · 29d79196
      Eric Dumazet 提交于
      Since commit ac4e4af1 ("macvtap: Consistently use rcu functions"),
      Thomas gets two different warnings :
      
      BUG: using smp_processor_id() in preemptible [00000000] code: vhost-45891/45892
      caller is macvtap_do_read+0x45c/0x600 [macvtap]
      CPU: 1 PID: 45892 Comm: vhost-45891 Not tainted 3.11.0-bisecttest #13
      Call Trace:
      ([<00000000001126ee>] show_trace+0x126/0x144)
       [<00000000001127d2>] show_stack+0xc6/0xd4
       [<000000000068bcec>] dump_stack+0x74/0xd8
       [<0000000000481066>] debug_smp_processor_id+0xf6/0x114
       [<000003ff802e9a18>] macvtap_do_read+0x45c/0x600 [macvtap]
       [<000003ff802e9c1c>] macvtap_recvmsg+0x60/0x88 [macvtap]
       [<000003ff80318c5e>] handle_rx+0x5b2/0x800 [vhost_net]
       [<000003ff8028f77c>] vhost_worker+0x15c/0x1c4 [vhost]
       [<000000000015f3ac>] kthread+0xd8/0xe4
       [<00000000006934a6>] kernel_thread_starter+0x6/0xc
       [<00000000006934a0>] kernel_thread_starter+0x0/0xc
      
      And
      
      BUG: using smp_processor_id() in preemptible [00000000] code: vhost-45897/45898
      caller is macvlan_start_xmit+0x10a/0x1b4 [macvlan]
      CPU: 1 PID: 45898 Comm: vhost-45897 Not tainted 3.11.0-bisecttest #16
      Call Trace:
      ([<00000000001126ee>] show_trace+0x126/0x144)
       [<00000000001127d2>] show_stack+0xc6/0xd4
       [<000000000068bdb8>] dump_stack+0x74/0xd4
       [<0000000000481132>] debug_smp_processor_id+0xf6/0x114
       [<000003ff802b72ca>] macvlan_start_xmit+0x10a/0x1b4 [macvlan]
       [<000003ff802ea69a>] macvtap_get_user+0x982/0xbc4 [macvtap]
       [<000003ff802ea92a>] macvtap_sendmsg+0x4e/0x60 [macvtap]
       [<000003ff8031947c>] handle_tx+0x494/0x5ec [vhost_net]
       [<000003ff8028f77c>] vhost_worker+0x15c/0x1c4 [vhost]
       [<000000000015f3ac>] kthread+0xd8/0xe4
       [<000000000069356e>] kernel_thread_starter+0x6/0xc
       [<0000000000693568>] kernel_thread_starter+0x0/0xc
      2 locks held by vhost-45897/45898:
       #0:  (&vq->mutex){+.+.+.}, at: [<000003ff8031903c>] handle_tx+0x54/0x5ec [vhost_net]
       #1:  (rcu_read_lock){.+.+..}, at: [<000003ff802ea53c>] macvtap_get_user+0x824/0xbc4 [macvtap]
      
      In the first case, macvtap_put_user() calls macvlan_count_rx()
      in a preempt-able context, and this is not allowed.
      
      In the second case, macvtap_get_user() calls
      macvlan_start_xmit() with BH enabled, and this is not allowed.
      Reported-by: NThomas Huth <thuth@linux.vnet.ibm.com>
      Bisected-by: NThomas Huth <thuth@linux.vnet.ibm.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Tested-by: NThomas Huth <thuth@linux.vnet.ibm.com>
      Cc: Vlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      29d79196
  16. 10 8月, 2013 1 次提交
    • E
      net: attempt high order allocations in sock_alloc_send_pskb() · 28d64271
      Eric Dumazet 提交于
      Adding paged frags skbs to af_unix sockets introduced a performance
      regression on large sends because of additional page allocations, even
      if each skb could carry at least 100% more payload than before.
      
      We can instruct sock_alloc_send_pskb() to attempt high order
      allocations.
      
      Most of the time, it does a single page allocation instead of 8.
      
      I added an additional parameter to sock_alloc_send_pskb() to
      let other users to opt-in for this new feature on followup patches.
      
      Tested:
      
      Before patch :
      
      $ netperf -t STREAM_STREAM
      STREAM STREAM TEST
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
       2304  212992  212992    10.00    46861.15
      
      After patch :
      
      $ netperf -t STREAM_STREAM
      STREAM STREAM TEST
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
       2304  212992  212992    10.00    57981.11
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      28d64271
  17. 08 8月, 2013 2 次提交
  18. 19 7月, 2013 1 次提交
    • J
      macvtap: do not zerocopy if iov needs more pages than MAX_SKB_FRAGS · ece793fc
      Jason Wang 提交于
      We try to linearize part of the skb when the number of iov is greater than
      MAX_SKB_FRAGS. This is not enough since each single vector may occupy more than
      one pages, so zerocopy_sg_fromiovec() may still fail and may break the guest
      network.
      
      Solve this problem by calculate the pages needed for iov before trying to do
      zerocopy and switch to use copy instead of zerocopy if it needs more than
      MAX_SKB_FRAGS.
      
      This is done through introducing a new helper to count the pages for iov, and
      call uarg->callback() manually when switching from zerocopy to copy to notify
      vhost.
      
      We can do further optimization on top.
      
      This bug were introduced from b92946e2
      (macvtap: zerocopy: validate vectors before building skb).
      
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ece793fc
  19. 17 7月, 2013 2 次提交
  20. 11 7月, 2013 1 次提交
    • J
      macvtap: correctly linearize skb when zerocopy is used · 61d46bf9
      Jason Wang 提交于
      Userspace may produce vectors greater than MAX_SKB_FRAGS. When we try to
      linearize parts of the skb to let the rest of iov to be fit in
      the frags, we need count copylen into linear when calling macvtap_alloc_skb()
      instead of partly counting it into data_len. Since this breaks
      zerocopy_sg_from_iovec() since its inner counter assumes nr_frags should
      be zero at beginning. This cause nr_frags to be increased wrongly without
      setting the correct frags.
      
      This bug were introduced from b92946e2
      (macvtap: zerocopy: validate vectors before building skb).
      
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      61d46bf9
  21. 26 6月, 2013 5 次提交
  22. 13 6月, 2013 2 次提交
    • J
      macvtap: fix uninitialized return value macvtap_ioctl_set_queue() · f57855a5
      Jason Wang 提交于
      Return -EINVAL on illegal flag instead of uninitialized value. This fixes the
      kbuild test warning.
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f57855a5
    • J
      macvtap: slient sparse warnings · d9a90a31
      Jason Wang 提交于
      This patch silents the following sparse warnings:
      
      drivers/net/macvtap.c:98:9: warning: incorrect type in assignment (different
      address spaces)
      drivers/net/macvtap.c:98:9:    expected struct macvtap_queue *<noident>
      drivers/net/macvtap.c:98:9:    got struct macvtap_queue [noderef]
      <asn:4>*<noident>
      drivers/net/macvtap.c:120:9: warning: incorrect type in assignment (different
      address spaces)
      drivers/net/macvtap.c:120:9:    expected struct macvtap_queue *<noident>
      drivers/net/macvtap.c:120:9:    got struct macvtap_queue [noderef]
      <asn:4>*<noident>
      drivers/net/macvtap.c:151:22: error: incompatible types in comparison expression
      (different address spaces)
      drivers/net/macvtap.c:233:23: error: incompatible types in comparison expression
      (different address spaces)
      drivers/net/macvtap.c:243:23: error: incompatible types in comparison expression
      (different address spaces)
      drivers/net/macvtap.c:247:15: error: incompatible types in comparison expression
      (different address spaces)
        CC [M]  drivers/net/macvtap.o
      drivers/net/macvlan.c:232:24: error: incompatible types in comparison expression
      (different address spaces)
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d9a90a31
  23. 08 6月, 2013 4 次提交