1. 04 2月, 2015 1 次提交
  2. 17 12月, 2014 2 次提交
  3. 10 12月, 2014 1 次提交
    • A
      put iov_iter into msghdr · c0371da6
      Al Viro 提交于
      Note that the code _using_ ->msg_iter at that point will be very
      unhappy with anything other than unshifted iovec-backed iov_iter.
      We still need to convert users to proper primitives.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      c0371da6
  4. 09 12月, 2014 1 次提交
  5. 06 12月, 2014 1 次提交
  6. 24 11月, 2014 2 次提交
  7. 22 11月, 2014 1 次提交
  8. 08 11月, 2014 1 次提交
  9. 04 11月, 2014 1 次提交
  10. 31 10月, 2014 2 次提交
    • B
      drivers/net, ipv6: Select IPv6 fragment idents for virtio UFO packets · 5188cd44
      Ben Hutchings 提交于
      UFO is now disabled on all drivers that work with virtio net headers,
      but userland may try to send UFO/IPv6 packets anyway.  Instead of
      sending with ID=0, we should select identifiers on their behalf (as we
      used to).
      Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
      Fixes: 916e4cf4 ("ipv6: reuse ip6_frag_id from ip6_ufo_append_data")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5188cd44
    • B
      drivers/net: Disable UFO through virtio · 3d0ad094
      Ben Hutchings 提交于
      IPv6 does not allow fragmentation by routers, so there is no
      fragmentation ID in the fixed header.  UFO for IPv6 requires the ID to
      be passed separately, but there is no provision for this in the virtio
      net protocol.
      
      Until recently our software implementation of UFO/IPv6 generated a new
      ID, but this was a bug.  Now we will use ID=0 for any UFO/IPv6 packet
      passed through a tap, which is even worse.
      
      Unfortunately there is no distinction between UFO/IPv4 and v6
      features, so disable UFO on taps and virtio_net completely until we
      have a proper solution.
      
      We cannot depend on VM managers respecting the tap feature flags, so
      keep accepting UFO packets but log a warning the first time we do
      this.
      Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
      Fixes: 916e4cf4 ("ipv6: reuse ip6_frag_id from ip6_ufo_append_data")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3d0ad094
  11. 16 10月, 2014 1 次提交
    • T
      net: Add ndo_gso_check · 04ffcb25
      Tom Herbert 提交于
      Add ndo_gso_check which a device can define to indicate whether is
      is capable of doing GSO on a packet. This funciton would be called from
      the stack to determine whether software GSO is needed to be done. A
      driver should populate this function if it advertises GSO types for
      which there are combinations that it wouldn't be able to handle. For
      instance a device that performs UDP tunneling might only implement
      support for transparent Ethernet bridging type of inner packets
      or might have limitations on lengths of inner headers.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04ffcb25
  12. 27 9月, 2014 1 次提交
    • V
      macvtap: Fix race between device delete and open. · 40b8fe45
      Vlad Yasevich 提交于
      In macvtap device delete and open calls can race and
      this causes a list curruption of the vlan queue_list.
      
      The race intself is triggered by the idr accessors
      that located the vlan device.  The device is stored
      into and removed from the idr under both an rtnl and
      a mutex.  However, when attempting to locate the device
      in idr, only a mutex is taken.  As a result, once cpu
      perfoming a delete may take an rtnl and wait for the mutex,
      while another cput doing an open() will take the idr
      mutex first to fetch the device pointer and later take
      an rtnl to add a queue for the device which may have
      just gotten deleted.
      
      With this patch, we now hold the rtnl for the duration
      of the macvtap_open() call thus making sure that
      open will not race with delete.
      
      CC: Michael S. Tsirkin <mst@redhat.com>
      CC: Jason Wang <jasowang@redhat.com>
      Signed-off-by: NVladislav Yasevich <vyasevic@redhat.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40b8fe45
  13. 01 5月, 2014 1 次提交
    • V
      mactap: Fix checksum errors for non-gso packets in bridge mode · cbdb0427
      Vlad Yasevich 提交于
      The following is a problematic configuration:
      
       VM1: virtio-net device connected to macvtap0@eth0
       VM2: e1000 device connect to macvtap1@eth0
      
      The problem is is that virtio-net supports checksum offloading
      and thus sends the packets to the host with CHECKSUM_PARTIAL set.
      On the other hand, e1000 does not support any acceleration.
      
      For small TCP packets (and this includes the 3-way handshake),
      e1000 ends up receiving packets that only have a partial checksum
      set.  This causes TCP to fail checksum validation and to drop
      packets.  As a result tcp connections can not be established.
      
      Commit 3e4f8b78
      	macvtap: Perform GSO on forwarding path.
      fixes this issue for large packets wthat will end up undergoing GSO.
      This commit adds a check for the non-GSO case and attempts to
      compute the checksum for partially checksummed packets in the
      non-GSO case.
      
      CC: Daniel Lezcano <daniel.lezcano@free.fr>
      CC: Patrick McHardy <kaber@trash.net>
      CC: Andrian Nord <nightnord@gmail.com>
      CC: Eric Dumazet <eric.dumazet@gmail.com>
      CC: Michael S. Tsirkin <mst@redhat.com>
      CC: Jason Wang <jasowang@redhat.com>
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cbdb0427
  14. 17 1月, 2014 1 次提交
  15. 18 12月, 2013 1 次提交
  16. 13 12月, 2013 2 次提交
    • V
      macvlan: Remove custom recieve and forward handlers · 2f6a1b66
      Vlad Yasevich 提交于
      Since now macvlan and macvtap use the same receive and
      forward handlers, we can remove them completely and use
      netif_rx and dev_forward_skb() directly.
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f6a1b66
    • V
      macvtap: Add support of packet capture on macvtap device. · 6acf54f1
      Vlad Yasevich 提交于
      Macvtap device currently doesn not allow a user to capture
      traffic on due to the fact that it steals the packets
      from the network stack before the skb->dev is set correctly
      on the receive side, and that use uses macvlan transmit
      path directly on the send side.  As a result, we never
      get a change to give traffic to the taps while the correct
      device is set in the skb.
      
      This patch makes macvtap device behave almost exaclty like
      macvlan.  On the send side, we switch to using dev_queue_xmit().
      On the receive side, to deliver packets to macvtap, we now
      use netif_rx and dev_forward_skb just like macvlan.  The only
      differnce now is that macvtap has its own rx_handler which is
      attached to the macvtap netdev.  It is here that we now steal
      the packet and provide it to the socket.
      
      As a result, we can now capture traffic on the macvtap device:
         tcpdump -i macvtap0
      
      It also gives us the abilit to add tc actions to the macvtap
      device and actually utilize different bandwidth management
      queues on output.
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6acf54f1
  17. 12 12月, 2013 1 次提交
  18. 11 12月, 2013 3 次提交
  19. 10 12月, 2013 1 次提交
  20. 07 12月, 2013 3 次提交
  21. 30 11月, 2013 1 次提交
  22. 29 11月, 2013 1 次提交
  23. 15 11月, 2013 1 次提交
    • J
      macvtap: limit head length of skb allocated · 16a3fa28
      Jason Wang 提交于
      We currently use hdr_len as a hint of head length which is advertised by
      guest. But when guest advertise a very big value, it can lead to an 64K+
      allocating of kmalloc() which has a very high possibility of failure when host
      memory is fragmented or under heavy stress. The huge hdr_len also reduce the
      effect of zerocopy or even disable if a gso skb is linearized in guest.
      
      To solves those issues, this patch introduces an upper limit (PAGE_SIZE) of the
      head, which guarantees an order 0 allocation each time.
      
      Cc: Stefan Hajnoczi <stefanha@redhat.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      16a3fa28
  24. 21 8月, 2013 3 次提交
  25. 12 8月, 2013 1 次提交
    • E
      macvtap: fix two races · 29d79196
      Eric Dumazet 提交于
      Since commit ac4e4af1 ("macvtap: Consistently use rcu functions"),
      Thomas gets two different warnings :
      
      BUG: using smp_processor_id() in preemptible [00000000] code: vhost-45891/45892
      caller is macvtap_do_read+0x45c/0x600 [macvtap]
      CPU: 1 PID: 45892 Comm: vhost-45891 Not tainted 3.11.0-bisecttest #13
      Call Trace:
      ([<00000000001126ee>] show_trace+0x126/0x144)
       [<00000000001127d2>] show_stack+0xc6/0xd4
       [<000000000068bcec>] dump_stack+0x74/0xd8
       [<0000000000481066>] debug_smp_processor_id+0xf6/0x114
       [<000003ff802e9a18>] macvtap_do_read+0x45c/0x600 [macvtap]
       [<000003ff802e9c1c>] macvtap_recvmsg+0x60/0x88 [macvtap]
       [<000003ff80318c5e>] handle_rx+0x5b2/0x800 [vhost_net]
       [<000003ff8028f77c>] vhost_worker+0x15c/0x1c4 [vhost]
       [<000000000015f3ac>] kthread+0xd8/0xe4
       [<00000000006934a6>] kernel_thread_starter+0x6/0xc
       [<00000000006934a0>] kernel_thread_starter+0x0/0xc
      
      And
      
      BUG: using smp_processor_id() in preemptible [00000000] code: vhost-45897/45898
      caller is macvlan_start_xmit+0x10a/0x1b4 [macvlan]
      CPU: 1 PID: 45898 Comm: vhost-45897 Not tainted 3.11.0-bisecttest #16
      Call Trace:
      ([<00000000001126ee>] show_trace+0x126/0x144)
       [<00000000001127d2>] show_stack+0xc6/0xd4
       [<000000000068bdb8>] dump_stack+0x74/0xd4
       [<0000000000481132>] debug_smp_processor_id+0xf6/0x114
       [<000003ff802b72ca>] macvlan_start_xmit+0x10a/0x1b4 [macvlan]
       [<000003ff802ea69a>] macvtap_get_user+0x982/0xbc4 [macvtap]
       [<000003ff802ea92a>] macvtap_sendmsg+0x4e/0x60 [macvtap]
       [<000003ff8031947c>] handle_tx+0x494/0x5ec [vhost_net]
       [<000003ff8028f77c>] vhost_worker+0x15c/0x1c4 [vhost]
       [<000000000015f3ac>] kthread+0xd8/0xe4
       [<000000000069356e>] kernel_thread_starter+0x6/0xc
       [<0000000000693568>] kernel_thread_starter+0x0/0xc
      2 locks held by vhost-45897/45898:
       #0:  (&vq->mutex){+.+.+.}, at: [<000003ff8031903c>] handle_tx+0x54/0x5ec [vhost_net]
       #1:  (rcu_read_lock){.+.+..}, at: [<000003ff802ea53c>] macvtap_get_user+0x824/0xbc4 [macvtap]
      
      In the first case, macvtap_put_user() calls macvlan_count_rx()
      in a preempt-able context, and this is not allowed.
      
      In the second case, macvtap_get_user() calls
      macvlan_start_xmit() with BH enabled, and this is not allowed.
      Reported-by: NThomas Huth <thuth@linux.vnet.ibm.com>
      Bisected-by: NThomas Huth <thuth@linux.vnet.ibm.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Tested-by: NThomas Huth <thuth@linux.vnet.ibm.com>
      Cc: Vlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      29d79196
  26. 10 8月, 2013 1 次提交
    • E
      net: attempt high order allocations in sock_alloc_send_pskb() · 28d64271
      Eric Dumazet 提交于
      Adding paged frags skbs to af_unix sockets introduced a performance
      regression on large sends because of additional page allocations, even
      if each skb could carry at least 100% more payload than before.
      
      We can instruct sock_alloc_send_pskb() to attempt high order
      allocations.
      
      Most of the time, it does a single page allocation instead of 8.
      
      I added an additional parameter to sock_alloc_send_pskb() to
      let other users to opt-in for this new feature on followup patches.
      
      Tested:
      
      Before patch :
      
      $ netperf -t STREAM_STREAM
      STREAM STREAM TEST
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
       2304  212992  212992    10.00    46861.15
      
      After patch :
      
      $ netperf -t STREAM_STREAM
      STREAM STREAM TEST
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
       2304  212992  212992    10.00    57981.11
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      28d64271
  27. 08 8月, 2013 2 次提交
  28. 19 7月, 2013 1 次提交
    • J
      macvtap: do not zerocopy if iov needs more pages than MAX_SKB_FRAGS · ece793fc
      Jason Wang 提交于
      We try to linearize part of the skb when the number of iov is greater than
      MAX_SKB_FRAGS. This is not enough since each single vector may occupy more than
      one pages, so zerocopy_sg_fromiovec() may still fail and may break the guest
      network.
      
      Solve this problem by calculate the pages needed for iov before trying to do
      zerocopy and switch to use copy instead of zerocopy if it needs more than
      MAX_SKB_FRAGS.
      
      This is done through introducing a new helper to count the pages for iov, and
      call uarg->callback() manually when switching from zerocopy to copy to notify
      vhost.
      
      We can do further optimization on top.
      
      This bug were introduced from b92946e2
      (macvtap: zerocopy: validate vectors before building skb).
      
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ece793fc
  29. 17 7月, 2013 1 次提交