1. 01 5月, 2014 1 次提交
    • Z
      virtio-net: Set needed_headroom for virtio-net when VIRTIO_F_ANY_LAYOUT is true · 6ebbc1a6
      Zhangjie \(HZ\) 提交于
      This is a small supplement for commit e7428e95
      ("virtio-net: put virtio-net header inline with data"). TCP packages have
      enough room to put virtio-net header in, but UDP packages do not. By
      setting dev->needed_headroom for virtio-net device, UDP packages could have
      enough room.
      
      For UDP packages, sk_buff is alloced in fun __ip_append_data. The size is
      "alloclen + hh_len + 15", and "hh_len = LL_RESERVED_SPACE(rt-dst.dev);".
      The Macro is defined as follows:
      #define LL_RESERVED_SPACE(dev) \
           ((((dev)->hard_header_len+(dev)->needed_headroom)\
           &~(HH_DATA_MOD - 1)) + HH_DATA_MOD)
      By default, for UDP packages, after skb is allocated, only 16 bytes
      reserved. And 2 bytes remained after mac header is set. That is not enough
      to put virtio-net header in. If we set dev->needed_headroom to 12 or 10
      (according to mergeable_rx_bufs is on or off ), more room can be reserved.
      Then there is enough room for UDP packages to put the header in.
      
      test result list as below:
      guest and host: suse11sp3, netperf, intel 2.4GHz
      +-------+---------+---------+---------+---------+
      |       |   old             |   new             |
      +-------+---------+---------+---------+---------+
      | UDP   |  Gbit/s | pps     |  Gbit/s | pps     |
      | 64    |  0.57   | 692232  |  0.61   | 742420  |
      | 256   |  1.60   | 686860  |  1.71   | 733331  |
      | 512   |  2.92   | 674576  |  3.07   | 710446  |
      | 1024  |  4.99   | 598977  |  5.17   | 620821  |
      | 1460  |  5.68   | 483757  |  7.16   | 610519  |
      | 4096  |  6.98   | 637468  |  7.21   | 658471  |
      +-------+---------+---------+---------+---------+
      Signed-off-by: NZhang Jie <zhangjie14@huawei.com>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6ebbc1a6
  2. 23 4月, 2014 1 次提交
  3. 28 3月, 2014 1 次提交
    • J
      virtio-net: correct error handling of virtqueue_kick() · 681daee2
      Jason Wang 提交于
      Current error handling of virtqueue_kick() was wrong in two places:
      - The skb were freed immediately when virtqueue_kick() fail during
        xmit. This may lead double free since the skb was not detached from
        the virtqueue.
      - try_fill_recv() returns false when virtqueue_kick() fail. This will
        lead unnecessary rescheduling of refill work.
      
      Actually, it's safe to just ignore the kick failure in those two
      places. So this patch fixes this by partially revert commit
      67975901.
      
      Fixes 67975901
      (virtio_net: verify if virtqueue_kick() succeeded).
      
      Cc: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      681daee2
  4. 25 3月, 2014 1 次提交
  5. 15 3月, 2014 1 次提交
  6. 13 3月, 2014 1 次提交
  7. 25 2月, 2014 1 次提交
  8. 17 1月, 2014 4 次提交
    • M
      virtio-net: initial rx sysfs support, export mergeable rx buffer size · fbf28d78
      Michael Dalton 提交于
      Add initial support for per-rx queue sysfs attributes to virtio-net. If
      mergeable packet buffers are enabled, adds a read-only mergeable packet
      buffer size sysfs attribute for each RX queue.
      Suggested-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NMichael Dalton <mwdalton@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fbf28d78
    • M
      virtio-net: auto-tune mergeable rx buffer size for improved performance · ab7db917
      Michael Dalton 提交于
      Commit 2613af0e ("virtio_net: migrate mergeable rx buffers to page frag
      allocators") changed the mergeable receive buffer size from PAGE_SIZE to
      MTU-size, introducing a single-stream regression for benchmarks with large
      average packet size. There is no single optimal buffer size for all
      workloads.  For workloads with packet size <= MTU bytes, MTU + virtio-net
      header-sized buffers are preferred as larger buffers reduce the TCP window
      due to SKB truesize. However, single-stream workloads with large average
      packet sizes have higher throughput if larger (e.g., PAGE_SIZE) buffers
      are used.
      
      This commit auto-tunes the mergeable receiver buffer packet size by
      choosing the packet buffer size based on an EWMA of the recent packet
      sizes for the receive queue. Packet buffer sizes range from MTU_SIZE +
      virtio-net header len to PAGE_SIZE. This improves throughput for
      large packet workloads, as any workload with average packet size >=
      PAGE_SIZE will use PAGE_SIZE buffers.
      
      These optimizations interact positively with recent commit
      ba275241 ("virtio-net: coalesce rx frags when possible during rx"),
      which coalesces adjacent RX SKB fragments in virtio_net. The coalescing
      optimizations benefit buffers of any size.
      
      Benchmarks taken from an average of 5 netperf 30-second TCP_STREAM runs
      between two QEMU VMs on a single physical machine. Each VM has two VCPUs
      with all offloads & vhost enabled. All VMs and vhost threads run in a
      single 4 CPU cgroup cpuset, using cgroups to ensure that other processes
      in the system will not be scheduled on the benchmark CPUs. Trunk includes
      SKB rx frag coalescing.
      
      net-next w/ virtio_net before 2613af0e (PAGE_SIZE bufs): 14642.85Gb/s
      net-next (MTU-size bufs):  13170.01Gb/s
      net-next + auto-tune: 14555.94Gb/s
      
      Jason Wang also reported a throughput increase on mlx4 from 22Gb/s
      using MTU-sized buffers to about 26Gb/s using auto-tuning.
      Signed-off-by: NMichael Dalton <mwdalton@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ab7db917
    • M
      virtio-net: use per-receive queue page frag alloc for mergeable bufs · fb51879d
      Michael Dalton 提交于
      The virtio-net driver currently uses netdev_alloc_frag() for GFP_ATOMIC
      mergeable rx buffer allocations. This commit migrates virtio-net to use
      per-receive queue page frags for GFP_ATOMIC allocation. This change unifies
      mergeable rx buffer memory allocation, which now will use skb_refill_frag()
      for both atomic and GFP-WAIT buffer allocations.
      
      To address fragmentation concerns, if after buffer allocation there
      is too little space left in the page frag to allocate a subsequent
      buffer, the remaining space is added to the current allocated buffer
      so that the remaining space can be used to store packet data.
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NMichael Dalton <mwdalton@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fb51879d
    • J
      virtio-net: drop rq->max and rq->num · be121f46
      Jason Wang 提交于
      It looks like there's no need for those two fields:
      
      - Unless there's a failure for the first refill try, rq->max should be always
        equal to the vring size.
      - rq->num is only used to determine the condition that we need to do the refill,
        we could check vq->num_free instead.
      - rq->num was required to be increased or decreased explicitly after each
        get/put which results a bad API.
      
      So this patch removes them both to make the code simpler.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      be121f46
  9. 03 1月, 2014 1 次提交
    • J
      virtio-net: fix refill races during restore · 6cd4ce00
      Jason Wang 提交于
      During restoring, try_fill_recv() was called with neither napi lock nor napi
      disabled. This can lead two try_fill_recv() was called in the same time. Fix
      this by refilling before trying to enable napi.
      
      Fixes 0741bcb5
      (virtio: net: Add freeze, restore handlers to support S4).
      
      Cc: Amit Shah <amit.shah@redhat.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6cd4ce00
  10. 11 12月, 2013 2 次提交
  11. 07 12月, 2013 4 次提交
    • M
      virtio-net: free bufs correctly on invalid packet length · 98bfd23c
      Michael Dalton 提交于
      When a packet with invalid length arrives, ensure that the packet
      is freed correctly if mergeable packet buffers and big packets
      (GUEST_TSO4) are both enabled.
      Signed-off-by: NMichael Dalton <mwdalton@google.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NAndrew Vagin <avagin@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      98bfd23c
    • A
      virtio: delete napi structures from netdev before releasing memory · d4fb84ee
      Andrey Vagin 提交于
      free_netdev calls netif_napi_del too, but it's too late, because napi
      structures are placed on vi->rq. netif_napi_add() is called from
      virtnet_alloc_queues.
      
      general protection fault: 0000 [#1] SMP
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Modules linked in: ip6table_filter ip6_tables iptable_filter ip_tables virtio_balloon pcspkr virtio_net(-) i2c_pii
      CPU: 1 PID: 347 Comm: rmmod Not tainted 3.13.0-rc2+ #171
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      task: ffff8800b779c420 ti: ffff8800379e0000 task.ti: ffff8800379e0000
      RIP: 0010:[<ffffffff81322e19>]  [<ffffffff81322e19>] __list_del_entry+0x29/0xd0
      RSP: 0018:ffff8800379e1dd0  EFLAGS: 00010a83
      RAX: 6b6b6b6b6b6b6b6b RBX: ffff8800379c2fd0 RCX: dead000000200200
      RDX: 6b6b6b6b6b6b6b6b RSI: 0000000000000001 RDI: ffff8800379c2fd0
      RBP: ffff8800379e1dd0 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800379c2f90
      R13: ffff880037839160 R14: 0000000000000000 R15: 00000000013352f0
      FS:  00007f1400e34740(0000) GS:ffff8800bfb00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00007f464124c763 CR3: 00000000b68cf000 CR4: 00000000000006e0
      Stack:
       ffff8800379e1df0 ffffffff8155beab 6b6b6b6b6b6b6b2b ffff8800378391c0
       ffff8800379e1e18 ffffffff8156499b ffff880037839be0 ffff880037839d20
       ffff88003779d3f0 ffff8800379e1e38 ffffffffa003477c ffff88003779d388
      Call Trace:
       [<ffffffff8155beab>] netif_napi_del+0x1b/0x80
       [<ffffffff8156499b>] free_netdev+0x8b/0x110
       [<ffffffffa003477c>] virtnet_remove+0x7c/0x90 [virtio_net]
       [<ffffffff813ae323>] virtio_dev_remove+0x23/0x80
       [<ffffffff813f62ef>] __device_release_driver+0x7f/0xf0
       [<ffffffff813f6ca0>] driver_detach+0xc0/0xd0
       [<ffffffff813f5f28>] bus_remove_driver+0x58/0xd0
       [<ffffffff813f72ec>] driver_unregister+0x2c/0x50
       [<ffffffff813ae65e>] unregister_virtio_driver+0xe/0x10
       [<ffffffffa0036942>] virtio_net_driver_exit+0x10/0x6ce [virtio_net]
       [<ffffffff810d7cf2>] SyS_delete_module+0x172/0x220
       [<ffffffff810a732d>] ? trace_hardirqs_on+0xd/0x10
       [<ffffffff810f5d4c>] ? __audit_syscall_entry+0x9c/0xf0
       [<ffffffff81677f69>] system_call_fastpath+0x16/0x1b
      Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de 48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00
      RIP  [<ffffffff81322e19>] __list_del_entry+0x29/0xd0
       RSP <ffff8800379e1dd0>
      ---[ end trace d5931cd3f87c9763 ]---
      
      Fixes: 986a4f4d (virtio_net: multiqueue support)
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Signed-off-by: NAndrey Vagin <avagin@openvz.org>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d4fb84ee
    • A
      virtio-net: determine type of bufs correctly · fa9fac17
      Andrey Vagin 提交于
      free_unused_bufs must check vi->mergeable_rx_bufs before
      vi->big_packets, because we use this sequence in other places.
      Otherwise we allocate buffer of one type, then free it as another
      type.
      
      general protection fault: 0000 [#1] SMP
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Modules linked in: ip6table_filter ip6_tables iptable_filter ip_tables pcspkr virtio_balloon virtio_net(-) i2c_pii
      CPU: 0 PID: 400 Comm: rmmod Not tainted 3.13.0-rc2+ #170
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      task: ffff8800b6d2a210 ti: ffff8800aed32000 task.ti: ffff8800aed32000
      RIP: 0010:[<ffffffffa00345f3>]  [<ffffffffa00345f3>] free_unused_bufs+0xc3/0x190 [virtio_net]
      RSP: 0018:ffff8800aed33dd8  EFLAGS: 00010202
      RAX: ffff8800b1fe2c00 RBX: ffff8800b66a7240 RCX: 6b6b6b6b6b6b6b6b
      RDX: 6b6b6b6b6b6b6b6b RSI: ffff8800b8419a68 RDI: ffff8800b66a1148
      RBP: ffff8800aed33e00 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
      R13: ffff8800b66a1148 R14: 0000000000000000 R15: 000077ff80000000
      FS:  00007fc4f9c4e740(0000) GS:ffff8800bfa00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00007f63f432f000 CR3: 00000000b6538000 CR4: 00000000000006f0
      Stack:
       ffff8800b66a7240 ffff8800b66a7380 ffff8800377bd3f0 0000000000000000
       00000000023302f0 ffff8800aed33e18 ffffffffa00346e2 ffff8800b66a7240
       ffff8800aed33e38 ffffffffa003474d ffff8800377bd388 ffff8800377bd390
      Call Trace:
       [<ffffffffa00346e2>] remove_vq_common+0x22/0x40 [virtio_net]
       [<ffffffffa003474d>] virtnet_remove+0x4d/0x90 [virtio_net]
       [<ffffffff813ae303>] virtio_dev_remove+0x23/0x80
       [<ffffffff813f62cf>] __device_release_driver+0x7f/0xf0
       [<ffffffff813f6c80>] driver_detach+0xc0/0xd0
       [<ffffffff813f5f08>] bus_remove_driver+0x58/0xd0
       [<ffffffff813f72cc>] driver_unregister+0x2c/0x50
       [<ffffffff813ae63e>] unregister_virtio_driver+0xe/0x10
       [<ffffffffa0036852>] virtio_net_driver_exit+0x10/0x7be [virtio_net]
       [<ffffffff810d7cf2>] SyS_delete_module+0x172/0x220
       [<ffffffff810a732d>] ? trace_hardirqs_on+0xd/0x10
       [<ffffffff810f5d4c>] ? __audit_syscall_entry+0x9c/0xf0
       [<ffffffff81677f69>] system_call_fastpath+0x16/0x1b
      Code: c0 74 55 0f 1f 44 00 00 80 7b 30 00 74 7a 48 8b 50 30 4c 89 e6 48 03 73 20 48 85 d2 0f 84 bb 00 00 00 66 0f
      RIP  [<ffffffffa00345f3>] free_unused_bufs+0xc3/0x190 [virtio_net]
       RSP <ffff8800aed33dd8>
      ---[ end trace edb570ea923cce9c ]---
      
      Fixes: 2613af0e (virtio_net: migrate mergeable rx buffers to page frag allocators)
      Cc: Michael Dalton <mwdalton@google.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Signed-off-by: NAndrey Vagin <avagin@openvz.org>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fa9fac17
    • J
      drivers/net/*: Fix FSF address in file headers · adf8d3ff
      Jeff Kirsher 提交于
      Several files refer to an old address for the Free Software Foundation
      in the file header comment.  Resolve by replacing the address with
      the URL <http://www.gnu.org/licenses/> so that we do not have to keep
      updating the header comments anytime the address changes.
      
      CC: Jay Vosburgh <fubar@us.ibm.com>
      CC: Veaceslav Falico <vfalico@redhat.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: Haiyang Zhang <haiyangz@microsoft.com>
      CC: "K. Y. Srinivasan" <kys@microsoft.com>
      CC: Paul Mackerras <paulus@samba.org>
      CC: Ian Campbell <ian.campbell@citrix.com>
      CC: Wei Liu <wei.liu2@citrix.com>
      CC: Rusty Russell <rusty@rustcorp.com.au>
      CC: "Michael S. Tsirkin" <mst@redhat.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Acked-by: NWei Liu <wei.liu2@citrix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      adf8d3ff
  12. 02 12月, 2013 2 次提交
  13. 01 12月, 2013 1 次提交
  14. 19 11月, 2013 1 次提交
  15. 15 11月, 2013 1 次提交
    • M
      virtio-net: mergeable buffer size should include virtio-net header · 5061de36
      Michael Dalton 提交于
      Commit 2613af0e ("virtio_net: migrate mergeable rx buffers to page
      frag allocators") changed the mergeable receive buffer size from PAGE_SIZE
      to MTU-size. However, the merge buffer size does not take into account the
      size of the virtio-net header. Consequently, packets that are MTU-size
      will take two buffers intead of one (to store the virtio-net header),
      substantially decreasing the throughput of MTU-size traffic due to TCP
      window / SKB truesize effects.
      
      This commit changes the mergeable buffer size to include the virtio-net
      header. The buffer size is cacheline-aligned because skb_page_frag_refill
      will not automatically align the requested size.
      
      Benchmarks taken from an average of 5 netperf 30-second TCP_STREAM runs
      between two QEMU VMs on a single physical machine. Each VM has two VCPUs and
      vhost enabled. All VMs and vhost threads run in a single 4 CPU cgroup
      cpuset, using cgroups to ensure that other processes in the system will not
      be scheduled on the benchmark CPUs. Transmit offloads and mergeable receive
      buffers are enabled, but guest_tso4 / guest_csum are explicitly disabled to
      force MTU-sized packets on the receiver.
      
      next-net trunk before 2613af0e (PAGE_SIZE buf): 3861.08Gb/s
      net-next trunk (MTU 1500- packet uses two buf due to size bug): 4076.62Gb/s
      net-next trunk (MTU 1480- packet fits in one buf): 6301.34Gb/s
      net-next trunk w/ size fix (MTU 1500 - packet fits in one buf): 6445.44Gb/s
      Suggested-by: NEric Northup <digitaleric@google.com>
      Signed-off-by: NMichael Dalton <mwdalton@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5061de36
  16. 06 11月, 2013 2 次提交
    • J
      net: Explicitly initialize u64_stats_sync structures for lockdep · 827da44c
      John Stultz 提交于
      In order to enable lockdep on seqcount/seqlock structures, we
      must explicitly initialize any locks.
      
      The u64_stats_sync structure, uses a seqcount, and thus we need
      to introduce a u64_stats_init() function and use it to initialize
      the structure.
      
      This unfortunately adds a lot of fairly trivial initialization code
      to a number of drivers. But the benefit of ensuring correctness makes
      this worth while.
      
      Because these changes are required for lockdep to be enabled, and the
      changes are quite trivial, I've not yet split this patch out into 30-some
      separate patches, as I figured it would be better to get the various
      maintainers thoughts on how to best merge this change along with
      the seqcount lockdep enablement.
      
      Feedback would be appreciated!
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Acked-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: James Morris <jmorris@namei.org>
      Cc: Jesse Gross <jesse@nicira.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Mirko Lindner <mlindner@marvell.com>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Roger Luethi <rl@hellgate.ch>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Simon Horman <horms@verge.net.au>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Cc: Wensong Zhang <wensong@linux-vs.org>
      Cc: netdev@vger.kernel.org
      Link: http://lkml.kernel.org/r/1381186321-4906-2-git-send-email-john.stultz@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      827da44c
    • J
      virtio-net: switch to use XPS to choose txq · 9bb8ca86
      Jason Wang 提交于
      We used to use a percpu structure vq_index to record the cpu to queue
      mapping, this is suboptimal since it duplicates the work of XPS and
      loses all other XPS functionality such as allowing user to configure
      their own transmission steering strategy.
      
      So this patch switches to use XPS and suggest a default mapping when
      the number of cpus is equal to the number of queues. With XPS support,
      there's no need for keeping per-cpu vq_index and .ndo_select_queue(),
      so they were removed also.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9bb8ca86
  17. 05 11月, 2013 1 次提交
    • J
      virtio-net: coalesce rx frags when possible during rx · ba275241
      Jason Wang 提交于
      Commit 2613af0e (virtio_net: migrate mergeable
      rx buffers to page frag allocators) try to increase the payload/truesize for
      MTU-sized traffic. But this will introduce the extra overhead for GSO packets
      received because of the frag list. This commit tries to reduce this issue by
      coalesce the possible rx frags when possible during rx. Test result shows the
      about 15% improvement on full size GSO packet receiving (and even better than
      before commit 2613af0e).
      
      Before this commit:
      ./netperf -H 192.168.100.4
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.100.4
      () port 0 AF_INET : demo
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
       87380  16384  16384    10.00    20303.87
      
      After this commit:
      ./netperf -H 192.168.100.4
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.100.4
      () port 0 AF_INET : demo
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
       87380  16384  16384    10.00    23841.26
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Michael Dalton <mwdalton@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ba275241
  18. 30 10月, 2013 1 次提交
    • J
      virtio-net: correctly handle cpu hotplug notifier during resuming · ec9debbd
      Jason Wang 提交于
      commit 3ab098df (virtio-net: don't respond to
      cpu hotplug notifier if we're not ready) tries to bypass the cpu hotplug
      notifier by checking the config_enable and does nothing is it was false. So it
      need to try to hold the config_lock mutex which may happen in atomic
      environment which leads the following warnings:
      
      [  622.944441] CPU0 attaching NULL sched-domain.
      [  622.944446] CPU1 attaching NULL sched-domain.
      [  622.944485] CPU0 attaching NULL sched-domain.
      [  622.950795] BUG: sleeping function called from invalid context at kernel/mutex.c:616
      [  622.950796] in_atomic(): 1, irqs_disabled(): 1, pid: 10, name: migration/1
      [  622.950796] no locks held by migration/1/10.
      [  622.950798] CPU: 1 PID: 10 Comm: migration/1 Not tainted 3.12.0-rc5-wl-01249-gb91e82d #317
      [  622.950799] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [  622.950802]  0000000000000000 ffff88001d42dba0 ffffffff81a32f22 ffff88001bfb9c70
      [  622.950803]  ffff88001d42dbb0 ffffffff810edb02 ffff88001d42dc38 ffffffff81a396ed
      [  622.950805]  0000000000000046 ffff88001d42dbe8 ffffffff810e861d 0000000000000000
      [  622.950805] Call Trace:
      [  622.950810]  [<ffffffff81a32f22>] dump_stack+0x54/0x74
      [  622.950815]  [<ffffffff810edb02>] __might_sleep+0x112/0x114
      [  622.950817]  [<ffffffff81a396ed>] mutex_lock_nested+0x3c/0x3c6
      [  622.950818]  [<ffffffff810e861d>] ? up+0x39/0x3e
      [  622.950821]  [<ffffffff8153ea7c>] ? acpi_os_signal_semaphore+0x21/0x2d
      [  622.950824]  [<ffffffff81565ed1>] ? acpi_ut_release_mutex+0x5e/0x62
      [  622.950828]  [<ffffffff816d04ec>] virtnet_cpu_callback+0x33/0x87
      [  622.950830]  [<ffffffff81a42576>] notifier_call_chain+0x3c/0x5e
      [  622.950832]  [<ffffffff810e86a8>] __raw_notifier_call_chain+0xe/0x10
      [  622.950835]  [<ffffffff810c5556>] __cpu_notify+0x20/0x37
      [  622.950836]  [<ffffffff810c5580>] cpu_notify+0x13/0x15
      [  622.950838]  [<ffffffff81a237cd>] take_cpu_down+0x27/0x3a
      [  622.950841]  [<ffffffff81136289>] stop_machine_cpu_stop+0x93/0xf1
      [  622.950842]  [<ffffffff81136167>] cpu_stopper_thread+0xa0/0x12f
      [  622.950844]  [<ffffffff811361f6>] ? cpu_stopper_thread+0x12f/0x12f
      [  622.950847]  [<ffffffff81119710>] ? lock_release_holdtime.part.7+0xa3/0xa8
      [  622.950848]  [<ffffffff81135e4b>] ? cpu_stop_should_run+0x3f/0x47
      [  622.950850]  [<ffffffff810ea9b0>] smpboot_thread_fn+0x1c5/0x1e3
      [  622.950852]  [<ffffffff810ea7eb>] ? lg_global_unlock+0x67/0x67
      [  622.950854]  [<ffffffff810e36b7>] kthread+0xd8/0xe0
      [  622.950857]  [<ffffffff81a3bfad>] ? wait_for_common+0x12f/0x164
      [  622.950859]  [<ffffffff810e35df>] ? kthread_create_on_node+0x124/0x124
      [  622.950861]  [<ffffffff81a45ffc>] ret_from_fork+0x7c/0xb0
      [  622.950862]  [<ffffffff810e35df>] ? kthread_create_on_node+0x124/0x124
      [  622.950876] smpboot: CPU 1 is now offline
      [  623.194556] SMP alternatives: lockdep: fixing up alternatives
      [  623.194559] smpboot: Booting Node 0 Processor 1 APIC 0x1
      ...
      
      A correct fix is to unregister the hotcpu notifier during restore and register a
      new one in resume.
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Tested-by: NFengguang Wu <fengguang.wu@intel.com>
      Cc: Wanlong Gao <gaowanlong@cn.fujitsu.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: NWanlong Gao <gaowanlong@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec9debbd
  19. 29 10月, 2013 3 次提交
  20. 18 10月, 2013 2 次提交
  21. 17 10月, 2013 1 次提交
  22. 23 9月, 2013 1 次提交
  23. 04 9月, 2013 1 次提交
  24. 28 7月, 2013 1 次提交
  25. 10 7月, 2013 1 次提交
    • M
      virtio_net: fix race in RX VQ processing · cbdadbbf
      Michael S. Tsirkin 提交于
      virtio net called virtqueue_enable_cq on RX path after napi_complete, so
      with NAPI_STATE_SCHED clear - outside the implicit napi lock.
      This violates the requirement to synchronize virtqueue_enable_cq wrt
      virtqueue_add_buf.  In particular, used event can move backwards,
      causing us to lose interrupts.
      In a debug build, this can trigger panic within START_USE.
      
      Jason Wang reports that he can trigger the races artificially,
      by adding udelay() in virtqueue_enable_cb() after virtio_mb().
      
      However, we must call napi_complete to clear NAPI_STATE_SCHED before
      polling the virtqueue for used buffers, otherwise napi_schedule_prep in
      a callback will fail, causing us to lose RX events.
      
      To fix, call virtqueue_enable_cb_prepare with NAPI_STATE_SCHED
      set (under napi lock), later call virtqueue_poll with
      NAPI_STATE_SCHED clear (outside the lock).
      Reported-by: NJason Wang <jasowang@redhat.com>
      Tested-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cbdadbbf
  26. 04 7月, 2013 1 次提交
    • J
      virtio-net: fix the race between channels setting and refill · 9b9cd802
      Jason Wang 提交于
      Commit 55257d72 (virtio-net: fill only rx queues
      which are being used) tries to refill on demand when changing the number of
      channels by call try_refill_recv() directly, this may race:
      
      - the refill work who may do the refill in the same time
      - the try_refill_recv() called in bh since napi was not disabled
      
      Which may led guest complain during setting channels:
      
      virtio_net virtio0: input.1:id 0 is not a head!
      
      Solve this issue by scheduling a refill work which can guarantee the
      serialization of refill.
      
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      9b9cd802
  27. 23 5月, 2013 1 次提交
  28. 12 5月, 2013 1 次提交