1. 21 1月, 2017 1 次提交
    • V
      xen-netfront: Fix Rx stall during network stress and OOM · 90c311b0
      Vineeth Remanan Pillai 提交于
      During an OOM scenario, request slots could not be created as skb
      allocation fails. So the netback cannot pass in packets and netfront
      wrongly assumes that there is no more work to be done and it disables
      polling. This causes Rx to stall.
      
      The issue is with the retry logic which schedules the timer if the
      created slots are less than NET_RX_SLOTS_MIN. The count of new request
      slots to be pushed are calculated as a difference between new req_prod
      and rsp_cons which could be more than the actual slots, if there are
      unconsumed responses.
      
      The fix is to calculate the count of newly created slots as the
      difference between new req_prod and old req_prod.
      Signed-off-by: NVineeth Remanan Pillai <vineethp@amazon.com>
      Reviewed-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90c311b0
  2. 07 11月, 2016 1 次提交
  3. 03 11月, 2016 1 次提交
  4. 01 11月, 2016 1 次提交
  5. 21 10月, 2016 1 次提交
    • J
      net: use core MTU range checking in virt drivers · d0c2c997
      Jarod Wilson 提交于
      hyperv_net:
      - set min/max_mtu, per Haiyang, after rndis_filter_device_add
      
      virtio_net:
      - set min/max_mtu
      - remove virtnet_change_mtu
      
      vmxnet3:
      - set min/max_mtu
      
      xen-netback:
      - min_mtu = 0, max_mtu = 65517
      
      xen-netfront:
      - min_mtu = 0, max_mtu = 65535
      
      unisys/visor:
      - clean up defines a little to not clash with network core or add
        redundat definitions
      
      CC: netdev@vger.kernel.org
      CC: virtualization@lists.linux-foundation.org
      CC: "K. Y. Srinivasan" <kys@microsoft.com>
      CC: Haiyang Zhang <haiyangz@microsoft.com>
      CC: "Michael S. Tsirkin" <mst@redhat.com>
      CC: Shrikrishna Khare <skhare@vmware.com>
      CC: "VMware, Inc." <pv-drivers@vmware.com>
      CC: Wei Liu <wei.liu2@citrix.com>
      CC: Paul Durrant <paul.durrant@citrix.com>
      CC: David Kershner <david.kershner@unisys.com>
      Signed-off-by: NJarod Wilson <jarod@redhat.com>
      Reviewed-by: NHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d0c2c997
  6. 20 9月, 2016 1 次提交
    • V
      xen-netfront: avoid packet loss when ethernet header crosses page boundary · fd07160b
      Vitaly Kuznetsov 提交于
      Small packet loss is reported on complex multi host network configurations
      including tunnels, NAT, ... My investigation led me to the following check
      in netback which drops packets:
      
              if (unlikely(txreq.size < ETH_HLEN)) {
                      netdev_err(queue->vif->dev,
                                 "Bad packet size: %d\n", txreq.size);
                      xenvif_tx_err(queue, &txreq, extra_count, idx);
                      break;
              }
      
      But this check itself is legitimate. SKBs consist of a linear part (which
      has to have the ethernet header) and (optionally) a number of frags.
      Netfront transmits the head of the linear part up to the page boundary
      as the first request and all the rest becomes frags so when we're
      reconstructing the SKB in netback we can't distinguish between original
      frags and the 'tail' of the linear part. The first SKB needs to be at
      least ETH_HLEN size. So in case we have an SKB with its linear part
      starting too close to the page boundary the packet is lost.
      
      I see two ways to fix the issue:
      - Change the 'wire' protocol between netfront and netback to start keeping
        the original SKB structure. We'll have to add a flag indicating the fact
        that the particular request is a part of the original linear part and not
        a frag. We'll need to know the length of the linear part to pre-allocate
        memory.
      - Avoid transmitting SKBs with linear parts starting too close to the page
        boundary. That seems preferable short-term and shouldn't bring
        significant performance degradation as such packets are rare. That's what
        this patch is trying to achieve with skb_copy().
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Acked-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fd07160b
  7. 29 1月, 2016 1 次提交
  8. 23 10月, 2015 1 次提交
  9. 21 10月, 2015 1 次提交
  10. 21 9月, 2015 1 次提交
  11. 11 9月, 2015 1 次提交
  12. 09 9月, 2015 1 次提交
  13. 29 8月, 2015 1 次提交
    • C
      net/xen-netfront: only napi_synchronize() if running · 274b0455
      Chas Williams 提交于
      If an interface isn't running napi_synchronize() will hang forever.
      
      [  392.248403] rmmod           R  running task        0   359    343 0x00000000
      [  392.257671]  ffff88003760fc88 ffff880037193b40 ffff880037193160 ffff88003760fc88
      [  392.267644]  ffff880037610000 ffff88003760fcd8 0000000100014c22 ffffffff81f75c40
      [  392.277524]  0000000000bc7010 ffff88003760fca8 ffffffff81796927 ffffffff81f75c40
      [  392.287323] Call Trace:
      [  392.291599]  [<ffffffff81796927>] schedule+0x37/0x90
      [  392.298553]  [<ffffffff8179985b>] schedule_timeout+0x14b/0x280
      [  392.306421]  [<ffffffff810f91b9>] ? irq_free_descs+0x69/0x80
      [  392.314006]  [<ffffffff811084d0>] ? internal_add_timer+0xb0/0xb0
      [  392.322125]  [<ffffffff81109d07>] msleep+0x37/0x50
      [  392.329037]  [<ffffffffa00ec79a>] xennet_disconnect_backend.isra.24+0xda/0x390 [xen_netfront]
      [  392.339658]  [<ffffffffa00ecadc>] xennet_remove+0x2c/0x80 [xen_netfront]
      [  392.348516]  [<ffffffff81481c69>] xenbus_dev_remove+0x59/0xc0
      [  392.356257]  [<ffffffff814e7217>] __device_release_driver+0x87/0x120
      [  392.364645]  [<ffffffff814e7cf8>] driver_detach+0xb8/0xc0
      [  392.371989]  [<ffffffff814e6e69>] bus_remove_driver+0x59/0xe0
      [  392.379883]  [<ffffffff814e84f0>] driver_unregister+0x30/0x70
      [  392.387495]  [<ffffffff814814b2>] xenbus_unregister_driver+0x12/0x20
      [  392.395908]  [<ffffffffa00ed89b>] netif_exit+0x10/0x775 [xen_netfront]
      [  392.404877]  [<ffffffff81124e08>] SyS_delete_module+0x1d8/0x230
      [  392.412804]  [<ffffffff8179a8ee>] system_call_fastpath+0x12/0x71
      Signed-off-by: NChas Williams <3chas3@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      274b0455
  14. 24 8月, 2015 1 次提交
  15. 29 6月, 2015 1 次提交
  16. 22 6月, 2015 1 次提交
  17. 17 6月, 2015 1 次提交
  18. 01 6月, 2015 1 次提交
  19. 28 5月, 2015 1 次提交
  20. 18 4月, 2015 1 次提交
  21. 15 4月, 2015 1 次提交
  22. 03 4月, 2015 1 次提交
    • J
      xen-netfront: transmit fully GSO-sized packets · 0c36820e
      Jonathan Davies 提交于
      xen-netfront limits transmitted skbs to be at most 44 segments in size. However,
      GSO permits up to 65536 bytes, which means a maximum of 45 segments of 1448
      bytes each. This slight reduction in the size of packets means a slight loss in
      efficiency.
      
      Since c/s 9ecd1a75, xen-netfront sets gso_max_size to
          XEN_NETIF_MAX_TX_SIZE - MAX_TCP_HEADER,
      where XEN_NETIF_MAX_TX_SIZE is 65535 bytes.
      
      The calculation used by tcp_tso_autosize (and also tcp_xmit_size_goal since c/s
      6c09fa09) in determining when to split an skb into two is
          sk->sk_gso_max_size - 1 - MAX_TCP_HEADER.
      
      So the maximum permitted size of an skb is calculated to be
          (XEN_NETIF_MAX_TX_SIZE - MAX_TCP_HEADER) - 1 - MAX_TCP_HEADER.
      
      Intuitively, this looks like the wrong formula -- we don't need two TCP headers.
      Instead, there is no need to deviate from the default gso_max_size of 65536 as
      this already accommodates the size of the header.
      
      Currently, the largest skb transmitted by netfront is 63712 bytes (44 segments
      of 1448 bytes each), as observed via tcpdump. This patch makes netfront send
      skbs of up to 65160 bytes (45 segments of 1448 bytes each).
      
      Similarly, the maximum allowable mtu does not need to subtract MAX_TCP_HEADER as
      it relates to the size of the whole packet, including the header.
      
      Fixes: 9ecd1a75 ("xen-netfront: reduce gso_max_size to account for max TCP header")
      Signed-off-by: NJonathan Davies <jonathan.davies@citrix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c36820e
  23. 05 2月, 2015 1 次提交
  24. 14 1月, 2015 3 次提交
  25. 13 1月, 2015 1 次提交
  26. 17 12月, 2014 1 次提交
    • D
      xen-netfront: use napi_complete() correctly to prevent Rx stalling · 6a6dc08f
      David Vrabel 提交于
      After d75b1ade (net: less interrupt
      masking in NAPI) the napi instance is removed from the per-cpu list
      prior to calling the n->poll(), and is only requeued if all of the
      budget was used.  This inadvertently broke netfront because netfront
      does not use NAPI correctly.
      
      If netfront had not used all of its budget it would do a final check
      for any Rx responses and avoid calling napi_complete() if there were
      more responses.  It would still return under budget so it would never
      be rescheduled.  The final check would also not re-enable the Rx
      interrupt.
      
      Additionally, xenvif_poll() would also call napi_complete() /after/
      enabling the interrupt.  This resulted in a race between the
      napi_complete() and the napi_schedule() in the interrupt handler.  The
      use of local_irq_save/restore() avoided by race iff the handler is
      running on the same CPU but not if it was running on a different CPU.
      
      Fix both of these by always calling napi_compete() if the budget was
      not all used, and then calling napi_schedule() if the final checks
      says there's more work.
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6a6dc08f
  27. 10 12月, 2014 1 次提交
  28. 03 12月, 2014 1 次提交
  29. 27 10月, 2014 1 次提交
    • D
      xen-netfront: always keep the Rx ring full of requests · 1f3c2eba
      David Vrabel 提交于
      A full Rx ring only requires 1 MiB of memory.  This is not enough
      memory that it is useful to dynamically scale the number of Rx
      requests in the ring based on traffic rates, because:
      
      a) Even the full 1 MiB is a tiny fraction of a typically modern Linux
         VM (for example, the AWS micro instance still has 1 GiB of memory).
      
      b) Netfront would have used up to 1 MiB already even with moderate
         data rates (there was no adjustment of target based on memory
         pressure).
      
      c) Small VMs are going to typically have one VCPU and hence only one
         queue.
      
      Keeping the ring full of Rx requests handles bursty traffic better
      than trying to converge on an optimal number of requests to keep
      filled.
      
      On a 4 core host, an iperf -P 64 -t 60 run from dom0 to a 4 VCPU guest
      improved from 5.1 Gbit/s to 5.6 Gbit/s.  Gains with more bursty
      traffic are expected to be higher.
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1f3c2eba
  30. 16 10月, 2014 1 次提交
    • T
      net: Add ndo_gso_check · 04ffcb25
      Tom Herbert 提交于
      Add ndo_gso_check which a device can define to indicate whether is
      is capable of doing GSO on a packet. This funciton would be called from
      the stack to determine whether software GSO is needed to be done. A
      driver should populate this function if it advertises GSO types for
      which there are combinations that it wouldn't be able to handle. For
      instance a device that performs UDP tunneling might only implement
      support for transparent Ethernet bridging type of inner packets
      or might have limitations on lengths of inner headers.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04ffcb25
  31. 06 10月, 2014 1 次提交
  32. 12 8月, 2014 1 次提交
    • Z
      xen-netfront: Fix handling packets on compound pages with skb_linearize · 97a6d1bb
      Zoltan Kiss 提交于
      There is a long known problem with the netfront/netback interface: if the guest
      tries to send a packet which constitues more than MAX_SKB_FRAGS + 1 ring slots,
      it gets dropped. The reason is that netback maps these slots to a frag in the
      frags array, which is limited by size. Having so many slots can occur since
      compound pages were introduced, as the ring protocol slice them up into
      individual (non-compound) page aligned slots. The theoretical worst case
      scenario looks like this (note, skbs are limited to 64 Kb here):
      linear buffer: at most PAGE_SIZE - 17 * 2 bytes, overlapping page boundary,
      using 2 slots
      first 15 frags: 1 + PAGE_SIZE + 1 bytes long, first and last bytes are at the
      end and the beginning of a page, therefore they use 3 * 15 = 45 slots
      last 2 frags: 1 + 1 bytes, overlapping page boundary, 2 * 2 = 4 slots
      Although I don't think this 51 slots skb can really happen, we need a solution
      which can deal with every scenario. In real life there is only a few slots
      overdue, but usually it causes the TCP stream to be blocked, as the retry will
      most likely have the same buffer layout.
      This patch solves this problem by linearizing the packet. This is not the
      fastest way, and it can fail much easier as it tries to allocate a big linear
      area for the whole packet, but probably easier by an order of magnitude than
      anything else. Probably this code path is not touched very frequently anyway.
      Signed-off-by: NZoltan Kiss <zoltan.kiss@citrix.com>
      Cc: Wei Liu <wei.liu2@citrix.com>
      Cc: Ian Campbell <Ian.Campbell@citrix.com>
      Cc: Paul Durrant <paul.durrant@citrix.com>
      Cc: netdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: xen-devel@lists.xenproject.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      97a6d1bb
  33. 01 8月, 2014 3 次提交
  34. 09 7月, 2014 2 次提交
  35. 22 6月, 2014 1 次提交
    • D
      xen-netfront: recreate queues correctly when reconnecting · ce58725f
      David Vrabel 提交于
      When reconnecting to the backend (after a resume/migration, for example),
      a different number of queues may be required (since the guest may have
      moved to a different host with different capabilities).  During the
      reconnection the old queues are torn down and new ones created.
      
      Introduce xennet_create_queues() and xennet_destroy_queues() that fixes
      three bugs during the reconnection.
      
      - The old info->queues was leaked.
      - The old queue's napi instances were not deleted.
      - The new queue's napi instances were left disabled (which meant no
        packets could be received).
      
      The xennet_destroy_queues() calls is deferred until the reconnection
      instead of the disconnection (in xennet_disconnect_backend()) because
      napi_disable() might sleep.
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Reviewed-by: NWei Liu <wei.liu2@citrix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ce58725f