1. 04 2月, 2017 1 次提交
    • M
      Revert "vring: Force use of DMA API for ARM-based systems with legacy devices" · 0d5415b4
      Michael S. Tsirkin 提交于
      This reverts commit c7070619.
      
      This has been shown to regress on some ARM systems:
      
      by forcing on DMA API usage for ARM systems, we have inadvertently
      kicked open a hornets' nest in terms of cache-coherency. Namely that
      unless the virtio device is explicitly described as capable of coherent
      DMA by firmware, the DMA APIs on ARM and other DT-based platforms will
      assume it is non-coherent. This turns out to cause a big problem for the
      likes of QEMU and kvmtool, which generate virtio-mmio devices in their
      guest DTs but neglect to add the often-overlooked "dma-coherent"
      property; as a result, we end up with the guest making non-cacheable
      accesses to the vring, the host doing so cacheably, both talking past
      each other and things going horribly wrong.
      
      We are working on a safer work-around.
      
      Fixes: c7070619 ("vring: Force use of DMA API for ARM-based systems with legacy devices")
      Reported-by: NRobin Murphy <robin.murphy@arm.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      0d5415b4
  2. 25 1月, 2017 1 次提交
    • W
      vring: Force use of DMA API for ARM-based systems with legacy devices · c7070619
      Will Deacon 提交于
      Booting Linux on an ARM fastmodel containing an SMMU emulation results
      in an unexpected I/O page fault from the legacy virtio-blk PCI device:
      
      [    1.211721] arm-smmu-v3 2b400000.smmu: event 0x10 received:
      [    1.211800] arm-smmu-v3 2b400000.smmu:	0x00000000fffff010
      [    1.211880] arm-smmu-v3 2b400000.smmu:	0x0000020800000000
      [    1.211959] arm-smmu-v3 2b400000.smmu:	0x00000008fa081002
      [    1.212075] arm-smmu-v3 2b400000.smmu:	0x0000000000000000
      [    1.212155] arm-smmu-v3 2b400000.smmu: event 0x10 received:
      [    1.212234] arm-smmu-v3 2b400000.smmu:	0x00000000fffff010
      [    1.212314] arm-smmu-v3 2b400000.smmu:	0x0000020800000000
      [    1.212394] arm-smmu-v3 2b400000.smmu:	0x00000008fa081000
      [    1.212471] arm-smmu-v3 2b400000.smmu:	0x0000000000000000
      
      <system hangs failing to read partition table>
      
      This is because the legacy virtio-blk device is behind an SMMU, so we
      have consequently swizzled its DMA ops and configured the SMMU to
      translate accesses. This then requires the vring code to use the DMA API
      to establish translations, otherwise all transactions will result in
      fatal faults and termination.
      
      Given that ARM-based systems only see an SMMU if one is really present
      (the topology is all described by firmware tables such as device-tree or
      IORT), then we can safely use the DMA API for all legacy virtio devices.
      Modern devices can advertise the prescense of an IOMMU using the
      VIRTIO_F_IOMMU_PLATFORM feature flag.
      
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: <stable@vger.kernel.org>
      Fixes: 876945db ("arm64: Hook up IOMMU dma_ops")
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      c7070619
  3. 16 12月, 2016 1 次提交
  4. 15 12月, 2016 1 次提交
    • G
      virtio_ring: fix complaint by sparse · c60923cb
      Gonglei 提交于
       # make C=2 CF="-D__CHECK_ENDIAN__" ./drivers/virtio/
      
      drivers/virtio/virtio_ring.c:423:19: warning: incorrect type in assignment (different base types)
      drivers/virtio/virtio_ring.c:423:19:    expected unsigned int [unsigned] [assigned] i
      drivers/virtio/virtio_ring.c:423:19:    got restricted __virtio16 [usertype] next
      drivers/virtio/virtio_ring.c:423:19: warning: incorrect type in assignment (different base types)
      drivers/virtio/virtio_ring.c:423:19:    expected unsigned int [unsigned] [assigned] i
      drivers/virtio/virtio_ring.c:423:19:    got restricted __virtio16 [usertype] next
      drivers/virtio/virtio_ring.c:423:19: warning: incorrect type in assignment (different base types)
      drivers/virtio/virtio_ring.c:423:19:    expected unsigned int [unsigned] [assigned] i
      drivers/virtio/virtio_ring.c:423:19:    got restricted __virtio16 [usertype] next
      drivers/virtio/virtio_ring.c:604:39: warning: incorrect type in initializer (different base types)
      drivers/virtio/virtio_ring.c:604:39:    expected unsigned short [unsigned] [usertype] nextflag
      drivers/virtio/virtio_ring.c:604:39:    got restricted __virtio16
      drivers/virtio/virtio_ring.c:612:33: warning: restricted __virtio16 degrades to integer
      Signed-off-by: NGonglei <arei.gonglei@huawei.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      c60923cb
  5. 31 10月, 2016 2 次提交
  6. 10 9月, 2016 1 次提交
  7. 09 8月, 2016 2 次提交
  8. 02 8月, 2016 1 次提交
    • M
      virtio: new feature to detect IOMMU device quirk · 1a937693
      Michael S. Tsirkin 提交于
      The interaction between virtio and IOMMUs is messy.
      
      On most systems with virtio, physical addresses match bus addresses,
      and it doesn't particularly matter which one we use to program
      the device.
      
      On some systems, including Xen and any system with a physical device
      that speaks virtio behind a physical IOMMU, we must program the IOMMU
      for virtio DMA to work at all.
      
      On other systems, including SPARC and PPC64, virtio-pci devices are
      enumerated as though they are behind an IOMMU, but the virtio host
      ignores the IOMMU, so we must either pretend that the IOMMU isn't
      there or somehow map everything as the identity.
      
      Add a feature bit to detect that quirk: VIRTIO_F_IOMMU_PLATFORM.
      
      Any device with this feature bit set to 0 needs a quirk and has to be
      passed physical addresses (as opposed to bus addresses) even though
      the device is behind an IOMMU.
      
      Note: it has to be a per-device quirk because for example, there could
      be a mix of passed-through and virtual virtio devices. As another
      example, some devices could be implemented by an out of process
      hypervisor backend (in case of qemu vhost, or vhost-user) and so support
      for an IOMMU needs to be coded up separately.
      
      It would be cleanest to handle this in IOMMU core code, but that needs
      per-device DMA ops. While we are waiting for that to be implemented, use
      a work-around in virtio core.
      
      Note: a "noiommu" feature is a quirk - add a wrapper to make
      that clear.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      1a937693
  9. 01 5月, 2016 1 次提交
  10. 02 3月, 2016 4 次提交
  11. 13 1月, 2016 1 次提交
  12. 07 12月, 2015 2 次提交
    • V
      virtio_ring: shadow available ring flags & index · f277ec42
      Venkatesh Srinivas 提交于
      Improves cacheline transfer flow of available ring header.
      
      Virtqueues are implemented as a pair of rings, one producer->consumer
      avail ring and one consumer->producer used ring; preceding the
      avail ring in memory are two contiguous u16 fields -- avail->flags
      and avail->idx. A producer posts work by writing to avail->idx and
      a consumer reads avail->idx.
      
      The flags and idx fields only need to be written by a producer CPU
      and only read by a consumer CPU; when the producer and consumer are
      running on different CPUs and the virtio_ring code is structured to
      only have source writes/sink reads, we can continuously transfer the
      avail header cacheline between 'M' states between cores. This flow
      optimizes core -> core bandwidth on certain CPUs.
      
      (see: "Software Optimization Guide for AMD Family 15h Processors",
      Section 11.6; similar language appears in the 10h guide and should
      apply to CPUs w/ exclusive caches, using LLC as a transfer cache)
      
      Unfortunately the existing virtio_ring code issued reads to the
      avail->idx and read-modify-writes to avail->flags on the producer.
      
      This change shadows the flags and index fields in producer memory;
      the vring code now reads from the shadows and only ever writes to
      avail->flags and avail->idx, allowing the cacheline to transfer
      core -> core optimally.
      
      In a concurrent version of vring_bench, the time required for
      10,000,000 buffer checkout/returns was reduced by ~2% (average
      across many runs) on an AMD Piledriver (15h) CPU:
      
      (w/o shadowing):
       Performance counter stats for './vring_bench':
           5,451,082,016      L1-dcache-loads
           ...
             2.221477739 seconds time elapsed
      
      (w/ shadowing):
       Performance counter stats for './vring_bench':
           5,405,701,361      L1-dcache-loads
           ...
             2.168405376 seconds time elapsed
      
      The further away (in a NUMA sense) virtio producers and consumers are
      from each other, the more we expect to benefit. Physical implementations
      of virtio devices and implementations of virtio where the consumer polls
      vring avail indexes (vhost) should also benefit.
      Signed-off-by: NVenkatesh Srinivas <venkateshs@google.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      f277ec42
    • M
      virtio: Do not drop __GFP_HIGH in alloc_indirect · 82107539
      Michal Hocko 提交于
      b92b1b89 ("virtio: force vring descriptors to be allocated from
      lowmem") tried to exclude highmem pages for descriptors so it cleared
      __GFP_HIGHMEM from a given gfp mask. The patch also cleared __GFP_HIGH
      which doesn't make much sense for this fix because __GFP_HIGH only
      controls access to memory reserves and it doesn't have any influence
      on the zone selection. Some of the call paths use GFP_ATOMIC and
      dropping __GFP_HIGH will reduce their changes for success because the
      lack of access to memory reserves.
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Reviewed-by: NMel Gorman <mgorman@techsingularity.net>
      82107539
  13. 11 2月, 2015 1 次提交
  14. 21 1月, 2015 1 次提交
  15. 09 12月, 2014 4 次提交
  16. 14 9月, 2014 2 次提交
    • R
      virtio_ring: unify direct/indirect code paths. · b25bd251
      Rusty Russell 提交于
      virtqueue_add() populates the virtqueue descriptor table from the sgs
      given.  If it uses an indirect descriptor table, then it puts a single
      descriptor in the descriptor table pointing to the kmalloc'ed indirect
      table where the sg is populated.
      
      Previously vring_add_indirect() did the allocation and the simple
      linear layout.  We replace that with alloc_indirect() which allocates
      the indirect table then chains it like the normal descriptor table so
      we can reuse the core logic.
      
      This slows down pktgen by less than 1/2 a percent (which uses direct
      descriptors), as well as vring_bench, but it's far neater.
      
      vring_bench before:
      	1061485790-1104800648(1.08254e+09+/-6.6e+06)ns
      vring_bench after:
      	1125610268-1183528965(1.14172e+09+/-8e+06)ns
      
      pktgen before:
         787781-796334(793165+/-2.4e+03)pps 365-369(367.5+/-1.2)Mb/sec (365530384-369498976(3.68028e+08+/-1.1e+06)bps) errors: 0
      
      pktgen after:
         779988-790404(786391+/-2.5e+03)pps 361-366(364.35+/-1.3)Mb/sec (361914432-366747456(3.64885e+08+/-1.2e+06)bps) errors: 0
      
      Now, if we make force indirect descriptors by turning off any_header_sg
      in virtio_net.c:
      
      pktgen before:
        713773-721062(718374+/-2.1e+03)pps 331-334(332.95+/-0.92)Mb/sec (331190672-334572768(3.33325e+08+/-9.6e+05)bps) errors: 0
      pktgen after:
        710542-719195(714898+/-2.4e+03)pps 329-333(331.15+/-1.1)Mb/sec (329691488-333706480(3.31713e+08+/-1.1e+06)bps) errors: 0
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b25bd251
    • R
      virtio_ring: assume sgs are always well-formed. · eeebf9b1
      Rusty Russell 提交于
      We used to have several callers which just used arrays.  They're
      gone, so we can use sg_next() everywhere, simplifying the code.
      
      On my laptop, this slowed down vring_bench by 15%:
      
      vring_bench before:
      	936153354-967745359(9.44739e+08+/-6.1e+06)ns
      vring_bench after:
      	1061485790-1104800648(1.08254e+09+/-6.6e+06)ns
      
      However, a more realistic test using pktgen on a AMD FX(tm)-8320 saw
      a few percent improvement:
      
      pktgen before:
        767390-792966(785159+/-6.5e+03)pps 356-367(363.75+/-2.9)Mb/sec (356068960-367936224(3.64314e+08+/-3e+06)bps) errors: 0
      
      pktgen after:
         787781-796334(793165+/-2.4e+03)pps 365-369(367.5+/-1.2)Mb/sec (365530384-369498976(3.68028e+08+/-1.1e+06)bps) errors: 0
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eeebf9b1
  17. 28 4月, 2014 1 次提交
  18. 13 3月, 2014 2 次提交
    • R
      virtio: fail adding buffer on broken queues. · 70670444
      Rusty Russell 提交于
      Heinz points out that adding buffers to a broken virtqueue (which
      should "never happen") still works.  Failing allows drivers to detect
      and complain about broken devices.
      
      Now drivers are robust, we can add this extra check.
      Reported-by: NHeinz Graalfs <graalfs@linux.vnet.ibm.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      70670444
    • J
      tools/virtio: fix missing kmemleak_ignore symbol · 6abb2dd9
      Joel Stanley 提交于
      In commit bb478d8b virtio_ring: plug kmemleak false positive,
      kmemleak_ignore was introduced. This broke compilation of virtio_test:
      
        cc -g -O2 -Wall -I. -I ../../usr/include/ -Wno-pointer-sign
          -fno-strict-overflow -fno-strict-aliasing -fno-common -MMD
          -U_FORTIFY_SOURCE   -c -o virtio_ring.o ../../drivers/virtio/virtio_ring.c
        ../../drivers/virtio/virtio_ring.c: In function ‘vring_add_indirect’:
        ../../drivers/virtio/virtio_ring.c:177:2: warning: implicit declaration
        of function ‘kmemleak_ignore’ [-Wimplicit-function-declaration]
          kmemleak_ignore(desc);
          ^
        cc   virtio_test.o virtio_ring.o   -o virtio_test
        virtio_ring.o: In function `vring_add_indirect':
        tools/virtio/../../drivers/virtio/virtio_ring.c:177:
        undefined reference to `kmemleak_ignore'
      
      Add a dummy header for tools/virtio, and add #incldue <linux/kmemleak.h>
      to drivers/virtio/virtio_ring.c so it is picked up by the userspace
      tools.
      Signed-off-by: NJoel Stanley <joel@jms.id.au>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      6abb2dd9
  19. 05 11月, 2013 1 次提交
  20. 29 10月, 2013 3 次提交
  21. 17 10月, 2013 1 次提交
    • R
      virtio_ring: plug kmemleak false positive. · bb478d8b
      Rusty Russell 提交于
      unreferenced object 0xffff88003d467e20 (size 32):
        comm "softirq", pid 0, jiffies 4295197765 (age 6.364s)
        hex dump (first 32 bytes):
          28 19 bf 3d 00 00 00 00 0c 00 00 00 01 00 01 00  (..=............
          02 dc 51 3c 00 00 00 00 56 00 00 00 00 00 00 00  ..Q<....V.......
        backtrace:
          [<ffffffff8152db19>] kmemleak_alloc+0x59/0xc0
          [<ffffffff81102e93>] __kmalloc+0xf3/0x180
          [<ffffffff812db5d6>] vring_add_indirect+0x36/0x280
          [<ffffffff812dc59f>] virtqueue_add_outbuf+0xbf/0x4e0
          [<ffffffff813a8b30>] start_xmit+0x1a0/0x3b0
          [<ffffffff81445861>] dev_hard_start_xmit+0x2d1/0x4d0
          [<ffffffff81460052>] sch_direct_xmit+0xf2/0x1c0
          [<ffffffff81445c28>] dev_queue_xmit+0x1c8/0x460
          [<ffffffff814e3187>] ip6_finish_output2+0x1d7/0x470
          [<ffffffff814e34b0>] ip6_finish_output+0x90/0xb0
          [<ffffffff814e3507>] ip6_output+0x37/0xb0
          [<ffffffff815021eb>] igmp6_send+0x2db/0x470
          [<ffffffff81502645>] igmp6_timer_handler+0x95/0xa0
          [<ffffffff8104b57c>] call_timer_fn+0x2c/0x90
          [<ffffffff8104b7ba>] run_timer_softirq+0x1da/0x1f0
          [<ffffffff81045721>] __do_softirq+0xd1/0x1b0
      
      Address gets embedded in a descriptor via virt_to_phys().  See detach_buf,
      which frees it:
      
      	if (vq->vring.desc[i].flags & VRING_DESC_F_INDIRECT)
      		kfree(phys_to_virt(vq->vring.desc[i].addr));
      Reported-by: NChristoph Paasch <christoph.paasch@uclouvain.be>
      Fix-suggested-by: NChristoph Paasch <christoph.paasch@uclouvain.be>
      Typing-done-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      bb478d8b
  22. 10 7月, 2013 1 次提交
  23. 20 5月, 2013 1 次提交
  24. 20 3月, 2013 3 次提交
  25. 18 12月, 2012 1 次提交