1. 16 11月, 2020 1 次提交
  2. 21 10月, 2020 2 次提交
  3. 04 10月, 2020 3 次提交
  4. 02 9月, 2020 1 次提交
  5. 06 8月, 2020 1 次提交
  6. 05 8月, 2020 2 次提交
  7. 21 7月, 2020 1 次提交
    • W
      vhost: Remove redundant use of read_barrier_depends() barrier · 71c0b9a6
      Will Deacon 提交于
      Since commit 76ebbe78 ("locking/barriers: Add implicit
      smp_read_barrier_depends() to READ_ONCE()"), there is no need to use
      smp_read_barrier_depends() outside of the Alpha architecture code.
      
      Unfortunately, there is precisely _one_ user in the vhost code, and
      there isn't an obvious READ_ONCE() access making the barrier
      redundant. However, on closer inspection (thanks, Jason), it appears
      that vring synchronisation between the producer and consumer occurs via
      the 'avail_idx' field, which is followed up by an rmb() in
      vhost_get_vq_desc(), making the read_barrier_depends() redundant on
      Alpha.
      
      Jason says:
      
        | I'm also confused about the barrier here, basically in driver side
        | we did:
        |
        | 1) allocate pages
        | 2) store pages in indirect->addr
        | 3) smp_wmb()
        | 4) increase the avail idx (somehow a tail pointer of vring)
        |
        | in vhost we did:
        |
        | 1) read avail idx
        | 2) smp_rmb()
        | 3) read indirect->addr
        | 4) read from indirect->addr
        |
        | It looks to me even the data dependency barrier is not necessary
        | since we have rmb() which is sufficient for us to the correct
        | indirect->addr and driver are not expected to do any writing to
        | indirect->addr after avail idx is increased
      
      Remove the redundant barrier invocation.
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Suggested-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NPaul E. McKenney <paulmck@kernel.org>
      Signed-off-by: NWill Deacon <will@kernel.org>
      71c0b9a6
  8. 11 6月, 2020 3 次提交
  9. 09 6月, 2020 1 次提交
  10. 07 6月, 2020 1 次提交
  11. 05 6月, 2020 3 次提交
  12. 02 6月, 2020 1 次提交
    • M
      virtio: force spec specified alignment on types · a865e420
      Michael S. Tsirkin 提交于
      The ring element addresses are passed between components with different
      alignments assumptions. Thus, if guest/userspace selects a pointer and
      host then gets and dereferences it, we might need to decrease the
      compiler-selected alignment to prevent compiler on the host from
      assuming pointer is aligned.
      
      This actually triggers on ARM with -mabi=apcs-gnu - which is a
      deprecated configuration, but it seems safer to handle this
      generally.
      
      Note that userspace that allocates the memory is actually OK and does
      not need to be fixed, but userspace that gets it from guest or another
      process does need to be fixed. The later doesn't generally talk to the
      kernel so while it might be buggy it's not talking to the kernel in the
      buggy way - it's just using the header in the buggy way - so fixing
      header and asking userspace to recompile is the best we can do.
      
      I verified that the produced kernel binary on x86 is exactly identical
      before and after the change.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      a865e420
  13. 15 5月, 2020 1 次提交
  14. 02 4月, 2020 2 次提交
  15. 05 12月, 2019 1 次提交
  16. 15 9月, 2019 1 次提交
  17. 12 9月, 2019 2 次提交
  18. 04 9月, 2019 2 次提交
  19. 19 6月, 2019 1 次提交
  20. 09 6月, 2019 1 次提交
  21. 06 6月, 2019 6 次提交
    • J
      vhost: access vq metadata through kernel virtual address · 7f466032
      Jason Wang 提交于
      It was noticed that the copy_to/from_user() friends that was used to
      access virtqueue metdata tends to be very expensive for dataplane
      implementation like vhost since it involves lots of software checks,
      speculation barriers, hardware feature toggling (e.g SMAP). The
      extra cost will be more obvious when transferring small packets since
      the time spent on metadata accessing become more significant.
      
      This patch tries to eliminate those overheads by accessing them
      through direct mapping of those pages. Invalidation callbacks is
      implemented for co-operation with general VM management (swap, KSM,
      THP or NUMA balancing). We will try to get the direct mapping of vq
      metadata before each round of packet processing if it doesn't
      exist. If we fail, we will simplely fallback to copy_to/from_user()
      friends.
      
      This invalidation and direct mapping access are synchronized through
      spinlock and RCU. All matedata accessing through direct map is
      protected by RCU, and the setup or invalidation are done under
      spinlock.
      
      This method might does not work for high mem page which requires
      temporary mapping so we just fallback to normal
      copy_to/from_user() and may not for arch that has virtual tagged cache
      since extra cache flushing is needed to eliminate the alias. This will
      result complex logic and bad performance. For those archs, this patch
      simply go for copy_to/from_user() friends. This is done by ruling out
      kernel mapping codes through ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE.
      
      Note that this is only done when device IOTLB is not enabled. We
      could use similar method to optimize IOTLB in the future.
      
      Tests shows at most about 23% improvement on TX PPS when using
      virtio-user + vhost_net + xdp1 + TAP on 2.6GHz Broadwell:
      
              SMAP on | SMAP off
      Before: 5.2Mpps | 7.1Mpps
      After:  6.4Mpps | 8.2Mpps
      
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: linux-mm@kvack.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-parisc@vger.kernel.org
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      7f466032
    • J
      vhost: factor out setting vring addr and num · feebcaea
      Jason Wang 提交于
      Factoring vring address and num setting which needs special care for
      accelerating vq metadata accessing.
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      feebcaea
    • J
      vhost: introduce helpers to get the size of metadata area · 4942e825
      Jason Wang 提交于
      To avoid code duplication since it will be used by kernel VA prefetching.
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      4942e825
    • J
      vhost: rename vq_iotlb_prefetch() to vq_meta_prefetch() · 9b5e830b
      Jason Wang 提交于
      Rename the function to be more accurate since it actually tries to
      prefetch vq metadata address in IOTLB. And this will be used by
      following patch to prefetch metadata virtual addresses.
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      9b5e830b
    • J
      vhost: fine grain userspace memory accessors · 7b5d753e
      Jason Wang 提交于
      This is used to hide the metadata address from virtqueue helpers. This
      will allow to implement a vmap based fast accessing to metadata.
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      7b5d753e
    • J
      vhost: generalize adding used elem · 1ab5d138
      Jason Wang 提交于
      Use one generic vhost_copy_to_user() instead of two dedicated
      accessor. This will simplify the conversion to fine grain
      accessors. About 2% improvement of PPS were seen during vitio-user
      txonly test.
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      1ab5d138
  22. 27 5月, 2019 1 次提交
  23. 15 5月, 2019 1 次提交
    • I
      mm/gup: change GUP fast to use flags rather than a write 'bool' · 73b0140b
      Ira Weiny 提交于
      To facilitate additional options to get_user_pages_fast() change the
      singular write parameter to be gup_flags.
      
      This patch does not change any functionality.  New functionality will
      follow in subsequent patches.
      
      Some of the get_user_pages_fast() call sites were unchanged because they
      already passed FOLL_WRITE or 0 for the write parameter.
      
      NOTE: It was suggested to change the ordering of the get_user_pages_fast()
      arguments to ensure that callers were converted.  This breaks the current
      GUP call site convention of having the returned pages be the final
      parameter.  So the suggestion was rejected.
      
      Link: http://lkml.kernel.org/r/20190328084422.29911-4-ira.weiny@intel.com
      Link: http://lkml.kernel.org/r/20190317183438.2057-4-ira.weiny@intel.comSigned-off-by: NIra Weiny <ira.weiny@intel.com>
      Reviewed-by: NMike Marshall <hubcap@omnibond.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      73b0140b
  24. 11 4月, 2019 1 次提交