1. 01 9月, 2020 3 次提交
  2. 10 6月, 2020 1 次提交
  3. 05 6月, 2020 1 次提交
  4. 26 5月, 2020 1 次提交
  5. 22 5月, 2020 2 次提交
  6. 05 5月, 2020 2 次提交
  7. 15 4月, 2020 1 次提交
  8. 01 2月, 2020 2 次提交
    • J
      mm, tree-wide: rename put_user_page*() to unpin_user_page*() · f1f6a7dd
      John Hubbard 提交于
      In order to provide a clearer, more symmetric API for pinning and
      unpinning DMA pages.  This way, pin_user_pages*() calls match up with
      unpin_user_pages*() calls, and the API is a lot closer to being
      self-explanatory.
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-23-jhubbard@nvidia.comSigned-off-by: NJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f1f6a7dd
    • J
      net/xdp: set FOLL_PIN via pin_user_pages() · fb48b474
      John Hubbard 提交于
      Convert net/xdp to use the new pin_longterm_pages() call, which sets
      FOLL_PIN.  Setting FOLL_PIN is now required for code that requires
      tracking of pinned pages.
      
      In partial anticipation of this work, the net/xdp code was already calling
      put_user_page() instead of put_page().  Therefore, in order to convert
      from the get_user_pages()/put_page() model, to the
      pin_user_pages()/put_user_page() model, the only change required here is
      to change get_user_pages() to pin_user_pages().
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-18-jhubbard@nvidia.comSigned-off-by: NJohn Hubbard <jhubbard@nvidia.com>
      Acked-by: NBjörn Töpel <bjorn.topel@intel.com>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fb48b474
  9. 16 1月, 2020 1 次提交
  10. 24 10月, 2019 1 次提交
  11. 25 9月, 2019 1 次提交
  12. 19 9月, 2019 1 次提交
  13. 31 8月, 2019 1 次提交
  14. 21 8月, 2019 1 次提交
  15. 20 8月, 2019 1 次提交
  16. 18 8月, 2019 2 次提交
    • M
      xsk: add support for need_wakeup flag in AF_XDP rings · 77cd0d7b
      Magnus Karlsson 提交于
      This commit adds support for a new flag called need_wakeup in the
      AF_XDP Tx and fill rings. When this flag is set, it means that the
      application has to explicitly wake up the kernel Rx (for the bit in
      the fill ring) or kernel Tx (for bit in the Tx ring) processing by
      issuing a syscall. Poll() can wake up both depending on the flags
      submitted and sendto() will wake up tx processing only.
      
      The main reason for introducing this new flag is to be able to
      efficiently support the case when application and driver is executing
      on the same core. Previously, the driver was just busy-spinning on the
      fill ring if it ran out of buffers in the HW and there were none on
      the fill ring. This approach works when the application is running on
      another core as it can replenish the fill ring while the driver is
      busy-spinning. Though, this is a lousy approach if both of them are
      running on the same core as the probability of the fill ring getting
      more entries when the driver is busy-spinning is zero. With this new
      feature the driver now sets the need_wakeup flag and returns to the
      application. The application can then replenish the fill queue and
      then explicitly wake up the Rx processing in the kernel using the
      syscall poll(). For Tx, the flag is only set to one if the driver has
      no outstanding Tx completion interrupts. If it has some, the flag is
      zero as it will be woken up by a completion interrupt anyway.
      
      As a nice side effect, this new flag also improves the performance of
      the case where application and driver are running on two different
      cores as it reduces the number of syscalls to the kernel. The kernel
      tells user space if it needs to be woken up by a syscall, and this
      eliminates many of the syscalls.
      
      This flag needs some simple driver support. If the driver does not
      support this, the Rx flag is always zero and the Tx flag is always
      one. This makes any application relying on this feature default to the
      old behaviour of not requiring any syscalls in the Rx path and always
      having to call sendto() in the Tx path.
      
      For backwards compatibility reasons, this feature has to be explicitly
      turned on using a new bind flag (XDP_USE_NEED_WAKEUP). I recommend
      that you always turn it on as it so far always have had a positive
      performance impact.
      
      The name and inspiration of the flag has been taken from io_uring by
      Jens Axboe. Details about this feature in io_uring can be found in
      http://kernel.dk/io_uring.pdf, section 8.3.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      77cd0d7b
    • M
      xsk: replace ndo_xsk_async_xmit with ndo_xsk_wakeup · 9116e5e2
      Magnus Karlsson 提交于
      This commit replaces ndo_xsk_async_xmit with ndo_xsk_wakeup. This new
      ndo provides the same functionality as before but with the addition of
      a new flags field that is used to specifiy if Rx, Tx or both should be
      woken up. The previous ndo only woke up Tx, as implied by the
      name. The i40e and ixgbe drivers (which are all the supported ones)
      are updated with this new interface.
      
      This new ndo will be used by the new need_wakeup functionality of XDP
      sockets that need to be able to wake up both Rx and Tx driver
      processing.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      9116e5e2
  17. 10 8月, 2019 1 次提交
  18. 12 7月, 2019 1 次提交
  19. 03 7月, 2019 2 次提交
  20. 12 6月, 2019 1 次提交
  21. 15 5月, 2019 1 次提交
    • I
      mm/gup: replace get_user_pages_longterm() with FOLL_LONGTERM · 932f4a63
      Ira Weiny 提交于
      Pach series "Add FOLL_LONGTERM to GUP fast and use it".
      
      HFI1, qib, and mthca, use get_user_pages_fast() due to its performance
      advantages.  These pages can be held for a significant time.  But
      get_user_pages_fast() does not protect against mapping FS DAX pages.
      
      Introduce FOLL_LONGTERM and use this flag in get_user_pages_fast() which
      retains the performance while also adding the FS DAX checks.  XDP has also
      shown interest in using this functionality.[1]
      
      In addition we change get_user_pages() to use the new FOLL_LONGTERM flag
      and remove the specialized get_user_pages_longterm call.
      
      [1] https://lkml.org/lkml/2019/3/19/939
      
      "longterm" is a relative thing and at this point is probably a misnomer.
      This is really flagging a pin which is going to be given to hardware and
      can't move.  I've thought of a couple of alternative names but I think we
      have to settle on if we are going to use FL_LAYOUT or something else to
      solve the "longterm" problem.  Then I think we can change the flag to a
      better name.
      
      Secondly, it depends on how often you are registering memory.  I have
      spoken with some RDMA users who consider MR in the performance path...
      For the overall application performance.  I don't have the numbers as the
      tests for HFI1 were done a long time ago.  But there was a significant
      advantage.  Some of which is probably due to the fact that you don't have
      to hold mmap_sem.
      
      Finally, architecturally I think it would be good for everyone to use
      *_fast.  There are patches submitted to the RDMA list which would allow
      the use of *_fast (they reworking the use of mmap_sem) and as soon as they
      are accepted I'll submit a patch to convert the RDMA core as well.  Also
      to this point others are looking to use *_fast.
      
      As an aside, Jasons pointed out in my previous submission that *_fast and
      *_unlocked look very much the same.  I agree and I think further cleanup
      will be coming.  But I'm focused on getting the final solution for DAX at
      the moment.
      
      This patch (of 7):
      
      This patch starts a series which aims to support FOLL_LONGTERM in
      get_user_pages_fast().  Some callers who would like to do a longterm (user
      controlled pin) of pages with the fast variant of GUP for performance
      purposes.
      
      Rather than have a separate get_user_pages_longterm() call, introduce
      FOLL_LONGTERM and change the longterm callers to use it.
      
      This patch does not change any functionality.  In the short term
      "longterm" or user controlled pins are unsafe for Filesystems and FS DAX
      in particular has been blocked.  However, callers of get_user_pages_fast()
      were not "protected".
      
      FOLL_LONGTERM can _only_ be supported with get_user_pages[_fast]() as it
      requires vmas to determine if DAX is in use.
      
      NOTE: In merging with the CMA changes we opt to change the
      get_user_pages() call in check_and_migrate_cma_pages() to a call of
      __get_user_pages_locked() on the newly migrated pages.  This makes the
      code read better in that we are calling __get_user_pages_locked() on the
      pages before and after a potential migration.
      
      As a side affect some of the interfaces are cleaned up but this is not the
      primary purpose of the series.
      
      In review[1] it was asked:
      
      <quote>
      > This I don't get - if you do lock down long term mappings performance
      > of the actual get_user_pages call shouldn't matter to start with.
      >
      > What do I miss?
      
      A couple of points.
      
      First "longterm" is a relative thing and at this point is probably a
      misnomer.  This is really flagging a pin which is going to be given to
      hardware and can't move.  I've thought of a couple of alternative names
      but I think we have to settle on if we are going to use FL_LAYOUT or
      something else to solve the "longterm" problem.  Then I think we can
      change the flag to a better name.
      
      Second, It depends on how often you are registering memory.  I have spoken
      with some RDMA users who consider MR in the performance path...  For the
      overall application performance.  I don't have the numbers as the tests
      for HFI1 were done a long time ago.  But there was a significant
      advantage.  Some of which is probably due to the fact that you don't have
      to hold mmap_sem.
      
      Finally, architecturally I think it would be good for everyone to use
      *_fast.  There are patches submitted to the RDMA list which would allow
      the use of *_fast (they reworking the use of mmap_sem) and as soon as they
      are accepted I'll submit a patch to convert the RDMA core as well.  Also
      to this point others are looking to use *_fast.
      
      As an asside, Jasons pointed out in my previous submission that *_fast and
      *_unlocked look very much the same.  I agree and I think further cleanup
      will be coming.  But I'm focused on getting the final solution for DAX at
      the moment.
      
      </quote>
      
      [1] https://lore.kernel.org/lkml/20190220180255.GA12020@iweiny-DESK2.sc.intel.com/T/#md6abad2569f3bf6c1f03686c8097ab6563e94965
      
      [ira.weiny@intel.com: v3]
        Link: http://lkml.kernel.org/r/20190328084422.29911-2-ira.weiny@intel.com
      Link: http://lkml.kernel.org/r/20190328084422.29911-2-ira.weiny@intel.com
      Link: http://lkml.kernel.org/r/20190317183438.2057-2-ira.weiny@intel.comSigned-off-by: NIra Weiny <ira.weiny@intel.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Mike Marshall <hubcap@omnibond.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      932f4a63
  22. 16 3月, 2019 1 次提交
  23. 13 2月, 2019 1 次提交
  24. 12 2月, 2019 1 次提交
  25. 25 1月, 2019 1 次提交
  26. 22 1月, 2019 1 次提交
  27. 16 1月, 2019 1 次提交
  28. 08 10月, 2018 1 次提交
    • B
      xsk: proper AF_XDP socket teardown ordering · 541d7fdd
      Björn Töpel 提交于
      The AF_XDP socket struct can exist in three different, implicit
      states: setup, bound and released. Setup is prior the socket has been
      bound to a device. Bound is when the socket is active for receive and
      send. Released is when the process/userspace side of the socket is
      released, but the sock object is still lingering, e.g. when there is a
      reference to the socket in an XSKMAP after process termination.
      
      The Rx fast-path code uses the "dev" member of struct xdp_sock to
      check whether a socket is bound or relased, and the Tx code uses the
      struct xdp_umem "xsk_list" member in conjunction with "dev" to
      determine the state of a socket.
      
      However, the transition from bound to released did not tear the socket
      down in correct order.
      
      On the Rx side "dev" was cleared after synchronize_net() making the
      synchronization useless. On the Tx side, the internal queues were
      destroyed prior removing them from the "xsk_list".
      
      This commit corrects the cleanup order, and by doing so
      xdp_del_sk_umem() can be simplified and one synchronize_net() can be
      removed.
      
      Fixes: 965a9909 ("xsk: add support for bind for Rx")
      Fixes: ac98d8aa ("xsk: wire upp Tx zero-copy functions")
      Reported-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      541d7fdd
  29. 05 10月, 2018 3 次提交
  30. 26 9月, 2018 1 次提交
  31. 01 9月, 2018 1 次提交