1. 21 9月, 2018 5 次提交
  2. 13 9月, 2018 1 次提交
  3. 11 9月, 2018 13 次提交
  4. 07 9月, 2018 1 次提交
  5. 06 9月, 2018 4 次提交
  6. 05 9月, 2018 2 次提交
  7. 23 8月, 2018 1 次提交
    • M
      mm, oom: distinguish blockable mode for mmu notifiers · 93065ac7
      Michal Hocko 提交于
      There are several blockable mmu notifiers which might sleep in
      mmu_notifier_invalidate_range_start and that is a problem for the
      oom_reaper because it needs to guarantee a forward progress so it cannot
      depend on any sleepable locks.
      
      Currently we simply back off and mark an oom victim with blockable mmu
      notifiers as done after a short sleep.  That can result in selecting a new
      oom victim prematurely because the previous one still hasn't torn its
      memory down yet.
      
      We can do much better though.  Even if mmu notifiers use sleepable locks
      there is no reason to automatically assume those locks are held.  Moreover
      majority of notifiers only care about a portion of the address space and
      there is absolutely zero reason to fail when we are unmapping an unrelated
      range.  Many notifiers do really block and wait for HW which is harder to
      handle and we have to bail out though.
      
      This patch handles the low hanging fruit.
      __mmu_notifier_invalidate_range_start gets a blockable flag and callbacks
      are not allowed to sleep if the flag is set to false.  This is achieved by
      using trylock instead of the sleepable lock for most callbacks and
      continue as long as we do not block down the call chain.
      
      I think we can improve that even further because there is a common pattern
      to do a range lookup first and then do something about that.  The first
      part can be done without a sleeping lock in most cases AFAICS.
      
      The oom_reaper end then simply retries if there is at least one notifier
      which couldn't make any progress in !blockable mode.  A retry loop is
      already implemented to wait for the mmap_sem and this is basically the
      same thing.
      
      The simplest way for driver developers to test this code path is to wrap
      userspace code which uses these notifiers into a memcg and set the hard
      limit to hit the oom.  This can be done e.g.  after the test faults in all
      the mmu notifier managed memory and set the hard limit to something really
      small.  Then we are looking for a proper process tear down.
      
      [akpm@linux-foundation.org: coding style fixes]
      [akpm@linux-foundation.org: minor code simplification]
      Link: http://lkml.kernel.org/r/20180716115058.5559-1-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: Christian König <christian.koenig@amd.com> # AMD notifiers
      Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx and umem_odp
      Reported-by: NDavid Rientjes <rientjes@google.com>
      Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
      Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
      Cc: Sudeep Dutt <sudeep.dutt@intel.com>
      Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: "Jérôme Glisse" <jglisse@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Felix Kuehling <felix.kuehling@amd.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      93065ac7
  8. 15 8月, 2018 1 次提交
  9. 13 8月, 2018 1 次提交
  10. 11 8月, 2018 1 次提交
  11. 08 8月, 2018 1 次提交
    • L
      RDMA/mlx5: Fix shift overflow in mlx5_ib_create_wq · 0dfe4522
      Leon Romanovsky 提交于
      [   61.182439] UBSAN: Undefined behaviour in drivers/infiniband/hw/mlx5/qp.c:5366:34
      [   61.183673] shift exponent 4294967288 is too large for 32-bit type 'unsigned int'
      [   61.185530] CPU: 0 PID: 639 Comm: qp Not tainted 4.18.0-rc1-00037-g4aa1d69a9c60-dirty #96
      [   61.186981] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
      [   61.188315] Call Trace:
      [   61.188661]  dump_stack+0xc7/0x13b
      [   61.190427]  ubsan_epilogue+0x9/0x49
      [   61.190899]  __ubsan_handle_shift_out_of_bounds+0x1ea/0x22f
      [   61.197040]  mlx5_ib_create_wq+0x1c99/0x1d50
      [   61.206632]  ib_uverbs_ex_create_wq+0x499/0x820
      [   61.213892]  ib_uverbs_write+0x77e/0xae0
      [   61.248018]  vfs_write+0x121/0x3b0
      [   61.249831]  ksys_write+0xa1/0x120
      [   61.254024]  do_syscall_64+0x7c/0x2a0
      [   61.256178]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [   61.259211] RIP: 0033:0x7f54bab70e99
      [   61.262125] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89
      [   61.268678] RSP: 002b:00007ffe1541c318 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [   61.271076] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f54bab70e99
      [   61.273795] RDX: 0000000000000070 RSI: 0000000020000240 RDI: 0000000000000003
      [   61.276982] RBP: 00007ffe1541c330 R08: 00000000200078e0 R09: 0000000000000002
      [   61.280035] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004005c0
      [   61.283279] R13: 00007ffe1541c420 R14: 0000000000000000 R15: 0000000000000000
      
      Cc: <stable@vger.kernel.org> # 4.7
      Fixes: 79b20a6c ("IB/mlx5: Add receive Work Queue verbs")
      Cc: syzkaller <syzkaller@googlegroups.com>
      Reported-by: NNoa Osherovich <noaos@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      0dfe4522
  12. 03 8月, 2018 1 次提交
    • J
      RDMA/netdev: Use priv_destructor for netdev cleanup · 9f49a5b5
      Jason Gunthorpe 提交于
      Now that the unregister_netdev flow for IPoIB no longer relies on external
      code we can now introduce the use of priv_destructor and
      needs_free_netdev.
      
      The rdma_netdev flow is switched to use the netdev common priv_destructor
      instead of the special free_rdma_netdev and the IPOIB ULP adjusted:
       - priv_destructor needs to switch to point to the ULP's destructor
         which will then call the rdma_ndev's in the right order
       - We need to be careful around the error unwind of register_netdev
         as it sometimes calls priv_destructor on failure
       - ULPs need to use ndo_init/uninit to ensure proper ordering
         of failures around register_netdev
      
      Switching to priv_destructor is a necessary pre-requisite to using
      the rtnl new_link mechanism.
      
      The VNIC user for rdma_netdev should also be revised, but that is left for
      another patch.
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NDenis Drozdov <denisd@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      9f49a5b5
  13. 02 8月, 2018 1 次提交
    • J
      IB/uverbs: Do not pass struct ib_device to the ioctl methods · e83f0ecd
      Jason Gunthorpe 提交于
      This does the same as the patch before, except for ioctl. The rules are
      the same, but for the ioctl methods the core code handles setting up the
      uobject.
      
      - Retrieve the ib_dev from the uobject->context->device. This is
        safe under ioctl as the core has already done rdma_alloc_begin_uobject
        and so CREATE calls are entirely protected by the rwsem.
      - Retrieve the ib_dev from uobject->object
      - Call ib_uverbs_get_ucontext()
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      e83f0ecd
  14. 01 8月, 2018 1 次提交
  15. 31 7月, 2018 4 次提交
  16. 27 7月, 2018 1 次提交
  17. 26 7月, 2018 1 次提交
    • J
      IB/uverbs: Fix locking around struct ib_uverbs_file ucontext · 22fa27fb
      Jason Gunthorpe 提交于
      We have a parallel unlocked reader and writer with ib_uverbs_get_context()
      vs everything else, and nothing guarantees this works properly.
      
      Audit and fix all of the places that access ucontext to use one of the
      following locking schemes:
      - Call ib_uverbs_get_ucontext() under SRCU and check for failure
      - Access the ucontext through an struct ib_uobject context member
        while holding a READ or WRITE lock on the uobject.
        This value cannot be NULL and has no race.
      - Hold the ucontext_lock and check for ufile->ucontext !NULL
      
      This also re-implements ib_uverbs_get_ucontext() in a way that is safe
      against concurrent ib_uverbs_get_context() and disassociation.
      
      As a side effect, every access to ucontext in the commands is via
      ib_uverbs_get_context() with an error check, or via the uobject, so there
      is no longer any need for the core code to check ucontext on every command
      call. These checks are also removed.
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      22fa27fb