1. 12 7月, 2013 1 次提交
  2. 27 6月, 2013 1 次提交
  3. 22 6月, 2013 9 次提交
  4. 05 6月, 2013 1 次提交
    • M
      IB/qib: Fix lockdep splat in qib_alloc_lkey() · f3bdf344
      Mike Marciniszyn 提交于
      The following backtrace is reported with CONFIG_PROVE_RCU:
      
          drivers/infiniband/hw/qib/qib_keys.c:64 suspicious rcu_dereference_check() usage!
          other info that might help us debug this:
          rcu_scheduler_active = 1, debug_locks = 1
          4 locks held by kworker/0:1/56:
          #0:  (events){.+.+.+}, at: [<ffffffff8107a4f5>] process_one_work+0x165/0x4a0
          #1:  ((&wfc.work)){+.+.+.}, at: [<ffffffff8107a4f5>] process_one_work+0x165/0x4a0
          #2:  (device_mutex){+.+.+.}, at: [<ffffffffa0148dd8>] ib_register_device+0x38/0x220 [ib_core]
          #3:  (&(&dev->lk_table.lock)->rlock){......}, at: [<ffffffffa017e81c>] qib_alloc_lkey+0x3c/0x1b0 [ib_qib]
      
          stack backtrace:
          Pid: 56, comm: kworker/0:1 Not tainted 3.10.0-rc1+ #6
          Call Trace:
          [<ffffffff810c0b85>] lockdep_rcu_suspicious+0xe5/0x130
          [<ffffffffa017e8e1>] qib_alloc_lkey+0x101/0x1b0 [ib_qib]
          [<ffffffffa0184886>] qib_get_dma_mr+0xa6/0xd0 [ib_qib]
          [<ffffffffa01461aa>] ib_get_dma_mr+0x1a/0x50 [ib_core]
          [<ffffffffa01678dc>] ib_mad_port_open+0x12c/0x390 [ib_mad]
          [<ffffffff810c2c55>] ?  trace_hardirqs_on_caller+0x105/0x190
          [<ffffffffa0167b92>] ib_mad_init_device+0x52/0x110 [ib_mad]
          [<ffffffffa01917c0>] ?  sl2vl_attr_show+0x30/0x30 [ib_qib]
          [<ffffffffa0148f49>] ib_register_device+0x1a9/0x220 [ib_core]
          [<ffffffffa01b1685>] qib_register_ib_device+0x735/0xa40 [ib_qib]
          [<ffffffff8106ba98>] ? mod_timer+0x118/0x220
          [<ffffffffa017d425>] qib_init_one+0x1e5/0x400 [ib_qib]
          [<ffffffff812ce86e>] local_pci_probe+0x4e/0x90
          [<ffffffff81078118>] work_for_cpu_fn+0x18/0x30
          [<ffffffff8107a566>] process_one_work+0x1d6/0x4a0
          [<ffffffff8107a4f5>] ?  process_one_work+0x165/0x4a0
          [<ffffffff8107c9c9>] worker_thread+0x119/0x370
          [<ffffffff8107c8b0>] ?  manage_workers+0x180/0x180
          [<ffffffff8108294e>] kthread+0xee/0x100
          [<ffffffff81082860>] ?  __init_kthread_worker+0x70/0x70
          [<ffffffff815c04ac>] ret_from_fork+0x7c/0xb0
          [<ffffffff81082860>] ?  __init_kthread_worker+0x70/0x70
      
      Per Documentation/RCU/lockdep-splat.txt, the code now uses rcu_access_pointer()
      vs. rcu_dereference().
      Reported-by: NJay Fenlason <fenlason@redhat.com>
      Reviewed-by: NDean Luick <dean.luick@intel.com>
      Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      f3bdf344
  5. 08 5月, 2013 1 次提交
  6. 17 4月, 2013 1 次提交
  7. 06 4月, 2013 1 次提交
  8. 23 3月, 2013 1 次提交
  9. 04 3月, 2013 1 次提交
    • E
      fs: Limit sys_mount to only request filesystem modules. · 7f78e035
      Eric W. Biederman 提交于
      Modify the request_module to prefix the file system type with "fs-"
      and add aliases to all of the filesystems that can be built as modules
      to match.
      
      A common practice is to build all of the kernel code and leave code
      that is not commonly needed as modules, with the result that many
      users are exposed to any bug anywhere in the kernel.
      
      Looking for filesystems with a fs- prefix limits the pool of possible
      modules that can be loaded by mount to just filesystems trivially
      making things safer with no real cost.
      
      Using aliases means user space can control the policy of which
      filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
      with blacklist and alias directives.  Allowing simple, safe,
      well understood work-arounds to known problematic software.
      
      This also addresses a rare but unfortunate problem where the filesystem
      name is not the same as it's module name and module auto-loading
      would not work.  While writing this patch I saw a handful of such
      cases.  The most significant being autofs that lives in the module
      autofs4.
      
      This is relevant to user namespaces because we can reach the request
      module in get_fs_type() without having any special permissions, and
      people get uncomfortable when a user specified string (in this case
      the filesystem type) goes all of the way to request_module.
      
      After having looked at this issue I don't think there is any
      particular reason to perform any filtering or permission checks beyond
      making it clear in the module request that we want a filesystem
      module.  The common pattern in the kernel is to call request_module()
      without regards to the users permissions.  In general all a filesystem
      module does once loaded is call register_filesystem() and go to sleep.
      Which means there is not much attack surface exposed by loading a
      filesytem module unless the filesystem is mounted.  In a user
      namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
      which most filesystems do not set today.
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Reported-by: NKees Cook <keescook@google.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      7f78e035
  10. 28 2月, 2013 1 次提交
  11. 23 2月, 2013 1 次提交
  12. 15 2月, 2013 1 次提交
    • M
      IB/qib: Fix QP locate/remove race · bcc9b67a
      Mike Marciniszyn 提交于
      remove_qp() can execute concurrently with a qib_lookup_qpn() on
      another CPU, which in of itself, is ok, given the RCU locking.
      
      The issue is that remove_qp() NULLs out the qp->next field so that a
      qib_lookup_qpn() might fail to find a qp if it occurs after the one
      that is being deleted.  This is a momentary issue and subsequent
      qib_lookup_qpn() calls would find the qp's since the search restarts
      from the bucket head.  At scale, the issue might causes dropped
      packets and unnecessary retransmissions.
      
      The fix just deletes the qp->next NULL assignment to prevent the
      remove_qp() from hiding qp's from qib_lookup_qpn().
      Reviewed-by: NDean Luick <dean.luick@intel.com>
      Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      bcc9b67a
  13. 06 2月, 2013 1 次提交
  14. 04 1月, 2013 1 次提交
    • G
      Drivers: infinband: remove __dev* attributes. · 1e6d9abe
      Greg Kroah-Hartman 提交于
      CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
      markings need to be removed.
      
      This change removes the use of __devinit, __devexit_p, __devinitdata,
      and __devexit from these drivers.
      
      Based on patches originally written by Bill Pemberton, but redone by me
      in order to handle some of the coding style issues better, by hand.
      
      Cc: Bill Pemberton <wfp5p@virginia.edu>
      Cc: Tom Tucker <tom@opengridcomputing.com>
      Cc: Steve Wise <swise@opengridcomputing.com>
      Cc: Roland Dreier <roland@kernel.org>
      Cc: Sean Hefty <sean.hefty@intel.com>
      Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
      Cc: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
      Cc: Christoph Raisch <raisch@de.ibm.com>
      Cc: Mike Marciniszyn <infinipath@intel.com>
      Cc: Faisal Latif <faisal.latif@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e6d9abe
  15. 09 10月, 2012 1 次提交
    • K
      mm: kill vma flag VM_RESERVED and mm->reserved_vm counter · 314e51b9
      Konstantin Khlebnikov 提交于
      A long time ago, in v2.4, VM_RESERVED kept swapout process off VMA,
      currently it lost original meaning but still has some effects:
      
       | effect                 | alternative flags
      -+------------------------+---------------------------------------------
      1| account as reserved_vm | VM_IO
      2| skip in core dump      | VM_IO, VM_DONTDUMP
      3| do not merge or expand | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
      4| do not mlock           | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
      
      This patch removes reserved_vm counter from mm_struct.  Seems like nobody
      cares about it, it does not exported into userspace directly, it only
      reduces total_vm showed in proc.
      
      Thus VM_RESERVED can be replaced with VM_IO or pair VM_DONTEXPAND | VM_DONTDUMP.
      
      remap_pfn_range() and io_remap_pfn_range() set VM_IO|VM_DONTEXPAND|VM_DONTDUMP.
      remap_vmalloc_range() set VM_DONTEXPAND | VM_DONTDUMP.
      
      [akpm@linux-foundation.org: drivers/vfio/pci/vfio_pci.c fixup]
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Carsten Otte <cotte@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      314e51b9
  16. 02 10月, 2012 1 次提交
  17. 01 10月, 2012 1 次提交
  18. 21 9月, 2012 1 次提交
  19. 15 9月, 2012 1 次提交
  20. 08 9月, 2012 1 次提交
  21. 24 8月, 2012 1 次提交
  22. 16 8月, 2012 2 次提交
  23. 30 7月, 2012 1 次提交
  24. 20 7月, 2012 4 次提交
  25. 18 7月, 2012 1 次提交
  26. 11 7月, 2012 1 次提交
  27. 09 7月, 2012 2 次提交
    • M
      IB/qib: RCU locking for MR validation · 8aac4cc3
      Mike Marciniszyn 提交于
      Profiling indicates that MR validation locking is expensive.  The MR
      table is largely read-only and is a suitable candidate for RCU locking.
      
      The patch uses RCU locking during validation to eliminate one
      lock/unlock during that validation.
      Reviewed-by: NMike Heinz <michael.william.heinz@intel.com>
      Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      8aac4cc3
    • M
      IB/qib: Avoid returning EBUSY from MR deregister · 6a82649f
      Mike Marciniszyn 提交于
      A timing issue can occur where qib_mr_dereg can return -EBUSY if the
      MR use count is not zero.
      
      This can occur if the MR is de-registered while RDMA read response
      packets are being progressed from the SDMA ring.  The suspicion is
      that the peer sent an RDMA read request, which has already been copied
      across to the peer.  The peer sees the completion of his request and
      then communicates to the responder that the MR is not needed any
      longer.  The responder tries to de-register the MR, catching some
      responses remaining in the SDMA ring holding the MR use count.
      
      The code now uses a get/put paradigm to track MR use counts and
      coordinates with the MR de-registration process using a completion
      when the count has reached zero.  A timeout on the delay is in place
      to catch other EBUSY issues.
      
      The reference count protocol is as follows:
      - The return to the user counts as 1
      - A reference from the lk_table or the qib_ibdev counts as 1.
      - Transient I/O operations increase/decrease as necessary
      
      A lot of code duplication has been folded into the new routines
      init_qib_mregion() and deinit_qib_mregion().  Additionally, explicit
      initialization of fields to zero is now handled by kzalloc().
      
      Also, duplicated code 'while.*num_sge' that decrements reference
      counts have been consolidated in qib_put_ss().
      Reviewed-by: NRamkrishna Vepa <ramkrishna.vepa@intel.com>
      Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      6a82649f