1. 17 7月, 2015 2 次提交
    • L
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 761ab766
      Linus Torvalds 提交于
      Pull block fixes from Jens Axboe:
       "A collection of fixes from the last few weeks that should go into the
        current series.  This contains:
      
         - Various fixes for the per-blkcg policy data, fixing regressions
           since 4.1.  From Arianna and Tejun
      
         - Code cleanup for bcache closure macros from me.  Really just
           flushing this out, it's been sitting in another branch for months
      
         - FIELD_SIZEOF cleanup from Maninder Singh
      
         - bio integrity oops fix from Mike
      
         - Timeout regression fix for blk-mq from Ming Lei"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        blk-mq: set default timeout as 30 seconds
        NVMe: Reread partitions on metadata formats
        bcache: don't embed 'return' statements in closure macros
        blkcg: fix blkcg_policy_data allocation bug
        blkcg: implement all_blkcgs list
        blkcg: blkcg_css_alloc() should grab blkcg_pol_mutex while iterating blkcg_policy[]
        blkcg: allow blkcg_pol_mutex to be grabbed from cgroup [file] methods
        block/blk-cgroup.c: free per-blkcg data when freeing the blkcg
        block: use FIELD_SIZEOF to calculate size of a field
        bio integrity: do not assume bio_integrity_pool exists if bioset exists
      761ab766
    • L
      Merge tag 'jfs-4.2' of git://github.com/kleikamp/linux-shaggy · f76d94de
      Linus Torvalds 提交于
      Pull jfs fixes from David Kleikamp:
       "A couple trivial fixes and an error path fix"
      
      * tag 'jfs-4.2' of git://github.com/kleikamp/linux-shaggy:
        jfs: clean up jfs_rename and fix out of order unlock
        jfs: fix indentation on if statement
        jfs: removed a prohibited space after opening parenthesis
      f76d94de
  2. 16 7月, 2015 11 次提交
    • M
      blk-mq: set default timeout as 30 seconds · e56f698b
      Ming Lei 提交于
      It is reasonable to set default timeout of request as 30 seconds instead of
      30000 ticks, which may be 300 seconds if HZ is 100, for example, some arm64
      based systems may choose 100 HZ.
      Signed-off-by: NMing Lei <ming.lei@canonical.com>
      Fixes: c76cbbcf ("blk-mq: put blk_queue_rq_timeout together in blk_mq_init_queue()"
      Signed-off-by: NJens Axboe <axboe@fb.com>
      e56f698b
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · 3aa20508
      Linus Torvalds 提交于
      Pull TPM bugfixes from James Morris.
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        tpm, tpm_crb: fail when TPM2 ACPI table contents look corrupted
        tpm: Fix initialization of the cdev
      3aa20508
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma · 9090fdb9
      Linus Torvalds 提交于
      Pull rdma fixes from Doug Ledford:
       "Mainly fix-ups for the various 4.2 items"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (24 commits)
        IB/core: Destroy ocrdma_dev_id IDR on module exit
        IB/core: Destroy multcast_idr on module exit
        IB/mlx4: Optimize do_slave_init
        IB/mlx4: Fix memory leak in do_slave_init
        IB/mlx4: Optimize freeing of items on error unwind
        IB/mlx4: Fix use of flow-counters for process_mad
        IB/ipath: Convert use of __constant_<foo> to <foo>
        IB/ipoib: Set MTU to max allowed by mode when mode changes
        IB/ipoib: Scatter-Gather support in connected mode
        IB/ucm: Fix bitmap wrap when devnum > IB_UCM_MAX_DEVICES
        IB/ipoib: Prevent lockdep warning in __ipoib_ib_dev_flush
        IB/ucma: Fix lockdep warning in ucma_lock_files
        rds: rds_ib_device.refcount overflow
        RDMA/nes: Fix for incorrect recording of the MAC address
        RDMA/nes: Fix for resolving the neigh
        RDMA/core: Fixes for port mapper client registration
        IB/IPoIB: Fix bad error flow in ipoib_add_port()
        IB/mlx4: Do not attemp to report HCA clock offset on VFs
        IB/cm: Do not queue work to a device that's going away
        IB/srp: Avoid using uninitialized variable
        ...
      9090fdb9
    • K
      NVMe: Reread partitions on metadata formats · 7bee6074
      Keith Busch 提交于
      This patch has the driver automatically reread partitions if a namespace
      has a separate metadata format. Previously revalidating a disk was
      sufficient to get the correct capacity set on such formatted drives,
      but partitions that may exist would not have been surfaced.
      Reported-by: NPaul Grabinar <paul.grabinar@ranbarg.com>
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Tested-by: NPaul Grabinar <paul.grabinar@ranbarg.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      7bee6074
    • L
      Merge tag 'locks-v4.2-1' of git://git.samba.org/jlayton/linux · 16ff49a0
      Linus Torvalds 提交于
      Pull file locking updates from Jeff Layton:
       "I had thought that I was going to get away without a pull request this
        cycle.  There was a NFSv4 file locking problem that cropped up that I
        tried to fix in the NFSv4 code alone, but that fix has turned out to
        be problematic.  These patches fix this in the correct way.
      
        Note that this touches some NFSv4 code as well.  Ordinarily I'd wait
        for Trond to ACK this, but he's on holiday right now and the bug is
        rather nasty.  So I suggest we merge this and if he raises issues with
        it we can sort it out when he gets back"
      Acked-by: NBruce Fields <bfields@fieldses.org>
      Acked-by: NDan Williams <dan.j.williams@intel.com>
       [ +1 to this series fixing a 100% reproducible slab corruption +
         general protection fault in my nfs-root test environment. - Dan ]
      Acked-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      
      * tag 'locks-v4.2-1' of git://git.samba.org/jlayton/linux:
        locks: inline posix_lock_file_wait and flock_lock_file_wait
        nfs4: have do_vfs_lock take an inode pointer
        locks: new helpers - flock_lock_inode_wait and posix_lock_inode_wait
        locks: have flock_lock_file take an inode pointer instead of a filp
        Revert "nfs: take extra reference to fl->fl_file when running a LOCKU operation"
      16ff49a0
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · df14a68d
      Linus Torvalds 提交于
      Pull KVM fixes from Paolo Bonzini:
      
       - Fix FPU refactoring ("kvm: x86: fix load xsave feature warning")
      
       - Fix eager FPU mode (Cc stable)
      
       - AMD bits of MTRR virtualization
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        kvm: x86: fix load xsave feature warning
        KVM: x86: apply guest MTRR virtualization on host reserved pages
        KVM: SVM: Sync g_pat with guest-written PAT value
        KVM: SVM: use NPT page attributes
        KVM: count number of assigned devices
        KVM: VMX: fix vmwrite to invalid VMCS
        KVM: x86: reintroduce kvm_is_mmio_pfn
        x86: hyperv: add CPUID bit for crash handlers
      df14a68d
    • L
      Merge tag 'arc-v4.2-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · bec33cd2
      Linus Torvalds 提交于
      Pull ARC fixes from Vineet Gupta:
       - Makefile changes (top-level+ARC) reinstates -O3 builds (regression
         since 3.16)
       - IDU intc related fixes, IRQ affinity
       - patch to make bitops safer for ARC
       - perf fix from Alexey to remove signed PC braino
       - Futex backend gets llock/scond support
      
      * tag 'arc-v4.2-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        ARCv2: support HS38 releases
        ARC: make sure instruction_pointer() returns unsigned value
        ARC: slightly refactor macros for boot logging
        ARC: Add llock/scond to futex backend
        arc:irqchip: prepare for drivers/irqchip/irqchip.h removal
        ARC: Make ARC bitops "safer" (add anti-optimization)
        ARCv2: [axs103] bump CPU frequency from 75 to 90 MHZ
        ARCv2: intc: IDU: Fix potential race in installing a chained IRQ handler
        ARCv2: intc: IDU: support irq affinity
        ARC: fix unused var wanring
        ARC: Don't memzero twice in dma_alloc_coherent for __GFP_ZERO
        ARC: Override toplevel default -O2 with -O3
        kbuild: Allow arch Makefiles to override {cpp,ld,c}flags
        ARCv2: guard SLC DMA ops with spinlock
        ARC: Kconfig: better way to disable ARC_HAS_LLSC for ARC_CPU_750D
      bec33cd2
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 9c69481e
      Linus Torvalds 提交于
      Pull s390 fixes from Martin Schwidefsky:
       "One improvement for the zcrypt driver, the quality attribute for the
        hwrng device has been missing.  Without it the kernel entropy seeding
        will not happen automatically.
      
        And six bug fixes, the most important one is the fix for the vector
        register corruption due to machine checks"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/nmi: fix vector register corruption
        s390/process: fix sfpc inline assembly
        s390/dasd: fix kernel panic when alias is set offline
        s390/sclp: clear upper register halves in _sclp_print_early
        s390/oprofile: fix compile error
        s390/sclp: fix compile error
        s390/zcrypt: enable s390 hwrng to seed kernel entropy
      9c69481e
    • D
      jfs: clean up jfs_rename and fix out of order unlock · 26456955
      Dave Kleikamp 提交于
      The end of jfs_rename(), which is also used by the error paths,
      included a call to IWRITE_UNLOCK(new_ip) after labels out1, out2
      and out3. If we come in through these labels, IWRITE_LOCK() has not
      been called yet.
      
      In moving that call to the correct spot, I also moved some
      exceptional truncate code earlier as well, since the early error
      paths don't need to deal with it, and I renamed out4: to out_tx: so
      a future patch by Jan Kara doesn't need to deal with renumbering or
      confusing out-of-order labels.
      Signed-off-by: NDave Kleikamp <dave.kleikamp@oracle.com>
      26456955
    • L
      Merge tag 'module-final-v4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux · 97d6e2b6
      Linus Torvalds 提交于
      Pull final init.h/module.h code relocation from Paul Gortmaker:
       "With the release of 4.2-rc2 done, we should not be seeing any new code
        added that gets upset by this small code move, and we've banked yet
        another complete week of testing with this move in place on top of
        4.2-rc1 via linux-next to ensure that remained true.
      
        Given that, I'd like to put it in now so that people formulating new
        work for 4.3-rc1 will be exposed to the ever so slightly stricter (but
        sensible) requirements wrt.  whether they are needing init.h vs.
        module.h macros, even if they are not using linux-next.
      
        The diffstat of the move is slightly asymmetrical due to needing to
        leave behind a couple #ifdef in the old location and add the same ones
        to the new location, but other than that, it is a 1:1 move, complete
        with the module_init/exit trailing semicolon that we can't fix.  That
        is, until/unless someone does a tree-wide sed fix of all the
        approximately 800 currently in tree users relying on it"
      
      * tag 'module-final-v4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
        module: relocate module_init from init.h to module.h
      97d6e2b6
    • L
      Merge tag 'trace-v4.2-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 75580097
      Linus Torvalds 提交于
      Pull tracing fix from Steven Rostedt:
       "Fengguang Wu discovered a crash that happened to be because of the
        branch tracer (traces unlikely and likely branches) when enabled with
        certain debug options.
      
        What happened was that various debug options like lockdep and
        DEBUG_PREEMPT can cause parts of the branch tracer to recurse outside
        its recursion protection.  In fact, part of its recursion protection
        used these features that caused the lockup.  This cleans up the code a
        little and makes the recursion protection a bit more robust"
      
      * tag 'trace-v4.2-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Have branch tracer use recursive field of task struct
      75580097
  3. 15 7月, 2015 25 次提交
    • J
    • J
      IB/core: Destroy ocrdma_dev_id IDR on module exit · d8b2ba7c
      Johannes Thumshirn 提交于
      Destroy ocrdma_dev_id IDR on module exit, reclaiming the allocated memory.
      
      This was detected by the following semantic patch (written by Luis Rodriguez
      <mcgrof@suse.com>)
      <SmPL>
      @ defines_module_init @
      declarer name module_init, module_exit;
      declarer name DEFINE_IDR;
      identifier init;
      @@
      
      module_init(init);
      
      @ defines_module_exit @
      identifier exit;
      @@
      
      module_exit(exit);
      
      @ declares_idr depends on defines_module_init && defines_module_exit @
      identifier idr;
      @@
      
      DEFINE_IDR(idr);
      
      @ on_exit_calls_destroy depends on declares_idr && defines_module_exit @
      identifier declares_idr.idr, defines_module_exit.exit;
      @@
      
      exit(void)
      {
       ...
       idr_destroy(&idr);
       ...
      }
      
      @ missing_module_idr_destroy depends on declares_idr && defines_module_exit && !on_exit_calls_destroy @
      identifier declares_idr.idr, defines_module_exit.exit;
      @@
      
      exit(void)
      {
       ...
       +idr_destroy(&idr);
       }
      
      </SmPL>
      Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      d8b2ba7c
    • J
      IB/core: Destroy multcast_idr on module exit · 45d25420
      Johannes Thumshirn 提交于
      Destroy multcast_idr on module exit, reclaiming the allocated memory.
      
      This was detected by the following semantic patch (written by Luis Rodriguez
      <mcgrof@suse.com>)
      <SmPL>
      @ defines_module_init @
      declarer name module_init, module_exit;
      declarer name DEFINE_IDR;
      identifier init;
      @@
      
      module_init(init);
      
      @ defines_module_exit @
      identifier exit;
      @@
      
      module_exit(exit);
      
      @ declares_idr depends on defines_module_init && defines_module_exit @
      identifier idr;
      @@
      
      DEFINE_IDR(idr);
      
      @ on_exit_calls_destroy depends on declares_idr && defines_module_exit @
      identifier declares_idr.idr, defines_module_exit.exit;
      @@
      
      exit(void)
      {
       ...
       idr_destroy(&idr);
       ...
      }
      
      @ missing_module_idr_destroy depends on declares_idr && defines_module_exit && !on_exit_calls_destroy @
      identifier declares_idr.idr, defines_module_exit.exit;
      @@
      
      exit(void)
      {
       ...
       +idr_destroy(&idr);
      }
      
      </SmPL>
      Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      45d25420
    • D
      IB/mlx4: Optimize do_slave_init · d9a047ae
      Doug Ledford 提交于
      There is little chance our memory allocation will fail, so we can
      combine initializing the work structs with allocating them instead of
      looping through all of them once to allocate and again to initialize.
      Then when we need to actually find out if our device is up or in the
      process of going down, have all of our work structs batched up, take the
      spin_lock once and only once, and do all of the batch under the one
      spin_lock invocation instead of incurring all of the locked memory cycles
      we would otherwise incur to take/release the spin_lock over and over
      again.
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      d9a047ae
    • D
      IB/mlx4: Fix memory leak in do_slave_init · 9bbf282d
      Doug Ledford 提交于
      We create a number of work structs to be queued up to a workqueue, and
      on completion of the workqueue handler, the workqueue handler frees the
      allocated memory.  If, however, we don't queue the work struct because
      the device is going down, then we need to free the memory ourselves.
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      9bbf282d
    • M
      IB/mlx4: Optimize freeing of items on error unwind · a39a98ff
      Maninder Singh 提交于
      On failure, we loop through all possible pointers and test them before
      calling kfree.  But really, why even attempt to free items we didn't
      allocate when we can easily loop through exactly and only the devices
      for which the original memory allocation succeeded and free just those.
      Signed-off-by: NManinder Singh <maninder1.s@samsung.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      a39a98ff
    • O
      IB/mlx4: Fix use of flow-counters for process_mad · 43bfb972
      Or Gerlitz 提交于
      For IB links, reading HCA flow counters through iboe_process_mad() should
      be used when mlx4_ib_process_mad() is invoked only for VFs PMA queries and
      exactly nothing else.
      
      Fixes: 7193a141 ('IB/mlx4: Set VF to read from QP counters')
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      43bfb972
    • V
      IB/ipath: Convert use of __constant_<foo> to <foo> · cb1ff431
      Vaishali Thakkar 提交于
      In little endian cases, the macros be16_to_cpu and cpu_to_be64
      unfolds to __swab{16,64} which provides special case for constants.
      In big endian cases, __constant_be16_to_cpu and be16_to_cpu
      expand directly to the same expression. The same applies for
      __constant_cpu_to_be64 and cpu_to_be64.
      
      So, replace __constant_be16_to_cpu with be16_to_cpu and
      __constant_cpu_to_be64 with cpu_to_be64, with the goal of getting
      rid of the definition of __constant_be16_to_cpu and
      __constant_cpu_to_be64 completely.
      Signed-off-by: NVaishali Thakkar <vthakkar1994@gmail.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      cb1ff431
    • E
      IB/ipoib: Set MTU to max allowed by mode when mode changes · edcd2a74
      Erez Shitrit 提交于
      When switching between modes (datagram / connected) change the MTU
      accordingly.
      datagram mode up to 4K, connected mode up to (64K - 0x10).
      Signed-off-by: NELi Cohen <eli@mellanox.com>
      Signed-off-by: NErez Shitrit <erezsh@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      edcd2a74
    • Y
      IB/ipoib: Scatter-Gather support in connected mode · c4268778
      Yuval Shaia 提交于
      By default, IPoIB-CM driver uses 64k MTU. Larger MTU gives better
      performance.
      This MTU plus overhead puts the memory allocation for IP based packets at
      32 4k pages (order 5), which have to be contiguous.
      When the system memory under pressure, it was observed that allocating 128k
      contiguous physical memory is difficult and causes serious errors (such as
      system becomes unusable).
      
      This enhancement resolve the issue by removing the physically contiguous
      memory requirement using Scatter/Gather feature that exists in Linux stack.
      
      With this fix Scatter-Gather will be supported also in connected mode.
      
      This change reverts some of the change made in commit e112373f
      ("IPoIB/cm: Reduce connected mode TX object size").
      
      The ability to use SG in IPoIB CM is possible because the coupling
      between NETIF_F_SG and NETIF_F_CSUM was removed in commit
      ec5f0615 ("net: Kill link between CSUM and SG features.")
      Signed-off-by: NYuval Shaia <yuval.shaia@oracle.com>
      Acked-by: NChristian Marie <christian@ponies.io>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      c4268778
    • C
      IB/ucm: Fix bitmap wrap when devnum > IB_UCM_MAX_DEVICES · 59d40dd9
      Carol L Soto 提交于
      ib_ucm_release_dev clears the wrong bit if devnum is greater
      than IB_UCM_MAX_DEVICES.
      Signed-off-by: NCarol L Soto <clsoto@linux.vnet.ibm.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      59d40dd9
    • H
      IB/ipoib: Prevent lockdep warning in __ipoib_ib_dev_flush · 8b7cce0d
      Haggai Eran 提交于
      __ipoib_ib_dev_flush calls itself recursively on child devices, and lockdep
      complains about locking vlan_rwsem twice (see below). Use down_read_nested
      instead of down_read to prevent the warning.
      
       =============================================
       [ INFO: possible recursive locking detected ]
       4.1.0-rc4+ #36 Tainted: G           O
       ---------------------------------------------
       kworker/u20:2/261 is trying to acquire lock:
        (&priv->vlan_rwsem){.+.+..}, at: [<ffffffffa0791e2a>] __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
      
       but task is already holding lock:
        (&priv->vlan_rwsem){.+.+..}, at: [<ffffffffa0791e2a>] __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(&priv->vlan_rwsem);
         lock(&priv->vlan_rwsem);
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       3 locks held by kworker/u20:2/261:
        #0:  ("%s""ipoib_flush"){.+.+..}, at: [<ffffffff810827cc>] process_one_work+0x15c/0x760
        #1:  ((&priv->flush_heavy)){+.+...}, at: [<ffffffff810827cc>] process_one_work+0x15c/0x760
        #2:  (&priv->vlan_rwsem){.+.+..}, at: [<ffffffffa0791e2a>] __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
      
       stack backtrace:
       CPU: 3 PID: 261 Comm: kworker/u20:2 Tainted: G           O    4.1.0-rc4+ #36
       Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
       Workqueue: ipoib_flush ipoib_ib_dev_flush_heavy [ib_ipoib]
        ffff8801c6c54790 ffff8801c9927af8 ffffffff81665238 0000000000000001
        ffffffff825b5b30 ffff8801c9927bd8 ffffffff810bba51 ffff880100000000
        ffffffff00000001 ffff880100000001 ffff8801c6c55428 ffff8801c6c54790
       Call Trace:
        [<ffffffff81665238>] dump_stack+0x4f/0x6f
        [<ffffffff810bba51>] __lock_acquire+0x741/0x1820
        [<ffffffff810bcbf8>] lock_acquire+0xc8/0x240
        [<ffffffffa0791e2a>] ? __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
        [<ffffffff81669d2c>] down_read+0x4c/0x70
        [<ffffffffa0791e2a>] ? __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
        [<ffffffffa0791e2a>] __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
        [<ffffffffa0791e4a>] __ipoib_ib_dev_flush+0x5a/0x2b0 [ib_ipoib]
        [<ffffffffa07920ba>] ipoib_ib_dev_flush_heavy+0x1a/0x20 [ib_ipoib]
        [<ffffffff81082871>] process_one_work+0x201/0x760
        [<ffffffff810827cc>] ? process_one_work+0x15c/0x760
        [<ffffffff81082ef0>] worker_thread+0x120/0x4d0
        [<ffffffff81082dd0>] ? process_one_work+0x760/0x760
        [<ffffffff81082dd0>] ? process_one_work+0x760/0x760
        [<ffffffff81088b7e>] kthread+0xfe/0x120
        [<ffffffff81088a80>] ? __init_kthread_worker+0x70/0x70
        [<ffffffff8166c6e2>] ret_from_fork+0x42/0x70
        [<ffffffff81088a80>] ? __init_kthread_worker+0x70/0x70
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      8b7cce0d
    • H
      IB/ucma: Fix lockdep warning in ucma_lock_files · 31b57b87
      Haggai Eran 提交于
      The ucma_lock_files() locks the mut mutex on two files, e.g. for migrating
      an ID. Use mutex_lock_nested() to prevent the warning below.
      
       =============================================
       [ INFO: possible recursive locking detected ]
       4.1.0-rc6-hmm+ #40 Tainted: G           O
       ---------------------------------------------
       pingpong_rpc_se/10260 is trying to acquire lock:
        (&file->mut){+.+.+.}, at: [<ffffffffa047ac55>] ucma_migrate_id+0xc5/0x248 [rdma_ucm]
      
       but task is already holding lock:
        (&file->mut){+.+.+.}, at: [<ffffffffa047ac4b>] ucma_migrate_id+0xbb/0x248 [rdma_ucm]
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(&file->mut);
         lock(&file->mut);
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       1 lock held by pingpong_rpc_se/10260:
        #0:  (&file->mut){+.+.+.}, at: [<ffffffffa047ac4b>] ucma_migrate_id+0xbb/0x248 [rdma_ucm]
      
       stack backtrace:
       CPU: 0 PID: 10260 Comm: pingpong_rpc_se Tainted: G           O    4.1.0-rc6-hmm+ #40
       Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
        ffff8801f85b63d0 ffff880195677b58 ffffffff81668f49 0000000000000001
        ffffffff825cbbe0 ffff880195677c38 ffffffff810bb991 ffff880100000000
        ffff880100000000 ffff880100000001 ffff8801f85b7010 ffffffff8121bee9
       Call Trace:
        [<ffffffff81668f49>] dump_stack+0x4f/0x6e
        [<ffffffff810bb991>] __lock_acquire+0x741/0x1820
        [<ffffffff8121bee9>] ? dput+0x29/0x320
        [<ffffffff810bcb38>] lock_acquire+0xc8/0x240
        [<ffffffffa047ac55>] ? ucma_migrate_id+0xc5/0x248 [rdma_ucm]
        [<ffffffff8166b901>] ? mutex_lock_nested+0x291/0x3e0
        [<ffffffff8166b6d5>] mutex_lock_nested+0x65/0x3e0
        [<ffffffffa047ac55>] ? ucma_migrate_id+0xc5/0x248 [rdma_ucm]
        [<ffffffff810baeed>] ? trace_hardirqs_on+0xd/0x10
        [<ffffffff8166b66e>] ? mutex_unlock+0xe/0x10
        [<ffffffffa047ac55>] ucma_migrate_id+0xc5/0x248 [rdma_ucm]
        [<ffffffffa0478474>] ucma_write+0xa4/0xb0 [rdma_ucm]
        [<ffffffff81200674>] __vfs_write+0x34/0x100
        [<ffffffff8112427c>] ? __audit_syscall_entry+0xac/0x110
        [<ffffffff810ec055>] ? current_kernel_time+0xc5/0xe0
        [<ffffffff812aa4d3>] ? security_file_permission+0x23/0x90
        [<ffffffff8120088d>] ? rw_verify_area+0x5d/0xe0
        [<ffffffff812009bb>] vfs_write+0xab/0x120
        [<ffffffff81201519>] SyS_write+0x59/0xd0
        [<ffffffff8112427c>] ? __audit_syscall_entry+0xac/0x110
        [<ffffffff8166ffee>] system_call_fastpath+0x12/0x76
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      31b57b87
    • W
      rds: rds_ib_device.refcount overflow · 4fabb594
      Wengang Wang 提交于
      Fixes: 3e0249f9 ("RDS/IB: add refcount tracking to struct rds_ib_device")
      
      There lacks a dropping on rds_ib_device.refcount in case rds_ib_alloc_fmr
      failed(mr pool running out). this lead to the refcount overflow.
      
      A complain in line 117(see following) is seen. From vmcore:
      s_ib_rdma_mr_pool_depleted is 2147485544 and rds_ibdev->refcount is -2147475448.
      That is the evidence the mr pool is used up. so rds_ib_alloc_fmr is very likely
      to return ERR_PTR(-EAGAIN).
      
      115 void rds_ib_dev_put(struct rds_ib_device *rds_ibdev)
      116 {
      117         BUG_ON(atomic_read(&rds_ibdev->refcount) <= 0);
      118         if (atomic_dec_and_test(&rds_ibdev->refcount))
      119                 queue_work(rds_wq, &rds_ibdev->free_work);
      120 }
      
      fix is to drop refcount when rds_ib_alloc_fmr failed.
      Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
      Reviewed-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      4fabb594
    • T
      RDMA/nes: Fix for incorrect recording of the MAC address · 0a691272
      Tatyana Nikolova 提交于
      Fix for incorrect recording of the MAC address
      Signed-off-by: NTatyana Nikolova <Tatyana.E.Nikolova@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      0a691272
    • T
      RDMA/nes: Fix for resolving the neigh · 165d6822
      Tatyana Nikolova 提交于
      Neighbor resolution doesn't work without this fix
      Signed-off-by: NTatyana Nikolova <Tatyana.E.Nikolova@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      165d6822
    • T
      RDMA/core: Fixes for port mapper client registration · a7f2f24c
      Tatyana Nikolova 提交于
      Fixes to allow clients to make remove mapping requests, after
      they have provided the user space service with the mapping
      information, they are using when the service is restarted.
      
      1) Adding IWPM_REG_VALID, IWPM_REG_INCOMPL and IWPM_REG_UNDEF
         registration types for the port mapper clients and functions
         to set/check the registration type.
      2) If the port mapper user space service is not available to register
         the client, then its registration stays IWPM_REG_UNDEF and the
         registration isn't checked until the service becomes available
         (no mappings are possible, if the user space service isn't running).
      3) After the service is restarted, the user space port mapper pid is set
         to valid and the client registration is set to IWPM_REG_INCOMPL
         to allow the client to make remove mapping requests.
      Signed-off-by: NTatyana Nikolova <Tatyana.E.Nikolova@intel.com>
      Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
      Tested-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      a7f2f24c
    • A
      IB/IPoIB: Fix bad error flow in ipoib_add_port() · 58e9cc90
      Amir Vadai 提交于
      Error values of ib_query_port() and ib_query_device() weren't propagated
      correctly. Because of that, ipoib_add_port() could return NULL value,
      which escaped the IS_ERR() check in ipoib_add_one() and we crashed.
      Signed-off-by: NAmir Vadai <amirv@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      58e9cc90
    • M
      IB/mlx4: Do not attemp to report HCA clock offset on VFs · 8a7ff14d
      Matan Barak 提交于
      mlx4 VFs can provide CQE raw time-stamping services, but they
      don't have the hca core clock mapped to their PCI bars.
      
      As such, we should not attempt to query and report the clock offset
      to user space for VFs. Doing so causes query_device over VFs to fail
      with -ENOSUPP.
      
      Fixes: 4b664c43 ('IB/mlx4: Add support for CQ time-stamping')
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      8a7ff14d
    • E
      IB/cm: Do not queue work to a device that's going away · be4b4993
      Erez Shitrit 提交于
      Whenever ib_cm gets remove_one call, like when there is a hot-unplug
      event, the driver should mark itself as going_down and confirm that no
      new works are going to be queued for that device.
      so, the order of the actions are:
      1. mark the going_down bit.
      2. flush the wq.
      3. [make sure no new works for that device.]
      4. unregister mad agent.
      
      otherwise, works that are already queued can be scheduled after the mad
      agent was freed.
      Signed-off-by: NErez Shitrit <erezsh@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      be4b4993
    • S
      IB/srp: Avoid using uninitialized variable · 3fdf70ac
      Sagi Grimberg 提交于
      We might return res which is not initialized. Also
      reduce code duplication by exporting srp_parse_tmo so
      srp_tmo_set can reuse it.
      
      Detected by Coverity.
      Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: NJenny Falkovich <jennyf@mellanox.com>
      Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      3fdf70ac
    • V
      IB/srpt: Convert use of __constant_cpu_to_beXX to cpu_to_beXX · b356c1c1
      Vaishali Thakkar 提交于
      In little endian cases, the macro cpu_to_be{16,32,64} unfolds to
      __swab{16,32,64} which provides special case for constants. In
      big endian cases, __constant_cpu_to_be{16,32,64} and
      cpu_to_be{16,32,64} expand directly to the same expression. So,
      replace __constant_cpu_to_be{16,32,64} with cpu_to_be{16,32,64}
      with the goal of getting rid of the definitions of
      __constant_cpu_to_be{16,32,64} completely.
      
      The Coccinelle semantic patch that performs this transformation
      is as follows:
      
      @@expression x;@@
      
      (
      - __constant_cpu_to_be16(x)
      + cpu_to_be16(x)
      |
      - __constant_cpu_to_be32(x)
      + cpu_to_be32(x)
      |
      - __constant_cpu_to_be64(x)
      + cpu_to_be64(x)
      )
      Signed-off-by: NVaishali Thakkar <vthakkar1994@gmail.com>
      Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      b356c1c1
    • I
      IB/mad: Remove improper use of BUG_ON · 3b8ab700
      Ira Weiny 提交于
      We recently added BUG_ON's which were inappropriate for a condition which
      should never happen. Change these to be WARN_ON_ONCE as a debugging aid.
      
      Fixes: 4cd7c947 ('IB/mad: Add support for additional MAD info to/from drivers')
      Signed-off-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      3b8ab700
    • I
      IB/mad: Fix compare between big endian and cpu endian · cd4cd565
      Ira Weiny 提交于
      The define OPA_LID_PERMISSIVE is big endian and was compared to the
      cpu endian variable opa_drslid.
      
      Problem caught by 0-day build infrastructure.
      
      Fixes: 8e4349d1 (IB/mad: Add final OPA MAD processing)
      Signed-off-by: NIra Weiny <ira.weiny@intel.com>
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Reviewed-by: NJohn, Jubin <jubin.john@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      cd4cd565
    • H
      IB: Add rdma_cap_ib_switch helper and use where appropriate · 4139032b
      Hal Rosenstock 提交于
      Persuant to Liran's comments on node_type on linux-rdma
      mailing list:
      
      In an effort to reform the RDMA core and ULPs to minimize use of
      node_type in struct ib_device, an additional bit is added to
      struct ib_device for is_switch (IB switch). This is needed
      to be initialized by any IB switch device driver. This is a
      NEW requirement on such device drivers which are all
      "out of tree".
      
      In addition, an ib_switch helper was added to ib_verbs.h
      based on the is_switch device bit rather than node_type
      (although those should be consistent).
      
      The RDMA core (MAD, SMI, agent, sa_query, multicast, sysfs)
      as well as (IPoIB and SRP) ULPs are updated where
      appropriate to use this new helper. In some cases,
      the helper is now used under the covers of using
      rdma_[start end]_port rather than the open coding
      previously used.
      Reviewed-by: NSean Hefty <sean.hefty@intel.com>
      Reviewed-By: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Reviewed-by: NIra Weiny <ira.weiny@intel.com>
      Tested-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NHal Rosenstock <hal@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      4139032b
  4. 14 7月, 2015 2 次提交