1. 12 10月, 2016 8 次提交
  2. 10 10月, 2016 1 次提交
    • L
      printk: reinstate KERN_CONT for printing continuation lines · 4bcc595c
      Linus Torvalds 提交于
      Long long ago the kernel log buffer was a buffered stream of bytes, very
      much like stdio in user space.  It supported log levels by scanning the
      stream and noticing the log level markers at the beginning of each line,
      but if you wanted to print a partial line in multiple chunks, you just
      did multiple printk() calls, and it just automatically worked.
      
      Except when it didn't, and you had very confusing output when different
      lines got all mixed up with each other.  Then you got fragment lines
      mixing with each other, or with non-fragment lines, because it was
      traditionally impossible to tell whether a printk() call was a
      continuation or not.
      
      To at least help clarify the issue of continuation lines, we added a
      KERN_CONT marker back in 2007 to mark continuation lines:
      
        47492527 ("printk: add KERN_CONT annotation").
      
      That continuation marker was initially an empty string, and didn't
      actuall make any semantic difference.  But it at least made it possible
      to annotate the source code, and have check-patch notice that a printk()
      didn't need or want a log level marker, because it was a continuation of
      a previous line.
      
      To avoid the ambiguity between a continuation line that had that
      KERN_CONT marker, and a printk with no level information at all, we then
      in 2009 made KERN_CONT be a real log level marker which meant that we
      could now reliably tell the difference between the two cases.
      
        5fd29d6c ("printk: clean up handling of log-levels and newlines")
      
      and we could take advantage of that to make sure we didn't mix up
      continuation lines with lines that just didn't have any loglevel at all.
      
      Then, in 2012, the kernel log buffer was changed to be a "record" based
      log, where each line was a record that has a loglevel and a timestamp.
      
      You can see the beginning of that conversion in commits
      
        e11fea92 ("kmsg: export printk records to the /dev/kmsg interface")
        7ff9554b ("printk: convert byte-buffer to variable-length record buffer")
      
      with a number of follow-up commits to fix some painful fallout from that
      conversion.  Over all, it took a couple of months to sort out most of
      it.  But the upside was that you could have concurrent readers (and
      writers) of the kernel log and not have lines with mixed output in them.
      
      And one particular pain-point for the record-based kernel logging was
      exactly the fragmentary lines that are generated in smaller chunks.  In
      order to still log them as one recrod, the continuation lines need to be
      attached to the previous record properly.
      
      However the explicit continuation record marker that is actually useful
      for this exact case was actually removed in aroundm the same time by commit
      
        61e99ab8 ("printk: remove the now unnecessary "C" annotation for KERN_CONT")
      
      due to the incorrect belief that KERN_CONT wasn't meaningful.  The
      ambiguity between "is this a continuation line" or "is this a plain
      printk with no log level information" was reintroduced, and in fact
      became an even bigger pain point because there was now the whole
      record-level merging of kernel messages going on.
      
      This patch reinstates the KERN_CONT as a real non-empty string marker,
      so that the ambiguity is fixed once again.
      
      But it's not a plain revert of that original removal: in the four years
      since we made KERN_CONT an empty string again, not only has the format
      of the log level markers changed, we've also had some usage changes in
      this area.
      
      For example, some ACPI code seems to use KERN_CONT _together_ with a log
      level, and now uses both the KERN_CONT marker and (for example) a
      KERN_INFO marker to show that it's an informational continuation of a
      line.
      
      Which is actually not a bad idea - if the continuation line cannot be
      attached to its predecessor, without the log level information we don't
      know what log level to assign to it (and we traditionally just assigned
      it the default loglevel).  So having both a log level and the KERN_CONT
      marker is not necessarily a bad idea, but it does mean that we need to
      actually iterate over potentially multiple markers, rather than just a
      single one.
      
      Also, since KERN_CONT was still conceptually needed, and encouraged, but
      didn't actually _do_ anything, we've also had the reverse problem:
      rather than having too many annotations it has too few, and there is bit
      rot with code that no longer marks the continuation lines with the
      KERN_CONT marker.
      
      So this patch not only re-instates the non-empty KERN_CONT marker, it
      also fixes up the cases of bit-rot I noticed in my own logs.
      
      There are probably other cases where KERN_CONT will be needed to be
      added, either because it is new code that never dealt with the need for
      KERN_CONT, or old code that has bitrotted without anybody noticing.
      
      That said, we should strive to avoid the need for KERN_CONT.  It does
      result in real problems for logging, and should generally not be seen as
      a good feature.  If we some day can get rid of the feature entirely,
      because nobody does any fragmented printk calls, that would be lovely.
      
      But until that point, let's at mark the code that relies on the hacky
      multi-fragment kernel printk's.  Not only does it avoid the ambiguity,
      it also annotates code as "maybe this would be good to fix some day".
      
      (That said, particularly during single-threaded bootup, the downsides of
      KERN_CONT are very limited.  Things get much hairier when you have
      multiple threads going on and user level reading and writing logs too).
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4bcc595c
  3. 09 10月, 2016 5 次提交
  4. 08 10月, 2016 26 次提交
    • C
      wan/fsl_ucc_hdlc: Fix size used in dma_free_coherent() · 776482cd
      Christophe Jaillet 提交于
      Size used with 'dma_alloc_coherent()' and 'dma_free_coherent()' should be
      consistent.
      Here, the size of a pointer is used in dma_alloc... and the size of the
      pointed structure is used in dma_free...
      
      This has been spotted with coccinelle, using the following script:
      ////////////////////
      @r@
      expression x0, x1, y0, y1, z0, z1, t0, t1, ret;
      @@
      
      *   ret = dma_alloc_coherent(x0, y0, z0, t0);
          ...
      *   dma_free_coherent(x1, y1, ret, t1);
      
      @script:python@
      y0 << r.y0;
      y1 << r.y1;
      
      @@
      if y1.find(y0) == -1:
       print "WARNING: sizes look different:  '%s'   vs   '%s'" % (y0, y1)
      ////////////////////
      Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      776482cd
    • N
      net: macb: NULL out phydev after removing mdio bus · fa6114d4
      Nathan Sullivan 提交于
      To ensure the dev->phydev pointer is not used after becoming invalid in
      mdiobus_unregister, set it to NULL. This happens when removing the macb
      driver without first taking its interface down, since unregister_netdev
      will end up calling macb_close.
      Signed-off-by: NXander Huff <xander.huff@ni.com>
      Signed-off-by: NNathan Sullivan <nathan.sullivan@ni.com>
      Signed-off-by: NBrad Mouring <brad.mouring@ni.com>
      Reviewed-by: NMoritz Fischer <moritz.fischer@ettus.com>
      Acked-by: NNicolas Ferre <nicolas.ferre@atmel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fa6114d4
    • P
      xen-netback: make sure that hashes are not send to unaware frontends · 912e27e8
      Paul Durrant 提交于
      In the case when a frontend only negotiates a single queue with xen-
      netback it is possible for a skbuff with a s/w hash to result in a
      hash extra_info segment being sent to the frontend even when no hash
      algorithm has been configured. (The ndo_select_queue() entry point makes
      sure the hash is not set if no algorithm is configured, but this entry
      point is not called when there is only a single queue). This can result
      in a frontend that is unable to handle extra_info segments being given
      such a segment, causing it to crash.
      
      This patch fixes the problem by clearing the hash in ndo_start_xmit()
      instead, which is clearly guaranteed to be called irrespective of the
      number of queues.
      Signed-off-by: NPaul Durrant <paul.durrant@citrix.com>
      Cc: Wei Liu <wei.liu2@citrix.com>
      Acked-by: NWei Liu <wei.liu2@citrix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      912e27e8
    • A
      vfs: Remove {get,set,remove}xattr inode operations · fd50ecad
      Andreas Gruenbacher 提交于
      These inode operations are no longer used; remove them.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      fd50ecad
    • P
      console: don't prefer first registered if DT specifies stdout-path · 05fd007e
      Paul Burton 提交于
      If a device tree specifies a preferred device for kernel console output
      via the stdout-path or linux,stdout-path chosen node properties or the
      stdout alias then the kernel ought to honor it & output the kernel
      console to that device.  As it stands, this isn't the case.  Whilst we
      parse the stdout-path properties & set an of_stdout variable from
      of_alias_scan(), and use that from of_console_check() to determine
      whether to add a console device as a preferred console whilst
      registering it, we also prefer the first registered console if no other
      has been selected at the time of its registration.
      
      This means that if a console other than the one the device tree selects
      via stdout-path is registered first, we will switch to using it & when
      the stdout-path console is later registered the call to
      add_preferred_console() via of_console_check() is too late to do
      anything useful.  In practice this seems to mean that we switch to the
      dummy console device fairly early & see no further console output:
      
          Console: colour dummy device 80x25
          console [tty0] enabled
          bootconsole [ns16550a0] disabled
      
      Fix this by not automatically preferring the first registered console if
      one is specified by the device tree.  This allows consoles to be
      registered but not enabled, and once the driver for the console selected
      by stdout-path calls of_console_check() the driver will be added to the
      list of preferred consoles before any other console has been enabled.
      When that console is then registered via register_console() it will be
      enabled as expected.
      
      Link: http://lkml.kernel.org/r/20160809151937.26118-1-paul.burton@imgtec.comSigned-off-by: NPaul Burton <paul.burton@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Ivan Delalande <colona@arista.com>
      Cc: Thierry Reding <treding@nvidia.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Jan Kara <jack@suse.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Frank Rowand <frowand.list@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      05fd007e
    • A
      cred: simpler, 1D supplementary groups · 81243eac
      Alexey Dobriyan 提交于
      Current supplementary groups code can massively overallocate memory and
      is implemented in a way so that access to individual gid is done via 2D
      array.
      
      If number of gids is <= 32, memory allocation is more or less tolerable
      (140/148 bytes).  But if it is not, code allocates full page (!)
      regardless and, what's even more fun, doesn't reuse small 32-entry
      array.
      
      2D array means dependent shifts, loads and LEAs without possibility to
      optimize them (gid is never known at compile time).
      
      All of the above is unnecessary.  Switch to the usual
      trailing-zero-len-array scheme.  Memory is allocated with
      kmalloc/vmalloc() and only as much as needed.  Accesses become simpler
      (LEA 8(gi,idx,4) or even without displacement).
      
      Maximum number of gids is 65536 which translates to 256KB+8 bytes.  I
      think kernel can handle such allocation.
      
      On my usual desktop system with whole 9 (nine) aux groups, struct
      group_info shrinks from 148 bytes to 44 bytes, yay!
      
      Nice side effects:
      
       - "gi->gid[i]" is shorter than "GROUP_AT(gi, i)", less typing,
      
       - fix little mess in net/ipv4/ping.c
         should have been using GROUP_AT macro but this point becomes moot,
      
       - aux group allocation is persistent and should be accounted as such.
      
      Link: http://lkml.kernel.org/r/20160817201927.GA2096@p183.telecom.bySigned-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Cc: Vasily Kulikov <segoon@openwall.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      81243eac
    • C
      nmi_backtrace: generate one-line reports for idle cpus · 6727ad9e
      Chris Metcalf 提交于
      When doing an nmi backtrace of many cores, most of which are idle, the
      output is a little overwhelming and very uninformative.  Suppress
      messages for cpus that are idling when they are interrupted and just
      emit one line, "NMI backtrace for N skipped: idling at pc 0xNNN".
      
      We do this by grouping all the cpuidle code together into a new
      .cpuidle.text section, and then checking the address of the interrupted
      PC to see if it lies within that section.
      
      This commit suitably tags x86 and tile idle routines, and only adds in
      the minimal framework for other architectures.
      
      Link: http://lkml.kernel.org/r/1472487169-14923-5-git-send-email-cmetcalf@mellanox.comSigned-off-by: NChris Metcalf <cmetcalf@mellanox.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: Daniel Thompson <daniel.thompson@linaro.org> [arm]
      Tested-by: NPetr Mladek <pmladek@suse.com>
      Cc: Aaron Tomlin <atomlin@redhat.com>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6727ad9e
    • R
      memory-hotplug: fix store_mem_state() return value · d66ba15b
      Reza Arbab 提交于
      If store_mem_state() is called to online memory which is already online,
      it will return 1, the value it got from device_online().
      
      This is wrong because store_mem_state() is a device_attribute .store
      function.  Thus a non-negative return value represents input bytes read.
      
      Set the return value to -EINVAL in this case.
      
      Link: http://lkml.kernel.org/r/1472743777-24266-1-git-send-email-arbab@linux.vnet.ibm.comSigned-off-by: NReza Arbab <arbab@linux.vnet.ibm.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Chen Yucong <slaoub@gmail.com>
      Cc: Andrew Banman <abanman@sgi.com>
      Cc: Seth Jennings <sjenning@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d66ba15b
    • R
      /dev/dax: fix Kconfig dependency build breakage · 4e65e938
      Ross Zwisler 提交于
      The function dax_pmem_probe() in drivers/dax/pmem.c is compiled under the
      CONFIG_DEV_DAX_PMEM tri-state config option.  This config option currently
      only depends on CONFIG_NVDIMM_DAX, a bool, which means that the following
      configuration is possible:
      
      CONFIG_LIBNVDIMM=m
      ...
      CONFIG_NVDIMM_DAX=y
      CONFIG_DEV_DAX=y
      CONFIG_DEV_DAX_PMEM=y
      
      With this config LIBNVDIMM is compiled as a module with NVDIMM_DAX=y just
      meaning that we will compile drivers/nvdimm/dax_devs.c into that module.
      However, dax_pmem_probe() depends on several symbols defined in
      drivers/nvdimm/dax_devs.c, which results in the following build errors:
      
      drivers/built-in.o: In function `dax_pmem_probe':
      linux/drivers/dax/pmem.c:70: undefined reference to `to_nd_dax'
      linux/drivers/dax/pmem.c:74: undefined reference to
      `nvdimm_namespace_common_probe'
      linux/drivers/dax/pmem.c:80: undefined reference to `devm_nsio_enable'
      linux/drivers/dax/pmem.c:81: undefined reference to `nvdimm_setup_pfn'
      linux/drivers/dax/pmem.c:84: undefined reference to `devm_nsio_disable'
      linux/drivers/dax/pmem.c:122: undefined reference to `to_nd_region'
      drivers/built-in.o: In function `dax_pmem_init':
      linux/drivers/dax/pmem.c:147: undefined reference to `__nd_driver_register'
      
      Fix this by making NVDIMM_DAX a tristate.  DEV_DAX_PMEM depends on
      NVDIMM_DAX which depends on LIBNVDIMM.  Since they are all now tristates,
      if LIBNVDIMM is built as a kernel module DEV_DAX_PMEM will be as well.
      This prevents dax_devs.c from being built as a built-in while its
      dependencies are in the libnvdimm.ko module.
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      4e65e938
    • A
      dax: use correct dev_t value · bc0a0fe9
      Arnd Bergmann 提交于
      The dev_t variable in devm_create_dax_dev() is used before it's
      first set:
      
      drivers/dax/dax.c: In function 'devm_create_dax_dev':
      drivers/dax/dax.c:205:39: error: 'dev_t' may be used uninitialized in this function [-Werror=maybe-uninitialized]
        inode = iget5_locked(dax_superblock, hash_32(devt + DAXFS_MAGIC, 31),
                                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      drivers/dax/dax.c:688:8: note: 'dev_t' was declared here
      
      This reorders the code to how it looks correct to me.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Fixes: 3bc52c45 ("dax: define a unified inode/address_space for device-dax mappings")
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      bc0a0fe9
    • D
      dax: convert devm_create_dax_dev to PTR_ERR · d76911ee
      Dan Williams 提交于
      For sub-division support we need access to the dax_dev created by
      devm_create_dax_dev().
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      d76911ee
    • D
      staging/lustre: Disable InfiniBand support · 2937f375
      Doug Ledford 提交于
      We changed one of the RDMA APIs and Lustre's InfiniBand transport
      has not been updated to match.  Disabled it for now.
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      2937f375
    • S
      iw_cxgb4: add fast-path for small REG_MR operations · 49b53a93
      Steve Wise 提交于
      When processing a REG_MR work request, if fw supports the
      FW_RI_NSMR_TPTE_WR work request, and if the page list for this
      registration is <= 2 pages, and the current state of the mr is INVALID,
      then use FW_RI_NSMR_TPTE_WR to pass down a fully populated TPTE for FW
      to write.  This avoids FW having to do an async read of the TPTE blocking
      the SQ until the read completes.
      
      To know if the current MR state is INVALID or not, iw_cxgb4 must track the
      state of each fastreg MR.  The c4iw_mr struct state is updated as REG_MR
      and LOCAL_INV WRs are posted and completed, when a reg_mr is destroyed,
      and when RECV completions are processed that include a local invalidation.
      
      This optimization increases small IO IOPS for both iSER and NVMF.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      49b53a93
    • S
      cxgb4: advertise support for FR_NSMR_TPTE_WR · 086de575
      Steve Wise 提交于
      Query firmware for the FW_PARAMS_PARAM_DEV_RI_FR_NSMR_TPTE_WR parameter.
      If it exists and is 1, then advertise support for FR_NSMR_TPTE_WR to
      the ULDs.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      086de575
    • S
      IB/core: correctly handle rdma_rw_init_mrs() failure · b6bc1c73
      Steve Wise 提交于
      Function ib_create_qp() was failing to return an error when
      rdma_rw_init_mrs() fails, causing a crash further down in ib_create_qp()
      when trying to dereferece the qp pointer which was actually a negative
      errno.
      
      The crash:
      
      crash> log|grep BUG
      [  136.458121] BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
      crash> bt
      PID: 3736   TASK: ffff8808543215c0  CPU: 2   COMMAND: "kworker/u64:2"
       #0 [ffff88084d323340] machine_kexec at ffffffff8105fbb0
       #1 [ffff88084d3233b0] __crash_kexec at ffffffff81116758
       #2 [ffff88084d323480] crash_kexec at ffffffff8111682d
       #3 [ffff88084d3234b0] oops_end at ffffffff81032bd6
       #4 [ffff88084d3234e0] no_context at ffffffff8106e431
       #5 [ffff88084d323530] __bad_area_nosemaphore at ffffffff8106e610
       #6 [ffff88084d323590] bad_area_nosemaphore at ffffffff8106e6f4
       #7 [ffff88084d3235a0] __do_page_fault at ffffffff8106ebdc
       #8 [ffff88084d323620] do_page_fault at ffffffff8106f057
       #9 [ffff88084d323660] page_fault at ffffffff816e3148
          [exception RIP: ib_create_qp+427]
          RIP: ffffffffa02554fb  RSP: ffff88084d323718  RFLAGS: 00010246
          RAX: 0000000000000004  RBX: fffffffffffffff4  RCX: 000000018020001f
          RDX: ffff880830997fc0  RSI: 0000000000000001  RDI: ffff88085f407200
          RBP: ffff88084d323778   R8: 0000000000000001   R9: ffffea0020bae210
          R10: ffffea0020bae218  R11: 0000000000000001  R12: ffff88084d3237c8
          R13: 00000000fffffff4  R14: ffff880859fa5000  R15: ffff88082eb89800
          ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
      #10 [ffff88084d323780] rdma_create_qp at ffffffffa0782681 [rdma_cm]
      #11 [ffff88084d3237b0] nvmet_rdma_create_queue_ib at ffffffffa07c43f3 [nvmet_rdma]
      #12 [ffff88084d323860] nvmet_rdma_alloc_queue at ffffffffa07c5ba9 [nvmet_rdma]
      #13 [ffff88084d323900] nvmet_rdma_queue_connect at ffffffffa07c5c96 [nvmet_rdma]
      #14 [ffff88084d323980] nvmet_rdma_cm_handler at ffffffffa07c6450 [nvmet_rdma]
      #15 [ffff88084d3239b0] iw_conn_req_handler at ffffffffa0787480 [rdma_cm]
      #16 [ffff88084d323a60] cm_conn_req_handler at ffffffffa0775f06 [iw_cm]
      #17 [ffff88084d323ab0] process_event at ffffffffa0776019 [iw_cm]
      #18 [ffff88084d323af0] cm_work_handler at ffffffffa0776170 [iw_cm]
      #19 [ffff88084d323cb0] process_one_work at ffffffff810a1483
      #20 [ffff88084d323d90] worker_thread at ffffffff810a211d
      #21 [ffff88084d323ec0] kthread at ffffffff810a6c5c
      #22 [ffff88084d323f50] ret_from_fork at ffffffff816e1ebf
      
      Fixes: 632bc3f6 ("IB/core, RDMA RW API: Do not exceed QP SGE send limit")
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      b6bc1c73
    • B
      IB/srp: Fix infinite loop when FMR sg[0].offset != 0 · 681cc360
      Bart Van Assche 提交于
      Avoid that mapping an sg-list in which the first element has a
      non-zero offset triggers an infinite loop when using FMR. This
      patch makes the FMR mapping code similar to that of ib_sg_to_pages().
      
      Note: older Mellanox HCAs do not support non-zero offsets for FMR.
      See also commit 8c4037b5 ("IB/srp: always avoid non-zero offsets
      into an FMR").
      Reported-by: NAlex Estrin <alex.estrin@intel.com>
      Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      681cc360
    • B
      52bb8c62
    • B
      IB/core: Improve ib_map_mr_sg() documentation · 52746129
      Bart Van Assche 提交于
      Document that ib_map_mr_sg() is able to map physically discontiguous
      sg-lists as a single MR. Change IB_MR_TYPE_SG_GAPS_REG into
      IB_MR_TYPE_SG_GAPS.
      Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Cc: Sagi Grimberg <sagi@grimberg.me>
      Cc: Christoph Hellwig <hch@lst.de>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NSagi Grimberg <sagi@rimberg.me>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      52746129
    • J
      IB/mlx4: Fix possible vl/sl field mismatch in LRH header in QP1 packets · fd10ed8e
      Jack Morgenstein 提交于
      In MLX qp packets, the LRH (built by the driver) has both a VL field
      and an SL field. When building a QP1 packet, the VL field should
      reflect the SLtoVL mapping and not arbitrarily contain zero (as is
      done now). This bug causes credit problems in IB switches at
      high rates of QP1 packets.
      
      The fix is to cache the SL to VL mapping in the driver, and look up
      the VL mapped to the SL provided in the send request when sending
      QP1 packets.
      
      For FW versions which support generating a port_management_config_change
      event with subtype sl-to-vl-table-change, the driver uses that event
      to update its sl-to-vl mapping cache.  Otherwise, the driver snoops
      incoming SMP mads to update the cache.
      
      There remains the case where the FW is running in secure-host mode
      (so no QP0 packets are delivered to the driver), and the FW does not
      generate the sl2vl mapping change event. To support this case, the
      driver updates (via querying the FW) its sl2vl mapping cache when
      running in secure-host mode when it receives either a Port Up event
      or a client-reregister event (where the port is still up, but there
      may have been an opensm failover).
      OpenSM modifies the sl2vl mapping before Port Up and Client-reregister
      events occur, so if there is a mapping change the driver's cache will
      be properly updated.
      
      Fixes: 225c7b1f ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      fd10ed8e
    • L
      IB/mthca: Move user vendor structures · 486f6095
      Leon Romanovsky 提交于
      This patch moves mthca vendor's specific structures to
      common UAPI folder which will be visible to all consumers.
      
      These structures are used by user-space library driver
      (libmthca) and currently manually copied to that library.
      
      This move will allow cross-compile against these files and
      simplify introduction of vendor specific data.
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      486f6095
    • L
      IB/nes: Move user vendor structures · c546b2a3
      Leon Romanovsky 提交于
      This patch moves nes vendor's specific structures to
      common UAPI folder which will be visible to all consumers.
      
      These structures are used by user-space library driver
      (libmlx4) and currently manually copied to that library.
      
      This move will allow cross-compile against these files and
      simplify introduction of vendor specific data.
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      c546b2a3
    • L
      IB/ocrdma: Move user vendor structures · a7fe7380
      Leon Romanovsky 提交于
      This patch moves ocrdma vendor's specific structures to
      common UAPI folder which will be visible to all consumers.
      
      These structures are used by user-space library driver
      (libmlx4) and currently manually copied to that library.
      
      This move will allow cross-compile against these files and
      simplify introduction of vendor specific data.
      
      In addition, it changes types to be __uXX instead of uXX.
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Acked-By: NDevesh Sharma <devesh.sharma@broadcom.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      a7fe7380
    • L
      IB/mlx4: Move user vendor structures · 9ce28a20
      Leon Romanovsky 提交于
      This patch moves mlx4 vendor's specific structures to
      common UAPI folder which will be visible to all consumers.
      
      These structures are used by user-space library driver
      (libmlx4) and currently manually copied to that library.
      
      This move will allow cross-compile against these files and
      simplify introduction of vendor specific data.
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      9ce28a20
    • L
      IB/cxgb4: Move user vendor structures · e44ee2fd
      Leon Romanovsky 提交于
      This patch moves cxgb4 vendor's specific structures to
      common UAPI folder which will be visible to all consumers.
      
      These structures are used by user-space library driver
      (libcxgb4) and currently manually copied to that library.
      
      This move will allow cross-compile against these files and
      simplify introduction of vendor specific data.
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      e44ee2fd
    • L
      IB/cxgb3: Move user vendor structures · a85fb338
      Leon Romanovsky 提交于
      This patch moves cxgb3 vendor's specific structures to
      common UAPI folder which will be visible to all consumers.
      
      These structures are used by user-space library driver
      (libcxgb3) and currently manually copied to that library.
      
      This move will allow cross-compile against these files and
      simplify introduction of vendor specific data.
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      a85fb338
    • L
      IB/mlx5: Move and decouple user vendor structures · 3085e29e
      Leon Romanovsky 提交于
      This patch decouples and moves vendors specific structures to
      common UAPI folder which will be visible to all consumers.
      
      These structures are used by user-space library driver
      (libmlx5) and currently manually copied to that library.
      
      This move will allow cross-compile against these files and
      simplify introduction of vendor specific data.
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      3085e29e