1. 05 9月, 2017 1 次提交
    • M
      libnvdimm, nfit: move the check on nd_reserved2 to the endpoint · 9edcad53
      Meng Xu 提交于
      Delay the check of nd_reserved2 to the actual endpoint (acpi_nfit_ctl)
      that uses it, as a prevention of a potential double-fetch bug.
      
      While examining the kernel source code, I found a dangerous operation that
      could turn into a double-fetch situation (a race condition bug) where
      the same userspace memory region are fetched twice into kernel with sanity
      checks after the first fetch while missing checks after the second fetch.
      
      In the case of _IOC_NR(ioctl_cmd) == ND_CMD_CALL:
      
      1. The first fetch happens in line 935 copy_from_user(&pkg, p, sizeof(pkg)
      
      2. subsequently `pkg.nd_reserved2` is asserted to be all zeroes
      (line 984 to 986).
      
      3. The second fetch happens in line 1022 copy_from_user(buf, p, buf_len)
      
      4. Given that `p` can be fully controlled in userspace, an attacker can
      race condition to override the header part of `p`, say,
      `((struct nd_cmd_pkg *)p)->nd_reserved2` to arbitrary value
      (say nine 0xFFFFFFFF for `nd_reserved2`) after the first fetch but before the
      second fetch. The changed value will be copied to `buf`.
      
      5. There is no checks on the second fetches until the use of it in
      line 1034: nd_cmd_clear_to_send(nvdimm_bus, nvdimm, cmd, buf) and
      line 1038: nd_desc->ndctl(nd_desc, nvdimm, cmd, buf, buf_len, &cmd_rc)
      which means that the assumed relation, `p->nd_reserved2` are all zeroes might
      not hold after the second fetch. And once the control goes to these functions
      we lose the context to assert the assumed relation.
      
      6. Based on my manual analysis, `p->nd_reserved2` is not used in function
      `nd_cmd_clear_to_send` and potential implementations of `nd_desc->ndctl`
      so there is no working exploit against it right now. However, this could
      easily turns to an exploitable one if careless developers start to use
      `p->nd_reserved2` later and assume that they are all zeroes.
      
      Move the validation of the nd_reserved2 field to the ->ndctl()
      implementation where it has a stable buffer to evaluate.
      Signed-off-by: NMeng Xu <mengxu.gatech@gmail.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      9edcad53
  2. 04 9月, 2017 1 次提交
    • D
      dax: fix FS_DAX=n BLOCK=y compilation · 26f2f4de
      Dan Williams 提交于
      The 0day kbuild robot reports:
      
      >> drivers//dax/super.c:64:20: error: redefinition of 'fs_dax_get_by_bdev'
          struct dax_device *fs_dax_get_by_bdev(struct block_device *bdev)
                             ^~~~~~~~~~~~~~~~~~
         In file included from drivers//dax/super.c:22:0:
         include/linux/dax.h:76:34: note: previous definition of 'fs_dax_get_by_bdev' was here
          static inline struct dax_device *fs_dax_get_by_bdev(struct block_device *bdev)
                                           ^~~~~~~~~~~~~~~~~~
      
      Protect the definition of fs_dax_get_by_bdev() in drivers/dax/super.c
      with an ifdef.
      
      Fixes: 78f35473 ("dax: introduce a fs_dax_get_by_bdev() helper")
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Darrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      26f2f4de
  3. 01 9月, 2017 9 次提交
    • D
      libnvdimm: fix integer overflow static analysis warning · 58738c49
      Dan Williams 提交于
      Dan reports:
          The patch 62232e45: "libnvdimm: control (ioctl) messages for
          nvdimm_bus and nvdimm devices" from Jun 8, 2015, leads to the
          following static checker warning:
      
                  drivers/nvdimm/bus.c:1018 __nd_ioctl()
                  warn: integer overflows 'buf_len'
      
          From a casual review, this seems like it might be a real bug.  On
          the first iteration we load some data into in_env[].  On the second
          iteration we read a use controlled "in_size" from nd_cmd_in_size().
          It can go up to UINT_MAX - 1.  A high number means we will fill the
          whole in_env[] buffer.  But we potentially keep looping and adding
          more to in_len so now it can be any value.
      
          It simple enough to change, but it feels weird that we keep looping
          even though in_env is totally full.  Shouldn't we just return an
          error if we don't have space for desc->in_num.
      
      We keep looping because the size of the total input is allowed to be
      bigger than the 'envelope' which is a subset of the payload that tells
      us how much data to expect. For safety explicitly check that buf_len
      does not overflow which is what the checker flagged.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 62232e45: "libnvdimm: control (ioctl) messages for nvdimm_bus..."
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      58738c49
    • R
      libnvdimm, nd_blk: remove mmio_flush_range() · 5deb67f7
      Robin Murphy 提交于
      mmio_flush_range() suffers from a lack of clearly-defined semantics,
      and is somewhat ambiguous to port to other architectures where the
      scope of the writeback implied by "flush" and ordering might matter,
      but MMIO would tend to imply non-cacheable anyway. Per the rationale
      in 67a3e8fe ("nd_blk: change aperture mapping from WC to WB"), the
      only existing use is actually to invalidate clean cache lines for
      ARCH_MEMREMAP_PMEM type mappings *without* writeback. Since the recent
      cleanup of the pmem API, that also now happens to be the exact purpose
      of arch_invalidate_pmem(), which would be a far more well-defined tool
      for the job.
      
      Rather than risk potentially inconsistent implementations of
      mmio_flush_range() for the sake of one callsite, streamline things by
      removing it entirely and instead move the ARCH_MEMREMAP_PMEM related
      definitions up to the libnvdimm level, so they can be shared by NFIT
      as well. This allows NFIT to be enabled for arm64.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      5deb67f7
    • V
      libnvdimm, btt: rework error clearing · d9b83c75
      Vishal Verma 提交于
      Clearing errors or badblocks during a BTT write requires sending an ACPI
      DSM, which means potentially sleeping. Since a BTT IO happens in atomic
      context (preemption disabled, spinlocks may be held), we cannot perform
      error clearing in the course of an IO. Due to this error clearing for
      BTT IOs has hitherto been disabled.
      
      In this patch we move error clearing out of the atomic section, and thus
      re-enable error clearing with BTTs. When we are about to add a block to
      the free list, we check if it was previously marked as an error, and if
      it was, we add it to the freelist, but also set a flag that says error
      clearing will be required. We then drop the lane (ending the atomic
      context), and send a zero buffer so that the error can be cleared. The
      error flag in the free list is protected by the nd 'lane', and is set
      only be a thread while it holds that lane. When the error is cleared,
      the flag is cleared, but while holding a mutex for that freelist index.
      
      When writing, we check for two things -
      1/ If the freelist mutex is held or if the error flag is set. If so,
      this is an error block that is being (or about to be) cleared.
      2/ If the block is a known badblock based on nsio->bb
      
      The second check is required because the BTT map error flag for a map
      entry only gets set when an error LBA is read. If we write to a new
      location that may not have the map error flag set, but still might be in
      the region's badblock list, we can trigger an EIO on the write, which is
      undesirable and completely avoidable.
      
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      d9b83c75
    • V
      libnvdimm: fix potential deadlock while clearing errors · 0930a750
      Vishal Verma 提交于
      With the ACPI NFIT 'DSM' methods, acpi can be called from IO paths.
      Specifically, the DSM to clear media errors is called during writes, so
      that we can provide a writes-fix-errors model.
      
      However it is easy to imagine a scenario like:
       -> write through the nvdimm driver
         -> acpi allocation
           -> writeback, causes more IO through the nvdimm driver
             -> deadlock
      
      Fix this by using memalloc_noio_{save,restore}, which sets the GFP_NOIO
      flag for the current scope when issuing commands/IOs that are expected
      to clear errors.
      
      Cc: <linux-acpi@vger.kernel.org>
      Cc: <linux-nvdimm@lists.01.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Robert Moore <robert.moore@intel.com>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      0930a750
    • V
      libnvdimm, btt: cache sector_size in arena_info · 75892004
      Vishal Verma 提交于
      In preparation for the error clearing rework, add sector_size in the
      arena_info struct.
      Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      75892004
    • V
      libnvdimm, btt: ensure that flags were also unchanged during a map_read · 1398199d
      Vishal Verma 提交于
      In btt_map_read, we read the map twice to make sure that the map entry
      didn't change after we added it to the read tracking table. In
      anticipation of expanding the use of the error bit, also make sure that
      the error and zero flags are constant across the two map reads.
      Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      1398199d
    • V
      libnvdimm, btt: refactor map entry operations with macros · 0595d539
      Vishal Verma 提交于
      Add helpers for converting a raw map entry to just the block number, or
      either of the 'e' or 'z' flags in preparation for actually using the
      error flag to mark blocks with media errors.
      Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      0595d539
    • V
      libnvdimm, btt: fix a missed NVDIMM_IO_ATOMIC case in the write path · 1db1f3ce
      Vishal Verma 提交于
      The IO context conversion for rw_bytes missed a case in the BTT write
      path (btt_map_write) which should've been marked as atomic.
      
      In reality this should not cause a problem, because map writes are to
      small for nsio_rw_bytes to attempt error clearing, but it should be
      fixed for posterity.
      
      Add a might_sleep() in the non-atomic section of nsio_rw_bytes so that
      things like the nfit unit tests, which don't actually sleep, can catch
      bugs like this.
      
      Cc: Dan Williams <dan.j.williams@intel.com>
      Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      1db1f3ce
    • D
      libnvdimm, nfit: export an 'ecc_unit_size' sysfs attribute · a15797f4
      Dan Williams 提交于
      When the nfit driver initializes it runs an ARS (Address Range Scrub)
      operation across every pmem range. Part of that process involves
      determining the ARS capabilities of a given address range. One of the
      capabilities that is reported is the 'Clear Uncorrectable Error Range
      Length Unit Size' (see: ACPI 6.2 section 9.20.7.4 Function Index 1 -
      Query ARS Capabilities). This property is of interest to userspace
      software as it indicates the boundary at which the NVDIMM may need to
      perform read-modify-write cycles to maintain ECC blocks.
      
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      a15797f4
  4. 31 8月, 2017 1 次提交
  5. 30 8月, 2017 2 次提交
  6. 18 8月, 2017 4 次提交
    • D
      soc: ti: ti_sci_pm_domains: Populate name for genpd · 4dd6a997
      Dave Gerlach 提交于
      Commit b6a1d093 ("PM / Domains: Extend generic power domain
      debugfs") now creates a debugfs directory for each genpd based on the
      name of the genpd. Currently no name is given to the genpd created by
      ti_sci_pm_domains driver so because of this we see a NULL pointer
      dereferences when it is accessed on boot when the debugfs entry creation
      is attempted.
      
      Give the genpd a name before registering it to avoid this.
      
      Fixes: 52835d59 ("soc: ti: Add ti_sci_pm_domains driver")
      Signed-off-by: NDave Gerlach <d-gerlach@ti.com>
      Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      4dd6a997
    • K
      nvme-pci: set cqe_seen on polled completions · e9d8a0fd
      Keith Busch 提交于
      Fixes: 920d13a8 ("nvme-pci: factor out the cqe reading mechanics from __nvme_process_cq")
      Reported-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      e9d8a0fd
    • C
      nvme-fabrics: fix reporting of unrecognized options · 81a0b8d7
      Christoph Hellwig 提交于
      Only print the specified options that are not recognized, instead
      of the whole list of options.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
      81a0b8d7
    • L
      pty: fix the cached path of the pty slave file descriptor in the master · c8c03f18
      Linus Torvalds 提交于
      Christian Brauner reported that if you use the TIOCGPTPEER ioctl() to
      get a slave pty file descriptor, the resulting file descriptor doesn't
      look right in /proc/<pid>/fd/<fd>.  In particular, he wanted to use
      readlink() on /proc/self/fd/<fd> to get the pathname of the slave pty
      (basically implementing "ptsname{_r}()").
      
      The reason for that was that we had generated the wrong 'struct path'
      when we create the pty in ptmx_open().
      
      In particular, the dentry was correct, but the vfsmount pointed to the
      mount of the ptmx node. That _can_ be correct - in case you use
      "/dev/pts/ptmx" to open the master - but usually is not.  The normal
      case is to use /dev/ptmx, which then looks up the pts/ directory, and
      then the vfsmount of the ptmx node is obviously the /dev directory, not
      the /dev/pts/ directory.
      
      We actually did have the right vfsmount available, but in the wrong
      place (it gets looked up in 'devpts_acquire()' when we get a reference
      to the pts filesystem), and so ptmx_open() used the wrong mnt pointer.
      
      The end result of this confusion was that the pty worked fine, but when
      if you did TIOCGPTPEER to get the slave side of the pty, end end result
      would also work, but have that dodgy 'struct path'.
      
      And then when doing "d_path()" on to get the pathname, the vfsmount
      would not match the root of the pts directory, and d_path() would return
      an empty pathname thinking that the entry had escaped a bind mount into
      another mount.
      
      This fixes the problem by making devpts_acquire() return the vfsmount
      for the pts filesystem, allowing ptmx_open() to trivially just use the
      right mount for the pts dentry, and create the proper 'struct path'.
      Reported-by: NChristian Brauner <christian.brauner@ubuntu.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Acked-by: NEric Biederman <ebiederm@xmission.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c8c03f18
  7. 17 8月, 2017 3 次提交
    • R
      of: fix DMA mask generation · ee7b1f31
      Robin Murphy 提交于
      Historically, DMA masks have suffered some ambiguity between whether
      they represent the range of physical memory a device can access, or the
      address bits a device is capable of driving, particularly since on many
      platforms the two are equivalent. Whilst there are some stragglers left
      (dma_max_pfn(), I'm looking at you...), the majority of DMA code has
      been cleaned up to follow the latter definition, not least since it is
      the only one which makes sense once IOMMUs are involved.
      
      In this respect, of_dma_configure() has always done the wrong thing in
      how it generates initial masks based on "dma-ranges". Although rounding
      down did not affect the TI Keystone platform where dma_addr + size is
      already a power of two, in any other case it results in a mask which is
      at best unnecessarily constrained and at worst unusable.
      
      BCM2837 illustrates the problem nicely, where we have a DMA base of 3GB
      and a size of 1GB - 16MB, giving dma_addr + size = 0xff000000 and a
      resultant mask of 0x7fffffff, which is then insufficient to even cover
      the necessary offset, effectively making all DMA addresses out-of-range.
      This has been hidden until now (mostly because we don't yet prevent
      drivers from simply overwriting this initial mask later upon probe), but
      due to recent changes elsewhere now shows up as USB being broken on
      Raspberry Pi 3.
      
      Make it right by rounding up instead of down, such that the mask
      correctly correctly describes all possisble bits the device needs to
      emit.
      
      Fixes: 9a6d7298 ("of: Calculate device DMA masks based on DT dma-range size")
      Reported-by: NStefan Wahren <stefan.wahren@i2se.com>
      Reported-by: NAndreas Färber <afaerber@suse.de>
      Reported-by: NHans Verkuil <hverkuil@xs4all.nl>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Acked-by: NRob Herring <robh@kernel.org>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      ee7b1f31
    • J
      nvmet-fc: eliminate incorrect static markers on local variables · 369157b4
      James Smart 提交于
      There were 2 statics introduced that were bogus. Removed the static
      designations.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      369157b4
    • M
      IB/uverbs: Fix NULL pointer dereference during device removal · 870201f9
      Maor Gottlieb 提交于
      As part of ib_uverbs_remove_one which might be triggered upon
      reset flow, we trigger IB_EVENT_DEVICE_FATAL event to userspace
      application.
      If device was removed after uverbs fd was opened but before
      ib_uverbs_get_context was called, the event file will be accessed
      before it was allocated, result in NULL pointer dereference:
      
      [ 72.325873] BUG: unable to handle kernel NULL pointer dereference at (null)
      ...
      [ 72.325984] IP: _raw_spin_lock_irqsave+0x22/0x40
      [ 72.327123] Call Trace:
      [ 72.327168] ib_uverbs_async_handler.isra.8+0x2e/0x160 [ib_uverbs]
      [ 72.327216] ? synchronize_srcu_expedited+0x27/0x30
      [ 72.327269] ib_uverbs_remove_one+0x120/0x2c0 [ib_uverbs]
      [ 72.327330] ib_unregister_device+0xd0/0x180 [ib_core]
      [ 72.327373] mlx5_ib_remove+0x74/0x140 [mlx5_ib]
      [ 72.327422] mlx5_remove_device+0xfb/0x110 [mlx5_core]
      [ 72.327466] mlx5_unregister_interface+0x3c/0xa0 [mlx5_core]
      [ 72.327509] mlx5_ib_cleanup+0x10/0x962 [mlx5_ib]
      [ 72.327546] SyS_delete_module+0x155/0x230
      [ 72.328472] ? exit_to_usermode_loop+0x70/0xa6
      [ 72.329370] do_syscall_64+0x54/0xc0
      [ 72.330262] entry_SYSCALL64_slow_path+0x25/0x25
      
      Fix it by checking that user context was allocated before
      trigger the event.
      
      Fixes: 036b1063 ('IB/uverbs: Enable device removal when there are active user space applications')
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Reviewed-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      870201f9
  8. 16 8月, 2017 16 次提交
  9. 15 8月, 2017 3 次提交
    • M
      xen-blkfront: use a right index when checking requests · b15bd8cb
      Munehisa Kamata 提交于
      Since commit d05d7f40 ("Merge branch 'for-4.8/core' of
      git://git.kernel.dk/linux-block") and 3fc9d690 ("Merge branch
      'for-4.8/drivers' of git://git.kernel.dk/linux-block"), blkfront_resume()
      has been using an index for iterating ring_info to check request when
      iterating blk_shadow in an inner loop. This seems to have been
      accidentally introduced during the massive rewrite of the block layer
      macros in the commits.
      
      This may cause crash like this:
      
      [11798.057074] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
      [11798.058832] IP: [<ffffffff814411fa>] blkfront_resume+0x10a/0x610
      ....
      [11798.061063] Call Trace:
      [11798.061063]  [<ffffffff8139ce93>] xenbus_dev_resume+0x53/0x140
      [11798.061063]  [<ffffffff8139ce40>] ? xenbus_dev_probe+0x150/0x150
      [11798.061063]  [<ffffffff813f359e>] dpm_run_callback+0x3e/0x110
      [11798.061063]  [<ffffffff813f3a08>] device_resume+0x88/0x190
      [11798.061063]  [<ffffffff813f4cc0>] dpm_resume+0x100/0x2d0
      [11798.061063]  [<ffffffff813f5221>] dpm_resume_end+0x11/0x20
      [11798.061063]  [<ffffffff813950a8>] do_suspend+0xe8/0x1a0
      [11798.061063]  [<ffffffff813954bd>] shutdown_handler+0xfd/0x130
      [11798.061063]  [<ffffffff8139aba0>] ? split+0x110/0x110
      [11798.061063]  [<ffffffff8139ac26>] xenwatch_thread+0x86/0x120
      [11798.061063]  [<ffffffff810b4570>] ? prepare_to_wait_event+0x110/0x110
      [11798.061063]  [<ffffffff8108fe57>] kthread+0xd7/0xf0
      [11798.061063]  [<ffffffff811da811>] ? kfree+0x121/0x170
      [11798.061063]  [<ffffffff8108fd80>] ? kthread_park+0x60/0x60
      [11798.061063]  [<ffffffff810863b0>] ?  call_usermodehelper_exec_work+0xb0/0xb0
      [11798.061063]  [<ffffffff810864ea>] ?  call_usermodehelper_exec_async+0x13a/0x140
      [11798.061063]  [<ffffffff81534a45>] ret_from_fork+0x25/0x30
      
      Use the right index in the inner loop.
      
      Fixes: d05d7f40 ("Merge branch 'for-4.8/core' of git://git.kernel.dk/linux-block")
      Fixes: 3fc9d690 ("Merge branch 'for-4.8/drivers' of git://git.kernel.dk/linux-block")
      Signed-off-by: NMunehisa Kamata <kamatam@amazon.com>
      Reviewed-by: NThomas Friebel <friebelt@amazon.de>
      Reviewed-by: NEduardo Valentin <eduval@amazon.com>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Reviewed-by: NRoger Pau Monne <roger.pau@citrix.com>
      Cc: xen-devel@lists.xenproject.org
      Cc: stable@vger.kernel.org
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      b15bd8cb
    • R
      xen: fix bio vec merging · 462cdace
      Roger Pau Monne 提交于
      The current test for bio vec merging is not fully accurate and can be
      tricked into merging bios when certain grant combinations are used.
      The result of these malicious bio merges is a bio that extends past
      the memory page used by any of the originating bios.
      
      Take into account the following scenario, where a guest creates two
      grant references that point to the same mfn, ie: grant 1 -> mfn A,
      grant 2 -> mfn A.
      
      These references are then used in a PV block request, and mapped by
      the backend domain, thus obtaining two different pfns that point to
      the same mfn, pfn B -> mfn A, pfn C -> mfn A.
      
      If those grants happen to be used in two consecutive sectors of a disk
      IO operation becoming two different bios in the backend domain, the
      checks in xen_biovec_phys_mergeable will succeed, because bfn1 == bfn2
      (they both point to the same mfn). However due to the bio merging,
      the backend domain will end up with a bio that expands past mfn A into
      mfn A + 1.
      
      Fix this by making sure the check in xen_biovec_phys_mergeable takes
      into account the offset and the length of the bio, this basically
      replicates whats done in __BIOVEC_PHYS_MERGEABLE using mfns (bus
      addresses). While there also remove the usage of
      __BIOVEC_PHYS_MERGEABLE, since that's already checked by the callers
      of xen_biovec_phys_mergeable.
      
      CC: stable@vger.kernel.org
      Reported-by: N"Jan H. Schönherr" <jschoenh@amazon.de>
      Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
      Reviewed-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      462cdace
    • C
      net/cxgb4vf: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag · b629276d
      Casey Leedom 提交于
      cxgb4vf Ethernet driver now queries PCIe configuration space to
      determine if it can send TLPs to it with the Relaxed Ordering
      Attribute set, just like the pf did.
      Signed-off-by: NCasey Leedom <leedom@chelsio.com>
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Reviewed-by: NCasey Leedom <leedom@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b629276d