1. 07 3月, 2015 2 次提交
  2. 06 3月, 2015 1 次提交
  3. 05 3月, 2015 3 次提交
  4. 04 3月, 2015 1 次提交
    • T
      NFS: Fix a regression in the read() syscall · 874f9463
      Trond Myklebust 提交于
      When invalidating the page cache for a regular file, we want to first
      sync all dirty data to disk and then call invalidate_inode_pages2().
      The latter relies on nfs_launder_page() and nfs_release_page() to deal
      respectively with dirty pages, and unstable written pages.
      
      When commit 95905446 ("NFS: avoid deadlocks with loop-back mounted
      NFS filesystems.") changed the behaviour of nfs_release_page(), then it
      made it possible for invalidate_inode_pages2() to fail with an EBUSY.
      Unfortunately, that error is then propagated back to read().
      
      Let's therefore work around the problem for now by protecting the call
      to sync the data and invalidate_inode_pages2() so that they are atomic
      w.r.t. the addition of new writes.
      Later on, we can revisit whether or not we still need nfs_launder_page()
      and nfs_release_page().
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      874f9463
  5. 03 3月, 2015 1 次提交
  6. 02 3月, 2015 3 次提交
  7. 28 2月, 2015 1 次提交
  8. 27 2月, 2015 1 次提交
  9. 26 2月, 2015 1 次提交
    • M
      genirq / PM: better describe IRQF_NO_SUSPEND semantics · 737eb030
      Mark Rutland 提交于
      The IRQF_NO_SUSPEND flag is intended to be used for interrupts required
      to be enabled during the suspend-resume cycle. This mostly consists of
      IPIs and timer interrupts, potentially including chained irqchip
      interrupts if these are necessary to handle timers or IPIs. If an
      interrupt does not fall into one of the aforementioned categories,
      requesting it with IRQF_NO_SUSPEND is likely incorrect.
      
      Using IRQF_NO_SUSPEND does not guarantee that the interrupt can wake the
      system from a suspended state. For an interrupt to be able to trigger a
      wakeup, it may be necessary to program various components of the system.
      In these cases it is necessary to use {enable,disabled}_irq_wake.
      
      Unfortunately, several drivers assume that IRQF_NO_SUSPEND ensures that
      an IRQ can wake up the system, and the documentation can be read
      ambiguously w.r.t. this property.
      
      This patch updates the documentation regarding IRQF_NO_SUSPEND to make
      this caveat explicit, hopefully making future misuse rarer. Cleanup of
      existing misuse will occur as part of later patch series.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      737eb030
  10. 25 2月, 2015 1 次提交
  11. 24 2月, 2015 2 次提交
    • J
    • D
      x86/xen: allow privcmd hypercalls to be preempted · fdfd811d
      David Vrabel 提交于
      Hypercalls submitted by user space tools via the privcmd driver can
      take a long time (potentially many 10s of seconds) if the hypercall
      has many sub-operations.
      
      A fully preemptible kernel may deschedule such as task in any upcall
      called from a hypercall continuation.
      
      However, in a kernel with voluntary or no preemption, hypercall
      continuations in Xen allow event handlers to be run but the task
      issuing the hypercall will not be descheduled until the hypercall is
      complete and the ioctl returns to user space.  These long running
      tasks may also trigger the kernel's soft lockup detection.
      
      Add xen_preemptible_hcall_begin() and xen_preemptible_hcall_end() to
      bracket hypercalls that may be preempted.  Use these in the privcmd
      driver.
      
      When returning from an upcall, call xen_maybe_preempt_hcall() which
      adds a schedule point if if the current task was within a preemptible
      hypercall.
      
      Since _cond_resched() can move the task to a different CPU, clear and
      set xen_in_preemptible_hcall around the call.
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      fdfd811d
  12. 23 2月, 2015 5 次提交
  13. 22 2月, 2015 2 次提交
  14. 21 2月, 2015 2 次提交
    • D
      caif: fix a signedness bug in cfpkt_iterate() · 278f7b4f
      Dan Carpenter 提交于
      The cfpkt_iterate() function can return -EPROTO on error, but the
      function is a u16 so the negative value gets truncated to a positive
      unsigned short.  This causes a static checker warning.
      
      The only caller which might care is cffrml_receive(), when it's checking
      the frame checksum.  I modified cffrml_receive() so that it never says
      -EPROTO is a valid checksum.
      
      Also this isn't ever going to be inlined so I removed the "inline".
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      278f7b4f
    • G
      net: Initialize all members in skb_gro_remcsum_init() · 846cd667
      Geert Uytterhoeven 提交于
      skb_gro_remcsum_init() initializes the gro_remcsum.delta member only,
      leading to compiler warnings about a possibly uninitialized
      gro_remcsum.offset member:
      
      drivers/net/vxlan.c: In function ‘vxlan_gro_receive’:
      drivers/net/vxlan.c:602: warning: ‘grc.offset’ may be used uninitialized in this function
      net/ipv4/fou.c: In function ‘gue_gro_receive’:
      net/ipv4/fou.c:262: warning: ‘grc.offset’ may be used uninitialized in this function
      
      While these are harmless for now:
        - skb_gro_remcsum_process() sets offset before changing delta,
        - skb_gro_remcsum_cleanup() checks if delta is non-zero before
          accessing offset,
      it's safer to let the initialization function initialize all members.
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Acked-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      846cd667
  15. 20 2月, 2015 6 次提交
    • K
      NVMe: Fix potential corruption during shutdown · 07836e65
      Keith Busch 提交于
      The driver has to end unreturned commands at some point even if the
      controller has not provided a completion. The driver tried to be safe by
      deleting IO queues prior to ending all unreturned commands. That should
      cause the controller to internally abort inflight commands, but IO queue
      deletion request does not have to be successful, so all bets are off. We
      still have to make progress, so to be extra safe, this patch doesn't
      clear a queue to release the dma mapping for a command until after the
      pci device has been disabled.
      
      This patch removes the special handling during device initialization
      so controller recovery can be done all the time. This is possible since
      initialization is not inlined with pci probe anymore.
      Reported-by: NNilish Choudhury <nilesh.choudhury@oracle.com>
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      07836e65
    • K
      NVMe: Asynchronous controller probe · 2e1d8448
      Keith Busch 提交于
      This performs the longest parts of nvme device probe in scheduled work.
      This speeds up probe significantly when multiple devices are in use.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      2e1d8448
    • K
      NVMe: Register management handle under nvme class · b3fffdef
      Keith Busch 提交于
      This creates a new class type for nvme devices to register their
      management character devices with. This is so we do not rely on miscdev
      to provide enough minors for as many nvme devices some people plan to
      use. The previous limit was approximately 60 NVMe controllers, depending
      on the platform and kernel. Now the limit is 1M, which ought to be enough
      for anybody.
      
      Since we have a new device class, it makes sense to attach the block
      devices under this as well, so part of this patch moves the management
      handle initialization prior to the namespaces discovery.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      b3fffdef
    • K
      NVMe: Update SCSI Inquiry VPD 83h translation · 4f1982b4
      Keith Busch 提交于
      The original translation created collisions on Inquiry VPD 83 for many
      existing devices. Newer specifications provide other ways to translate
      based on the device's version can be used to create unique identifiers.
      
      Version 1.1 provides an EUI64 field that uniquely identifies each
      namespace, and 1.2 added the longer NGUID field for the same reason.
      Both follow the IEEE EUI format and readily translate to the SCSI device
      identification EUI designator type 2h. For devices implementing either,
      the translation will use this type, defaulting to the EUI64 8-byte type if
      implemented then NGUID's 16 byte version if not. If neither are provided,
      the 1.0 translation is used, and is updated to use the SCSI String format
      to guarantee a unique identifier.
      
      Knowing when to use the new fields depends on the nvme controller's
      revision. The NVME_VS macro was not decoding this correctly, so that is
      fixed in this patch and moved to a more appropriate place.
      
      Since the Identify Namespace structure required an update for the NGUID
      field, this patch adds the remaining new 1.2 fields to the structure.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      4f1982b4
    • K
      NVMe: Metadata format support · e1e5e564
      Keith Busch 提交于
      Adds support for NVMe metadata formats and exposes block devices for
      all namespaces regardless of their format. Namespace formats that are
      unusable will have disk capacity set to 0, but a handle to the block
      device is created to simplify device management. A namespace is not
      usable when the format requires host interleave block and metadata in
      single buffer, has no provisioned storage, or has better data but failed
      to register with blk integrity.
      
      The namespace has to be scanned in two phases to support separate
      metadata formats. The first establishes the sector size and capacity
      prior to invoking add_disk. If metadata is required, the capacity will
      be temporarilly set to 0 until it can be revalidated and registered with
      the integrity extenstions after add_disk completes.
      
      The driver relies on the integrity extensions to provide the metadata
      buffer. NVMe requires this be a single physically contiguous region,
      so only one integrity segment is allowed per command. If the metadata
      is used for T10 PI, the driver provides mappings to save and restore
      the reftag physical block translation. The driver provides no-op
      functions for generate and verify if metadata is not used for protection
      information. This way the setup is always provided by the block layer.
      
      If a request does not supply a required metadata buffer, the command
      is failed with bad address. This could only happen if a user manually
      disables verify/generate on such a disk. The only exception to where
      this is okay is if the controller is capable of stripping/generating
      the metadata, which is possible on some types of formats.
      
      The metadata scatter gather list now occupies the spot in the nvme_iod
      that used to be used to link retryable IOD's, but we don't do that
      anymore, so the field was unused.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      e1e5e564
    • D
      kdb: Avoid printing KERN_ levels to consoles · f7d4ca8b
      Daniel Thompson 提交于
      Currently when kdb traps printk messages then the raw log level prefix
      (consisting of '\001' followed by a numeral) does not get stripped off
      before the message is issued to the various I/O handlers supported by
      kdb. This causes annoying visual noise as well as causing problems
      grepping for ^. It is also a change of behaviour compared to normal usage
      of printk() usage. For example <SysRq>-h ends up with different output to
      that of kdb's "sr h".
      
      This patch addresses the problem by stripping log levels from messages
      before they are issued to the I/O handlers. printk() which can also
      act as an i/o handler in some cases is special cased; if the caller
      provided a log level then the prefix will be preserved when sent to
      printk().
      
      The addition of non-printable characters to the output of kdb commands is a
      regression, albeit and extremely elderly one, introduced by commit
      04d2c8c8 ("printk: convert the format for KERN_<LEVEL> to a 2 byte
      pattern"). Note also that this patch does *not* restore the original
      behaviour from v3.5. Instead it makes printk() from within a kdb command
      display the message without any prefix (i.e. like printk() normally does).
      Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Cc: Joe Perches <joe@perches.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      f7d4ca8b
  16. 19 2月, 2015 8 次提交