1. 08 4月, 2015 1 次提交
  2. 03 4月, 2015 1 次提交
  3. 13 3月, 2015 4 次提交
  4. 07 3月, 2015 2 次提交
  5. 06 3月, 2015 1 次提交
  6. 05 3月, 2015 5 次提交
    • T
      workqueue: fix hang involving racing cancel[_delayed]_work_sync()'s for PREEMPT_NONE · 8603e1b3
      Tejun Heo 提交于
      cancel[_delayed]_work_sync() are implemented using
      __cancel_work_timer() which grabs the PENDING bit using
      try_to_grab_pending() and then flushes the work item with PENDING set
      to prevent the on-going execution of the work item from requeueing
      itself.
      
      try_to_grab_pending() can always grab PENDING bit without blocking
      except when someone else is doing the above flushing during
      cancelation.  In that case, try_to_grab_pending() returns -ENOENT.  In
      this case, __cancel_work_timer() currently invokes flush_work().  The
      assumption is that the completion of the work item is what the other
      canceling task would be waiting for too and thus waiting for the same
      condition and retrying should allow forward progress without excessive
      busy looping
      
      Unfortunately, this doesn't work if preemption is disabled or the
      latter task has real time priority.  Let's say task A just got woken
      up from flush_work() by the completion of the target work item.  If,
      before task A starts executing, task B gets scheduled and invokes
      __cancel_work_timer() on the same work item, its try_to_grab_pending()
      will return -ENOENT as the work item is still being canceled by task A
      and flush_work() will also immediately return false as the work item
      is no longer executing.  This puts task B in a busy loop possibly
      preventing task A from executing and clearing the canceling state on
      the work item leading to a hang.
      
      task A			task B			worker
      
      						executing work
      __cancel_work_timer()
        try_to_grab_pending()
        set work CANCELING
        flush_work()
          block for work completion
      						completion, wakes up A
      			__cancel_work_timer()
      			while (forever) {
      			  try_to_grab_pending()
      			    -ENOENT as work is being canceled
      			  flush_work()
      			    false as work is no longer executing
      			}
      
      This patch removes the possible hang by updating __cancel_work_timer()
      to explicitly wait for clearing of CANCELING rather than invoking
      flush_work() after try_to_grab_pending() fails with -ENOENT.
      
      Link: http://lkml.kernel.org/g/20150206171156.GA8942@axis.com
      
      v3: bit_waitqueue() can't be used for work items defined in vmalloc
          area.  Switched to custom wake function which matches the target
          work item and exclusive wait and wakeup.
      
      v2: v1 used wake_up() on bit_waitqueue() which leads to NULL deref if
          the target bit waitqueue has wait_bit_queue's on it.  Use
          DEFINE_WAIT_BIT() and __wake_up_bit() instead.  Reported by Tomeu
          Vizoso.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NRabin Vincent <rabin.vincent@axis.com>
      Cc: Tomeu Vizoso <tomeu.vizoso@gmail.com>
      Cc: stable@vger.kernel.org
      Tested-by: NJesper Nilsson <jesper.nilsson@axis.com>
      Tested-by: NRabin Vincent <rabin.vincent@axis.com>
      8603e1b3
    • A
      drm/ttm: device address space != CPU address space · 54c4cd68
      Alex Deucher 提交于
      We need to store device offsets in 64 bit as the device
      address space may be larger than the CPU's.
      
      Fixes GPU init failures on radeons with 4GB or more of
      vram on 32 bit kernels.  We put vram at the start of the
      GPU's address space so the gart aperture starts at 4 GB
      causing all GPU addresses in the gart aperture to get
      truncated.
      
      bug:
      https://bugs.freedesktop.org/show_bug.cgi?id=89072
      
      [airlied: fix warning on nouveau build]
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Cc: thellstrom@vmware.com
      Acked-by: NThomas Hellstrom <thellstrom@vmware.com>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      54c4cd68
    • T
      drm/mm: Support 4 GiB and larger ranges · 440fd528
      Thierry Reding 提交于
      The current implementation is limited by the number of addresses that
      fit into an unsigned long. This causes problems on 32-bit Tegra where
      unsigned long is 32-bit but drm_mm is used to manage an IOVA space of
      4 GiB. Given the 32-bit limitation, the range is limited to 4 GiB - 1
      (or 4 GiB - 4 KiB for page granularity).
      
      This commit changes the start and size of the range to be an unsigned
      64-bit integer, thus allowing much larger ranges to be supported.
      
      [airlied: fix i915 warnings and coloring callback]
      Signed-off-by: NThierry Reding <treding@nvidia.com>
      Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      
      fixupo
      440fd528
    • R
      genirq / PM: Add flag for shared NO_SUSPEND interrupt lines · 17f48034
      Rafael J. Wysocki 提交于
      It currently is required that all users of NO_SUSPEND interrupt
      lines pass the IRQF_NO_SUSPEND flag when requesting the IRQ or the
      WARN_ON_ONCE() in irq_pm_install_action() will trigger.  That is
      done to warn about situations in which unprepared interrupt handlers
      may be run unnecessarily for suspended devices and may attempt to
      access those devices by mistake.  However, it may cause drivers
      that have no technical reasons for using IRQF_NO_SUSPEND to set
      that flag just because they happen to share the interrupt line
      with something like a timer.
      
      Moreover, the generic handling of wakeup interrupts introduced by
      commit 9ce7a258 (genirq: Simplify wakeup mechanism) only works
      for IRQs without any NO_SUSPEND users, so the drivers of wakeup
      devices needing to use shared NO_SUSPEND interrupt lines for
      signaling system wakeup generally have to detect wakeup in their
      interrupt handlers.  Thus if they happen to share an interrupt line
      with a NO_SUSPEND user, they also need to request that their
      interrupt handlers be run after suspend_device_irqs().
      
      In both cases the reason for using IRQF_NO_SUSPEND is not because
      the driver in question has a genuine need to run its interrupt
      handler after suspend_device_irqs(), but because it happens to
      share the line with some other NO_SUSPEND user.  Otherwise, the
      driver would do without IRQF_NO_SUSPEND just fine.
      
      To make it possible to specify that condition explicitly, introduce
      a new IRQ action handler flag for shared IRQs, IRQF_COND_SUSPEND,
      that, when set, will indicate to the IRQ core that the interrupt
      user is generally fine with suspending the IRQ, but it also can
      tolerate handler invocations after suspend_device_irqs() and, in
      particular, it is capable of detecting system wakeup and triggering
      it as appropriate from its interrupt handler.
      
      That will allow us to work around a problem with a shared timer
      interrupt line on at91 platforms.
      
      Link: http://marc.info/?l=linux-kernel&m=142252777602084&w=2
      Link: http://marc.info/?t=142252775300011&r=1&w=2
      Link: https://lkml.org/lkml/2014/12/15/552Reported-by: NBoris Brezillon <boris.brezillon@free-electrons.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      17f48034
    • P
      netfilter: nf_tables: fix userdata length overflow · 86f1ec32
      Patrick McHardy 提交于
      The NFT_USERDATA_MAXLEN is defined to 256, however we only have a u8
      to store its size. Introduce a struct nft_userdata which contains a
      length field and indicate its presence using a single bit in the rule.
      
      The length field of struct nft_userdata is also a u8, however we don't
      store zero sized data, so the actual length is udata->len + 1.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      86f1ec32
  7. 04 3月, 2015 1 次提交
    • T
      NFS: Fix a regression in the read() syscall · 874f9463
      Trond Myklebust 提交于
      When invalidating the page cache for a regular file, we want to first
      sync all dirty data to disk and then call invalidate_inode_pages2().
      The latter relies on nfs_launder_page() and nfs_release_page() to deal
      respectively with dirty pages, and unstable written pages.
      
      When commit 95905446 ("NFS: avoid deadlocks with loop-back mounted
      NFS filesystems.") changed the behaviour of nfs_release_page(), then it
      made it possible for invalidate_inode_pages2() to fail with an EBUSY.
      Unfortunately, that error is then propagated back to read().
      
      Let's therefore work around the problem for now by protecting the call
      to sync the data and invalidate_inode_pages2() so that they are atomic
      w.r.t. the addition of new writes.
      Later on, we can revisit whether or not we still need nfs_launder_page()
      and nfs_release_page().
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      874f9463
  8. 03 3月, 2015 2 次提交
  9. 02 3月, 2015 3 次提交
  10. 28 2月, 2015 1 次提交
  11. 27 2月, 2015 1 次提交
  12. 26 2月, 2015 2 次提交
    • T
      OMAPDSS: fix regression with display sysfs files · a38bb793
      Tomi Valkeinen 提交于
      omapdss's sysfs directories for displays used to have 'name' file,
      giving the name for the display. This file was later renamed to
      'display_name' to avoid conflicts with i2c sysfs 'name' file. Looks like
      at least xserver-xorg-video-omap3 requires the 'name' file to be
      present.
      
      To fix the regression, this patch creates new kobjects for each display,
      allowing us to create sysfs directories for the displays. This way we
      have the whole directory for omapdss, and there will be no sysfs file
      clashes with the underlying display device's sysfs files.
      
      We can thus add the 'name' sysfs file back.
      Signed-off-by: NTomi Valkeinen <tomi.valkeinen@ti.com>
      Tested-by: NNeilBrown <neilb@suse.de>
      a38bb793
    • M
      genirq / PM: better describe IRQF_NO_SUSPEND semantics · 737eb030
      Mark Rutland 提交于
      The IRQF_NO_SUSPEND flag is intended to be used for interrupts required
      to be enabled during the suspend-resume cycle. This mostly consists of
      IPIs and timer interrupts, potentially including chained irqchip
      interrupts if these are necessary to handle timers or IPIs. If an
      interrupt does not fall into one of the aforementioned categories,
      requesting it with IRQF_NO_SUSPEND is likely incorrect.
      
      Using IRQF_NO_SUSPEND does not guarantee that the interrupt can wake the
      system from a suspended state. For an interrupt to be able to trigger a
      wakeup, it may be necessary to program various components of the system.
      In these cases it is necessary to use {enable,disabled}_irq_wake.
      
      Unfortunately, several drivers assume that IRQF_NO_SUSPEND ensures that
      an IRQ can wake up the system, and the documentation can be read
      ambiguously w.r.t. this property.
      
      This patch updates the documentation regarding IRQF_NO_SUSPEND to make
      this caveat explicit, hopefully making future misuse rarer. Cleanup of
      existing misuse will occur as part of later patch series.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      737eb030
  13. 25 2月, 2015 1 次提交
  14. 24 2月, 2015 2 次提交
    • J
    • D
      x86/xen: allow privcmd hypercalls to be preempted · fdfd811d
      David Vrabel 提交于
      Hypercalls submitted by user space tools via the privcmd driver can
      take a long time (potentially many 10s of seconds) if the hypercall
      has many sub-operations.
      
      A fully preemptible kernel may deschedule such as task in any upcall
      called from a hypercall continuation.
      
      However, in a kernel with voluntary or no preemption, hypercall
      continuations in Xen allow event handlers to be run but the task
      issuing the hypercall will not be descheduled until the hypercall is
      complete and the ioctl returns to user space.  These long running
      tasks may also trigger the kernel's soft lockup detection.
      
      Add xen_preemptible_hcall_begin() and xen_preemptible_hcall_end() to
      bracket hypercalls that may be preempted.  Use these in the privcmd
      driver.
      
      When returning from an upcall, call xen_maybe_preempt_hcall() which
      adds a schedule point if if the current task was within a preemptible
      hypercall.
      
      Since _cond_resched() can move the task to a different CPU, clear and
      set xen_in_preemptible_hcall around the call.
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      fdfd811d
  15. 23 2月, 2015 5 次提交
  16. 22 2月, 2015 2 次提交
  17. 21 2月, 2015 2 次提交
    • D
      caif: fix a signedness bug in cfpkt_iterate() · 278f7b4f
      Dan Carpenter 提交于
      The cfpkt_iterate() function can return -EPROTO on error, but the
      function is a u16 so the negative value gets truncated to a positive
      unsigned short.  This causes a static checker warning.
      
      The only caller which might care is cffrml_receive(), when it's checking
      the frame checksum.  I modified cffrml_receive() so that it never says
      -EPROTO is a valid checksum.
      
      Also this isn't ever going to be inlined so I removed the "inline".
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      278f7b4f
    • G
      net: Initialize all members in skb_gro_remcsum_init() · 846cd667
      Geert Uytterhoeven 提交于
      skb_gro_remcsum_init() initializes the gro_remcsum.delta member only,
      leading to compiler warnings about a possibly uninitialized
      gro_remcsum.offset member:
      
      drivers/net/vxlan.c: In function ‘vxlan_gro_receive’:
      drivers/net/vxlan.c:602: warning: ‘grc.offset’ may be used uninitialized in this function
      net/ipv4/fou.c: In function ‘gue_gro_receive’:
      net/ipv4/fou.c:262: warning: ‘grc.offset’ may be used uninitialized in this function
      
      While these are harmless for now:
        - skb_gro_remcsum_process() sets offset before changing delta,
        - skb_gro_remcsum_cleanup() checks if delta is non-zero before
          accessing offset,
      it's safer to let the initialization function initialize all members.
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Acked-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      846cd667
  18. 20 2月, 2015 4 次提交
    • K
      NVMe: Fix potential corruption during shutdown · 07836e65
      Keith Busch 提交于
      The driver has to end unreturned commands at some point even if the
      controller has not provided a completion. The driver tried to be safe by
      deleting IO queues prior to ending all unreturned commands. That should
      cause the controller to internally abort inflight commands, but IO queue
      deletion request does not have to be successful, so all bets are off. We
      still have to make progress, so to be extra safe, this patch doesn't
      clear a queue to release the dma mapping for a command until after the
      pci device has been disabled.
      
      This patch removes the special handling during device initialization
      so controller recovery can be done all the time. This is possible since
      initialization is not inlined with pci probe anymore.
      Reported-by: NNilish Choudhury <nilesh.choudhury@oracle.com>
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      07836e65
    • K
      NVMe: Asynchronous controller probe · 2e1d8448
      Keith Busch 提交于
      This performs the longest parts of nvme device probe in scheduled work.
      This speeds up probe significantly when multiple devices are in use.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      2e1d8448
    • K
      NVMe: Register management handle under nvme class · b3fffdef
      Keith Busch 提交于
      This creates a new class type for nvme devices to register their
      management character devices with. This is so we do not rely on miscdev
      to provide enough minors for as many nvme devices some people plan to
      use. The previous limit was approximately 60 NVMe controllers, depending
      on the platform and kernel. Now the limit is 1M, which ought to be enough
      for anybody.
      
      Since we have a new device class, it makes sense to attach the block
      devices under this as well, so part of this patch moves the management
      handle initialization prior to the namespaces discovery.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      b3fffdef
    • K
      NVMe: Update SCSI Inquiry VPD 83h translation · 4f1982b4
      Keith Busch 提交于
      The original translation created collisions on Inquiry VPD 83 for many
      existing devices. Newer specifications provide other ways to translate
      based on the device's version can be used to create unique identifiers.
      
      Version 1.1 provides an EUI64 field that uniquely identifies each
      namespace, and 1.2 added the longer NGUID field for the same reason.
      Both follow the IEEE EUI format and readily translate to the SCSI device
      identification EUI designator type 2h. For devices implementing either,
      the translation will use this type, defaulting to the EUI64 8-byte type if
      implemented then NGUID's 16 byte version if not. If neither are provided,
      the 1.0 translation is used, and is updated to use the SCSI String format
      to guarantee a unique identifier.
      
      Knowing when to use the new fields depends on the nvme controller's
      revision. The NVME_VS macro was not decoding this correctly, so that is
      fixed in this patch and moved to a more appropriate place.
      
      Since the Identify Namespace structure required an update for the NGUID
      field, this patch adds the remaining new 1.2 fields to the structure.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      4f1982b4