1. 31 8月, 2013 2 次提交
  2. 27 8月, 2013 1 次提交
  3. 25 8月, 2013 1 次提交
  4. 23 8月, 2013 4 次提交
  5. 22 8月, 2013 1 次提交
    • M
      [SCSI] zfcp: fix lock imbalance by reworking request queue locking · d79ff142
      Martin Peschke 提交于
      This patch adds wait_event_interruptible_lock_irq_timeout(), which is a
      straight-forward descendant of wait_event_interruptible_timeout() and
      wait_event_interruptible_lock_irq().
      
      The zfcp driver used to call wait_event_interruptible_timeout()
      in combination with some intricate and error-prone locking. Using
      wait_event_interruptible_lock_irq_timeout() as a replacement
      nicely cleans up that locking.
      
      This rework removes a situation that resulted in a locking imbalance
      in zfcp_qdio_sbal_get():
      
      BUG: workqueue leaked lock or atomic: events/1/0xffffff00/10
          last function: zfcp_fc_wka_port_offline+0x0/0xa0 [zfcp]
      
      It was introduced by commit c2af7545
      "[SCSI] zfcp: Do not wait for SBALs on stopped queue", which had a new
      code path related to ZFCP_STATUS_ADAPTER_QDIOUP that took an early exit
      without a required lock being held. The problem occured when a
      special, non-SCSI I/O request was being submitted in process context,
      when the adapter's queues had been torn down. In this case the bug
      surfaced when the Fibre Channel port connection for a well-known address
      was closed during a concurrent adapter shut-down procedure, which is a
      rare constellation.
      
      This patch also fixes these warnings from the sparse tool (make C=1):
      
      drivers/s390/scsi/zfcp_qdio.c:224:12: warning: context imbalance in
       'zfcp_qdio_sbal_check' - wrong count at exit
      drivers/s390/scsi/zfcp_qdio.c:244:5: warning: context imbalance in
       'zfcp_qdio_sbal_get' - unexpected unlock
      
      Last but not least, we get rid of that crappy lock-unlock-lock
      sequence at the beginning of the critical section.
      
      It is okay to call zfcp_erp_adapter_reopen() with req_q_lock held.
      Reported-by: NMikulas Patocka <mpatocka@redhat.com>
      Reported-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Peschke <mpeschke@linux.vnet.ibm.com>
      Cc: stable@vger.kernel.org #2.6.35+
      Signed-off-by: NSteffen Maier <maier@linux.vnet.ibm.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      d79ff142
  6. 20 8月, 2013 1 次提交
  7. 16 8月, 2013 2 次提交
    • L
      Fix TLB gather virtual address range invalidation corner cases · 2b047252
      Linus Torvalds 提交于
      Ben Tebulin reported:
      
       "Since v3.7.2 on two independent machines a very specific Git
        repository fails in 9/10 cases on git-fsck due to an SHA1/memory
        failures.  This only occurs on a very specific repository and can be
        reproduced stably on two independent laptops.  Git mailing list ran
        out of ideas and for me this looks like some very exotic kernel issue"
      
      and bisected the failure to the backport of commit 53a59fc6 ("mm:
      limit mmu_gather batching to fix soft lockups on !CONFIG_PREEMPT").
      
      That commit itself is not actually buggy, but what it does is to make it
      much more likely to hit the partial TLB invalidation case, since it
      introduces a new case in tlb_next_batch() that previously only ever
      happened when running out of memory.
      
      The real bug is that the TLB gather virtual memory range setup is subtly
      buggered.  It was introduced in commit 597e1c35 ("mm/mmu_gather:
      enable tlb flush range in generic mmu_gather"), and the range handling
      was already fixed at least once in commit e6c495a9 ("mm: fix the TLB
      range flushed when __tlb_remove_page() runs out of slots"), but that fix
      was not complete.
      
      The problem with the TLB gather virtual address range is that it isn't
      set up by the initial tlb_gather_mmu() initialization (which didn't get
      the TLB range information), but it is set up ad-hoc later by the
      functions that actually flush the TLB.  And so any such case that forgot
      to update the TLB range entries would potentially miss TLB invalidates.
      
      Rather than try to figure out exactly which particular ad-hoc range
      setup was missing (I personally suspect it's the hugetlb case in
      zap_huge_pmd(), which didn't have the same logic as zap_pte_range()
      did), this patch just gets rid of the problem at the source: make the
      TLB range information available to tlb_gather_mmu(), and initialize it
      when initializing all the other tlb gather fields.
      
      This makes the patch larger, but conceptually much simpler.  And the end
      result is much more understandable; even if you want to play games with
      partial ranges when invalidating the TLB contents in chunks, now the
      range information is always there, and anybody who doesn't want to
      bother with it won't introduce subtle bugs.
      
      Ben verified that this fixes his problem.
      Reported-bisected-and-tested-by: NBen Tebulin <tebulin@googlemail.com>
      Build-testing-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Build-testing-by: NRichard Weinberger <richard.weinberger@gmail.com>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2b047252
    • M
      net/mlx5_core: Support MANAGE_PAGES and QUERY_PAGES firmware command changes · 0a324f31
      Moshe Lazer 提交于
      In the previous QUERY_PAGES command version we used one command to get the
      required amount of boot, init and post init pages.  The new version uses the
      op_mod field to specify whether the query is for the required amount of boot,
      init or post init pages. In addition the output field size for the required
      amount of pages increased from 16 to 32 bits.
      
      In MANAGE_PAGES command the input_num_entries and output_num_entries fields
      sizes changed from 16 to 32 bits and the PAS tables offset changed to 0x10.
      
      In the pages request event the num_pages field also changed to 32 bits.
      
      In the HCA-capabilities-layout the size and location of max_qp_mcg field has
      been changed to support 24 bits.
      
      This patch isn't compatible with firmware versions < 5; however, it  turns out that the
      first GA firmware we will publish will not support previous versions so this should be OK.
      Signed-off-by: NMoshe Lazer <moshel@mellanox.com>
      Signed-off-by: NEli Cohen <eli@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a324f31
  8. 15 8月, 2013 1 次提交
    • J
      net_sched: restore "linklayer atm" handling · 8a8e3d84
      Jesper Dangaard Brouer 提交于
      commit 56b765b7 ("htb: improved accuracy at high rates")
      broke the "linklayer atm" handling.
      
       tc class add ... htb rate X ceil Y linklayer atm
      
      The linklayer setting is implemented by modifying the rate table
      which is send to the kernel.  No direct parameter were
      transferred to the kernel indicating the linklayer setting.
      
      The commit 56b765b7 ("htb: improved accuracy at high rates")
      removed the use of the rate table system.
      
      To keep compatible with older iproute2 utils, this patch detects
      the linklayer by parsing the rate table.  It also supports future
      versions of iproute2 to send this linklayer parameter to the
      kernel directly. This is done by using the __reserved field in
      struct tc_ratespec, to convey the choosen linklayer option, but
      only using the lower 4 bits of this field.
      
      Linklayer detection is limited to speeds below 100Mbit/s, because
      at high rates the rtab is gets too inaccurate, so bad that
      several fields contain the same values, this resembling the ATM
      detect.  Fields even start to contain "0" time to send, e.g. at
      1000Mbit/s sending a 96 bytes packet cost "0", thus the rtab have
      been more broken than we first realized.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a8e3d84
  9. 14 8月, 2013 5 次提交
  10. 13 8月, 2013 1 次提交
    • O
      sched: fix the theoretical signal_wake_up() vs schedule() race · e0acd0a6
      Oleg Nesterov 提交于
      This is only theoretical, but after try_to_wake_up(p) was changed
      to check p->state under p->pi_lock the code like
      
      	__set_current_state(TASK_INTERRUPTIBLE);
      	schedule();
      
      can miss a signal. This is the special case of wait-for-condition,
      it relies on try_to_wake_up/schedule interaction and thus it does
      not need mb() between __set_current_state() and if(signal_pending).
      
      However, this __set_current_state() can move into the critical
      section protected by rq->lock, now that try_to_wake_up() takes
      another lock we need to ensure that it can't be reordered with
      "if (signal_pending(current))" check inside that section.
      
      The patch is actually one-liner, it simply adds smp_wmb() before
      spin_lock_irq(rq->lock). This is what try_to_wake_up() already
      does by the same reason.
      
      We turn this wmb() into the new helper, smp_mb__before_spinlock(),
      for better documentation and to allow the architectures to change
      the default implementation.
      
      While at it, kill smp_mb__after_lock(), it has no callers.
      
      Perhaps we can also add smp_mb__before/after_spinunlock() for
      prepare_to_wait().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e0acd0a6
  11. 11 8月, 2013 2 次提交
  12. 10 8月, 2013 1 次提交
  13. 09 8月, 2013 1 次提交
  14. 08 8月, 2013 2 次提交
    • T
      SUNRPC: If the rpcbind channel is disconnected, fail the call to unregister · 786615bc
      Trond Myklebust 提交于
      If rpcbind causes our connection to the AF_LOCAL socket to close after
      we've registered a service, then we want to be careful about reconnecting
      since the mount namespace may have changed.
      
      By simply refusing to reconnect the AF_LOCAL socket in the case of
      unregister, we avoid the need to somehow save the mount namespace. While
      this may lead to some services not unregistering properly, it should
      be safe.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: Nix <nix@esperi.org.uk>
      Cc: Jeff Layton <jlayton@redhat.com>
      Cc: stable@vger.kernel.org # 3.9.x
      786615bc
    • R
      ACPI: Try harder to resolve _ADR collisions for bridges · 60f75b8e
      Rafael J. Wysocki 提交于
      In theory, under a given ACPI namespace node there should be only
      one child device object with _ADR whose value matches a given bus
      address exactly.  In practice, however, there are systems in which
      multiple child device objects under a given parent have _ADR matching
      exactly the same address.  In those cases we use _STA to determine
      which of the multiple matching devices is enabled, since some systems
      are known to indicate which ACPI device object to associate with the
      given physical (usually PCI) device this way.
      
      Unfortunately, as it turns out, there are systems in which many
      device objects under the same parent have _ADR matching exactly the
      same bus address and none of them has _STA, in which case they all
      should be regarded as enabled according to the spec.  Still, if
      those device objects are supposed to represent bridges (e.g. this
      is the case for device objects corresponding to PCIe ports), we can
      try harder and skip the ones that have no child device objects in the
      ACPI namespace.  With luck, we can avoid using device objects that we
      are not expected to use this way.
      
      Although this only works for bridges whose children also have ACPI
      namespace representation, it is sufficient to address graphics
      adapter detection issues on some systems, so rework the code finding
      a matching device ACPI handle for a given bus address to implement
      this idea.
      
      Introduce a new function, acpi_find_child(), taking three arguments:
      the ACPI handle of the device's parent, a bus address suitable for
      the device's bus type and a bool indicating if the device is a
      bridge and make it work as outlined above.  Reimplement the function
      currently used for this purpose, acpi_get_child(), as a call to
      acpi_find_child() with the last argument set to 'false' and make
      the PCI subsystem use acpi_find_child() with the bridge information
      passed as the last argument to it.  [Lan Tianyu notices that it is
      not sufficient to use pci_is_bridge() for that, because the device's
      subordinate pointer hasn't been set yet at this point, so use
      hdr_type instead.]
      
      This change fixes a regression introduced inadvertently by commit
      33f767d7 (ACPI: Rework acpi_get_child() to be more efficient) which
      overlooked the fact that for acpi_walk_namespace() "post-order" means
      "after all children have been visited" rather than "on the way back",
      so for device objects without children and for namespace walks of
      depth 1, as in the acpi_get_child() case, the "post-order" callbacks
      ordering is actually the same as the ordering of "pre-order" ones.
      Since that commit changed the namespace walk in acpi_get_child() to
      terminate after finding the first matching object instead of going
      through all of them and returning the last one, it effectively
      changed the result returned by that function in some rare cases and
      that led to problems (the switch from a "pre-order" to a "post-order"
      callback was supposed to prevent that from happening, but it was
      ineffective).
      
      As it turns out, the systems where the change made by commit
      33f767d7 actually matters are those where there are multiple ACPI
      device objects representing the same PCIe port (which effectively
      is a bridge).  Moreover, only one of them, and the one we are
      expected to use, has child device objects in the ACPI namespace,
      so the regression can be addressed as described above.
      
      References: https://bugzilla.kernel.org/show_bug.cgi?id=60561Reported-by: NPeter Wu <lekensteyn@gmail.com>
      Tested-by: NVladimir Lalov <mail@vlalov.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NBjorn Helgaas <bhelgaas@google.com>
      Cc: 3.9+ <stable@vger.kernel.org> # 3.9+
      60f75b8e
  15. 07 8月, 2013 1 次提交
  16. 06 8月, 2013 2 次提交
  17. 05 8月, 2013 2 次提交
    • L
      ASoC: dapm: Implement mixer input auto-disable · 57295073
      Lars-Peter Clausen 提交于
      Some devices have the problem that if a internal audio signal source is disabled
      the output of the source becomes undefined or goes to a undesired state (E.g.
      DAC output goes to ground instead of VMID). In this case it is necessary, in
      order to avoid unwanted clicks and pops, to disable any mixer input the signal
      feeds into or to active a mute control along the path to the output. Often it is
      still desirable to expose the same mixer input control to userspace, so cerain
      paths can sill be disabled manually. This means we can not use conventional DAPM
      to manage the mixer input control. This patch implements a method for letting
      DAPM overwrite the state of a userspace visible control. I.e. DAPM will disable
      the control if the path on which the control sits becomes inactive. Userspace
      will then only see a cached copy of the controls state. Once DAPM powers the
      path up again it will sync the userspace setting with the hardware and give
      control back to userspace.
      
      To implement this a new widget type is introduced. One widget of this type will
      be created for each DAPM kcontrol which has the auto-disable feature enabled.
      For each path that is controlled by the kcontrol the widget will be connected to
      the source of that path. The new widget type behaves like a supply widget,
      which means it will power up if one of its sinks are powered up and will only
      power down if all of its sinks are powered down. In order to only have the mixer
      input enabled when the source signal is valid the new widget type will be
      disabled before all other widget types and only be enabled after all other
      widget types.
      
      E.g. consider the following simplified example. A DAC is connected to a mixer
      and the mixer has a control to enable or disable the signal from the DAC.
      
                           +-------+
        +-----+            |       |
        | DAC |-----[Ctrl]-| Mixer |
        +-----+       :    |       |
           |          :    +-------+
           |          :
          +-------------+
          | Ctrl widget |
          +-------------+
      
      If the control has the auto-disable feature enabled we'll create a widget for
      the control. This widget is connected to the DAC as it is the source for the
      mixer input. If the DAC powers up the control widget powers up and if the DAC
      powers down the control widget is powered down. As long as the control widget
      is powered down the hardware input control is kept disabled and if it is enabled
      userspace can freely change the control's state.
      Signed-off-by: NLars-Peter Clausen <lars@metafoo.de>
      Signed-off-by: NMark Brown <broonie@linaro.org>
      57295073
    • E
      busy_poll: cleanup do-nothing placeholders · bf37d2b3
      Eliezer Tamir 提交于
      When renaming ll_poll to busy poll, I introduced a typo
      in the name of the do-nothing placeholder for sk_busy_loop
      and called it sk_busy_poll.
      This broke compile when busy poll was not configured.
      Cong Wang submitted a patch to fixed that.
      This patch removes the now redundant, misspelled placeholder.
      Signed-off-by: NEliezer Tamir <eliezer.tamir@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bf37d2b3
  18. 03 8月, 2013 2 次提交
  19. 02 8月, 2013 3 次提交
    • C
      net: rename CONFIG_NET_LL_RX_POLL to CONFIG_NET_RX_BUSY_POLL · e0d1095a
      Cong Wang 提交于
      Eliezer renames several *ll_poll to *busy_poll, but forgets
      CONFIG_NET_LL_RX_POLL, so in case of confusion, rename it too.
      
      Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e0d1095a
    • C
      net: fix a compile error when CONFIG_NET_LL_RX_POLL is not set · dfcefb0b
      Cong Wang 提交于
      When CONFIG_NET_LL_RX_POLL is not set, I got:
      
      net/socket.c: In function ‘sock_poll’:
      net/socket.c:1165:4: error: implicit declaration of function ‘sk_busy_loop’ [-Werror=implicit-function-declaration]
      
      Fix this by adding a nop when !CONFIG_NET_LL_RX_POLL.
      
      Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dfcefb0b
    • M
      ipv6: prevent fib6_run_gc() contention · 2ac3ac8f
      Michal Kubeček 提交于
      On a high-traffic router with many processors and many IPv6 dst
      entries, soft lockup in fib6_run_gc() can occur when number of
      entries reaches gc_thresh.
      
      This happens because fib6_run_gc() uses fib6_gc_lock to allow
      only one thread to run the garbage collector but ip6_dst_gc()
      doesn't update net->ipv6.ip6_rt_last_gc until fib6_run_gc()
      returns. On a system with many entries, this can take some time
      so that in the meantime, other threads pass the tests in
      ip6_dst_gc() (ip6_rt_last_gc is still not updated) and wait for
      the lock. They then have to run the garbage collector one after
      another which blocks them for quite long.
      
      Resolve this by replacing special value ~0UL of expire parameter
      to fib6_run_gc() by explicit "force" parameter to choose between
      spin_lock_bh() and spin_trylock_bh() and call fib6_run_gc() with
      force=false if gc_thresh is reached but not max_size.
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2ac3ac8f
  20. 01 8月, 2013 5 次提交