1. 02 6月, 2012 1 次提交
    • C
      cpuidle: add support for states that affect multiple cpus · 4126c019
      Colin Cross 提交于
      On some ARM SMP SoCs (OMAP4460, Tegra 2, and probably more), the
      cpus cannot be independently powered down, either due to
      sequencing restrictions (on Tegra 2, cpu 0 must be the last to
      power down), or due to HW bugs (on OMAP4460, a cpu powering up
      will corrupt the gic state unless the other cpu runs a work
      around).  Each cpu has a power state that it can enter without
      coordinating with the other cpu (usually Wait For Interrupt, or
      WFI), and one or more "coupled" power states that affect blocks
      shared between the cpus (L2 cache, interrupt controller, and
      sometimes the whole SoC).  Entering a coupled power state must
      be tightly controlled on both cpus.
      
      The easiest solution to implementing coupled cpu power states is
      to hotplug all but one cpu whenever possible, usually using a
      cpufreq governor that looks at cpu load to determine when to
      enable the secondary cpus.  This causes problems, as hotplug is an
      expensive operation, so the number of hotplug transitions must be
      minimized, leading to very slow response to loads, often on the
      order of seconds.
      
      This file implements an alternative solution, where each cpu will
      wait in the WFI state until all cpus are ready to enter a coupled
      state, at which point the coupled state function will be called
      on all cpus at approximately the same time.
      
      Once all cpus are ready to enter idle, they are woken by an smp
      cross call.  At this point, there is a chance that one of the
      cpus will find work to do, and choose not to enter idle.  A
      final pass is needed to guarantee that all cpus will call the
      power state enter function at the same time.  During this pass,
      each cpu will increment the ready counter, and continue once the
      ready counter matches the number of online coupled cpus.  If any
      cpu exits idle, the other cpus will decrement their counter and
      retry.
      
      To use coupled cpuidle states, a cpuidle driver must:
      
         Set struct cpuidle_device.coupled_cpus to the mask of all
         coupled cpus, usually the same as cpu_possible_mask if all cpus
         are part of the same cluster.  The coupled_cpus mask must be
         set in the struct cpuidle_device for each cpu.
      
         Set struct cpuidle_device.safe_state to a state that is not a
         coupled state.  This is usually WFI.
      
         Set CPUIDLE_FLAG_COUPLED in struct cpuidle_state.flags for each
         state that affects multiple cpus.
      
         Provide a struct cpuidle_state.enter function for each state
         that affects multiple cpus.  This function is guaranteed to be
         called on all cpus at approximately the same time.  The driver
         should ensure that the cpus all abort together if any cpu tries
         to abort once the function is called.
      
      update1:
      
      cpuidle: coupled: fix count of online cpus
      
      online_count was never incremented on boot, and was also counting
      cpus that were not part of the coupled set.  Fix both issues by
      introducting a new function that counts online coupled cpus, and
      call it from register as well as the hotplug notifier.
      
      update2:
      
      cpuidle: coupled: fix decrementing ready count
      
      cpuidle_coupled_set_not_ready sometimes refuses to decrement the
      ready count in order to prevent a race condition.  This makes it
      unsuitable for use when finished with idle.  Add a new function
      cpuidle_coupled_set_done that decrements both the ready count and
      waiting count, and call it after idle is complete.
      
      Cc: Amit Kucheria <amit.kucheria@linaro.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Trinabh Gupta <g.trinabh@gmail.com>
      Cc: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
      Reviewed-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
      Tested-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
      Reviewed-by: NKevin Hilman <khilman@ti.com>
      Tested-by: NKevin Hilman <khilman@ti.com>
      Signed-off-by: NColin Cross <ccross@android.com>
      Acked-by: NRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      4126c019
  2. 17 5月, 2012 1 次提交
  3. 16 5月, 2012 1 次提交
    • M
      usbnet: fix skb traversing races during unlink(v2) · 5b6e9bcd
      Ming Lei 提交于
      Commit 4231d47e(net/usbnet: avoid
      recursive locking in usbnet_stop()) fixes the recursive locking
      problem by releasing the skb queue lock before unlink, but may
      cause skb traversing races:
      	- after URB is unlinked and the queue lock is released,
      	the refered skb and skb->next may be moved to done queue,
      	even be released
      	- in skb_queue_walk_safe, the next skb is still obtained
      	by next pointer of the last skb
      	- so maybe trigger oops or other problems
      
      This patch extends the usage of entry->state to describe 'start_unlink'
      state, so always holding the queue(rx/tx) lock to change the state if
      the referd skb is in rx or tx queue because we need to know if the
      refered urb has been started unlinking in unlink_urbs.
      
      The other part of this patch is based on Huajun's patch:
      always traverse from head of the tx/rx queue to get skb which is
      to be unlinked but not been started unlinking.
      Signed-off-by: NHuajun Li <huajun.li.lee@gmail.com>
      Signed-off-by: NMing Lei <tom.leiming@gmail.com>
      Cc: Oliver Neukum <oneukum@suse.de>
      Cc: stable@kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5b6e9bcd
  4. 15 5月, 2012 2 次提交
    • T
      block: fix buffer overflow when printing partition UUIDs · 05c69d29
      Tejun Heo 提交于
      6d1d8050 "block, partition: add partition_meta_info to hd_struct"
      added part_unpack_uuid() which assumes that the passed in buffer has
      enough space for sprintfing "%pU" - 37 characters including '\0'.
      
      Unfortunately, b5af921e "init: add support for root devices
      specified by partition UUID" supplied 33 bytes buffer to the function
      leading to the following panic with stackprotector enabled.
      
        Kernel panic - not syncing: stack-protector: Kernel stack corrupted in: ffffffff81b14c7e
      
        [<ffffffff815e226b>] panic+0xba/0x1c6
        [<ffffffff81b14c7e>] ? printk_all_partitions+0x259/0x26xb
        [<ffffffff810566bb>] __stack_chk_fail+0x1b/0x20
        [<ffffffff81b15c7e>] printk_all_paritions+0x259/0x26xb
        [<ffffffff81aedfe0>] mount_block_root+0x1bc/0x27f
        [<ffffffff81aee0fa>] mount_root+0x57/0x5b
        [<ffffffff81aee23b>] prepare_namespace+0x13d/0x176
        [<ffffffff8107eec0>] ? release_tgcred.isra.4+0x330/0x30
        [<ffffffff81aedd60>] kernel_init+0x155/0x15a
        [<ffffffff81087b97>] ? schedule_tail+0x27/0xb0
        [<ffffffff815f4d24>] kernel_thread_helper+0x5/0x10
        [<ffffffff81aedc0b>] ? start_kernel+0x3c5/0x3c5
        [<ffffffff815f4d20>] ? gs_change+0x13/0x13
      
      Increase the buffer size, remove the dangerous part_unpack_uuid() and
      use snprintf() directly from printk_all_partitions().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NSzymon Gruszczynski <sz.gruszczynski@googlemail.com>
      Cc: Will Drewry <wad@chromium.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      05c69d29
    • G
      Bluetooth: notify userspace of security level change · a7d7723a
      Gustavo Padovan 提交于
      It fixes L2CAP socket based security level elevation during a
      connection. The HID profile needs this (for keyboards) and it is the only
      way to achieve the security level elevation when using the management
      interface to talk to the kernel (hence the management enabling patch
      being the one that exposes this issue).
      
      It enables the userspace a security level change when the socket is
      already connected and create a way to notify the socket the result of the
      request. At the moment of the request the socket is made non writable, if
      the request fails the connections closes, otherwise the socket is made
      writable again, POLL_OUT is emmited.
      Signed-off-by: NGustavo Padovan <gustavo@padovan.org>
      Acked-by: NMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      a7d7723a
  5. 14 5月, 2012 1 次提交
    • R
      Fix blkdev.h build errors when BLOCK=n · 85fd0bc9
      Russell King 提交于
      I see builds failing with:
      
        CC [M]  drivers/mmc/host/dw_mmc.o
      In file included from drivers/mmc/host/dw_mmc.c:15:
      include/linux/blkdev.h:1404: warning: 'struct task_struct' declared inside parameter list
      include/linux/blkdev.h:1404: warning: its scope is only this definition or declaration, which is probably not what you want
      include/linux/blkdev.h:1408: warning: 'struct task_struct' declared inside parameter list
      include/linux/blkdev.h:1413: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'blk_needs_flush_plug'
      make[4]: *** [drivers/mmc/host/dw_mmc.o] Error 1
      
      This is because dw_mmc.c includes linux/blkdev.h as the very first file,
      and when CONFIG_BLOCK=n, blkdev.h omits all includes.
      
      As it requires linux/sched.h even when CONFIG_BLOCK=n, move this out of
      the #ifdef.
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      85fd0bc9
  6. 11 5月, 2012 4 次提交
    • J
      block: don't mark buffers beyond end of disk as mapped · 080399aa
      Jeff Moyer 提交于
      Hi,
      
      We have a bug report open where a squashfs image mounted on ppc64 would
      exhibit errors due to trying to read beyond the end of the disk.  It can
      easily be reproduced by doing the following:
      
      [root@ibm-p750e-02-lp3 ~]# ls -l install.img
      -rw-r--r-- 1 root root 142032896 Apr 30 16:46 install.img
      [root@ibm-p750e-02-lp3 ~]# mount -o loop ./install.img /mnt/test
      [root@ibm-p750e-02-lp3 ~]# dd if=/dev/loop0 of=/dev/null
      dd: reading `/dev/loop0': Input/output error
      277376+0 records in
      277376+0 records out
      142016512 bytes (142 MB) copied, 0.9465 s, 150 MB/s
      
      In dmesg, you'll find the following:
      
      squashfs: version 4.0 (2009/01/31) Phillip Lougher
      [   43.106012] attempt to access beyond end of device
      [   43.106029] loop0: rw=0, want=277410, limit=277408
      [   43.106039] Buffer I/O error on device loop0, logical block 138704
      [   43.106053] attempt to access beyond end of device
      [   43.106057] loop0: rw=0, want=277412, limit=277408
      [   43.106061] Buffer I/O error on device loop0, logical block 138705
      [   43.106066] attempt to access beyond end of device
      [   43.106070] loop0: rw=0, want=277414, limit=277408
      [   43.106073] Buffer I/O error on device loop0, logical block 138706
      [   43.106078] attempt to access beyond end of device
      [   43.106081] loop0: rw=0, want=277416, limit=277408
      [   43.106085] Buffer I/O error on device loop0, logical block 138707
      [   43.106089] attempt to access beyond end of device
      [   43.106093] loop0: rw=0, want=277418, limit=277408
      [   43.106096] Buffer I/O error on device loop0, logical block 138708
      [   43.106101] attempt to access beyond end of device
      [   43.106104] loop0: rw=0, want=277420, limit=277408
      [   43.106108] Buffer I/O error on device loop0, logical block 138709
      [   43.106112] attempt to access beyond end of device
      [   43.106116] loop0: rw=0, want=277422, limit=277408
      [   43.106120] Buffer I/O error on device loop0, logical block 138710
      [   43.106124] attempt to access beyond end of device
      [   43.106128] loop0: rw=0, want=277424, limit=277408
      [   43.106131] Buffer I/O error on device loop0, logical block 138711
      [   43.106135] attempt to access beyond end of device
      [   43.106139] loop0: rw=0, want=277426, limit=277408
      [   43.106143] Buffer I/O error on device loop0, logical block 138712
      [   43.106147] attempt to access beyond end of device
      [   43.106151] loop0: rw=0, want=277428, limit=277408
      [   43.106154] Buffer I/O error on device loop0, logical block 138713
      [   43.106158] attempt to access beyond end of device
      [   43.106162] loop0: rw=0, want=277430, limit=277408
      [   43.106166] attempt to access beyond end of device
      [   43.106169] loop0: rw=0, want=277432, limit=277408
      ...
      [   43.106307] attempt to access beyond end of device
      [   43.106311] loop0: rw=0, want=277470, limit=2774
      
      Squashfs manages to read in the end block(s) of the disk during the
      mount operation.  Then, when dd reads the block device, it leads to
      block_read_full_page being called with buffers that are beyond end of
      disk, but are marked as mapped.  Thus, it would end up submitting read
      I/O against them, resulting in the errors mentioned above.  I fixed the
      problem by modifying init_page_buffers to only set the buffer mapped if
      it fell inside of i_size.
      
      Cheers,
      Jeff
      Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
      Acked-by: NNick Piggin <npiggin@kernel.dk>
      
      --
      
      Changes from v1->v2: re-used max_block, as suggested by Nick Piggin.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      080399aa
    • N
      sctp: check cached dst before using it · e0268868
      Nicolas Dichtel 提交于
      dst_check() will take care of SA (and obsolete field), hence
      IPsec rekeying scenario is taken into account.
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: NVlad Yaseivch <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e0268868
    • D
      Revert "net: maintain namespace isolation between vlan and real device" · 59b9997b
      David S. Miller 提交于
      This reverts commit 8a83a00b.
      
      It causes regressions for S390 devices, because it does an
      unconditional DST drop on SKBs for vlans and the QETH device
      needs the neighbour entry hung off the DST for certain things
      on transmit.
      
      Arnd can't remember exactly why he even needed this change.
      
      Conflicts:
      
      	drivers/net/macvlan.c
      	net/8021q/vlan_dev.c
      	net/core/dev.c
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59b9997b
    • S
      tracing: Do not enable function event with enable · 9b63776f
      Steven Rostedt 提交于
      With the adding of function tracing event to perf, it caused a
      side effect that produces the following warning when enabling all
      events in ftrace:
      
       # echo 1 > /sys/kernel/debug/tracing/events/enable
      
      [console]
      event trace: Could not enable event function
      
      This is because when enabling all events via the debugfs system
      it ignores events that do not have a ->reg() function assigned.
      This was to skip over the ftrace internal events (as they are
      not TRACE_EVENTs). But as the ftrace function event now has
      a ->reg() function attached to it for use with perf, it is no
      longer ignored.
      
      Worse yet, this ->reg() function is being called when it should
      not be. It returns an error and causes the above warning to
      be printed.
      
      By adding a new event_call flag (TRACE_EVENT_FL_IGNORE_ENABLE)
      and have all ftrace internel event structures have it set,
      setting the events/enable will no longe try to incorrectly enable
      the function event and does not warn.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      9b63776f
  7. 08 5月, 2012 1 次提交
  8. 05 5月, 2012 3 次提交
    • L
      ACPI: Fix D3hot v D3cold confusion · 1cc0c998
      Lin Ming 提交于
      Before this patch, ACPI_STATE_D3 incorrectly referenced D3hot
      in some places, but D3cold in other places.
      
      After this patch, ACPI_STATE_D3 always means ACPI_STATE_D3_COLD;
      and all references to D3hot use ACPI_STATE_D3_HOT.
      
      ACPI's _PR3 method is used to enter both D3hot and D3cold states.
      What distinguishes D3hot from D3cold is the presence _PR3
      (Power Resources for D3hot)  If these resources are all ON,
      then the state is D3hot.  If _PR3 is not present,
      or all _PR0 resources for the devices are OFF,
      then the state is D3cold.
      
      This patch applies after Linux-3.4-rc1.
      A future syntax cleanup may remove ACPI_STATE_D3
      to emphasize that it always means ACPI_STATE_D3_COLD.
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Acked-by: NRafael J. Wysocki <rjw@sisk.pl>
      Reviewed-by: NAaron Lu <aaron.lu@amd.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      1cc0c998
    • L
      seqlock: add 'raw_seqcount_begin()' function · 4f988f15
      Linus Torvalds 提交于
      The normal read_seqcount_begin() function will wait for any current
      writers to exit their critical region by looping until the sequence
      count is even.
      
      That "wait for sequence count to stabilize" is the right thing to do if
      the read-locker will just retry the whole operation on contention: no
      point in doing a potentially expensive reader sequence if we know at the
      beginning that we'll just end up re-doing it all.
      
      HOWEVER.  Some users don't actually retry the operation, but instead
      will abort and do the operation with proper locking.  So the sequence
      count case may be the optimistic quick case, but in the presense of
      writers you may want to do full locking in order to guarantee forward
      progress.  The prime example of this would be the RCU name lookup.
      
      And in that case, you may well be better off without the "retry early",
      and are in a rush to instead get to the failure handling.  Thus this
      "raw" interface that just returns the sequence number without testing it
      - it just forces the low bit to zero so that read_seqcount_retry() will
      always fail such a "active concurrent writer" scenario.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4f988f15
    • L
      Fix __read_seqcount_begin() to use ACCESS_ONCE for sequence value read · 2f624278
      Linus Torvalds 提交于
      We really need to use a ACCESS_ONCE() on the sequence value read in
      __read_seqcount_begin(), because otherwise the compiler might end up
      reloading the value in between the test and the return of it.  As a
      result, it might end up returning an odd value (which means that a write
      is in progress).
      
      If the reader is then fast enough that that odd value is still the
      current one when the read_seqcount_retry() is done, we might end up with
      a "successful" read sequence, even despite the concurrent write being
      active.
      
      In practice this probably never really happens - there just isn't
      anything else going on around the read of the sequence count, and the
      common case is that we end up having a read barrier immediately
      afterwards.
      
      So the code sequence in which gcc might decide to reaload from memory is
      small, and there's no reason to believe it would ever actually do the
      reload.  But if the compiler ever were to decide to do so, it would be
      incredibly annoying to debug.  Let's just make sure.
      
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2f624278
  9. 01 5月, 2012 4 次提交
    • E
      net: fix two typos in skbuff.h · d9619496
      Eric Dumazet 提交于
      fix kernel doc typos in function names
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d9619496
    • M
      efi: Add new variable attributes · 41b3254c
      Matthew Garrett 提交于
      More recent versions of the UEFI spec have added new attributes for
      variables. Add them.
      Signed-off-by: NMatthew Garrett <mjg@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      41b3254c
    • H
      asm-generic: Use __BITS_PER_LONG in statfs.h · f5c2347e
      H. Peter Anvin 提交于
      <asm-generic/statfs.h> is exported to userspace, so using
      BITS_PER_LONG is invalid.  We need to use __BITS_PER_LONG instead.
      
      This is kernel bugzilla 43165.
      Reported-by: NH.J. Lu <hjl.tools@gmail.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Link: http://lkml.kernel.org/r/1335465916-16965-1-git-send-email-hpa@linux.intel.comAcked-by: NArnd Bergmann <arnd@arndb.de>
      Cc: <stable@vger.kernel.org>
      f5c2347e
    • E
      net: fix sk_sockets_allocated_read_positive · 518fbf9c
      Eric Dumazet 提交于
      Denys Fedoryshchenko reported frequent crashes on a proxy server and kindly
      provided a lockdep report that explains it all :
      
        [  762.903868]
        [  762.903880] =================================
        [  762.903890] [ INFO: inconsistent lock state ]
        [  762.903903] 3.3.4-build-0061 #8 Not tainted
        [  762.904133] ---------------------------------
        [  762.904344] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
        [  762.904542] squid/1603 [HC0[0]:SC0[0]:HE1:SE1] takes:
        [  762.904542]  (key#3){+.?...}, at: [<c0232cc4>]
      __percpu_counter_sum+0xd/0x58
        [  762.904542] {IN-SOFTIRQ-W} state was registered at:
        [  762.904542]   [<c0158b84>] __lock_acquire+0x284/0xc26
        [  762.904542]   [<c01598e8>] lock_acquire+0x71/0x85
        [  762.904542]   [<c0349765>] _raw_spin_lock+0x33/0x40
        [  762.904542]   [<c0232c93>] __percpu_counter_add+0x58/0x7c
        [  762.904542]   [<c02cfde1>] sk_clone_lock+0x1e5/0x200
        [  762.904542]   [<c0303ee4>] inet_csk_clone_lock+0xe/0x78
        [  762.904542]   [<c0315778>] tcp_create_openreq_child+0x1b/0x404
        [  762.904542]   [<c031339c>] tcp_v4_syn_recv_sock+0x32/0x1c1
        [  762.904542]   [<c031615a>] tcp_check_req+0x1fd/0x2d7
        [  762.904542]   [<c0313f77>] tcp_v4_do_rcv+0xab/0x194
        [  762.904542]   [<c03153bb>] tcp_v4_rcv+0x3b3/0x5cc
        [  762.904542]   [<c02fc0c4>] ip_local_deliver_finish+0x13a/0x1e9
        [  762.904542]   [<c02fc539>] NF_HOOK.clone.11+0x46/0x4d
        [  762.904542]   [<c02fc652>] ip_local_deliver+0x41/0x45
        [  762.904542]   [<c02fc4d1>] ip_rcv_finish+0x31a/0x33c
        [  762.904542]   [<c02fc539>] NF_HOOK.clone.11+0x46/0x4d
        [  762.904542]   [<c02fc857>] ip_rcv+0x201/0x23e
        [  762.904542]   [<c02daa3a>] __netif_receive_skb+0x319/0x368
        [  762.904542]   [<c02dac07>] netif_receive_skb+0x4e/0x7d
        [  762.904542]   [<c02dacf6>] napi_skb_finish+0x1e/0x34
        [  762.904542]   [<c02db122>] napi_gro_receive+0x20/0x24
        [  762.904542]   [<f85d1743>] e1000_receive_skb+0x3f/0x45 [e1000e]
        [  762.904542]   [<f85d3464>] e1000_clean_rx_irq+0x1f9/0x284 [e1000e]
        [  762.904542]   [<f85d3926>] e1000_clean+0x62/0x1f4 [e1000e]
        [  762.904542]   [<c02db228>] net_rx_action+0x90/0x160
        [  762.904542]   [<c012a445>] __do_softirq+0x7b/0x118
        [  762.904542] irq event stamp: 156915469
        [  762.904542] hardirqs last  enabled at (156915469): [<c019b4f4>]
      __slab_alloc.clone.58.clone.63+0xc4/0x2de
        [  762.904542] hardirqs last disabled at (156915468): [<c019b452>]
      __slab_alloc.clone.58.clone.63+0x22/0x2de
        [  762.904542] softirqs last  enabled at (156915466): [<c02ce677>]
      lock_sock_nested+0x64/0x6c
        [  762.904542] softirqs last disabled at (156915464): [<c0349914>]
      _raw_spin_lock_bh+0xe/0x45
        [  762.904542]
        [  762.904542] other info that might help us debug this:
        [  762.904542]  Possible unsafe locking scenario:
        [  762.904542]
        [  762.904542]        CPU0
        [  762.904542]        ----
        [  762.904542]   lock(key#3);
        [  762.904542]   <Interrupt>
        [  762.904542]     lock(key#3);
        [  762.904542]
        [  762.904542]  *** DEADLOCK ***
        [  762.904542]
        [  762.904542] 1 lock held by squid/1603:
        [  762.904542]  #0:  (sk_lock-AF_INET){+.+.+.}, at: [<c03055c0>]
      lock_sock+0xa/0xc
        [  762.904542]
        [  762.904542] stack backtrace:
        [  762.904542] Pid: 1603, comm: squid Not tainted 3.3.4-build-0061 #8
        [  762.904542] Call Trace:
        [  762.904542]  [<c0347b73>] ? printk+0x18/0x1d
        [  762.904542]  [<c015873a>] valid_state+0x1f6/0x201
        [  762.904542]  [<c0158816>] mark_lock+0xd1/0x1bb
        [  762.904542]  [<c015876b>] ? mark_lock+0x26/0x1bb
        [  762.904542]  [<c015805d>] ? check_usage_forwards+0x77/0x77
        [  762.904542]  [<c0158bf8>] __lock_acquire+0x2f8/0xc26
        [  762.904542]  [<c0159b8e>] ? mark_held_locks+0x5d/0x7b
        [  762.904542]  [<c0159cf6>] ? trace_hardirqs_on+0xb/0xd
        [  762.904542]  [<c0158dd4>] ? __lock_acquire+0x4d4/0xc26
        [  762.904542]  [<c01598e8>] lock_acquire+0x71/0x85
        [  762.904542]  [<c0232cc4>] ? __percpu_counter_sum+0xd/0x58
        [  762.904542]  [<c0349765>] _raw_spin_lock+0x33/0x40
        [  762.904542]  [<c0232cc4>] ? __percpu_counter_sum+0xd/0x58
        [  762.904542]  [<c0232cc4>] __percpu_counter_sum+0xd/0x58
        [  762.904542]  [<c02cebc4>] __sk_mem_schedule+0xdd/0x1c7
        [  762.904542]  [<c02d178d>] ? __alloc_skb+0x76/0x100
        [  762.904542]  [<c0305e8e>] sk_wmem_schedule+0x21/0x2d
        [  762.904542]  [<c0306370>] sk_stream_alloc_skb+0x42/0xaa
        [  762.904542]  [<c0306567>] tcp_sendmsg+0x18f/0x68b
        [  762.904542]  [<c031f3dc>] ? ip_fast_csum+0x30/0x30
        [  762.904542]  [<c0320193>] inet_sendmsg+0x53/0x5a
        [  762.904542]  [<c02cb633>] sock_aio_write+0xd2/0xda
        [  762.904542]  [<c015876b>] ? mark_lock+0x26/0x1bb
        [  762.904542]  [<c01a1017>] do_sync_write+0x9f/0xd9
        [  762.904542]  [<c01a2111>] ? file_free_rcu+0x2f/0x2f
        [  762.904542]  [<c01a17a1>] vfs_write+0x8f/0xab
        [  762.904542]  [<c01a284d>] ? fget_light+0x75/0x7c
        [  762.904542]  [<c01a1900>] sys_write+0x3d/0x5e
        [  762.904542]  [<c0349ec9>] syscall_call+0x7/0xb
        [  762.904542]  [<c0340000>] ? rp_sidt+0x41/0x83
      
      Bug is that sk_sockets_allocated_read_positive() calls
      percpu_counter_sum_positive() without BH being disabled.
      
      This bug was added in commit 180d8cd9
      (foundations of per-cgroup memory pressure controlling.), since previous
      code was using percpu_counter_read_positive() which is IRQ safe.
      
      In __sk_mem_schedule() we dont need the precise count of allocated
      sockets and can revert to previous behavior.
      Reported-by: NDenys Fedoryshchenko <denys@visp.net.lb>
      Sined-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Glauber Costa <glommer@parallels.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      518fbf9c
  10. 30 4月, 2012 3 次提交
    • H
      ipvs: kernel oops - do_ip_vs_get_ctl · 8537de8a
      Hans Schillstrom 提交于
      Change order of init so netns init is ready
      when register ioctl and netlink.
      
      Ver2
      	Whitespace fixes and __init added.
      Reported-by: N"Ryan O'Hara" <rohara@redhat.com>
      Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
      Acked-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      8537de8a
    • H
      ipvs: take care of return value from protocol init_netns · 582b8e3e
      Hans Schillstrom 提交于
      ip_vs_create_timeout_table() can return NULL
      All functions protocol init_netns is affected of this patch.
      Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
      Acked-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      582b8e3e
    • L
      pipes: add a "packetized pipe" mode for writing · 9883035a
      Linus Torvalds 提交于
      The actual internal pipe implementation is already really about
      individual packets (called "pipe buffers"), and this simply exposes that
      as a special packetized mode.
      
      When we are in the packetized mode (marked by O_DIRECT as suggested by
      Alan Cox), a write() on a pipe will not merge the new data with previous
      writes, so each write will get a pipe buffer of its own.  The pipe
      buffer is then marked with the PIPE_BUF_FLAG_PACKET flag, which in turn
      will tell the reader side to break the read at that boundary (and throw
      away any partial packet contents that do not fit in the read buffer).
      
      End result: as long as you do writes less than PIPE_BUF in size (so that
      the pipe doesn't have to split them up), you can now treat the pipe as a
      packet interface, where each read() system call will read one packet at
      a time.  You can just use a sufficiently big read buffer (PIPE_BUF is
      sufficient, since bigger than that doesn't guarantee atomicity anyway),
      and the return value of the read() will naturally give you the size of
      the packet.
      
      NOTE! We do not support zero-sized packets, and zero-sized reads and
      writes to a pipe continue to be no-ops.  Also note that big packets will
      currently be split at write time, but that the size at which that
      happens is not really specified (except that it's bigger than PIPE_BUF).
      Currently that limit is the system page size, but we might want to
      explicitly support bigger packets some day.
      
      The main user for this is going to be the autofs packet interface,
      allowing us to stop having to care so deeply about exact packet sizes
      (which have had bugs with 32/64-bit compatibility modes).  But user
      space can create packetized pipes with "pipe2(fd, O_DIRECT)", which will
      fail with an EINVAL on kernels that do not support this interface.
      Tested-by: NMichael Tokarev <mjt@tls.msk.ru>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ian Kent <raven@themaw.net>
      Cc: Thomas Meyer <thomas@m3y3r.de>
      Cc: stable@kernel.org  # needed for systemd/autofs interaction fix
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9883035a
  11. 28 4月, 2012 1 次提交
  12. 27 4月, 2012 1 次提交
  13. 26 4月, 2012 2 次提交
  14. 25 4月, 2012 1 次提交
    • A
      USB: EHCI: fix crash during suspend on ASUS computers · 151b6128
      Alan Stern 提交于
      This patch (as1545) fixes a problem affecting several ASUS computers:
      The machine crashes or corrupts memory when going into suspend if the
      ehci-hcd driver is bound to any controllers.  Users have been forced
      to unbind or unload ehci-hcd before putting their systems to sleep.
      
      After extensive testing, it was determined that the machines don't
      like going into suspend when any EHCI controllers are in the PCI D3
      power state.  Presumably this is a firmware bug, but there's nothing
      we can do about it except to avoid putting the controllers in D3
      during system sleep.
      
      The patch adds a new flag to indicate whether the problem is present,
      and avoids changing the controller's power state if the flag is set.
      Runtime suspend is unaffected; this matters only for system suspend.
      However as a side effect, the controller will not respond to remote
      wakeup requests while the system is asleep.  Hence USB wakeup is not
      functional -- but of course, this is already true in the current state
      of affairs.
      
      This fixes Bugzilla #42728.
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Tested-by: NSteven Rostedt <rostedt@goodmis.org>
      Tested-by: NAndrey Rahmatullin <wrar@wrar.name>
      Tested-by: NOleksij Rempel (fishor) <bug-track@fisher-privat.net>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      151b6128
  15. 24 4月, 2012 2 次提交
  16. 23 4月, 2012 5 次提交
  17. 21 4月, 2012 6 次提交
  18. 18 4月, 2012 1 次提交