1. 26 9月, 2012 2 次提交
    • F
      rcu: New rcu_user_enter_after_irq() and rcu_user_exit_after_irq() APIs · 19dd1591
      Frederic Weisbecker 提交于
      In some cases, it is necessary to enter or exit userspace-RCU-idle mode
      from an interrupt handler, for example, if some other CPU sends this
      CPU a resched IPI.  In this case, the current CPU would enter the IPI
      handler in userspace-RCU-idle mode, but would need to exit the IPI handler
      after having exited that mode.
      
      To allow this to work, this commit adds two new APIs to TREE_RCU:
      
      - rcu_user_enter_after_irq(). This must be called from an interrupt between
      rcu_irq_enter() and rcu_irq_exit().  After the irq calls rcu_irq_exit(),
      the irq handler will return into an RCU extended quiescent state.
      In theory, this interrupt is never a nested interrupt, but in practice
      it might interrupt softirq, which looks to RCU like a nested interrupt.
      
      - rcu_user_exit_after_irq(). This must be called from a non-nesting
      interrupt, interrupting an RCU extended quiescent state, also
      between rcu_irq_enter() and rcu_irq_exit(). After the irq calls
      rcu_irq_exit(), the irq handler will return in an RCU non-quiescent
      state.
      
      [ Combined with "Allow calls to rcu_exit_user_irq from nesting irqs." ]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      19dd1591
    • F
      rcu: New rcu_user_enter() and rcu_user_exit() APIs · adf5091e
      Frederic Weisbecker 提交于
      RCU currently insists that only idle tasks can enter RCU idle mode, which
      prohibits an adaptive tickless kernel (AKA nohz cpusets), which in turn
      would mean that usermode execution would always take scheduling-clock
      interrupts, even when there is only one task runnable on the CPU in
      question.
      
      This commit therefore adds rcu_user_enter() and rcu_user_exit(), which
      allow non-idle tasks to enter RCU idle mode.  These are quite similar
      to rcu_idle_enter() and rcu_idle_exit(), respectively, except that they
      omit the idle-task checks.
      
      [ Updated to use "user" flag rather than separate check functions. ]
      
      [ paulmck: Updated to drop exports of new functions based on Josh's patch
        getting rid of the need for them. ]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      adf5091e
  2. 23 9月, 2012 2 次提交
  3. 21 9月, 2012 2 次提交
    • M
      xfrm_user: ensure user supplied esn replay window is valid · ecd79187
      Mathias Krause 提交于
      The current code fails to ensure that the netlink message actually
      contains as many bytes as the header indicates. If a user creates a new
      state or updates an existing one but does not supply the bytes for the
      whole ESN replay window, the kernel copies random heap bytes into the
      replay bitmap, the ones happen to follow the XFRMA_REPLAY_ESN_VAL
      netlink attribute. This leads to following issues:
      
      1. The replay window has random bits set confusing the replay handling
         code later on.
      
      2. A malicious user could use this flaw to leak up to ~3.5kB of heap
         memory when she has access to the XFRM netlink interface (requires
         CAP_NET_ADMIN).
      
      Known users of the ESN replay window are strongSwan and Steffen's
      iproute2 patch (<http://patchwork.ozlabs.org/patch/85962/>). The latter
      uses the interface with a bitmap supplied while the former does not.
      strongSwan is therefore prone to run into issue 1.
      
      To fix both issues without breaking existing userland allow using the
      XFRMA_REPLAY_ESN_VAL netlink attribute with either an empty bitmap or a
      fully specified one. For the former case we initialize the in-kernel
      bitmap with zero, for the latter we copy the user supplied bitmap. For
      state updates the full bitmap must be supplied.
      
      To prevent overflows in the bitmap length calculation the maximum size
      of bmp_len is limited to 128 by this patch -- resulting in a maximum
      replay window of 4096 packets. This should be sufficient for all real
      life scenarios (RFC 4303 recommends a default replay window size of 64).
      
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: Martin Willi <martin@revosec.ch>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NMathias Krause <minipli@googlemail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ecd79187
    • W
      tracing: Don't call page_to_pfn() if page is NULL · 85f2a2ef
      Wen Congyang 提交于
      When allocating memory fails, page is NULL. page_to_pfn() will
      cause the kernel panicked if we don't use sparsemem vmemmap.
      
      Link: http://lkml.kernel.org/r/505AB1FF.8020104@cn.fujitsu.com
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: stable <stable@vger.kernel.org>
      Acked-by: NMel Gorman <mel@csn.ul.ie>
      Reviewed-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      85f2a2ef
  4. 19 9月, 2012 5 次提交
  5. 18 9月, 2012 3 次提交
    • A
      compiler.h: add __visible · 9a858dc7
      Andi Kleen 提交于
      gcc 4.6+ has support for a externally_visible attribute that prevents the
      optimizer from optimizing unused symbols away.  Add a __visible macro to
      use it with that compiler version or later.
      
      This is used (at least) by the "Link Time Optimization" patchset.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9a858dc7
    • C
      include/net/sock.h: squelch compiler warning in sk_rmem_schedule() · 35c448a8
      Chuck Lever 提交于
      This warning:
      
        In file included from linux/include/linux/tcp.h:227:0,
                         from linux/include/linux/ipv6.h:221,
                         from linux/include/net/ipv6.h:16,
                         from linux/include/linux/sunrpc/clnt.h:26,
                         from linux/net/sunrpc/stats.c:22:
        linux/include/net/sock.h: In function `sk_rmem_schedule':
        linux/nfs-2.6/include/net/sock.h:1339:13: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
      
      is seen with gcc (GCC) 4.6.3 20120306 (Red Hat 4.6.3-2) using the
      -Wextra option.
      
      Commit c76562b6 ("netvm: prevent a stream-specific deadlock")
      accidentally replaced the "size" parameter of sk_rmem_schedule() with an
      unsigned int.  This changes the semantics of the comparison in the
      return statement.
      
      In sk_wmem_schedule we have syntactically the same comparison, but
      "size" is a signed integer.  In addition, __sk_mem_schedule() takes a
      signed integer for its "size" parameter, so there is an implicit type
      conversion in sk_rmem_schedule() anyway.
      
      Revert the "size" parameter back to a signed integer so that the
      semantics of the expressions in both sk_[rw]mem_schedule() are exactly
      the same.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Cc: David Miller <davem@davemloft.net>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      35c448a8
    • J
      mm/ia64: fix a memory block size bug · 05cf9639
      Jianguo Wu 提交于
      I found following definition in include/linux/memory.h, in my IA64
      platform, SECTION_SIZE_BITS is equal to 32, and MIN_MEMORY_BLOCK_SIZE
      will be 0.
      
        #define MIN_MEMORY_BLOCK_SIZE     (1 << SECTION_SIZE_BITS)
      
      Because MIN_MEMORY_BLOCK_SIZE is int type and length of 32bits,
      so MIN_MEMORY_BLOCK_SIZE(1 << 32) will will equal to 0.
      Actually when SECTION_SIZE_BITS >= 31, MIN_MEMORY_BLOCK_SIZE will be wrong.
      This will cause wrong system memory infomation in sysfs.
      I think it should be:
      
        #define MIN_MEMORY_BLOCK_SIZE     (1UL << SECTION_SIZE_BITS)
      
      And "echo offline > memory0/state" will cause following call trace:
      
        kernel BUG at mm/memory_hotplug.c:885!
        sh[6455]: bugcheck! 0 [1]
        Pid: 6455, CPU 0, comm:                   sh
        psr : 0000101008526030 ifs : 8000000000000fa4 ip  : [<a0000001008c40f0>]    Not tainted (3.6.0-rc1)
        ip is at offline_pages+0x210/0xee0
        Call Trace:
          show_stack+0x80/0xa0
          show_regs+0x640/0x920
          die+0x190/0x2c0
          die_if_kernel+0x50/0x80
          ia64_bad_break+0x3d0/0x6e0
          ia64_native_leave_kernel+0x0/0x270
          offline_pages+0x210/0xee0
          alloc_pages_current+0x180/0x2a0
      Signed-off-by: NJianguo Wu <wujianguo@huawei.com>
      Signed-off-by: NJiang Liu <jiang.liu@huawei.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      05cf9639
  6. 17 9月, 2012 1 次提交
  7. 16 9月, 2012 1 次提交
    • M
      mfd: core: Push irqdomain mapping out into devices · 0848c94f
      Mark Brown 提交于
      Currently the MFD core supports remapping MFD cell interrupts using an
      irqdomain but only if the MFD is being instantiated using device tree
      and only if the device tree bindings use the pattern of registering IPs
      in the device tree with compatible properties.  This will be actively
      harmful for drivers which support non-DT platforms and use this pattern
      for their DT bindings as it will mean that the core will silently change
      remapping behaviour and it is also limiting for drivers which don't do
      DT with this particular pattern.  There is also a potential fragility if
      there are interrupts not associated with MFD cells and all the cells are
      omitted from the device tree for some reason.
      
      Instead change the code to take an IRQ domain as an optional argument,
      allowing drivers to take the decision about the parent domain for their
      interrupts.  The one current user of this feature is ab8500-core, it has
      the domain lookup pushed out into the driver.
      Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com>
      Signed-off-by: NSamuel Ortiz <sameo@linux.intel.com>
      0848c94f
  8. 14 9月, 2012 1 次提交
  9. 13 9月, 2012 2 次提交
  10. 12 9月, 2012 3 次提交
  11. 08 9月, 2012 3 次提交
  12. 07 9月, 2012 2 次提交
    • T
      SUNRPC: Fix a UDP transport regression · f39c1bfb
      Trond Myklebust 提交于
      Commit 43cedbf0 (SUNRPC: Ensure that
      we grab the XPRT_LOCK before calling xprt_alloc_slot) is causing
      hangs in the case of NFS over UDP mounts.
      
      Since neither the UDP or the RDMA transport mechanism use dynamic slot
      allocation, we can skip grabbing the socket lock for those transports.
      Add a new rpc_xprt_op to allow switching between the TCP and UDP/RDMA
      case.
      
      Note that the NFSv4.1 back channel assigns the slot directly
      through rpc_run_bc_task, so we can ignore that case.
      Reported-by: NDick Streefland <dick.streefland@altium.nl>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: stable@vger.kernel.org [>= 3.1]
      f39c1bfb
    • B
      kobject: fix oops with "input0: bad kobj_uevent_env content in show_uevent()" · 60e233a5
      Bjørn Mork 提交于
      Fengguang Wu <fengguang.wu@intel.com> writes:
      
      > After the __devinit* removal series, I can still get kernel panic in
      > show_uevent(). So there are more sources of bug..
      >
      > Debug patch:
      >
      > @@ -343,8 +343,11 @@ static ssize_t show_uevent(struct device
      >                 goto out;
      >
      >         /* copy keys to file */
      > -       for (i = 0; i < env->envp_idx; i++)
      > +       dev_err(dev, "uevent %d env[%d]: %s/.../%s\n", env->buflen, env->envp_idx, top_kobj->name, dev->kobj.name);
      > +       for (i = 0; i < env->envp_idx; i++) {
      > +               printk(KERN_ERR "uevent %d env[%d]: %s\n", (int)count, i, env->envp[i]);
      >                 count += sprintf(&buf[count], "%s\n", env->envp[i]);
      > +       }
      >
      > Oops message, the env[] is again not properly initilized:
      >
      > [   44.068623] input input0: uevent 61 env[805306368]: input0/.../input0
      > [   44.069552] uevent 0 env[0]: (null)
      
      This is a completely different CONFIG_HOTPLUG problem, only
      demonstrating another reason why CONFIG_HOTPLUG should go away.  I had a
      hard time trying to disable it anyway ;-)
      
      The problem this time is lots of code assuming that a call to
      add_uevent_var() will guarantee that env->buflen > 0.  This is not true
      if CONFIG_HOTPLUG is unset.  So things like this end up overwriting
      env->envp_idx because the array index is -1:
      
      	if (add_uevent_var(env, "MODALIAS="))
      		return -ENOMEM;
              len = input_print_modalias(&env->buf[env->buflen - 1],
      				   sizeof(env->buf) - env->buflen,
      				   dev, 0);
      
      Don't know what the best action is, given that there seem to be a *lot*
      of this around the kernel.  This patch "fixes" the problem for me, but I
      don't know if it can be considered an appropriate fix.
      
      [ It is the correct fix for now, for 3.7 forcing CONFIG_HOTPLUG to
      always be on is the longterm fix, but it's too late for 3.6 and older
      kernels to resolve this that way - gregkh ]
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NBjørn Mork <bjorn@mork.no>
      Tested-by: NFengguang Wu <fengguang.wu@intel.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      60e233a5
  13. 06 9月, 2012 2 次提交
  14. 05 9月, 2012 3 次提交
  15. 04 9月, 2012 2 次提交
  16. 02 9月, 2012 2 次提交
  17. 31 8月, 2012 2 次提交
  18. 29 8月, 2012 1 次提交
  19. 27 8月, 2012 1 次提交