1. 02 5月, 2009 2 次提交
    • J
      timekeeping: create arch_gettimeoffset infrastructure · 7d27558c
      john stultz 提交于
      Some arches don't supply their own clocksource. This is mainly the
      case in architectures that get their inter-tick times by reading the
      counter on their interval timer.  Since these timers wrap every tick,
      they're not really useful as clocksources.  Wrapping them to act like
      one is possible but not very efficient. So we provide a callout these
      arches can implement for use with the jiffies clocksource to provide
      finer then tick granular time.
      
      [ Impact: ease the migration to generic time keeping ]
      Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      7d27558c
    • M
      clocksource: setup mult_orig in clocksource_enable() · a25cbd04
      Magnus Damm 提交于
      Setup clocksource mult_orig in clocksource_enable().
      
      Clocksource drivers can save power by using keeping the
      device clock disabled while the clocksource is unused.
      
      In practice this means that the enable() and disable()
      callbacks perform clk_enable() and clk_disable().
      
      The enable() callback may also use clk_get_rate() to get
      the clock rate from the clock framework. This information
      can then be used to calculate the shift and mult variables.
      
      Currently the mult_orig variable is setup from mult at
      registration time only. This is conflicting with the above
      case since the clock is disabled and the mult variable is
      not yet calculated at the time of registration.
      
      Moving the mult_orig setup code to clocksource_enable()
      allows us to both handle the common case with no enable()
      callback and the mult-changed-after-enable() case.
      
      [ Impact: allow dynamic clock source usage ]
      Signed-off-by: NMagnus Damm <damm@igel.co.jp>
      LKML-Reference: <20090501054546.8193.10688.sendpatchset@rx1.opensource.se>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      a25cbd04
  2. 27 4月, 2009 1 次提交
  3. 25 4月, 2009 1 次提交
    • R
      PM/Hibernate: Fix waiting for image device to appear on resume · 0c8454f5
      Rafael J. Wysocki 提交于
      Commit c7510859 ("PM/Hibernate: Wait for
      SCSI devices scan to complete during resume") added a call to
      scsi_complete_async_scans() to software_resume(), so that it waited for
      the SCSI scanning to complete, but the call was added at a wrong place.
      
      Namely, it should have been added after wait_for_device_probe(), which
      is called only if the image partition hasn't been specified yet.  Also,
      it's reasonable to check if the image partition is present and only wait
      for the device probing and SCSI scanning to complete if it is not the
      case.
      
      Additionally, since noresume is checked right at the beginning of
      software_resume() and the function returns immediately if it's set, it
      doesn't make sense to check it once again later.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0c8454f5
  4. 24 4月, 2009 1 次提交
  5. 23 4月, 2009 1 次提交
    • I
      locking: clarify kernel-taint warning message · b48ccb09
      Ingo Molnar 提交于
      Andi Kleen reported this message triggering on non-lockdep kernels:
      
         Disabling lockdep due to kernel taint
      
      Clarify the message to say 'lock debugging' - debug_locks_off()
      turns off all things lock debugging, not just lockdep.
      
      [ Impact: change kernel warning message text ]
      Reported-by: NAndi Kleen <andi@firstfloor.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b48ccb09
  6. 22 4月, 2009 2 次提交
  7. 21 4月, 2009 1 次提交
  8. 20 4月, 2009 1 次提交
    • R
      PM/Suspend: Introduce two new platform callbacks to avoid breakage · 6a7c7eaf
      Rafael J. Wysocki 提交于
      Commit 900af0d9 (PM: Change suspend
      code ordering) changed the ordering of suspend code in such a way
      that the platform .prepare() callback is now executed after the
      device drivers' late suspend callbacks have run.  Unfortunately, this
      turns out to break ARM platforms that need to talk via I2C to power
      control devices during the .prepare() callback.
      
      For this reason introduce two new platform suspend callbacks,
      .prepare_late() and .wake(), that will be called just prior to
      disabling non-boot CPUs and right after bringing them back on line,
      respectively, and use them instead of .prepare() and .finish() for
      ACPI suspend.  Make the PM core execute the .prepare() and .finish()
      platform suspend callbacks where they were executed previously (that
      is, right after calling the regular suspend methods provided by
      device drivers and right before executing their regular resume
      methods, respectively).
      
      It is not necessary to make analogous changes to the hibernation
      code and data structures at the moment, because they are only used
      by ACPI platforms.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Reported-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Acked-by: NLen Brown <len.brown@intel.com>
      6a7c7eaf
  9. 19 4月, 2009 1 次提交
  10. 18 4月, 2009 1 次提交
    • P
      lockdep: more robust lockdep_map init sequence · c8a25005
      Peter Zijlstra 提交于
      Steven Rostedt reported:
      
      > OK, I think I figured this bug out. This is a lockdep issue with respect
      > to tracepoints.
      >
      > The trace points in lockdep are called all the time. Outside the lockdep
      > logic. But if lockdep were to trigger an error / warning (which this run
      > did) we might be in trouble. For new locks, like the dentry->d_lock, that
      > are created, they will not get a name:
      >
      > void lockdep_init_map(struct lockdep_map *lock, const char *name,
      >                       struct lock_class_key *key, int subclass)
      > {
      >         if (unlikely(!debug_locks))
      >                 return;
      >
      > When a problem is found by lockdep, debug_locks becomes false. Thus we
      > stop allocating names for locks. This dentry->d_lock I had, now has no
      > name. Worse yet, I have CONFIG_DEBUG_VM set, that scrambles non
      > initialized memory. Thus, when the trace point was hit, it had junk for
      > the lock->name, and the machine crashed.
      
      Ah, nice catch. I think we should put at least the name in regardless.
      
      Ensure we at least initialize the trivial entries of the depmap so that
      they can be relied upon, even when lockdep itself decided to pack up and
      go home.
      
      [ Impact: fix lock tracing after lockdep warnings. ]
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1239954049.23397.4156.camel@laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c8a25005
  11. 17 4月, 2009 2 次提交
  12. 16 4月, 2009 1 次提交
    • D
      RCU: Don't try and predeclare inline funcs as it upsets some versions of gcc · 5b1d07ed
      David Howells 提交于
      Don't try and predeclare inline funcs like this:
      
      	static inline void wait_migrated_callbacks(void)
      	...
      	static void _rcu_barrier(enum rcu_barrier type)
      	{
      		...
      		wait_migrated_callbacks();
      	}
      	...
      	static inline void wait_migrated_callbacks(void)
      	{
      		wait_event(rcu_migrate_wq, !atomic_read(&rcu_migrate_type_count));
      	}
      
      as it upsets some versions of gcc under some circumstances:
      
      	kernel/rcupdate.c: In function `_rcu_barrier':
      	kernel/rcupdate.c:125: sorry, unimplemented: inlining failed in call to 'wait_migrated_callbacks': function body not available
      	kernel/rcupdate.c:152: sorry, unimplemented: called from here
      
      This can be dealt with by simply putting the static variables (rcu_migrate_*)
      at the top, and moving the implementation of the function up so that it
      replaces its forward declaration.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5b1d07ed
  13. 15 4月, 2009 1 次提交
  14. 14 4月, 2009 9 次提交
    • P
      x86, irq: Remove IRQ_DISABLED check in process context IRQ move · 6ec3cfec
      Pallipadi, Venkatesh 提交于
      As discussed in the thread here:
      
        http://marc.info/?l=linux-kernel&m=123964468521142&w=2
      
      Eric W. Biederman observed:
      
      > It looks like some additional bugs have slipped in since last I looked.
      >
      > set_irq_affinity does this:
      > ifdef CONFIG_GENERIC_PENDING_IRQ
      >        if (desc->status & IRQ_MOVE_PCNTXT || desc->status & IRQ_DISABLED) {
      >                cpumask_copy(desc->affinity, cpumask);
      >                desc->chip->set_affinity(irq, cpumask);
      >        } else {
      >                desc->status |= IRQ_MOVE_PENDING;
      >                cpumask_copy(desc->pending_mask, cpumask);
      >        }
      > #else
      >
      > That IRQ_DISABLED case is a software state and as such it has nothing to
      > do with how safe it is to move an irq in process context.
      
      [...]
      
      >
      > The only reason we migrate MSIs in interrupt context today is that there
      > wasn't infrastructure for support migration both in interrupt context
      > and outside of it.
      
      Yes. The idea here was to force the MSI migration to happen in process
      context. One of the patches in the series did
      
              disable_irq(dev->irq);
              irq_set_affinity(dev->irq, cpumask_of(dev->cpu));
              enable_irq(dev->irq);
      
      with the above patch adding irq/manage code check for interrupt disabled
      and moving the interrupt in process context.
      
      IIRC, there was no IRQ_MOVE_PCNTXT when we were developing this HPET
      code and we ended up having this ugly hack. IRQ_MOVE_PCNTXT was there
      when we eventually submitted the patch upstream. But, looks like I did a
      blind rebasing instead of using IRQ_MOVE_PCNTXT in hpet MSI code.
      
      Below patch fixes this. i.e., revert commit 932775a4
      and add PCNTXT to HPET MSI setup. Also removes copying of desc->affinity
      in generic code as set_affinity routines are doing it internally.
      Reported-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: "Li Shaohua" <shaohua.li@intel.com>
      Cc: Gary Hade <garyhade@us.ibm.com>
      Cc: "lcm@us.ibm.com" <lcm@us.ibm.com>
      Cc: suresh.b.siddha@intel.com
      LKML-Reference: <20090413222058.GB8211@linux-os.sc.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6ec3cfec
    • P
      rcu: Make hierarchical RCU less IPI-happy · ef631b0c
      Paul E. McKenney 提交于
      This patch fixes a hierarchical-RCU performance bug located by Anton
      Blanchard.  The problem stems from a misguided attempt to provide a
      work-around for jiffies-counter failure.  This work-around uses a per-CPU
      n_rcu_pending counter, which is incremented on each call to rcu_pending(),
      which in turn is called from each scheduling-clock interrupt.  Each CPU
      then treats this counter as a surrogate for the jiffies counter, so
      that if the jiffies counter fails to advance, the per-CPU n_rcu_pending
      counter will cause RCU to invoke force_quiescent_state(), which in turn
      will (among other things) send resched IPIs to CPUs that have thus far
      failed to pass through an RCU quiescent state.
      
      Unfortunately, each CPU resets only its own counter after sending a
      batch of IPIs.  This means that the other CPUs will also (needlessly)
      send -another- round of IPIs, for a full N-squared set of IPIs in the
      worst case every three scheduler-clock ticks until the grace period
      finally ends.  It is not reasonable for a given CPU to reset each and
      every n_rcu_pending for all the other CPUs, so this patch instead simply
      disables the jiffies-counter "training wheels", thus eliminating the
      excessive IPIs.
      
      Note that the jiffies-counter IPIs do not have this problem due to
      the fact that the jiffies counter is global, so that the CPU sending
      the IPIs can easily reset things, thus preventing the other CPUs from
      sending redundant IPIs.
      
      Note also that the n_rcu_pending counter remains, as it will continue to
      be used for tracing.  It may also see use to update the jiffies counter,
      should an appropriate kick-the-jiffies-counter API appear.
      Located-by: NAnton Blanchard <anton@au1.ibm.com>
      Tested-by: NAnton Blanchard <anton@au1.ibm.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: anton@samba.org
      Cc: akpm@linux-foundation.org
      Cc: dipankar@in.ibm.com
      Cc: manfred@colorfullife.com
      Cc: cl@linux-foundation.org
      Cc: josht@linux.vnet.ibm.com
      Cc: schamp@sgi.com
      Cc: niv@us.ibm.com
      Cc: dvhltc@us.ibm.com
      Cc: ego@in.ibm.com
      Cc: laijs@cn.fujitsu.com
      Cc: rostedt@goodmis.org
      Cc: peterz@infradead.org
      Cc: penberg@cs.helsinki.fi
      Cc: andi@firstfloor.org
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      LKML-Reference: <12396834793575-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ef631b0c
    • Z
      tracing: Fix branch tracer header · 557055be
      Zhaolei 提交于
      Before patch:
      
        # tracer: branch
        #
        #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
        #              | |       |          |         |
                   <...>-2981  [000] 24008.872738: [  ok  ] trace_irq_handler_exit:irq_event_types.h:41
                   <...>-2981  [000] 24008.872742: [  ok  ] note_interrupt:spurious.c:229
        ...
      
      After patch:
      
        # tracer: branch
        #
        #           TASK-PID    CPU#    TIMESTAMP  CORRECT  FUNC:FILE:LINE
        #              | |       |          |         |       |
                   <...>-2985  [000] 26329.142970: [  ok  ] slab_free:slub.c:1776
                   <...>-2985  [000] 26329.142972: [  ok  ] trace_kmem_cache_free:kmem_event_types.h:191
        ...
      Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <49E2F19A.3040006@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      557055be
    • L
      tracing, sched: mark get_parent_ip() notrace · 132380a0
      Lai Jiangshan 提交于
      Impact: remove overly redundant tracing entries
      
      When tracer is "function" or "function_graph", way too much
      "get_parent_ip" entries are recorded in ring_buffer.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D458B1.5000703@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      132380a0
    • A
      kernel/sys.c: clean up sys_shutdown exit path · 3d26dcf7
      Andi Kleen 提交于
      Impact: cleanup, fix
      
      Clean up sys_shutdown() exit path.  Factor out common code.  Return
      correct error code instead of always 0 on failure.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3d26dcf7
    • O
      ptrace: fix exit_ptrace() vs ptrace_traceme() race · f1671f6d
      Oleg Nesterov 提交于
      Pointed out by Roland.  The bug was recently introduced by me in
      "forget_original_parent: split out the un-ptrace part", commit
      39c626ae.
      
      Since that patch we have a window after exit_ptrace() drops tasklist and
      before forget_original_parent() takes it again.  In this window the child
      can do ptrace(PTRACE_TRACEME) and nobody can untrace this child after
      that.
      
      Change ptrace_traceme() to not attach to the exiting ->real_parent.  We
      don't report the error in this case, we pretend we attach right before
      ->real_parent calls exit_ptrace() which should untrace us anyway.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NRoland McGrath <roland@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f1671f6d
    • P
      mm: move the scan_unevictable_pages sysctl to the vm table · 4be6f6bb
      Peter Zijlstra 提交于
      vm knobs should go in the vm table.  Probably too late for
      randomize_va_space though.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NLee Schermerhorn <lee.schermerhorn@hp.com>
      Acked-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4be6f6bb
    • Z
      tracing: Fix power tracer header · a3d03eca
      Zhaolei 提交于
      Before patch:
        # tracer: power
        #
        #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
        #              | |       |          |         |
        [  676.875865889] CSTATE: Going to C1 on cpu 0 for 0.005911463
        [  676.882938805] CSTATE: Going to C1 on cpu 0 for 0.104796532
        ...
      
      After patch:
        # tracer: power
        #
        #   TIMESTAMP      STATE  EVENT
        #       |            |      |
        [  676.875865889] CSTATE: Going to C1 on cpu 0 for 0.005911463
        [  676.882938805] CSTATE: Going to C1 on cpu 0 for 0.104796532
        ...
      
      v2: Use seq_puts instead of seq_printf
      Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <49E2E889.5000903@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a3d03eca
    • R
      PM/Hibernate: Wait for SCSI devices scan to complete during resume · c7510859
      Rafael J. Wysocki 提交于
      There is a race between resume from hibernation and the asynchronous
      scanning of SCSI devices and to prevent it from happening we need to
      call scsi_complete_async_scans() during resume from hibernation.
      
      In addition, if the resume from hibernation is userland-driven, it's
      better to wait for all device probes in the kernel to complete before
      attempting to open the resume device.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Acked-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c7510859
  15. 12 4月, 2009 7 次提交
  16. 10 4月, 2009 7 次提交
    • Z
      ftrace: Output REC->var instead of __entry->var for trace format · 0462b566
      Zhaolei 提交于
      print fmt: "irq=%d return=%s", __entry->irq, __entry->ret ? \"handled\" : \"unhandled\"
      
      "__entry" should be convert to "REC" by __stringify() macro.
      Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <49DC679D.2090901@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0462b566
    • L
      tracing: fix document references · 4d1f4372
      Li Zefan 提交于
      When moving documents to Documentation/trace/, I forgot to
      grep Kconfig to find out those references.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Pekka Paalanen <pq@iki.fi>
      Cc: eduard.munteanu@linux360.ro
      LKML-Reference: <49DE97EF.7080208@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4d1f4372
    • L
      tracing: fix splice return too large · 93cfb3c9
      Lai Jiangshan 提交于
      I got these from strace:
      
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 16384
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
      
      I wanted to splice_read 4096 bytes, but it returns 8192 or larger.
      
      It is because the return value of tracing_buffers_splice_read()
      does not include "zero out any left over data" bytes.
      
      But tracing_buffers_read() includes these bytes, we make them
      consistent.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D46674.9030804@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      93cfb3c9
    • L
      tracing: update file->f_pos when splice(2) it · c7625a55
      Lai Jiangshan 提交于
      Impact: Cleanup
      
      These two lines:
      
      	if (unlikely(*ppos))
      		return -ESPIPE;
      
      in tracing_buffers_splice_read() are not needed, VFS layer
      has disabled seek(2).
      
      We remove these two lines, and then we can update file->f_pos.
      
      And tracing_buffers_read() updates file->f_pos, this fix
      make tracing_buffers_splice_read() updates file->f_pos too.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D46670.4010503@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c7625a55
    • L
      tracing: allocate page when needed · ddd538f3
      Lai Jiangshan 提交于
      Impact: Cleanup
      
      Sometimes, we open trace_pipe_raw, but we don't read(2) it,
      we just splice(2) it, thus, the page is not used.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D4666B.4010608@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ddd538f3
    • L
      tracing: disable seeking for trace_pipe_raw · d1e7e02f
      Lai Jiangshan 提交于
      Impact: disable pread()
      
      We set tracing_buffers_fops.llseek to no_llseek,
      but we can still perform pread() to read this file.
      
      That is not expected.
      
      This fix uses nonseekable_open() to disable it.
      
      tracing_buffers_fops.llseek is still set to no_llseek,
      it mark this file is a "non-seekable device" and is used by
      sys_splice(). See also do_splice() or manual of splice(2):
      
      ERRORS
             EINVAL Target file system doesn't support  splicing;
                    neither  of the descriptors refers to a pipe;
                    or offset given for non-seekable device.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D46668.8030806@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d1e7e02f
    • H
      mutex: have non-spinning mutexes on s390 by default · 36cd3c9f
      Heiko Carstens 提交于
      Impact: performance regression fix for s390
      
      The adaptive spinning mutexes will not always do what one would expect on
      virtualized architectures like s390. Especially the cpu_relax() loop in
      mutex_spin_on_owner might hurt if the mutex holding cpu has been scheduled
      away by the hypervisor.
      
      We would end up in a cpu_relax() loop when there is no chance that the
      state of the mutex changes until the target cpu has been scheduled again by
      the hypervisor.
      
      For that reason we should change the default behaviour to no-spin on s390.
      
      We do have an instruction which allows to yield the current cpu in favour of
      a different target cpu. Also we have an instruction which allows us to figure
      out if the target cpu is physically backed.
      
      However we need to do some performance tests until we can come up with
      a solution that will do the right thing on s390.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      LKML-Reference: <20090409184834.7a0df7b2@osiris.boeblingen.de.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      36cd3c9f
  17. 09 4月, 2009 1 次提交