1. 15 3月, 2013 9 次提交
    • S
      tracing: Add trace_puts() for even faster trace_printk() tracing · 09ae7234
      Steven Rostedt (Red Hat) 提交于
      The trace_printk() is extremely fast and is very handy as it can be
      used in any context (including NMIs!). But it still requires scanning
      the fmt string for parsing the args. Even the trace_bprintk() requires
      a scan to know what args will be saved, although it doesn't copy the
      format string itself.
      
      Several times trace_printk() has no args, and wastes cpu cycles scanning
      the fmt string.
      
      Adding trace_puts() allows the developer to use an even faster
      tracing method that only saves the pointer to the string in the
      ring buffer without doing any format parsing at all. This will
      help remove even more of the "Heisenbug" effect, when debugging.
      
      Also fixed up the F_printk()s for the ftrace internal bprint and print events.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      09ae7234
    • S
      tracing: Add internal tracing_snapshot() functions · ad909e21
      Steven Rostedt (Red Hat) 提交于
      The new snapshot feature is quite handy. It's a way for the user
      to take advantage of the spare buffer that, until then, only
      the latency tracers used to "snapshot" the buffer when it hit
      a max latency. Now users can trigger a "snapshot" manually when
      some condition is hit in a program. But a snapshot currently can
      not be triggered by a condition inside the kernel.
      
      With the addition of tracing_snapshot() and tracing_snapshot_alloc(),
      snapshots can now be taking when a condition is hit, and the
      developer wants to snapshot the case without stopping the trace.
      
      Note, any snapshot will overwrite the old one, so take care
      in how this is done.
      
      These new functions are to be used like tracing_on(), tracing_off()
      and trace_printk() are. That is, they should never be called
      in the mainline Linux kernel. They are solely for the purpose
      of debugging.
      
      The tracing_snapshot() will not allocate a buffer, but it is
      safe to be called from any context (except NMIs). But if a
      snapshot buffer isn't allocated when it is called, it will write
      to the live buffer, complaining about the lack of a snapshot
      buffer, and then stop tracing (giving you the "permanent snapshot").
      
      tracing_snapshot_alloc() will allocate the snapshot buffer if
      it was not already allocated and then take the snapshot. This routine
      *may sleep*, and must be called from context that can sleep.
      The allocation is done with GFP_KERNEL and not atomic.
      
      If you need a snapshot in an atomic context, say in early boot,
      then it is best to call the tracing_snapshot_alloc() before then,
      where it will allocate the buffer, and then you can use the
      tracing_snapshot() anywhere you want and still get snapshots.
      
      Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ad909e21
    • S
      tracing: Consolidate max_tr into main trace_array structure · 12883efb
      Steven Rostedt (Red Hat) 提交于
      Currently, the way the latency tracers and snapshot feature works
      is to have a separate trace_array called "max_tr" that holds the
      snapshot buffer. For latency tracers, this snapshot buffer is used
      to swap the running buffer with this buffer to save the current max
      latency.
      
      The only items needed for the max_tr is really just a copy of the buffer
      itself, the per_cpu data pointers, the time_start timestamp that states
      when the max latency was triggered, and the cpu that the max latency
      was triggered on. All other fields in trace_array are unused by the
      max_tr, making the max_tr mostly bloat.
      
      This change removes the max_tr completely, and adds a new structure
      called trace_buffer, that holds the buffer pointer, the per_cpu data
      pointers, the time_start timestamp, and the cpu where the latency occurred.
      
      The trace_array, now has two trace_buffers, one for the normal trace and
      one for the max trace or snapshot. By doing this, not only do we remove
      the bloat from the max_trace but the instances of traces can now use
      their own snapshot feature and not have just the top level global_trace have
      the snapshot feature and latency tracers for itself.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      12883efb
    • S
      tracing: Only clear trace buffer on module unload if event was traced · 575380da
      Steven Rostedt (Red Hat) 提交于
      Currently, when a module with events is unloaded, the trace buffer is
      cleared. This is just a safety net in case the module might have some
      strange callback when its event is outputted. But there's no reason
      to reset the buffer if the module didn't have any of its events traced.
      
      Add a flag to the event "call" structure called WAS_ENABLED and gets set
      when the event is ever enabled, and this flag never gets cleared. When a
      module gets unloaded, if any of its events have this flag set, then the
      trace buffer will get cleared.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      575380da
    • S
      tracing: Add comment for trace event flag IGNORE_ENABLE · 2a30c11f
      Steven Rostedt (Red Hat) 提交于
      All the trace event flags have comments but the IGNORE_ENABLE flag
      which is set for ftrace internal events that should not be enabled
      via the debugfs "enable" file. That is, if the top level enable file
      is set, it will enable all events. It use to just check the ftrace
      event call descriptor "reg" field and skip those whithout it, but now
      some ftrace internal events have a reg field but still need to be
      skipped. The flag was created to ignore those events.
      
      Now document it.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      2a30c11f
    • L
      tracing: Add a helper function for event print functions · f71130de
      Li Zefan 提交于
      Move duplicate code in event print functions to a helper function.
      
      This shrinks the size of the kernel by ~13K.
      
         text    data     bss     dec     hex filename
      6596137 1743966 10138672        18478775        119f6b7 vmlinux.o.old
      6583002 1743849 10138672        18465523        119c2f3 vmlinux.o.new
      
      Link: http://lkml.kernel.org/r/51258746.2060304@huawei.comSigned-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f71130de
    • S
      tracing/ring-buffer: Move poll wake ups into ring buffer code · 15693458
      Steven Rostedt (Red Hat) 提交于
      Move the logic to wake up on ring buffer data into the ring buffer
      code itself. This simplifies the tracing code a lot and also has the
      added benefit that waiters on one of the instance buffers can be woken
      only when data is added to that instance instead of data added to
      any instance.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      15693458
    • S
      tracing: Pass the ftrace_file to the buffer lock reserve code · ccb469a1
      Steven Rostedt 提交于
      Pass the struct ftrace_event_file *ftrace_file to the
      trace_event_buffer_lock_reserve() (new function that replaces the
      trace_current_buffer_lock_reserver()).
      
      The ftrace_file holds a pointer to the trace_array that is in use.
      In the case of multiple buffers with different trace_arrays, this
      allows different events to be recorded into different buffers.
      
      Also fixed some of the stale comments in include/trace/ftrace.h
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ccb469a1
    • S
      tracing: Separate out trace events from global variables · ae63b31e
      Steven Rostedt 提交于
      The trace events for ftrace are all defined via global variables.
      The arrays of events and event systems are linked to a global list.
      This prevents multiple users of the event system (what to enable and
      what not to).
      
      By adding descriptors to represent the event/file relation, as well
      as to which trace_array descriptor they are associated with, allows
      for more than one set of events to be defined. Once the trace events
      files have a link between the trace event and the trace_array they
      are associated with, we can create multiple trace_arrays that can
      record separate events in separate buffers.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ae63b31e
  2. 19 2月, 2013 1 次提交
  3. 14 2月, 2013 2 次提交
    • T
      smpboot: Allow selfparking per cpu threads · 7d7e499f
      Thomas Gleixner 提交于
      The stop machine threads are still killed when a cpu goes offline. The
      reason is that the thread is used to bring the cpu down, so it can't
      be parked along with the other per cpu threads.
      
      Allow a per cpu thread to be excluded from automatic parking, so it
      can park itself once it's done
      
      Add a create callback function as well.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Arjan van de Veen <arjan@infradead.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Richard Weinberger <rw@linutronix.de>
      Cc: Magnus Damm <magnus.damm@gmail.com>
      Link: http://lkml.kernel.org/r/20130131120741.553993267@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      7d7e499f
    • T
      workqueue: rename cpu_workqueue to pool_workqueue · 112202d9
      Tejun Heo 提交于
      workqueue has moved away from global_cwqs to worker_pools and with the
      scheduled custom worker pools, wforkqueues will be associated with
      pools which don't have anything to do with CPUs.  The workqueue code
      went through significant amount of changes recently and mass renaming
      isn't likely to hurt much additionally.  Let's replace 'cpu' with
      'pool' so that it reflects the current design.
      
      * s/struct cpu_workqueue_struct/struct pool_workqueue/
      * s/cpu_wq/pool_wq/
      * s/cwq/pwq/
      
      This patch is purely cosmetic.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      112202d9
  4. 09 2月, 2013 6 次提交
    • P
      time, Fix setting of hardware clock in NTP code · 84e345e4
      Prarit Bhargava 提交于
      At init time, if the system time is "warped" forward in warp_clock()
      it will differ from the hardware clock by sys_tz.tz_minuteswest.  This time
      difference is not taken into account when ntp updates the hardware clock,
      and this causes the system time to jump forward by this offset every reboot.
      
      The kernel must take this offset into account when writing the system time
      to the hardware clock in the ntp code.  This patch adds
      persistent_clock_is_local which indicates that an offset has been applied
      in warp_clock() and accounts for the "warp" before writing the hardware
      clock.
      
      x86 does not have this problem as rtc writes are software limited to a
      +/-15 minute window relative to the current rtc time.  Other arches, such
      as powerpc, however do a full synchronization of the system time to the
      rtc and will see this problem.
      
      [v2]: generated against tip/timers/core
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      84e345e4
    • O
      uprobes: Introduce uprobe_apply() · bdf8647c
      Oleg Nesterov 提交于
      Currently it is not possible to change the filtering constraints after
      uprobe_register(), so a consumer can not, say, start to trace a task/mm
      which was previously filtered out, or remove the no longer needed bp's.
      
      Introduce uprobe_apply() which simply does register_for_each_vma() again
      to consult uprobe_consumer->filter() and install/remove the breakpoints.
      The only complication is that register_for_each_vma() can no longer
      assume that uprobe->consumers should be consulter if is_register == T,
      so we change it to accept "struct uprobe_consumer *new" instead.
      
      Unlike uprobe_register(), uprobe_apply(true) doesn't do "unregister" if
      register_for_each_vma() fails, it is up to caller to handle the error.
      
      Note: we probably need to cleanup the current interface, it is strange
      that uprobe_apply/unregister need inode/offset. We should either change
      uprobe_register() to return "struct uprobe *", or add a private ->uprobe
      member in uprobe_consumer. And in the long term uprobe_apply() should
      take a single argument, uprobe or consumer, even "bool add" should go
      away.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      bdf8647c
    • O
      perf: Introduce hw_perf_event->tp_target and ->tp_list · f22c1bb6
      Oleg Nesterov 提交于
      sys_perf_event_open()->perf_init_event(event) is called before
      find_get_context(event), this means that event->ctx == NULL when
      class->reg(TRACE_REG_PERF_REGISTER/OPEN) is called and thus it
      can't know if this event is per-task or system-wide.
      
      This patch adds hw_perf_event->tp_target for PERF_TYPE_TRACEPOINT,
      this is analogous to PERF_TYPE_BREAKPOINT/bp_target we already have.
      The patch also moves ->bp_target up so that it can overlap with the
      new member, this can help the compiler to generate the better code.
      
      trace_uprobe_register() will use it for prefiltering to avoid the
      unnecessary breakpoints in mm's we do not want to trace.
      
      ->tp_target doesn't have its own reference, but we can rely on the
      fact that either sys_perf_event_open() holds a reference, or it is
      equal to event->ctx->task. So this pointer is always valid until
      free_event().
      
      Also add the "struct list_head tp_list" into this union. It is not
      strictly necessary, but it can simplify the next changes and we can
      add it for free.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      f22c1bb6
    • O
      uprobes: Teach handler_chain() to filter out the probed task · da1816b1
      Oleg Nesterov 提交于
      Currrently the are 2 problems with pre-filtering:
      
      1. It is not possible to add/remove a task (mm) after uprobe_register()
      
      2. A forked child inherits all breakpoints and uprobe_consumer can not
         control this.
      
      This patch does the first step to improve the filtering. handler_chain()
      removes the breakpoints installed by this uprobe from current->mm if all
      handlers return UPROBE_HANDLER_REMOVE.
      
      Note that handler_chain() relies on ->register_rwsem to avoid the race
      with uprobe_register/unregister which can add/del a consumer, or even
      remove and then insert the new uprobe at the same address.
      
      Perhaps we will add uprobe_apply_mm(uprobe, mm, is_register) and teach
      copy_mm() to do filter(UPROBE_FILTER_FORK), but I think this change makes
      sense anyway.
      
      Note: instead of checking the retcode from uc->handler, we could add
      uc->filter(UPROBE_FILTER_BPHIT). But I think this is not optimal to
      call 2 hooks in a row. This buys nothing, and if handler/filter do
      something nontrivial they will probably do the same work twice.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      da1816b1
    • O
      uprobes: Reintroduce uprobe_consumer->filter() · 8a7f2fa0
      Oleg Nesterov 提交于
      Finally add uprobe_consumer->filter() and change consumer_filter()
      to actually call this method.
      
      Note that ->filter() accepts mm_struct, not task_struct. Because:
      
      	1. We do not have for_each_mm_user(mm, task).
      
      	2. Even if we implement for_each_mm_user(), ->filter() can
      	   use it itself.
      
      	3. It is not clear who will actually need this interface to
      	   do the "nontrivial" filtering.
      
      Another argument is "enum uprobe_filter_ctx", consumer->filter() can
      use it to figure out why/where it was called. For example, perhaps
      we can add UPROBE_FILTER_PRE_REGISTER used by build_map_info() to
      quickly "nack" the unwanted mm's. In this case consumer should know
      that it is called under ->i_mmap_mutex.
      
      See the previous discussion at http://marc.info/?t=135214229700002
      Perhaps we should pass more arguments, vma/vaddr?
      
      Note: this patch obviously can't help to filter out the child created
      by fork(), this will be addressed later.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      8a7f2fa0
    • O
      uprobes: Kill uprobe_consumer->filter() · fe20d71f
      Oleg Nesterov 提交于
      uprobe_consumer->filter() is pointless in its current form, kill it.
      
      We will add it back, but with the different signature/semantics. Perhaps
      we will even re-introduce the callsite in handler_chain(), but not to
      just skip uc->handler().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      fe20d71f
  5. 08 2月, 2013 6 次提交
  6. 07 2月, 2013 2 次提交
    • L
      workqueue: add delayed_work->wq to simplify reentrancy handling · 60c057bc
      Lai Jiangshan 提交于
      To avoid executing the same work item from multiple CPUs concurrently,
      a work_struct records the last pool it was on in its ->data so that,
      on the next queueing, the pool can be queried to determine whether the
      work item is still executing or not.
      
      A delayed_work goes through timer before actually being queued on the
      target workqueue and the timer needs to know the target workqueue and
      CPU.  This is currently achieved by modifying delayed_work->work.data
      such that it points to the cwq which points to the target workqueue
      and the last CPU the work item was on.  __queue_delayed_work()
      extracts the last CPU from delayed_work->work.data and then combines
      it with the target workqueue to create new work.data.
      
      The only thing this rather ugly hack achieves is encoding the target
      workqueue into delayed_work->work.data without using a separate field,
      which could be a trade off one can make; unfortunately, this entangles
      work->data management between regular workqueue and delayed_work code
      by setting cwq pointer before the work item is actually queued and
      becomes a hindrance for further improvements of work->data handling.
      
      This can be easily made sane by adding a target workqueue field to
      delayed_work.  While delayed_work is used widely in the kernel and
      this does make it a bit larger (<5%), I think this is the right
      trade-off especially given the prospect of much saner handling of
      work->data which currently involves quite tricky memory barrier
      dancing, and don't expect to see any measureable effect.
      
      Add delayed_work->wq and drop the delayed_work->work.data overloading.
      
      tj: Rewrote the description.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      60c057bc
    • L
      workqueue: replace WORK_CPU_NONE/LAST with WORK_CPU_END · 6be19588
      Lai Jiangshan 提交于
      Now that workqueue has moved away from gcwqs, workqueue no longer has
      the need to have a CPU identifier indicating "no cpu associated" - we
      now use WORK_OFFQ_POOL_NONE instead - and most uses of WORK_CPU_NONE
      are gone.
      
      The only left usage is as the end marker for for_each_*wq*()
      iterators, where the name WORK_CPU_NONE is confusing w/o actual
      WORK_CPU_NONE usages.  Similarly, WORK_CPU_LAST which equals
      WORK_CPU_NONE no longer makes sense.
      
      Replace both WORK_CPU_NONE and LAST with WORK_CPU_END.  This patch
      doesn't introduce any functional difference.
      
      tj: s/WORK_CPU_LAST/WORK_CPU_END/ and rewrote the description.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      6be19588
  7. 05 2月, 2013 2 次提交
  8. 04 2月, 2013 1 次提交
  9. 01 2月, 2013 3 次提交
  10. 31 1月, 2013 4 次提交
    • B
      net: usbnet: prevent buggy devices from killing us · 70c37bf9
      Bjørn Mork 提交于
      A device sending 0 length frames as fast as it can has been
      observed killing the host system due to the resulting memory
      pressure.
      
      Temporarily disable RX skb allocation and URB submission when
      the current error ratio is high, preventing us from trying to
      allocate an infinite number of skbs.  Reenable as soon as we
      are finished processing the done queue, allowing the device
      to continue working after short error bursts.
      Signed-off-by: NBjørn Mork <bjorn@mork.no>
      Acked-by: NOliver Neukum <oneukum@suse.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      70c37bf9
    • M
      efi: Make 'efi_enabled' a function to query EFI facilities · 83e68189
      Matt Fleming 提交于
      Originally 'efi_enabled' indicated whether a kernel was booted from
      EFI firmware. Over time its semantics have changed, and it now
      indicates whether or not we are booted on an EFI machine with
      bit-native firmware, e.g. 64-bit kernel with 64-bit firmware.
      
      The immediate motivation for this patch is the bug report at,
      
          https://bugs.launchpad.net/ubuntu-cdimage/+bug/1040557
      
      which details how running a platform driver on an EFI machine that is
      designed to run under BIOS can cause the machine to become
      bricked. Also, the following report,
      
          https://bugzilla.kernel.org/show_bug.cgi?id=47121
      
      details how running said driver can also cause Machine Check
      Exceptions. Drivers need a new means of detecting whether they're
      running on an EFI machine, as sadly the expression,
      
          if (!efi_enabled)
      
      hasn't been a sufficient condition for quite some time.
      
      Users actually want to query 'efi_enabled' for different reasons -
      what they really want access to is the list of available EFI
      facilities.
      
      For instance, the x86 reboot code needs to know whether it can invoke
      the ResetSystem() function provided by the EFI runtime services, while
      the ACPI OSL code wants to know whether the EFI config tables were
      mapped successfully. There are also checks in some of the platform
      driver code to simply see if they're running on an EFI machine (which
      would make it a bad idea to do BIOS-y things).
      
      This patch is a prereq for the samsung-laptop fix patch.
      
      Cc: David Airlie <airlied@linux.ie>
      Cc: Corentin Chary <corentincj@iksaif.net>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Dave Jiang <dave.jiang@intel.com>
      Cc: Olof Johansson <olof@lixom.net>
      Cc: Peter Jones <pjones@redhat.com>
      Cc: Colin Ian King <colin.king@canonical.com>
      Cc: Steve Langasek <steve.langasek@canonical.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad@kernel.org>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      83e68189
    • H
      tracing: Make a snapshot feature available from userspace · debdd57f
      Hiraku Toyooka 提交于
      Ftrace has a snapshot feature available from kernel space and
      latency tracers (e.g. irqsoff) are using it. This patch enables
      user applictions to take a snapshot via debugfs.
      
      Add "snapshot" debugfs file in "tracing" directory.
      
        snapshot:
          This is used to take a snapshot and to read the output of the
          snapshot.
      
           # echo 1 > snapshot
      
          This will allocate the spare buffer for snapshot (if it is
          not allocated), and take a snapshot.
      
           # cat snapshot
      
          This will show contents of the snapshot.
      
           # echo 0 > snapshot
      
          This will free the snapshot if it is allocated.
      
          Any other positive values will clear the snapshot contents if
          the snapshot is allocated, or return EINVAL if it is not allocated.
      
      Link: http://lkml.kernel.org/r/20121226025300.3252.86850.stgit@liselsia
      
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: David Sharp <dhsharp@google.com>
      Signed-off-by: NHiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
      [
         Fixed irqsoff selftest and also a conflict with a change
         that fixes the update_max_tr.
      ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      debdd57f
    • S
      ring-buffer: Add stats field for amount read from trace ring buffer · ad964704
      Steven Rostedt (Red Hat) 提交于
      Add a stat about the number of events read from the ring buffer:
      
       #  cat /debug/tracing/per_cpu/cpu0/stats
      entries: 39869
      overrun: 870512
      commit overrun: 0
      bytes: 1449912
      oldest event ts:  6561.368690
      now ts:  6565.246426
      dropped events: 0
      read events: 112    <-- Added
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ad964704
  11. 30 1月, 2013 1 次提交
  12. 28 1月, 2013 3 次提交
    • J
      x86, msi: Use IRQ remapping specific setup_msi_irqs routine · 5afba62c
      Joerg Roedel 提交于
      Use seperate routines to setup MSI IRQs for both
      irq_remapping_enabled cases.
      Signed-off-by: NJoerg Roedel <joro@8bytes.org>
      Acked-by: NSebastian Andrzej Siewior <sebastian@breakpoint.cc>
      Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      5afba62c
    • F
      cputime: Safely read cputime of full dynticks CPUs · 6a61671b
      Frederic Weisbecker 提交于
      While remotely reading the cputime of a task running in a
      full dynticks CPU, the values stored in utime/stime fields
      of struct task_struct may be stale. Its values may be those
      of the last kernel <-> user transition time snapshot and
      we need to add the tickless time spent since this snapshot.
      
      To fix this, flush the cputime of the dynticks CPUs on
      kernel <-> user transition and record the time / context
      where we did this. Then on top of this snapshot and the current
      time, perform the fixup on the reader side from task_times()
      accessors.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      [fixed kvm module related build errors]
      Signed-off-by: NSedat Dilek <sedat.dilek@gmail.com>
      6a61671b
    • F
      kvm: Prepare to add generic guest entry/exit callbacks · c11f11fc
      Frederic Weisbecker 提交于
      Do some ground preparatory work before adding guest_enter()
      and guest_exit() context tracking callbacks. Those will
      be later used to read the guest cputime safely when we
      run in full dynticks mode.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      c11f11fc