提交 · e6c4efd87ab04e5ead363f24e6ac35ed3506d401 · openeuler / raspberrypi-kernel

18 9月, 2014 4 次提交

Btrfs: show real function name in btrfs workqueue tracepoint · b7831b20

由 Liu Bo 提交于 8月 15, 2014

Use %pf instead of %p, just same as kernel workqueue tracepoints.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NChris Mason <clm@fb.com>

b7831b20

Btrfs: cleanup for btrfs workqueue tracepoints · 1a76e4ba

由 Liu Bo 提交于 8月 12, 2014

Tracepoint trace_btrfs_normal_work_done never has an user, just cleanup it.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NChris Mason <clm@fb.com>

1a76e4ba

Btrfs: add work_struct information for workqueue tracepoint · b38a6258

由 Liu Bo 提交于 8月 12, 2014

Kernel workqueue's tracepoints print the address of work_struct, while btrfs
workqueue's tracepoints print the address of btrfs_work.

We need a connection between this two, for example when debuging, we usually
grep an address in the trace output.  So it'd be better to also print
work_struct in btrfs workqueue's tracepoint.

Please note that we can only add this into those tracepoints whose work is still
available in memory because we need to reference the work.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NChris Mason <clm@fb.com>

b38a6258

btrfs: add trace for qgroup accounting · d3982100

由 Mark Fasheh 提交于 7月 17, 2014

We want this to debug qgroup changes on live systems.
Signed-off-by: NMark Fasheh <mfasheh@suse.de>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NChris Mason <clm@fb.com>

d3982100

06 9月, 2014 1 次提交

net: treewide: Fix typo found in DocBook/networking.xml · e793c0f7

由 Masanari Iida 提交于 9月 04, 2014

This patch fix spelling typo found in DocBook/networking.xml.
It is because the neworking.xml is generated from comments
in the source, I have to fix typo in comments within the source.
Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
Acked-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e793c0f7

13 8月, 2014 1 次提交

powerpc/thp: Add tracepoints to track hugepage invalidate · 9e813308

由 Aneesh Kumar K.V 提交于 8月 13, 2014

Add tracepoint to track hugepage invalidate. This help us
in debugging difficult to track bugs.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

9e813308

08 8月, 2014 1 次提交

tracepoint: add generic tracepoint definitions for IPI tracing · f6d9804d

由 Nicolas Pitre 提交于 7月 25, 2014

The Inter Processor Interrupt is used to make another processor do a
specific action such as rescheduling tasks, signal a timer event or
execute something in another CPU's context. IRQs are already traceable
but IPIs were not. Tracing them is useful for monitoring IPI latency,
or to verify when they are the source of CPU wake-ups with power
management implications.

Three trace hooks are defined: ipi_raise, ipi_entry and ipi_exit. To make
them portable, a string is used to identify them and correlate related
events. Additionally, ipi_raise records a bitmask representing targeted
CPUs.

Link: http://lkml.kernel.org/p/1406318733-26754-3-git-send-email-nicolas.pitre@linaro.orgAcked-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: NNicolas Pitre <nico@linaro.org>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

f6d9804d

07 8月, 2014 2 次提交

mm: pagemap: avoid unnecessary overhead when tracepoints are deactivated · 24b7e581

由 Mel Gorman 提交于 8月 06, 2014

This was formerly the series "Improve sequential read throughput" which
noted some major differences in performance of tiobench since 3.0.
While there are a number of factors, two that dominated were the
introduction of the fair zone allocation policy and changes to CFQ.

The behaviour of fair zone allocation policy makes more sense than
tiobench as a benchmark and CFQ defaults were not changed due to
insufficient benchmarking.

This series is what's left.  It's one functional fix to the fair zone
allocation policy when used on NUMA machines and a reduction of overhead
in general.  tiobench was used for the comparison despite its flaws as
an IO benchmark as in this case we are primarily interested in the
overhead of page allocator and page reclaim activity.

On UMA, it makes little difference to overhead

          3.16.0-rc3   3.16.0-rc3
             vanilla lowercost-v5
User          383.61      386.77
System        403.83      401.74
Elapsed      5411.50     5413.11

On a 4-socket NUMA machine it's a bit more noticable

          3.16.0-rc3   3.16.0-rc3
             vanilla lowercost-v5
User          746.94      802.00
System      65336.22    40852.33
Elapsed     27553.52    27368.46

This patch (of 6):

The LRU insertion and activate tracepoints take PFN as a parameter
forcing the overhead to the caller.  Move the overhead to the tracepoint
fast-assign method to ensure the cost is only incurred when the
tracepoint is active.
Signed-off-by: NMel Gorman <mgorman@suse.de>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

24b7e581

mm tracing: tell mm_migrate_pages event about numa_misplaced · 4a262d26

由 Max Asbock 提交于 8月 06, 2014

The mm_migrate_pages trace event reports a reason for the migration,
typically as a symbolic string.  The exception is the reason
MR_NUMA_MISPLACED for which it just displays the numeric value:
mm_migrate_pages: nr_succeeded=1 nr_failed=0 mode=MIGRATE_ASYNC
reason=0x5

This patch makes the output consistent by introducing a string value for
MR_NUMA_MISPLACED.  The event is then reported as: mm_migrate_pages:
nr_succeeded=1 nr_failed=0 mode=MIGRATE_ASYNC reason=numa_misplaced
Signed-off-by: NMax Asbock <masbock@linux.vnet.ibm.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Acked-by: NMel Gorman <mgorman@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4a262d26

06 8月, 2014 1 次提交

KVM: Move more code under CONFIG_HAVE_KVM_IRQFD · c77dcacb

由 Paolo Bonzini 提交于 8月 06, 2014

Commits e4d57e1e (KVM: Move irq notifier implementation into
eventfd.c, 2014-06-30) included the irq notifier code unconditionally
in eventfd.c, while it was under CONFIG_HAVE_KVM_IRQCHIP before.

Similarly, commit 297e2105 (KVM: Give IRQFD its own separate enabling
Kconfig option, 2014-06-30) moved code from CONFIG_HAVE_IRQ_ROUTING
to CONFIG_HAVE_KVM_IRQFD but forgot to move the pieces that used to be
under CONFIG_HAVE_KVM_IRQCHIP.

Together, this broke compilation without CONFIG_KVM_XICS. Fix by adding
or changing the #ifdefs so that they point at CONFIG_HAVE_KVM_IRQFD.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c77dcacb

05 8月, 2014 2 次提交
- S
  bcache: fix crash in bcache_btree_node_alloc_fail tracepoint · 913dc33f
  由 Slava Pestov 提交于 5月 23, 2014
```
'b' was NULL.

Change-Id: Icac0fd04afa2d23f213d96d51afd53374e6dd0c0
```
  913dc33f
- S
  bcache: bcache_write tracepoint was crashing · 60ae81ee
  由 Slava Pestov 提交于 5月 22, 2014
```
Signed-off-by: NKent Overstreet <kmo@daterainc.com>
```
  60ae81ee
02 8月, 2014 1 次提交

f2fs: add tracepoint for f2fs_direct_IO · 70407fad

由 Chao Yu 提交于 7月 31, 2014

This patch adds a tracepoint for f2fs_direct_IO.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

70407fad

31 7月, 2014 2 次提交

x86/mm: Add tracepoints for TLB flushes · d17d8f9d

由 Dave Hansen 提交于 7月 31, 2014

We don't have any good way to figure out what kinds of flushes
are being attempted.  Right now, we can try to use the vm
counters, but those only tell us what we actually did with the
hardware (one-by-one vs full) and don't tell us what was actually
_requested_.

This allows us to select out "interesting" TLB flushes that we
might want to optimize (like the ranged ones) and ignore the ones
that we have very little control over (the ones at context
switch).
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20140731154059.4C96CBA5@viggo.jf.intel.comAcked-by: NRik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

d17d8f9d

f2fs: add tracepoint for f2fs_issue_flush · 24a9ee0f

由 Jaegeuk Kim 提交于 7月 25, 2014

This patch adds a tracepoint for f2fs_issue_flush.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

24a9ee0f

09 7月, 2014 1 次提交

fence: dma-buf cross-device synchronization (v18) · e941759c

由 Maarten Lankhorst 提交于 7月 01, 2014

A fence can be attached to a buffer which is being filled or consumed
by hw, to allow userspace to pass the buffer without waiting to another
device.  For example, userspace can call page_flip ioctl to display the
next frame of graphics after kicking the GPU but while the GPU is still
rendering.  The display device sharing the buffer with the GPU would
attach a callback to get notified when the GPU's rendering-complete IRQ
fires, to update the scan-out address of the display, without having to
wake up userspace.

A driver must allocate a fence context for each execution ring that can
run in parallel. The function for this takes an argument with how many
contexts to allocate:
  + fence_context_alloc()

A fence is transient, one-shot deal.  It is allocated and attached
to one or more dma-buf's.  When the one that attached it is done, with
the pending operation, it can signal the fence:
  + fence_signal()

To have a rough approximation whether a fence is fired, call:
  + fence_is_signaled()

The dma-buf-mgr handles tracking, and waiting on, the fences associated
with a dma-buf.

The one pending on the fence can add an async callback:
  + fence_add_callback()

The callback can optionally be cancelled with:
  + fence_remove_callback()

To wait synchronously, optionally with a timeout:
  + fence_wait()
  + fence_wait_timeout()

When emitting a fence, call:
  + trace_fence_emit()

To annotate that a fence is blocking on another fence, call:
  + trace_fence_annotate_wait_on(fence, on_fence)

A default software-only implementation is provided, which can be used
by drivers attaching a fence to a buffer when they have no other means
for hw sync.  But a memory backed fence is also envisioned, because it
is common that GPU's can write to, or poll on some memory location for
synchronization.  For example:

  fence = custom_get_fence(...);
  if ((seqno_fence = to_seqno_fence(fence)) != NULL) {
    dma_buf *fence_buf = seqno_fence->sync_buf;
    get_dma_buf(fence_buf);

    ... tell the hw the memory location to wait ...
    custom_wait_on(fence_buf, seqno_fence->seqno_ofs, fence->seqno);
  } else {
    /* fall-back to sw sync * /
    fence_add_callback(fence, my_cb);
  }

On SoC platforms, if some other hw mechanism is provided for synchronizing
between IP blocks, it could be supported as an alternate implementation
with it's own fence ops in a similar way.

enable_signaling callback is used to provide sw signaling in case a cpu
waiter is requested or no compatible hardware signaling could be used.

The intention is to provide a userspace interface (presumably via eventfd)
later, to be used in conjunction with dma-buf's mmap support for sw access
to buffers (or for userspace apps that would prefer to do their own
synchronization).

v1: Original
v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided
    that dma-fence didn't need to care about the sw->hw signaling path
    (it can be handled same as sw->sw case), and therefore the fence->ops
    can be simplified and more handled in the core.  So remove the signal,
    add_callback, cancel_callback, and wait ops, and replace with a simple
    enable_signaling() op which can be used to inform a fence supporting
    hw->hw signaling that one or more devices which do not support hw
    signaling are waiting (and therefore it should enable an irq or do
    whatever is necessary in order that the CPU is notified when the
    fence is passed).
v3: Fix locking fail in attach_fence() and get_fence()
v4: Remove tie-in w/ dma-buf..  after discussion w/ danvet and mlankorst
    we decided that we need to be able to attach one fence to N dma-buf's,
    so using the list_head in dma-fence struct would be problematic.
v5: [ Maarten Lankhorst ] Updated for dma-bikeshed-fence and dma-buf-manager.
v6: [ Maarten Lankhorst ] I removed dma_fence_cancel_callback and some comments
    about checking if fence fired or not. This is broken by design.
    waitqueue_active during destruction is now fatal, since the signaller
    should be holding a reference in enable_signalling until it signalled
    the fence. Pass the original dma_fence_cb along, and call __remove_wait
    in the dma_fence_callback handler, so that no cleanup needs to be
    performed.
v7: [ Maarten Lankhorst ] Set cb->func and only enable sw signaling if
    fence wasn't signaled yet, for example for hardware fences that may
    choose to signal blindly.
v8: [ Maarten Lankhorst ] Tons of tiny fixes, moved __dma_fence_init to
    header and fixed include mess. dma-fence.h now includes dma-buf.h
    All members are now initialized, so kmalloc can be used for
    allocating a dma-fence. More documentation added.
v9: Change compiler bitfields to flags, change return type of
    enable_signaling to bool. Rework dma_fence_wait. Added
    dma_fence_is_signaled and dma_fence_wait_timeout.
    s/dma// and change exports to non GPL. Added fence_is_signaled and
    fence_enable_sw_signaling calls, add ability to override default
    wait operation.
v10: remove event_queue, use a custom list, export try_to_wake_up from
    scheduler. Remove fence lock and use a global spinlock instead,
    this should hopefully remove all the locking headaches I was having
    on trying to implement this. enable_signaling is called with this
    lock held.
v11:
    Use atomic ops for flags, lifting the need for some spin_lock_irqsaves.
    However I kept the guarantee that after fence_signal returns, it is
    guaranteed that enable_signaling has either been called to completion,
    or will not be called any more.

    Add contexts and seqno to base fence implementation. This allows you
    to wait for less fences, by testing for seqno + signaled, and then only
    wait on the later fence.

    Add FENCE_TRACE, FENCE_WARN, and FENCE_ERR. This makes debugging easier.
    An CONFIG_DEBUG_FENCE will be added to turn off the FENCE_TRACE
    spam, and another runtime option can turn it off at runtime.
v12:
    Add CONFIG_FENCE_TRACE. Add missing documentation for the fence->context
    and fence->seqno members.
v13:
    Fixup CONFIG_FENCE_TRACE kconfig description.
    Move fence_context_alloc to fence.
    Simplify fence_later.
    Kill priv member to fence_cb.
v14:
    Remove priv argument from fence_add_callback, oops!
v15:
    Remove priv from documentation.
    Explicitly include linux/atomic.h.
v16:
    Add trace events.
    Import changes required by android syncpoints.
v17:
    Use wake_up_state instead of try_to_wake_up. (Colin Cross)
    Fix up commit description for seqno_fence. (Rob Clark)
v18:
    Rename release_fence to fence_release.
    Move to drivers/dma-buf/.
    Rename __fence_is_signaled and __fence_signal to *_locked.
    Rename __fence_init to fence_init.
    Make fence_default_wait return a signed long, and fix wait ops too.
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com> #use smp_mb__before_atomic()
Acked-by: NSumit Semwal <sumit.semwal@linaro.org>
Acked-by: NDaniel Vetter <daniel@ffwll.ch>
Reviewed-by: NRob Clark <robdclark@gmail.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

e941759c

24 6月, 2014 1 次提交

trace, AER: Move trace into unified interface · 0a2409aa

由 Chen, Gong 提交于 6月 11, 2014

AER uses a separate trace interface by now. To make it
consistent, move it into unified RAS trace interface.
Signed-off-by: NChen, Gong <gong.chen@linux.intel.com>
Acked-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NTony Luck <tony.luck@intel.com>

0a2409aa

22 6月, 2014 1 次提交

ASoC: Move name and id from CODEC/platform to component · f4333203

由 Lars-Peter Clausen 提交于 6月 16, 2014

The component struct already has a name and id field which are initialized to
the same values as the same fields in the CODEC and platform structs. So remove
them from the CODEC and platform structs and used the ones from the component
struct instead.
Signed-off-by: NLars-Peter Clausen <lars@metafoo.de>
Signed-off-by: NMark Brown <broonie@linaro.org>

f4333203

21 6月, 2014 2 次提交

tracing: Add __field_struct macro for TRACE_EVENT() · 4d4c9cc8

由 Steven Rostedt 提交于 6月 17, 2014

Currently the __field() macro in TRACE_EVENT is only good for primitive
values, such as integers and pointers, but it fails on complex data types
such as structures or unions. This is because the __field() macro
determines if the variable is signed or not with the test of:

  (((type)(-1)) < (type)1)

Unfortunately, that fails when type is a structure.

Since trace events should support structures as fields a new macro
is created for such a case called __field_struct() which acts exactly
the same as __field() does but it does not do the signed type check
and just uses a constant false for that answer.

Cc: Tony Luck <tony.luck@gmail.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

4d4c9cc8

tracing: Fix syscall_*regfunc() vs copy_process() race · 4af4206b

由 Oleg Nesterov 提交于 4月 13, 2014

syscall_regfunc() and syscall_unregfunc() should set/clear
TIF_SYSCALL_TRACEPOINT system-wide, but do_each_thread() can race
with copy_process() and miss the new child which was not added to
the process/thread lists yet.

Change copy_process() to update the child's TIF_SYSCALL_TRACEPOINT
under tasklist.

Link: http://lkml.kernel.org/p/20140413185854.GB20668@redhat.com

Cc: stable@vger.kernel.org # 2.6.33
Fixes: a871bd33 "tracing: Add syscall tracepoints"
Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

4af4206b

11 6月, 2014 1 次提交

PM / sleep: trace events for device PM callbacks · e8bca479

由 Todd E Brandt 提交于 6月 10, 2014

Adds two trace events which supply the same info that initcall_debug
provides, but via ftrace instead of dmesg. The existing initcall_debug
calls require the pm_print_times_enabled var to be set (either via
sysfs or via the kernel cmd line). The new trace events provide all the
same info as the initcall_debug prints but with less overhead, and also
with coverage of device prepare and complete device callbacks.

These events replace the device_pm_report_time event (which has been
removed). device_pm_callback_start is called first and provides the device
and callback info. device_pm_callback_end is called after with the
device name and error info. The time and pid are gathered from the trace
data headers.
Signed-off-by: NTodd Brandt <todd.e.brandt@intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

e8bca479

07 6月, 2014 1 次提交

PM / sleep: trace events for suspend/resume · bb3632c6

由 Todd E Brandt 提交于 6月 06, 2014

Adds trace events that give finer resolution into suspend/resume. These
events are graphed in the timelines generated by the analyze_suspend.py
script. They represent large areas of time consumed that are typical to
suspend and resume.

The event is triggered by calling the function "trace_suspend_resume"
with three arguments: a string (the name of the event to be displayed
in the timeline), an integer (case specific number, such as the power
state or cpu number), and a boolean (where true is used to denote the start
of the timeline event, and false to denote the end).

The suspend_resume trace event reproduces the data that the machine_suspend
trace event did, so the latter has been removed.
Signed-off-by: NTodd Brandt <todd.e.brandt@intel.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

bb3632c6

05 6月, 2014 6 次提交

sched, trace: Add a tracepoint for IPI-less remote wakeups · dfc68f29

由 Andy Lutomirski 提交于 6月 04, 2014

Remote wakeups of polling CPUs are a valuable performance
improvement; add a tracepoint to make it much easier to verify that
they're working.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: nicolas.pitre@linaro.org
Cc: daniel.lezcano@linaro.org
Cc: umgwanakikbuti@gmail.com
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/16205aee116772aa686814f9b13bccb562108047.1401902905.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

dfc68f29

mm/compaction: do not count migratepages when unnecessary · f8c9301f

由 Vlastimil Babka 提交于 6月 04, 2014

During compaction, update_nr_listpages() has been used to count remaining
non-migrated and free pages after a call to migrage_pages().  The
freepages counting has become unneccessary, and it turns out that
migratepages counting is also unnecessary in most cases.

The only situation when it's needed to count cc->migratepages is when
migrate_pages() returns with a negative error code.  Otherwise, the
non-negative return value is the number of pages that were not migrated,
which is exactly the count of remaining pages in the cc->migratepages
list.

Furthermore, any non-zero count is only interesting for the tracepoint of
mm_compaction_migratepages events, because after that all remaining
unmigrated pages are put back and their count is set to 0.

This patch therefore removes update_nr_listpages() completely, and changes
the tracepoint definition so that the manual counting is done only when
the tracepoint is enabled, and only when migrate_pages() returns a
negative error code.

Furthermore, migrate_pages() and the tracepoints won't be called when
there's nothing to migrate.  This potentially avoids some wasted cycles
and reduces the volume of uninteresting mm_compaction_migratepages events
where "nr_migrated=0 nr_failed=0".  In the stress-highalloc mmtest, this
was about 75% of the events.  The mm_compaction_isolate_migratepages event
is better for determining that nothing was isolated for migration, and
this one was just duplicating the info.
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Acked-by: NMichal Nazarewicz <mina86@mina86.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rik van Riel <riel@redhat.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f8c9301f

mm: shrinker: add nid to tracepoint output · df9024a8

由 Dave Hansen 提交于 6月 04, 2014

Now that we are doing NUMA-aware shrinking, and can have shrinkers
running in parallel, or working on individual nodes, it seems like we
should also be sticking the node in the output.
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Acked-by: NDave Chinner <david@fromorbit.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

df9024a8

mm: shrinker trace points: fix negatives · 7fe70475

由 Dave Hansen 提交于 6月 04, 2014

I was looking at a trace of the slab shrinkers (attachment in this comment):

	https://bugs.freedesktop.org/show_bug.cgi?id=72742#c67

and noticed that "total_scan" can go negative in some cases.  We
used to dump out the "total_scan" variable directly, but some of
the shrinker modifications along the way changed that.

This patch just dumps it out directly, again.  It doesn't make
any sense to derive it from new_nr and nr any more since there
are now other shrinkers that can be running in parallel and
mucking with those values.

Here's an example of the negative numbers in the output:

>          kswapd0-840   [000]   160.869398: mm_shrink_slab_end:   i915_gem_inactive_scan+0x0 0xffff8800037cbc68: unused scan count 10 new scan count 39 total_scan 29 last shrinker return val 256
>          kswapd0-840   [000]   160.869618: mm_shrink_slab_end:   i915_gem_inactive_scan+0x0 0xffff8800037cbc68: unused scan count 39 new scan count 102 total_scan 63 last shrinker return val 256
>          kswapd0-840   [000]   160.870031: mm_shrink_slab_end:   i915_gem_inactive_scan+0x0 0xffff8800037cbc68: unused scan count 102 new scan count 47 total_scan -55 last shrinker return val 768
>          kswapd0-840   [000]   160.870464: mm_shrink_slab_end:   i915_gem_inactive_scan+0x0 0xffff8800037cbc68: unused scan count 47 new scan count 45 total_scan -2 last shrinker return val 768
>          kswapd0-840   [000]   163.384144: mm_shrink_slab_end:   i915_gem_inactive_scan+0x0 0xffff8800037cbc68: unused scan count 45 new scan count 56 total_scan 11 last shrinker return val 0
>          kswapd0-840   [000]   163.384297: mm_shrink_slab_end:   i915_gem_inactive_scan+0x0 0xffff8800037cbc68: unused scan count 56 new scan count 15 total_scan -41 last shrinker return val 256
>          kswapd0-840   [000]   163.384414: mm_shrink_slab_end:   i915_gem_inactive_scan+0x0 0xffff8800037cbc68: unused scan count 15 new scan count 117 total_scan 102 last shrinker return val 0
>          kswapd0-840   [000]   163.384657: mm_shrink_slab_end:   i915_gem_inactive_scan+0x0 0xffff8800037cbc68: unused scan count 117 new scan count 36 total_scan -81 last shrinker return val 512
>          kswapd0-840   [000]   163.384880: mm_shrink_slab_end:   i915_gem_inactive_scan+0x0 0xffff8800037cbc68: unused scan count 36 new scan count 111 total_scan 75 last shrinker return val 256
>          kswapd0-840   [000]   163.385256: mm_shrink_slab_end:   i915_gem_inactive_scan+0x0 0xffff8800037cbc68: unused scan count 111 new scan count 34 total_scan -77 last shrinker return val 768
>          kswapd0-840   [000]   163.385598: mm_shrink_slab_end:   i915_gem_inactive_scan+0x0 0xffff8800037cbc68: unused scan count 34 new scan count 122 total_scan 88 last shrinker return val 512
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Acked-by: NDave Chinner <david@fromorbit.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7fe70475

mm: get rid of __GFP_KMEMCG · 52383431

由 Vladimir Davydov 提交于 6月 04, 2014

Currently to allocate a page that should be charged to kmemcg (e.g.
threadinfo), we pass __GFP_KMEMCG flag to the page allocator.  The page
allocated is then to be freed by free_memcg_kmem_pages.  Apart from
looking asymmetrical, this also requires intrusion to the general
allocation path.  So let's introduce separate functions that will
alloc/free pages charged to kmemcg.

The new functions are called alloc_kmem_pages and free_kmem_pages.  They
should be used when the caller actually would like to use kmalloc, but
has to fall back to the page allocator for the allocation is large.
They only differ from alloc_pages and free_pages in that besides
allocating or freeing pages they also charge them to the kmem resource
counter of the current memory cgroup.

[sfr@canb.auug.org.au: export kmalloc_order() to modules]
Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
Acked-by: NGreg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: NMichal Hocko <mhocko@suse.cz>
Cc: Glauber Costa <glommer@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

52383431

tracing: Add __get_dynamic_array_len() macro for trace events · beba4bb0

由 Steven Rostedt (Red Hat) 提交于 6月 04, 2014

If a trace event uses a dynamic array for something other than a string
then there's currently no way the TP_printk() can figure out what size
it is. A __get_dynamic_array_len() is required to know the length.

This also simplifies the __get_bitmask() macro which required it as well,
but instead just hardcoded it.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

beba4bb0

02 6月, 2014 1 次提交

locks: add some tracepoints in the lease handling code · 62af4f1f

由 Jeff Layton 提交于 5月 09, 2014

v2: add a __break_lease tracepoint for non-blocking case

Recently, I needed these to help track down a softlockup when recalling a
delegation, but they might be helpful in other situations as well.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: NJeff Layton <jlayton@poochiereds.net>

62af4f1f

15 5月, 2014 1 次提交

tracing: Add __bitmask() macro to trace events to cpumasks and other bitmasks · 4449bf92

由 Steven Rostedt (Red Hat) 提交于 5月 06, 2014

Being able to show a cpumask of events can be useful as some events
may affect only some CPUs. There is no standard way to record the
cpumask and converting it to a string is rather expensive during
the trace as traces happen in hotpaths. It would be better to record
the raw event mask and be able to parse it at print time.

The following macros were added for use with the TRACE_EVENT() macro:

  __bitmask()
  __assign_bitmask()
  __get_bitmask()

To test this, I added this to the sched_migrate_task event, which
looked like this:

TRACE_EVENT(sched_migrate_task,

	TP_PROTO(struct task_struct *p, int dest_cpu, const struct cpumask *cpus),

	TP_ARGS(p, dest_cpu, cpus),

	TP_STRUCT__entry(
		__array(	char,	comm,	TASK_COMM_LEN	)
		__field(	pid_t,	pid			)
		__field(	int,	prio			)
		__field(	int,	orig_cpu		)
		__field(	int,	dest_cpu		)
		__bitmask(	cpumask, num_possible_cpus()	)
	),

	TP_fast_assign(
		memcpy(__entry->comm, p->comm, TASK_COMM_LEN);
		__entry->pid		= p->pid;
		__entry->prio		= p->prio;
		__entry->orig_cpu	= task_cpu(p);
		__entry->dest_cpu	= dest_cpu;
		__assign_bitmask(cpumask, cpumask_bits(cpus), num_possible_cpus());
	),

	TP_printk("comm=%s pid=%d prio=%d orig_cpu=%d dest_cpu=%d cpumask=%s",
		  __entry->comm, __entry->pid, __entry->prio,
		  __entry->orig_cpu, __entry->dest_cpu,
		  __get_bitmask(cpumask))
);

With the output of:

        ksmtuned-3613  [003] d..2   485.220508: sched_migrate_task: comm=ksmtuned pid=3615 prio=120 orig_cpu=3 dest_cpu=2 cpumask=00000000,0000000f
     migration/1-13    [001] d..5   485.221202: sched_migrate_task: comm=ksmtuned pid=3614 prio=120 orig_cpu=1 dest_cpu=0 cpumask=00000000,0000000f
             awk-3615  [002] d.H5   485.221747: sched_migrate_task: comm=rcu_preempt pid=7 prio=120 orig_cpu=0 dest_cpu=1 cpumask=00000000,000000ff
     migration/2-18    [002] d..5   485.222062: sched_migrate_task: comm=ksmtuned pid=3615 prio=120 orig_cpu=2 dest_cpu=3 cpumask=00000000,0000000f

Link: http://lkml.kernel.org/r/1399377998-14870-6-git-send-email-javi.merino@arm.com
Link: http://lkml.kernel.org/r/20140506132238.22e136d1@gandalf.local.homeSuggested-by: NJavi Merino <javi.merino@arm.com>
Tested-by: NJavi Merino <javi.merino@arm.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

4449bf92

08 5月, 2014 1 次提交

trace: module: Maintain a valid user count · 098507ae

由 Romain Izard 提交于 3月 04, 2014

The replacement of the 'count' variable by two variables 'incs' and
'decs' to resolve some race conditions during module unloading was done
in parallel with some cleanup in the trace subsystem, and was integrated
as a merge.

Unfortunately, the formula for this replacement was wrong in the tracing
code, and the refcount in the traces was not usable as a result.

Use 'count = incs - decs' to compute the user count.

Link: http://lkml.kernel.org/p/1393924179-9147-1-git-send-email-romain.izard.pro@gmail.comAcked-by: NIngo Molnar <mingo@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: stable@vger.kernel.org # 2.6.35
Fixes: c1ab9cab "merge conflict resolution"
Signed-off-by: NRomain Izard <romain.izard.pro@gmail.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

098507ae

07 5月, 2014 5 次提交

f2fs: add a tracepoint for f2fs_read_data_page · c20e89cd

由 Chao Yu 提交于 5月 06, 2014

This patch adds a tracepoint for f2fs_read_data_page to trace when page is
readed by user.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

c20e89cd

f2fs: add a tracepoint for f2fs_write_{meta,node,data}_pages · e5748434

由 Chao Yu 提交于 5月 06, 2014

This patch adds a tracepoint for f2fs_write_{meta,node,data}_pages to trace when
pages are fsyncing/flushing.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

e5748434

f2fs: add a tracepoint for f2fs_write_{meta,node,data}_page · ecda0de3

由 Chao Yu 提交于 5月 06, 2014

This patch adds a tracepoint for f2fs_write_{meta,node,data}_page to trace when
page is writting out.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

ecda0de3

f2fs: add a tracepoint for f2fs_write_end · dfb2bf38

由 Chao Yu 提交于 5月 06, 2014

This patch adds a tracepoint for f2fs_write_end to trace write op of user.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

dfb2bf38

f2fs: add a tracepoint for f2fs_write_begin · 62aed044

由 Chao Yu 提交于 5月 06, 2014

This patch adds a tracepoint for f2fs_write_begin to trace write op of user.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

62aed044

23 4月, 2014 1 次提交

Fix: tracing: use 'E' instead of 'X' for unsigned module tain flag · 132e9a68

由 Mathieu Desnoyers 提交于 4月 15, 2014

In the following commit:

commit 57673c2b
Author: Rusty Russell <rusty@rustcorp.com.au>
Date:   Mon Mar 31 14:39:57 2014 +1030

    Use 'E' instead of 'X' for unsigned module taint flag.

One site has been forgotten in trace events module.h.
Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

132e9a68

22 4月, 2014 1 次提交

ASoC: Remove ASoC level IO tracing · 111c0cf5

由 Lars-Peter Clausen 提交于 4月 22, 2014

The ASoC framework is in the process of migrating all IO operations to regmap.
regmap has its own more sophisticated tracing infrastructure for IO operations,
which means that the ASoC level IO tracing becomes redundant, hence this patch
removes them. There are still a handful of ASoC drivers left that do not use
regmap yet, but hopefully the removal of the ASoC IO tracing will be an
additional incentive to switch to regmap.
Signed-off-by: NLars-Peter Clausen <lars@metafoo.de>
Signed-off-by: NMark Brown <broonie@linaro.org>

111c0cf5

21 4月, 2014 2 次提交

ext4: rename uninitialized extents to unwritten · 556615dc

由 Lukas Czerner 提交于 4月 20, 2014

Currently in ext4 there is quite a mess when it comes to naming
unwritten extents. Sometimes we call it uninitialized and sometimes we
refer to it as unwritten.

The right name for the extent which has been allocated but does not
contain any written data is _unwritten_. Other file systems are
using this name consistently, even the buffer head state refers to it as
unwritten. We need to fix this confusion in ext4.

This commit changes every reference to an uninitialized extent (meaning
allocated but unwritten) to unwritten extent. This includes comments,
function names and variable names. It even covers abbreviation of the
word uninitialized (such as uninit) and some misspellings.

This commit does not change any of the code paths at all. This has been
confirmed by comparing md5sums of the assembly code of each object file
after all the function names were stripped from it.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

556615dc

ext4: get rid of EXT4_MAP_UNINIT flag · 090f32ee

由 Lukas Czerner 提交于 4月 20, 2014

Currently EXT4_MAP_UNINIT is used in dioread_nolock case to mark the
cases where we're using dioread_nolock and we're writing into either
unallocated, or unwritten extent, because we need to make sure that
any DIO write into that inode will wait for the extent conversion.

However EXT4_MAP_UNINIT is not only entirely misleading name but also
unnecessary because we can check for EXT4_MAP_UNWRITTEN in the
dioread_nolock case instead.

This commit removes EXT4_MAP_UNINIT flag.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

090f32ee