提交 · 3a150df945b7408c27cad2c01a1638e8b14ac562 · openanolis / cloud-kernel

28 2月, 2017 1 次提交

tracing: Fix code comment for ftrace_ops_get_func() · 3a150df9

由 Chunyu Hu 提交于 2月 22, 2017

There is no function 'ftrace_ops_recurs_func' existing in the current code,
it was renamed to ftrace_ops_assist_func() in commit c68c0fa2
("ftrace: Have ftrace_ops_get_func() handle RCU and PER_CPU flags too").
Update the comment to the correct function name.

Link: http://lkml.kernel.org/r/1487723366-14463-1-git-send-email-chuhu@redhat.comSigned-off-by: NChunyu Hu <chuhu@redhat.com>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

3a150df9

17 2月, 2017 1 次提交

tracing: Remove outdated ring buffer comment · 67d04bb2

由 Joel Fernandes 提交于 2月 16, 2017

The comment about ring buffer's organization is outdated and the code sits
elsewhere, remove the comment.
Link: http://lkml.kernel.org/r/20170217041058.23904-1-joelaf@google.com

Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: NJoel Fernandes <joelaf@google.com>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

67d04bb2

15 2月, 2017 8 次提交

tracing/probes: Fix a warning message to show correct maximum length · bef5da60

由 Masami Hiramatsu 提交于 2月 10, 2017

Since tracing/*probe_events will accept a probe definition
up to 4096 - 2 ('\n' and '\0') bytes, it must show 4094 instead
of 4096 in warning message.

Note that there is one possible case of exceed 4094. If user
prepare 4096 bytes null-terminated string and syscall write
it with the count == 4095, then it can be accepted. However,
if user puts a '\n' after that, it must rejected.
So IMHO, the warning message should indicate shorter one,
since it is safer.

Link: http://lkml.kernel.org/r/148673290462.2579.7966778294009665632.stgit@devboxSigned-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

bef5da60

tracing: Fix return value check in trace_benchmark_reg() · 8f0994bb

由 Wei Yongjun 提交于 1月 12, 2017

In case of error, the function kthread_run() returns ERR_PTR() and never
returns NULL. The NULL test in the return value check should be replaced
with IS_ERR().

Link: http://lkml.kernel.org/r/20170112135502.28556-1-weiyj.lk@gmail.com

Cc: stable@vger.kernel.org
Fixes: 81dc9f0e ("tracing: Add tracepoint benchmark tracepoint")
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

8f0994bb

tracing: Use modern function declaration · eb583cd4

由 Arnd Bergmann 提交于 1月 23, 2017

We get a lot of harmless warnings about this header file at W=1 level
because of an unusual function declaration:

kernel/trace/trace.h:766:1: error: 'inline' is not at beginning of declaration [-Werror=old-style-declaration]

This moves the inline statement where it normally belongs, avoiding the
warning.

Link: http://lkml.kernel.org/r/20170123122521.3389010-1-arnd@arndb.de

Fixes: 4046bf02 ("ftrace: Expose ftrace_hash_empty and ftrace_lookup_ip")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

eb583cd4

jump_label: Reduce the size of struct static_key · 3821fd35

由 Jason Baron 提交于 2月 03, 2017

The static_key->next field goes mostly unused. The field is used for
associating module uses with a static key. Most uses of struct static_key
define a static key in the core kernel and make use of it entirely within
the core kernel, or define the static key in a module and make use of it
only from within that module. In fact, of the ~3,000 static keys defined,
I found only about 5 or so that did not fit this pattern.

Thus, we can remove the static_key->next field entirely and overload
the static_key->entries field. That is, when all the static_key uses
are contained within the same module, static_key->entries continues
to point to those uses. However, if the static_key uses are not contained
within the module where the static_key is defined, then we allocate a
struct static_key_mod, store a pointer to the uses within that
struct static_key_mod, and have the static key point at the static_key_mod.
This does incur some extra memory usage when a static_key is used in a
module that does not define it, but since there are only a handful of such
cases there is a net savings.

In order to identify if the static_key->entries pointer contains a
struct static_key_mod or a struct jump_entry pointer, bit 1 of
static_key->entries is set to 1 if it points to a struct static_key_mod and
is 0 if it points to a struct jump_entry. We were already using bit 0 in a
similar way to store the initial value of the static_key. This does mean
that allocations of struct static_key_mod and that the struct jump_entry
tables need to be at least 4-byte aligned in memory. As far as I can tell
all arches meet this criteria.

For my .config, the patch increased the text by 778 bytes, but reduced
the data + bss size by 14912, for a net savings of 14,134 bytes.

text data bss dec hex filename
8092427 5016512 790528 13899467 d416cb vmlinux.pre
8093205 5001600 790528 13885333 d3df95 vmlinux.post

Link: http://lkml.kernel.org/r/1486154544-4321-1-git-send-email-jbaron@akamai.com

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: NJason Baron <jbaron@akamai.com>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

3821fd35

tracing/probe: Show subsystem name in messages · 72576341

由 Masami Hiramatsu 提交于 2月 07, 2017

Show "trace_probe:", "trace_kprobe:" and "trace_uprobe:"
headers for each warning/error/info message. This will
help people to notice that kprobe/uprobe events caused
those messages.

Link: http://lkml.kernel.org/r/148646647813.24658.16705315294927615333.stgit@devboxSigned-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

72576341

tracing/hwlat: Update old comment about migration · 8e0f1142

由 Luiz Capitulino 提交于 2月 13, 2017

The ftrace hwlat does support a cpumask.

Link: http://lkml.kernel.org/r/20170213122517.6e211955@redhat.comSigned-off-by: NLuiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

8e0f1142

timers: Make flags output in the timer_start tracepoint useful · 8a58a34b

由 Thomas Gleixner 提交于 2月 10, 2017

The timer flags in the timer_start trace event contain lots of useful
information, but the meaning is not clear in the trace output. Making tools
rely on the bit positions is bad as they might change over time.

Decode the flags in the print out. Tools can retrieve the bits and their
meaning from the trace format file.

Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1702101639290.4036@nanosRequested-by: NArjan van de Ven <arjan@linux.intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

8a58a34b

tracing: Have traceprobe_probes_write() not access userspace unnecessarily · 1f9b3546

由 Steven Rostedt (VMware) 提交于 2月 09, 2017

The code in traceprobe_probes_write() reads up to 4096 bytes from userpace
for each line. If userspace passes in several lines to execute, the code
will do a large read for each line, even though, it is highly likely that
the first read from userspace received all of the lines at once.

I changed the logic to do a single read from userspace, and to only read
from userspace again if not all of the read from userspace made it in.

I tested this by adding printk()s and writing files that would test -1, ==,
and +1 the buffer size, to make sure that there's no overflows and that if a
single line is written with +1 the buffer size, that it fails properly.

Link: http://lkml.kernel.org/r/20170209180458.5c829ab2@gandalf.local.homeAcked-by: NMasami Hiramatsu <mhiramat@kernel.org>
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

1f9b3546

11 2月, 2017 1 次提交

tracing: Have COMM event filter key be treated as a string · 4c738413

由 Steven Rostedt (VMware) 提交于 2月 08, 2017

The GLOB operation "~" should be able to work with the COMM filter key in
order to trace programs with a glob. For example

  echo 'COMM ~ "systemd*"' > events/syscalls/filter
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

4c738413

03 2月, 2017 8 次提交

ftrace: Have set_graph_function handle multiple functions in one write · e704eff3

由 Steven Rostedt (VMware) 提交于 2月 02, 2017

Currently, only one function can be written to set_graph_function and
set_graph_notrace. The last function in the list will have saved, even
though other functions will be added then removed.

Change the behavior to be the same as set_ftrace_function as to allow
multiple functions to be written. If any one fails, none of them will be
added. The addition of the functions are done at the end when the file is
closed.
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

e704eff3

ftrace: Do not hold references of ftrace_graph_{notrace_}hash out of graph_lock · 649b988b

由 Steven Rostedt (VMware) 提交于 2月 02, 2017

The hashs ftrace_graph_hash and ftrace_graph_notrace_hash are modified
within the graph_lock being held. Holding a pointer to them and passing them
along can lead to a use of a stale pointer (fgd->hash). Move assigning the
pointer and its use to within the holding of the lock. Note, it's an
rcu_sched protected data, and other instances of referencing them are done
with preemption disabled. But the file manipuation code must be protected by
the lock.

The fgd->hash pointer is set to NULL when the lock is being released.
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

649b988b

tracing: Reset parser->buffer to allow multiple "puts" · 0e684b65

由 Steven Rostedt (VMware) 提交于 2月 02, 2017

trace_parser_put() simply frees the allocated parser buffer. But it does not
reset the pointer that was freed. This means that if trace_parser_put() is
called on the same parser more than once, it will corrupt the allocation
system. Setting parser->buffer to NULL after free allows it to be called
more than once without any ill effect.
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

0e684b65

ftrace: Have set_graph_functions handle write with RDWR · ae98d27a

由 Steven Rostedt (VMware) 提交于 2月 02, 2017

Since reading the set_graph_functions uses seq functions, which sets the
file->private_data pointer to a seq_file descriptor. On writes the
ftrace_graph_data descriptor is set to file->private_data. But if the file
is opened for RDWR, the ftrace_graph_write() will incorrectly use the
file->private_data descriptor instead of
((struct seq_file *)file->private_data)->private pointer, and this can crash
the kernel.
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

ae98d27a

ftrace: Reset fgd->hash in ftrace_graph_write() · d4ad9a1c

由 Steven Rostedt (VMware) 提交于 2月 02, 2017

fgd->hash is saved and then freed, but is never reset to either
ftrace_graph_hash nor ftrace_graph_notrace_hash. But if multiple writes are
performed, then the freed hash could be accessed again.

 # cd /sys/kernel/debug/tracing
 # head -1000 available_filter_functions > /tmp/funcs
 # cat /tmp/funcs > set_graph_function

Causes:

 general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
 Modules linked in:  [...]
 CPU: 2 PID: 1337 Comm: cat Not tainted 4.10.0-rc2-test-00010-g6b052e9 #32
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
 task: ffff880113a12200 task.stack: ffffc90001940000
 RIP: 0010:free_ftrace_hash+0x7c/0x160
 RSP: 0018:ffffc90001943db0 EFLAGS: 00010246
 RAX: 6b6b6b6b6b6b6b6b RBX: 6b6b6b6b6b6b6b6b RCX: 6b6b6b6b6b6b6b6b
 RDX: 0000000000000002 RSI: 0000000000000001 RDI: ffff8800ce1e1d40
 RBP: ffff8800ce1e1d50 R08: 0000000000000000 R09: 0000000000006400
 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
 R13: ffff8800ce1e1d40 R14: 0000000000004000 R15: 0000000000000001
 FS:  00007f9408a07740(0000) GS:ffff88011e500000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000aee1f0 CR3: 0000000116bb4000 CR4: 00000000001406e0
 Call Trace:
  ? ftrace_graph_write+0x150/0x190
  ? __vfs_write+0x1f6/0x210
  ? __audit_syscall_entry+0x17f/0x200
  ? rw_verify_area+0xdb/0x210
  ? _cond_resched+0x2b/0x50
  ? __sb_start_write+0xb4/0x130
  ? vfs_write+0x1c8/0x330
  ? SyS_write+0x62/0xf0
  ? do_syscall_64+0xa3/0x1b0
  ? entry_SYSCALL64_slow_path+0x25/0x25
 Code: 01 48 85 db 0f 84 92 00 00 00 b8 01 00 00 00 d3 e0 85 c0 7e 3f 83 e8 01 48 8d 6f 10 45 31 e4 4c 8d 34 c5 08 00 00 00 49 8b 45 08 <4a> 8b 34 20 48 85 f6 74 13 48 8b 1e 48 89 ef e8 20 fa ff ff 48
 RIP: free_ftrace_hash+0x7c/0x160 RSP: ffffc90001943db0
 ---[ end trace 999b48216bf4b393 ]---
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

d4ad9a1c

ftrace: Replace (void *)1 with a meaningful macro name FTRACE_GRAPH_EMPTY · 555fc781

由 Steven Rostedt (VMware) 提交于 2月 02, 2017

When the set_graph_function or set_graph_notrace contains no records, a
banner is displayed of either "#### all functions enabled ####" or
"#### all functions disabled ####" respectively. To tell the seq operations
to do this, (void *)1 is passed as a return value. Instead of using a
hardcoded meaningless variable, define it as a macro.
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

555fc781

ftrace: Create a slight optimization on searching the ftrace_hash · 2b2c279c

由 Steven Rostedt (VMware) 提交于 2月 01, 2017

This is a micro-optimization, but as it has to deal with a fast path of the
function tracer, these optimizations can be noticed.

The ftrace_lookup_ip() returns true if the given ip is found in the hash. If
it's not found or the hash is NULL, it returns false. But there's some cases
that a NULL hash is a true, and the ftrace_hash_empty() is tested before
calling ftrace_lookup_ip() in those cases. But as ftrace_lookup_ip() tests
that first, that adds a few extra unneeded instructions in those cases.

A new static "always_inlined" function is created that does not perform the
hash empty test. This most only be used by callers that do the check first
anyway, as an empty or NULL hash could cause a crash if a lookup is
performed on it.

Also add kernel doc for the ftrace_lookup_ip() main function.
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

2b2c279c

tracing: Add ftrace_hash_key() helper function · 2b0cce0e

由 Steven Rostedt (VMware) 提交于 2月 01, 2017

Replace the couple of use cases that has small logic to produce the ftrace
function key id with a helper function. No need for duplicate code.
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

2b0cce0e

21 1月, 2017 3 次提交

ftrace: Convert graph filter to use hash tables · b9b0c831

由 Namhyung Kim 提交于 1月 20, 2017

Use ftrace_hash instead of a static array of a fixed size. This is
useful when a graph filter pattern matches to a large number of
functions. Now hash lookup is done with preemption disabled to protect
from the hash being changed/freed.

Link: http://lkml.kernel.org/r/20170120024447.26097-3-namhyung@kernel.orgSigned-off-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

b9b0c831

ftrace: Expose ftrace_hash_empty and ftrace_lookup_ip · 4046bf02

由 Namhyung Kim 提交于 1月 20, 2017

It will be used when checking graph filter hashes later.

Link: http://lkml.kernel.org/r/20170120024447.26097-2-namhyung@kernel.orgSigned-off-by: NNamhyung Kim <namhyung@kernel.org>
[ Moved ftrace_hash dec and functions outside of FUNCTION_GRAPH define ]
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

4046bf02

ftrace: Factor out __ftrace_hash_move() · 3e278c0d

由 Namhyung Kim 提交于 1月 20, 2017

The __ftrace_hash_move() is to allocates properly-sized hash and move
entries in the src ftrace_hash. It will be used to set function graph
filters which has nothing to do with the dyn_ftrace records.

Link: http://lkml.kernel.org/r/20170120024447.26097-1-namhyung@kernel.orgSigned-off-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

3e278c0d

19 1月, 2017 2 次提交

tracing: Add the constant count for branch tracer · 068f530b

由 Steven Rostedt (VMware) 提交于 1月 19, 2017

The unlikely/likely branch profiler now gets called even if the if statement
is a constant (always goes in one direction without a compare). Add a value
to denote this in the likely/unlikely tracer as well.
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

068f530b

tracing: Show number of constants profiled in likely profiler · 134e6a03

由 Steven Rostedt (VMware) 提交于 1月 19, 2017

Now that constants are traced, it is useful to see the number of constants
that are traced in the likely/unlikely profiler in order to know if they
should be ignored or not.

The likely/unlikely will display a number after the "correct" number if a
"constant" count exists.
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

134e6a03

18 1月, 2017 2 次提交

tracing: Process constants for (un)likely() profiler · d45ae1f7

由 Steven Rostedt (VMware) 提交于 1月 17, 2017

When running the likely/unlikely profiler, one of the results did not look
accurate. It noted that the unlikely() in link_path_walk() was 100%
incorrect. When I added a trace_printk() to see what was happening there, it
became 80% correct! Looking deeper into what whas happening, I found that
gcc split that if statement into two paths. One where the if statement
became a constant, the other path a variable. The other path had the if
statement always hit (making the unlikely there, always false), but since
the #define unlikely() has:

  #define unlikely() (__builtin_constant_p(x) ? !!(x) : __branch_check__(x, 0))

Where constants are ignored by the branch profiler, the "constant" path
made by the compiler was ignored, even though it was hit 80% of the time.

By just passing the constant value to the __branch_check__() function and
tracing it out of line (as always correct, as likely/unlikely isn't a factor
for constants), then we get back the accurate readings of branches that were
optimized by gcc causing part of the execution to become constant.
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

d45ae1f7

uprobe: Find last occurrence of ':' when parsing uprobe PATH:OFFSET · 6496bb72

由 Kenny Yu 提交于 1月 13, 2017

Previously, `create_trace_uprobe` found the *first* occurence
of the ':' character when parsing `PATH:OFFSET` for a uprobe.
However, if the path contains a ':' character, then the function
would parse the path incorrectly. Even worse, if the path does not
exist, the subsequent call to `kern_path()` would set `ret` to
`ENOENT`, leading to very cryptic errno values in user space.

The fix is to find the *last* occurence of ':'.

How to repro:: The write fails with "No such file or directory", suggesting
incorrectly that the `uprobe_events` file does not exist.

  $ mkdir testing && cd testing
  $ cp /bin/bash .
  $ cp /bin/bash ./bash:with:colon
  $ echo "p:uprobes/p__root_testing_bash_0x6 /root/testing/bash:0x6" > /sys/kernel/debug/tracing/uprobe_events     # this works
  $ echo "p:uprobes/p__root_testing_bash_with_colon_0x6 /root/testing/bash:with:colon:0x6" >> /sys/kernel/debug/tracing/uprobe_events     # this doesn't
  -bash: echo: write error: No such file or directory

With the patch:

  $ echo "p:uprobes/p__root_testing_bash_0x6 /root/testing/bash:0x6" > /sys/kernel/debug/tracing/uprobe_events     # this still works
  $ echo "p:uprobes/p__root_testing_bash_with_colon_0x6 /root/testing/bash:with:colon:0x6" >> /sys/kernel/debug/tracing/uprobe_events     # this works now too!
  $ cat /sys/kernel/debug/tracing/uprobe_events
  p:uprobes/p__root_testing_bash_0x6 /root/testing/bash:0x0000000000000006
  p:uprobes/p__root_testing_bash_with_colon_0x6 /root/testing/bash:with:colon:0x0000000000000006

Link: http://lkml.kernel.org/r/20170113165834.4081016-1-kennyyu@fb.comSigned-off-by: NKenny Yu <kennyyu@fb.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

6496bb72

02 1月, 2017 2 次提交

L

Linux 4.10-rc2 · 0c744ea4
由 Linus Torvalds 提交于 1月 01, 2017

0c744ea4

Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · 4759d386

由 Linus Torvalds 提交于 1月 01, 2017

Pull DAX updates from Dan Williams:
 "The completion of Jan's DAX work for 4.10.

  As I mentioned in the libnvdimm-for-4.10 pull request, these are some
  final fixes for the DAX dirty-cacheline-tracking invalidation work
  that was merged through the -mm, ext4, and xfs trees in -rc1. These
  patches were prepared prior to the merge window, but we waited for
  4.10-rc1 to have a stable merge base after all the prerequisites were
  merged.

  Quoting Jan on the overall changes in these patches:

     "So I'd like all these 6 patches to go for rc2. The first three
      patches fix invalidation of exceptional DAX entries (a bug which
      is there for a long time) - without these patches data loss can
      occur on power failure even though user called fsync(2). The other
      three patches change locking of DAX faults so that ->iomap_begin()
      is called in a more relaxed locking context and we are safe to
      start a transaction there for ext4"

  These have received a build success notification from the kbuild
  robot, and pass the latest libnvdimm unit tests. There have not been
  any -next releases since -rc1, so they have not appeared there"

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  ext4: Simplify DAX fault path
  dax: Call ->iomap_begin without entry lock during dax fault
  dax: Finish fault completely when loading holes
  dax: Avoid page invalidation races and unnecessary radix tree traversals
  mm: Invalidate DAX radix tree entries only if appropriate
  ext2: Return BH_New buffers for zeroed blocks

4759d386

31 12月, 2016 2 次提交

Merge tag 'docs-4.10-rc1-fix' of git://git.lwn.net/linux · 238d1d0f

由 Linus Torvalds 提交于 12月 30, 2016

Pull documentation fixes from Jonathan Corbet:
 "Two small fixes:

   - A merge error on my part broke the DocBook build. I've
     requisitioned one of tglx's frozen sharks for appropriate
     disciplinary action and resolved to be more careful about testing
     the DocBook stuff as long as it's still around.

   - Fix an error in unaligned-memory-access.txt"

* tag 'docs-4.10-rc1-fix' of git://git.lwn.net/linux:
  Documentation/unaligned-memory-access.txt: fix incorrect comparison operator
  docs: Fix build failure

238d1d0f

Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · f3de082c

由 Linus Torvalds 提交于 12月 30, 2016

Pull crypto fix from Herbert Xu:
 "This fixes a boot failure on some platforms when crypto self test is
  enabled along with the new acomp interface"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: testmgr - Use heap buffer for acomp test input

f3de082c

30 12月, 2016 2 次提交

mm/filemap: fix parameters to test_bit() · 98473f9f

由 Olof Johansson 提交于 12月 29, 2016

 mm/filemap.c: In function 'clear_bit_unlock_is_negative_byte':
  mm/filemap.c:933:9: error: too few arguments to function 'test_bit'
    return test_bit(PG_waiters);
         ^~~~~~~~

Fixes: b91e1302 ('mm: optimize PageWaiters bit use for unlock_page()')
Signed-off-by: NOlof Johansson <olof@lixom.net>
Brown-paper-bag-by: NLinus Torvalds <dummy@duh.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

98473f9f

mm: optimize PageWaiters bit use for unlock_page() · b91e1302

由 Linus Torvalds 提交于 12月 27, 2016

In commit 62906027 ("mm: add PageWaiters indicating tasks are
waiting for a page bit") Nick Piggin made our page locking no longer
unconditionally touch the hashed page waitqueue, which not only helps
performance in general, but is particularly helpful on NUMA machines
where the hashed wait queues can bounce around a lot.

However, the "clear lock bit atomically and then test the waiters bit"
sequence turns out to be much more expensive than it needs to be,
because you get a nasty stall when trying to access the same word that
just got updated atomically.

On architectures where locking is done with LL/SC, this would be trivial
to fix with a new primitive that clears one bit and tests another
atomically, but that ends up not working on x86, where the only atomic
operations that return the result end up being cmpxchg and xadd.  The
atomic bit operations return the old value of the same bit we changed,
not the value of an unrelated bit.

On x86, we could put the lock bit in the high bit of the byte, and use
"xadd" with that bit (where the overflow ends up not touching other
bits), and look at the other bits of the result.  However, an even
simpler model is to just use a regular atomic "and" to clear the lock
bit, and then the sign bit in eflags will indicate the resulting state
of the unrelated bit #7.

So by moving the PageWaiters bit up to bit #7, we can atomically clear
the lock bit and test the waiters bit on x86 too.  And architectures
with LL/SC (which is all the usual RISC suspects), the particular bit
doesn't matter, so they are fine with this approach too.

This avoids the extra access to the same atomic word, and thus avoids
the costly stall at page unlock time.

The only downside is that the interface ends up being a bit odd and
specialized: clear a bit in a byte, and test the sign bit.  Nick doesn't
love the resulting name of the new primitive, but I'd rather make the
name be descriptive and very clear about the limitation imposed by
trying to work across all relevant architectures than make it be some
generic thing that doesn't make the odd semantics explicit.

So this introduces the new architecture primitive

    clear_bit_unlock_is_negative_byte();

and adds the trivial implementation for x86.  We have a generic
non-optimized fallback (that just does a "clear_bit()"+"test_bit(7)"
combination) which can be overridden by any architecture that can do
better.  According to Nick, Power has the same hickup x86 has, for
example, but some other architectures may not even care.

All these optimizations mean that my page locking stress-test (which is
just executing a lot of small short-lived shell scripts: "make test" in
the git source tree) no longer makes our page locking look horribly bad.
Before all these optimizations, just the unlock_page() costs were just
over 3% of all CPU overhead on "make test".  After this, it's down to
0.66%, so just a quarter of the cost it used to be.

(The difference on NUMA is bigger, but there this micro-optimization is
likely less noticeable, since the big issue on NUMA was not the accesses
to 'struct page', but the waitqueue accesses that were already removed
by Nick's earlier commit).
Acked-by: NNick Piggin <npiggin@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Andrew Lutomirski <luto@kernel.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b91e1302

28 12月, 2016 8 次提交

Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 2d706e79

由 Linus Torvalds 提交于 12月 27, 2016

Pull crypto fix from Herbert Xu:
 "This fixes a hash corruption bug in the marvell driver"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: marvell - Copy IVDIG before launching partial DMA ahash requests

2d706e79

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 8f18e4d0

由 Linus Torvalds 提交于 12月 27, 2016

Pull networking fixes from David Miller:

 1) Various ipvlan fixes from Eric Dumazet and Mahesh Bandewar.

    The most important is to not assume the packet is RX just because
    the destination address matches that of the device. Such an
    assumption causes problems when an interface is put into loopback
    mode.

 2) If we retry when creating a new tc entry (because we dropped the
    RTNL mutex in order to load a module, for example) we end up with
    -EAGAIN and then loop trying to replay the request. But we didn't
    reset some state when looping back to the top like this, and if
    another thread meanwhile inserted the same tc entry we were trying
    to, we re-link it creating an enless loop in the tc chain. Fix from
    Daniel Borkmann.

 3) There are two different WRITE bits in the MDIO address register for
    the stmmac chip, depending upon the chip variant. Due to a bug we
    could set them both, fix from Hock Leong Kweh.

 4) Fix mlx4 bug in XDP_TX handling, from Tariq Toukan.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  net: stmmac: fix incorrect bit set in gmac4 mdio addr register
  r8169: add support for RTL8168 series add-on card.
  net: xdp: remove unused bfp_warn_invalid_xdp_buffer()
  openvswitch: upcall: Fix vlan handling.
  ipv4: Namespaceify tcp_tw_reuse knob
  net: korina: Fix NAPI versus resources freeing
  net, sched: fix soft lockup in tc_classify
  net/mlx4_en: Fix user prio field in XDP forward
  tipc: don't send FIN message from connectionless socket
  ipvlan: fix multicast processing
  ipvlan: fix various issues in ipvlan_process_multicast()

8f18e4d0

Documentation/unaligned-memory-access.txt: fix incorrect comparison operator · 36f671be

由 Cihangir Akturk 提交于 12月 17, 2016

In the actual implementation ether_addr_equal function tests for equality to 0
when returning. It seems in commit 0d74c4 it is somehow overlooked to change
this operator to reflect the actual function.
Signed-off-by: NCihangir Akturk <cakturk@gmail.com>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

36f671be

docs: Fix build failure · 66115335

由 John Brooks 提交于 12月 23, 2016

The 80211.tmpl DocBook file was removed in commit 819bf593 ("docs-rst:
sphinxify 802.11 documentation"), but the 80211.xml target was re-added to
the Makefile by commit 7ddedebb ("ALSA: doc: ReSTize
writing-an-alsa-driver document"), leading to a failure when building the
documentation:

*** No rule to make target 'Documentation/DocBook/80211.xml', needed by
'Documentation/DocBook/80211.aux.xml'.

cc: stable@vger.kernel.org
Signed-off-by: NJohn Brooks <john@fastquake.com>
Mea-culpa-by: NJonathan Corbet <corbet@lwn.net>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

66115335

J
Merge tag 'v4.10-rc1' into docs-next · 54ab6db0
由 Jonathan Corbet 提交于 12月 27, 2016
```
Linux 4.10-rc1
```
54ab6db0

net: stmmac: fix incorrect bit set in gmac4 mdio addr register · 5799fc90

由 Kweh, Hock Leong 提交于 12月 28, 2016

Fixing the gmac4 mdio write access to use MII_GMAC4_WRITE only instead of
OR together with MII_WRITE.
Signed-off-by: NKweh, Hock Leong <hock.leong.kweh@intel.com>
Acked-By: NJoao Pinto <jpinto@synopsys.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5799fc90

r8169: add support for RTL8168 series add-on card. · 610c9087

由 Chun-Hao Lin 提交于 12月 27, 2016

This chip is the same as RTL8168, but its device id is 0x8161.
Signed-off-by: NChun-Hao Lin <hau@realtek.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

610c9087

net: xdp: remove unused bfp_warn_invalid_xdp_buffer() · be267277

由 Jason Wang 提交于 12月 27, 2016

After commit 73b62bd0 ("virtio-net:
remove the warning before XDP linearizing"), there's no users for
bpf_warn_invalid_xdp_buffer(), so remove it. This is a revert for
commit f23bc46c.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

be267277

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功