提交 · 67edab48caeb75d412706f4b9d3107afd1e07623 · OpenHarmony / kernel_linux

13 6月, 2017 2 次提交

posix-timers: Handle relative posix-timers correctly · 67edab48

由 Thomas Gleixner 提交于 6月 12, 2017

The recent rework of the posix timer internals broke the magic posix
mechanism, which requires that relative timers are not affected by
modifications of the underlying clock. That means relative CLOCK_REALTIME
timers cannot use CLOCK_REALTIME, because that can be set and adjusted. The
underlying hrtimer switches the clock for these timers to CLOCK_MONOTONIC.

That still works, but reading the remaining time of such a timer has been
broken in the rework. The old code used the hrtimer internals directly and
avoided the posix clock callbacks. Now common_timer_get() uses the
underlying kclock->timer_get() callback, which is still CLOCK_REALTIME
based. So the remaining time of such a timer is calculated against the
wrong time base.

Handle it by switching the k_itimer->kclock pointer according to the
resulting hrtimer mode. k_itimer->it_clock still contains CLOCK_REALTIME
because the timer might be set with ABSTIME later and then it needs to
switch back to the realtime posix clock implementation.

Fixes: eae1c4ae ("posix-timers: Make use of cancel/arm callbacks")
Reported-by: NAndrei Vagin <avagin@virtuozzo.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Link: http://lkml.kernel.org/r/20170609201156.GB21491@outlook.office365.com

67edab48

posix-timers: Zero out oldval itimerspec · 5c7a3a3d

由 Thomas Gleixner 提交于 6月 12, 2017

The recent posix timer rework moved the clearing of the itimerspec to the
real syscall implementation, but forgot that the kclock->timer_get() is
used by timer_settime() as well. That results in an uninitialized variable
and bogus values returned to user space.

Add the missing memset to timer_settime().

Fixes: eabdec04 ("posix-timers: Zero settings value in common code")
Reported-by: NAndrei Vagin <avagin@virtuozzo.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Link: http://lkml.kernel.org/r/20170609201156.GB21491@outlook.office365.com

5c7a3a3d

12 6月, 2017 1 次提交

posix-timers: Fix inverted SIGEV_NONE logic in common_timer_get() · c6503be5

由 Thomas Gleixner 提交于 6月 12, 2017

The refactoring of the posix-timer core to allow better code sharing
introduced inverted logic vs. SIGEV_NONE timers in common_timer_get().

That causes hrtimer_forward() to be called on active timers, which
rightfully triggers the warning hrtimer_forward().

Make sig_none what it says: signal mode == SIGEV_NONE.

Fixes: 91d57bae ("posix-timers: Make use of forward/remaining callbacks")
Reported-by: NYe Xiaolong <xiaolong.ye@intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170609104457.GA39907@inn.lkp.intel.com

c6503be5

04 6月, 2017 26 次提交

alarmtimer: Switch over to generic set/get/rearm routine · f2c45807

由 Thomas Gleixner 提交于 5月 30, 2017

All required callbacks are in place. Switch the alarm timer based posix
interval timer callbacks to the common implementation and remove the
incorrect private implementation.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211657.825471962@linutronix.de

f2c45807

alarmtimer: Implement arm callback · b3bf6f36

由 Thomas Gleixner 提交于 5月 30, 2017

Preparatory change to utilize the common posix timer mechanisms.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211657.747567162@linutronix.de

b3bf6f36

alarmtimer: Implement try_to_cancel callback · e344c9e7

由 Thomas Gleixner 提交于 5月 30, 2017

Preparatory change to utilize the common posix timer mechanisms.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211657.670026824@linutronix.de

e344c9e7

alarmtimer: Implement remaining callback · d653d845

由 Thomas Gleixner 提交于 5月 30, 2017

Preparatory change to utilize the common posix timer mechanisms.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211657.592676753@linutronix.de

d653d845

alarmtimer: Implement forward callback · e7561f16

由 Thomas Gleixner 提交于 5月 30, 2017

Preparatory change to utilize the common posix timer mechanisms.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211657.513694229@linutronix.de

e7561f16

alarmtimer: Implement timer_rearm() callback · b3db80f7

由 Thomas Gleixner 提交于 5月 30, 2017

Preparatory change to utilize the common posix timer mechanisms.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211657.434598989@linutronix.de

b3db80f7

posix-timers: Make use of cancel/arm callbacks · eae1c4ae

由 Thomas Gleixner 提交于 5月 30, 2017

Replace the hrtimer calls by calls to the new try_to_cancel()/arm() kclock
callbacks and move the hrtimer specific implementation into the
corresponding callback functions.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211657.355396667@linutronix.de

eae1c4ae

posix-timers: Add cancel/arm callbacks · 525b8ed9

由 Thomas Gleixner 提交于 5月 30, 2017

Add timer_try_to_cancel() and timer_arm() callbacks to kclock which allow
to make common_timer_set() usable by both hrtimer and alarmtimer based
clocks.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211657.278022962@linutronix.de

525b8ed9

posix-timers: Zero settings value in common code · eabdec04

由 Thomas Gleixner 提交于 5月 30, 2017

Zero out the settings struct in the common code so the callbacks do not
have to do it themself.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211657.200870713@linutronix.de

eabdec04

posix-timers: Make use of forward/remaining callbacks · 91d57bae

由 Thomas Gleixner 提交于 5月 30, 2017

Replace the hrtimer calls by calls to the new forward/remaining kclock
callbacks and move the hrtimer specific implementation into the
corresponding callback functions.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211657.121437232@linutronix.de

91d57bae

posix-timers: Add forward/remaining callbacks · 63841b2a

由 Thomas Gleixner 提交于 5月 30, 2017

Add two callbacks to kclock which allow using common_)timer_get() for both
hrtimer and alarm timer based clocks.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211657.044915536@linutronix.de

63841b2a

posix-timers: Add active flag to k_itimer · 21e55c1f

由 Thomas Gleixner 提交于 5月 30, 2017

Keep track of the activation state of posix timers. This is a preparatory
change for making common_timer_get() usable by both hrtimer and alarm timer
implementations.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211656.967783982@linutronix.de

21e55c1f

posix-timers: Use timer_rearm() callback in posixtimer_rearm() · f37fb0aa

由 Thomas Gleixner 提交于 5月 30, 2017

Use the new timer_rearm() callback to replace the conditional hardcoded
calls into the hrtimer and cpu timer code.

This allows later to bring the same logic to alarmtimers.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211656.889661919@linutronix.de

f37fb0aa

posix-timers: Rename do_schedule_next_timer · 96fe3b07

由 Thomas Gleixner 提交于 5月 30, 2017

That function is a misnomer. Rename it with a proper prefix to
posixtimer_rearm().
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211656.811362578@linutronix.de

96fe3b07

posix-timers: Add timer_rearm() callback · 30802945

由 Thomas Gleixner 提交于 5月 30, 2017

Add a timer_rearm() callback which is used to make the rescheduling of
posix interval timers independent of the underlying clock implementation.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211656.732632167@linutronix.de

30802945

posix-timers: Store k_clock pointer in k_itimer · d97bb75d

由 Thomas Gleixner 提交于 5月 30, 2017

Having the k_clock pointer in the k_itimer struct avoids the lookup in
several code pathes and makes the next steps of unification of the hrtimer
and alarmtimer based posix timers simpler.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211656.641222072@linutronix.de

d97bb75d

posix-timers: Move interval out of the union · 80105cd0

由 Thomas Gleixner 提交于 5月 30, 2017

Preparatory patch to unify the alarm timer and hrtimer based posix interval
timer handling.

The interval is used as a criteria for rearming decisions so moving it out
of the clock specific data structures allows later unification.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211656.563922908@linutronix.de

80105cd0

posix-timers: Unify overrun/requeue_pending handling · af888d67

由 Thomas Gleixner 提交于 5月 30, 2017

hrtimer based posix-timers and posix-cpu-timers handle the update of the
rearming and overflow related status fields differently.

Move that update to the common rearming code.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211656.484936964@linutronix.de

af888d67

posix-timers: Move posix-timer internals to core · bab0aae9

由 Thomas Gleixner 提交于 5月 30, 2017

None of these declarations is required outside of kernel/time. Move them to
an internal header.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170530211656.394803853@linutronix.de

bab0aae9

posix-timers: Avoid gazillions of forward declarations · 6631fa12

由 Thomas Gleixner 提交于 5月 30, 2017

Move it below the actual implementations as there are new callbacks coming
which would require even more forward declarations.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211656.238209952@linutronix.de

6631fa12

posix-clocks: Remove interval timer facility and mmap/fasync callbacks · 3a06c7ac

由 Thomas Gleixner 提交于 5月 30, 2017

The only user of this facility is ptp_clock, which does not implement any of
those functions.

Remove them to prevent accidental users. Especially the interval timer
interfaces are now more or less impossible to implement because the
necessary infrastructure has been confined to the core code. Aside of that
it's really complex to make these callbacks implemented according to spec
as the alarm timer implementation demonstrates. If at all then a nanosleep
callback might be a reasonable extension. For now keep just what ptp_clock
needs.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211656.145036286@linutronix.de

3a06c7ac

posix-timers: Remove unused export of posix_timer_event() · a81129e5

由 Thomas Gleixner 提交于 5月 30, 2017

Since the removal of the mmtimer driver the export is not longer needed.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211656.052744418@linutronix.de

a81129e5

alarmtimer: Remove pointless config conditional · 18c700c4

由 Thomas Gleixner 提交于 5月 30, 2017

Having a IF_ENABLED(CONFIG_POSIX_TIMERS) inside of a
#ifdef CONFIG_POSIX_TIMERS section is pointless.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170530211655.975218056@linutronix.de

18c700c4

alarmtimer: Rate limit periodic intervals · ff86bf0c

由 Thomas Gleixner 提交于 5月 30, 2017

The alarmtimer code has another source of potentially rearming itself too
fast. Interval timers with a very samll interval have a similar CPU hog
effect as the previously fixed overflow issue.

The reason is that alarmtimers do not implement the normal protection
against this kind of problem which the other posix timer use:

  timer expires -> queue signal -> deliver signal -> rearm timer

This scheme brings the rearming under scheduler control and prevents
permanently firing timers which hog the CPU.

Bringing this scheme to the alarm timer code is a major overhaul because it
lacks all the necessary mechanisms completely.

So for a quick fix limit the interval to one jiffie. This is not
problematic in practice as alarmtimers are usually backed by an RTC for
suspend which have 1 second resolution. It could be therefor argued that
the resolution of this clock should be set to 1 second in general, but
that's outside the scope of this fix.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kostya Serebryany <kcc@google.com>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170530211655.896767100@linutronix.de

ff86bf0c

alarmtimer: Prevent overflow of relative timers · f4781e76

由 Thomas Gleixner 提交于 5月 30, 2017

Andrey reported a alartimer related RCU stall while fuzzing the kernel with
syzkaller.

The reason for this is an overflow in ktime_add() which brings the
resulting time into negative space and causes immediate expiry of the
timer. The following rearm with a small interval does not bring the timer
back into positive space due to the same issue.

This results in a permanent firing alarmtimer which hogs the CPU.

Use ktime_add_safe() instead which detects the overflow and clamps the
result to KTIME_SEC_MAX.
Reported-by: NAndrey Konovalov <andreyknvl@google.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kostya Serebryany <kcc@google.com>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170530211655.802921648@linutronix.de

f4781e76

posix-timers: Move the do_schedule_next_timer declaration · 31ea70e0

由 Christoph Hellwig 提交于 6月 03, 2017

Having it in asm-generic/siginfo.h doesn't make any sense as it is in no way
architecture specific.  Move it to posix-timers.h instead.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-ia64@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: sparclinux@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Link: http://lkml.kernel.org/r/20170603190102.28866-4-hch@lst.de

31ea70e0

27 5月, 2017 5 次提交

alarmtimer: Fix posix-timer constification fallout · b6b3b80f

由 Thomas Gleixner 提交于 5月 27, 2017

Some freezer related variables are only used when either CONFIG_POSIX_TIMER
or CONFIG_RTC_CLASS are enabled. Hide them when both are off.

Fixes: d3ba5a9a ("posix-timers: Make posix_clocks immutable")
Reported-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Christoph Helwig <hch@lst.de>

b6b3b80f

posix-timers: Make posix_clocks immutable · d3ba5a9a

由 Christoph Hellwig 提交于 5月 26, 2017

There are no more modular users providing a posix clock. The register
function is now pointless so the posix clock array can be initialized
statically at compile time and the array including the various k_clock
structs can be marked 'const'.

Inspired by changes in the Grsecurity patch set, but done proper.

[ tglx: Massaged changelog and fixed the POSIX_TIMER=n case ]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Mike Travis <mike.travis@hpe.com>
Cc: Dimitri Sivanich <sivanich@hpe.com>
Link: http://lkml.kernel.org/r/20170526090311.3377-3-hch@lst.de

d3ba5a9a

kprobes/x86: Fix to set RWX bits correctly before releasing trampoline · c93f5cf5

由 Masami Hiramatsu 提交于 5月 25, 2017

Fix kprobes to set(recover) RWX bits correctly on trampoline
buffer before releasing it. Releasing readonly page to
module_memfree() crash the kernel.

Without this fix, if kprobes user register a bunch of kprobes
in function body (since kprobes on function entry usually
use ftrace) and unregister it, kernel hits a BUG and crash.

Link: http://lkml.kernel.org/r/149570868652.3518.14120169373590420503.stgit@devboxSigned-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
Fixes: d0381c81 ("kprobes/x86: Set kprobes pages read-only")
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

c93f5cf5

ftrace: Fix memory leak in ftrace_graph_release() · f9797c2f

由 Luis Henriques 提交于 5月 25, 2017

ftrace_hash is being kfree'ed in ftrace_graph_release(), however the
->buckets field is not.  This results in a memory leak that is easily
captured by kmemleak:

unreferenced object 0xffff880038afe000 (size 8192):
  comm "trace-cmd", pid 238, jiffies 4294916898 (age 9.736s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff815f561e>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff8113964d>] __kmalloc+0x12d/0x1a0
    [<ffffffff810bf6d1>] alloc_ftrace_hash+0x51/0x80
    [<ffffffff810c0523>] __ftrace_graph_open.isra.39.constprop.46+0xa3/0x100
    [<ffffffff810c05e8>] ftrace_graph_open+0x68/0xa0
    [<ffffffff8114003d>] do_dentry_open.isra.1+0x1bd/0x2d0
    [<ffffffff81140df7>] vfs_open+0x47/0x60
    [<ffffffff81150f95>] path_openat+0x2a5/0x1020
    [<ffffffff81152d6a>] do_filp_open+0x8a/0xf0
    [<ffffffff811411df>] do_sys_open+0x12f/0x200
    [<ffffffff811412ce>] SyS_open+0x1e/0x20
    [<ffffffff815fa6e0>] entry_SYSCALL_64_fastpath+0x13/0x94
    [<ffffffffffffffff>] 0xffffffffffffffff

Link: http://lkml.kernel.org/r/20170525152038.7661-1-lhenriques@suse.com

Cc: stable@vger.kernel.org
Fixes: b9b0c831 ("ftrace: Convert graph filter to use hash tables")
Signed-off-by: NLuis Henriques <lhenriques@suse.com>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

f9797c2f

livepatch: Make livepatch dependent on !TRIM_UNUSED_KSYMS · 5720acf4

由 Miroslav Benes 提交于 5月 26, 2017

If TRIM_UNUSED_KSYMS is enabled, all unneeded exported symbols are made
unexported. Two-pass build of the kernel is done to find out which
symbols are needed based on a configuration. This effectively
complicates things for out-of-tree modules.

Livepatch exports functions to (un)register and enable/disable a live
patch. The only in-tree module which uses these functions is a sample in
samples/livepatch/. If the sample is disabled, the functions are
trimmed and out-of-tree live patches cannot be built.

Note that live patches are intended to be built out-of-tree.
Suggested-by: NMichal Marek <mmarek@suse.com>
Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Acked-by: NJessica Yu <jeyu@redhat.com>
Signed-off-by: NMiroslav Benes <mbenes@suse.cz>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

5720acf4

26 5月, 2017 3 次提交

bpf: fix wrong exposure of map_flags into fdinfo for lpm · a316338c

由 Daniel Borkmann 提交于 5月 25, 2017

trie_alloc() always needs to have BPF_F_NO_PREALLOC passed in via
attr->map_flags, since it does not support preallocation yet. We
check the flag, but we never copy the flag into trie->map.map_flags,
which is later on exposed into fdinfo and used by loaders such as
iproute2. Latter uses this in bpf_map_selfcheck_pinned() to test
whether a pinned map has the same spec as the one from the BPF obj
file and if not, bails out, which is currently the case for lpm
since it exposes always 0 as flags.

Also copy over flags in array_map_alloc() and stack_map_alloc().
They always have to be 0 right now, but we should make sure to not
miss to copy them over at a later point in time when we add actual
flags for them to use.

Fixes: b95a5c4d ("bpf: add a longest prefix match trie map implementation")
Reported-by: NJarno Rajahalme <jarno@covalent.io>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a316338c

bpf: properly reset caller saved regs after helper call and ld_abs/ind · a9789ef9

由 Daniel Borkmann 提交于 5月 25, 2017

Currently, after performing helper calls, we clear all caller saved
registers, that is r0 - r5 and fill r0 depending on struct bpf_func_proto
specification. The way we reset these regs can affect pruning decisions
in later paths, since we only reset register's imm to 0 and type to
NOT_INIT. However, we leave out clearing of other variables such as id,
min_value, max_value, etc, which can later on lead to pruning mismatches
due to stale data.
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a9789ef9

bpf: fix incorrect pruning decision when alignment must be tracked · 1ad2f583

由 Daniel Borkmann 提交于 5月 25, 2017

Currently, when we enforce alignment tracking on direct packet access,
the verifier lets the following program pass despite doing a packet
write with unaligned access:

  0: (61) r2 = *(u32 *)(r1 +76)
  1: (61) r3 = *(u32 *)(r1 +80)
  2: (61) r7 = *(u32 *)(r1 +8)
  3: (bf) r0 = r2
  4: (07) r0 += 14
  5: (25) if r7 > 0x1 goto pc+4
   R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
   R3=pkt_end R7=inv,min_value=0,max_value=1 R10=fp
  6: (2d) if r0 > r3 goto pc+1
   R0=pkt(id=0,off=14,r=14) R1=ctx R2=pkt(id=0,off=0,r=14)
   R3=pkt_end R7=inv,min_value=0,max_value=1 R10=fp
  7: (63) *(u32 *)(r0 -4) = r0
  8: (b7) r0 = 0
  9: (95) exit

  from 6 to 8:
   R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
   R3=pkt_end R7=inv,min_value=0,max_value=1 R10=fp
  8: (b7) r0 = 0
  9: (95) exit

  from 5 to 10:
   R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
   R3=pkt_end R7=inv,min_value=2 R10=fp
  10: (07) r0 += 1
  11: (05) goto pc-6
  6: safe                           <----- here, wrongly found safe
  processed 15 insns

However, if we enforce a pruning mismatch by adding state into r8
which is then being mismatched in states_equal(), we find that for
the otherwise same program, the verifier detects a misaligned packet
access when actually walking that path:

  0: (61) r2 = *(u32 *)(r1 +76)
  1: (61) r3 = *(u32 *)(r1 +80)
  2: (61) r7 = *(u32 *)(r1 +8)
  3: (b7) r8 = 1
  4: (bf) r0 = r2
  5: (07) r0 += 14
  6: (25) if r7 > 0x1 goto pc+4
   R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
   R3=pkt_end R7=inv,min_value=0,max_value=1
   R8=imm1,min_value=1,max_value=1,min_align=1 R10=fp
  7: (2d) if r0 > r3 goto pc+1
   R0=pkt(id=0,off=14,r=14) R1=ctx R2=pkt(id=0,off=0,r=14)
   R3=pkt_end R7=inv,min_value=0,max_value=1
   R8=imm1,min_value=1,max_value=1,min_align=1 R10=fp
  8: (63) *(u32 *)(r0 -4) = r0
  9: (b7) r0 = 0
  10: (95) exit

  from 7 to 9:
   R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
   R3=pkt_end R7=inv,min_value=0,max_value=1
   R8=imm1,min_value=1,max_value=1,min_align=1 R10=fp
  9: (b7) r0 = 0
  10: (95) exit

  from 6 to 11:
   R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
   R3=pkt_end R7=inv,min_value=2
   R8=imm1,min_value=1,max_value=1,min_align=1 R10=fp
  11: (07) r0 += 1
  12: (b7) r8 = 0
  13: (05) goto pc-7                <----- mismatch due to r8
  7: (2d) if r0 > r3 goto pc+1
   R0=pkt(id=0,off=15,r=15) R1=ctx R2=pkt(id=0,off=0,r=15)
   R3=pkt_end R7=inv,min_value=2
   R8=imm0,min_value=0,max_value=0,min_align=2147483648 R10=fp
  8: (63) *(u32 *)(r0 -4) = r0
  misaligned packet access off 2+15+-4 size 4

The reason why we fail to see it in states_equal() is that the
third test in compare_ptrs_to_packet() ...

  if (old->off <= cur->off &&
      old->off >= old->range && cur->off >= cur->range)
          return true;

... will let the above pass. The situation we run into is that
old->off <= cur->off (14 <= 15), meaning that prior walked paths
went with smaller offset, which was later used in the packet
access after successful packet range check and found to be safe
already.

For example: Given is R0=pkt(id=0,off=0,r=0). Adding offset 14
as in above program to it, results in R0=pkt(id=0,off=14,r=0)
before the packet range test. Now, testing this against R3=pkt_end
with 'if r0 > r3 goto out' will transform R0 into R0=pkt(id=0,off=14,r=14)
for the case when we're within bounds. A write into the packet
at offset *(u32 *)(r0 -4), that is, 2 + 14 -4, is valid and
aligned (2 is for NET_IP_ALIGN). After processing this with
all fall-through paths, we later on check paths from branches.
When the above skb->mark test is true, then we jump near the
end of the program, perform r0 += 1, and jump back to the
'if r0 > r3 goto out' test we've visited earlier already. This
time, R0 is of type R0=pkt(id=0,off=15,r=0), and we'll prune
that part because this time we'll have a larger safe packet
range, and we already found that with off=14 all further insn
were already safe, so it's safe as well with a larger off.
However, the problem is that the subsequent write into the packet
with 2 + 15 -4 is then unaligned, and not caught by the alignment
tracking. Note that min_align, aux_off, and aux_off_align were
all 0 in this example.

Since we cannot tell at this time what kind of packet access was
performed in the prior walk and what minimal requirements it has
(we might do so in the future, but that requires more complexity),
fix it to disable this pruning case for strict alignment for now,
and let the verifier do check such paths instead. With that applied,
the test cases pass and reject the program due to misalignment.

Fixes: d1174416 ("bpf: Track alignment of register values in the verifier.")
Reference: http://patchwork.ozlabs.org/patch/761909/Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1ad2f583

24 5月, 2017 1 次提交

posix-timers: Make signal printks conditional · 43fe8b8e

由 Thomas Gleixner 提交于 5月 23, 2017

A recent commit added extra printks for CPU/RT limits. This can result in
excessive spam in dmesg.

Make the printks conditional on print_fatal_signals.

Fixes: e7ea7c98 ("rlimits: Print more information when CPU/RT limits are exceeded")
Reported-by: NDave Jones <davej@codemonkey.org.uk>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Arun Raghavan <arun@arunraghavan.net>

43fe8b8e

23 5月, 2017 2 次提交

ptrace: Properly initialize ptracer_cred on fork · c70d9d80

由 Eric W. Biederman 提交于 5月 22, 2017

When I introduced ptracer_cred I failed to consider the weirdness of
fork where the task_struct copies the old value by default. This
winds up leaving ptracer_cred set even when a process forks and
the child process does not wind up being ptraced.

Because ptracer_cred is not set on non-ptraced processes whose
parents were ptraced this has broken the ability of the enlightenment
window manager to start setuid children.

Fix this by properly initializing ptracer_cred in ptrace_init_task

This must be done with a little bit of care to preserve the current value
of ptracer_cred when ptrace carries through fork. Re-reading the
ptracer_cred from the ptracing process at this point is inconsistent
with how PT_PTRACE_CAP has been maintained all of these years.
Tested-by: NTakashi Iwai <tiwai@suse.de>
Fixes: 64b875f7 ("ptrace: Capture the ptracer's creds not PT_PTRACE_CAP")
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

c70d9d80

kthread: Fix use-after-free if kthread fork fails · 4d6501dc

由 Vegard Nossum 提交于 5月 09, 2017

If a kthread forks (e.g. usermodehelper since commit 1da5c46f) but
fails in copy_process() between calling dup_task_struct() and setting
p->set_child_tid, then the value of p->set_child_tid will be inherited
from the parent and get prematurely freed by free_kthread_struct().

    kthread()
     - worker_thread()
        - process_one_work()
        |  - call_usermodehelper_exec_work()
        |     - kernel_thread()
        |        - _do_fork()
        |           - copy_process()
        |              - dup_task_struct()
        |                 - arch_dup_task_struct()
        |                    - tsk->set_child_tid = current->set_child_tid // implied
        |              - ...
        |              - goto bad_fork_*
        |              - ...
        |              - free_task(tsk)
        |                 - free_kthread_struct(tsk)
        |                    - kfree(tsk->set_child_tid)
        - ...
        - schedule()
           - __schedule()
              - wq_worker_sleeping()
                 - kthread_data(task)->flags // UAF

The problem started showing up with commit 1da5c46f since it reused
->set_child_tid for the kthread worker data.

A better long-term solution might be to get rid of the ->set_child_tid
abuse. The comment in set_kthread_struct() also looks slightly wrong.
Debugged-by: NJamie Iles <jamie.iles@oracle.com>
Fixes: 1da5c46f ("kthread: Make struct kthread kmalloc'ed")
Signed-off-by: NVegard Nossum <vegard.nossum@oracle.com>
Acked-by: NOleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jamie Iles <jamie.iles@oracle.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170509073959.17858-1-vegard.nossum@oracle.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

4d6501dc

OpenHarmony / kernel_linux 上一次同步 4 年多

OpenHarmony / kernel_linux
上一次同步 4 年多