1. 26 8月, 2015 1 次提交
  2. 12 8月, 2015 3 次提交
  3. 10 8月, 2015 1 次提交
    • A
      cpuset: use trialcs->mems_allowed as a temp variable · 24ee3cf8
      Alban Crequy 提交于
      The comment says it's using trialcs->mems_allowed as a temp variable but
      it didn't match the code. Change the code to match the comment.
      
      This fixes an issue when writing in cpuset.mems when a sub-directory
      exists: we need to write several times for the information to persist:
      
      | root@alban:/sys/fs/cgroup/cpuset# mkdir footest9
      | root@alban:/sys/fs/cgroup/cpuset# cd footest9
      | root@alban:/sys/fs/cgroup/cpuset/footest9# mkdir aa
      | root@alban:/sys/fs/cgroup/cpuset/footest9# cat cpuset.mems
      |
      | root@alban:/sys/fs/cgroup/cpuset/footest9# echo 0 > cpuset.mems
      | root@alban:/sys/fs/cgroup/cpuset/footest9# cat cpuset.mems
      |
      | root@alban:/sys/fs/cgroup/cpuset/footest9# echo 0 > cpuset.mems
      | root@alban:/sys/fs/cgroup/cpuset/footest9# cat cpuset.mems
      | 0
      | root@alban:/sys/fs/cgroup/cpuset/footest9# cat aa/cpuset.mems
      |
      | root@alban:/sys/fs/cgroup/cpuset/footest9# echo 0 > aa/cpuset.mems
      | root@alban:/sys/fs/cgroup/cpuset/footest9# cat aa/cpuset.mems
      | 0
      | root@alban:/sys/fs/cgroup/cpuset/footest9#
      
      This should help to fix the following issue in Docker:
      https://github.com/opencontainers/runc/issues/133
      In some conditions, a Docker container needs to be started twice in
      order to work.
      Signed-off-by: NAlban Crequy <alban@endocode.com>
      Tested-by: NIago López Galeiras <iago@endocode.com>
      Cc: <stable@vger.kernel.org> # 3.17+
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      24ee3cf8
  4. 07 8月, 2015 4 次提交
    • D
      kthread: export kthread functions · 18896451
      David Kershner 提交于
      The s-Par visornic driver, currently in staging, processes a queue being
      serviced by the an s-Par service partition.  We can get a message that
      something has happened with the Service Partition, when that happens, we
      must not access the channel until we get a message that the service
      partition is back again.
      
      The visornic driver has a thread for processing the channel, when we get
      the message, we need to be able to park the thread and then resume it
      when the problem clears.
      
      We can do this with kthread_park and unpark but they are not exported
      from the kernel, this patch exports the needed functions.
      Signed-off-by: NDavid Kershner <david.kershner@unisys.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Richard Weinberger <richard.weinberger@gmail.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      18896451
    • A
      signal: fix information leak in copy_siginfo_to_user · 26135022
      Amanieu d'Antras 提交于
      This function may copy the si_addr_lsb, si_lower and si_upper fields to
      user mode when they haven't been initialized, which can leak kernel
      stack data to user mode.
      
      Just checking the value of si_code is insufficient because the same
      si_code value is shared between multiple signals.  This is solved by
      checking the value of si_signo in addition to si_code.
      Signed-off-by: NAmanieu d'Antras <amanieu@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      26135022
    • A
      signal: fix information leak in copy_siginfo_from_user32 · 3c00cb5e
      Amanieu d'Antras 提交于
      This function can leak kernel stack data when the user siginfo_t has a
      positive si_code value.  The top 16 bits of si_code descibe which fields
      in the siginfo_t union are active, but they are treated inconsistently
      between copy_siginfo_from_user32, copy_siginfo_to_user32 and
      copy_siginfo_to_user.
      
      copy_siginfo_from_user32 is called from rt_sigqueueinfo and
      rt_tgsigqueueinfo in which the user has full control overthe top 16 bits
      of si_code.
      
      This fixes the following information leaks:
      x86:   8 bytes leaked when sending a signal from a 32-bit process to
             itself. This leak grows to 16 bytes if the process uses x32.
             (si_code = __SI_CHLD)
      x86:   100 bytes leaked when sending a signal from a 32-bit process to
             a 64-bit process. (si_code = -1)
      sparc: 4 bytes leaked when sending a signal from a 32-bit process to a
             64-bit process. (si_code = any)
      
      parsic and s390 have similar bugs, but they are not vulnerable because
      rt_[tg]sigqueueinfo have checks that prevent sending a positive si_code
      to a different process.  These bugs are also fixed for consistency.
      Signed-off-by: NAmanieu d'Antras <amanieu@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3c00cb5e
    • W
      tracing, perf: Implement BPF programs attached to uprobes · 04a22fae
      Wang Nan 提交于
      By copying BPF related operation to uprobe processing path, this patch
      allow users attach BPF programs to uprobes like what they are already
      doing on kprobes.
      
      After this patch, users are allowed to use PERF_EVENT_IOC_SET_BPF on a
      uprobe perf event. Which make it possible to profile user space programs
      and kernel events together using BPF.
      
      Because of this patch, CONFIG_BPF_EVENTS should be selected by
      CONFIG_UPROBE_EVENT to ensure trace_call_bpf() is compiled even if
      KPROBE_EVENT is not set.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kaixu Xia <xiakaixu@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1435716878-189507-3-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      04a22fae
  5. 04 8月, 2015 3 次提交
    • A
      perf/x86/intel/pt: Do not force sync packets on every schedule-in · 9a6694cf
      Alexander Shishkin 提交于
      Currently, the PT driver zeroes out the status register every time before
      starting the event. However, all the writable bits are already taken care
      of in pt_handle_status() function, except the new PacketByteCnt field,
      which in new versions of PT contains the number of packet bytes written
      since the last sync (PSB) packet. Zeroing it out before enabling PT forces
      a sync packet to be written. This means that, with the existing code, a
      sync packet (PSB and PSBEND, 18 bytes in total) will be generated every
      time a PT event is scheduled in.
      
      To avoid these unnecessary syncs and save a WRMSR in the fast path, this
      patch changes the default behavior to not clear PacketByteCnt field, so
      that the sync packets will be generated with the period specified as
      "psb_period" attribute config field. This has little impact on the trace
      data as the other packets that are normally sent within PSB+ (between PSB
      and PSBEND) have their own generation scenarios which do not depend on the
      sync packets.
      
      One exception where we do need to force PSB like this when tracing starts,
      so that the decoder has a clear sync point in the trace. For this purpose
      we aready have hw::itrace_started flag, which we are currently using to
      output PERF_RECORD_ITRACE_START. This patch moves setting itrace_started
      from perf core to the pmu::start, where it should still be 0 on the very
      first run.
      Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@infradead.org
      Cc: adrian.hunter@intel.com
      Cc: hpa@zytor.com
      Link: http://lkml.kernel.org/r/1438264104-16189-1-git-send-email-alexander.shishkin@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9a6694cf
    • A
      perf/x86/hw_breakpoints: Disallow kernel breakpoints unless kprobe-safe · e5779e8e
      Andy Lutomirski 提交于
      Code on the kprobe blacklist doesn't want unexpected int3
      exceptions. It probably doesn't want unexpected debug exceptions
      either. Be safe: disallow breakpoints in nokprobes code.
      
      On non-CONFIG_KPROBES kernels, there is no kprobe blacklist.  In
      that case, disallow kernel breakpoints entirely.
      
      It will be particularly important to keep hw breakpoints out of the
      entry and NMI code once we move debug exceptions off the IST stack.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/e14b152af99640448d895e3c2a8c2d5ee19a1325.1438312874.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e5779e8e
    • P
      perf: Fix fasync handling on inherited events · fed66e2c
      Peter Zijlstra 提交于
      Vince reported that the fasync signal stuff doesn't work proper for
      inherited events. So fix that.
      
      Installing fasync allocates memory and sets filp->f_flags |= FASYNC,
      which upon the demise of the file descriptor ensures the allocation is
      freed and state is updated.
      
      Now for perf, we can have the events stick around for a while after the
      original FD is dead because of references from child events. So we
      cannot copy the fasync pointer around. We can however consistently use
      the parent's fasync, as that will be updated.
      Reported-and-Tested-by: NVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <stable@vger.kernel.org>
      Cc: Arnaldo Carvalho deMelo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1434011521.1495.71.camel@twinsSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fed66e2c
  6. 31 7月, 2015 14 次提交
  7. 29 7月, 2015 1 次提交
  8. 27 7月, 2015 1 次提交
  9. 25 7月, 2015 1 次提交
  10. 24 7月, 2015 1 次提交
    • A
      perf: Add PERF_RECORD_SWITCH to indicate context switches · 45ac1403
      Adrian Hunter 提交于
      There are already two events for context switches, namely the tracepoint
      sched:sched_switch and the software event context_switches.
      Unfortunately neither are suitable for use by non-privileged users for
      the purpose of synchronizing hardware trace data (e.g. Intel PT) to the
      context switch.
      
      Tracepoints are no good at all for non-privileged users because they
      need either CAP_SYS_ADMIN or /proc/sys/kernel/perf_event_paranoid <= -1.
      
      On the other hand, kernel software events need either CAP_SYS_ADMIN or
      /proc/sys/kernel/perf_event_paranoid <= 1.
      
      Now many distributions do default perf_event_paranoid to 1 making
      context_switches a contender, except it has another problem (which is
      also shared with sched:sched_switch) which is that it happens before
      perf schedules events out instead of after perf schedules events in.
      Whereas a privileged user can see all the events anyway, a
      non-privileged user only sees events for their own processes, in other
      words they see when their process was scheduled out not when it was
      scheduled in. That presents two problems to use the event:
      
      1. the information comes too late, so tools have to look ahead in the
         event stream to find out what the current state is
      
      2. if they are unlucky tracing might have stopped before the
         context-switches event is recorded.
      
      This new PERF_RECORD_SWITCH event does not have those problems
      and it also has a couple of other small advantages.
      
      It is easier to use because it is an auxiliary event (like mmap, comm
      and task events) which can be enabled by setting a single bit. It is
      smaller than sched:sched_switch and easier to parse.
      
      To make the event useful for privileged users also, if the
      context is cpu-wide then the event record will be
      PERF_RECORD_SWITCH_CPU_WIDE which is the same as
      PERF_RECORD_SWITCH except it also provides the next or
      previous pid/tid.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Pawel Moll <pawel.moll@arm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1437471846-26995-2-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      45ac1403
  11. 22 7月, 2015 1 次提交
  12. 21 7月, 2015 1 次提交
  13. 18 7月, 2015 2 次提交
  14. 17 7月, 2015 1 次提交
  15. 15 7月, 2015 1 次提交
    • T
      genirq: Revert sparse irq locking around __cpu_up() and move it to x86 for now · ce0d3c0a
      Thomas Gleixner 提交于
      Boris reported that the sparse_irq protection around __cpu_up() in the
      generic code causes a regression on Xen. Xen allocates interrupts and
      some more in the xen_cpu_up() function, so it deadlocks on the
      sparse_irq_lock.
      
      There is no simple fix for this and we really should have the
      protection for all architectures, but for now the only solution is to
      move it to x86 where actual wreckage due to the lack of protection has
      been observed.
      Reported-and-tested-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Fixes: a8994181 'hotplug: Prevent alloc/free of irq descriptors during cpu up/down'
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: xiao jin <jin.xiao@intel.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
      Cc: xen-devel <xen-devel@lists.xenproject.org>
      ce0d3c0a
  16. 14 7月, 2015 1 次提交
  17. 11 7月, 2015 1 次提交
  18. 09 7月, 2015 2 次提交