1. 16 1月, 2016 2 次提交
  2. 15 1月, 2016 1 次提交
  3. 23 11月, 2015 1 次提交
    • P
      treewide: Remove old email address · 90eec103
      Peter Zijlstra 提交于
      There were still a number of references to my old Red Hat email
      address in the kernel source. Remove these while keeping the
      Red Hat copyright notices intact.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      90eec103
  4. 31 7月, 2015 14 次提交
  5. 14 12月, 2014 3 次提交
  6. 24 11月, 2014 1 次提交
  7. 09 8月, 2014 1 次提交
    • J
      mm: memcontrol: rewrite charge API · 00501b53
      Johannes Weiner 提交于
      These patches rework memcg charge lifetime to integrate more naturally
      with the lifetime of user pages.  This drastically simplifies the code and
      reduces charging and uncharging overhead.  The most expensive part of
      charging and uncharging is the page_cgroup bit spinlock, which is removed
      entirely after this series.
      
      Here are the top-10 profile entries of a stress test that reads a 128G
      sparse file on a freshly booted box, without even a dedicated cgroup (i.e.
       executing in the root memcg).  Before:
      
          15.36%              cat  [kernel.kallsyms]   [k] copy_user_generic_string
          13.31%              cat  [kernel.kallsyms]   [k] memset
          11.48%              cat  [kernel.kallsyms]   [k] do_mpage_readpage
           4.23%              cat  [kernel.kallsyms]   [k] get_page_from_freelist
           2.38%              cat  [kernel.kallsyms]   [k] put_page
           2.32%              cat  [kernel.kallsyms]   [k] __mem_cgroup_commit_charge
           2.18%          kswapd0  [kernel.kallsyms]   [k] __mem_cgroup_uncharge_common
           1.92%          kswapd0  [kernel.kallsyms]   [k] shrink_page_list
           1.86%              cat  [kernel.kallsyms]   [k] __radix_tree_lookup
           1.62%              cat  [kernel.kallsyms]   [k] __pagevec_lru_add_fn
      
      After:
      
          15.67%           cat  [kernel.kallsyms]   [k] copy_user_generic_string
          13.48%           cat  [kernel.kallsyms]   [k] memset
          11.42%           cat  [kernel.kallsyms]   [k] do_mpage_readpage
           3.98%           cat  [kernel.kallsyms]   [k] get_page_from_freelist
           2.46%           cat  [kernel.kallsyms]   [k] put_page
           2.13%       kswapd0  [kernel.kallsyms]   [k] shrink_page_list
           1.88%           cat  [kernel.kallsyms]   [k] __radix_tree_lookup
           1.67%           cat  [kernel.kallsyms]   [k] __pagevec_lru_add_fn
           1.39%       kswapd0  [kernel.kallsyms]   [k] free_pcppages_bulk
           1.30%           cat  [kernel.kallsyms]   [k] kfree
      
      As you can see, the memcg footprint has shrunk quite a bit.
      
         text    data     bss     dec     hex filename
        37970    9892     400   48262    bc86 mm/memcontrol.o.old
        35239    9892     400   45531    b1db mm/memcontrol.o
      
      This patch (of 4):
      
      The memcg charge API charges pages before they are rmapped - i.e.  have an
      actual "type" - and so every callsite needs its own set of charge and
      uncharge functions to know what type is being operated on.  Worse,
      uncharge has to happen from a context that is still type-specific, rather
      than at the end of the page's lifetime with exclusive access, and so
      requires a lot of synchronization.
      
      Rewrite the charge API to provide a generic set of try_charge(),
      commit_charge() and cancel_charge() transaction operations, much like
      what's currently done for swap-in:
      
        mem_cgroup_try_charge() attempts to reserve a charge, reclaiming
        pages from the memcg if necessary.
      
        mem_cgroup_commit_charge() commits the page to the charge once it
        has a valid page->mapping and PageAnon() reliably tells the type.
      
        mem_cgroup_cancel_charge() aborts the transaction.
      
      This reduces the charge API and enables subsequent patches to
      drastically simplify uncharging.
      
      As pages need to be committed after rmap is established but before they
      are added to the LRU, page_add_new_anon_rmap() must stop doing LRU
      additions again.  Revive lru_cache_add_active_or_unevictable().
      
      [hughd@google.com: fix shmem_unuse]
      [hughd@google.com: Add comments on the private use of -EAGAIN]
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      00501b53
  8. 01 7月, 2014 1 次提交
  9. 05 6月, 2014 2 次提交
  10. 02 6月, 2014 2 次提交
  11. 26 5月, 2014 1 次提交
    • V
      ARM: 8043/1: uprobes need icache flush after xol write · 72e6ae28
      Victor Kamensky 提交于
      After instruction write into xol area, on ARM V7
      architecture code need to flush dcache and icache to sync
      them up for given set of addresses. Having just
      'flush_dcache_page(page)' call is not enough - it is
      possible to have stale instruction sitting in icache
      for given xol area slot address.
      
      Introduce arch_uprobe_ixol_copy weak function
      that by default calls uprobes copy_to_page function and
      than flush_dcache_page function and on ARM define new one
      that handles xol slot copy in ARM specific way
      
      flush_uprobe_xol_access function shares/reuses implementation
      with/of flush_ptrace_access function and takes care of writing
      instruction to user land address space on given variety of
      different cache types on ARM CPUs. Because
      flush_uprobe_xol_access does not have vma around
      flush_ptrace_access was split into two parts. First that
      retrieves set of condition from vma and common that receives
      those conditions as flags.
      
      Note ARM cache flush function need kernel address
      through which instruction write happened, so instead
      of using uprobes copy_to_page function changed
      code to explicitly map page and do memcpy.
      
      Note arch_uprobe_copy_ixol function, in similar way as
      copy_to_user_page function, has preempt_disable/preempt_enable.
      Signed-off-by: NVictor Kamensky <victor.kamensky@linaro.org>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Reviewed-by: NDavid A. Long <dave.long@linaro.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      72e6ae28
  12. 14 5月, 2014 2 次提交
    • O
      uprobes/x86: Fix the wrong ->si_addr when xol triggers a trap · b02ef20a
      Oleg Nesterov 提交于
      If the probed insn triggers a trap, ->si_addr = regs->ip is technically
      correct, but this is not what the signal handler wants; we need to pass
      the address of the probed insn, not the address of xol slot.
      
      Add the new arch-agnostic helper, uprobe_get_trap_addr(), and change
      fill_trap_info() and math_error() to use it. !CONFIG_UPROBES case in
      uprobes.h uses a macro to avoid include hell and ensure that it can be
      compiled even if an architecture doesn't define instruction_pointer().
      
      Test-case:
      
      	#include <signal.h>
      	#include <stdio.h>
      	#include <unistd.h>
      
      	extern void probe_div(void);
      
      	void sigh(int sig, siginfo_t *info, void *c)
      	{
      		int passed = (info->si_addr == probe_div);
      		printf(passed ? "PASS\n" : "FAIL\n");
      		_exit(!passed);
      	}
      
      	int main(void)
      	{
      		struct sigaction sa = {
      			.sa_sigaction	= sigh,
      			.sa_flags	= SA_SIGINFO,
      		};
      
      		sigaction(SIGFPE, &sa, NULL);
      
      		asm (
      			"xor %ecx,%ecx\n"
      			".globl probe_div; probe_div:\n"
      			"idiv %ecx\n"
      		);
      
      		return 0;
      	}
      
      it fails if probe_div() is probed.
      
      Note: show_unhandled_signals users should probably use this helper too,
      but we need to cleanup them first.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Reviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      b02ef20a
    • O
      uprobes: Add mem_cgroup_charge_anon() into uprobe_write_opcode() · 29dedee0
      Oleg Nesterov 提交于
      Hugh says:
      
          The one I noticed was that it forgets all about memcg (because
          it was copied from KSM, and there the replacement page has already
          been charged to a memcg). See how mm/memory.c do_anonymous_page()
          does a mem_cgroup_charge_anon().
      
      Hopefully not a big problem, uprobes is a system-wide thing and only
      root can insert the probes. But I agree, should be fixed anyway.
      
      Add mem_cgroup_{un,}charge_anon() into uprobe_write_opcode(). To simplify
      the error handling (and avoid the new "uncharge" label) the patch also
      moves anon_vma_prepare() up before we alloc/charge the new page.
      
      While at it fix the comment about ->mmap_sem, it is held for write.
      Suggested-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      29dedee0
  13. 01 5月, 2014 1 次提交
    • O
      uprobes: Refuse to insert a probe into MAP_SHARED vma · 13f59c5e
      Oleg Nesterov 提交于
      valid_vma() rejects the VM_SHARED vmas, but this still allows to insert
      a probe into the MAP_SHARED but not VM_MAYWRITE vma.
      
      Currently this is fine, such a mapping doesn't really differ from the
      private read-only mmap except mprotect(PROT_WRITE) won't work. However,
      get_user_pages(FOLL_WRITE | FOLL_FORCE) doesn't allow to COW in this
      case, and it would be safer to follow the same conventions as mm even
      if currently this happens to work.
      
      After the recent cda540ac "mm: get_user_pages(write,force) refuse
      to COW in shared areas" only uprobes can insert an anon page into the
      shared file-backed area, lets stop this and change valid_vma() to check
      VM_MAYSHARE instead.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      13f59c5e
  14. 18 4月, 2014 2 次提交
    • O
      uprobes/x86: Send SIGILL if arch_uprobe_post_xol() fails · 014940ba
      Oleg Nesterov 提交于
      Currently the error from arch_uprobe_post_xol() is silently ignored.
      This doesn't look good and this can lead to the hard-to-debug problems.
      
      1. Change handle_singlestep() to loudly complain and send SIGILL.
      
         Note: this only affects x86, ppc/arm can't fail.
      
      2. Change arch_uprobe_post_xol() to call arch_uprobe_abort_xol() and
         avoid TF games if it is going to return an error.
      
         This can help to to analyze the problem, if nothing else we should
         not report ->ip = xol_slot in the core-file.
      
         Note: this means that handle_riprel_post_xol() can be called twice,
         but this is fine because it is idempotent.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Reviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Reviewed-by: NJim Keniston <jkenisto@us.ibm.com>
      014940ba
    • O
      uprobes: Kill UPROBE_SKIP_SSTEP and can_skip_sstep() · 8a6b1732
      Oleg Nesterov 提交于
      UPROBE_COPY_INSN, UPROBE_SKIP_SSTEP, and uprobe->flags must die. This
      patch kills UPROBE_SKIP_SSTEP. I never understood why it was added;
      not only it doesn't help, it harms.
      
      It can only help to avoid arch_uprobe_skip_sstep() if it was already
      called before and failed. But this is ugly, if we want to know whether
      we can emulate this instruction or not we should do this analysis in
      arch_uprobe_analyze_insn(), not when we hit this probe for the first
      time.
      
      And in fact this logic is simply wrong. arch_uprobe_skip_sstep() can
      fail or not depending on the task/register state, if this insn can be
      emulated but, say, put_user() fails we need to xol it this time, but
      this doesn't mean we shouldn't try to emulate it when this or another
      thread hits this bp next time.
      
      And this is the actual reason for this change. We need to emulate the
      "call" insn, but push(return-address) can obviously fail.
      
      Per-arch notes:
      
      	x86: __skip_sstep() can only emulate "rep;nop". With this
      	     change it will be called every time and most probably
      	     for no reason.
      
      	     This will be fixed by the next changes. We need to
      	     change this suboptimal code anyway.
      
      	arm: Should not be affected. It has its own "bool simulate"
      	     flag checked in arch_uprobe_skip_sstep().
      
      	ppc: Looks like, it can emulate almost everything. Does it
      	     actually need to record the fact that emulate_step()
      	     failed? Hopefully not. But if yes, it can add the ppc-
      	     specific flag into arch_uprobe.
      
      TODO: rename arch_uprobe_skip_sstep() to arch_uprobe_emulate_insn(),
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Reviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Reviewed-by: NDavid A. Long <dave.long@linaro.org>
      Reviewed-by: NJim Keniston <jkenisto@us.ibm.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      8a6b1732
  15. 19 3月, 2014 1 次提交
  16. 13 11月, 2014 1 次提交
  17. 03 1月, 2014 1 次提交
  18. 20 11月, 2013 3 次提交