1. 09 10月, 2012 1 次提交
    • K
      mm: kill vma flag VM_RESERVED and mm->reserved_vm counter · 314e51b9
      Konstantin Khlebnikov 提交于
      A long time ago, in v2.4, VM_RESERVED kept swapout process off VMA,
      currently it lost original meaning but still has some effects:
      
       | effect                 | alternative flags
      -+------------------------+---------------------------------------------
      1| account as reserved_vm | VM_IO
      2| skip in core dump      | VM_IO, VM_DONTDUMP
      3| do not merge or expand | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
      4| do not mlock           | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
      
      This patch removes reserved_vm counter from mm_struct.  Seems like nobody
      cares about it, it does not exported into userspace directly, it only
      reduces total_vm showed in proc.
      
      Thus VM_RESERVED can be replaced with VM_IO or pair VM_DONTEXPAND | VM_DONTDUMP.
      
      remap_pfn_range() and io_remap_pfn_range() set VM_IO|VM_DONTEXPAND|VM_DONTDUMP.
      remap_vmalloc_range() set VM_DONTEXPAND | VM_DONTDUMP.
      
      [akpm@linux-foundation.org: drivers/vfio/pci/vfio_pci.c fixup]
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Carsten Otte <cotte@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      314e51b9
  2. 27 9月, 2012 2 次提交
  3. 15 9月, 2012 6 次提交
    • S
      uprobes: Introduce arch_uprobe_enable/disable_step() · 9d778782
      Sebastian Andrzej Siewior 提交于
      As Oleg pointed out in [0] uprobe should not use the ptrace interface
      for enabling/disabling single stepping.
      
      [0] http://lkml.kernel.org/r/20120730141638.GA5306@redhat.com
      
      Add the new "__weak arch" helpers which simply call user_*_single_step()
      as a preparation. This is only needed to not break the powerpc port, we
      will fold this logic into arch_uprobe_pre/post_xol() hooks later.
      
      We should also change handle_singlestep(), _disable_step(&uprobe->arch)
      should be called before put_uprobe().
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      9d778782
    • O
      uprobes: Teach find_active_uprobe() to clear MMF_HAS_UPROBES · 499a4f3e
      Oleg Nesterov 提交于
      The wrong MMF_HAS_UPROBES doesn't really hurt, just it triggers
      the "slow" and unnecessary handle_swbp() path if the task hits
      the non-uprobe breakpoint.
      
      So this patch changes find_active_uprobe() to check every valid
      vma and clear MMF_HAS_UPROBES if no uprobes were found. This is
      adds the slow O(n) path, but it is only called in unlikely case
      when the task hits the normal breakpoint first time after
      uprobe_unregister().
      
      Note the "not strictly accurate" comment in mmf_recalc_uprobes().
      We can fix this, we only need to teach vma_has_uprobes() to return
      a bit more more info, but I am not sure this worth the trouble.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      499a4f3e
    • O
      uprobes: Introduce MMF_RECALC_UPROBES · 9f68f672
      Oleg Nesterov 提交于
      Add the new MMF_RECALC_UPROBES flag, it means that MMF_HAS_UPROBES
      can be false positive after remove_breakpoint() or uprobe_munmap().
      It is also set by uprobe_dup_mmap(), this is not optimal but simple.
      We could add the new hook, uprobe_dup_vma(), to set MMF_HAS_UPROBES
      only if the new mm actually has uprobes, but I don't think this
      makes sense.
      
      The next patch will use this flag to clear MMF_HAS_UPROBES.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      9f68f672
    • O
      uprobes: uprobes_treelock should not disable irqs · 6f47caa0
      Oleg Nesterov 提交于
      Nobody plays with uprobes_tree/uprobes_treelock in interrupt context,
      no need to disable irqs.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      6f47caa0
    • S
      uprobes: Don't put NULL pointer in uprobe_register() · 6d1d8dfa
      Sebastian Andrzej Siewior 提交于
      alloc_uprobe() might return a NULL pointer, put_uprobe() can't deal with
      this.
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      6d1d8dfa
    • T
      cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them · 8c7f6edb
      Tejun Heo 提交于
      Currently, cgroup hierarchy support is a mess.  cpu related subsystems
      behave correctly - configuration, accounting and control on a parent
      properly cover its children.  blkio and freezer completely ignore
      hierarchy and treat all cgroups as if they're directly under the root
      cgroup.  Others show yet different behaviors.
      
      These differing interpretations of cgroup hierarchy make using cgroup
      confusing and it impossible to co-mount controllers into the same
      hierarchy and obtain sane behavior.
      
      Eventually, we want full hierarchy support from all subsystems and
      probably a unified hierarchy.  Users using separate hierarchies
      expecting completely different behaviors depending on the mounted
      subsystem is deterimental to making any progress on this front.
      
      This patch adds cgroup_subsys.broken_hierarchy and sets it to %true
      for controllers which are lacking in hierarchy support.  The goal of
      this patch is two-fold.
      
      * Move users away from using hierarchy on currently non-hierarchical
        subsystems, so that implementing proper hierarchy support on those
        doesn't surprise them.
      
      * Keep track of which controllers are broken how and nudge the
        subsystems to implement proper hierarchy support.
      
      For now, start with a single warning message.  We can whine louder
      later on.
      
      v2: Fixed a typo spotted by Michal. Warning message updated.
      
      v3: Updated memcg part so that it doesn't generate warning in the
          cases where .use_hierarchy=false doesn't make the behavior
          different from root.use_hierarchy=true.  Fixed a typo spotted by
          Glauber.
      
      v4: Check ->broken_hierarchy after cgroup creation is complete so that
          ->create() can affect the result per Michal.  Dropped unnecessary
          memcg root handling per Michal.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Glauber Costa <glommer@parallels.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Thomas Graf <tgraf@suug.ch>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      8c7f6edb
  4. 04 9月, 2012 2 次提交
  5. 29 8月, 2012 9 次提交
  6. 10 8月, 2012 5 次提交
    • F
      perf: Add attribute to filter out callchains · d0775264
      Frederic Weisbecker 提交于
      Introducing following bits to the the perf_event_attr struct:
      
        - exclude_callchain_kernel to filter out kernel callchain
          from the sample dump
      
        - exclude_callchain_user to filter out user callchain
          from the sample dump
      
      We need to be able to disable standard user callchain dump when we use
      the dwarf cfi callchain mode, because frame pointer based user
      callchains are useless in this mode.
      
      Implementing also exclude_callchain_kernel to have complete set of
      options.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      [ Added kernel callchains filtering ]
      Cc: "Frank Ch. Eigler" <fche@redhat.com>
      Cc: Arun Sharma <asharma@fb.com>
      Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Ulrich Drepper <drepper@gmail.com>
      Link: http://lkml.kernel.org/r/1344345647-11536-7-git-send-email-jolsa@redhat.comSigned-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d0775264
    • J
      perf: Add ability to attach user stack dump to sample · c5ebcedb
      Jiri Olsa 提交于
      Introducing PERF_SAMPLE_STACK_USER sample type bit to trigger the dump
      of the user level stack on sample. The size of the dump is specified by
      sample_stack_user value.
      
      Being able to dump parts of the user stack, starting from the stack
      pointer, will be useful to make a post mortem dwarf CFI based stack
      unwinding.
      
      Added HAVE_PERF_USER_STACK_DUMP config option to determine if the
      architecture provides user stack dump on perf event samples.  This needs
      access to the user stack pointer which is not unified across
      architectures. Enabling this for x86 architecture.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Original-patch-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: "Frank Ch. Eigler" <fche@redhat.com>
      Cc: Arun Sharma <asharma@fb.com>
      Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Ulrich Drepper <drepper@gmail.com>
      Link: http://lkml.kernel.org/r/1344345647-11536-6-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c5ebcedb
    • J
      perf: Add perf_output_skip function to skip bytes in sample · 5685e0ff
      Jiri Olsa 提交于
      Introducing perf_output_skip function to be able to skip data within the
      perf ring buffer.
      
      When writing data into perf ring buffer we first reserve needed place in
      ring buffer and then copy the actual data.
      
      There's a possibility we won't be able to fill all the reserved size
      with data, so we need a way to skip the remaining bytes.
      
      This is going to be useful when storing the user stack dump, where we
      might end up with less data than we originally requested.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: "Frank Ch. Eigler" <fche@redhat.com>
      Cc: Arun Sharma <asharma@fb.com>
      Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Ulrich Drepper <drepper@gmail.com>
      Link: http://lkml.kernel.org/r/1344345647-11536-5-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5685e0ff
    • F
      perf: Factor __output_copy to be usable with specific copy function · 91d7753a
      Frederic Weisbecker 提交于
      Adding a generic way to use __output_copy function with specific copy
      function via DEFINE_PERF_OUTPUT_COPY macro.
      
      Using this to add new __output_copy_user function, that provides output
      copy from user pointers. For x86 the copy_from_user_nmi function is used
      and __copy_from_user_inatomic for the rest of the architectures.
      
      This new function will be used in user stack dump on sample, coming in
      next patches.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Cc: "Frank Ch. Eigler" <fche@redhat.com>
      Cc: Arun Sharma <asharma@fb.com>
      Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Ulrich Drepper <drepper@gmail.com>
      Link: http://lkml.kernel.org/r/1344345647-11536-4-git-send-email-jolsa@redhat.comSigned-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      91d7753a
    • J
      perf: Add ability to attach user level registers dump to sample · 4018994f
      Jiri Olsa 提交于
      Introducing PERF_SAMPLE_REGS_USER sample type bit to trigger the dump of
      user level registers on sample. Registers we want to dump are specified
      by sample_regs_user bitmask.
      
      Only user level registers are dumped at the moment. Meaning the register
      values of the user space context as it was before the user entered the
      kernel for whatever reason (syscall, irq, exception, or a PMI happening
      in userspace).
      
      The layout of the sample_regs_user bitmap is described in
      asm/perf_regs.h for archs that support register dump.
      
      This is going to be useful to bring Dwarf CFI based stack unwinding on
      top of samples.
      Original-patch-by: NFrederic Weisbecker <fweisbec@gmail.com>
      [ Dump registers ABI specification. ]
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Suggested-by: NStephane Eranian <eranian@google.com>
      Cc: "Frank Ch. Eigler" <fche@redhat.com>
      Cc: Arun Sharma <asharma@fb.com>
      Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Ulrich Drepper <drepper@gmail.com>
      Link: http://lkml.kernel.org/r/1344345647-11536-3-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4018994f
  7. 31 7月, 2012 1 次提交
  8. 30 7月, 2012 12 次提交
  9. 18 6月, 2012 2 次提交