1. 05 9月, 2016 1 次提交
  2. 29 7月, 2016 1 次提交
    • V
      mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations · 25160354
      Vlastimil Babka 提交于
      After the previous patch, we can distinguish costly allocations that
      should be really lightweight, such as THP page faults, with
      __GFP_NORETRY.  This means we don't need to recognize khugepaged
      allocations via PF_KTHREAD anymore.  We can also change THP page faults
      in areas where madvise(MADV_HUGEPAGE) was used to try as hard as
      khugepaged, as the process has indicated that it benefits from THP's and
      is willing to pay some initial latency costs.
      
      We can also make the flags handling less cryptic by distinguishing
      GFP_TRANSHUGE_LIGHT (no reclaim at all, default mode in page fault) from
      GFP_TRANSHUGE (only direct reclaim, khugepaged default).  Adding
      __GFP_NORETRY or __GFP_KSWAPD_RECLAIM is done where needed.
      
      The patch effectively changes the current GFP_TRANSHUGE users as
      follows:
      
      * get_huge_zero_page() - the zero page lifetime should be relatively
        long and it's shared by multiple users, so it's worth spending some
        effort on it.  We use GFP_TRANSHUGE, and __GFP_NORETRY is not added.
        This also restores direct reclaim to this allocation, which was
        unintentionally removed by commit e4a49efe4e7e ("mm: thp: set THP defrag
        by default to madvise and add a stall-free defrag option")
      
      * alloc_hugepage_khugepaged_gfpmask() - this is khugepaged, so latency
        is not an issue.  So if khugepaged "defrag" is enabled (the default), do
        reclaim via GFP_TRANSHUGE without __GFP_NORETRY.  We can remove the
        PF_KTHREAD check from page alloc.
      
        As a side-effect, khugepaged will now no longer check if the initial
        compaction was deferred or contended.  This is OK, as khugepaged sleep
        times between collapsion attempts are long enough to prevent noticeable
        disruption, so we should allow it to spend some effort.
      
      * migrate_misplaced_transhuge_page() - already was masking out
        __GFP_RECLAIM, so just convert to GFP_TRANSHUGE_LIGHT which is
        equivalent.
      
      * alloc_hugepage_direct_gfpmask() - vma's with VM_HUGEPAGE (via madvise)
        are now allocating without __GFP_NORETRY.  Other vma's keep using
        __GFP_NORETRY if direct reclaim/compaction is at all allowed (by default
        it's allowed only for madvised vma's).  The rest is conversion to
        GFP_TRANSHUGE(_LIGHT).
      
      [mhocko@suse.com: suggested GFP_TRANSHUGE_LIGHT]
      Link: http://lkml.kernel.org/r/20160721073614.24395-7-vbabka@suse.czSigned-off-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NMel Gorman <mgorman@techsingularity.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      25160354
  3. 23 6月, 2016 2 次提交
  4. 15 4月, 2016 1 次提交
    • A
      perf callchain: Start moving away from global per thread cursors · 91d7b2de
      Arnaldo Carvalho de Melo 提交于
      The recent perf_evsel__fprintf_callchain() move to evsel.c added several
      new symbol requirements to the python binding, for instance:
      
        # perf test -v python
        16: Try 'import perf' in python, checking link problems      :
        --- start ---
        test child forked, pid 18030
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        ImportError: /tmp/build/perf/python/perf.so: undefined symbol:
        callchain_cursor
        test child finished with -1
        ---- end ----
        Try 'import perf' in python, checking link problems: FAILED!
        #
      
      This would require linking against callchain.c to access to the global
      callchain_cursor variables.
      
      Since lots of functions already receive as a parameter a
      callchain_cursor struct pointer, make that be the case for some more
      function so that we can start phasing out usage of yet another global
      variable.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-djko3097eyg2rn66v2qcqfvn@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      91d7b2de
  5. 16 3月, 2016 2 次提交
    • V
      mm, tracing: unify mm flags handling in tracepoints and printk · 420adbe9
      Vlastimil Babka 提交于
      In tracepoints, it's possible to print gfp flags in a human-friendly
      format through a macro show_gfp_flags(), which defines a translation
      array and passes is to __print_flags().  Since the following patch will
      introduce support for gfp flags printing in printk(), it would be nice
      to reuse the array.  This is not straightforward, since __print_flags()
      can't simply reference an array defined in a .c file such as mm/debug.c
      - it has to be a macro to allow the macro magic to communicate the
      format to userspace tools such as trace-cmd.
      
      The solution is to create a macro __def_gfpflag_names which is used both
      in show_gfp_flags(), and to define the gfpflag_names[] array in
      mm/debug.c.
      
      On the other hand, mm/debug.c also defines translation tables for page
      flags and vma flags, and desire was expressed (but not implemented in
      this series) to use these also from tracepoints.  Thus, this patch also
      renames the events/gfpflags.h file to events/mmflags.h and moves the
      table definitions there, using the same macro approach as for gfpflags.
      This allows translating all three kinds of mm-specific flags both in
      tracepoints and printk.
      Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: NMichal Hocko <mhocko@suse.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      420adbe9
    • V
      tools, perf: make gfp_compact_table up to date · 14e0a214
      Vlastimil Babka 提交于
      When updating tracing's show_gfp_flags() I have noticed that perf's
      gfp_compact_table is also outdated.  Fill in the missing flags and place
      a note in gfp.h to increase chance that future updates are synced.
      Convert the __GFP_X flags from "GFP_X" to "__GFP_X" strings in line with
      the previous patch.
      Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      14e0a214
  6. 27 2月, 2016 1 次提交
  7. 18 12月, 2015 1 次提交
  8. 01 10月, 2015 2 次提交
  9. 02 7月, 2015 1 次提交
  10. 29 5月, 2015 1 次提交
  11. 12 5月, 2015 1 次提交
  12. 09 5月, 2015 1 次提交
    • A
      perf machine: Protect the machine->threads with a rwlock · b91fc39f
      Arnaldo Carvalho de Melo 提交于
      In addition to using refcounts for the struct thread lifetime
      management, we need to protect access to machine->threads from
      concurrent access.
      
      That happens in 'perf top', where a thread processes events, inserting
      and deleting entries from that rb_tree while another thread decays
      hist_entries, that end up dropping references and ultimately deleting
      threads from the rb_tree and releasing its resources when no further
      hist_entry (or other data structures, like in 'perf sched') references
      it.
      
      So the rule is the same for refcounts + protected trees in the kernel,
      get the tree lock, find object, bump the refcount, drop the tree lock,
      return, use object, drop the refcount if no more use of it is needed,
      keep it if storing it in some other data structure, drop when releasing
      that data structure.
      
      I.e. pair "t = machine__find(new)_thread()" with a "thread__put(t)", and
      "perf_event__preprocess_sample(&al)" with "addr_location__put(&al)".
      
      The addr_location__put() one is because as we return references to
      several data structures, we may end up adding more reference counting
      for the other data structures and then we'll drop it at
      addr_location__put() time.
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-bs9rt4n0jw3hi9f3zxyy3xln@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b91fc39f
  13. 06 5月, 2015 1 次提交
  14. 05 5月, 2015 4 次提交
    • N
      perf kmem: Add kmem.default config option · 0c160d49
      Namhyung Kim 提交于
      Currently perf kmem command will select --slab if neither --slab nor
      --page is given for backward compatibility.  Add kmem.default config
      option to select the default value ('page' or 'slab').
      
        # cat ~/.perfconfig
        [kmem]
        	default = page
      
        # perf kmem stat
      
        SUMMARY (page allocator)
        ========================
        Total allocation requests     :            1,518   [            6,096 KB ]
        Total free requests           :            1,431   [            5,748 KB ]
      
        Total alloc+freed requests    :            1,330   [            5,344 KB ]
        Total alloc-only requests     :              188   [              752 KB ]
        Total free-only requests      :              101   [              404 KB ]
      
        Total allocation failures     :                0   [                0 KB ]
        ...
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NPekka Enberg <penberg@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/1429592107-1807-6-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0c160d49
    • N
      perf kmem: Print gfp flags in human readable string · 0e111156
      Namhyung Kim 提交于
      Save libtraceevent output and print it in the header.
      
        # perf kmem stat --page --caller
        #
        # GFP flags
        # ---------
        # 00000010:       NI: GFP_NOIO
        # 000000d0:        K: GFP_KERNEL
        # 00000200:      NWR: GFP_NOWARN
        # 000084d0:    K|R|Z: GFP_KERNEL|GFP_REPEAT|GFP_ZERO
        # 000200d2:       HU: GFP_HIGHUSER
        # 000200da:      HUM: GFP_HIGHUSER_MOVABLE
        # 000280da:    HUM|Z: GFP_HIGHUSER_MOVABLE|GFP_ZERO
        # 002084d0: K|R|Z|NT: GFP_KERNEL|GFP_REPEAT|GFP_ZERO|GFP_NOTRACK
        # 0102005a:  NF|HW|M: GFP_NOFS|GFP_HARDWALL|GFP_MOVABLE
      
        ---------------------------------------------------------------------------------------------------------
         Total alloc (KB) | Hits      | Order | Mig.type | GFP flags | Callsite
        ---------------------------------------------------------------------------------------------------------
                       60 |        15 |     0 | UNMOVABL | K|R|Z|NT  | pte_alloc_one
                       40 |        10 |     0 |  MOVABLE | HUM|Z     | handle_mm_fault
                       24 |         6 |     0 |  MOVABLE | HUM       | do_wp_page
                       24 |         6 |     0 | UNMOVABL | K         | __pollwait
         ...
      Requested-by: NJoonsoo Kim <js1304@gmail.com>
      Suggested-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NPekka Enberg <penberg@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/1429592107-1807-5-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0e111156
    • N
      perf kmem: Add --live option for current allocation stat · 2a7ef02c
      Namhyung Kim 提交于
      Currently 'perf kmem stat --page' shows total (page) allocation stat by
      default, but sometimes one might want to see live (total alloc-only)
      requests/pages only.  The new --live option does this by subtracting freed
      allocation from the stat.
      
      E.g.:
      
       # perf kmem stat --page
      
       SUMMARY (page allocator)
       ========================
       Total allocation requests     :          988,858   [        4,045,368 KB ]
       Total free requests           :          886,484   [        3,624,996 KB ]
      
       Total alloc+freed requests    :          885,969   [        3,622,628 KB ]
       Total alloc-only requests     :          102,889   [          422,740 KB ]
       Total free-only requests      :              515   [            2,368 KB ]
      
       Total allocation failures     :                0   [                0 KB ]
      
       Order     Unmovable   Reclaimable       Movable      Reserved  CMA/Isolated
       -----  ------------  ------------  ------------  ------------  ------------
           0       172,173         3,083       806,686             .             .
           1           284             .             .             .             .
           2         6,124            58             .             .             .
           3           114           335             .             .             .
           4             .             .             .             .             .
           5             .             .             .             .             .
           6             .             .             .             .             .
           7             .             .             .             .             .
           8             .             .             .             .             .
           9             .             .             1             .             .
          10             .             .             .             .             .
       # perf kmem stat --page --live
      
       SUMMARY (page allocator)
       ========================
       Total allocation requests     :          988,858   [        4,045,368 KB ]
       Total free requests           :          886,484   [        3,624,996 KB ]
      
       Total alloc+freed requests    :          885,969   [        3,622,628 KB ]
       Total alloc-only requests     :          102,889   [          422,740 KB ]
       Total free-only requests      :              515   [            2,368 KB ]
      
       Total allocation failures     :                0   [                0 KB ]
      
       Order     Unmovable   Reclaimable       Movable      Reserved  CMA/Isolated
       -----  ------------  ------------  ------------  ------------  ------------
           0         2,214         3,025        97,156             .             .
           1            59             .             .             .             .
           2            19            58             .             .             .
           3            23           335             .             .             .
           4             .             .             .             .             .
           5             .             .             .             .             .
           6             .             .             .             .             .
           7             .             .             .             .             .
           8             .             .             .             .             .
           9             .             .             .             .             .
          10             .             .             .             .             .
       #
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NPekka Enberg <penberg@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/1429592107-1807-4-git-send-email-namhyung@kernel.org
      [ Added examples to the changeset log ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2a7ef02c
    • N
      perf kmem: Support sort keys on page analysis · fb4f313d
      Namhyung Kim 提交于
      Add new sort keys for page: page, order, migtype, gfp - existing
      'bytes', 'hit' and 'callsite' sort keys also work for page.  Note that
      -s/--sort option should be preceded by either of --slab or --page option
      to determine where the sort keys applies.
      
      Now it properly groups and sorts allocation stats - so same
      page/caller with different order/migtype/gfp will be printed on a
      different line.
      
       # perf kmem stat --page --caller -l 10 -s order,hit
      
       -----------------------------------------------------------------------------
       Total alloc (KB) | Hits   | Order | Mig.type | GFP flags | Callsite
       -----------------------------------------------------------------------------
                     64 |      4 |     2 |  RECLAIM |  00285250 | new_slab
                 50,144 | 12,536 |     0 |  MOVABLE |  0102005a | __page_cache_alloc
                     52 |     13 |     0 | UNMOVABL |  002084d0 | pte_alloc_one
                     40 |     10 |     0 |  MOVABLE |  000280da | handle_mm_fault
                     28 |      7 |     0 | UNMOVABL |  000000d0 | __pollwait
                     20 |      5 |     0 |  MOVABLE |  000200da | do_wp_page
                     20 |      5 |     0 |  MOVABLE |  000200da | do_cow_fault
                     16 |      4 |     0 | UNMOVABL |  00000200 | __tlb_remove_page
                     16 |      4 |     0 | UNMOVABL |  000084d0 | __pmd_alloc
                      8 |      2 |     0 | UNMOVABL |  000084d0 | __pud_alloc
       ...              | ...    | ...   | ...      | ...       | ...
       -----------------------------------------------------------------------------
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NPekka Enberg <penberg@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/1429592107-1807-3-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fb4f313d
  15. 04 5月, 2015 1 次提交
    • N
      perf kmem: Implement stat --page --caller · c9758cc4
      Namhyung Kim 提交于
      It is 'perf kmem' support caller statistics for page.  Unlike slab case,
      the tracepoints in page allocator don't provide callsite info.  So it
      records with callchain and extracts callsite info.
      
      Note that the callchain contains several memory allocation functions
      which has no meaning for users.  So skip those functions to get proper
      callsites.  I used following regex pattern to skip the allocator
      functions:
      
        ^_?_?(alloc|get_free|get_zeroed)_pages?
      
      This gave me a following list of functions:
      
        # perf kmem record --page sleep 3
        # perf kmem stat --page -v
        ...
        alloc func: __get_free_pages
        alloc func: get_zeroed_page
        alloc func: alloc_pages_exact
        alloc func: __alloc_pages_direct_compact
        alloc func: __alloc_pages_nodemask
        alloc func: alloc_page_interleave
        alloc func: alloc_pages_current
        alloc func: alloc_pages_vma
        alloc func: alloc_page_buffers
        alloc func: alloc_pages_exact_nid
        ...
      
      The output looks mostly same as --alloc (I also added callsite column
      to that) but groups entries by callsite.  Currently, the order,
      migrate type and GFP flag info is for the last allocation and not
      guaranteed to be same for all allocations from the callsite.
      
        ---------------------------------------------------------------------------------------------
         Total_alloc (KB) | Hits      | Order | Mig.type | GFP flags | Callsite
        ---------------------------------------------------------------------------------------------
                    1,064 |       266 |     0 | UNMOVABL |  000000d0 | __pollwait
                       52 |        13 |     0 | UNMOVABL |  002084d0 | pte_alloc_one
                       44 |        11 |     0 |  MOVABLE |  000280da | handle_mm_fault
                       20 |         5 |     0 |  MOVABLE |  000200da | do_cow_fault
                       20 |         5 |     0 |  MOVABLE |  000200da | do_wp_page
                       16 |         4 |     0 | UNMOVABL |  000084d0 | __pmd_alloc
                       16 |         4 |     0 | UNMOVABL |  00000200 | __tlb_remove_page
                       12 |         3 |     0 | UNMOVABL |  000084d0 | __pud_alloc
                        8 |         2 |     0 | UNMOVABL |  00000010 | bio_copy_user_iov
                        4 |         1 |     0 | UNMOVABL |  000200d2 | pipe_write
                        4 |         1 |     0 |  MOVABLE |  000280da | do_wp_page
                        4 |         1 |     0 | UNMOVABL |  002084d0 | pgd_alloc
        ---------------------------------------------------------------------------------------------
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NPekka Enberg <penberg@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/1429592107-1807-2-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c9758cc4
  16. 29 4月, 2015 1 次提交
    • D
      perf kmem: Fix compiles on RHEL6/OL6 · 6b1a2752
      David Ahern 提交于
      0d68bc92 breaks compiles on RHEL6/OL6:
          cc1: warnings being treated as errors
          builtin-kmem.c: In function ‘search_page_alloc_stat’:
          builtin-kmem.c:322: error: declaration of ‘stat’ shadows a global declaration
                                  node = &parent->rb_left;
          /usr/include/sys/stat.h:455: error: shadowed declaration is here
          builtin-kmem.c: In function ‘perf_evsel__process_page_alloc_event’:
          builtin-kmem.c:378: error: declaration of ‘stat’ shadows a global declaration
          /usr/include/sys/stat.h:455: error: shadowed declaration is here
          builtin-kmem.c: In function ‘perf_evsel__process_page_free_event’:
          builtin-kmem.c:431: error: declaration of ‘stat’ shadows a global declaration
          /usr/include/sys/stat.h:455: error: shadowed declaration is here
      
      Rename local variable to pstat to avoid the name conflict.
      Signed-off-by: NDavid Ahern <david.ahern@oracle.com>
      Link: http://lkml.kernel.org/r/1429033773-31383-1-git-send-email-david.ahern@oracle.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6b1a2752
  17. 24 4月, 2015 2 次提交
    • D
      perf kmem: Fix compiles on RHEL6/OL6 · 4ad1f430
      David Ahern 提交于
      0d68bc92 breaks compiles on RHEL6/OL6:
          cc1: warnings being treated as errors
          builtin-kmem.c: In function ‘search_page_alloc_stat’:
          builtin-kmem.c:322: error: declaration of ‘stat’ shadows a global declaration
                                  node = &parent->rb_left;
          /usr/include/sys/stat.h:455: error: shadowed declaration is here
          builtin-kmem.c: In function ‘perf_evsel__process_page_alloc_event’:
          builtin-kmem.c:378: error: declaration of ‘stat’ shadows a global declaration
          /usr/include/sys/stat.h:455: error: shadowed declaration is here
          builtin-kmem.c: In function ‘perf_evsel__process_page_free_event’:
          builtin-kmem.c:431: error: declaration of ‘stat’ shadows a global declaration
          /usr/include/sys/stat.h:455: error: shadowed declaration is here
      
      Rename local variable to pstat to avoid the name conflict.
      Signed-off-by: NDavid Ahern <david.ahern@oracle.com>
      Link: http://lkml.kernel.org/r/1429033773-31383-1-git-send-email-david.ahern@oracle.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4ad1f430
    • W
      perf kmem: Consistently use PRIu64 for printing u64 values · 6145c259
      Will Deacon 提交于
      Building the perf tool for 32-bit ARM results in the following build
      error due to a combination of an incorrect conversion specifier and
      compiling with -Werror:
      
        builtin-kmem.c: In function ‘print_page_summary’:
        builtin-kmem.c:644:9: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘u64’ [-Werror=format=]
                 nr_alloc_freed, (total_alloc_freed_bytes) / 1024);
                 ^
        builtin-kmem.c:647:9: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘u64’ [-Werror=format=]
                 (total_page_alloc_bytes - total_alloc_freed_bytes) / 1024);
                 ^
        cc1: all warnings being treated as errors
      
      This patch fixes the problem by consistently using PRIu64 for printing
      out u64 values.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1429796437-1790-1-git-send-email-will.deacon@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6145c259
  18. 13 4月, 2015 1 次提交
    • N
      perf kmem: Analyze page allocator events also · 0d68bc92
      Namhyung Kim 提交于
      The perf kmem command records and analyze kernel memory allocation only
      for SLAB objects.  This patch implement a simple page allocator analyzer
      using kmem:mm_page_alloc and kmem:mm_page_free events.
      
      It adds two new options of --slab and --page.  The --slab option is for
      analyzing SLAB allocator and that's what perf kmem currently does.
      
      The new --page option enables page allocator events and analyze kernel
      memory usage in page unit.  Currently, 'stat --alloc' subcommand is
      implemented only.
      
      If none of these --slab nor --page is specified, --slab is implied.
      
      First run 'perf kmem record' to generate a suitable perf.data file:
      
        # perf kmem record --page sleep 5
      
      Then run 'perf kmem stat' to postprocess the perf.data file:
      
        # perf kmem stat --page --alloc --line 10
      
        -------------------------------------------------------------------------------
         PFN              | Total alloc (KB) | Hits     | Order | Mig.type | GFP flags
        -------------------------------------------------------------------------------
                  4045014 |               16 |        1 |     2 |  RECLAIM |  00285250
                  4143980 |               16 |        1 |     2 |  RECLAIM |  00285250
                  3938658 |               16 |        1 |     2 |  RECLAIM |  00285250
                  4045400 |               16 |        1 |     2 |  RECLAIM |  00285250
                  3568708 |               16 |        1 |     2 |  RECLAIM |  00285250
                  3729824 |               16 |        1 |     2 |  RECLAIM |  00285250
                  3657210 |               16 |        1 |     2 |  RECLAIM |  00285250
                  4120750 |               16 |        1 |     2 |  RECLAIM |  00285250
                  3678850 |               16 |        1 |     2 |  RECLAIM |  00285250
                  3693874 |               16 |        1 |     2 |  RECLAIM |  00285250
         ...              | ...              | ...      | ...   | ...      | ...
        -------------------------------------------------------------------------------
      
        SUMMARY (page allocator)
        ========================
        Total allocation requests     :           44,260   [          177,256 KB ]
        Total free requests           :              117   [              468 KB ]
      
        Total alloc+freed requests    :               49   [              196 KB ]
        Total alloc-only requests     :           44,211   [          177,060 KB ]
        Total free-only requests      :               68   [              272 KB ]
      
        Total allocation failures     :                0   [                0 KB ]
      
        Order     Unmovable   Reclaimable       Movable      Reserved  CMA/Isolated
        -----  ------------  ------------  ------------  ------------  ------------
            0            32             .        44,210             .             .
            1             .             .             .             .             .
            2             .            18             .             .             .
            3             .             .             .             .             .
            4             .             .             .             .             .
            5             .             .             .             .             .
            6             .             .             .             .             .
            7             .             .             .             .             .
            8             .             .             .             .             .
            9             .             .             .             .             .
           10             .             .             .             .             .
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/1428298576-9785-4-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0d68bc92
  19. 08 4月, 2015 1 次提交
  20. 03 4月, 2015 1 次提交
    • Y
      perf kmem: Support using -f to override perf.data file ownership · d1eeb77c
      Yunlong Song 提交于
      Enable perf kmem to use perf.data when it is not owned by current user
      or root.
      
      Example:
      
       # perf kmem record ls
       # chown Yunlong.Song:Yunlong.Song perf.data
       # ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 5315665 Apr  2 10:54 perf.data
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf kmem stat
       File perf.data not owned by current user or root (use -f to override)
       # perf kmem stat -f
         Error: unknown switch `f'
      
        usage: perf kmem [<options>] {record|stat}
      
           -i, --input <file>    input file name
           -v, --verbose         be more verbose (show symbol address, etc)
               --caller          show per-callsite statistics
               --alloc           show per-allocation statistics
           -s, --sort <key[,key2...]>
                                 sort by keys: ptr, call_site, bytes, hit,
                                 pingpong, frag
           -l, --line <num>      show n lines
               --raw-ip          show raw ip instead of symbol
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       # perf kmem stat
       File perf.data not owned by current user or root (use -f to override)
       # perf kmem stat -f
       SUMMARY
       =======
       Total bytes requested: 437599
       Total bytes allocated: 615472
       Total bytes wasted on internal fragmentation: 177873
       Internal fragmentation: 28.900259%
       Cross CPU allocations: 6/1192
      
      As shown above, the -f option really works now.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-4-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d1eeb77c
  21. 24 3月, 2015 1 次提交
  22. 13 3月, 2015 3 次提交
    • N
      perf kmem: Fix alignment of slab result table · 65f46e02
      Namhyung Kim 提交于
      Its table was a bit misaligned.  Fix it.
      
      Before:
      
        # perf kmem stat --caller -l 10
        ------------------------------------------------------------------------------------------------------
         Callsite                           | Total_alloc/Per | Total_req/Per   | Hit      | Ping-pong | Frag
        ------------------------------------------------------------------------------------------------------
         radeon_cs_parser_init.part.1+11a   |      2080/260   |      1504/188   |        8 |        0 | 27.692%
         radeon_cs_parser_init.part.1+e1    |       384/96    |       288/72    |        4 |        0 | 25.000%
         radeon_cs_parser_init.part.1+93    |       128/32    |        96/24    |        4 |        0 | 25.000%
         load_elf_binary+a39                |       512/512   |       392/392   |        1 |        0 | 23.438%
         __alloc_skb+89                     |      6144/877   |      4800/685   |        7 |        6 | 21.875%
         radeon_fence_emit+5c               |      1152/192   |       912/152   |        6 |        0 | 20.833%
         radeon_cs_parser_relocs+ad         |      8192/2048  |      6624/1656  |        4 |        0 | 19.141%
         radeon_sa_bo_new+78                |      1280/64    |      1120/56    |       20 |        0 | 12.500%
         load_elf_binary+2c4                |        32/32    |        28/28    |        1 |        0 | 12.500%
         anon_vma_prepare+101               |       576/72    |       512/64    |        8 |        0 | 11.111%
         ...                                | ...             | ...             | ...    | ...      | ...
        ------------------------------------------------------------------------------------------------------
      
      After:
      
        ---------------------------------------------------------------------------------------------------------
         Callsite                           | Total_alloc/Per | Total_req/Per   | Hit      | Ping-pong | Frag
        ---------------------------------------------------------------------------------------------------------
         radeon_cs_parser_init.part.1+11a   |      2080/260   |      1504/188   |        8 |         0 | 27.692%
         radeon_cs_parser_init.part.1+e1    |       384/96    |       288/72    |        4 |         0 | 25.000%
         radeon_cs_parser_init.part.1+93    |       128/32    |        96/24    |        4 |         0 | 25.000%
         load_elf_binary+a39                |       512/512   |       392/392   |        1 |         0 | 23.438%
         __alloc_skb+89                     |      6144/877   |      4800/685   |        7 |         6 | 21.875%
         radeon_fence_emit+5c               |      1152/192   |       912/152   |        6 |         0 | 20.833%
         radeon_cs_parser_relocs+ad         |      8192/2048  |      6624/1656  |        4 |         0 | 19.141%
         radeon_sa_bo_new+78                |      1280/64    |      1120/56    |       20 |         0 | 12.500%
         load_elf_binary+2c4                |        32/32    |        28/28    |        1 |         0 | 12.500%
         anon_vma_prepare+101               |       576/72    |       512/64    |        8 |         0 | 11.111%
         ...                                | ...             | ...             | ...      | ...       | ...
        ---------------------------------------------------------------------------------------------------------
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1426145571-3065-4-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      65f46e02
    • N
      perf kmem: Allow -v option · bd72a33e
      Namhyung Kim 提交于
      Current perf kmem fails when -v option is used.  As it's very useful for
      debugging, let's allow it.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1426145571-3065-3-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bd72a33e
    • N
      perf kmem: Fix segfault when invalid sort key is given · 405f8755
      Namhyung Kim 提交于
      When it tries to free 'str', it was already updated by strsep() - so it
      needs to save the original pointer.
      
        # perf kmem stat -s xxx,hit
          Error: Unknown --sort key: 'xxx'
        *** Error in `perf': free(): invalid pointer: 0x0000000000e9e7b6 ***
        ======= Backtrace: =========
        /usr/lib/libc.so.6(+0x7198e)[0x7fc7e6e0d98e]
        /usr/lib/libc.so.6(+0x76dee)[0x7fc7e6e12dee]
        /usr/lib/libc.so.6(+0x775cb)[0x7fc7e6e135cb]
        ./perf[0x44a1b5]
        ./perf[0x490b20]
        ./perf(parse_options_step+0x173)[0x491773]
        ./perf(parse_options_subcommand+0xa7)[0x491fb7]
        ./perf(cmd_kmem+0x2bc)[0x44ae4c]
        ./perf[0x47aa13]
        ./perf(main+0x60a)[0x427a9a]
        /usr/lib/libc.so.6(__libc_start_main+0xf0)[0x7fc7e6dbc800]
        ./perf(_start+0x29)[0x427bb9]
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1426145571-3065-2-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      405f8755
  23. 11 3月, 2015 1 次提交
  24. 26 9月, 2014 1 次提交
  25. 14 8月, 2014 2 次提交
    • N
      perf tools: Check recorded kernel version when finding vmlinux · 0a7e6d1b
      Namhyung Kim 提交于
      Currently vmlinux_path__init() only tries to find vmlinux file from
      current directory, /boot and some canonical directories with version
      number of the running kernel.  This can be a problem when reporting old
      data recorded on a kernel version not running currently.
      
      We can use --symfs option for this but it's annoying for user to do it
      always.  As we already have the info in the perf.data file, it can be
      changed to use it for the search automatically.
      
      Before:
      
        $ perf report
        ...
        # Samples: 4K of event 'cpu-clock'
        # Event count (approx.): 1067250000
        #
        # Overhead  Command     Shared Object      Symbol
        # ........  ..........  .................  ..............................
            71.87%     swapper  [kernel.kallsyms]  [k] recover_probed_instruction
      
      After:
      
        # Overhead  Command     Shared Object      Symbol
        # ........  ..........  .................  ....................
            71.87%     swapper  [kernel.kallsyms]  [k] native_safe_halt
      
      This requires to change signature of symbol__init() to receive struct
      perf_session_env *.
      Reported-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1407825645-24586-14-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0a7e6d1b
    • N
      perf kmem: Move session handling out of __cmd_kmem() · 2b2b2c68
      Namhyung Kim 提交于
      This is a preparation of fixing dso__load_kernel_sym().  It needs a
      session info before calling symbol__init().
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1407825645-24586-7-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2b2b2c68
  26. 12 8月, 2014 2 次提交
  27. 12 5月, 2014 1 次提交
  28. 22 4月, 2014 1 次提交
  29. 16 4月, 2014 1 次提交