1. 16 4月, 2020 1 次提交
  2. 11 3月, 2020 5 次提交
    • L
      perf cs-etm: Fix unsigned variable comparison to zero · bc010dd6
      Leo Yan 提交于
      The variable 'offset' in function cs_etm__sample() is u64 type, it's not
      appropriate to check it with 'while (offset > 0)'; this patch changes to
      'while (offset)'.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Reviewed-by: NMathieu Poirier <mathieu.poirier@linaro.org>
      Reviewed-by: NMike Leach <mike.leach@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Walker <robert.walker@arm.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: coresight ml <coresight@lists.linaro.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lore.kernel.org/lkml/20200219021811.20067-6-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bc010dd6
    • L
      perf cs-etm: Optimize copying last branches · 695378b5
      Leo Yan 提交于
      If an instruction range packet can generate multiple instruction
      samples, these samples share the same last branches; it's not necessary
      to copy the same last branches repeatedly for these samples within the
      same packet.
      
      This patch moves out the last branches copying from function
      cs_etm__synth_instruction_sample(), and execute it prior to generating
      instruction samples.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Reviewed-by: NMathieu Poirier <mathieu.poirier@linaro.org>
      Reviewed-by: NMike Leach <mike.leach@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Walker <robert.walker@arm.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: coresight ml <coresight@lists.linaro.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lore.kernel.org/lkml/20200219021811.20067-5-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      695378b5
    • L
      perf cs-etm: Correct synthesizing instruction samples · c9f5baa1
      Leo Yan 提交于
      When 'etm->instructions_sample_period' is less than
      'tidq->period_instructions', the function cs_etm__sample() cannot handle
      this case properly with its logic.
      
      Let's see below flow as an example:
      
      - If we set itrace option '--itrace=i4', then function cs_etm__sample()
        has variables with initialized values:
      
        tidq->period_instructions = 0
        etm->instructions_sample_period = 4
      
      - When the first packet is coming:
      
        packet->instr_count = 10; the number of instructions executed in this
        packet is 10, thus update period_instructions as below:
      
        tidq->period_instructions = 0 + 10 = 10
        instrs_over = 10 - 4 = 6
        offset = 10 - 6 - 1 = 3
        tidq->period_instructions = instrs_over = 6
      
      - When the second packet is coming:
      
        packet->instr_count = 10; in the second pass, assume 10 instructions
        in the trace sample again:
      
        tidq->period_instructions = 6 + 10 = 16
        instrs_over = 16 - 4 = 12
        offset = 10 - 12 - 1 = -3  -> the negative value
        tidq->period_instructions = instrs_over = 12
      
      So after handle these two packets, there have below issues:
      
      The first issue is that cs_etm__instr_addr() returns the address within
      the current trace sample of the instruction related to offset, so the
      offset is supposed to be always unsigned value.  But in fact, function
      cs_etm__sample() might calculate a negative offset value (in handling
      the second packet, the offset is -3) and pass to cs_etm__instr_addr()
      with u64 type with a big positive integer.
      
      The second issue is it only synthesizes 2 samples for sample period = 4.
      In theory, every packet has 10 instructions so the two packets have
      total 20 instructions, 20 instructions should generate 5 samples
      (4 x 5 = 20).  This is because cs_etm__sample() only calls once
      cs_etm__synth_instruction_sample() to generate instruction sample per
      range packet.
      
      This patch fixes the logic in function cs_etm__sample(); the basic
      idea for handling coming packet is:
      
      - To synthesize the first instruction sample, it combines the left
        instructions from the previous packet and the head of the new
        packet; then generate continuous samples with sample period;
      - At the tail of the new packet, if it has the rest instructions,
        these instructions will be left for the sequential sample.
      Suggested-by: NMike Leach <mike.leach@linaro.org>
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Reviewed-by: NMathieu Poirier <mathieu.poirier@linaro.org>
      Reviewed-by: NMike Leach <mike.leach@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Walker <robert.walker@arm.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: coresight ml <coresight@lists.linaro.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lore.kernel.org/lkml/20200219021811.20067-4-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c9f5baa1
    • L
      perf cs-etm: Continuously record last branch · f1410028
      Leo Yan 提交于
      Every time synthesize instruction sample, the last branch recording will
      be reset.  This is fine if the instruction period is big enough, for
      example if use the option '--itrace=i100000', the last branch array is
      reset for every sample with 100000 instructions per period; before
      generate the next instruction sample, there has the sufficient packets
      coming to fill the last branch array.
      
      On the other hand, if set a very small period, the packets will be
      significantly reduced between two continuous instruction samples, thus
      the last branch array is almost empty for new instruction sample by
      frequently resetting.
      
      To allow the last branches to work properly for any instruction periods,
      this patch avoids to reset the last branch for every instruction sample
      and only reset it when flush the trace data.  The last branches will be
      reset only for two cases, one is for trace starting, another case is for
      discontinuous trace; other cases can keep recording last branches for
      continuous instruction samples.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Reviewed-by: NMathieu Poirier <mathieu.poirier@linaro.org>
      Reviewed-by: NMike Leach <mike.leach@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Walker <robert.walker@arm.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: coresight ml <coresight@lists.linaro.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lore.kernel.org/lkml/20200219021811.20067-3-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f1410028
    • L
      perf cs-etm: Swap packets for instruction samples · d0175156
      Leo Yan 提交于
      If use option '--itrace=iNNN' with Arm CoreSight trace data, perf tool
      fails inject instruction samples; the root cause is the packets are only
      swapped for branch samples and last branches but not for instruction
      samples, so the new coming packets cannot be properly handled for only
      synthesizing instruction samples.
      
      To fix this issue, this patch refactors the code with a new function
      cs_etm__packet_swap() which is used to swap packets and adds the
      condition for instruction samples.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Reviewed-by: NMathieu Poirier <mathieu.poirier@linaro.org>
      Reviewed-by: NMike Leach <mike.leach@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Walker <robert.walker@arm.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: coresight ml <coresight@lists.linaro.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lore.kernel.org/lkml/20200219021811.20067-2-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d0175156
  3. 10 3月, 2020 1 次提交
    • K
      perf tools: Add hw_idx in struct branch_stack · 42bbabed
      Kan Liang 提交于
      The low level index of raw branch records for the most recent branch can
      be recorded in a sample with PERF_SAMPLE_BRANCH_HW_INDEX
      branch_sample_type. Extend struct branch_stack to support it.
      
      However, if the PERF_SAMPLE_BRANCH_HW_INDEX is not applied, only nr and
      entries[] will be output by kernel. The pointer of entries[] could be
      wrong, since the output format is different with new struct
      branch_stack.  Add a variable no_hw_idx in struct perf_sample to
      indicate whether the hw_idx is output.  Add get_branch_entry() to return
      corresponding pointer of entries[0].
      
      To make dummy branch sample consistent as new branch sample, add hw_idx
      in struct dummy_branch_stack for cs-etm and intel-pt.
      
      Apply the new struct branch_stack for synthetic events as well.
      
      Extend test case sample-parsing to support new struct branch_stack.
      
      Committer notes:
      
      Renamed get_branch_entries() to perf_sample__branch_entries() to have
      proper namespacing and pave the way for this to be moved to libperf,
      eventually.
      
      Add 'static' to that inline as it is in a header.
      
      Add 'hw_idx' to 'struct dummy_branch_stack' in cs-etm.c to fix the build
      on arm64.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Pavel Gerasimov <pavel.gerasimov@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
      Link: http://lore.kernel.org/lkml/20200228163011.19358-2-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      42bbabed
  4. 26 11月, 2019 1 次提交
    • A
      perf maps: Merge 'struct maps' with 'struct map_groups' · 79b6bb73
      Arnaldo Carvalho de Melo 提交于
      And pick the shortest name: 'struct maps'.
      
      The split existed because we used to have two groups of maps, one for
      functions and one for variables, but that only complicated things,
      sometimes we needed to figure out what was at some address and then had
      to first try it on the functions group and if that failed, fall back to
      the variables one.
      
      That split is long gone, so for quite a while we had only one struct
      maps per struct map_groups, simplify things by combining those structs.
      
      First patch is the minimum needed to merge both, follow up patches will
      rename 'thread->mg' to 'thread->maps', etc.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-hom6639ro7020o708trhxh59@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      79b6bb73
  5. 07 11月, 2019 1 次提交
    • L
      perf cs-etm: Fix definition of macro TO_CS_QUEUE_NR · 9d604aad
      Leo Yan 提交于
      Macro TO_CS_QUEUE_NR definition has a typo, which uses 'trace_id_chan'
      as its parameter, this doesn't match with its definition body which uses
      'trace_chan_id'.  So renames the parameter to 'trace_chan_id'.
      
      It's luck to have a local variable 'trace_chan_id' in the function
      cs_etm__setup_queue(), even we wrongly define the macro TO_CS_QUEUE_NR,
      the local variable 'trace_chan_id' is used rather than the macro's
      parameter 'trace_id_chan'; so the compiler doesn't complain for this
      before.
      
      After renaming the parameter, it leads to a compiling error due
      cs_etm__setup_queue() has no variable 'trace_id_chan'.  This patch uses
      the variable 'trace_chan_id' for the macro so that fixes the compiling
      error.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Reviewed-by: NMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: coresight ml <coresight@lists.linaro.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lore.kernel.org/lkml/20191021074808.25795-1-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9d604aad
  6. 25 9月, 2019 1 次提交
  7. 20 9月, 2019 2 次提交
  8. 01 9月, 2019 4 次提交
  9. 29 8月, 2019 2 次提交
  10. 20 8月, 2019 1 次提交
    • L
      perf cs-etm: Support sample flags 'insn' and 'insnlen' · a4973d8f
      Leo Yan 提交于
      The synthetic branch and instruction samples are missed to set
      instruction related info, thus the perf tool fails to display samples
      with flags '-F,+insn,+insnlen'.
      
      The CoreSight trace decoder provides sufficient information to decide
      the instruction size based on the ISA type: A64/A32 instructions are
      32-bit size, but one exception is the T32 instruction size, which might
      be 32-bit or 16-bit.
      
      This patch handles these cases and it reads the instruction values from
      DSO file; thus can support the flags '-F,+insn,+insnlen'.
      
      Before:
      
        # perf script -F,insn,insnlen,ip,sym
                      0 [unknown] ilen: 0
           ffff97174044 _start ilen: 0
           ffff97174938 _dl_start ilen: 0
           ffff97174938 _dl_start ilen: 0
           ffff97174938 _dl_start ilen: 0
           ffff97174938 _dl_start ilen: 0
           ffff97174938 _dl_start ilen: 0
           ffff97174938 _dl_start ilen: 0
           ffff97174938 _dl_start ilen: 0
           ffff97174938 _dl_start ilen: 0
      
        [...]
      
      After:
      
        # perf script -F,insn,insnlen,ip,sym
                      0 [unknown] ilen: 0
           ffff97174044 _start ilen: 4 insn: 2f 02 00 94
           ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
           ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
           ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
           ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
           ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
           ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
           ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
           ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
      
        [...]
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Reviewed-by: NMathieu Poirier <mathieu.poirier@linaro.org>
      Tested-by: NMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Robert Walker <robert.walker@arm.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/20190815082854.18191-1-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a4973d8f
  11. 30 7月, 2019 3 次提交
    • J
      libperf: Move perf_event_attr field from perf's evsel to libperf's perf_evsel · 1fc632ce
      Jiri Olsa 提交于
      Move the perf_event_attr struct fron 'struct evsel' to 'struct perf_evsel'.
      
      Committer notes:
      
      Fixed up these:
      
       tools/perf/arch/arm/util/auxtrace.c
       tools/perf/arch/arm/util/cs-etm.c
       tools/perf/arch/arm64/util/arm-spe.c
       tools/perf/arch/s390/util/auxtrace.c
       tools/perf/util/cs-etm.c
      
      Also
      
        cc1: warnings being treated as errors
        tests/sample-parsing.c: In function 'do_test':
        tests/sample-parsing.c:162: error: missing initializer
        tests/sample-parsing.c:162: error: (near initialization for 'evsel.core.cpus')
      
         	struct evsel evsel = {
         		.needs_swap = false,
        -		.core.attr = {
        -			.sample_type = sample_type,
        -			.read_format = read_format,
        +		.core = {
        +			. attr = {
        +				.sample_type = sample_type,
        +				.read_format = read_format,
        +			},
      
        [perfbuilder@a70e4eeb5549 /]$ gcc --version |& head -1
        gcc (GCC) 4.4.7
      
      Also we don't need to include perf_event.h in
      tools/perf/lib/include/perf/evsel.h, forward declaring 'struct
      perf_event_attr' is enough. And this even fixes the build in some
      systems where things are used somewhere down the include path from
      perf_event.h without defining __always_inline.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20190721112506.12306-43-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1fc632ce
    • J
      perf evlist: Rename struct perf_evlist to struct evlist · 63503dba
      Jiri Olsa 提交于
      Rename struct perf_evlist to struct evlist, so we don't have a name
      clash when we add struct perf_evlist in libperf.
      
      Committer notes:
      
      Added fixes to build on arm64, from Jiri and from me
      (tools/perf/util/cs-etm.c)
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20190721112506.12306-6-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      63503dba
    • J
      perf evsel: Rename struct perf_evsel to struct evsel · 32dcd021
      Jiri Olsa 提交于
      Rename struct perf_evsel to struct evsel, so we don't have a name clash
      when we add struct perf_evsel in libperf.
      
      Committer notes:
      
      Added fixes for arm64, provided by Jiri.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      32dcd021
  12. 11 7月, 2019 2 次提交
  13. 09 7月, 2019 3 次提交
  14. 11 6月, 2019 13 次提交