1. 01 9月, 2020 10 次提交
    • A
      perf tools: Correct SNOOPX field offset · 39c0a53b
      Al Grant 提交于
      perf_event.h has macros that define the field offsets in the data_src
      bitmask in perf records. The SNOOPX and REMOTE offsets were both 37.
      
      These are distinct fields, and the bitfield layout in perf_mem_data_src
      confirms that SNOOPX should be at offset 38.
      
      Committer notes:
      
      This was extracted from a larger patch that also contained kernel
      changes.
      
      Fixes: 52839e65 ("perf tools: Add support for printing new mem_info encodings")
      Signed-off-by: NAl Grant <al.grant@arm.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/9974f2d0-bf7f-518e-d9f7-4520e5ff1bb0@foss.arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      39c0a53b
    • A
      perf intel-pt: Fix corrupt data after perf inject from · a347306f
      Al Grant 提交于
      Commit 42bbabed ("perf tools: Add hw_idx in struct branch_stack")
      changed the format of branch stacks in perf samples. When samples use
      this new format, a flag must be set in the corresponding event.
      
      Synthesized branch stacks generated from Intel PT were using the new
      format, but not setting the event attribute, leading to consumers
      seeing corrupt data. This patch fixes the issue by setting the event
      attribute to indicate use of the new format.
      
      Fixes: 42bbabed ("perf tools: Add hw_idx in struct branch_stack")
      Signed-off-by: NAl Grant <al.grant@arm.com>
      Acked-by: NAdrian Hunter <adrian.hunter@intel.com>
      Reviewed-by: NMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lore.kernel.org/lkml/20200819084751.17686-2-leo.yan@linaro.orgSigned-off-by: NLeo Yan <leo.yan@linaro.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a347306f
    • A
      perf cs-etm: Fix corrupt data after perf inject from · f5f8e7e5
      Al Grant 提交于
      Commit 42bbabed ("perf tools: Add hw_idx in struct branch_stack")
      changed the format of branch stacks in perf samples. When samples use
      this new format, a flag must be set in the corresponding event.
      
      Synthesized branch stacks generated from CoreSight ETM trace were using
      the new format, but not setting the event attribute, leading to
      consumers seeing corrupt data. This patch fixes the issue by setting the
      event attribute to indicate use of the new format.
      
      Fixes: 42bbabed ("perf tools: Add hw_idx in struct branch_stack")
      Signed-off-by: NAl Grant <al.grant@arm.com>
      Reviewed-by: NAndrea Brunato <andrea.brunato@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Link: http://lore.kernel.org/lkml/20200819084751.17686-1-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f5f8e7e5
    • A
      perf top/report: Fix infinite loop in the TUI for grouped events · d4ccbacb
      Arnaldo Carvalho de Melo 提交于
      For a while we need to have a dummy event for doing things like
      receiving PERF_RECORD_COMM, PERF_RECORD_EXEC, etc for threads being
      created and dying while we synthesize the pre-existing ones at tool
      start.
      
      This 'dummy' event is needed for keeping track of thread lifetime events
      early in the session but are uninteresting otherwise, i.e. no need to
      have it in a initial events menu for the non-grouped case, i.e. for:
      
       # perf top -e cycles,instructions
      
      or even for plain:
      
       # perf top
      
      When 'cycles' and that 'dummy' event are in place.
      
      The code to remove that 'dummy' event ended up creating an endless loop
      for the grouped case, i.e.:
      
       # perf top -e '{cycles,instructions}'
      
      Fix it.
      
      Fixes: bee9ca1c ("perf report TUI: Remove needless 'dummy' event from menu")
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d4ccbacb
    • I
      perf parse-events: Avoid an uninitialized read when using fake PMUs · 33321a06
      Ian Rogers 提交于
      With a fake_pmu the pmu_info isn't populated by perf_pmu__check_alias.
      In this case, don't try to copy the uninitialized values to the evsel.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200826042910.1902374-2-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      33321a06
    • T
      perf stat: Fix out of bounds array access in the print_counters() evlist method · 313146a8
      Thomas Richter 提交于
      Fix a compile error on F32 and gcc version 10.1 on s390 in file
      utils/stat-display.c.  The error does not show up with make DEBUG=y.  In
      fact the issue shows up when using both compiler options -O6 and
      -D_FORTIFY_SOURCE=2 (which are omitted with DEBUG=Y).
      
      This is the offending call chain:
      
      print_counter_aggr()
        printout(config, -1, 0, ...)  with 2nd parm id set to -1
          aggr_printout(config, x, id --> -1, ...) which leads to this code:
      		case AGGR_NONE:
                      if (evsel->percore && !config->percore_show_thread) {
                              ....
                      } else {
                              fprintf(config->output, "CPU%*d%s",
                                      config->csv_output ? 0 : -7,
                                      evsel__cpus(evsel)->map[id],
      				                        ^^ id is -1 !!!!
                                      config->csv_sep);
                      }
      
      This is a compiler inlining issue which is detected on s390 but not on
      other plattforms.
      
      Output before:
      
       # make util/stat-display.o
          .....
      
        util/stat-display.c: In function ‘perf_evlist__print_counters’:
        util/stat-display.c:121:4: error: array subscript -1 is below array
            bounds of ‘int[]’ [-Werror=array-bounds]
        121 |    fprintf(config->output, "CPU%*d%s",
            |    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        122 |     config->csv_output ? 0 : -7,
            |     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        123 |     evsel__cpus(evsel)->map[id],
            |     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        124 |     config->csv_sep);
            |     ~~~~~~~~~~~~~~~~
        In file included from util/evsel.h:13,
                       from util/evlist.h:13,
                       from util/stat-display.c:9:
        /root/linux/tools/lib/perf/include/internal/cpumap.h:10:7:
        note: while referencing ‘map’
         10 |  int  map[];
            |       ^~~
        cc1: all warnings being treated as errors
        mv: cannot stat 'util/.stat-display.o.tmp': No such file or directory
        make[3]: *** [/root/linux/tools/build/Makefile.build:97: util/stat-display.o]
        Error 1
        make[2]: *** [Makefile.perf:716: util/stat-display.o] Error 2
        make[1]: *** [Makefile.perf:231: sub-make] Error 2
        make: *** [Makefile:110: util/stat-display.o] Error 2
        [root@t35lp46 perf]#
      
      Output after:
      
        # make util/stat-display.o
          .....
        CC       util/stat-display.o
        [root@t35lp46 perf]#
      
      Committer notes:
      
      Removed the removal of {} enclosing the multiline else block, as pointed
      out by Jiri Olsa.
      Suggested-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Sven Schnelle <svens@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Link: http://lore.kernel.org/lkml/20200825063304.77733-1-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      313146a8
    • T
      perf test: Set NULL sentinel in pmu_events table in "Parse and process metrics" test · 492d4d87
      Thomas Richter 提交于
      Linux 5.9 introduced perf test case "Parse and process metrics" and
      on s390 this test case always dumps core:
      
        [root@t35lp67 perf]# ./perf test -vvvv -F 67
        67: Parse and process metrics                             :
        --- start ---
        metric expr inst_retired.any / cpu_clk_unhalted.thread for IPC
        parsing metric: inst_retired.any / cpu_clk_unhalted.thread
        Segmentation fault (core dumped)
        [root@t35lp67 perf]#
      
      I debugged this core dump and gdb shows this call chain:
      
        (gdb) where
         #0  0x000003ffabc3192a in __strnlen_c_1 () from /lib64/libc.so.6
         #1  0x000003ffabc293de in strcasestr () from /lib64/libc.so.6
         #2  0x0000000001102ba2 in match_metric(list=0x1e6ea20 "inst_retired.any",
                  n=<optimized out>)
             at util/metricgroup.c:368
         #3  find_metric (map=<optimized out>, map=<optimized out>,
                 metric=0x1e6ea20 "inst_retired.any")
            at util/metricgroup.c:765
         #4  __resolve_metric (ids=0x0, map=<optimized out>, metric_list=0x0,
                 metric_no_group=<optimized out>, m=<optimized out>)
            at util/metricgroup.c:844
         #5  resolve_metric (ids=0x0, map=0x0, metric_list=0x0,
                metric_no_group=<optimized out>)
            at util/metricgroup.c:881
         #6  metricgroup__add_metric (metric=<optimized out>,
              metric_no_group=metric_no_group@entry=false, events=<optimized out>,
              events@entry=0x3ffd84fb878, metric_list=0x0,
              metric_list@entry=0x3ffd84fb868, map=0x0)
            at util/metricgroup.c:943
         #7  0x00000000011034ae in metricgroup__add_metric_list (map=0x13f9828 <map>,
              metric_list=0x3ffd84fb868, events=0x3ffd84fb878,
              metric_no_group=<optimized out>, list=<optimized out>)
            at util/metricgroup.c:988
         #8  parse_groups (perf_evlist=perf_evlist@entry=0x1e70260,
                str=str@entry=0x12f34b2 "IPC", metric_no_group=<optimized out>,
                metric_no_merge=<optimized out>,
                fake_pmu=fake_pmu@entry=0x1462f18 <perf_pmu.fake>,
                metric_events=0x3ffd84fba58, map=0x1)
            at util/metricgroup.c:1040
         #9  0x0000000001103eb2 in metricgroup__parse_groups_test(
        	evlist=evlist@entry=0x1e70260, map=map@entry=0x13f9828 <map>,
        	str=str@entry=0x12f34b2 "IPC",
        	metric_no_group=metric_no_group@entry=false,
        	metric_no_merge=metric_no_merge@entry=false,
        	metric_events=0x3ffd84fba58)
            at util/metricgroup.c:1082
         #10 0x00000000010c84d8 in __compute_metric (ratio2=0x0, name2=0x0,
                ratio1=<synthetic pointer>, name1=0x12f34b2 "IPC",
        	vals=0x3ffd84fbad8, name=0x12f34b2 "IPC")
            at tests/parse-metric.c:159
         #11 compute_metric (ratio=<synthetic pointer>, vals=0x3ffd84fbad8,
        	name=0x12f34b2 "IPC")
            at tests/parse-metric.c:189
         #12 test_ipc () at tests/parse-metric.c:208
      .....
      ..... omitted many more lines
      
      This test case was added with
      commit 218ca91d ("perf tests: Add parse metric test for frontend metric").
      
      When I compile with make DEBUG=y it works fine and I do not get a core dump.
      
      It turned out that the above listed function call chain worked on a struct
      pmu_event array which requires a trailing element with zeroes which was
      missing. The marco map_for_each_event() loops over that array tests for members
      metric_expr/metric_name/metric_group being non-NULL. Adding this element fixes
      the issue.
      
      Output after:
      
        [root@t35lp46 perf]# ./perf test 67
        67: Parse and process metrics                             : Ok
        [root@t35lp46 perf]#
      
      Committer notes:
      
      As Ian remarks, this is not s390 specific:
      
      <quote Ian>
        This also shows up with address sanitizer on all architectures
        (perhaps change the patch title) and perhaps add a "Fixes: <commit>"
        tag.
      
        =================================================================
        ==4718==ERROR: AddressSanitizer: global-buffer-overflow on address
        0x55c93b4d59e8 at pc 0x55c93a1541e2 bp 0x7ffd24327c60 sp
        0x7ffd24327c58
        READ of size 8 at 0x55c93b4d59e8 thread T0
            #0 0x55c93a1541e1 in find_metric tools/perf/util/metricgroup.c:764:2
            #1 0x55c93a153e6c in __resolve_metric tools/perf/util/metricgroup.c:844:9
            #2 0x55c93a152f18 in resolve_metric tools/perf/util/metricgroup.c:881:9
            #3 0x55c93a1528db in metricgroup__add_metric
        tools/perf/util/metricgroup.c:943:9
            #4 0x55c93a151996 in metricgroup__add_metric_list
        tools/perf/util/metricgroup.c:988:9
            #5 0x55c93a1511b9 in parse_groups tools/perf/util/metricgroup.c:1040:8
            #6 0x55c93a1513e1 in metricgroup__parse_groups_test
        tools/perf/util/metricgroup.c:1082:9
            #7 0x55c93a0108ae in __compute_metric tools/perf/tests/parse-metric.c:159:8
            #8 0x55c93a010744 in compute_metric tools/perf/tests/parse-metric.c:189:9
            #9 0x55c93a00f5ee in test_ipc tools/perf/tests/parse-metric.c:208:2
            #10 0x55c93a00f1e8 in test__parse_metric
        tools/perf/tests/parse-metric.c:345:2
            #11 0x55c939fd7202 in run_test tools/perf/tests/builtin-test.c:410:9
            #12 0x55c939fd6736 in test_and_print tools/perf/tests/builtin-test.c:440:9
            #13 0x55c939fd58c3 in __cmd_test tools/perf/tests/builtin-test.c:661:4
            #14 0x55c939fd4e02 in cmd_test tools/perf/tests/builtin-test.c:807:9
            #15 0x55c939e4763d in run_builtin tools/perf/perf.c:313:11
            #16 0x55c939e46475 in handle_internal_command tools/perf/perf.c:365:8
            #17 0x55c939e4737e in run_argv tools/perf/perf.c:409:2
            #18 0x55c939e45f7e in main tools/perf/perf.c:539:3
      
        0x55c93b4d59e8 is located 0 bytes to the right of global variable
        'pme_test' defined in 'tools/perf/tests/parse-metric.c:17:25'
        (0x55c93b4d54a0) of size 1352
        SUMMARY: AddressSanitizer: global-buffer-overflow
        tools/perf/util/metricgroup.c:764:2 in find_metric
        Shadow bytes around the buggy address:
          0x0ab9a7692ae0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
          0x0ab9a7692af0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
          0x0ab9a7692b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
          0x0ab9a7692b10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
          0x0ab9a7692b20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        =>0x0ab9a7692b30: 00 00 00 00 00 00 00 00 00 00 00 00 00[f9]f9 f9
          0x0ab9a7692b40: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
          0x0ab9a7692b50: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
          0x0ab9a7692b60: f9 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
          0x0ab9a7692b70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
          0x0ab9a7692b80: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
        Shadow byte legend (one shadow byte represents 8 application bytes):
          Addressable:           00
          Partially addressable: 01 02 03 04 05 06 07
          Heap left redzone:	   fa
          Freed heap region:	   fd
          Stack left redzone:	   f1
          Stack mid redzone:	   f2
          Stack right redzone:     f3
          Stack after return:	   f5
          Stack use after scope:   f8
          Global redzone:          f9
          Global init order:	   f6
          Poisoned by user:        f7
          Container overflow:	   fc
          Array cookie:            ac
          Intra object redzone:    bb
          ASan internal:           fe
          Left alloca redzone:     ca
          Right alloca redzone:    cb
          Shadow gap:              cc
      </quote>
      
      I'm also adding the missing "Fixes" tag and setting just .name to NULL,
      as doing it that way is more compact (the compiler will zero out
      everything else) and the table iterators look for .name being NULL as
      the sentinel marking the end of the table.
      
      Fixes: 0a507af9 ("perf tests: Add parse metric test for ipc metric")
      Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
      Reviewed-by: NSumanth Korikkar <sumanthk@linux.ibm.com>
      Acked-by: NIan Rogers <irogers@google.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Sven Schnelle <svens@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Link: http://lore.kernel.org/lkml/20200825071211.16959-1-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      492d4d87
    • J
      perf parse-events: Set exclude_guest=1 for user-space counting · 943b69ac
      Jin Yao 提交于
      Currently if we run 'perf record -e cycles:u', exclude_guest=0.
      
      But it doesn't make sense in most cases that we request for
      user-space counting but we also get the guest report.
      
      Of course, we also need to consider 'perf kvm' usage case that
      authorized perf users on the host may only want to count guest user
      space events. For example,
      
        # perf kvm --guest record -e cycles:u
      
      When we have 'exclude_guest=1' for 'perf kvm' usage, we may get nothing
      from guest events.
      
      To keep perf semantics consistent and clear, this patch sets
      exclude_guest=1 for user-space counting but except for 'perf kvm' usage.
      
      Before:
      
        perf record -e cycles:u ./div
        perf evlist -v
        cycles:u: ..., exclude_kernel: 1, exclude_hv: 1, ...
      
      After:
        perf record -e cycles:u ./div
        perf evlist -v
        cycles:u: ..., exclude_kernel: 1, exclude_hv: 1,  exclude_guest: 1, ...
      
      Before:
        perf kvm --guest record -e cycles:u -vvv
      
      perf_event_attr:
      
        size                             120
        { sample_period, sample_freq }   4000
        sample_type                      IP|TID|TIME|ID|CPU|PERIOD
        read_format                      ID
        disabled                         1
        inherit                          1
        exclude_kernel                   1
        exclude_hv                       1
        freq                             1
        sample_id_all                    1
      
      After:
      
        perf kvm --guest record -e cycles:u -vvv
      
      perf_event_attr:
        size                             120
        { sample_period, sample_freq }   4000
        sample_type                      IP|TID|TIME|ID|CPU|PERIOD
        read_format                      ID
        disabled                         1
        inherit                          1
        exclude_kernel                   1
        exclude_hv                       1
        freq                             1
        sample_id_all                    1
      
      For Before/After, exclude_guest are both 0 for perf kvm usage.
      
      perf test 6
      
       6: Parse event definition strings             : Ok
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Tested-by: NLike Xu <like.xu@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200814012120.16647-1-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      943b69ac
    • W
      perf record: Correct the help info of option "--no-bpf-event" · a060c1f1
      Wei Li 提交于
      The help info of option "--no-bpf-event" is wrongly described as "record
      bpf events", correct it.
      
      Committer testing:
      
        $ perf record -h bpf
      
         Usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
                --clang-opt <clang options>
                                  options passed to clang when compiling BPF scriptlets
                --clang-path <clang path>
                                  clang binary to use for compiling BPF scriptlets
                --no-bpf-event    do not record bpf events
      
        $
      
      Fixes: 71184c6a ("perf record: Replace option --bpf-event with --no-bpf-event")
      Signed-off-by: NWei Li <liwei391@huawei.com>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Li Bin <huawei.libin@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/20200819031947.12115-1-liwei391@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a060c1f1
    • C
      perf tools: Use %zd for size_t printf formats on 32-bit · 20befbb1
      Chris Wilson 提交于
      A couple of trivial fixes for using %zd for size_t in the code
      supporting the ZSTD compression library.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200820212501.24421-1-chris@chris-wilson.co.ukSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      20befbb1
  2. 21 8月, 2020 9 次提交
  3. 20 8月, 2020 3 次提交
    • L
      Merge tag 'vfio-v5.9-rc2' of git://github.com/awilliam/linux-vfio · 7eac66d0
      Linus Torvalds 提交于
      Pull VFIO fixes from Alex Williamson:
      
       - Fix lockdep issue reported for recursive read-lock (Alex Williamson)
      
       - Fix missing unwind in type1 replay function (Alex Williamson)
      
      * tag 'vfio-v5.9-rc2' of git://github.com/awilliam/linux-vfio:
        vfio/type1: Add proper error unwind for vfio_iommu_replay()
        vfio-pci: Avoid recursive read-lock usage
      7eac66d0
    • A
      lib/string.c: Use freestanding environment · 33d0f96f
      Arvind Sankar 提交于
      gcc can transform the loop in a naive implementation of memset/memcpy
      etc into a call to the function itself.  This optimization is enabled by
      -ftree-loop-distribute-patterns.
      
      This has been the case for a while, but gcc-10.x enables this option at
      -O2 rather than -O3 as in previous versions.
      
      Add -ffreestanding, which implicitly disables this optimization with
      gcc.  It is unclear whether clang performs such optimizations, but
      hopefully it will also not do so in a freestanding environment.
      Signed-off-by: NArvind Sankar <nivedita@alum.mit.edu>
      Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      33d0f96f
    • A
      x86/boot/compressed: Use builtin mem functions for decompressor · 394b19d6
      Arvind Sankar 提交于
      Since commits
      
        c041b5ad ("x86, boot: Create a separate string.h file to provide standard string functions")
        fb4cac57 ("x86, boot: Move memcmp() into string.h and string.c")
      
      the decompressor stub has been using the compiler's builtin memcpy,
      memset and memcmp functions, _except_ where it would likely have the
      largest impact, in the decompression code itself.
      
      Remove the #undef's of memcpy and memset in misc.c so that the
      decompressor code also uses the compiler builtins.
      
      The rationale given in the comment doesn't really apply: just because
      some functions use the out-of-line version is no reason to not use the
      builtin version in the rest.
      
      Replace the comment with an explanation of why memzero and memmove are
      being #define'd.
      
      Drop the suggestion to #undef in boot/string.h as well: the out-of-line
      versions are not really optimized versions, they're generic code that's
      good enough for the preboot environment. The compiler will likely
      generate better code for constant-size memcpy/memset/memcmp if it is
      allowed to.
      
      Most decompressors' performance is unchanged, with the exception of LZ4
      and 64-bit ZSTD.
      
      	Before	After ARCH
      LZ4	  73ms	 10ms   32
      LZ4	 120ms	 10ms	64
      ZSTD	  90ms	 74ms	64
      
      Measurements on QEMU on 2.2GHz Broadwell Xeon, using defconfig kernels.
      
      Decompressor code size has small differences, with the largest being
      that 64-bit ZSTD decreases just over 2k. The largest code size increase
      was on 64-bit XZ, of about 400 bytes.
      Signed-off-by: NArvind Sankar <nivedita@alum.mit.edu>
      Suggested-by: NNick Terrell <nickrterrell@gmail.com>
      Tested-by: NNick Terrell <nickrterrell@gmail.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      394b19d6
  4. 19 8月, 2020 3 次提交
  5. 18 8月, 2020 14 次提交
    • L
      Merge tag 'pstore-v5.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 06a4ec1d
      Linus Torvalds 提交于
      Pull mailmap update from Kees Cook:
       "This was originally part of my pstore tree, but when I realized that
        mailmap needed re-alphabetizing, I decided to wait until -rc1 to send
        this, as I saw a lot of mailmap additions pending in -next for the
        merge window.
      
        It's a programmatic reordering and the addition of a pstore
        contributor's preferred email address"
      
      * tag 'pstore-v5.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        mailmap: Add WeiXiong Liao
        mailmap: Restore dictionary sorting
      06a4ec1d
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 4cf75621
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
       "Another batch of fixes:
      
        1) Remove nft_compat counter flush optimization, it generates warnings
           from the refcount infrastructure. From Florian Westphal.
      
        2) Fix BPF to search for build id more robustly, from Jiri Olsa.
      
        3) Handle bogus getopt lengths in ebtables, from Florian Westphal.
      
        4) Infoleak and other fixes to j1939 CAN driver, from Eric Dumazet and
           Oleksij Rempel.
      
        5) Reset iter properly on mptcp sendmsg() error, from Florian
           Westphal.
      
        6) Show a saner speed in bonding broadcast mode, from Jarod Wilson.
      
        7) Various kerneldoc fixes in bonding and elsewhere, from Lee Jones.
      
        8) Fix double unregister in bonding during namespace tear down, from
           Cong Wang.
      
        9) Disable RP filter during icmp_redirect selftest, from David Ahern"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (75 commits)
        otx2_common: Use devm_kcalloc() in otx2_config_npa()
        net: qrtr: fix usage of idr in port assignment to socket
        selftests: disable rp_filter for icmp_redirect.sh
        Revert "net: xdp: pull ethernet header off packet after computing skb->protocol"
        phylink: <linux/phylink.h>: fix function prototype kernel-doc warning
        mptcp: sendmsg: reset iter on error redux
        net: devlink: Remove overzealous WARN_ON with snapshots
        tipc: not enable tipc when ipv6 works as a module
        tipc: fix uninit skb->data in tipc_nl_compat_dumpit()
        net: Fix potential wrong skb->protocol in skb_vlan_untag()
        net: xdp: pull ethernet header off packet after computing skb->protocol
        ipvlan: fix device features
        bonding: fix a potential double-unregister
        can: j1939: add rxtimer for multipacket broadcast session
        can: j1939: abort multipacket broadcast session when timeout occurs
        can: j1939: cancel rxtimer on multipacket broadcast session complete
        can: j1939: fix support for multipacket broadcast message
        net: fddi: skfp: cfm: Remove seemingly unused variable 'ID_sccs'
        net: fddi: skfp: cfm: Remove set but unused variable 'oldstate'
        net: fddi: skfp: smt: Remove seemingly unused variable 'ID_sccs'
        ...
      4cf75621
    • X
      otx2_common: Use devm_kcalloc() in otx2_config_npa() · bf2bcd6f
      Xu Wang 提交于
      A multiplication for the size determination of a memory allocation
      indicated that an array data structure should be processed.
      Thus use the corresponding function "devm_kcalloc".
      Signed-off-by: NXu Wang <vulab@iscas.ac.cn>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bf2bcd6f
    • C
      PCI/P2PDMA: Fix build without DMA ops · 7c2308f7
      Christoph Hellwig 提交于
      My commit to make DMA ops support optional missed the reference in
      the p2pdma code.  And while the build bot didn't manage to find a config
      where this can happen, Matthew did.  Fix this by replacing two IS_ENABLED
      checks with ifdefs.
      
      Fixes: 2f9237d4 ("dma-mapping: make support for dma ops optional")
      Link: https://lore.kernel.org/r/20200810124843.1532738-1-hch@lst.deReported-by: NMatthew Wilcox <willy@infradead.org>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
      7c2308f7
    • N
      net: qrtr: fix usage of idr in port assignment to socket · 8dfddfb7
      Necip Fazil Yildiran 提交于
      Passing large uint32 sockaddr_qrtr.port numbers for port allocation
      triggers a warning within idr_alloc() since the port number is cast
      to int, and thus interpreted as a negative number. This leads to
      the rejection of such valid port numbers in qrtr_port_assign() as
      idr_alloc() fails.
      
      To avoid the problem, switch to idr_alloc_u32() instead.
      
      Fixes: bdabad3e ("net: Add Qualcomm IPC router")
      Reported-by: syzbot+f31428628ef672716ea8@syzkaller.appspotmail.com
      Signed-off-by: NNecip Fazil Yildiran <necip@google.com>
      Reviewed-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8dfddfb7
    • D
      selftests: disable rp_filter for icmp_redirect.sh · bcf7ddb0
      David Ahern 提交于
      h1 is initially configured to reach h2 via r1 rather than the
      more direct path through r2. If rp_filter is set and inherited
      for r2, forwarding fails since the source address of h1 is
      reachable from eth0 vs the packet coming to it via r1 and eth1.
      Since rp_filter setting affects the test, explicitly reset it.
      Signed-off-by: NDavid Ahern <dsahern@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bcf7ddb0
    • K
      mailmap: Add WeiXiong Liao · 5a4fe062
      Kees Cook 提交于
      WeiXiong Liao noted to me offlist that his preference for email address
      had changed and that he'd like it updated in the mailmap so people
      discussing pstore/blk would be able to reach him.
      
      Cc: WeiXiong Liao <gmpy.liaowx@gmail.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      5a4fe062
    • K
      mailmap: Restore dictionary sorting · d6bd5201
      Kees Cook 提交于
      Several names had been recently appended (instead of inserted). While
      git-shortlog doesn't need this file to be sorted, it helps humans to
      keep it organized this way. Sort the entire file (which includes some
      minor shuffling for dictionary order).
      
      Done with the following commands:
      
      	grep -E '^(#|$)' .mailmap > .mailmap.head
      	grep -Ev '^(#|$)' .mailmap > .mailmap.body
       	sort -f .mailmap.body > .mailmap.body.sort
      	cat .mailmap.head .mailmap.body.sort > .mailmap
      	rm .mailmap.head .mailmap.body.sort
      Signed-off-by: NKees Cook <keescook@chromium.org>
      d6bd5201
    • J
      arch/ia64: Restore arch-specific pgd_offset_k implementation · bd05220c
      Jessica Clarke 提交于
      IA-64 is special and treats pgd_offset_k() differently to pgd_offset(),
      using different formulae to calculate the indices into the kernel and user
      PGDs.  The index into the user PGDs takes into account the region number,
      but the index into the kernel (init_mm) PGD always assumes a predefined
      kernel region number. Commit 974b9b2c ("mm: consolidate pte_index() and
      pte_offset_*() definitions") made IA-64 use a generic pgd_offset_k() which
      incorrectly used pgd_index() for kernel page tables.  As a result, the
      index into the kernel PGD was going out of bounds and the kernel hung
      during early boot.
      
      Allow overrides of pgd_offset_k() and override it on IA-64 with the old
      implementation that will correctly index the kernel PGD.
      
      Fixes: 974b9b2c ("mm: consolidate pte_index() and pte_offset_*() definitions")
      Reported-by: NJohn Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Signed-off-by: NJessica Clarke <jrtc27@jrtc27.com>
      Tested-by: NJohn Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Acked-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NMike Rapoport <rppt@linux.ibm.com>
      bd05220c
    • D
      Revert "net: xdp: pull ethernet header off packet after computing skb->protocol" · 7f9bf6e8
      David S. Miller 提交于
      This reverts commit f8414a8d.
      
      eth_type_trans() does the necessary pull on the skb.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f9bf6e8
    • R
      phylink: <linux/phylink.h>: fix function prototype kernel-doc warning · 0b76e642
      Randy Dunlap 提交于
      Fix a kernel-doc warning for the pcs_config() function prototype:
      
      ../include/linux/phylink.h:406: warning: Excess function parameter 'permit_pause_to_mac' description in 'pcs_config'
      
      Fixes: 7137e18f ("net: phylink: add struct phylink_pcs")
      Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: netdev@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b76e642
    • A
      vfio/type1: Add proper error unwind for vfio_iommu_replay() · aae7a75a
      Alex Williamson 提交于
      The vfio_iommu_replay() function does not currently unwind on error,
      yet it does pin pages, perform IOMMU mapping, and modify the vfio_dma
      structure to indicate IOMMU mapping.  The IOMMU mappings are torn down
      when the domain is destroyed, but the other actions go on to cause
      trouble later.  For example, the iommu->domain_list can be empty if we
      only have a non-IOMMU backed mdev attached.  We don't currently check
      if the list is empty before getting the first entry in the list, which
      leads to a bogus domain pointer.  If a vfio_dma entry is erroneously
      marked as iommu_mapped, we'll attempt to use that bogus pointer to
      retrieve the existing physical page addresses.
      
      This is the scenario that uncovered this issue, attempting to hot-add
      a vfio-pci device to a container with an existing mdev device and DMA
      mappings, one of which could not be pinned, causing a failure adding
      the new group to the existing container and setting the conditions
      for a subsequent attempt to explode.
      
      To resolve this, we can first check if the domain_list is empty so
      that we can reject replay of a bogus domain, should we ever encounter
      this inconsistent state again in the future.  The real fix though is
      to add the necessary unwind support, which means cleaning up the
      current pinning if an IOMMU mapping fails, then walking back through
      the r-b tree of DMA entries, reading from the IOMMU which ranges are
      mapped, and unmapping and unpinning those ranges.  To be able to do
      this, we also defer marking the DMA entry as IOMMU mapped until all
      entries are processed, in order to allow the unwind to know the
      disposition of each entry.
      
      Fixes: a54eb550 ("vfio iommu type1: Add support for mediated devices")
      Reported-by: NZhiyi Guo <zhguo@redhat.com>
      Tested-by: NZhiyi Guo <zhguo@redhat.com>
      Reviewed-by: NCornelia Huck <cohuck@redhat.com>
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      aae7a75a
    • A
      vfio-pci: Avoid recursive read-lock usage · bc93b9ae
      Alex Williamson 提交于
      A down_read on memory_lock is held when performing read/write accesses
      to MMIO BAR space, including across the copy_to/from_user() callouts
      which may fault.  If the user buffer for these copies resides in an
      mmap of device MMIO space, the mmap fault handler will acquire a
      recursive read-lock on memory_lock.  Avoid this by reducing the lock
      granularity.  Sequential accesses requiring multiple ioread/iowrite
      cycles are expected to be rare, therefore typical accesses should not
      see additional overhead.
      
      VGA MMIO accesses are expected to be non-fatal regardless of the PCI
      memory enable bit to allow legacy probing, this behavior remains with
      a comment added.  ioeventfds are now included in memory access testing,
      with writes dropped while memory space is disabled.
      
      Fixes: abafbc55 ("vfio-pci: Invalidate mmaps and block MMIO access on disabled memory")
      Reported-by: NZhiyi Guo <zhguo@redhat.com>
      Tested-by: NZhiyi Guo <zhguo@redhat.com>
      Reviewed-by: NCornelia Huck <cohuck@redhat.com>
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      bc93b9ae
    • D
      watch_queue: Limit the number of watches a user can hold · 29e44f45
      David Howells 提交于
      Impose a limit on the number of watches that a user can hold so that
      they can't use this mechanism to fill up all the available memory.
      
      This is done by putting a counter in user_struct that's incremented when
      a watch is allocated and decreased when it is released.  If the number
      exceeds the RLIMIT_NOFILE limit, the watch is rejected with EAGAIN.
      
      This can be tested by the following means:
      
       (1) Create a watch queue and attach it to fd 5 in the program given - in
           this case, bash:
      
      	keyctl watch_session /tmp/nlog /tmp/gclog 5 bash
      
       (2) In the shell, set the maximum number of files to, say, 99:
      
      	ulimit -n 99
      
       (3) Add 200 keyrings:
      
      	for ((i=0; i<200; i++)); do keyctl newring a$i @s || break; done
      
       (4) Try to watch all of the keyrings:
      
      	for ((i=0; i<200; i++)); do echo $i; keyctl watch_add 5 %:a$i || break; done
      
           This should fail when the number of watches belonging to the user hits
           99.
      
       (5) Remove all the keyrings and all of those watches should go away:
      
      	for ((i=0; i<200; i++)); do keyctl unlink %:a$i; done
      
       (6) Kill off the watch queue by exiting the shell spawned by
           watch_session.
      
      Fixes: c73be61c ("pipe: Add general notification queue support")
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      29e44f45
  6. 17 8月, 2020 1 次提交