• K
    perf/x86/amd/ibs: Fix sample bias for dispatched micro-ops · 7ec11cad
    Kim Phillips 提交于
    [ Upstream commit 0f4cd769c410e2285a4e9873a684d90423f03090 ]
    
    When counting dispatched micro-ops with cnt_ctl=1, in order to prevent
    sample bias, IBS hardware preloads the least significant 7 bits of
    current count (IbsOpCurCnt) with random values, such that, after the
    interrupt is handled and counting resumes, the next sample taken
    will be slightly perturbed.
    
    The current count bitfield is in the IBS execution control h/w register,
    alongside the maximum count field.
    
    Currently, the IBS driver writes that register with the maximum count,
    leaving zeroes to fill the current count field, thereby overwriting
    the random bits the hardware preloaded for itself.
    
    Fix the driver to actually retain and carry those random bits from the
    read of the IBS control register, through to its write, instead of
    overwriting the lower current count bits with zeroes.
    
    Tested with:
    
    perf record -c 100001 -e ibs_op/cnt_ctl=1/pp -a -C 0 taskset -c 0 <workload>
    
    'perf annotate' output before:
    
     15.70  65:   addsd     %xmm0,%xmm1
     17.30        add       $0x1,%rax
     15.88        cmp       %rdx,%rax
                  je        82
     17.32  72:   test      $0x1,%al
                  jne       7c
      7.52        movapd    %xmm1,%xmm0
      5.90        jmp       65
      8.23  7c:   sqrtsd    %xmm1,%xmm0
     12.15        jmp       65
    
    'perf annotate' output after:
    
     16.63  65:   addsd     %xmm0,%xmm1
     16.82        add       $0x1,%rax
     16.81        cmp       %rdx,%rax
                  je        82
     16.69  72:   test      $0x1,%al
                  jne       7c
      8.30        movapd    %xmm1,%xmm0
      8.13        jmp       65
      8.24  7c:   sqrtsd    %xmm1,%xmm0
      8.39        jmp       65
    
    Tested on Family 15h and 17h machines.
    
    Machines prior to family 10h Rev. C don't have the RDWROPCNT capability,
    and have the IbsOpCurCnt bitfield reserved, so this patch shouldn't
    affect their operation.
    
    It is unknown why commit db98c5fa ("perf/x86: Implement 64-bit
    counter support for IBS") ignored the lower 4 bits of the IbsOpCurCnt
    field; the number of preloaded random bits has always been 7, AFAICT.
    Signed-off-by: NKim Phillips <kim.phillips@amd.com>
    Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
    Cc: "Arnaldo Carvalho de Melo" <acme@kernel.org>
    Cc: <x86@kernel.org>
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: "Borislav Petkov" <bp@alien8.de>
    Cc: Stephane Eranian <eranian@google.com>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: "Namhyung Kim" <namhyung@kernel.org>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Link: https://lkml.kernel.org/r/20190826195730.30614-1-kim.phillips@amd.comSigned-off-by: NSasha Levin <sashal@kernel.org>
    7ec11cad
perf_event.h 8.8 KB