提交 · 3408fef7448ce7d3c926978ee1a511e7707bffba · gsplhtlxg / clone-Linux

19 8月, 2016 9 次提交

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3408fef7

由 Linus Torvalds 提交于 8月 18, 2016

Pull x86 fixes from Ingo Molnar:
 "An initrd microcode loading fix, and an SMP bootup topology setup fix
  to resolve crashes on SGI/UV systems if the BIOS is configured in a
  certain way"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/smp: Fix __max_logical_packages value setup
  x86/microcode/AMD: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y

3408fef7

Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b061b4f3

由 Linus Torvalds 提交于 8月 18, 2016

Pull timer fixes from Ingo Molnar:
 "Three clocksource driver fixes"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource/drivers/mips-gic-timer: Make gic_clocksource_of_init() return int
  clocksource/drivers/kona: Fix get_counter() error handling
  clocksource/drivers/time-armada-370-xp: Fix the clock reference

b061b4f3

Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ac78bc71

由 Linus Torvalds 提交于 8月 18, 2016

Pull scheduler fixes from Ingo Molnar:
 "Two cputime fixes - hopefully the last ones"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/cputime: Resync steal time when guest & host lose sync
  sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression

ac78bc71

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 0dcb7b6f

由 Linus Torvalds 提交于 8月 18, 2016

Pull perf fixes from Ingo Molnar:
 "Mostly tooling fixes, but also start/stop filter related fixes, a perf
  event read() fix, a fix uncovered by fuzzing, and an uprobes leak fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Check return value of the perf_event_read() IPI
  perf/core: Enable mapping of the stop filters
  perf/core: Update filters only on executable mmap
  perf/core: Fix file name handling for start/stop filters
  perf/core: Fix event_function_local()
  uprobes: Fix the memcg accounting
  perf intel-pt: Fix occasional decoding errors when tracing system-wide
  tools: Sync kvm related header files for arm64 and s390
  perf probe: Release resources on error when handling exit paths
  perf probe: Check for dup and fdopen failures
  perf symbols: Fix annotation of objects with debuginfo files
  perf script: Don't disable use_callchain if input is pipe
  perf script: Show proper message when failed list scripts
  perf jitdump: Add the right header to get the major()/minor() definitions
  perf ppc64le: Fix build failure when libelf is not present
  perf tools mem: Fix -t store option for record command
  perf intel-pt: Fix ip compression

0dcb7b6f

Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · bd3fd451

由 Linus Torvalds 提交于 8月 18, 2016

Pull locking fixes from Ingo Molnar:
 "Two lockless_dereference() related fixes"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/barriers: Suppress sparse warnings in lockless_dereference()
  Revert "drm/fb-helper: Reduce READ_ONCE(master) to lockless_dereference"

bd3fd451

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · f28535c1

由 Linus Torvalds 提交于 8月 18, 2016

Pull arm64 fixes from Catalin Marinas:

 - Avoid a literal load with the MMU off on the CPU resume path
   (potential inconsistency between cache and RAM)

 - Build error with CONFIG_ACPI=n fixed

 - Compiler warning in the arch/arm64/mm/dump.c code fixed

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Fix shift warning in arch/arm64/mm/dump.c
  arm64: kernel: avoid literal load of virtual address with MMU off
  arm64: Fix NUMA build error when !CONFIG_ACPI

f28535c1

Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm · 114e3bae

由 Linus Torvalds 提交于 8月 18, 2016

Pull ARM fixes from Russell King:
 "Only three fixes this time:

   - Emil found an overflow problem with the memory layout sanity check.

   - Ard Biesheuvel noticed that late-allocated page tables (for EFI)
     weren't being properly constructed.

   - Guenter Roeck reported a problem found on qemu caused by the recent
     addr_limit changes"

* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: fix address limit restoration for undefined instructions
  ARM: 8591/1: mm: use fully constructed struct pages for EFI pgd allocations
  ARM: 8590/1: sanity_check_meminfo(): avoid overflow on vmalloc_limit

114e3bae

Merge tag 'pm-4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 395c4342

由 Linus Torvalds 提交于 8月 18, 2016

Pull power management fixes from Rafael Wysocki:
 "More hibernation-related material: one fix for a recent regression in
  the core, one small cleanup of the x86-64 resume code and a
  documentation update.

  Specifics:

   - Fix a hibernate core regression resulting from uncovering a latent
     bug in its implementation of memory bitmaps by a recent commit
     (James Morse).

   - Use __pa() to compute a physical address in the x86-64 code
     finalizing resume from hibernation (Rafael Wysocki).

   - Update power management documentation related to system sleep
     states to remove outdated information from it and to add a
     description of a recently introduced hibernation debug feature to
     it (Rafael Wysocki)"

* tag 'pm-4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / hibernate: Fix rtree_next_node() to avoid walking off list ends
  x86/power/64: Use __pa() for physical address computation
  PM / sleep: Update some system sleep documentation

395c4342

Merge tag 'drm-fixes-for-4.8-rc3' of git://people.freedesktop.org/~airlied/linux · 76dcd939

由 Linus Torvalds 提交于 8月 18, 2016

Pull drm fixes from Dave Airlie:
 "Pretty quiet so far:

   - a few amdgpu/radeon fixup for pcie pm changes
   - a couple of amdgpu fixes
   - some build fixes
   - printk fix"

* tag 'drm-fixes-for-4.8-rc3' of git://people.freedesktop.org/~airlied/linux:
  drm/amdgpu: Change GART offset to 64-bit
  drm/mediatek: add ARM_SMCCC dependency
  drm/mediatek: add CONFIG_OF dependency
  drm/mediatek: add COMMON_CLK dependency
  drm/amdgpu: Fix memory trashing if UVD ring test fails
  drm/amdgpu: fix vm init error path
  drm/amdkfd: print doorbell offset as a hex value
  Revert "drm/radeon: work around lack of upstream ACPI support for D3cold"
  Revert "drm/amdgpu: work around lack of upstream ACPI support for D3cold"

76dcd939

18 8月, 2016 31 次提交

locking/barriers: Suppress sparse warnings in lockless_dereference() · 112dc0c8

由 Johannes Berg 提交于 8月 11, 2016

After Peter's commit:

  331b6d8c ("locking/barriers: Validate lockless_dereference() is used on a pointer type")

... we get a lot of sparse warnings (one for every rcu_dereference, and more)
since the expression here is assigning to the wrong address space.

Instead of validating that 'p' is a pointer this way, instead make
it fail compilation when it's not by using sizeof(*(p)). This will
not cause any sparse warnings (tested, likely since the address
space is irrelevant for sizeof), and will fail compilation when
'p' isn't a pointer type.
Tested-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 331b6d8c ("locking/barriers: Validate lockless_dereference() is used on a pointer type")
Link: http://lkml.kernel.org/r/1470909022-687-2-git-send-email-johannes@sipsolutions.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

112dc0c8

Revert "drm/fb-helper: Reduce READ_ONCE(master) to lockless_dereference" · f17b3ea3

由 Johannes Berg 提交于 8月 11, 2016

This reverts commit:

  fa7d81bb ("drm/fb-helper: Reduce READ_ONCE(master) to lockless_dereference")

As Peter explained:

  [...] lockless_dereference() is _stronger_ than READ_ONCE(), not weaker.

  [...]

  Also, clue is in the name: 'dereference', you don't actually dereference
  the pointer here, only load it.

My next patch breaks the compile without this revert, because it assumes
you want to deference and thus also need the struct type visible (which
it isn't here), so revert it.
Tested-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1470909022-687-1-git-send-email-johannes@sipsolutions.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

f17b3ea3

arm64: Fix shift warning in arch/arm64/mm/dump.c · a93a4d62

由 Catalin Marinas 提交于 12月 05, 2014

When building with 48-bit VAs and 16K page configuration, it's possible
to get the following warning when building the arm64 page table dumping
code:

arch/arm64/mm/dump.c: In function ‘walk_pud’:
arch/arm64/mm/dump.c:274:102: warning: right shift count >= width of type [-Wshift-count-overflow]

This is because pud_offset(pgd, 0) performs a shift to the right by 36
while the value 0 has the type 'int' by default, therefore 32-bit.

This patch modifies all the p*_offset() uses in arch/arm64/mm/dump.c to
use 0UL for the address argument.
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

a93a4d62

sched/cputime: Resync steal time when guest & host lose sync · 03cbc732

由 Wanpeng Li 提交于 8月 17, 2016

Commit:

  57430218 ("sched/cputime: Count actually elapsed irq & softirq time")

... fixed a bug but also triggered a regression:

On an i5 laptop, 4 pCPUs, 4vCPUs for one full dynticks guest, there are four
CPU hog processes(for loop) running in the guest, I hot-unplug the pCPUs
on host one by one until there is only one left, then observe CPU utilization
via 'top' in the guest, it shows:

  100% st for cpu0(housekeeping)
   75% st for other CPUs (nohz full mode)

However, w/o this commit it shows the correct 75% for all four CPUs.

When a guest is interrupted for a longer amount of time, missed clock ticks
are not redelivered later. Because of that, we should not limit the amount
of steal time accounted to the amount of time that the calling functions
think have passed.

However, the interval returned by account_other_time() is NOT rounded down
to the nearest jiffy, while the base interval in get_vtime_delta() it is
subtracted from is, so the max cputime limit is required to avoid underflow.

This patch fixes the regression by limiting the account_other_time() from
get_vtime_delta() to avoid underflow, and lets the other three call sites
(in account_other_time() and steal_account_process_time()) account however
much steal time the host told us elapsed.
Suggested-by: NRik van Riel <riel@redhat.com>
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: NRik van Riel <riel@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Link: http://lkml.kernel.org/r/1471399546-4069-1-git-send-email-wanpeng.li@hotmail.com
[ Improved the changelog. ]
Signed-off-by: NIngo Molnar <mingo@kernel.org>

03cbc732

sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression · 173be9a1

由 Peter Zijlstra 提交于 8月 15, 2016

Mike reports:

 Roughly 10% of the time, ltp testcase getrusage04 fails:
 getrusage04    0  TINFO  :  Expected timers granularity is 4000 us
 getrusage04    0  TINFO  :  Using 1 as multiply factor for max [us]time increment (1000+4000us)!
 getrusage04    0  TINFO  :  utime:           0us; stime:         179us
 getrusage04    0  TINFO  :  utime:        3751us; stime:           0us
 getrusage04    1  TFAIL  :  getrusage04.c:133: stime increased > 5000us:

And tracked it down to the case where the task simply doesn't get
_any_ [us]time ticks.

Update the code to assume all rtime is utime when we lack information,
thus ensuring a task that elides the tick gets time accounted.
Reported-by: NMike Galbraith <umgwanakikbuti@gmail.com>
Tested-by: NMike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Fredrik Markstrom <fredrik.markstrom@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: stable@vger.kernel.org # 4.3+
Fixes: 9d7fb042 ("sched/cputime: Guarantee stime + utime == rtime")
Signed-off-by: NIngo Molnar <mingo@kernel.org>

173be9a1

perf/core: Check return value of the perf_event_read() IPI · 71e7bc2b

由 David Carrillo-Cisneros 提交于 8月 17, 2016

The call to smp_call_function_single in perf_event_read() may fail if
an invalid or not online CPU index is passed. Warn user if such bug is
present and return error.
Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1471467307-61171-2-git-send-email-davidcc@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

71e7bc2b

perf/core: Enable mapping of the stop filters · 99f5bc9b

由 Mathieu Poirier 提交于 7月 18, 2016

At this time the perf_addr_filter_needs_mmap() function will _not_
return true on a user space 'stop' filter.  But stop filters need
exactly the same kind of mapping that range and start filters get.
Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1468860187-318-4-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

99f5bc9b

perf/core: Update filters only on executable mmap · 12b40a23

由 Mathieu Poirier 提交于 7月 18, 2016

Function perf_event_mmap() is called by the MM subsystem each time
part of a binary is loaded in memory.  There can be several mapping
for a binary, many times unrelated to the code section.

Each time a section of a binary is mapped address filters are
updated, event when the map doesn't pertain to the code section.
The end result is that filters are configured based on the last map
event that was received rather than the last mapping of the code
segment.

For example if we have an executable 'main' that calls library
'libcstest.so.1.0', and that we want to collect traces on code
that is in that library.  The perf cmd line for this scenario
would be:

  perf record -e cs_etm// --filter 'filter 0x72c/0x40@/opt/lib/libcstest.so.1.0' --per-thread ./main

Resulting in binaries being mapped this way:

  root@linaro-nano:~# cat /proc/1950/maps
  00400000-00401000 r-xp 00000000 08:02 33169     /home/linaro/main
  00410000-00411000 r--p 00000000 08:02 33169     /home/linaro/main
  00411000-00412000 rw-p 00001000 08:02 33169     /home/linaro/main
  7fa2464000-7fa2474000 rw-p 00000000 00:00 0
  7fa2474000-7fa25a4000 r-xp 00000000 08:02 543   /lib/aarch64-linux-gnu/libc-2.21.so
  7fa25a4000-7fa25b3000 ---p 00130000 08:02 543   /lib/aarch64-linux-gnu/libc-2.21.so
  7fa25b3000-7fa25b7000 r--p 0012f000 08:02 543   /lib/aarch64-linux-gnu/libc-2.21.so
  7fa25b7000-7fa25b9000 rw-p 00133000 08:02 543   /lib/aarch64-linux-gnu/libc-2.21.so
  7fa25b9000-7fa25bd000 rw-p 00000000 00:00 0
  7fa25bd000-7fa25be000 r-xp 00000000 08:02 38308 /opt/lib/libcstest.so.1.0
  7fa25be000-7fa25cd000 ---p 00001000 08:02 38308 /opt/lib/libcstest.so.1.0
  7fa25cd000-7fa25ce000 r--p 00000000 08:02 38308 /opt/lib/libcstest.so.1.0
  7fa25ce000-7fa25cf000 rw-p 00001000 08:02 38308 /opt/lib/libcstest.so.1.0
  7fa25cf000-7fa25eb000 r-xp 00000000 08:02 574   /lib/aarch64-linux-gnu/ld-2.21.so
  7fa25ef000-7fa25f2000 rw-p 00000000 00:00 0
  7fa25f7000-7fa25f9000 rw-p 00000000 00:00 0
  7fa25f9000-7fa25fa000 r--p 00000000 00:00 0     [vvar]
  7fa25fa000-7fa25fb000 r-xp 00000000 00:00 0     [vdso]
  7fa25fb000-7fa25fc000 r--p 0001c000 08:02 574   /lib/aarch64-linux-gnu/ld-2.21.so
  7fa25fc000-7fa25fe000 rw-p 0001d000 08:02 574   /lib/aarch64-linux-gnu/ld-2.21.so
  7ff2ea8000-7ff2ec9000 rw-p 00000000 00:00 0     [stack]
  root@linaro-nano:~#

Before 'main()' can execute 'libcstest.so.1.0' has to be loaded in
memory.  Once that has been done perf_event_mmap() has been called
4 times, with the last map starting at address 0x7fa25ce000 and
the address filter configured to start filtering when the
IP has passed over address 0x0x7fa25ce72c (0x7fa25ce000 + 0x72c).

But that is wrong since the code segment for library 'libcstest.so.1.0'
as been mapped at 0x7fa25bd000, resulting in traces not being
collected.

This patch corrects the situation by requesting that address
filters be updated only if the mapped event is for a code
segment.
Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1468860187-318-3-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

12b40a23

perf/core: Fix file name handling for start/stop filters · 4059ffd0

由 Mathieu Poirier 提交于 7月 18, 2016

Binary file names have to be supplied for both range and start/stop
filters but the current code only processes the filename if an
address range filter is specified.  This code adds processing of
the filename for start/stop filters.
Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1468860187-318-2-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

4059ffd0

perf/core: Fix event_function_local() · cca20946

由 Peter Zijlstra 提交于 8月 16, 2016

Vincent reported triggering the WARN_ON_ONCE() in event_function_local().

While thinking through cases I noticed that by using event_function()
directly, we miss the inactive case usually handled by
event_function_call().

Therefore construct a blend of event_function_call() and
event_function() that handles the cases relevant to
event_function_local().
Reported-by: NVince Weaver <vincent.weaver@maine.edu>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org # 4.5+
Fixes: fae3fde6 ("perf: Collapse and fix event_function_call() users")
Signed-off-by: NIngo Molnar <mingo@kernel.org>

cca20946

x86/smp: Fix __max_logical_packages value setup · 7b0501b1

由 Jiri Olsa 提交于 8月 15, 2016

Frank reported kernel panic when he disabled several cores in BIOS
via following option:

  Core Disable Bitmap(Hex)   [0]

with number 0xFFE, which leaves 16 CPUs in system (out of 48).

The kernel panic below goes along with following messages:

 smpboot: Max logical packages: 2^M
 smpboot: APIC(0) Converting physical 0 to logical package 0^M
 smpboot: APIC(20) Converting physical 1 to logical package 1^M
 smpboot: APIC(40) Package 2 exceeds logical package map^M
 smpboot: CPU 8 APICId 40 disabled^M
 smpboot: APIC(60) Package 3 exceeds logical package map^M
 smpboot: CPU 12 APICId 60 disabled^M
 ...
 general protection fault: 0000 [#1] SMP^M
 Modules linked in:^M
 CPU: 15 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc5+ #1^M
 Hardware name: SGI UV300/UV300, BIOS SGI UV 300 series BIOS 05/25/2016^M
 task: ffff8801673e0000 ti: ffff8801673ac000 task.ti: ffff8801673ac000^M
 RIP: 0010:[<ffffffff81014d54>]  [<ffffffff81014d54>] uncore_change_context+0xd4/0x180^M
 ...
  [<ffffffff810158ac>] uncore_event_init_cpu+0x6c/0x70^M
  [<ffffffff81d8c91c>] intel_uncore_init+0x1c2/0x2dd^M
  [<ffffffff81d8c75a>] ? uncore_cpu_setup+0x17/0x17^M
  [<ffffffff81002190>] do_one_initcall+0x50/0x190^M
  [<ffffffff810ab193>] ? parse_args+0x293/0x480^M
  [<ffffffff81d87365>] kernel_init_freeable+0x1a5/0x249^M
  [<ffffffff81d86a35>] ? set_debug_rodata+0x12/0x12^M
  [<ffffffff816dc19e>] kernel_init+0xe/0x110^M
  [<ffffffff816e93bf>] ret_from_fork+0x1f/0x40^M
  [<ffffffff816dc190>] ? rest_init+0x80/0x80^M

The reason for the panic is wrong value of __max_logical_packages,
which lets logical_package_map uninitialized and the uncore code
relying on this map being properly initialized (maybe we should
add some safety checks there as well).

The __max_logical_packages is computed as:

  DIV_ROUND_UP(total_cpus, ncpus);
  - ncpus being number of cores

With above BIOS setup we get total_cpus == 16 which set
__max_logical_packages to 2 (ncpus is 12).

Once topology_update_package_map processes CPU with logical
pkg over 2 we display above messages and fail to initialize
the physical_to_logical_pkg map, which makes the uncore code
crash.

The fix is to remove logical_package_map bitmap completely
and keep and update the logical_packages number instead.

After we enumerate all the present CPUs, we check if the
enumerated logical packages count is within its computed
maximum from BIOS data.

If it's not the case, we set this maximum to the new enumerated
value and freeze any new addition of logical packages.

The freeze is because lot of init code like uncore/rapl/cqm
depends on having maximum logical package value set to allocate
their data, so we can't change it later on.

Prarit Bhargava tested the patch and confirms that it solves
the problem:

  From dmidecode:
          Core Count: 24
          Core Enabled: 24
          Thread Count: 48

Orig kernel boot log:

 [    0.464981] smpboot: Max logical packages: 19
 [    0.469861] smpboot: APIC(0) Converting physical 0 to logical package 0
 [    0.477261] smpboot: APIC(40) Converting physical 1 to logical package 1
 [    0.484760] smpboot: APIC(80) Converting physical 2 to logical package 2
 [    0.492258] smpboot: APIC(c0) Converting physical 3 to logical package 3

1.  nr_cpus=8, should stop enumerating in package 0:

 [    0.533664] smpboot: APIC(0) Converting physical 0 to logical package 0
 [    0.539596] smpboot: Max logical packages: 19

2.  max_cpus=8, should still enumerate all packages:

 [    0.526494] smpboot: APIC(0) Converting physical 0 to logical package 0
 [    0.532428] smpboot: APIC(40) Converting physical 1 to logical package 1
 [    0.538456] smpboot: APIC(80) Converting physical 2 to logical package 2
 [    0.544486] smpboot: APIC(c0) Converting physical 3 to logical package 3
 [    0.550524] smpboot: Max logical packages: 19

3.  nr_cpus=49 ( 2 socket + 1 core on 3rd socket), should stop enumerating in
    package 2:

 [    0.521378] smpboot: APIC(0) Converting physical 0 to logical package 0
 [    0.527314] smpboot: APIC(40) Converting physical 1 to logical package 1
 [    0.533345] smpboot: APIC(80) Converting physical 2 to logical package 2
 [    0.539368] smpboot: Max logical packages: 19

4.  maxcpus=49, should still enumerate all packages:

 [    0.525591] smpboot: APIC(0) Converting physical 0 to logical package 0
 [    0.531525] smpboot: APIC(40) Converting physical 1 to logical package 1
 [    0.537547] smpboot: APIC(80) Converting physical 2 to logical package 2
 [    0.543579] smpboot: APIC(c0) Converting physical 3 to logical package 3
 [    0.549624] smpboot: Max logical packages: 19

5.  kdump (nr_cpus=1) works as well.
Reported-by: NFrank Ramsay <framsay@redhat.com>
Tested-by: NPrarit Bhargava <prarit@redhat.com>
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Reviewed-by: NPrarit Bhargava <prarit@redhat.com>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160815101700.GA30090@kravaSigned-off-by: NIngo Molnar <mingo@kernel.org>

7b0501b1

x86/microcode/AMD: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y · 88b2f634

由 Borislav Petkov 提交于 8月 17, 2016

Similar to:

  efaad554 ("x86/microcode/intel: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y")

... fix microcode loading from the initrd on AMD by adding the
randomization offset to the microcode patch container within the initrd.
Reported-and-tested-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-tip-commits@vger.kernel.org
Link: http://lkml.kernel.org/r/20160817113314.GA19221@nazgul.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>

88b2f634

uprobes: Fix the memcg accounting · 6c4687cc

由 Oleg Nesterov 提交于 8月 17, 2016

__replace_page() wronlgy calls mem_cgroup_cancel_charge() in "success" path,
it should only do this if page_check_address() fails.

This means that every enable/disable leads to unbalanced mem_cgroup_uncharge()
from put_page(old_page), it is trivial to underflow the page_counter->count
and trigger OOM.
Reported-and-tested-by: NBrenden Blanco <bblanco@plumgrid.com>
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Reviewed-by: NJohannes Weiner <hannes@cmpxchg.org>
Acked-by: NMichal Hocko <mhocko@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: stable@vger.kernel.org # 3.17+
Fixes: 00501b53 ("mm: memcontrol: rewrite charge API")
Link: http://lkml.kernel.org/r/20160817153629.GB29724@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

6c4687cc

D
Merge branch 'drm-fixes-4.8' of git://people.freedesktop.org/~agd5f/linux into drm-fixes · 91d62d9f
由 Dave Airlie 提交于 8月 18, 2016
```
Single 64-bit gart size fix.

* 'drm-fixes-4.8' of git://people.freedesktop.org/~agd5f/linux:
  drm/amdgpu: Change GART offset to 64-bit
```
91d62d9f

Merge branch 'pm-sleep' · 6c16f42a

由 Rafael J. Wysocki 提交于 8月 18, 2016

* pm-sleep:
  PM / hibernate: Fix rtree_next_node() to avoid walking off list ends
  x86/power/64: Use __pa() for physical address computation
  PM / sleep: Update some system sleep documentation

6c16f42a

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 184ca823

由 Linus Torvalds 提交于 8月 17, 2016

Pull networking fixes from David Miller:

 1) Buffers powersave frame test is reversed in cfg80211, fix from Felix
    Fietkau.

 2) Remove bogus WARN_ON in openvswitch, from Jarno Rajahalme.

 3) Fix some tg3 ethtool logic bugs, and one that would cause no
    interrupts to be generated when rx-coalescing is set to 0.  From
    Satish Baddipadige and Siva Reddy Kallam.

 4) QLCNIC mailbox corruption and napi budget handling fix from Manish
    Chopra.

 5) Fix fib_trie logic when walking the trie during /proc/net/route
    output than can access a stale node pointer.  From David Forster.

 6) Several sctp_diag fixes from Phil Sutter.

 7) PAUSE frame handling fixes in mlxsw driver from Ido Schimmel.

 8) Checksum fixup fixes in bpf from Daniel Borkmann.

 9) Memork leaks in nfnetlink, from Liping Zhang.

10) Use after free in rxrpc, from David Howells.

11) Use after free in new skb_array code of macvtap driver, from Jason
    Wang.

12) Calipso resource leak, from Colin Ian King.

13) mediatek bug fixes (missing stats sync init, etc.) from Sean Wang.

14) Fix bpf non-linear packet write helpers, from Daniel Borkmann.

15) Fix lockdep splats in macsec, from Sabrina Dubroca.

16) hv_netvsc bug fixes from Vitaly Kuznetsov, mostly to do with VF
    handling.

17) Various tc-action bug fixes, from CONG Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
  net_sched: allow flushing tc police actions
  net_sched: unify the init logic for act_police
  net_sched: convert tcf_exts from list to pointer array
  net_sched: move tc offload macros to pkt_cls.h
  net_sched: fix a typo in tc_for_each_action()
  net_sched: remove an unnecessary list_del()
  net_sched: remove the leftover cleanup_a()
  mlxsw: spectrum: Allow packets to be trapped from any PG
  mlxsw: spectrum: Unmap 802.1Q FID before destroying it
  mlxsw: spectrum: Add missing rollbacks in error path
  mlxsw: reg: Fix missing op field fill-up
  mlxsw: spectrum: Trap loop-backed packets
  mlxsw: spectrum: Add missing packet traps
  mlxsw: spectrum: Mark port as active before registering it
  mlxsw: spectrum: Create PVID vPort before registering netdevice
  mlxsw: spectrum: Remove redundant errors from the code
  mlxsw: spectrum: Don't return upon error in removal path
  i40e: check for and deal with non-contiguous TCs
  ixgbe: Re-enable ability to toggle VLAN filtering
  ixgbe: Force VLNCTRL.VFE to be set in all VMDq paths
  ...

184ca823

Merge branch 'tc_action-fixes' · b96c22c0

由 David S. Miller 提交于 8月 17, 2016

Cong Wang says:

====================
net_sched: tc action fixes and updates

This patchset fixes a few regressions caused by the previous
code refactor and more. Thanks to Jamal for catching them!

Note, patch 3/7 and 4/7 are not strictly necessary for this patchset,
I just want to carry them together.

---
v4: adjust an indention for Jamal
    add two more patches

v3: avoid list for fast path, suggested by Jamal

v2: replace flex_array with regular dynamic array
    keep tcf_action_stats_update() in act_api.h
    fix macro typos found by Amir
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b96c22c0

net_sched: allow flushing tc police actions · b5ac8518

由 Roman Mashak 提交于 8月 13, 2016

The act_police uses its own code to walk the
action hashtable, which leads to that we could
not flush standalone tc police actions, so just
switch to tcf_generic_walker() like other actions.

(Joint work from Roman and Cong.)
Signed-off-by: NRoman Mashak <mrv@mojatatu.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b5ac8518

net_sched: unify the init logic for act_police · 0852e455

由 WANG Cong 提交于 8月 13, 2016

Jamal reported a crash when we create a police action
with a specific index, this is because the init logic
is not correct, we should always create one for this
case. Just unify the logic with other tc actions.

Fixes: a03e6fe5 ("act_police: fix a crash during removal")
Reported-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0852e455

net_sched: convert tcf_exts from list to pointer array · 22dc13c8

由 WANG Cong 提交于 8月 13, 2016

As pointed out by Jamal, an action could be shared by
multiple filters, so we can't use list to chain them
any more after we get rid of the original tc_action.
Instead, we could just save pointers to these actions
in tcf_exts, since they are refcount'ed, so convert
the list to an array of pointers.

The "ugly" part is the action API still accepts list
as a parameter, I just introduce a helper function to
convert the array of pointers to a list, instead of
relying on the C99 feature to iterate the array.

Fixes: a85a970a ("net_sched: move tc_action into tcf_common")
Reported-by: NJamal Hadi Salim <jhs@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

22dc13c8

net_sched: move tc offload macros to pkt_cls.h · 2734437e

由 WANG Cong 提交于 8月 13, 2016

struct tcf_exts belongs to filters, should not be visible
to plain tc actions.

Cc: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2734437e

net_sched: fix a typo in tc_for_each_action() · 0c23c3e7

由 WANG Cong 提交于 8月 13, 2016

It is harmless because all users pass 'a' to this macro.

Fixes: 00175aec ("net/sched: Macro instead of CONFIG_NET_CLS_ACT ifdef")
Cc: Amir Vadai <amir@vadai.me>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c23c3e7

net_sched: remove an unnecessary list_del() · 824a7e88

由 WANG Cong 提交于 8月 13, 2016

This list_del() for tc action is not needed actually,
because we only use this list to chain bulk operations,
therefore should not be carried for latter operations.

Fixes: ec0595cc ("net_sched: get rid of struct tcf_common")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

824a7e88

net_sched: remove the leftover cleanup_a() · f07fed82

由 WANG Cong 提交于 8月 13, 2016

After refactoring tc_action into tcf_common, we no
longer need to cleanup temporary "actions" in list,
they are permanently stored in the hashtable.

Fixes: a85a970a ("net_sched: move tc_action into tcf_common")
Reported-by: NJamal Hadi Salim <jhs@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f07fed82

Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · f4abf05f

由 David S. Miller 提交于 8月 17, 2016

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2016-08-16

This series contains fixes to e1000e, igb, ixgbe and i40e.

Kshitiz Gupta provides a fix for igb to resolve the PHY delay compensation
math in several functions.

Jarod Wilson provides a fix for e1000e which had to broken up into 2
patches, first is prepares the driver for expanding the list of NICs
that have occasional ~10 hour clock jumps when being used for PTP.
Second patch actually fixes i218 silicon which has been experiencing
the clock jumps while using PTP.

Alex provides 2 patches for ixgbe now that he is back at Intel.  First
fixes setting VLNCTRL.VFE bit, which was left unchanged in earlier patches
which resulted in disabling VLAN filtering for all the VFs.  Second
corrects the support for disabling the VLAN tag filtering via the
feature bit.

Lastly, David fixes i40e which was causing a kernel panic when
non-contiguous traffic classes or traffic classes not starting with TC0,
were configured on a link partner switch.  To fix this, changed the
logic when determining the total number of TCs enabled.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f4abf05f

Merge branch 'mlxsw-fixes' · 647f28c7

由 David S. Miller 提交于 8月 17, 2016

Jiri Pirko says:

====================
mlxsw: IPv4 UC router fixes

Ido says:
Patches 1-3 fix a long standing problem in the driver's init sequence,
which manifests itself quite often when routing daemons try to configure
an IP address on registered netdevs that don't yet have an associated
vPort.

Patches 4-9 add missing packet traps for the router to work properly and
also fix ordering issue following the recent changes to the driver's init
sequence.

The last patch isn't related to the router, but fixes a general problem
in which under certain conditions packets aren't trapped to CPU.

v1->v2:
- Change order of patch 7
- Add patch 6 following Ilan's comment
- Add patchset name and cover letter
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

647f28c7

mlxsw: spectrum: Allow packets to be trapped from any PG · 9ffcc372

由 Ido Schimmel 提交于 8月 17, 2016

When packets enter the device they are classified to a priority group
(PG) buffer based on their PCP value. After their egress port and
traffic class are determined they are moved to the switch's shared
buffer and await transmission, if:

(Ingress{Port}.Usage < Thres && Ingress{Port,PG}.Usage < Thres &&
 Egress{Port}.Usage < Thres && Egress{Port,TC}.Usage < Thres)
||
(Ingress{Port}.Usage < Min || Ingress{Port,PG} < Min ||
 Egress{Port}.Usage < Min || Egress{Port,TC}.Usage < Min)

Packets scheduled to transmission through CPU port (trapped to CPU) use
traffic class 7, which has a zero maximum and minimum quotas. However,
when such packets arrive from PG 0 they are admitted to the shared
buffer as PG 0 has a non-zero minimum quota.

Allow all packets to be trapped to the CPU - regardless of the PG they
were classified to - by assigning a 10KB minimum quota for CPU port and
TC7.

Fixes: 8e8dfe9f ("mlxsw: spectrum: Add IEEE 802.1Qaz ETS support")
Reported-by: NTamir Winetroub <tamirw@mellanox.com>
Tested-by: NTamir Winetroub <tamirw@mellanox.com>
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9ffcc372

mlxsw: spectrum: Unmap 802.1Q FID before destroying it · 8168287b

由 Ido Schimmel 提交于 8月 17, 2016

Before destroying the 802.1Q FID we should first remove the VID-to-FID
mapping. This makes mlxsw_sp_fid_destroy() symmetric with regards to
mlxsw_sp_fid_create().

Fixes: 14d39461 ("mlxsw: spectrum: Use per-FID struct for the VLAN-aware bridge")
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8168287b

mlxsw: spectrum: Add missing rollbacks in error path · 0583272d

由 Ido Schimmel 提交于 8月 17, 2016

While going over the code I noticed we are missing two rollbacks in the
port's creation error path. Add them and adjust the place of one of them
in the port's removal sequence so that both are symmetric.

Fixes: 56ade8fe ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0583272d

mlxsw: reg: Fix missing op field fill-up · 0e7df1a2

由 Jiri Pirko 提交于 8月 17, 2016

Ralue pack function needs to set op, otherwise it is 0 for add always.

Fixes: d5a1c749 ("mlxsw: reg: Add Router Algorithmic LPM Unicast Entry Register definition")
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e7df1a2

mlxsw: spectrum: Trap loop-backed packets · a94a614f

由 Ido Schimmel 提交于 8月 17, 2016

One of the conditions to generate an ICMP Redirect Message is that "the
packet is being forwarded out the same physical interface that it was
received from" (RFC 1812).

Therefore, we need to be able to trap such packets and let the kernel
decide what to do with them.

For each RIF, enable the loop-back filter, which will raise the LBERROR
trap whenever the ingress RIF equals the egress RIF.

Fixes: 99724c18 ("mlxsw: spectrum: Introduce support for router interfaces")
Reported-by: NIlan Tayari <ilant@mellanox.com>
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a94a614f