1. 24 9月, 2014 2 次提交
    • D
      x86, sched: Add new topology for multi-NUMA-node CPUs · cebf15eb
      Dave Hansen 提交于
      I'm getting the spew below when booting with Haswell (Xeon
      E5-2699 v3) CPUs and the "Cluster-on-Die" (CoD) feature enabled
      in the BIOS.  It seems similar to the issue that some folks from
      AMD ran in to on their systems and addressed in this commit:
      
        161270fc ("x86/smp: Fix topology checks on AMD MCM CPUs")
      
      Both these Intel and AMD systems break an assumption which is
      being enforced by topology_sane(): a socket may not contain more
      than one NUMA node.
      
      AMD special-cased their system by looking for a cpuid flag.  The
      Intel mode is dependent on BIOS options and I do not know of a
      way which it is enumerated other than the tables being parsed
      during the CPU bringup process.  In other words, we have to trust
      the ACPI tables <shudder>.
      
      This detects the situation where a NUMA node occurs at a place in
      the middle of the "CPU" sched domains.  It replaces the default
      topology with one that relies on the NUMA information from the
      firmware (SRAT table) for all levels of sched domains above the
      hyperthreads.
      
      This also fixes a sysfs bug.  We used to freak out when we saw
      the "mc" group cross a node boundary, so we stopped building the
      MC group.  MC gets exported as the 'core_siblings_list' in
      /sys/devices/system/cpu/cpu*/topology/ and this caused CPUs with
      the same 'physical_package_id' to not be listed together in
      'core_siblings_list'.  This violates a statement from
      Documentation/ABI/testing/sysfs-devices-system-cpu:
      
      	core_siblings: internal kernel map of cpu#'s hardware threads
      	within the same physical_package_id.
      
      	core_siblings_list: human-readable list of the logical CPU
      	numbers within the same physical_package_id as cpu#.
      
      The sysfs effects here cause an issue with the hwloc tool where
      it gets confused and thinks there are more sockets than are
      physically present.
      
      Before this patch, there are two packages:
      
      # cd /sys/devices/system/cpu/
      # cat cpu*/topology/physical_package_id | sort | uniq -c
           18 0
           18 1
      
      But 4 _sets_ of core siblings:
      
      # cat cpu*/topology/core_siblings_list | sort | uniq -c
            9 0-8
            9 18-26
            9 27-35
            9 9-17
      
      After this set, there are only 2 sets of core siblings, which
      is what we expect for a 2-socket system.
      
      # cat cpu*/topology/physical_package_id | sort | uniq -c
           18 0
           18 1
      # cat cpu*/topology/core_siblings_list | sort | uniq -c
           18 0-17
           18 18-35
      
      Example spew:
      ...
      	NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
      	 #2  #3  #4  #5  #6  #7  #8
      	.... node  #1, CPUs:    #9
      	------------[ cut here ]------------
      	WARNING: CPU: 9 PID: 0 at /home/ak/hle/linux-hle-2.6/arch/x86/kernel/smpboot.c:306 topology_sane.isra.2+0x74/0x90()
      	sched: CPU #9's mc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
      	Modules linked in:
      	CPU: 9 PID: 0 Comm: swapper/9 Not tainted 3.17.0-rc1-00293-g8e01c4d-dirty #631
      	Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRNDSDP1.86B.0036.R05.1407140519 07/14/2014
      	0000000000000009 ffff88046ddabe00 ffffffff8172e485 ffff88046ddabe48
      	ffff88046ddabe38 ffffffff8109691d 000000000000b001 0000000000000009
      	ffff88086fc12580 000000000000b020 0000000000000009 ffff88046ddabe98
      	Call Trace:
      	[<ffffffff8172e485>] dump_stack+0x45/0x56
      	[<ffffffff8109691d>] warn_slowpath_common+0x7d/0xa0
      	[<ffffffff8109698c>] warn_slowpath_fmt+0x4c/0x50
      	[<ffffffff81074f94>] topology_sane.isra.2+0x74/0x90
      	[<ffffffff8107530e>] set_cpu_sibling_map+0x31e/0x4f0
      	[<ffffffff8107568d>] start_secondary+0x1ad/0x240
      	---[ end trace 3fe5f587a9fcde61 ]---
      	#10 #11 #12 #13 #14 #15 #16 #17
      	.... node  #2, CPUs:   #18 #19 #20 #21 #22 #23 #24 #25 #26
      	.... node  #3, CPUs:   #27 #28 #29 #30 #31 #32 #33 #34 #35
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      [ Added LLC domain and s/match_mc/match_die/ ]
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Igor Mammedov <imammedo@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: brice.goglin@gmail.com
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Link: http://lkml.kernel.org/r/20140918193334.C065EBCE@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cebf15eb
    • P
      sched, mips, ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW · c55f5158
      Peter Zijlstra 提交于
      Kirill found that there's a subtle race in the
      __ARCH_WANT_UNLOCKED_CTXSW code, and instead of fixing it, remove the
      entire exception because neither arch that uses it seems to actually
      still require it.
      
      Boot tested on mips64el (qemu) only.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NKirill Tkhai <tkhai@yandex.ru>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Davidlohr Bueso <davidlohr@hp.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Qais Yousef <qais.yousef@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: oleg@redhat.com
      Cc: linux@roeck-us.net
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Link: http://lkml.kernel.org/r/20140923150641.GH3312@worktop.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c55f5158
  2. 19 9月, 2014 4 次提交
    • V
      ARM: topology: Use the new cpu_capacity interface · d3bfca1a
      Vincent Guittot 提交于
      Use the new arch_scale_cpu_capacity() scheduler facility in order to reflect
      the original capacity of a CPU instead of arch_scale_freq_capacity() which is
      more linked to a scaling of the capacity linked to the frequency.
      Signed-off-by: NVincent Guittot <vincent.guittot@linaro.org>
      Acked-by: NNicolas Pitre <nico@linaro.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: preeti@linux.vnet.ibm.com
      Cc: riel@redhat.com
      Cc: Morten.Rasmussen@arm.com
      Cc: efault@gmx.de
      Cc: daniel.lezcano@linaro.org
      Cc: dietmar.eggemann@arm.com
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Brown <broonie@linaro.org>
      Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: devicetree@vger.kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/1409051215-16788-6-git-send-email-vincent.guittot@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d3bfca1a
    • A
      sched: Add helper for task stack page overrun checking · a70857e4
      Aaron Tomlin 提交于
      This facility is used in a few places so let's introduce
      a helper function to improve code readability.
      Signed-off-by: NAaron Tomlin <atomlin@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: aneesh.kumar@linux.vnet.ibm.com
      Cc: dzickus@redhat.com
      Cc: bmr@redhat.com
      Cc: jcastillo@redhat.com
      Cc: oleg@redhat.com
      Cc: riel@redhat.com
      Cc: prarit@redhat.com
      Cc: jgh@redhat.com
      Cc: minchan@kernel.org
      Cc: mpe@ellerman.id.au
      Cc: tglx@linutronix.de
      Cc: hannes@cmpxchg.org
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Seiji Aguchi <seiji.aguchi@hds.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/r/1410527779-8133-3-git-send-email-atomlin@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a70857e4
    • A
      init/main.c: Give init_task a canary · d4311ff1
      Aaron Tomlin 提交于
      Tasks get their end of stack set to STACK_END_MAGIC with the
      aim to catch stack overruns. Currently this feature does not
      apply to init_task. This patch removes this restriction.
      
      Note that a similar patch was posted by Prarit Bhargava
      some time ago but was never merged:
      
        http://marc.info/?l=linux-kernel&m=127144305403241&w=2Signed-off-by: NAaron Tomlin <atomlin@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
      Cc: aneesh.kumar@linux.vnet.ibm.com
      Cc: dzickus@redhat.com
      Cc: bmr@redhat.com
      Cc: jcastillo@redhat.com
      Cc: jgh@redhat.com
      Cc: minchan@kernel.org
      Cc: tglx@linutronix.de
      Cc: hannes@cmpxchg.org
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Daeseok Youn <daeseok.youn@gmail.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Michael Opdenacker <michael.opdenacker@free-electrons.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Seiji Aguchi <seiji.aguchi@hds.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/r/1410527779-8133-2-git-send-email-atomlin@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d4311ff1
    • K
      sched, cleanup, treewide: Remove set_current_state(TASK_RUNNING) after schedule() · f139caf2
      Kirill Tkhai 提交于
      schedule(), io_schedule() and schedule_timeout() always return
      with TASK_RUNNING state set, so one more setting is unnecessary.
      
      (All places in patch are visible good, only exception is
       kiblnd_scheduler() from:
      
            drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
      
       Its schedule() is one line above standard 3 lines of unified diff)
      
      No places where set_current_state() is used for mb().
      Signed-off-by: NKirill Tkhai <ktkhai@parallels.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1410529254.3569.23.camel@tkhai
      Cc: Alasdair Kergon <agk@redhat.com>
      Cc: Anil Belur <askb23@gmail.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dave Kleikamp <shaggy@kernel.org>
      Cc: David Airlie <airlied@linux.ie>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Dmitry Eremin <dmitry.eremin@intel.com>
      Cc: Frank Blaschka <blaschka@linux.vnet.ibm.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Isaac Huang <he.huang@intel.com>
      Cc: James E.J. Bottomley <JBottomley@parallels.com>
      Cc: James E.J. Bottomley <jejb@parisc-linux.org>
      Cc: J. Bruce Fields <bfields@fieldses.org>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Laura Abbott <lauraa@codeaurora.org>
      Cc: Liang Zhen <liang.zhen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Masaru Nomura <massa.nomura@gmail.com>
      Cc: Michael Opdenacker <michael.opdenacker@free-electrons.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Oleg Drokin <green@linuxhacker.ru>
      Cc: Peng Tao <bergwolf@gmail.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Robert Love <robert.w.love@intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Trond Myklebust <trond.myklebust@primarydata.com>
      Cc: Ursula Braun <ursula.braun@de.ibm.com>
      Cc: Zi Shen Lim <zlim.lnx@gmail.com>
      Cc: devel@driverdev.osuosl.org
      Cc: dm-devel@redhat.com
      Cc: dri-devel@lists.freedesktop.org
      Cc: fcoe-devel@open-fcoe.org
      Cc: jfs-discussion@lists.sourceforge.net
      Cc: linux390@de.ibm.com
      Cc: linux-afs@lists.infradead.org
      Cc: linux-cris-kernel@axis.com
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-nfs@vger.kernel.org
      Cc: linux-parisc@vger.kernel.org
      Cc: linux-raid@vger.kernel.org
      Cc: linux-s390@vger.kernel.org
      Cc: linux-scsi@vger.kernel.org
      Cc: qla2xxx-upstream@qlogic.com
      Cc: user-mode-linux-devel@lists.sourceforge.net
      Cc: user-mode-linux-user@lists.sourceforge.net
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      f139caf2
  3. 05 9月, 2014 11 次提交
  4. 04 9月, 2014 5 次提交
  5. 03 9月, 2014 2 次提交
    • L
      powerpc/kvm/cma: Fix panic introduces by signed shift operation · 02a68d05
      Laurent Dufour 提交于
      fc95ca72 introduces a memset in
      kvmppc_alloc_hpt since the general CMA doesn't clear the memory it
      allocates.
      
      However, the size argument passed to memset is computed from a signed value
      and its signed bit is extended by the cast the compiler is doing. This lead
      to extremely large size value when dealing with order value >= 31, and
      almost all the memory following the allocated space is cleaned. As a
      consequence, the system is panicing and may even fail spawning the kdump
      kernel.
      
      This fix makes use of an unsigned value for the memset's size argument to
      avoid sign extension. Among this fix, another shift operation which may
      lead to signed extended value too is also fixed.
      
      Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Alexander Graf <agraf@suse.de>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      02a68d05
    • L
      ARM: ux500: disable msp2 node on Snowball · dbd366fd
      Linus Walleij 提交于
      Analogous to commit 8858d88a
      that fixed commit 70b41abc
      "ARM: ux500: move MSP pin control to the device tree"
      accidentally activated MSP2, giving rise to a boot scroll
      scream as the kernel attempts to probe a driver for it and
      fails to obtain DMA channel 14.
      
      For some reason I forgot to fix this on the Snowball. Fix
      this up by marking the node disabled again.
      
      Cc: Lee Jones <lee.jones@linaro.org>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Tested-by: NKevin Hilman <khilman@linaro.org>
      Signed-off-by: NKevin Hilman <khilman@linaro.org>
      dbd366fd
  6. 02 9月, 2014 2 次提交
  7. 01 9月, 2014 6 次提交
  8. 30 8月, 2014 8 次提交