1. 09 2月, 2014 1 次提交
  2. 16 1月, 2014 1 次提交
    • P
      x86: Add check for number of available vectors before CPU down · da6139e4
      Prarit Bhargava 提交于
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64791
      
      When a cpu is downed on a system, the irqs on the cpu are assigned to
      other cpus.  It is possible, however, that when a cpu is downed there
      aren't enough free vectors on the remaining cpus to account for the
      vectors from the cpu that is being downed.
      
      This results in an interesting "overflow" condition where irqs are
      "assigned" to a CPU but are not handled.
      
      For example, when downing cpus on a 1-64 logical processor system:
      
      <snip>
      [  232.021745] smpboot: CPU 61 is now offline
      [  238.480275] smpboot: CPU 62 is now offline
      [  245.991080] ------------[ cut here ]------------
      [  245.996270] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:264 dev_watchdog+0x246/0x250()
      [  246.005688] NETDEV WATCHDOG: p786p1 (ixgbe): transmit queue 0 timed out
      [  246.013070] Modules linked in: lockd sunrpc iTCO_wdt iTCO_vendor_support sb_edac ixgbe microcode e1000e pcspkr joydev edac_core lpc_ich ioatdma ptp mdio mfd_core i2c_i801 dca pps_core i2c_core wmi acpi_cpufreq isci libsas scsi_transport_sas
      [  246.037633] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.0+ #14
      [  246.044451] Hardware name: Intel Corporation S4600LH ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
      [  246.057371]  0000000000000009 ffff88081fa03d40 ffffffff8164fbf6 ffff88081fa0ee48
      [  246.065728]  ffff88081fa03d90 ffff88081fa03d80 ffffffff81054ecc ffff88081fa13040
      [  246.074073]  0000000000000000 ffff88200cce0000 0000000000000040 0000000000000000
      [  246.082430] Call Trace:
      [  246.085174]  <IRQ>  [<ffffffff8164fbf6>] dump_stack+0x46/0x58
      [  246.091633]  [<ffffffff81054ecc>] warn_slowpath_common+0x8c/0xc0
      [  246.098352]  [<ffffffff81054fb6>] warn_slowpath_fmt+0x46/0x50
      [  246.104786]  [<ffffffff815710d6>] dev_watchdog+0x246/0x250
      [  246.110923]  [<ffffffff81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80
      [  246.119097]  [<ffffffff8106092a>] call_timer_fn+0x3a/0x110
      [  246.125224]  [<ffffffff8106280f>] ? update_process_times+0x6f/0x80
      [  246.132137]  [<ffffffff81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80
      [  246.140308]  [<ffffffff81061db0>] run_timer_softirq+0x1f0/0x2a0
      [  246.146933]  [<ffffffff81059a80>] __do_softirq+0xe0/0x220
      [  246.152976]  [<ffffffff8165fedc>] call_softirq+0x1c/0x30
      [  246.158920]  [<ffffffff810045f5>] do_softirq+0x55/0x90
      [  246.164670]  [<ffffffff81059d35>] irq_exit+0xa5/0xb0
      [  246.170227]  [<ffffffff8166062a>] smp_apic_timer_interrupt+0x4a/0x60
      [  246.177324]  [<ffffffff8165f40a>] apic_timer_interrupt+0x6a/0x70
      [  246.184041]  <EOI>  [<ffffffff81505a1b>] ? cpuidle_enter_state+0x5b/0xe0
      [  246.191559]  [<ffffffff81505a17>] ? cpuidle_enter_state+0x57/0xe0
      [  246.198374]  [<ffffffff81505b5d>] cpuidle_idle_call+0xbd/0x200
      [  246.204900]  [<ffffffff8100b7ae>] arch_cpu_idle+0xe/0x30
      [  246.210846]  [<ffffffff810a47b0>] cpu_startup_entry+0xd0/0x250
      [  246.217371]  [<ffffffff81646b47>] rest_init+0x77/0x80
      [  246.223028]  [<ffffffff81d09e8e>] start_kernel+0x3ee/0x3fb
      [  246.229165]  [<ffffffff81d0989f>] ? repair_env_string+0x5e/0x5e
      [  246.235787]  [<ffffffff81d095a5>] x86_64_start_reservations+0x2a/0x2c
      [  246.242990]  [<ffffffff81d0969f>] x86_64_start_kernel+0xf8/0xfc
      [  246.249610] ---[ end trace fb74fdef54d79039 ]---
      [  246.254807] ixgbe 0000:c2:00.0 p786p1: initiating reset due to tx timeout
      [  246.262489] ixgbe 0000:c2:00.0 p786p1: Reset adapter
      Last login: Mon Nov 11 08:35:14 from 10.18.17.119
      [root@(none) ~]# [  246.792676] ixgbe 0000:c2:00.0 p786p1: detected SFP+: 5
      [  249.231598] ixgbe 0000:c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
      [  246.792676] ixgbe 0000:c2:00.0 p786p1: detected SFP+: 5
      [  249.231598] ixgbe 0000:c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
      
      (last lines keep repeating.  ixgbe driver is dead until module reload.)
      
      If the downed cpu has more vectors than are free on the remaining cpus on the
      system, it is possible that some vectors are "orphaned" even though they are
      assigned to a cpu.  In this case, since the ixgbe driver had a watchdog, the
      watchdog fired and notified that something was wrong.
      
      This patch adds a function, check_vectors(), to compare the number of vectors
      on the CPU going down and compares it to the number of vectors available on
      the system.  If there aren't enough vectors for the CPU to go down, an
      error is returned and propogated back to userspace.
      
      v2: Do not need to look at percpu irqs
      v3: Need to check affinity to prevent counting of MSIs in IOAPIC Lowest
          Priority Mode
      v4: Additional changes suggested by Gong Chen.
      v5/v6/v7/v8: Updated comment text
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Link: http://lkml.kernel.org/r/1389613861-3853-1-git-send-email-prarit@redhat.comReviewed-by: NGong Chen <gong.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Seiji Aguchi <seiji.aguchi@hds.com>
      Cc: Yang Zhang <yang.z.zhang@Intel.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Janet Morgan <janet.morgan@intel.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Ruiv Wang <ruiv.wang@gmail.com>
      Cc: Gong Chen <gong.chen@linux.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      da6139e4
  3. 20 12月, 2013 1 次提交
  4. 01 10月, 2013 1 次提交
    • B
      x86/boot: Further compress CPUs bootup message · a17bce4d
      Borislav Petkov 提交于
      Turn it into (for example):
      
      [    0.073380] x86: Booting SMP configuration:
      [    0.074005] .... node   #0, CPUs:          #1   #2   #3   #4   #5   #6   #7
      [    0.603005] .... node   #1, CPUs:     #8   #9  #10  #11  #12  #13  #14  #15
      [    1.200005] .... node   #2, CPUs:    #16  #17  #18  #19  #20  #21  #22  #23
      [    1.796005] .... node   #3, CPUs:    #24  #25  #26  #27  #28  #29  #30  #31
      [    2.393005] .... node   #4, CPUs:    #32  #33  #34  #35  #36  #37  #38  #39
      [    2.996005] .... node   #5, CPUs:    #40  #41  #42  #43  #44  #45  #46  #47
      [    3.600005] .... node   #6, CPUs:    #48  #49  #50  #51  #52  #53  #54  #55
      [    4.202005] .... node   #7, CPUs:    #56  #57  #58  #59  #60  #61  #62  #63
      [    4.811005] .... node   #8, CPUs:    #64  #65  #66  #67  #68  #69  #70  #71
      [    5.421006] .... node   #9, CPUs:    #72  #73  #74  #75  #76  #77  #78  #79
      [    6.032005] .... node  #10, CPUs:    #80  #81  #82  #83  #84  #85  #86  #87
      [    6.648006] .... node  #11, CPUs:    #88  #89  #90  #91  #92  #93  #94  #95
      [    7.262005] .... node  #12, CPUs:    #96  #97  #98  #99 #100 #101 #102 #103
      [    7.865005] .... node  #13, CPUs:   #104 #105 #106 #107 #108 #109 #110 #111
      [    8.466005] .... node  #14, CPUs:   #112 #113 #114 #115 #116 #117 #118 #119
      [    9.073006] .... node  #15, CPUs:   #120 #121 #122 #123 #124 #125 #126 #127
      [    9.679901] x86: Booted up 16 nodes, 128 CPUs
      
      and drop useless elements.
      
      Change num_digits() to hpa's division-avoiding, cell-phone-typed
      version which he went at great lengths and pains to submit on a
      Saturday evening.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: huawei.libin@huawei.com
      Cc: wangyijing@huawei.com
      Cc: fenghua.yu@intel.com
      Cc: guohanjun@huawei.com
      Cc: paul.gortmaker@windriver.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20130930095624.GB16383@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a17bce4d
  5. 28 9月, 2013 1 次提交
    • B
      x86: Improve the printout of the SMP bootup CPU table · 646e29a1
      Borislav Petkov 提交于
      As the new x86 CPU bootup printout format code maintainer, I am
      taking immediate action to improve and clean (and thus indulge
      my OCD) the reporting of the cores when coming up online.
      
      Fix padding to a right-hand alignment, cleanup code and bind
      reporting width to the max number of supported CPUs on the
      system, like this:
      
       [    0.074509] smpboot: Booting Node   0, Processors:      #1  #2  #3  #4  #5  #6  #7 OK
       [    0.644008] smpboot: Booting Node   1, Processors:  #8  #9 #10 #11 #12 #13 #14 #15 OK
       [    1.245006] smpboot: Booting Node   2, Processors: #16 #17 #18 #19 #20 #21 #22 #23 OK
       [    1.864005] smpboot: Booting Node   3, Processors: #24 #25 #26 #27 #28 #29 #30 #31 OK
       [    2.489005] smpboot: Booting Node   4, Processors: #32 #33 #34 #35 #36 #37 #38 #39 OK
       [    3.093005] smpboot: Booting Node   5, Processors: #40 #41 #42 #43 #44 #45 #46 #47 OK
       [    3.698005] smpboot: Booting Node   6, Processors: #48 #49 #50 #51 #52 #53 #54 #55 OK
       [    4.304005] smpboot: Booting Node   7, Processors: #56 #57 #58 #59 #60 #61 #62 #63 OK
       [    4.961413] Brought up 64 CPUs
      
      and this:
      
       [    0.072367] smpboot: Booting Node   0, Processors:    #1 #2 #3 #4 #5 #6 #7 OK
       [    0.686329] Brought up 8 CPUs
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Libin <huawei.libin@huawei.com>
      Cc: wangyijing@huawei.com
      Cc: fenghua.yu@intel.com
      Cc: guohanjun@huawei.com
      Cc: paul.gortmaker@windriver.com
      Link: http://lkml.kernel.org/r/20130927143554.GF4422@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      646e29a1
  6. 25 9月, 2013 1 次提交
  7. 05 9月, 2013 1 次提交
  8. 15 7月, 2013 1 次提交
    • P
      x86: delete __cpuinit usage from all x86 files · 148f9bb8
      Paul Gortmaker 提交于
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      Note that some harmless section mismatch warnings may result, since
      notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
      are flagged as __cpuinit  -- so if we remove the __cpuinit from
      arch specific callers, we will also get section mismatch warnings.
      As an intermediate step, we intend to turn the linux/init.h cpuinit
      content into no-ops as early as possible, since that will get rid
      of these warnings.  In any case, they are temporary and harmless.
      
      This removes all the arch/x86 uses of the __cpuinit macros from
      all C files.  x86 only had the one __CPUINIT used in assembly files,
      and it wasn't paired off with a .previous or a __FINIT, so we can
      delete it directly w/o any corresponding additional change there.
      
      [1] https://lkml.org/lkml/2013/5/20/589
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      148f9bb8
  9. 31 5月, 2013 1 次提交
    • A
      sched/x86: Construct all sibling maps if smt · b0bc225d
      Andrew Jones 提交于
      Commit 316ad248 ("sched/x86: Rewrite
      set_cpu_sibling_map()") broke the construction of sibling maps,
      which also broke the booted_cores accounting.
      
      Before the rewrite, if smt was present, then each map was
      updated for each smt sibling. After the rewrite only
      cpu_sibling_mask gets updated, as the llc and core maps depend
      on 'has_mc = x86_max_cores > 1' instead. This leads to problems
      with topologies like the following
      
      (qemu -smp sockets=2,cores=1,threads=2)
      
        processor       : 0
        physical id     : 0
        siblings        : 1    <= should be 2
        core id         : 0
        cpu cores       : 1
      
        processor       : 1
        physical id     : 0
        siblings        : 1    <= should be 2
        core id         : 0
        cpu cores       : 0    <= should be 1
      
        processor       : 2
        physical id     : 1
        siblings        : 1    <= should be 2
        core id         : 0
        cpu cores       : 1
      
        processor       : 3
        physical id     : 1
        siblings        : 1    <= should be 2
        core id         : 0
        cpu cores       : 0    <= should be 1
      
      This patch restores the former construction by defining has_mc
      as (has_smt || x86_max_cores > 1). This should be fine as there
      were no (has_smt && !has_mc) conditions in the context.
      
      Aso rename has_mc to has_mp now that it's not just for cores.
      Signed-off-by: NAndrew Jones <drjones@redhat.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: a.p.zijlstra@chello.nl
      Cc: fenghua.yu@intel.com
      Link: http://lkml.kernel.org/r/1369831695-11970-1-git-send-email-drjones@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b0bc225d
  10. 08 4月, 2013 1 次提交
  11. 06 3月, 2013 1 次提交
  12. 10 2月, 2013 1 次提交
    • L
      x86 idle: remove mwait_idle() and "idle=mwait" cmdline param · 69fb3676
      Len Brown 提交于
      mwait_idle() is a C1-only idle loop intended to be more efficient
      than HLT, starting on Pentium-4 HT-enabled processors.
      
      But mwait_idle() has been replaced by the more general
      mwait_idle_with_hints(), which handles both C1 and deeper C-states.
      ACPI processor_idle and intel_idle use only mwait_idle_with_hints(),
      and no longer use mwait_idle().
      
      Here we simplify the x86 native idle code by removing mwait_idle(),
      and the "idle=mwait" bootparam used to invoke it.
      
      Since Linux 3.0 there has been a boot-time warning when "idle=mwait"
      was invoked saying it would be removed in 2012.  This removal
      was also noted in the (now removed:-) feature-removal-schedule.txt.
      
      After this change, kernels configured with
      (CONFIG_ACPI=n && CONFIG_INTEL_IDLE=n) when run on hardware
      that supports MWAIT will simply use HLT.  If MWAIT is desired
      on those systems, cpuidle and the cpuidle drivers above
      can be enabled.
      Signed-off-by: NLen Brown <len.brown@intel.com>
      Cc: x86@kernel.org
      69fb3676
  13. 01 12月, 2012 1 次提交
    • V
      x86, fpu: Avoid FPU lazy restore after suspend · 644c1541
      Vincent Palatin 提交于
      When a cpu enters S3 state, the FPU state is lost.
      After resuming for S3, if we try to lazy restore the FPU for a process running
      on the same CPU, this will result in a corrupted FPU context.
      
      Ensure that "fpu_owner_task" is properly invalided when (re-)initializing a CPU,
      so nobody will try to lazy restore a state which doesn't exist in the hardware.
      
      Tested with a 64-bit kernel on a 4-core Ivybridge CPU with eagerfpu=off,
      by doing thousands of suspend/resume cycles with 4 processes doing FPU
      operations running. Without the patch, a process is killed after a
      few hundreds cycles by a SIGFPE.
      
      Cc: Duncan Laurie <dlaurie@chromium.org>
      Cc: Olof Johansson <olofj@chromium.org>
      Cc: <stable@kernel.org> v3.4+ # for 3.4 need to replace this_cpu_write by percpu_write
      Signed-off-by: NVincent Palatin <vpalatin@chromium.org>
      Link: http://lkml.kernel.org/r/1354306532-1014-1-git-send-email-vpalatin@chromium.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      644c1541
  14. 15 11月, 2012 2 次提交
  15. 14 11月, 2012 1 次提交
  16. 23 8月, 2012 1 次提交
    • R
      x86/smp: Don't ever patch back to UP if we unplug cpus · 816afe4f
      Rusty Russell 提交于
      We still patch SMP instructions to UP variants if we boot with a
      single CPU, but not at any other time.  In particular, not if we
      unplug CPUs to return to a single cpu.
      
      Paul McKenney points out:
      
       mean offline overhead is 6251/48=130.2 milliseconds.
      
       If I remove the alternatives_smp_switch() from the offline
       path [...] the mean offline overhead is 550/42=13.1 milliseconds
      
      Basically, we're never going to get those 120ms back, and the
      code is pretty messy.
      
      We get rid of:
      
       1) The "smp-alt-once" boot option. It's actually "smp-alt-boot", the
          documentation is wrong. It's now the default.
      
       2) The skip_smp_alternatives flag used by suspend.
      
       3) arch_disable_nonboot_cpus_begin() and arch_disable_nonboot_cpus_end()
          which were only used to set this one flag.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Paul McKenney <paul.mckenney@us.ibm.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/87vcgwwive.fsf@rustcorp.com.auSigned-off-by: NIngo Molnar <mingo@kernel.org>
      816afe4f
  17. 14 6月, 2012 1 次提交
    • V
      x86: Add read_mostly declaration/definition to variables from smp.h · 0816b0f0
      Vlad Zolotarov 提交于
      Add "read-mostly" qualifier to the following variables in
      smp.h:
      
       - cpu_sibling_map
       - cpu_core_map
       - cpu_llc_shared_map
       - cpu_llc_id
       - cpu_number
       - x86_cpu_to_apicid
       - x86_bios_cpu_apicid
       - x86_cpu_to_logical_apicid
      
      As long as all the variables above are only written during the
      initialization, this change is meant to prevent the false
      sharing. More specifically, on vSMP Foundation platform
      x86_cpu_to_apicid shared the same internode_cache_line with
      frequently written lapic_events.
      
      From the analysis of the first 33 per_cpu variables out of 219
      (memories they describe, to be more specific) the 8 have read_mostly
      nature (tlb_vector_offset, cpu_loops_per_jiffy, xen_debug_irq, etc.)
      and 25 are frequently written (irq_stack_union, gdt_page,
      exception_stacks, idt_desc, etc.).
      
      Assuming that the spread of the rest of the per_cpu variables is
      similar, identifying the read mostly memories will make more sense
      in terms of long-term code maintenance comparing to identifying
      frequently written memories.
      Signed-off-by: NVlad Zolotarov <vlad@scalemp.com>
      Acked-by: NShai Fultheim <shai@scalemp.com>
      Cc: Shai Fultheim (Shai@ScaleMP.com) <Shai@scalemp.com>
      Cc: ido@wizery.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1719258.EYKzE4Zbq5@vladSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0816b0f0
  18. 13 6月, 2012 1 次提交
    • B
      x86/smp: Fix topology checks on AMD MCM CPUs · 161270fc
      Borislav Petkov 提交于
      The warning below triggers on AMD MCM packages because physical package
      IDs on the cores of a _physical_ socket are the same. I.e., this field
      says which CPUs belong to the same physical package.
      
      However, the same two CPUs belong to two different internal, i.e.
      "logical" nodes in the same physical socket which is reflected in the
      CPU-to-node map on x86 with NUMA.
      
      Which makes this check wrong on the above topologies so circumvent it.
      
      [    0.444413] Booting Node   0, Processors  #1 #2 #3 #4 #5 Ok.
      [    0.461388] ------------[ cut here ]------------
      [    0.465997] WARNING: at arch/x86/kernel/smpboot.c:310 topology_sane.clone.1+0x6e/0x81()
      [    0.473960] Hardware name: Dinar
      [    0.477170] sched: CPU #6's mc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
      [    0.486860] Booting Node   1, Processors  #6
      [    0.491104] Modules linked in:
      [    0.494141] Pid: 0, comm: swapper/6 Not tainted 3.4.0+ #1
      [    0.499510] Call Trace:
      [    0.501946]  [<ffffffff8144bf92>] ? topology_sane.clone.1+0x6e/0x81
      [    0.508185]  [<ffffffff8102f1fc>] warn_slowpath_common+0x85/0x9d
      [    0.514163]  [<ffffffff8102f2b7>] warn_slowpath_fmt+0x46/0x48
      [    0.519881]  [<ffffffff8144bf92>] topology_sane.clone.1+0x6e/0x81
      [    0.525943]  [<ffffffff8144c234>] set_cpu_sibling_map+0x251/0x371
      [    0.532004]  [<ffffffff8144c4ee>] start_secondary+0x19a/0x218
      [    0.537729] ---[ end trace 4eaa2a86a8e2da22 ]---
      [    0.628197]  #7 #8 #9 #10 #11 Ok.
      [    0.807108] Booting Node   3, Processors  #12 #13 #14 #15 #16 #17 Ok.
      [    0.897587] Booting Node   2, Processors  #18 #19 #20 #21 #22 #23 Ok.
      [    0.917443] Brought up 24 CPUs
      
      We ran a topology sanity check test we have here on it and
      it all looks ok... hopefully :).
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20120529135442.GE29157@aftab.osrc.amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      161270fc
  19. 06 6月, 2012 2 次提交
  20. 05 6月, 2012 1 次提交
  21. 30 5月, 2012 1 次提交
  22. 17 5月, 2012 1 次提交
    • P
      sched: Remove stale power aware scheduling remnants and dysfunctional knobs · 8e7fbcbc
      Peter Zijlstra 提交于
      It's been broken forever (i.e. it's not scheduling in a power
      aware fashion), as reported by Suresh and others sending
      patches, and nobody cares enough to fix it properly ...
      so remove it to make space free for something better.
      
      There's various problems with the code as it stands today, first
      and foremost the user interface which is bound to topology
      levels and has multiple values per level. This results in a
      state explosion which the administrator or distro needs to
      master and almost nobody does.
      
      Furthermore large configuration state spaces aren't good, it
      means the thing doesn't just work right because it's either
      under so many impossibe to meet constraints, or even if
      there's an achievable state workloads have to be aware of
      it precisely and can never meet it for dynamic workloads.
      
      So pushing this kind of decision to user-space was a bad idea
      even with a single knob - it's exponentially worse with knobs
      on every node of the topology.
      
      There is a proposal to replace the user interface with a single
      3 state knob:
      
       sched_balance_policy := { performance, power, auto }
      
      where 'auto' would be the preferred default which looks at things
      like Battery/AC mode and possible cpufreq state or whatever the hw
      exposes to show us power use expectations - but there's been no
      progress on it in the past many months.
      
      Aside from that, the actual implementation of the various knobs
      is known to be broken. There have been sporadic attempts at
      fixing things but these always stop short of reaching a mergable
      state.
      
      Therefore this wholesale removal with the hopes of spurring
      people who care to come forward once again and work on a
      coherent replacement.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1326104915.2442.53.camel@twinsSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8e7fbcbc
  23. 14 5月, 2012 1 次提交
  24. 09 5月, 2012 5 次提交
  25. 26 4月, 2012 2 次提交
  26. 30 3月, 2012 1 次提交
  27. 27 3月, 2012 1 次提交
  28. 14 3月, 2012 1 次提交
  29. 13 3月, 2012 1 次提交
    • P
      sched: Cleanup cpu_active madness · 5fbd036b
      Peter Zijlstra 提交于
      Stepan found:
      
      CPU0		CPUn
      
      _cpu_up()
        __cpu_up()
      
      		boostrap()
      		  notify_cpu_starting()
      		  set_cpu_online()
      		  while (!cpu_active())
      		    cpu_relax()
      
      <PREEMPT-out>
      
      smp_call_function(.wait=1)
        /* we find cpu_online() is true */
        arch_send_call_function_ipi_mask()
      
        /* wait-forever-more */
      
      <PREEMPT-in>
      		  local_irq_enable()
      
        cpu_notify(CPU_ONLINE)
          sched_cpu_active()
            set_cpu_active()
      
      Now the purpose of cpu_active is mostly with bringing down a cpu, where
      we mark it !active to avoid the load-balancer from moving tasks to it
      while we tear down the cpu. This is required because we only update the
      sched_domain tree after we brought the cpu-down. And this is needed so
      that some tasks can still run while we bring it down, we just don't want
      new tasks to appear.
      
      On cpu-up however the sched_domain tree doesn't yet include the new cpu,
      so its invisible to the load-balancer, regardless of the active state.
      So instead of setting the active state after we boot the new cpu (and
      consequently having to wait for it before enabling interrupts) set the
      cpu active before we set it online and avoid the whole mess.
      Reported-by: NStepan Moskovchenko <stepanm@codeaurora.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1323965362.18942.71.camel@twinsSigned-off-by: NIngo Molnar <mingo@elte.hu>
      5fbd036b
  30. 05 3月, 2012 1 次提交
    • I
      x86: Introduce x86_cpuinit.early_percpu_clock_init hook · df156f90
      Igor Mammedov 提交于
      When kvm guest uses kvmclock, it may hang on vcpu hot-plug.
      This is caused by an overflow in pvclock_get_nsec_offset,
      
          u64 delta = tsc - shadow->tsc_timestamp;
      
      which in turn is caused by an undefined values from percpu
      hv_clock that hasn't been initialized yet.
      Uninitialized clock on being booted cpu is accessed from
         start_secondary
          -> smp_callin
            ->  smp_store_cpu_info
              -> identify_secondary_cpu
                -> mtrr_ap_init
                  -> mtrr_restore
                    -> stop_machine_from_inactive_cpu
                      -> queue_stop_cpus_work
                        ...
                          -> sched_clock
                            -> kvm_clock_read
      which is well before x86_cpuinit.setup_percpu_clockev call in
      start_secondary, where percpu clock is initialized.
      
      This patch introduces a hook that allows to setup/initialize
      per_cpu clock early and avoid overflow due to reading
        - undefined values
        - old values if cpu was offlined and then onlined again
      
      Another possible early user of this clock source is ftrace that
      accesses it to get timestamps for ring buffer entries. So if
      mtrr_ap_init is moved from identify_secondary_cpu to past
      x86_cpuinit.setup_percpu_clockev in start_secondary, ftrace
      may cause the same overflow/hang on cpu hot-plug anyway.
      
      More complete description of the problem:
        https://lkml.org/lkml/2012/2/2/101
      
      Credits to Marcelo Tosatti <mtosatti@redhat.com> for hook idea.
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      df156f90
  31. 23 2月, 2012 1 次提交
  32. 13 2月, 2012 1 次提交
  33. 24 12月, 2011 1 次提交