1. 16 1月, 2014 1 次提交
    • H
      x86, apic, kexec: Add disable_cpu_apicid kernel parameter · 151e0c7d
      HATAYAMA Daisuke 提交于
      Add disable_cpu_apicid kernel parameter. To use this kernel parameter,
      specify an initial APIC ID of the corresponding CPU you want to
      disable.
      
      This is mostly used for the kdump 2nd kernel to disable BSP to wake up
      multiple CPUs without causing system reset or hang due to sending INIT
      from AP to BSP.
      
      Kdump users first figure out initial APIC ID of the BSP, CPU0 in the
      1st kernel, for example from /proc/cpuinfo and then set up this kernel
      parameter for the 2nd kernel using the obtained APIC ID.
      
      However, doing this procedure at each boot time manually is awkward,
      which should be automatically done by user-land service scripts, for
      example, kexec-tools on fedora/RHEL distributions.
      
      This design is more flexible than disabling BSP in kernel boot time
      automatically in that in kernel boot time we have no choice but
      referring to ACPI/MP table to obtain initial APIC ID for BSP, meaning
      that the method is not applicable to the systems without such BIOS
      tables.
      
      One assumption behind this design is that users get initial APIC ID of
      the BSP in still healthy state and so BSP is uniquely kept in
      CPU0. Thus, through the kernel parameter, only one initial APIC ID can
      be specified.
      
      In a comparison with disabled_cpu_apicid, we use read_apic_id(), not
      boot_cpu_physical_apicid, because on some platforms, the variable is
      modified to the apicid reported as BSP through MP table and this
      function is executed with the temporarily modified
      boot_cpu_physical_apicid. As a result, disabled_cpu_apicid kernel
      parameter doesn't work well for apicids of APs.
      
      Fixing the wrong handling of boot_cpu_physical_apicid requires some
      reviews and tests beyond some platforms and it could take some
      time. The fix here is a kind of workaround to focus on the main topic
      of this patch.
      Signed-off-by: NHATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Link: http://lkml.kernel.org/r/20140115064458.1545.38775.stgit@localhost6.localdomain6Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      151e0c7d
  2. 12 1月, 2014 1 次提交
  3. 31 12月, 2013 1 次提交
  4. 29 12月, 2013 2 次提交
    • D
      x86: Export x86 boot_params to sysfs · 5039e316
      Dave Young 提交于
      kexec-tools use boot_params for getting the 1st kernel hardware_subarch,
      the kexec kernel EFI runtime support also needs to read the old efi_info
      from boot_params. Currently it exists in debugfs which is not a good
      place for such infomation. Per HPA, we should avoid "sploit debugfs".
      
      In this patch /sys/kernel/boot_params are exported, also the setup_data is
      exported as a subdirectory. kexec-tools is using debugfs for hardware_subarch
      for a long time now so we're not removing it yet.
      
      Structure is like below:
      
      /sys/kernel/boot_params
      |__ data                /* boot_params in binary*/
      |__ setup_data
      |   |__ 0               /* the first setup_data node */
      |   |   |__ data        /* setup_data node 0 in binary*/
      |   |   |__ type        /* setup_data type of setup_data node 0, hex string */
      [snip]
      |__ version             /* boot protocal version (in hex, "0x" prefixed)*/
      Signed-off-by: NDave Young <dyoung@redhat.com>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NToshi Kani <toshi.kani@hp.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      5039e316
    • D
      x86: Add xloadflags bit for EFI runtime support on kexec · 456a29dd
      Dave Young 提交于
      Old kexec-tools can not load new kernels. The reason is kexec-tools does
      not fill efi_info in x86 setup header previously, thus EFI failed to
      initialize.  In new kexec-tools it will by default to fill efi_info and
      pass other EFI required infomation to 2nd kernel so kexec kernel EFI
      initialization can succeed finally.
      
      To prevent from breaking userspace, add a new xloadflags bit so
      kexec-tools can check the flag and switch to old logic.
      Signed-off-by: NDave Young <dyoung@redhat.com>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NToshi Kani <toshi.kani@hp.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      456a29dd
  5. 22 12月, 2013 2 次提交
  6. 21 12月, 2013 2 次提交
  7. 19 12月, 2013 2 次提交
  8. 18 12月, 2013 1 次提交
  9. 17 12月, 2013 2 次提交
  10. 16 12月, 2013 5 次提交
  11. 13 12月, 2013 1 次提交
  12. 12 12月, 2013 1 次提交
  13. 11 12月, 2013 2 次提交
  14. 10 12月, 2013 2 次提交
  15. 06 12月, 2013 1 次提交
    • T
      net: davinci_emac: Fix platform data handling and make usable for am3517 · dd0df47d
      Tony Lindgren 提交于
      When booted with device tree, we may still have platform data passed
      as auxdata. For am3517 this is needed for passing the interrupt_enable
      and interrupt_disable callbacks that access the omap system control module
      registers. These callback functions will eventually go away when we have
      a separate system control module driver.
      
      Some of the things that are currently passed as platform data we don't need
      to set up as device tree properties as they are always the same on am3517.
      So let's use a new compatible flag for those so we can get those from
      the device tree match data.
      
      Also note that we need to fix setting of phy_dev to NULL instead of an empty
      string as the code later on uses that to find the first phy on the mdio bus.
      This seems to have been caused by 5d69e007 (net: davinci_emac: switch to
      new mdio).
      Signed-off-by: NTony Lindgren <tony@atomide.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dd0df47d
  16. 04 12月, 2013 5 次提交
    • P
      rcu: Break call_rcu() deadlock involving scheduler and perf · 96d3fd0d
      Paul E. McKenney 提交于
      Dave Jones got the following lockdep splat:
      
      >  ======================================================
      >  [ INFO: possible circular locking dependency detected ]
      >  3.12.0-rc3+ #92 Not tainted
      >  -------------------------------------------------------
      >  trinity-child2/15191 is trying to acquire lock:
      >   (&rdp->nocb_wq){......}, at: [<ffffffff8108ff43>] __wake_up+0x23/0x50
      >
      > but task is already holding lock:
      >   (&ctx->lock){-.-...}, at: [<ffffffff81154c19>] perf_event_exit_task+0x109/0x230
      >
      > which lock already depends on the new lock.
      >
      >
      > the existing dependency chain (in reverse order) is:
      >
      > -> #3 (&ctx->lock){-.-...}:
      >         [<ffffffff810cc243>] lock_acquire+0x93/0x200
      >         [<ffffffff81733f90>] _raw_spin_lock+0x40/0x80
      >         [<ffffffff811500ff>] __perf_event_task_sched_out+0x2df/0x5e0
      >         [<ffffffff81091b83>] perf_event_task_sched_out+0x93/0xa0
      >         [<ffffffff81732052>] __schedule+0x1d2/0xa20
      >         [<ffffffff81732f30>] preempt_schedule_irq+0x50/0xb0
      >         [<ffffffff817352b6>] retint_kernel+0x26/0x30
      >         [<ffffffff813eed04>] tty_flip_buffer_push+0x34/0x50
      >         [<ffffffff813f0504>] pty_write+0x54/0x60
      >         [<ffffffff813e900d>] n_tty_write+0x32d/0x4e0
      >         [<ffffffff813e5838>] tty_write+0x158/0x2d0
      >         [<ffffffff811c4850>] vfs_write+0xc0/0x1f0
      >         [<ffffffff811c52cc>] SyS_write+0x4c/0xa0
      >         [<ffffffff8173d4e4>] tracesys+0xdd/0xe2
      >
      > -> #2 (&rq->lock){-.-.-.}:
      >         [<ffffffff810cc243>] lock_acquire+0x93/0x200
      >         [<ffffffff81733f90>] _raw_spin_lock+0x40/0x80
      >         [<ffffffff810980b2>] wake_up_new_task+0xc2/0x2e0
      >         [<ffffffff81054336>] do_fork+0x126/0x460
      >         [<ffffffff81054696>] kernel_thread+0x26/0x30
      >         [<ffffffff8171ff93>] rest_init+0x23/0x140
      >         [<ffffffff81ee1e4b>] start_kernel+0x3f6/0x403
      >         [<ffffffff81ee1571>] x86_64_start_reservations+0x2a/0x2c
      >         [<ffffffff81ee1664>] x86_64_start_kernel+0xf1/0xf4
      >
      > -> #1 (&p->pi_lock){-.-.-.}:
      >         [<ffffffff810cc243>] lock_acquire+0x93/0x200
      >         [<ffffffff8173419b>] _raw_spin_lock_irqsave+0x4b/0x90
      >         [<ffffffff810979d1>] try_to_wake_up+0x31/0x350
      >         [<ffffffff81097d62>] default_wake_function+0x12/0x20
      >         [<ffffffff81084af8>] autoremove_wake_function+0x18/0x40
      >         [<ffffffff8108ea38>] __wake_up_common+0x58/0x90
      >         [<ffffffff8108ff59>] __wake_up+0x39/0x50
      >         [<ffffffff8110d4f8>] __call_rcu_nocb_enqueue+0xa8/0xc0
      >         [<ffffffff81111450>] __call_rcu+0x140/0x820
      >         [<ffffffff81111b8d>] call_rcu+0x1d/0x20
      >         [<ffffffff81093697>] cpu_attach_domain+0x287/0x360
      >         [<ffffffff81099d7e>] build_sched_domains+0xe5e/0x10a0
      >         [<ffffffff81efa7fc>] sched_init_smp+0x3b7/0x47a
      >         [<ffffffff81ee1f4e>] kernel_init_freeable+0xf6/0x202
      >         [<ffffffff817200be>] kernel_init+0xe/0x190
      >         [<ffffffff8173d22c>] ret_from_fork+0x7c/0xb0
      >
      > -> #0 (&rdp->nocb_wq){......}:
      >         [<ffffffff810cb7ca>] __lock_acquire+0x191a/0x1be0
      >         [<ffffffff810cc243>] lock_acquire+0x93/0x200
      >         [<ffffffff8173419b>] _raw_spin_lock_irqsave+0x4b/0x90
      >         [<ffffffff8108ff43>] __wake_up+0x23/0x50
      >         [<ffffffff8110d4f8>] __call_rcu_nocb_enqueue+0xa8/0xc0
      >         [<ffffffff81111450>] __call_rcu+0x140/0x820
      >         [<ffffffff81111bb0>] kfree_call_rcu+0x20/0x30
      >         [<ffffffff81149abf>] put_ctx+0x4f/0x70
      >         [<ffffffff81154c3e>] perf_event_exit_task+0x12e/0x230
      >         [<ffffffff81056b8d>] do_exit+0x30d/0xcc0
      >         [<ffffffff8105893c>] do_group_exit+0x4c/0xc0
      >         [<ffffffff810589c4>] SyS_exit_group+0x14/0x20
      >         [<ffffffff8173d4e4>] tracesys+0xdd/0xe2
      >
      > other info that might help us debug this:
      >
      > Chain exists of:
      >   &rdp->nocb_wq --> &rq->lock --> &ctx->lock
      >
      >   Possible unsafe locking scenario:
      >
      >         CPU0                    CPU1
      >         ----                    ----
      >    lock(&ctx->lock);
      >                                 lock(&rq->lock);
      >                                 lock(&ctx->lock);
      >    lock(&rdp->nocb_wq);
      >
      >  *** DEADLOCK ***
      >
      > 1 lock held by trinity-child2/15191:
      >  #0:  (&ctx->lock){-.-...}, at: [<ffffffff81154c19>] perf_event_exit_task+0x109/0x230
      >
      > stack backtrace:
      > CPU: 2 PID: 15191 Comm: trinity-child2 Not tainted 3.12.0-rc3+ #92
      >  ffffffff82565b70 ffff880070c2dbf8 ffffffff8172a363 ffffffff824edf40
      >  ffff880070c2dc38 ffffffff81726741 ffff880070c2dc90 ffff88022383b1c0
      >  ffff88022383aac0 0000000000000000 ffff88022383b188 ffff88022383b1c0
      > Call Trace:
      >  [<ffffffff8172a363>] dump_stack+0x4e/0x82
      >  [<ffffffff81726741>] print_circular_bug+0x200/0x20f
      >  [<ffffffff810cb7ca>] __lock_acquire+0x191a/0x1be0
      >  [<ffffffff810c6439>] ? get_lock_stats+0x19/0x60
      >  [<ffffffff8100b2f4>] ? native_sched_clock+0x24/0x80
      >  [<ffffffff810cc243>] lock_acquire+0x93/0x200
      >  [<ffffffff8108ff43>] ? __wake_up+0x23/0x50
      >  [<ffffffff8173419b>] _raw_spin_lock_irqsave+0x4b/0x90
      >  [<ffffffff8108ff43>] ? __wake_up+0x23/0x50
      >  [<ffffffff8108ff43>] __wake_up+0x23/0x50
      >  [<ffffffff8110d4f8>] __call_rcu_nocb_enqueue+0xa8/0xc0
      >  [<ffffffff81111450>] __call_rcu+0x140/0x820
      >  [<ffffffff8109bc8f>] ? local_clock+0x3f/0x50
      >  [<ffffffff81111bb0>] kfree_call_rcu+0x20/0x30
      >  [<ffffffff81149abf>] put_ctx+0x4f/0x70
      >  [<ffffffff81154c3e>] perf_event_exit_task+0x12e/0x230
      >  [<ffffffff81056b8d>] do_exit+0x30d/0xcc0
      >  [<ffffffff810c9af5>] ? trace_hardirqs_on_caller+0x115/0x1e0
      >  [<ffffffff810c9bcd>] ? trace_hardirqs_on+0xd/0x10
      >  [<ffffffff8105893c>] do_group_exit+0x4c/0xc0
      >  [<ffffffff810589c4>] SyS_exit_group+0x14/0x20
      >  [<ffffffff8173d4e4>] tracesys+0xdd/0xe2
      
      The underlying problem is that perf is invoking call_rcu() with the
      scheduler locks held, but in NOCB mode, call_rcu() will with high
      probability invoke the scheduler -- which just might want to use its
      locks.  The reason that call_rcu() needs to invoke the scheduler is
      to wake up the corresponding rcuo callback-offload kthread, which
      does the job of starting up a grace period and invoking the callbacks
      afterwards.
      
      One solution (championed on a related problem by Lai Jiangshan) is to
      simply defer the wakeup to some point where scheduler locks are no longer
      held.  Since we don't want to unnecessarily incur the cost of such
      deferral, the task before us is threefold:
      
      1.	Determine when it is likely that a relevant scheduler lock is held.
      
      2.	Defer the wakeup in such cases.
      
      3.	Ensure that all deferred wakeups eventually happen, preferably
      	sooner rather than later.
      
      We use irqs_disabled_flags() as a proxy for relevant scheduler locks
      being held.  This works because the relevant locks are always acquired
      with interrupts disabled.  We may defer more often than needed, but that
      is at least safe.
      
      The wakeup deferral is tracked via a new field in the per-CPU and
      per-RCU-flavor rcu_data structure, namely ->nocb_defer_wakeup.
      
      This flag is checked by the RCU core processing.  The __rcu_pending()
      function now checks this flag, which causes rcu_check_callbacks()
      to initiate RCU core processing at each scheduling-clock interrupt
      where this flag is set.  Of course this is not sufficient because
      scheduling-clock interrupts are often turned off (the things we used to
      be able to count on!).  So the flags are also checked on entry to any
      state that RCU considers to be idle, which includes both NO_HZ_IDLE idle
      state and NO_HZ_FULL user-mode-execution state.
      
      This approach should allow call_rcu() to be invoked regardless of what
      locks you might be holding, the key word being "should".
      Reported-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      96d3fd0d
    • P
      documentation: Update circular buffer for load-acquire/store-release · 6c43c091
      Paul E. McKenney 提交于
      This commit replaces full barriers by targeted use of load-acquire and
      store-release.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      [ paulmck: Restore comments as suggested by David Howells. ]
      6c43c091
    • P
      documentation: Fix circular-buffer example. · 9873552f
      Paul E. McKenney 提交于
      The code sample in Documentation/circular-buffers.txt appears to have a
      few ordering bugs.  This patch therefore applies the needed fixes.
      Reported-by: NLech Fomicki <lfomicki@poczta.fm>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      9873552f
    • P
    • P
      c9592ecb
  17. 03 12月, 2013 9 次提交