1. 07 11月, 2008 1 次提交
    • D
      [IA64] fix boot panic caused by offline CPUs · 62ee0540
      Doug Chapman 提交于
      This fixes a regression introduced by 2c6e6db4
      "Minimize per_cpu reservations."  That patch incorrectly used information about
      what CPUs are possible that was not yet initialized by ACPI.  The end result
      was that per_cpu structures for offline CPUs were not initialized causing a
      NULL pointer reference.
      
      Since we cannot do the full acpi_boot_init() call any earlier, the simplest
      fix is to just parse the MADT for SAPIC entries early to find the CPU
      info.  This should also allow for some cleanup of the code added by the
      "Minimize per_cpu reservations".  This patch just fixes the regressions, the
      cleanup will come in a later patch.
      Signed-off-by: NDoug Chapman <doug.chapman@hp.com>
      Signed-off-by: NAlex Chiang <achiang@hp.com>
      CC: Robin Holt <holt@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      62ee0540
  2. 20 10月, 2008 3 次提交
    • S
      always reserve elfcore header memory in crash kernel · d9a9855d
      Simon Horman 提交于
      elfcore header memory needs to be reserved in a crash kernel.  This means
      that the relevant code should be protected by CONFIG_CRASH_DUMP rather
      than CONFIG_PROC_VMCORE.
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d9a9855d
    • S
      kdump: add is_vmcore_usable() and vmcore_unusable() · 85a0ee34
      Simon Horman 提交于
      The usage of elfcorehdr_addr has changed recently such that being set to
      ELFCORE_ADDR_MAX is used by is_kdump_kernel() to indicate if the code is
      executing in a kernel executed as a crash kernel.
      
      However, arch/ia64/kernel/setup.c:reserve_elfcorehdr will rest
      elfcorehdr_addr to ELFCORE_ADDR_MAX on error, which means any subsequent
      calls to is_kdump_kernel() will return 0, even though they should return
      1.
      
      Ok, at this point in time there are no subsequent calls, but I think its
      fair to say that there is ample scope for error or at the very least
      confusion.
      
      This patch add an extra state, ELFCORE_ADDR_ERR, which indicates that
      elfcorehdr_addr was passed on the command line, and thus execution is
      taking place in a crashdump kernel, but vmcore can't be used for some
      reason.  This is tested for using is_vmcore_usable() and set using
      vmcore_unusable().  A subsequent patch makes use of this new code.
      
      To summarise, the states that elfcorehdr_addr can now be in are as follows:
      
      ELFCORE_ADDR_MAX: not a crashdump kernel
      ELFCORE_ADDR_ERR: crashdump kernel but vmcore is unusable
      any other value:  crash dump kernel and vmcore is usable
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      85a0ee34
    • V
      kdump: make elfcorehdr_addr independent of CONFIG_PROC_VMCORE · 57cac4d1
      Vivek Goyal 提交于
      o elfcorehdr_addr is used by not only the code under CONFIG_PROC_VMCORE
        but also by the code which is not inside CONFIG_PROC_VMCORE.  For
        example, is_kdump_kernel() is used by powerpc code to determine if
        kernel is booting after a panic then use previous kernel's TCE table.
        So even if CONFIG_PROC_VMCORE is not set in second kernel, one should be
        able to correctly determine that we are booting after a panic and setup
        calgary iommu accordingly.
      
      o So remove the assumption that elfcorehdr_addr is under
        CONFIG_PROC_VMCORE.
      
      o Move definition of elfcorehdr_addr to arch dependent crash files.
        (Unfortunately crash dump does not have an arch independent file
        otherwise that would have been the best place).
      
      o kexec.c is not the right place as one can Have CRASH_DUMP enabled in
        second kernel without KEXEC being enabled.
      
      o I don't see sh setup code parsing the command line for
        elfcorehdr_addr.  I am wondering how does vmcore interface work on sh.
        Anyway, I am atleast defining elfcoredhr_addr so that compilation is not
        broken on sh.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Acked-by: NSimon Horman <horms@verge.net.au>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      57cac4d1
  3. 18 10月, 2008 1 次提交
  4. 23 9月, 2008 1 次提交
  5. 13 8月, 2008 1 次提交
    • T
      [IA64] Ensure cpu0 can access per-cpu variables in early boot code · 10617bbe
      Tony Luck 提交于
      ia64 handles per-cpu variables a litle differently from other architectures
      in that it maps the physical memory allocated for each cpu at a constant
      virtual address (0xffffffffffff0000). This mapping is not enabled until
      the architecture specific cpu_init() function is run, which causes problems
      since some generic code is run before this point. In particular when
      CONFIG_PRINTK_TIME is enabled, the boot cpu will trap on the access to
      per-cpu memory at the first printk() call so the boot will fail without
      the kernel printing anything to the console.
      
      Fix this by allocating percpu memory for cpu0 in the kernel data section
      and doing all initialization to enable percpu access in head.S before
      calling any generic code.
      
      Other cpus must take care not to access per-cpu variables too early, but
      their code path from start_secondary() to cpu_init() is all in arch/ia64
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      10617bbe
  6. 02 8月, 2008 1 次提交
    • T
      [IA64] Move include/asm-ia64 to arch/ia64/include/asm · 7f30491c
      Tony Luck 提交于
      After moving the the include files there were a few clean-ups:
      
      1) Some files used #include <asm-ia64/xyz.h>, changed to <asm/xyz.h>
      
      2) Some comments alerted maintainers to look at various header files to
      make matching updates if certain code were to be changed. Updated these
      comments to use the new include paths.
      
      3) Some header files mentioned their own names in initial comments. Just
      deleted these self references.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      7f30491c
  7. 01 7月, 2008 1 次提交
    • T
      [IA64] Bugfix for system with 32 cpus · dd4f0888
      Tony Luck 提交于
      On a system where there are no hot pluggable cpus "additional_cpus"
      is still set to -1 at the point where we call per_cpu_scan_finalize().
      If we didn't find an SRAT table and so pick the default "32" for the
      number of cpus, when we get to:
      high_cpu = min(high_cpu + reserve_cpus, NR_CPUS);
      we will end up initializing for just 31 cpus ... and so we will
      die horribly when bringing up cpu#32.
      
      Problem introduced by: 2c6e6db4
      "Minimize per_cpu reservations."
      Acked-by: NRobin Holt <holt@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      dd4f0888
  8. 25 6月, 2008 1 次提交
  9. 28 5月, 2008 2 次提交
    • I
      [IA64] pvops: define initialization hooks, pv_init_ops, for paravirtualized environment. · e51835d5
      Isaku Yamahata 提交于
      define pv_init_ops hooks which represents various initialization
      hooks for paravirtualized environment. and add hooks.
      Signed-off-by: NAlex Williamson <alex.williamson@hp.com>
      Signed-off-by: NIsaku Yamahata <yamahata@valinux.co.jp>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      e51835d5
    • T
      [IA64] Workaround for RSE issue · 4dcc29e1
      Tony Luck 提交于
      Problem: An application violating the architectural rules regarding
      operation dependencies and having specific Register Stack Engine (RSE)
      state at the time of the violation, may result in an illegal operation
      fault and invalid RSE state.  Such faults may initiate a cascade of
      repeated illegal operation faults within OS interruption handlers.
      The specific behavior is OS dependent.
      
      Implication: An application causing an illegal operation fault with
      specific RSE state may result in a series of illegal operation faults
      and an eventual OS stack overflow condition.
      
      Workaround: OS interruption handlers that switch to kernel backing
      store implement a check for invalid RSE state to avoid the series
      of illegal operation faults.
      
      The core of the workaround is the RSE_WORKAROUND code sequence
      inserted into each invocation of the SAVE_MIN_WITH_COVER and
      SAVE_MIN_WITH_COVER_R19 macros.  This sequence includes hard-coded
      constants that depend on the number of stacked physical registers
      being 96.  The rest of this patch consists of code to disable this
      workaround should this not be the case (with the presumption that
      if a future Itanium processor increases the number of registers, it
      would also remove the need for this patch).
      
      Move the start of the RBS up to a mod32 boundary to avoid some
      corner cases.
      
      The dispatch_illegal_op_fault code outgrew the spot it was
      squatting in when built with this patch and CONFIG_VIRT_CPU_ACCOUNTING=y
      Move it out to the end of the ivt.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      4dcc29e1
  10. 15 5月, 2008 1 次提交
    • B
      [IA64] Don't reserve crashkernel memory > 4 GB · 8a3360f0
      Bernhard Walle 提交于
      Some IA64 machines map all cell-local memory above 4 GB (32 bit limit).
      However, in most cases, the kernel needs some memory below that limit that is
      DMA-capable. So in this machine configuration, the crashkernel will be reserved
      above 4 GB.
      
      For machines that use SWIOTLB implementation because they lack an I/O MMU
      the low memory is required by the SWIOTLB implementation. In that case,
      it doesn't make sense to reserve the crashkernel at all because it's unusable
      for kdump.
      
      A special case is the "hpzx1" machine vector. In theory, it has a I/O MMU, so
      it can be booted above 4 GB. However, in the kdump case that is not possible
      because of changeset 51b58e3e:
      
          On HP zx1 machines, the 'machvec=dig' parameter is needed for the kdump
          kernel to avoid problems with the HP sba iommu.  The problem is that during
          the boot of the kdump kernel, the iommu is re-initialized, so in-flight DMA
          from improperly shutdown drivers causes an IOTLB miss which leads to an
          MCA.  With kdump, the idea is to get into the kdump kernel with as little
          code as we can, so shutting down drivers properly is not an option.
      
          The workaround is to add 'machvec=dig' to the kdump kernel boot parameters.
          This makes the kdump kernel avoid using the sba iommu altogether, leaving
          the IOTLB intact.  Any ongoing DMA falls harmlessly outside the kdump
          kernel.  After the kdump kernel reboots, all devices will have been
          shutdown properly and DMA stopped.
      
      This patch pushes that functionality into the sba iommu initialization
      code, so that users won't have to find the obscure documentation telling
      them about 'machvec=dig'.
      
      This means that also for hpzx1 it's not possible to boot when all
      memory is above the 4 GB limit. So the only machine vectors that can handle
      this case are "sn2" and "uv".
      Signed-off-by: NBernhard Walle <bwalle@suse.de>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      8a3360f0
  11. 12 4月, 2008 1 次提交
    • Z
      [IA64] Fix NUMA configuration issue · 98075d24
      Zoltan Menyhart 提交于
      There is a NUMA memory configuration issue in 2.6.24:
      
      A 2-node machine of ours has got the following memory layout:
      
      Node 0:	0 - 2 Gbytes
      Node 0:	4 - 8 Gbytes
      Node 1:	8 - 16 Gbytes
      Node 0:	16 - 18 Gbytes
      
      "efi_memmap_init()" merges the three last ranges into one.
      
      "register_active_ranges()" is called as follows:
      
      efi_memmap_walk(register_active_ranges, NULL);
      
      i.e. once for the 4 - 18 Gbytes range. It picks up the node
      number from the start address, and registers all the memory for
      the node #0.
      
      "register_active_ranges()" should be called as follows to
      make sure there is no merged address range at its entry:
      
      efi_memmap_walk(filter_memory, register_active_ranges);
      
      "filter_memory()" is similar to "filter_rsvd_memory()",
      but the reserved memory ranges are not filtered out.
      Signed-off-by: NZoltan Menyhart <Zoltan.Menyhart@bull.net>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      98075d24
  12. 09 4月, 2008 1 次提交
    • H
      [IA64] Minimize per_cpu reservations. · 2c6e6db4
      holt@sgi.com 提交于
      This attached patch significantly shrinks boot memory allocation on ia64.
      It does this by not allocating per_cpu areas for cpus that can never
      exist.
      
      In the case where acpi does not have any numa node description of the
      cpus, I defaulted to assigning the first 32 round-robin on the known
      nodes..  For the !CONFIG_ACPI  I used for_each_possible_cpu().
      Signed-off-by: NRobin Holt <holt@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      2c6e6db4
  13. 05 4月, 2008 2 次提交
  14. 07 3月, 2008 1 次提交
  15. 05 2月, 2008 1 次提交
  16. 26 1月, 2008 1 次提交
  17. 08 12月, 2007 1 次提交
  18. 30 10月, 2007 1 次提交
    • A
      [IA64] /proc/cpuinfo "physical id" field cleanups · 113134fc
      Alex Chiang 提交于
      Clean up the process for presenting the "physical id" field in
      /proc/cpuinfo.
      
      	- remove global smp_num_cpucores, as it is mostly useless
      
      	- remove check_for_logical_procs(), since we do the same
      	  functionality in identify_siblings()
      
      	- reflow logic in identify_siblings(). If an older CPU
      	  does not implement PAL_LOGICAL_TO_PHYSICAL, we may still
      	  be able to get useful information from SAL_PHYSICAL_ID_INFO
      
      	- in identify_siblings(), threads/cores are a property of
      	  the CPU, not the platform
      
      	- remove useless printk's about multi-core / thread
      	  capability in identify_siblings(), as that information
      	  is readily available in /proc/cpuinfo, and printing for
      	  the BSP only adds little value
      
      	- smp_num_siblings is now meaningful if any CPU in the
      	  system supports threads, not just the BSP
      
      	- expose "physical id" field, even on CPUs that are not
      	  multi-core / multi-threaded (as long as we have a valid
      	  value). Now we know what sockets Madisons live in too.
      Signed-off-by: NAlex Chiang <achiang@hp.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      113134fc
  19. 22 10月, 2007 1 次提交
    • B
      kexec: add BSS to resource tree · 00bf4098
      Bernhard Walle 提交于
      Add the BSS to the resource tree just as kernel text and kernel data are in
      the resource tree.  The main reason behind this is to avoid crashkernel
      reservation in that area.
      
      While it's not strictly necessary to have the BSS in the resource tree (the
      actual collision detection is done in the reserve_bootmem() function before),
      the usage of the BSS resource should be presented to the user in /proc/iomem
      just as Kernel data and Kernel code.
      
      Note: The patch currently is only implemented for x86 and ia64 (because
      efi_initialize_iomem_resources() has the same signature on i386 and ia64).
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NBernhard Walle <bwalle@suse.de>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@in.ibm.com>
      Cc: <linux-arch@vger.kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      00bf4098
  20. 20 10月, 2007 1 次提交
  21. 17 10月, 2007 2 次提交
  22. 01 9月, 2007 1 次提交
  23. 29 8月, 2007 1 次提交
  24. 18 8月, 2007 1 次提交
  25. 31 7月, 2007 1 次提交
  26. 26 7月, 2007 1 次提交
  27. 20 7月, 2007 1 次提交
  28. 17 7月, 2007 1 次提交
    • Y
      serial: convert early_uart to earlycon for 8250 · 18a8bd94
      Yinghai Lu 提交于
      Beacuse SERIAL_PORT_DFNS is removed from include/asm-i386/serial.h and
      include/asm-x86_64/serial.h.  the serial8250_ports need to be probed late in
      serial initializing stage.  the console_init=>serial8250_console_init=>
      register_console=>serial8250_console_setup will return -ENDEV, and console
      ttyS0 can not be enabled at that time.  need to wait till uart_add_one_port in
      drivers/serial/serial_core.c to call register_console to get console ttyS0.
      that is too late.
      
      Make early_uart to use early_param, so uart console can be used earlier.  Make
      it to be bootconsole with CON_BOOT flag, so can use console handover feature.
      and it will switch to corresponding normal serial console automatically.
      
      new command line will be:
      	console=uart8250,io,0x3f8,9600n8
      	console=uart8250,mmio,0xff5e0000,115200n8
      or
      	earlycon=uart8250,io,0x3f8,9600n8
      	earlycon=uart8250,mmio,0xff5e0000,115200n8
      
      it will print in very early stage:
      	Early serial console at I/O port 0x3f8 (options '9600n8')
      	console [uart0] enabled
      later for console it will print:
      	console handover: boot [uart0] -> real [ttyS0]
      
      Signed-off-by: <yinghai.lu@sun.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Gerd Hoffmann <kraxel@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      18a8bd94
  29. 10 7月, 2007 1 次提交
    • I
      sched: zap the migration init / cache-hot balancing code · 0437e109
      Ingo Molnar 提交于
      the SMP load-balancer uses the boot-time migration-cost estimation
      code to attempt to improve the quality of balancing. The reason for
      this code is that the discrete priority queues do not preserve
      the order of scheduling accurately, so the load-balancer skips
      tasks that were running on a CPU 'recently'.
      
      this code is fundamental fragile: the boot-time migration cost detector
      doesnt really work on systems that had large L3 caches, it caused boot
      delays on large systems and the whole cache-hot concept made the
      balancing code pretty undeterministic as well.
      
      (and hey, i wrote most of it, so i can say it out loud that it sucks ;-)
      
      under CFS the same purpose of cache affinity can be achieved without
      any special cache-hot special-case: tasks are sorted in the 'timeline'
      tree and the SMP balancer picks tasks from the left side of the
      tree, thus the most cache-cold task is balanced automatically.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0437e109
  30. 12 5月, 2007 1 次提交
  31. 08 5月, 2007 1 次提交
  32. 07 4月, 2007 1 次提交
  33. 21 3月, 2007 1 次提交
  34. 08 3月, 2007 1 次提交
  35. 07 3月, 2007 1 次提交
    • M
      [IA64] kexec: Use EFI_LOADER_DATA for ELF core header · cee87af2
      Magnus Damm 提交于
      The address where the ELF core header is stored is passed to the secondary
      kernel as a kernel command line option.  The memory area for this header is
      also marked as a separate EFI memory descriptor on ia64.
      
      The separate EFI memory descriptor is at the moment of the type
      EFI_UNUSABLE_MEMORY.  With such a type the secondary kernel skips over the
      entire memory granule (config option, 16M or 64M) when detecting memory.
      If we are lucky we will just lose some memory, but if we happen to have
      data in the same granule (such as an initramfs image), then this data will
      never get mapped and the kernel bombs out when trying to access it.
      
      So this is an attempt to fix this by changing the EFI memory descriptor
      type into EFI_LOADER_DATA.  This type is the same type used for the kernel
      data and for initramfs.  In the secondary kernel we then handle the ELF
      core header data the same way as we handle the initramfs image.
      
      This patch contains the kernel changes to make this happen.  Pretty
      straightforward, we reserve the area in reserve_memory().  The address for
      the area comes from the kernel command line and the size comes from the
      specialized EFI parsing function vmcore_find_descriptor_size().
      
      The kexec-tools-testing code for this can be found here:
      http://lists.osdl.org/pipermail/fastboot/2007-February/005983.htmlSigned-off-by: NMagnus Damm <magnus@valinux.co.jp>
      Cc: Simon Horman <horms@verge.net.au>
      Cc: Vivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      cee87af2