1. 01 2月, 2008 1 次提交
  2. 31 1月, 2008 1 次提交
  3. 30 1月, 2008 16 次提交
    • B
      x86: early boot debugging via FireWire (ohci1394_dma=early) · f212ec4b
      Bernhard Kaindl 提交于
      This patch adds a new configuration option, which adds support for a new
      early_param which gets checked in arch/x86/kernel/setup_{32,64}.c:setup_arch()
      to decide wether OHCI-1394 FireWire controllers should be initialized and
      enabled for physical DMA access to allow remote debugging of early problems
      like issues ACPI or other subsystems which are executed very early.
      
      If the config option is not enabled, no code is changed, and if the boot
      paramenter is not given, no new code is executed, and independent of that,
      all new code is freed after boot, so the config option can be even enabled
      in standard, non-debug kernels.
      
      With specialized tools, it is then possible to get debugging information
      from machines which have no serial ports (notebooks) such as the printk
      buffer contents, or any data which can be referenced from global pointers,
      if it is stored below the 4GB limit and even memory dumps of of the physical
      RAM region below the 4GB limit can be taken without any cooperation from the
      CPU of the host, so the machine can be crashed early, it does not matter.
      
      In the extreme, even kernel debuggers can be accessed in this way. I wrote
      a small kgdb module and an accompanying gdb stub for FireWire which allows
      to gdb to talk to kgdb using remote remory reads and writes over FireWire.
      
      An version of the gdb stub fore FireWire is able to read all global data
      from a system which is running a a normal kernel without any kernel debugger,
      without any interruption or support of the system's CPU. That way, e.g. the
      task struct and so on can be read and even manipulated when the physical DMA
      access is granted.
      
      A HOWTO is included in this patch, in Documentation/debugging-via-ohci1394.txt
      and I've put a copy online at
      ftp://ftp.suse.de/private/bk/firewire/docs/debugging-via-ohci1394.txt
      
      It also has links to all the tools which are available to make use of it
      another copy of it is online at:
      ftp://ftp.suse.de/private/bk/firewire/kernel/ohci1394_dma_early-v2.diffSigned-Off-By: NBernhard Kaindl <bk@suse.de>
      Tested-By: NThomas Renninger <trenn@suse.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      f212ec4b
    • W
      x86: GEODE add the "mfgptfix" boot time option to fix MFGPT timers · e6c4dc6c
      Willy Tarreau 提交于
      The new "mfgptfix" boot command line option may be usd to fix MFGPT
      timers on AMD Geode platforms when the BIOS has incorrectly applied
      a workaround. TinyBIOS version 0.98 is known to be affected, 0.99
      fixes the problem by letting the user disable the workaround.
      Signed-off-by: NWilly Tarreau <w@1wt.eu>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      e6c4dc6c
    • Y
      x86_32: trim memory by updating e820 · 093af8d7
      Yinghai Lu 提交于
      when MTRRs are not covering the whole e820 table, we need to trim the
      RAM and need to update e820.
      
      reuse some code on 64-bit as well.
      
      here need to add early_get_cap and use it in early_cpu_detect, and move
      mtrr_bp_init early.
      
      The code successfully trimmed the memory map on Justin's system:
      
      from:
      
       [    0.000000]  BIOS-e820: 0000000100000000 - 000000022c000000 (usable)
      
      to:
      
       [    0.000000]   modified: 0000000100000000 - 0000000228000000 (usable)
       [    0.000000]   modified: 0000000228000000 - 000000022c000000 (reserved)
      
      According to Justin it makes quite a difference:
      
      |  When I boot the box without any trimming it acts like a 286 or 386,
      |  takes about 10 minutes to boot (using raptor disks).
      Signed-off-by: NYinghai Lu <yinghai.lu@sun.com>
      Tested-by: NJustin Piszcz <jpiszcz@lucidpixels.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      093af8d7
    • A
      x86: add generic clearcpuid=... option · ac72e788
      Andi Kleen 提交于
      Add a generic option to clear any cpuid bit. I added it because it was
      very easy to add with the new generic cpuid disable bitmap and perhaps
      it will be useful in the future.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      ac72e788
    • A
      x86: add noclflush option · 191679fd
      Andi Kleen 提交于
      To disable CLFLUSH usage, especially in change_page_attr().
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      191679fd
    • J
      x86, 32-bit: trim memory not covered by wb mtrrs · 99fc8d42
      Jesse Barnes 提交于
      On some machines, buggy BIOSes don't properly setup WB MTRRs to cover all
      available RAM, meaning the last few megs (or even gigs) of memory will be
      marked uncached.  Since Linux tends to allocate from high memory addresses
      first, this causes the machine to be unusably slow as soon as the kernel
      starts really using memory (i.e.  right around init time).
      
      This patch works around the problem by scanning the MTRRs at boot and
      figuring out whether the current end_pfn value (setup by early e820 code)
      goes beyond the highest WB MTRR range, and if so, trimming it to match.  A
      fairly obnoxious KERN_WARNING is printed too, letting the user know that
      not all of their memory is available due to a likely BIOS bug.
      
      Something similar could be done on i386 if needed, but the boot ordering
      would be slightly different, since the MTRR code on i386 depends on the
      boot_cpu_data structure being setup.
      
      This patch fixes a bug in the last patch that caused the code to run on
      non-Intel machines (AMD machines apparently don't need it and it's untested
      on other non-Intel machines, so best keep it off).
      
      Further enhancements and fixes from:
      
        Yinghai Lu <Yinghai.Lu@Sun.COM>
        Andi Kleen <ak@suse.de>
      Signed-off-by: NJesse Barnes <jesse.barnes@intel.com>
      Tested-by: NJustin Piszcz <jpiszcz@lucidpixels.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Yinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      99fc8d42
    • Y
      x86: disable the GART early, 64-bit · aaf23042
      Yinghai Lu 提交于
      For K8 system: 4G RAM with memory hole remapping enabled, or more than
      4G RAM installed.
      
      when try to use kexec second kernel, and the first doesn't include
      gart_shutdown. the second kernel could have different aper position than
      the first kernel. and second kernel could use that hole as RAM that is
      still used by GART set by the first kernel. esp. when try to kexec
      2.6.24 with sparse mem enable from previous kernel (from RHEL 5 or SLES
      10). the new kernel will use aper by GART (set by first kernel) for
      vmemmap. and after new kernel setting one new GART. the position will be
      real RAM. the _mapcount set is lost.
      
      Bad page state in process 'swapper'
      page:ffffe2000e600020 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0
      Trying to fix it up, but a reboot is needed
      Backtrace:
      Pid: 0, comm: swapper Not tainted 2.6.24-rc7-smp-gcdf71a10-dirty #13
      
      Call Trace:
       [<ffffffff8026401f>] bad_page+0x63/0x8d
       [<ffffffff80264169>] __free_pages_ok+0x7c/0x2a5
       [<ffffffff80ba75d1>] free_all_bootmem_core+0xd0/0x198
       [<ffffffff80ba3a42>] numa_free_all_bootmem+0x3b/0x76
       [<ffffffff80ba3461>] mem_init+0x3b/0x152
       [<ffffffff80b959d3>] start_kernel+0x236/0x2c2
       [<ffffffff80b9511a>] _sinittext+0x11a/0x121
      
      and
       [ffffe2000e600000-ffffe2000e7fffff] PMD ->ffff81001c200000 on node 0
      phys addr is : 0x1c200000
      
      RHEL 5.1 kernel -53 said:
      PCI-DMA: aperture base @ 1c000000 size 65536 KB
      
      new kernel said:
      Mapping aperture over 65536 KB of RAM @ 3c000000
      
      So could try to disable that GART if possible.
      
      According to Ingo
      
      > hm, i'm wondering, instead of modifying the GART, why dont we simply
      > _detect_ whatever GART settings we have inherited, and propagate that
      > into our e820 maps? I.e. if there's inconsistency, then punch that out
      > from the memory maps and just dont use that memory.
      >
      > that way it would not matter whether the GART settings came from a [old
      > or crashing] Linux kernel that has not called gart_iommu_shutdown(), or
      > whether it's a BIOS that has set up an aperture hole inconsistent with
      > the memory map it passed. (or the memory map we _think_ i tried to pass
      > us)
      >
      > it would also be more robust to only read and do a memory map quirk
      > based on that, than actively trying to change the GART so early in the
      > bootup. Later on we have to re-enable the GART _anyway_ and have to
      > punch a hole for it.
      >
      > and as a bonus, we would have shored up our defenses against crappy
      > BIOSes as well.
      
      add e820 modification for gart inconsistent setting.
      
      gart_fix_e820=off could be used to disable e820 fix.
      Signed-off-by: NYinghai Lu <yinghai.lu@sun.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      aaf23042
    • A
      x86: add the "print code before the trapping instruction" feature to 64 bit · a25bd949
      Arjan van de Ven 提交于
      The 32 bit x86 tree has a very useful feature that prints the Code: line
      for the code even before the trapping instrution (and the start of the
      trapping instruction is then denoted with a <>). Unfortunately, the 64 bit
      x86 tree does not yet have this feature, making diagnosing backtraces harder
      than needed.
      
      This patch adds this feature in the same was as the 32 bit tree has
      (including the same kernel boot parameter), and including a bugfix
      to make the code use probe_kernel_address() rarther than a buggy (deadlocking)
      __get_user.
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      a25bd949
    • H
      x86: 32-bit EFI runtime service support: fixes in sync with 64-bit support · 8b2cb7a8
      Huang, Ying 提交于
      support according to fixes of x86_64 support.
      
      - Delete efi_rt_lock because it is used during system early boot,
        before SMP is initialized.
      
      - Change local_flush_tlb() to __flush_tlb_all() to flush global page
        mapping.
      
      - Clean up includes.
      
      - Revise Kconfig description.
      
      - Enable noefi kernel parameter on i386.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      8b2cb7a8
    • H
      x86: EFI runtime service support: document for EFI runtime services · 9ad65e47
      Huang, Ying 提交于
      This patch adds document for EFI x86_64 runtime services support.
      Signed-off-by: NChandramouli Narayanan <mouli@linux.intel.com>
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      9ad65e47
    • A
      x86: add ACPI reboot option · fa20efd2
      Aaron Durbin 提交于
      Add the ability to reboot an x86_64 based machine using the RESET_REG in the
      FADT ACPI table.
      Signed-off-by: NAaron Durbin <adurbin@google.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      fa20efd2
    • R
      x86 vDSO: consolidate vdso32 · af65d648
      Roland McGrath 提交于
      This makes x86_64's ia32 emulation support share the sources used in the
      32-bit kernel for the 32-bit vDSO and much of its setup code.
      
      The 32-bit vDSO mapping now behaves the same on x86_64 as on native 32-bit.
      The abi.syscall32 sysctl on x86_64 now takes the same values that
      vm.vdso_enabled takes on the 32-bit kernel.  That is, 1 means a randomized
      vDSO location, 2 means the fixed old address.  The CONFIG_COMPAT_VDSO
      option is now available to make this the default setting, the same meaning
      it has for the 32-bit kernel.  (This does not affect the 64-bit vDSO.)
      
      The argument vdso32=[012] can be used on both 32-bit and 64-bit kernels to
      set this paramter at boot time.  The vdso=[012] argument still does this
      same thing on the 32-bit kernel.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      af65d648
    • I
      x86: various changes and cleanups to in_p/out_p delay details · 6e7c4025
      Ingo Molnar 提交于
      various changes to the in_p/out_p delay details:
      
      - add the io_delay=none method
      - make each method selectable from the kernel config
      - simplify the delay code a bit by getting rid of an indirect function call
      - add the /proc/sys/kernel/io_delay_type sysctl
      - change 'io_delay=standard|alternate' to io_delay=0x80 and io_delay=0xed
      - make the io delay config not depend on CONFIG_DEBUG_KERNEL
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: N"David P. Reed" <dpreed@reed.com>
      6e7c4025
    • R
      x86: provide a DMI based port 0x80 I/O delay override. · b02aae9c
      Rene Herman 提交于
      x86: provide a DMI based port 0x80 I/O delay override.
      
      Certain (HP) laptops experience trouble from our port 0x80 I/O delay
      writes. This patch provides for a DMI based switch to the "alternate
      diagnostic port" 0xed (as used by some BIOSes as well) for these.
      
      David P. Reed confirmed that port 0xed works for him and provides a
      proper delay. The symptoms of _not_ working are a hanging machine,
      with "hwclock" use being a direct trigger.
      
      Earlier versions of this attempted to simply use udelay(2), with the
      2 being a value tested to be a nicely conservative upper-bound with
      help from many on the linux-kernel mailinglist but that approach has
      two problems.
      
      First, pre-loops_per_jiffy calibration (which is post PIT init while
      some implementations of the PIT are actually one of the historically
      problematic devices that need the delay) udelay() isn't particularly
      well-defined. We could initialise loops_per_jiffy conservatively (and
      based on CPU family so as to not unduly delay old machines) which
      would sort of work, but...
      
      Second, delaying isn't the only effect that a write to port 0x80 has.
      It's also a PCI posting barrier which some devices may be explicitly
      or implicitly relying on. Alan Cox did a survey and found evidence
      that additionally some drivers may be racy on SMP without the bus
      locking outb.
      
      Switching to an inb() makes the timing too unpredictable and as such,
      this DMI based switch should be the safest approach for now. Any more
      invasive changes should get more rigid testing first. It's moreover
      only very few machines with the problem and a DMI based hack seems
      to fit that situation.
      
      This also introduces a command-line parameter "io_delay" to override
      the DMI based choice again:
      
      	io_delay=<standard|alternate>
      
      where "standard" means using the standard port 0x80 and "alternate"
      port 0xed.
      
      This retains the udelay method as a config (CONFIG_UDELAY_IO_DELAY) and
      command-line ("io_delay=udelay") choice for testing purposes as well.
      
      This does not change the io_delay() in the boot code which is using
      the same port 0x80 I/O delay but those do not appear to be a problem
      as David P. Reed reported the problem was already gone after using the
      udelay version. He moreover reported that booting with "acpi=off" also
      fixed things and seeing as how ACPI isn't touched until after this DMI
      based I/O port switch I believe it's safe to leave the ones in the boot
      code be.
      
      The DMI strings from David's HP Pavilion dv9000z are in there already
      and we need to get/verify the DMI info from other machines with the
      problem, notably the HP Pavilion dv6000z.
      
      This patch is partly based on earlier patches from Pavel Machek and
      David P. Reed.
      Signed-off-by: NRene Herman <rene.herman@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      b02aae9c
    • G
      lguest: adapt launcher to per-cpuness · e3283fa0
      Glauber de Oliveira Costa 提交于
      This patch makes uses of pread() and pwrite() in lguest launcher
      to communicate the vcpu id to the lguest driver. The id is kept in
      a thread variable, which means we'll span in the future, vcpus as
      threads. But right now, only the infrastructure is out there.
      Signed-off-by: NGlauber de Oliveira Costa <gcosta@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      e3283fa0
    • B
      lguest: Reboot support · ec04b13f
      Balaji Rao 提交于
      Reboot Implemented
      
      (Prevent fd leak, fix style and fix documentation --RR)
      Signed-off-by: NBalaji Rao <balajirrao@gmail.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      ec04b13f
  4. 29 1月, 2008 22 次提交