1. 18 3月, 2009 4 次提交
  2. 15 3月, 2009 8 次提交
    • Y
      x86: put initial_pg_tables into .bss · 2bd2753f
      Yinghai Lu 提交于
      Impact: makes vmlinux section information more useful
      
      Don't use ram after _end blindly for pagetables. aka init pages is before _end
      put those pg table into .bss
      
      [Adapted to use brk segment - Jeremy]
      
      v2: keep initial page table up to 512M only.
      v4: put initial page tables just before _end
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      2bd2753f
    • J
      x86: allow extend_brk users to reserve brk space · 796216a5
      Jeremy Fitzhardinge 提交于
      Impact: new interface; remove hard-coded limit
      
      Add RESERVE_BRK(name, size) macro to reserve space in the brk
      area.  This should be a conservative (ie, larger) estimate of
      how much space might possibly be required from the brk area.
      Any unused space will be freed, so there's no real downside
      on making the reservation too large (within limits).
      
      The name should be unique within a given file, and somewhat
      descriptive.
      
      The C definition of RESERVE_BRK() ends up being more complex than
      one would expect to work around a cluster of gcc infelicities:
      
        The first attempt was to simply try putting __section(.brk_reservation)
        on a variable.  This doesn't work because it ends up making it a
        @progbits section, which gets actual space allocated in the vmlinux
        executable.
      
        The second attempt was to emit the space into a section using asm,
        but gcc doesn't allow arguments to be passed to file-level asm()
        statements, making it hard to pass in the size.
      
        The final attempt is to wrap the asm() in a function to allow
        it to have arguments, and put the function itself into the
        .discard section, which vmlinux*.lds drops entirely from the
        emitted vmlinux.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      796216a5
    • Y
      x86-32: compute initial mapping size more accurately · 7543c1de
      Yinghai Lu 提交于
      Impact: simplification
      
      We only need to map the kernel in head_32.S, not the whole of
      lowmem.  We use 512MB as a reasonable (but arbitrary) limit on
      the maximum size of the kernel image.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      7543c1de
    • J
      x86: use brk allocation for DMI · 6de6cb44
      Jeremy Fitzhardinge 提交于
      Impact: use new interface instead of previous ad hoc implementation
      
      Use extend_brk() to allocate memory for DMI rather than having an
      ad-hoc allocator.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      6de6cb44
    • J
      x86-32: use brk segment for allocating initial kernel pagetable · ccf3fe02
      Jeremy Fitzhardinge 提交于
      Impact: use new interface instead of previous ad hoc implementation
      
      Rather than having special purpose init_pg_table_start/end variables
      to delimit the kernel pagetable built by head_32.S, just use the brk
      mechanism to extend the bss for the new pagetable.
      
      This patch removes init_pg_table_start/end and pg0, defines __brk_base
      (which is page-aligned and immediately follows _end), initializes
      the brk region to start there, and uses it for the 32-bit pagetable.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      ccf3fe02
    • H
      x86: move brk initialization out of #ifdef CONFIG_BLK_DEV_INITRD · 5368a2be
      H. Peter Anvin 提交于
      Impact: build fix
      
      The brk initialization functions were incorrectly located inside
      an #ifdef CONFIG_VLK_DEV_INITRD block, causing the obvious build failure in
      minimal configurations.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      5368a2be
    • J
      x86: add brk allocation for very, very early allocations · 93dbda7c
      Jeremy Fitzhardinge 提交于
      Impact: new interface
      
      Add a brk()-like allocator which effectively extends the bss in order
      to allow very early code to do dynamic allocations.  This is better than
      using statically allocated arrays for data in subsystems which may never
      get used.
      
      The space for brk allocations is in the bss ELF segment, so that the
      space is mapped properly by the code which maps the kernel, and so
      that bootloaders keep the space free rather than putting a ramdisk or
      something into it.
      
      The bss itself, delimited by __bss_stop, ends before the brk area
      (__brk_base to __brk_limit).  The kernel text, data and bss is reserved
      up to __bss_stop.
      
      Any brk-allocated data is reserved separately just before the kernel
      pagetable is built, as that code allocates from unreserved spaces
      in the e820 map, potentially allocating from any unused brk memory.
      Ultimately any unused memory in the brk area is used in the general
      kernel memory pool.
      
      Initially the brk space is set to 1MB, which is probably much larger
      than any user needs (the largest current user is i386 head_32.S's code
      to build the pagetables to map the kernel, which can get fairly large
      with a big kernel image and no PSE support).  So long as the system
      has sufficient memory for the bootloader to reserve the kernel+1MB brk,
      there are no bad effects resulting from an over-large brk.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      93dbda7c
    • J
      x86: make section delimiter symbols part of their section · b9719a4d
      Jeremy Fitzhardinge 提交于
      Impact: cleanup
      
      Move the symbols delimiting a section part of the section
      (section relative) rather than absolute.  This avoids any
      unexpected gaps between the section-start symbol and the first
      data in the section, which could be caused by implicit
      alignment of the section data.  It also makes the general
      form of vmlinux_64.lds.S consistent with vmlinux_32.lds.S.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      b9719a4d
  3. 12 3月, 2009 3 次提交
  4. 11 3月, 2009 10 次提交
  5. 10 3月, 2009 5 次提交
    • S
      x86: BUG to BUG_ON changes · 8c5dfd25
      Stoyan Gaydarov 提交于
      Impact: cleanup
      Signed-off-by: NStoyan Gaydarov <stoyboyker@gmail.com>
      LKML-Reference: <1236661850-8237-8-git-send-email-stoyboyker@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8c5dfd25
    • T
      percpu: generalize embedding first chunk setup helper · 66c3a757
      Tejun Heo 提交于
      Impact: code reorganization
      
      Separate out embedding first chunk setup helper from x86 embedding
      first chunk allocator and put it in mm/percpu.c.  This will be used by
      the default percpu first chunk allocator and possibly by other archs.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      66c3a757
    • T
      percpu: more flexibility for @dyn_size of pcpu_setup_first_chunk() · 6074d5b0
      Tejun Heo 提交于
      Impact: cleanup, more flexibility for first chunk init
      
      Non-negative @dyn_size used to be allowed iff @unit_size wasn't auto.
      This restriction stemmed from implementation detail and made things a
      bit less intuitive.  This patch allows @dyn_size to be specified
      regardless of @unit_size and swaps the positions of @dyn_size and
      @unit_size so that the parameter order makes more sense (static,
      reserved and dyn sizes followed by enclosing unit_size).
      
      While at it, add @unit_size >= PCPU_MIN_UNIT_SIZE sanity check.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      6074d5b0
    • T
      percpu: make x86 addr <-> pcpu ptr conversion macros generic · e0100983
      Tejun Heo 提交于
      Impact: generic addr <-> pcpu ptr conversion macros
      
      There's nothing arch specific about x86 __addr_to_pcpu_ptr() and
      __pcpu_ptr_to_addr().  With proper __per_cpu_load and __per_cpu_start
      defined, they'll do the right thing regardless of actual layout.
      
      Move these macros from arch/x86/include/asm/percpu.h to mm/percpu.c
      and allow archs to override it as necessary.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      e0100983
    • D
      Revert "[CPUFREQ] Disable sysfs ui for p4-clockmod." · 129f8ae9
      Dave Jones 提交于
      This reverts commit e088e4c9.
      
      Removing the sysfs interface for p4-clockmod was flagged as a
      regression in bug 12826.
      
      Course of action:
       - Find out the remaining causes of overheating, and fix them
         if possible. ACPI should be doing the right thing automatically.
         If it isn't, we need to fix that.
       - mark p4-clockmod ui as deprecated
       - try again with the removal in six months.
      
      It's not really feasible to printk about the deprecation, because
      it needs to happen at all the sysfs entry points, which means adding
      a lot of strcmp("p4-clockmod".. calls to the core, which.. bleuch.
      Signed-off-by: NDave Jones <davej@redhat.com>
      129f8ae9
  6. 09 3月, 2009 6 次提交
    • R
      lguest: fix for CONFIG_SPARSE_IRQ=y · 6db6a5f3
      Rusty Russell 提交于
      Impact: remove lots of lguest boot WARN_ON() when CONFIG_SPARSE_IRQ=y
      
      We now need to call irq_to_desc_alloc_cpu() before
      set_irq_chip_and_handler_name(), but we can't do that from init_IRQ (no
      kmalloc available).
      
      So do it as we use interrupts instead.  Also means we only alloc for
      irqs we use, which was the intent of CONFIG_SPARSE_IRQ anyway.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Ingo Molnar <mingo@redhat.com>
      6db6a5f3
    • R
      lguest: fix crash 'unhandled trap 13 at <native_read_msr_safe>' · cbd88c8e
      Rusty Russell 提交于
      Impact: fix lguest boot crash on modern Intel machines
      
      The code in early_init_intel does:
      
      	if (c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xd)) {
      		u64 misc_enable;
      
      		rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable);
      
      And that rdmsr faults (not allowed from non-0 PL).  We can get around
      this by mugging the family ID part of the cpuid.  5 seems like a good
      number.
      
      Of course, this is a hack (how very lguest!).  We could just indicate
      that we don't support MSRs, or implement lguest_rdmst.
      Reported-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Tested-by: NPatrick McHardy <kaber@trash.net>
      cbd88c8e
    • J
      x86-32: make sure virt_addr_valid() returns false for fixmap addresses · 0feca851
      Jeremy Fitzhardinge 提交于
      I found that virt_addr_valid() was returning true for fixmap addresses.
      
      I'm not sure whether pfn_valid() is supposed to include this test,
      but there's no harm in being explicit.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Jiri Slaby <jirislaby@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      LKML-Reference: <49B166D6.2080505@goop.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0feca851
    • S
      x86 mmiotrace: fix remove_kmmio_fault_pages() · d0fc63f7
      Stuart Bennett 提交于
      Impact: fix race+crash in mmiotrace
      
      The list manipulation in remove_kmmio_fault_pages() was broken. If more
      than one consecutive kmmio_fault_page was re-added during the grace
      period between unregister_kmmio_probe() and remove_kmmio_fault_pages(),
      the list manipulation failed to remove pages from the release list.
      
      After a second grace period the pages get into rcu_free_kmmio_fault_pages()
      and raise a BUG_ON() kernel crash.
      
      The list manipulation is fixed to properly remove pages from the release
      list.
      
      This bug has been present from the very beginning of mmiotrace in the
      mainline kernel. It was introduced in 0fd0e3da ("x86: mmiotrace full
      patch, preview 1");
      
      An urgent fix for Linus. Tested by Stuart (on 32-bit) and Pekka
      (on amd and intel 64-bit systems, nouveau and nvidia proprietary).
      Signed-off-by: NStuart Bennett <stuart@freedesktop.org>
      Signed-off-by: NPekka Paalanen <pq@iki.fi>
      LKML-Reference: <20090308202135.34933feb@daedalus.pq.iki.fi>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d0fc63f7
    • Y
      x86: fix warning about nodeid · e954ef20
      Yinghai Lu 提交于
      Impact: cleanup
      
      Ingo found there warning about nodeid with some configs.
      
      try to use for_each_online_node for non numa too. in that case
      nodeid will be 0.
      
      also move out boundary checking from setup_node_bootmem(), so
      non-numa config will not check it.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <49B03069.80001@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e954ef20
    • W
      x86: don't define __this_fixmap_does_not_exist() · 8827247f
      Wang Chen 提交于
      Impact: improve out-of-range fixmap index debugging
      
      Commit "1b42f516"
      defined the __this_fixmap_does_not_exist() function
      with a WARN_ON(1) in it.
      
      This causes the linker to not report an error when
      __this_fixmap_does_not_exist() is called with a
      non-constant parameter.
      
      Ingo defined __this_fixmap_does_not_exist() because he
      wanted to get virt addresses of fix memory of nest level
      by non-constant index.
      
      But we can fix this and still keep the link-time check:
      
      We can get the four slot virt addresses on link time and
      store them to array slot_virt[].
      
      Then we can then refer the slot_virt with non-constant index,
      in the ioremap-leak detection code.
      Signed-off-by: NWang Chen <wangchen@cn.fujitsu.com>
      LKML-Reference: <49B2075B.4070509@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8827247f
  7. 08 3月, 2009 2 次提交
    • Y
      x86: remove smp_apply_quirks()/smp_checks() · 1f442d70
      Yinghai Lu 提交于
      Impact: cleanup and code size reduction on 64-bit
      
      This code is only applied to Intel Pentium and AMD K7 32-bit cpus.
      
      Move those checks to intel_init()/amd_init() for 32-bit
      so 64-bit will not build this code.
      
      Also change to use cpu_index check to see if we need to emit warning.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <49B377D2.8030108@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1f442d70
    • C
      x86: UV: remove uv_flush_tlb_others() WARN_ON · 3a450de1
      Cliff Wickman 提交于
      In uv_flush_tlb_others() (arch/x86/kernel/tlb_uv.c),
      the "WARN_ON(!in_atomic())" fails if CONFIG_PREEMPT is not enabled.
      
      And CONFIG_PREEMPT is not enabled by default in the distribution that
      most UV owners will use.
      
      We could #ifdef CONFIG_PREEMPT the warning, but that is not good form.
      And there seems to be no suitable fix to in_atomic() when CONFIG_PREMPT
      is not on.
      
      As Ingo commented:
      
        > and we have no proper primitive to test for atomicity. (mainly
        > because we dont know about atomicity on a non-preempt kernel)
      
      So we drop the WARN_ON.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3a450de1
  8. 07 3月, 2009 1 次提交
    • C
      x86: linkage.h - guard assembler specifics by __ASSEMBLY__ · 7ab15247
      Cyrill Gorcunov 提交于
      Stephen Rothwell reported:
      
      |Today's linux-next build (x86_64 allmodconfig) produced this warning:
      |
      |In file included from drivers/char/epca.c:49:
      |drivers/char/digiFep1.h:7:1: warning: "GLOBAL" redefined
      |In file included from include/linux/linkage.h:5,
      |                 from include/linux/kernel.h:11,
      |                 from arch/x86/include/asm/system.h:10,
      |                 from arch/x86/include/asm/processor.h:17,
      |                 from include/linux/prefetch.h:14,
      |                 from include/linux/list.h:6,
      |                 from include/linux/module.h:9,
      |                 from drivers/char/epca.c:29:
      |arch/x86/include/asm/linkage.h:55:1: warning: this is the location of the previous definition
      |
      |Probably introduced by commit 95695547
      |("x86: asm linkage - introduce GLOBAL macro") from the x86 tree.
      
      Any assembler specific snippets being placed in headers
      are to be protected by __ASSEMBLY__. Fixed.
      
      Also move __ALIGN definition under the same protection as well.
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      LKML-Reference: <20090306160833.GB7420@localhost>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7ab15247
  9. 06 3月, 2009 1 次提交