1. 25 1月, 2009 1 次提交
    • I
      x86: use standard PIT frequency · e1b4d114
      Ingo Molnar 提交于
      the RDC and ELAN platforms use slighly different PIT clocks, resulting in
      a timex.h hack that changes PIT_TICK_RATE during build time. But if a
      tester enables any of these platform support .config options, the PIT
      will be miscalibrated on standard PC platforms.
      
      So use one frequency - in a subsequent patch we'll add a quirk to allow
      x86 platforms to define different PIT frequencies.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e1b4d114
  2. 24 1月, 2009 2 次提交
    • I
      xen: handle highmem pages correctly when shrinking a domain · ff4ce8c3
      Ian Campbell 提交于
      Commit 1058a75f ("xen: actually release
      memory when shrinking domain") causes a crash if the page being released
      is a highmem page.
      
      If a page is highmem then there is no need to unmap it.
      Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
      Acked-by: NJeremy Fitzhardinge <jeremy@goop.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ff4ce8c3
    • P
      x86, mm: fix pte_free() · 42ef73fe
      Peter Zijlstra 提交于
      On -rt we were seeing spurious bad page states like:
      
      Bad page state in process 'firefox'
      page:c1bc2380 flags:0x40000000 mapping:c1bc2390 mapcount:0 count:0
      Trying to fix it up, but a reboot is needed
      Backtrace:
      Pid: 503, comm: firefox Not tainted 2.6.26.8-rt13 #3
      [<c043d0f3>] ? printk+0x14/0x19
      [<c0272d4e>] bad_page+0x4e/0x79
      [<c0273831>] free_hot_cold_page+0x5b/0x1d3
      [<c02739f6>] free_hot_page+0xf/0x11
      [<c0273a18>] __free_pages+0x20/0x2b
      [<c027d170>] __pte_alloc+0x87/0x91
      [<c027d25e>] handle_mm_fault+0xe4/0x733
      [<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
      [<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
      [<c0218875>] do_page_fault+0x36f/0x88a
      
      This is the case where a concurrent fault already installed the PTE and
      we get to free the newly allocated one.
      
      This is due to pgtable_page_ctor() doing the spin_lock_init(&page->ptl)
      which is overlaid with the {private, mapping} struct.
      
      union {
          struct {
              unsigned long private;
              struct address_space *mapping;
          };
          spinlock_t ptl;
          struct kmem_cache *slab;
          struct page *first_page;
      };
      
      Normally the spinlock is small enough to not stomp on page->mapping, but
      PREEMPT_RT=y has huge 'spin'locks.
      
      But lockdep kernels should also be able to trigger this splat, as the
      lock tracking code grows the spinlock to cover page->mapping.
      
      The obvious fix is calling pgtable_page_dtor() like the regular pte free
      path __pte_free_tlb() does.
      
      It seems all architectures except x86 and nm10300 already do this, and
      nm10300 doesn't seem to use pgtable_page_ctor(), which suggests it
      doesn't do SMP or simply doesnt do MMU at all or something.
      Signed-off-by: NPeter Zijlstra <a.p.zijlsta@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: <stable@kernel.org>
      42ef73fe
  3. 23 1月, 2009 1 次提交
  4. 22 1月, 2009 3 次提交
  5. 21 1月, 2009 4 次提交
    • T
      x86: mtrr fix debug boot parameter · 731f1872
      Thomas Renninger 提交于
      while looking at:
      
        http://bugzilla.kernel.org/show_bug.cgi?id=11541
      
      I realized that the mtrr.show param cannot work, because
      the code is processed much too early.
      
      This patch:
       - Declares mtrr.show as early_param
       - Stays consistent with the previous param (which I doubt
         that it ever worked), so mtrr.show=1 would still work
       - Declares mtrr_show as initdata
      Signed-off-by: NThomas Renninger <trenn@suse.de>
      Acked-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      731f1872
    • S
      x86: fix page attribute corruption with cpa() · a1e46212
      Suresh Siddha 提交于
      Impact: fix sporadic slowdowns and warning messages
      
      This patch fixes a performance issue reported by Linus on his
      Nehalem system. While Linus reverted the PAT patch (commit
      58dab916) which exposed the issue,
      existing cpa() code can potentially still cause wrong(page attribute
      corruption) behavior.
      
      This patch also fixes the "WARNING: at arch/x86/mm/pageattr.c:560" that
      various people reported.
      
      In 64bit kernel, kernel identity mapping might have holes depending
      on the available memory and how e820 reports the address range
      covering the RAM, ACPI, PCI reserved regions. If there is a 2MB/1GB hole
      in the address range that is not listed by e820 entries, kernel identity
      mapping will have a corresponding hole in its 1-1 identity mapping.
      
      If cpa() happens on the kernel identity mapping which falls into these holes,
      existing code fails like this:
      
      	__change_page_attr_set_clr()
      		__change_page_attr()
      			returns 0 because of if (!kpte). But doesn't
      			set cpa->numpages and cpa->pfn.
      		cpa_process_alias()
      			uses uninitialized cpa->pfn (random value)
      			which can potentially lead to changing the page
      			attribute of kernel text/data, kernel identity
      			mapping of RAM pages etc. oops!
      
      This bug was easily exposed by another PAT patch which was doing
      cpa() more often on kernel identity mapping holes (physical range between
      max_low_pfn_mapped and 4GB), where in here it was setting the
      cache disable attribute(PCD) for kernel identity mappings aswell.
      
      Fix cpa() to handle the kernel identity mapping holes. Retain
      the WARN() for cpa() calls to other not present address ranges
      (kernel-text/data, ioremap() addresses)
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a1e46212
    • I
      Revert "x86: signal: change type of paramter for sys_rt_sigreturn()" · 552b8aa4
      Ingo Molnar 提交于
      This reverts commit 4217458d.
      
      Justin Madru bisected this commit, it was causing weird Firefox
      crashes.
      
      The reason is that GCC mis-optimizes (re-uses) the on-stack parameters of
      the calling frame, which corrupts the syscall return pt_regs state and
      thus corrupts user-space register state.
      
      So we go back to the slightly less clean but more optimization-safe
      method of getting to pt_regs. Also add a comment to explain this.
      
      Resolves: http://bugzilla.kernel.org/show_bug.cgi?id=12505Reported-and-bisected-by: NJustin Madru <jdm64@gawab.com>
      Tested-by: NJustin Madru <jdm64@gawab.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      552b8aa4
    • A
      x86: use early clobbers in usercopy*.c · e0a96129
      Andi Kleen 提交于
      Impact: fix rare (but currently harmless) miscompile with certain configs and gcc versions
      
      Hugh Dickins noticed that strncpy_from_user() was miscompiled
      in some circumstances with gcc 4.3.
      
      Thanks to Hugh's excellent analysis it was easy to track down.
      
      Hugh writes:
      
      > Try building an x86_64 defconfig 2.6.29-rc1 kernel tree,
      > except not quite defconfig, switch CONFIG_PREEMPT_NONE=y
      > and CONFIG_PREEMPT_VOLUNTARY off (because it expands a
      > might_fault() there, which hides the issue): using a
      > gcc 4.3.2 (I've checked both openSUSE 11.1 and Fedora 10).
      >
      > It generates the following:
      >
      > 0000000000000000 <__strncpy_from_user>:
      >    0:   48 89 d1                mov    %rdx,%rcx
      >    3:   48 85 c9                test   %rcx,%rcx
      >    6:   74 0e                   je     16 <__strncpy_from_user+0x16>
      >    8:   ac                      lods   %ds:(%rsi),%al
      >    9:   aa                      stos   %al,%es:(%rdi)
      >    a:   84 c0                   test   %al,%al
      >    c:   74 05                   je     13 <__strncpy_from_user+0x13>
      >    e:   48 ff c9                dec    %rcx
      >   11:   75 f5                   jne    8 <__strncpy_from_user+0x8>
      >   13:   48 29 c9                sub    %rcx,%rcx
      >   16:   48 89 c8                mov    %rcx,%rax
      >   19:   c3                      retq
      >
      > Observe that "sub %rcx,%rcx; mov %rcx,%rax", whereas gcc 4.2.1
      > (and many other configs) say "sub %rcx,%rdx; mov %rdx,%rax".
      > Isn't it returning 0 when it ought to be returning strlen?
      
      The asm constraints for the strncpy_from_user() result were missing an
      early clobber, which tells gcc that the last output arguments
      are written before all input arguments are read.
      
      Also add more early clobbers in the rest of the file and fix 32-bit
      usercopy.c in the same way.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      [ since this API is rarely used and no in-kernel user relies on a 'len'
        return value (they only rely on negative return values) this miscompile
        was never noticed in the field. But it's worth fixing it nevertheless. ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e0a96129
  6. 20 1月, 2009 5 次提交
    • G
      x86: remove kernel_physical_mapping_init() from init section · f5495506
      Gary Hade 提交于
      Impact: fix crash with memory hotplug enabled
      
      kernel_physical_mapping_init() is called during memory hotplug
      so it does not belong in the init section.
      
      If the kernel is built with CONFIG_DEBUG_SECTION_MISMATCH=y on
      the make command line, arch/x86/mm/init_64.c is compiled with
      the -fno-inline-functions-called-once gcc option defeating
      inlining of kernel_physical_mapping_init() within init_memory_mapping().
      
      When kernel_physical_mapping_init() is not inlined it is placed
      in the .init.text section according to the __init in it's current
      declaration.  A later call to kernel_physical_mapping_init() during
      a memory hotplug operation encounters an int3 trap because the
      .init.text section memory has been freed.
      
      This patch eliminates the crash caused by the int3 trap by moving the
      non-inlined kernel_physical_mapping_init() from .init.text to .meminit.text.
      Signed-off-by: NGary Hade <garyhade@us.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f5495506
    • I
      fix: crash: IP: __bitmap_intersects+0x48/0x73 · bfa318ad
      Ingo Molnar 提交于
      -tip testing found this crash:
      
      > [   35.258515] calling  acpi_cpufreq_init+0x0/0x127 @ 1
      > [   35.264127] BUG: unable to handle kernel NULL pointer dereference at (null)
      > [   35.267554] IP: [<ffffffff80478092>] __bitmap_intersects+0x48/0x73
      > [   35.267554] PGD 0
      > [   35.267554] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
      
      arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c is still broken: there's no
      allocation of the variable mask, so we pass in an uninitialized cmd.mask
      field to drv_read(), which then passes it to the scheduler which then
      crashes ...
      
      Switch it over to the much simpler constant-cpumask-pointers approach.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bfa318ad
    • M
      cpufreq: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write · 72859081
      Mike Travis 提交于
      Impact: use new work_on_cpu function to reduce stack usage
      
      Replace the saving of current->cpus_allowed and set_cpus_allowed_ptr() with
      a work_on_cpu function for drv_read() and drv_write().
      
      Basically converts do_drv_{read,write} into "work_on_cpu" functions that
      are now called by drv_read and drv_write.
      
      Note: This patch basically reverts 50c668d6 which reverted 7503bfba, now
      that the work_on_cpu() function is more stable.
      Signed-off-by: NMike Travis <travis@sgi.com>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Tested-by: NDieter Ries <clip2@gmx.de>
      Tested-by: NMaciej Rutecki <maciej.rutecki@gmail.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: <cpufreq@vger.kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      72859081
    • R
      work_on_cpu: Use our own workqueue. · 8ccad40d
      Rusty Russell 提交于
      Impact: remove potential clashes with generic kevent workqueue
      
      Annoyingly, some places we want to use work_on_cpu are already in
      workqueues.  As per Ingo's suggestion, we create a different workqueue
      for work_on_cpu.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NMike Travis <travis@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8ccad40d
    • R
      work_on_cpu: don't try to get_online_cpus() in work_on_cpu. · 31ad9081
      Rusty Russell 提交于
      Impact: remove potential circular lock dependency with cpu hotplug lock
      
      This has caused more problems than it solved, with a pile of cpu
      hotplug locking issues.
      
      Followup patches will get_online_cpus() in callers that need it, but
      if they don't do it they're no worse than before when they were using
      set_cpus_allowed without locking.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NMike Travis <travis@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      31ad9081
  7. 19 1月, 2009 3 次提交
    • L
      x86: fix section mismatch warnings in kernel/setup_percpu.c · c7f8562a
      Leonardo Potenza 提交于
      The function setup_cpu_local_masks() has been marked __init, in
      order to remove the following section mismatch messages:
      
      WARNING: vmlinux.o(.text+0x3c2c7): Section mismatch in reference from the function setup_cpu_local_masks() to the function .init.text:alloc_bootmem_cpumask_var()
      The function setup_cpu_local_masks() references
      the function __init alloc_bootmem_cpumask_var().
      This is often because setup_cpu_local_masks lacks a __init
      annotation or the annotation of alloc_bootmem_cpumask_var is wrong.
      
      WARNING: vmlinux.o(.text+0x3c2d3): Section mismatch in reference from the function setup_cpu_local_masks() to the function .init.text:alloc_bootmem_cpumask_var()
      The function setup_cpu_local_masks() references
      the function __init alloc_bootmem_cpumask_var().
      This is often because setup_cpu_local_masks lacks a __init
      annotation or the annotation of alloc_bootmem_cpumask_var is wrong.
      
      WARNING: vmlinux.o(.text+0x3c2df): Section mismatch in reference from the function setup_cpu_local_masks() to the function .init.text:alloc_bootmem_cpumask_var()
      The function setup_cpu_local_masks() references
      the function __init alloc_bootmem_cpumask_var().
      This is often because setup_cpu_local_masks lacks a __init
      annotation or the annotation of alloc_bootmem_cpumask_var is wrong.
      
      WARNING: vmlinux.o(.text+0x3c2eb): Section mismatch in reference from the function setup_cpu_local_masks() to the function .init.text:alloc_bootmem_cpumask_var()
      The function setup_cpu_local_masks() references
      the function __init alloc_bootmem_cpumask_var().
      This is often because setup_cpu_local_masks lacks a __init
      annotation or the annotation of alloc_bootmem_cpumask_var is wrong.
      Signed-off-by: NLeonardo Potenza <lpotenza@inwind.it>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c7f8562a
    • M
      x86: put trigger in to detect mismatched apic versions · b2b815d8
      Mike Travis 提交于
      Impact: add debug warning
      
      Fire off one message if two apic's discovered with different
      apic versions. (this code is only called during CPU init)
      
      The goal of this is to pave the way of the removal of the apic_version[]
      array. We dont expect any apic version incompatibilities in the x86
      landscape of systems [if so we dont handle them very well and probably
      never will handle deep apic version assymetries well], but it's prudent
      to have a debug check for one kernel cycle nevertheless.
      Signed-off-by: NMike Travis <travis@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b2b815d8
    • J
      x86: define ARCH_WANT_FRAME_POINTERS · 64dec40d
      Jeff Mahoney 提交于
      Commit da4276b8 changed a dependency
      for FRAME_POINTER from X86 to ARCH_WANT_FRAME_POINTERS, but didn't
      actually define it.
      
      This patch adds the definition for ARCH_WANT_FRAME_POINTERS. Without it,
      FRAME_POINTER can't be enabled on x86.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      64dec40d
  8. 16 1月, 2009 3 次提交
    • J
      x86: fix assumed to be contiguous leaf page tables for kmap_atomic region (take 2) · a3c6018e
      Jan Beulich 提交于
      Debugging and original patch from Nick Piggin <npiggin@suse.de>
      
      The early fixmap pmd entry inserted at the very top of the KVA is causing the
      subsequent fixmap mapping code to not provide physically linear pte pages over
      the kmap atomic portion of the fixmap (which relies on said property to
      calculate pte addresses).
      
      This has caused weird boot failures in kmap_atomic much later in the boot
      process (initial userspace faults) on a 32-bit PAE system with a larger number
      of CPUs (smaller CPU counts tend not to run over into the next page so don't
      show up the problem).
      
      Solve this by attempting to clear out the page table, and copy any of its
      entries to the new one. Also, add a bug if a nonlinear condition is encountered
      and can't be resolved, which might save some hours of debugging if this fragile
      scheme ever breaks again...
      
      Once we have such logic, we can also use it to eliminate the early ioremap
      trickery around the page table setup for the fixmap area. This also fixes
      potential issues with FIX_* entries sharing the leaf page table with the early
      ioremap ones getting discarded by early_ioremap_clear() and not restored by
      early_ioremap_reset(). It at once eliminates the temporary (and configuration,
      namely NR_CPUS, dependent) unavailability of early fixed mappings during the
      time the fixmap area page tables get constructed.
      
      Finally, also replace the hard coded calculation of the initial table space
      needed for the fixmap area with a proper one, allowing kernels configured for
      large CPU counts to actually boot.
      
      Based-on: Nick Piggin <npiggin@suse.de>
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a3c6018e
    • C
      x86, UV: cpu_relax in uv_wait_completion · 18c07cf5
      Cliff Wickman 提交于
      The function uv_wait_completion() spins on reads of a memory-mapped
      register, waiting for completion of BAU hardware replies.
      
      It should call "cpu_relax()" between those reads to improve performance
      on hyperthreaded configurations.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Acked-by: NJack Steiner <steiner@sgi.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      18c07cf5
    • J
      x86: avoid early crash in disable_local_APIC() · 4a13ad0b
      Jan Beulich 提交于
      E.g. when called due to an early panic.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4a13ad0b
  9. 15 1月, 2009 1 次提交
  10. 14 1月, 2009 1 次提交
    • A
      x86, generic: mark complex bitops.h inlines as __always_inline · c8399943
      Andi Kleen 提交于
      Impact: reduce kernel image size
      
      Hugh Dickins noticed that older gcc versions when the kernel
      is built for code size didn't inline some of the bitops.
      
      Mark all complex x86 bitops that have more than a single
      asm statement or two as always inline to avoid this problem.
      
      Probably should be done for other architectures too.
      
      Ingo then found a better fix that only requires
      a single line change, but it unfortunately only
      works on gcc 4.3.
      
      On older gccs the original patch still makes a ~0.3% defconfig
      difference with CONFIG_OPTIMIZE_INLINING=y.
      
      With gcc 4.1 and a defconfig like build:
      
          6116998 1138540  883788 8139326  7c323e vmlinux-oi-with-patch
          6137043 1138540  883788 8159371  7c808b vmlinux-optimize-inlining
      
      ~20k / 0.3% difference.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c8399943
  11. 13 1月, 2009 6 次提交
  12. 11 1月, 2009 5 次提交
  13. 10 1月, 2009 5 次提交