1. 21 10月, 2010 1 次提交
  2. 20 10月, 2010 2 次提交
  3. 08 10月, 2010 1 次提交
    • N
      x86-32: Fix sparse warning for the __PHYSICAL_MASK calculation · a416e9e1
      Namhyung Kim 提交于
      On 32-bit non-PAE system, cast to 'phys_addr_t' truncates value
      before subtraction. Subtracting before cast produce same result
      but remove following warnings from sparse:
      
       arch/x86/include/asm/pgtable_types.h:255:38: warning: cast truncates bits from constant value (100000000 becomes 0)
       arch/x86/include/asm/pgtable_types.h:270:38: warning: cast truncates bits from constant value (100000000 becomes 0)
       arch/x86/include/asm/pgtable.h:127:32: warning: cast truncates bits from constant value (100000000 becomes 0)
       arch/x86/include/asm/pgtable.h:132:32: warning: cast truncates bits from constant value (100000000 becomes 0)
       arch/x86/include/asm/pgtable.h:344:31: warning: cast truncates bits from constant value (100000000 becomes 0)
      
      64-bit or PAE machines will not be affected by this change.
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      LKML-Reference: <1285770588-14065-1-git-send-email-namhyung@gmail.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      a416e9e1
  4. 06 10月, 2010 1 次提交
  5. 17 9月, 2010 1 次提交
    • C
      mm, x86: Saving vmcore with non-lazy freeing of vmas · 3ee48b6a
      Cliff Wickman 提交于
      During the reading of /proc/vmcore the kernel is doing
      ioremap()/iounmap() repeatedly. And the buildup of un-flushed
      vm_area_struct's is causing a great deal of overhead. (rb_next()
      is chewing up most of that time).
      
      This solution is to provide function set_iounmap_nonlazy(). It
      causes a subsequent call to iounmap() to immediately purge the
      vma area (with try_purge_vmap_area_lazy()).
      
      With this patch we have seen the time for writing a 250MB
      compressed dump drop from 71 seconds to 44 seconds.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: kexec@lists.infradead.org
      Cc: <stable@kernel.org>
      LKML-Reference: <E1OwHZ4-0005WK-Tw@eag09.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3ee48b6a
  6. 09 9月, 2010 1 次提交
  7. 03 9月, 2010 1 次提交
  8. 30 8月, 2010 1 次提交
  9. 27 8月, 2010 3 次提交
    • S
      x86, mm: Make spurious_fault check explicitly check the PRESENT bit · 660a293e
      Shaohua Li 提交于
      pte_present() returns true even present bit isn't set but _PAGE_PROTNONE
      (global bit) bit is set. While with CONFIG_DEBUG_PAGEALLOC, free pages have
      global bit set but present bit clear. This patch makes we could catch
      free pages access with CONFIG_DEBUG_PAGEALLOC enabled.
      
      [ hpa: added a comment in the code as a warning to janitors ]
      Signed-off-by: NShaohua Li <shaohua.li@intel.com>
      LKML-Reference: <1280217988.32400.75.camel@sli10-desk.sh.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      660a293e
    • H
      x86-64, mem: Update all PGDs for direct mapping and vmemmap mapping changes · 9b861528
      Haicheng Li 提交于
      When memory hotplug-adding happens for a large enough area
      that a new PGD entry is needed for the direct mapping, the PGDs
      of other processes would not get updated. This leads to some CPUs
      oopsing like below when they have to access the unmapped areas.
      
      [ 1139.243192] BUG: soft lockup - CPU#0 stuck for 61s! [bash:6534]
      [ 1139.243195] Modules linked in: ipv6 autofs4 rfcomm l2cap crc16 bluetooth rfkill binfmt_misc
      dm_mirror dm_region_hash dm_log dm_multipath dm_mod video output sbs sbshc fan battery ac parport_pc
      lp parport joydev usbhid processor thermal thermal_sys container button rtc_cmos rtc_core rtc_lib
      i2c_i801 i2c_core pcspkr uhci_hcd ohci_hcd ehci_hcd usbcore
      [ 1139.243229] irq event stamp: 8538759
      [ 1139.243230] hardirqs last  enabled at (8538759): [<ffffffff8100c3fc>] restore_args+0x0/0x30
      [ 1139.243236] hardirqs last disabled at (8538757): [<ffffffff810422df>] __do_softirq+0x106/0x146
      [ 1139.243240] softirqs last  enabled at (8538758): [<ffffffff81042310>] __do_softirq+0x137/0x146
      [ 1139.243245] softirqs last disabled at (8538743): [<ffffffff8100cb5c>] call_softirq+0x1c/0x34
      [ 1139.243249] CPU 0:
      [ 1139.243250] Modules linked in: ipv6 autofs4 rfcomm l2cap crc16 bluetooth rfkill binfmt_misc
      dm_mirror dm_region_hash dm_log dm_multipath dm_mod video output sbs sbshc fan battery ac parport_pc
      lp parport joydev usbhid processor thermal thermal_sys container button rtc_cmos rtc_core rtc_lib
      i2c_i801 i2c_core pcspkr uhci_hcd ohci_hcd ehci_hcd usbcore
      [ 1139.243284] Pid: 6534, comm: bash Tainted: G   M       2.6.32-haicheng-cpuhp #7 QSSC-S4R
      [ 1139.243287] RIP: 0010:[<ffffffff810ace35>]  [<ffffffff810ace35>] alloc_arraycache+0x35/0x69
      [ 1139.243292] RSP: 0018:ffff8802799f9d78  EFLAGS: 00010286
      [ 1139.243295] RAX: ffff8884ffc00000 RBX: ffff8802799f9d98 RCX: 0000000000000000
      [ 1139.243297] RDX: 0000000000190018 RSI: 0000000000000001 RDI: ffff8884ffc00010
      [ 1139.243300] RBP: ffffffff8100c34e R08: 0000000000000002 R09: 0000000000000000
      [ 1139.243303] R10: ffffffff8246dda0 R11: 000000d08246dda0 R12: ffff8802599bfff0
      [ 1139.243305] R13: ffff88027904c040 R14: ffff8802799f8000 R15: 0000000000000001
      [ 1139.243308] FS:  00007fe81bfe86e0(0000) GS:ffff88000d800000(0000) knlGS:0000000000000000
      [ 1139.243311] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1139.243313] CR2: ffff8884ffc00000 CR3: 000000026cf2d000 CR4: 00000000000006f0
      [ 1139.243316] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 1139.243318] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [ 1139.243321] Call Trace:
      [ 1139.243324]  [<ffffffff810ace29>] ? alloc_arraycache+0x29/0x69
      [ 1139.243328]  [<ffffffff8135004e>] ? cpuup_callback+0x1b0/0x32a
      [ 1139.243333]  [<ffffffff8105385d>] ? notifier_call_chain+0x33/0x5b
      [ 1139.243337]  [<ffffffff810538a4>] ? __raw_notifier_call_chain+0x9/0xb
      [ 1139.243340]  [<ffffffff8134ecfc>] ? cpu_up+0xb3/0x152
      [ 1139.243344]  [<ffffffff813388ce>] ? store_online+0x4d/0x75
      [ 1139.243348]  [<ffffffff811e53f3>] ? sysdev_store+0x1b/0x1d
      [ 1139.243351]  [<ffffffff8110589f>] ? sysfs_write_file+0xe5/0x121
      [ 1139.243355]  [<ffffffff810b539d>] ? vfs_write+0xae/0x14a
      [ 1139.243358]  [<ffffffff810b587f>] ? sys_write+0x47/0x6f
      [ 1139.243362]  [<ffffffff8100b9ab>] ? system_call_fastpath+0x16/0x1b
      
      This patch makes sure to always replicate new direct mapping PGD entries
      to the PGDs of all processes, as well as ensures corresponding vmemmap
      mapping gets synced.
      
      V1: initial code by Andi Kleen.
      V2: fix several issues found in testing.
      V3: as suggested by Wu Fengguang, reuse common code of vmalloc_sync_all().
      
      [ hpa: changed pgd_change from int to bool ]
      Originally-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com>
      LKML-Reference: <4C6E4FD8.6080100@linux.intel.com>
      Reviewed-by: NWu Fengguang <fengguang.wu@intel.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      9b861528
    • H
      x86, mm: Separate x86_64 vmalloc_sync_all() into separate functions · 6afb5157
      Haicheng Li 提交于
      No behavior change.
      
      Move some of vmalloc_sync_all() code into a new function
      sync_global_pgds() that will be useful for memory hotplug.
      Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com>
      LKML-Reference: <4C6E4ECD.1090607@linux.intel.com>
      Reviewed-by: NWu Fengguang <fengguang.wu@intel.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      6afb5157
  10. 24 8月, 2010 1 次提交
    • S
      x86, mm: Avoid unnecessary TLB flush · 61c77326
      Shaohua Li 提交于
      In x86, access and dirty bits are set automatically by CPU when CPU accesses
      memory. When we go into the code path of below flush_tlb_fix_spurious_fault(),
      we already set dirty bit for pte and don't need flush tlb. This might mean
      tlb entry in some CPUs hasn't dirty bit set, but this doesn't matter. When
      the CPUs do page write, they will automatically check the bit and no software
      involved.
      
      On the other hand, flush tlb in below position is harmful. Test creates CPU
      number of threads, each thread writes to a same but random address in same vma
      range and we measure the total time. Under a 4 socket system, original time is
      1.96s, while with the patch, the time is 0.8s. Under a 2 socket system, there is
      20% time cut too. perf shows a lot of time are taking to send ipi/handle ipi for
      tlb flush.
      Signed-off-by: NShaohua Li <shaohua.li@intel.com>
      LKML-Reference: <20100816011655.GA362@sli10-desk.sh.intel.com>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Andrea Archangeli <aarcange@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      61c77326
  11. 22 8月, 2010 1 次提交
  12. 20 8月, 2010 2 次提交
    • D
      x86, apic: Fix apic=debug boot crash · 05e40760
      Daniel Kiper 提交于
      Fix a boot crash when apic=debug is used and the APIC is
      not properly initialized.
      
      This issue appears during Xen Dom0 kernel boot but the
      fix is generic and the crash could occur on real hardware
      as well.
      Signed-off-by: NDaniel Kiper <dkiper@net-space.pl>
      Cc: xen-devel@lists.xensource.com
      Cc: konrad.wilk@oracle.com
      Cc: jeremy@goop.org
      Cc: <stable@kernel.org> # .35.x, .34.x, .33.x, .32.x
      LKML-Reference: <20100819224616.GB9967@router-fw-old.local.net-space.pl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      05e40760
    • B
      x86, hotplug: Serialize CPU hotplug to avoid bringup concurrency issues · d7c53c9e
      Borislav Petkov 提交于
      When testing cpu hotplug code on 32-bit we kept hitting the "CPU%d:
      Stuck ??" message due to multiple cores concurrently accessing the
      cpu_callin_mask, among others.
      
      Since these codepaths are not protected from concurrent access due to
      the fact that there's no sane reason for making an already complex
      code unnecessarily more complex - we hit the issue only when insanely
      switching cores off- and online - serialize hotplugging cores on the
      sysfs level and be done with it.
      
      [ v2.1: fix !HOTPLUG_CPU build ]
      
      Cc: <stable@kernel.org>
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      LKML-Reference: <20100819181029.GC17171@aftab>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      d7c53c9e
  13. 19 8月, 2010 4 次提交
    • K
      kprobes/x86: Fix the return address of multiple kretprobes · 737480a0
      KUMANO Syuhei 提交于
      Fix the return address of subsequent kretprobes when multiple
      kretprobes are set on the same function.
      
      For example:
      
       # cd /sys/kernel/debug/tracing
       # echo "r:event1 sys_symlink" > kprobe_events
       # echo "r:event2 sys_symlink" >> kprobe_events
       # echo 1 > events/kprobes/enable
       # ln -s /tmp/foo /tmp/bar
      
      (without this patch)
      
       # cat trace
                    ln-897   [000] 20404.133727: event1: (kretprobe_trampoline+0x0/0x4c <- sys_symlink)
                    ln-897   [000] 20404.133747: event2: (system_call_fastpath+0x16/0x1b <- sys_symlink)
      
      (with this patch)
      
       # cat trace
                    ln-740   [000] 13799.491076: event1: (system_call_fastpath+0x16/0x1b <- sys_symlink)
                    ln-740   [000] 13799.491096: event2: (system_call_fastpath+0x16/0x1b <- sys_symlink)
      Signed-off-by: NKUMANO Syuhei <kumano.prog@gmail.com>
      Reviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      LKML-Reference: <1281853084.3254.11.camel@camp10-laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      737480a0
    • H
      x86-32: Fix dummy trampoline-related inline stubs · 8848a910
      H. Peter Anvin 提交于
      Fix dummy inline stubs for trampoline-related functions when no
      trampolines exist (until we get rid of the no-trampoline case
      entirely.)
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Joerg Roedel <joerg.roedel@amd.com>
      Cc: Borislav Petkov <borislav.petkov@amd.com>
      LKML-Reference: <4C6C294D.3030404@zytor.com>
      8848a910
    • J
      x86-32: Separate 1:1 pagetables from swapper_pg_dir · fd89a137
      Joerg Roedel 提交于
      This patch fixes machine crashes which occur when heavily exercising the
      CPU hotplug codepaths on a 32-bit kernel. These crashes are caused by
      AMD Erratum 383 and result in a fatal machine check exception. Here's
      the scenario:
      
      1. On 32-bit, the swapper_pg_dir page table is used as the initial page
      table for booting a secondary CPU.
      
      2. To make this work, swapper_pg_dir needs a direct mapping of physical
      memory in it (the low mappings). By adding those low, large page (2M)
      mappings (PAE kernel), we create the necessary conditions for Erratum
      383 to occur.
      
      3. Other CPUs which do not participate in the off- and onlining game may
      use swapper_pg_dir while the low mappings are present (when leave_mm is
      called). For all steps below, the CPU referred to is a CPU that is using
      swapper_pg_dir, and not the CPU which is being onlined.
      
      4. The presence of the low mappings in swapper_pg_dir can result
      in TLB entries for addresses below __PAGE_OFFSET to be established
      speculatively. These TLB entries are marked global and large.
      
      5. When the CPU with such TLB entry switches to another page table, this
      TLB entry remains because it is global.
      
      6. The process then generates an access to an address covered by the
      above TLB entry but there is a permission mismatch - the TLB entry
      covers a large global page not accessible to userspace.
      
      7. Due to this permission mismatch a new 4kb, user TLB entry gets
      established. Further, Erratum 383 provides for a small window of time
      where both TLB entries are present. This results in an uncorrectable
      machine check exception signalling a TLB multimatch which panics the
      machine.
      
      There are two ways to fix this issue:
      
              1. Always do a global TLB flush when a new cr3 is loaded and the
              old page table was swapper_pg_dir. I consider this a hack hard
              to understand and with performance implications
      
              2. Do not use swapper_pg_dir to boot secondary CPUs like 64-bit
              does.
      
      This patch implements solution 2. It introduces a trampoline_pg_dir
      which has the same layout as swapper_pg_dir with low_mappings. This page
      table is used as the initial page table of the booting CPU. Later in the
      bringup process, it switches to swapper_pg_dir and does a global TLB
      flush. This fixes the crashes in our test cases.
      
      -v2: switch to swapper_pg_dir right after entering start_secondary() so
      that we are able to access percpu data which might not be mapped in the
      trampoline page table.
      Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
      LKML-Reference: <20100816123833.GB28147@aftab>
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      fd89a137
    • H
      x86, cpu: Fix regression in AMD errata checking code · 07a7795c
      Hans Rosenfeld 提交于
      A bug in the family-model-stepping matching code caused the presence of
      errata to go undetected when OSVW was not used. This causes hangs on
      some K8 systems because the E400 workaround is not enabled.
      Signed-off-by: NHans Rosenfeld <hans.rosenfeld@amd.com>
      LKML-Reference: <1282141190-930137-1-git-send-email-hans.rosenfeld@amd.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      07a7795c
  14. 18 8月, 2010 2 次提交
    • Z
      perf, x86: Fix Intel-nhm PMU programming errata workaround · 351af072
      Zhang, Yanmin 提交于
      Fix the Errata AAK100/AAP53/BD53 workaround, the officialy documented
      workaround we implemented in:
      
       11164cd4: perf, x86: Add Nehelem PMU programming errata workaround
      
      doesn't actually work fully and causes a stuck PMU state
      under load and non-functioning perf profiling.
      
      A functional workaround was found by trial & error.
      
      Affects all Nehalem-class Intel PMUs.
      Signed-off-by: NZhang Yanmin <yanmin_zhang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1281073148.2125.63.camel@ymzhang.sh.intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <stable@kernel.org> # .35.x
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      351af072
    • D
      Make do_execve() take a const filename pointer · d7627467
      David Howells 提交于
      Make do_execve() take a const filename pointer so that kernel_execve() compiles
      correctly on ARM:
      
      arch/arm/kernel/sys_arm.c:88: warning: passing argument 1 of 'do_execve' discards qualifiers from pointer target type
      
      This also requires the argv and envp arguments to be consted twice, once for
      the pointer array and once for the strings the array points to.  This is
      because do_execve() passes a pointer to the filename (now const) to
      copy_strings_kernel().  A simpler alternative would be to cast the filename
      pointer in do_execve() when it's passed to copy_strings_kernel().
      
      do_execve() may not change any of the strings it is passed as part of the argv
      or envp lists as they are some of them in .rodata, so marking these strings as
      const should be fine.
      
      Further kernel_execve() and sys_execve() need to be changed to match.
      
      This has been test built on x86_64, frv, arm and mips.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NRalf Baechle <ralf@linux-mips.org>
      Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d7627467
  15. 17 8月, 2010 2 次提交
  16. 15 8月, 2010 4 次提交
    • X
      KVM: destroy workqueue on kvm_create_pit() failures · 3185bf8c
      Xiaotian Feng 提交于
      kernel needs to destroy workqueue if kvm_create_pit() fails, otherwise
      after pit is freed, the workqueue is leaked.
      Signed-off-by: NXiaotian Feng <dfeng@redhat.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Gregory Haskins <ghaskins@novell.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      3185bf8c
    • X
      KVM: fix poison overwritten caused by using wrong xstate size · f45755b8
      Xiaotian Feng 提交于
      fpu.state is allocated from task_xstate_cachep, the size of task_xstate_cachep
      is xstate_size. xstate_size is set from cpuid instruction, which is often
      smaller than sizeof(struct xsave_struct). kvm is using sizeof(struct xsave_struct)
      to fill in/out fpu.state.xsave, as what we allocated for fpu.state is
      xstate_size, kernel will write out of memory and caused poison/redzone/padding
      overwritten warnings.
      Signed-off-by: NXiaotian Feng <dfeng@redhat.com>
      Reviewed-by: NSheng Yang <sheng@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Sheng Yang <sheng@linux.intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Jan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      f45755b8
    • S
      defconfig reduction · 8b1bb907
      Sam Ravnborg 提交于
      Use the defconfig files generated by "make savedefconfig" for
      remaining defconfig files.
      Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
      8b1bb907
    • S
      archs: replace unifdef-y with header-y · bf56fba6
      Sam Ravnborg 提交于
      unifdef-y and header-y have same semantic, so drop unifdef-y
      Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
      bf56fba6
  17. 14 8月, 2010 2 次提交
  18. 13 8月, 2010 4 次提交
  19. 12 8月, 2010 2 次提交
  20. 11 8月, 2010 4 次提交