1. 21 4月, 2011 1 次提交
  2. 19 4月, 2011 4 次提交
  3. 16 4月, 2011 2 次提交
    • J
      x86, amd: Disable GartTlbWlkErr when BIOS forgets it · 5bbc097d
      Joerg Roedel 提交于
      This patch disables GartTlbWlk errors on AMD Fam10h CPUs if
      the BIOS forgets to do is (or is just too old). Letting
      these errors enabled can cause a sync-flood on the CPU
      causing a reboot.
      
      The AMD BKDG recommends disabling GART TLB Wlk Error completely.
      
      This patch is the fix for
      
      	https://bugzilla.kernel.org/show_bug.cgi?id=33012
      
      on my machine.
      Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
      Link: http://lkml.kernel.org/r/20110415131152.GJ18463@8bytes.orgTested-by: NAlexandre Demers <alexandre.f.demers@gmail.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      5bbc097d
    • K
      x86, NUMA: Fix fakenuma boot failure · 7d6b4670
      KOSAKI Motohiro 提交于
      Currently, numa=fake boot parameter is broken. If it's used,
      kernel may panic due to devide by zero error depending on CPU
      configuration
      
      Call Trace:
       [<ffffffff8104ad4c>] find_busiest_group+0x38c/0xd30
       [<ffffffff81086aff>] ? local_clock+0x6f/0x80
       [<ffffffff81050533>] load_balance+0xa3/0x600
       [<ffffffff81050f53>] idle_balance+0xf3/0x180
       [<ffffffff81550092>] schedule+0x722/0x7d0
       [<ffffffff81550538>] ? wait_for_common+0x128/0x190
       [<ffffffff81550a65>] schedule_timeout+0x265/0x320
       [<ffffffff81095815>] ? lock_release_holdtime+0x35/0x1a0
       [<ffffffff81550538>] ? wait_for_common+0x128/0x190
       [<ffffffff8109bb6c>] ? __lock_release+0x9c/0x1d0
       [<ffffffff815534e0>] ? _raw_spin_unlock_irq+0x30/0x40
       [<ffffffff815534e0>] ? _raw_spin_unlock_irq+0x30/0x40
       [<ffffffff81550540>] wait_for_common+0x130/0x190
       [<ffffffff81051920>] ? try_to_wake_up+0x510/0x510
       [<ffffffff8155067d>] wait_for_completion+0x1d/0x20
       [<ffffffff8107f36c>] kthread_create_on_node+0xac/0x150
       [<ffffffff81077bb0>] ? process_scheduled_works+0x40/0x40
       [<ffffffff8155045f>] ? wait_for_common+0x4f/0x190
       [<ffffffff8107a283>] __alloc_workqueue_key+0x1a3/0x590
       [<ffffffff81e0cce2>] cpuset_init_smp+0x6b/0x7b
       [<ffffffff81df3d07>] kernel_init+0xc3/0x182
       [<ffffffff8155d5e4>] kernel_thread_helper+0x4/0x10
       [<ffffffff81553cd4>] ? retint_restore_args+0x13/0x13
       [<ffffffff81df3c44>] ? start_kernel+0x400/0x400
       [<ffffffff8155d5e0>] ? gs_change+0x13/0x13
      
      The divede by zero is caused by the following line,
      group->cpu_power==0:
      
       kernel/sched_fair.c::update_sg_lb_stats()
              /* Adjust by relative CPU power of the group */
              sgs->avg_load = (sgs->group_load * SCHED_LOAD_SCALE) / group->cpu_power;
      
      This regression was caused by commit e23bba60 ("x86-64, NUMA: Unify
      emulated distance mapping") because it changes cpu -> node
      mapping in the process of dropping fake_physnodes().
      
        old) all cpus are assinged node 0
        now) cpus are assigned round robin
             (the logic is implemented by numa_init_array())
      
        Note: The change in behavior only happens if the system doesn't
              have neither ACPI SRAT table nor AMD northbridge NUMA
      	information.
      
      Round robin assignment doesn't work because init_numa_sched_groups_power()
      assumes all logical cpus in the same physical cpu share the same node
      (then it only accounts for group_first_cpu()), and the simple round robin
      breaks the above assumption.
      
      Thus, this patch implements a reassignment of node-ids if buggy firmware
      or numa emulation makes wrong cpu node map. Tt enforce all logical cpus
      in the same physical cpu share the same node.
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Shaohui Zheng <shaohui.zheng@intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Link: http://lkml.kernel.org/r/20110415203928.1303.A69D9226@jp.fujitsu.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      7d6b4670
  4. 07 4月, 2011 1 次提交
    • H
      x86, hibernate: Initialize mmu_cr4_features during boot · 4da9484b
      H. Peter Anvin 提交于
      Restore the initialization of mmu_cr4_features during boot, which was
      removed without comment in checkin e5f15b45
      
      x86: Cleanup highmap after brk is concluded
      
      thereby breaking resume from hibernate.  This restores previous
      functionality in approximately the same place, and corrects the
      reading of %cr4 on pre-CPUID hardware (%cr4 exists if and only if
      CPUID is supported.)
      
      However, part of the problem is that the hibernate suspend/resume
      sequence should manage the save/restore of %cr4 explicitly.
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      LKML-Reference: <201104020154.57136.rjw@sisk.pl>
      4da9484b
  5. 01 4月, 2011 2 次提交
  6. 31 3月, 2011 1 次提交
  7. 30 3月, 2011 2 次提交
    • S
      x86, mtrr, pat: Fix one cpu getting out of sync during resume · 84ac7cdb
      Suresh Siddha 提交于
      On laptops with core i5/i7, there were reports that after resume
      graphics workloads were performing poorly on a specific AP, while
      the other cpu's were ok. This was observed on a 32bit kernel
      specifically.
      
      Debug showed that the PAT init was not happening on that AP
      during resume and hence it contributing to the poor workload
      performance on that cpu.
      
      On this system, resume flow looked like this:
      
      1. BP starts the resume sequence and we reinit BP's MTRR's/PAT
         early on using mtrr_bp_restore()
      
      2. Resume sequence brings all AP's online
      
      3. Resume sequence now kicks off the MTRR reinit on all the AP's.
      
      4. For some reason, between point 2 and 3, we moved from BP
         to one of the AP's. My guess is that printk() during resume
         sequence is contributing to this. We don't see similar
         behavior with the 64bit kernel but there is no guarantee that
         at this point the remaining resume sequence (after AP's bringup)
         has to happen on BP.
      
      5. set_mtrr() was assuming that we are still on BP and skipped the
         MTRR/PAT init on that cpu (because of 1 above)
      
      6. But we were on an AP and this led to not reprogramming PAT
         on this cpu leading to bad performance.
      
      Fix this by doing unconditional mtrr_if->set_all() in set_mtrr()
      during MTRR/PAT init. This might be unnecessary if we are still
      running on BP. But it is of no harm and will guarantee that after
      resume, all the cpu's will be in sync with respect to the
      MTRR/PAT registers.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1301438292-28370-1-git-send-email-eric@anholt.net>
      Signed-off-by: NEric Anholt <eric@anholt.net>
      Tested-by: NKeith Packard <keithp@keithp.com>
      Cc: stable@kernel.org	[v2.6.32+]
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      84ac7cdb
    • T
      x86: apb_timer: Fixup genirq fallout · 86cc8dfc
      Thomas Gleixner 提交于
      The lonely user of the internal interface was not in the coccinelle
      script.
      Reported-by: NRandy Dunlap <rdunlap@xenotime.net>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      86cc8dfc
  8. 29 3月, 2011 2 次提交
    • X
      x86, microcode: Unregister syscore_ops after microcode unloaded · 4ac5fc6a
      Xiaotian Feng 提交于
      Currently, microcode doesn't unregister syscore_ops after it's
      unloaded. So if we modprobe then rmmod microcode, the stale
      microcode syscore_ops info will stay on syscore_ops_list.
      
      Later when we're trying to reboot/halt/shutdown the machine, kernel
      will panic on syscore_shutdown().
      
      With the patch applied, I can reboot/halt/shutdown my machine successfully.
      Signed-off-by: NXiaotian Feng <dfeng@redhat.com>
      Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      LKML-Reference: <1301387672-23661-1-git-send-email-dfeng@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4ac5fc6a
    • J
      x86: Stop including <linux/delay.h> in two asm header files · ca444564
      Jean Delvare 提交于
      Stop including <linux/delay.h> in x86 header files which don't
      need it. This will let the compiler complain when this header is
      not included by source files when it should, so that
      contributors can fix the problem before building on other
      architectures starts to fail.
      
      Credits go to Geert for the idea.
      Signed-off-by: NJean Delvare <khali@linux-fr.org>
      Cc: James E.J. Bottomley <James.Bottomley@suse.de>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      LKML-Reference: <20110325152014.297890ec@endymion.delvare>
      [ this also fixes an upstream build bug in drivers/media/rc/ite-cir.c ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ca444564
  9. 26 3月, 2011 1 次提交
  10. 25 3月, 2011 4 次提交
    • I
      perf, x86: Complain louder about BIOSen corrupting CPU/PMU state and continue · 45daae57
      Ingo Molnar 提交于
      Eric Dumazet reported that hardware PMU events do not work on his
      system, due to the BIOS corrupting PMU state:
      
          Performance Events: PEBS fmt0+, Core2 events, Broken BIOS detected, using software events only.
          [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 186 is 43003c)
      
      Linus suggested that we continue in the face of such BIOS-induced CPU
      state corruption:
      
         http://lkml.org/lkml/2011/3/24/608
      
      Such BIOSes will have to be fixed - Linux developers rely on a working and
      fully capable PMU and the BIOS interfering with the CPU's PMU state is simply
      not acceptable.
      
      So this patch changes perf to continue when it detects such BIOS
      interaction, some hardware events may be unreliable due to the BIOS
      writing and re-writing them - there's not much the kernel can do
      about that but to detect the corruption and report it.
      Reported-and-tested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      45daae57
    • T
      x86: DT: Cleanup namespace and call irq_set_irq_type() unconditional · 07611dbd
      Thomas Gleixner 提交于
      That call escaped the name space cleanup. Fix it up.
      
      We really want to call there. The chip might have changed since the
      irq was setup initially. So let the core code and the chip decide what
      to do. The status is just an unreliable snapshot.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      07611dbd
    • T
      x86: DT: Fix return condition in irq_create_of_mapping() · 00a30b25
      Thomas Gleixner 提交于
      The xlate() function returns 0 or a negative error code. Returning the
      error code blindly will be seen as an huge irq number by the calling
      function because irq_create_of_mapping() returns an unsigned value.
      
      Return 0 (NO_IRQ) as required.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      00a30b25
    • D
      perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows · 242214f9
      Don Zickus 提交于
      The read of a proper MSR register was missed and instead of
      counter the configration register was tested (it has
      ARCH_P4_UNFLAGGED_BIT always cleared) leading to unknown NMI
      hitting the system. As result the user may obtain "Dazed and
      confused, but trying to continue" message. Fix it by reading a
      proper MSR register.
      
      When an NMI happens on a P4, the perf nmi handler checks the
      configuration register to see if the overflow bit is set or not
      before taking appropriate action.  Unfortunately, various P4
      machines had a broken overflow bit, so a backup mechanism was
      implemented.  This mechanism checked to see if the counter
      rolled over or not.
      
      A previous commit that implemented this backup mechanism was
      broken. Instead of reading the counter register, it used the
      configuration register to determine if the counter rolled over
      or not. Reading that bit would give incorrect results.
      
      This would lead to 'Dazed and confused' messages for the end
      user when using the perf tool (or if the nmi watchdog is
      running).
      
      The fix is to read the counter register before determining if
      the counter rolled over or not.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <4D8BAB49.3080701@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      242214f9
  11. 24 3月, 2011 4 次提交
    • N
      x86, dumpstack: Use %pB format specifier for stack trace · 71f9e598
      Namhyung Kim 提交于
      Improve noreturn function entries in call traces:
      
      Before:
      
       Call Trace:
        [<ffffffff812a8502>] panic+0x8c/0x18d
        [<ffffffffa000012a>] deep01+0x0/0x38 [test_panic]  <--- bad
        [<ffffffff81104666>] proc_file_write+0x73/0x8d
        [<ffffffff811000b3>] proc_reg_write+0x8d/0xac
        [<ffffffff810c7d32>] vfs_write+0xa1/0xc5
        [<ffffffff810c7e0f>] sys_write+0x45/0x6c
        [<ffffffff8f02943b>] system_call_fastpath+0x16/0x1b
      
      After:
      
       Call Trace:
        [<ffffffff812bce69>] panic+0x8c/0x18d
        [<ffffffffa000012a>] panic_write+0x20/0x20 [test_panic] <--- good
        [<ffffffff81115fab>] proc_file_write+0x73/0x8d
        [<ffffffff81111a5f>] proc_reg_write+0x8d/0xac
        [<ffffffff810d90ee>] vfs_write+0xa1/0xc5
        [<ffffffff810d91cb>] sys_write+0x45/0x6c
        [<ffffffff812c07fb>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <1300934550-21394-2-git-send-email-namhyung@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      71f9e598
    • O
      crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr,... · 93a72052
      Olaf Hering 提交于
      crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn
      
      The Xen PV drivers in a crashed HVM guest can not connect to the dom0
      backend drivers because both frontend and backend drivers are still in
      connected state.  To run the connection reset function only in case of a
      crashdump, the is_kdump_kernel() function needs to be available for the PV
      driver modules.
      
      Consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn into
      kernel/crash_dump.c Also export elfcorehdr_addr to make is_kdump_kernel()
      usable for modules.
      
      Leave 'elfcorehdr' as early_param().  This changes powerpc from __setup()
      to early_param().  It adds an address range check from x86 also on ia64
      and powerpc.
      
      [akpm@linux-foundation.org: additional #includes]
      [akpm@linux-foundation.org: remove elfcorehdr_addr export]
      [akpm@linux-foundation.org: fix for Tejun's mm/nobootmem.c changes]
      Signed-off-by: NOlaf Hering <olaf@aepfle.de>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      93a72052
    • R
      x86: Use syscore_ops instead of sysdev classes and sysdevs · f3c6ea1b
      Rafael J. Wysocki 提交于
      Some subsystems in the x86 tree need to carry out suspend/resume and
      shutdown operations with one CPU on-line and interrupts disabled and
      they define sysdev classes and sysdevs or sysdev drivers for this
      purpose.  This leads to unnecessarily complicated code and excessive
      memory usage, so switch them to using struct syscore_ops objects for
      this purpose instead.
      
      Generally, there are three categories of subsystems that use
      sysdevs for implementing PM operations: (1) subsystems whose
      suspend/resume callbacks ignore their arguments entirely (the
      majority), (2) subsystems whose suspend/resume callbacks use their
      struct sys_device argument, but don't really need to do that,
      because they can be implemented differently in an arguably simpler
      way (io_apic.c), and (3) subsystems whose suspend/resume callbacks
      use their struct sys_device argument, but the value of that argument
      is always the same and could be ignored (microcode_core.c).  In all
      of these cases the subsystems in question may be readily converted to
      using struct syscore_ops objects for power management and shutdown.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      f3c6ea1b
    • S
      x86: mark associated mm when running a task in 32 bit compatibility mode · 375906f8
      Stephen Wilson 提交于
      This patch simply follows the same practice as for setting the TIF_IA32 flag.
      In particular, an mm is marked as holding 32-bit tasks when a 32-bit binary is
      exec'ed.  Both ELF and a.out formats are updated.
      Signed-off-by: NStephen Wilson <wilsons@start.ca>
      Reviewed-by: NMichel Lespinasse <walken@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      375906f8
  12. 23 3月, 2011 2 次提交
  13. 22 3月, 2011 2 次提交
    • R
      x86, mpparse: Move check_slot into CONFIG_X86_IO_APIC context · cbb84c4c
      Rakib Mullick 提交于
      When CONFIG_X86_MPPARSE=y and CONFIG_X86_IO_APIC=n, then we get
      the following warning:
      
        arch/x86/kernel/mpparse.c:723: warning: 'check_slot' defined but not used
      
      So, put check_slot into CONFIG_X86_IO_APIC context. Its only
      called from CONFIG_X86_IO_APIC=y context.
      Signed-off-by: NRakib Mullick <rakib.mullick@gmail.com>
      LKML-Reference: <AANLkTinsUfGc=NG_GeH_B+zFVu+DXJzZbJKdQLscqfuH@mail.gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cbb84c4c
    • H
      ACPI, APEI, Add ERST record ID cache · 885b976f
      Huang Ying 提交于
      APEI ERST firmware interface and implementation has no multiple users
      in mind.  For example, if there is four records in storage with ID: 1,
      2, 3 and 4, if two ERST readers enumerate the records via
      GET_NEXT_RECORD_ID as follow,
      
      reader 1		reader 2
      1
      			2
      3
      			4
      -1
      			-1
      
      where -1 signals there is no more record ID.
      
      Reader 1 has no chance to check record 2 and 4, while reader 2 has no
      chance to check record 1 and 3.  And any other GET_NEXT_RECORD_ID will
      return -1, that is, other readers will has no chance to check any
      record even they are not cleared by anyone.
      
      This makes raw GET_NEXT_RECORD_ID not suitable for used by multiple
      users.
      
      To solve the issue, an in-memory ERST record ID cache is designed and
      implemented.  When enumerating record ID, the ID returned by
      GET_NEXT_RECORD_ID is added into cache in addition to be returned to
      caller.  So other readers can check the cache to get all record ID
      available.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      885b976f
  14. 21 3月, 2011 1 次提交
    • S
      introduce sys_syncfs to sync a single file system · b7ed78f5
      Sage Weil 提交于
      It is frequently useful to sync a single file system, instead of all
      mounted file systems via sync(2):
      
       - On machines with many mounts, it is not at all uncommon for some of
         them to hang (e.g. unresponsive NFS server).  sync(2) will get stuck on
         those and may never get to the one you do care about (e.g., /).
       - Some applications write lots of data to the file system and then
         want to make sure it is flushed to disk.  Calling fsync(2) on each
         file introduces unnecessary ordering constraints that result in a large
         amount of sub-optimal writeback/flush/commit behavior by the file
         system.
      
      There are currently two ways (that I know of) to sync a single super_block:
      
       - BLKFLSBUF ioctl on the block device: That also invalidates the bdev
         mapping, which isn't usually desirable, and doesn't work for non-block
         file systems.
       - 'mount -o remount,rw' will call sync_filesystem as an artifact of the
         current implemention.  Relying on this little-known side effect for
         something like data safety sounds foolish.
      
      Both of these approaches require root privileges, which some applications
      do not have (nor should they need?) given that sync(2) is an unprivileged
      operation.
      
      This patch introduces a new system call syncfs(2) that takes an fd and
      syncs only the file system it references.  Maybe someday we can
      
       $ sync /some/path
      
      and not get
      
       sync: ignoring all arguments
      
      The syscall is motivated by comments by Al and Christoph at the last LSF.
      syncfs(2) seems like an appropriate name given statfs(2).
      
      A similar ioctl was also proposed a while back, see
      	http://marc.info/?l=linux-fsdevel&m=127970513829285&w=2Signed-off-by: NSage Weil <sage@newdream.net>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      b7ed78f5
  15. 20 3月, 2011 2 次提交
    • Y
      x86: Cleanup highmap after brk is concluded · e5f15b45
      Yinghai Lu 提交于
      Now cleanup_highmap actually is in two steps: one is early in head64.c
      and only clears above _end; a second one is in init_memory_mapping() and
      tries to clean from _brk_end to _end.
      It should check if those boundaries are PMD_SIZE aligned but currently
      does not.
      Also init_memory_mapping() is called several times for numa or memory
      hotplug, so we really should not handle initial kernel mappings there.
      
      This patch moves cleanup_highmap() down after _brk_end is settled so
      we can do everything in one step.
      Also we honor max_pfn_mapped in the implementation of cleanup_highmap.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
      LKML-Reference: <alpine.DEB.2.00.1103171739050.3382@kaball-desktop>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      e5f15b45
    • S
      perf, x86: Fix Intel fixed counters base initialization · fc66c521
      Stephane Eranian 提交于
      The following patch solves the problems introduced by Robert's
      commit 41bf4989 and reported by Arun Sharma. This commit gets rid
      of the base + index notation for reading and writing PMU msrs.
      
      The problem is that for fixed counters, the new calculation for
      the base did not take into account the fixed counter indexes,
      thus all fixed counters were read/written from fixed counter 0.
      Although all fixed counters share the same config MSR, they each
      have their own counter register.
      
      Without:
      
       $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds
      
        242202299 unhalted_core_cycles (0.00% scaling, ena=1000790892, run=1000790892)
       2389685946 instructions_retired (0.00% scaling, ena=1000790892, run=1000790892)
            49473 baclears             (0.00% scaling, ena=1000790892, run=1000790892)
      
      With:
      
       $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds
      
       2392703238 unhalted_core_cycles (0.00% scaling, ena=1000840809, run=1000840809)
       2389793744 instructions_retired (0.00% scaling, ena=1000840809, run=1000840809)
            47863 baclears             (0.00% scaling, ena=1000840809, run=1000840809)
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: ming.m.lin@intel.com
      Cc: robert.richter@amd.com
      Cc: asharma@fb.com
      Cc: perfmon2-devel@lists.sf.net
      LKML-Reference: <20110319172005.GB4978@quad>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fc66c521
  16. 18 3月, 2011 3 次提交
    • N
      x86, dumpstack: Correct stack dump info when frame pointer is available · e8e999cf
      Namhyung Kim 提交于
      Current stack dump code scans entire stack and check each entry
      contains a pointer to kernel code. If CONFIG_FRAME_POINTER=y it
      could mark whether the pointer is valid or not based on value of
      the frame pointer. Invalid entries could be preceded by '?' sign.
      
      However this was not going to happen because scan start point
      was always higher than the frame pointer so that they could not
      meet.
      
      Commit 9c0729dc ("x86: Eliminate bp argument from the stack
      tracing routines") delayed bp acquisition point, so the bp was
      read in lower frame, thus all of the entries were marked
      invalid.
      
      This patch fixes this by reverting above commit while retaining
      stack_frame() helper as suggested by Frederic Weisbecker.
      
      End result looks like below:
      
      before:
      
       [    3.508329] Call Trace:
       [    3.508551]  [<ffffffff814f35c9>] ? panic+0x91/0x199
       [    3.508662]  [<ffffffff814f3739>] ? printk+0x68/0x6a
       [    3.508770]  [<ffffffff81a981b2>] ? mount_block_root+0x257/0x26e
       [    3.508876]  [<ffffffff81a9821f>] ? mount_root+0x56/0x5a
       [    3.508975]  [<ffffffff81a98393>] ? prepare_namespace+0x170/0x1a9
       [    3.509216]  [<ffffffff81a9772b>] ? kernel_init+0x1d2/0x1e2
       [    3.509335]  [<ffffffff81003894>] ? kernel_thread_helper+0x4/0x10
       [    3.509442]  [<ffffffff814f6880>] ? restore_args+0x0/0x30
       [    3.509542]  [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2
       [    3.509641]  [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10
      
      after:
      
       [    3.522991] Call Trace:
       [    3.523351]  [<ffffffff814f35b9>] panic+0x91/0x199
       [    3.523468]  [<ffffffff814f3729>] ? printk+0x68/0x6a
       [    3.523576]  [<ffffffff81a981b2>] mount_block_root+0x257/0x26e
       [    3.523681]  [<ffffffff81a9821f>] mount_root+0x56/0x5a
       [    3.523780]  [<ffffffff81a98393>] prepare_namespace+0x170/0x1a9
       [    3.523885]  [<ffffffff81a9772b>] kernel_init+0x1d2/0x1e2
       [    3.523987]  [<ffffffff81003894>] kernel_thread_helper+0x4/0x10
       [    3.524228]  [<ffffffff814f6880>] ? restore_args+0x0/0x30
       [    3.524345]  [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2
       [    3.524445]  [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10
      
       -v5:
         * fix build breakage with oprofile
      
       -v4:
         * use 0 instead of regs->bp
         * separate out printk changes
      
       -v3:
         * apply comment from Frederic
         * add a couple of printk fixes
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Soren Sandmann <ssp@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Robert Richter <robert.richter@amd.com>
      LKML-Reference: <1300416006-3163-1-git-send-email-namhyung@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e8e999cf
    • L
      x86: Fix common misspellings · 0d2eb44f
      Lucas De Marchi 提交于
      They were generated by 'codespell' and then manually reviewed.
      Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
      Cc: trivial@kernel.org
      LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0d2eb44f
    • S
      KVM guest: Fix section mismatch derived from kvm_guest_cpu_online() · 775077a0
      Sedat Dilek 提交于
      WARNING: arch/x86/built-in.o(.text+0x1bb74): Section mismatch in reference from the function kvm_guest_cpu_online() to the function .cpuinit.text:kvm_guest_cpu_init()
      The function kvm_guest_cpu_online() references
      the function __cpuinit kvm_guest_cpu_init().
      This is often because kvm_guest_cpu_online lacks a __cpuinit
      annotation or the annotation of kvm_guest_cpu_init is wrong.
      
      This patch fixes the warning.
      
      Tested with linux-next (next-20101231)
      Signed-off-by: NSedat Dilek <sedat.dilek@gmail.com>
      Acked-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      775077a0
  17. 17 3月, 2011 3 次提交
  18. 16 3月, 2011 3 次提交