1. 24 10月, 2011 1 次提交
    • T
      x86: Fix S4 regression · 8548c84d
      Takashi Iwai 提交于
      Commit 4b239f45 ("x86-64, mm: Put early page table high") causes a S4
      regression since 2.6.39, namely the machine reboots occasionally at S4
      resume.  It doesn't happen always, overall rate is about 1/20.  But,
      like other bugs, once when this happens, it continues to happen.
      
      This patch fixes the problem by essentially reverting the memory
      assignment in the older way.
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Cc: <stable@kernel.org>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Yinghai Lu <yinghai.lu@oracle.com>
      [ We'll hopefully find the real fix, but that's too late for 3.1 now ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8548c84d
  2. 14 10月, 2011 1 次提交
  3. 11 10月, 2011 1 次提交
  4. 07 10月, 2011 1 次提交
    • P
      x86/PCI: use host bridge _CRS info on ASUS M2V-MX SE · 29cf7a30
      Paul Menzel 提交于
      In summary, this DMI quirk uses the _CRS info by default for the ASUS
      M2V-MX SE by turning on `pci=use_crs` and is similar to the quirk
      added by commit 2491762c ("x86/PCI: use host bridge _CRS info on
      ASRock ALiveSATA2-GLAN") whose commit message should be read for further
      information.
      
      Since commit 3e3da00c ("x86/pci: AMD one chain system to use pci
      read out res") Linux gives the following oops:
      
          parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE]
          HDA Intel 0000:20:01.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
          HDA Intel 0000:20:01.0: setting latency timer to 64
          BUG: unable to handle kernel paging request at ffffc90011c08000
          IP: [<ffffffffa0578402>] azx_probe+0x3ad/0x86b [snd_hda_intel]
          PGD 13781a067 PUD 13781b067 PMD 1300ba067 PTE 800000fd00000173
          Oops: 0009 [#1] SMP
          last sysfs file: /sys/module/snd_pcm/initstate
          CPU 0
          Modules linked in: snd_hda_intel(+) snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event tpm_tis tpm snd_seq tpm_bios psmouse parport_pc snd_timer snd_seq_device parport processor evdev snd i2c_viapro thermal_sys amd64_edac_mod k8temp i2c_core soundcore shpchp pcspkr serio_raw asus_atk0110 pci_hotplug edac_core button snd_page_alloc edac_mce_amd ext3 jbd mbcache sha256_generic cryptd aes_x86_64 aes_generic cbc dm_crypt dm_mod raid1 md_mod usbhid hid sg sd_mod crc_t10dif sr_mod cdrom ata_generic uhci_hcd sata_via pata_via libata ehci_hcd usbcore scsi_mod via_rhine mii nls_base [last unloaded: scsi_wait_scan]
          Pid: 1153, comm: work_for_cpu Not tainted 2.6.37-1-amd64 #1 M2V-MX SE/System Product Name
          RIP: 0010:[<ffffffffa0578402>]  [<ffffffffa0578402>] azx_probe+0x3ad/0x86b [snd_hda_intel]
          RSP: 0018:ffff88013153fe50  EFLAGS: 00010286
          RAX: ffffc90011c08000 RBX: ffff88013029ec00 RCX: 0000000000000006
          RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000246
          RBP: ffff88013341d000 R08: 0000000000000000 R09: 0000000000000040
          R10: 0000000000000286 R11: 0000000000003731 R12: ffff88013029c400
          R13: 0000000000000000 R14: 0000000000000000 R15: ffff88013341d090
          FS:  0000000000000000(0000) GS:ffff8800bfc00000(0000) knlGS:00000000f7610ab0
          CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
          CR2: ffffc90011c08000 CR3: 0000000132f57000 CR4: 00000000000006f0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
          Process work_for_cpu (pid: 1153, threadinfo ffff88013153e000, task ffff8801303c86c0)
          Stack:
           0000000000000005 ffffffff8123ad65 00000000000136c0 ffff88013029c400
           ffff8801303c8998 ffff88013341d000 ffff88013341d090 ffff8801322d9dc8
           ffff88013341d208 0000000000000000 0000000000000000 ffffffff811ad232
          Call Trace:
           [<ffffffff8123ad65>] ? __pm_runtime_set_status+0x162/0x186
           [<ffffffff811ad232>] ? local_pci_probe+0x49/0x92
           [<ffffffff8105afc5>] ? do_work_for_cpu+0x0/0x1b
           [<ffffffff8105afc5>] ? do_work_for_cpu+0x0/0x1b
           [<ffffffff8105afd0>] ? do_work_for_cpu+0xb/0x1b
           [<ffffffff8105fd3f>] ? kthread+0x7a/0x82
           [<ffffffff8100a824>] ? kernel_thread_helper+0x4/0x10
           [<ffffffff8105fcc5>] ? kthread+0x0/0x82
           [<ffffffff8100a820>] ? kernel_thread_helper+0x0/0x10
          Code: f4 01 00 00 ef 31 f6 48 89 df e8 29 dd ff ff 85 c0 0f 88 2b 03 00 00 48 89 ef e8 b4 39 c3 e0 8b 7b 40 e8 fc 9d b1 e0 48 8b 43 38 <66> 8b 10 66 89 14 24 8b 43 14 83 e8 03 83 f8 01 77 32 31 d2 be
          RIP  [<ffffffffa0578402>] azx_probe+0x3ad/0x86b [snd_hda_intel]
           RSP <ffff88013153fe50>
          CR2: ffffc90011c08000
          ---[ end trace 8d1f3ebc136437fd ]---
      
      Trusting the ACPI _CRS information (`pci=use_crs`) fixes this problem.
      
          $ dmesg | grep -i crs # with the quirk
          PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
      
      The match has to be against the DMI board entries though since the vendor entries are not populated.
      
          DMI: System manufacturer System Product Name/M2V-MX SE, BIOS 0304    10/30/2007
      
      This quirk should be removed when `pci=use_crs` is enabled for machines
      from 2006 or earlier or some other solution is implemented.
      
      Using coreboot [1] with this board the problem does not exist but this
      quirk also does not affect it either. To be safe though the check is
      tightened to only take effect when the BIOS from American Megatrends is
      used.
      
              15:13 < ruik> but coreboot does not need that
              15:13 < ruik> because i have there only one root bus
              15:13 < ruik> the audio is behind a bridge
      
              $ sudo dmidecode
              BIOS Information
                      Vendor: American Megatrends Inc.
                      Version: 0304
                      Release Date: 10/30/2007
      
      [1] http://www.coreboot.org/
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=30552
      
      Cc: stable@kernel.org (2.6.34)
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: x86@kernel.org
      Signed-off-by: NPaul Menzel <paulepanter@users.sourceforge.net>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      29cf7a30
  5. 26 9月, 2011 2 次提交
  6. 21 9月, 2011 1 次提交
    • M
      x86/rtc: Don't recursively acquire rtc_lock · 47997d75
      Matt Fleming 提交于
      A deadlock was introduced on x86 in commit ef68c8f8 ("x86:
      Serialize EFI time accesses on rtc_lock") because efi_get_time()
      and friends can be called with rtc_lock already held by
      read_persistent_time(), e.g.:
      
       timekeeping_init()
          read_persistent_clock()     <-- acquire rtc_lock
              efi_get_time()
                  phys_efi_get_time() <-- acquire rtc_lock <DEADLOCK>
      
      To fix this let's push the locking down into the get_wallclock()
      and set_wallclock() implementations.  Only the clock
      implementations that access the x86 RTC directly need to acquire
      rtc_lock, so it makes sense to push the locking down into the
      rtc, vrtc and efi code.
      
      The virtualization implementations don't require rtc_lock to be
      held because they provide their own serialization.
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      Acked-by: NJan Beulich <jbeulich@novell.com>
      Acked-by: Avi Kivity <avi@redhat.com> [for the virtualization aspect]
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Zhang Rui <rui.zhang@intel.com>
      Cc: Josh Boyer <jwboyer@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      47997d75
  7. 16 9月, 2011 1 次提交
    • L
      asm alternatives: remove incorrect alignment notes · a7f934d4
      Linus Torvalds 提交于
      On x86-64, they were just wasteful: with the explicitly added (now
      unnecessary) padding, the size of the alternatives structure was 16
      bytes, and an alignment of 8 bytes didn't hurt much.
      
      However, it was still silly, since the natural size and alignment for
      the structure is actually just 12 bytes, 4-byte aligned since commit
      59e97e4d ("x86: Make alternative instruction pointers relative").
      So removing the padding, and removing the extra alignment is just a good
      idea.
      
      On x86-32, the alignment of 4 bytes was correct, but was incorrectly
      hardcoded as 8 bytes in <asm/alternative-asm.h>.  That header file had
      used to be an x86-64 only header file, but various unification efforts
      have made it be used for x86-32 too (ie the unification of rwlock and
      rwsem).
      
      That in turn caused x86-32 boot failures, because the extra alignment
      would result in random zero-filled words in the altinstructions section,
      causing oopses early at boot when doing alternative instruction
      replacement.
      
      So just remove all the alignment noise entirely.  It's wrong, and it's
      unnecessary.  The section itself is already properly aligned by the
      linker scripts, and all additions to the section had better be of the
      proper 12-byte format, keeping it aligned.  So if the align directive
      were to ever make a difference, that would be an indication of a serious
      bug to begin with.
      Reported-by: NWerner Landgraf <w.landgraf@ru.r>
      Acked-by: NAndrew Lutomirski <luto@mit.edu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a7f934d4
  8. 15 9月, 2011 1 次提交
  9. 13 9月, 2011 1 次提交
    • D
      xen/e820: if there is no dom0_mem=, don't tweak extra_pages. · e3b73c4a
      David Vrabel 提交于
      The patch "xen: use maximum reservation to limit amount of usable RAM"
      (d312ae87) breaks machines that
      do not use 'dom0_mem=' argument with:
      
      reserve RAM buffer: 000000133f2e2000 - 000000133fffffff
      (XEN) mm.c:4976:d0 Global bit is set to kernel page fffff8117e
      (XEN) domain_crash_sync called from entry.S
      (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
      ...
      
      The reason being that the last E820 entry is created using the
      'extra_pages' (which is based on how many pages have been freed).
      The mentioned git commit sets the initial value of 'extra_pages'
      using a hypercall which returns the number of pages (if dom0_mem
      has been used) or -1 otherwise. If the later we return with
      MAX_DOMAIN_PAGES as basis for calculation:
      
          return min(max_pages, MAX_DOMAIN_PAGES);
      
      and use it:
      
           extra_limit = xen_get_max_pages();
           if (extra_limit >= max_pfn)
                   extra_pages = extra_limit - max_pfn;
           else
                   extra_pages = 0;
      
      which means we end up with extra_pages = 128GB in PFNs (33554432)
      - 8GB in PFNs (2097152, on this specific box, can be larger or smaller),
      and then we add that value to the E820 making it:
      
        Xen: 00000000ff000000 - 0000000100000000 (reserved)
        Xen: 0000000100000000 - 000000133f2e2000 (usable)
      
      which is clearly wrong. It should look as so:
      
        Xen: 00000000ff000000 - 0000000100000000 (reserved)
        Xen: 0000000100000000 - 000000027fbda000 (usable)
      
      Naturally this problem does not present itself if dom0_mem=max:X
      is used.
      
      CC: stable@kernel.org
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      e3b73c4a
  10. 10 9月, 2011 1 次提交
  11. 09 9月, 2011 1 次提交
  12. 02 9月, 2011 2 次提交
  13. 01 9月, 2011 1 次提交
    • D
      xen: use maximum reservation to limit amount of usable RAM · d312ae87
      David Vrabel 提交于
      Use the domain's maximum reservation to limit the amount of extra RAM
      for the memory balloon. This reduces the size of the pages tables and
      the amount of reserved low memory (which defaults to about 1/32 of the
      total RAM).
      
      On a system with 8 GiB of RAM with the domain limited to 1 GiB the
      kernel reports:
      
      Before:
      
      Memory: 627792k/4472000k available
      
      After:
      
      Memory: 549740k/11132224k available
      
      A increase of about 76 MiB (~1.5% of the unused 7 GiB).  The reserved
      low memory is also reduced from 253 MiB to 32 MiB.  The total
      additional usable RAM is 329 MiB.
      
      For dom0, this requires at patch to Xen ('x86: use 'dom0_mem' to limit
      the number of pages for dom0') (c/s 23790)
      
      CC: stable@kernel.org
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      d312ae87
  14. 31 8月, 2011 1 次提交
    • A
      x86, perf: Check that current->mm is alive before getting user callchain · 20afc60f
      Andrey Vagin 提交于
      An event may occur when an mm is already released.
      
      I added an event in dequeue_entity() and caught a panic with
      the following backtrace:
      
      [  434.421110] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
      [  434.421258] IP: [<ffffffff810464ac>] __get_user_pages_fast+0x9c/0x120
      ...
      [  434.421258] Call Trace:
      [  434.421258]  [<ffffffff8101ae81>] copy_from_user_nmi+0x51/0xf0
      [  434.421258]  [<ffffffff8109a0d5>] ? sched_clock_local+0x25/0x90
      [  434.421258]  [<ffffffff8101b048>] perf_callchain_user+0x128/0x170
      [  434.421258]  [<ffffffff811154cd>] ? __perf_event_header__init_id+0xed/0x100
      [  434.421258]  [<ffffffff81116690>] perf_prepare_sample+0x200/0x280
      [  434.421258]  [<ffffffff81118da8>] __perf_event_overflow+0x1b8/0x290
      [  434.421258]  [<ffffffff81065240>] ? tg_shares_up+0x0/0x670
      [  434.421258]  [<ffffffff8104fe1a>] ? walk_tg_tree+0x6a/0xb0
      [  434.421258]  [<ffffffff81118f44>] perf_swevent_overflow+0xc4/0xf0
      [  434.421258]  [<ffffffff81119150>] do_perf_sw_event+0x1e0/0x250
      [  434.421258]  [<ffffffff81119204>] perf_tp_event+0x44/0x70
      [  434.421258]  [<ffffffff8105701f>] ftrace_profile_sched_block+0xdf/0x110
      [  434.421258]  [<ffffffff8106121d>] dequeue_entity+0x2ad/0x2d0
      [  434.421258]  [<ffffffff810614ec>] dequeue_task_fair+0x1c/0x60
      [  434.421258]  [<ffffffff8105818a>] dequeue_task+0x9a/0xb0
      [  434.421258]  [<ffffffff810581e2>] deactivate_task+0x42/0xe0
      [  434.421258]  [<ffffffff814bc019>] thread_return+0x191/0x808
      [  434.421258]  [<ffffffff81098a44>] ? switch_task_namespaces+0x24/0x60
      [  434.421258]  [<ffffffff8106f4c4>] do_exit+0x464/0x910
      [  434.421258]  [<ffffffff8106f9c8>] do_group_exit+0x58/0xd0
      [  434.421258]  [<ffffffff8106fa57>] sys_exit_group+0x17/0x20
      [  434.421258]  [<ffffffff8100b202>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NAndrey Vagin <avagin@openvz.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: stable@kernel.org
      Link: http://lkml.kernel.org/r/1314693156-24131-1-git-send-email-avagin@openvz.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      20afc60f
  15. 30 8月, 2011 1 次提交
    • D
      KVM: Fix instruction size issue in pvclock scaling · 3b217116
      Duncan Sands 提交于
      Commit de2d1a52 ("KVM: Fix register corruption in pvclock_scale_delta")
      introduced a mul instruction that may have only a memory operand; the
      assembler therefore cannot select the correct size:
      
         pvclock.s:229: Error: no instruction mnemonic suffix given and no register
      operands; can't size instruction
      
      In this example the assembler is:
      
               #APP
               mul -48(%rbp) ; shrd $32, %rdx, %rax
               #NO_APP
      
      A simple solution is to use mulq.
      Signed-off-by: NDuncan Sands <baldrick@free.fr>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      3b217116
  16. 27 8月, 2011 2 次提交
  17. 26 8月, 2011 2 次提交
  18. 25 8月, 2011 1 次提交
  19. 24 8月, 2011 1 次提交
  20. 22 8月, 2011 2 次提交
  21. 17 8月, 2011 2 次提交
    • J
      xen/x86: replace order-based range checking of M2P table by linear one · ccbcdf7c
      Jan Beulich 提交于
      The order-based approach is not only less efficient (requiring a shift
      and a compare, typical generated code looking like this
      
      	mov	eax, [machine_to_phys_order]
      	mov	ecx, eax
      	shr	ebx, cl
      	test	ebx, ebx
      	jnz	...
      
      whereas a direct check requires just a compare, like in
      
      	cmp	ebx, [machine_to_phys_nr]
      	jae	...
      
      ), but also slightly dangerous in the 32-on-64 case - the element
      address calculation can wrap if the next power of two boundary is
      sufficiently far away from the actual upper limit of the table, and
      hence can result in user space addresses being accessed (with it being
      unknown what may actually be mapped there).
      
      Additionally, the elimination of the mistaken use of fls() here (should
      have been __fls()) fixes a latent issue on x86-64 that would trigger
      if the code was run on a system with memory extending beyond the 44-bit
      boundary.
      
      CC: stable@kernel.org
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      [v1: Based on Jeremy's feedback]
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      ccbcdf7c
    • R
      KVM: uses TASKSTATS, depends on NET · df3d8ae1
      Randy Dunlap 提交于
      CONFIG_TASKSTATS just had a change to use netlink, including
      a change to "depends on NET".  Since "select" does not follow
      dependencies, KVM also needs to depend on NET to prevent build
      errors when CONFIG_NET is not enabled.
      
      Sample of the reported "undefined reference" build errors:
      
      taskstats.c:(.text+0x8f686): undefined reference to `nla_put'
      taskstats.c:(.text+0x8f721): undefined reference to `nla_reserve'
      taskstats.c:(.text+0x8f8fb): undefined reference to `init_net'
      taskstats.c:(.text+0x8f905): undefined reference to `netlink_unicast'
      taskstats.c:(.text+0x8f934): undefined reference to `kfree_skb'
      taskstats.c:(.text+0x8f9e9): undefined reference to `skb_clone'
      taskstats.c:(.text+0x90060): undefined reference to `__alloc_skb'
      taskstats.c:(.text+0x901e9): undefined reference to `skb_put'
      taskstats.c:(.init.text+0x4665): undefined reference to `genl_register_family'
      taskstats.c:(.init.text+0x4699): undefined reference to `genl_register_ops'
      taskstats.c:(.init.text+0x4710): undefined reference to `genl_unregister_ops'
      taskstats.c:(.init.text+0x471c): undefined reference to `genl_unregister_family'
      Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      df3d8ae1
  22. 16 8月, 2011 1 次提交
  23. 11 8月, 2011 3 次提交
  24. 09 8月, 2011 1 次提交
  25. 06 8月, 2011 2 次提交
  26. 05 8月, 2011 6 次提交