1. 02 10月, 2014 1 次提交
  2. 25 9月, 2014 1 次提交
    • M
      x86/efi: Truncate 64-bit values when calling 32-bit OutputString() · 115c6628
      Matt Fleming 提交于
      If we're executing the 32-bit efi_char16_printk() code path (i.e.
      running on top of 32-bit firmware) we know that efi_early->text_output
      will be a 32-bit value, even though ->text_output has type u64.
      
      Unfortunately, we currently pass ->text_output directly to
      efi_early->call() so for CONFIG_X86_32 the compiler will push a 64-bit
      value onto the stack, causing the other parameters to be misaligned.
      
      The way we handle this in the rest of the EFI boot stub is to pass
      pointers as arguments to efi_early->call(), which automatically do the
      right thing (pointers are 32-bit on CONFIG_X86_32, and we simply ignore
      the upper 32-bits of the argument register if running in 64-bit mode
      with 32-bit firmware).
      
      This fixes a corruption bug when printing strings from the 32-bit EFI
      boot stub.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=84241Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      115c6628
  3. 24 9月, 2014 4 次提交
    • M
      crypto: aesni - disable "by8" AVX CTR optimization · 7da4b29d
      Mathias Krause 提交于
      The "by8" implementation introduced in commit 22cddcc7 ("crypto: aes
      - AES CTR x86_64 "by8" AVX optimization") is failing crypto tests as it
      handles counter block overflows differently. It only accounts the right
      most 32 bit as a counter -- not the whole block as all other
      implementations do. This makes it fail the cryptomgr test #4 that
      specifically tests this corner case.
      
      As we're quite late in the release cycle, just disable the "by8" variant
      for now.
      Reported-by: NRomain Francoise <romain@orebokech.com>
      Signed-off-by: NMathias Krause <minipli@googlemail.com>
      Cc: Chandramouli Narayanan <mouli@linux.intel.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      7da4b29d
    • W
      sched: Fix unreleased llc_shared_mask bit during CPU hotplug · 03bd4e1f
      Wanpeng Li 提交于
      The following bug can be triggered by hot adding and removing a large number of
      xen domain0's vcpus repeatedly:
      
      	BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: [..] find_busiest_group
      	PGD 5a9d5067 PUD 13067 PMD 0
      	Oops: 0000 [#3] SMP
      	[...]
      	Call Trace:
      	load_balance
      	? _raw_spin_unlock_irqrestore
      	idle_balance
      	__schedule
      	schedule
      	schedule_timeout
      	? lock_timer_base
      	schedule_timeout_uninterruptible
      	msleep
      	lock_device_hotplug_sysfs
      	online_store
      	dev_attr_store
      	sysfs_write_file
      	vfs_write
      	SyS_write
      	system_call_fastpath
      
      Last level cache shared mask is built during CPU up and the
      build_sched_domain() routine takes advantage of it to setup
      the sched domain CPU topology.
      
      However, llc_shared_mask is not released during CPU disable,
      which leads to an invalid sched domainCPU topology.
      
      This patch fix it by releasing the llc_shared_mask correctly
      during CPU disable.
      
      Yasuaki also reported that this can happen on real hardware:
      
        https://lkml.org/lkml/2014/7/22/1018
      
      His case is here:
      
      	==
      	Here is an example on my system.
      	My system has 4 sockets and each socket has 15 cores and HT is
      	enabled. In this case, each core of sockes is numbered as
      	follows:
      
      		 | CPU#
      	Socket#0 | 0-14 , 60-74
      	Socket#1 | 15-29, 75-89
      	Socket#2 | 30-44, 90-104
      	Socket#3 | 45-59, 105-119
      
      	Then llc_shared_mask of CPU#30 has 0x3fff80000001fffc0000000.
      
      	It means that last level cache of Socket#2 is shared with
      	CPU#30-44 and 90-104.
      
      	When hot-removing socket#2 and #3, each core of sockets is
      	numbered as follows:
      
      		 | CPU#
      	Socket#0 | 0-14 , 60-74
      	Socket#1 | 15-29, 75-89
      
      	But llc_shared_mask is not cleared. So llc_shared_mask of CPU#30
      	remains having 0x3fff80000001fffc0000000.
      
      	After that, when hot-adding socket#2 and #3, each core of
      	sockets is numbered as follows:
      
      		 | CPU#
      	Socket#0 | 0-14 , 60-74
      	Socket#1 | 15-29, 75-89
      	Socket#2 | 30-59
      	Socket#3 | 90-119
      
      	Then llc_shared_mask of CPU#30 becomes
      	0x3fff8000fffffffc0000000. It means that last level cache of
      	Socket#2 is shared with CPU#30-59 and 90-104. So the mask has
      	the wrong value.
      Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
      Tested-by: NLinn Crosetto <linn@hp.com>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NToshi Kani <toshi.kani@hp.com>
      Reviewed-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: <stable@vger.kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1411547885-48165-1-git-send-email-wanpeng.li@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      03bd4e1f
    • M
      x86/efi: Delete misleading efi_printk() error message · 56394ab8
      Matt Fleming 提交于
      A number of people are reporting seeing the "setup_efi_pci() failed!"
      error message in what used to be a quiet boot,
      
        https://bugzilla.kernel.org/show_bug.cgi?id=81891
      
      The message isn't all that helpful because setup_efi_pci() can return a
      non-success error code for a variety of reasons, not all of them fatal.
      
      Let's drop the return code from setup_efi_pci*() altogether, since
      there's no way to process it in any meaningful way outside of the inner
      __setup_efi_pci*() functions.
      Reported-by: NDarren Hart <dvhart@linux.intel.com>
      Reported-by: NJosh Boyer <jwboyer@fedoraproject.org>
      Cc: Ulf Winkelvos <ulf@winkelvos.de>
      Cc: Andre Müller <andre.muller@web.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      56394ab8
    • M
      Revert "efi/x86: efistub: Move shared dependencies to <asm/efi.h>" · 84be8805
      Matt Fleming 提交于
      This reverts commit f23cf8bd ("efi/x86: efistub: Move shared
      dependencies to <asm/efi.h>") as well as the x86 parts of commit
      f4f75ad5 ("efi: efistub: Convert into static library").
      
      The road leading to these two reverts is long and winding.
      
      The above two commits were merged during the v3.17 merge window and
      turned the common EFI boot stub code into a static library. This
      necessitated making some symbols global in the x86 boot stub which
      introduced new entries into the early boot GOT.
      
      The problem was that we weren't fixing up the newly created GOT entries
      before invoking the EFI boot stub, which sometimes resulted in hangs or
      resets. This failure was reported by Maarten on his Macbook pro.
      
      The proposed fix was commit 9cb0e394 ("x86/efi: Fixup GOT in all
      boot code paths"). However, that caused issues for Linus when booting
      his Sony Vaio Pro 11. It was subsequently reverted in commit
      f3670394.
      
      So that leaves us back with Maarten's Macbook pro not booting.
      
      At this stage in the release cycle the least risky option is to revert
      the x86 EFI boot stub to the pre-merge window code structure where we
      explicitly #include efi-stub-helper.c instead of linking with the static
      library. The arm64 code remains unaffected.
      
      We can take another swing at the x86 parts for v3.18.
      
      Conflicts:
      	arch/x86/include/asm/efi.h
      Tested-by: NJosh Boyer <jwboyer@fedoraproject.org>
      Tested-by: NMaarten Lankhorst <maarten.lankhorst@canonical.com>
      Tested-by: Leif Lindholm <leif.lindholm@linaro.org> [arm64]
      Tested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>,
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      84be8805
  4. 23 9月, 2014 1 次提交
  5. 19 9月, 2014 1 次提交
  6. 17 9月, 2014 1 次提交
    • B
      vgaarb: Don't default exclusively to first video device with mem+io · 86fd887b
      Bruno Prémont 提交于
      Commit 20cde694 ("x86, ia64: Move EFI_FB vga_default_device()
      initialization to pci_vga_fixup()") moved boot video device detection from
      efifb to x86 and ia64 pci/fixup.c.
      
      For dual-GPU Apple computers above change represents a regression as code
      in efifb did forcefully override vga_default_device while the merge did not
      (vgaarb happens prior to PCI fixup).
      
      To improve on initial device selection by vgaarb (it cannot know if PCI
      device not behind bridges see/decode legacy VGA I/O or not), move the
      screen_info based check from pci_video_fixup() to vgaarb's init function and
      use it to refine/override decision taken while adding the individual PCI
      VGA devices.  This way PCI fixup has no reason to adjust vga_default_device
      anymore but can depend on its value for flagging shadowed VBIOS.
      
      This has the nice benefit of removing duplicated code but does introduce a
      #if defined() block in vgaarb.  Not all architectures have screen_info and
      would cause compile to fail without it.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=84461Reported-and-Tested-By: NAndreas Noever <andreas.noever@gmail.com>
      Signed-off-by: NBruno Prémont <bonbons@linux-vserver.org>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      CC: Matthew Garrett <matthew.garrett@nebula.com>
      CC: stable@vger.kernel.org # v3.5+
      86fd887b
  7. 14 9月, 2014 2 次提交
    • D
      x86 early_ioremap: Increase FIX_BTMAPS_SLOTS to 8 · 3eddc69f
      Dave Young 提交于
      3.16 kernel boot fail with earlyprintk=efi, it keeps scrolling at the
      bottom line of screen.
      
      Bisected, the first bad commit is below:
      commit 86dfc6f3
      Author: Lv Zheng <lv.zheng@intel.com>
      Date:   Fri Apr 4 12:38:57 2014 +0800
      
          ACPICA: Tables: Fix table checksums verification before installation.
      
      I did some debugging by enabling both serial and efi earlyprintk, below is
      some debug dmesg, seems early_ioremap fails in scroll up function due to
      no free slot, see below dmesg output:
      
        WARNING: CPU: 0 PID: 0 at mm/early_ioremap.c:116 __early_ioremap+0x90/0x1c4()
        __early_ioremap(ed00c800, 00000c80) not found slot
        Modules linked in:
        CPU: 0 PID: 0 Comm: swapper Not tainted 3.17.0-rc1+ #204
        Hardware name: Hewlett-Packard HP Z420 Workstation/1589, BIOS J61 v03.15 05/09/2013
        Call Trace:
          dump_stack+0x4e/0x7a
          warn_slowpath_common+0x75/0x8e
          ? __early_ioremap+0x90/0x1c4
          warn_slowpath_fmt+0x47/0x49
          __early_ioremap+0x90/0x1c4
          ? sprintf+0x46/0x48
          early_ioremap+0x13/0x15
          early_efi_map+0x24/0x26
          early_efi_scroll_up+0x6d/0xc0
          early_efi_write+0x1b0/0x214
          call_console_drivers.constprop.21+0x73/0x7e
          console_unlock+0x151/0x3b2
          ? vprintk_emit+0x49f/0x532
          vprintk_emit+0x521/0x532
          ? console_unlock+0x383/0x3b2
          printk+0x4f/0x51
          acpi_os_vprintf+0x2b/0x2d
          acpi_os_printf+0x43/0x45
          acpi_info+0x5c/0x63
          ? __acpi_map_table+0x13/0x18
          ? acpi_os_map_iomem+0x21/0x147
          acpi_tb_print_table_header+0x177/0x186
          acpi_tb_install_table_with_override+0x4b/0x62
          acpi_tb_install_standard_table+0xd9/0x215
          ? early_ioremap+0x13/0x15
          ? __acpi_map_table+0x13/0x18
          acpi_tb_parse_root_table+0x16e/0x1b4
          acpi_initialize_tables+0x57/0x59
          acpi_table_init+0x50/0xce
          acpi_boot_table_init+0x1e/0x85
          setup_arch+0x9b7/0xcc4
          start_kernel+0x94/0x42d
          ? early_idt_handlers+0x120/0x120
          x86_64_start_reservations+0x2a/0x2c
          x86_64_start_kernel+0xf3/0x100
      
      Quote reply from Lv.zheng about the early ioremap slot usage in this case:
      
      """
      In early_efi_scroll_up(), 2 mapping entries will be used for the src/dst screen buffer.
      In drivers/acpi/acpica/tbutils.c, we've improved the early table loading code in acpi_tb_parse_root_table().
      We now need 2 mapping entries:
      1. One mapping entry is used for RSDT table mapping. Each RSDT entry contains an address for another ACPI table.
      2. For each entry in RSDP, we need another mapping entry to map the table to perform necessary check/override before installing it.
      
      When acpi_tb_parse_root_table() prints something through EFI earlyprintk console, we'll have 4 mapping entries used.
      The current 4 slots setting of early_ioremap() seems to be too small for such a use case.
      """
      
      Thus increase the slot to 8 in this patch to fix this issue.
      boot-time mappings become 512 page with this patch.
      Signed-off-by: NDave Young <dyoung@redhat.com>
      Cc: <stable@vger.kernel.org> # v3.16
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      3eddc69f
    • L
      Make ARCH_HAS_FAST_MULTIPLIER a real config variable · 72d93104
      Linus Torvalds 提交于
      It used to be an ad-hoc hack defined by the x86 version of
      <asm/bitops.h> that enabled a couple of library routines to know whether
      an integer multiply is faster than repeated shifts and additions.
      
      This just makes it use the real Kconfig system instead, and makes x86
      (which was the only architecture that did this) select the option.
      
      NOTE! Even for x86, this really is kind of wrong.  If we cared, we would
      probably not enable this for builds optimized for netburst (P4), where
      shifts-and-adds are generally faster than multiplies.  This patch does
      *not* change that kind of logic, though, it is purely a syntactic change
      with no code changes.
      
      This was triggered by the fact that we have other places that really
      want to know "do I want to expand multiples by constants by hand or
      not", particularly the hash generation code.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      72d93104
  8. 10 9月, 2014 1 次提交
    • S
      x86/xen: don't copy bogus duplicate entries into kernel page tables · 0b5a5063
      Stefan Bader 提交于
      When RANDOMIZE_BASE (KASLR) is enabled; or the sum of all loaded
      modules exceeds 512 MiB, then loading modules fails with a warning
      (and hence a vmalloc allocation failure) because the PTEs for the
      newly-allocated vmalloc address space are not zero.
      
        WARNING: CPU: 0 PID: 494 at linux/mm/vmalloc.c:128
                 vmap_page_range_noflush+0x2a1/0x360()
      
      This is caused by xen_setup_kernel_pagetables() copying
      level2_kernel_pgt into level2_fixmap_pgt, overwriting many non-present
      entries.
      
      Without KASLR, the normal kernel image size only covers the first half
      of level2_kernel_pgt and module space starts after that.
      
      L4[511]->level3_kernel_pgt[510]->level2_kernel_pgt[  0..255]->kernel
                                                        [256..511]->module
                                [511]->level2_fixmap_pgt[  0..505]->module
      
      This allows 512 MiB of of module vmalloc space to be used before
      having to use the corrupted level2_fixmap_pgt entries.
      
      With KASLR enabled, the kernel image uses the full PUD range of 1G and
      module space starts in the level2_fixmap_pgt. So basically:
      
      L4[511]->level3_kernel_pgt[510]->level2_kernel_pgt[0..511]->kernel
                                [511]->level2_fixmap_pgt[0..505]->module
      
      And now no module vmalloc space can be used without using the corrupt
      level2_fixmap_pgt entries.
      
      Fix this by properly converting the level2_fixmap_pgt entries to MFNs,
      and setting level1_fixmap_pgt as read-only.
      
      A number of comments were also using the the wrong L3 offset for
      level2_kernel_pgt.  These have been corrected.
      Signed-off-by: NStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: stable@vger.kernel.org
      0b5a5063
  9. 09 9月, 2014 4 次提交
  10. 01 9月, 2014 1 次提交
    • J
      x86, irq: Fix build error caused by 9eabc99a · f3761db1
      Jiang Liu 提交于
      Commit 9eabc99a causes following build error when
      IOAPIC is disabled.
         arch/x86/pci/irq.c: In function 'pirq_disable_irq':
      >> arch/x86/pci/irq.c:1259:2: error: implicit declaration of function 'mp_should_keep_irq' [-Werror=implicit-function-declaration]
           if (io_apic_assign_pci_irqs && !mp_should_keep_irq(&dev->dev) &&
           ^
         cc1: some warnings being treated as errors
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Grant Likely <grant.likely@linaro.org>
      Link: http://lkml.kernel.org/r/1409382916-10649-1-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      f3761db1
  11. 30 8月, 2014 4 次提交
    • M
      kexec: purgatory: add clean-up for purgatory directory · b0108f9e
      Michael Welling 提交于
      Without this patch the kexec-purgatory.c and purgatory.ro files are not
      removed after make mrproper.
      Signed-off-by: NMichael Welling <mwelling@ieee.org>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b0108f9e
    • V
      x86/purgatory: use approprate -m64/-32 build flag for arch/x86/purgatory · 4df4185a
      Vivek Goyal 提交于
      Thomas reported that build of x86_64 kernel was failing for him.  He is
      using 32bit tool chain.
      
      Problem is that while compiling purgatory, I have not specified -m64
      flag.  And 32bit tool chain must be assuming -m32 by default.
      
      Following is error message.
      
      (mini) [~/work/linux-2.6] make
      scripts/kconfig/conf --silentoldconfig Kconfig
        CHK     include/config/kernel.release
        UPD     include/config/kernel.release
        CHK     include/generated/uapi/linux/version.h
        CHK     include/generated/utsrelease.h
        UPD     include/generated/utsrelease.h
        CC      arch/x86/purgatory/purgatory.o
      arch/x86/purgatory/purgatory.c:1:0: error: code model 'large' not supported in
      the 32 bit mode
      
      Fix it by explicitly passing appropriate -m64/-m32 build flag for
      purgatory.
      Reported-by: NThomas Glanzmann <thomas@glanzmann.de>
      Tested-by: NThomas Glanzmann <thomas@glanzmann.de>
      Suggested-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4df4185a
    • V
      kexec: create a new config option CONFIG_KEXEC_FILE for new syscall · 74ca317c
      Vivek Goyal 提交于
      Currently new system call kexec_file_load() and all the associated code
      compiles if CONFIG_KEXEC=y.  But new syscall also compiles purgatory
      code which currently uses gcc option -mcmodel=large.  This option seems
      to be available only gcc 4.4 onwards.
      
      Hiding new functionality behind a new config option will not break
      existing users of old gcc.  Those who wish to enable new functionality
      will require new gcc.  Having said that, I am trying to figure out how
      can I move away from using -mcmodel=large but that can take a while.
      
      I think there are other advantages of introducing this new config
      option.  As this option will be enabled only on x86_64, other arches
      don't have to compile generic kexec code which will never be used.  This
      new code selects CRYPTO=y and CRYPTO_SHA256=y.  And all other arches had
      to do this for CONFIG_KEXEC.  Now with introduction of new config
      option, we can remove crypto dependency from other arches.
      
      Now CONFIG_KEXEC_FILE is available only on x86_64.  So whereever I had
      CONFIG_X86_64 defined, I got rid of that.
      
      For CONFIG_KEXEC_FILE, instead of doing select CRYPTO=y, I changed it to
      "depends on CRYPTO=y".  This should be safer as "select" is not
      recursive.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Tested-by: NShaun Ruffell <sruffell@digium.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      74ca317c
    • H
      x86,mm: fix pte_special versus pte_numa · b38af472
      Hugh Dickins 提交于
      Sasha Levin has shown oopses on ffffea0003480048 and ffffea0003480008 at
      mm/memory.c:1132, running Trinity on different 3.16-rc-next kernels:
      where zap_pte_range() checks page->mapping to see if PageAnon(page).
      
      Those addresses fit struct pages for pfns d2001 and d2000, and in each
      dump a register or a stack slot showed d2001730 or d2000730: pte flags
      0x730 are PCD ACCESSED PROTNONE SPECIAL IOMAP; and Sasha's e820 map has
      a hole between cfffffff and 100000000, which would need special access.
      
      Commit c46a7c81 ("x86: define _PAGE_NUMA by reusing software bits on
      the PMD and PTE levels") has broken vm_normal_page(): a PROTNONE SPECIAL
      pte no longer passes the pte_special() test, so zap_pte_range() goes on
      to try to access a non-existent struct page.
      
      Fix this by refining pte_special() (SPECIAL with PRESENT or PROTNONE) to
      complement pte_numa() (SPECIAL with neither PRESENT nor PROTNONE).  A
      hint that this was a problem was that c46a7c81 added pte_numa() test
      to vm_normal_page(), and moved its is_zero_pfn() test from slow to fast
      path: This was papering over a pte_special() snag when the zero page was
      encountered during zap.  This patch reverts vm_normal_page() to how it
      was before, relying on pte_special().
      
      It still appears that this patch may be incomplete: aren't there other
      places which need to be handling PROTNONE along with PRESENT?  For
      example, pte_mknuma() clears _PAGE_PRESENT and sets _PAGE_NUMA, but on a
      PROT_NONE area, that would make it pte_special().  This is side-stepped
      by the fact that NUMA hinting faults skipped PROT_NONE VMAs and there
      are no grounds where a NUMA hinting fault on a PROT_NONE VMA would be
      interesting.
      
      Fixes: c46a7c81 ("x86: define _PAGE_NUMA by reusing software bits on the PMD and PTE levels")
      Reported-by: NSasha Levin <sasha.levin@oracle.com>
      Tested-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
      Cc: <stable@vger.kernel.org>	[3.16]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b38af472
  12. 29 8月, 2014 1 次提交
  13. 28 8月, 2014 1 次提交
  14. 27 8月, 2014 1 次提交
    • J
      x86: irq: Fix bug in setting IOAPIC pin attributes · f395dcae
      Jiang Liu 提交于
      Commit 15a3c7cc "x86, irq: Introduce two helper functions
      to support irqdomain map operation" breaks LPSS ACPI enumerated
      devices.
      
      On startup, IOAPIC driver preallocates IRQ descriptors and programs
      IOAPIC pins with default level and polarity attributes for all legacy
      IRQs. Later legacy IRQ users may fail to set IOAPIC pin attributes
      if the requested attributes conflicts with the default IOAPIC pin
      attributes. So change mp_irqdomain_map() to allow the first legacy IRQ
      user to reprogram IOAPIC pin with different attributes.
      Reported-and-tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Link: http://lkml.kernel.org/r/1409118795-17046-1-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      f395dcae
  15. 26 8月, 2014 1 次提交
  16. 19 8月, 2014 3 次提交
  17. 16 8月, 2014 2 次提交
  18. 13 8月, 2014 1 次提交
  19. 11 8月, 2014 2 次提交
    • D
      x86/xen: use vmap() to map grant table pages in PVH guests · 7d951f3c
      David Vrabel 提交于
      Commit b7dd0e35 (x86/xen: safely map and unmap grant frames when
      in atomic context) causes PVH guests to crash in
      arch_gnttab_map_shared() when they attempted to map the pages for the
      grant table.
      
      This use of a PV-specific function during the PVH grant table setup is
      non-obvious and not needed.  The standard vmap() function does the
      right thing.
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Reported-by: NMukesh Rathor <mukesh.rathor@oracle.com>
      Tested-by: NMukesh Rathor <mukesh.rathor@oracle.com>
      Cc: stable@vger.kernel.org
      7d951f3c
    • D
      x86/xen: resume timer irqs early · 8d5999df
      David Vrabel 提交于
      If the timer irqs are resumed during device resume it is possible in
      certain circumstances for the resume to hang early on, before device
      interrupts are resumed.  For an Ubuntu 14.04 PVHVM guest this would
      occur in ~0.5% of resume attempts.
      
      It is not entirely clear what is occuring the point of the hang but I
      think a task necessary for the resume calls schedule_timeout(),
      waiting for a timer interrupt (which never arrives).  This failure may
      require specific tasks to be running on the other VCPUs to trigger
      (processes are not frozen during a suspend/resume if PREEMPT is
      disabled).
      
      Add IRQF_EARLY_RESUME to the timer interrupts so they are resumed in
      syscore_resume().
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: stable@vger.kernel.org
      8d5999df
  20. 10 8月, 2014 1 次提交
  21. 09 8月, 2014 6 次提交
    • V
      kexec: verify the signature of signed PE bzImage · 8e7d8381
      Vivek Goyal 提交于
      This is the final piece of the puzzle of verifying kernel image signature
      during kexec_file_load() syscall.
      
      This patch calls into PE file routines to verify signature of bzImage.  If
      signature are valid, kexec_file_load() succeeds otherwise it fails.
      
      Two new config options have been introduced.  First one is
      CONFIG_KEXEC_VERIFY_SIG.  This option enforces that kernel has to be
      validly signed otherwise kernel load will fail.  If this option is not
      set, no signature verification will be done.  Only exception will be when
      secureboot is enabled.  In that case signature verification should be
      automatically enforced when secureboot is enabled.  But that will happen
      when secureboot patches are merged.
      
      Second config option is CONFIG_KEXEC_BZIMAGE_VERIFY_SIG.  This option
      enables signature verification support on bzImage.  If this option is not
      set and previous one is set, kernel image loading will fail because kernel
      does not have support to verify signature of bzImage.
      
      I tested these patches with both "pesign" and "sbsign" signed bzImages.
      
      I used signing_key.priv key and signing_key.x509 cert for signing as
      generated during kernel build process (if module signing is enabled).
      
      Used following method to sign bzImage.
      
      pesign
      ======
      - Convert DER format cert to PEM format cert
      openssl x509 -in signing_key.x509 -inform DER -out signing_key.x509.PEM -outform
      PEM
      
      - Generate a .p12 file from existing cert and private key file
      openssl pkcs12 -export -out kernel-key.p12 -inkey signing_key.priv -in
      signing_key.x509.PEM
      
      - Import .p12 file into pesign db
      pk12util -i /tmp/kernel-key.p12 -d /etc/pki/pesign
      
      - Sign bzImage
      pesign -i /boot/vmlinuz-3.16.0-rc3+ -o /boot/vmlinuz-3.16.0-rc3+.signed.pesign
      -c "Glacier signing key - Magrathea" -s
      
      sbsign
      ======
      sbsign --key signing_key.priv --cert signing_key.x509.PEM --output
      /boot/vmlinuz-3.16.0-rc3+.signed.sbsign /boot/vmlinuz-3.16.0-rc3+
      
      Patch details:
      
      Well all the hard work is done in previous patches.  Now bzImage loader
      has just call into that code and verify whether bzImage signature are
      valid or not.
      
      Also create two config options.  First one is CONFIG_KEXEC_VERIFY_SIG.
      This option enforces that kernel has to be validly signed otherwise kernel
      load will fail.  If this option is not set, no signature verification will
      be done.  Only exception will be when secureboot is enabled.  In that case
      signature verification should be automatically enforced when secureboot is
      enabled.  But that will happen when secureboot patches are merged.
      
      Second config option is CONFIG_KEXEC_BZIMAGE_VERIFY_SIG.  This option
      enables signature verification support on bzImage.  If this option is not
      set and previous one is set, kernel image loading will fail because kernel
      does not have support to verify signature of bzImage.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Matt Fleming <matt@console-pimps.org>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8e7d8381
    • V
      kexec: support kexec/kdump on EFI systems · 6a2c20e7
      Vivek Goyal 提交于
      This patch does two things.  It passes EFI run time mappings to second
      kernel in bootparams efi_info.  Second kernel parse this info and create
      new mappings in second kernel.  That means mappings in first and second
      kernel will be same.  This paves the way to enable EFI in kexec kernel.
      
      This patch also prepares and passes EFI setup data through bootparams.
      This contains bunch of information about various tables and their
      addresses.
      
      These information gathering and passing has been written along the lines
      of what current kexec-tools is doing to make kexec work with UEFI.
      
      [akpm@linux-foundation.org: s/get_efi/efi_get/g, per Matt]
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Matt Fleming <matt@console-pimps.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6a2c20e7
    • V
      kexec: support for kexec on panic using new system call · dd5f7260
      Vivek Goyal 提交于
      This patch adds support for loading a kexec on panic (kdump) kernel usning
      new system call.
      
      It prepares ELF headers for memory areas to be dumped and for saved cpu
      registers.  Also prepares the memory map for second kernel and limits its
      boot to reserved areas only.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dd5f7260
    • V
      kexec-bzImage64: support for loading bzImage using 64bit entry · 27f48d3e
      Vivek Goyal 提交于
      This is loader specific code which can load bzImage and set it up for
      64bit entry.  This does not take care of 32bit entry or real mode entry.
      
      32bit mode entry can be implemented if somebody needs it.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      27f48d3e
    • V
      kexec: load and relocate purgatory at kernel load time · 12db5562
      Vivek Goyal 提交于
      Load purgatory code in RAM and relocate it based on the location.
      Relocation code has been inspired by module relocation code and purgatory
      relocation code in kexec-tools.
      
      Also compute the checksums of loaded kexec segments and store them in
      purgatory.
      
      Arch independent code provides this functionality so that arch dependent
      bootloaders can make use of it.
      
      Helper functions are provided to get/set symbol values in purgatory which
      are used by bootloaders later to set things like stack and entry point of
      second kernel etc.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      12db5562
    • V
      purgatory: core purgatory functionality · 8fc5b4d4
      Vivek Goyal 提交于
      Create a stand alone relocatable object purgatory which runs between two
      kernels.  This name, concept and some code has been taken from
      kexec-tools.  Idea is that this code runs after a crash and it runs in
      minimal environment.  So keep it separate from rest of the kernel and in
      long term we will have to practically do no maintenance of this code.
      
      This code also has the logic to do verify sha256 hashes of various
      segments which have been loaded into memory.  So first we verify that the
      kernel we are jumping to is fine and has not been corrupted and make
      progress only if checsums are verified.
      
      This code also takes care of copying some memory contents to backup region.
      
      [sfr@canb.auug.org.au: run host built programs from objtree]
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8fc5b4d4