1. 23 7月, 2015 1 次提交
  2. 04 7月, 2015 1 次提交
    • I
      x86/fpu: Fix boot crash in the early FPU code · b96fecbf
      Ingo Molnar 提交于
      Jan Kara and Thomas Gleixner reported boot crashes in the FPU
      code:
      
        general protection fault: 0000 [#1] SMP
        RIP: 0010:[<ffffffff81048a6c>]  [<ffffffff81048a6c>] mxcsr_feature_mask_init+0x1c/0x40
      
        2b:*  0f ae 85 00 fe ff ff    fxsave -0x200(%rbp)
      
      and bisected it down to the following FPU commit:
      
         91a8c2a5 ("x86/fpu: Clean up and fix MXCSR handling")
      
      The reason is that the on-stack FPU registers state variable,
      used by the FXSAVE instruction, did not have the required
      minimum alignment of 16 bytes, causing the general protection
      fault.
      
      This is most likely a GCC bug in older GCC versions, but the
      offending commit also added a bogus extra 32-byte alignment
      (which GCC ignored too).
      
      So fix this bug by making the variable static again, but also
      mark it __initdata this time, because fpu__init_system_mxcsr()
      is now an __init function.
      Reported-and-bisected-by: NJan Kara <jack@suse.cz>
      Reported-bisected-and-tested-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20150704075819.GA9201@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b96fecbf
  3. 01 7月, 2015 2 次提交
  4. 30 6月, 2015 2 次提交
    • P
      perf/x86: Fix 'active_events' imbalance · 93472aff
      Peter Zijlstra 提交于
      Commit 1b7b938f ("perf/x86/intel: Fix PMI handling for Intel PT") conditionally
      increments active_events in x86_add_exclusive() but unconditionally decrements in
      x86_del_exclusive().
      
      These extra decrements can lead to the situation where
      active_events is zero and thus the PMI handler is 'disabled'
      while we have active events on the PMU generating PMIs.
      
      This leads to a truckload of:
      
        Uhhuh. NMI received for unknown reason 21 on CPU 28.
        Do you have a strange power saving mode enabled?
        Dazed and confused, but trying to continue
      
      messages and generally messes up perf.
      
      Remove the condition on the increment, double increment balanced
      by a double decrement is perfectly fine.
      
      Restructure the code a little bit to make the unconditional inc
      a bit more natural.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: alexander.shishkin@linux.intel.com
      Cc: brgerst@gmail.com
      Cc: dvlasenk@redhat.com
      Cc: luto@amacapital.net
      Cc: oleg@redhat.com
      Fixes: 1b7b938f ("perf/x86/intel: Fix PMI handling for Intel PT")
      Link: http://lkml.kernel.org/r/20150624144750.GJ18673@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      93472aff
    • I
      x86/fpu: Fix FPU related boot regression when CPUID masking BIOS feature is enabled · db52ef74
      Ingo Molnar 提交于
      Mike Galbraith reported:
      
        " My i7-4790 box is having one hell of a time with this merge
          window, dead in the water.
      
          BIOS setting "Limit CPUID Maximum" upsets new fpu code
          mightily. "
      
      It turns out that Linux does a double workaround here, as per:
      
        066941bd ("x86: unmask CPUID levels on Intel CPUs")
      
      it undoes the BIOS workaround - but as a side effect the CPUID
      state is not completely constant during early init anymore,
      and the new FPU init code did not take this into account.
      
      So what happened is that the xstate init code did not have full
      CPUID available, which broke subsequent attempts to use xstate
      features.
      
      Fix this by ordering the early FPU init code to after we've
      stabilized the CPUID state.
      Reported-bisected-and-tested-by: NMike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20150627082514.GA10894@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      db52ef74
  5. 26 6月, 2015 1 次提交
    • T
      libnvdimm: Set numa_node to NVDIMM devices · 41d7a6d6
      Toshi Kani 提交于
      ACPI NFIT table has System Physical Address Range Structure entries that
      describe a proximity ID of each range when ACPI_NFIT_PROXIMITY_VALID is
      set in the flags.
      
      Change acpi_nfit_register_region() to map a proximity ID to its node ID,
      and set it to a new numa_node field of nd_region_desc, which is then
      conveyed to the nd_region device.
      
      The device core arranges for btt and namespace devices to inherit their
      node from their parent region.
      Signed-off-by: NToshi Kani <toshi.kani@hp.com>
      [djbw: move set_dev_node() from region.c to bus.c]
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      41d7a6d6
  6. 25 6月, 2015 3 次提交
    • D
      libnvdimm, pmem: add libnvdimm support to the pmem driver · 9f53f9fa
      Dan Williams 提交于
      nd_pmem attaches to persistent memory regions and namespaces emitted by
      the libnvdimm subsystem, and, same as the original pmem driver, presents
      the system-physical-address range as a block device.
      
      The existing e820-type-12 to pmem setup is converted to an nvdimm_bus
      that emits an nd_namespace_io device.
      
      Note that the X in 'pmemX' is now derived from the parent region.  This
      provides some stability to the pmem devices names from boot-to-boot.
      The minor numbers are also more predictable by passing 0 to
      alloc_disk().
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Boaz Harrosh <boaz@plexistor.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Acked-by: NChristoph Hellwig <hch@lst.de>
      Tested-by: NToshi Kani <toshi.kani@hp.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      9f53f9fa
    • T
      x86, mirror: x86 enabling - find mirrored memory ranges · b05b9f5f
      Tony Luck 提交于
      UEFI GetMemoryMap() uses a new attribute bit to mark mirrored memory
      address ranges.  See UEFI 2.5 spec pages 157-158:
      
        http://www.uefi.org/sites/default/files/resources/UEFI%202_5.pdf
      
      On EFI enabled systems scan the memory map and tell memblock about any
      mirrored ranges.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Cc: Xiexiuqi <xiexiuqi@huawei.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b05b9f5f
    • T
      mm/memblock: add extra "flags" to memblock to allow selection of memory based on attribute · fc6daaf9
      Tony Luck 提交于
      Some high end Intel Xeon systems report uncorrectable memory errors as a
      recoverable machine check.  Linux has included code for some time to
      process these and just signal the affected processes (or even recover
      completely if the error was in a read only page that can be replaced by
      reading from disk).
      
      But we have no recovery path for errors encountered during kernel code
      execution.  Except for some very specific cases were are unlikely to ever
      be able to recover.
      
      Enter memory mirroring. Actually 3rd generation of memory mirroing.
      
      Gen1: All memory is mirrored
      	Pro: No s/w enabling - h/w just gets good data from other side of the
      	     mirror
      	Con: Halves effective memory capacity available to OS/applications
      
      Gen2: Partial memory mirror - just mirror memory begind some memory controllers
      	Pro: Keep more of the capacity
      	Con: Nightmare to enable. Have to choose between allocating from
      	     mirrored memory for safety vs. NUMA local memory for performance
      
      Gen3: Address range partial memory mirror - some mirror on each memory
            controller
      	Pro: Can tune the amount of mirror and keep NUMA performance
      	Con: I have to write memory management code to implement
      
      The current plan is just to use mirrored memory for kernel allocations.
      This has been broken into two phases:
      
      1) This patch series - find the mirrored memory, use it for boot time
         allocations
      
      2) Wade into mm/page_alloc.c and define a ZONE_MIRROR to pick up the
         unused mirrored memory from mm/memblock.c and only give it out to
         select kernel allocations (this is still being scoped because
         page_alloc.c is scary).
      
      This patch (of 3):
      
      Add extra "flags" to memblock to allow selection of memory based on
      attribute.  No functional changes
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Cc: Xiexiuqi <xiexiuqi@huawei.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fc6daaf9
  7. 22 6月, 2015 1 次提交
  8. 21 6月, 2015 1 次提交
  9. 20 6月, 2015 1 次提交
  10. 19 6月, 2015 5 次提交
    • B
      x86/boot: Fix overflow warning with 32-bit binutils · 04c17341
      Borislav Petkov 提交于
      When building the kernel with 32-bit binutils built with support
      only for the i386 target, we get the following warning:
      
        arch/x86/kernel/head_32.S:66: Warning: shift count out of range (32 is not between 0 and 31)
      
      The problem is that in that case, binutils' internal type
      representation is 32-bit wide and the shift range overflows.
      
      In order to fix this, manipulate the shift expression which
      creates the 4GiB constant to not overflow the shift count.
      Suggested-by: NMichael Matz <matz@suse.de>
      Reported-and-tested-by: NEnrico Mioso <mrkiko.rs@gmail.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      04c17341
    • P
      perf/x86: Honor the architectural performance monitoring version · 2c33645d
      Palik, Imre 提交于
      Architectural performance monitoring, version 1, doesn't support fixed counters.
      
      Currently, even if a hypervisor advertises support for architectural
      performance monitoring version 1, perf may still try to use the fixed
      counters, as the constraints are set up based on the CPU model.
      
      This patch ensures that perf honors the architectural performance monitoring
      version returned by CPUID, and it only uses the fixed counters for version 2
      and above.
      
      (Some of the ideas in this patch came from Peter Zijlstra.)
      Signed-off-by: NImre Palik <imrep@amazon.de>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Anthony Liguori <aliguori@amazon.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1433767609-1039-1-git-send-email-imrep.amz@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2c33645d
    • A
      perf/x86/intel: Fix PMI handling for Intel PT · 1b7b938f
      Alexander Shishkin 提交于
      Intel PT is a separate PMU and it is not using any of the x86_pmu
      code paths, which means in particular that the active_events counter
      remains intact when new PT events are created.
      
      However, PT uses the generic x86_pmu PMI handler for its PMI handling needs.
      
      The problem here is that the latter checks active_events and in case of it
      being zero, exits without calling the actual x86_pmu.handle_nmi(), which
      results in unknown NMI errors and massive data loss for PT.
      
      The effect is not visible if there are other perf events in the system
      at the same time that keep active_events counter non-zero, for instance
      if the NMI watchdog is running, so one needs to disable it to reproduce
      the problem.
      
      At the same time, the active_events counter besides doing what the name
      suggests also implicitly serves as a PMC hardware and DS area reference
      counter.
      
      This patch adds a separate reference counter for the PMC hardware, leaving
      active_events for actually counting the events and makes sure it also
      counts PT and BTS events.
      Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@infradead.org
      Cc: adrian.hunter@intel.com
      Link: http://lkml.kernel.org/r/87k2v92t0s.fsf@ashishki-desk.ger.corp.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1b7b938f
    • A
      perf/x86/intel/bts: Fix DS area sharing with x86_pmu events · 6b099d9b
      Alexander Shishkin 提交于
      Currently, the intel_bts driver relies on the DS area allocated by the x86_pmu
      code in its event_init() path, which is a bug: creating a BTS event while
      no x86_pmu events are present results in a NULL pointer dereference.
      
      The same DS area is also used by PEBS sampling, which makes it quite a bit
      trickier to have a separate one for intel_bts' purposes.
      
      This patch makes intel_bts driver use the same DS allocation and reference
      counting code as x86_pmu to make sure it is always present when either
      intel_bts or x86_pmu need it.
      Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@infradead.org
      Cc: adrian.hunter@intel.com
      Link: http://lkml.kernel.org/r/1434024837-9916-2-git-send-email-alexander.shishkin@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6b099d9b
    • A
      perf/x86: Add more Broadwell model numbers · 4b36f1a4
      Andi Kleen 提交于
      This patch adds additional model numbers for Broadwell to perf.
      Support for Broadwell with Iris Pro (Intel Core i7-57xxC)
      and support for Broadwell Server Xeon.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1434055942-28253-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4b36f1a4
  11. 18 6月, 2015 2 次提交
  12. 17 6月, 2015 5 次提交
    • P
      x86: replace __init_or_module with __init in non-modular vsmp_64.c · 70c4f78b
      Paul Gortmaker 提交于
      The __init_or_module is from commit 05e12e1c
      ("x86: fix 27-rc crash on vsmp due to paravirt during module load").
      
      But as of commit 70511134
      ("Revert "x86: don't compile vsmp_64 for 32bit") this file became
      obj-y and hence is now only for built-in.  That makes any
      "_or_module" support no longer necessary.
      
      We need to distinguish between the two in order to do some header
      reorganization between init.h and module.h and we don't want to
      be including module.h in non-modular code.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      70c4f78b
    • P
      x86: perf_event_intel_pt.c: use arch_initcall to hook in enabling · 5b00c1eb
      Paul Gortmaker 提交于
      This was using module_init, but the current Kconfig situation is
      as follows:
      
      In arch/x86/kernel/cpu/Makefile:
      
        obj-$(CONFIG_CPU_SUP_INTEL)    += perf_event_intel_pt.o perf_event_intel_bts.o
      
      and in arch/x86/Kconfig.cpu:
      
        config CPU_SUP_INTEL
              default y
              bool "Support Intel processors" if PROCESSOR_SELECT
      
      So currently, the end user can not build this code into a module.
      If in the future, there is desire for this to be modular, then
      it can be changed to include <linux/module.h> and use module_init.
      
      But currently, in the non-modular case, a module_init becomes a
      device_initcall.  But this really isn't a device, so we should
      choose a more appropriate initcall bucket to put it in.
      
      The obvious choice here seems to be arch_initcall, but that does
      make it earlier than it was currently through device_initcall.
      As long as perf_pmu_register() is functional, we should be OK.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      5b00c1eb
    • P
      x86: perf_event_intel_bts.c: use arch_initcall to hook in enabling · ca41d24c
      Paul Gortmaker 提交于
      This was using module_init, but the current Kconfig situation is
      as follows:
      
      In arch/x86/kernel/cpu/Makefile:
      
        obj-$(CONFIG_CPU_SUP_INTEL)    += perf_event_intel_pt.o perf_event_intel_bts.o
      
      and in arch/x86/Kconfig.cpu:
      
        config CPU_SUP_INTEL
              default y
              bool "Support Intel processors" if PROCESSOR_SELECT
      
      So currently, the end user can not build this code into a module.
      If in the future, there is desire for this to be modular, then
      it can be changed to include <linux/module.h> and use module_init.
      
      But currently, in the non-modular case, a module_init becomes a
      device_initcall.  But this really isn't a device, so we should
      choose a more appropriate initcall bucket to put it in.
      
      The obvious choice here seems to be arch_initcall, but that does
      make it earlier than it was currently through device_initcall.
      As long as perf_pmu_register() is functional, we should be OK.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      ca41d24c
    • P
      x86: don't use module_init for non-modular core bootflag code · 1206f535
      Paul Gortmaker 提交于
      The bootflag.o is obj-y (always built in).  It will never be
      modular, so using module_init as an alias for __initcall is
      somewhat misleading.
      
      Fix this up now, so that we can relocate module_init from
      init.h into module.h in the future.  If we don't do this, we'd
      have to add module.h to obviously non-modular code, and that
      would be a worse thing.
      
      Note that direct use of __initcall is discouraged, vs. one
      of the priority categorized subgroups.  As __initcall gets
      mapped onto device_initcall, our use of arch_initcall (which
      makes sense for arch code) will thus change this registration
      from level 6-device to level 3-arch (i.e. slightly earlier).
      However no observable impact of that small difference has
      been observed during testing, or is expected.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      1206f535
    • P
      x86: don't use module_init in non-modular devicetree.c code · d54b675a
      Paul Gortmaker 提交于
      The devicetree.o is built for "OF" -- which is bool, and hence
      this code is either present or absent.  It will never be modular,
      so using module_init as an alias for __initcall can be somewhat
      misleading.
      
      Fix this up now, so that we can relocate module_init from
      init.h into module.h in the future.  If we don't do this, we'd
      have to add module.h to obviously non-modular code, and that
      would be a worse thing.
      
      Note that direct use of __initcall is discouraged, vs. one
      of the priority categorized subgroups.  As __initcall gets
      mapped onto device_initcall, our use of device_initcall
      directly in this change means that the runtime impact is
      zero -- it will remain at level 6 in initcall ordering.
      Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      d54b675a
  13. 12 6月, 2015 1 次提交
    • D
      x86/fpu: Fix double-increment in setup_xstate_features() · a8424003
      Dave Hansen 提交于
      I noticed that my MPX tracepoints were producing garbage for the
      lower and upper bounds:
      
      	mpx_bounds_register_exception: address referenced: 0x00007fffffffccb7 bounds: lower: 0x0 ~upper: 0xffffffffffffffff
      	mpx_bounds_register_exception: address referenced: 0x00007fffffffccbf bounds: lower: 0x0 ~upper: 0xffffffffffffffff
      
      This is, of course, bogus because 0x00007fffffffccbf is *within*
      the bounds.  I assumed that my instruction decoder was bad and
      went looking at it.  But I eventually realized that I was
      getting a '0' offset back from xstate_offsets[BNDREGS].
      
      It was being skipped in the initialization, which is obviously
      bogus, so remove the extra leaf++.
      
      This also goes an initializes xstate_offsets/sizes[] to -1 so
      so that bugs like this will oops instead of silently failing
      in interesting ways.
      
      This was introduced by:
      
      	39f1acd2 ("x86/fpu/xstate: Don't assume the first zero xfeatures zero bit means the end")
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dave@sr71.net
      Link: http://lkml.kernel.org/r/20150611193400.2E0B00DB@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a8424003
  14. 11 6月, 2015 2 次提交
    • J
      x86/crash: Allocate enough low memory when crashkernel=high · 94fb9334
      Joerg Roedel 提交于
      When the crash kernel is loaded above 4GiB in memory, the
      first kernel allocates only 72MiB of low-memory for the DMA
      requirements of the second kernel. On systems with many
      devices this is not enough and causes device driver
      initialization errors and failed crash dumps. Testing by
      SUSE and Redhat has shown that 256MiB is a good default
      value for now and the discussion has lead to this value as
      well. So set this default value to 256MiB to make sure there
      is enough memory available for DMA.
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      [ Reflow comment. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NBaoquan He <bhe@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jörg Rödel <joro@8bytes.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: kexec@lists.infradead.org
      Link: http://lkml.kernel.org/r/1433500202-25531-4-git-send-email-joro@8bytes.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      94fb9334
    • J
      x86/swiotlb: Try coherent allocations with __GFP_NOWARN · 186dfc9d
      Joerg Roedel 提交于
      When we boot a kdump kernel in high memory, there is by
      default only 72MB of low memory available. The swiotlb code
      takes 64MB of it (by default) so that there are only 8MB
      left to allocate from. On systems with many devices this
      causes page allocator warnings from
      dma_generic_alloc_coherent():
      
        systemd-udevd: page allocation failure: order:0, mode:0x280d4
        CPU: 0 PID: 197 Comm: systemd-udevd Tainted: G        W
        3.12.28-4-default #1 Hardware name: HP ProLiant DL980 G7, BIOS
        P66 07/30/2012  ffff8800781335e0 ffffffff8150b1db 00000000000280d4 ffffffff8113af90
         0000000000000000 0000000000000000 ffff88007efdbb00 0000000100000000
         0000000000000000 0000000000000000 0000000000000000 0000000000000001
        Call Trace:
          dump_trace+0x7d/0x2d0
          show_stack_log_lvl+0x94/0x170
          show_stack+0x21/0x50
          dump_stack+0x41/0x51
          warn_alloc_failed+0xf0/0x160
          __alloc_pages_slowpath+0x72f/0x796
          __alloc_pages_nodemask+0x1ea/0x210
          dma_generic_alloc_coherent+0x96/0x140
          x86_swiotlb_alloc_coherent+0x1c/0x50
          ttm_dma_pool_alloc_new_pages+0xab/0x320 [ttm]
          ttm_dma_populate+0x3ce/0x640 [ttm]
          ttm_tt_bind+0x36/0x60 [ttm]
          ttm_bo_handle_move_mem+0x55f/0x5c0 [ttm]
          ttm_bo_move_buffer+0x105/0x130 [ttm]
          ttm_bo_validate+0xc1/0x130 [ttm]
          ttm_bo_init+0x24b/0x400 [ttm]
          radeon_bo_create+0x16c/0x200 [radeon]
          radeon_ring_init+0x11e/0x2b0 [radeon]
          r100_cp_init+0x123/0x5b0 [radeon]
          r100_startup+0x194/0x230 [radeon]
          r100_init+0x223/0x410 [radeon]
          radeon_device_init+0x6af/0x830 [radeon]
          radeon_driver_load_kms+0x89/0x180 [radeon]
          drm_get_pci_dev+0x121/0x2f0 [drm]
          local_pci_probe+0x39/0x60
          pci_device_probe+0xa9/0x120
          driver_probe_device+0x9d/0x3d0
          __driver_attach+0x8b/0x90
          bus_for_each_dev+0x5b/0x90
          bus_add_driver+0x1f8/0x2c0
          driver_register+0x5b/0xe0
          do_one_initcall+0xf2/0x1a0
          load_module+0x1207/0x1c70
          SYSC_finit_module+0x75/0xa0
          system_call_fastpath+0x16/0x1b
          0x7fac533d2788
      
      After these warnings the code enters a fall-back path and
      allocated directly from the swiotlb aperture in the end.
      So remove these warnings as this is not a fatal error.
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      [ Simplify, reflow comment. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NBaoquan He <bhe@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jörg Rödel <joro@8bytes.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: kexec@lists.infradead.org
      Link: http://lkml.kernel.org/r/1433500202-25531-3-git-send-email-joro@8bytes.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      186dfc9d
  15. 09 6月, 2015 9 次提交
    • D
      x86: Make is_64bit_mm() widely available · b0e9b09b
      Dave Hansen 提交于
      The uprobes code has a nice helper, is_64bit_mm(), that consults
      both the runtime and compile-time flags for 32-bit support.
      Instead of reinventing the wheel, pull it in to an x86 header so
      we can use it for MPX.
      
      I prefer passing the 'mm' around to test_thread_flag(TIF_IA32)
      because it makes it explicit where the context is coming from.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20150607183704.F0209999@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b0e9b09b
    • D
      x86/mpx: Trace #BR exceptions · e7126cf5
      Dave Hansen 提交于
      This is the first in a series of MPX tracing patches.
      I've found these extremely useful in the process of
      debugging applications and the kernel code itself.
      
      This exception hooks in to the bounds (#BR) exception
      very early and allows capturing the key registers which
      would influence how the exception is handled.
      
      Note that bndcfgu/bndstatus are technically still
      64-bit registers even in 32-bit mode.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20150607183703.5FE2619A@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e7126cf5
    • D
      x86/mpx: Introduce a boot-time disable flag · 8c3641e9
      Dave Hansen 提交于
      MPX has the _potential_ to cause some issues.  Say part of your
      init system tried to protect one of its components from buffer
      overflows with MPX.  If there were a false positive, it's
      possible that MPX could keep a system from booting.
      
      MPX could also potentially cause performance issues since it is
      present in hot paths like the unmap path.
      
      Allow it to be disabled at boot time.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: Thomas Gleixner <tglx@linutronix.de
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20150607183702.2E8B77AB@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8c3641e9
    • D
      x86/mpx: Clean up the code by not passing a task pointer around when unnecessary · 46a6e0cf
      Dave Hansen 提交于
      The MPX code can only work on the current task.  You can not,
      for instance, enable MPX management in another process or
      thread. You can also not handle a fault for another process or
      thread.
      
      Despite this, we pass a task_struct around prolifically.  This
      patch removes all of the task struct passing for code paths
      where the code can not deal with another task (which turns out
      to be all of them).
      
      This has no functional changes.  It's just a cleanup.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: bp@alien8.de
      Link: http://lkml.kernel.org/r/20150607183702.6A81DA2C@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      46a6e0cf
    • D
      x86/mpx: Use the new get_xsave_field_ptr()API · a84eeaa9
      Dave Hansen 提交于
      The MPX registers (bndcsr/bndcfgu/bndstatus) are not directly
      accessible via normal instructions.  They essentially act as
      if they were floating point registers and are saved/restored
      along with those registers.
      
      There are two main paths in the MPX code where we care about
      the contents of these registers:
      
      	1. #BR (bounds) faults
      	2. the prctl() code where we are setting MPX up
      
      Both of those paths _might_ be called without the FPU having
      been used.  That means that 'tsk->thread.fpu.state' might
      never be allocated.
      
      Also, fpu_save_init() is not preempt-safe.  It was a bug to
      call it without disabling preemption.  The new
      get_xsave_addr() calls unlazy_fpu() instead and properly
      disables preemption.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suresh Siddha <sbsiddha@gmail.com>
      Cc: bp@alien8.de
      Link: http://lkml.kernel.org/r/20150607183701.BC0D37CF@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a84eeaa9
    • D
      x86/fpu/xstate: Wrap get_xsave_addr() to make it safer · 04cd027b
      Dave Hansen 提交于
      The MPX code appears is calling a low-level FPU function
      (copy_fpregs_to_fpstate()).  This function is not able to
      be called in all contexts, although it is safe to call
      directly in some cases.
      
      Although probably correct, the current code is ugly and
      potentially error-prone.  So, add a wrapper that calls
      the (slightly) higher-level fpu__save() (which is preempt-
      safe) and also ensures that we even *have* an FPU context
      (in the case that this was called when in lazy FPU mode).
      
      Ingo had this to say about the details about when we need
      preemption disabled:
      
      > it's indeed generally unsafe to access/copy FPU registers with preemption enabled,
      > for two reasons:
      >
      >   - on older systems that use FSAVE the instruction destroys FPU register
      >     contents, which has to be handled carefully
      >
      >   - even on newer systems if we copy to FPU registers (which this code doesn't)
      >     then we don't want a context switch to occur in the middle of it, because a
      >     context switch will write to the fpstate, potentially overwriting our new data
      >     with old FPU state.
      >
      > But it's safe to access FPU registers with preemption enabled in a couple of
      > special cases:
      >
      >   - potentially destructively saving FPU registers: the signal handling code does
      >     this in copy_fpstate_to_sigframe(), because it can rely on the signal restore
      >     side to restore the original FPU state.
      >
      >   - reading FPU registers on modern systems: we don't do this anywhere at the
      >     moment, mostly to keep symmetry with older systems where FSAVE is
      >     destructive.
      >
      >   - initializing FPU registers on modern systems: fpu__clear() does this. Here
      >     it's safe because we don't copy from the fpstate.
      >
      >   - directly writing FPU registers from user-space memory (!). We do this in
      >     fpu__restore_sig(), and it's safe because neither context switches nor
      >     irq-handler FPU use can corrupt the source context of the copy (which is
      >     user-space memory).
      >
      > Note that the MPX code's current use of copy_fpregs_to_fpstate() was safe I think,
      > because:
      >
      >  - MPX is predicated on eagerfpu, so the destructive F[N]SAVE instruction won't be
      >    used.
      >
      >  - the code was only reading FPU registers, and was doing it only in places that
      >    guaranteed that an FPU state was already active (i.e. didn't do it in
      >    kthreads)
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suresh Siddha <sbsiddha@gmail.com>
      Cc: bp@alien8.de
      Link: http://lkml.kernel.org/r/20150607183700.AA881696@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      04cd027b
    • D
      x86/fpu/xstate: Fix up bad get_xsave_addr() assumptions · 0c4109be
      Dave Hansen 提交于
      get_xsave_addr() assumes that if an xsave bit is present in the
      hardware (pcntxt_mask) that it is present in a given xsave
      buffer.  Due to an bug in the xsave code on all of the systems
      that have MPX (and thus all the users of this code), that has
      been a true assumption.
      
      But, the bug is getting fixed, so our assumption is not going
      to hold any more.
      
      It's quite possible (and normal) for an enabled state to be
      present on 'pcntxt_mask', but *not* in 'xstate_bv'.  We need
      to consult 'xstate_bv'.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20150607183700.1E739B34@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0c4109be
    • I
      Revert "perf/x86/intel/uncore: Move uncore_box_init() out of driver initialization" · 15c12479
      Ingo Molnar 提交于
      This reverts commit c05199e5.
      
      Vince Weaver reported the following crash while perf fuzzing:
      
      [   79.473121] kernel BUG at mm/vmalloc.c:1335!
      [   79.694391] Call Trace:
      [   79.696997]  <IRQ>
      [   79.699090]  [<ffffffff811b2130>] get_vm_area_caller+0x40/0x50
      [   79.705505]  [<ffffffff81039f4d>] ? snb_uncore_imc_init_box+0x6d/0x90
      [   79.712414]  [<ffffffff810635e5>] __ioremap_caller+0x195/0x350
      [   79.718610]  [<ffffffff81039f4d>] ? snb_uncore_imc_init_box+0x6d/0x90
      [   79.725462]  [<ffffffff81427f6b>] ? debug_object_activate+0x14b/0x1e0
      [   79.732346]  [<ffffffff810637b7>] ioremap_nocache+0x17/0x20
      [   79.738283]  [<ffffffff81039f4d>] snb_uncore_imc_init_box+0x6d/0x90
      [   79.744945]  [<ffffffff81039cf7>] snb_uncore_imc_event_start+0xb7/0x110
      [   79.752020]  [<ffffffff81039d97>] snb_uncore_imc_event_add+0x47/0x60
      [   79.758832]  [<ffffffff81162cbb>] event_sched_in.isra.85+0xfb/0x330
      [   79.765519]  [<ffffffff81162f5f>] group_sched_in+0x6f/0x1e0
      [   79.771481]  [<ffffffff8101df1a>] ? native_sched_clock+0x2a/0x90
      [   79.777858]  [<ffffffff811637bc>] __perf_event_enable+0x25c/0x2a0
      [   79.784418]  [<ffffffff810f3e69>] ? tick_nohz_irq_exit+0x29/0x30
      [   79.790820]  [<ffffffff8115ef30>] ? cpu_clock_event_start+0x40/0x40
      [   79.797546]  [<ffffffff8115ef80>] remote_function+0x50/0x60
      [   79.803535]  [<ffffffff810f8cd1>] flush_smp_call_function_queue+0x81/0x180
      [   79.810840]  [<ffffffff810f9763>] generic_smp_call_function_single_interrupt+0x13/0x60
      [   79.819328]  [<ffffffff8104b5e8>] smp_trace_call_function_single_interrupt+0x38/0xc0
      [   79.827614]  [<ffffffff816de9be>] trace_call_function_single_interrupt+0x6e/0x80
      [   79.835465]  <EOI>
      [   79.837543]  [<ffffffff8156e8b5>] ? cpuidle_enter_state+0x65/0x160
      [   79.844377]  [<ffffffff8156e8a1>] ? cpuidle_enter_state+0x51/0x160
      [   79.851015]  [<ffffffff8156e9e7>] cpuidle_enter+0x17/0x20
      [   79.856791]  [<ffffffff810b6e39>] cpu_startup_entry+0x399/0x440
      [   79.863165]  [<ffffffff816c9ddb>] rest_init+0xbb/0xd0
      
      The offending commit is clearly confused as it moves heavy initialization
      work into IPI context.
      
      Revert it.
      Reported-by: NVince Weaver <vincent.weaver@maine.edu>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yan, Zheng <zheng.z.yan@intel.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      15c12479
    • I
      x86/asm/entry: (Re-)rename __NR_entry_INT80_compat_max to __NR_syscall_compat_max · bace7117
      Ingo Molnar 提交于
      Brian Gerst noticed that I did a weird rename in the following commit:
      
         b2502b41 ("x86/asm/entry: Untangle 'system_call' into two entry points: entry_SYSCALL_64 and entry_INT80_32")
      
      which renamed __NR_ia32_syscall_max to __NR_entry_INT80_compat_max.
      
      Now the original name was a misnomer, but the new one is a misnomer as well,
      as all the 32-bit compat syscall entry points (sysenter, syscall) share the
      system call table, not just the INT80 based one.
      
      Rename it to __NR_syscall_compat_max.
      Reported-by: NBrian Gerst <brgerst@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      bace7117
  16. 08 6月, 2015 3 次提交
    • B
      PCI: Remove unnecessary #includes of <asm/pci.h> · 633adc71
      Bjorn Helgaas 提交于
      In include/linux/pci.h, we already #include <asm/pci.h>, so we don't need
      to include <asm/pci.h> directly.
      
      Remove the unnecessary includes.  All the files here already include
      <linux/pci.h>.
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: Simon Horman <horms+renesas@verge.net.au>	# sh
      Acked-by: NRalf Baechle <ralf@linux-mips.org>
      633adc71
    • I
      x86/asm/entry: Untangle 'system_call' into two entry points: entry_SYSCALL_64 and entry_INT80_32 · b2502b41
      Ingo Molnar 提交于
      The 'system_call' entry points differ starkly between native 32-bit and 64-bit
      kernels: on 32-bit kernels it defines the INT 0x80 entry point, while on
      64-bit it's the SYSCALL entry point.
      
      This is pretty confusing when looking at generic code, and it also obscures
      the nature of the entry point at the assembly level.
      
      So unangle this by splitting the name into its two uses:
      
      	system_call (32) -> entry_INT80_32
      	system_call (64) -> entry_SYSCALL_64
      
      As per the generic naming scheme for x86 system call entry points:
      
      	entry_MNEMONIC_qualifier
      
      where 'qualifier' is one of _32, _64 or _compat.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      b2502b41
    • I
      x86/asm/entry: Untangle 'ia32_sysenter_target' into two entry points:... · 4c8cd0c5
      Ingo Molnar 提交于
      x86/asm/entry: Untangle 'ia32_sysenter_target' into two entry points: entry_SYSENTER_32 and entry_SYSENTER_compat
      
      So the SYSENTER instruction is pretty quirky and it has different behavior
      depending on bitness and CPU maker.
      
      Yet we create a false sense of coherency by naming it 'ia32_sysenter_target'
      in both of the cases.
      
      Split the name into its two uses:
      
      	ia32_sysenter_target (32)    -> entry_SYSENTER_32
      	ia32_sysenter_target (64)    -> entry_SYSENTER_compat
      
      As per the generic naming scheme for x86 system call entry points:
      
      	entry_MNEMONIC_qualifier
      
      where 'qualifier' is one of _32, _64 or _compat.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      4c8cd0c5