1. 13 11月, 2013 7 次提交
    • V
      x86: move fpu_counter into ARCH specific thread_struct · c375f15a
      Vineet Gupta 提交于
      Only a couple of arches (sh/x86) use fpu_counter in task_struct so it can
      be moved out into ARCH specific thread_struct, reducing the size of
      task_struct for other arches.
      
      Compile tested i386_defconfig + gcc 4.7.3
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Paul Mundt <paul.mundt@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c375f15a
    • Z
    • T
      mem-hotplug: introduce movable_node boot option · c5320926
      Tang Chen 提交于
      The hot-Pluggable field in SRAT specifies which memory is hotpluggable.
      As we mentioned before, if hotpluggable memory is used by the kernel, it
      cannot be hot-removed.  So memory hotplug users may want to set all
      hotpluggable memory in ZONE_MOVABLE so that the kernel won't use it.
      
      Memory hotplug users may also set a node as movable node, which has
      ZONE_MOVABLE only, so that the whole node can be hot-removed.
      
      But the kernel cannot use memory in ZONE_MOVABLE.  By doing this, the
      kernel cannot use memory in movable nodes.  This will cause NUMA
      performance down.  And other users may be unhappy.
      
      So we need a way to allow users to enable and disable this functionality.
      In this patch, we introduce movable_node boot option to allow users to
      choose to not to consume hotpluggable memory at early boot time and later
      we can set it as ZONE_MOVABLE.
      
      To achieve this, the movable_node boot option will control the memblock
      allocation direction.  That said, after memblock is ready, before SRAT is
      parsed, we should allocate memory near the kernel image as we explained in
      the previous patches.  So if movable_node boot option is set, the kernel
      does the following:
      
      1. After memblock is ready, make memblock allocate memory bottom up.
      2. After SRAT is parsed, make memblock behave as default, allocate memory
         top down.
      
      Users can specify "movable_node" in kernel commandline to enable this
      functionality.  For those who don't use memory hotplug or who don't want
      to lose their NUMA performance, just don't specify anything.  The kernel
      will work as before.
      Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
      Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Suggested-by: NKamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Suggested-by: NIngo Molnar <mingo@kernel.org>
      Acked-by: NTejun Heo <tj@kernel.org>
      Acked-by: NToshi Kani <toshi.kani@hp.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Thomas Renninger <trenn@suse.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c5320926
    • T
      x86, acpi, crash, kdump: do reserve_crashkernel() after SRAT is parsed. · fa591c4a
      Tang Chen 提交于
      Memory reserved for crashkernel could be large.  So we should not allocate
      this memory bottom up from the end of kernel image.
      
      When SRAT is parsed, we will be able to know which memory is hotpluggable,
      and we can avoid allocating this memory for the kernel.  So reorder
      reserve_crashkernel() after SRAT is parsed.
      Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
      Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Acked-by: NToshi Kani <toshi.kani@hp.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Thomas Renninger <trenn@suse.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fa591c4a
    • T
      x86/mem-hotplug: support initialize page tables in bottom-up · b959ed6c
      Tang Chen 提交于
      The Linux kernel cannot migrate pages used by the kernel.  As a result,
      kernel pages cannot be hot-removed.  So we cannot allocate hotpluggable
      memory for the kernel.
      
      In a memory hotplug system, any numa node the kernel resides in should be
      unhotpluggable.  And for a modern server, each node could have at least
      16GB memory.  So memory around the kernel image is highly likely
      unhotpluggable.
      
      ACPI SRAT (System Resource Affinity Table) contains the memory hotplug
      info.  But before SRAT is parsed, memblock has already started to allocate
      memory for the kernel.  So we need to prevent memblock from doing this.
      
      So direct memory mapping page tables setup is the case.
      init_mem_mapping() is called before SRAT is parsed.  To prevent page
      tables being allocated within hotpluggable memory, we will use bottom-up
      direction to allocate page tables from the end of kernel image to the
      higher memory.
      
      Note:
      As for allocating page tables in lower memory, TJ said:
      
      : This is an optional behavior which is triggered by a very specific kernel
      : boot param, which I suspect is gonna need to stick around to support
      : memory hotplug in the current setup unless we add another layer of address
      : translation to support memory hotplug.
      
      As for page tables may occupy too much lower memory if using 4K mapping
      (CONFIG_DEBUG_PAGEALLOC and CONFIG_KMEMCHECK both disable using >4k
      pages), TJ said:
      
      : But as I said in the same paragraph, parsing SRAT earlier doesn't solve
      : the problem in itself either.  Ignoring the option if 4k mapping is
      : required and memory consumption would be prohibitive should work, no?
      : Something like that would be necessary if we're gonna worry about cases
      : like this no matter how we implement it, but, frankly, I'm not sure this
      : is something worth worrying about.
      Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
      Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Acked-by: NToshi Kani <toshi.kani@hp.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Thomas Renninger <trenn@suse.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b959ed6c
    • T
      x86/mm: factor out of top-down direct mapping setup · 0167d7d8
      Tang Chen 提交于
      Create a new function memory_map_top_down to factor out of the top-down
      direct memory mapping pagetable setup.  This is also a preparation for the
      following patch, which will introduce the bottom-up memory mapping.  That
      said, we will put the two ways of pagetable setup into separate functions,
      and choose to use which way in init_mem_mapping, which makes the code more
      clear.
      Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
      Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Acked-by: NToshi Kani <toshi.kani@hp.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Thomas Renninger <trenn@suse.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0167d7d8
    • J
      mm/arch: use NUMA_NO_NODE · 40c3baa7
      Jianguo Wu 提交于
      Use more appropriate NUMA_NO_NODE instead of -1 in all archs' module_alloc()
      Signed-off-by: NJianguo Wu <wujianguo@huawei.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      40c3baa7
  2. 12 11月, 2013 1 次提交
    • I
      Revert "x86/UV: Add uvtrace support" · b5dfcb09
      Ingo Molnar 提交于
      This reverts commit 8eba1842.
      
      uv_trace() is not used by anything, nor is uv_trace_nmi_func, nor
      uv_trace_func.
      
      That's not how we do instrumentation code in the kernel: we add
      tracepoints, printk()s, etc. so that everyone not just those with
      magic kernel modules can debug a system.
      
      So remove this unused (and misguied) piece of code.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: Mike Travis <travis@sgi.com>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: Hedi Berriche <hedi@sgi.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Link: http://lkml.kernel.org/n/tip-tumfBffmr4jmnt8Gyxanoblg@git.kernel.org
      b5dfcb09
  3. 09 11月, 2013 6 次提交
  4. 08 11月, 2013 2 次提交
  5. 07 11月, 2013 5 次提交
  6. 06 11月, 2013 6 次提交
  7. 05 11月, 2013 1 次提交
  8. 31 10月, 2013 3 次提交
    • L
      ACPICA: Cleanup asmlinkage for ACPICA APIs. · 40bce100
      Lv Zheng 提交于
      Add an asmlinkage wrapper around acpi_enter_sleep_state() to prevent
      an empty stub from being called by assmebly code for ACPI_REDUCED_HARDWARE
      set.
      
      As arch/x86/kernel/acpi/wakeup_xx.S is only compiled when CONFIG_ACPI=y
      and there are no users of ACPI_HARDWARE_REDUCED, currently this is in
      fact not a real issue, but a cleanup to reduce source code differences
      between Linux and ACPICA upstream.
      
      [rjw: Changelog]
      Signed-off-by: NLv Zheng <lv.zheng@intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      40bce100
    • D
      x86, hyperv: Fix build error due to missing <asm/apic.h> include · d68ce017
      David Rientjes 提交于
      9e7827b5 ("x86, hyperv: Get the local APIC timer frequency from the
      hypervisor") breaks the build with some configs because apic.h isn't
      directly included:
      
      arch/x86/kernel/cpu/mshyperv.c: In function 'ms_hyperv_init_platform':
      arch/x86/kernel/cpu/mshyperv.c:90:3: error: 'lapic_timer_frequency' undeclared (first use in this function)
      arch/x86/kernel/cpu/mshyperv.c:90:3: note: each undeclared identifier is reported only once for each function it appears in
      
      Fix it by including asm/apic.h.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1310111604160.31170@chino.kir.corp.google.comAcked-by: NK. Y. Srinivasan <kys@microsoft.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      d68ce017
    • G
      percpu: fix this_cpu_sub() subtrahend casting for unsigneds · bd09d9a3
      Greg Thelen 提交于
      this_cpu_sub() is implemented as negation and addition.
      
      This patch casts the adjustment to the counter type before negation to
      sign extend the adjustment.  This helps in cases where the counter type
      is wider than an unsigned adjustment.  An alternative to this patch is
      to declare such operations unsupported, but it seemed useful to avoid
      surprises.
      
      This patch specifically helps the following example:
        unsigned int delta = 1
        preempt_disable()
        this_cpu_write(long_counter, 0)
        this_cpu_sub(long_counter, delta)
        preempt_enable()
      
      Before this change long_counter on a 64 bit machine ends with value
      0xffffffff, rather than 0xffffffffffffffff.  This is because
      this_cpu_sub(pcp, delta) boils down to this_cpu_add(pcp, -delta),
      which is basically:
        long_counter = 0 + 0xffffffff
      
      Also apply the same cast to:
        __this_cpu_sub()
        __this_cpu_sub_return()
        this_cpu_sub_return()
      
      All percpu_test.ko passes, especially the following cases which
      previously failed:
      
        l -= ui_one;
        __this_cpu_sub(long_counter, ui_one);
        CHECK(l, long_counter, -1);
      
        l -= ui_one;
        this_cpu_sub(long_counter, ui_one);
        CHECK(l, long_counter, -1);
        CHECK(l, long_counter, 0xffffffffffffffff);
      
        ul -= ui_one;
        __this_cpu_sub(ulong_counter, ui_one);
        CHECK(ul, ulong_counter, -1);
        CHECK(ul, ulong_counter, 0xffffffffffffffff);
      
        ul = this_cpu_sub_return(ulong_counter, ui_one);
        CHECK(ul, ulong_counter, 2);
      
        ul = __this_cpu_sub_return(ulong_counter, ui_one);
        CHECK(ul, ulong_counter, 1);
      Signed-off-by: NGreg Thelen <gthelen@google.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bd09d9a3
  9. 30 10月, 2013 1 次提交
    • T
      KVM: Fix modprobe failure for kvm_intel/kvm_amd · d780a312
      Tim Gardner 提交于
      The x86 specific kvm init creates a new conflicting
      debugfs directory which causes modprobe issues
      with kvm_intel and kvm_amd. For example,
      
      sudo modprobe kvm_amd
      modprobe: ERROR: could not insert 'kvm_amd': Bad address
      
      The simplest fix is to just rename the directory. The following
      KVM config options are set:
      
      CONFIG_KVM_GUEST=y
      CONFIG_KVM_DEBUG_FS=y
      CONFIG_HAVE_KVM=y
      CONFIG_HAVE_KVM_IRQCHIP=y
      CONFIG_HAVE_KVM_IRQ_ROUTING=y
      CONFIG_HAVE_KVM_EVENTFD=y
      CONFIG_KVM_APIC_ARCHITECTURE=y
      CONFIG_KVM_MMIO=y
      CONFIG_KVM_ASYNC_PF=y
      CONFIG_HAVE_KVM_MSI=y
      CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y
      CONFIG_KVM=m
      CONFIG_KVM_INTEL=m
      CONFIG_KVM_AMD=m
      CONFIG_KVM_DEVICE_ASSIGNMENT=y
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NTim Gardner <tim.gardner@canonical.com>
      [Change debugfs directory name. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d780a312
  10. 29 10月, 2013 3 次提交
    • P
      perf/x86: Further optimize copy_from_user_nmi() · e00b12e6
      Peter Zijlstra 提交于
      Now that we can deal with nested NMI due to IRET re-enabling NMIs and
      can deal with faults from NMI by making sure we preserve CR2 over NMIs
      we can in fact simply access user-space memory from NMI context.
      
      So rewrite copy_from_user_nmi() to use __copy_from_user_inatomic() and
      rework the fault path to do the minimal required work before taking
      the in_atomic() fault handler.
      
      In particular avoid perf_sw_event() which would make perf recurse on
      itself (it should be harmless as our recursion protections should be
      able to deal with this -- but why tempt fate).
      
      Also rename notify_page_fault() to kprobes_fault() as that is a much
      better name; there is no notifier in it and its specific to kprobes.
      
      Don measured that his worst case NMI path shrunk from ~300K cycles to
      ~150K cycles.
      
      Cc: Stephane Eranian <eranian@google.com>
      Cc: jmario@redhat.com
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: dave.hansen@linux.intel.com
      Tested-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20131024105206.GM2490@laptop.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e00b12e6
    • P
      perf/x86: Fix NMI measurements · e8a923cc
      Peter Zijlstra 提交于
      OK, so what I'm actually seeing on my WSM is that sched/clock.c is
      'broken' for the purpose we're using it for.
      
      What triggered it is that my WSM-EP is broken :-(
      
        [    0.001000] tsc: Fast TSC calibration using PIT
        [    0.002000] tsc: Detected 2533.715 MHz processor
        [    0.500180] TSC synchronization [CPU#0 -> CPU#6]:
        [    0.505197] Measured 3 cycles TSC warp between CPUs, turning off TSC clock.
        [    0.004000] tsc: Marking TSC unstable due to check_tsc_sync_source failed
      
      For some reason it consistently detects TSC skew, even though NHM+
      should have a single clock domain for 'reasonable' systems.
      
      This marks sched_clock_stable=0, which means that we do fancy stuff to
      try and get a 'sane' clock. Part of this fancy stuff relies on the tick,
      clearly that's gone when NOHZ=y. So for idle cpus time gets stuck, until
      it either wakes up or gets kicked by another cpu.
      
      While this is perfectly fine for the scheduler -- it only cares about
      actually running stuff, and when we're running stuff we're obviously not
      idle. This does somewhat break down for perf which can trigger events
      just fine on an otherwise idle cpu.
      
      So I've got NMIs get get 'measured' as taking ~1ms, which actually
      don't last nearly that long:
      
                <idle>-0     [013] d.h.   886.311970: rcu_nmi_enter <-do_nmi
        ...
                <idle>-0     [013] d.h.   886.311997: perf_sample_event_took: HERE!!! : 1040990
      
      So ftrace (which uses sched_clock(), not the fancy bits) only sees
      ~27us, but we measure ~1ms !!
      
      Now since all this measurement stuff lives in x86 code, we can actually
      fix it.
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: mingo@kernel.org
      Cc: dave.hansen@linux.intel.com
      Cc: eranian@google.com
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: jmario@redhat.com
      Cc: acme@infradead.org
      Link: http://lkml.kernel.org/r/20131017133350.GG3364@laptop.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e8a923cc
    • M
      x86/efi: Add EFI framebuffer earlyprintk support · 72548e83
      Matt Fleming 提交于
      It's incredibly difficult to diagnose early EFI boot issues without
      special hardware because earlyprintk=vga doesn't work on EFI systems.
      
      Add support for writing to the EFI framebuffer, via earlyprintk=efi,
      which will actually give users a chance of providing debug output.
      
      Cc: H. Peter Anvin <hpa@zytor.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Jones <pjones@redhat.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      72548e83
  11. 27 10月, 2013 1 次提交
  12. 26 10月, 2013 4 次提交
    • J
      x86/time: Honor ACPI FADT flag indicating absence of a CMOS RTC · ee5872be
      Jan Beulich 提交于
      Even though the omission was found only during code review
      (originally in the Xen hypervisor, looking through ACPI v5 flags
      and their meanings and uses), we shouldn't be creating a
      corresponding platform device in that case.
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Link: http://lkml.kernel.org/r/5265029D02000078000FC4D2@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ee5872be
    • J
      x86/cpu: Track legacy CPU model data only on 32-bit kernels · 09dc68d9
      Jan Beulich 提交于
      struct cpu_dev's c_models is only ever set inside CONFIG_X86_32
      conditionals (or code that's being built for 32-bit only), so
      there's no use of reserving the (empty) space for the model
      names in a 64-bit kernel.
      
      Similarly, c_size_cache is only used in the #else of a
      CONFIG_X86_64 conditional, so reserving space for (and in one
      case even initializing) that field is pointless for 64-bit
      kernels too.
      
      While moving both fields to the end of the structure, I also
      noticed that:
      
       - the c_models array size was one too small, potentially causing
         table_lookup_model() to return garbage on Intel CPUs (intel.c's
         instance was lacking the sentinel with family being zero), so the
         patch bumps that by one,
      
       - c_models' vendor sub-field was unused (and anyway redundant
         with the base structure's c_x86_vendor field), so the patch deletes it.
      
      Also rename the legacy fields so that their legacy nature stands out
      and comment their declarations.
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Link: http://lkml.kernel.org/r/5265036802000078000FC4DB@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      09dc68d9
    • J
      x86: Unify copy_to_user() and add size checking to it · 7a3d9b0f
      Jan Beulich 提交于
      Similarly to copy_from_user(), where the range check is to
      protect against kernel memory corruption, copy_to_user() can
      benefit from such checking too: Here it protects against kernel
      information leaks.
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Cc: <arjan@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/5265059502000078000FC4F6@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      7a3d9b0f
    • J
      x86: Unify copy_from_user() size checking · 3df7b41a
      Jan Beulich 提交于
      Commits 4a312769 ("x86: Turn the
      copy_from_user check into an (optional) compile time warning")
      and 63312b6a ("x86: Add a
      Kconfig option to turn the copy_from_user warnings into errors")
      touched only the 32-bit variant of copy_from_user(), whereas the
      original commit 9f0cf4ad ("x86:
      Use __builtin_object_size() to validate the buffer size for
      copy_from_user()") also added the same code to the 64-bit one.
      
      Further the earlier conversion from an inline WARN() to the call
      to copy_from_user_overflow() went a little too far: When the
      number of bytes to be copied is not a constant (e.g. [looking at
      3.11] in drivers/net/tun.c:__tun_chr_ioctl() or
      drivers/pci/pcie/aer/aer_inject.c:aer_inject_write()), the
      compiler will always have to keep the funtion call, and hence
      there will always be a warning. By using __builtin_constant_p()
      we can avoid this.
      
      And then this slightly extends the effect of
      CONFIG_DEBUG_STRICT_USER_COPY_CHECKS in that apart from
      converting warnings to errors in the constant size case, it
      retains the (possibly wrong) warnings in the non-constant size
      case, such that if someone is prepared to get a few false
      positives, (s)he'll be able to recover the current behavior
      (except that these diagnostics now will never be converted to
      errors).
      
      Since the 32-bit variant (intentionally) didn't call
      might_fault(), the unification results in this being called
      twice now. Adding a suitable #ifdef would be the alternative if
      that's a problem.
      
      I'd like to point out though that with
      __compiletime_object_size() being restricted to gcc before 4.6,
      the whole construct is going to become more and more pointless
      going forward. I would question however that commit
      2fb0815c ("gcc4: disable
      __compiletime_object_size for GCC 4.6+") was really necessary,
      and instead this should have been dealt with as is done here
      from the beginning.
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/5265056D02000078000FC4F3@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3df7b41a