1. 12 5月, 2015 1 次提交
  2. 11 5月, 2015 1 次提交
    • B
      x86/alternatives: Switch AMD F15h and later to the P6 NOPs · f21262b8
      Borislav Petkov 提交于
      Software optimization guides for both F15h and F16h cite those
      NOPs as the optimal ones. A microbenchmark confirms that
      actually even older families are better with the single-insn
      NOPs so switch to them for the alternatives.
      
      Cycles count below includes the loop overhead of the measurement
      but that overhead is the same with all runs.
      
      	F10h, revE:
      	-----------
      	Running NOP tests, 1000 NOPs x 1000000 repetitions
      
      	K8:
      			      90     288.212282 cycles
      			   66 90     288.220840 cycles
      			66 66 90     288.219447 cycles
      		     66 66 66 90     288.223204 cycles
      		  66 66 90 66 90     571.393424 cycles
      	       66 66 90 66 66 90     571.374919 cycles
      	    66 66 66 90 66 66 90     572.249281 cycles
      	 66 66 66 90 66 66 66 90     571.388651 cycles
      
      	P6:
      			      90     288.214193 cycles
      			   66 90     288.225550 cycles
      			0f 1f 00     288.224441 cycles
      		     0f 1f 40 00     288.225030 cycles
      		  0f 1f 44 00 00     288.233558 cycles
      	       66 0f 1f 44 00 00     324.792342 cycles
      	    0f 1f 80 00 00 00 00     325.657462 cycles
      	 0f 1f 84 00 00 00 00 00     430.246643 cycles
      
      	F14h:
      	----
      	Running NOP tests, 1000 NOPs x 1000000 repetitions
      
      	K8:
      			      90     510.404890 cycles
      			   66 90     510.432117 cycles
      			66 66 90     510.561858 cycles
      		     66 66 66 90     510.541865 cycles
      		  66 66 90 66 90    1014.192782 cycles
      	       66 66 90 66 66 90    1014.226546 cycles
      	    66 66 66 90 66 66 90    1014.334299 cycles
      	 66 66 66 90 66 66 66 90    1014.381205 cycles
      
      	P6:
      			      90     510.436710 cycles
      			   66 90     510.448229 cycles
      			0f 1f 00     510.545100 cycles
      		     0f 1f 40 00     510.502792 cycles
      		  0f 1f 44 00 00     510.589517 cycles
      	       66 0f 1f 44 00 00     510.611462 cycles
      	    0f 1f 80 00 00 00 00     511.166794 cycles
      	 0f 1f 84 00 00 00 00 00     511.651641 cycles
      
      	F15h:
      	-----
      	Running NOP tests, 1000 NOPs x 1000000 repetitions
      
      	K8:
      			      90     243.128396 cycles
      			   66 90     243.129883 cycles
      			66 66 90     243.131631 cycles
      		     66 66 66 90     242.499324 cycles
      		  66 66 90 66 90     481.829083 cycles
      	       66 66 90 66 66 90     481.884413 cycles
      	    66 66 66 90 66 66 90     481.851446 cycles
      	 66 66 66 90 66 66 66 90     481.409220 cycles
      
      	P6:
      			      90     243.127026 cycles
      			   66 90     243.130711 cycles
      			0f 1f 00     243.122747 cycles
      		     0f 1f 40 00     242.497617 cycles
      		  0f 1f 44 00 00     245.354461 cycles
      	       66 0f 1f 44 00 00     361.930417 cycles
      	    0f 1f 80 00 00 00 00     362.844944 cycles
      	 0f 1f 84 00 00 00 00 00     480.514948 cycles
      
      	F16h:
      	-----
      	Running NOP tests, 1000 NOPs x 1000000 repetitions
      
      	K8:
      			      90     507.793298 cycles
      			   66 90     507.789636 cycles
      			66 66 90     507.826490 cycles
      		     66 66 66 90     507.859075 cycles
      		  66 66 90 66 90    1008.663129 cycles
      	       66 66 90 66 66 90    1008.696259 cycles
      	    66 66 66 90 66 66 90    1008.692517 cycles
      	 66 66 66 90 66 66 66 90    1008.755399 cycles
      
      	P6:
      			      90     507.795232 cycles
      			   66 90     507.794761 cycles
      			0f 1f 00     507.834901 cycles
      		     0f 1f 40 00     507.822629 cycles
      		  0f 1f 44 00 00     507.838493 cycles
      	       66 0f 1f 44 00 00     507.908597 cycles
      	    0f 1f 80 00 00 00 00     507.946417 cycles
      	 0f 1f 84 00 00 00 00 00     507.954960 cycles
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1431332153-18566-2-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f21262b8
  3. 10 5月, 2015 4 次提交
  4. 08 5月, 2015 6 次提交
    • D
      x86/entry: Define 'cpu_current_top_of_stack' for 64-bit code · 3a23208e
      Denys Vlasenko 提交于
      32-bit code has PER_CPU_VAR(cpu_current_top_of_stack).
      64-bit code uses somewhat more obscure: PER_CPU_VAR(cpu_tss + TSS_sp0).
      
      Define the 'cpu_current_top_of_stack' macro on CONFIG_X86_64
      as well so that the PER_CPU_VAR(cpu_current_top_of_stack)
      expression can be used in both 32-bit and 64-bit code.
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Drewry <wad@chromium.org>
      Link: http://lkml.kernel.org/r/1429889495-27850-3-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3a23208e
    • D
      x86/entry: Remove unused 'kernel_stack' per-cpu variable · fed7c3f0
      Denys Vlasenko 提交于
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Acked-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Drewry <wad@chromium.org>
      Link: http://lkml.kernel.org/r/1429889495-27850-2-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fed7c3f0
    • D
      x86/entry: Stop using PER_CPU_VAR(kernel_stack) · 63332a84
      Denys Vlasenko 提交于
      PER_CPU_VAR(kernel_stack) is redundant:
      
        - On the 64-bit build, we can use PER_CPU_VAR(cpu_tss + TSS_sp0).
        - On the 32-bit build, we can use PER_CPU_VAR(cpu_current_top_of_stack).
      
      PER_CPU_VAR(kernel_stack) will be deleted by a separate change.
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Drewry <wad@chromium.org>
      Link: http://lkml.kernel.org/r/1429889495-27850-1-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      63332a84
    • D
      x86: Force inlining of atomic ops · 2a4e90b1
      Denys Vlasenko 提交于
      With both gcc 4.7.2 and 4.9.2, sometimes gcc mysteriously
      doesn't inline very small functions we expect to be inlined:
      
      $ nm --size-sort vmlinux | grep -iF ' t ' | uniq -c | grep -v '^
      *1 ' | sort -rn     473 000000000000000b t spin_unlock_irqrestore
          449 000000000000005f t rcu_read_unlock
          355 0000000000000009 t atomic_inc                <== THIS
          353 000000000000006e t rcu_read_lock
          350 0000000000000075 t rcu_read_lock_sched_held
          291 000000000000000b t spin_unlock
          266 0000000000000019 t arch_local_irq_restore
          215 000000000000000b t spin_lock
          180 0000000000000011 t kzalloc
          165 0000000000000012 t list_add_tail
          161 0000000000000019 t arch_local_save_flags
          153 0000000000000016 t test_and_set_bit
          134 000000000000000b t spin_unlock_irq
          134 0000000000000009 t atomic_dec                <== THIS
          130 000000000000000b t spin_unlock_bh
          122 0000000000000010 t brelse
          120 0000000000000016 t test_and_clear_bit
          120 000000000000000b t spin_lock_irq
          119 000000000000001e t get_dma_ops
          117 0000000000000053 t cpumask_next
          116 0000000000000036 t kref_get
          114 000000000000001a t schedule_work
          106 000000000000000b t spin_lock_bh
          103 0000000000000019 t arch_local_irq_disable
      ...
      
      Note sizes of marked functions. They are merely 9 bytes long!
      Selecting function with 'atomic' in their names:
      
          355 0000000000000009 t atomic_inc
          134 0000000000000009 t atomic_dec
           98 0000000000000014 t atomic_dec_and_test
           31 000000000000000e t atomic_add_return
           27 000000000000000a t atomic64_inc
           26 000000000000002f t kmap_atomic
           24 0000000000000009 t atomic_add
           12 0000000000000009 t atomic_sub
           10 0000000000000021 t __atomic_add_unless
           10 000000000000000a t atomic64_add
            5 000000000000001f t __atomic_add_unless.constprop.7
            5 000000000000000a t atomic64_dec
            4 000000000000001f t __atomic_add_unless.constprop.18
            4 000000000000001f t __atomic_add_unless.constprop.12
            4 000000000000001f t __atomic_add_unless.constprop.10
            3 000000000000001f t __atomic_add_unless.constprop.13
            3 0000000000000011 t atomic64_add_return
            2 000000000000001f t __atomic_add_unless.constprop.9
            2 000000000000001f t __atomic_add_unless.constprop.8
            2 000000000000001f t __atomic_add_unless.constprop.6
            2 000000000000001f t __atomic_add_unless.constprop.5
            2 000000000000001f t __atomic_add_unless.constprop.3
            2 000000000000001f t __atomic_add_unless.constprop.22
            2 000000000000001f t __atomic_add_unless.constprop.14
            2 000000000000001f t __atomic_add_unless.constprop.11
            2 000000000000001e t atomic_dec_if_positive
            2 0000000000000014 t atomic_inc_and_test
            2 0000000000000011 t atomic_add_return.constprop.4
            2 0000000000000011 t atomic_add_return.constprop.17
            2 0000000000000011 t atomic_add_return.constprop.16
            2 000000000000000d t atomic_inc.constprop.4
            2 000000000000000c t atomic_cmpxchg
      
      This patch fixes this for x86 atomic ops via
      s/inline/__always_inline/. This decreases allyesconfig kernel by
      about 25k:
      
          text     data      bss       dec     hex filename
      82399481 22255416 20627456 125282353 777a831 vmlinux.before
      82375570 22255544 20627456 125258570 7774b4a vmlinux
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Drewry <wad@chromium.org>
      Link: http://lkml.kernel.org/r/1431080762-17797-1-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2a4e90b1
    • D
      x86/asm/entry/64: Clean up usage of TEST insns · 03335e95
      Denys Vlasenko 提交于
      By the nature of TEST operation, it is often possible
      to test a narrower part of the operand:
      
          "testl $3, mem"  -> "testb $3, mem"
      
      This results in shorter insns, because TEST insn has no
      sign-entending byte-immediate forms unlike other ALU ops.
      
         text	   data	    bss	    dec	    hex	filename
        11674	      0	      0	  11674	   2d9a	entry_64.o.before
        11658	      0	      0	  11658	   2d8a	entry_64.o
      
      Changes in object code:
      
      -	f7 84 24 88 00 00 00 03 00 00 00 	testl  $0x3,0x88(%rsp)
      +	f6 84 24 88 00 00 00 03	         	testb  $0x3,0x88(%rsp)
      -	f7 44 24 68 03 00 00 00          	testl  $0x3,0x68(%rsp)
      +	f6 44 24 68 03                  	testb  $0x3,0x68(%rsp)
      -	f7 84 24 90 00 00 00 03 00 00 00	testl  $0x3,0x90(%rsp)
      +	f6 84 24 90 00 00 00 03         	testb  $0x3,0x90(%rsp)
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Acked-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Drewry <wad@chromium.org>
      Link: http://lkml.kernel.org/r/1430140912-7960-2-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      03335e95
    • D
      x86/asm/entry/64: Tidy up JZ insns after TESTs · dde74f2e
      Denys Vlasenko 提交于
      After TESTs, use logically correct JZ/JNZ mnemonics instead of
      JE/JNE. This doesn't change code.
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Acked-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Drewry <wad@chromium.org>
      Link: http://lkml.kernel.org/r/1430140912-7960-1-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      dde74f2e
  5. 06 5月, 2015 7 次提交
  6. 05 5月, 2015 4 次提交
  7. 01 5月, 2015 1 次提交
    • J
      x86/PCI/ACPI: Make all resources except [io 0xcf8-0xcff] available on PCI bus · 2c62e849
      Jiang Liu 提交于
      An IO port or MMIO resource assigned to a PCI host bridge may be
      consumed by the host bridge itself or available to its child
      bus/devices. The ACPI specification defines a bit (Producer/Consumer)
      to tell whether the resource is consumed by the host bridge itself,
      but firmware hasn't used that bit consistently, so we can't rely on it.
      
      Before commit 593669c2 ("x86/PCI/ACPI: Use common ACPI resource
      interfaces to simplify implementation"), arch/x86/pci/acpi.c ignored
      all IO port resources defined by acpi_resource_io and
      acpi_resource_fixed_io to filter out IO ports consumed by the host
      bridge itself.
      
      Commit 593669c2 ("x86/PCI/ACPI: Use common ACPI resource interfaces
      to simplify implementation") started accepting all IO port and MMIO
      resources, which caused a regression that IO port resources consumed
      by the host bridge itself became available to its child devices.
      
      Then commit 63f1789e ("x86/PCI/ACPI: Ignore resources consumed by
      host bridge itself") ignored resources consumed by the host bridge
      itself by checking the IORESOURCE_WINDOW flag, which accidently removed
      MMIO resources defined by acpi_resource_memory24, acpi_resource_memory32
      and acpi_resource_fixed_memory32.
      
      On x86 and IA64 platforms, all IO port and MMIO resources are assumed
      to be available to child bus/devices except one special case:
          IO port [0xCF8-0xCFF] is consumed by the host bridge itself
          to access PCI configuration space.
      
      So explicitly filter out PCI CFG IO ports[0xCF8-0xCFF]. This solution
      will also ease the way to consolidate ACPI PCI host bridge common code
      from x86, ia64 and ARM64.
      
      Related ACPI table are archived at:
      https://bugzilla.kernel.org/show_bug.cgi?id=94221
      
      Related discussions at:
      http://patchwork.ozlabs.org/patch/461633/
      https://lkml.org/lkml/2015/3/29/304
      
      Fixes: 63f1789e (Ignore resources consumed by host bridge itself)
      Reported-by: NBernhard Thaler <bernhard.thaler@wvnet.at>
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Cc: 4.0+ <stable@vger.kernel.org> # 4.0+
      Reviewed-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      2c62e849
  8. 30 4月, 2015 1 次提交
    • B
      xen: Suspend ticks on all CPUs during suspend · 2b953a5e
      Boris Ostrovsky 提交于
      Commit 77e32c89 ("clockevents: Manage device's state separately for
      the core") decouples clockevent device's modes from states. With this
      change when a Xen guest tries to resume, it won't be calling its
      set_mode op which needs to be done on each VCPU in order to make the
      hypervisor aware that we are in oneshot mode.
      
      This happens because clockevents_tick_resume() (which is an intermediate
      step of resuming ticks on a processor) doesn't call clockevents_set_state()
      anymore and because during suspend clockevent devices on all VCPUs (except
      for the one doing the suspend) are left in ONESHOT state. As result, during
      resume the clockevents state machine will assume that device is already
      where it should be and doesn't need to be updated.
      
      To avoid this problem we should suspend ticks on all VCPUs during
      suspend.
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      2b953a5e
  9. 27 4月, 2015 3 次提交
    • P
      x86: pvclock: Really remove the sched notifier for cross-cpu migrations · 73459e2a
      Paolo Bonzini 提交于
      This reverts commits 0a4e6be9
      and 80f7fdb1.
      
      The task migration notifier was originally introduced in order to support
      the pvclock vsyscall with non-synchronized TSC, but KVM only supports it
      with synchronized TSC.  Hence, on KVM the race condition is only needed
      due to a bad implementation on the host side, and even then it's so rare
      that it's mostly theoretical.
      
      As far as KVM is concerned it's possible to fix the host, avoiding the
      additional complexity in the vDSO and the (re)introduction of the task
      migration notifier.
      
      Xen, on the other hand, hasn't yet implemented vsyscall support at
      all, so we do not care about its plans for non-synchronized TSC.
      Reported-by: NPeter Zijlstra <peterz@infradead.org>
      Suggested-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      73459e2a
    • R
      kvm: x86: fix kvmclock update protocol · 5dca0d91
      Radim Krčmář 提交于
      The kvmclock spec says that the host will increment a version field to
      an odd number, then update stuff, then increment it to an even number.
      The host is buggy and doesn't do this, and the result is observable
      when one vcpu reads another vcpu's kvmclock data.
      
      There's no good way for a guest kernel to keep its vdso from reading
      a different vcpu's kvmclock data, but we don't need to care about
      changing VCPUs as long as we read a consistent data from kvmclock.
      (VCPU can change outside of this loop too, so it doesn't matter if we
      return a value not fit for this VCPU.)
      
      Based on a patch by Radim Krčmář.
      Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
      Acked-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      5dca0d91
    • A
      x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue · 61f01dd9
      Andy Lutomirski 提交于
      AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET with
      SS == 0 results in an invalid usermode state in which SS is apparently
      equal to __USER_DS but causes #SS if used.
      
      Work around the issue by setting SS to __KERNEL_DS __switch_to, thus
      ensuring that SYSRET never happens with SS set to NULL.
      
      This was exposed by a recent vDSO cleanup.
      
      Fixes: e7d6eefa x86/vdso32/syscall.S: Do not load __USER32_DS to %ss
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Peter Anvin <hpa@zytor.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Denys Vlasenko <vda.linux@googlemail.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      61f01dd9
  10. 24 4月, 2015 12 次提交
    • L
      x86: fix special __probe_kernel_write() tail zeroing case · d869844b
      Linus Torvalds 提交于
      Commit cae2a173 ("x86: clean up/fix 'copy_in_user()' tail zeroing")
      fixed the failure case tail zeroing of one special case of the x86-64
      generic user-copy routine, namely when used for the user-to-user case
      ("copy_in_user()").
      
      But in the process it broke an even more unusual case: using the user
      copy routine for kernel-to-kernel copying.
      
      Now, normally kernel-kernel copies are obviously done using memcpy(),
      but we have a couple of special cases when we use the user-copy
      functions.  One is when we pass a kernel buffer to a regular user-buffer
      routine, using set_fs(KERNEL_DS).  That's a "normal" case, and continued
      to work fine, because it never takes any faults (with the possible
      exception of a silent and successful vmalloc fault).
      
      But Jan Beulich pointed out another, very unusual, special case: when we
      use the user-copy routines not because it's a path that expects a user
      pointer, but for a couple of ftrace/kgdb cases that want to do a kernel
      copy, but do so using "unsafe" buffers, and use the user-copy routine to
      gracefully handle faults.  IOW, for probe_kernel_write().
      
      And that broke for the case of a faulting kernel destination, because we
      saw the kernel destination and wanted to try to clear the tail of the
      buffer.  Which doesn't work, since that's what faults.
      
      This only triggers for things like kgdb and ftrace users (eg trying
      setting a breakpoint on read-only memory), but it's definitely a bug.
      The fix is to not compare against the kernel address start (TASK_SIZE),
      but instead use the same limits "access_ok()" uses.
      Reported-and-tested-by: NJan Beulich <jbeulich@suse.com>
      Cc: stable@vger.kernel.org # 4.0
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d869844b
    • J
      x86/irq: Avoid memory allocation in __assign_irq_vector() · f7fa7aee
      Jiang Liu 提交于
      Function __assign_irq_vector() is protected by vector_lock, so use
      a global temporary cpu_mask to avoid allocating/freeing cpu_mask.
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: David Cohen <david.a.cohen@linux.intel.com>
      Cc: Sander Eikelenboom <linux@eikelenboom.it>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Link: http://lkml.kernel.org/r/1428978610-28986-34-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      f7fa7aee
    • J
      x86/irq: Move irqdomain specific code into asm/irqdomain.h · d746d1eb
      Jiang Liu 提交于
      Now we have dedicated asm/irqdomain.h, so move irqdomain specific
      code into it.
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: David Cohen <david.a.cohen@linux.intel.com>
      Cc: Sander Eikelenboom <linux@eikelenboom.it>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Link: http://lkml.kernel.org/r/1428978610-28986-33-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      d746d1eb
    • T
      x86: Cleanup irq_domain ops · f7a0c786
      Thomas Gleixner 提交于
      We have 3 identical copies of the ioapic domain ops for acpi, mpparse,
      and sfi. Have a global one in the io_apic code and be done with it.
      
      To avoid include hell in io_apic.h, create a private irqdomain header
      and include the generic irqdomain header from there.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: David Cohen <david.a.cohen@linux.intel.com>
      Cc: Sander Eikelenboom <linux@eikelenboom.it>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: sfi-devel@simplefirmware.org
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: x86@kernel.org
      Link: http://lkml.kernel.org/r/1428978610-28986-32-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      f7a0c786
    • T
      x86,ioapic: Cleanup irq_trigger/polarity() · ab76085e
      Thomas Gleixner 提交于
      These functions are full of pointless indentations, useless comments
      and even more useless printks.
      
      Clean them up.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: David Cohen <david.a.cohen@linux.intel.com>
      Cc: Sander Eikelenboom <linux@eikelenboom.it>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: Grant Likely <grant.likely@linaro.org>
      Link: http://lkml.kernel.org/r/1428978610-28986-31-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: x86@kernel.org
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      ab76085e
    • T
      x86, ioapic: Use proper defines for the entry fields · 335efdf5
      Thomas Gleixner 提交于
      While looking at the printout issue, I stumbled more than once over
      the various 0/1 assignments which are either commented in strange ways
      or force to lookup the meaning.
      
      Use proper constants and fix the misleading comments. While at it
      remove pointless 0 assignments in native_disable_io_apic() which have
      no value for understanding the code.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: David Cohen <david.a.cohen@linux.intel.com>
      Cc: Sander Eikelenboom <linux@eikelenboom.it>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: x86@kernel.org
      Link: http://lkml.kernel.org/r/1428978610-28986-30-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      335efdf5
    • J
      x86/irq, ACPI: Remove private function mp_register_gsi()/ mp_unregister_gsi() · 46176f39
      Jiang Liu 提交于
      Function mp_register_gsi() is only called once, so fold it into caller
      acpi_register_gsi_ioapic(). Do the same for mp_unregister_gsi().
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Tested-by: NJoerg Roedel <jroedel@suse.de>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: David Cohen <david.a.cohen@linux.intel.com>
      Cc: Sander Eikelenboom <linux@eikelenboom.it>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Link: http://lkml.kernel.org/r/1428978610-28986-29-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      46176f39
    • J
      x86/irq: Refine the way to calculate NR_IRQS · 4399b14f
      Jiang Liu 提交于
      Now we have made MSI independent of IOAPIC, so we need to refine the
      way to calculate NR_IRQS to support configuration with MSI enabled but
      IOAPIC disabled.
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Tested-by: NJoerg Roedel <jroedel@suse.de>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: David Cohen <david.a.cohen@linux.intel.com>
      Cc: Sander Eikelenboom <linux@eikelenboom.it>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Link: http://lkml.kernel.org/r/1428978610-28986-28-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      4399b14f
    • J
      x86/irq: Move private data in struct irq_cfg into dedicated data structure · 7f3262ed
      Jiang Liu 提交于
      Several fields in struct irq_cfg are private to vector.c, so move it
      into dedicated data structure. This helps to hide implementation
      details.
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: David Cohen <david.a.cohen@linux.intel.com>
      Cc: Sander Eikelenboom <linux@eikelenboom.it>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Link: http://lkml.kernel.org/r/1428978610-28986-27-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: http://lkml.kernel.org/r/1416901802-24211-35-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NJoerg Roedel <jroedel@suse.de>
      7f3262ed
    • J
      x86/irq: Move check of cfg->move_in_progress into send_cleanup_vector() · c6c2002b
      Jiang Liu 提交于
      Move check of cfg->move_in_progress into send_cleanup_vector() to
      prepare for simplifying struct irq_cfg.
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Tested-by: NJoerg Roedel <jroedel@suse.de>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: David Cohen <david.a.cohen@linux.intel.com>
      Cc: Sander Eikelenboom <linux@eikelenboom.it>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: iommu@lists.linux-foundation.org
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Link: http://lkml.kernel.org/r/1428978610-28986-26-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      c6c2002b
    • J
      x86/irq: Remove function apic_set_affinity() · 68f9f440
      Jiang Liu 提交于
      Now there's no user of apic_set_affinity(), so remove it.  Also rename
      vector_set_affinity() to apic_set_affinity() for consistency.
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Tested-by: NJoerg Roedel <jroedel@suse.de>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: David Cohen <david.a.cohen@linux.intel.com>
      Cc: Sander Eikelenboom <linux@eikelenboom.it>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Link: http://lkml.kernel.org/r/1428978610-28986-25-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      68f9f440
    • J
      x86/irq: Make functions only used in vector.c static · f970510c
      Jiang Liu 提交于
      Function {assign|clear}_irq_vector() and apic_retrigger_irq() are only
      used in vector.c, so make them static.
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Tested-by: NJoerg Roedel <jroedel@suse.de>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: David Cohen <david.a.cohen@linux.intel.com>
      Cc: Sander Eikelenboom <linux@eikelenboom.it>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Link: http://lkml.kernel.org/r/1428978610-28986-24-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      f970510c