提交 · ecb6d6185b3ae40067330eb889977bf2a51f7429 · openanolis / cloud-kernel

20 3月, 2015 2 次提交

KVM: PPC: Book3S HV: Endian fix for accessing VPA yield count · ecb6d618

由 Paul Mackerras 提交于 3月 20, 2015

The VPA (virtual processor area) is defined by PAPR and is therefore
big-endian, so we need a be32_to_cpu when reading it in
kvmppc_get_yield_count().  Without this, H_CONFER always fails on a
little-endian host, causing SMP guests to waste time spinning on
spinlocks.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

ecb6d618

KVM: PPC: Book3S HV: Fix spinlock/mutex ordering issue in kvmppc_set_lpcr() · 8f902b00

由 Paul Mackerras 提交于 3月 20, 2015

Currently, kvmppc_set_lpcr() has a spinlock around the whole function,
and inside that does mutex_lock(&kvm->lock).  It is not permitted to
take a mutex while holding a spinlock, because the mutex_lock might
call schedule().  In addition, this causes lockdep to warn about a
lock ordering issue:

======================================================
[ INFO: possible circular locking dependency detected ]
3.18.0-kvm-04645-gdfea862-dirty #131 Not tainted
-------------------------------------------------------
qemu-system-ppc/8179 is trying to acquire lock:
 (&kvm->lock){+.+.+.}, at: [<d00000000ecc1f54>] .kvmppc_set_lpcr+0xf4/0x1c0 [kvm_hv]

but task is already holding lock:
 (&(&vcore->lock)->rlock){+.+...}, at: [<d00000000ecc1ea0>] .kvmppc_set_lpcr+0x40/0x1c0 [kvm_hv]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&(&vcore->lock)->rlock){+.+...}:
       [<c000000000b3c120>] .mutex_lock_nested+0x80/0x570
       [<d00000000ecc7a14>] .kvmppc_vcpu_run_hv+0xc4/0xe40 [kvm_hv]
       [<d00000000eb9f5cc>] .kvmppc_vcpu_run+0x2c/0x40 [kvm]
       [<d00000000eb9cb24>] .kvm_arch_vcpu_ioctl_run+0x54/0x160 [kvm]
       [<d00000000eb94478>] .kvm_vcpu_ioctl+0x4a8/0x7b0 [kvm]
       [<c00000000026cbb4>] .do_vfs_ioctl+0x444/0x770
       [<c00000000026cfa4>] .SyS_ioctl+0xc4/0xe0
       [<c000000000009264>] syscall_exit+0x0/0x98

-> #0 (&kvm->lock){+.+.+.}:
       [<c0000000000ff28c>] .lock_acquire+0xcc/0x1a0
       [<c000000000b3c120>] .mutex_lock_nested+0x80/0x570
       [<d00000000ecc1f54>] .kvmppc_set_lpcr+0xf4/0x1c0 [kvm_hv]
       [<d00000000ecc510c>] .kvmppc_set_one_reg_hv+0x4dc/0x990 [kvm_hv]
       [<d00000000eb9f234>] .kvmppc_set_one_reg+0x44/0x330 [kvm]
       [<d00000000eb9c9dc>] .kvm_vcpu_ioctl_set_one_reg+0x5c/0x150 [kvm]
       [<d00000000eb9ced4>] .kvm_arch_vcpu_ioctl+0x214/0x2c0 [kvm]
       [<d00000000eb940b0>] .kvm_vcpu_ioctl+0xe0/0x7b0 [kvm]
       [<c00000000026cbb4>] .do_vfs_ioctl+0x444/0x770
       [<c00000000026cfa4>] .SyS_ioctl+0xc4/0xe0
       [<c000000000009264>] syscall_exit+0x0/0x98

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&(&vcore->lock)->rlock);
                               lock(&kvm->lock);
                               lock(&(&vcore->lock)->rlock);
  lock(&kvm->lock);

 *** DEADLOCK ***

2 locks held by qemu-system-ppc/8179:
 #0:  (&vcpu->mutex){+.+.+.}, at: [<d00000000eb93f18>] .vcpu_load+0x28/0x90 [kvm]
 #1:  (&(&vcore->lock)->rlock){+.+...}, at: [<d00000000ecc1ea0>] .kvmppc_set_lpcr+0x40/0x1c0 [kvm_hv]

stack backtrace:
CPU: 4 PID: 8179 Comm: qemu-system-ppc Not tainted 3.18.0-kvm-04645-gdfea862-dirty #131
Call Trace:
[c000001a66c0f310] [c000000000b486ac] .dump_stack+0x88/0xb4 (unreliable)
[c000001a66c0f390] [c0000000000f8bec] .print_circular_bug+0x27c/0x3d0
[c000001a66c0f440] [c0000000000fe9e8] .__lock_acquire+0x2028/0x2190
[c000001a66c0f5d0] [c0000000000ff28c] .lock_acquire+0xcc/0x1a0
[c000001a66c0f6a0] [c000000000b3c120] .mutex_lock_nested+0x80/0x570
[c000001a66c0f7c0] [d00000000ecc1f54] .kvmppc_set_lpcr+0xf4/0x1c0 [kvm_hv]
[c000001a66c0f860] [d00000000ecc510c] .kvmppc_set_one_reg_hv+0x4dc/0x990 [kvm_hv]
[c000001a66c0f8d0] [d00000000eb9f234] .kvmppc_set_one_reg+0x44/0x330 [kvm]
[c000001a66c0f960] [d00000000eb9c9dc] .kvm_vcpu_ioctl_set_one_reg+0x5c/0x150 [kvm]
[c000001a66c0f9f0] [d00000000eb9ced4] .kvm_arch_vcpu_ioctl+0x214/0x2c0 [kvm]
[c000001a66c0faf0] [d00000000eb940b0] .kvm_vcpu_ioctl+0xe0/0x7b0 [kvm]
[c000001a66c0fcb0] [c00000000026cbb4] .do_vfs_ioctl+0x444/0x770
[c000001a66c0fd90] [c00000000026cfa4] .SyS_ioctl+0xc4/0xe0
[c000001a66c0fe30] [c000000000009264] syscall_exit+0x0/0x98

This fixes it by moving the mutex_lock()/mutex_unlock() pair outside
the spin-locked region.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

8f902b00

13 3月, 2015 2 次提交

KVM: VMX: Set msr bitmap correctly if vcpu is in guest mode · 670125bd

由 Wincy Van 提交于 3月 04, 2015

In commit 3af18d9c ("KVM: nVMX: Prepare for using hardware MSR bitmap"),
we are setting MSR_BITMAP in prepare_vmcs02 if we should use hardware. This
is not enough since the field will be modified by following vmx_set_efer.

Fix this by setting vmx_msr_bitmap_nested in vmx_set_msr_bitmap if vcpu is
in guest mode.
Signed-off-by: NWincy Van <fanwenyi0529@gmail.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

670125bd

kvm: x86: i8259: return initialized data on invalid-size read · c1a6bff2

由 Petr Matousek 提交于 3月 11, 2015

If data is read from PIC with invalid access size, the return data stays
uninitialized even though success is returned.

Fix this by always initializing the data.
Signed-off-by: NPetr Matousek <pmatouse@redhat.com>
Reported-by: NNadav Amit <nadav.amit@gmail.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c1a6bff2

11 3月, 2015 4 次提交

arm64: KVM: Fix outdated comment about VTCR_EL2.PS · 84ed7412

由 Marc Zyngier 提交于 3月 10, 2015

Commit 87366d8c ("arm64: Add boot time configuration of
Intermediate Physical Address size") removed the hardcoded setting
of VTCR_EL2.PS to use ID_AA64MMFR0_EL1.PARange instead, but didn't
remove the (now rather misleading) comment.

Fix the comments to match reality (at least for the next few minutes).
Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

84ed7412

arm64: KVM: Do not use pgd_index to index stage-2 pgd · 04b8dc85

由 Marc Zyngier 提交于 3月 10, 2015

The kernel's pgd_index macro is designed to index a normal, page
sized array. KVM is a bit diffferent, as we can use concatenated
pages to have a bigger address space (for example 40bit IPA with
4kB pages gives us an 8kB PGD.

In the above case, the use of pgd_index will always return an index
inside the first 4kB, which makes a guest that has memory above
0x8000000000 rather unhappy, as it spins forever in a page fault,
whist the host happilly corrupts the lower pgd.

The obvious fix is to get our own kvm_pgd_index that does the right
thing(tm).

Tested on X-Gene with a hacked kvmtool that put memory at a stupidly
high address.
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

04b8dc85

arm64: KVM: Fix stage-2 PGD allocation to have per-page refcounting · a987370f

由 Marc Zyngier 提交于 3月 10, 2015

We're using __get_free_pages with to allocate the guest's stage-2
PGD. The standard behaviour of this function is to return a set of
pages where only the head page has a valid refcount.

This behaviour gets us into trouble when we're trying to increment
the refcount on a non-head page:

page:ffff7c00cfb693c0 count:0 mapcount:0 mapping:          (null) index:0x0
flags: 0x4000000000000000()
page dumped because: VM_BUG_ON_PAGE((*({ __attribute__((unused)) typeof((&page->_count)->counter) __var = ( typeof((&page->_count)->counter)) 0; (volatile typeof((&page->_count)->counter) *)&((&page->_count)->counter); })) <= 0)
BUG: failure at include/linux/mm.h:548/get_page()!
Kernel panic - not syncing: BUG!
CPU: 1 PID: 1695 Comm: kvm-vcpu-0 Not tainted 4.0.0-rc1+ #3825
Hardware name: APM X-Gene Mustang board (DT)
Call trace:
[<ffff80000008a09c>] dump_backtrace+0x0/0x13c
[<ffff80000008a1e8>] show_stack+0x10/0x1c
[<ffff800000691da8>] dump_stack+0x74/0x94
[<ffff800000690d78>] panic+0x100/0x240
[<ffff8000000a0bc4>] stage2_get_pmd+0x17c/0x2bc
[<ffff8000000a1dc4>] kvm_handle_guest_abort+0x4b4/0x6b0
[<ffff8000000a420c>] handle_exit+0x58/0x180
[<ffff80000009e7a4>] kvm_arch_vcpu_ioctl_run+0x114/0x45c
[<ffff800000099df4>] kvm_vcpu_ioctl+0x2e0/0x754
[<ffff8000001c0a18>] do_vfs_ioctl+0x424/0x5c8
[<ffff8000001c0bfc>] SyS_ioctl+0x40/0x78
CPU0: stopping

A possible approach for this is to split the compound page using
split_page() at allocation time, and change the teardown path to
free one page at a time.  It turns out that alloc_pages_exact() and
free_pages_exact() does exactly that.

While we're at it, the PGD allocation code is reworked to reduce
duplication.

This has been tested on an X-Gene platform with a 4kB/48bit-VA host
kernel, and kvmtool hacked to place memory in the second page of
the hardware PGD (PUD for the host kernel). Also regression-tested
on a Cubietruck (Cortex-A7).

 [ Reworked to use alloc_pages_exact() and free_pages_exact() and to
   return pointers directly instead of by reference as arguments
    - Christoffer ]
Reported-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

a987370f

kvm: move advertising of KVM_CAP_IRQFD to common code · dc9be0fa

由 Paolo Bonzini 提交于 3月 05, 2015

POWER supports irqfds but forgot to advertise them.  Some userspace does
not check for the capability, but others check it---thus they work on
x86 and s390 but not POWER.

To avoid that other architectures in the future make the same mistake, let
common code handle KVM_CAP_IRQFD the same way as KVM_CAP_IRQFD_RESAMPLE.
Reported-and-tested-by: NGreg Kurz <gkurz@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 297e2105Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

dc9be0fa

06 3月, 2015 3 次提交

arm64: Don't use is_module_addr in setting page attributes · 8b5f5a07

由 Laura Abbott 提交于 2月 25, 2015

The set_memory_* functions currently only support module
addresses. The addresses are validated using is_module_addr.
That function is special though and relies on internal state
in the module subsystem to work properly. At the time of
module initialization and calling set_memory_*, it's too early
for is_module_addr to work properly so it always returns
false. Rather than be subject to the whims of the module state,
just bounds check against the module virtual address range.
Signed-off-by: NLaura Abbott <lauraa@codeaurora.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

8b5f5a07

x86/fpu/xsaves: Fix improper uses of __ex_table · 06c8173e

由 Quentin Casasnovas 提交于 3月 05, 2015

Commit:

  f31a9f7c ("x86/xsaves: Use xsaves/xrstors to save and restore xsave area")

introduced alternative instructions for XSAVES/XRSTORS and commit:

  adb9d526 ("x86/xsaves: Add xsaves and xrstors support for booting time")

added support for the XSAVES/XRSTORS instructions at boot time.

Unfortunately both failed to properly protect them against faulting:

The 'xstate_fault' macro will use the closest label named '1'
backward and that ends up in the .altinstr_replacement section
rather than in .text. This means that the kernel will never find
in the __ex_table the .text address where this instruction might
fault, leading to serious problems if userspace manages to
trigger the fault.
Signed-off-by: NQuentin Casasnovas <quentin.casasnovas@oracle.com>
Signed-off-by: NJamie Iles <jamie.iles@oracle.com>
[ Improved the changelog, fixed some whitespace noise. ]
Acked-by: NBorislav Petkov <bp@alien8.de>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Cc: Allan Xavier <mr.a.xavier@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: adb9d526 ("x86/xsaves: Add xsaves and xrstors support for booting time")
Fixes: f31a9f7c ("x86/xsaves: Use xsaves/xrstors to save and restore xsave area")
Signed-off-by: NIngo Molnar <mingo@kernel.org>

06c8173e

x86/intel/quark: Select COMMON_CLK · 9ab6eb51

由 Andy Shevchenko 提交于 3月 05, 2015

The commit 8bbc2a13 ("x86/intel/quark: Add Intel Quark
platform support") introduced a minimal support of Intel Quark
SoC. That allows to use core parts of the SoC. However, the SPI,
I2C, and GPIO drivers can't be selected by kernel configuration
because they depend on COMMON_CLK. The patch adds a COMMON_CLK
selection to the platfrom definition to allow user choose the drivers.
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: NOng, Boon Leong <boon.leong.ong@intel.com>
Cc: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Cc: Darren Hart <dvhart@linux.intel.com>
Fixes: 8bbc2a13 ("x86/intel/quark: Add Intel Quark platform support")
Link: http://lkml.kernel.org/r/1425569044-2867-1-git-send-email-andriy.shevchenko@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

9ab6eb51

05 3月, 2015 3 次提交

ARM: fix typos in smc91x platform data · 04b91701

由 Arnd Bergmann 提交于 3月 04, 2015

I recently did a rework of the smc91x driver and did some build-testing
by compiling hundreds of randconfig kernels. Unfortunately, my script
was wrong and did not actually test the configurations that mattered,
so I introduced stupid typos in almost every file I touched.

I fixed my script now, built all configurations that actually matter
and fixed all the typos, this is the result.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Fixes: b70661c7 ("net: smc91x: use run-time configuration on all ARM machines")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

04b91701

x86/asm/entry/64: Remove a bogus 'ret_from_fork' optimization · 956421fb

由 Andy Lutomirski 提交于 3月 05, 2015

'ret_from_fork' checks TIF_IA32 to determine whether 'pt_regs' and
the related state make sense for 'ret_from_sys_call'.  This is
entirely the wrong check.  TS_COMPAT would make a little more
sense, but there's really no point in keeping this optimization
at all.

This fixes a return to the wrong user CS if we came from int
0x80 in a 64-bit task.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/4710be56d76ef994ddf59087aad98c000fbab9a4.1424989793.git.luto@amacapital.net
[ Backported from tip:x86/asm. ]
Signed-off-by: NIngo Molnar <mingo@kernel.org>

956421fb

dtb: change binding name to match with newer firmware DT · 2a91eb72

由 Iyappan Subramanian 提交于 3月 03, 2015

This patch fixes the backward compatibility of the older driver with the
newer firmware by making the binding unique so that the older driver won't
recognize the non-supported interfaces.

The new bindings are in sync with the newer firmware.
Signed-off-by: NIyappan Subramanian <isubramanian@apm.com>
Signed-off-by: NKeyur Chudgar <kchudgar@apm.com>
Tested-by: NMark Langsdorf <mlangsdo@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2a91eb72

04 3月, 2015 7 次提交

x86/PCI/ACPI: Ignore resources consumed by host bridge itself · 63f1789e

由 Jiang Liu 提交于 3月 04, 2015

When parsing resources for PCI host bridge, we should ignore resources
consumed by host bridge itself and only report window resources available
to child PCI busses.

Fixes: 593669c2 (x86/PCI/ACPI: Use common ACPI resource interfaces ...)
Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

63f1789e

KVM: s390: non-LPAR case obsolete during facilities mask init · fb5bf93f

由 Michael Mueller 提交于 2月 27, 2015

With patch "include guest facilities in kvm facility test" it is no
longer necessary to have special handling for the non-LPAR case.
Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

fb5bf93f

KVM: s390: include guest facilities in kvm facility test · 981467c9

由 Michael Mueller 提交于 2月 24, 2015

Most facility related decisions in KVM have to take into account:

- the facilities offered by the underlying run container (LPAR/VM)
- the facilities supported by the KVM code itself
- the facilities requested by a guest VM

This patch adds the KVM driver requested facilities to the test routine.

It additionally renames struct s390_model_fac to kvm_s390_fac and its field
names to be more meaningful.

The semantics of the facilities stored in the KVM architecture structure
is changed. The address arch.model.fac->list now points to the guest
facility list and arch.model.fac->mask points to the KVM facility mask.

This patch fixes the behaviour of KVM for some facilities for guests
that ignore the guest visible facility bits, e.g. guests could use
transactional memory intructions on hosts supporting them even if the
chosen cpu model would not offer them.

The userspace interface is not affected by this change.
Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

981467c9

KVM: s390: fix in memory copy of facility lists · 94422ee8

由 Michael Mueller 提交于 2月 26, 2015

The facility lists were not fully copied.
Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

94422ee8

KVM: s390/cpacf: Fix kernel bug under z/VM · 86044c8c

由 Christian Borntraeger 提交于 2月 26, 2015

Under z/VM PQAP might trigger an operation exception if no crypto cards
are defined via APVIRTUAL or APDEDICATED.

[  386.098666] Kernel BUG at 0000000000135c56 [verbose debug info unavailable]
[  386.098693] illegal operation: 0001 ilc:2 [#1] SMP
[...]
[  386.098751] Krnl PSW : 0704c00180000000 0000000000135c56 (kvm_s390_apxa_installed+0x46/0x98)
[...]
[  386.098804]  [<000000000013627c>] kvm_arch_init_vm+0x29c/0x358
[  386.098806]  [<000000000012d008>] kvm_dev_ioctl+0xc0/0x460
[  386.098809]  [<00000000002c639a>] do_vfs_ioctl+0x332/0x508
[  386.098811]  [<00000000002c660e>] SyS_ioctl+0x9e/0xb0
[  386.098814]  [<000000000070476a>] system_call+0xd6/0x258
[  386.098815]  [<000003fffc7400a2>] 0x3fffc7400a2

Lets add an extable entry and provide a zeroed config in that case.
Reported-by: NStefan Zimmermann <stzi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NThomas Huth <thuth@linux.vnet.ibm.com>
Tested-by: NStefan Zimmermann <stzi@linux.vnet.ibm.com>

86044c8c

powerpc/iommu: Remove IOMMU device references via bus notifier · 4ad04e59

由 Nishanth Aravamudan 提交于 2月 21, 2015

After d905c5df ("PPC: POWERNV: move iommu_add_device earlier"), the
refcnt on the kobject backing the IOMMU group for a PCI device is
elevated by each call to pci_dma_dev_setup_pSeriesLP() (via
set_iommu_table_base_and_group). When we go to dlpar a multi-function
PCI device out:

        iommu_reconfig_notifier ->
                iommu_free_table ->
                        iommu_group_put
                        BUG_ON(tbl->it_group)

We trip this BUG_ON, because there are still references on the table, so
it is not freed. Fix this by moving the powernv bus notifier to common
code and calling it for both powernv and pseries.

Fixes: d905c5df ("PPC: POWERNV: move iommu_add_device earlier")
Signed-off-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
Tested-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

4ad04e59

powerpc/smp: Wait until secondaries are active & online · 875ebe94

由 Michael Ellerman 提交于 2月 24, 2015

Anton has a busy ppc64le KVM box where guests sometimes hit the infamous
"kernel BUG at kernel/smpboot.c:134!" issue during boot:

  BUG_ON(td->cpu != smp_processor_id());

Basically a per CPU hotplug thread scheduled on the wrong CPU. The oops
output confirms it:

  CPU: 0
  Comm: watchdog/130

The problem is that we aren't ensuring the CPU active bit is set for the
secondary before allowing the master to continue on. The master unparks
the secondary CPU's kthreads and the scheduler looks for a CPU to run
on. It calls select_task_rq() and realises the suggested CPU is not in
the cpus_allowed mask. It then ends up in select_fallback_rq(), and
since the active bit isnt't set we choose some other CPU to run on.

This seems to have been introduced by 6acbfb96 "sched: Fix hotplug
vs. set_cpus_allowed_ptr()", which changed from setting active before
online to setting active after online. However that was in turn fixing a
bug where other code assumed an active CPU was also online, so we can't
just revert that fix.

The simplest fix is just to spin waiting for both active & online to be
set. We already have a barrier prior to set_cpu_online() (which also
sets active), to ensure all other setup is completed before online &
active are set.

Fixes: 6acbfb96 ("sched: Fix hotplug vs. set_cpus_allowed_ptr()")
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

875ebe94

03 3月, 2015 6 次提交

KVM: s390/cpacf: Enable key wrapping by default · ed6f76b4

由 Tony Krowiak 提交于 2月 24, 2015

z/VM and LPAR enable key wrapping by default, lets do the same on KVM.
Signed-off-by: NTony Krowiak <akrowiak@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

ed6f76b4

KVM: MIPS: Enable after disabling interrupt · cfec0e75

由 Tapasweni Pathak 提交于 2月 22, 2015

Enable disabled interrupt, on unsuccessful operation.

Found by Coccinelle.
Signed-off-by: NTapasweni Pathak <tapaswenipathak@gmail.com>
Acked-by: NJulia Lawall <julia.lawall@lip6.fr>
Reviewed-by: NJames Hogan <james.hogan@imgtec.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

cfec0e75

KVM: MIPS: Fix trace event to save PC directly · b3cffac0

由 James Hogan 提交于 2月 24, 2015

Currently the guest exit trace event saves the VCPU pointer to the
structure, and the guest PC is retrieved by dereferencing it when the
event is printed rather than directly from the trace record. This isn't
safe as the printing may occur long afterwards, after the PC has changed
and potentially after the VCPU has been freed. Usually this results in
the same (wrong) PC being printed for multiple trace events. It also
isn't portable as userland has no way to access the VCPU data structure
when interpreting the trace record itself.

Lets save the actual PC in the structure so that the correct value is
accessible later.

Fixes: 669e846e ("KVM/MIPS32: MIPS arch specific APIs for KVM")
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org> # v3.10+
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

b3cffac0

KVM: SVM: fix interrupt injection (apic->isr_count always 0) · f563db4b

由 Radim Krčmář 提交于 2月 27, 2015

In commit b4eef9b3, we started to use hwapic_isr_update() != NULL
instead of kvm_apic_vid_enabled(vcpu->kvm). This didn't work because
SVM had it defined and "apicv" path in apic_{set,clear}_isr() does not
change apic->isr_count, because it should always be 1. The initial
value of apic->isr_count was based on kvm_apic_vid_enabled(vcpu->kvm),
which is always 0 for SVM, so KVM could have injected interrupts when it
shouldn't.

Fix it by implicitly setting SVM's hwapic_isr_update to NULL and make the
initial isr_count depend on hwapic_isr_update() for good measure.

Fixes: b4eef9b3 ("kvm: x86: vmx: NULL out hwapic_isr_update() in case of !enable_apicv")
Reported-and-tested-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f563db4b

s390/mm: fix incorrect ASCE after crst_table_downgrade · 691d5264

由 Martin Schwidefsky 提交于 3月 01, 2015

The switch_mm function does nothing in case the prev and next mm
are the same. It can happen that a crst_table_downgrade has changed
the top-level pgd in the meantime on a different CPU. Always store
the new ASCE to be picked up in entry.S.

[heiko.carstens@de.ibm.com]: Bug was introduced with git commit
53e857f3 ("s390/mm,tlb: race of lazy TLB flush vs. recreation
of TLB entries") and causes random crashes due to broken page tables
being used.
Reported-by: NDominik Vogt <vogt@linux.vnet.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>

691d5264

s390/ftrace: fix crashes when switching tracers / add notrace to cpu_relax() · a9ca8eb7

由 Heiko Carstens 提交于 2月 28, 2015

With git commit 4d92f502 ("s390: reintroduce diag 44 calls for
cpu_relax()") I reintroduced a non-trivial cpu_relax() variant on s390.

The difference to the previous variant however is that the new version is
an out-of-line function, which will be traced if function tracing is enabled.

Switching to different tracers includes instruction patching. Therefore this
is done within stop_machine() "context" to prevent that any function tracing
is going on while instructions are being patched.
With the new out-of-line variant of cpu_relax() this is not true anymore,
since cpu_relax() gets called in a busy loop by all waiting cpus within
stop_machine() until function patching is finished.
Therefore cpu_relax() must be marked notrace.

This fixes kernel crashes when frequently switching between "function" and
"function_graph" tracers.

Moving cpu_relax() to a header file again, doesn't work because of header
include order dependencies.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

a9ca8eb7

01 3月, 2015 2 次提交

mm: add missing __PAGETABLE_{PUD,PMD}_FOLDED defines · c07af4f1

由 Kirill A. Shutemov 提交于 2月 27, 2015

Core mm expects __PAGETABLE_{PUD,PMD}_FOLDED to be defined if these page
table levels folded.  Usually, these defines are provided by
<asm-generic/pgtable-nopmd.h> and <asm-generic/pgtable-nopud.h>.

But some architectures fold page table levels in a custom way.  They
need to define these macros themself.  This patch adds missing defines.

The patch fixes mm->nr_pmds underflow and eliminates dead __pmd_alloc()
and __pud_alloc() on architectures without these page table levels.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c07af4f1

net: smc91x: use run-time configuration on all ARM machines · b70661c7

由 Arnd Bergmann 提交于 2月 25, 2015

The smc91x driver traditionally gets configured at compile-time
for whichever hardware it runs on. This no longer works on
ARM as we continue to move to building all-in-one kernels.

Most ARM configurations with this driver already use run-time
configuration through DT or through platform_data, but a
few have not been converted yet.

I've checked all ARM boards that use this driver in their
legacy board files, and converted the ones that were using
compile-time configuration in smc91x.h to behave like the
other ones and provide the interrupt polarity along with
the MMIO configuration (width, stride) at platform device
creation time.

In particular, these combinations were previously selectable
in Kconfig but in fact broken:

- sa1100 assabet plus pleb
- msm combined with any other armv6/v7 platform
- pxa-idp combined with any non-DMA pxa variant
- LogicPD PXA270 combined with any other pxa
- nomadik combined with any other armv4/v5 platform,
  e.g. versatile.

None of these seem critical enough to warrant a backport
to stable, but it would be nice to clean this up for good.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
----
I would like the patch to get merged through netdev, after
Robert and/or Linus have verified it on at least some hardware.

There are a few other non-ARM platforms using this driver,
I could do the same patch for those if we want to take
it further.

 arch/arm/mach-msm/board-halibut.c    |   8 ++++-
 arch/arm/mach-msm/board-qsd8x50.c    |   8 ++++-
 arch/arm/mach-pxa/idp.c              |   5 +++
 arch/arm/mach-pxa/lpd270.c           |   8 ++++-
 arch/arm/mach-realview/core.c        |   7 ++++
 arch/arm/mach-realview/realview_eb.c |   2 +-
 arch/arm/mach-sa1100/neponset.c      |   6 ++++
 arch/arm/mach-sa1100/pleb.c          |   7 ++++
 drivers/net/ethernet/smsc/smc91x.c   |   9 +++--
 drivers/net/ethernet/smsc/smc91x.h   | 114 ++----------------------------------------------------------
 10 files changed, 57 insertions(+), 117 deletions(-)
Tested-by: NRobert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b70661c7

28 2月, 2015 6 次提交

x86: Init per-cpu shadow copy of CR4 on 32-bit CPUs too · 5b2bdbc8

由 Steven Rostedt 提交于 2月 27, 2015

Commit:

   1e02ce4c ("x86: Store a per-cpu shadow copy of CR4")

added a shadow CR4 such that reads and writes that do not
modify the CR4 execute much faster than always reading the
register itself.

The change modified cpu_init() in common.c, so that the
shadow CR4 gets initialized before anything uses it.

Unfortunately, there's two cpu_init()s in common.c. There's
one for 64-bit and one for 32-bit. The commit only added
the shadow init to the 64-bit path, but the 32-bit path
needs the init too.

Link: http://lkml.kernel.org/r/20150227125208.71c36402@gandalf.local.home Fixes: 1e02ce4c "x86: Store a per-cpu shadow copy of CR4"
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
Acked-by: NAndy Lutomirski <luto@amacapital.net>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150227145019.2bdd4354@gandalf.local.homeSigned-off-by: NIngo Molnar <mingo@kernel.org>

5b2bdbc8

arm64: cpuidle: add asm/proc-fns.h inclusion · af4819af

由 Lorenzo Pieralisi 提交于 2月 27, 2015

ARM64 CPUidle driver requires the cpu_do_idle function so that it can
be used to enter the shallowest idle state, and it is declared in
asm/proc-fns.h.

The current ARM64 CPUidle driver does not include asm/proc-fns.h
explicitly and it has so far relied on implicit inclusion from other
header files.

Owing to some header dependencies reshuffling this currently triggers
build failures when CONFIG_ARM64_64K_PAGES=y:

drivers/cpuidle/cpuidle-arm64.c: In function "arm64_enter_idle_state"
drivers/cpuidle/cpuidle-arm64.c:42:3: error: implicit declaration of
function "cpu_do_idle" [-Werror=implicit-function-declaration]
   cpu_do_idle();
   ^

This patch adds the explicit inclusion of the asm/proc-fns.h header file
in the arm64 asm/cpuidle.h header file, so that the build breakage is fixed
and the required header inclusion is added to the appropriate arch back-end
CPUidle header, already included by the CPUidle arm64 driver, where
CPUidle arch related function declarations belong.
Reported-by: NLaura Abbott <lauraa@codeaurora.org>
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Tested-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

af4819af

arm64: compat Fix siginfo_t -> compat_siginfo_t conversion on big endian · 9d42d48a

由 Catalin Marinas 提交于 2月 23, 2015

The native (64-bit) sigval_t union contains sival_int (32-bit) and
sival_ptr (64-bit). When a compat application invokes a syscall that
takes a sigval_t value (as part of a larger structure, e.g.
compat_sys_mq_notify, compat_sys_timer_create), the compat_sigval_t
union is converted to the native sigval_t with sival_int overlapping
with either the least or the most significant half of sival_ptr,
depending on endianness. When the corresponding signal is delivered to a
compat application, on big endian the current (compat_uptr_t)sival_ptr
cast always returns 0 since sival_int corresponds to the top part of
sival_ptr. This patch fixes copy_siginfo_to_user32() so that sival_int
is copied to the compat_siginfo_t structure.

Cc: <stable@vger.kernel.org>
Reported-by: NBamvor Jian Zhang <bamvor.zhangjian@huawei.com>
Tested-by: NBamvor Jian Zhang <bamvor.zhangjian@huawei.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

9d42d48a

arm64: Increase the swiotlb buffer size 64MB · a1e50a82

由 Catalin Marinas 提交于 2月 05, 2015

With commit 3690951f (arm64: Use swiotlb late initialisation), the
swiotlb buffer size is limited to MAX_ORDER_NR_PAGES. However, there are
platforms with 32-bit only devices that require bounce buffering via
swiotlb. This patch changes the swiotlb initialisation to an early 64MB
memblock allocation. In order to get the swiotlb buffer correctly
allocated (via memblock_virt_alloc_low_nopanic), this patch also defines
ARCH_LOW_ADDRESS_LIMIT to the maximum physical address capable of 32-bit
DMA.
Reported-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Tested-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

a1e50a82

s390/pci: unify pci_iomap symbol exports · d9426083

由 Sebastian Ott 提交于 2月 27, 2015

Since commit 8cfc99b5 ("s390: add pci_iomap_range") we use
EXPORT_SYMBOL for pci_iomap but EXPORT_SYMBOL_GPL for pci_iounmap.
Change the related functions to use EXPORT_SYMBOL like the asm-generic
variants do.
Signed-off-by: NSebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

d9426083

s390/pci: fix [un]map_resources sequence · 1803ba2d

由 Sebastian Ott 提交于 2月 27, 2015

Commit 8cfc99b5 ("s390: add pci_iomap_range") introduced counters
to keep track of the number of mappings created. This revealed that
we don't have our internal mappings in order when using hotunplug or
resume from hibernate. This patch addresses both issues.
Signed-off-by: NSebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

1803ba2d

27 2月, 2015 5 次提交

ARC: Fix thread_saved_pc() · 3240dd57

由 Vineet Gupta 提交于 2月 27, 2015

The old implementation assumed that SP at the time of __switch_to() is
right above pt_regs which is almost certainly not the case as there will
be some stack build up between entry into kernel and leading up to
__switch_to
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

3240dd57

ARC: Fix KSTK_ESP() · 13648b01

由 Vineet Gupta 提交于 2月 27, 2015

/proc/<pid>/maps currently don't annotate stack vma with "[stack]"
This is because KSTK_ESP ie expected to return usermode SP of tsk while
currently it returns the kernel mode SP of a sleeping tsk.

While the fix is trivial, we also need to adjust the ARC kernel stack
unwinder to not use KSTK_SP and friends any more.

Cc: <stable@vger.kernel.org>
Reported-and-suggested-by: NAlexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

13648b01

V
ARC: perf: Enable generic software events · ceed97ab
由 Vineet Gupta 提交于 10月 02, 2014
```
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
```
ceed97ab

ARC: Make arc_unwind_core accessible externally · 3a51d50f

由 Vineet Gupta 提交于 7月 10, 2013

The arc unwinder can also be used for perf callchains.
Signed-off-by: NMischa Jonker <mjonker@synopsys.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

3a51d50f

arm64: Fix text patching logic when using fixmap · f6242cac

由 Marc Zyngier 提交于 2月 24, 2015

Patch 2f896d58 ("arm64: use fixmap for text patching") changed
the way we patch the kernel text, using a fixmap when the kernel or
modules are flagged as read only.

Unfortunately, a flaw in the logic makes it fall over when patching
modules without CONFIG_DEBUG_SET_MODULE_RONX enabled:

[...]
[   32.032636] Call trace:
[   32.032716] [<fffffe00003da0dc>] __copy_to_user+0x2c/0x60
[   32.032837] [<fffffe0000099f08>] __aarch64_insn_write+0x94/0xf8
[   32.033027] [<fffffe000009a0a0>] aarch64_insn_patch_text_nosync+0x18/0x58
[   32.033200] [<fffffe000009c3ec>] ftrace_modify_code+0x58/0x84
[   32.033363] [<fffffe000009c4e4>] ftrace_make_nop+0x3c/0x58
[   32.033532] [<fffffe0000164420>] ftrace_process_locs+0x3d0/0x5c8
[   32.033709] [<fffffe00001661cc>] ftrace_module_init+0x28/0x34
[   32.033882] [<fffffe0000135148>] load_module+0xbb8/0xfc4
[   32.034044] [<fffffe0000135714>] SyS_finit_module+0x94/0xc4
[...]

This is triggered by the use of virt_to_page() on a module address,
which ends to pointing to Nowhereland if you're lucky, or corrupt
your precious data if not.

This patch fixes the logic by mimicking what is done on arm:
- If we're patching a module and CONFIG_DEBUG_SET_MODULE_RONX is set,
  use vmalloc_to_page().
- If we're patching the kernel and CONFIG_DEBUG_RODATA is set,
  use virt_to_page().
- Otherwise, use the provided address, as we can write to it directly.

Tested on 4.0-rc1 as a KVM guest.
Reported-by: NRichard W.M. Jones <rjones@redhat.com>
Reviewed-by: NKees Cook <keescook@chromium.org>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Acked-by: NLaura Abbott <lauraa@codeaurora.org>
Tested-by: NRichard W.M. Jones <rjones@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

f6242cac

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功