提交 · 42dbfc8649737cb622b2a7e02045401c4c09561c · openanolis / cloud-kernel

28 4月, 2014 29 次提交

powerpc/pseries: Protect remove_memory() with device hotplug lock · 42dbfc86

由 Li Zhong 提交于 4月 10, 2014

While testing memory hot-remove, I found following dead lock:

Process #1141 is drmgr, trying to remove some memory, i.e. memory499.
It holds the memory_hotplug_mutex, and blocks when trying to remove file
"online" under dir memory499, in kernfs_drain(), at
        wait_event(root->deactivate_waitq,
                   atomic_read(&kn->active) == KN_DEACTIVATED_BIAS);

Process #1120 is trying to online memory499 by
   echo 1 > memory499/online

In .kernfs_fop_write, it uses kernfs_get_active() to increase
&kn->active, thus blocking process #1141. While itself is blocked later
when trying to acquire memory_hotplug_mutex, which is held by process

The backtrace of both processes are shown below:

[<c000000001b18600>] 0xc000000001b18600
[<c000000000015044>] .__switch_to+0x144/0x200
[<c000000000263ca4>] .online_pages+0x74/0x7b0
[<c00000000055b40c>] .memory_subsys_online+0x9c/0x150
[<c00000000053cbe8>] .device_online+0xb8/0x120
[<c00000000053cd04>] .online_store+0xb4/0xc0
[<c000000000538ce4>] .dev_attr_store+0x64/0xa0
[<c00000000030f4ec>] .sysfs_kf_write+0x7c/0xb0
[<c00000000030e574>] .kernfs_fop_write+0x154/0x1e0
[<c000000000268450>] .vfs_write+0xe0/0x260
[<c000000000269144>] .SyS_write+0x64/0x110
[<c000000000009ffc>] syscall_exit+0x0/0x7c

[<c000000001b18600>] 0xc000000001b18600
[<c000000000015044>] .__switch_to+0x144/0x200
[<c00000000030be14>] .__kernfs_remove+0x204/0x300
[<c00000000030d428>] .kernfs_remove_by_name_ns+0x68/0xf0
[<c00000000030fb38>] .sysfs_remove_file_ns+0x38/0x60
[<c000000000539354>] .device_remove_attrs+0x54/0xc0
[<c000000000539fd8>] .device_del+0x158/0x250
[<c00000000053a104>] .device_unregister+0x34/0xa0
[<c00000000055bc14>] .unregister_memory_section+0x164/0x170
[<c00000000024ee18>] .__remove_pages+0x108/0x4c0
[<c00000000004b590>] .arch_remove_memory+0x60/0xc0
[<c00000000026446c>] .remove_memory+0x8c/0xe0
[<c00000000007f9f4>] .pseries_remove_memblock+0xd4/0x160
[<c00000000007fcfc>] .pseries_memory_notifier+0x27c/0x290
[<c0000000008ae6cc>] .notifier_call_chain+0x8c/0x100
[<c0000000000d858c>] .__blocking_notifier_call_chain+0x6c/0xe0
[<c00000000071ddec>] .of_property_notify+0x7c/0xc0
[<c00000000071ed3c>] .of_update_property+0x3c/0x1b0
[<c0000000000756cc>] .ofdt_write+0x3dc/0x740
[<c0000000002f60fc>] .proc_reg_write+0xac/0x110
[<c000000000268450>] .vfs_write+0xe0/0x260
[<c000000000269144>] .SyS_write+0x64/0x110
[<c000000000009ffc>] syscall_exit+0x0/0x7c

This patch uses lock_device_hotplug() to protect remove_memory() called
in pseries_remove_memblock(), which is also stated before function
remove_memory():

 * NOTE: The caller must call lock_device_hotplug() to serialize hotplug
 * and online/offline operations before this call, as required by
 * try_offline_node().
 */
void __ref remove_memory(int nid, u64 start, u64 size)

With this lock held, the other process(#1120 above) trying to online the
memory block will retry the system call when calling
lock_device_hotplug_sysfs(), and finally find No such device error.
Signed-off-by: NLi Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

42dbfc86

powerpc: Fix error return in rtas_flash module init · 0c930692

由 Anton Blanchard 提交于 4月 14, 2014

module_init should return 0 or a negative errno.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

0c930692

powerpc: Bump BOOT_COMMAND_LINE_SIZE to 2048 · 579a53ca

由 Anton Blanchard 提交于 4月 14, 2014

Bump the boot wrapper BOOT_COMMAND_LINE_SIZE to match the
kernel.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

579a53ca

powerpc: Bump COMMAND_LINE_SIZE to 2048 · a5980d06

由 Anton Blanchard 提交于 4月 14, 2014

I've had a report that the current limit is too small for
an automated network based installer. Bump it.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

a5980d06

powerpc: Rename duplicate COMMAND_LINE_SIZE define · a2dd5da7

由 Anton Blanchard 提交于 4月 14, 2014

We have two definitions of COMMAND_LINE_SIZE, one for the kernel
and one for the boot wrapper. I assume this is so the boot
wrapper can be self sufficient and not rely on kernel headers.

Having two defines with the same name is confusing, I just
updated the wrong one when trying to bump it.

Make the boot wrapper define unique by calling it
BOOT_COMMAND_LINE_SIZE.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

a2dd5da7

powerpc/perf/hv-24x7: Catalog version number is be64, not be32 · bbad3e50

由 Cody P Schafer 提交于 4月 15, 2014

The catalog version number was changed from a be32 (with proceeding
32bits of padding) to a be64, update the code to treat it as a be64
Signed-off-by: NCody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

bbad3e50

powerpc/perf/hv-24x7: Remove [static 4096], sparse chokes on it · 1ee9fcc1

由 Cody P Schafer 提交于 4月 15, 2014

Signed-off-by: NCody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

1ee9fcc1

C
powerpc/perf/hv-24x7: Use (unsigned long) not (u32) values when calling plpar_hcall_norets() · 78d13166
由 Cody P Schafer 提交于 4月 15, 2014
```
Signed-off-by: NCody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
```
78d13166

powerpc/perf/hv-gpci: Make device attr static · 58a685c2

由 Cody P Schafer 提交于 4月 15, 2014

Signed-off-by: NCody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

58a685c2

powerpc/perf/hv_gpci: Probe failures use pr_debug(), and padding reduced · 0a8cf9e2

由 Cody P Schafer 提交于 4月 15, 2014

fixup for "powerpc/perf: Add support for the hv gpci (get performance
counter info) interface".

Makes the "not enabled" message less awful (and hidden unless
debugging).
Signed-off-by: NCody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

0a8cf9e2

powerpc/perf/hv_24x7: Probe errors changed to pr_debug(), padding fixed · e98bf005

由 Cody P Schafer 提交于 4月 15, 2014

fixup for "powerpc/perf: Add support for the hv 24x7 interface"

Makes the "not enabled" message less awful (and hides it in most cases).
Signed-off-by: NCody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

e98bf005

powerpc/mm: Fix tlbie to add AVAL fields for 64K pages · 29ef7a3e

由 Aneesh Kumar K.V 提交于 4月 21, 2014

The if condition check was based on a draft ISA doc. Remove the same.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

29ef7a3e

powerpc/powernv: Fix little endian issues in OPAL dump code · 2d6b63bb

由 Anton Blanchard 提交于 4月 22, 2014

Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

2d6b63bb

powerpc/powernv: Create OPAL sglist helper functions and fix endian issues · 3441f04b

由 Anton Blanchard 提交于 4月 22, 2014

We have two copies of code that creates an OPAL sg list. Consolidate
these into a common set of helpers and fix the endian issues.

The flash interface embedded a version number in the num_entries
field, whereas the dump interface did did not. Since versioning
wasn't added to the flash interface and it is impossible to add
this in a backwards compatible way, just remove it.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

3441f04b

powerpc/powernv: Fix little endian issues in OPAL error log code · 14ad0c58

由 Anton Blanchard 提交于 4月 22, 2014

Fix little endian issues with the OPAL error log code.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Reviewed-by: NStewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

14ad0c58

powerpc/powernv: Fix little endian issues with opal_do_notifier calls · 56b4c993

由 Anton Blanchard 提交于 4月 22, 2014

The bitmap in opal_poll_events and opal_handle_interrupt is
big endian, so we need to byteswap it on little endian builds.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

56b4c993

powerpc/powernv: Remove some OPAL function declaration duplication · e2c8b93e

由 Anton Blanchard 提交于 4月 22, 2014

We had some duplication of the internal OPAL functions.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

e2c8b93e

powerpc/powernv: Use uint64_t instead of size_t in OPAL APIs · 2bad7423

由 Anton Blanchard 提交于 4月 22, 2014

Using size_t in our APIs is asking for trouble, especially
when some OPAL calls use size_t pointers.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Reviewed-by: NStewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

2bad7423

powerpc/powernv: Release the refcount for pci_dev · 4966bfa1

由 Wei Yang 提交于 4月 23, 2014

On PowerNV platform, we are holding an unnecessary refcount on a pci_dev, which
leads to the pci_dev is not destroyed when hotplugging a pci device.

This patch release the unnecessary refcount.
Signed-off-by: NWei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

4966bfa1

powerpc/powernv: Reduce multi-hit of iommu_add_device() · 3f28c5af

由 Wei Yang 提交于 4月 23, 2014

During the EEH hotplug event, iommu_add_device() will be invoked three times
and two of them will trigger warning or error.

The three times to invoke the iommu_add_device() are:

    pci_device_add
       ...
       set_iommu_table_base_and_group   <- 1st time, fail
    device_add
       ...
       tce_iommu_bus_notifier           <- 2nd time, succees
    pcibios_add_pci_devices
       ...
       pcibios_setup_bus_devices        <- 3rd time, re-attach

The first time fails, since the dev->kobj->sd is not initialized. The
dev->kobj->sd is initialized in device_add().
The third time's warning is triggered by the re-attach of the iommu_group.

After applying this patch, the error

    iommu_tce: 0003:05:00.0 has not been added, ret=-14

and the warning

    [  204.123609] ------------[ cut here ]------------
    [  204.123645] WARNING: at arch/powerpc/kernel/iommu.c:1125
    [  204.123680] Modules linked in: xt_CHECKSUM nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT bnep bluetooth 6lowpan_iphc rfkill xt_conntrack ebtable_nat ebtable_broute bridge stp llc mlx4_ib ib_sa ib_mad ib_core ib_addr ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw bnx2x tg3 mlx4_core nfsd ptp mdio ses libcrc32c nfs_acl enclosure be2net pps_core shpchp lockd kvm uinput sunrpc binfmt_misc lpfc scsi_transport_fc ipr scsi_tgt
    [  204.124356] CPU: 18 PID: 650 Comm: eehd Not tainted 3.14.0-rc5yw+ #102
    [  204.124400] task: c0000027ed485670 ti: c0000027ed50c000 task.ti: c0000027ed50c000
    [  204.124453] NIP: c00000000003cf80 LR: c00000000006c648 CTR: c00000000006c5c0
    [  204.124506] REGS: c0000027ed50f440 TRAP: 0700   Not tainted  (3.14.0-rc5yw+)
    [  204.124558] MSR: 9000000000029032 <SF,HV,EE,ME,IR,DR,RI>  CR: 88008084  XER: 20000000
    [  204.124682] CFAR: c00000000006c644 SOFTE: 1
    GPR00: c00000000006c648 c0000027ed50f6c0 c000000001398380 c0000027ec260300
    GPR04: c0000027ea92c000 c00000000006ad00 c0000000016e41b0 0000000000000110
    GPR08: c0000000012cd4c0 0000000000000001 c0000027ec2602ff 0000000000000062
    GPR12: 0000000028008084 c00000000fdca200 c0000000000d1d90 c0000027ec281a80
    GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
    GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000001
    GPR24: 000000005342697b 0000000000002906 c000001fe6ac9800 c000001fe6ac9800
    GPR28: 0000000000000000 c0000000016e3a80 c0000027ea92c090 c0000027ea92c000
    [  204.125353] NIP [c00000000003cf80] .iommu_add_device+0x30/0x1f0
    [  204.125399] LR [c00000000006c648] .pnv_pci_ioda_dma_dev_setup+0x88/0xb0
    [  204.125443] Call Trace:
    [  204.125464] [c0000027ed50f6c0] [c0000027ed50f750] 0xc0000027ed50f750 (unreliable)
    [  204.125526] [c0000027ed50f750] [c00000000006c648] .pnv_pci_ioda_dma_dev_setup+0x88/0xb0
    [  204.125588] [c0000027ed50f7d0] [c000000000069cc8] .pnv_pci_dma_dev_setup+0x78/0x340
    [  204.125650] [c0000027ed50f870] [c000000000044408] .pcibios_setup_device+0x88/0x2f0
    [  204.125712] [c0000027ed50f940] [c000000000046040] .pcibios_setup_bus_devices+0x60/0xd0
    [  204.125774] [c0000027ed50f9c0] [c000000000043acc] .pcibios_add_pci_devices+0xdc/0x1c0
    [  204.125837] [c0000027ed50fa50] [c00000000086f970] .eeh_reset_device+0x36c/0x4f0
    [  204.125939] [c0000027ed50fb20] [c00000000003a2d8] .eeh_handle_normal_event+0x448/0x480
    [  204.126068] [c0000027ed50fbc0] [c00000000003a35c] .eeh_handle_event+0x4c/0x340
    [  204.126192] [c0000027ed50fc80] [c00000000003a74c] .eeh_event_handler+0xfc/0x1b0
    [  204.126319] [c0000027ed50fd30] [c0000000000d1ea0] .kthread+0x110/0x130
    [  204.126430] [c0000027ed50fe30] [c00000000000a460] .ret_from_kernel_thread+0x5c/0x7c
    [  204.126556] Instruction dump:
    [  204.126610] 7c0802a6 fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821ff71 7c7e1b78 60000000
    [  204.126787] 60000000 e87e0298 3143ffff 7d2a1910 <0b090000> 2fa90000 40de00c8 ebfe0218
    [  204.126966] ---[ end trace 6e7aefd80add2973 ]---

are cleared.

This patch removes iommu_add_device() in pnv_pci_ioda_dma_dev_setup(), which
revert part of the change in commit d905c5df(PPC: POWERNV: move
iommu_add_device earlier).
Signed-off-by: NWei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

3f28c5af

powerpc/powernv: Fix little endian issues in OPAL flash code · cc146d1d

由 Anton Blanchard 提交于 4月 24, 2014

With this patch I was able to update firmware on an LE kernel.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

cc146d1d

powerpc/powernv: Fix kexec races going back to OPAL · 298b34d7

由 Benjamin Herrenschmidt 提交于 4月 24, 2014

We have a subtle race when sending CPUs back to OPAL on kexec.

We mark them as "in real mode" right before we send them down. Once
we've booted the new kernel, it might try to call opal_reinit_cpus()
to change endianness, and that requires all CPUs to be spinning inside
OPAL.

However there is no synchronization here and we've observed cases
where the returning CPUs hadn't established their new state inside
OPAL before opal_reinit_cpus() is called, causing it to fail.

The proper fix is to actually wait for them to go down all the way
from the kexec'ing kernel.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

298b34d7

powerpc/powernv: Check sysparam size before creation · 63aecfb2

由 Joel Stanley 提交于 4月 24, 2014

The size of the sysparam sysfs files is determined from the device tree
at boot. However the buffer is hard coded to 64 bytes. If we encounter a
parameter that is larger than 64, or miss-parse the device tree, the
buffer will overflow when reading or writing to the parameter.

Check it at discovery time, and if the parameter is too large, do not
create a sysfs entry for it.
Signed-off-by: NJoel Stanley <joel@jms.id.au>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

63aecfb2

J
powerpc/powernv: Fix typos in sysparam code · 16003d23
由 Joel Stanley 提交于 4月 24, 2014
```
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
```
16003d23

powerpc/powernv: Check sysfs size before copying · 85390378

由 Joel Stanley 提交于 4月 24, 2014

The sysparam code currently uses the userspace supplied number of
bytes when memcpy()ing in to a local 64-byte buffer.

Limit the maximum number of bytes by the size of the buffer.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

85390378

powerpc/powernv: Use ssize_t for sysparam return values · b8569d23

由 Joel Stanley 提交于 4月 24, 2014

The OPAL calls are returning int64_t values, which the sysparam code
stores in an int, and the sysfs callback returns ssize_t. Make code a
easier to read by consistently using ssize_t.
Signed-off-by: NJoel Stanley <joel@jms.id.au>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

b8569d23

powerpc/powernv: Fix sysparam sysfs error handling · ba9a32b1

由 Joel Stanley 提交于 4月 24, 2014

When a sysparam query in OPAL returned a negative value (error code),
sysfs would spew out a decent chunk of memory; almost 64K more than
expected. This was traced to a sign/unsigned mix up in the OPAL sysparam
sysfs code at sys_param_show.

The return value of sys_param_show is a ssize_t, calculated using

  return ret ? ret : attr->param_size;

Alan Modra explains:

  "attr->param_size" is an unsigned int, "ret" an int, so the overall
  expression has type unsigned int.  Result is that ret is cast to
  unsigned int before being cast to ssize_t.

Instead of using the ternary operator, set ret to the param_size if an
error is not detected. The same bug exists in the sysfs write callback;
this patch fixes it in the same way.

A note on debugging this next time: on my system gcc will warn about
this if compiled with -Wsign-compare, which is not enabled by -Wall,
only -Wextra.
Signed-off-by: NJoel Stanley <joel@jms.id.au>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

ba9a32b1

powerpc: Fix Oops in rtas_stop_self() · 4fb8d027

由 Li Zhong 提交于 4月 28, 2014

commit 41dd03a9 may cause Oops in rtas_stop_self().

The reason is that the rtas_args was moved into stack space. For a box
with more that 4GB RAM, the stack could easily be outside 32bit range,
but RTAS is 32bit.

So the patch moves rtas_args away from stack by adding static before
it.
Signed-off-by: NLi Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: NAnton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

4fb8d027

powerpc: Export flush_icache_range · 28ea3c75

由 Jeff Mahoney 提交于 4月 27, 2014

Commit aac416fc (lkdtm: flush icache and report actions) calls
flush_icache_range from a module. It's exported on most architectures
that implement it, but not on powerpc. This patch exports it to fix
the module link failure.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

28ea3c75

19 4月, 2014 1 次提交

powerpc/mm: fix ".__node_distance" undefined · 12c743eb

由 Mike Qiu 提交于 4月 18, 2014

  CHK     include/config/kernel.release
  CHK     include/generated/uapi/linux/version.h
  CHK     include/generated/utsrelease.h
  ...
  Building modules, stage 2.
WARNING: 1 bad relocations
c0000000013d6a30 R_PPC64_ADDR64    uprobes_fetch_type_table
  WRAP    arch/powerpc/boot/zImage.pseries
  WRAP    arch/powerpc/boot/zImage.epapr
  MODPOST 1849 modules
ERROR: ".__node_distance" [drivers/block/nvme.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2
make: *** Waiting for unfinished jobs....

The reason is symbol "__node_distance" not been exported in powerpc.
Signed-off-by: NMike Qiu <qiudayu@linux.vnet.ibm.com>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Alistair Popple <alistair@popple.id.au>
Cc: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

12c743eb

15 4月, 2014 1 次提交

powerpc/PCI: Fix NULL dereference in sys_pciconfig_iobase() list traversal · 140ab645

由 Mike Qiu 提交于 4月 14, 2014

3bc95598 ("powerpc/PCI: Use list_for_each_entry() for bus traversal")
caused a NULL pointer dereference because the loop body set the iterator to
NULL:

  Unable to handle kernel paging request for data at address 0x00000000
  Faulting instruction address: 0xc000000000041d78
  Oops: Kernel access of bad area, sig: 11 [#1]
  ...
  NIP [c000000000041d78] .sys_pciconfig_iobase+0x68/0x1f0
  LR [c000000000041e0c] .sys_pciconfig_iobase+0xfc/0x1f0
  Call Trace:
  [c0000003b4787db0] [c000000000041e0c] .sys_pciconfig_iobase+0xfc/0x1f0 (unreliable)
  [c0000003b4787e30] [c000000000009ed8] syscall_exit+0x0/0x98

Fix it by using a temporary variable for the iterator.

[bhelgaas: changelog, drop tmp_bus initialization]
Fixes: 3bc95598 powerpc/PCI: Use list_for_each_entry() for bus traversal
Signed-off-by: NMike Qiu <qiudayu@linux.vnet.ibm.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

140ab645

13 4月, 2014 1 次提交

powerpc: Don't try to set LPCR unless we're in hypervisor mode · 18aa0da3

由 Paul Mackerras 提交于 4月 11, 2014

Commit 8f619b54 ("powerpc/ppc64: Do not turn AIL (reloc-on
interrupts) too early") added code to set the AIL bit in the LPCR
without checking whether the kernel is running in hypervisor mode. The
result is that when the kernel is running as a guest (i.e., under
PowerKVM or PowerVM), the processor takes a privileged instruction
interrupt at that point, causing a panic. The visible result is that
the kernel hangs after printing "returning from prom_init".

This fixes it by checking for hypervisor mode being available before
setting LPCR. If we are not in hypervisor mode, we enable relocation-on
interrupts later in pSeries_setup_arch using the H_SET_MODE hcall.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

18aa0da3

09 4月, 2014 8 次提交

powerpc/powernv Adapt opal-elog and opal-dump to new sysfs_remove_file_self · cc4f265a

由 Stewart Smith 提交于 4月 09, 2014

We are currently using sysfs_schedule_callback() which is deprecated
and about to be removed. Switch to the new interface instead.
Signed-off-by: NStewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

cc4f265a

power, sched: stop updating inside arch_update_cpu_topology() when nothing to be update · 9a013361

由 Michael Wang 提交于 4月 08, 2014

Since v1:
	Edited the comment according to Srivatsa's suggestion.

During the testing, we encounter below WARN followed by Oops:

	WARNING: at kernel/sched/core.c:6218
	...
	NIP [c000000000101660] .build_sched_domains+0x11d0/0x1200
	LR [c000000000101358] .build_sched_domains+0xec8/0x1200
	PACATMSCRATCH [800000000000f032]
	Call Trace:
	[c00000001b103850] [c000000000101358] .build_sched_domains+0xec8/0x1200
	[c00000001b1039a0] [c00000000010aad4] .partition_sched_domains+0x484/0x510
	[c00000001b103aa0] [c00000000016d0a8] .rebuild_sched_domains+0x68/0xa0
	[c00000001b103b30] [c00000000005cbf0] .topology_work_fn+0x10/0x30
	...
	Oops: Kernel access of bad area, sig: 11 [#1]
	...
	NIP [c00000000045c000] .__bitmap_weight+0x60/0xf0
	LR [c00000000010132c] .build_sched_domains+0xe9c/0x1200
	PACATMSCRATCH [8000000000029032]
	Call Trace:
	[c00000001b1037a0] [c000000000288ff4] .kmem_cache_alloc_node_trace+0x184/0x3a0
	[c00000001b103850] [c00000000010132c] .build_sched_domains+0xe9c/0x1200
	[c00000001b1039a0] [c00000000010aad4] .partition_sched_domains+0x484/0x510
	[c00000001b103aa0] [c00000000016d0a8] .rebuild_sched_domains+0x68/0xa0
	[c00000001b103b30] [c00000000005cbf0] .topology_work_fn+0x10/0x30
	...

This was caused by that 'sd->groups == NULL' after building groups, which
was caused by the empty 'sd->span'.

The cpu's domain contained nothing because the cpu was assigned to a wrong
node, due to the following unfortunate sequence of events:

1. The hypervisor sent a topology update to the guest OS, to notify changes
   to the cpu-node mapping. However, the update was actually redundant - i.e.,
   the "new" mapping was exactly the same as the old one.

2. Due to this, the 'updated_cpus' mask turned out to be empty after exiting
   the 'for-loop' in arch_update_cpu_topology().

3. So we ended up calling stop-machine() with an empty cpumask list, which made
   stop-machine internally elect cpumask_first(cpu_online_mask), i.e., CPU0 as
   the cpu to run the payload (the update_cpu_topology() function).

4. This causes update_cpu_topology() to be run by CPU0. And since 'updates'
   is kzalloc()'ed inside arch_update_cpu_topology(), update_cpu_topology()
   finds update->cpu as well as update->new_nid to be 0. In other words, we
   end up assigning CPU0 (and eventually its siblings) to node 0, incorrectly.

Along with the following wrong updating, it causes the sched-domain rebuild
code to break and crash the system.

Fix this by skipping the topology update in cases where we find that
the topology has not actually changed in reality (ie., spurious updates).

CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
CC: Nathan Fontenot <nfont@linux.vnet.ibm.com>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Robert Jennings <rcj@linux.vnet.ibm.com>
CC: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
CC: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
CC: Alistair Popple <alistair@popple.id.au>
Suggested-by: N"Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: NMichael Wang <wangyun@linux.vnet.ibm.com>
Reviewed-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

9a013361

powerpc/le: Avoid creatng R_PPC64_TOCSAVE relocations for modules. · d3d35d95

由 Tony Breeds 提交于 3月 12, 2014

When building modules with a native le toolchain the linker will
generate R_PPC64_TOCSAVE relocations when it's safe to omit saving r2 on
a plt call.  This isn't helpful in the conext of a kernel module and the
kernel will fail to load those modules with an error like:
	nf_conntrack: Unknown ADD relocation: 109

This patch tells the linker to avoid createing R_PPC64_TOCSAVE
relocations allowing modules to load.
Signed-off-by: NTony Breeds <tony@bakeyournoodle.com>
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

d3d35d95

arch/powerpc: Use RCU_INIT_POINTER(x, NULL) in platforms/cell/spu_syscalls.c · 282efb70

由 Monam Agarwal 提交于 3月 22, 2014

Here rcu_assign_pointer() is ensuring that the
initialization of a structure is carried out before storing a pointer
to that structure.
So, rcu_assign_pointer(p, NULL) can always safely be converted to
RCU_INIT_POINTER(p, NULL).
Signed-off-by: NMonam Agarwal <monamagarwal123@gmail.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

282efb70

powerpc/opal: Add missing include · bfd25d72

由 Michael Neuling 提交于 3月 25, 2014

next-20140324 currently fails compiling celleb_defconfig with:

arch/powerpc/include/asm/opal.h:894:42: error: 'struct notifier_block' declared inside parameter list [-Werror]
arch/powerpc/include/asm/opal.h:894:42: error: its scope is only this definition or declaration, which is probably not what you want [-Werror]
arch/powerpc/include/asm/opal.h:896:14: error: 'struct notifier_block' declared inside parameter list [-Werror]

This is due to a missing include which is added here.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

bfd25d72

powerpc: Convert last uses of __FUNCTION__ to __func__ · aba6f4f2

由 Joe Perches 提交于 3月 25, 2014

Just about all of these have been converted to __func__,
so convert the last uses.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

aba6f4f2

powerpc: Add lq/stq emulation · f83319d7

由 Anton Blanchard 提交于 3月 28, 2014

Recent CPUs support quad word load and store instructions. Add
support to the alignment handler for them.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

f83319d7

powerpc/powernv: Add invalid OPAL call · e28b05e7

由 Joel Stanley 提交于 4月 01, 2014

This call will not be understood by OPAL, and cause it to add an error
to it's log. Among other things, this is useful for testing the
behaviour of the log as it fills up.
Signed-off-by: NJoel Stanley <joel@jms.id.au>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

e28b05e7

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功