1. 28 1月, 2014 2 次提交
  2. 24 1月, 2014 2 次提交
  3. 22 1月, 2014 1 次提交
    • T
      memblock: make memblock_set_node() support different memblock_type · e7e8de59
      Tang Chen 提交于
      [sfr@canb.auug.org.au: fix powerpc build]
      Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
      Reviewed-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "Rafael J . Wysocki" <rjw@sisk.pl>
      Cc: Chen Tang <imtangchen@gmail.com>
      Cc: Gong Chen <gong.chen@linux.intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Larry Woodman <lwoodman@redhat.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Liu Jiang <jiang.liu@huawei.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Renninger <trenn@suse.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e7e8de59
  4. 19 1月, 2014 1 次提交
    • M
      net: introduce SO_BPF_EXTENSIONS · ea02f941
      Michal Sekletar 提交于
      For user space packet capturing libraries such as libpcap, there's
      currently only one way to check which BPF extensions are supported
      by the kernel, that is, commit aa1113d9 ("net: filter: return
      -EINVAL if BPF_S_ANC* operation is not supported"). For querying all
      extensions at once this might be rather inconvenient.
      
      Therefore, this patch introduces a new option which can be used as
      an argument for getsockopt(), and allows one to obtain information
      about which BPF extensions are supported by the current kernel.
      
      As David Miller suggests, we do not need to define any bits right
      now and status quo can just return 0 in order to state that this
      versions supports SKF_AD_PROTOCOL up to SKF_AD_PAY_OFFSET. Later
      additions to BPF extensions need to add their bits to the
      bpf_tell_extensions() function, as documented in the comment.
      Signed-off-by: NMichal Sekletar <msekleta@redhat.com>
      Cc: David Miller <davem@davemloft.net>
      Reviewed-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ea02f941
  5. 17 1月, 2014 2 次提交
    • G
      dt/bindings: Remove device_type "serial" from marvell,mv64360-mpsc · 482c4341
      Grant Likely 提交于
      device_type is deprecated. There is no need to check for it in device
      driver code and no need to specify it in the device tree. Remove the
      property from stock .dts files and remove the check for it from device
      drivers. This change should be 100% backwards compatible with old device
      trees.
      Signed-off-by: NGrant Likely <grant.likely@linaro.org>
      Cc: Rob Herring <rob.herring@calxeda.com>
      Cc: Pawel Moll <pawel.moll@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Ian Campbell <ijc+devicetree@hellion.org.uk>
      Cc: Kumar Gala <galak@codeaurora.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      482c4341
    • G
      dt/bindings: remove users of device_type "mdio" · dae95c1f
      Grant Likely 提交于
      device_type is a deprecated property, but some MDIO bus nodes still have
      it. Except for a couple of old binding (compatible="gianfar" and
      compatible="ucc_geth_phy") the kernel doesn't look for
      device_type="mdio" at all.
      
      This patch removes all instances of device_type="mdio" from the binding
      documentation and the .dts files.
      Signed-off-by: NGrant Likely <grant.likely@linaro.org>
      Cc: Rob Herring <rob.herring@calxeda.com>
      Cc: Pawel Moll <pawel.moll@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Ian Campbell <ijc+devicetree@hellion.org.uk>
      Cc: Kumar Gala <galak@codeaurora.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      dae95c1f
  6. 16 1月, 2014 3 次提交
  7. 15 1月, 2014 24 次提交
    • V
      powerpc/powernv: Call OPAL sync before kexec'ing · f7d98d18
      Vasant Hegde 提交于
      Its possible that OPAL may be writing to host memory during
      kexec (like dump retrieve scenario). In this situation we might
      end up corrupting host memory.
      
      This patch makes OPAL sync call to make sure OPAL stops
      writing to host memory before kexec'ing.
      Signed-off-by: NVasant Hegde <hegdevasant@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      f7d98d18
    • G
      powerpc/eeh: Escalate error on non-existing PE · cb5b242c
      Gavin Shan 提交于
      Sometimes, especially in sinario of loading another kernel with kdump,
      we got EEH error on non-existing PE. That means the PEEV / PEST in
      the corresponding PHB would be messy and we can't handle that case.
      The patch escalates the error to fenced PHB so that the PHB could be
      rested in order to revoer the errors on non-existing PEs.
      Reported-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Tested-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      cb5b242c
    • G
      powerpc/eeh: Handle multiple EEH errors · 7e4e7867
      Gavin Shan 提交于
      For one PCI error relevant OPAL event, we possibly have multiple
      EEH errors for that. For example, multiple frozen PEs detected on
      different PHBs. Unfortunately, we didn't cover the case. The patch
      enumarates the return value from eeh_ops::next_error() and change
      eeh_handle_special_event() and eeh_ops::next_error() to handle all
      existing EEH errors.
      
      As Ben pointed out, we needn't list_for_each_entry_safe() since we
      are not deleting any PHB from the hose_list and the EEH serialized
      lock should be held while purging EEH events. The patch covers those
      suggestions as well.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      7e4e7867
    • A
      powerpc/thp: Fix crash on mremap · b3084f4d
      Aneesh Kumar K.V 提交于
      This patch fix the below crash
      
      NIP [c00000000004cee4] .__hash_page_thp+0x2a4/0x440
      LR [c0000000000439ac] .hash_page+0x18c/0x5e0
      ...
      Call Trace:
      [c000000736103c40] [00001ffffb000000] 0x1ffffb000000(unreliable)
      [437908.479693] [c000000736103d50] [c0000000000439ac] .hash_page+0x18c/0x5e0
      [437908.479699] [c000000736103e30] [c00000000000924c] .do_hash_page+0x4c/0x58
      
      On ppc64 we use the pgtable for storing the hpte slot information and
      store address to the pgtable at a constant offset (PTRS_PER_PMD) from
      pmd. On mremap, when we switch the pmd, we need to withdraw and deposit
      the pgtable again, so that we find the pgtable at PTRS_PER_PMD offset
      from new pmd.
      
      We also want to move the withdraw and deposit before the set_pmd so
      that, when page fault find the pmd as trans huge we can be sure that
      pgtable can be located at the offset.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      b3084f4d
    • P
      powerpc: Fix transactional FP/VMX/VSX unavailable handlers · 3ac8ff1c
      Paul Mackerras 提交于
      Currently, if a process starts a transaction and then takes an
      exception because the FPU, VMX or VSX unit is unavailable to it,
      we end up corrupting any FP/VMX/VSX state that was valid before
      the interrupt.  For example, if the process starts a transaction
      with the FPU available to it but VMX unavailable, and then does
      a VMX instruction inside the transaction, the FP state gets
      corrupted.
      
      Loading up the desired state generally involves doing a reclaim
      and a recheckpoint.  To avoid corrupting already-valid state, we have
      to be careful not to reload that state from the thread_struct
      between the reclaim and the recheckpoint (since the thread_struct
      values are stale by now), and we have to reload that state from
      the transact_fp/vr arrays after the recheckpoint to get back the
      current transactional values saved there by the reclaim.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      3ac8ff1c
    • P
      powerpc: Don't corrupt transactional state when using FP/VMX in kernel · d31626f7
      Paul Mackerras 提交于
      Currently, when we have a process using the transactional memory
      facilities on POWER8 (that is, the processor is in transactional
      or suspended state), and the process enters the kernel and the
      kernel then uses the floating-point or vector (VMX/Altivec) facility,
      we end up corrupting the user-visible FP/VMX/VSX state.  This
      happens, for example, if a page fault causes a copy-on-write
      operation, because the copy_page function will use VMX to do the
      copy on POWER8.  The test program below demonstrates the bug.
      
      The bug happens because when FP/VMX state for a transactional process
      is stored in the thread_struct, we store the checkpointed state in
      .fp_state/.vr_state and the transactional (current) state in
      .transact_fp/.transact_vr.  However, when the kernel wants to use
      FP/VMX, it calls enable_kernel_fp() or enable_kernel_altivec(),
      which saves the current state in .fp_state/.vr_state.  Furthermore,
      when we return to the user process we return with FP/VMX/VSX
      disabled.  The next time the process uses FP/VMX/VSX, we don't know
      which set of state (the current register values, .fp_state/.vr_state,
      or .transact_fp/.transact_vr) we should be using, since we have no
      way to tell if we are still in the same transaction, and if not,
      whether the previous transaction succeeded or failed.
      
      Thus it is necessary to strictly adhere to the rule that if FP has
      been enabled at any point in a transaction, we must keep FP enabled
      for the user process with the current transactional state in the
      FP registers, until we detect that it is no longer in a transaction.
      Similarly for VMX; once enabled it must stay enabled until the
      process is no longer transactional.
      
      In order to keep this rule, we add a new thread_info flag which we
      test when returning from the kernel to userspace, called TIF_RESTORE_TM.
      This flag indicates that there is FP/VMX/VSX state to be restored
      before entering userspace, and when it is set the .tm_orig_msr field
      in the thread_struct indicates what state needs to be restored.
      The restoration is done by restore_tm_state().  The TIF_RESTORE_TM
      bit is set by new giveup_fpu/altivec_maybe_transactional helpers,
      which are called from enable_kernel_fp/altivec, giveup_vsx, and
      flush_fp/altivec_to_thread instead of giveup_fpu/altivec.
      
      The other thing to be done is to get the transactional FP/VMX/VSX
      state from .fp_state/.vr_state when doing reclaim, if that state
      has been saved there by giveup_fpu/altivec_maybe_transactional.
      Having done this, we set the FP/VMX bit in the thread's MSR after
      reclaim to indicate that that part of the state is now valid
      (having been reclaimed from the processor's checkpointed state).
      
      Finally, in the signal handling code, we move the clearing of the
      transactional state bits in the thread's MSR a bit earlier, before
      calling flush_fp_to_thread(), so that we don't unnecessarily set
      the TIF_RESTORE_TM bit.
      
      This is the test program:
      
      /* Michael Neuling 4/12/2013
       *
       * See if the altivec state is leaked out of an aborted transaction due to
       * kernel vmx copy loops.
       *
       *   gcc -m64 htm_vmxcopy.c -o htm_vmxcopy
       *
       */
      
      /* We don't use all of these, but for reference: */
      
      int main(int argc, char *argv[])
      {
      	long double vecin = 1.3;
      	long double vecout;
      	unsigned long pgsize = getpagesize();
      	int i;
      	int fd;
      	int size = pgsize*16;
      	char tmpfile[] = "/tmp/page_faultXXXXXX";
      	char buf[pgsize];
      	char *a;
      	uint64_t aborted = 0;
      
      	fd = mkstemp(tmpfile);
      	assert(fd >= 0);
      
      	memset(buf, 0, pgsize);
      	for (i = 0; i < size; i += pgsize)
      		assert(write(fd, buf, pgsize) == pgsize);
      
      	unlink(tmpfile);
      
      	a = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);
      	assert(a != MAP_FAILED);
      
      	asm __volatile__(
      		"lxvd2x 40,0,%[vecinptr] ; " // set 40 to initial value
      		TBEGIN
      		"beq	3f ;"
      		TSUSPEND
      		"xxlxor 40,40,40 ; " // set 40 to 0
      		"std	5, 0(%[map]) ;" // cause kernel vmx copy page
      		TABORT
      		TRESUME
      		TEND
      		"li	%[res], 0 ;"
      		"b	5f ;"
      		"3: ;" // Abort handler
      		"li	%[res], 1 ;"
      		"5: ;"
      		"stxvd2x 40,0,%[vecoutptr] ; "
      		: [res]"=r"(aborted)
      		: [vecinptr]"r"(&vecin),
      		  [vecoutptr]"r"(&vecout),
      		  [map]"r"(a)
      		: "memory", "r0", "r3", "r4", "r5", "r6", "r7");
      
      	if (aborted && (vecin != vecout)){
      		printf("FAILED: vector state leaked on abort %f != %f\n",
      		       (double)vecin, (double)vecout);
      		exit(1);
      	}
      
      	munmap(a, size);
      
      	close(fd);
      
      	printf("PASSED!\n");
      	return 0;
      }
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      d31626f7
    • P
      powerpc: Reclaim two unused thread_info flag bits · ae39c58c
      Paul Mackerras 提交于
      TIF_PERFMON_WORK and TIF_PERFMON_CTXSW are completely unused.  They
      appear to be related to the old perfmon2 code, which has been
      superseded by the perf_event infrastructure.  This removes their
      definitions so that the bits can be used for other purposes.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      ae39c58c
    • B
      powerpc: Fix races with irq_work · 0215f7d8
      Benjamin Herrenschmidt 提交于
      If we set irq_work on a processor and immediately afterward, before the
      irq work has a chance to be processed, we change the decrementer value,
      we can seriously delay the handling of that irq_work.
      
      Fix it by checking in a few places for pending irq work, first before
      changing the decrementer in decrementer_set_next_event() and after
      changing it in the same function and in timer_interrupt().
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      0215f7d8
    • M
      Move precessing of MCE queued event out from syscall exit path. · 30c82635
      Mahesh Salgaonkar 提交于
      Huge Dickins reported an issue that b5ff4211
      "powerpc/book3s: Queue up and process delayed MCE events" breaks the
      PowerMac G5 boot. This patch fixes it by moving the mce even processing
      away from syscall exit, which was wrong to do that in first place, and
      using irq work framework to delay processing of mce event.
      
      Reported-by: Hugh Dickins <hughd@google.com
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      30c82635
    • P
      pseries/cpuidle: Remove redundant call to ppc64_runlatch_off() in cpu idle routines · c0c4301c
      Preeti U Murthy 提交于
      Commit fbd7740f(powerpc: Simplify pSeries idle loop) switched pseries cpu
      idle handling from complete idle loops to ppc_md.powersave functions. Earlier to
      this switch, ppc64_runlatch_off() had to be called in each of the idle routines.
      But after the switch, this call is handled in arch_cpu_idle(),just before the call
      to ppc_md.powersave, where platform specific idle routines are called.
      
      As a consequence, the call to ppc64_runlatch_off() got duplicated in the
      arch_cpu_idle() routine as well as in the some of the idle routines in
      pseries and commit fbd7740f missed to get rid of these redundant
      calls. These calls were carried over subsequent enhancements to the pseries
      cpuidle routines.
      
      Although multiple calls to ppc64_runlatch_off() is harmless, there is still some
      overhead due to it. Besides that, these calls could also make way for a
      misunderstanding that it is *necessary* to call ppc64_runlatch_off() multiple
      times, when that is not the case. Hence this patch takes care of eliminating
      this redundancy.
      Signed-off-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      c0c4301c
    • G
      powerpc: Make add_system_ram_resources() __init · 4f770924
      Geert Uytterhoeven 提交于
      add_system_ram_resources() is a subsys_initcall.
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      4f770924
    • O
      powerpc: add SATA_MV to ppc64_defconfig · 5906b0a7
      Olof Johansson 提交于
      This makes ppc64_defconfig bootable without initrd on pasemi systems,
      most of whom have MV SATA controllers. Some have SIL24, but that driver
      is already enabled.
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      5906b0a7
    • V
      powerpc/powernv: Increase candidate fw image size · bf16a4c2
      Vasant Hegde 提交于
      At present we assume candidate image is <= 256MB. But in P8,
      candidate image size can go up to 750MB. Hence increasing
      candidate image max size to 1GB.
      Signed-off-by: NVasant Hegde <hegdevasant@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      bf16a4c2
    • S
      powerpc: Add debug checks to catch invalid cpu-to-node mappings · 68fb18aa
      Srivatsa S. Bhat 提交于
      There have been some weird bugs in the past where the kernel tried to associate
      threads of the same core to different NUMA nodes, and things went haywire after
      that point (as expected).
      
      But unfortunately, root-causing such issues have been quite challenging, due to
      the lack of appropriate debug checks in the kernel. These bugs usually lead to
      some odd soft-lockups in the scheduler's build-sched-domain code in the CPU
      hotplug path, which makes it very hard to trace it back to the incorrect
      cpu-to-node mappings.
      
      So add appropriate debug checks to catch such invalid cpu-to-node mappings
      as early as possible.
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      68fb18aa
    • S
      powerpc: Fix the setup of CPU-to-Node mappings during CPU online · d4edc5b6
      Srivatsa S. Bhat 提交于
      On POWER platforms, the hypervisor can notify the guest kernel about dynamic
      changes in the cpu-numa associativity (VPHN topology update). Hence the
      cpu-to-node mappings that we got from the firmware during boot, may no longer
      be valid after such updates. This is handled using the arch_update_cpu_topology()
      hook in the scheduler, and the sched-domains are rebuilt according to the new
      mappings.
      
      But unfortunately, at the moment, CPU hotplug ignores these updated mappings
      and instead queries the firmware for the cpu-to-numa relationships and uses
      them during CPU online. So the kernel can end up assigning wrong NUMA nodes
      to CPUs during subsequent CPU hotplug online operations (after booting).
      
      Further, a particularly problematic scenario can result from this bug:
      On POWER platforms, the SMT mode can be switched between 1, 2, 4 (and even 8)
      threads per core. The switch to Single-Threaded (ST) mode is performed by
      offlining all except the first CPU thread in each core. Switching back to
      SMT mode involves onlining those other threads back, in each core.
      
      Now consider this scenario:
      
      1. During boot, the kernel gets the cpu-to-node mappings from the firmware
         and assigns the CPUs to NUMA nodes appropriately, during CPU online.
      
      2. Later on, the hypervisor updates the cpu-to-node mappings dynamically and
         communicates this update to the kernel. The kernel in turn updates its
         cpu-to-node associations and rebuilds its sched domains. Everything is
         fine so far.
      
      3. Now, the user switches the machine from SMT to ST mode (say, by running
         ppc64_cpu --smt=1). This involves offlining all except 1 thread in each
         core.
      
      4. The user then tries to switch back from ST to SMT mode (say, by running
         ppc64_cpu --smt=4), and this involves onlining those threads back. Since
         CPU hotplug ignores the new mappings, it queries the firmware and tries to
         associate the newly onlined sibling threads to the old NUMA nodes. This
         results in sibling threads within the same core getting associated with
         different NUMA nodes, which is incorrect.
      
         The scheduler's build-sched-domains code gets thoroughly confused with this
         and enters an infinite loop and causes soft-lockups, as explained in detail
         in commit 3be7db6a (powerpc: VPHN topology change updates all siblings).
      
      So to fix this, use the numa_cpu_lookup_table to remember the updated
      cpu-to-node mappings, and use them during CPU hotplug online operations.
      Further, we also need to ensure that all threads in a core are assigned to a
      common NUMA node, irrespective of whether all those threads were online during
      the topology update. To achieve this, we take care not to use cpu_sibling_mask()
      since it is not hotplug invariant. Instead, we use cpu_first_sibling_thread()
      and set up the mappings manually using the 'threads_per_core' value for that
      particular platform. This helps us ensure that we don't hit this bug with any
      combination of CPU hotplug and SMT mode switching.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      d4edc5b6
    • G
      powerpc/iommu: Don't detach device without IOMMU group · 0c4b9e27
      Gavin Shan 提交于
      Some devices, for example PCI root port, don't have IOMMU table and
      group. We needn't detach them from their IOMMU group. Otherwise, it
      potentially incurs kernel crash because of referring NULL IOMMU group
      as following backtrace indicates:
      
        .iommu_group_remove_device+0x74/0x1b0
        .iommu_bus_notifier+0x94/0xb4
        .notifier_call_chain+0x78/0xe8
        .__blocking_notifier_call_chain+0x7c/0xbc
        .blocking_notifier_call_chain+0x38/0x48
        .device_del+0x50/0x234
        .pci_remove_bus_device+0x88/0x138
        .pci_stop_and_remove_bus_device+0x2c/0x40
        .pcibios_remove_pci_devices+0xcc/0xfc
        .pcibios_remove_pci_devices+0x3c/0xfc
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      0c4b9e27
    • G
      powerpc/eeh: Hotplug improvement · f26c7a03
      Gavin Shan 提交于
      When EEH error comes to one specific PCI device before its driver
      is loaded, we will apply hotplug to recover the error. During the
      plug time, the PCI device will be probed and its driver is loaded.
      Then we wrongly calls to the error handlers if the driver supports
      EEH explicitly.
      
      The patch intends to fix by introducing flag EEH_DEV_NO_HANDLER and
      set it before we remove the PCI device. In turn, we can avoid wrongly
      calls the error handlers of the PCI device after its driver loaded.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      f26c7a03
    • G
      powerpc/eeh: Call opal_pci_reinit() on powernv for restoring config space · 9be3becc
      Gavin Shan 提交于
      The patch implements the EEH operation backend restore_config()
      for PowerNV platform. That relies on OPAL API opal_pci_reinit()
      where we reinitialize the error reporting properly after PE or
      PHB reset.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      9be3becc
    • G
      powerpc/eeh: Add restore_config operation · 1d350544
      Gavin Shan 提交于
      After reset on the specific PE or PHB, we never configure AER
      correctly on PowerNV platform. We needn't care it on pSeries
      platform. The patch introduces additional EEH operation eeh_ops::
      restore_config() so that we have chance to configure AER correctly
      for PowerNV platform.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1d350544
    • G
      powerpc/powernv: Remove unnecessary assignment · 8184616f
      Gavin Shan 提交于
      We don't have IO ports on PHB3 and the assignment of variable
      "iomap_off" on PHB3 is meaningless. The patch just removes the
      unnecessary assignment to the variable. The code change should
      have been part of commit c35d2a8c ("powerpc/powernv: Needn't IO
      segment map for PHB3").
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8184616f
    • N
      Revert "pseries/iommu: Remove DDW on kexec" · 97e7dc52
      Nishanth Aravamudan 提交于
      After reverting 25ebc45b
      ("powerpc/pseries/iommu: remove default window before attempting DDW
      manipulation"), we no longer remove the base window in enable_ddw.
      Therefore, we no longer need to reset the DMA window state in
      find_existing_ddw_windows(). We can instead go back to what was done
      before, which simply reuses the previous configuration, if any. Further,
      this removes the final caller of the reset-pe-dma-windows call, so
      remove those functions.
      
      This fixes an EEH on kdump with the ipr driver. The EEH occurs, because
      the initcall removes the DDW configuration (64-bit DMA window), but
      doesn't ensure the ops are via the IOMMU -- a DMA operation occurs
      during probe (still investigating this) and we EEH.
      
      This reverts commit 14b6f00f.
      Signed-off-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      97e7dc52
    • N
      Revert "powerpc/pseries/iommu: remove default window before attempting DDW manipulation" · ae69e1ed
      Nishanth Aravamudan 提交于
      Ben rightfully pointed out that there is a race in the "newer" DDW code.
      Presuming we are running on recent enough firmware that supports the
      "reset" DDW manipulation call, we currently always remove the base
      32-bit DMA window in order to maximize the resources for Phyp when
      creating the 64-bit window. However, this can be problematic for the
      case where multiple functions are in the same PE (partitionable
      endpoint), where some funtions might be 32-bit DMA only. All of a
      sudden, the only functional DMA window for such functions is gone. We
      will have serious errors in such situations. The best solution is simply
      to revert the extension to the DDW code where we ever remove the base
      DMA window.
      
      This reverts commit 25ebc45b.
      Signed-off-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      ae69e1ed
    • P
      powerpc: Delete non-required instances of include <linux/init.h> · c141611f
      Paul Gortmaker 提交于
      None of these files are actually using any __init type directives
      and hence don't need to include <linux/init.h>.  Most are just a
      left over from __devinit and __cpuinit removal, or simply due to
      code getting copied from one driver to the next.
      
      The one instance where we add an include for init.h covers off
      a case where that file was implicitly getting it from another
      header which itself didn't need it.
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      c141611f
    • A
      powerpc: Add vr save/restore functions · 8fe9c93e
      Andreas Schwab 提交于
      GCC 4.8 now generates out-of-line vr save/restore functions when
      optimizing for size.  They are needed for the raid6 altivec support.
      Signed-off-by: NAndreas Schwab <schwab@linux-m68k.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8fe9c93e
  8. 13 1月, 2014 1 次提交
    • B
      powerpc: Check return value of instance-to-package OF call · 10348f59
      Benjamin Herrenschmidt 提交于
      On PA-Semi firmware, the instance-to-package callback doesn't seem
      to be implemented. We didn't check for error, however, thus
      subsequently passed the -1 value returned into stdout_node to
      thins like prom_getprop etc...
      
      Thus caused the firmware to load values around 0 (physical) internally
      as node structures. It somewhat "worked" as long as we had a NULL in the
      right place (address 8) at the beginning of the kernel, we didn't "see"
      the bug. But commit 5c0484e2
      "powerpc: Endian safe trampoline" changed the kernel entry point causing
      that old bug to now cause a crash early during boot.
      
      This fixes booting on PA-Semi board by properly checking the return
      value from instance-to-package.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Tested-by: NOlof Johansson <olof@lixom.net>
      ---
      10348f59
  9. 12 1月, 2014 1 次提交
    • P
      arch: Introduce smp_load_acquire(), smp_store_release() · 47933ad4
      Peter Zijlstra 提交于
      A number of situations currently require the heavyweight smp_mb(),
      even though there is no need to order prior stores against later
      loads.  Many architectures have much cheaper ways to handle these
      situations, but the Linux kernel currently has no portable way
      to make use of them.
      
      This commit therefore supplies smp_load_acquire() and
      smp_store_release() to remedy this situation.  The new
      smp_load_acquire() primitive orders the specified load against
      any subsequent reads or writes, while the new smp_store_release()
      primitive orders the specifed store against any prior reads or
      writes.  These primitives allow array-based circular FIFOs to be
      implemented without an smp_mb(), and also allow a theoretical
      hole in rcu_assign_pointer() to be closed at no additional
      expense on most architectures.
      
      In addition, the RCU experience transitioning from explicit
      smp_read_barrier_depends() and smp_wmb() to rcu_dereference()
      and rcu_assign_pointer(), respectively resulted in substantial
      improvements in readability.  It therefore seems likely that
      replacing other explicit barriers with smp_load_acquire() and
      smp_store_release() will provide similar benefits.  It appears
      that roughly half of the explicit barriers in core kernel code
      might be so replaced.
      
      [Changelog by PaulMck]
      Reviewed-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Victor Kaplansky <VICTORK@il.ibm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Link: http://lkml.kernel.org/r/20131213150640.908486364@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      47933ad4
  10. 11 1月, 2014 3 次提交