- 20 9月, 2011 40 次提交
-
-
由 Anshuman Khandual 提交于
perf events, powerpc: Add POWER7 stalled-cycles-frontend/backend events Extent the POWER7 PMU driver with definitions for generic front-end and back-end stall events. As explained in Ingo's original comment(8f622422 ), the exact definitions of the stall events are very much processor specific as different things mean different in their respective instruction pipeline. These two Power7 raw events are the closest approximation to the concept detailed in Ingo's comment. [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x100f8, /* GCT_NOSLOT_CYC */ It means cycles when the Global Completion Table has no slots from this thread [PERF_COUNT_HW_STALLED_CYCLES_BACKEND] = 0x4000a, /* CMPLU_STALL */ It means no groups completed and GCT not empty for this thread Signed-off-by: NAnshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
The firmware doesn't wait after lifting the PCI reset. However it does timestamp it in the device tree. We use that to ensure we wait long enough (3s is our current arbitrary setting) from that timestamp to actually probing the bus. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
This implements support for MSIs on p5ioc2 PHBs. We only support MSIs on the PCIe PHBs, not the PCI-X ones as the later hasn't been properly verified in HW. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
This adds support for PCI-X and PCIe on the p5ioc2 IO hub using OPAL. This includes allocating & setting up TCE tables and config space access routines. This also supports fallbacks via RTAS when OPAL is absent, using legacy TCE format pre-allocated via the device-tree (BML style) Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
OPAL can handle various interrupt for us such as Machine Checks (it performs all sorts of recovery tasks and passes back control to us with informations about the error), Hardware Management Interrupts and Softpatch interrupts. This wires up the mechanisms and prints out specific informations returned by HAL when a machine check occurs. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
We do the minimum which is to "pass" interrupts to HAL, which makes the console smoother and will allow us to implement interrupt based completion and console. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
OPAL handles HW access to the various ICS or equivalent chips for us (with the exception of p5ioc2 based HEA which uses a different backend) similarily to what RTAS does on pSeries. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
Implements OPAL RTC and NVRAM support and wire all that up to the powernv platform. We use RTAS for RTC as a fallback if available. Using RTAS for nvram is not supported yet, pending some rework/cleanup and generalization of the pSeries & CHRP code. We also use RTAS fallbacks for power off and reboot Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
This calls the respective HAL functions, and spin on hal_poll_event() to ensure the HAL has a chance to communicate with the FSP to trigger the reboot or shutdown operation Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
This adds a udbg and an hvc console backend for supporting a console using the OPAL console interfaces. On OPAL v1 we have hvc0 mapped to whatever console the system was configured for (network or hvsi serial port) via the service processor. On OPAL v2 we have hvcN mapped to the Nth console provided by OPAL which generally corresponds to: hvc0 : network console (raw protocol) hvc1 : serial port S1 (hvsi) hvc2 : serial port S2 (hvsi) Note: At this point, early debug console only works with OPAL v1 and shouldn't be enabled in a normal kernel. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
OPAL v2 is instantiated in a way similar to RTAS using Open Firmware client interface calls, and the resulting address and entry point are put in the device-tree Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
Add definition of OPAL interfaces along with the wrappers to call into OPAL runtime and the early device-tree parsing hook to locate the OPAL runtime firmware. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
We stash it in boot_command_line which isn't in BSS and so won't be overwritten. We then use that as a default cmd_line before we walk the device-tree. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
On machines supporting the OPAL firmware version 1, the system is initially booted under pHyp. We then use a special hypercall to verify if OPAL is available and if it is, we then trigger a "takeover" which disables pHyp and loads the OPAL runtime firmware, giving control to the kernel in hypervisor mode. This patch add the necessary code to detect that the OPAL takeover capability is present when running under PowerVM (aka pHyp) and perform said takeover to get hypervisor control of the processor. To perform the takeover, we must first use RTAS (within Open Firmware runtime environment) to start all processors & threads, in order to give control to OPAL on all of them. We then call the takeover hypercall on everybody, OPAL will re-enter the kernel main entry point passing it a flat device-tree. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
Unplugged CPU go into NAP mode in a loop until woken up Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
We used to overwrite with CONFIG_CMDLINE if we found a chosen node but failed to get bootargs out of it or they were empty, unless CONFIG_CMDLINE_FORCE is set. Instead change that to overwrite if "data" is non empty after the bootargs check. It allows arch code to have other mechanisms to retrieve the command line prior to parsing the device-tree. Note: CONFIG_CMDLINE_FORCE case should ideally be handled elsewhere as it won't work as it-is if the device-tree has no /chosen node Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> CC: devicetree-discuss@lists-ozlabs.org CC: Grant Likely <grant.likely@secretlab.ca> Acked-by: NGrant Likely <grant.likely@secretlab.ca>
-
由 Benjamin Herrenschmidt 提交于
This adds a skeletton for the new Power "Non Virtualized" platform which will be used by machines supporting running without an hypervisor, for example in order to run KVM. These machines will be using a new firmware called OPAL for which the support will be provided by later patches. The PowerNV platform is intended to be also usable under the BML environment used internally for early CPU bringup which is why the code also supports using RTAS instead of OPAL in various places. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
With OPAL, r8 and r9 will be used to pass the OPAL base and entry for debugging purposes (those informations are also in the device-tree). We don't want to clobber those registers that early. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
This new function is used to properly setup the PCI Express Max Payload Size (and in some circumstances Max Read Request Size). Some systems will not operate properly if these aren't set correctly and the firmware doesn't always do it. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
This adds more generic support for doing CPU hotplug with a simple idle loop and no actual reset of the processors. The generic smp_generic_kick_cpu() does the hotplug bringup trick if the PACA shows that the CPU has already been started at boot and we provide an accessor for the CPU state. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
It was preventing the global early debug selection whenever KVM was enabled instead of only preventing the 440 specific one. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Brian King 提交于
The Power platform requires the partner info buffer to be page aligned otherwise it will fail the partner info hcall with H_PARAMETER. Switch from using kmalloc to allocate this buffer to __get_free_page to ensure page alignment. Signed-off-by: NBrian King <brking@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Anton Blanchard 提交于
The icswx code introduced an A-B B-A deadlock: CPU0 CPU1 ---- ---- lock(&anon_vma->mutex); lock(&mm->mmap_sem); lock(&anon_vma->mutex); lock(&mm->mmap_sem); Instead of using the mmap_sem to keep mm_users constant, take the page table spinlock. Signed-off-by: NAnton Blanchard <anton@samba.org> Cc: <stable@kernel.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Anton Blanchard 提交于
If we echo an address the hypervisor doesn't like to /sys/devices/system/memory/probe we oops the box: # echo 0x10000000000 > /sys/devices/system/memory/probe kernel BUG at arch/powerpc/mm/hash_utils_64.c:541! The backtrace is: create_section_mapping arch_add_memory add_memory memory_probe_store sysdev_class_store sysfs_write_file vfs_write SyS_write In create_section_mapping we BUG if htab_bolt_mapping returned an error. A better approach is to return an error which will propagate back to userspace. Rerunning the test with this patch applied: # echo 0x10000000000 > /sys/devices/system/memory/probe -bash: echo: write error: Invalid argument Signed-off-by: NAnton Blanchard <anton@samba.org> Cc: stable@kernel.org Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Anton Blanchard 提交于
While converting code to use for_each_node_by_type I noticed a number of coding style issues. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Anton Blanchard 提交于
Use for_each_node_by_type instead of open coding it. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Anton Blanchard 提交于
During memory hotplug testing, I got the following warning: ERROR: Bad of_node_put() on /memory@0 of_node_release kref_put of_node_put of_find_node_by_type hot_add_node_scn_to_nid hot_add_scn_to_nid memory_add_physaddr_to_nid ... of_find_node_by_type() loop does the of_node_put for us so we only need the handle the case where we terminate the loop early. As suggested by Stephen Rothwell we can do the of_node_put unconditionally outside of the loop since of_node_put handles a NULL argument fine. Signed-off-by: NAnton Blanchard <anton@samba.org> Cc: stable@kernel.org Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Anton Blanchard 提交于
We have two identical definitions of RECLAIM_DISTANCE, looks like the patch got applied twice. Remove one. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Anton Blanchard 提交于
On big POWER7 boxes we see large amounts of CPU time in system processes like workqueue and watchdog kernel threads. We currently rebalance the entire machine each time a task goes idle and this is very expensive on large machines. Disable newidle balancing at the node level and rely on the scheduler tick to rebalance across nodes. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Anton Blanchard 提交于
The largest POWER7 boxes have 32 nodes. SD_NODES_PER_DOMAIN groups nodes into chunks of 16 and adds a global balancing domain (SD_ALLNODES) above it. If we bump SD_NODES_PER_DOMAIN to 32, then we avoid this extra level of balancing on our largest boxes. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Anton Blanchard 提交于
We want to override the default value of SD_NODES_PER_DOMAIN on ppc64, so move it into linux/topology.h. Signed-off-by: NAnton Blanchard <anton@samba.org> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Anton Blanchard 提交于
When chasing a performance issue on ppc64, I noticed tasks communicating via a pipe would often end up on different nodes. It turns out SD_WAKE_AFFINE is not set in our node defition. Commit 9fcd18c9 (sched: re-tune balancing) enabled SD_WAKE_AFFINE in the node definition for x86 and we need a similar change for ppc64. I used lmbench lat_ctx and perf bench pipe to verify this fix. Each benchmark was run 10 times and the average taken. lmbench lat_ctx: before: 66565 ops/sec after: 204700 ops/sec 3.1x faster perf bench pipe: before: 5.6570 usecs after: 1.3470 usecs 4.2x faster Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
(Merge in order to get the PCIe mps/mrss code fixes)
-
git://tesla.tglx.de/git/linux-2.6-tip由 Linus Torvalds 提交于
* 'irq-fixes-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip: x86, iommu: Mark DMAR IRQ as non-threaded genirq: Make irq_shutdown() symmetric vs. irq_startup again
-
git://github.com/chrismason/linux由 Linus Torvalds 提交于
* 'for-linus' of git://github.com/chrismason/linux: Btrfs: only clear the need lookup flag after the dentry is setup BTRFS: Fix lseek return value for error Btrfs: don't change inode flag of the dest clone file Btrfs: don't make a file partly checksummed through file clone Btrfs: fix pages truncation in btrfs_ioctl_clone() btrfs: fix d_off in the first dirent
-
由 Andiry Xu 提交于
When a xHC host is unable to handle isochronous transfer in the interval, it reports a Missed Service Error event and skips some tds. Currently xhci driver handles MSE event in the following ways: 1. When encounter a MSE event, set ep->skip flag, update event ring dequeue pointer and return. 2. When encounter the next event on this ep, the driver will run the do-while loop, fetch td from ep's td_list to find the td corresponding to this event. All tds missed are marked as short transfer(-EXDEV). The do-while loop will end in two ways: 1. If the td pointed by the event trb is found; 2. If the ep ring's td_list is empty. However, if a buggy HW reports some unpredicted event (for example, an overrun event following a MSE event while the ep ring is actually not empty), the driver will never find the td, and it will loop until the td_list is empty. Unfortunately, the spinlock is dropped when give back a urb in the do-while loop. During the spinlock released period, the class driver may still submit urbs and add tds to the td_list. This may cause disaster, since the td_list will never be empty and the loop never ends, and the system hangs. To fix this, count the number of TDs on the ep ring before skipping TDs, and quit the loop when skipped that number of tds. This guarantees the do-while loop will end after certain number of cycles, and driver will not be trapped in an infinite loop. Signed-off-by: NAndiry Xu <andiry.xu@amd.com> Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Greg KH 提交于
Sometimes, when a USB 3.0 device is disconnected, the Intel Panther Point xHCI host controller will report a link state change with the state set to "SS.Inactive". This causes the xHCI host controller to issue a warm port reset, which doesn't finish before the USB core times out while waiting for it to complete. When the warm port reset does complete, and the xHC gives back a port status change event, the xHCI driver kicks khubd. However, it fails to set the bit indicating there is a change event for that port because the logic in xhci-hub.c doesn't check for the warm port reset bit. After that, the warm port status change bit is never cleared by the USB core, and the xHC stops reporting port status change bits. (The xHCI spec says it shouldn't report more port events until all change bits are cleared.) This means any port changes when a new device is connected will never be reported, and the port will seem "dead" until the xHCI driver is unloaded and reloaded, or the computer is rebooted. Fix this by making the xHCI driver set the port change bit when a warm port reset change bit is set. A better solution would be to make the USB core handle warm port reset in differently, merging the current code with the standard port reset code that does an incremental backoff on the timeout, and tries to complete the port reset two more times before giving up. That more complicated fix will be merged next window, and this fix will be backported to stable. This should be backported to kernels as old as 3.0, since that was the first kernel with commit a11496eb ("xHCI: warm reset support"). Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com> Cc: stable@kernel.org Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Randy Dunlap 提交于
Fix build when CONFIG_ISA_DMA_API is enabled but CONFIG_COMEDI_PCI[_DRIVERS] is not enabled. Fixes these build errors: drivers/staging/comedi/drivers/ni_labpc.c: In function 'labpc_ai_cmd': drivers/staging/comedi/drivers/ni_labpc.c:1351: error: implicit declaration of function 'labpc_suggest_transfer_size' drivers/staging/comedi/drivers/ni_labpc.c: At top level: drivers/staging/comedi/drivers/ni_labpc.c:1802: error: conflicting types for 'labpc_suggest_transfer_size' drivers/staging/comedi/drivers/ni_labpc.c:1351: note: previous implicit declaration of 'labpc_suggest_transfer_size' was here Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
Even with just the interface limited to admin, there really is little to reason to give byte-per-byte counts for taskstats. So round it down to something less intrusive. Acked-by: NBalbir Singh <bsingharora@gmail.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
Ok, this isn't optimal, since it means that 'iotop' needs admin capabilities, and we may have to work on this some more. But at the same time it is very much not acceptable to let anybody just read anybody elses IO statistics quite at this level. Use of the GENL_ADMIN_PERM suggested by Johannes Berg as an alternative to checking the capabilities by hand. Reported-by: NVasiliy Kulikov <segoon@openwall.com> Cc: Johannes Berg <johannes.berg@intel.com> Acked-by: NBalbir Singh <bsingharora@gmail.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-