- 05 11月, 2009 10 次提交
-
-
由 Alexander Graf 提交于
Following S390's good example we should use hrtimers for the decrementer too! This patch converts the timer from the old mechanism to hrtimers. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Alexander Graf 提交于
For KVM we need to store some information in the PACA, so we need to extend it. This patch adds KVM SLB shadow related entries to the PACA and a field that indicates if we're inside a guest. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Alexander Graf 提交于
For KVM we need to allocate a new context id, but don't really care about all the mm context around it. So let's split the alloc and destroy functions for the context id, so we can grab one without allocating an mm context. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Alexander Graf 提交于
We need to run some KVM trampoline code in real mode. Unfortunately, real mode only covers 8MB on Cell so we need to squeeze ourselves as low as possible. Also, we need to trap interrupts to get us back from guest state to host state without telling Linux about it. This patch adds interrupt traps and includes the KVM code that requires real mode in the real mode parts of Linux. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Alexander Graf 提交于
This is the of entry / exit code. In order to switch between host and guest context, we need to switch register state and call the exit code handler on exit. This assembly file does exactly that. To finally enter the guest it calls into book3s_64_slb.S. On exit it gets jumped at from book3s_64_slb.S too. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Alexander Graf 提交于
We need to intercept interrupt vectors. To do that, let's add a file we can always include which only activates the intercepts when we have then configured. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Alexander Graf 提交于
This adds the book3s specific header file that contains structs that are only valid on book3s specific code. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Alexander Graf 提交于
We need to store more information than we currently have for vcpus when running on Book3s. So let's extend the internal struct definitions. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Alexander Graf 提交于
We need quite a bunch of new constants for KVM on Book3s, so let's define them now. These constants will be used in later patches. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Alexander Graf 提交于
Right now sregs is unused on PPC, so we can use it for initialization of the CPU. KVM on BookE always virtualizes the host CPU. On Book3s we go a step further and take the PVR from userspace that tells us what kind of CPU we are supposed to virtualize, because we support Book3s_32 and Book3s_64 guests. In order to get that information, we use the sregs ioctl, because we don't want to reset the guest CPU on every normal register set. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 30 10月, 2009 10 次提交
-
-
由 Michael Ellerman 提交于
Defining CONFIG_SPARSE_IRQ enables generic code that gets rid of the static irq_desc array, and replaces it with an array of pointers to irq_descs. It also allows node local allocation of irq_descs, however we currently don't have the information available to do that, so we just allocate them on all on node 0. Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Thomas Gleixner 提交于
nvram_find_partition() has no user. The call site was removed in the arch/powerpc move, but the function stayed. Remove it. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: linuxppc-dev@ozlabs.org Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Michael Neuling 提交于
irqs_disabled_flags is #defined in linux/irqflags.h when CONFIG_TRACE_IRQFLAGS_SUPPORT is enabled. 64 and 32 bit always have CONFIG_TRACE_IRQFLAGS_SUPPORT enabled so just remove irqs_disabled_flags. This fixes the case when someone needs to include both linux/irqflags.h and asm/hw_irq.h. Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 David Gibson 提交于
The hugepage arch code provides a number of hook functions/macros which mirror the functionality of various normal page pte access functions. Various changes in the normal page accessors (in particular BenH's recent changes to the handling of lazy icache flushing and PAGE_EXEC) have caused the hugepage versions to get out of sync with the originals. In some cases, this is a bug, at least on some MMU types. One of the reasons that some hooks were not identical to the normal page versions, is that the fact we're dealing with a hugepage needed to be passed down do use the correct dcache-icache flush function. This patch makes the main flush_dcache_icache_page() function hugepage aware (by checking for the PageCompound flag). That in turn means we can make set_huge_pte_at() just a call to set_pte_at() bringing it back into sync. As a bonus, this lets us remove the hash_huge_page_do_lazy_icache() function, replacing it with a call to the hash_page_do_lazy_icache() function it was based on. Some other hugepage pte access hooks - huge_ptep_get_and_clear() and huge_ptep_clear_flush() - are not so easily unified, but this patch at least brings them back into sync with the current versions of the corresponding normal page functions. Signed-off-by: NDavid Gibson <dwg@au1.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 David Gibson 提交于
This patch separates the parts of hugetlbpage.c which are inherently specific to the hash MMU into a new hugelbpage-hash64.c file. Signed-off-by: NDavid Gibson <dwg@au1.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 David Gibson 提交于
This patch simplifies the logic used to initialize hugepages on powerpc. The somewhat oddly named set_huge_psize() is renamed to add_huge_page_size() and now does all necessary verification of whether it's given a valid hugepage sizes (instead of just some) and instantiates the generic hstate structure (but no more). hugetlbpage_init() now steps through the available pagesizes, checks if they're valid for hugepages by calling add_huge_page_size() and initializes the kmem_caches for the hugepage pagetables. This means we can now eliminate the mmu_huge_psizes array, since we no longer need to pass the sizing information for the pagetable caches from set_huge_psize() into hugetlbpage_init() Determination of the default huge page size is also moved from the hash code into the general hugepage code. Signed-off-by: NDavid Gibson <dwg@au1.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 David Gibson 提交于
Currently each available hugepage size uses a slightly different pagetable layout: that is, the bottem level table of pointers to hugepages is a different size, and may branch off from the normal page tables at a different level. Every hugepage aware path that needs to walk the pagetables must therefore look up the hugepage size from the slice info first, and work out the correct way to walk the pagetables accordingly. Future hardware is likely to add more possible hugepage sizes, more layout options and more mess. This patch, therefore reworks the handling of hugepage pagetables to reduce this complexity. In the new scheme, instead of having to consult the slice mask, pagetable walking code can check a flag in the PGD/PUD/PMD entries to see where to branch off to hugepage pagetables, and the entry also contains the information (eseentially hugepage shift) necessary to then interpret that table without recourse to the slice mask. This scheme can be extended neatly to handle multiple levels of self-describing "special" hugepage pagetables, although for now we assume only one level exists. This approach means that only the pagetable allocation path needs to know how the pagetables should be set out. All other (hugepage) pagetable walking paths can just interpret the structure as they go. There already was a flag bit in PGD/PUD/PMD entries for hugepage directory pointers, but it was only used for debug. We alter that flag bit to instead be a 0 in the MSB to indicate a hugepage pagetable pointer (normally it would be 1 since the pointer lies in the linear mapping). This means that asm pagetable walking can test for (and punt on) hugepage pointers with the same test that checks for unpopulated page directory entries (beq becomes bge), since hugepage pointers will always be positive, and normal pointers always negative. While we're at it, we get rid of the confusing (and grep defeating) #defining of hugepte_shift to be the same thing as mmu_huge_psizes. Signed-off-by: NDavid Gibson <dwg@au1.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 David Gibson 提交于
Currently we have a fair bit of rather fiddly code to manage the various kmem_caches used to store page tables of various levels. We generally have two caches holding some combination of PGD, PUD and PMD tables, plus several more for the special hugepage pagetables. This patch cleans this all up by taking a different approach. Rather than the caches being designated as for PUDs or for hugeptes for 16M pages, the caches are simply allocated to be a specific size. Thus sharing of caches between different types/levels of pagetables happens naturally. The pagetable size, where needed, is passed around encoded in the same way as {PGD,PUD,PMD}_INDEX_SIZE; that is n where the pagetable contains 2^n pointers. Signed-off-by: NDavid Gibson <dwg@au1.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Michael Ellerman 提交于
get_irq_desc() is a powerpc-specific version of irq_to_desc(). That is reason enough to remove it, but it also doesn't know about sparse irq_desc support which irq_to_desc() does (when we enable it). Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Acked-by: NGrant Likely <grant.likely@secretlab.ca> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Michael Ellerman 提交于
The irq_desc array consumes quite a lot of space, and for systems that don't need or can't have 512 irqs it's just wasted space. The first 16 are reserved for ISA, so the minimum of 32 is really 16 - and no one has asked for more than 512 so leave that as the maximum. Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 14 10月, 2009 1 次提交
-
-
由 Anton Blanchard 提交于
Profiling of a page fault scalability microbenchmark shows flush_hash_range is not calling the batch hpte invalidate hcall (H_BULK_REMOVE). It turns out we have a duplicate firmware feature for hcall-bulk and the current setup code stops after finding the first match. This meant we never batch and always do individual invalidates. The patch below removes the duplicate and shifts FW_FEATURE_CMO to close the gap. With the patch applied the single threaded page fault rate improves from 217169 to 238755 per second on a POWER5 test box, a 10% improvement. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 24 9月, 2009 9 次提交
-
-
由 Benjamin Herrenschmidt 提交于
The test to check whether we have _PAGE_SPECIAL defined is broken, since we always define it, just not always to a meaninful value :-) That broke 8xx and 40x under some circumstances. This fixes it by adding _PAGE_SPECIAL for both of these since they had a free PTE bit, and removing the condition around advertising it. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Becky Bruce 提交于
Sometimes this is used to hold a simple offset, and sometimes it is used to hold a pointer. This patch changes it to a union containing void * and dma_addr_t. get/set accessors are also provided, because it was getting a bit ugly to get to the actual data. Signed-off-by: NBecky Bruce <beckyb@kernel.crashing.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Becky Bruce 提交于
The former is no longer really accurate with the swiotlb case now a possibility. I also move it into dma-mapping.h - it no longer needs to be in dma.c, and there are about to be some more accessors that should all end up in the same place. A comment is added to indicate that this function is not used in configs where there is no simple dma offset, such as the iommu case. Signed-off-by: NBecky Bruce <beckyb@kernel.crashing.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
It doesn't exist ! Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Rusty Russell 提交于
Now everyone is converted to arch_send_call_function_ipi_mask, remove the shim and the #defines. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Rusty Russell 提交于
We're weaning the core code off handing cpumask's around on-stack. This introduces arch_send_call_function_ipi_mask(), and by defining it, the old arch_send_call_function_ipi is defined by the core code. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Rusty Russell 提交于
There were replaced by topology_core_cpumask and topology_thread_cpumask. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Rusty Russell 提交于
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Rusty Russell 提交于
cpumask_of_pcibus() is the new version. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 22 9月, 2009 2 次提交
-
-
由 Arnd Bergmann 提交于
Add a flag for mmap that will be used to request a huge page region that will look like anonymous memory to user space. This is accomplished by using a file on the internal vfsmount. MAP_HUGETLB is a modifier of MAP_ANONYMOUS and so must be specified with it. The region will behave the same as a MAP_ANONYMOUS region using small pages. The patch also adds the MAP_STACK flag, which was previously defined only on some architectures but not on others. Since MAP_STACK is meant to be a hint only, architectures can define it without assigning a specific meaning to it. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Cc: Eric B Munson <ebmunson@us.ibm.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: David Rientjes <rientjes@google.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Mackerras 提交于
This fixes two places in the powerpc perf_event (perf_counter) code where 'list_entry' needs to be changed to 'group_entry', but were missed in commit 65abc865 ("perf_counter: Rename list_entry -> group_entry, counter_list -> group_list"). This also changes 'event' back to 'counter' in a couple of contexts: * Field and function names that deal with the limited-function counters: it's really the hardware counters whose function is limited, not the events that they count. Hence: MAX_LIMITED_HWEVENTS -> MAX_LIMITED_HWCOUNTERS limited_event -> limited_counter freeze/thaw_limited_events -> freeze/thaw_limited_counters * The machine-specific PMU description struct (struct power_pmu): this renames 'n_event' back to 'n_counter' since it really describes how many hardware counters the machine has. (Renaming this back avoids a compile error in each of the machine-specific PMU back-ends where they initialize their power_pmu struct.) Signed-off-by: NPaul Mackerras <paulus@samba.org> Cc: linuxppc-dev@ozlabs.org Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <19128.4280.813369.589704@cargo.ozlabs.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 21 9月, 2009 2 次提交
-
-
由 Ingo Molnar 提交于
- provide compatibility Kconfig entry for existing PERF_COUNTERS .config's - provide courtesy copy of old perf_counter.h, for user-space projects - small indentation fixups - fix up MAINTAINERS - fix small x86 printout fallout - fix up small PowerPC comment fallout (use 'counter' as in register) Reviewed-by: NArjan van de Ven <arjan@linux.intel.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Bye-bye Performance Counters, welcome Performance Events! In the past few months the perfcounters subsystem has grown out its initial role of counting hardware events, and has become (and is becoming) a much broader generic event enumeration, reporting, logging, monitoring, analysis facility. Naming its core object 'perf_counter' and naming the subsystem 'perfcounters' has become more and more of a misnomer. With pending code like hw-breakpoints support the 'counter' name is less and less appropriate. All in one, we've decided to rename the subsystem to 'performance events' and to propagate this rename through all fields, variables and API names. (in an ABI compatible fashion) The word 'event' is also a bit shorter than 'counter' - which makes it slightly more convenient to write/handle as well. Thanks goes to Stephane Eranian who first observed this misnomer and suggested a rename. User-space tooling and ABI compatibility is not affected - this patch should be function-invariant. (Also, defconfigs were not touched to keep the size down.) This patch has been generated via the following script: FILES=$(find * -type f | grep -vE 'oprofile|[^K]config') sed -i \ -e 's/PERF_EVENT_/PERF_RECORD_/g' \ -e 's/PERF_COUNTER/PERF_EVENT/g' \ -e 's/perf_counter/perf_event/g' \ -e 's/nb_counters/nb_events/g' \ -e 's/swcounter/swevent/g' \ -e 's/tpcounter_event/tp_event/g' \ $FILES for N in $(find . -name perf_counter.[ch]); do M=$(echo $N | sed 's/perf_counter/perf_event/g') mv $N $M done FILES=$(find . -name perf_event.*) sed -i \ -e 's/COUNTER_MASK/REG_MASK/g' \ -e 's/COUNTER/EVENT/g' \ -e 's/\<event\>/event_id/g' \ -e 's/counter/event/g' \ -e 's/Counter/Event/g' \ $FILES ... to keep it as correct as possible. This script can also be used by anyone who has pending perfcounters patches - it converts a Linux kernel tree over to the new naming. We tried to time this change to the point in time where the amount of pending patches is the smallest: the end of the merge window. Namespace clashes were fixed up in a preparatory patch - and some stylistic fallout will be fixed up in a subsequent patch. ( NOTE: 'counters' are still the proper terminology when we deal with hardware registers - and these sed scripts are a bit over-eager in renaming them. I've undone some of that, but in case there's something left where 'counter' would be better than 'event' we can undo that on an individual basis instead of touching an otherwise nicely automated patch. ) Suggested-by: NStephane Eranian <eranian@google.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NPaul Mackerras <paulus@samba.org> Reviewed-by: NArjan van de Ven <arjan@linux.intel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 16 9月, 2009 1 次提交
-
-
由 Peter Zijlstra 提交于
Sysbench thinks SD_BALANCE_WAKE is too agressive and kbuild doesn't really mind too much, SD_BALANCE_NEWIDLE picks up most of the slack. On a dual socket, quad core, dual thread nehalem system: sysbench (--num_threads=16): SD_BALANCE_WAKE-: 13982 tx/s SD_BALANCE_WAKE+: 15688 tx/s kbuild (-j16): SD_BALANCE_WAKE-: 47.648295846 seconds time elapsed ( +- 0.312% ) SD_BALANCE_WAKE+: 47.608607360 seconds time elapsed ( +- 0.026% ) (same within noise) Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 15 9月, 2009 3 次提交
-
-
由 Mike Galbraith 提交于
Make the idle balancer more agressive, to improve a x264 encoding workload provided by Jason Garrett-Glaser: NEXT_BUDDY NO_LB_BIAS encoded 600 frames, 252.82 fps, 22096.60 kb/s encoded 600 frames, 250.69 fps, 22096.60 kb/s encoded 600 frames, 245.76 fps, 22096.60 kb/s NO_NEXT_BUDDY LB_BIAS encoded 600 frames, 344.44 fps, 22096.60 kb/s encoded 600 frames, 346.66 fps, 22096.60 kb/s encoded 600 frames, 352.59 fps, 22096.60 kb/s NO_NEXT_BUDDY NO_LB_BIAS encoded 600 frames, 425.75 fps, 22096.60 kb/s encoded 600 frames, 425.45 fps, 22096.60 kb/s encoded 600 frames, 422.49 fps, 22096.60 kb/s Peter pointed out that this is better done via newidle_idx, not via LB_BIAS, newidle balancing should look for where there is load _now_, not where there was load 2 ticks ago. Worst-case latencies are improved as well as no buddies means less vruntime spread. (as per prior lkml discussions) This change improves kbuild-peak parallelism as well. Reported-by: NJason Garrett-Glaser <darkshikari@gmail.com> Signed-off-by: NMike Galbraith <efault@gmx.de> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1253011667.9128.16.camel@marge.simson.net> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
When merging select_task_rq_fair() and sched_balance_self() we lost the use of wake_idx, restore that and set them to 0 to make wake balancing more aggressive. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
The problem with wake_idle() is that is doesn't respect things like cpu_power, which means it doesn't deal well with SMT nor the recent RT interaction. To cure this, it needs to do what sched_balance_self() does, which leads to the possibility of merging select_task_rq_fair() and sched_balance_self(). Modify sched_balance_self() to: - update_shares() when walking up the domain tree, (it only called it for the top domain, but it should have done this anyway), which allows us to remove this ugly bit from try_to_wake_up(). - do wake_affine() on the smallest domain that contains both this (the waking) and the prev (the wakee) cpu for WAKE invocations. Then use the top-down balance steps it had to replace wake_idle(). This leads to the dissapearance of SD_WAKE_BALANCE and SD_WAKE_IDLE_FAR, with SD_WAKE_IDLE replaced with SD_BALANCE_WAKE. SD_WAKE_AFFINE needs SD_BALANCE_WAKE to be effective. Touch all topology bits to replace the old with new SD flags -- platforms might need re-tuning, enabling SD_BALANCE_WAKE conditionally on a NUMA distance seems like a good additional feature, magny-core and small nehalem systems would want this enabled, systems with slow interconnects would not. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 11 9月, 2009 2 次提交
-
-
由 Martyn Welch 提交于
Remove the reliance on a staticly defined NVRAM size, allowing platforms to support NVRAMs with sizes differing from the standard. A fall back value is provided for platforms not supporting this extension. Signed-off-by: NMartyn Welch <martyn.welch@gefanuc.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Paul Mackerras 提交于
Currently there is a bug where if you use oprofile on a pSeries machine, then use perf_counters, then use oprofile again, oprofile will not work correctly; it will lose the PMU configuration the next time the hypervisor does a partition context switch, and thereafter won't count anything. Maynard Johnson identified the sequence causing the problem: - oprofile setup calls ppc_enable_pmcs(), which calls pseries_lpar_enable_pmcs, which tells the hypervisor that we want to use the PMU, and sets the "PMU in use" flag in the lppaca. This flag tells the hypervisor whether it needs to save and restore the PMU config. - The perf_counter code sets and clears the "PMU in use" flag directly as it context-switches the PMU between tasks, and leaves it clear when it finishes. - oprofile setup, called for a new oprofile run, calls ppc_enable_pmcs, which does nothing because it has already been called. In particular it doesn't set the "PMU in use" flag. This fixes the problem by arranging for ppc_enable_pmcs to always set the "PMU in use" flag. It makes the perf_counter code call ppc_enable_pmcs also rather than calling the lower-level function directly, and removes the setting of the "PMU in use" flag from pseries_lpar_enable_pmcs, since that is now done in its caller. This also removes the declaration of pasemi_enable_pmcs because it isn't defined anywhere. Reported-by: NMaynard Johnson <mpjohn@us.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org> Cc: <stable@kernel.org) Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-