- 17 12月, 2015 13 次提交
-
-
由 Laurent Dufour 提交于
User space checkpoint and restart tool (CRIU) needs the page's change to be soft tracked. This allows to do a pre checkpoint and then dump only touched pages. This is done by using a newly assigned PTE bit (_PAGE_SOFT_DIRTY) when the page is backed in memory, and a new _PAGE_SWP_SOFT_DIRTY bit when the page is swapped out. To introduce a new PTE _PAGE_SOFT_DIRTY bit value common to hash 4k and hash 64k pte, the bits already defined in hash-*4k.h should be shifted left by one. The _PAGE_SWP_SOFT_DIRTY bit is dynamically put after the swap type in the swap pte. A check is added to ensure that the bit is not overwritten by _PAGE_HPTEFLAGS. Signed-off-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com> CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
The STD_EXCEPTION_PSERIES macro takes both a vector number, and a location (memory address). However both are always identical, so combine them to save repeating ourselves. This does mean an exception handler must always exist at the location in memory that matches its vector number. But that's OK because this is the "STD" macro (standard), which does exactly that. We have other macros for the other cases, eg. STD_EXCEPTION_PSERIES_OOL (out of line). Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
This is only used in one location, open code it. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
HMT_MEDIUM_LOW_HAS_PPR is only used in once place, open code it. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
HMT_MEDIUM_PPR_DISCARD is a macro which is present at the start of most of our first level exception handlers. It conditionally executes a HMT_MEDIUM instruction, which sets the processor priority to medium. On on modern systems, ie. Power7 and later, it is nop'ed out at boot. All it does is make the exception vectors more cramped, and consume 4 bytes of icache. On old systems it has the effect of boosting the processor priority at the start of exception processing. If we were previously in the idle loop for example, we may be at low or very low priority. This is desirable as we want to process the exception as fast as possible. However looking closely at the generated code, we see that in all cases we execute another HMT_MEDIUM just four instructions later. With code patching applied, the final code on an old (Power6) system will look like, eg: c000000000000300 <data_access_pSeries>: c000000000000300: 7c 42 13 78 mr r2,r2 <- c000000000000304: 7d b2 43 a6 mtsprg 2,r13 c000000000000308: 7d b1 42 a6 mfsprg r13,1 c00000000000030c: f9 2d 00 80 std r9,128(r13) c000000000000310: 60 00 00 00 nop c000000000000314: 7c 42 13 78 mr r2,r2 <- So I suggest that the added code complexity of HMT_MEDIUM_PPR_DISCARD is not justified by the benefit of boosting the processor priority for the duration of four instructions, and therefore we drop it. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
There are no longer any users of enter_rtas() outside of rtas.c, so make it "private", by moving the declaration inside rtas.c. Hopefully this will encourage people to use one of the wrappers which takes the sharp edges off the RTAS calling sequence. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
Although call_rtas_display_status() does actually want to use the regular RTAS locking, it doesn't want the extra logic that is in rtas_call(), so currently it open codes the logic. Instead we can use rtas_call_unlocked(), after taking the RTAS lock. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
Avoid open coding the logic by using rtas_call_unlocked(). Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
Avoid open coding the logic by using rtas_call_unlocked(). Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
Most users of RTAS (Run-Time Abstraction Services) use rtas_call(), which deals with locking as well as endian handling. However we have two users outside of rtas.c that can't use rtas_call() because they have different locking requirements. The hotplug CPU code can't take the RTAS lock because the CPU would go offline with the lock held and no other CPUs would be able to call RTAS until the CPU came back online. The xmon code doesn't want to take the lock because it would risk dead locking when we are trying to recover from a crash. Both sites required multiple patches when we added little endian support, proving that programmers can't do endian right. Although that ship has sailed, we can still clean the code up by providing an unlocked version of rtas_call() which avoids the need to open code the logic elsewhere. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Stewart Smith 提交于
Long ago, only in the lab, there was OPALv1 and OPALv2. Now there is just OPALv3, with nobody ever expecting anything on pre-OPALv3 to be cared about or supported by mainline kernels. So, let's remove FW_FEATURE_OPALv3 and instead use FW_FEATURE_OPAL exclusively. Signed-off-by: NStewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Stewart Smith 提交于
OPALv2 only ever existed in the lab and didn't escape to the world. All OPAL systems in the wild are OPALv3. The probability of there being an OPALv2 system still powered on anywhere inside IBM is approximately zero, let alone anyone expecting to run mainline kernels. So, start to remove references to OPALv2. Signed-off-by: NStewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Stewart Smith 提交于
The OpenPower Abstraction Layer firmware went through a couple of iterations in the lab before being released. What we now know as OPAL advertises itself as OPALv3. OPALv2 and OPALv1 never made it outside the lab, and the possibility of anyone at all ever building a mainline kernel today and expecting it to boot on such hardware is zero. Signed-off-by: NStewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 16 12月, 2015 1 次提交
-
-
由 Daniel Axtens 提交于
GregorianDay() is supposed to calculate the day of the week (tm->tm_wday) for a given day/month/year. In that calcuation it indexed into an array called MonthOffset using tm->tm_mon-1. However tm_mon is zero-based, not one-based, so this is off-by-one. It also means that every January, GregoiranDay() will access element -1 of the MonthOffset array. It also doesn't appear to be a correct algorithm either: see in contrast kernel/time/timeconv.c's time_to_tm function. It's been broken forever, which suggests no-one in userland uses this. It looks like no-one in the kernel uses tm->tm_wday either (see e.g. drivers/rtc/rtc-ds1305.c:319). tm->tm_wday is conventionally set to -1 when not available in hardware so we can simply set it to -1 and drop the function. (There are over a dozen other drivers in drivers/rtc that do this.) Found using UBSAN. Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andrew Morton <akpm@linux-foundation.org> # as an example of what UBSan finds. Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com> Cc: rtc-linux@googlegroups.com Signed-off-by: NDaniel Axtens <dja@axtens.net> Acked-by: NAlexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 14 12月, 2015 26 次提交
-
-
由 Rashmica Gupta 提交于
Currently if you are in xmon without an oops etc. to view the kernel version you have to type "d $linux_banner" - not necessarily obvious. As this is useful information, append to the output of "e" command. Example output: $mon> e cpu 0x1: Vector: 0 at [c0000000f879ba80] pc: c000000000081718: sysrq_handle_xmon+0x68/0x80 lr: c000000000081718: sysrq_handle_xmon+0x68/0x80 sp: c0000000f879bbe0 msr: 8000000000009033 current = 0xc0000000f604d5c0 paca = 0xc00000000fdc0480 softe: 0 irq_happened: 0x01 pid = 2467, comm = bash Linux version 4.4.0-rc2-00008-gc51af91c3ab3-dirty (rashmica@circle) (gcc version 5.1.1 20150629 (GCC) ) #45 SMP Wed Nov 25 10:25:12 AEDT 2015 Signed-off-by: NRashmica Gupta <rashmicy@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Rashmica Gupta 提交于
All users of QPACE have upgraded to QPACE2 so remove the Cell QPACE code. Signed-off-by: NRashmica Gupta <rashmicy@gmail.com> Acked-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Vipin K Parashar 提交于
Kernel prints respective warnings about various EPOW events for user information/action after parsing EPOW interrupts. At times below EPOW reset event warning is seen to be flooding kernel log over a period of time. May 25 03:46:34 alp kernel: Non critical power or cooling issue cleared May 25 03:46:52 alp kernel: Non critical power or cooling issue cleared May 25 03:53:48 alp kernel: Non critical power or cooling issue cleared May 25 03:55:46 alp kernel: Non critical power or cooling issue cleared May 25 03:56:34 alp kernel: Non critical power or cooling issue cleared May 25 03:59:04 alp kernel: Non critical power or cooling issue cleared May 25 04:02:01 alp kernel: Non critical power or cooling issue cleared These EPOW reset events are spurious in nature and are triggered by firmware without an actual EPOW event being reset. This patch avoids these multiple EPOW reset warnings by using a counter variable. This variable is incremented every time an EPOW event is reported. Upon receiving a EPOW reset event the same variable is checked to filter out spurious events and decremented accordingly. This patch also improves log messages to better describe EPOW event being reported. Merged adjacent log messages into single one to reduce number of lines printed per event. Signed-off-by: NKamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: NVipin K Parashar <vipin@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Neuling 提交于
Print MSR TM bits in oops messages. This appends them to the end like this: MSR: 8000000502823031 <SF,VEC,VSX,FP,ME,IR,DR,LE,TM[TE]> You get the TM[] only if at least one TM MSR bit is set. Inside the TM[], E means Enabled (bit 32), S means Suspended (bit 33), and T means Transactional (bit 34) If no bits are set, you get no TM[] output. Include rework of printbits() to handle this case. Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Boqun Feng 提交于
According to memory-barriers.txt, xchg*, cmpxchg* and their atomic_ versions all need to be fully ordered, however they are now just RELEASE+ACQUIRE, which are not fully ordered. So also replace PPC_RELEASE_BARRIER and PPC_ACQUIRE_BARRIER with PPC_ATOMIC_ENTRY_BARRIER and PPC_ATOMIC_EXIT_BARRIER in __{cmp,}xchg_{u32,u64} respectively to guarantee fully ordered semantics of atomic{,64}_{cmp,}xchg() and {cmp,}xchg(), as a complement of commit b97021f8 ("powerpc: Fix atomic_xxx_return barrier semantics") This patch depends on patch "powerpc: Make value-returning atomics fully ordered" for PPC_ATOMIC_ENTRY_BARRIER definition. Cc: stable@vger.kernel.org # 3.2+ Signed-off-by: NBoqun Feng <boqun.feng@gmail.com> Reviewed-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Boqun Feng 提交于
According to memory-barriers.txt: > Any atomic operation that modifies some state in memory and returns > information about the state (old or new) implies an SMP-conditional > general memory barrier (smp_mb()) on each side of the actual > operation ... Which mean these operations should be fully ordered. However on PPC, PPC_ATOMIC_ENTRY_BARRIER is the barrier before the actual operation, which is currently "lwsync" if SMP=y. The leading "lwsync" can not guarantee fully ordered atomics, according to Paul Mckenney: https://lkml.org/lkml/2015/10/14/970 To fix this, we define PPC_ATOMIC_ENTRY_BARRIER as "sync" to guarantee the fully-ordered semantics. This also makes futex atomics fully ordered, which can avoid possible memory ordering problems if userspace code relies on futex system call for fully ordered semantics. Fixes: b97021f8 ("powerpc: Fix atomic_xxx_return barrier semantics") Cc: stable@vger.kernel.org # 3.2+ Signed-off-by: NBoqun Feng <boqun.feng@gmail.com> Reviewed-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
The slot information of base page size hash pte is stored in the pgtable_t w.r.t transparent hugepage. We need to make sure we don't index beyond pgtable_t size. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
This will bulk read 4 hash pte slot entries and should reduce the loop Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
Remove the related functions and #defines Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
They don't need to track 4k subpage slot details and hence don't need second half of pgtable_t. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
Use the #define instead of open-coding the same Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
pte and pmd table size are dependent on config items. Don't hard code the same. This make sure we use the right value when masking pmd entries and also while checking pmd_bad Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
For a pte entry we will have _PAGE_PTE set. Our pte page address have a minimum alignment requirement of HUGEPD_SHIFT_MASK + 1. We use the lower 7 bits to indicate hugepd. ie. For pmd and pgd we can find: 1) _PAGE_PTE set pte -> indicate PTE 2) bits [2..6] non zero -> indicate hugepd. They also encode the size. We skip bit 1 (_PAGE_PRESENT). 3) othewise pointer to next table. Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
We support THP only with book3s_64 and 64K page size. Move THP details to hash64-64k.h to clarify the same. Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
W.r.t hugetlb, we support two format for pmd. With book3s_64 and 64K linux page size, we can have pte at the pmd level. Hence we don't need to support hugepd there. For everything else hugepd is supported and pmd_huge is (0). Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
Only difference here is, we apply the WIMG mapping early, so rflags passed to updatepp will also be changed. Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
Instead of open coding it in multiple code paths, export the helper and add more documentation. Also make sure we don't make assumption regarding pte bit position Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
This is similar to 64K insert. May be we want to consolidate Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
Convert from asm to C Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
No real change, only style changes Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
This free up 11 bits in pte_t. In the later patch we also change the pte_t format so that we can start supporting migration pte at pmd level. We now track 4k subpage valid bit as below If we have _PAGE_COMBO set, we override the _PAGE_F_GIX_SHIFT and _PAGE_F_SECOND. Together we have 4 bits, each of them used to indicate whether any of the 4 4k subpage in that group is valid. ie, [ group 1 bit ] [ group 2 bit ] ..... [ group 4 ] [ subpage 1 - 4] [ subpage 5- 8] ..... [ subpage 13 - 16] We still track each 4k subpage slot number and secondary hash information in the second half of pgtable_t. Removing the subpage tracking have some significant overhead on aim9 and ebizzy benchmark and to support THP with 4K subpage, we do need a pgtable_t of 4096 bytes. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
We should not expect pte bit position in asm code. Simply by moving part of that to C Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
Move the booke related headers below booke/32 or booke/64 Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
functions which operate on pte bits are moved to hash*.h and other generic functions are moved to pgtable.h Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-