提交 · 6b0b50b0617fad5f2af3b928596a25f7de8dbf50 · openeuler / Kernel

20 6月, 2013 1 次提交

mm/THP: add pmd args to pgtable deposit and withdraw APIs · 6b0b50b0

由 Aneesh Kumar K.V 提交于 6月 05, 2013

This will be later used by powerpc THP support. In powerpc we want to use
pgtable for storing the hash index values. So instead of adding them to
mm_context list, we would like to store them in the second half of pmd
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

6b0b50b0

31 5月, 2013 1 次提交

s390/pgtable: Fix gmap notifier address · e86cbd87

由 Christian Borntraeger 提交于 5月 29, 2013

The address of the gmap notifier was broken, resulting in
unhandled validity intercepts in KVM. Fix the rmap->vmaddr
to be on a segment boundary.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

e86cbd87

15 5月, 2013 1 次提交

s390: fix gmap_ipte_notifier vs. software dirty pages · bb4b42ce

由 Christian Borntraeger 提交于 5月 08, 2013

On heavy paging load some guest cpus started to loop in gmap_ipte_notify.
This was visible as stalled cpus inside the guest. The gmap_ipte_notifier
tries to map a user page and then made sure that the pte is valid and
writable. Turns out that with the software change bit tracking the pte
can become read-only (and only software writable) if the page is clean.
Since we loop in this code, the page would stay clean and, therefore,
be never writable again.
Let us just use fixup_user_fault, that guarantees to call handle_mm_fault.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

bb4b42ce

03 5月, 2013 2 次提交

s390/mem_detect: remove artificial kdump memory types · 996b4a7d

由 Heiko Carstens 提交于 4月 30, 2013

Simplify the memory detection code a bit by removing the CHUNK_OLDMEM
and CHUNK_CRASHK memory types.
They are not needed. Everything that is needed is a mechanism to
insert holes into the detected memory.
Reviewed-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

996b4a7d

s390/mm: add pte invalidation notifier for kvm · d3383632

由 Martin Schwidefsky 提交于 4月 17, 2013

Add a notifier for kvm to get control before a page table entry is
invalidated. The notifier is only called for ptes of an address space
with pgstes that have been explicitly marked to require notification.
Kvm will use this to get control before prefix pages of virtual CPU
are unmapped.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

d3383632

02 5月, 2013 5 次提交

s390/memory hotplug: provide memory_block_size_bytes() function · e5d709bb

由 Heiko Carstens 提交于 5月 02, 2013

Commit 0c2c99b1 "memory hotplug: Allow memory blocks to span
multiple memory sections" introduced a weak memory_block_size_bytes()
function which can be used to set the size of a memory block as
seen in sysfs.
Provide an s390 specific override which makes sure that each
memory block has at least a size of 256MB or the increment size of
of a memory increment, whatever is larger.
This way we can make sure that the number of memory sysfs objects
doesn't explode for very large memory configurations.
Reported-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

e5d709bb

s390/mem_detect: limit memory detection loop to "mem=" parameter · df1bd59c

由 Heiko Carstens 提交于 4月 30, 2013

The current memory detection loop will detect all present memory of
a machine. This is true even if the user specified the "mem=" parameter
on the kernel command line.
This can be a problem since the memory detection may cause a fully
populated host page table for the guest, even for those parts of the
memory that the guest will never use afterwards.

So fix this and only detect memory up to a user supplied "mem=" limit
if specified.
Reported-by: NMichael Johanssen <johanssn@de.ibm.com>
Reviewed-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

df1bd59c

s390/mem_detect: add DAT sanity check · 0a694067

由 Heiko Carstens 提交于 4月 27, 2013

Add sanity check: verify if the passed in array resides in vmalloc space.
If so print a warning and return to caller.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

0a694067

s390/mem_detect: fix lockdep irq tracing · d009f4d8

由 Heiko Carstens 提交于 4月 27, 2013

When disabling and enabling interrupts we must tell lockdep.
So use local_irq_save()/restore() to disable and enable interrupts.
The DAT disabling/enabling get handled separately now.
Note: we may not call trace_hardirqs_on() with DAT disabled, since
the generic code may access vmalloc'ed data structures.
Reported-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

d009f4d8

s390/mem_detect: move memory detection code to mm folder · 066b9fd6

由 Heiko Carstens 提交于 4月 26, 2013

Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

066b9fd6

30 4月, 2013 3 次提交

sparse-vmemmap: specify vmemmap population range in bytes · 0aad818b

由 Johannes Weiner 提交于 4月 29, 2013

The sparse code, when asking the architecture to populate the vmemmap,
specifies the section range as a starting page and a number of pages.

This is an awkward interface, because none of the arch-specific code
actually thinks of the range in terms of 'struct page' units and always
translates it to bytes first.

In addition, later patches mix huge page and regular page backing for
the vmemmap.  For this, they need to call vmemmap_populate_basepages()
on sub-section ranges with PAGE_SIZE and PMD_SIZE in mind.  But these
are not necessarily multiples of the 'struct page' size and so this unit
is too coarse.

Just translate the section range into bytes once in the generic sparse
code, then pass byte ranges down the stack.
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Bernhard Schmidt <Bernhard.Schmidt@lrz.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Tested-by: NDavid S. Miller <davem@davemloft.net>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0aad818b

mm/hugetlb: add more arch-defined huge_pte functions · 106c992a

由 Gerald Schaefer 提交于 4月 29, 2013

Commit abf09bed ("s390/mm: implement software dirty bits")
introduced another difference in the pte layout vs.  the pmd layout on
s390, thoroughly breaking the s390 support for hugetlbfs.  This requires
replacing some more pte_xxx functions in mm/hugetlbfs.c with a
huge_pte_xxx version.

This patch introduces those huge_pte_xxx functions and their generic
implementation in asm-generic/hugetlb.h, which will now be included on
all architectures supporting hugetlbfs apart from s390.  This change
will be a no-op for those architectures.

[akpm@linux-foundation.org: fix warning]
Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Hillf Danton <dhillf@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.cz>	[for !s390 parts]
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

106c992a

mm/s390: use common help functions to free reserved pages · 0999f119

由 Jiang Liu 提交于 4月 29, 2013

Use common help functions to free reserved pages.
Signed-off-by: NJiang Liu <jiang.liu@huawei.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0999f119

23 4月, 2013 2 次提交

s390/mm,gmap: segment mapping race · ab8e5235

由 Martin Schwidefsky 提交于 4月 16, 2013

The gmap_map_segment function creates a special invalid segment table
entry with the address of the requested target location in the process
address space. The first access will create the connection between the
gmap segment table and the target page table of the main process.
If two threads do this concurrently both will walk the page tables and
allocate a gmap_rmap structure for the same segment table entry.
To avoid the race recheck the segment table entry after taking to page
table lock.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

ab8e5235

s390/mm,gmap: implement gmap_translate() · c5034945

由 Heiko Carstens 提交于 9月 10, 2012

Implement gmap_translate() function which translates a guest absolute address
to a user space process address without establishing the guest page table
entries.

This is useful for kvm guest address translations where no memory access
is expected to happen soon (e.g. tprot exception handler).
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

c5034945

17 4月, 2013 4 次提交

s390/mm: speedup storage key initialization · f7f8d7e5

由 Heiko Carstens 提交于 3月 14, 2013

Use sske with multiple block control to initialize storage keys within
a 1 MB frame at once.
It turned out that the sske with mb=1 is an order of magnitude faster
than pfmf. This is only an issue for very large systems (several 100GB)
where storage key initialization could last more than a minute.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

f7f8d7e5

s390/cmm: Removed useless label · 8fe853f3

由 Alexandru Gheorghiu 提交于 3月 13, 2013

Rewrote conditional statement and eliminated the out_kthread label.
Signed-off-by: NAlexandru Gheorghiu <gheorghiuandru@gmail.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

8fe853f3

s390/mm: zero page cache synonyms for zEC12 · 7919e91b

由 Martin Schwidefsky 提交于 2月 28, 2013

To avoid cache synonyms on System zEC12 32 independent zero pages are
required, one for each combination for bits 2**12 to 2**16 of the virtual
address. To avoid wasting too much memory on small virtual systems the
number of zero pages is limited to 4 if the memory size is less or equal
to 64MB.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

7919e91b

s390/mm: protection exception PSW for aborted transaction · f752ac4d

由 Martin Schwidefsky 提交于 4月 16, 2013

Protection exception usually are suppressing and the fault handler
needs to rewind the PSW by the instruction length to get the correct
fault address. Except for protection exceptions while the CPU is in
the middle of a transaction. The CPU stores the transaction abort
PSW at the start of the transaction, if the transaction is aborted
the PSW is already correct and may not be modified by the fault
handler.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

f752ac4d

08 3月, 2013 1 次提交

s390/mm,gmap: implement gmap_translate() · 9e0fdb41

由 Heiko Carstens 提交于 3月 05, 2013

Implement gmap_translate() function which translates a guest absolute address
to a user space process address without establishing the guest page table
entries.

This is useful for kvm guest address translations where no memory access
is expected to happen soon (e.g. tprot exception handler).
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

9e0fdb41

28 2月, 2013 2 次提交

s390/mm: ignore change bit for vmemmap · 17ea345a

由 Heiko Carstens 提交于 2月 16, 2013

Add hint to the page tables that we don't care about the change bit
in storage keys that belong to vmemmap pages.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

17ea345a

H
s390/page table dumper: add support for change-recording override bit · 1819ed1f
由 Heiko Carstens 提交于 2月 16, 2013
```
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
```
1819ed1f

24 2月, 2013 2 次提交

memory-hotplug: remove memmap of sparse-vmemmap · 0197518c

由 Tang Chen 提交于 2月 22, 2013

Introduce a new API vmemmap_free() to free and remove vmemmap
pagetables.  Since pagetable implements are different, each architecture
has to provide its own version of vmemmap_free(), just like
vmemmap_populate().

Note: vmemmap_free() is not implemented for ia64, ppc, s390, and sparc.

[mhocko@suse.cz: fix implicit declaration of remove_pagetable]
Signed-off-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: NJianguo Wu <wujianguo@huawei.com>
Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: NMichal Hocko <mhocko@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0197518c

memory-hotplug: introduce new arch_remove_memory() for removing page table · 24d335ca

由 Wen Congyang 提交于 2月 22, 2013

For removing memory, we need to remove page tables.  But it depends on
architecture.  So the patch introduce arch_remove_memory() for removing
page table.  Now it only calls __remove_pages().

Note: __remove_pages() for some archtecuture is not implemented
      (I don't know how to implement it for s390).
Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

24d335ca

14 2月, 2013 2 次提交

s390/mm: implement software dirty bits · abf09bed

由 Martin Schwidefsky 提交于 11月 07, 2012

The s390 architecture is unique in respect to dirty page detection,
it uses the change bit in the per-page storage key to track page
modifications. All other architectures track dirty bits by means
of page table entries. This property of s390 has caused numerous
problems in the past, e.g. see git commit ef5d437f
"mm: fix XFS oops due to dirty pages without buffers on s390".

To avoid future issues in regard to per-page dirty bits convert
s390 to a fault based software dirty bit detection mechanism. All
user page table entries which are marked as clean will be hardware
read-only, even if the pte is supposed to be writable. A write by
the user process will trigger a protection fault which will cause
the user pte to be marked as dirty and the hardware read-only bit
is removed.

With this change the dirty bit in the storage key is irrelevant
for Linux as a host, but the storage key is still required for
KVM guests. The effect is that page_test_and_clear_dirty and the
related code can be removed. The referenced bit in the storage
key is still used by the page_test_and_clear_young primitive to
provide page age information.

For page cache pages of mappings with mapping_cap_account_dirty
there will not be any change in behavior as the dirty bit tracking
already uses read-only ptes to control the amount of dirty pages.
Only for swap cache pages and pages of mappings without
mapping_cap_account_dirty there can be additional protection faults.
To avoid an excessive number of additional faults the mk_pte
primitive checks for PageDirty if the pgprot value allows for writes
and pre-dirties the pte. That avoids all additional faults for
tmpfs and shmem pages until these pages are added to the swap cache.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

abf09bed

s390/mm: Fix crst upgrade of mmap with MAP_FIXED · 486c0a0b

由 Hendrik Brueckner 提交于 2月 11, 2013

Right now the page table upgrade does not happen if the end address
of a fixed mapping is greater than TASK_SIZE.
Enhance s390_mmap_check() to handle MAP_FIXED mappings correctly.
Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

486c0a0b

08 1月, 2013 1 次提交

s390/irq: remove split irq fields from /proc/stat · 420f42ec

由 Heiko Carstens 提交于 1月 02, 2013

Now that irq sum accounting for /proc/stat's "intr" line works again we
have the oddity that the sum field (first field) contains only the sum
of the second (external irqs) and third field (I/O interrupts).
The reason for that is that these two fields are already sums of all other
fields. So if we would sum up everything we would count every interrupt
twice.
This is broken since the split interrupt accounting was merged two years
ago: 052ff461 "[S390] irq: have detailed
statistics for interrupt types".
To fix this remove the split interrupt fields from /proc/stat's "intr"
line again and only have them in /proc/interrupts.

This restores the old behaviour, seems to be the only sane fix and mimics
a behaviour from other architectures where /proc/interrupts also contains
more than /proc/stat's "intr" line does.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

420f42ec

23 11月, 2012 6 次提交

s390/ptrace: race of single stepping vs signal delivery · 39efd4ec

由 Martin Schwidefsky 提交于 11月 21, 2012

The current single step code is racy in regard to concurrent delivery
of signals. If a signal is delivered after a PER program check occurred
but before the TIF_PER_TRAP bit has been checked in entry[64].S the code
clears TIF_PER_TRAP and then calls do_signal. This is wrong, if the
instruction completed (or has been suppressed) a SIGTRAP should be
delivered to the debugger in any case. Only if the instruction has been
nullified the SIGTRAP may not be send.

The new logic always sets TIF_PER_TRAP if the program check indicates PER
tracing but removes it again for all program checks that are nullifying.
The effect is that for each change in the PSW address we now get a
single SIGTRAP.
Reported-by: NAndreas Arnez <arnez@linux.vnet.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

39efd4ec

s390/mm: move kernel_page_present/kernel_map_pages to page_attr.c · 0a4ccc99

由 Heiko Carstens 提交于 11月 02, 2012

Keep related functions together and move to appropriate file.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

0a4ccc99

s390/memory hotplug: use pfmf instruction to initialize storage keys · 6b70a920

由 Heiko Carstens 提交于 11月 02, 2012

Move and rename init_storage_keys() to pageattr.c, so it can also be
used from the sclp memory hotplug code in order to initialize
storage keys.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

6b70a920

s390/mm: keep fault_init() private to fault.c · a4f32bdb

由 Heiko Carstens 提交于 10月 30, 2012

Just convert fault_init() to an early initcall. That's still early
enough since it only needs be called before user space processes get
executed. No reason to externalize it.
Also add the function to the init section and move the store_indication
variable to the read_mostly section.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

a4f32bdb

s390/mm,vmemmap: use 1MB frames for vmemmap · f7817968

由 Heiko Carstens 提交于 10月 17, 2012

Use 1MB frames for vmemmap if EDAT1 is available in order to
reduce TLB pressure
Always use a 1MB frame even if its only partially needed for
struct pages. Otherwise we would end up with a mix of large
frame and page mappings, because vmemmap_populate gets called
for each section (256MB -> 3.5MB memmap) separately.
Worst case is that we would waste 512KB.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

f7817968

s390/mm,vmem: use 2GB frames for identity mapping · 18da2369

由 Heiko Carstens 提交于 10月 08, 2012

Use 2GB frames for indentity mapping if EDAT2 is
available to reduce TLB pressure.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

18da2369

13 11月, 2012 2 次提交

s390/gup: fix access_ok() usage in __get_user_pages_fast() · 516bad44

由 Heiko Carstens 提交于 10月 22, 2012

access_ok() returns always "true" on s390. Therefore all access_ok()
invocations are rather pointless.
However when walking page tables we need to make sure that everything
is within bounds of the ASCE limit of the task's address space.
So remove the access_ok() call and add the same check we have in
get_user_pages_fast().
Reviewed-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

516bad44

s390/gup: add missing TASK_SIZE check to get_user_pages_fast() · d55c4c61

由 Heiko Carstens 提交于 10月 22, 2012

When walking page tables we need to make sure that everything
is within bounds of the ASCE limit of the task's address space.
Otherwise we might calculate e.g. a pud pointer which is not
within a pud and dereference it.
So check against TASK_SIZE (which is the ASCE limit) before
walking page tables.
Reviewed-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

d55c4c61

26 10月, 2012 1 次提交

s390/mm: use pmd_large() instead of pmd_huge() · 156152f8

由 Gerald Schaefer 提交于 10月 25, 2012

Without CONFIG_HUGETLB_PAGE, pmd_huge() will always return 0. So
pmd_large() should be used instead in places where both transparent
huge pages and hugetlbfs pages can occur.
Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

156152f8

09 10月, 2012 4 次提交

s390/mm,vmem: fix vmem_add_mem()/vmem_remove_range() · fc7e48aa

由 Heiko Carstens 提交于 10月 08, 2012

vmem_add_mem() should only then insert a large page if pmd_none() is true
for the specific entry. We might have a leftover from a previous mapping.
In addition make vmem_remove_range()'s page table walk code more complete
and fix a couple of potential endless loops (which can never happen :).
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

fc7e48aa

s390/vmalloc: have separate modules area · c972cc60

由 Heiko Carstens 提交于 10月 05, 2012

Add a special module area on top of the vmalloc area, which may be only
used for modules and bpf jit generated code.
This makes sure that inter module branches will always happen without a
trampoline and in addition having all the code within a 2GB frame is
branch prediction unit friendly.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

c972cc60

s390/mm: fix mapping of read-only kernel text section · 8fe234d3

由 Heiko Carstens 提交于 10月 04, 2012

Within the identity mapping the kernel text section is mapped read-only.
However when mapping the first and last page of the text section we must
round upwards and downwards respectively, if only parts of a page belong
to the section.
Otherwise potential rw data can be mapped read-only. So the rounding must
be done just the other way we have it right now.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

8fe234d3

s390/mm: add page table dumper · e76e82d7

由 Heiko Carstens 提交于 10月 04, 2012

This is more or less the same as the x86 page table dumper which was
merged four years ago: 926e5392 "x86: add code to dump the (kernel)
page tables for visual inspection by kernel developers".

We add a file at /sys/kernel/debug/kernel_page_tables for debugging
purposes so it's quite easy to see the kernel page table layout and
possible odd mappings:

---[ Identity Mapping ]---
0x0000000000000000-0x0000000000100000        1M PTE RW
---[ Kernel Image Start ]---
0x0000000000100000-0x0000000000800000        7M PMD RO
0x0000000000800000-0x00000000008a9000      676K PTE RO
0x00000000008a9000-0x0000000000900000      348K PTE RW
0x0000000000900000-0x0000000001500000       12M PMD RW
---[ Kernel Image End ]---
0x0000000001500000-0x0000000280000000    10219M PMD RW
0x0000000280000000-0x000003d280000000     3904G PUD I
---[ vmemmap Area ]---
0x000003d280000000-0x000003d288c00000      140M PTE RW
0x000003d288c00000-0x000003d300000000     1908M PMD I
0x000003d300000000-0x000003e000000000       52G PUD I
---[ vmalloc Area ]---
0x000003e000000000-0x000003e000009000       36K PTE RW
0x000003e000009000-0x000003e0000ee000      916K PTE I
0x000003e0000ee000-0x000003e000146000      352K PTE RW
0x000003e000146000-0x000003e000200000      744K PTE I
0x000003e000200000-0x000003e080000000     2046M PMD I
0x000003e080000000-0x0000040000000000      126G PUD I

This usually makes only sense for kernel developers. The output
with CONFIG_DEBUG_PAGEALLOC is not very helpful, because of the
huge number of mapped out pages, however I decided for the time
being to not add a !DEBUG_PAGEALLOC dependency.
Maybe it's helpful for somebody even with that option.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

e76e82d7

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功