- 26 9月, 2011 1 次提交
-
-
由 Carsten Otte 提交于
If gmap_unmap_segment figures that the segment was not mapped in the first place, it need to up mmap_sem on exit. Cc: <stable@kernel.org> Signed-off-by: NCarsten Otte <cotte@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 20 9月, 2011 1 次提交
-
-
由 Christian Borntraeger 提交于
598841ca ([S390] use gmap address spaces for kvm guest images) changed kvm to use a separate address space for kvm guests. This address space was switched in __vcpu_run In some cases (preemption, page fault) there is the possibility that this address space switch is lost. The typical symptom was a huge amount of validity intercepts or random guest addressing exceptions. Fix this by doing the switch in sie_loop and sie_exit and saving the address space in the gmap structure itself. Also use the preempt notifier. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Acked-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
-
- 03 8月, 2011 2 次提交
-
-
由 Michael Holzheu 提交于
With this patch a new S390 shutdown trigger "restart" is added. If under z/VM "systerm restart" is entered or under the HMC the "PSW restart" button is pressed, the PSW located at 0 (31 bit) or 0x1a0 (64 bit) bit is loaded. Now we execute do_restart() that processes the restart action that is defined under /sys/firmware/shutdown_actions/on_restart. Currently the following actions are possible: reipl (default), stop, vmcmd, dump, and dump_reipl. Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
-
由 Jan Glauber 提交于
Fix the following compile warning for !CONFIG_PGSTE: CC arch/s390/mm/pgtable.o arch/s390/mm/pgtable.c: In function ‘page_table_alloc_pgste’: arch/s390/mm/pgtable.c:531:1: warning: no return statement in function returning non-void [-Wreturn-type] Signed-off-by: NJan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
-
- 24 7月, 2011 1 次提交
-
-
由 Martin Schwidefsky 提交于
Add code that allows KVM to control the virtual memory layout that is seen by a guest. The guest address space uses a second page table that shares the last level pte-tables with the process page table. If a page is unmapped from the process page table it is automatically unmapped from the guest page table as well. The guest address space mapping starts out empty, KVM can map any individual 1MB segments from the process virtual memory to any 1MB aligned location in the guest virtual memory. If a target segment in the process virtual memory does not exist or is unmapped while a guest mapping exists the desired target address is stored as an invalid segment table entry in the guest page table. The population of the guest page table is fault driven. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 01 7月, 2011 1 次提交
-
-
由 Peter Zijlstra 提交于
The nmi parameter indicated if we could do wakeups from the current context, if not, we would set some state and self-IPI and let the resulting interrupt do the wakeup. For the various event classes: - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from the PMI-tail (ARM etc.) - tracepoint: nmi=0; since tracepoint could be from NMI context. - software: nmi=[0,1]; some, like the schedule thing cannot perform wakeups, and hence need 0. As one can see, there is very little nmi=1 usage, and the down-side of not using it is that on some platforms some software events can have a jiffy delay in wakeup (when arch_irq_work_raise isn't implemented). The up-side however is that we can remove the nmi parameter and save a bunch of conditionals in fast paths. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Michael Cree <mcree@orcon.net.nz> Cc: Will Deacon <will.deacon@arm.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: Anton Blanchard <anton@samba.org> Cc: Eric B Munson <emunson@mgebm.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David S. Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 06 6月, 2011 1 次提交
-
-
由 Martin Schwidefsky 提交于
Replace the s390 specific rcu page-table freeing code with the generic variant. This requires to duplicate the definition for the struct mmu_table_batch as s390 does not use the generic tlb flush code. While we are at it remove the restriction that page table fragments can not be reused after a single fragment has been freed with rcu and split out allocation and freeing of page tables with pgstes. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 29 5月, 2011 1 次提交
-
-
由 Heiko Carstens 提交于
Quite a few functions that get called from the tlb gather code require that preemption must be disabled. So disable preemption inside of the called functions instead. The only drawback is that rcu_table_freelist_finish() doesn't get necessarily called on the cpu(s) that filled the free lists. So we may see a delay, until we finally see an rcu callback. However over time this shouldn't matter. So we get rid of lots of "BUG: using smp_processor_id() in preemptible" messages. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
-
- 26 5月, 2011 7 次提交
-
-
由 Heiko Carstens 提交于
Add ZONE_DMA to 31-bit config again. The performance gain is minimal and hardly anybody cares anymore about a 31-bit kernel. So add ZONE_DMA again to help with SLAB_CACHE_DMA removal for !CONFIG_ZONE_DMA configurations. Acked-by: NDavid Rientjes <rientjes@google.com> Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
-
由 Heiko Carstens 提交于
s390 arch backend for d065bd81 "mm: retry page fault when blocking on disk transfer". Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
-
由 Heiko Carstens 提交于
If e.g. copy_from_user() generates a page fault and the kernel runs into an OOM situation the system might lock up. If the OOM killer sends a SIG_KILL to the current process it can't handle it since it is stuck in a copy_from_user() - page fault loop. Fix this by adding the same fix as other architectures have. E.g. the x86 variant f86268 "x86/mm: Handle mm_fault_error() in kernel space" Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
-
由 Heiko Carstens 提交于
Merge irq.c and s390_ext.c into irq.c. That way all external interrupt related functions are together. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
Interrupt sources like pfault, sclp, dasd_diag and virtio all use the service signal external interrupt subclass mask in control register 0 to enable and disable the corresponding interrupt. Because no reference counting is implemented each subsystem thinks it is the only user of subclass and sets and clears the bit like it wants. This leads to case that unloading the dasd diag module under z/VM causes both sclp and pfault interrupts to be masked. The result will be locked up system sooner or later. Fix this by introducing a new way to set (register) and clear (unregister) the service signal subclass mask bit in cr0. Also convert all drivers. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
Always enable the service signal subclass mask bit in cr0, if pfault is available. That way we use the normal cpu hotplug way to propagate the subclass mask bit in cr0 instead of open coding it. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Steven Rostedt 提交于
The functions probe_kernel_write() and probe_kernel_read() do not modify the src pointer. Allow const pointers to be passed in without the need of a typecast. Acked-by: NMike Frysinger <vapier@gentoo.org> Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1305824936.1465.4.camel@gandalf.stny.rr.com
-
- 25 5月, 2011 1 次提交
-
-
由 Peter Zijlstra 提交于
Fold all the mmu_gather rework patches into one for submission Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Reported-by: NHugh Dickins <hughd@google.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Miller <davem@davemloft.net> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Tony Luck <tony.luck@intel.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Namhyung Kim <namhyung@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 5月, 2011 6 次提交
-
-
由 Martin Schwidefsky 提交于
Rework the architecture page table functions to access the bits in the page table extension array (pgste). There are a number of changes: 1) Fix missing pgste update if the attach_count for the mm is <= 1. 2) For every operation that affects the invalid bit in the pte or the rcp byte in the pgste the pcl lock needs to be acquired. The function pgste_get_lock gets the pcl lock and returns the current pgste value for a pte pointer. The function pgste_set_unlock stores the pgste and releases the lock. Between these two calls the bits in the pgste can be shuffled. 3) Define two software bits in the pte _PAGE_SWR and _PAGE_SWC to avoid calling SetPageDirty and SetPageReferenced from pgtable.h. If the host reference backup bit or the host change backup bit has been set the dirty/referenced state is transfered to the pte. The common code will pick up the state from the pte. 4) Add ptep_modify_prot_start and ptep_modify_prot_commit for mprotect. 5) Remove pgd_populate_kernel, pud_populate_kernel, pmd_populate_kernel pgd_clear_kernel, pud_clear_kernel, pmd_clear_kernel and ptep_invalidate. 6) Rename kvm_s390_test_and_clear_page_dirty to ptep_test_and_clear_user_dirty and add ptep_test_and_clear_user_young. 7) Define mm_exclusive() and mm_has_pgste() helper to improve readability. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
Small code cleanup. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
On cpu hot remove a PFAULT CANCEL command is sent to the hypervisor which in turn will cancel all outstanding pfault requests that have been issued on that cpu (the same happens with a SIGP cpu reset). The result is that we end up with uninterruptible processes where the interrupt that would wake up these processes never arrives. In order to solve this all processes which wait for a pfault completion interrupt get woken up after a cpu hot remove. The worst case that could happen is that they fault again and in turn need to wait again. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
Get rid of these: arch/s390/mm/extmem.c: In function 'segment_modify_shared': arch/s390/mm/extmem.c:622:3: warning: 'end_addr' may be used uninitialized in this function [-Wuninitialized] arch/s390/mm/extmem.c:627:18: warning: 'start_addr' may be used uninitialized in this function [-Wuninitialized] arch/s390/mm/extmem.c: In function 'segment_load': arch/s390/mm/extmem.c:481:11: warning: 'end_addr' may be used uninitialized in this function [-Wuninitialized] arch/s390/mm/extmem.c:480:18: warning: 'start_addr' may be used uninitialized in this function [-Wuninitialized] Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
Remove trivially unused variables as detected with -Wunused-but-set-variable. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Martin Schwidefsky 提交于
The noexec support on s390 does not rely on a bit in the page table entry but utilizes the secondary space mode to distinguish between memory accesses for instructions vs. data. The noexec code relies on the assumption that the cpu will always use the secondary space page table for data accesses while it is running in the secondary space mode. Up to the z9-109 class machines this has been the case. Unfortunately this is not true anymore with z10 and later machines. The load-relative-long instructions lrl, lgrl and lgfrl access the memory operand using the same addressing-space mode that has been used to fetch the instruction. This breaks the noexec mode for all user space binaries compiled with march=z10 or later. The only option is to remove the current noexec support. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 19 5月, 2011 1 次提交
-
-
由 Jan Glauber 提交于
While debugging I stumbled over two problems in the code that protects module pages. First issue is that disabling the protection before freeing init or unload of a module is not symmetric with the enablement. For instance, if pages are set to RO the page range from module_core to module_core + core_ro_size is protected. If a module is unloaded the page range from module_core to module_core + core_size is set back to RW. So pages that were not set to RO are also changed to RW. This is not critical but IMHO it should be symmetric. Second issue is that while set_memory_rw & set_memory_ro are used for RO/RW changes only set_memory_nx is involved for NX/X. One would await that the inverse function is called when the NX protection should be removed, which is not the case here, unless I'm missing something. Signed-off-by: NJan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 10 5月, 2011 1 次提交
-
-
由 Michael Holzheu 提交于
Currently the diag10() function can only release one page. For exploiters that have to call diag10 on a contiguous memory region this is suboptimal. This patch replaces the diag10() function with diag10_range() that is able to release multiple pages. In addition to that the new function now allows to release memory with addresses higher than 2047 MiB. This was due to a restriction of the diagnose implementation under z/VM prior to release 5.2. Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 29 4月, 2011 1 次提交
-
-
由 Heiko Carstens 提交于
pfault, dasd diag and virtio all use the same external interrupt number. The respective interrupt handlers decide by the subcode if they are meant to handle the interrupt. Counting is currently done before looking at the subcode which means each handler counts an interrupt even if it is not handling it. Fix this by moving the kstat code after the code which looks at the subcode. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 20 4月, 2011 2 次提交
-
-
由 Heiko Carstens 提交于
f6649a7e "[S390] cleanup lowcore access from external interrupts" changed handling of external interrupts. Instead of letting the external interrupt handlers accessing the per cpu lowcore the entry code of the kernel reads already all fields that are necessary and passes them to the handlers. The pfault interrupt handler was incorrectly converted. It tries to dereference a value which used to be a pointer to a lowcore field. After the conversion however it is not anymore the pointer to the field but its content. So instead of a dereference only a cast is needed to get the task pointer that caused the pfault. Fixes a NULL pointer dereference and a subsequent kernel crash: Unable to handle kernel pointer dereference at virtual kernel address (null) Oops: 0004 [#1] SMP Modules linked in: nfsd exportfs nfs lockd fscache nfs_acl auth_rpcgss sunrpc loop qeth_l3 qeth vmur ccwgroup ext3 jbd mbcache dm_mod dasd_eckd_mod dasd_diag_mod dasd_mod CPU: 0 Not tainted 2.6.38-2-s390x #1 Process cron (pid: 1106, task: 000000001f962f78, ksp: 000000001fa0f9d0) Krnl PSW : 0404200180000000 000000000002c03e (pfault_interrupt+0xa2/0x138) R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0 EA:3 Krnl GPRS: 0000000000000000 0000000000000001 0000000000000000 0000000000000001 000000001f962f78 0000000000518968 0000000090000002 000000001ff03280 0000000000000000 000000000064f000 000000001f962f78 0000000000002603 0000000006002603 0000000000000000 000000001ff7fe68 000000001ff7fe48 Krnl Code: 000000000002c036: 5820d010 l %r2,16(%r13) 000000000002c03a: 1832 lr %r3,%r2 000000000002c03c: 1a31 ar %r3,%r1 >000000000002c03e: ba23d010 cs %r2,%r3,16(%r13) 000000000002c042: a744fffc brc 4,2c03a 000000000002c046: a7290002 lghi %r2,2 000000000002c04a: e320d0000024 stg %r2,0(%r13) 000000000002c050: 07f0 bcr 15,%r0 Call Trace: ([<000000001f962f78>] 0x1f962f78) [<000000000001acda>] do_extint+0xf6/0x138 [<000000000039b6ca>] ext_no_vtime+0x30/0x34 [<000000007d706e04>] 0x7d706e04 Last Breaking-Event-Address: [<0000000000000000>] 0x0 For stable maintainers: the first kernel which contains this bug is 2.6.37. Reported-by: NStephen Powell <zlinuxman@wowway.com> Cc: Jonathan Nieder <jrnieder@gmail.com> Cc: stable@kernel.org Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Jan Glauber 提交于
The page table walk for changing page attributes used the wrong address for pgd/pud/pmd lookups if the range was bigger than a pmd entry. Fix the lookup by using the correct address. Signed-off-by: NJan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 31 3月, 2011 1 次提交
-
-
由 Lucas De Marchi 提交于
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
-
- 16 3月, 2011 1 次提交
-
-
由 Jan Glauber 提交于
Implement write protection for kernel modules text and read-only data sections by implementing set_memory_[ro|rw] on s390. Since s390 has no execute bit in the pte's NX is not supported. set_memory_[ro|rw] will only work on normal pages and not on large pages, so in case a large page should be modified bail out with a warning. Signed-off-by: NJan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 31 1月, 2011 1 次提交
-
-
由 Martin Schwidefsky 提交于
After page_table_free_rcu removed a page from the pgtable_list page_table_free better not add it again. Otherwise a page_table_alloc can reuse a page table fragment that is still in the rcu process. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 12 1月, 2011 5 次提交
-
-
由 Heiko Carstens 提交于
Randomize mmap start address with 8MB. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
Shuffle code around so it looks more like x86 and powerpc. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
Historically 64 bit processes use the legacy address layout. However there is no reason why 64 bit processes shouldn't benefit from the flexible mmap layout advantages. Therefore just enable it. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
Reduce minimum gap between stack and mmap_base to 32MB. That way there is a bit more space for heap and mmap for tight 31 bit address spaces. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
Consider stack address randomization when calulating mmap_base for flexible mmap layout . Because of address randomization the stack address can be up to 8MB lower than STACK_TOP. When calculating mmap_base this isn't taken into account, which could lead to the case that the gap between the real stack top and mmap_base is lower than what ulimit specifies for the maximum stack size. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 05 1月, 2011 3 次提交
-
-
由 Martin Schwidefsky 提交于
Overhaul program event recording and the code dealing with the ptrace user space interface. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
Use an early init call to initialize pfault. That way it is possible to use the register_external_interrupt() instead of the early variant. No need to enable pfault any earlier since it has only effect if user space processes are running. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
Up to now /proc/interrupts only has statistics for external and i/o interrupts but doesn't split up them any further. This patch adds a line for every single interrupt source so that it is possible to easier tell what the machine is/was doing. Part of the output now looks like this; CPU0 CPU2 CPU4 EXT: 3898 4232 2305 I/O: 782 315 245 CLK: 1029 1964 727 [EXT] Clock Comparator IPI: 2868 2267 1577 [EXT] Signal Processor TMR: 0 0 0 [EXT] CPU Timer TAL: 0 0 0 [EXT] Timing Alert PFL: 0 0 0 [EXT] Pseudo Page Fault [...] NMI: 0 1 1 [NMI] Machine Checks Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 10 11月, 2010 1 次提交
-
-
由 Martin Schwidefsky 提交于
The check for the _PAGE_RO bit in get_user_pages_fast for write==1 is the wrong way around. It must not be set for the fast path. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 25 10月, 2010 1 次提交
-
-
由 Martin Schwidefsky 提交于
Store the facility list once at system startup with stfl/stfle and reuse the result for all facility tests. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-