- 30 7月, 2012 1 次提交
-
-
由 Heiko Carstens 提交于
This is the s390 variant of 37b23e05 "x86,mm: make pagefault killable". Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 26 7月, 2012 1 次提交
-
-
由 Martin Schwidefsky 提交于
The downgrade of the 4 level page table created by init_new_context is currently done only in start_thread31. If a 31 bit process forks the new mm uses a 4 level page table, including the task size of 2<<42 that goes along with it. This is incorrect as now a 31 bit process can map memory beyond 2GB. Define arch_dup_mmap to do the downgrade after fork. Cc: stable@vger.kernel.org Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 20 7月, 2012 1 次提交
-
-
由 Heiko Carstens 提交于
Remove the file name from the comment at top of many files. In most cases the file name was wrong anyway, so it's rather pointless. Also unify the IBM copyright statement. We did have a lot of sightly different statements and wanted to change them one after another whenever a file gets touched. However that never happened. Instead people start to take the old/"wrong" statements to use as a template for new files. So unify all of them in one go. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
-
- 30 5月, 2012 1 次提交
-
-
由 Michael Holzheu 提交于
This patch introduces the new function memcpy_absolute() that allows to copy memory using absolute addressing. This means that the prefix swap does not apply when this function is used. With this patch also all s390 kernel code that accesses absolute zero now uses the new memcpy_absolute() function. The old and less generic copy_to_absolute_zero() function is removed. Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 24 5月, 2012 1 次提交
-
-
由 Heiko Carstens 提交于
Replace __s390x__ with CONFIG_64BIT in all places that are not exported to userspace or guarded with #ifdef __KERNEL__. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 16 5月, 2012 8 次提交
-
-
由 Heiko Carstens 提交于
If the task that was found on an initial interrupt doesn't match the current task execute a WARN_ON_ONCE() and don't put the task to sleep. When this happened something went wrong between the interface of the hypervisor and the kernel. In such a case keep the tasks alive to avoid a hanging system. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
Use __set_task_state() instead of set_task_state(). Saves a couple of instructions, since the memory barrier is not needed here. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
Make the code a bit more symmetric and always search for the task of the reported pid. This simplifies the code a bit. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
When setting the current task state to TASK_UNINTERRUPTIBLE this can race with a different cpu. The other cpu could set the task state after it inspected it (while it was still TASK_RUNNING) to TASK_RUNNING which would change the state from TASK_UNINTERRUPTIBLE to TASK_RUNNING again. This race was always present in the pfault interrupt code but didn't cause anything harmful before commit f2db2e6c "[S390] pfault: cpu hotplug vs missing completion interrupts" which relied on the fact that after setting the task state to TASK_UNINTERRUPTIBLE the task would really sleep. Since this is not necessarily the case the result may be a list corruption of the pfault_list or, as observed, a use-after-free bug while trying to access the task_struct of a task which terminated itself already. To fix this, we need to get a reference of the affected task when receiving the initial pfault interrupt and add special handling if we receive yet another initial pfault interrupt when the task is already enqueued in the pfault list. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Cc: <stable@vger.kernel.org> # needed for v3.0 and newer Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Christian Borntraeger 提交于
commit c3f0327f mm: add rss counters consistency check detected the following problem with kvm on s390: BUG: Bad rss-counter state mm:00000004f73ef000 idx:0 val:-10 BUG: Bad rss-counter state mm:00000004f73ef000 idx:1 val:-5 We have to make sure that we accumulate all rss values into the mm before we replace the mm to avoid triggering this (harmless) bug message. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Gerald Schaefer 提交于
The software large page emulation on s390 did not clear the the pre-allocated page table in arch_release_hugepage() before freeing it. This could trigger the WARN_ON(!pte_none(*pte) in mm/vmalloc.c:106 and make vmap_pte_range() fail, because the page table could be reused in page_table_alloc(). This is fixed now by calling clear_table() before page_table_free(). Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Martin Schwidefsky 提交于
Replace the check for TIF_SIE in the fault handler by a check for PF_VCPU. With the last user of TIF_SIE gone we can now remove the bit. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Michael Holzheu 提交于
Currently dev/mem for s390 provides only real memory access. This means that the CPU prefix pages are swapped. The prefix swap for real memory works as follows: Each CPU owns a prefix register that points to a page aligned memory location "P". If this CPU accesses the address range [0,0x1fff], it is translated by the hardware to [P,P+0x1fff]. Accordingly if this CPU accesses the address range [P,P+0x1fff], it is translated by the hardware to [0,0x1fff]. Therefore, if [P,P+0x1fff] or [0,0x1fff] is read from the current /dev/mem device, the incorrectly swapped memory content is returned. With this patch the /dev/mem architecture code is modified to provide absolute memory access. This is done via the arch specific functions xlate_dev_mem_ptr() and unxlate_dev_mem_ptr(). For swapped pages on s390 the function xlate_dev_mem_ptr() now returns a new buffer with a copy of the requested absolute memory. In case the buffer was allocated, the unxlate_dev_mem_ptr() function frees it after /dev/mem code has called copy_to_user(). Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 11 4月, 2012 2 次提交
-
-
由 Martin Schwidefsky 提交于
Git commit 36409f63 "use generic RCU page-table freeing code" introduced a tlb flushing bug. Partially revert the above git commit and go back to s390 specific page table flush code. For s390 the TLB can contain three types of entries, "normal" TLB page-table entries, TLB combined region-and-segment-table (CRST) entries and real-space entries. Linux does not use real-space entries which leaves normal TLB entries and CRST entries. The CRST entries are intermediate steps in the page-table translation called translation paths. For example a 4K page access in a three-level page table setup will create two CRST TLB entries and one page-table TLB entry. The advantage of that approach is that a page access next to the previous one can reuse the CRST entries and needs just a single read from memory to create the page-table TLB entry. The disadvantage is that the TLB flushing rules are more complicated, before any page-table may be freed the TLB needs to be flushed. In short: the generic RCU page-table freeing code is incorrect for the CRST entries, in particular the check for mm_users < 2 is troublesome. This is applicable to 3.0+ kernels. Cc: <stable@vger.kernel.org> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Michael Holzheu 提交于
Currently in the memcpy_real() function interrupts are disabled with __arch_local_irq_stnsm(). In order to notify lockdep that interrupts are disabled, with this patch local_irq_save() is used instead. The function __arch_local_irq_stnsm() is still used for switching to real mode. Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 29 3月, 2012 1 次提交
-
-
由 David Howells 提交于
Disintegrate asm/system.h for S390. Signed-off-by: NDavid Howells <dhowells@redhat.com> cc: linux-s390@vger.kernel.org
-
- 23 3月, 2012 1 次提交
-
-
由 Ben Hutchings 提交于
This function is defined for use in exec, not in modules. No other architecture exports its implementation. Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 11 3月, 2012 1 次提交
-
-
由 Heiko Carstens 提交于
The external interrupt handlers have a parameter called ext_int_code. Besides the name this paramter does not only contain the ext_int_code but in addition also the "cpu address" (POP) which caused the external interrupt. To make the code a bit more obvious pass a struct instead so the called function can easily distinguish between external interrupt code and cpu address. The cpu address field however is named "subcode" since some external interrupt sources do not pass a cpu address but a different parameter (or none at all). Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 27 2月, 2012 1 次提交
-
-
由 Heiko Carstens 提交于
The new is_compat_task() define for the !COMPAT case in include/linux/compat.h conflicts with a similar define in arch/s390/include/asm/compat.h. This is the minimal patch which fixes the build issues. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 2月, 2012 1 次提交
-
-
由 Gerald Schaefer 提交于
This fixes a kernel oops with CONFIG_DEBUG_VM triggered by a VM_BUG_ON(bad_range()): kernel BUG at mm/page_alloc.c:748. With memory hotplug on System z, it is possible that the memory online/offline state is preserved over a system restart, e.g. there may be offline memory blocks in ZONE_DMA or ZONE_NORMAL. So far, the offline memory range has always been added to ZONE_MOVABLE during system start, so that it was possible to have ZONE_MOVABLE interleave with ZONE_DMA or ZONE_NORMAL. This patch fixes that by checking for zone overlap before adding memory. Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com> Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 17 2月, 2012 1 次提交
-
-
由 Martin Schwidefsky 提交于
The page_table_free_pgste function is used for kvm processes to free page tables that have the pgste extension. It calls pgtable_page_ctor instead of pgtable_page_dtor which increases NR_PAGETABLE instead of decreasing it. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 27 12月, 2011 4 次提交
-
-
由 Martin Schwidefsky 提交于
Move the program interruption code and the translation exception identifier to the pt_regs structure as 'int_code' and 'int_parm_long' and make the first level interrupt handler in entry[64].S store the two values. That makes it possible to drop 'prot_addr' and 'trap_no' from the thread_struct and to reduce the number of arguments to a lot of functions. Finally un-inline do_trap. Overall this saves 5812 bytes in the .text section of the 64 bit kernel. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Carsten Otte 提交于
This patch disables the check for MACHINE_IS_VM when initializing the pfault infrastructure. The code checks for successful completion of diag 258 anyway, thus it's safe to try initialization on LPAR anyway. This is needed to use pfault on kvm Signed-off-by: NCarsten Otte <cotte@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Martin Schwidefsky 提交于
The kernel address space of a 64 bit kernel currently uses a three level page table and the vmemmap array has a fixed address and a fixed maximum size. A three level page table is good enough for systems with less than 3.8TB of memory, for bigger systems four page table levels need to be used. Each page table level costs a bit of performance, use 3 levels for normal systems and 4 levels only for the really big systems. To avoid bloating sparse.o too much set MAX_PHYSMEM_BITS to 46 for a maximum of 64TB of memory. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Christian Borntraeger 提交于
commit cc772456 [S390] fix list corruption in gmap reverse mapping added a potential dead lock: BUG: sleeping function called from invalid context at mm/page_alloc.c:2260 in_atomic(): 1, irqs_disabled(): 0, pid: 1108, name: qemu-system-s39 3 locks held by qemu-system-s39/1108: #0: (&kvm->slots_lock){+.+.+.}, at: [<000003e004866542>] kvm_set_memory_region+0x3a/0x6c [kvm] #1: (&mm->mmap_sem){++++++}, at: [<0000000000123790>] gmap_map_segment+0x9c/0x298 #2: (&(&mm->page_table_lock)->rlock){+.+.+.}, at: [<00000000001237a8>] gmap_map_segment+0xb4/0x298 CPU: 0 Not tainted 3.1.3 #45 Process qemu-system-s39 (pid: 1108, task: 00000004f8b3cb30, ksp: 00000004fd5978d0) 00000004fd5979a0 00000004fd597920 0000000000000002 0000000000000000 00000004fd5979c0 00000004fd597938 00000004fd597938 0000000000617e96 0000000000000000 00000004f8b3cf58 0000000000000000 0000000000000000 000000000000000d 000000000000000c 00000004fd597988 0000000000000000 0000000000000000 0000000000100a18 00000004fd597920 00000004fd597960 Call Trace: ([<0000000000100926>] show_trace+0xee/0x144) [<0000000000131f3a>] __might_sleep+0x12a/0x158 [<0000000000217fb4>] __alloc_pages_nodemask+0x224/0xadc [<0000000000123086>] gmap_alloc_table+0x46/0x114 [<000000000012395c>] gmap_map_segment+0x268/0x298 [<000003e00486b014>] kvm_arch_commit_memory_region+0x44/0x6c [kvm] [<000003e004866414>] __kvm_set_memory_region+0x3b0/0x4a4 [kvm] [<000003e004866554>] kvm_set_memory_region+0x4c/0x6c [kvm] [<000003e004867c7a>] kvm_vm_ioctl+0x14a/0x314 [kvm] [<0000000000292100>] do_vfs_ioctl+0x94/0x588 [<0000000000292688>] SyS_ioctl+0x94/0xac [<000000000061e124>] sysc_noemu+0x22/0x28 [<000003fffcd5e7ca>] 0x3fffcd5e7ca 3 locks held by qemu-system-s39/1108: #0: (&kvm->slots_lock){+.+.+.}, at: [<000003e004866542>] kvm_set_memory_region+0x3a/0x6c [kvm] #1: (&mm->mmap_sem){++++++}, at: [<0000000000123790>] gmap_map_segment+0x9c/0x298 #2: (&(&mm->page_table_lock)->rlock){+.+.+.}, at: [<00000000001237a8>] gmap_map_segment+0xb4/0x298 Fix this by freeing the lock on the alloc path. This is ok, since the gmap table is never freed until we call gmap_free, so the table we are walking cannot go. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 14 11月, 2011 1 次提交
-
-
由 Heiko Carstens 提交于
Ignore completion interrupts if the initial interrupt hasn't been received and the addressed task is not running. This case can only happen if leftover (pending) completion interrupt gets delivered which wasn't removed with the PFAULT CANCEL operation during cpu hotplug. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 03 11月, 2011 3 次提交
-
-
由 Andrea Arcangeli 提交于
This avoids duplicating the function in every arch gup_fast. Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrea Arcangeli 提交于
s390 didn't return 0 in that case, if it's rolling back the *nr pointer it should also return zero to avoid adding pages to the array at the wrong offset. Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrea Arcangeli 提交于
Up to this point the code assumed old refcounting for hugepages (pre-thp). This updates the code directly to the thp mapcount tail page refcounting. Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 11月, 2011 1 次提交
-
-
由 Heiko Carstens 提交于
Fix several compile errors on s390 caused by splitting module.h. Some include additions [e.g. qdio_setup.c, zfcp_qdio.c] are in anticipation of pending changes queued for s390 that increase the modular use footprint. [PG: added additional obvious changes since Heiko's original patch] Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
- 30 10月, 2011 10 次提交
-
-
由 Martin Schwidefsky 提交于
Add prototypes and includes for functions used in different modules. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Christian Borntraeger 提交于
Linux on System z uses a ballooner based on diagnose 0x10. (aka as collaborative memory management). This patch implements diagnose 0x10 on the guest address space. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Carsten Otte 提交于
gmap_fault needs to walk the guest page table. However, parts of that may change if some other thread does munmap. In that case gmap_unmap_notifier will also unmap the corresponding parts from the guest page table. We need to take mmap_sem in order to serialize these operations. do_exception now calls __gmap_fault with mmap_sem held which does not get exported to modules. The exported function, which is called from KVM, now takes mmap_sem. Reported-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NCarsten Otte <cotte@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Carsten Otte 提交于
This introduces locking via mm->page_table_lock to protect the rmap list for guest mappings from being corrupted by concurrent operations. Signed-off-by: NCarsten Otte <cotte@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Carsten Otte 提交于
Fix possible deadlock reported by lockdep: qemu-system-s39/2963 is trying to acquire lock: (&mm->mmap_sem){++++++}, at: gmap_alloc_table+0x9c/0x120 but task is already holding lock: (&mm->mmap_sem){++++++}, at: gmap_map_segment+0xa6/0x27c Actually gmap_alloc_table is the only called in gmap_map_segment with mmap_sem held, thus it's safe to simply remove the inner lock. Signed-off-by: NCarsten Otte <cotte@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Martin Schwidefsky 提交于
Split out addressing mode bits from PSW_BASE_BITS, rename PSW_BASE_BITS to PSW_MASK_BASE, get rid of psw_user32_bits, remove unused function enabled_wait(), introduce PSW_MASK_USER, and drop PSW_MASK_MERGE macros. Change psw_kernel_bits / psw_user_bits to contain only the bits that are always set in the respective mode. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Martin Schwidefsky 提交于
An instruction with an address right below the adress limit for the current addressing mode will wrap. The instruction restart logic in the protection fault handler and the signal code need to follow the wrapping rules to find the correct instruction address. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Michael Holzheu 提交于
This patch provides the architecture specific part of the s390 kdump support. Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Michael Holzheu 提交于
Add access function for real memory needed by s390 kdump backend. Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Martin Schwidefsky 提交于
The rcu page table free code uses a couple of bits in the page table pointer passed to tlb_remove_table to discern the different page table types. __tlb_remove_table extracts the type with an incorrect mask which leads to memory leaks. The correct mask is ((FRAG_MASK << 4) | FRAG_MASK). Cc: stable@kernel.org Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-