- 24 2月, 2017 6 次提交
-
-
由 Nitin Gupta 提交于
Add support for using multiple hugepage sizes simultaneously on mainline. Currently, support for 256M has been added which can be used along with 8M pages. Page tables are set like this (e.g. for 256M page): VA + (8M * x) -> PA + (8M * x) (sz bit = 256M) where x in [0, 31] and TSB is set similarly: VA + (4M * x) -> PA + (4M * x) (sz bit = 256M) where x in [0, 63] - Testing Tested on Sonoma (which supports 256M pages) by running stream benchmark instances in parallel: one instance uses 8M pages and another uses 256M pages, consuming 48G each. Boot params used: default_hugepagesz=256M hugepagesz=256M hugepages=300 hugepagesz=8M hugepages=10000 Signed-off-by: NNitin Gupta <nitin.m.gupta@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vijay Kumar 提交于
On panic, all other CPUs are stopped except the one which had hit panic. To keep console alive, we need to migrate hvcons irq to panicked CPU. Signed-off-by: NVijay Kumar <vijay.ac.kumar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vijay Kumar 提交于
CPU needs to be marked offline before stopping it. When not marked offline, the xcall receives HV_EWOULDBLOCK and so assumes that not all CPUs received the message, and retries. After 10000 retries, it finally fails with fatal mondo timeout. Signed-off-by: NVijay Kumar <vijay.ac.kumar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Saint Etienne 提交于
When returning from the user probe code into userspace process, PC & NPC are truncated to 32 bits. Due to shared libraries getting loaded very high in the virtual address space of the process, placing a user probe inside a shared library makes the kernel return into the process at the wrong address, causing it to seg'fault most of the time. This patch prevents truncating PC and NPC. Signed-off-by: NEric Saint Etienne <eric.saint.etienne@oracle.com> Reviewed-by: NDavid Aldridge <david.j.aldridge@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ben Hutchings 提交于
We currently define macros referring to cpu_data if CONFIG_SMP is defined, but only include the declaration if CONFIG_NUMA is defined. Fixes: 541cc394 ("sparc: fix a building error reported by kbuild") Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Acked-by: NGonglei <arei.gonglei@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Bhumika Goyal 提交于
The objects viking_ops, viking_sun4d_smp_ops and smp_cachetlb_ops of type sparc32_cachetlb_ops are not modified anywhere after getting modified in the init functions. Inside init their reference is also stored in a pointer of type const struct sparc32_cachetlb_ops *. So these structures are never modified after init, therefore add __ro_after to the declaration of these structures. Signed-off-by: NBhumika Goyal <bhumirks@gmail.com> Reviewed-by: NKees Cook <keescook@chromium.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 2月, 2017 9 次提交
-
-
由 Kirill A. Shutemov 提交于
There's no users of zap_page_range() who wants non-NULL 'details'. Let's drop it. Link: http://lkml.kernel.org/r/20170118122429.43661-3-kirill.shutemov@linux.intel.comSigned-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
show_mem() allows to filter out node specific data which is irrelevant to the allocation request via SHOW_MEM_FILTER_NODES. The filtering is done in skip_free_areas_node which skips all nodes which are not in the mems_allowed of the current process. This works most of the time as expected because the nodemask shouldn't be outside of the allocating task but there are some exceptions. E.g. memory hotplug might want to request allocations from outside of the allowed nodes (see new_node_page). Get rid of this hardcoded behavior and push the allocation mask down the show_mem path and use it instead of cpuset_current_mems_allowed. NULL nodemask is interpreted as cpuset_current_mems_allowed. [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/20170117091543.25850-5-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com> Acked-by: NMel Gorman <mgorman@suse.de> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
We have a generic implementation for quite some time already. If there is any arch specific information to be printed then we should add a callback called from the generic code rather than duplicate the whole show_mem. The current code has resulted in the code duplication and the output divergence which is both confusing and adds maintainance costs. Let's just get rid of this mess. Link: http://lkml.kernel.org/r/20170117091543.25850-4-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com> Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn> [UniCore32] Acked-by: Helge Deller <deller@gmx.de> [for parisc] Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile] Acked-by: NMel Gorman <mgorman@suse.de> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Denys Vlasenko 提交于
On 32-bit powerpc the ELF PLT sections of binaries (built with --bss-plt, or with a toolchain which defaults to it) look like this: [17] .sbss NOBITS 0002aff8 01aff8 000014 00 WA 0 0 4 [18] .plt NOBITS 0002b00c 01aff8 000084 00 WAX 0 0 4 [19] .bss NOBITS 0002b090 01aff8 0000a4 00 WA 0 0 4 Which results in an ELF load header: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x019c70 0x00029c70 0x00029c70 0x01388 0x014c4 RWE 0x10000 This is all correct, the load region containing the PLT is marked as executable. Note that the PLT starts at 0002b00c but the file mapping ends at 0002aff8, so the PLT falls in the 0 fill section described by the load header, and after a page boundary. Unfortunately the generic ELF loader ignores the X bit in the load headers when it creates the 0 filled non-file backed mappings. It assumes all of these mappings are RW BSS sections, which is not the case for PPC. gcc/ld has an option (--secure-plt) to not do this, this is said to incur a small performance penalty. Currently, to support 32-bit binaries with PLT in BSS kernel maps *entire brk area* with executable rights for all binaries, even --secure-plt ones. Stop doing that. Teach the ELF loader to check the X bit in the relevant load header and create 0 filled anonymous mappings that are executable if the load header requests that. Test program showing the difference in /proc/$PID/maps: int main() { char buf[16*1024]; char *p = malloc(123); /* make "[heap]" mapping appear */ int fd = open("/proc/self/maps", O_RDONLY); int len = read(fd, buf, sizeof(buf)); write(1, buf, len); printf("%p\n", p); return 0; } Compiled using: gcc -mbss-plt -m32 -Os test.c -otest Unpatched ppc64 kernel: 00100000-00120000 r-xp 00000000 00:00 0 [vdso] 0fe10000-0ffd0000 r-xp 00000000 fd:00 67898094 /usr/lib/libc-2.17.so 0ffd0000-0ffe0000 r--p 001b0000 fd:00 67898094 /usr/lib/libc-2.17.so 0ffe0000-0fff0000 rw-p 001c0000 fd:00 67898094 /usr/lib/libc-2.17.so 10000000-10010000 r-xp 00000000 fd:00 100674505 /home/user/test 10010000-10020000 r--p 00000000 fd:00 100674505 /home/user/test 10020000-10030000 rw-p 00010000 fd:00 100674505 /home/user/test 10690000-106c0000 rwxp 00000000 00:00 0 [heap] f7f70000-f7fa0000 r-xp 00000000 fd:00 67898089 /usr/lib/ld-2.17.so f7fa0000-f7fb0000 r--p 00020000 fd:00 67898089 /usr/lib/ld-2.17.so f7fb0000-f7fc0000 rw-p 00030000 fd:00 67898089 /usr/lib/ld-2.17.so ffa90000-ffac0000 rw-p 00000000 00:00 0 [stack] 0x10690008 Patched ppc64 kernel: 00100000-00120000 r-xp 00000000 00:00 0 [vdso] 0fe10000-0ffd0000 r-xp 00000000 fd:00 67898094 /usr/lib/libc-2.17.so 0ffd0000-0ffe0000 r--p 001b0000 fd:00 67898094 /usr/lib/libc-2.17.so 0ffe0000-0fff0000 rw-p 001c0000 fd:00 67898094 /usr/lib/libc-2.17.so 10000000-10010000 r-xp 00000000 fd:00 100674505 /home/user/test 10010000-10020000 r--p 00000000 fd:00 100674505 /home/user/test 10020000-10030000 rw-p 00010000 fd:00 100674505 /home/user/test 10180000-101b0000 rw-p 00000000 00:00 0 [heap] ^^^^ this has changed f7c60000-f7c90000 r-xp 00000000 fd:00 67898089 /usr/lib/ld-2.17.so f7c90000-f7ca0000 r--p 00020000 fd:00 67898089 /usr/lib/ld-2.17.so f7ca0000-f7cb0000 rw-p 00030000 fd:00 67898089 /usr/lib/ld-2.17.so ff860000-ff890000 rw-p 00000000 00:00 0 [stack] 0x10180008 The patch was originally posted in 2012 by Jason Gunthorpe and apparently ignored: https://lkml.org/lkml/2012/9/30/138 Lightly run-tested. Link: http://lkml.kernel.org/r/20161215131950.23054-1-dvlasenk@redhat.comSigned-off-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Acked-by: NKees Cook <keescook@chromium.org> Acked-by: NMichael Ellerman <mpe@ellerman.id.au> Tested-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Florian Weimer <fweimer@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Yasuaki Ishimatsu 提交于
To identify that pages of page table are allocated from bootmem allocator, magic number sets to page->lru.next. But page->lru list is initialized in reserve_bootmem_region(). So when calling free_pagetable(), the function cannot find the magic number of pages. And free_pagetable() frees the pages by free_reserved_page() not put_page_bootmem(). But if the pages are allocated from bootmem allocator and used as page table, the pages have private flag. So before freeing the pages, we should clear the private flag by put_page_bootmem(). Before applying the commit 7bfec6f4 ("mm, page_alloc: check multiple page fields with a single branch"), we could find the following visible issue: BUG: Bad page state in process kworker/u1024:1 page:ffffea103cfd8040 count:0 mapcount:0 mappi flags: 0x6fffff80000800(private) page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set bad because of flags: 0x800(private) <snip> Call Trace: [...] dump_stack+0x63/0x87 [...] bad_page+0x114/0x130 [...] free_pages_prepare+0x299/0x2d0 [...] free_hot_cold_page+0x31/0x150 [...] __free_pages+0x25/0x30 [...] free_pagetable+0x6f/0xb4 [...] remove_pagetable+0x379/0x7ff [...] vmemmap_free+0x10/0x20 [...] sparse_remove_one_section+0x149/0x180 [...] __remove_pages+0x2e9/0x4f0 [...] arch_remove_memory+0x63/0xc0 [...] remove_memory+0x8c/0xc0 [...] acpi_memory_device_remove+0x79/0xa5 [...] acpi_bus_trim+0x5a/0x8d [...] acpi_bus_trim+0x38/0x8d [...] acpi_device_hotplug+0x1b7/0x418 [...] acpi_hotplug_work_fn+0x1e/0x29 [...] process_one_work+0x152/0x400 [...] worker_thread+0x125/0x4b0 [...] kthread+0xd8/0xf0 [...] ret_from_fork+0x22/0x40 And the issue still silently occurs. Until freeing the pages of page table allocated from bootmem allocator, the page->freelist is never used. So the patch sets magic number to page->freelist instead of page->lru.next. [isimatu.yasuaki@jp.fujitsu.com: fix merge issue] Link: http://lkml.kernel.org/r/722b1cc4-93ac-dd8b-2be2-7a7e313b3b0b@gmail.com Link: http://lkml.kernel.org/r/2c29bd9f-5b67-02d0-18a3-8828e78bbb6f@gmail.comSigned-off-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Xishi Qiu <qiuxishi@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Davidlohr Bueso 提交于
Given that the arch does not add its own implementations, simply use the asm-generic/current.h (generic-y) header instead of duplicating code. Link: http://lkml.kernel.org/r/1485992878-4780-4-git-send-email-dave@stgolabs.netSigned-off-by: NDavidlohr Bueso <dbueso@suse.de> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Davidlohr Bueso 提交于
... it's already using the generic version anyways, so just drop the file as do the other archs that do not implement their own version of the current macro. Link: http://lkml.kernel.org/r/1485992878-4780-5-git-send-email-dave@stgolabs.netSigned-off-by: NDavidlohr Bueso <dbueso@suse.de> Cc: Chen Liqin <liqin.linux@gmail.com> Cc: Lennox Wu <lennox.wu@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Sudip Mukherjee 提交于
Some m32r builds were having a warning: arch/m32r/include/asm/cmpxchg.h:191:3: warning: value computed is not used arch/m32r/include/asm/cmpxchg.h:68:3: warning: value computed is not used Taking the idea from commit e001bbae ("ARM: cmpxchg: avoid warnings from macro-ized cmpxchg() implementations") the m32r implementation is changed to use a similar construct with a compound expression instead of a typecast, which causes the compiler to not complain about an unused result. Link: http://lkml.kernel.org/r/1484432664-7015-1-git-send-email-sudipm.mukherjee@gmail.comSigned-off-by: NSudip Mukherjee <sudip.mukherjee@codethink.co.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Davidlohr Bueso 提交于
Given that the arch does not add its own implementations, simply use the asm-generic/current.h (generic-y) header instead of duplicating code. Link: http://lkml.kernel.org/r/1482896994-25863-1-git-send-email-dave@stgolabs.netSigned-off-by: NDavidlohr Bueso <dbueso@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 2月, 2017 2 次提交
-
-
由 Daniel Borkmann 提交于
Eric and Willem reported that they recently saw random crashes when JIT was in use and bisected this to 74451e66 ("bpf: make jited programs visible in traces"). Issue was that the consolidation part added bpf_jit_binary_unlock_ro() that would unlock previously made read-only memory back to read-write. However, DEBUG_SET_MODULE_RONX cannot be used for this to test for presence of set_memory_*() functions. We need to use ARCH_HAS_SET_MEMORY instead to fix this; also add the corresponding bpf_jit_binary_lock_ro() to filter.h. Fixes: 74451e66 ("bpf: make jited programs visible in traces") Reported-by: NEric Dumazet <edumazet@google.com> Reported-by: NWillem de Bruijn <willemb@google.com> Bisected-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Tested-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Currently, there's no good way to test for the presence of set_memory_ro/rw/x/nx() helpers implemented by archs such as x86, arm, arm64 and s390. There's DEBUG_SET_MODULE_RONX and DEBUG_RODATA, however both don't really reflect that: set_memory_*() are also available even when DEBUG_SET_MODULE_RONX is turned off, and DEBUG_RODATA is set by parisc, but doesn't implement above functions. Thus, add ARCH_HAS_SET_MEMORY that is selected by mentioned archs, where generic code can test against this. This also allows later on to move DEBUG_SET_MODULE_RONX out of the arch specific Kconfig to define it only once depending on ARCH_HAS_SET_MEMORY. Suggested-by: NLaura Abbott <labbott@redhat.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 2月, 2017 10 次提交
-
-
由 Waiman Long 提交于
It was found when running fio sequential write test with a XFS ramdisk on a KVM guest running on a 2-socket x86-64 system, the %CPU times as reported by perf were as follows: 69.75% 0.59% fio [k] down_write 69.15% 0.01% fio [k] call_rwsem_down_write_failed 67.12% 1.12% fio [k] rwsem_down_write_failed 63.48% 52.77% fio [k] osq_lock 9.46% 7.88% fio [k] __raw_callee_save___kvm_vcpu_is_preempt 3.93% 3.93% fio [k] __kvm_vcpu_is_preempted Making vcpu_is_preempted() a callee-save function has a relatively high cost on x86-64 primarily due to at least one more cacheline of data access from the saving and restoring of registers (8 of them) to and from stack as well as one more level of function call. To reduce this performance overhead, an optimized assembly version of the the __raw_callee_save___kvm_vcpu_is_preempt() function is provided for x86-64. With this patch applied on a KVM guest on a 2-socket 16-core 32-thread system with 16 parallel jobs (8 on each socket), the aggregrate bandwidth of the fio test on an XFS ramdisk were as follows: I/O Type w/o patch with patch -------- --------- ---------- random read 8141.2 MB/s 8497.1 MB/s seq read 8229.4 MB/s 8304.2 MB/s random write 1675.5 MB/s 1701.5 MB/s seq write 1681.3 MB/s 1699.9 MB/s There are some increases in the aggregated bandwidth because of the patch. The perf data now became: 70.78% 0.58% fio [k] down_write 70.20% 0.01% fio [k] call_rwsem_down_write_failed 69.70% 1.17% fio [k] rwsem_down_write_failed 59.91% 55.42% fio [k] osq_lock 10.14% 10.14% fio [k] __kvm_vcpu_is_preempted The assembly code was verified by using a test kernel module to compare the output of C __kvm_vcpu_is_preempted() and that of assembly __raw_callee_save___kvm_vcpu_is_preempt() to verify that they matched. Suggested-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NWaiman Long <longman@redhat.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Waiman Long 提交于
The cpu argument in the function prototype of vcpu_is_preempted() is changed from int to long. That makes it easier to provide a better optimized assembly version of that function. For Xen, vcpu_is_preempted(long) calls xen_vcpu_stolen(int), the downcast from long to int is not a problem as vCPU number won't exceed 32 bits. Signed-off-by: NWaiman Long <longman@redhat.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Chao Peng 提交于
Guest segment selector is 16 bit field and guest segment base is natural width field. Fix two incorrect invocations accordingly. Without this patch, build fails when aggressive inlining is used with ICC. Cc: stable@vger.kernel.org Signed-off-by: NChao Peng <chao.p.peng@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Andy Lutomirski 提交于
Intel's VMX is daft and resets the hidden TSS limit register to 0x67 on VMX reload, and the 0x67 is not configurable. KVM currently reloads TR using the LTR instruction on every exit, but this is quite slow because LTR is serializing. The 0x67 limit is entirely harmless unless ioperm() is in use, so defer the reload until a task using ioperm() is actually running. Here's some poorly done benchmarking using kvm-unit-tests: Before: cpuid 1313 vmcall 1195 mov_from_cr8 11 mov_to_cr8 17 inl_from_pmtimer 6770 inl_from_qemu 6856 inl_from_kernel 2435 outl_to_kernel 1402 After: cpuid 1291 vmcall 1181 mov_from_cr8 11 mov_to_cr8 16 inl_from_pmtimer 6457 inl_from_qemu 6209 inl_from_kernel 2339 outl_to_kernel 1391 Signed-off-by: NAndy Lutomirski <luto@kernel.org> [Force-reload TR in invalidate_tss_limit. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Andy Lutomirski 提交于
Historically, the entire TSS + io bitmap structure was cacheline aligned, but commit ca241c75 ("x86: unify tss_struct") changed it (presumably inadvertently) so that the fixed-layout hardware part is cacheline-aligned and the io bitmap is after the padding. This wastes 24 bytes (the hardware part should be 104 bytes, but this pads it to 128 bytes) and, serves no purpose, and causes sizeof(struct x86_hw_tss) to have a confusing value. Drop the pointless alignment. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Andy Lutomirski 提交于
Use actual pointer types for pointers (instead of unsigned long) and replace hardcoded constants with the appropriate self-documenting macros. The function is still a bit messy, but this seems a lot better than before to me. This is mostly borrowed from a patch by Thomas Garnier. Cc: Thomas Garnier <thgarnie@google.com> Cc: Jim Mattson <jmattson@google.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Andy Lutomirski 提交于
It was a bit buggy (it didn't list all segment types that needed 64-bit fixups), but the bug was irrelevant because it wasn't called in any interesting context on 64-bit kernels and was only used for data segents on 32-bit kernels. To avoid confusion, make it explicitly 32-bit only. Cc: Thomas Garnier <thgarnie@google.com> Cc: Jim Mattson <jmattson@google.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Andy Lutomirski 提交于
The current CPU's TSS base is a foregone conclusion, so there's no need to parse it out of the segment tables. This should save a couple cycles (as STR is surely microcoded and poorly optimized) but, more importantly, it's a cleanup and it means that segment_base() will never be called on 64-bit kernels. Cc: Thomas Garnier <thgarnie@google.com> Cc: Jim Mattson <jmattson@google.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Andy Lutomirski 提交于
Rather than open-coding the kernel TSS limit in set_tss_desc(), make it a real macro near the TSS layout definition. This is purely a cleanup. Cc: Thomas Garnier <thgarnie@google.com> Cc: Jim Mattson <jmattson@google.com> Signed-off-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
handle_vmon gets a reference on VMXON region page, but does not release it. Release the reference. Found by syzkaller; based on a patch by Dmitry. Reported-by: NDmitry Vyukov <dvyukov@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 20 2月, 2017 1 次提交
-
-
由 Martin Schwidefsky 提交于
Fix PER tracing of system calls after git commit 34525e1f "s390: store breaking event address only for program checks" broke it. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 19 2月, 2017 1 次提交
-
-
由 Andy Gross 提交于
This patch enables the Qualcomm RPM based Clock Controller present on A-family boards. Signed-off-by: NAndy Gross <andy.gross@linaro.org> Acked-by: NBjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: NOlof Johansson <olof@lixom.net> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
- 18 2月, 2017 3 次提交
-
-
由 Paul Mackerras 提交于
The new HPT resizing code added in commit b5baa687 ("KVM: PPC: Book3S HV: KVM-HV HPT resizing implementation", 2016-12-20) doesn't have code to handle the new HPTE format which POWER9 uses. Thus it would be best not to advertise it to userspace on POWER9 systems until it works properly. Also, since resize_hpt_rehash_hpte() contains BUG_ON() calls that could be hit on POWER9, let's prevent it from being called on POWER9 for now. Acked-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
由 Daniel Borkmann 提交于
Long standing issue with JITed programs is that stack traces from function tracing check whether a given address is kernel code through {__,}kernel_text_address(), which checks for code in core kernel, modules and dynamically allocated ftrace trampolines. But what is still missing is BPF JITed programs (interpreted programs are not an issue as __bpf_prog_run() will be attributed to them), thus when a stack trace is triggered, the code walking the stack won't see any of the JITed ones. The same for address correlation done from user space via reading /proc/kallsyms. This is read by tools like perf, but the latter is also useful for permanent live tracing with eBPF itself in combination with stack maps when other eBPF types are part of the callchain. See offwaketime example on dumping stack from a map. This work tries to tackle that issue by making the addresses and symbols known to the kernel. The lookup from *kernel_text_address() is implemented through a latched RB tree that can be read under RCU in fast-path that is also shared for symbol/size/offset lookup for a specific given address in kallsyms. The slow-path iteration through all symbols in the seq file done via RCU list, which holds a tiny fraction of all exported ksyms, usually below 0.1 percent. Function symbols are exported as bpf_prog_<tag>, in order to aide debugging and attribution. This facility is currently enabled for root-only when bpf_jit_kallsyms is set to 1, and disabled if hardening is active in any mode. The rationale behind this is that still a lot of systems ship with world read permissions on kallsyms thus addresses should not get suddenly exposed for them. If that situation gets much better in future, we always have the option to change the default on this. Likewise, unprivileged programs are not allowed to add entries there either, but that is less of a concern as most such programs types relevant in this context are for root-only anyway. If enabled, call graphs and stack traces will then show a correct attribution; one example is illustrated below, where the trace is now visible in tooling such as perf script --kallsyms=/proc/kallsyms and friends. Before: 7fff8166889d bpf_clone_redirect+0x80007f0020ed (/lib/modules/4.9.0-rc8+/build/vmlinux) f5d80 __sendmsg_nocancel+0xffff006451f1a007 (/usr/lib64/libc-2.18.so) After: 7fff816688b7 bpf_clone_redirect+0x80007f002107 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fffa0575728 bpf_prog_33c45a467c9e061a+0x8000600020fb (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fffa07ef1fc cls_bpf_classify+0x8000600020dc (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff81678b68 tc_classify+0x80007f002078 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164d40b __netif_receive_skb_core+0x80007f0025fb (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164d718 __netif_receive_skb+0x80007f002018 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164e565 process_backlog+0x80007f002095 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164dc71 net_rx_action+0x80007f002231 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff81767461 __softirqentry_text_start+0x80007f0020d1 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff817658ac do_softirq_own_stack+0x80007f00201c (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff810a2c20 do_softirq+0x80007f002050 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff810a2cb5 __local_bh_enable_ip+0x80007f002085 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8168d452 ip_finish_output2+0x80007f002152 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8168ea3d ip_finish_output+0x80007f00217d (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8168f2af ip_output+0x80007f00203f (/lib/modules/4.9.0-rc8+/build/vmlinux) [...] 7fff81005854 do_syscall_64+0x80007f002054 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff817649eb return_from_SYSCALL_64+0x80007f002000 (/lib/modules/4.9.0-rc8+/build/vmlinux) f5d80 __sendmsg_nocancel+0xffff01c484812007 (/usr/lib64/libc-2.18.so) Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Remove the dummy bpf_jit_compile() stubs for eBPF JITs and make that a single __weak function in the core that can be overridden similarly to the eBPF one. Also remove stale pr_err() mentions of bpf_jit_compile. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 2月, 2017 8 次提交
-
-
由 Robert Schiele 提交于
Not every toolchain has -fno-asynchronous-unwind-tables per default on MIPS. This patch specifies the necessary option explicitly for VDSO library build. This prevents the following build failure: GENVDSO arch/mips/vdso/vdso-image.c arch/mips/vdso/genvdso: 'arch/mips/vdso/vdso.so.dbg' contains relocation sections .../arch/mips/vdso/Makefile:84: recipe for target 'arch/mips/vdso/vdso-image.c' failed Signed-off-by: NRobert Schiele <rschiele@gmail.com> Signed-off-by: NAlexander Sverdlin <alexander.sverdlin@nokia.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: "Maciej W. Rozycki" <macro@imgtec.com> Cc: Alexander Sverdlin <alexander.sverdlin@nokia.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/15127/Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
-
由 Paolo Bonzini 提交于
The FPU is always active now when running KVM. Reviewed-by: NDavid Matlack <dmatlack@google.com> Reviewed-by: NBandan Das <bsd@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
The purpose of the KVM_SET_SIGNAL_MASK API is to let userspace "kick" a VCPU out of KVM_RUN through a POSIX signal. A signal is attached to a dummy signal handler; by blocking the signal outside KVM_RUN and unblocking it inside, this possible race is closed: VCPU thread service thread -------------------------------------------------------------- check flag set flag raise signal (signal handler does nothing) KVM_RUN However, one issue with KVM_SET_SIGNAL_MASK is that it has to take tsk->sighand->siglock on every KVM_RUN. This lock is often on a remote NUMA node, because it is on the node of a thread's creator. Taking this lock can be very expensive if there are many userspace exits (as is the case for SMP Windows VMs without Hyper-V reference time counter). As an alternative, we can put the flag directly in kvm_run so that KVM can see it: VCPU thread service thread -------------------------------------------------------------- raise signal signal handler set run->immediate_exit KVM_RUN check run->immediate_exit Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Mirko Parthey 提交于
The Asus WL-500W buttons are active high, but the software treats them as active low. Fix the inverted logic. Fixes: 3be97255 ("MIPS: BCM47XX: Import buttons database from OpenWrt") Signed-off-by: NMirko Parthey <mirko.parthey@web.de> Acked-by: NRafał Miłecki <rafal@milecki.pl> Cc: Hauke Mehrtens <hauke@hauke-m.de> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 3.14.x- Patchwork: https://patchwork.linux-mips.org/patch/15295/Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
-
由 Ian Pozella 提交于
An img directory exists for the Pistchio SoC device tree but the directory itself isn't in the dts Makefile meaning the dtbs never get built. Fixes: daa10170 ("MIPS: DTS: img: add device tree for Marduk board") Signed-off-by: NIan Pozella <Ian.Pozella@imgtec.com> Reviewed-by: NRahul Bedarkar <Rahul.Bedarkar@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/15309/Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
-
由 Arnd Bergmann 提交于
One of the last remaining failures in kernelci.org is for a gcc bug: drivers/net/ethernet/qlogic/qlge/qlge_main.c:4819:1: error: insn does not satisfy its constraints: drivers/net/ethernet/qlogic/qlge/qlge_main.c:4819:1: internal compiler error: in extract_constrain_insn, at recog.c:2190 This is apparently broken in gcc-6 but fixed in gcc-7, and I cannot reproduce the problem here. However, it is clear that ip27_defconfig does not actually need this driver as the platform has only PCI-X but not PCIe, and the qlge adapter in turn is PCIe-only. The driver was originally enabled in 2010 along with lots of other drivers. Fixes: 59d302b3 ("MIPS: IP27: Make defconfig useful again.") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/15197/Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
-
由 Purna Chandra Mandal 提交于
Early clock API pic32_get_pbclk() is defined in early_clk.c and used by time.c and early_console.c. When CONFIG_EARLY_PRINTK isn't set, early_clk.c isn't compiled and time.c fails to link. Fix it by compiling early_clk.c always. Also sort files in alphabetical order. Fixes: 6e4ad1b4 ("MIPS: pic32mzda: fix getting timer clock rate.") Reported-by: NHarvey Hunt <harvey.hunt@imgtec.com> Signed-off-by: NPurna Chandra Mandal <purna.mandal@microchip.com> Reviewed-by: NHarvey Hunt <harvey.hunt@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Joshua Henderson <digitalpeer@digitalpeer.com> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.7.x- Patchwork: https://patchwork.linux-mips.org/patch/13383/Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
-
由 Felix Fietkau 提交于
Disabling ethernet during reboot (only to enable it again when the ethernet driver attaches) can put the chip into a faulty state where it corrupts the header of all incoming packets. This happens if packets arrive during the time window where the core is disabled, and it can be easily reproduced by rebooting while sending a flood ping to the broadcast address. Fixes: 95135bfa ("MIPS: Lantiq: Deactivate most of the devices by default") Signed-off-by: NFelix Fietkau <nbd@nbd.name> Acked-by: NJohn Crispin <john@phrozen.org> Cc: hauke.mehrtens@lantiq.com Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.4.x- Patchwork: https://patchwork.linux-mips.org/patch/15078/Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
-