- 16 8月, 2017 1 次提交
-
-
由 Nitin Gupta 提交于
Adds support for 16GB hugepage size. To use this page size use kernel parameters as: default_hugepagesz=16G hugepagesz=16G hugepages=10 Testing: Tested with the stream benchmark which allocates 48G of arrays backed by 16G hugepages and does RW operation on them in parallel. Orabug: 25362942 Cc: Anthony Yznaga <anthony.yznaga@oracle.com> Reviewed-by: NBob Picco <bob.picco@oracle.com> Signed-off-by: NNitin Gupta <nitin.m.gupta@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 8月, 2017 1 次提交
-
-
由 Babu Moger 提交于
New algorithm that takes advantage of the M7/M8 block init store ASI, ie, overlapping pipelines and miss buffer filling. Full details in code comments. Signed-off-by: NBabu Moger <babu.moger@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 8月, 2017 2 次提交
-
-
由 Allen Pais 提交于
Recognize SPARC-M8 cpu type, hardware caps and cpu distribution map. Signed-off-by: NAllen Pais <allen.pais@oracle.com> Signed-off-by: NDavid Aldridge <david.j.aldridge@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Allen Pais 提交于
Acked-by: NSam Ravnborg <sam@ravnborg.org> Acked-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NAllen Pais <allen.pais@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 5月, 2017 1 次提交
-
-
由 Dave Aldridge 提交于
When any of the functions contained in NGbzero.S and GENbzero.S vector through *bzero_from_clear_user, we may end up taking a fault when executing one of the store alternate address space instructions. If this happens, the exception handler does not restore the %asi register. This commit fixes the issue by introducing a new exception handler that ensures the %asi register is restored when a fault is handled. Orabug: 25577560 Signed-off-by: NDave Aldridge <david.j.aldridge@oracle.com> Reviewed-by: NRob Gardner <rob.gardner@oracle.com> Reviewed-by: NBabu Moger <babu.moger@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 3月, 2017 1 次提交
-
-
由 Babu Moger 提交于
Avoid un-intended DCTI Couples. Use of DCTI couples is deprecated. Also address the "Programming Note" for optimal performance. Here is the complete text from Oracle SPARC Architecture Specs. 6.3.4.7 DCTI Couples "A delayed control transfer instruction (DCTI) in the delay slot of another DCTI is referred to as a “DCTI couple”. The use of DCTI couples is deprecated in the Oracle SPARC Architecture; no new software should place a DCTI in the delay slot of another DCTI, because on future Oracle SPARC Architecture implementations DCTI couples may execute either slowly or differently than the programmer assumes it will. SPARC V8 and SPARC V9 Compatibility Note The SPARC V8 architecture left behavior undefined for a DCTI couple. The SPARC V9 architecture defined behavior in that case, but as of UltraSPARC Architecture 2005, use of DCTI couples was deprecated. Software should not expect high performance from DCTI couples, and performance of DCTI couples should be expected to decline further in future processors. Programming Note As noted in TABLE 6-5 on page 115, an annulled branch-always (branch-always with a = 1) instruction is not architecturally a DCTI. However, since not all implementations make that distinction, for optimal performance, a DCTI should not be placed in the instruction word immediately following an annulled branch-always instruction (BA,A or BPA,A)." Signed-off-by: NBabu Moger <babu.moger@oracle.com> Reviewed-by: NRob Gardner <rob.gardner@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 10月, 2016 3 次提交
-
-
由 David S. Miller 提交于
All of __ret{,l}_mone{_asi,_fp,_asi_fpu} are now unused. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
The fixup helper function mechanism for handling user copy fault handling is not %100 accurrate, and can never be made so. We are going to transition the code to return the running return return length, which is always kept track in one or more registers of each of these routines. In order to convert them one by one, we have to allow the existing behavior to continue functioning. Therefore make all the copy code that wants the fixup helper to be used return negative one. After all of the user copy routines have been converted, this logic and the fixup helpers themselves can be removed completely. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
It is completely unused. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 8月, 2016 1 次提交
-
-
由 Al Viro 提交于
Acked-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 28 4月, 2016 1 次提交
-
-
由 David S. Miller 提交于
The system call tracing bug fix mentioned in the Fixes tag below increased the amount of assembler code in the sequence of assembler files included by head_64.S This caused to total set of code to exceed 0x4000 bytes in size, which overflows the expression in head_64.S that works to place swapper_tsb at address 0x408000. When this is violated, the TSB is not properly aligned, and also the trap table is not aligned properly either. All of this together results in failed boots. So, do two things: 1) Simplify some code by using ba,a instead of ba/nop to get those bytes back. 2) Add a linker script assertion to make sure that if this happens again the build will fail. Fixes: 1a40b953 ("sparc: Fix system call tracing register handling.") Reported-by: NMeelis Roos <mroos@linux.ee> Reported-by: NJoerg Abraham <joerg.abraham@nokia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 4月, 2016 1 次提交
-
-
由 Khalid Aziz 提交于
Add code to recognize SPARC-Sonoma cpu correctly and update cpu hardware caps and cpu distribution map. SPARC-Sonoma is based upon SPARC-M7 core along with additional PCI functions added on and is reported by firmware as "SPARC-SN". Signed-off-by: NKhalid Aziz <khalid.aziz@oracle.com> Acked-by: NAllen Pais <allen.pais@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 2月, 2016 1 次提交
-
-
由 Andrey Ryabinin 提交于
Lockdep is initialized at compile time now. Get rid of lockdep_init(). Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Krinkin <krinkin.m.u@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Cc: mm-commits@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 25 12月, 2015 1 次提交
-
-
由 Rob Gardner 提交于
Short story: Exception handlers used by some copy_to_user() and copy_from_user() functions do not diligently clean up floating point register usage, and this can result in a user process seeing invalid values in floating point registers. This sometimes makes the process fail. Long story: Several cpu-specific (NG4, NG2, U1, U3) memcpy functions use floating point registers and VIS alignaddr/faligndata to accelerate data copying when source and dest addresses don't align well. Linux uses a lazy scheme for saving floating point registers; It is not done upon entering the kernel since it's a very expensive operation. Rather, it is done only when needed. If the kernel ends up not using FP regs during the course of some trap or system call, then it can return to user space without saving or restoring them. The various memcpy functions begin their FP code with VISEntry (or a variation thereof), which saves the FP regs. They conclude their FP code with VISExit (or a variation) which essentially marks the FP regs "clean", ie, they contain no unsaved values. fprs.FPRS_FEF is turned off so that a lazy restore will be triggered when/if the user process accesses floating point regs again. The bug is that the user copy variants of memcpy, copy_from_user() and copy_to_user(), employ an exception handling mechanism to detect faults when accessing user space addresses, and when this handler is invoked, an immediate return from the function is forced, and VISExit is not executed, thus leaving the fprs register in an indeterminate state, but often with fprs.FPRS_FEF set and one or more dirty bits. This results in a return to user space with invalid values in the FP regs, and since fprs.FPRS_FEF is on, no lazy restore occurs. This bug affects copy_to_user() and copy_from_user() for NG4, NG2, U3, and U1. All are fixed by using a new exception handler for those loads and stores that are done during the time between VISEnter and VISExit. n.b. In NG4memcpy, the problematic code can be triggered by a copy size greater than 128 bytes and an unaligned source address. This bug is known to be the cause of random user process memory corruptions while perf is running with the callgraph option (ie, perf record -g). This occurs because perf uses copy_from_user() to read user stacks, and may fault when it follows a stack frame pointer off to an invalid page. Validation checks on the stack address just obscure the underlying problem. Signed-off-by: NRob Gardner <rob.gardner@oracle.com> Signed-off-by: NDave Aldridge <david.j.aldridge@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 10月, 2014 1 次提交
-
-
由 David S. Miller 提交于
Meelis Roos reported that kernels built with gcc-4.9 do not boot, we eventually narrowed this down to only impacting machines using UltraSPARC-III and derivitive cpus. The crash happens right when the first user process is spawned: [ 54.451346] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004 [ 54.451346] [ 54.571516] CPU: 1 PID: 1 Comm: init Not tainted 3.16.0-rc2-00211-gd7933ab7 #96 [ 54.666431] Call Trace: [ 54.698453] [0000000000762f8c] panic+0xb0/0x224 [ 54.759071] [000000000045cf68] do_exit+0x948/0x960 [ 54.823123] [000000000042cbc0] fault_in_user_windows+0xe0/0x100 [ 54.902036] [0000000000404ad0] __handle_user_windows+0x0/0x10 [ 54.978662] Press Stop-A (L1-A) to return to the boot prom [ 55.050713] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004 Further investigation showed that compiling only per_cpu_patch() with an older compiler fixes the boot. Detailed analysis showed that the function is not being miscompiled by gcc-4.9, but it is using a different register allocation ordering. With the gcc-4.9 compiled function, something during the code patching causes some of the %i* input registers to get corrupted. Perhaps we have a TLB miss path into the firmware that is deep enough to cause a register window spill and subsequent restore when we get back from the TLB miss trap. Let's plug this up by doing two things: 1) Stop using the firmware stack for client interface calls into the firmware. Just use the kernel's stack. 2) As soon as we can, call into a new function "start_early_boot()" to put a one-register-window buffer between the firmware's deepest stack frame and the top-most initial kernel one. Reported-by: NMeelis Roos <mroos@linux.ee> Tested-by: NMeelis Roos <mroos@linux.ee> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 9月, 2014 1 次提交
-
-
由 Allen Pais 提交于
The following patch adds support for correctly recognising M6 and M7 cpu type. Signed-off-by: NAllen Pais <allen.pais@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 5月, 2014 1 次提交
-
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 3月, 2013 1 次提交
-
-
由 Allen Pais 提交于
The following patch adds support for correctly recognizing SPARC-X chips. cpu : Unknown SUN4V CPU fpu : Unknown SUN4V FPU pmu : Unknown SUN4V PMU Signed-off-by: NKatayama Yoshihiro <kata1@jp.fujitsu.com> Signed-off-by: NAllen Pais <allen.pais@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 10月, 2012 1 次提交
-
-
由 David S. Miller 提交于
This adds optimized memset/bzero/page-clear routines for Niagara-4. We basically can do what powerpc has been able to do for a decade (via the "dcbz" instruction), which is use cache line clearing stores for bzero and memsets with a 'c' argument of zero. As long as we make the cache initializing store to each 32-byte subblock of the L2 cache line, it works. As with other Niagara-4 optimized routines, the key is to make sure to avoid any usage of the %asi register, as reads and writes to it cost at least 50 cycles. For the user clear cases, we don't use these new routines, we use the Niagara-1 variants instead. Those have to use %asi in an unavoidable way. A Niagara-4 8K page clear costs just under 600 cycles. Add definitions of the MRU variants of the cache initializing store ASIs. By default, cache initializing stores install the line as Least Recently Used. If we know we're going to use the data immediately (which is true for page copies and clears) we can use the Most Recently Used variant, to decrease the likelyhood of the lines being evicted before they get used. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 9月, 2012 1 次提交
-
-
由 David S. Miller 提交于
Before After -------------- -------------- bw_tcp: 1288.53 MB/sec 1637.77 MB/sec bw_pipe: 1517.18 MB/sec 2107.61 MB/sec bw_unix: 1838.38 MB/sec 2640.91 MB/sec make -s -j128 allmodconfig 5min 49sec 5min 31sec Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 5月, 2012 1 次提交
-
-
由 Sam Ravnborg 提交于
To allow us to add ttable_32.S Signed-off-by: NSam Ravnborg <sam@ravnborg.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 9月, 2011 1 次提交
-
-
由 David S. Miller 提交于
Recognize T4 and T5 chips. Treating them both as "T2 plus other stuff" should be extremely safe and make sure distributions will work when those chips actually ship to customers. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 8月, 2011 1 次提交
-
-
由 David S. Miller 提交于
Don't use floating point on Niagara2, use the traditional plain Niagara code instead. Unroll Niagara loops to 128 bytes for copy, and 256 bytes for clear. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 7月, 2011 1 次提交
-
-
由 David S. Miller 提交于
The cpu compatible string we look for is "SPARC-T3". As far as memset/memcpy optimizations go, we treat this chip the same as Niagara-T2/T2+. Use cache initializing stores for memset, and use perfetch, FPU block loads, cache initializing stores, and block stores for copies. We use the Niagara-T2 perf support, since T3 is a close relative in this regard. Later we'll add support for the new events T3 can report, plus enable T3's new "sample" mode. For now I haven't added any new ELF hwcap flags. We probably need to add a couple, for example: T2 and T3 both support the population count instruction in hardware. T3 supports VIS3 instructions, including support (finally) for partitioned shift. One can also now move directly between float and integer registers. T3 supports instructions meant to help with Galois Field and other HPC calculations, such as XOR multiply. Also there are "OP and negate" instructions, for example "fnmul" which is multiply-and-negate. T3 recognizes the transactional memory opcodes, however since transactional memory isn't supported: 1) 'commit' behaves as a NOP and 2) 'chkpt' always branches 3) 'rdcps' returns all zeros and 4) 'wrcps' behaves as a NOP. So we'll need about 3 new elf capability flags in the end to represent all of these things. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 3月, 2011 1 次提交
-
-
由 Lucas De Marchi 提交于
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
-
- 16 6月, 2009 1 次提交
-
-
由 David S. Miller 提交于
Surprisingly this actually makes LOAD_PER_CPU_BASE() a little more efficient. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 4月, 2009 1 次提交
-
-
由 Tim Abbott 提交于
The section .text.init.refok is deprecated and __REF (.ref.text) should be used in assembly files instead. This patch cleans up a few uses of .text.init.refok in the sparc architecture. Also fix a reference to .text.init in a comment that wasn't updated to .init.text. Signed-off-by: NTim Abbott <tabbott@mit.edu> Cc: David S. Miller <davem@davemloft.net> Acked-by: NSam Ravnborg <sam@ravnborg.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 3月, 2009 1 次提交
-
-
由 Nick Andrew 提交于
Fix misspelling of firmware. Signed-off-by: NNick Andrew <nick@nick-andrew.net> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 09 2月, 2009 1 次提交
-
-
由 David S. Miller 提交于
This is an implementation of a suggestion made by Chris Torek: -------------------- Something else I noticed in passing: the EX and EX_LD/EX_ST macros scattered throughout the various .S files make a fair bit of .fixup code, all of which does the same thing. At the cost of one symbol in copy_in_user.S, you could just have one common two-instruction retl-and-mov-1 fixup that they all share. -------------------- The following is with a defconfig build: text data bss dec hex filename 3972767 344024 584449 4901240 4ac978 vmlinux.orig 39688877 344024 584449 4897360 4aba50 vmlinux Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 12月, 2008 2 次提交
-
-
由 Sam Ravnborg 提交于
o Move all files from sparc64/kernel/ to sparc/kernel - rename as appropriate o Update sparc/Makefile to the changes o Update sparc/kernel/Makefile to include the sparc64 files NOTE: This commit changes link order on sparc64! Link order had to change for either of sparc32 and sparc64. And assuming sparc64 see more testing than sparc32 change link order on sparc64 where issues will be caught faster. Signed-off-by: NSam Ravnborg <sam@ravnborg.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
So that we can profile code even in a local_irq_disable() section, only write 14 (instead of 15) into the %pil register to disable IRQs. This allows PIL level 15 to serve as a pseudo NMI. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 9月, 2008 1 次提交
-
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 4月, 2008 1 次提交
-
-
由 David S. Miller 提交于
entry.S was a hodge-podge of several totally unrelated sets of assembler routines, ranging from FPU trap handlers to hypervisor call functions. Split it up into topic-sized pieces. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 3月, 2008 1 次提交
-
-
由 David S. Miller 提交于
Currently kernel images are limited to 8MB in size, and this causes problems especially when enabling features that take up a lot of kernel image space such as lockdep. The code now will align the kernel image size up to 4MB and map that many locked TLB entries. So, the only practical limitation is the number of available locked TLB entries which is 16 on Cheetah and 64 on pre-Cheetah sparc64 cpus. Niagara cpus don't actually have hw locked TLB entry support. Rather, the hypervisor transparently provides support for "locked" TLB entries since it runs with physical addressing and does the initial TLB miss processing. Fully utilizing this change requires some help from SILO, a patch for which will be submitted to the maintainer. Essentially, SILO will only currently map up to 8MB for the kernel image and that needs to be increased. Note that neither this patch nor the SILO bits will help with network booting. The openfirmware code will only map up to a certain amount of kernel image during a network boot and there isn't much we can to about that other than to implemented a layered network booting facility. Solaris has this, and calls it "wanboot" and we may implement something similar at some point. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 2月, 2008 1 次提交
-
-
由 David S. Miller 提交于
The early per-cpu handling needs a slight tweak to work when booting on a non-zero cpu. We got away with this for a long time, but can't any longer as now even printk() calls functions (cpu_clock() for example) that thus make early references to per-cpu variables. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 9月, 2007 1 次提交
-
-
由 David S. Miller 提交于
As noted by Al Viro, when we try to call prom_set_trap_table() in the SMP trampoline code we try to take the PROM call spinlock which doesn't work because the current thread pointer isn't valid yet and lockdep depends upon that being correct. Furthermore, we cannot set the current thread pointer register because it can't be properly dereferenced until we return from prom_set_trap_table(). Kernel TLB misses only work after that call. So do the PROM call to set the trap table directly instead of going through the OBP library C code, and thus avoid the lock altogether. These calls are guarenteed to be serialized fully. Since there are now no calls to the prom_set_trap_table{_sun4v}() library functions, they can be deleted. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 8月, 2007 2 次提交
-
-
由 David S. Miller 提交于
This register is not a part of the sun4v architecture. Niagara 1 and 2 happened to leave it around. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
The bzero/memset implementation stays the same as Niagara-1. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 8月, 2007 1 次提交
-
-
由 David S. Miller 提交于
Check the cpu type in the OBP device tree before committing to using the optimized Niagara memcpy and memset implementation. If we don't recognize the cpu type, use a completely generic version. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 7月, 2007 1 次提交
-
-
由 David S. Miller 提交于
We can't mark the whole thing init because there are dependencies in bootloaders that assume that _start, or whatever the image entry value, is 2 instructions before the "HdrS" signature. In fact, TILO assumes this entry is always at 0x4000, yikes! Also, right after the bootloader info area there are OBP strings and values that get used later in the boot process, and those are not all provably .init yet. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-