- 10 5月, 2013 4 次提交
-
-
由 Vineet Gupta 提交于
Enforce congruency of userspace shared mappings Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
Fix the one zillion warnings Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
This is the meat of the series which prevents any dcache alias creation by always keeping the U and K mapping of a page congruent. If a mapping already exists, and other tries to access the page, prev one is flushed to physical page (wback+inv) Essentially flush_dcache_page()/copy_user_highpage() create K-mapping of a page, but try to defer flushing, unless U-mapping exist. When page is actually mapped to userspace, update_mmu_cache() flushes the K-mapping (in certain cases this can be optimised out) Additonally flush_cache_mm(), flush_cache_range(), flush_cache_page() handle the puring of stale userspace mappings on exit/munmap... flush_anon_page() handles the existing U-mapping for anon page before kernel reads it via the GUP path. Note that while not complete, this is enough to boot a simple dynamically linked Busybox based rootfs Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
This preps the low level dcache flush helpers to take vaddr argument in addition to the existing paddr to properly flush the VIPT dcache Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 07 5月, 2013 6 次提交
-
-
由 Vineet Gupta 提交于
flush_dcache_page( ) is MM hook to ensure that a page has consistent views between kernel and userspace. Thus it is called when * kernel writes to a page which at some later point could get mapped to userspace (so kernel mapping needs to be flushed-n-inv) * kernel is about to read from a page with possible userspace mappings (so userspace mappings needs to be made coherent with kernel ones) However for Non aliasing VIPT dcache, any userspace mapping will always be congruent to kernel mapping. Thus d-cache need need not be flushed at all (or delayed indefinitely). The only reason it does need to be flushed is when mapping code pages. Since icache doesn't snoop dcache, those dirty dcache lines need to be written back to memory and icache line invalidated so that icache lines fetch will get the right data. Decent gains on LMBench fork/exec/sh and File I/O micro-benchmarks. (1) FPGA @ 80 MHZ Processor, Processes - times in microseconds - smaller is better ------------------------------------------------------------------------------ Host OS Mhz null null open slct sig sig fork exec sh call I/O stat clos TCP inst hndl proc proc proc --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 3.9-rc6-a Linux 3.9.0-r 80 4.79 8.72 66.7 116. 239. 8.39 30.4 4798 14.K 34.K 3.9-rc6-b Linux 3.9.0-r 80 4.79 8.62 65.4 111. 239. 8.35 29.0 3995 12.K 30.K 3.9-rc7-c Linux 3.9.0-r 80 4.79 9.00 66.1 106. 239. 8.61 30.4 2858 10.K 24.K ^^^^ ^^^^ ^^^ File & VM system latencies in microseconds - smaller is better ------------------------------------------------------------------------------- Host OS 0K File 10K File Mmap Prot Page 100fd Create Delete Create Delete Latency Fault Fault selct --------- ------------- ------ ------ ------ ------ ------- ----- ------- ----- 3.9-rc6-a Linux 3.9.0-r 317.8 204.2 1122.3 375.1 3522.0 4.288 20.7 126.8 3.9-rc6-b Linux 3.9.0-r 298.7 223.0 1141.6 367.8 3531.0 4.866 20.9 126.4 3.9-rc7-c Linux 3.9.0-r 278.4 179.2 862.1 339.3 3705.0 3.223 20.3 126.6 ^^^^^ ^^^^^ ^^^^^ ^^^^ (2) Customer Silicon @ 500 MHz (166 MHz mem) ------------------------------------------------------------------------------ Host OS Mhz null null open slct sig sig fork exec sh call I/O stat clos TCP inst hndl proc proc proc --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- abilis-ba Linux 3.9.0-r 497 0.71 1.38 4.58 12.0 35.5 1.40 3.89 2070 5525 13.K abilis-ca Linux 3.9.0-r 497 0.71 1.40 4.61 11.8 35.6 1.37 3.92 1411 4317 10.K ^^^^ ^^^^ ^^^ Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
Now that we have same helper used for all icache invalidates (i.e. vaddr+paddr based exact line invalidate), consolidate the open coded calls into one place. Also rename flush_icache_range_vaddr => __sync_icache_dcache Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
ARC icache doesn't snoop dcache thus executable pages need to be made coherent before mapping into userspace in flush_icache_page(). However ARC700 CDU (hardware cache flush module) requires both vaddr (index in cache) as well as paddr (tag match) to correctly identify a line in the VIPT cache. A typical ARC700 SoC has aliasing icache, thus the paddr only based flush_icache_page() API couldn't be implemented efficiently. It had to loop thru all possible alias indexes and perform the invalidate operation (ofcourse the cache op would only succeed at the index(es) where tag matches - typically only 1, but the cost of visiting all the cache-bins needs to paid nevertheless). Turns out however that the vaddr (along with paddr) is available in update_mmu_cache() hence better suits ARC icache flush semantics. With both vaddr+paddr, exactly one flush operation per line is done. Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
munmap ends up calling tlb_flush() which for ARC was flushing the entire TLB unconditionally (by moving the MMU to a new ASID) do_munmap unmap_region unmap_vmas unmap_single_vma unmap_page_range tlb_start_vma zap_pud_range tlb_end_vma() tlb_finish_mmu tlb_flush() ---> unconditional flush_tlb_mm() So even a single page munmap, a frequent operation when uClibc dynamic linker (ldso) is loading the dependent shared libraries, would move the the ASID multiple times - needlessly invalidating the pre-faulted TLB entries (and increasing the rate of ASID wraparound + full TLB flush). This is now optimised to only be called if tlb->full_mm (which means for exit/execve) cases only. And for those cases, flush_tlb_mm() is already optimised to be a no-op for mm->mm_users == 0. So essentially there are no mmore full mm flushes - except for fork which anyhow needs it for properly COW'ing parent address space. munmap now needs to do TLB range flush, which is implemented with tlb_end_vma() Results ------- 1. ASID now consistenly moves by 4 during a simple ls (as opposed to 5 or 7 before). 2. LMBench microbenchmark also shows improvements Basic system parameters ------------------------------------------------------------------------------ Host OS Description Mhz tlb cache mem scal pages line par load bytes --------- ------------- ----------------------- ---- ----- ----- ------ ---- 3.9-rc5-0 Linux 3.9.0-r 3.9-rc5-0404-gcc-4.4-ba 80 8 64 1.1000 1 3.9-rc5-0 Linux 3.9.0-r 3.9-rc5-0405-avoid-full 80 8 64 1.1200 1 Processor, Processes - times in microseconds - smaller is better ------------------------------------------------------------------------------ Host OS Mhz null null open slct sig sig fork exec sh call I/O stat clos TCP inst hndl proc proc proc --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 3.9-rc5-0 Linux 3.9.0-r 80 4.81 8.69 68.6 118. 239. 8.53 31.6 4839 13.K 34.K 3.9-rc5-0 Linux 3.9.0-r 80 4.46 8.36 53.8 91.3 223. 8.12 24.2 4725 13.K 33.K File & VM system latencies in microseconds - smaller is better ------------------------------------------------------------------------------- Host OS 0K File 10K File Mmap Prot Page 100fd Create Delete Create Delete Latency Fault Fault selct --------- ------------- ------ ------ ------ ------ ------- ----- ------- ----- 3.9-rc5-0 Linux 3.9.0-r 314.7 223.2 1054.9 390.2 3615.0 1.590 20.1 126.6 3.9-rc5-0 Linux 3.9.0-r 265.8 183.8 1014.2 314.1 3193.0 6.910 18.8 110.4 Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Christian Ruppert 提交于
Infrastructure required to make the Linux kernel compile and boot on the Abilis Systems TB10x series of SOCs based on ARC700 CPUs: - Kmake related files (Kconfig, Makefile, tb10x_defconfig) - TB10x platform initialisation Signed-off-by: NChristian Ruppert <christian.ruppert@abilis.com> Signed-off-by: NPierrick Hascoet <pierrick.hascoet@abilis.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Christian Ruppert 提交于
This patch adds some room for CPU-external interrupt controllers in the Linux interrupt space. Until now, only the 32 CPU internal interrupt lines were supported which does not allow for external interrupt controllers such as GPIO modules etc. Signed-off-by: NChristian Ruppert <christian.ruppert@abilis.com> Signed-off-by: NPierrick Hascoet <pierrick.hascoet@abilis.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 09 4月, 2013 1 次提交
-
-
由 Christian Ruppert 提交于
ARC irqsave/restore macros were missing the compiler barrier, causing a stale load in irq-enabled region be used in irq-safe region, despite being changed, because the register holding the value was still live. The problem manifested as random crashes in timer code when stress testing ARCLinux (3.9-rc3) on a !SMP && !PREEMPT_COUNT Here's the exact sequence which caused this: (0). tv1[x] <----> t1 <---> t2 (1). mod_timer(t1) interrupted after it calls timer_pending() (2). mod_timer(t2) completes (3). mod_timer(t1) resumes but messes up the list (4). __runt_timers( ) uses bogus timer_list entry / crashes in timer->function Essentially mod_timer() was racing against itself and while the spinlock serialized the tv1[] timer link list, timer_pending() called outside the spinlock, cached timer link list element in a register. With low register pressure (and a deep register file), lack of barrier in raw_local_irqsave() as well as preempt_disable (!PREEMPT_COUNT version), there was nothing to force gcc to reload across the spinlock, causing a stale value in reg be used for link list manipulation - ensuing a corruption. ARcompact disassembly which shows the culprit generated code: mod_timer: push_s blink mov_s r13,r0 # timer, timer .. ###### timer_pending( ) ld_s r3,[r13] # <------ <variable>.entry.next LOADED brne r3, 0, @.L163 .L163: .. ###### spin_lock_irq( ) lr r5, [status32] # flags bic r4, r5, 6 # temp, flags, and.f 0, r5, 6 # flags, flag.nz r4 ###### detach_if_pending( ) begins tst_s r3,r3 <-------------- # timer_pending( ) checks timer->entry.next # r3 is NOT reloaded by gcc, using stale value beq.d @.L169 mov.eq r0,0 ##### detach_timer( ): __list_del( ) ld r4,[r13,4] # <variable>.entry.prev, D.31439 st r4,[r3,4] # <variable>.prev, D.31439 st r3,[r4] # <variable>.next, D.30246 We initially tried to fix this by adding barrier() to preempt_* macros for !PREEMPT_COUNT but Linus clarified that it was anything but wrong. http://www.spinics.net/lists/kernel/msg1512709.html [vgupta: updated commitlog] Reported-by/Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com> Cc: Christian Ruppert <christian.ruppert@abilis.com> Cc: Pierrick Hascoet <pierrick.hascoet@abilis.com> Debugged-by/Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 3月, 2013 1 次提交
-
-
由 Vineet Gupta 提交于
orig_r8_IS_EXCPN and orig_r8_IS_BRKPT were same values due to a copy/paste error. Although it looks bad and is wrong, it really doesn't affect gdb working. orig_r8_IS_BRKPT is the one relevant to debugging (breakpoints), since it is used to provide EFA vs. ERET to a ptrace "stop_pc" request. So when gdb has inserted a breakpoint, orig_r8_IS_BRKPT is already set, and anything else (i.e. orig_r8_IS_EXCPN) becoming same as it, really doesn't hurt gdb. The corollary case, could be nasty but nobody uses the ptrace "stop_pc" request in that case Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 19 3月, 2013 1 次提交
-
-
由 Pierrick Hascoet 提交于
Signed-off-by: NPierrick Hascoet <pierrick.hascoet@abilis.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 18 3月, 2013 1 次提交
-
-
由 Vineet Gupta 提交于
Tracks commit e72837e3 "default SET_PERSONALITY() in linux/elf.h" Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 11 3月, 2013 2 次提交
-
-
由 Vineet Gupta 提交于
When switching to clone() only ABI - I missed out pruning the low level asm syscall wrappers Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
CC drivers/mmc/host/mmc_spi.o drivers/mmc/host/mmc_spi.c:118: error: redefinition of 'struct scratch' make[3]: *** [drivers/mmc/host/mmc_spi.o] Error 1 make[2]: *** [drivers/mmc/host] Error 2 make[1]: *** [drivers/mmc] Error 2 make: *** [drivers] Error 2 CC arch/arc/kernel/kgdb.o In file included from include/linux/kgdb.h:20, from arch/arc/kernel/kgdb.c:11: /home/vineetg/arc/k.org/arc-port/arch/arc/include/asm/kgdb.h:34: warning: 'struct pt_regs' declared inside parameter list /home/vineetg/arc/k.org/arc-port/arch/arc/include/asm/kgdb.h:34: warning: its scope is only this definition or declaration, which is probably not what you want arch/arc/kernel/kgdb.c:172: error: conflicting types for 'kgdb_trap' CC arch/arc/kernel/kgdb.o arch/arc/kernel/kgdb.c: In function 'pt_regs_to_gdb_regs': arch/arc/kernel/kgdb.c:62: error: dereferencing pointer to incomplete type Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 27 2月, 2013 3 次提交
-
-
由 Vineet Gupta 提交于
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
The upstream kernel ABI (v3) is different from current out-of-tree (v2): * no-legacy-syscalls * user_regs_struct layout has changed So we rev up the ABI version Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
ptrace regset interface relies on ELF_NGREG for ceiling the size of user request. So any larger request (even if legit) would be clipped. The existing def of ELF_NGREG didn't use user_regs_struct and was technically one placeholder short (stop_pc) - although the current code would still work because pt_regs includes a bunch of extra fields, making ELF_NGREG >= sizeof(struct user_regs_struct)/sizeof(long) But we need to remove this ambiguity, specially since pt_regs should NOT be directly associated with with anything userspace-ish. Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 26 2月, 2013 1 次提交
-
-
由 Vineet Gupta 提交于
The flat DT (currently embedded in vmlinux) is in .init section. The unflattened/binary tree doesn't copy strings through and references them from orig flat DT - which could cause catestrohpy if of_* APIs are called post init, say from a driver which is a loadable module. Reported-by: NJames Hogan <james.hogan@imgtec.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 16 2月, 2013 20 次提交
-
-
由 Vineet Gupta 提交于
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
This again is for switch from singleton platform SMP API to multi-platform paradigm Platform code is not yet setup to populate the callbacks, that happens in next commit Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: NArnd Bergmann <arnd@arndb.de>
-
由 Vineet Gupta 提交于
All the current platforms can work with 0x8000_0000 based dma_addr_t since the Bus Bridges typically ignore the top bit (the only excpetion was Angel4 PCI-AHB bridge which we no longer care for). That way we don't need plat-specific cpu-addr to bus-addr conversion. Hooks still provided - just in case a platform has an obscure device which say needs 0 based bus address. That way <asm/dma_mapping.h> no longer needs to unconditinally include <plat/dma_addr.h> Also verfied that on Angel4 board, other peripherals (IDE-disk / EMAC) work fine with 0x8000_0000 based dma addresses. Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: NArnd Bergmann <arnd@arndb.de>
-
由 Vineet Gupta 提交于
For now this will suffice for all platforms, later exotic ones needs to get this from DeviceTree Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Cc: Arnd Bergmann <arnd@arndb.de>
-
由 Vineet Gupta 提交于
-Top level ARC makefile removes -I for platform headers -asm/irq.h no longer includes plat/irq.h -platform makefile adds -I for it's specfic platform headers -platform code to directly include it's plat/irq.h -Linker script needed plat/memmap.h for CCM info, already in .config Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: NArnd Bergmann <arnd@arndb.de>
-
由 Vineet Gupta 提交于
-platform API is retired and instead callbacks are used Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: NArnd Bergmann <arnd@arndb.de>
-
由 Vineet Gupta 提交于
The orig platform code orgnaization was singleton design pattern - only one platform (and board thereof) would build at a time. Thus any platform/board specific code (e.g. irq init, early init ...) expected by ARC common code was exported as well defined set of APIs, with only ONE instance building ever. Now with multiple-platform build requirement, that design of code no longer holds - multiple board specific calls need to build at the same time - so ARC common code can't use the API approach, it needs a callback based design where each board registers it's specific set of functions, and at runtime, depending on board detection, the callbacks are used from the registry. This commit adds all the infrastructure, where board specific callbacks are specified as a "maThine description". All the hooks are placed in right spots, no board callbacks registered yet (with MACHINE_STARt/END constructs) so the hooks will not run. Next commit will actually convert the platform to this infrastructure. Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: NArnd Bergmann <arnd@arndb.de>
-
由 Gilad Ben-Yossef 提交于
Implement ioremap_prot() to allow mapping IO memory with variable protection via TLB. Implementing this allows the /dev/mem driver to use its generic access() VMA callback, which in turn allows ptrace to examine data in memory mapped regions mapped via /dev/mem, such as Arc DCCM. The end result is that it is possible to examine values of variables placed into DCCM in user space programs via GDB. CC: Alexey Brodkin <Alexey.Brodkin@synopsys.com> CC: Noam Camus <noamc@ezchip.com> Acked-by: NVineet Gupta <vgupta@synopsys.com> Signed-off-by: NGilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
1. ./genfilelist.pl arch/arc/include/asm/ 2. Create arch/arc/include/uapi/asm/Kbuild as follows +# UAPI Header export list +include include/uapi/asm-generic/Kbuild.asm 3. ./disintegrate-one.pl arch/arc/include/{,uapi/}asm/<above-list> 4. Edit arch/arc/include/asm/Kbuild to remove ref to asm-generic/Kbuild.asm - To work around empty uapi/asm/setup.h added a placholder comment. - Also a manual #ifdef __ASSEMBLY__ for a late ptrace change Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Cc: David Howells <dhowells@redhat.com>
-
由 Vineet Gupta 提交于
* Includes mapping of CCMs in address space * Annotations to move arbitrary code/data into CCM * Moving some of the critical code/data into CCM * Runtime detection/reporting Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Mischa Jonker 提交于
Signed-off-by: NMischa Jonker <mjonker@synopsys.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Cc: Jason Wessel <jason.wessel@windriver.com> Acked-by: NJason Wessel <jason.wessel@windriver.com>
-
由 Vineet Gupta 提交于
ARC700 doesn't natively support unaligned access, but can be emulated -Unaligned Access Exception -Disassembly at the Fault address to find the exact insn (long/short) Also per Arnd's comment, we runtime control it using 2 sysctl knobs: * SYSCTL_ARCH_UNALIGN_ALLOW: Runtime enable/disble * SYSCTL_ARCH_UNALIGN_NO_WARN: Warn on each emulation attempt Originally contributed by Tim Yao <tim.yao@amlogic.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Cc: Tim Yao <tim.yao@amlogic.com> Acked-by: NArnd Bergmann <arnd@arndb.de>
-
由 Vineet Gupta 提交于
Origin port done by Rajeshwar Ranga Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Cc: Rajeshwar Ranga <rajeshwar.ranga@gmail.com>
-
由 Vineet Gupta 提交于
In-kernel disassembler Due Credits * Orig written by Rajeshwar Ranga * Consolidation/cleanups by Mischa Jonker Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Cc: Rajeshwar Ranga <rajeshwar.ranga@gmail.com> Cc: Mischa Jonker <mjonker@synopsys.com>
-
由 Vineet Gupta 提交于
-Originally written by Rajeshwar Ranga -Derived off of generic unwinder in 2.6.19 and adapted to ARC Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Cc: Rajeshwar Ranga <rajeshwar.ranga@gmail.com>
-
由 Vineet Gupta 提交于
ARC common code to enable a SMP system + ISS provided SMP extensions. ARC700 natively lacks SMP support, hence some of the core features are are only enabled if SoCs have the necessary h/w pixie-dust. This includes: -Inter Processor Interrupts (IPI) -Cache coherency -load-locked/store-conditional ... The low level exception handling would be completely broken in SMP because we don't have hardware assisted stack switching. Thus a fair bit of this code is repurposing the MMU_SCRATCH reg for event handler prologues to keep them re-entrant. Many thanks to Rajeshwar Ranga for his initial "major" contributions to SMP Port (back in 2008), and to Noam Camus and Gilad Ben-Yossef for help with resurrecting that in 3.2 kernel (2012). Note that this platform code is again singleton design pattern - so multiple SMP platforms won't build at the moment - this deficiency is addressed in subsequent patches within this series. Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rajeshwar Ranga <rajeshwar.ranga@gmail.com> Cc: Noam Camus <noamc@ezchip.com> Cc: Gilad Ben-Yossef <gilad@benyossef.com>
-
由 Vineet Gupta 提交于
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
There is a bit of hack/kludge right now where we disable preemption if a L2 (High prio) IRQ is taken while L1 (Low prio) is active. Need to revisit this Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-