- 28 7月, 2018 1 次提交
-
-
由 Eugeniy Paltsev 提交于
As for today we don't setup SMP_CACHE_BYTES and cache_line_size for ARC, so they are set to L1_CACHE_BYTES by default. L1 line length (L1_CACHE_BYTES) might be easily smaller than L2 line (which is usually the case BTW). This breaks code. For example this breaks ethernet infrastructure on HSDK/AXS103 boards with IOC disabled, involving manual cache flushes Functions which alloc and manage sk_buff packet data area rely on SMP_CACHE_BYTES define. In the result we can share last L2 cache line in sk_buff linear packet data area between DMA buffer and some useful data in other structure. So we can lose this data when we invalidate DMA buffer. sk_buff linear packet data area | | | skb->end skb->tail V | | V V ----------------------------------------------. packet data | <tail padding> | <useful data in other struct> ----------------------------------------------. ---------------------.--------------------------------------------------. SLC line | SLC (L2 cache) line (128B) | ---------------------.--------------------------------------------------. ^ ^ | | These cache lines will be invalidated when we invalidate skb linear packet data area before DMA transaction starting. This leads to issues painful to debug as it reproduces only if (sk_buff->end - sk_buff->tail) < SLC_LINE_SIZE and if we have some useful data right after sk_buff->end. Fix that by hardcode SMP_CACHE_BYTES to max line length we may have. Signed-off-by: NEugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 31 8月, 2017 2 次提交
-
-
由 Vineet Gupta 提交于
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Alexey Brodkin 提交于
Current implementation relies on L1 line length which might easily be smaller than L2 line (which is usually the case BTW). Imagine this typical case: L2 line is 128 bytes while L1 line is 64-bytes. Now we want to allocate small buffer and later use it for DMA (consider IOC is not available). kmalloc() allocates small KMALLOC_MIN_SIZE-sized, KMALLOC_MIN_SIZE-aligned That way if buffer happens to be aligned to L1 line and not L2 line we'll be flushing and invalidating extra portions of data from L2 which will cause cache coherency issues. And since KMALLOC_MIN_SIZE is bound to ARCH_DMA_MINALIGN the fix could be simple - set ARCH_DMA_MINALIGN to the largest cache line we may ever get. As of today neither L1 of ARC700 and ARC HS38 nor SLC might not be longer than 128 bytes. Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 04 8月, 2017 1 次提交
-
-
由 Alexey Brodkin 提交于
It is necessary to explicitly set both SLC_AUX_RGN_START1 and SLC_AUX_RGN_END1 which hold MSB bits of the physical address correspondingly of region start and end otherwise SLC region operation is executed in unpredictable manner Without this patch, SLC flushes on HSDK (IOC disabled) were taking seconds. Cc: stable@vger.kernel.org #4.4+ Reported-by: NVladimir Kondratiev <vladimir.kondratiev@intel.com> Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com> [vgupta: PAR40 regs only written if PAE40 exist]
-
- 03 5月, 2017 2 次提交
-
-
由 Vineet Gupta 提交于
DC_CTRL.RGN_OP is 3 bits wide, however only 1 bit is used in current programming model (0: flush, 1: invalidate) The current code targetting 3 bits leads to additional 8 byte AND operation which can be elided given that only 1 bit is ever set by software and/or looked at by hardware before ------ | 80b63324 <__dma_cache_wback_inv_l1>: | 80b63324: clri r3 | 80b63328: lr r2,[dc_ctrl] | 80b6332c: and r2,r2,0xfffff1ff <--- 8 bytes insn | 80b63334: or r2,r2,576 | 80b63338: sr r2,[dc_ctrl] | ... | ... | 80b63360 <__dma_cache_inv_l1>: | 80b63360: clri r3 | 80b63364: lr r2,[dc_ctrl] | 80b63368: and r2,r2,0xfffff1ff <--- 8 bytes insn | 80b63370: bset_s r2,r2,0x9 | 80b63372: sr r2,[dc_ctrl] | ... | ... | 80b6338c <__dma_cache_wback_l1>: | 80b6338c: clri r3 | 80b63390: lr r2,[dc_ctrl] | 80b63394: and r2,r2,0xfffff1ff <--- 8 bytes insn | 80b6339c: sr r2,[dc_ctrl] after (AND elided totally in 2 cases, replaced with 2 byte BCLR in 3rd) ----- | 80b63324 <__dma_cache_wback_inv_l1>: | 80b63324: clri r3 | 80b63328: lr r2,[dc_ctrl] | 80b6332c: or r2,r2,576 | 80b63330: sr r2,[dc_ctrl] | ... | ... | 80b63358 <__dma_cache_inv_l1>: | 80b63358: clri r3 | 80b6335c: lr r2,[dc_ctrl] | 80b63360: bset_s r2,r2,0x9 | 80b63362: sr r2,[dc_ctrl] | ... | ... | 80b6337c <__dma_cache_wback_l1>: | 80b6337c: clri r3 | 80b63380: lr r2,[dc_ctrl] | 80b63384: bclr_s r2,r2,0x9 | 80b63386: sr r2,[dc_ctrl] Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
These are more efficient than the per-line ops Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 19 1月, 2017 2 次提交
-
-
由 Vineet Gupta 提交于
On AXS103 release bitfiles, DMA data corruptions were seen because IOC setup was not following the recommended way in documentation. Flipping IOC on when caches are enabled or coherency transactions are in flight, might cause some of the memory operations to not observe coherency as expected. So strictly follow the programming model recommendations as documented in comment header above arc_ioc_setup() Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
- Move IOC setup into arc_ioc_setup() - Move SLC disabling into arc_slc_disable() Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 25 10月, 2016 1 次提交
-
-
由 Vineet Gupta 提交于
if user disables IOC from debugger at startup (by clearing @ioc_enable), @ioc_exists is cleared too. This means boot prints don't capture the fact that IOC was present but disabled which could be misleading. So invert how we use @ioc_enable and @ioc_exists and make it more canonical. @ioc_exists represent whether hardware is present or not and stays same whether enabled or not. @ioc_enable is still user driven, but will be auto-disabled if IOC hardware is not present, i.e. if @ioc_exist=0. This is opposite to what we were doing before, but much clearer. This means @ioc_enable is now the "exported" toggle in rest of code such as dma mapping API. Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 01 10月, 2016 1 次提交
-
-
由 Vineet Gupta 提交于
HS release 3.0 provides for even more flexibility in specifying the volatile address space for mapping peripherals. With HS 2.1 @start was made flexible / programmable - with HS 3.0 even @end can be setup (vs. fixed to 0xFFFF_FFFF before). So add code to reflect that and while at it remove an unused struct defintion Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 19 3月, 2016 1 次提交
-
-
由 Vineet Gupta 提交于
The peripheral address space is architectural address window which is uncached and typically used to wire up peripherals. For ARC700 cores (ARCompact ISA based) this was fixed to 1GB region 0xC000_0000 - 0xFFFF_FFFF. For ARCv2 based HS38 cores the start address is flexible and can be 0xC, 0xD, 0xE, 0xF 000_000 by programming AUX_NON_VOLATILE_LIMIT reg (typically done in bootloader) Further in cas of PAE, the physical address can extend beyond 4GB so need to confine this check, otherwise all pages beyond 4GB will be treated as uncached Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 21 12月, 2015 1 次提交
-
-
由 Alexey Brodkin 提交于
ARC700 cores with MMU v2 don't have IC_PTAG AUX register and so we only define ARC_REG_IC_PTAG for MMU versions >= 3. But current implementation of cache_line_loop_vX() routines assumes availability of all of them (v2, v3 and v4) simultaneously. And given undefined ARC_REG_IC_PTAG if CONFIG_MMU_VER=2 we're seeing compilation problem: ---------------------------------->8------------------------------- CC arch/arc/mm/cache.o arch/arc/mm/cache.c: In function '__cache_line_loop_v3': arch/arc/mm/cache.c:270:13: error: 'ARC_REG_IC_PTAG' undeclared (first use in this function) aux_tag = ARC_REG_IC_PTAG; ^ arch/arc/mm/cache.c:270:13: note: each undeclared identifier is reported only once for each function it appears in scripts/Makefile.build:258: recipe for target 'arch/arc/mm/cache.o' failed ---------------------------------->8------------------------------- The simples fix is to have ARC_REG_IC_PTAG defined regardless MMU version being used. We don't use it in cache_line_loop_v2() anyways so who cares. Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 29 10月, 2015 1 次提交
-
-
由 Vineet Gupta 提交于
This is the first working implementation of 40-bit physical address extension on ARCv2. Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 20 8月, 2015 1 次提交
-
-
由 Alexey Brodkin 提交于
In case of ARCv2 CPU there're could be following configurations that affect cache handling for data exchanged with peripherals via DMA: [1] Only L1 cache exists [2] Both L1 and L2 exist, but no IO coherency unit [3] L1, L2 caches and IO coherency unit exist Current implementation takes care of [1] and [2]. Moreover support of [2] is implemented with run-time check for SLC existence which is not super optimal. This patch introduces support of [3] and rework of DMA ops usage. Instead of doing run-time check every time a particular DMA op is executed we'll have 3 different implementations of DMA ops and select appropriate one during init. As for IOC support for it we need: [a] Implement empty DMA ops because IOC takes care of cache coherency with DMAed data [b] Route dma_alloc_coherent() via dma_alloc_noncoherent() This is required to make IOC work in first place and also serves as optimization as LD/ST to coherent buffers can be srviced from caches w/o going all the way to memory Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com> [vgupta: -Added some comments about IOC gains -Marked dma ops as static, -Massaged changelog a bit] Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 25 6月, 2015 1 次提交
-
-
由 Vineet Gupta 提交于
L2 cache on ARCHS processors is called SLC (System Level Cache) For working DMA (in absence of hardware assisted IO Coherency) we need to manage SLC explicitly when buffers transition between cpu and controllers. Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 22 6月, 2015 2 次提交
-
-
由 Vineet Gupta 提交于
This is also default for AXS103 release Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
Caveats about cache flush on ARCv2 based cores - dcache is PIPT so paddr is sufficient for cache maintenance ops (no need to setup PTAG reg - icache is still VIPT but only aliasing configs need PTAG setup So basically this is departure from MMU-v3 which always need vaddr in line ops registers (DC_IVDL, DC_FLDL, IC_IVIL) but paddr in DC_PTAG, IC_PTAG respectively. Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 13 10月, 2014 1 次提交
-
-
由 Vineet Gupta 提交于
Suggested-by: NNoam Camus <noamc@ezchip.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 16 6月, 2014 1 次提交
-
-
由 Paul Bolle 提交于
There's no Kconfig symbol ARC_MMU_V4 so the checks for CONFIG_ARC_MMU_V4 will always evaluate to false. Remove them. Signed-off-by: NPaul Bolle <pebolle@tiscali.nl> Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 03 6月, 2014 1 次提交
-
-
由 Vineet Gupta 提交于
Requested-by: NNoam Camus <noamc@ezchip.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 06 11月, 2013 1 次提交
-
-
由 Vineet Gupta 提交于
Having them be different seems an obscure configuration. Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 05 9月, 2013 1 次提交
-
-
由 Vineet Gupta 提交于
```------------>8-------------------- WARNING: vmlinux.o(.text+0x708): Section mismatch in reference from the function read_arc_build_cfg_regs() to the function .init.text:read_decode_cache_bcr() WARNING: vmlinux.o(.text+0x702): Section mismatch in reference from the function read_arc_build_cfg_regs() to the function .init.text:read_decode_mmu_bcr() ``` ------------>8-------------------- Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 22 6月, 2013 3 次提交
-
-
由 Vineet Gupta 提交于
* Number of (i|d)cache ways can be retrieved from BCRs and hence no need to cross check with with built-in constants * Use of IS_ENABLED() to check for a Kconfig option * is_not_cache_aligned() not used anymore Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
* Move the various sub-system defines/types into relevant files/functions (reduces compilation time) * move CPU specific stuff out of asm/tlb.h into asm/mmu.h Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 10 5月, 2013 1 次提交
-
-
由 Vineet Gupta 提交于
Enforce congruency of userspace shared mappings Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 16 2月, 2013 1 次提交
-
-
由 Vineet Gupta 提交于
* ARC700 has VIPT L1 Caches * Caches don't snoop and are not coherent * Given the PAGE_SIZE and Cache associativity, we don't support aliasing D$ configurations (yet), but do allow aliasing I$ configs Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 11 2月, 2013 1 次提交
-
-
由 Vineet Gupta 提交于
* L1_CACHE_SHIFT * PAGE_SIZE, PAGE_OFFSET * struct pt_regs, struct user_regs_struct * struct thread_struct, cpu_relax(), task_pt_regs(), start_thread(), ... * struct thread_info, THREAD_SIZE, INIT_THREAD_INFO(), TIF_*, ... * BUG() * ELF_* * Elf_* To disallow user-space visibility into some of the core kernel data-types such as struct pt_regs, #ifdef __KERNEL__ which also makes the UAPI header spit (further patch in the series) to NOT export it to asm/uapi/ptrace.h Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Cc: Jonas Bonn <jonas.bonn@gmail.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Acked-by: NArnd Bergmann <arnd@arndb.de>
-