- 13 5月, 2016 1 次提交
-
-
由 Vineet Gupta 提交于
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 09 5月, 2016 14 次提交
-
-
由 Noam Camus 提交于
The default 256 bytes sometimes is just not enough. We usually provide earlycon=... and console=... and ip=... All this and more may need more room. Signed-off-by: NNoam Camus <noamc@ezchip.com> Acked-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Tal Zilcer 提交于
Since the CTOP is SMT hardware multi-threaded, we need to hint the HW that now will be a very good time to do a hardware thread context switching. This is done by issuing the schd.rw instruction (binary coded here so as to not require specific revision of GCC to build the kernel). sched.rw means that Thread becomes eligible for execution by the threads scheduler after all pending read/write transactions were completed. Implementing cpu_relax_lowlatency() with barrier() Since with current semantics of cpu_relax() it may take a while till yielded CPU will get back. Signed-off-by: NNoam Camus <noamc@ezchip.com> Cc: Peter Zijlstra <peterz@infradead.org> Acked-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Noam Camus 提交于
With generic "identity" num of CPUs is limited to 256 (8 bit). We use our alternative AUX register GLOBAL_ID (12 bit). Now we can support up to 4096 CPUs. Signed-off-by: NNoam Camus <noamc@ezchip.com>
-
由 Noam Camus 提交于
NPS device got 256 cores and each got 16 HW threads (SMT). We use EZchip dedicated ISA to trigger HW scheduler of the core that current HW thread belongs to. This scheduling makes sure that data beyond barrier is available to all HW threads in core and by that to all in device (4K). Signed-off-by: NNoam Camus <noamc@ezchip.com> Cc: Peter Zijlstra <peterz@infradead.org>
-
由 Noam Camus 提交于
We need our own implementaions since we lack LLSC support. Our extended ISA provided with optimized solution for all 32bit operations we see in these three headers. Signed-off-by: NNoam Camus <noamc@ezchip.com>
-
由 Noam Camus 提交于
NPS use special mapping right below TASK_SIZE. Hence we need to lower STACK_TOP so that user stack won't overlap NPS special mapping. Signed-off-by: NNoam Camus <noamc@ezchip.com> Acked-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Noam Camus 提交于
If we hold rwlock and interrupt occures we may end up spinning on it for ever during softirq. Note that this lock is an internal lock and since the lock is free to be used from any context, the lock needs to be IRQ-safe. Below you may see an example for interrupt we get while nl_table_lock is holding its rw->lock_mutex and we spinned on it for ever. The concept for the fix was taken from SPARC. [2015-05-12 19:16:12] Stack Trace: [2015-05-12 19:16:12] arc_unwind_core+0xb8/0x11c [2015-05-12 19:16:12] dump_stack+0x68/0xac [2015-05-12 19:16:12] _raw_read_lock+0xa8/0xac [2015-05-12 19:16:12] netlink_broadcast_filtered+0x56/0x35c [2015-05-12 19:16:12] nlmsg_notify+0x42/0xa4 [2015-05-12 19:16:13] neigh_update+0x1fe/0x44c [2015-05-12 19:16:13] neigh_event_ns+0x40/0xa4 [2015-05-12 19:16:13] arp_process+0x46e/0x5a8 [2015-05-12 19:16:13] __netif_receive_skb_core+0x358/0x500 [2015-05-12 19:16:13] process_backlog+0x92/0x154 [2015-05-12 19:16:13] net_rx_action+0xb8/0x188 [2015-05-12 19:16:13] __do_softirq+0xda/0x1d8 [2015-05-12 19:16:14] irq_exit+0x8a/0x8c [2015-05-12 19:16:14] arch_do_IRQ+0x6c/0xa8 [2015-05-12 19:16:14] handle_interrupt_level1+0xe4/0xf0 Signed-off-by: NNoam Camus <noamc@ezchip.com> Acked-by: NPeter Zijlstra <peterz@infradead.org>
-
由 Noam Camus 提交于
On ARC, lower 2G of address space is translated and used for - user vaddr space (region 0 to 5) - unused kernel-user gutter (region 6) - kernel vaddr space (region 7) where each region simply represents 256MB of address space. The kernel vaddr space of 256MB is used to implement vmalloc, modules So far this was enough, but not on EZChip system with 4K CPUs (given that per cpu mechanism uses vmalloc for allocating chunks) So allow VMALLOC_SIZE to be configurable by expanding down into the unused kernel-user gutter region which at default 256M was excessive anyways. Also use _BITUL() to fix a build error since PGDIR_SIZE cannot use "1UL" as called from assembly code in mm/tlbex.S Signed-off-by: NNoam Camus <noamc@ezchip.com> [vgupta: rewrote changelog, debugged bootup crash due to int vs. hex] Acked-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Noam Camus 提交于
UAPI header should not use Kconfig items Use __BIG_ENDIAN__ defined as a compiler intrinsic Signed-off-by: NNoam Camus <noamc@ezchip.com> [vgupta: fix changelog] Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Alexey Brodkin 提交于
There are no more users of this - so RIP! Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com> [vgupta: update changelog] Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
This will be needed for switching to linear irq domain as irq_create_mapping() called by intr code needs the IRQ numbers in addition to existing usage in mcip.c for requesting the irq Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
- timer frequency is derived from DT (no longer rely on top level DT "clock-frequency" probed early and exported by asm/clk.h) - TIMER0_IRQ need not be exported across arch code, confined to intc as it is property of same - Any failures in clockevent setup are considered pedantic and system panic()'s as there is no generic fallback (unlike clocksource where a jiffies based soft clocksource always exists) Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Noam Camus 提交于
ARC Timers so far have been handled as "legacy" w/o explicit description in DT. This poses challenge for newer platforms wanting to use them. This series will eventually help move timers over to DT. This patch does a small change of using a CPU notifier to set clockevent on non-boot CPUs. So explicit setup is done only on boot CPU (which will later be done by DT) Signed-off-by: NNoam Camus <noamc@ezchip.com> [vgupta: broken off from a bigger patch] Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
- The idea is to remove the API usage since it has a subltle design flaw - relies on being called on cpu0 first. This is true for some early per cpu irqs such as TIMER/IPI, but not for late probed per cpu peripherals such a perf. And it's usage in perf has already bitten us once: see c6317bc7 ("ARCv2: perf: Ensure perf intr gets enabled on all cores") where we ended up open coding it anyways - The seeming duplication will go away once we start using cpu notifier for timer setup Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 05 5月, 2016 3 次提交
-
-
由 Vineet Gupta 提交于
Initial HIGHMEM support on ARC was introduced for PAE40 where the low memory (0x8000_0000 based) and high memory (0x1_0000_0000) were physically contiguous. So CONFIG_FLATMEM sufficed (despite a peipheral hole in the middle, which wasted a bit of struct page memory, but things worked). However w/o PAE, highmem was not possible and we could only reach ~1.75GB of DDR. Now there is a use case to access ~4GB of DDR w/o PAE40 The idea is to have low memory at canonical 0x8000_0000 and highmem at 0 so enire 4GB address space is available for physical addressing This needs additional platform/interconnect mapping to convert the non contiguous physical addresses into linear bus adresses. From Linux point of view, non contiguous divide means FLATMEM no longer works and DISCONTIGMEM is needed to track the pfns in the 2 regions. This scheme would also work for PAE40, only better in that we don't waste struct page memory for the peripheral hole. The DT description will be something like memory { ... reg = <0x80000000 0x200000000 /* 512MB: lowmem */ 0x00000000 0x10000000>; /* 256MB: highmem */ } Signed-off-by: NNoam Camus <noamc@ezchip.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
So a benign looking cleanup which macro'ized PAGE_SHIFT shifts turned out to be bad (since it was done non-sensically across the board). It caused boot failures with PAE40 as forced cast to (unsigned long) from newly introduced virt_to_pfn() was causing truncatiion of the (long long) pte/paddr values. It is OK to use this in accessors dealing with kernel virtual address, pointers etc, but not for PTE values themelves. Fixes: cJ2ff5cf2735c ("ARC: mm: Use virt_to_pfn() for addr >> PAGE_SHIFT pattern) Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
While reviewing a different change to asm-generic/io.h Arnd spotted that ARC ioread32 and ioread32be both of which come from asm-generic versions are not symmetrical in terms of calling the io barriers. generic ioread32 -> ARC readl() [ has barriers] generic ioread32be -> __be32_to_cpu(__raw_readl()) [ lacks barriers] While generic ioread32be is being remediated to call readl(), that involves a swab32(), causing double swaps on ioread32be() on Big Endian systems. So provide our versions of big endian IO accessors to ensure io barrier calls while also keeping them optimal Suggested-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NArnd Bergmann <arnd@arndb.de> Cc: stable@vger.kernel.org [4.2+] Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 22 4月, 2016 1 次提交
-
-
由 Evgeny Voevodin 提交于
- The asm helpers for calling into irq tracer were missing - Add calls to above helpers in low level assembly entry code for ARCv2 - irq_save() uses CLRI to disable interrupts and returns the prev interrupt state (in STATUS32) in a specific encoding (and not the raw value of STATUS32). This is usable with SETI in irq_restore(). However save_flags() reads the raw value of STATUS32 which doesn't pair with irq_save/restore() and thus needs fixing. Signed-off-by: NEvgeny Voevodin <evgeny.voevodin@intel.com> [vgupta: updated changelog and also added some comments] Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 07 4月, 2016 1 次提交
-
-
由 Alexey Brodkin 提交于
During mmaping of frame-buffer pages to user-space fb_protect() is called to set proper page settings. In case of ARC we need to mark pages that are mmaped to user as uncached because of 2 reasons: * Huge amount of data if passing through data cache will thrash cache a lot making cache almost useless for other less traffic hungry processes. * Data written by user in FB will be immediately available for hardware (such as PGU etc) without requirements to flush data cache regularly. Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com> Cc: linux-snps-arc@lists.infradead.org Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 19 3月, 2016 3 次提交
-
-
由 Vineet Gupta 提交于
The peripheral address space is architectural address window which is uncached and typically used to wire up peripherals. For ARC700 cores (ARCompact ISA based) this was fixed to 1GB region 0xC000_0000 - 0xFFFF_FFFF. For ARCv2 based HS38 cores the start address is flexible and can be 0xC, 0xD, 0xE, 0xF 000_000 by programming AUX_NON_VOLATILE_LIMIT reg (typically done in bootloader) Further in cas of PAE, the physical address can extend beyond 4GB so need to confine this check, otherwise all pages beyond 4GB will be treated as uncached Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
To support dma in physical memory beyond 4GB with PAE40 Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 18 3月, 2016 1 次提交
-
-
由 Vineet Gupta 提交于
With THP refcounting work, no need to mark PMDs splitting. (ARC got missed under the sweeping arch change as THP support was likely not present in orig baseline) Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 3月, 2016 1 次提交
-
-
由 Vineet Gupta 提交于
linux-next for 4.6-rc1 timeline reported ARC build failures !THP | arch/arc/include/asm/tlbflush.h:29:0: warning: "flush_pmd_tlb_range" redefined [enabled by default] | arch/arc/include/asm/tlbflush.h:29:0: warning: "flush_pmd_tlb_range" redefined [enabled by default] | arch/arc/include/asm/tlbflush.h:29:0: warning: "flush_pmd_tlb_range" redefined [enabled by default] Turns out that commit ("mm/thp/migration: switch from flush_tlb_range to flush_pmd_tlb_range") triggered the issue while the problem was in ARC code where THP specific helpers were not guarded with #ifdef. Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 14 3月, 2016 1 次提交
-
-
由 Alexander Duyck 提交于
This patch updates all instances of csum_tcpudp_magic and csum_tcpudp_nofold to reflect the types that are usually used as the source inputs. For example the protocol field is populated based on nexthdr which is actually an unsigned 8 bit value. The length is usually populated based on skb->len which is an unsigned integer. This addresses an issue in which the IPv6 function csum_ipv6_magic was generating a checksum using the full 32b of skb->len while csum_tcpudp_magic was only using the lower 16 bits. As a result we could run into issues when attempting to adjust the checksum as there was no protocol agnostic way to update it. With this change the value is still truncated as many architectures use "(len + proto) << 8", however this truncation only occurs for values greater than 16776960 in length and as such is unlikely to occur as we stop the inner headers at ~64K in size. I did have to make a few minor changes in the arm, mn10300, nios2, and score versions of the function in order to support these changes as they were either using things such as an OR to combine the protocol and length, or were using ntohs to convert the length which would have truncated the value. I also updated a few spots in terms of whitespace and type differences for the addresses. Most of this was just to make sure all of the definitions were in sync going forward. Signed-off-by: NAlexander Duyck <aduyck@mirantis.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 3月, 2016 3 次提交
-
-
由 Vineet Gupta 提交于
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
ARC architecture has 2 instruction sets: ARCompact/ARCv2. While same gcc supports compiling for either (using appropriate toggles), we can't use the same toolchain to build kernel because libgcc needs to be unique and the toolchian (uClibc based) is not multilibed. uClibc toolchain is convenient since it allows all userspace and kernel to be built with a single install for an ISA. This however means 2 gnu installs (with same triplet prefix) are needed for building for 2 ISA and need to be in PATH. As developers we keep switching the builds, but would occassionally fail to update the PATH leading to usage of wrong tools. And this would only show up at the end of kernel build when linking incompatible libgcc. So the initial solution was to have gcc define a special preprocessor macro DEFAULT_CPU_xxx which is unique for default toolchain configuration. Claudiu proposed using grep for an existing preprocessor macro which is again uniquely defined per ISA. Cc: Michal Marek <mmarek@suse.cz> Suggested-by: NClaudiu Zissulescu <claziss@synopsys.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Lada Trimasova 提交于
read{l,w}() write{l,w}() primitives should use le{16,32}_to_cpu() and cpu_to_le{16,32}() respectively to ensure device registers are read correctly in Big Endian CPU configuration. Per Arnd Bergmann | Most drivers using readl() or readl_relaxed() expect those to perform byte | swaps on big-endian architectures, as the registers tend to be fixed endian This was needed for getting UART to work correctly on a Big Endian ARC. The ARC accessors originally were fine, and the bug got introduced inadventently by commit b8a03302 ("ARCv2: barriers") Fixes: b8a03302 ("ARCv2: barriers") Link: http://lkml.kernel.org/r/201603100845.30602.arnd@arndb.de Cc: Alexey Brodkin <abrodkin@synopsys.com> Cc: stable@vger.kernel.org [4.2+] Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: NLada Trimasova <ltrimas@synopsys.com> [vgupta: beefed up changelog, added Fixes/stable tags] Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 11 3月, 2016 3 次提交
-
-
由 Vineet Gupta 提交于
commit 80f42084 removed the ARC bitops microoptimization but failed to prune the comments to same effect Fixes: 80f42084 ("ARC: Make ARC bitops "safer" (add anti-optimization)") Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Adam Buchbinder 提交于
Signed-off-by: NAdam Buchbinder <adam.buchbinder@gmail.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Joao Pinto 提交于
Add PCI support to ARC and update drivers/pci Makefile enabling the ARC arch to use the generic PCI setup functions. [bhelgaas: fold in Joao's pci-dma-compat.h & pci-bridge.h build fix (I should have caught this myself, sorry] Signed-off-by: NJoao Pinto <jpinto@synopsys.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Acked-by: NVineet Gupta <vgupta@synopsys.com>
-
- 26 2月, 2016 1 次提交
-
-
由 Chen Gang 提交于
They are not symmetric with each other, neither are used in real world (can not be found by grep command in source code root directory), so remove them. Signed-off-by: NChen Gang <gang.chen.5i5j@gmail.com> Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk> Acked-by: NGreg Ungerer <gerg@uclinux.org> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
- 24 2月, 2016 3 次提交
-
-
由 Vineet Gupta 提交于
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
There is no real ARC700 based SMP SoC so remove IPI definition. EZChip's SMP ARC700 is going to use a different intc and IPI provider anyways. Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Vineet Gupta 提交于
ARConnect/MCIP Inter-Core-Interrupt module can't send interrupt to local core. So use core intc capability to trigger software interrupt to self, using an unsued IRQ #21. This showed up as csd deadlock with LTP trace_sched on a dual core system. This test acts as scheduler fuzzer, triggering all sorts of schedulting activity. Trouble starts with IPI to self, which doesn't get delivered (effectively lost due to H/w capability), but the msg intended to be sent remain enqueued in per-cpu @ipi_data. All subsequent IPIs to this core from other cores get elided due to the IPI coalescing optimization in ipi_send_msg_one() where a pending msg implies an IPI already sent and assumes other core is yet to ack it. After the elided IPI, other core simply goes into csd_lock_wait() but never comes out as this core never sees the interrupt. Fixes STAR 9001008624 Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> [4.2] Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 18 2月, 2016 1 次提交
-
-
由 Vineet Gupta 提交于
- ARCv2 uses a seperate BCR for {I,D}CCM base address: ARCompact encoded both base/size in same BCR - Size encoding in common BCR is different for ARCompact/ARCv2 Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 12 2月, 2016 1 次提交
-
-
由 Vineet Gupta 提交于
MMUv4 supports 2 concurrent page sizes: Normal and Super [4K to 16M] So far Linux supported a single super page size for a given Normal page, depending on the software page walking address split. e.g. we had 11:8:13 address split for 8K page, which meant super page was 2 ^(8+13) = 2M (given that THP size has to be PMD_SHIFT) Now we turn this around, by allowing multiple Super Pages in Kconfig (currently 2M and 16M only) and forcing page walker address split to PGDIR_SHIFT and PAGE_SHIFT For configs without Super page, things are same as before and PGDIR_SHIFT can be hacked to get non default address split The motivation for this change is a customer who needs 16M super page and a 8K Normal page combo. Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 10 2月, 2016 1 次提交
-
-
由 Vineet Gupta 提交于
ARC HS Cores support configurable multiple interrupt priorities of upto 16 levels. There is processor "interrupt preemption threshhold" in STATUS32.E[4:1] And several places need to set this up: 1. seed value as kernel is booting 2. seed value for user space programs 3. Arg to SLEEP instruction in idle task (what interrupt prio can wake) 4. Per-IRQ line prioirty (i.e. what is the priority of interrupt raised by a peripheral or timer or perf counter... Currently above sites use the highest priority 0. This can be potential problem when multiple priorities are supported. e.g. user space could only be interrupted by P0 interrupt, not others... So turn this over and instead make default interruption level to be the lowest priority possible 15. This should be fine even if there are fewer priority levels configured (say two: P0 HIGH, P1 LOW) This feature also effectively disables FIRQ feature if present in hardware config. With old code, a P0 interrupt would be FIRQ, needing special handling (ISR or Register Banks) which is NOT supported yet. Now it not be P0 (P15 or whatever is lowest prio) so FIRQ is not triggered. Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 29 1月, 2016 1 次提交
-
-
由 Vineet Gupta 提交于
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-