提交 · 4ba66a9760722ccbb691b8f7116cad2f791cca7b · gsplhtlxg / clone-Linux

04 11月, 2017 1 次提交

arch/tile: Implement ->set_state_oneshot_stopped() · 777a45b4

由 Chris Metcalf 提交于 11月 03, 2017

set_state_oneshot_stopped() is called by the clkevt core, when the
next event is required at an expiry time of 'KTIME_MAX'. This normally
happens with NO_HZ_{IDLE|FULL} in both LOWRES/HIGHRES modes.

This patch makes the clockevent device to stop on such an event, to
avoid spurious interrupts, as explained by: commit 8fff52fd
("clockevents: Introduce CLOCK_EVT_STATE_ONESHOT_STOPPED state").
Signed-off-by: NChris Metcalf <cmetcalf@mellanox.com>

777a45b4

15 4月, 2017 1 次提交

tile/time: Set ->min_delta_ticks and ->max_delta_ticks · 45b586ef

由 Nicolai Stange 提交于 3月 30, 2017

In preparation for making the clockevents core NTP correction aware,
all clockevent device drivers must set ->min_delta_ticks and
->max_delta_ticks rather than ->min_delta_ns and ->max_delta_ns: a
clockevent device's rate is going to change dynamically and thus, the
ratio of ns to ticks ceases to stay invariant.

Currently, the tile's timer clockevent device is initialized as follows:

  evt->max_delta_ns = clockevent_delta2ns(MAX_TICK, evt);

and

  .min_delta_ns = 1000,

The first one translates to a ->max_delta_ticks value of MAX_TICK.
For the latter, note that the clockevent core will superimpose a
minimum of 1us by itself -- setting ->min_delta_ticks to 1 is safe here.

Initialize ->min_delta_ticks and ->max_delta_ticks with these values.

This patch alone doesn't introduce any change in functionality as the
clockevents core still looks exclusively at the (untouched) ->min_delta_ns
and ->max_delta_ns. As soon as this has changed, a followup patch will
purge the initialization of ->min_delta_ns and ->max_delta_ns from this
driver.

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Signed-off-by: NNicolai Stange <nicstange@gmail.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

45b586ef

02 3月, 2017 1 次提交

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/clock.h> · e6017571

由 Ingo Molnar 提交于 2月 01, 2017

We are going to split <linux/sched/clock.h> out of <linux/sched.h>, which
will have to be picked up from other headers and .c files.

Create a trivial placeholder <linux/sched/clock.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

e6017571

17 12月, 2016 1 次提交

tile: use __ro_after_init instead of tile-specific __write_once · 14e73e78

由 Chris Metcalf 提交于 11月 07, 2016

The semantics of the old tile __write_once are the same as the
newer generic __ro_after_init, so rename them all and get rid
of the tile-specific version.

This does not enable actual support for __ro_after_init,
which had been dropped from the tile architecture before the
initial upstreaming was done, since we had at that time switched
to using 16MB huge pages to map the kernel.
Signed-off-by: NChris Metcalf <cmetcalf@mellanox.com>

14e73e78

24 11月, 2016 1 次提交

tile: avoid using clocksource_cyc2ns with absolute cycle count · e658a6f1

由 Chris Metcalf 提交于 11月 16, 2016

For large values of "mult" and long uptimes, the intermediate
result of "cycles * mult" can overflow 64 bits.  For example,
the tile platform calls clocksource_cyc2ns with a 1.2 GHz clock;
we have mult = 853, and after 208.5 days, we overflow 64 bits.

Since clocksource_cyc2ns() is intended to be used for relative
cycle counts, not absolute cycle counts, performance is more
importance than accepting a wider range of cycle values.  So,
just use mult_frac() directly in tile's sched_clock().

Commit 4cecf6d4 ("sched, x86: Avoid unnecessary overflow
in sched_clock") by Salman Qazi results in essentially the same
generated code for x86 as this change does for tile.  In fact,
a follow-on change by Salman introduced mult_frac() and switched
to using it, so the C code was largely identical at that point too.

Peter Zijlstra then added mul_u64_u32_shr() and switched x86
to use it.  This is, in principle, better; by optimizing the
64x64->64 multiplies to be 32x32->64 multiplies we can potentially
save some time.  However, the compiler piplines the 64x64->64
multiplies pretty well, and the conditional branch in the generic
mul_u64_u32_shr() causes some bubbles in execution, with the
result that it's pretty much a wash.  If tilegx provided its own
implementation of mul_u64_u32_shr() without the conditional branch,
we could potentially save 3 cycles, but that seems like small gain
for a fair amount of additional build scaffolding; no other platform
currently provides a mul_u64_u32_shr() override, and tile doesn't
currently have an <asm/div64.h> header to put the override in.

Additionally, gcc currently has an optimization bug that prevents
it from recognizing the opportunity to use a 32x32->64 multiply,
and so the result would be no better than the existing mult_frac()
until such time as the compiler is fixed.

For now, just using mult_frac() seems like the right answer.

Cc: stable@kernel.org [v3.4+]
Signed-off-by: NChris Metcalf <cmetcalf@mellanox.com>

e658a6f1

31 7月, 2015 1 次提交

tile/time: Migrate to new 'set-state' interface · 38715df2

由 Viresh Kumar 提交于 7月 16, 2015

Migrate tile driver to the new 'set-state' interface provided by
clockevents core, the earlier 'set-mode' interface is marked obsolete
now.

This also enables us to implement callbacks for new states of clockevent
devices, for example: ONESHOT_STOPPED.

Cc: Chris Metcalf <cmetcalf@ezchip.com>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NChris Metcalf <cmetcalf@ezchip.com>

38715df2

27 3月, 2015 1 次提交

time: Rename timekeeper::tkr to timekeeper::tkr_mono · 876e7881

由 Peter Zijlstra 提交于 3月 19, 2015

In preparation of adding another tkr field, rename this one to
tkr_mono. Also rename tk_read_base::base_mono to tk_read_base::base,
since the structure is not specific to CLOCK_MONOTONIC and the mono
name got added to the tk_read_base instance.

Lots of trivial churn.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NJohn Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150319093400.344679419@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

876e7881

12 11月, 2014 1 次提交

tile: Use the more common pr_warn instead of pr_warning · f4743673

由 Joe Perches 提交于 10月 31, 2014

And other message logging neatening.

Other miscellanea:

o coalesce formats
o realign arguments
o standardize a couple of macros
o use __func__ instead of embedding the function name
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>

f4743673

03 10月, 2014 1 次提交

tile: add clock_gettime support to vDSO · 78410af5

由 Chris Metcalf 提交于 10月 02, 2014

This change adds support for clock_gettime with CLOCK_REALTIME
and CLOCK_MONOTONIC using vDSO.  It also updates the vdso
struct nomenclature used for the clocks to match the x86 code
to keep it easier to update going forward.

We also support the *_COARSE clockid_t, for apps that want speed
but aren't concerned about fine-grained timestamps; this saves
about 20 cycles per call (see http://lwn.net/Articles/342018/).
Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
Acked-by: NJohn Stultz <john.stultz@linaro.org>

78410af5

02 10月, 2014 1 次提交
- C
  tile: switch to using seqlocks for the vDSO time code · 94fb1afb
  由 Chris Metcalf 提交于 10月 02, 2014
```
Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
```
  94fb1afb
27 8月, 2014 1 次提交

tile: Replace __get_cpu_var uses · b4f50191

由 Christoph Lameter 提交于 8月 17, 2014

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :

#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))

__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.

This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.

Transformations done to __get_cpu_var()

1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);

2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);

3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);

4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));

5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);

6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)
Acked-by: NChris Metcalf <cmetcalf@tilera.com>
Signed-off-by: NChristoph Lameter <cl@linux.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

b4f50191

24 7月, 2014 3 次提交

timekeeping: Create struct tk_read_base and use it in struct timekeeper · d28ede83

由 Thomas Gleixner 提交于 7月 16, 2014

The members of the new struct are the required ones for the new NMI
safe accessor to clcok monotonic. In order to reuse the existing
timekeeping code and to make the update of the fast NMI safe
timekeepers a simple memcpy use the struct for the timekeeper as well
and convert all users.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

d28ede83

clocksource: Get rid of cycle_last · 4a0e6377

由 Thomas Gleixner 提交于 7月 16, 2014

cycle_last was added to the clocksource to support the TSC
validation. We moved that to the core code, so we can get rid of the
extra copy.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

4a0e6377

tile: Convert VDSO timekeeping to the precise mechanism · dc01c9fa

由 Thomas Gleixner 提交于 7月 16, 2014

The code was only halfarsed converted to the new VSDO update mechanism
and still uses the inaccurate base value which lacks the fractional
part of xtime_nsec. Fix it up.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

dc01c9fa

07 3月, 2014 1 次提交

tile: avoid overflow in ns2cycles · 767f3021

由 Henrik Austad 提交于 3月 04, 2014

In commit 4cecf6d4 ("sched, x86: Avoid unnecessary overflow in
sched_clock") and in recent patch "clocksource: avoid unnecessary
overflow in cyclecounter_cyc2ns()" https://lkml.org/lkml/2014/3/4/17,
the mult-shift approach is replaced by 2 steps to avoid storing a large,
intermediate value that could overflow.

arch/tile/kernel/time.c has a similar pattern in cycles2ns, and this
copies the same pattern in this function

CC: John Stultz <johnstul@us.ibm.com>
CC: Mike Galbraith <bitbucket@online.de>
CC: Salman Qazi <sqazi@google.com>
Signed-off-by: NHenrik Austad <henrik@austad.us>
Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>

767f3021

14 8月, 2013 1 次提交

tile: implement gettimeofday() via vDSO · 4a556f4f

由 Chris Metcalf 提交于 8月 07, 2013

This change creates the framework for vDSO calls, makes the existing
rt_sigreturn() mechanism use it, and adds a fast gettimeofday().
Now that we need to expose the vDSO address to userspace, we add
AT_SYSINFO_EHDR to the set of aux entries provided to userspace.
(You can disable any extra vDSO support by booting with vdso=0,
but the rt_sigreturn vDSO page will still be provided.)

Note that glibc has supported the tile vDSO since release 2.17.
Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>

4a556f4f

15 7月, 2013 1 次提交

tile: delete __cpuinit usage from all tile files · 18f894c1

由 Paul Gortmaker 提交于 6月 18, 2013

The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications.  For example, the fix in
commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.

After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out.  Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.

Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
are flagged as __cpuinit  -- so if we remove the __cpuinit from
arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
content into no-ops as early as possible, since that will get rid
of these warnings.  In any case, they are temporary and harmless.

This removes all the arch/tile uses of the __cpuinit macros from
all C files.  Currently tile does not have any __CPUINIT used in
assembly files.

[1] https://lkml.org/lkml/2013/5/20/589

Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

18f894c1

27 3月, 2013 1 次提交

tile: ns2cycles should use __raw_get_cpu_var · 39e8202b

由 Henrik Austad 提交于 3月 26, 2013

ns2cycles use per_cpu variables, and will, eventually, find its way into
smp_processor_id(). This is not safe in a preemptible kernel;
preemption should ideally be disabled.

BUG: using smp_processor_id() in preemptible [00000000] code:
systemd-modules/367
caller is ns2cycles+0x40/0xb8

Starting stack dump of tid 367, pid 367 (systemd-modules) on cpu 2 at
cycle 20969956421
 frame 0: 0xfffffff70004b860 dump_stack+0x0/0x20 (sp 0xfffffe407993fa90)
 frame 1: 0xfffffff7006abc28 debug_smp_processor_id+0x1a8/0x1e0 (sp
0xfffffe407993fa90)
 frame 2: 0xfffffff7004d7b40 ns2cycles+0x40/0xb8 (sp 0xfffffe407993fab8)
 frame 3: 0xfffffff7004dc578 __ndelay+0x38/0x80 (sp 0xfffffe407993fae0)

However, in this case:

- the frequency is the same accross all cores
- we use the data read-only
- we do not scale the frequency

Which means that we can use the __raw_get_cpu_var instead.
Signed-off-by: NHenrik Austad <haustad@cisco.com>
Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>

39e8202b

04 6月, 2011 1 次提交

clocksource: tile: convert to use clocksource_register_hz · 9f14517b

由 John Stultz 提交于 6月 01, 2011

Convert tile to use clocksource_register_hz.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>

9f14517b

05 5月, 2011 1 次提交

arch/tile: various header improvements for building drivers · 28d71741

由 Chris Metcalf 提交于 5月 02, 2011

This change adds a number of missing headers in asm (fb.h, parport.h,
serial.h, and vga.h) using the minimal generic versions.

It also adds a number of missing interfaces that showed up as build
failures when trying to build various drivers not normally included in the
"tile" distribution: ioremap_wc(), memset_io(), io{read,write}{16,32}be(),
virt_to_bus(), bus_to_virt(), irq_canonicalize(), __pte(), __pgd(),
and __pmd().  I also added a cast in virt_to_page() since not all callers
pass a pointer.

I fixed <asm/stat.h> to properly include a __KERNEL__ guard for the
__ARCH_WANT_STAT64 symbol, and <asm/swab.h> to use __builtin_bswap32()
even for our 64-bit architecture, since the same code is produced.

I added an export for get_cycles(), since it's used in some modules.

And I made <arch/spr_def.h> properly include the __KERNEL__ guard,
even though it's not yet exported, since it likely will be soon.
Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>

28d71741

02 3月, 2011 1 次提交

arch/tile: fix __ndelay etc to work better · 13371731

由 Chris Metcalf 提交于 2月 28, 2011

The current implementations of __ndelay and __udelay call a hypervisor
service to delay, but the hypervisor service isn't actually implemented
very well, and the consensus is that Linux should handle figuring this
out natively and not use a hypervisor service.

By converting nanoseconds to cycles, and then spinning until the
cycle counter reaches the desired cycle, we get several benefits:
first, we are sensitive to the actual clock speed; second, we use
less power by issuing a slow SPR read once every six cycles while
we delay; and third, we properly handle the case of an interrupt by
exiting at the target time rather than after some number of cycles.
Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>

13371731

02 11月, 2010 1 次提交

arch/tile: bomb raw_local_irq_ to arch_local_irq_ · 5d966115

由 Chris Metcalf 提交于 11月 01, 2010

This completes the tile migration to the new naming scheme for
the architecture-specific irq management code.
Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>

5d966115

13 8月, 2010 1 次提交

arch/tile: Use separate, better minsec values for clocksource and sched_clock. · 749dc6f2

由 Chris Metcalf 提交于 8月 13, 2010

We were using the same 5-sec minsec for the clocksource and sched_clock
that we were using for the clock_event_device. For the clock_event_device
that's exactly right since it has a short maximum countdown time.
But for sched_clock we want to avoid wraparound when converting from
ticks to nsec over a much longer window, so we force a shift of 10.
And for clocksource it seems dodgy to use a 5-sec minsec as well, so we
copy some other platforms and force a shift of 22.
Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>

749dc6f2

07 7月, 2010 1 次提交

arch/tile: Miscellaneous cleanup changes. · 0707ad30

由 Chris Metcalf 提交于 6月 25, 2010

This commit is primarily changes caused by reviewing "sparse"
and "checkpatch" output on our sources, so is somewhat noisy, since
things like "printk() -> pr_err()" (or whatever) throughout the
codebase tend to get tedious to read.  Rather than trying to tease
apart precisely which things changed due to which type of code
review, this commit includes various cleanups in the code:

- sparse: Add declarations in headers for globals.
- sparse: Fix __user annotations.
- sparse: Using gfp_t consistently instead of int.
- sparse: removing functions not actually used.
- checkpatch: Clean up printk() warnings by using pr_info(), etc.;
  also avoid partial-line printks except in bootup code.
  - checkpatch: Use exposed structs rather than typedefs.
  - checkpatch: Change some C99 comments to C89 comments.

In addition, a couple of minor other changes are rolled in
to this commit:

- Add support for a "raise" instruction to cause SIGFPE, etc., to be raised.
- Remove some compat code that is unnecessary when we fully eliminate
  some of the deprecated syscalls from the generic syscall ABI.
- Update the tile_defconfig to reflect current config contents.
Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>

0707ad30

05 6月, 2010 1 次提交

arch/tile: core support for Tilera 32-bit chips. · 867e359b

由 Chris Metcalf 提交于 5月 28, 2010

This change is the core kernel support for TILEPro and TILE64 chips.
No driver support (except the console driver) is included yet.

This includes the relevant Linux headers in asm/; the low-level
low-level "Tile architecture" headers in arch/, which are
shared with the hypervisor, etc., and are build-system agnostic;
and the relevant hypervisor headers in hv/.
Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: NPaul Mundt <lethal@linux-sh.org>

867e359b