提交 · d65670a78cdbfae94f20a9e05ec705871d7cdf2b · openanolis / cloud-kernel

11 11月, 2011 1 次提交

clocksource: Avoid selecting mult values that might overflow when adjusted · d65670a7

由 John Stultz 提交于 10月 31, 2011

For some frequencies, the clocks_calc_mult_shift() function will
unfortunately select mult values very close to 0xffffffff.  This
has the potential to overflow when NTP adjusts the clock, adding
to the mult value.

This patch adds a clocksource.maxadj value, which provides
an approximation of an 11% adjustment(NTP limits adjustments to
500ppm and the tick adjustment is limited to 10%), which could
be made to the clocksource.mult value. This is then used to both
check that the current mult value won't overflow/underflow, as
well as warning us if the timekeeping_adjust() code pushes over
that 11% boundary.

v2: Fix max_adjustment calculation, and improve WARN_ONCE
messages.

v3: Don't warn before maxadj has actually been set

CC: Yong Zhang <yong.zhang0@gmail.com>
CC: David Daney <ddaney.cavm@gmail.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Chen Jie <chenj@lemote.com>
CC: zhangfx <zhangfx@lemote.com>
CC: stable@kernel.org
Reported-by: NChen Jie <chenj@lemote.com>
Reported-by: Nzhangfx <zhangfx@lemote.com>
Tested-by: NYong Zhang <yong.zhang0@gmail.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

d65670a7

12 10月, 2011 1 次提交

time, s390: Get rid of compile warning · e35f95b3

由 Heiko Carstens 提交于 9月 22, 2011

"s390: Use direct ktime path for s390 clockevent device" in linux-next
introduces this compile warning:

arch/s390/kernel/time.c: In function 's390_next_ktime':
arch/s390/kernel/time.c:118:2: warning:
comparison of distinct pointer types lacks a cast [enabled by default]

Just use a u64 instead of an s64 variable. This is not a problem since it
will always contain a positive value.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/1316675957-5538-1-git-send-email-heiko.carstens@de.ibm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

e35f95b3

05 10月, 2011 2 次提交

dw_apb_timer: constify clocksource name · a1330228

由 Jamie Iles 提交于 7月 25, 2011

The clocksource name should be const for correctness.

Cc: John Stultz <johnstul@us.ibm.com>
Signed-off-by: NJamie Iles <jamie@jamieiles.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

a1330228

time: Cleanup old CONFIG_GENERIC_TIME references that snuck in · dcb69290

由 John Stultz 提交于 8月 16, 2011

Awhile back I removed all the CONFIG_GENERIC_TIME referecnes as
the last of the non-GENERIC_TIME arches were converted.

However, due to the functionality being important and around for
awhile, there apparently were some out of tree hardware enablement
patches that used it and have since been merged.

This patch removes the remaining instances of GENERIC_TIME.
Singed-off-by: NJohn Stultz <john.stultz@linaro.org>

dcb69290

21 9月, 2011 1 次提交

time: Change jiffies_to_clock_t() argument type to unsigned long · cbbc719f

由 hank 提交于 9月 20, 2011

The parameter's origin type is long. On an i386 architecture, it can
easily be larger than 0x80000000, causing this function to convert it
to a sign-extended u64 type.

Change the type to unsigned long so we get the correct result.
Signed-off-by: Nhank <pyu@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: <stable@kernel.org>
[ build fix ]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cbbc719f

14 9月, 2011 1 次提交

alarmtimers: Fix error handling · 4523f6ad

由 Thomas Gleixner 提交于 9月 14, 2011

commit 8bc0dafb (alarmtimers: Rework RTC device selection using class
interface) did not implement required error checks. Add them.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

4523f6ad

13 9月, 2011 1 次提交

clocksource: Make watchdog reset lockless · 9fb60336

由 Thomas Gleixner 提交于 9月 12, 2011

KGDB needs to trylock watchdog_lock when trying to reset the
clocksource watchdog after the system has been stopped to avoid a
potential deadlock. When the trylock fails TSC usually becomes
unstable.

We can be more clever by using an atomic counter and checking it in
the clocksource_watchdog callback. We restart the watchdog whenever
the counter is > 0 and only decrement the counter when we ran through
a full update cycle.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <johnstul@us.ibm.com>
Acked-by: NJason Wessel <jason.wessel@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1109121326280.2723@ionosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

9fb60336

08 9月, 2011 9 次提交

posix-cpu-timers: Cure SMP accounting oddities · e8abccb7

由 Peter Zijlstra 提交于 9月 01, 2011

David reported:

  Attached below is a watered-down version of rt/tst-cpuclock2.c from
  GLIBC.  Just build it with "gcc -o test test.c -lpthread -lrt" or
  similar.

  Run it several times, and you will see cases where the main thread
  will measure a process clock difference before and after the nanosleep
  which is smaller than the cpu-burner thread's individual thread clock
  difference.  This doesn't make any sense since the cpu-burner thread
  is part of the top-level process's thread group.

  I've reproduced this on both x86-64 and sparc64 (using both 32-bit and
  64-bit binaries).

  For example:

  [davem@boricha build-x86_64-linux]$ ./test
  process: before(0.001221967) after(0.498624371) diff(497402404)
  thread:  before(0.000081692) after(0.498316431) diff(498234739)
  self:    before(0.001223521) after(0.001240219) diff(16698)
  [davem@boricha build-x86_64-linux]$

  The diff of 'process' should always be >= the diff of 'thread'.

  I make sure to wrap the 'thread' clock measurements the most tightly
  around the nanosleep() call, and that the 'process' clock measurements
  are the outer-most ones.

  ---
  #include <unistd.h>
  #include <stdio.h>
  #include <stdlib.h>
  #include <time.h>
  #include <fcntl.h>
  #include <string.h>
  #include <errno.h>
  #include <pthread.h>

  static pthread_barrier_t barrier;

  static void *chew_cpu(void *arg)
  {
	  pthread_barrier_wait(&barrier);
	  while (1)
		  __asm__ __volatile__("" : : : "memory");
	  return NULL;
  }

  int main(void)
  {
	  clockid_t process_clock, my_thread_clock, th_clock;
	  struct timespec process_before, process_after;
	  struct timespec me_before, me_after;
	  struct timespec th_before, th_after;
	  struct timespec sleeptime;
	  unsigned long diff;
	  pthread_t th;
	  int err;

	  err = clock_getcpuclockid(0, &process_clock);
	  if (err)
		  return 1;

	  err = pthread_getcpuclockid(pthread_self(), &my_thread_clock);
	  if (err)
		  return 1;

	  pthread_barrier_init(&barrier, NULL, 2);
	  err = pthread_create(&th, NULL, chew_cpu, NULL);
	  if (err)
		  return 1;

	  err = pthread_getcpuclockid(th, &th_clock);
	  if (err)
		  return 1;

	  pthread_barrier_wait(&barrier);

	  err = clock_gettime(process_clock, &process_before);
	  if (err)
		  return 1;

	  err = clock_gettime(my_thread_clock, &me_before);
	  if (err)
		  return 1;

	  err = clock_gettime(th_clock, &th_before);
	  if (err)
		  return 1;

	  sleeptime.tv_sec = 0;
	  sleeptime.tv_nsec = 500000000;
	  nanosleep(&sleeptime, NULL);

	  err = clock_gettime(th_clock, &th_after);
	  if (err)
		  return 1;

	  err = clock_gettime(my_thread_clock, &me_after);
	  if (err)
		  return 1;

	  err = clock_gettime(process_clock, &process_after);
	  if (err)
		  return 1;

	  diff = process_after.tv_nsec - process_before.tv_nsec;
	  printf("process: before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
		 process_before.tv_sec, process_before.tv_nsec,
		 process_after.tv_sec, process_after.tv_nsec, diff);
	  diff = th_after.tv_nsec - th_before.tv_nsec;
	  printf("thread:  before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
		 th_before.tv_sec, th_before.tv_nsec,
		 th_after.tv_sec, th_after.tv_nsec, diff);
	  diff = me_after.tv_nsec - me_before.tv_nsec;
	  printf("self:    before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
		 me_before.tv_sec, me_before.tv_nsec,
		 me_after.tv_sec, me_after.tv_nsec, diff);

	  return 0;
  }

This is due to us using p->se.sum_exec_runtime in
thread_group_cputime() where we iterate the thread group and sum all
data. This does not take time since the last schedule operation (tick
or otherwise) into account. We can cure this by using
task_sched_runtime() at the cost of having to take locks.

This also means we can (and must) do away with
thread_group_sched_runtime() since the modified thread_group_cputime()
is now more accurate and would deadlock when called from
thread_group_sched_runtime().
Reported-by: NDavid Miller <davem@davemloft.net>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1314874459.7945.22.camel@twins
Cc: stable@kernel.org
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

e8abccb7

s390: Use direct ktime path for s390 clockevent device · 4f37a68c

由 Martin Schwidefsky 提交于 8月 23, 2011

The clock comparator on s390 uses the same format as the TOD clock.
If the value in the clock comparator is smaller than the current TOD
value an interrupt is pending. Use the CLOCK_EVT_FEAT_KTIME feature
to get the unmodified ktime of the next clockevent expiration and
use it to program the clock comparator without querying the TOD clock.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Cc: john stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/20110823133143.153017933@de.ibm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

4f37a68c

clockevents: Add direct ktime programming function · 65516f8a

由 Martin Schwidefsky 提交于 8月 23, 2011

There is at least one architecture (s390) with a sane clockevent device
that can be programmed with the equivalent of a ktime. No need to create
a delta against the current time, the ktime can be used directly.

A new clock device function 'set_next_ktime' is introduced that is called
with the unmodified ktime for the timer if the clock event device has the
CLOCK_EVT_FEAT_KTIME bit set.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Cc: john stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/20110823133142.815350967@de.ibm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

65516f8a

clockevents: Make minimum delay adjustments configurable · d1748302

由 Martin Schwidefsky 提交于 8月 23, 2011

The automatic increase of the min_delta_ns of a clockevents device
should be done in the clockevents code as the minimum delay is an
attribute of the clockevents device.

In addition not all architectures want the automatic adjustment, on a
massively virtualized system it can happen that the programming of a
clock event fails several times in a row because the virtual cpu has
been rescheduled quickly enough. In that case the minimum delay will
erroneously be increased with no way back. The new config symbol
GENERIC_CLOCKEVENTS_MIN_ADJUST is used to enable the automatic
adjustment. The config option is selected only for x86.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Cc: john stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/20110823133142.494157493@de.ibm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

d1748302

nohz: Remove "Switched to NOHz mode" debugging messages · 29c158e8

由 Heiko Carstens 提交于 8月 23, 2011

When performing cpu hotplug tests the kernel printk log buffer gets flooded
with pointless "Switched to NOHz mode..." messages. Especially when afterwards
analyzing a dump this might have removed more interesting stuff out of the
buffer.
Assuming that switching to NOHz mode simply works just remove the printk.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Link: http://lkml.kernel.org/r/20110823112046.GB2540@osiris.boeblingen.de.ibm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

29c158e8

proc: Consider NO_HZ when printing idle and iowait times · a25cac51

由 Michal Hocko 提交于 8月 24, 2011

show_stat handler of the /proc/stat file relies on kstat_cpu(cpu)
statistics when priting information about idle and iowait times.
This is OK if we are not using tickless kernel (CONFIG_NO_HZ) because
counters are updated periodically.
With NO_HZ things got more tricky because we are not doing idle/iowait
accounting while we are tickless so the value might get outdated.
Users of /proc/stat will notice that by unchanged idle/iowait values
which is then interpreted as 0% idle/iowait time. From the user space
POV this is an unexpected behavior and a change of the interface.

Let's fix this by using get_cpu_{idle,iowait}_time_us which accounts the
total idle/iowait time since boot and it doesn't rely on sampling or any
other periodic activity. Fall back to the previous behavior if NO_HZ is
disabled or not configured.
Signed-off-by: NMichal Hocko <mhocko@suse.cz>
Cc: Dave Jones <davej@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Link: http://lkml.kernel.org/r/39181366adac1b39cb6aa3cd53ff0f7c78d32676.1314172057.git.mhocko@suse.czSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

a25cac51

nohz: Make idle/iowait counter update conditional · 09a1d34f

由 Michal Hocko 提交于 8月 24, 2011

get_cpu_{idle,iowait}_time_us update idle/iowait counters
unconditionally if the given CPU is in the idle loop.

This doesn't work well outside of CPU governors which are singletons
so nobody (except for IRQ) can race with them.

We will need to use both functions from /proc/stat handler to properly
handle nohz idle/iowait times.

Make the update depend on a non NULL last_update_time argument.
Signed-off-by: NMichal Hocko <mhocko@suse.cz>
Cc: Dave Jones <davej@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Link: http://lkml.kernel.org/r/11f23179472635ce52e78921d47a20216b872f23.1314172057.git.mhocko@suse.czSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

09a1d34f

nohz: Fix update_ts_time_stat idle accounting · 6beea0cd

由 Michal Hocko 提交于 8月 24, 2011

update_ts_time_stat currently updates idle time even if we are in
iowait loop at the moment. The only real users of the idle counter
(via get_cpu_idle_time_us) are CPU governors and they expect to get
cumulative time for both idle and iowait times.
The value (idle_sleeptime) is also printed to userspace by print_cpu
but it prints both idle and iowait times so the idle part is misleading.

Let's clean this up and fix update_ts_time_stat to account both counters
properly and update consumers of idle to consider iowait time as well.
If we do this we might use get_cpu_{idle,iowait}_time_us from other
contexts as well and we will get expected values.
Signed-off-by: NMichal Hocko <mhocko@suse.cz>
Cc: Dave Jones <davej@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Link: http://lkml.kernel.org/r/e9c909c221a8da402c4da07e4cd968c3218f8eb1.1314172057.git.mhocko@suse.czSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

6beea0cd

cputime: Clean up cputime_to_usecs and usecs_to_cputime macros · ef0e0f5e

由 Michal Hocko 提交于 8月 24, 2011

Get rid of semicolon so that those expressions can be used also
somewhere else than just in an assignment.
Signed-off-by: NMichal Hocko <mhocko@suse.cz>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Cc: Dave Jones <davej@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Link: http://lkml.kernel.org/r/7565417ce30d7e6b1ddc169843af0777dbf66e75.1314172057.git.mhocko@suse.czSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

ef0e0f5e

11 8月, 2011 9 次提交

alarmtimers: Rework RTC device selection using class interface · 8bc0dafb

由 John Stultz 提交于 7月 14, 2011

This allows cleaner detection of the RTC device being registered, rather
then probing any time someone calls alarmtimer_get_rtcdev.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

8bc0dafb

alarmtimers: Add try_to_cancel functionality · 9082c465

由 John Stultz 提交于 8月 10, 2011

There's a number of edge cases when cancelling a alarm, so
to be sure we accurately do so, introduce try_to_cancel, which
returns proper failure errors if it cannot. Also modify cancel
to spin until the alarm is properly disabled.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

9082c465

alarmtimers: Add more refined alarm state tracking · a28cde81

由 John Stultz 提交于 8月 10, 2011

In order to allow for functionality like try_to_cancel, add
more refined  state tracking (similar to hrtimers).

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

a28cde81

alarmtimers: Remove period from alarm structure · 9e264762

由 John Stultz 提交于 8月 10, 2011

Now that periodic alarmtimers are managed by the handler function,
remove the period value from the alarm structure and let the handlers
manage the interval on their own.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

9e264762

alarmtimers: Remove interval cap limit hack · d77e23ac

由 John Stultz 提交于 8月 10, 2011

Now that the alarmtimers code has been refactored, the interval
cap limit can be removed.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

d77e23ac

alarmtimers: Add alarm_forward functionality · dce75a8c

由 John Stultz 提交于 8月 10, 2011

In order to avoid wasting time expiring and re-adding very high freq
periodic alarmtimers, introduce alarm_forward() which is similar to
hrtimer_forward and moves the timer to the next future expiration time
and returns the number of overruns.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

dce75a8c

alarmtimers: Push rearming peroidic timers down into alamrtimer handler · 54da23b7

由 John Stultz 提交于 8月 10, 2011

This patch pushes the periodic alarmtimer re-arming down into the alarmtimer
handler, mimicking how hrtimers handle this.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

54da23b7

alarmtimers: Change alarmtimer functions to return alarmtimer_restart values · 4b41308d

由 John Stultz 提交于 8月 10, 2011

In order to properly fix the denial of service issue with high freq
periodic alarm timers, we need to push the re-arming logic into the
alarm timer handler, much as the hrtimer code does.

This patch introduces alarmtimer_restart enum and changes the
alarmtimer handler declarations to use it as a return value. Further,
to ease following changes, it extends the alarmtimer handler functions
to also take the time at expiration. No logic is yet modified.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

4b41308d

alarmtimers: Avoid possible denial of service with high freq periodic timers · 6af7e471

由 John Stultz 提交于 8月 10, 2011

Its possible to jam up the alarm timers by setting very small interval
timers, which will cause the alarmtimer subsystem to spend all of its time
firing and restarting timers. This can effectivly lock up a box.

A deeper fix is needed, closely mimicking the hrtimer code, but for now
just cap the interval to 100us to avoid userland hanging the system.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: stable@kernel.org
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

6af7e471

10 8月, 2011 2 次提交

alarmtimers: Memset itimerspec passed into alarm_timer_get · ea7802f6

由 John Stultz 提交于 8月 04, 2011

Following common_timer_get, zero out the itimerspec passed in.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: stable@kernel.org
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

ea7802f6

alarmtimers: Avoid possible null pointer traversal · 971c90bf

由 John Stultz 提交于 8月 04, 2011

We don't check if old_setting is non null before assigning it, so
correct this.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: stable@kernel.org
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

971c90bf

08 8月, 2011 7 次提交

L

Linux 3.1-rc1 · 322a8b03
由 Linus Torvalds 提交于 8月 07, 2011

322a8b03
L
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 9e233113
由 Linus Torvalds 提交于 8月 07, 2011
```
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: Fix build with DEBUG_PAGEALLOC enabled.
```
9e233113

sh: Fix boot crash related to SCI · fc97114b

由 Rafael J. Wysocki 提交于 8月 08, 2011

Commit d006199e72a9 ("serial: sh-sci: Regtype probing doesn't need to be
fatal.") made sci_init_single() return when sci_probe_regmap() succeeds,
although it should return when sci_probe_regmap() fails. This causes
systems using the serial sh-sci driver to crash during boot.

Fix the problem by using the right return condition.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fc97114b

arm: remove stale export of 'sha_transform' · f23c126b

由 Linus Torvalds 提交于 8月 07, 2011

The generic library code already exports the generic function, this was
left-over from the ARM-specific version that just got removed.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f23c126b

arm: remove "optimized" SHA1 routines · 4d448714

由 Linus Torvalds 提交于 8月 07, 2011

Since commit 1eb19a12 ("lib/sha1: use the git implementation of
SHA-1"), the ARM SHA1 routines no longer work.  The reason? They
depended on the larger 320-byte workspace, and now the sha1 workspace is
just 16 words (64 bytes).  So the assembly version would overwrite the
stack randomly.

The optimized asm version is also probably slower than the new improved
C version, so there's no reason to keep it around.  At least that was
the case in git, where what appears to be the same assembly language
version was removed two years ago because the optimized C BLK_SHA1 code
was faster.
Reported-and-tested-by: NJoachim Eastwood <manabian@gmail.com>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4d448714

fix rcu annotations noise in cred.h · 32955148

由 Al Viro 提交于 8月 07, 2011

task->cred is declared as __rcu, and access to other tasks' ->cred is,
indeed, protected. Access to current->cred does not need rcu_dereference()
at all, since only the task itself can change its ->cred. sparse, of
course, has no way of knowing that...

Add force-cast in current_cred(), make current_fsuid() et.al. use it.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

32955148

vfs: rename 'do_follow_link' to 'should_follow_link' · 7813b94a

由 Linus Torvalds 提交于 8月 07, 2011

Al points out that the do_follow_link() helper function really is
misnamed - it's about whether we should try to follow a symlink or not,
not about actually doing the following.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7813b94a

07 8月, 2011 6 次提交

Fix POSIX ACL permission check · 206b1d09

由 Ari Savolainen 提交于 8月 06, 2011

After commit 3567866b: "RCUify freeing acls, let check_acl() go ahead in
RCU mode if acl is cached" posix_acl_permission is being called with an
unsupported flag and the permission check fails. This patch fixes the issue.
Signed-off-by: NAri Savolainen <ari.m.savolainen@gmail.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

206b1d09

Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd · c2f340a6

由 Linus Torvalds 提交于 8月 06, 2011

* 'for-linus' of git://git.open-osd.org/linux-open-osd:
  ore: Make ore its own module
  exofs: Rename raid engine from exofs/ios.c => ore
  exofs: ios: Move to a per inode components & device-table
  exofs: Move exofs specific osd operations out of ios.c
  exofs: Add offset/length to exofs_get_io_state
  exofs: Fix truncate for the raid-groups case
  exofs: Small cleanup of exofs_fill_super
  exofs: BUG: Avoid sbi realloc
  exofs: Remove pnfs-osd private definitions
  nfs_xdr: Move nfs4_string definition out of #ifdef CONFIG_NFS_V4

c2f340a6

vfs: optimize inode cache access patterns · 3ddcd056

由 Linus Torvalds 提交于 8月 06, 2011

The inode structure layout is largely random, and some of the vfs paths
really do care.  The path lookup in particular is already quite D$
intensive, and profiles show that accessing the 'inode->i_op->xyz'
fields is quite costly.

We already optimized the dcache to not unnecessarily load the d_op
structure for members that are often NULL using the DCACHE_OP_xyz bits
in dentry->d_flags, and this does something very similar for the inode
ops that are used during pathname lookup.

It also re-orders the fields so that the fields accessed by 'stat' are
together at the beginning of the inode structure, and roughly in the
order accessed.

The effect of this seems to be in the 1-2% range for an empty kernel
"make -j" run (which is fairly kernel-intensive, mostly in filename
lookup), so it's visible.  The numbers are fairly noisy, though, and
likely depend a lot on exact microarchitecture.  So there's more tuning
to be done.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3ddcd056

vfs: renumber DCACHE_xyz flags, remove some stale ones · 830c0f0e

由 Linus Torvalds 提交于 8月 06, 2011

Gcc tends to generate better code with small integers, including the
DCACHE_xyz flag tests - so move the common ones to be first in the list.
Also just remove the unused DCACHE_INOTIFY_PARENT_WATCHED and
DCACHE_AUTOFS_PENDING values, their users no longer exists in the source
tree.

And add a "unlikely()" to the DCACHE_OP_COMPARE test, since we want the
common case to be a nice straight-line fall-through.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

830c0f0e

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 7cd4767e

由 Linus Torvalds 提交于 8月 06, 2011

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  net: Compute protocol sequence numbers and fragment IDs using MD5.
  crypto: Move md5_transform to lib/md5.c

7cd4767e

ore: Make ore its own module · cf283ade

由 Boaz Harrosh 提交于 8月 06, 2011

Export everything from ore need exporting. Change Kbuild and Kconfig
to build ore.ko as an independent module. Import ore from exofs
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>

cf283ade

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功