1. 22 11月, 2014 6 次提交
  2. 21 11月, 2014 1 次提交
  3. 20 11月, 2014 11 次提交
  4. 19 11月, 2014 3 次提交
  5. 18 11月, 2014 1 次提交
  6. 17 11月, 2014 12 次提交
  7. 16 11月, 2014 6 次提交
    • D
      x86: Require exact match for 'noxsave' command line option · 2cd3949f
      Dave Hansen 提交于
      We have some very similarly named command-line options:
      
      arch/x86/kernel/cpu/common.c:__setup("noxsave", x86_xsave_setup);
      arch/x86/kernel/cpu/common.c:__setup("noxsaveopt", x86_xsaveopt_setup);
      arch/x86/kernel/cpu/common.c:__setup("noxsaves", x86_xsaves_setup);
      
      __setup() is designed to match options that take arguments, like
      "foo=bar" where you would have:
      
      	__setup("foo", x86_foo_func...);
      
      The problem is that "noxsave" actually _matches_ "noxsaves" in
      the same way that "foo" matches "foo=bar".  If you boot an old
      kernel that does not know about "noxsaves" with "noxsaves" on the
      command line, it will interpret the argument as "noxsave", which
      is not what you want at all.
      
      This makes the "noxsave" handler only return success when it finds
      an *exact* match.
      
      [ tglx: We really need to make __setup() more robust. ]
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: x86@kernel.org
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/20141111220133.FE053984@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      2cd3949f
    • S
      sched/cputime: Fix clock_nanosleep()/clock_gettime() inconsistency · 6e998916
      Stanislaw Gruszka 提交于
      Commit d670ec13 "posix-cpu-timers: Cure SMP wobbles" fixes one glibc
      test case in cost of breaking another one. After that commit, calling
      clock_nanosleep(TIMER_ABSTIME, X) and then clock_gettime(&Y) can result
      of Y time being smaller than X time.
      
      Reproducer/tester can be found further below, it can be compiled and ran by:
      
      	gcc -o tst-cpuclock2 tst-cpuclock2.c -pthread
      	while ./tst-cpuclock2 ; do : ; done
      
      This reproducer, when running on a buggy kernel, will complain
      about "clock_gettime difference too small".
      
      Issue happens because on start in thread_group_cputimer() we initialize
      sum_exec_runtime of cputimer with threads runtime not yet accounted and
      then add the threads runtime to running cputimer again on scheduler
      tick, making it's sum_exec_runtime bigger than actual threads runtime.
      
      KOSAKI Motohiro posted a fix for this problem, but that patch was never
      applied: https://lkml.org/lkml/2013/5/26/191 .
      
      This patch takes different approach to cure the problem. It calls
      update_curr() when cputimer starts, that assure we will have updated
      stats of running threads and on the next schedule tick we will account
      only the runtime that elapsed from cputimer start. That also assure we
      have consistent state between cpu times of individual threads and cpu
      time of the process consisted by those threads.
      
      Full reproducer (tst-cpuclock2.c):
      
      	#define _GNU_SOURCE
      	#include <unistd.h>
      	#include <sys/syscall.h>
      	#include <stdio.h>
      	#include <time.h>
      	#include <pthread.h>
      	#include <stdint.h>
      	#include <inttypes.h>
      
      	/* Parameters for the Linux kernel ABI for CPU clocks.  */
      	#define CPUCLOCK_SCHED          2
      	#define MAKE_PROCESS_CPUCLOCK(pid, clock) \
      		((~(clockid_t) (pid) << 3) | (clockid_t) (clock))
      
      	static pthread_barrier_t barrier;
      
      	/* Help advance the clock.  */
      	static void *chew_cpu(void *arg)
      	{
      		pthread_barrier_wait(&barrier);
      		while (1) ;
      
      		return NULL;
      	}
      
      	/* Don't use the glibc wrapper.  */
      	static int do_nanosleep(int flags, const struct timespec *req)
      	{
      		clockid_t clock_id = MAKE_PROCESS_CPUCLOCK(0, CPUCLOCK_SCHED);
      
      		return syscall(SYS_clock_nanosleep, clock_id, flags, req, NULL);
      	}
      
      	static int64_t tsdiff(const struct timespec *before, const struct timespec *after)
      	{
      		int64_t before_i = before->tv_sec * 1000000000ULL + before->tv_nsec;
      		int64_t after_i = after->tv_sec * 1000000000ULL + after->tv_nsec;
      
      		return after_i - before_i;
      	}
      
      	int main(void)
      	{
      		int result = 0;
      		pthread_t th;
      
      		pthread_barrier_init(&barrier, NULL, 2);
      
      		if (pthread_create(&th, NULL, chew_cpu, NULL) != 0) {
      			perror("pthread_create");
      			return 1;
      		}
      
      		pthread_barrier_wait(&barrier);
      
      		/* The test.  */
      		struct timespec before, after, sleeptimeabs;
      		int64_t sleepdiff, diffabs;
      		const struct timespec sleeptime = {.tv_sec = 0,.tv_nsec = 100000000 };
      
      		/* The relative nanosleep.  Not sure why this is needed, but its presence
      		   seems to make it easier to reproduce the problem.  */
      		if (do_nanosleep(0, &sleeptime) != 0) {
      			perror("clock_nanosleep");
      			return 1;
      		}
      
      		/* Get the current time.  */
      		if (clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &before) < 0) {
      			perror("clock_gettime[2]");
      			return 1;
      		}
      
      		/* Compute the absolute sleep time based on the current time.  */
      		uint64_t nsec = before.tv_nsec + sleeptime.tv_nsec;
      		sleeptimeabs.tv_sec = before.tv_sec + nsec / 1000000000;
      		sleeptimeabs.tv_nsec = nsec % 1000000000;
      
      		/* Sleep for the computed time.  */
      		if (do_nanosleep(TIMER_ABSTIME, &sleeptimeabs) != 0) {
      			perror("absolute clock_nanosleep");
      			return 1;
      		}
      
      		/* Get the time after the sleep.  */
      		if (clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &after) < 0) {
      			perror("clock_gettime[3]");
      			return 1;
      		}
      
      		/* The time after sleep should always be equal to or after the absolute sleep
      		   time passed to clock_nanosleep.  */
      		sleepdiff = tsdiff(&sleeptimeabs, &after);
      		if (sleepdiff < 0) {
      			printf("absolute clock_nanosleep woke too early: %" PRId64 "\n", sleepdiff);
      			result = 1;
      
      			printf("Before %llu.%09llu\n", before.tv_sec, before.tv_nsec);
      			printf("After  %llu.%09llu\n", after.tv_sec, after.tv_nsec);
      			printf("Sleep  %llu.%09llu\n", sleeptimeabs.tv_sec, sleeptimeabs.tv_nsec);
      		}
      
      		/* The difference between the timestamps taken before and after the
      		   clock_nanosleep call should be equal to or more than the duration of the
      		   sleep.  */
      		diffabs = tsdiff(&before, &after);
      		if (diffabs < sleeptime.tv_nsec) {
      			printf("clock_gettime difference too small: %" PRId64 "\n", diffabs);
      			result = 1;
      		}
      
      		pthread_cancel(th);
      
      		return result;
      	}
      Signed-off-by: NStanislaw Gruszka <sgruszka@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20141112155843.GA24803@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6e998916
    • P
      sched/cputime: Fix cpu_timer_sample_group() double accounting · 23cfa361
      Peter Zijlstra 提交于
      While looking over the cpu-timer code I found that we appear to add
      the delta for the calling task twice, through:
      
        cpu_timer_sample_group()
          thread_group_cputimer()
            thread_group_cputime()
              times->sum_exec_runtime += task_sched_runtime();
      
          *sample = cputime.sum_exec_runtime + task_delta_exec();
      
      Which would make the sample run ahead, making the sleep short.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Link: http://lkml.kernel.org/r/20141112113737.GI10476@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      23cfa361
    • P
      sched/numa: Avoid selecting oneself as swap target · 7af68335
      Peter Zijlstra 提交于
      Because the whole numa task selection stuff runs with preemption
      enabled (its long and expensive) we can end up migrating and selecting
      oneself as a swap target. This doesn't really work out well -- we end
      up trying to acquire the same lock twice for the swap migrate -- so
      avoid this.
      Reported-and-Tested-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20141110100328.GF29390@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7af68335
    • M
      bitops: Fix shift overflow in GENMASK macros · 00b4d9a1
      Maxime COQUELIN 提交于
      On some 32 bits architectures, including x86, GENMASK(31, 0) returns 0
      instead of the expected ~0UL.
      
      This is the same on some 64 bits architectures with GENMASK_ULL(63, 0).
      
      This is due to an overflow in the shift operand, 1 << 32 for GENMASK,
      1 << 64 for GENMASK_ULL.
      Reported-by: NEric Paire <eric.paire@st.com>
      Suggested-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: NMaxime Coquelin <maxime.coquelin@st.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <stable@vger.kernel.org> # v3.13+
      Cc: linux@rasmusvillemoes.dk
      Cc: gong.chen@linux.intel.com
      Cc: John Sullivan <jsrhbz@kanargh.force9.co.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Fixes: 10ef6b0d ("bitops: Introduce a more generic BITMASK macro")
      Link: http://lkml.kernel.org/r/1415267659-10563-1-git-send-email-maxime.coquelin@st.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      00b4d9a1
    • A
      perf/x86/intel/uncore: Fix boot crash on SBOX PMU on Haswell-EP · 68055915
      Andi Kleen 提交于
      There were several reports that on some systems writing the SBOX0 PMU
      initialization MSR would #GP at boot. This did not happen on all
      systems -- my two test systems booted fine.
      
      Writing the three initialization bits bit-by-bit seems to avoid the
      problem. So add a special callback to do just that.
      
      This replaces an earlier patch that disabled the SBOX.
      Reported-by: NAlexei Starovoitov <alexei.starovoitov@gmail.com>
      Reported-and-Tested-by: NPatrick Lu <patrick.lu@intel.com>
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Link: http://lkml.kernel.org/r/1415062828-19759-4-git-send-email-andi@firstfloor.org
      [ Fixed a whitespace error and added attribution tags that were left out inexplicably. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      68055915