1. 04 2月, 2013 1 次提交
  2. 26 12月, 2012 3 次提交
    • A
      x32: fix sigtimedwait · b2ddedcd
      Al Viro 提交于
      It needs 64bit timespec.  As it is, we end up truncating the timeout
      to whole seconds; usually it doesn't matter, but for having all
      sub-second timeouts truncated to one jiffy is visibly wrong.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      b2ddedcd
    • A
      x32: fix waitid() · a566c288
      Al Viro 提交于
      It needs 64bit rusage and 32bit siginfo.  glibc never calls it with
      non-NULL rusage pointer, or we would've seen breakage already...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a566c288
    • A
      switch compat_sys_wait4() and compat_sys_waitid() to COMPAT_SYSCALL_DEFINE · 8d9807b1
      Al Viro 提交于
      Strictly speaking, ppc64 needs it for C ABI compliance.  Realistically
      I would be very surprised if e.g. passing 0xffffffff as 'options'
      argument to waitid() from 32bit task would cause problems, but yes,
      it puts us into undefined behaviour territory.  ppc64 expects int
      argument to be passed in 64bit register with bits 31..63 containing
      the same value.  SYSCALL_DEFINE on ppc provides a wrapper that normalizes
      the value passed from userland; so does COMPAT_SYSCALL_DEFINE.  Plain
      declaration of compat_sys_something() with an int argument obviously
      doesn't.  Again, for wait4 and waitid I would be extremely surprised
      if gcc started to produce code depending on that value having been
      properly sign-extended - the argument(s) in question end up passed
      blindly to sys_wait4 and sys_waitid resp. and normalization for native
      syscalls takes care of their use there.  Still, better to use
      COMPAT_SYSCALL_DEFINE here than worry about nasal daemons...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      8d9807b1
  3. 18 12月, 2012 1 次提交
  4. 22 5月, 2012 1 次提交
  5. 10 5月, 2012 1 次提交
    • J
      compat: Fix RT signal mask corruption via sigprocmask · b7dafa0e
      Jan Kiszka 提交于
      compat_sys_sigprocmask reads a smaller signal mask from userspace than
      sigprogmask accepts for setting.  So the high word of blocked.sig[0]
      will be cleared, releasing any potentially blocked RT signal.
      
      This was discovered via userspace code that relies on get/setcontext.
      glibc's i386 versions of those functions use sigprogmask instead of
      rt_sigprogmask to save/restore signal mask and caused RT signal
      unblocking this way.
      
      As suggested by Linus, this replaces the sys_sigprocmask based compat
      version with one that open-codes the required logic, including the merge
      of the existing blocked set with the new one provided on SIG_SETMASK.
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b7dafa0e
  6. 21 2月, 2012 1 次提交
    • H
      compat: Add helper functions to read/write struct timeval, timespec · 6684ba20
      H. Peter Anvin 提交于
      Add helper functions to read and write struct timeval and struct
      timespec from userspace.  We already had helper functions for reading
      and writing struct compat_timespec; add a set of functions to do the
      same with struct timeval, and add a second suite of functions which
      can be sensitive to COMPAT_USE_64BIT_TIME and access either 32- or
      64-bit time structures.
      
      This also exports these helper functions to modules.
      
      Rename the existing inlines for converting between struct
      compat_timeval and native struct timespec so we can have a saner
      naming convention for the exported functions.
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      6684ba20
  7. 01 11月, 2011 1 次提交
  8. 28 7月, 2011 2 次提交
  9. 12 7月, 2011 1 次提交
    • A
      KVM: Add compat ioctl for KVM_SET_SIGNAL_MASK · 1dda606c
      Alexander Graf 提交于
      KVM has an ioctl to define which signal mask should be used while running
      inside VCPU_RUN. At least for big endian systems, this mask is different
      on 32-bit and 64-bit systems (though the size is identical).
      
      Add a compat wrapper that converts the mask to whatever the kernel accepts,
      allowing 32-bit kvm user space to set signal masks.
      
      This patch fixes qemu with --enable-io-thread on ppc64 hosts when running
      32-bit user land.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      1dda606c
  10. 13 5月, 2011 1 次提交
    • C
      compat: fixes to allow working with tile arch · be84cb43
      Chris Metcalf 提交于
      The existing <asm-generic/unistd.h> mechanism doesn't really provide
      enough to create the 64-bit "compat" ABI properly in a generic way,
      since the compat ABI is a mix of things were you can re-use the 64-bit
      versions of syscalls and things where you need a compat wrapper.
      
      To provide this in the most direct way possible, I added two new macros
      to go along with the existing __SYSCALL and __SC_3264 macros: __SC_COMP
      and SC_COMP_3264.  These macros take an additional argument, typically a
      "compat_sys_xxx" function, which is passed to __SYSCALL if you define
      __SYSCALL_COMPAT when including the header, resulting in a pointer to
      the compat function being placed in the generated syscall table.
      
      The change also adds some missing definitions to <linux/compat.h> so that
      it actually has declarations for all the compat syscalls, since the
      "[nr] = ##call" approach requires proper C declarations for all the
      functions included in the syscall table.
      
      Finally, compat.c defines compat_sys_sigpending() and
      compat_sys_sigprocmask() even if the underlying architecture doesn't
      request it, which tries to pull in undefined compat_old_sigset_t defines.
      We need to guard those compat syscall definitions with appropriate
      __ARCH_WANT_SYS_xxx ifdefs.
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      be84cb43
  11. 28 4月, 2011 2 次提交
  12. 02 2月, 2011 2 次提交
  13. 15 9月, 2010 1 次提交
    • H
      compat: Make compat_alloc_user_space() incorporate the access_ok() · c41d68a5
      H. Peter Anvin 提交于
      compat_alloc_user_space() expects the caller to independently call
      access_ok() to verify the returned area.  A missing call could
      introduce problems on some architectures.
      
      This patch incorporates the access_ok() check into
      compat_alloc_user_space() and also adds a sanity check on the length.
      The existing compat_alloc_user_space() implementations are renamed
      arch_compat_alloc_user_space() and are used as part of the
      implementation of the new global function.
      
      This patch assumes NULL will cause __get_user()/__put_user() to either
      fail or access userspace on all architectures.  This should be
      followed by checking the return value of compat_access_user_space()
      for NULL in the callers, at which time the access_ok() in the callers
      can also be removed.
      Reported-by: NBen Hawkes <hawkes@sota.gen.nz>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NTony Luck <tony.luck@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: James Bottomley <jejb@parisc-linux.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: <stable@kernel.org>
      c41d68a5
  14. 16 7月, 2010 1 次提交
  15. 20 5月, 2010 1 次提交
    • K
      cpumask: fix compat getaffinity · fa9dc265
      KOSAKI Motohiro 提交于
      Commit a45185d2 "cpumask: convert kernel/compat.c" broke libnuma, which
      abuses sched_getaffinity to find out NR_CPUS in order to parse
      /sys/devices/system/node/node*/cpumap.
      
      On NUMA systems with less than 32 possibly CPUs, the current
      compat_sys_sched_getaffinity now returns '4' instead of the actual
      NR_CPUS/8, which makes libnuma bail out when parsing the cpumap.
      
      The libnuma call sched_getaffinity(0, bitmap, 4096) at first.  It mean
      the libnuma expect the return value of sched_getaffinity() is either len
      argument or NR_CPUS.  But it doesn't expect to return nr_cpu_ids.
      
      Strictly speaking, userland requirement are
      
      1) Glibc assume the return value mean the lengh of initialized
         of mask argument. E.g. if sched_getaffinity(1024) return 128,
         glibc make zero fill rest 896 byte.
      2) Libnuma assume the return value can be used to guess NR_CPUS
         in kernel. It assume len-arg<NR_CPUS makes -EINVAL. But
         it try len=4096 at first and 4096 is always bigger than
         NR_CPUS. Then, if we remove strange min_length normalization,
         we never hit -EINVAL case.
      
      sched_getaffinity() already solved this issue.  This patch adapts
      compat_sys_sched_getaffinity() to match the non-compat case.
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Reported-by: NKen Werner <ken.werner@web.de>
      Cc: stable@kernel.org
      Cc: Andi Kleen <andi@firstfloor.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fa9dc265
  16. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  17. 01 5月, 2009 1 次提交
  18. 07 1月, 2009 1 次提交
    • P
      Allow times and time system calls to return small negative values · e3d5a27d
      Paul Mackerras 提交于
      At the moment, the times() system call will appear to fail for a period
      shortly after boot, while the value it want to return is between -4095 and
      -1.  The same thing will also happen for the time() system call on 32-bit
      platforms some time in 2106 or so.
      
      On some platforms, such as x86, this is unavoidable because of the system
      call ABI, but other platforms such as powerpc have a separate error
      indication from the return value, so system calls can in fact return small
      negative values without indicating an error.  On those platforms,
      force_successful_syscall_return() provides a way to indicate that the
      system call return value should not be treated as an error even if it is
      in the range which would normally be taken as a negative error number.
      
      This adds a force_successful_syscall_return() call to the time() and
      times() system calls plus their 32-bit compat versions, so that they don't
      erroneously indicate an error on those platforms whose system call ABI has
      a separate error indication.  This will not affect anything on other
      platforms.
      
      Joakim Tjernlund added the fix for time() and the compat versions of
      time() and times(), after I did the fix for times().
      Signed-off-by: NJoakim Tjernlund <Joakim.Tjernlund@transmode.se>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e3d5a27d
  19. 01 1月, 2009 1 次提交
  20. 17 10月, 2008 1 次提交
    • C
      compat: generic compat get/settimeofday · b418da16
      Christoph Hellwig 提交于
      Nothing arch specific in get/settimeofday.  The details of the timeval
      conversion varied a little from arch to arch, but all with the same
      results.
      
      Also add an extern declaration for sys_tz to linux/time.h because externs
      in .c files are fowned upon.  I'll kill the externs in various other files
      in a sparate patch.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: David S. Miller <davem@davemloft.net> [ sparc bits ]
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Acked-by: NKyle McMartin <kyle@mcmartin.ca>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Grant Grundler <grundler@parisc-linux.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b418da16
  21. 14 9月, 2008 1 次提交
    • F
      timers: fix itimer/many thread hang · f06febc9
      Frank Mayhar 提交于
      Overview
      
      This patch reworks the handling of POSIX CPU timers, including the
      ITIMER_PROF, ITIMER_VIRT timers and rlimit handling.  It was put together
      with the help of Roland McGrath, the owner and original writer of this code.
      
      The problem we ran into, and the reason for this rework, has to do with using
      a profiling timer in a process with a large number of threads.  It appears
      that the performance of the old implementation of run_posix_cpu_timers() was
      at least O(n*3) (where "n" is the number of threads in a process) or worse.
      Everything is fine with an increasing number of threads until the time taken
      for that routine to run becomes the same as or greater than the tick time, at
      which point things degrade rather quickly.
      
      This patch fixes bug 9906, "Weird hang with NPTL and SIGPROF."
      
      Code Changes
      
      This rework corrects the implementation of run_posix_cpu_timers() to make it
      run in constant time for a particular machine.  (Performance may vary between
      one machine and another depending upon whether the kernel is built as single-
      or multiprocessor and, in the latter case, depending upon the number of
      running processors.)  To do this, at each tick we now update fields in
      signal_struct as well as task_struct.  The run_posix_cpu_timers() function
      uses those fields to make its decisions.
      
      We define a new structure, "task_cputime," to contain user, system and
      scheduler times and use these in appropriate places:
      
      struct task_cputime {
      	cputime_t utime;
      	cputime_t stime;
      	unsigned long long sum_exec_runtime;
      };
      
      This is included in the structure "thread_group_cputime," which is a new
      substructure of signal_struct and which varies for uniprocessor versus
      multiprocessor kernels.  For uniprocessor kernels, it uses "task_cputime" as
      a simple substructure, while for multiprocessor kernels it is a pointer:
      
      struct thread_group_cputime {
      	struct task_cputime totals;
      };
      
      struct thread_group_cputime {
      	struct task_cputime *totals;
      };
      
      We also add a new task_cputime substructure directly to signal_struct, to
      cache the earliest expiration of process-wide timers, and task_cputime also
      replaces the it_*_expires fields of task_struct (used for earliest expiration
      of thread timers).  The "thread_group_cputime" structure contains process-wide
      timers that are updated via account_user_time() and friends.  In the non-SMP
      case the structure is a simple aggregator; unfortunately in the SMP case that
      simplicity was not achievable due to cache-line contention between CPUs (in
      one measured case performance was actually _worse_ on a 16-cpu system than
      the same test on a 4-cpu system, due to this contention).  For SMP, the
      thread_group_cputime counters are maintained as a per-cpu structure allocated
      using alloc_percpu().  The timer functions update only the timer field in
      the structure corresponding to the running CPU, obtained using per_cpu_ptr().
      
      We define a set of inline functions in sched.h that we use to maintain the
      thread_group_cputime structure and hide the differences between UP and SMP
      implementations from the rest of the kernel.  The thread_group_cputime_init()
      function initializes the thread_group_cputime structure for the given task.
      The thread_group_cputime_alloc() is a no-op for UP; for SMP it calls the
      out-of-line function thread_group_cputime_alloc_smp() to allocate and fill
      in the per-cpu structures and fields.  The thread_group_cputime_free()
      function, also a no-op for UP, in SMP frees the per-cpu structures.  The
      thread_group_cputime_clone_thread() function (also a UP no-op) for SMP calls
      thread_group_cputime_alloc() if the per-cpu structures haven't yet been
      allocated.  The thread_group_cputime() function fills the task_cputime
      structure it is passed with the contents of the thread_group_cputime fields;
      in UP it's that simple but in SMP it must also safely check that tsk->signal
      is non-NULL (if it is it just uses the appropriate fields of task_struct) and,
      if so, sums the per-cpu values for each online CPU.  Finally, the three
      functions account_group_user_time(), account_group_system_time() and
      account_group_exec_runtime() are used by timer functions to update the
      respective fields of the thread_group_cputime structure.
      
      Non-SMP operation is trivial and will not be mentioned further.
      
      The per-cpu structure is always allocated when a task creates its first new
      thread, via a call to thread_group_cputime_clone_thread() from copy_signal().
      It is freed at process exit via a call to thread_group_cputime_free() from
      cleanup_signal().
      
      All functions that formerly summed utime/stime/sum_sched_runtime values from
      from all threads in the thread group now use thread_group_cputime() to
      snapshot the values in the thread_group_cputime structure or the values in
      the task structure itself if the per-cpu structure hasn't been allocated.
      
      Finally, the code in kernel/posix-cpu-timers.c has changed quite a bit.
      The run_posix_cpu_timers() function has been split into a fast path and a
      slow path; the former safely checks whether there are any expired thread
      timers and, if not, just returns, while the slow path does the heavy lifting.
      With the dedicated thread group fields, timers are no longer "rebalanced" and
      the process_timer_rebalance() function and related code has gone away.  All
      summing loops are gone and all code that used them now uses the
      thread_group_cputime() inline.  When process-wide timers are set, the new
      task_cputime structure in signal_struct is used to cache the earliest
      expiration; this is checked in the fast path.
      
      Performance
      
      The fix appears not to add significant overhead to existing operations.  It
      generally performs the same as the current code except in two cases, one in
      which it performs slightly worse (Case 5 below) and one in which it performs
      very significantly better (Case 2 below).  Overall it's a wash except in those
      two cases.
      
      I've since done somewhat more involved testing on a dual-core Opteron system.
      
      Case 1: With no itimer running, for a test with 100,000 threads, the fixed
      	kernel took 1428.5 seconds, 513 seconds more than the unfixed system,
      	all of which was spent in the system.  There were twice as many
      	voluntary context switches with the fix as without it.
      
      Case 2: With an itimer running at .01 second ticks and 4000 threads (the most
      	an unmodified kernel can handle), the fixed kernel ran the test in
      	eight percent of the time (5.8 seconds as opposed to 70 seconds) and
      	had better tick accuracy (.012 seconds per tick as opposed to .023
      	seconds per tick).
      
      Case 3: A 4000-thread test with an initial timer tick of .01 second and an
      	interval of 10,000 seconds (i.e. a timer that ticks only once) had
      	very nearly the same performance in both cases:  6.3 seconds elapsed
      	for the fixed kernel versus 5.5 seconds for the unfixed kernel.
      
      With fewer threads (eight in these tests), the Case 1 test ran in essentially
      the same time on both the modified and unmodified kernels (5.2 seconds versus
      5.8 seconds).  The Case 2 test ran in about the same time as well, 5.9 seconds
      versus 5.4 seconds but again with much better tick accuracy, .013 seconds per
      tick versus .025 seconds per tick for the unmodified kernel.
      
      Since the fix affected the rlimit code, I also tested soft and hard CPU limits.
      
      Case 4: With a hard CPU limit of 20 seconds and eight threads (and an itimer
      	running), the modified kernel was very slightly favored in that while
      	it killed the process in 19.997 seconds of CPU time (5.002 seconds of
      	wall time), only .003 seconds of that was system time, the rest was
      	user time.  The unmodified kernel killed the process in 20.001 seconds
      	of CPU (5.014 seconds of wall time) of which .016 seconds was system
      	time.  Really, though, the results were too close to call.  The results
      	were essentially the same with no itimer running.
      
      Case 5: With a soft limit of 20 seconds and a hard limit of 2000 seconds
      	(where the hard limit would never be reached) and an itimer running,
      	the modified kernel exhibited worse tick accuracy than the unmodified
      	kernel: .050 seconds/tick versus .028 seconds/tick.  Otherwise,
      	performance was almost indistinguishable.  With no itimer running this
      	test exhibited virtually identical behavior and times in both cases.
      
      In times past I did some limited performance testing.  those results are below.
      
      On a four-cpu Opteron system without this fix, a sixteen-thread test executed
      in 3569.991 seconds, of which user was 3568.435s and system was 1.556s.  On
      the same system with the fix, user and elapsed time were about the same, but
      system time dropped to 0.007 seconds.  Performance with eight, four and one
      thread were comparable.  Interestingly, the timer ticks with the fix seemed
      more accurate:  The sixteen-thread test with the fix received 149543 ticks
      for 0.024 seconds per tick, while the same test without the fix received 58720
      for 0.061 seconds per tick.  Both cases were configured for an interval of
      0.01 seconds.  Again, the other tests were comparable.  Each thread in this
      test computed the primes up to 25,000,000.
      
      I also did a test with a large number of threads, 100,000 threads, which is
      impossible without the fix.  In this case each thread computed the primes only
      up to 10,000 (to make the runtime manageable).  System time dominated, at
      1546.968 seconds out of a total 2176.906 seconds (giving a user time of
      629.938s).  It received 147651 ticks for 0.015 seconds per tick, still quite
      accurate.  There is obviously no comparable test without the fix.
      Signed-off-by: NFrank Mayhar <fmayhar@google.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f06febc9
  22. 01 5月, 2008 1 次提交
  23. 30 4月, 2008 1 次提交
  24. 20 4月, 2008 1 次提交
  25. 17 4月, 2008 1 次提交
  26. 10 2月, 2008 2 次提交
    • O
      hrtimer: don't modify restart_block->fn in restart functions · c289b074
      Oleg Nesterov 提交于
      hrtimer_nanosleep_restart() clears/restores restart_block->fn. This is
      pointless and complicates its usage. Note that if sys_restart_syscall()
      doesn't actually happen, we have a bogus "pending" restart->fn anyway,
      this is harmless.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Cc: Alexey Dobriyan <adobriyan@sw.ru>
      Cc: Pavel Emelyanov <xemul@sw.ru>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Toyo Abe <toyoa@mvista.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      c289b074
    • O
      hrtimer: fix *rmtp/restarts handling in compat_sys_nanosleep() · 41652937
      Oleg Nesterov 提交于
      Spotted by Pavel Emelyanov and Alexey Dobriyan.
      
      compat_sys_nanosleep() implicitly uses hrtimer_nanosleep_restart(), this can't
      work. Make a suitable compat_nanosleep_restart() helper.
      
      Introduced by commit c70878b4
      hrtimer: hook compat_sys_nanosleep up to high res timer code
      
      Also, set ->addr_limit = KERNEL_DS before doing hrtimer_nanosleep(), this func
      was changed by the previous patch and now takes the "__user *" parameter.
      
      Thanks to Ingo Molnar for fixing the bug in this patch.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Alexey Dobriyan <adobriyan@sw.ru>
      Cc: Pavel Emelyanov <xemul@sw.ru>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Toyo Abe <toyoa@mvista.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      41652937
  27. 19 10月, 2007 2 次提交
  28. 11 5月, 2007 1 次提交
  29. 12 2月, 2007 1 次提交
    • K
      [PATCH] Common compat_sys_sysinfo · d4d23add
      Kyle McMartin 提交于
      I noticed that almost all architectures implemented exactly the same
      sys32_sysinfo...  except parisc, where a bug was to be found in handling of
      the uptime.  So let's remove a whole whack of code for fun and profit.
      Cribbed compat_sys_sysinfo from x86_64's implementation, since I figured it
      would be the best tested.
      
      This patch incorporates Arnd's suggestion of not using set_fs/get_fs, but
      instead extracting out the common code from sys_sysinfo.
      
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d4d23add
  30. 04 11月, 2006 1 次提交
  31. 29 10月, 2006 1 次提交
  32. 02 10月, 2006 1 次提交
  33. 01 10月, 2006 1 次提交