1. 29 1月, 2008 1 次提交
  2. 26 1月, 2008 1 次提交
    • P
      sched: high-res preemption tick · 8f4d37ec
      Peter Zijlstra 提交于
      Use HR-timers (when available) to deliver an accurate preemption tick.
      
      The regular scheduler tick that runs at 1/HZ can be too coarse when nice
      level are used. The fairness system will still keep the cpu utilisation 'fair'
      by then delaying the task that got an excessive amount of CPU time but try to
      minimize this by delivering preemption points spot-on.
      
      The average frequency of this extra interrupt is sched_latency / nr_latency.
      Which need not be higher than 1/HZ, its just that the distribution within the
      sched_latency period is important.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8f4d37ec
  3. 11 10月, 2007 1 次提交
  4. 01 8月, 2007 1 次提交
  5. 17 7月, 2007 1 次提交
  6. 10 5月, 2007 1 次提交
  7. 08 5月, 2007 1 次提交
    • C
      i386: use page allocator to allocate thread_info structure · b5637e65
      Christoph Lameter 提交于
      i386 uses kmalloc to allocate the threadinfo structure assuming that the
      allocations result in a page sized aligned allocation.  That has worked so
      far because SLAB exempts page sized slabs from debugging and aligns them in
      special ways that goes beyond the restrictions imposed by
      KMALLOC_ARCH_MINALIGN valid for other slabs in the kmalloc array.
      
      SLUB also works fine without debugging since page sized allocations neatly
      align at page boundaries.  However, if debugging is switched on then SLUB
      will extend the slab with debug information.  The resulting slab is not
      longer of page size.  It will only be aligned following the requirements
      imposed by KMALLOC_ARCH_MINALIGN.  As a result the threadinfo structure may
      not be page aligned which makes i386 fail to boot with SLUB debug on.
      
      Replace the calls to kmalloc with calls into the page allocator.
      
      An alternate solution may be to create a custom slab cache where the
      alignment is set to PAGE_SIZE.  That would allow slub debugging to be
      applied to the threadinfo structure.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b5637e65
  8. 14 12月, 2006 1 次提交
    • R
      [PATCH] PM: Fix SMP races in the freezer · 8a102eed
      Rafael J. Wysocki 提交于
      Currently, to tell a task that it should go to the refrigerator, we set the
      PF_FREEZE flag for it and send a fake signal to it.  Unfortunately there
      are two SMP-related problems with this approach.  First, a task running on
      another CPU may be updating its flags while the freezer attempts to set
      PF_FREEZE for it and this may leave the task's flags in an inconsistent
      state.  Second, there is a potential race between freeze_process() and
      refrigerator() in which freeze_process() running on one CPU is reading a
      task's PF_FREEZE flag while refrigerator() running on another CPU has just
      set PF_FROZEN for the same task and attempts to reset PF_FREEZE for it.  If
      the refrigerator wins the race, freeze_process() will state that PF_FREEZE
      hasn't been set for the task and will set it unnecessarily, so the task
      will go to the refrigerator once again after it's been thawed.
      
      To solve first of these problems we need to stop using PF_FREEZE to tell
      tasks that they should go to the refrigerator.  Instead, we can introduce a
      special TIF_*** flag and use it for this purpose, since it is allowed to
      change the other tasks' TIF_*** flags and there are special calls for it.
      
      To avoid the freeze_process()-refrigerator() race we can make
      freeze_process() to always check the task's PF_FROZEN flag after it's read
      its "freeze" flag.  We should also make sure that refrigerator() will
      always reset the task's "freeze" flag after it's set PF_FROZEN for it.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Acked-by: NPavel Machek <pavel@ucw.cz>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8a102eed
  9. 07 12月, 2006 1 次提交
  10. 10 7月, 2006 1 次提交
  11. 28 6月, 2006 2 次提交
    • I
      [PATCH] vdso: randomize the i386 vDSO by moving it into a vma · e6e5494c
      Ingo Molnar 提交于
      Move the i386 VDSO down into a vma and thus randomize it.
      
      Besides the security implications, this feature also helps debuggers, which
      can COW a vma-backed VDSO just like a normal DSO and can thus do
      single-stepping and other debugging features.
      
      It's good for hypervisors (Xen, VMWare) too, which typically live in the same
      high-mapped address space as the VDSO, hence whenever the VDSO is used, they
      get lots of guest pagefaults and have to fix such guest accesses up - which
      slows things down instead of speeding things up (the primary purpose of the
      VDSO).
      
      There's a new CONFIG_COMPAT_VDSO (default=y) option, which provides support
      for older glibcs that still rely on a prelinked high-mapped VDSO.  Newer
      distributions (using glibc 2.3.3 or later) can turn this option off.  Turning
      it off is also recommended for security reasons: attackers cannot use the
      predictable high-mapped VDSO page as syscall trampoline anymore.
      
      There is a new vdso=[0|1] boot option as well, and a runtime
      /proc/sys/vm/vdso_enabled sysctl switch, that allows the VDSO to be turned
      on/off.
      
      (This version of the VDSO-randomization patch also has working ELF
      coredumping, the previous patch crashed in the coredumping code.)
      
      This code is a combined work of the exec-shield VDSO randomization
      code and Gerd Hoffmann's hypervisor-centric VDSO patch. Rusty Russell
      started this patch and i completed it.
      
      [akpm@osdl.org: cleanups]
      [akpm@osdl.org: compile fix]
      [akpm@osdl.org: compile fix 2]
      [akpm@osdl.org: compile fix 3]
      [akpm@osdl.org: revernt MAXMEM change]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NArjan van de Ven <arjan@infradead.org>
      Cc: Gerd Hoffmann <kraxel@suse.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Jan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e6e5494c
    • C
      [PATCH] i386: use C code for current_thread_info() · c723e084
      Chuck Ebbert 提交于
      Using C code for current_thread_info() lets the compiler optimize it.
      With gcc 4.0.2, kernel is smaller:
      
          text           data     bss     dec     hex filename
       3645212         555556  312024 4512792  44dc18 2.6.17-rc6-nb-post/vmlinux
       3647276         555556  312024 4514856  44e428 2.6.17-rc6-nb/vmlinux
       -------
         -2064
      Signed-off-by: NChuck Ebbert <76306.1226@compuserve.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c723e084
  12. 27 6月, 2006 1 次提交
    • A
      [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status · 495ab9c0
      Andi Kleen 提交于
      During some profiling I noticed that default_idle causes a lot of
      memory traffic. I think that is caused by the atomic operations
      to clear/set the polling flag in thread_info. There is actually
      no reason to make this atomic - only the idle thread does it
      to itself, other CPUs only read it. So I moved it into ti->status.
      
      Converted i386/x86-64/ia64 for now because that was the easiest
      way to fix ACPI which also manipulates these flags in its idle
      function.
      
      Cc: Nick Piggin <npiggin@novell.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Len Brown <len.brown@intel.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      495ab9c0
  13. 26 4月, 2006 1 次提交
  14. 18 2月, 2006 1 次提交
  15. 19 1月, 2006 1 次提交
    • D
      [PATCH] Handle TIF_RESTORE_SIGMASK for i386 · 283828f3
      David Howells 提交于
      Handle TIF_RESTORE_SIGMASK as added by David Woodhouse's patch entitled:
      
              [PATCH] 2/3 Add TIF_RESTORE_SIGMASK support for arch/powerpc
              [PATCH] 3/3 Generic sys_rt_sigsuspend
      
      It does the following:
      
       (1) Declares TIF_RESTORE_SIGMASK for i386.
      
       (2) Invokes it over to do_signal() when TIF_RESTORE_SIGMASK is set.
      
       (3) Makes do_signal() support TIF_RESTORE_SIGMASK, using the signal mask saved
           in current->saved_sigmask.
      
       (4) Discards sys_rt_sigsuspend() from the arch, using the generic one instead.
      
       (5) Makes sys_sigsuspend() save the signal mask and set TIF_RESTORE_SIGMASK
           rather than attempting to fudge the return registers.
      
       (6) Makes sys_sigsuspend() return -ERESTARTNOHAND rather than looping
           intrinsically.
      
       (7) Makes setup_frame(), setup_rt_frame() and handle_signal() return 0 or
           -EFAULT rather than true/false to be consistent with the rest of the
           kernel.
      
      Due to the fact do_signal() is then only called from one place:
      
       (8) Makes do_signal() no longer have a return value is it was just being
           ignored; force_sig() takes care of this.
      
       (9) Discards the old sigmask argument to do_signal() as it's no longer
           necessary.
      
      (10) Makes do_signal() static.
      
      (11) Marks the second argument to do_notify_resume() as unused. The unused
           argument should remain in the middle as the arguments are passed in as
           registers, and the ordering is specific in entry.S
      
      Given the way do_signal() is now no longer called from sys_{,rt_}sigsuspend(),
      they no longer need access to the exception frame, and so can just take
      arguments normally.
      
      This patch depends on sys_rt_sigsuspend patch.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      283828f3
  16. 13 1月, 2006 1 次提交
  17. 10 9月, 2005 1 次提交
    • S
      kbuild: full dependency check on asm-offsets.h · 86feeaa8
      Sam Ravnborg 提交于
      Building asm-offsets.h has been moved to a seperate Kbuild file
      located in the top-level directory. This allow us to share the
      functionality across the architectures.
      
      The old rules in architecture specific Makefiles will die
      in subsequent patches.
      
      Furhtermore the usual kbuild dependency tracking is now used
      when deciding to rebuild asm-offsets.s. So we no longer risk
      to fail a rebuild caused by asm-offsets.c dependencies being touched.
      
      With this common rule-set we now force the same name across
      all architectures. Following patches will fix the rest.
      Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
      86feeaa8
  18. 05 9月, 2005 1 次提交
    • L
      [PATCH] UML Support - Ptrace: adds the host SYSEMU support, for UML and general usage · ed75e8d5
      Laurent Vivier 提交于
            Jeff Dike <jdike@addtoit.com>,
            Paolo 'Blaisorblade' Giarrusso <blaisorblade_spam@yahoo.it>,
            Bodo Stroesser <bstroesser@fujitsu-siemens.com>
      
      Adds a new ptrace(2) mode, called PTRACE_SYSEMU, resembling PTRACE_SYSCALL
      except that the kernel does not execute the requested syscall; this is useful
      to improve performance for virtual environments, like UML, which want to run
      the syscall on their own.
      
      In fact, using PTRACE_SYSCALL means stopping child execution twice, on entry
      and on exit, and each time you also have two context switches; with SYSEMU you
      avoid the 2nd stop and so save two context switches per syscall.
      
      Also, some architectures don't have support in the host for changing the
      syscall number via ptrace(), which is currently needed to skip syscall
      execution (UML turns any syscall into getpid() to avoid it being executed on
      the host).  Fixing that is hard, while SYSEMU is easier to implement.
      
      * This version of the patch includes some suggestions of Jeff Dike to avoid
        adding any instructions to the syscall fast path, plus some other little
        changes, by myself, to make it work even when the syscall is executed with
        SYSENTER (but I'm unsure about them). It has been widely tested for quite a
        lot of time.
      
      * Various fixed were included to handle the various switches between
        various states, i.e. when for instance a syscall entry is traced with one of
        PT_SYSCALL / _SYSEMU / _SINGLESTEP and another one is used on exit.
        Basically, this is done by remembering which one of them was used even after
        the call to ptrace_notify().
      
      * We're combining TIF_SYSCALL_EMU with TIF_SYSCALL_TRACE or TIF_SINGLESTEP
        to make do_syscall_trace() notice that the current syscall was started with
        SYSEMU on entry, so that no notification ought to be done in the exit path;
        this is a bit of a hack, so this problem is solved in another way in next
        patches.
      
      * Also, the effects of the patch:
      "Ptrace - i386: fix Syscall Audit interaction with singlestep"
      are cancelled; they are restored back in the last patch of this series.
      
      Detailed descriptions of the patches doing this kind of processing follow (but
      I've already summed everything up).
      
      * Fix behaviour when changing interception kind #1.
      
        In do_syscall_trace(), we check the status of the TIF_SYSCALL_EMU flag
        only after doing the debugger notification; but the debugger might have
        changed the status of this flag because he continued execution with
        PTRACE_SYSCALL, so this is wrong.  This patch fixes it by saving the flag
        status before calling ptrace_notify().
      
      * Fix behaviour when changing interception kind #2:
        avoid intercepting syscall on return when using SYSCALL again.
      
        A guest process switching from using PTRACE_SYSEMU to PTRACE_SYSCALL
        crashes.
      
        The problem is in arch/i386/kernel/entry.S.  The current SYSEMU patch
        inhibits the syscall-handler to be called, but does not prevent
        do_syscall_trace() to be called after this for syscall completion
        interception.
      
        The appended patch fixes this.  It reuses the flag TIF_SYSCALL_EMU to
        remember "we come from PTRACE_SYSEMU and now are in PTRACE_SYSCALL", since
        the flag is unused in the depicted situation.
      
      * Fix behaviour when changing interception kind #3:
        avoid intercepting syscall on return when using SINGLESTEP.
      
        When testing 2.6.9 and the skas3.v6 patch, with my latest patch and had
        problems with singlestepping on UML in SKAS with SYSEMU.  It looped
        receiving SIGTRAPs without moving forward.  EIP of the traced process was
        the same for all SIGTRAPs.
      
      What's missing is to handle switching from PTRACE_SYSCALL_EMU to
      PTRACE_SINGLESTEP in a way very similar to what is done for the change from
      PTRACE_SYSCALL_EMU to PTRACE_SYSCALL_TRACE.
      
      I.e., after calling ptrace(PTRACE_SYSEMU), on the return path, the debugger is
      notified and then wake ups the process; the syscall is executed (or skipped,
      when do_syscall_trace() returns 0, i.e.  when using PTRACE_SYSEMU), and
      do_syscall_trace() is called again.  Since we are on the return path of a
      SYSEMU'd syscall, if the wake up is performed through ptrace(PTRACE_SYSCALL),
      we must still avoid notifying the parent of the syscall exit.  Now, this
      behaviour is extended even to resuming with PTRACE_SINGLESTEP.
      Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Jeff Dike <jdike@addtoit.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ed75e8d5
  19. 24 6月, 2005 1 次提交
    • J
      [PATCH] streamline preempt_count type across archs · dcd497f9
      Jesper Juhl 提交于
      The preempt_count member of struct thread_info is currently either defined
      as int, unsigned int or __s32 depending on arch.  This patch makes the type
      of preempt_count an int on all archs.
      
      Having preempt_count be an unsigned type prevents the catching of
      preempt_count < 0 bugs, and using int on some archs and __s32 on others is
      not exactely "neat" - much nicer when it's just int all over.
      
      A previous version of this patch was already ACK'ed by Robert Love, and the
      only change in this version of the patch compared to the one he ACK'ed is
      that this one also makes sure the preempt_count member is consistently
      commented.
      Signed-off-by: NJesper Juhl <juhl-lkml@dif.dk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      dcd497f9
  20. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4