1. 27 10月, 2005 3 次提交
  2. 24 10月, 2005 5 次提交
    • O
      [PATCH] posix-timers: fix posix_cpu_timer_set() vs run_posix_cpu_timers() race · a69ac4a7
      Oleg Nesterov 提交于
      This might be harmless, but looks like a race from code inspection (I
      was unable to trigger it).  I must admit, I don't understand why we
      can't return TIMER_RETRY after 'spin_unlock(&p->sighand->siglock)'
      without doing bump_cpu_timer(), but this is what original code does.
      
      posix_cpu_timer_set:
      
      	read_lock(&tasklist_lock);
      
      	spin_lock(&p->sighand->siglock);
      	list_del_init(&timer->it.cpu.entry);
      	spin_unlock(&p->sighand->siglock);
      
      We are probaly deleting the timer from run_posix_cpu_timers's 'firing'
      local list_head while run_posix_cpu_timers() does list_for_each_safe.
      
      Various bad things can happen, for example we can just delete this timer
      so that list_for_each() will not notice it and run_posix_cpu_timers()
      will not reset '->firing' flag. In that case,
      
      	....
      
      	if (timer->it.cpu.firing) {
      		read_unlock(&tasklist_lock);
      		timer->it.cpu.firing = -1;
      		return TIMER_RETRY;
      	}
      
      sys_timer_settime() goes to 'retry:', calls posix_cpu_timer_set() again,
      it returns TIMER_RETRY ...
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      a69ac4a7
    • O
      [PATCH] posix-timers: exit path cleanup · ca531a0a
      Oleg Nesterov 提交于
      No need to rebalance when task exited
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ca531a0a
    • O
      [PATCH] posix-timers: remove false BUG_ON() from run_posix_cpu_timers() · 3de463c7
      Oleg Nesterov 提交于
      do_exit() clears ->it_##clock##_expires, but nothing prevents
      another cpu to attach the timer to exiting process after that.
      
      After exit_notify() does 'write_unlock_irq(&tasklist_lock)' and
      before do_exit() calls 'schedule() local timer interrupt can find
      tsk->exit_state != 0. If that state was EXIT_DEAD (or another cpu
      does sys_wait4) interrupted task has ->signal == NULL.
      
      At this moment exiting task has no pending cpu timers, they were cleaned
      up in __exit_signal()->posix_cpu_timers_exit{,_group}(), so we can just
      return from irq.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3de463c7
    • O
      [PATCH] posix-timers: fix cleanup_timers() and run_posix_cpu_timers() races · 108150ea
      Oleg Nesterov 提交于
      1. cleanup_timers() sets timer->task = NULL under tasklist + ->sighand locks.
         That means that this code in posix_cpu_timer_del() and posix_cpu_timer_set()
      
         		lock_timer(timer);
      		if (timer->task == NULL)
      			return;
      		read_lock(tasklist);
      		put_task_struct(timer->task)
      
         is racy. With this patch timer->task modified and accounted only under
         timer->it_lock. Sadly, this means that dead task_struct won't be freed
         until timer deleted or armed.
      
      2. run_posix_cpu_timers() collects expired timers into local list under
         tasklist + ->sighand again. That means that posix_cpu_timer_del()
         should check timer->it.cpu.firing under these locks too.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      108150ea
    • L
      Posix timers: limit number of timers firing at once · e80eda94
      Linus Torvalds 提交于
      Bursty timers aren't good for anybody, very much including latency for
      other programs when we trigger lots of timers in interrupt context.  So
      set a random limit, after which we'll handle the rest on the next timer
      tick.
      
      Noted by Oleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e80eda94
  3. 22 10月, 2005 2 次提交
  4. 20 10月, 2005 2 次提交
    • A
      [PATCH] Threads shouldn't inherit PF_NOFREEZE · d1209d04
      Alan Stern 提交于
      The PF_NOFREEZE process flag should not be inherited when a thread is
      forked.  This patch (as585) removes the flag from the child.
      
      This problem is starting to show up more and more as drivers turn to the
      kthread API instead of using kernel_thread().  As a result, their kernel
      threads are now children of the kthread worker instead of modprobe, and
      they inherit the PF_NOFREEZE flag.  This can cause problems during system
      suspend; the kernel threads are not getting frozen as they ought to be.
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Acked-by: NPavel Machek <pavel@ucw.cz>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d1209d04
    • R
      [PATCH] Fix cpu timers exit deadlock and races · e03d13e9
      Roland McGrath 提交于
      Oleg Nesterov reported an SMP deadlock.  If there is a running timer
      tracking a different process's CPU time clock when the process owning
      the timer exits, we deadlock on tasklist_lock in posix_cpu_timer_del via
      exit_itimers.
      
      That code was using tasklist_lock to check for a race with __exit_signal
      being called on the timer-target task and clearing its ->signal.
      However, there is actually no such race.  __exit_signal will have called
      posix_cpu_timers_exit and posix_cpu_timers_exit_group before it does
      that.  Those will clear those k_itimer's association with the dying
      task, so posix_cpu_timer_del will return early and never reach the code
      in question.
      
      In addition, posix_cpu_timer_del called from exit_itimers during execve
      or directly from timer_delete in the process owning the timer can race
      with an exiting timer-target task to cause a double put on timer-target
      task struct.  Make sure we always access cpu_timers lists with sighand
      lock held.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Signed-off-by: NChris Wright <chrisw@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e03d13e9
  5. 18 10月, 2005 3 次提交
  6. 15 10月, 2005 1 次提交
  7. 11 10月, 2005 1 次提交
    • H
      [PATCH] Fix signal sending in usbdevio on async URB completion · 46113830
      Harald Welte 提交于
      If a process issues an URB from userspace and (starts to) terminate
      before the URB comes back, we run into the issue described above.  This
      is because the urb saves a pointer to "current" when it is posted to the
      device, but there's no guarantee that this pointer is still valid
      afterwards.
      
      In fact, there are three separate issues:
      
      1) the pointer to "current" can become invalid, since the task could be
         completely gone when the URB completion comes back from the device.
      
      2) Even if the saved task pointer is still pointing to a valid task_struct,
         task_struct->sighand could have gone meanwhile.
      
      3) Even if the process is perfectly fine, permissions may have changed,
         and we can no longer send it a signal.
      
      So what we do instead, is to save the PID and uid's of the process, and
      introduce a new kill_proc_info_as_uid() function.
      Signed-off-by: NHarald Welte <laforge@gnumonks.org>
      [ Fixed up types and added symbol exports ]
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      46113830
  8. 10 10月, 2005 1 次提交
    • R
      [PATCH] x86_64: Set up safe page tables during resume · 3dd08325
      Rafael J. Wysocki 提交于
      The following patch makes swsusp avoid the possible temporary corruption
      of page translation tables during resume on x86-64.  This is achieved by
      creating a copy of the relevant page tables that will not be modified by
      swsusp and can be safely used by it on resume.
      
      The problem is that during resume on x86-64 swsusp may temporarily
      corrupt the page tables used for the direct mapping of RAM.  If that
      happens, a page fault occurs and cannot be handled properly, which leads
      to the solid hang of the affected system.  This leads to the loss of the
      system's state from before suspend and may result in the loss of data or
      the corruption of filesystems, so it is a serious issue.  Also, it
      appears to happen quite often (for me, as often as 50% of the time).
      
      The problem is related to the fact that (at least) one of the PMD
      entries used in the direct memory mapping (starting at PAGE_OFFSET)
      points to a page table the physical address of which is much greater
      than the physical address of the PMD entry itself.  Moreover,
      unfortunately, the physical address of the page table before suspend
      (i.e.  the one stored in the suspend image) happens to be different to
      the physical address of the corresponding page table used during resume
      (i.e.  the one that is valid right before swsusp_arch_resume() in
      arch/x86_64/kernel/suspend_asm.S is executed).  Thus while the image is
      restored, the "offending" PMD entry gets overwritten, so it does not
      point to the right physical address any more (i.e.  there's no page
      table at the address pointed to by it, because it points to the address
      the page table has been at during suspend).  Consequently, if the PMD
      entry is used later on, and it _is_ used in the process of copying the
      image pages, a page fault occurs, but it cannot be handled in the normal
      way and the system hangs.
      
      In principle we can call create_resume_mapping() from
      swsusp_arch_resume() (ie.  from suspend_asm.S), but then the memory
      allocations in create_resume_mapping(), resume_pud_mapping(), and
      resume_pmd_mapping() must be made carefully so that we use _only_
      NosaveFree pages in them (the other pages are overwritten by the loop in
      swsusp_arch_resume()).  Additionally, we are in atomic context at that
      time, so we cannot use GFP_KERNEL.  Moreover, if one of the allocations
      fails, we should free all of the allocated pages, so we need to trace
      them somehow.
      
      All of this is done in the appended patch, except that the functions
      populating the page tables are located in arch/x86_64/kernel/suspend.c
      rather than in init.c.  It may be done in a more elegan way in the
      future, with the help of some swsusp patches that are in the works now.
      
      [AK: move some externs into headers, renamed a function]
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3dd08325
  9. 09 10月, 2005 2 次提交
    • A
      [PATCH] gfp flags annotations - part 1 · dd0fc66f
      Al Viro 提交于
       - added typedef unsigned int __nocast gfp_t;
      
       - replaced __nocast uses for gfp flags with gfp_t - it gives exactly
         the same warnings as far as sparse is concerned, doesn't change
         generated code (from gcc point of view we replaced unsigned int with
         typedef) and documents what's going on far better.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      dd0fc66f
    • O
      [PATCH] fix do_coredump() vs SIGSTOP race · 788e05a6
      Oleg Nesterov 提交于
      Let's suppose we have 2 threads in thread group:
      	A - does coredump
      	B - has pending SIGSTOP
      
      thread A						thread B
      
      do_coredump:						get_signal_to_deliver:
      
        lock(->sighand)
        ->signal->flags = SIGNAL_GROUP_EXIT
        unlock(->sighand)
      
      							lock(->sighand)
      							signr = dequeue_signal()
      								->signal->flags |= SIGNAL_STOP_DEQUEUED
      								return SIGSTOP;
      
      							do_signal_stop:
      							    unlock(->sighand)
      
        coredump_wait:
      
            zap_threads:
                lock(tasklist_lock)
                send SIGKILL to B
                    // signal_wake_up() does nothing
                unlock(tasklist_lock)
      
      							    lock(tasklist_lock)
      							    lock(->sighand)
      							    re-check sig->flags & SIGNAL_STOP_DEQUEUED, yes
      							    set_current_state(TASK_STOPPED);
      							    finish_stop:
      							        schedule();
      							            // ->state == TASK_STOPPED
      
            wait_for_completion(&startup_done)
               // waits for complete() from B,
               // ->state == TASK_UNINTERRUPTIBLE
      
      We can't wake up 'B' in any way:
      
      	SIGCONT will be ignored because handle_stop_signal() sees
      	->signal->flags & SIGNAL_GROUP_EXIT.
      
      	sys_kill(SIGKILL)->__group_complete_signal() will choose
      	uninterruptible 'A', so it can't help.
      
      	sys_tkill(B, SIGKILL) will be ignored by specific_send_sig_info()
      	because B already has pending SIGKILL.
      
      This scenario is not possbile if 'A' does do_group_exit(), because
      it sets sig->flags = SIGNAL_GROUP_EXIT and delivers SIGKILL to
      subthreads atomically, holding both tasklist_lock and sighand->lock.
      That means that do_signal_stop() will notice !SIGNAL_STOP_DEQUEUED
      after re-locking ->sighand. And it is not possible to any other
      thread to re-add SIGNAL_STOP_DEQUEUED later, because dequeue_signal()
      can only return SIGKILL.
      
      I think it is better to change do_coredump() to do sigaddset(SIGKILL)
      and signal_wake_up() under sighand->lock, but this patch is much
      simpler.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      788e05a6
  10. 02 10月, 2005 1 次提交
    • L
      Fix inequality comparison against "task->state" · 14bf01bb
      Linus Torvalds 提交于
      We should always use bitmask ops, rather than depend on some ordering of
      the different states.  With the TASK_NONINTERACTIVE flag, the inequality
      doesn't really work.
      
      Oleg Nesterov argues (likely correctly) that this test is unnecessary in
      the first place.  However, the minimal fix for now is to at least make
      it work in the presense of TASK_NONINTERACTIVE.  Waiting for consensus
      from Roland & co on potential bigger cleanups.
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      14bf01bb
  11. 30 9月, 2005 2 次提交
  12. 28 9月, 2005 5 次提交
  13. 24 9月, 2005 1 次提交
    • L
      Make sure SIGKILL gets proper respect · 188a1eaf
      Linus Torvalds 提交于
      Bhavesh P. Davda <bhavesh@avaya.com> noticed that SIGKILL wouldn't
      properly kill a process under just the right cicumstances: a stopped
      task that already had another signal queued would get the SIGKILL
      queued onto the shared queue, and there it would remain until SIGCONT.
      
      This simplifies the signal acceptance logic, and fixes the bug in the
      process.
      
      Losely based on an earlier patch by Bhavesh.
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      188a1eaf
  14. 23 9月, 2005 5 次提交
  15. 22 9月, 2005 1 次提交
    • A
      [PATCH] Add printk_clock() · 31f6d9d6
      Andrew Morton 提交于
      ia64's sched_clock() accesses per-cpu data which isn't set up at boot time.
      Hence ia64 cannot use printk timestamping, because printk() will crash in
      sched_clock().
      
      So make printk() use printk_clock(), defaulting to sched_clock(), overrideable
      by the architecture via attribute(weak).
      
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      31f6d9d6
  16. 18 9月, 2005 3 次提交
  17. 14 9月, 2005 1 次提交
  18. 13 9月, 2005 1 次提交