1. 26 2月, 2009 12 次提交
    • I
      time: ntp: make 64-bit constants more robust · 2b9d1496
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
       - make PPM_SCALE an explicit s64 constant, to
         remove (s64) casts from usage sites.
      
      kernel/time/ntp.o:
      
         text	   data	    bss	    dec	    hex	filename
         2536	    114	    136	   2786	    ae2	ntp.o.before
         2536	    114	    136	   2786	    ae2	ntp.o.after
      
      md5:
         40a7728d1188aa18e83e21a81fa7b150  ntp.o.before.asm
         40a7728d1188aa18e83e21a81fa7b150  ntp.o.after.asm
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2b9d1496
    • I
      time: ntp: refactor do_adjtimex() some more · e9629165
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      Further simplify do_adjtimex():
      
       - introduce the ntp_start_leap_timer() helper function
       - eliminate the goto adj_done complication
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e9629165
    • I
      time: ntp: refactor do_adjtimex() · 80f22571
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      do_adjtimex() is currently a monster function with a maze of
      branches. Refactor the txc->modes setting aspects of it into
      two new helper functions:
      
      	process_adj_status()
      	process_adjtimex_modes()
      
      kernel/time/ntp.o:
      
         text	   data	    bss	    dec	    hex	filename
         2512	    114	    136	   2762	    aca	ntp.o.before
         2512	    114	    136	   2762	    aca	ntp.o.after
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      80f22571
    • I
      time: ntp: fix bug in ntp_update_offset() & do_adjtimex() · 10dd31a7
      Ingo Molnar 提交于
      Impact: change (fix) the way the NTP PLL seconds offset is initialized/tracked
      
      Fix a bug and do a micro-optimization:
      
      When PLL is enabled we do not reset time_reftime. If the PLL
      was off for a long time (for example after bootup), this is
      arguably the wrong thing to do.
      
      We already had a hack for the common boot-time case in
      ntp_update_offset(), in form of:
      
      	if (unlikely(time_status & STA_FREQHOLD || time_reftime == 0))
       		secs = 0;
      
      But the update delta should be reset later on too - not just when
      the PLL is enabled for the first time after bootup.
      
      So do it on !STA_PLL -> STA_PLL transitions.
      
      This changes behavior, as previously if ntpd was disabled for
      a long time and we restarted it, we'd run from that last update,
      with a very large delta.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      10dd31a7
    • I
      time: ntp: micro-optimize ntp_update_offset() · c7986acb
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      The time_reftime update in ntp_update_offset() to xtime.tv_sec
      is a convoluted way of saying that we want to freeze the frequency
      and want the 'secs' delta to be 0. Also make this branch unlikely.
      
      This shaves off 8 bytes from the code size:
      
         text	   data	    bss	    dec	    hex	filename
         2504	    114	    136	   2754	    ac2	ntp.o.before
         2496	    114	    136	   2746	    aba	ntp.o.after
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c7986acb
    • I
      time: ntp: simplify ntp_update_offset_fll() · 478b7aab
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      Change ntp_update_offset_fll() to delta logic instead of
      absolute value logic. This eliminates 'freq_adj' from the
      function.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      478b7aab
    • I
      time: ntp: refactor and clean up ntp_update_offset() · f939890b
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      - introduce the ntp_update_offset_fll() helper
      - clean up the flow and variable naming
      
      kernel/time/ntp.o:
      
         text	   data	    bss	    dec	    hex	filename
         2504	    114	    136	   2754	    ac2	ntp.o.before
         2504	    114	    136	   2754	    ac2	ntp.o.after
      
      md5:
         01f7b8e1a5472a3056f9e4ae84d46315  ntp.o.before.asm
         01f7b8e1a5472a3056f9e4ae84d46315  ntp.o.after.asm
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f939890b
    • I
      time: ntp: refactor up ntp_update_frequency() · bc26c31d
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      Change ntp_update_frequency() from a hard to follow code
      flow that uses global variables as temporaries, to a clean
      input+output flow.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bc26c31d
    • I
      time: ntp: clean up ntp_update_frequency() · 9ce616aa
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      Prepare a refactoring of ntp_update_frequency().
      
      kernel/time/ntp.o:
      
         text	   data	    bss	    dec	    hex	filename
         2504	    114	    136	   2754	    ac2	ntp.o.before
         2504	    114	    136	   2754	    ac2	ntp.o.after
      
      md5:
         41f3009debc9b397d7394dd77d912f0a  ntp.o.before.asm
         41f3009debc9b397d7394dd77d912f0a  ntp.o.after.asm
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9ce616aa
    • I
      time: ntp: simplify the MAX_TICKADJ_SCALED definition · bbd12676
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      There's an ugly u64 typecase in the MAX_TICKADJ_SCALED definition,
      this can be eliminated by making the MAX_TICKADJ constant's type
      64-bit (signed).
      
      kernel/time/ntp.o:
      
         text	   data	    bss	    dec	    hex	filename
         2504	    114	    136	   2754	    ac2	ntp.o.before
         2504	    114	    136	   2754	    ac2	ntp.o.after
      
      md5:
         41f3009debc9b397d7394dd77d912f0a  ntp.o.before.asm
         41f3009debc9b397d7394dd77d912f0a  ntp.o.after.asm
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bbd12676
    • I
      time: ntp: simplify the second_overflow() code flow · 3c972c24
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      Instead of a hierarchy of conditions, transform them to clean
      gradual conditions and return's.
      
      This makes the flow easier to read and makes the purpose of
      the function easier to understand.
      
      kernel/time/ntp.o:
      
         text	   data	    bss	    dec	    hex	filename
         2552	    170	    168	   2890	    b4a	ntp.o.before
         2552	    170	    168	   2890	    b4a	ntp.o.after
      
      md5:
         eae1275df0b7d6290c13f6f6f8f05c8c  ntp.o.before.asm
         eae1275df0b7d6290c13f6f6f8f05c8c  ntp.o.after.asm
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3c972c24
    • I
      time: ntp: clean up kernel/time/ntp.c · 53bbfa9e
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      Make this file a bit more readable by applying a consistent coding style.
      
      No code changed:
      
      kernel/time/ntp.o:
      
         text	   data	    bss	    dec	    hex	filename
         2552	    170	    168	   2890	    b4a	ntp.o.before
         2552	    170	    168	   2890	    b4a	ntp.o.after
      
      md5:
         eae1275df0b7d6290c13f6f6f8f05c8c  ntp.o.before.asm
         eae1275df0b7d6290c13f6f6f8f05c8c  ntp.o.after.asm
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      53bbfa9e
  2. 19 2月, 2009 3 次提交
  3. 18 2月, 2009 1 次提交
  4. 16 2月, 2009 2 次提交
  5. 14 2月, 2009 1 次提交
  6. 13 2月, 2009 1 次提交
    • P
      timers: more consistently use clock vs timer · 3997ad31
      Peter Zijlstra 提交于
      While reviewing the manpages, I noticed I'd missed some clock vs timer sites.
      
      Make sure that all timer functions call cpu_timer_sample_group() and not
      cpu_clock_sample_group(). This ensures that we enable the process wide timer
      in time, and therefore pay the O(n) thread group cost from the syscall.
      
      Not doing it here, will result in the first jiffy tick after setting the timer
      doing this, resulting in a very expensive tick (but only once) and a delay in
      actually starting the timer.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3997ad31
  7. 12 2月, 2009 4 次提交
  8. 11 2月, 2009 4 次提交
  9. 10 2月, 2009 1 次提交
  10. 09 2月, 2009 5 次提交
  11. 07 2月, 2009 1 次提交
  12. 06 2月, 2009 3 次提交
    • J
      wait: prevent exclusive waiter starvation · 777c6c5f
      Johannes Weiner 提交于
      With exclusive waiters, every process woken up through the wait queue must
      ensure that the next waiter down the line is woken when it has finished.
      
      Interruptible waiters don't do that when aborting due to a signal.  And if
      an aborting waiter is concurrently woken up through the waitqueue, noone
      will ever wake up the next waiter.
      
      This has been observed with __wait_on_bit_lock() used by
      lock_page_killable(): the first contender on the queue was aborting when
      the actual lock holder woke it up concurrently.  The aborted contender
      didn't acquire the lock and therefor never did an unlock followed by
      waking up the next waiter.
      
      Add abort_exclusive_wait() which removes the process' wait descriptor from
      the waitqueue, iff still queued, or wakes up the next waiter otherwise.
      It does so under the waitqueue lock.  Racing with a wake up means the
      aborting process is either already woken (removed from the queue) and will
      wake up the next waiter, or it will remove itself from the queue and the
      concurrent wake up will apply to the next waiter after it.
      
      Use abort_exclusive_wait() in __wait_event_interruptible_exclusive() and
      __wait_on_bit_lock() when they were interrupted by other means than a wake
      up through the queue.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Reported-by: NChris Mason <chris.mason@oracle.com>
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Mentored-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Chuck Lever <cel@citi.umich.edu>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: <stable@kernel.org>		["after some testing"]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      777c6c5f
    • A
      revert "rlimit: permit setting RLIMIT_NOFILE to RLIM_INFINITY" · 60fd760f
      Andrew Morton 提交于
      Revert commit 0c2d64fb because it causes
      (arguably poorly designed) existing userspace to spend interminable
      periods closing billions of not-open file descriptors.
      
      We could bring this back, with some sort of opt-in tunable in /proc, which
      defaults to "off".
      
      Peter's alanysis follows:
      
      : I spent several hours trying to get to the bottom of a serious
      : performance issue that appeared on one of our servers after upgrading to
      : 2.6.28.  In the end it's what could be considered a userspace bug that
      : was triggered by a change in 2.6.28.  Since this might also affect other
      : people I figured I'd at least document what I found here, and maybe we
      : can even do something about it:
      :
      :
      : So, I upgraded some of debian.org's machines to 2.6.28.1 and immediately
      : the team maintaining our ftp archive complained that one of their
      : scripts that previously ran in a few minutes still hadn't even come
      : close to being done after an hour or so.  Downgrading to 2.6.27 fixed
      : that.
      :
      : Turns out that script is forking a lot and something in it or python or
      : whereever closes all the file descriptors it doesn't want to pass on.
      : That is, it starts at zero and goes up to ulimit -n/RLIMIT_NOFILE and
      : closes them all with a few exceptions.
      :
      : Turns out that takes a long time when your limit -n is now 2^20 (1048576).
      :
      : With 2.6.27.* the ulimit -n was the standard 1024, but with 2.6.28 it is
      : now a thousand times that.
      :
      : 2.6.28 included a patch titled "rlimit: permit setting RLIMIT_NOFILE to
      : RLIM_INFINITY" (0c2d64fb)[1] that
      : allows, as the title implies, to set the limit for number of files to
      : infinity.
      :
      : Closer investigation showed that the broken default ulimit did not apply
      : to "system" processes (like stuff started from init).  In the end I
      : could establish that all processes that passed through pam_limit at one
      : point had the bad resource limit.
      :
      : Apparently the pam library in Debian etch (4.0) initializes the limits
      : to some default values when it doesn't have any settings in limit.conf
      : to override them.  Turns out that for nofiles this is RLIM_INFINITY.
      : Commenting out "case RLIMIT_NOFILE" in pam_limit.c:267 of our pam
      : package version 0.79-5 fixes that - tho I'm not sure what side effects
      : that has.
      :
      : Debian lenny (the upcoming 5.0 version) doesn't have this issue as it
      : uses a different pam (version).
      Reported-by: NPeter Palfrader <weasel@debian.org>
      Cc: Adam Tkac <vonsch@gmail.com>
      Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
      Cc: <stable@kernel.org>		[2.6.28.x]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      60fd760f
    • A
      kernel/async.c: fix printk warnings · 58763a29
      Andrew Morton 提交于
      alpha:
      
      kernel/async.c: In function 'run_one_entry':
      kernel/async.c:141: warning: format '%lli' expects type 'long long int', but argument 2 has type 'async_cookie_t'
      kernel/async.c:149: warning: format '%lli' expects type 'long long int', but argument 2 has type 'async_cookie_t'
      kernel/async.c:149: warning: format '%lld' expects type 'long long int', but argument 4 has type 's64'
      kernel/async.c: In function 'async_synchronize_cookie_special':
      kernel/async.c:250: warning: format '%lli' expects type 'long long int', but argument 3 has type 's64'
      
      Cc: Arjan van de Ven <arjan@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      58763a29
  13. 05 2月, 2009 2 次提交
    • P
      timers: split process wide cpu clocks/timers · 4cd4c1b4
      Peter Zijlstra 提交于
      Change the process wide cpu timers/clocks so that we:
      
       1) don't mess up the kernel with too many threads,
       2) don't have a per-cpu allocation for each process,
       3) have no impact when not used.
      
      In order to accomplish this we're going to split it into two parts:
      
       - clocks; which can take all the time they want since they run
                 from user context -- ie. sys_clock_gettime(CLOCK_PROCESS_CPUTIME_ID)
      
       - timers; which need constant time sampling but since they're
                 explicity used, the user can pay the overhead.
      
      The clock readout will go back to a full sum of the thread group, while the
      timers will run of a global 'clock' that only runs when needed, so only
      programs that make use of the facility pay the price.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Reviewed-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4cd4c1b4
    • P
      signal: re-add dead task accumulation stats. · 32bd671d
      Peter Zijlstra 提交于
      We're going to split the process wide cpu accounting into two parts:
      
       - clocks; which can take all the time they want since they run
                 from user context.
      
       - timers; which need constant time tracing but can affort the overhead
                 because they're default off -- and rare.
      
      The clock readout will go back to a full sum of the thread group, for this
      we need to re-add the exit stats that were removed in the initial itimer
      rework (f06febc9: timers: fix itimer/many thread hang).
      
      Furthermore, since that full sum can be rather slow for large thread groups
      and we have the complete dead task stats, revert the do_notify_parent time
      computation.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Reviewed-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      32bd671d