1. 28 7月, 2008 1 次提交
  2. 22 7月, 2008 1 次提交
  3. 19 7月, 2008 1 次提交
    • T
      nohz: prevent tick stop outside of the idle loop · b8f8c3cf
      Thomas Gleixner 提交于
      Jack Ren and Eric Miao tracked down the following long standing
      problem in the NOHZ code:
      
      	scheduler switch to idle task
      	enable interrupts
      
      Window starts here
      
      	----> interrupt happens (does not set NEED_RESCHED)
      	      	irq_exit() stops the tick
      
      	----> interrupt happens (does set NEED_RESCHED)
      
      	return from schedule()
      	
      	cpu_idle(): preempt_disable();
      
      Window ends here
      
      The interrupts can happen at any point inside the race window. The
      first interrupt stops the tick, the second one causes the scheduler to
      rerun and switch away from idle again and we end up with the tick
      disabled.
      
      The fact that it needs two interrupts where the first one does not set
      NEED_RESCHED and the second one does made the bug obscure and extremly
      hard to reproduce and analyse. Kudos to Jack and Eric.
      
      Solution: Limit the NOHZ functionality to the idle loop to make sure
      that we can not run into such a situation ever again.
      
      cpu_idle()
      {
      	preempt_disable();
      
      	while(1) {
      		 tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
      		 			          are in the idle loop
      
      		 while (!need_resched())
      		       halt();
      
      		 tick_nohz_restart_sched_tick(); <- disables NOHZ mode
      		 preempt_enable_no_resched();
      		 schedule();
      		 preempt_disable();
      	}
      }
      
      In hindsight we should have done this forever, but ... 
      
      /me grabs a large brown paperbag.
      
      Debugged-by: Jack Ren <jack.ren@marvell.com>, 
      Debugged-by: Neric miao <eric.y.miao@gmail.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      b8f8c3cf
  4. 18 7月, 2008 1 次提交
  5. 22 5月, 2008 1 次提交
  6. 20 5月, 2008 1 次提交
    • D
      sparc64: Add global register dumping facility. · 93dae5b7
      David S. Miller 提交于
      When a cpu really is stuck in the kernel, it can be often
      impossible to figure out which cpu is stuck where.  The
      worst case is when the stuck cpu has interrupts disabled.
      
      Therefore, implement a global cpu state capture that uses
      SMP message interrupts which are not disabled by the
      normal IRQ enable/disable APIs of the kernel.
      
      As long as we can get a sysrq 'y' to the kernel, we can
      get a dump.  Even if the console interrupt cpu is wedged,
      we can trigger it from userspace using /proc/sysrq-trigger
      
      The output is made compact so that this facility is more
      useful on high cpu count systems, which is where this
      facility will likely find itself the most useful :)
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93dae5b7
  7. 08 5月, 2008 1 次提交
  8. 02 5月, 2008 1 次提交
    • D
      sparc64: Fix syscall restart, for real... · 2678fefe
      David S. Miller 提交于
      The change I put into copy_thread() just papered over the real
      problem.
      
      When we are looking to see if we should do a syscall restart, when
      deliverying a signal, we should only interpret the syscall return
      value as an error if the carry condition code(s) are set.
      
      Otherwise it's a success return.
      
      Also, sigreturn paths should do a pt_regs_clear_trap_type().
      
      It turns out that doing a syscall restart when returning from a fork()
      does and should happen, from time to time.  Even if copy_thread()
      returns success, copy_process() can still unwind and signal
      -ERESTARTNOINTR in the parent.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2678fefe
  9. 28 4月, 2008 1 次提交
  10. 27 4月, 2008 1 次提交
    • D
      sparc: Remove old style signal frame support. · 5526b7e4
      David S. Miller 提交于
      Back around the same time we were bootstrapping the first 32-bit sparc
      Linux kernel with a SunOS userland, we made the signal frame match
      that of SunOS.
      
      By the time we even started putting together a native Linux userland
      for 32-bit Sparc we realized this layout wasn't sufficient for Linux's
      needs.
      
      Therefore we changed the layout, yet kept support for the old style
      signal frame layout in there.  The detection mechanism is that we had
      sys_sigaction() start passing in a negative signal number to indicate
      "new style signal frames please".
      
      Anyways, no binaries exist in the world that use the old stuff.  In
      fact, I bet Jakub Jelinek and myself are the only two people who ever
      had such binaries to be honest.
      
      So let's get rid of this stuff.
      
      I added an assertion using WARN_ON_ONCE() that makes sure 32-bit
      applications are passing in that negative signal number still.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5526b7e4
  11. 24 3月, 2008 1 次提交
  12. 29 2月, 2008 1 次提交
  13. 20 2月, 2008 4 次提交
  14. 19 2月, 2008 2 次提交
  15. 09 2月, 2008 2 次提交
  16. 30 7月, 2007 1 次提交
    • A
      Remove fs.h from mm.h · 4e950f6f
      Alexey Dobriyan 提交于
      Remove fs.h from mm.h. For this,
       1) Uninline vma_wants_writenotify(). It's pretty huge anyway.
       2) Add back fs.h or less bloated headers (err.h) to files that need it.
      
      As result, on x86_64 allyesconfig, fs.h dependencies cut down from 3929 files
      rebuilt down to 3444 (-12.3%).
      
      Cross-compile tested without regressions on my two usual configs and (sigh):
      
      alpha              arm-mx1ads        mips-bigsur          powerpc-ebony
      alpha-allnoconfig  arm-neponset      mips-capcella        powerpc-g5
      alpha-defconfig    arm-netwinder     mips-cobalt          powerpc-holly
      alpha-up           arm-netx          mips-db1000          powerpc-iseries
      arm                arm-ns9xxx        mips-db1100          powerpc-linkstation
      arm-assabet        arm-omap_h2_1610  mips-db1200          powerpc-lite5200
      arm-at91rm9200dk   arm-onearm        mips-db1500          powerpc-maple
      arm-at91rm9200ek   arm-picotux200    mips-db1550          powerpc-mpc7448_hpc2
      arm-at91sam9260ek  arm-pleb          mips-ddb5477         powerpc-mpc8272_ads
      arm-at91sam9261ek  arm-pnx4008       mips-decstation      powerpc-mpc8313_rdb
      arm-at91sam9263ek  arm-pxa255-idp    mips-e55             powerpc-mpc832x_mds
      arm-at91sam9rlek   arm-realview      mips-emma2rh         powerpc-mpc832x_rdb
      arm-ateb9200       arm-realview-smp  mips-excite          powerpc-mpc834x_itx
      arm-badge4         arm-rpc           mips-fulong          powerpc-mpc834x_itxgp
      arm-carmeva        arm-s3c2410       mips-ip22            powerpc-mpc834x_mds
      arm-cerfcube       arm-shannon       mips-ip27            powerpc-mpc836x_mds
      arm-clps7500       arm-shark         mips-ip32            powerpc-mpc8540_ads
      arm-collie         arm-simpad        mips-jazz            powerpc-mpc8544_ds
      arm-corgi          arm-spitz         mips-jmr3927         powerpc-mpc8560_ads
      arm-csb337         arm-trizeps4      mips-malta           powerpc-mpc8568mds
      arm-csb637         arm-versatile     mips-mipssim         powerpc-mpc85xx_cds
      arm-ebsa110        i386              mips-mpc30x          powerpc-mpc8641_hpcn
      arm-edb7211        i386-allnoconfig  mips-msp71xx         powerpc-mpc866_ads
      arm-em_x270        i386-defconfig    mips-ocelot          powerpc-mpc885_ads
      arm-ep93xx         i386-up           mips-pb1100          powerpc-pasemi
      arm-footbridge     ia64              mips-pb1500          powerpc-pmac32
      arm-fortunet       ia64-allnoconfig  mips-pb1550          powerpc-ppc64
      arm-h3600          ia64-bigsur       mips-pnx8550-jbs     powerpc-prpmc2800
      arm-h7201          ia64-defconfig    mips-pnx8550-stb810  powerpc-ps3
      arm-h7202          ia64-gensparse    mips-qemu            powerpc-pseries
      arm-hackkit        ia64-sim          mips-rbhma4200       powerpc-up
      arm-integrator     ia64-sn2          mips-rbhma4500       s390
      arm-iop13xx        ia64-tiger        mips-rm200           s390-allnoconfig
      arm-iop32x         ia64-up           mips-sb1250-swarm    s390-defconfig
      arm-iop33x         ia64-zx1          mips-sead            s390-up
      arm-ixp2000        m68k              mips-tb0219          sparc
      arm-ixp23xx        m68k-amiga        mips-tb0226          sparc-allnoconfig
      arm-ixp4xx         m68k-apollo       mips-tb0287          sparc-defconfig
      arm-jornada720     m68k-atari        mips-workpad         sparc-up
      arm-kafa           m68k-bvme6000     mips-wrppmc          sparc64
      arm-kb9202         m68k-hp300        mips-yosemite        sparc64-allnoconfig
      arm-ks8695         m68k-mac          parisc               sparc64-defconfig
      arm-lart           m68k-mvme147      parisc-allnoconfig   sparc64-up
      arm-lpd270         m68k-mvme16x      parisc-defconfig     um-x86_64
      arm-lpd7a400       m68k-q40          parisc-up            x86_64
      arm-lpd7a404       m68k-sun3         powerpc              x86_64-allnoconfig
      arm-lubbock        m68k-sun3x        powerpc-cell         x86_64-defconfig
      arm-lusl7200       mips              powerpc-celleb       x86_64-up
      arm-mainstone      mips-atlas        powerpc-chrp32
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4e950f6f
  17. 21 7月, 2007 1 次提交
    • D
      [SPARC]: Fix serial console device detection. · c73fcc84
      David S. Miller 提交于
      The current scheme works on static interpretation of text names, which
      is wrong.
      
      The output-device setting, for example, must be resolved via an alias
      or similar to a full path name to the console device.
      
      Paths also contain an optional set of 'options', which starts with a
      colon at the end of the path.  The option area is used to specify
      which of two serial ports ('a' or 'b') the path refers to when a
      device node drives multiple ports.  'a' is assumed if the option
      specification is missing.
      
      This was caught by the UltraSPARC-T1 simulator.  The 'output-device'
      property was set to 'ttya' and we didn't pick upon the fact that this
      is an OBP alias set to '/virtual-devices/console'.  Instead we saw it
      as the first serial console device, instead of the hypervisor console.
      
      The infrastructure is now there to take advantage of this to resolve
      the console correctly even in multi-head situations in fbcon too.
      
      Thanks to Greg Onufer for the bug report.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c73fcc84
  18. 16 7月, 2007 1 次提交
  19. 29 5月, 2007 1 次提交
  20. 12 5月, 2007 1 次提交
  21. 09 5月, 2007 1 次提交
  22. 26 4月, 2007 1 次提交
  23. 10 3月, 2007 1 次提交
    • M
      [SPARC64]: Fix atomicity of TIF update in flush_thread() · c0a79b22
      Mathieu Desnoyers 提交于
      Fix atomicity of TIF update in flush_thread() for sparc64
      
      Fixes correctly the race by using *_ti_thread_flag.
      
      Race :
      
      parent process executing :
      sys_ptrace()
       (lock_kernel())
       (ptrace_get_task_struct(pid))
       arch_ptrace()
         ptrace_detach()
           ptrace_disable(child);
             clear_singlestep(child);
               clear_tsk_thread_flag(child, TIF_SINGLESTEP);
               (which clears the TIF_SINGLESTEP flag atomically from a different
                process)
       (put_task_struct(child))
       (unlock_kernel())
      
      And at the same time, in the child process :
      sys_execve()
       do_execve()
         search_binary_handler()
           load_elf_binary()
             flush_old_exec()
               flush_thread()
                 doing a non-atomic thread flag update
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c0a79b22
  24. 01 7月, 2006 1 次提交
  25. 20 3月, 2006 6 次提交
    • D
      [SPARC64]: Use sun4v_cpu_idle() in cpu_idle() on SUN4V. · 30c91d57
      David S. Miller 提交于
      We have to turn off the "polling nrflag" bit when we sleep
      the cpu like this, so that we'll get a cross-cpu interrupt
      to wake the processor up from the yield.
      
      We also have to disable PSTATE_IE in %pstate around the yield
      call and recheck need_resched() in order to avoid any races.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      30c91d57
    • D
      [SPARC64]: Kill cpudata->idle_volume. · 1bd0cd74
      David S. Miller 提交于
      Set, but never used.
      
      We used to use this for dynamic IRQ retargetting, but that
      code died a long time ago.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1bd0cd74
    • D
      [SPARC64]: Disable smp_report_regs() for now. · 19a0d585
      David S. Miller 提交于
      For 32 cpus and a slow console, it just wedges the
      machine especially with DETECT_SOFTLOCKUP enabled.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      19a0d585
    • D
      [SPARC64]: Refine register window trap handling. · 314ef685
      David S. Miller 提交于
      When saving and restoing trap state, do the window spill/fill
      handling inline so that we never trap deeper than 2 trap levels.
      This is important for chips like Niagara.
      
      The window fixup code is massively simplified, and many more
      improvements are now possible.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      314ef685
    • D
      [SPARC64]: Add infrastructure for dynamic TSB sizing. · 98c5584c
      David S. Miller 提交于
      This also cleans up tsb_context_switch().  The assembler
      routine is now __tsb_context_switch() and the former is
      an inline function that picks out the bits from the mm_struct
      and passes it into the assembler code as arguments.
      
      setup_tsb_parms() computes the locked TLB entry to map the
      TSB.  Later when we support using the physical address quad
      load instructions of Cheetah+ and later, we'll simply use
      the physical address for the TSB register value and set
      the map virtual and PTE both to zero.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      98c5584c
    • D
      [SPARC64]: Move away from virtual page tables, part 1. · 74bf4312
      David S. Miller 提交于
      We now use the TSB hardware assist features of the UltraSPARC
      MMUs.
      
      SMP is currently knowingly broken, we need to find another place
      to store the per-cpu base pointers.  We hid them away in the TSB
      base register, and that obviously will not work any more :-)
      
      Another known broken case is non-8KB base page size.
      
      Also noticed that flush_tlb_all() is not referenced anywhere, only
      the internal __flush_tlb_all() (local cpu only) is used by the
      sparc64 port, so we can get rid of flush_tlb_all().
      
      The kernel gets it's own 8KB TSB (swapper_tsb) and each address space
      gets it's own private 8K TSB.  Later we can add code to dynamically
      increase the size of per-process TSB as the RSS grows.  An 8KB TSB is
      good enough for up to about a 4MB RSS, after which the TSB starts to
      incur many capacity and conflict misses.
      
      We even accumulate OBP translations into the kernel TSB.
      
      Another area for refinement is large page size support.  We could use
      a secondary address space TSB to handle those.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      74bf4312
  26. 19 1月, 2006 1 次提交
  27. 13 1月, 2006 2 次提交
  28. 09 11月, 2005 2 次提交
    • N
      [PATCH] sched: resched and cpu_idle rework · 64c7c8f8
      Nick Piggin 提交于
      Make some changes to the NEED_RESCHED and POLLING_NRFLAG to reduce
      confusion, and make their semantics rigid.  Improves efficiency of
      resched_task and some cpu_idle routines.
      
      * In resched_task:
      - TIF_NEED_RESCHED is only cleared with the task's runqueue lock held,
        and as we hold it during resched_task, then there is no need for an
        atomic test and set there. The only other time this should be set is
        when the task's quantum expires, in the timer interrupt - this is
        protected against because the rq lock is irq-safe.
      
      - If TIF_NEED_RESCHED is set, then we don't need to do anything. It
        won't get unset until the task get's schedule()d off.
      
      - If we are running on the same CPU as the task we resched, then set
        TIF_NEED_RESCHED and no further action is required.
      
      - If we are running on another CPU, and TIF_POLLING_NRFLAG is *not* set
        after TIF_NEED_RESCHED has been set, then we need to send an IPI.
      
      Using these rules, we are able to remove the test and set operation in
      resched_task, and make clear the previously vague semantics of
      POLLING_NRFLAG.
      
      * In idle routines:
      - Enter cpu_idle with preempt disabled. When the need_resched() condition
        becomes true, explicitly call schedule(). This makes things a bit clearer
        (IMO), but haven't updated all architectures yet.
      
      - Many do a test and clear of TIF_NEED_RESCHED for some reason. According
        to the resched_task rules, this isn't needed (and actually breaks the
        assumption that TIF_NEED_RESCHED is only cleared with the runqueue lock
        held). So remove that. Generally one less locked memory op when switching
        to the idle thread.
      
      - Many idle routines clear TIF_POLLING_NRFLAG, and only set it in the inner
        most polling idle loops. The above resched_task semantics allow it to be
        set until before the last time need_resched() is checked before going into
        a halt requiring interrupt wakeup.
      
        Many idle routines simply never enter such a halt, and so POLLING_NRFLAG
        can be always left set, completely eliminating resched IPIs when rescheduling
        the idle task.
      
        POLLING_NRFLAG width can be increased, to reduce the chance of resched IPIs.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Con Kolivas <kernel@kolivas.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      64c7c8f8
    • N
      [PATCH] sched: disable preempt in idle tasks · 5bfb5d69
      Nick Piggin 提交于
      Run idle threads with preempt disabled.
      
      Also corrected a bugs in arm26's cpu_idle (make it actually call schedule()).
      How did it ever work before?
      
      Might fix the CPU hotplugging hang which Nigel Cunningham noted.
      
      We think the bug hits if the idle thread is preempted after checking
      need_resched() and before going to sleep, then the CPU offlined.
      
      After calling stop_machine_run, the CPU eventually returns from preemption and
      into the idle thread and goes to sleep.  The CPU will continue executing
      previous idle and have no chance to call play_dead.
      
      By disabling preemption until we are ready to explicitly schedule, this bug is
      fixed and the idle threads generally become more robust.
      
      From: alexs <ashepard@u.washington.edu>
      
        PPC build fix
      
      From: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
      
        MIPS build fix
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NYoichi Yuasa <yuasa@hh.iij4u.or.jp>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5bfb5d69