1. 12 6月, 2006 1 次提交
  2. 18 4月, 2006 1 次提交
    • P
      powerpc: Use correct sequence for putting CPU into nap mode · f39224a8
      Paul Mackerras 提交于
      We weren't using the recommended sequence for putting the CPU into
      nap mode.  When I changed the idle loop, for some reason 7447A cpus
      started hanging when we put them into nap mode.  Changing to the
      recommended sequence fixes that.
      
      The complexity here is that the recommended sequence is a loop that
      keeps putting the cpu back into nap mode.  Clearly we need some way
      to break out of the loop when an interrupt (external interrupt,
      decrementer, performance monitor) occurs.  Here we use a bit in
      the thread_info struct to indicate that we need this, and the exception
      entry code notices this and arranges for the exception to return
      to the value in the link register, thus breaking out of the loop.
      We use a new `local_flags' field in the thread_info which we can
      alter without needing to use an atomic update sequence.
      
      The PPC970 has the same recommended sequence, so we do the same thing
      there too.
      
      This also fixes a bug in the kernel stack overflow handling code on
      32-bit, since it was causing a value that we needed in a register to
      get trashed.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      f39224a8
  3. 28 3月, 2006 1 次提交
  4. 08 3月, 2006 1 次提交
    • P
      powerpc: Fix various syscall/signal/swapcontext bugs · 1bd79336
      Paul Mackerras 提交于
      A careful reading of the recent changes to the system call entry/exit
      paths revealed several problems, plus some things that could be
      simplified and improved:
      
      * 32-bit wasn't testing the _TIF_NOERROR bit in the syscall fast exit
        path, so it was only doing anything with it once it saw some other
        bit being set.  In other words, the noerror behaviour would apply to
        the next system call where we had to reschedule or deliver a signal,
        which is not necessarily the current system call.
      
      * 32-bit wasn't doing the call to ptrace_notify in the syscall exit
        path when the _TIF_SINGLESTEP bit was set.
      
      * _TIF_RESTOREALL was in both _TIF_USER_WORK_MASK and
        _TIF_PERSYSCALL_MASK, which is odd since _TIF_RESTOREALL is only set
        by system calls.  I took it out of _TIF_USER_WORK_MASK.
      
      * On 64-bit, _TIF_RESTOREALL wasn't causing the non-volatile registers
        to be restored (unless perhaps a signal was delivered or the syscall
        was traced or single-stepped).  Thus the non-volatile registers
        weren't restored on exit from a signal handler.  We probably got
        away with it mostly because signal handlers written in C wouldn't
        alter the non-volatile registers.
      
      * On 32-bit I simplified the code and made it more like 64-bit by
        making the syscall exit path jump to ret_from_except to handle
        preemption and signal delivery.
      
      * 32-bit was calling do_signal unnecessarily when _TIF_RESTOREALL was
        set - but I think because of that 32-bit was actually restoring the
        non-volatile registers on exit from a signal handler.
      
      * I changed the order of enabling interrupts and saving the
        non-volatile registers before calling do_syscall_trace_leave; now we
        enable interrupts first.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      1bd79336
  5. 24 2月, 2006 1 次提交
    • P
      powerpc: Implement accurate task and CPU time accounting · c6622f63
      Paul Mackerras 提交于
      This implements accurate task and cpu time accounting for 64-bit
      powerpc kernels.  Instead of accounting a whole jiffy of time to a
      task on a timer interrupt because that task happened to be running at
      the time, we now account time in units of timebase ticks according to
      the actual time spent by the task in user mode and kernel mode.  We
      also count the time spent processing hardware and software interrupts
      accurately.  This is conditional on CONFIG_VIRT_CPU_ACCOUNTING.  If
      that is not set, we do tick-based approximate accounting as before.
      
      To get this accurate information, we read either the PURR (processor
      utilization of resources register) on POWER5 machines, or the timebase
      on other machines on
      
      * each entry to the kernel from usermode
      * each exit to usermode
      * transitions between process context, hard irq context and soft irq
        context in kernel mode
      * context switches.
      
      On POWER5 systems with shared-processor logical partitioning we also
      read both the PURR and the timebase at each timer interrupt and
      context switch in order to determine how much time has been taken by
      the hypervisor to run other partitions ("steal" time).  Unfortunately,
      since we need values of the PURR on both threads at the same time to
      accurately calculate the steal time, and since we can only calculate
      steal time on a per-core basis, the apportioning of the steal time
      between idle time (time which we ceded to the hypervisor in the idle
      loop) and actual stolen time is somewhat approximate at the moment.
      
      This is all based quite heavily on what s390 does, and it uses the
      generic interfaces that were added by the s390 developers,
      i.e. account_system_time(), account_user_time(), etc.
      
      This patch doesn't add any new interfaces between the kernel and
      userspace, and doesn't change the units in which time is reported to
      userspace by things such as /proc/stat, /proc/<pid>/stat, getrusage(),
      times(), etc.  Internally the various task and cpu times are stored in
      timebase units, but they are converted to USER_HZ units (1/100th of a
      second) when reported to userspace.  Some precision is therefore lost
      but there should not be any accumulating error, since the internal
      accumulation is at full precision.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      c6622f63
  6. 13 1月, 2006 1 次提交
    • D
      [PATCH] powerpc: Remove lppaca structure from the PACA · 3356bb9f
      David Gibson 提交于
      At present the lppaca - the structure shared with the iSeries
      hypervisor and phyp - is contained within the PACA, our own low-level
      per-cpu structure.  This doesn't have to be so, the patch below
      removes it, making a separate array of lppaca structures.
      
      This saves approximately 500*NR_CPUS bytes of image size and kernel
      memory, because we don't need aligning gap between the Linux and
      hypervisor portions of every PACA.  On the other hand it means an
      extra level of dereference in many accesses to the lppaca.
      
      The patch also gets rid of several places where we assign the paca
      address to a local variable for no particular reason.
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      3356bb9f
  7. 09 1月, 2006 2 次提交
    • D
      [PATCH] powerpc: Remove some unneeded fields from the paca · 404849bb
      David Gibson 提交于
      This patch removes several unnecessary fields from the paca:
      
      - next_jiffy_update_tb was simply unused.  Remove trivially.
      
      - The exdsi exception save area was not used.  There were plans to use
        it, but they never seem to have gone anywhere.  If they ever do, we
        can put it back.  Remove from the paca, and from asm-offsets.c
      
      - The default_decr field was used from asm, but was only ever assigned
        the value of tb_ticks_per_jiffy.  Just access tb_ticks_per_jiffy from
        asm directly instead.
      
      Built and booted on POWER5 LPAR and iSeries RS64.
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      404849bb
    • D
      [PATCH] syscall entry/exit revamp · 401d1f02
      David Woodhouse 提交于
      This cleanup patch speeds up the null syscall path on ppc64 by about 3%,
      and brings the ppc32 and ppc64 code slightly closer together.
      
      The ppc64 code was checking current_thread_info()->flags twice in the
      syscall exit path; once for TIF_SYSCALL_T_OR_A before disabling
      interrupts, and then again for TIF_SIGPENDING|TIF_NEED_RESCHED etc after
      disabling interrupts. Now we do the same as ppc32 -- check the flags
      only once in the fast path, and re-enable interrupts if necessary in the
      ptrace case.
      
      The patch abolishes the 'syscall_noerror' member of struct thread_info
      and replaces it with a TIF_NOERROR bit in the flags, which is handled in
      the slow path. This shortens the syscall entry code, which no longer
      needs to clear syscall_noerror.
      
      The patch adds a TIF_SAVE_NVGPRS flag which causes the syscall exit slow
      path to save the non-volatile GPRs into a signal frame. This removes the
      need for the assembly wrappers around sys_sigsuspend(),
      sys_rt_sigsuspend(), et al which existed solely to save those registers
      in advance. It also means I don't have to add new wrappers for ppoll()
      and pselect(), which is what I was supposed to be doing when I got
      distracted into this...
      
      Finally, it unifies the ppc64 and ppc32 methods of handling syscall exit
      directly into a signal handler (as required by sigsuspend et al) by
      introducing a TIF_RESTOREALL flag which causes _all_ the registers to be
      reloaded from the pt_regs by taking the ret_from_exception path, instead
      of the normal syscall exit path which stomps on the callee-saved GPRs.
      
      It appears to pass an LTP test run on ppc64, and passes basic testing on
      ppc32 too. Brief tests of ptrace functionality with strace and gdb also
      appear OK. I wouldn't send it to Linus for 2.6.15 just yet though :)
      Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      401d1f02
  8. 14 11月, 2005 1 次提交
  9. 11 11月, 2005 1 次提交
    • B
      [PATCH] powerpc: Merge vdso's and add vdso support to 32 bits kernel · a7f290da
      Benjamin Herrenschmidt 提交于
      This patch moves the vdso's to arch/powerpc, adds support for the 32
      bits vdso to the 32 bits kernel, rename systemcfg (finally !), and adds
      some new (still untested) routines to both vdso's: clock_gettime() with
      support for CLOCK_REALTIME and CLOCK_MONOTONIC, clock_getres() (same
      clocks) and get_tbfreq() for glibc to retreive the timebase frequency.
      
      Tom,Steve: The implementation of get_tbfreq() I've done for 32 bits
      returns a long long (r3, r4) not a long. This is such that if we ever
      add support for >4Ghz timebases on ppc32, the userland interface won't
      have to change.
      
      I have tested gettimeofday() using some glibc patches in both ppc32 and
      ppc64 kernels using 32 bits userland (I haven't had a chance to test a
      64 bits userland yet, but the implementation didn't change and was
      tested earlier). I haven't tested yet the new functions.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      a7f290da
  10. 10 11月, 2005 1 次提交
  11. 07 11月, 2005 1 次提交
    • B
      [PATCH] ppc64: support 64k pages · 3c726f8d
      Benjamin Herrenschmidt 提交于
      Adds a new CONFIG_PPC_64K_PAGES which, when enabled, changes the kernel
      base page size to 64K.  The resulting kernel still boots on any
      hardware.  On current machines with 4K pages support only, the kernel
      will maintain 16 "subpages" for each 64K page transparently.
      
      Note that while real 64K capable HW has been tested, the current patch
      will not enable it yet as such hardware is not released yet, and I'm
      still verifying with the firmware architects the proper to get the
      information from the newer hypervisors.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3c726f8d
  12. 02 11月, 2005 1 次提交
  13. 28 10月, 2005 1 次提交
  14. 26 10月, 2005 1 次提交
    • P
      powerpc: Merge rtas.c into arch/powerpc/kernel · 033ef338
      Paul Mackerras 提交于
      This splits arch/ppc64/kernel/rtas.c into arch/powerpc/kernel/rtas.c,
      which contains generic RTAS functions useful on any CHRP platform,
      and arch/powerpc/platforms/pseries/rtas-fw.[ch], which contain
      some pSeries-specific firmware flashing bits.  The parts of rtas.c
      that are to do with pSeries-specific error logging are protected
      by a new CONFIG_RTAS_ERROR_LOGGING symbol.  The inclusion of rtas.o
      is controlled by the CONFIG_PPC_RTAS symbol, and the relevant
      platforms select that.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      033ef338
  15. 21 10月, 2005 1 次提交
    • D
      [PATCH] powerpc: Merge thread_info.h · 6cb7bfeb
      David Gibson 提交于
      Merge ppc32 and ppc64 versions of thread_info.h.  They were pretty
      similar already, the chief changes are:
      
      	- Instead of inline asm to implement current_thread_info(),
      which needs to be different for ppc32 and ppc64, we use C with an
      asm("r1") register variable.  gcc turns it into the same asm as we
      used to have for both platforms.
      	- We replace ppc32's 'local_flags' with the ppc64
      'syscall_noerror' field.  The noerror flag was in fact the only thing
      in the local_flags field anyway, so the ppc64 approach is simpler, and
      means we only need a load-immediate/store instead of load/mask/store
      when clearing the flag.
      	- In readiness for 64k pages, when THREAD_SIZE will be less
      than a page, ppc64 used kmalloc() rather than get_free_pages() to
      allocate the kernel stack.  With this patch we do the same for ppc32,
      since there's no strong reason not to.
      	- For ppc64, we no longer export THREAD_SHIFT and THREAD_SIZE
      via asm-offsets, thread_info.h can now be safely included in asm, as
      on ppc32.
      
      Built and booted on G4 Powerbook (ARCH=ppc and ARCH=powerpc) and
      Power5 (ARCH=ppc64 and ARCH=powerpc).
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      6cb7bfeb
  16. 11 10月, 2005 1 次提交
    • P
      ppc: Various minor compile fixes · fd582ec8
      Paul Mackerras 提交于
      This fixes up a variety of minor problems in compiling with ARCH=ppc
      arising from using the merged versions of various header files.
      A lot of the changes are just adding #include <asm/machdep.h> to
      files that use ppc_md or smp_ops_t.
      
      This also arranges for us to use semaphore.c, vecemu.c, vector.S and
      fpu.S from arch/powerpc/kernel when compiling with ARCH=ppc.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      fd582ec8
  17. 10 10月, 2005 1 次提交
  18. 30 9月, 2005 1 次提交
  19. 26 9月, 2005 1 次提交
    • P
      powerpc: Merge enough to start building in arch/powerpc. · 14cf11af
      Paul Mackerras 提交于
      This creates the directory structure under arch/powerpc and a bunch
      of Kconfig files.  It does a first-cut merge of arch/powerpc/mm,
      arch/powerpc/lib and arch/powerpc/platforms/powermac.  This is enough
      to build a 32-bit powermac kernel with ARCH=powerpc.
      
      For now we are getting some unmerged files from arch/ppc/kernel and
      arch/ppc/syslib, or arch/ppc64/kernel.  This makes some minor changes
      to files in those directories and files outside arch/powerpc.
      
      The boot directory is still not merged.  That's going to be interesting.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      14cf11af
  20. 29 8月, 2005 1 次提交
    • D
      [PATCH] Dynamic hugepage addresses for ppc64 · c594adad
      David Gibson 提交于
      Paulus, I think this is now a reasonable candidate for the post-2.6.13
      queue.
      
      Relax address restrictions for hugepages on ppc64
      
      Presently, 64-bit applications on ppc64 may only use hugepages in the
      address region from 1-1.5T.  Furthermore, if hugepages are enabled in
      the kernel config, they may only use hugepages and never normal pages
      in this area.  This patch relaxes this restriction, allowing any
      address to be used with hugepages, but with a 1TB granularity.  That
      is if you map a hugepage anywhere in the region 1TB-2TB, that entire
      area will be reserved exclusively for hugepages for the remainder of
      the process's lifetime.  This works analagously to hugepages in 32-bit
      applications, where hugepages can be mapped anywhere, but with 256MB
      (mmu segment) granularity.
      
      This patch applies on top of the four level pagetable patch
      (http://patchwork.ozlabs.org/linuxppc64/patch?id=1936).
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      c594adad
  21. 27 8月, 2005 1 次提交
    • D
      Fix missing audit_syscall_exit() on ppc64 sigsuspend exit path · 17888225
      David Woodhouse 提交于
      When we leave sigsuspend() directly into a signal handler, we don't want
      to go via the normal syscall exit path -- it'll corrupt r4 and r5 which
      are supposed to be giving information to the signal handler, and it'll
      give us one more single-step SIGTRAP than we need if single-stepping is
      in operation.
      
      However, we _should_ be calling audit_syscall_exit(), which would
      normally get invoked in that patch. It's not wonderfully pretty, but I
      suspect the best answer is just to call it directly...
      Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
      17888225
  22. 22 6月, 2005 1 次提交
  23. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4