1. 08 2月, 2006 4 次提交
  2. 07 2月, 2006 1 次提交
  3. 02 2月, 2006 2 次提交
  4. 19 1月, 2006 2 次提交
  5. 18 1月, 2006 1 次提交
    • B
      [PATCH] Fix sparse parse error in lppaca.h · c6b3feaf
      Bryan O'Sullivan 提交于
      sparse can't parse a struct definition in include/asm-powerpc/lppaca.h,
      even though gcc can accept it.  The form looks like this:
      
              struct __attribute__((whatever)) foo { };
      
      An equivalent that both gcc and sparse can handle is
      
              struct foo { } __attribute__((whatever));
      
      This is the only definition of this type in the tree, and fixing it is
      easier than fixing sparse.
      Signed-off-by: NBryan O'Sullivan <bos@serpentine.com>
      [ Side note: fixing sparse wouldn't be hard, but the "attribute at the
        end" version is the canonical one, and the one that makes sense. So
        let's just fix the kernel instead. Luc Van Oostenryck already sent
        out a sparse patch to the sparse mailing list in case anybody cares.
                     -- Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c6b3feaf
  6. 15 1月, 2006 1 次提交
  7. 14 1月, 2006 2 次提交
    • A
      [PATCH] powerpc: oprofile cpu type names clash with other code · 7a45fb19
      Andy Whitcroft 提交于
      In 2.6.15-git6 a change was commited in the oprofile support in
      the powerpc architecture.  It introduced the powerpc_oprofile_type
      which contains the define G4.  This causes a name clash with the
      existing wacom usb tablet driver.
      
            CC [M]  drivers/usb/input/wacom.o
          drivers/usb/input/wacom.c:98: error: conflicting types for `G4'
          include/asm/cputable.h:37: error: previous declaration of `G4'
            CC [M]  drivers/usb/mon/mon_text.o
          make[3]: *** [drivers/usb/input/wacom.o] Error 1
          make[2]: *** [drivers/usb/input] Error 2
      
      The elements of an enum declared in global scope are effectivly
      global identifiers themselves.  As such we need to ensure the names
      are unique.  This patch updates the later oprofile support to use
      unique names.
      Signed-off-by: NAndy Whitcroft <apw@shadowen.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      7a45fb19
    • P
      powerpc: Provide a suitable AT_PLATFORM value · 80f15dc7
      Paul Mackerras 提交于
      The glibc folks want to use AT_PLATFORM to select between possible
      alternative versions of shared libraries.  This commit makes the kernel
      supply an AT_PLATFORM string that indicates what class of processor
      we are running on.  Processors with the same set of user-level
      instructions and roughly the same instruction scheduling characteristics
      are given the same AT_PLATFORM value; for example, 821, 823 and 860
      are all reported as "ppc823", and 7447, 7447A, 7448, 7450, 7451, 7455
      are all called "ppc7450".
      
      The intention is that the AT_PLATFORM values match the values that
      gcc accepts for the -mcpu= option.  For values which are numeric
      (e.g. -mcpu=750), "ppc" has been prepended.
      
      This also adds a PPC_FEATURE_BOOKE bit to the AT_HWCAP value and sets
      it for the 440 family and the Freescale 85xx family.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      80f15dc7
  8. 13 1月, 2006 10 次提交
    • A
      [PATCH] powerpc: reformat atomic_add_unless · b11fa580
      Anton Blanchard 提交于
      It makes my eyes hurt.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      b11fa580
    • A
      [PATCH] powerpc: use lwsync in atomics, bitops, lock functions · 144b9c13
      Anton Blanchard 提交于
      eieio is only a store - store ordering. When used to order an unlock
      operation loads may leak out of the critical region. This is potentially
      buggy, one example is if a user wants to atomically read a couple of
      values.
      
      We can solve this with an lwsync which orders everything except store - load.
      
      I removed the (now unused) EIEIO_ON_SMP macros and the c versions
      isync_on_smp and eieio_on_smp now we dont use them. I also removed some
      old comments that were used to identify inline spinlocks in assembly,
      they dont make sense now our locks are out of line.
      
      Another interesting thing was that read_unlock was using an eieio even
      though the rest of the spinlock code had already been converted to
      use lwsync.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      144b9c13
    • D
      [PATCH] powerpc: Remove lppaca structure from the PACA · 3356bb9f
      David Gibson 提交于
      At present the lppaca - the structure shared with the iSeries
      hypervisor and phyp - is contained within the PACA, our own low-level
      per-cpu structure.  This doesn't have to be so, the patch below
      removes it, making a separate array of lppaca structures.
      
      This saves approximately 500*NR_CPUS bytes of image size and kernel
      memory, because we don't need aligning gap between the Linux and
      hypervisor portions of every PACA.  On the other hand it means an
      extra level of dereference in many accesses to the lppaca.
      
      The patch also gets rid of several places where we assign the paca
      address to a local variable for no particular reason.
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      3356bb9f
    • D
      [PATCH] powerpc: Cleanup LOADADDR etc. asm macros · e58c3495
      David Gibson 提交于
      This patch consolidates the variety of macros used for loading 32 or
      64-bit constants in assembler (LOADADDR, LOADBASE, SET_REG_TO_*).  The
      idea is to make the set of macros consistent across 32 and 64 bit and
      to make it more obvious which is the appropriate one to use in a given
      situation.  The new macros and their semantics are described in the
      comments in ppc_asm.h.
      
      In the process, we change several places that were unnecessarily using
      immediate loads on ppc64 to use the GOT/TOC.  Likewise we cleanup a
      couple of places where we were clumsily subtracting PAGE_OFFSET with
      asm instructions to use assemble-time arithmetic or the toreal() macro
      instead.
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      e58c3495
    • D
      [PATCH] powerpc: Add of_find_property function · ecaa8b0f
      Dave C Boutcher 提交于
      Add an of_find_property function that returns a struct property
      given a property name.  Then change the get_property function to
      use that routine internally.
      Signed-off-by: NDave Boutcher <sleddog@us.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      ecaa8b0f
    • D
      [PATCH] powerpc: Add/remove/update properties in firmware device tree · 088186de
      Dave C Boutcher 提交于
      Add support for updating and removing device tree
      properties.  Since we hand out pointers to properties with gay
      abandon, we can't just free the property storage.  Instead we
      move deleted, or the old copy of an updated property, to a
      "dead properties" list.
      
      Also note, its not feasable to kref device tree properties.
      we call get_property() all over the kernel in a wild variety
      of contexts.
      
      One consequence of this change is that we now take a
      read_lock(&devtree_lock) when doing get_property().
      Signed-off-by: NDave Boutcher <sleddog@us.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      088186de
    • D
      [PATCH] powerpc: Add some more pSeries hypervisor call constants · 43ccf202
      Dave C Boutcher 提交于
      Adds a few more hypervisor call constants.
      Signed-off-by: NDave Boutcher <sleddog@us.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      43ccf202
    • A
      [PATCH] death of get_thread_info/put_thread_info · f5a61d0c
      Al Viro 提交于
      {get,put}_thread_info() were introduced in 2.5.4 and never
      had been called by anything in the tree.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f5a61d0c
    • A
      [PATCH] scheduler cache-hot-autodetect · 198e2f18
      akpm@osdl.org 提交于
      )
      
      From: Ingo Molnar <mingo@elte.hu>
      
      This is the latest version of the scheduler cache-hot-auto-tune patch.
      
      The first problem was that detection time scaled with O(N^2), which is
      unacceptable on larger SMP and NUMA systems. To solve this:
      
      - I've added a 'domain distance' function, which is used to cache
        measurement results. Each distance is only measured once. This means
        that e.g. on NUMA distances of 0, 1 and 2 might be measured, on HT
        distances 0 and 1, and on SMP distance 0 is measured. The code walks
        the domain tree to determine the distance, so it automatically follows
        whatever hierarchy an architecture sets up. This cuts down on the boot
        time significantly and removes the O(N^2) limit. The only assumption
        is that migration costs can be expressed as a function of domain
        distance - this covers the overwhelming majority of existing systems,
        and is a good guess even for more assymetric systems.
      
        [ People hacking systems that have assymetries that break this
          assumption (e.g. different CPU speeds) should experiment a bit with
          the cpu_distance() function. Adding a ->migration_distance factor to
          the domain structure would be one possible solution - but lets first
          see the problem systems, if they exist at all. Lets not overdesign. ]
      
      Another problem was that only a single cache-size was used for measuring
      the cost of migration, and most architectures didnt set that variable
      up. Furthermore, a single cache-size does not fit NUMA hierarchies with
      L3 caches and does not fit HT setups, where different CPUs will often
      have different 'effective cache sizes'. To solve this problem:
      
      - Instead of relying on a single cache-size provided by the platform and
        sticking to it, the code now auto-detects the 'effective migration
        cost' between two measured CPUs, via iterating through a wide range of
        cachesizes. The code searches for the maximum migration cost, which
        occurs when the working set of the test-workload falls just below the
        'effective cache size'. I.e. real-life optimized search is done for
        the maximum migration cost, between two real CPUs.
      
        This, amongst other things, has the positive effect hat if e.g. two
        CPUs share a L2/L3 cache, a different (and accurate) migration cost
        will be found than between two CPUs on the same system that dont share
        any caches.
      
      (The reliable measurement of migration costs is tricky - see the source
      for details.)
      
      Furthermore i've added various boot-time options to override/tune
      migration behavior.
      
      Firstly, there's a blanket override for autodetection:
      
      	migration_cost=1000,2000,3000
      
      will override the depth 0/1/2 values with 1msec/2msec/3msec values.
      
      Secondly, there's a global factor that can be used to increase (or
      decrease) the autodetected values:
      
      	migration_factor=120
      
      will increase the autodetected values by 20%. This option is useful to
      tune things in a workload-dependent way - e.g. if a workload is
      cache-insensitive then CPU utilization can be maximized by specifying
      migration_factor=0.
      
      I've tested the autodetection code quite extensively on x86, on 3
      P3/Xeon/2MB, and the autodetected values look pretty good:
      
      Dual Celeron (128K L2 cache):
      
       ---------------------
       migration cost matrix (max_cache_size: 131072, cpu: 467 MHz):
       ---------------------
                 [00]    [01]
       [00]:     -     1.7(1)
       [01]:   1.7(1)    -
       ---------------------
       cacheflush times [2]: 0.0 (0) 1.7 (1784008)
       ---------------------
      
      Here the slow memory subsystem dominates system performance, and even
      though caches are small, the migration cost is 1.7 msecs.
      
      Dual HT P4 (512K L2 cache):
      
       ---------------------
       migration cost matrix (max_cache_size: 524288, cpu: 2379 MHz):
       ---------------------
                 [00]    [01]    [02]    [03]
       [00]:     -     0.4(1)  0.0(0)  0.4(1)
       [01]:   0.4(1)    -     0.4(1)  0.0(0)
       [02]:   0.0(0)  0.4(1)    -     0.4(1)
       [03]:   0.4(1)  0.0(0)  0.4(1)    -
       ---------------------
       cacheflush times [2]: 0.0 (33900) 0.4 (448514)
       ---------------------
      
      Here it can be seen that there is no migration cost between two HT
      siblings (CPU#0/2 and CPU#1/3 are separate physical CPUs). A fast memory
      system makes inter-physical-CPU migration pretty cheap: 0.4 msecs.
      
      8-way P3/Xeon [2MB L2 cache]:
      
       ---------------------
       migration cost matrix (max_cache_size: 2097152, cpu: 700 MHz):
       ---------------------
                 [00]    [01]    [02]    [03]    [04]    [05]    [06]    [07]
       [00]:     -    19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
       [01]:  19.2(1)    -    19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
       [02]:  19.2(1) 19.2(1)    -    19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
       [03]:  19.2(1) 19.2(1) 19.2(1)    -    19.2(1) 19.2(1) 19.2(1) 19.2(1)
       [04]:  19.2(1) 19.2(1) 19.2(1) 19.2(1)    -    19.2(1) 19.2(1) 19.2(1)
       [05]:  19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)    -    19.2(1) 19.2(1)
       [06]:  19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)    -    19.2(1)
       [07]:  19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)    -
       ---------------------
       cacheflush times [2]: 0.0 (0) 19.2 (19281756)
       ---------------------
      
      This one has huge caches and a relatively slow memory subsystem - so the
      migration cost is 19 msecs.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAshok Raj <ashok.raj@intel.com>
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Cc: <wilder@us.ibm.com>
      Signed-off-by: NJohn Hawkes <hawkes@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      198e2f18
    • I
      [PATCH] sched: add cacheflush() asm · 4dc7a0bb
      Ingo Molnar 提交于
      Add per-arch sched_cacheflush() which is a write-back cacheflush used by
      the migration-cost calibration code at bootup time.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4dc7a0bb
  9. 12 1月, 2006 8 次提交
  10. 11 1月, 2006 9 次提交