1. 24 2月, 2006 2 次提交
  2. 18 2月, 2006 1 次提交
  3. 16 2月, 2006 1 次提交
  4. 15 2月, 2006 1 次提交
    • M
      [PATCH] madvise MADV_DONTFORK/MADV_DOFORK · f8225661
      Michael S. Tsirkin 提交于
      Currently, copy-on-write may change the physical address of a page even if the
      user requested that the page is pinned in memory (either by mlock or by
      get_user_pages).  This happens if the process forks meanwhile, and the parent
      writes to that page.  As a result, the page is orphaned: in case of
      get_user_pages, the application will never see any data hardware DMA's into
      this page after the COW.  In case of mlock'd memory, the parent is not getting
      the realtime/security benefits of mlock.
      
      In particular, this affects the Infiniband modules which do DMA from and into
      user pages all the time.
      
      This patch adds madvise options to control whether memory range is inherited
      across fork.  Useful e.g.  for when hardware is doing DMA from/into these
      pages.  Could also be useful to an application wanting to speed up its forks
      by cutting large areas out of consideration.
      Signed-off-by: NMichael S. Tsirkin <mst@mellanox.co.il>
      Acked-by: NHugh Dickins <hugh@veritas.com>
      Cc: Michael Kerrisk <mtk-manpages@gmx.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f8225661
  5. 10 2月, 2006 1 次提交
  6. 08 2月, 2006 4 次提交
  7. 07 2月, 2006 1 次提交
  8. 02 2月, 2006 2 次提交
  9. 19 1月, 2006 2 次提交
  10. 18 1月, 2006 1 次提交
    • B
      [PATCH] Fix sparse parse error in lppaca.h · c6b3feaf
      Bryan O'Sullivan 提交于
      sparse can't parse a struct definition in include/asm-powerpc/lppaca.h,
      even though gcc can accept it.  The form looks like this:
      
              struct __attribute__((whatever)) foo { };
      
      An equivalent that both gcc and sparse can handle is
      
              struct foo { } __attribute__((whatever));
      
      This is the only definition of this type in the tree, and fixing it is
      easier than fixing sparse.
      Signed-off-by: NBryan O'Sullivan <bos@serpentine.com>
      [ Side note: fixing sparse wouldn't be hard, but the "attribute at the
        end" version is the canonical one, and the one that makes sense. So
        let's just fix the kernel instead. Luc Van Oostenryck already sent
        out a sparse patch to the sparse mailing list in case anybody cares.
                     -- Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c6b3feaf
  11. 15 1月, 2006 1 次提交
  12. 14 1月, 2006 2 次提交
    • A
      [PATCH] powerpc: oprofile cpu type names clash with other code · 7a45fb19
      Andy Whitcroft 提交于
      In 2.6.15-git6 a change was commited in the oprofile support in
      the powerpc architecture.  It introduced the powerpc_oprofile_type
      which contains the define G4.  This causes a name clash with the
      existing wacom usb tablet driver.
      
            CC [M]  drivers/usb/input/wacom.o
          drivers/usb/input/wacom.c:98: error: conflicting types for `G4'
          include/asm/cputable.h:37: error: previous declaration of `G4'
            CC [M]  drivers/usb/mon/mon_text.o
          make[3]: *** [drivers/usb/input/wacom.o] Error 1
          make[2]: *** [drivers/usb/input] Error 2
      
      The elements of an enum declared in global scope are effectivly
      global identifiers themselves.  As such we need to ensure the names
      are unique.  This patch updates the later oprofile support to use
      unique names.
      Signed-off-by: NAndy Whitcroft <apw@shadowen.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      7a45fb19
    • P
      powerpc: Provide a suitable AT_PLATFORM value · 80f15dc7
      Paul Mackerras 提交于
      The glibc folks want to use AT_PLATFORM to select between possible
      alternative versions of shared libraries.  This commit makes the kernel
      supply an AT_PLATFORM string that indicates what class of processor
      we are running on.  Processors with the same set of user-level
      instructions and roughly the same instruction scheduling characteristics
      are given the same AT_PLATFORM value; for example, 821, 823 and 860
      are all reported as "ppc823", and 7447, 7447A, 7448, 7450, 7451, 7455
      are all called "ppc7450".
      
      The intention is that the AT_PLATFORM values match the values that
      gcc accepts for the -mcpu= option.  For values which are numeric
      (e.g. -mcpu=750), "ppc" has been prepended.
      
      This also adds a PPC_FEATURE_BOOKE bit to the AT_HWCAP value and sets
      it for the 440 family and the Freescale 85xx family.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      80f15dc7
  13. 13 1月, 2006 10 次提交
    • A
      [PATCH] powerpc: reformat atomic_add_unless · b11fa580
      Anton Blanchard 提交于
      It makes my eyes hurt.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      b11fa580
    • A
      [PATCH] powerpc: use lwsync in atomics, bitops, lock functions · 144b9c13
      Anton Blanchard 提交于
      eieio is only a store - store ordering. When used to order an unlock
      operation loads may leak out of the critical region. This is potentially
      buggy, one example is if a user wants to atomically read a couple of
      values.
      
      We can solve this with an lwsync which orders everything except store - load.
      
      I removed the (now unused) EIEIO_ON_SMP macros and the c versions
      isync_on_smp and eieio_on_smp now we dont use them. I also removed some
      old comments that were used to identify inline spinlocks in assembly,
      they dont make sense now our locks are out of line.
      
      Another interesting thing was that read_unlock was using an eieio even
      though the rest of the spinlock code had already been converted to
      use lwsync.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      144b9c13
    • D
      [PATCH] powerpc: Remove lppaca structure from the PACA · 3356bb9f
      David Gibson 提交于
      At present the lppaca - the structure shared with the iSeries
      hypervisor and phyp - is contained within the PACA, our own low-level
      per-cpu structure.  This doesn't have to be so, the patch below
      removes it, making a separate array of lppaca structures.
      
      This saves approximately 500*NR_CPUS bytes of image size and kernel
      memory, because we don't need aligning gap between the Linux and
      hypervisor portions of every PACA.  On the other hand it means an
      extra level of dereference in many accesses to the lppaca.
      
      The patch also gets rid of several places where we assign the paca
      address to a local variable for no particular reason.
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      3356bb9f
    • D
      [PATCH] powerpc: Cleanup LOADADDR etc. asm macros · e58c3495
      David Gibson 提交于
      This patch consolidates the variety of macros used for loading 32 or
      64-bit constants in assembler (LOADADDR, LOADBASE, SET_REG_TO_*).  The
      idea is to make the set of macros consistent across 32 and 64 bit and
      to make it more obvious which is the appropriate one to use in a given
      situation.  The new macros and their semantics are described in the
      comments in ppc_asm.h.
      
      In the process, we change several places that were unnecessarily using
      immediate loads on ppc64 to use the GOT/TOC.  Likewise we cleanup a
      couple of places where we were clumsily subtracting PAGE_OFFSET with
      asm instructions to use assemble-time arithmetic or the toreal() macro
      instead.
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      e58c3495
    • D
      [PATCH] powerpc: Add of_find_property function · ecaa8b0f
      Dave C Boutcher 提交于
      Add an of_find_property function that returns a struct property
      given a property name.  Then change the get_property function to
      use that routine internally.
      Signed-off-by: NDave Boutcher <sleddog@us.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      ecaa8b0f
    • D
      [PATCH] powerpc: Add/remove/update properties in firmware device tree · 088186de
      Dave C Boutcher 提交于
      Add support for updating and removing device tree
      properties.  Since we hand out pointers to properties with gay
      abandon, we can't just free the property storage.  Instead we
      move deleted, or the old copy of an updated property, to a
      "dead properties" list.
      
      Also note, its not feasable to kref device tree properties.
      we call get_property() all over the kernel in a wild variety
      of contexts.
      
      One consequence of this change is that we now take a
      read_lock(&devtree_lock) when doing get_property().
      Signed-off-by: NDave Boutcher <sleddog@us.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      088186de
    • D
      [PATCH] powerpc: Add some more pSeries hypervisor call constants · 43ccf202
      Dave C Boutcher 提交于
      Adds a few more hypervisor call constants.
      Signed-off-by: NDave Boutcher <sleddog@us.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      43ccf202
    • A
      [PATCH] death of get_thread_info/put_thread_info · f5a61d0c
      Al Viro 提交于
      {get,put}_thread_info() were introduced in 2.5.4 and never
      had been called by anything in the tree.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f5a61d0c
    • A
      [PATCH] scheduler cache-hot-autodetect · 198e2f18
      akpm@osdl.org 提交于
      )
      
      From: Ingo Molnar <mingo@elte.hu>
      
      This is the latest version of the scheduler cache-hot-auto-tune patch.
      
      The first problem was that detection time scaled with O(N^2), which is
      unacceptable on larger SMP and NUMA systems. To solve this:
      
      - I've added a 'domain distance' function, which is used to cache
        measurement results. Each distance is only measured once. This means
        that e.g. on NUMA distances of 0, 1 and 2 might be measured, on HT
        distances 0 and 1, and on SMP distance 0 is measured. The code walks
        the domain tree to determine the distance, so it automatically follows
        whatever hierarchy an architecture sets up. This cuts down on the boot
        time significantly and removes the O(N^2) limit. The only assumption
        is that migration costs can be expressed as a function of domain
        distance - this covers the overwhelming majority of existing systems,
        and is a good guess even for more assymetric systems.
      
        [ People hacking systems that have assymetries that break this
          assumption (e.g. different CPU speeds) should experiment a bit with
          the cpu_distance() function. Adding a ->migration_distance factor to
          the domain structure would be one possible solution - but lets first
          see the problem systems, if they exist at all. Lets not overdesign. ]
      
      Another problem was that only a single cache-size was used for measuring
      the cost of migration, and most architectures didnt set that variable
      up. Furthermore, a single cache-size does not fit NUMA hierarchies with
      L3 caches and does not fit HT setups, where different CPUs will often
      have different 'effective cache sizes'. To solve this problem:
      
      - Instead of relying on a single cache-size provided by the platform and
        sticking to it, the code now auto-detects the 'effective migration
        cost' between two measured CPUs, via iterating through a wide range of
        cachesizes. The code searches for the maximum migration cost, which
        occurs when the working set of the test-workload falls just below the
        'effective cache size'. I.e. real-life optimized search is done for
        the maximum migration cost, between two real CPUs.
      
        This, amongst other things, has the positive effect hat if e.g. two
        CPUs share a L2/L3 cache, a different (and accurate) migration cost
        will be found than between two CPUs on the same system that dont share
        any caches.
      
      (The reliable measurement of migration costs is tricky - see the source
      for details.)
      
      Furthermore i've added various boot-time options to override/tune
      migration behavior.
      
      Firstly, there's a blanket override for autodetection:
      
      	migration_cost=1000,2000,3000
      
      will override the depth 0/1/2 values with 1msec/2msec/3msec values.
      
      Secondly, there's a global factor that can be used to increase (or
      decrease) the autodetected values:
      
      	migration_factor=120
      
      will increase the autodetected values by 20%. This option is useful to
      tune things in a workload-dependent way - e.g. if a workload is
      cache-insensitive then CPU utilization can be maximized by specifying
      migration_factor=0.
      
      I've tested the autodetection code quite extensively on x86, on 3
      P3/Xeon/2MB, and the autodetected values look pretty good:
      
      Dual Celeron (128K L2 cache):
      
       ---------------------
       migration cost matrix (max_cache_size: 131072, cpu: 467 MHz):
       ---------------------
                 [00]    [01]
       [00]:     -     1.7(1)
       [01]:   1.7(1)    -
       ---------------------
       cacheflush times [2]: 0.0 (0) 1.7 (1784008)
       ---------------------
      
      Here the slow memory subsystem dominates system performance, and even
      though caches are small, the migration cost is 1.7 msecs.
      
      Dual HT P4 (512K L2 cache):
      
       ---------------------
       migration cost matrix (max_cache_size: 524288, cpu: 2379 MHz):
       ---------------------
                 [00]    [01]    [02]    [03]
       [00]:     -     0.4(1)  0.0(0)  0.4(1)
       [01]:   0.4(1)    -     0.4(1)  0.0(0)
       [02]:   0.0(0)  0.4(1)    -     0.4(1)
       [03]:   0.4(1)  0.0(0)  0.4(1)    -
       ---------------------
       cacheflush times [2]: 0.0 (33900) 0.4 (448514)
       ---------------------
      
      Here it can be seen that there is no migration cost between two HT
      siblings (CPU#0/2 and CPU#1/3 are separate physical CPUs). A fast memory
      system makes inter-physical-CPU migration pretty cheap: 0.4 msecs.
      
      8-way P3/Xeon [2MB L2 cache]:
      
       ---------------------
       migration cost matrix (max_cache_size: 2097152, cpu: 700 MHz):
       ---------------------
                 [00]    [01]    [02]    [03]    [04]    [05]    [06]    [07]
       [00]:     -    19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
       [01]:  19.2(1)    -    19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
       [02]:  19.2(1) 19.2(1)    -    19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
       [03]:  19.2(1) 19.2(1) 19.2(1)    -    19.2(1) 19.2(1) 19.2(1) 19.2(1)
       [04]:  19.2(1) 19.2(1) 19.2(1) 19.2(1)    -    19.2(1) 19.2(1) 19.2(1)
       [05]:  19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)    -    19.2(1) 19.2(1)
       [06]:  19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)    -    19.2(1)
       [07]:  19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)    -
       ---------------------
       cacheflush times [2]: 0.0 (0) 19.2 (19281756)
       ---------------------
      
      This one has huge caches and a relatively slow memory subsystem - so the
      migration cost is 19 msecs.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAshok Raj <ashok.raj@intel.com>
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Cc: <wilder@us.ibm.com>
      Signed-off-by: NJohn Hawkes <hawkes@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      198e2f18
    • I
      [PATCH] sched: add cacheflush() asm · 4dc7a0bb
      Ingo Molnar 提交于
      Add per-arch sched_cacheflush() which is a write-back cacheflush used by
      the migration-cost calibration code at bootup time.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4dc7a0bb
  14. 12 1月, 2006 8 次提交
  15. 11 1月, 2006 3 次提交
    • P
      powerpc/32: Fix compile error caused by pud_t/pgt_t confusion · c38a04b1
      Paul Mackerras 提交于
      PPC32 is still using asm-generic/4level-fixup.h, but asm-powerpc/page.h
      was defining pud_t and pgd_t.  Depending on the order in which files
      got included, this could result in a compilation error.  Tweak the ifdef
      so that page.h doesn't try to define pud_t on ppc32 (which uses 2-level
      page tables).
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      c38a04b1
    • A
      [PATCH] powerpc/64: per cpu data optimisations · 7a0268fa
      Anton Blanchard 提交于
      The current ppc64 per cpu data implementation is quite slow. eg:
      
              lhz 11,18(13)           /* smp_processor_id() */
              ld 9,.LC63-.LCTOC1(30)  /* per_cpu__variable_name */
              ld 8,.LC61-.LCTOC1(30)  /* __per_cpu_offset */
              sldi 11,11,3            /* form index into __per_cpu_offset */
              mr 10,9
              ldx 9,11,8              /* __per_cpu_offset[smp_processor_id()] */
              ldx 0,10,9              /* load per cpu data */
      
      5 loads for something that is supposed to be fast, pretty awful. One
      reason for the large number of loads is that we have to synthesize 2
      64bit constants (per_cpu__variable_name and __per_cpu_offset).
      
      By putting __per_cpu_offset into the paca we can avoid the 2 loads
      associated with it:
      
              ld 11,56(13)            /* paca->data_offset */
              ld 9,.LC59-.LCTOC1(30)  /* per_cpu__variable_name */
              ldx 0,9,11              /* load per cpu data
      
      Longer term we can should be able to do even better than 3 loads.
      If per_cpu__variable_name wasnt a 64bit constant and paca->data_offset
      was in a register we could cut it down to one load. A suggestion from
      Rusty is to use gcc's __thread extension here. In order to do this we
      would need to free up r13 (the __thread register and where the paca
      currently is). So far Ive had a few unsuccessful attempts at doing that :)
      
      The patch also allocates per cpu memory node local on NUMA machines.
      This patch from Rusty has been sitting in my queue _forever_ but stalled
      when I hit the compiler bug. Sorry about that.
      
      Finally I also only allocate per cpu data for possible cpus, which comes
      straight out of the x86-64 port. On a pseries kernel (with NR_CPUS == 128)
      and 4 possible cpus we see some nice gains:
      
                   total       used       free     shared    buffers cached
      Mem:       4012228     212860    3799368          0          0 162424
      
                   total       used       free     shared    buffers cached
      Mem:       4016200     212984    3803216          0          0 162424
      
      A saving of 3.75MB. Quite nice for smaller machines. Note: we now have
      to be careful of per cpu users that touch data for !possible cpus.
      
      At this stage it might be worth making the NUMA and possible cpu
      optimisations generic, but per cpu init is done so early we have to be
      careful that all architectures have their possible map setup correctly.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      7a0268fa
    • M
      [PATCH] powerpc: parallel port init fix · 193cac99
      Michael Neuling 提交于
      This stops parport from accessing nonexistent parallel ports.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      193cac99