1. 13 9月, 2005 14 次提交
  2. 11 9月, 2005 2 次提交
    • A
      [PATCH] i386/x86_64: make get_cpu_vendor() static · 672289e9
      Adrian Bunk 提交于
      get_cpu_vendor() no longer has any users in other files.
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      672289e9
    • I
      [PATCH] spinlock consolidation · fb1c8f93
      Ingo Molnar 提交于
      This patch (written by me and also containing many suggestions of Arjan van
      de Ven) does a major cleanup of the spinlock code.  It does the following
      things:
      
       - consolidates and enhances the spinlock/rwlock debugging code
      
       - simplifies the asm/spinlock.h files
      
       - encapsulates the raw spinlock type and moves generic spinlock
         features (such as ->break_lock) into the generic code.
      
       - cleans up the spinlock code hierarchy to get rid of the spaghetti.
      
      Most notably there's now only a single variant of the debugging code,
      located in lib/spinlock_debug.c.  (previously we had one SMP debugging
      variant per architecture, plus a separate generic one for UP builds)
      
      Also, i've enhanced the rwlock debugging facility, it will now track
      write-owners.  There is new spinlock-owner/CPU-tracking on SMP builds too.
      All locks have lockup detection now, which will work for both soft and hard
      spin/rwlock lockups.
      
      The arch-level include files now only contain the minimally necessary
      subset of the spinlock code - all the rest that can be generalized now
      lives in the generic headers:
      
       include/asm-i386/spinlock_types.h       |   16
       include/asm-x86_64/spinlock_types.h     |   16
      
      I have also split up the various spinlock variants into separate files,
      making it easier to see which does what. The new layout is:
      
         SMP                         |  UP
         ----------------------------|-----------------------------------
         asm/spinlock_types_smp.h    |  linux/spinlock_types_up.h
         linux/spinlock_types.h      |  linux/spinlock_types.h
         asm/spinlock_smp.h          |  linux/spinlock_up.h
         linux/spinlock_api_smp.h    |  linux/spinlock_api_up.h
         linux/spinlock.h            |  linux/spinlock.h
      
      /*
       * here's the role of the various spinlock/rwlock related include files:
       *
       * on SMP builds:
       *
       *  asm/spinlock_types.h: contains the raw_spinlock_t/raw_rwlock_t and the
       *                        initializers
       *
       *  linux/spinlock_types.h:
       *                        defines the generic type and initializers
       *
       *  asm/spinlock.h:       contains the __raw_spin_*()/etc. lowlevel
       *                        implementations, mostly inline assembly code
       *
       *   (also included on UP-debug builds:)
       *
       *  linux/spinlock_api_smp.h:
       *                        contains the prototypes for the _spin_*() APIs.
       *
       *  linux/spinlock.h:     builds the final spin_*() APIs.
       *
       * on UP builds:
       *
       *  linux/spinlock_type_up.h:
       *                        contains the generic, simplified UP spinlock type.
       *                        (which is an empty structure on non-debug builds)
       *
       *  linux/spinlock_types.h:
       *                        defines the generic type and initializers
       *
       *  linux/spinlock_up.h:
       *                        contains the __raw_spin_*()/etc. version of UP
       *                        builds. (which are NOPs on non-debug, non-preempt
       *                        builds)
       *
       *   (included on UP-non-debug builds:)
       *
       *  linux/spinlock_api_up.h:
       *                        builds the _spin_*() APIs.
       *
       *  linux/spinlock.h:     builds the final spin_*() APIs.
       */
      
      All SMP and UP architectures are converted by this patch.
      
      arm, i386, ia64, ppc, ppc64, s390/s390x, x64 was build-tested via
      crosscompilers.  m32r, mips, sh, sparc, have not been tested yet, but should
      be mostly fine.
      
      From: Grant Grundler <grundler@parisc-linux.org>
      
        Booted and lightly tested on a500-44 (64-bit, SMP kernel, dual CPU).
        Builds 32-bit SMP kernel (not booted or tested).  I did not try to build
        non-SMP kernels.  That should be trivial to fix up later if necessary.
      
        I converted bit ops atomic_hash lock to raw_spinlock_t.  Doing so avoids
        some ugly nesting of linux/*.h and asm/*.h files.  Those particular locks
        are well tested and contained entirely inside arch specific code.  I do NOT
        expect any new issues to arise with them.
      
       If someone does ever need to use debug/metrics with them, then they will
        need to unravel this hairball between spinlocks, atomic ops, and bit ops
        that exist only because parisc has exactly one atomic instruction: LDCW
        (load and clear word).
      
      From: "Luck, Tony" <tony.luck@intel.com>
      
         ia64 fix
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NArjan van de Ven <arjanv@infradead.org>
      Signed-off-by: NGrant Grundler <grundler@parisc-linux.org>
      Cc: Matthew Wilcox <willy@debian.org>
      Signed-off-by: NHirokazu Takata <takata@linux-m32r.org>
      Signed-off-by: NMikael Pettersson <mikpe@csd.uu.se>
      Signed-off-by: NBenoit Boissinot <benoit.boissinot@ens-lyon.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fb1c8f93
  3. 10 9月, 2005 2 次提交
  4. 08 9月, 2005 10 次提交
    • S
      [PATCH] Clean up struct flock definitions · 5ac353f9
      Stephen Rothwell 提交于
      This patch just gathers together all the struct flock definitions except
      xtensa into asm-generic/fcntl.h.
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5ac353f9
    • S
      [PATCH] Clean up the fcntl operations · 1abf62af
      Stephen Rothwell 提交于
      This patch puts the most popular of each fcntl operation/flag into
      asm-generic/fcntl.h and cleans up the arch files.
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1abf62af
    • S
      [PATCH] Clean up the open flags · e64ca97f
      Stephen Rothwell 提交于
      This patch puts the most popular of each open flag into asm-generic/fcntl.h
      and cleans up the arch files.
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e64ca97f
    • S
      [PATCH] Create asm-generic/fcntl.h · 9317259e
      Stephen Rothwell 提交于
      This set of patches creates asm-generic/fcntl.h and consolidates as much as
      possible from the asm-*/fcntl.h files into it.
      
      This patch just gathers all the identical bits of the asm-*/fcntl.h files into
      asm-generic/fcntl.h.
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NYoichi Yuasa <yuasa@hh.iij4u.or.jp>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      9317259e
    • J
      [PATCH] remove verify_area(): remove verify_area() from various uaccess.h headers · 97de50c0
      Jesper Juhl 提交于
      Remove the deprecated (and unused) verify_area() from various uaccess.h
      headers.
      Signed-off-by: NJesper Juhl <jesper.juhl@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      97de50c0
    • C
      [PATCH] remove asm-*/hdreg.h · c8d12741
      Christoph Hellwig 提交于
      unused and useless..
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c8d12741
    • H
      [PATCH] auxiliary vector cleanups · 36d57ac4
      H. J. Lu 提交于
      The size of auxiliary vector is fixed at 42 in linux/sched.h.  But it isn't
      very obvious when looking at linux/elf.h.  This patch adds AT_VECTOR_SIZE
      so that we can change it if necessary when a new vector is added.
      
      Because of include file ordering problems, doing this necessitated the
      extraction of the AT_* symbols into a standalone header file.
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      36d57ac4
    • S
      [PATCH] compat: be more consistent about [ug]id_t · 202e5979
      Stephen Rothwell 提交于
      When I first wrote the compat layer patches, I was somewhat cavalier about
      the definition of compat_uid_t and compat_gid_t (or maybe I just
      misunderstood :-)).  This patch makes the compat types much more consistent
      with the types we are being compatible with and hopefully will fix a few
      bugs along the way.
      
      	compat type		type in compat arch
      	__compat_[ug]id_t	__kernel_[ug]id_t
      	__compat_[ug]id32_t	__kernel_[ug]id32_t
      	compat_[ug]id_t		[ug]id_t
      
      The difference is that compat_uid_t is always 32 bits (for the archs we
      care about) but __compat_uid_t may be 16 bits on some.
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      202e5979
    • J
      [PATCH] FUTEX_WAKE_OP: pthread_cond_signal() speedup · 4732efbe
      Jakub Jelinek 提交于
      ATM pthread_cond_signal is unnecessarily slow, because it wakes one waiter
      (which at least on UP usually means an immediate context switch to one of
      the waiter threads).  This waiter wakes up and after a few instructions it
      attempts to acquire the cv internal lock, but that lock is still held by
      the thread calling pthread_cond_signal.  So it goes to sleep and eventually
      the signalling thread is scheduled in, unlocks the internal lock and wakes
      the waiter again.
      
      Now, before 2003-09-21 NPTL was using FUTEX_REQUEUE in pthread_cond_signal
      to avoid this performance issue, but it was removed when locks were
      redesigned to the 3 state scheme (unlocked, locked uncontended, locked
      contended).
      
      Following scenario shows why simply using FUTEX_REQUEUE in
      pthread_cond_signal together with using lll_mutex_unlock_force in place of
      lll_mutex_unlock is not enough and probably why it has been disabled at
      that time:
      
      The number is value in cv->__data.__lock.
              thr1            thr2            thr3
      0       pthread_cond_wait
      1       lll_mutex_lock (cv->__data.__lock)
      0       lll_mutex_unlock (cv->__data.__lock)
      0       lll_futex_wait (&cv->__data.__futex, futexval)
      0                       pthread_cond_signal
      1                       lll_mutex_lock (cv->__data.__lock)
      1                                       pthread_cond_signal
      2                                       lll_mutex_lock (cv->__data.__lock)
      2                                         lll_futex_wait (&cv->__data.__lock, 2)
      2                       lll_futex_requeue (&cv->__data.__futex, 0, 1, &cv->__data.__lock)
                                # FUTEX_REQUEUE, not FUTEX_CMP_REQUEUE
      2                       lll_mutex_unlock_force (cv->__data.__lock)
      0                         cv->__data.__lock = 0
      0                         lll_futex_wake (&cv->__data.__lock, 1)
      1       lll_mutex_lock (cv->__data.__lock)
      0       lll_mutex_unlock (cv->__data.__lock)
                # Here, lll_mutex_unlock doesn't know there are threads waiting
                # on the internal cv's lock
      
      Now, I believe it is possible to use FUTEX_REQUEUE in pthread_cond_signal,
      but it will cost us not one, but 2 extra syscalls and, what's worse, one of
      these extra syscalls will be done for every single waiting loop in
      pthread_cond_*wait.
      
      We would need to use lll_mutex_unlock_force in pthread_cond_signal after
      requeue and lll_mutex_cond_lock in pthread_cond_*wait after lll_futex_wait.
      
      Another alternative is to do the unlocking pthread_cond_signal needs to do
      (the lock can't be unlocked before lll_futex_wake, as that is racy) in the
      kernel.
      
      I have implemented both variants, futex-requeue-glibc.patch is the first
      one and futex-wake_op{,-glibc}.patch is the unlocking inside of the kernel.
       The kernel interface allows userland to specify how exactly an unlocking
      operation should look like (some atomic arithmetic operation with optional
      constant argument and comparison of the previous futex value with another
      constant).
      
      It has been implemented just for ppc*, x86_64 and i?86, for other
      architectures I'm including just a stub header which can be used as a
      starting point by maintainers to write support for their arches and ATM
      will just return -ENOSYS for FUTEX_WAKE_OP.  The requeue patch has been
      (lightly) tested just on x86_64, the wake_op patch on ppc64 kernel running
      32-bit and 64-bit NPTL and x86_64 kernel running 32-bit and 64-bit NPTL.
      
      With the following benchmark on UP x86-64 I get:
      
      for i in nptl-orig nptl-requeue nptl-wake_op; do echo time elf/ld.so --library-path .:$i /tmp/bench; \
      for j in 1 2; do echo ( time elf/ld.so --library-path .:$i /tmp/bench ) 2>&1; done; done
      time elf/ld.so --library-path .:nptl-orig /tmp/bench
      real 0m0.655s user 0m0.253s sys 0m0.403s
      real 0m0.657s user 0m0.269s sys 0m0.388s
      time elf/ld.so --library-path .:nptl-requeue /tmp/bench
      real 0m0.496s user 0m0.225s sys 0m0.271s
      real 0m0.531s user 0m0.242s sys 0m0.288s
      time elf/ld.so --library-path .:nptl-wake_op /tmp/bench
      real 0m0.380s user 0m0.176s sys 0m0.204s
      real 0m0.382s user 0m0.175s sys 0m0.207s
      
      The benchmark is at:
      http://sourceware.org/ml/libc-alpha/2005-03/txt00001.txt
      Older futex-requeue-glibc.patch version is at:
      http://sourceware.org/ml/libc-alpha/2005-03/txt00002.txt
      Older futex-wake_op-glibc.patch version is at:
      http://sourceware.org/ml/libc-alpha/2005-03/txt00003.txt
      Will post a new version (just x86-64 fixes so that the patch
      applies against pthread_cond_signal.S) to libc-hacker ml soon.
      
      Attached is the kernel FUTEX_WAKE_OP patch as well as a simple-minded
      testcase that will not test the atomicity of the operation, but at least
      check if the threads that should have been woken up are woken up and
      whether the arithmetic operation in the kernel gave the expected results.
      Acked-by: NIngo Molnar <mingo@redhat.com>
      Cc: Ulrich Drepper <drepper@redhat.com>
      Cc: Jamie Lokier <jamie@shareable.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NYoichi Yuasa <yuasa@hh.iij4u.or.jp>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4732efbe
    • E
      [PATCH] x86_64: prefetchw() can fall back to prefetch() if !3DNOW · 19aaabb5
      Eric Dumazet 提交于
      This is a multi-part message in MIME format.  If the cpu lacks 3DNOW
      feature, we can use a normal prefetcht0 instruction instead of NOP5.
      "prefetchw (%rxx)" and "prefetcht0 (%rxx)" have the same length, ranging
      from 3 to 5 bytes depending on the register.  So this patch even helps
      AMD64, shortening the length of the code.
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Acked-by: NAndi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      19aaabb5
  5. 05 9月, 2005 7 次提交
  6. 30 8月, 2005 2 次提交
  7. 27 8月, 2005 1 次提交
    • A
      [PATCH] x86_64: Tell VM about holes in nodes · 485761bd
      Andi Kleen 提交于
      Some nodes can have large holes on x86-64.
      
      This fixes problems with the VM allowing too many dirty pages because it
      overestimates the number of available RAM in a node.  In extreme cases you
      can end up with all RAM filled with dirty pages which can lead to deadlocks
      and other nasty behaviour.
      
      This patch just tells the VM about the known holes from e820.  Reserved
      (like the kernel text or mem_map) is still not taken into account, but that
      should be only a few percent error now.
      
      Small detail is that the flat setup uses the NUMA free_area_init_node() now
      too because it offers more flexibility.
      
      (akpm: lotsa thanks to Martin for working this problem out)
      
      Cc: Martin Bligh <mbligh@mbligh.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      485761bd
  8. 26 8月, 2005 1 次提交
  9. 25 8月, 2005 1 次提交