1. 30 10月, 2005 1 次提交
    • H
      [PATCH] mm: split page table lock · 4c21e2f2
      Hugh Dickins 提交于
      Christoph Lameter demonstrated very poor scalability on the SGI 512-way, with
      a many-threaded application which concurrently initializes different parts of
      a large anonymous area.
      
      This patch corrects that, by using a separate spinlock per page table page, to
      guard the page table entries in that page, instead of using the mm's single
      page_table_lock.  (But even then, page_table_lock is still used to guard page
      table allocation, and anon_vma allocation.)
      
      In this implementation, the spinlock is tucked inside the struct page of the
      page table page: with a BUILD_BUG_ON in case it overflows - which it would in
      the case of 32-bit PA-RISC with spinlock debugging enabled.
      
      Splitting the lock is not quite for free: another cacheline access.  Ideally,
      I suppose we would use split ptlock only for multi-threaded processes on
      multi-cpu machines; but deciding that dynamically would have its own costs.
      So for now enable it by config, at some number of cpus - since the Kconfig
      language doesn't support inequalities, let preprocessor compare that with
      NR_CPUS.  But I don't think it's worth being user-configurable: for good
      testing of both split and unsplit configs, split now at 4 cpus, and perhaps
      change that to 8 later.
      
      There is a benefit even for singly threaded processes: kswapd can be attacking
      one part of the mm while another part is busy faulting.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4c21e2f2
  2. 28 10月, 2005 2 次提交
    • A
      [PATCH] gfp_t: fs/* · 27496a8c
      Al Viro 提交于
       - ->releasepage() annotated (s/int/gfp_t), instances updated
       - missing gfp_t in fs/* added
       - fixed misannotation from the original sweep caught by bitwise checks:
         XFS used __nocast both for gfp_t and for flags used by XFS allocator.
         The latter left with unsigned int __nocast; we might want to add a
         different type for those but for now let's leave them alone.  That,
         BTW, is a case when __nocast use had been actively confusing - it had
         been used in the same code for two different and similar types, with
         no way to catch misuses.  Switch of gfp_t to bitwise had caught that
         immediately...
      
      One tricky bit is left alone to be dealt with later - mapping->flags is
      a mix of gfp_t and error indications.  Left alone for now.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      27496a8c
    • A
      [PATCH] gfp_t: infrastructure · af4ca457
      Al Viro 提交于
      Beginning of gfp_t annotations:
      
       - -Wbitwise added to CHECKFLAGS
       - old __bitwise renamed to __bitwise__
       - __bitwise defined to either __bitwise__ or nothing, depending on
         __CHECK_ENDIAN__ being defined
       - gfp_t switched from __nocast to __bitwise__
       - force cast to gfp_t added to __GFP_... constants
       - new helper - gfp_zone(); extracts zone bits out of gfp_t value and casts
         the result to int
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      af4ca457
  3. 09 10月, 2005 1 次提交
  4. 11 9月, 2005 1 次提交
    • I
      [PATCH] spinlock consolidation · fb1c8f93
      Ingo Molnar 提交于
      This patch (written by me and also containing many suggestions of Arjan van
      de Ven) does a major cleanup of the spinlock code.  It does the following
      things:
      
       - consolidates and enhances the spinlock/rwlock debugging code
      
       - simplifies the asm/spinlock.h files
      
       - encapsulates the raw spinlock type and moves generic spinlock
         features (such as ->break_lock) into the generic code.
      
       - cleans up the spinlock code hierarchy to get rid of the spaghetti.
      
      Most notably there's now only a single variant of the debugging code,
      located in lib/spinlock_debug.c.  (previously we had one SMP debugging
      variant per architecture, plus a separate generic one for UP builds)
      
      Also, i've enhanced the rwlock debugging facility, it will now track
      write-owners.  There is new spinlock-owner/CPU-tracking on SMP builds too.
      All locks have lockup detection now, which will work for both soft and hard
      spin/rwlock lockups.
      
      The arch-level include files now only contain the minimally necessary
      subset of the spinlock code - all the rest that can be generalized now
      lives in the generic headers:
      
       include/asm-i386/spinlock_types.h       |   16
       include/asm-x86_64/spinlock_types.h     |   16
      
      I have also split up the various spinlock variants into separate files,
      making it easier to see which does what. The new layout is:
      
         SMP                         |  UP
         ----------------------------|-----------------------------------
         asm/spinlock_types_smp.h    |  linux/spinlock_types_up.h
         linux/spinlock_types.h      |  linux/spinlock_types.h
         asm/spinlock_smp.h          |  linux/spinlock_up.h
         linux/spinlock_api_smp.h    |  linux/spinlock_api_up.h
         linux/spinlock.h            |  linux/spinlock.h
      
      /*
       * here's the role of the various spinlock/rwlock related include files:
       *
       * on SMP builds:
       *
       *  asm/spinlock_types.h: contains the raw_spinlock_t/raw_rwlock_t and the
       *                        initializers
       *
       *  linux/spinlock_types.h:
       *                        defines the generic type and initializers
       *
       *  asm/spinlock.h:       contains the __raw_spin_*()/etc. lowlevel
       *                        implementations, mostly inline assembly code
       *
       *   (also included on UP-debug builds:)
       *
       *  linux/spinlock_api_smp.h:
       *                        contains the prototypes for the _spin_*() APIs.
       *
       *  linux/spinlock.h:     builds the final spin_*() APIs.
       *
       * on UP builds:
       *
       *  linux/spinlock_type_up.h:
       *                        contains the generic, simplified UP spinlock type.
       *                        (which is an empty structure on non-debug builds)
       *
       *  linux/spinlock_types.h:
       *                        defines the generic type and initializers
       *
       *  linux/spinlock_up.h:
       *                        contains the __raw_spin_*()/etc. version of UP
       *                        builds. (which are NOPs on non-debug, non-preempt
       *                        builds)
       *
       *   (included on UP-non-debug builds:)
       *
       *  linux/spinlock_api_up.h:
       *                        builds the _spin_*() APIs.
       *
       *  linux/spinlock.h:     builds the final spin_*() APIs.
       */
      
      All SMP and UP architectures are converted by this patch.
      
      arm, i386, ia64, ppc, ppc64, s390/s390x, x64 was build-tested via
      crosscompilers.  m32r, mips, sh, sparc, have not been tested yet, but should
      be mostly fine.
      
      From: Grant Grundler <grundler@parisc-linux.org>
      
        Booted and lightly tested on a500-44 (64-bit, SMP kernel, dual CPU).
        Builds 32-bit SMP kernel (not booted or tested).  I did not try to build
        non-SMP kernels.  That should be trivial to fix up later if necessary.
      
        I converted bit ops atomic_hash lock to raw_spinlock_t.  Doing so avoids
        some ugly nesting of linux/*.h and asm/*.h files.  Those particular locks
        are well tested and contained entirely inside arch specific code.  I do NOT
        expect any new issues to arise with them.
      
       If someone does ever need to use debug/metrics with them, then they will
        need to unravel this hairball between spinlocks, atomic ops, and bit ops
        that exist only because parisc has exactly one atomic instruction: LDCW
        (load and clear word).
      
      From: "Luck, Tony" <tony.luck@intel.com>
      
         ia64 fix
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NArjan van de Ven <arjanv@infradead.org>
      Signed-off-by: NGrant Grundler <grundler@parisc-linux.org>
      Cc: Matthew Wilcox <willy@debian.org>
      Signed-off-by: NHirokazu Takata <takata@linux-m32r.org>
      Signed-off-by: NMikael Pettersson <mikpe@csd.uu.se>
      Signed-off-by: NBenoit Boissinot <benoit.boissinot@ens-lyon.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fb1c8f93
  5. 08 9月, 2005 2 次提交
  6. 08 7月, 2005 1 次提交
  7. 29 6月, 2005 1 次提交
  8. 24 6月, 2005 2 次提交
    • A
      [PATCH] Bug in error recovery in fs/buffer.c::__block_prepare_write() · 152becd2
      Anton Altaparmakov 提交于
      fs/buffer.c::__block_prepare_write() has broken error recovery.  It calls
      the get_block() callback with "create = 1" and if that succeeds it
      immediately clears buffer_new on the just allocated buffer (which has
      buffer_new set).
      
      The bug is that if an error occurs and get_block() returns != 0, we break
      from this loop and go into recovery code.  This code has this comment:
      
      /* Error case: */
      /*
       * Zero out any newly allocated blocks to avoid exposing stale
       * data.  If BH_New is set, we know that the block was newly
       * allocated in the above loop.
       */
      
      So the intent is obviously good in that it wants to clear just allocated
      and hence not zeroed buffers.  However the code recognises allocated
      buffers by checking for buffer_new being set.
      
      Unfortunately __block_prepare_write() as discussed above already cleared
      buffer_new on all allocated buffers thus no buffers will be cleared during
      error recovery and old data will be leaked.
      
      The simplest way I can see to fix this is to make the current recovery code
      work by _not_ clearing buffer_new after calling get_block() in
      __block_prepare_write().
      
      We cannot safely allow buffer_new buffers to "leak out" of
      __block_prepare_write(), thus we simply do a quick loop over the buffers
      clearing buffer_new on each of them if it is set just before returning
      "success" from __block_prepare_write().
      Signed-off-by: NAnton Altaparmakov <aia21@cantab.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      152becd2
    • O
      [PATCH] factor out common code in sys_fsync/sys_fdatasync · dfb388bf
      Oleg Nesterov 提交于
      This patch consolidates sys_fsync and sys_fdatasync.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      dfb388bf
  9. 22 6月, 2005 1 次提交
  10. 17 5月, 2005 1 次提交
    • A
      [PATCH] block_read_full_page() get_block() error handling fix · c64610ba
      Andrew Morton 提交于
      If block_read_full_page() detects an error when running get_block() it will
      run SetPageError(), then it will zero out the block in pagecache and will mark
      the buffer_head uptodate.
      
      So at the end of readahead we end up with a non-uptodate pagecache page which
      is marked PageError.  But it has uptodate buffers.
      
      The pagefault code will run ClearPageError, will launch readpage a second time
      and block_read_full_page() will notice the uptodate buffers and will mark the
      page uptodate as well.  We end up with an uptodate, !PageError page full of
      zeros and the error is lost.
      
      (It seems a little odd that filemap_nopage() runs ClearPageError().  I guess
      all of this adds up to meaning that for each attempted access to the page, the
      pagefault handler will retry the I/O.  Which is good and bad.  If the app is
      ignoring SIGBUS for some reason we could get a lot of back-to-back I/O
      errors.)
      
      Fix it by not marking the pagecache buffer_head as uptodate if the attempt to
      map that buffer to a disk block failed.
      
      Credit-to: Qu Fuping <fs@ercist.iscas.ac.cn>
      
        For reporting the bug and identifying its source.
      Signed-off-by: NQu Fuping <fs@ercist.iscas.ac.cn>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c64610ba
  11. 06 5月, 2005 6 次提交
  12. 01 5月, 2005 4 次提交
  13. 17 4月, 2005 2 次提交