1. 09 7月, 2008 16 次提交
  2. 17 4月, 2008 1 次提交
  3. 04 2月, 2008 1 次提交
  4. 11 10月, 2007 1 次提交
  5. 22 7月, 2007 1 次提交
  6. 03 5月, 2007 1 次提交
  7. 12 2月, 2007 1 次提交
  8. 12 10月, 2006 1 次提交
  9. 26 6月, 2006 2 次提交
    • N
      [PATCH] Make copy_from_user_inatomic NOT zero the tail on i386 · 7c12d811
      NeilBrown 提交于
      As described in a previous patch and documented in mm/filemap.h,
      copy_from_user_inatomic* shouldn't zero out the tail of the buffer after an
      incomplete copy.
      
      This patch implements that change for i386.
      
      For the _nocache version, a new __copy_user_intel_nocache is defined similar
      to copy_user_zeroio_intel_nocache, and this is ultimately used for the copy.
      
      For the regular version, __copy_from_user_ll_nozero is defined which uses
      __copy_user and __copy_user_intel - the later needs casts to reposition the
      __user annotations.
      
      If copy_from_user_atomic is given a constant length of 1, 2, or 4, then we do
      still zero the destintion on failure.  This didn't seem worth the effort of
      fixing as the places where it is used really don't care.
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7c12d811
    • N
      [PATCH] Prepare for __copy_from_user_inatomic to not zero missed bytes · 01408c49
      NeilBrown 提交于
      The problem is that when we write to a file, the copy from userspace to
      pagecache is first done with preemption disabled, so if the source address is
      not immediately available the copy fails *and* *zeros* *the* *destination*.
      
      This is a problem because a concurrent read (which admittedly is an odd thing
      to do) might see zeros rather that was there before the write, or what was
      there after, or some mixture of the two (any of these being a reasonable thing
      to see).
      
      If the copy did fail, it will immediately be retried with preemption
      re-enabled so any transient problem with accessing the source won't cause an
      error.
      
      The first copying does not need to zero any uncopied bytes, and doing so
      causes the problem.  It uses copy_from_user_atomic rather than copy_from_user
      so the simple expedient is to change copy_from_user_atomic to *not* zero out
      bytes on failure.
      
      The first of these two patches prepares for the change by fixing two places
      which assume copy_from_user_atomic does zero the tail.  The two usages are
      very similar pieces of code which copy from a userspace iovec into one or more
      page-cache pages.  These are changed to remove the assumption.
      
      The second patch changes __copy_from_user_inatomic* to not zero the tail.
      Once these are accepted, I will look at similar patches of other architectures
      where this is important (ppc, mips and sparc being the ones I can find).
      
      This patch:
      
      There is a problem with __copy_from_user_inatomic zeroing the tail of the
      buffer in the case of an error.  As it is called in atomic context, the error
      may be transient, so it results in zeros being written where maybe they
      shouldn't be.
      
      In the usage in filemap, this opens a window for a well timed read to see data
      (zeros) which is not consistent with any ordering of reads and writes.
      
      Most cases where __copy_from_user_inatomic is called, a failure results in
      __copy_from_user being called immediately.  As long as the latter zeros the
      tail, the former doesn't need to.  However in *copy_from_user_iovec
      implementations (in both filemap and ntfs/file), it is assumed that
      copy_from_user_inatomic will zero the tail.
      
      This patch removes that assumption, so that after this patch it will
      be safe for copy_from_user_inatomic to not zero the tail.
      
      This patch also adds some commentary to filemap.h and asm-i386/uaccess.h.
      
      After this patch, all architectures that might disable preempt when
      kmap_atomic is called need to have their __copy_from_user_inatomic* "fixed".
      This includes
       - powerpc
       - i386
       - mips
       - sparc
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Anton Altaparmakov <aia21@cantab.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      01408c49
  10. 23 6月, 2006 2 次提交
    • R
      [PATCH] x86: fix __range_ok constraint · 722f4f5b
      Roman Zippel 提交于
      An immediate operand can't be the destination of the cmpl instruction,
      so exclude it.
      Signed-off-by: NRoman Zippel <zippel@linux-m68k.org>
      Cc: Mattia Dongili <malattia@linux.it>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      722f4f5b
    • H
      [PATCH] x86: cache pollution aware __copy_from_user_ll() · c22ce143
      Hiro Yoshioka 提交于
      Use the x86 cache-bypassing copy instructions for copy_from_user().
      
      Some performance data are
      
      Total of GLOBAL_POWER_EVENTS (CPU cycle samples)
      
      2.6.12.4.orig    1921587
      2.6.12.4.nt      1599424
      1599424/1921587=83.23% (16.77% reduction)
      
      BSQ_CACHE_REFERENCE (L3 cache miss)
      2.6.12.4.orig      57427
      2.6.12.4.nt        20858
      20858/57427=36.32% (63.7% reduction)
      
      L3 cache miss reduction of __copy_from_user_ll
      samples  %
      37408    65.1412  vmlinux                  __copy_from_user_ll
      23        0.1103  vmlinux                  __copy_user_zeroing_intel_nocache
      23/37408=0.061% (99.94% reduction)
      
      Top 5 of 2.6.12.4.nt
      Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000
      samples  %        app name                 symbol name
      128392    8.0274  vmlinux                  __copy_user_zeroing_intel_nocache
      64206     4.0143  vmlinux                  journal_add_journal_head
      59746     3.7355  vmlinux                  do_get_write_access
      47674     2.9807  vmlinux                  journal_put_journal_head
      46021     2.8774  vmlinux                  journal_dirty_metadata
      pattern9-0-cpu4-0-09011728/summary.out
      
      Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) with a unit mask of 0x3f (multiple flags) count 3000
      samples  %        app name                 symbol name
      69755     4.2861  vmlinux                  __copy_user_zeroing_intel_nocache
      55685     3.4215  vmlinux                  journal_add_journal_head
      52371     3.2179  vmlinux                  __find_get_block
      45504     2.7960  vmlinux                  journal_put_journal_head
      36005     2.2123  vmlinux                  journal_stop
      pattern9-0-cpu4-0-09011744/summary.out
      
      Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) with a unit mask of 0x200 (read 3rd level cache miss) count 3000
      samples  %        app name                 symbol name
      1147      5.4994  vmlinux                  journal_add_journal_head
      881       4.2240  vmlinux                  journal_dirty_data
      872       4.1809  vmlinux                  blk_rq_map_sg
      734       3.5192  vmlinux                  journal_commit_transaction
      617       2.9582  vmlinux                  radix_tree_delete
      pattern9-0-cpu4-0-09011731/summary.out
      
      iozone results are
      
      original 2.6.12.4 CPU time = 207.768 sec
      cache aware       CPU time = 184.783 sec
      (three times run)
      184.783/207.768=88.94% (11.06% reduction)
      
      original:
      pattern9-0-cpu4-0-08191720/iozone.out:  CPU Utilization: Wall time   45.997    CPU time   64.527    CPU utilization 140.28 %
      pattern9-0-cpu4-0-08191741/iozone.out:  CPU Utilization: Wall time   46.878    CPU time   71.933    CPU utilization 153.45 %
      pattern9-0-cpu4-0-08191743/iozone.out:  CPU Utilization: Wall time   45.152    CPU time   71.308    CPU utilization 157.93 %
      
      cache awre:
      pattern9-0-cpu4-0-09011728/iozone.out:  CPU Utilization: Wall time   44.842    CPU time   62.465    CPU utilization 139.30 %
      pattern9-0-cpu4-0-09011731/iozone.out:  CPU Utilization: Wall time   44.718    CPU time   59.273    CPU utilization 132.55 %
      pattern9-0-cpu4-0-09011744/iozone.out:  CPU Utilization: Wall time   44.367    CPU time   63.045    CPU utilization 142.10 %
      Signed-off-by: NHiro Yoshioka <hyoshiok@miraclelinux.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c22ce143
  11. 26 4月, 2006 1 次提交
  12. 23 3月, 2006 1 次提交
  13. 15 1月, 2006 1 次提交
  14. 08 9月, 2005 1 次提交
  15. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4