1. 30 4月, 2012 1 次提交
  2. 26 2月, 2009 1 次提交
    • M
      powerpc: Fix 64bit memcpy() regression · e423b9ec
      Mark Nelson 提交于
      This fixes a regression introduced by commit
      25d6e2d7 ("powerpc: Update 64bit memcpy()
      using CPU_FTR_UNALIGNED_LD_STD").
      
      This commit allowed CPUs that have the CPU_FTR_UNALIGNED_LD_STD CPU
      feature bit present to do the memcpy() with unaligned load doubles. But,
      along with this came a bug where our final load double would read bytes
      beyond a page boundary and into the next (unmapped) page. This was caught
      by enabling CONFIG_DEBUG_PAGEALLOC,
      
      The fix was to read only the number of bytes that we need to store rather
      than reading a full 8-byte doubleword and storing only a portion of that.
      
      In order to minimise the amount of existing code touched we use the
      original do_tail for the src_unaligned case.
      
      Below is an example of the regression, as reported by Sachin Sant:
      
      Unable to handle kernel paging request for data at address 0xc00000003f380000
      Faulting instruction address: 0xc000000000039574
      cpu 0x1: Vector: 300 (Data Access) at [c00000003baf3020]
          pc: c000000000039574: .memcpy+0x74/0x244
          lr: d00000000244916c: .ext3_xattr_get+0x288/0x2f4 [ext3]
          sp: c00000003baf32a0
         msr: 8000000000009032
         dar: c00000003f380000
       dsisr: 40000000
        current = 0xc00000003e54b010
        paca    = 0xc000000000a53680
          pid   = 1840, comm = readahead
      enter ? for help
      [link register   ] d00000000244916c .ext3_xattr_get+0x288/0x2f4 [ext3]
      [c00000003baf32a0] d000000002449104 .ext3_xattr_get+0x220/0x2f4 [ext3]
      (unreliab
      le)
      [c00000003baf3390] d00000000244a6e8 .ext3_xattr_security_get+0x40/0x5c [ext3]
      [c00000003baf3400] c000000000148154 .generic_getxattr+0x74/0x9c
      [c00000003baf34a0] c000000000333400 .inode_doinit_with_dentry+0x1c4/0x678
      [c00000003baf3560] c00000000032c6b0 .security_d_instantiate+0x50/0x68
      [c00000003baf35e0] c00000000013c818 .d_instantiate+0x78/0x9c
      [c00000003baf3680] c00000000013ced0 .d_splice_alias+0xf0/0x120
      [c00000003baf3720] d00000000243e05c .ext3_lookup+0xec/0x134 [ext3]
      [c00000003baf37c0] c000000000131e74 .do_lookup+0x110/0x260
      [c00000003baf3880] c000000000134ed0 .__link_path_walk+0xa98/0x1010
      [c00000003baf3970] c0000000001354a0 .path_walk+0x58/0xc4
      [c00000003baf3a20] c000000000135720 .do_path_lookup+0x138/0x1e4
      [c00000003baf3ad0] c00000000013645c .path_lookup_open+0x6c/0xc8
      [c00000003baf3b70] c000000000136780 .do_filp_open+0xcc/0x874
      [c00000003baf3d10] c0000000001251e0 .do_sys_open+0x80/0x140
      [c00000003baf3dc0] c00000000016aaec .compat_sys_open+0x24/0x38
      [c00000003baf3e30] c00000000000855c syscall_exit+0x0/0x40
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      e423b9ec
  3. 05 11月, 2008 1 次提交
    • M
      powerpc: Update 64bit memcpy() using CPU_FTR_UNALIGNED_LD_STD · 25d6e2d7
      Mark Nelson 提交于
      Update memcpy() to add two new feature sections: one for aligning the
      destination before copying and one for copying using aligned load
      and store doubles.
      
      These new feature sections will only affect Power6 and Cell because
      the CPU feature bit was only added to these two processors.
      
      Power6 gets its best performance in memcpy() when aligning neither the
      source nor the destination, while Cell gets its best performance when
      just the destination is aligned. But in order to save on CPU feature
      bits we can use the previously added CPU_FTR_CP_USE_DCBTZ feature bit
      to differentiate between Power6 and Cell (because CPU_FTR_CP_USE_DCBTZ
      was added to Cell but not Power6).
      
      The first feature section acts to nop out the branch that takes us to
      the code that aligns us to an eight byte boundary for the destination.
      We only want to nop out this branch on Power6.
      
      So the ALT_FTR_SECTION_END() for this feature section creates a test
      mask of the two feature bits ORed together and provides an expected
      result of just CPU_FTR_UNALIGNED_LD_STD, thus we nop out the branch
      if we're on a CPU that has CPU_FTR_UNALIGNED_LD_STD set and
      CPU_FTR_CP_USE_DCBTZ unset.
      
      For the second feature section added, if we're on a CPU that has the
      CPU_FTR_UNALIGNED_LD_STD bit set then we don't want to do the copy
      with aligned loads and stores (and the appropriate shifting left and
      right instructions), so we want to nop out the branch to
      .Lsrc_unaligned.
      
      The andi. used for this branch is moved to just above the branch
      because this allows us to nop out both instructions with just one
      feature section which gives us better performance and doesn't hurt
      readability which two separate feature sections did.
      
      Moving the andi. to just above the branch doesn't have any noticeable
      negative effect on the remaining 64bit processors (the ones that
      didn't have this feature bit added).
      
      On Cell this simple modification results in an improvement to measured
      memcpy() bandwidth of up to 50% in the hot cache case and up to 15% in
      the cold cache case.
      
      On Power6 we get memory bandwidth results that are up to three times
      faster in the hot cache case and up to 50% faster in the cold cache
      case.
      
      Commit 2a929436 ("powerpc: Add new CPU
      feature: CPU_FTR_CP_USE_DCBTZ") was where CPU_FTR_CP_USE_DCBTZ was
      added.
      
      To say that Cell gets its best performance in memcpy() with just the
      destination aligned is true but only for the reason that the indirect
      shift and rotate instructions, sld and srd, are microcoded on Cell.
      This means that either the destination or the source can be aligned,
      but not both, and seeing as we get better performance with the
      destination aligned we choose this option.
      
      While we're at it make a one line change from cmpldi r1,... to
      cmpldi cr1,... for consistency.
      Signed-off-by: NMark Nelson <markn@au1.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      25d6e2d7
  4. 13 4月, 2007 1 次提交
  5. 31 8月, 2006 1 次提交
  6. 10 2月, 2006 1 次提交
  7. 10 10月, 2005 1 次提交
  8. 26 9月, 2005 1 次提交
    • P
      powerpc: Merge enough to start building in arch/powerpc. · 14cf11af
      Paul Mackerras 提交于
      This creates the directory structure under arch/powerpc and a bunch
      of Kconfig files.  It does a first-cut merge of arch/powerpc/mm,
      arch/powerpc/lib and arch/powerpc/platforms/powermac.  This is enough
      to build a 32-bit powermac kernel with ARCH=powerpc.
      
      For now we are getting some unmerged files from arch/ppc/kernel and
      arch/ppc/syslib, or arch/ppc64/kernel.  This makes some minor changes
      to files in those directories and files outside arch/powerpc.
      
      The boot directory is still not merged.  That's going to be interesting.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      14cf11af
  9. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4