1. 03 2月, 2000 1 次提交
    • A
      Support for "multiply high" instruction, see BN_UMULT_HIGH comment in · fb81ac5e
      Andy Polyakov 提交于
      crypto/bn/bn_lcl.h for further details. It should be noted that for
      the moment of this writing the code was tested only on Alpha. If
      compiled with DEC C the C implementation exhibits 12% performance
      improvement over the crypto/bn/asm/alpha.s (on EV56 box running
      AlphaLinux). GNU C is (unfortunately) 8% behind the assembler
      implementation. But it's OpenVMS Alpha users who *may* benefit most
      as 'apps/openssl speed rsa' exhibits 6 (six) times performance
      improvement over the original VMS bignum implementation. Where "*may*"
      means "as soon as code is enabled though #define SIXTY_FOUR_BIT and
      crypto/bn/asm/vms.mar is skipped."
      fb81ac5e
  2. 02 2月, 2000 1 次提交
  3. 01 2月, 2000 2 次提交
  4. 14 12月, 1999 1 次提交
  5. 09 12月, 1999 1 次提交
  6. 30 9月, 1999 1 次提交
  7. 25 8月, 1999 1 次提交
  8. 03 8月, 1999 1 次提交
  9. 01 8月, 1999 1 次提交
    • A
      Extra i386+gcc bn_div.c tune-up featuring inline division and saving · 4c22909e
      Andy Polyakov 提交于
      the remainder left in %edx. Here is the resulting performance improvement
      matrix (improvement as a result of this *and* previous tune-up committed
      two days ago). The results were obtained by profiling the "div" part of
      the crypto/bn/bnspeed.c.
      
      CPU	BN_div	bn_div_words	overall	comment
      ------------------------------------------------------------------------
      PII	+16%	accumulated by	+2-3%	PII multiplies damn fast! Taking
      		inlining		multiplication out of the loop
      					didn't make too much difference.
      					Eliminating of the multiplication
      					involved in remainder calculation
      					is the major factor.
      
      Pentium	+45%	accumulated by	+7-9%	mull isn't that fast and replacing
      		inlining		multiplications with additions in
      					the loop has more visible effect:-)
      
      MIPS	+75%	+12%		+20-25%	In addition to the taking mults
      R10000					out of the loop (giving 12% in the
      					asm/mips3.s) three mults were
      					eliminated in BN_div.
      
      Alpha	+30%	+50%		+10-15%	Same as above. But remember that
      EV4					bn_div_words is a C implementation.
      					It takes 4 Alpha mults in C to do
      					the same thing as 1 MIPS mult in
      					assembler does. So the effect (50%)
      					is more impressive. But not the
      					overall one... Well, if Alpha
      					bn_mul_add would be implemented
      					in assembler overall improvement
      					would be closer to MIPS...
      4c22909e
  10. 30 7月, 1999 1 次提交
  11. 10 6月, 1999 1 次提交
  12. 05 6月, 1999 1 次提交
  13. 20 4月, 1999 1 次提交
  14. 21 12月, 1998 3 次提交