提交 · 5f6a11259ab0045a9f79bd789393de7a77e3c5d6 · 李少辉-开发者 / git

23 7月, 2012 1 次提交

block-sha1: avoid pointer conversion that violates alignment constraints · 5f6a1125

由 Jonathan Nieder 提交于 7月 22, 2012

With 660231aa (block-sha1: support for architectures with memory
alignment restrictions, 2009-08-12), blk_SHA1_Update was modified to
access 32-bit chunks of memory one byte at a time on arches that
prefer that:

	#define get_be32(p)    ( \
		(*((unsigned char *)(p) + 0) << 24) | \
		(*((unsigned char *)(p) + 1) << 16) | \
		(*((unsigned char *)(p) + 2) <<  8) | \
		(*((unsigned char *)(p) + 3) <<  0) )

The code previously accessed these values by just using htonl(*p).

Unfortunately, Michael noticed on an Alpha machine that git was using
plain 32-bit reads anyway.  As soon as we convert a pointer to int *,
the compiler can assume that the object pointed to is correctly
aligned as an int (C99 section 6.3.2.3 "pointer conversions"
paragraph 7), and gcc takes full advantage by using a single 32-bit
load, resulting in a whole bunch of unaligned access traps.

So we need to obey the alignment constraints even when only dealing
with pointers instead of actual values.  Do so by changing the type
of 'data' to void *.  This patch renames 'data' to 'block' at the same
time to make sure all references are updated to reflect the new type.
Reported-tested-and-explained-by: NMichael Cree <mcree@orcon.net.nz>
Signed-off-by: NJonathan Nieder <jrnieder@gmail.com>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

5f6a1125

19 8月, 2009 4 次提交

remove ARM and Mozilla SHA1 implementations · 30ae47b4

由 Nicolas Pitre 提交于 8月 17, 2009

They are both slower than the new BLK_SHA1 implementation, so it is
pointless to keep them around.
Signed-off-by: NNicolas Pitre <nico@cam.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

30ae47b4

block-sha1: guard gcc extensions with __GNUC__ · e9c5dcd1

由 Nicolas Pitre 提交于 8月 18, 2009

With this, the code should now be portable to any C compiler.
Signed-off-by: NNicolas Pitre <nico@cam.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

e9c5dcd1

make sure byte swapping is optimal for git · 51ea5519

由 Nicolas Pitre 提交于 8月 18, 2009

We rely on ntohl() and htonl() to perform byte swapping in many places.
However, some platforms have libraries providing really poor
implementations of those which might cause significant performance
issues, especially with the block-sha1 code.
Signed-off-by: NNicolas Pitre <nico@cam.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

51ea5519

block-sha1: make the size member first in the context struct · d5f6a96f

由 Nicolas Pitre 提交于 8月 17, 2009

This is a 64-bit value, hence having it first provides a better
alignment.
Signed-off-by: NNicolas Pitre <nico@cam.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

d5f6a96f

15 8月, 2009 1 次提交

block-sha1/sha1.c: silence compiler complaints by casting void * to char * · a1221857

由 Brandon Casey 提交于 8月 14, 2009

Some compilers produce errors when arithmetic is attempted on pointers to
void.  We want computations done on byte addresses, so cast them to char *
to work them around.
Signed-off-by: NBrandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

a1221857

14 8月, 2009 1 次提交

block-sha1: more good unaligned memory access candidates · ee7dc310

由 Nicolas Pitre 提交于 8月 13, 2009

In addition to X86, PowerPC and S390 are capable of unaligned memory
accesses.
Signed-off-by: NNicolas Pitre <nico@cam.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

ee7dc310

13 8月, 2009 3 次提交

block-sha1: support for architectures with memory alignment restrictions · 660231aa

由 Nicolas Pitre 提交于 8月 12, 2009

This is needed on architectures with poor or non-existent unaligned memory
support and/or no fast byte swap instruction (such as ARM) by using byte
accesses to memory and shifting the result together.

This also makes the code portable, therefore the byte access methods are
the defaults. Any architecture that properly supports unaligned word
accesses in hardware simply has to enable the alternative methods.
Signed-off-by: NNicolas Pitre <nico@cam.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

660231aa

block-sha1: split the different "hacks" to be individually selected · dc52fd29

由 Nicolas Pitre 提交于 8月 12, 2009

This is to make it easier for them to be selected individually depending
on the architecture instead of the other way around i.e. having each
architecture select a list of hacks up front.  That makes for clearer
documentation as well.
Signed-off-by: NNicolas Pitre <nico@cam.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

dc52fd29

block-sha1: move code around · 30ba0de7

由 Nicolas Pitre 提交于 8月 12, 2009

Move the code around so specific architecture hacks are defined first.
Also make one line comments actually one line.  No code change.
Signed-off-by: NNicolas Pitre <nico@cam.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

30ba0de7

11 8月, 2009 1 次提交

block-sha1: improve code on large-register-set machines · 926172c5

由 Linus Torvalds 提交于 8月 10, 2009

For x86 performance (especially in 32-bit mode) I added that hack to write
the SHA1 internal temporary hash using a volatile pointer, in order to get
gcc to not try to cache the array contents. Because gcc will do all the
wrong things, and then spill things in insane random ways.

But on architectures like PPC, where you have 32 registers, it's actually
perfectly reasonable to put the whole temporary array[] into the register
set, and gcc can do so.

So make the 'volatile unsigned int *' cast be dependent on a
SMALL_REGISTER_SET preprocessor symbol, and enable it (currently) on just
x86 and x86-64.  With that, the routine is fairly reasonable even when
compared to the hand-scheduled PPC version. Ben Herrenschmidt reports on
a G5:

 * Paulus asm version:       about 3.67s
 * Yours with no change:     about 5.74s
 * Yours without "volatile": about 3.78s

so with this the C version is within about 3% of the asm one.

And add a lot of commentary on what the heck is going on.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

926172c5

08 8月, 2009 2 次提交

block-sha1: improved SHA1 hashing · 66c9c6c0

由 Linus Torvalds 提交于 8月 07, 2009

I think I have found a way to avoid the gcc crazyness.

Lookie here:

	#             TIME[s] SPEED[MB/s]
	rfc3174         5.094       119.8
	rfc3174         5.098       119.7
	linus           1.462       417.5
	linusas         2.008         304
	linusas2        1.878         325
	mozilla         5.566       109.6
	mozillaas       5.866       104.1
	openssl         1.609       379.3
	spelvin         1.675       364.5
	spelvina        1.601       381.3
	nettle          1.591       383.6

notice? I outperform all the hand-tuned asm on 32-bit too. By quite a
margin, in fact.

Now, I didn't try a P4, and it's possible that it won't do that there, but
the 32-bit code generation sure looks impressive on my Nehalem box. The
magic? I force the stores to the 512-bit hash bucket to be done in order.
That seems to help a lot.

The diff is trivial (on top of the "rename registers with cpp" patch), as
appended. And it does seem to fix the P4 issues too, although I can
obviously (once again) only test Prescott, and only in 64-bit mode:

	#             TIME[s] SPEED[MB/s]
	rfc3174         1.662       36.73
	rfc3174          1.64       37.22
	linus          0.2523       241.9
	linusas        0.4367       139.8
	linusas2       0.4487         136
	mozilla        0.9704        62.9
	mozillaas      0.9399       64.94

that's some really impressive improvement. All from just saying "do the
stores in the order I told you to, dammit!" to the compiler.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

66c9c6c0

block-sha1: perform register rotation using cpp · 30d12d4c

由 Linus Torvalds 提交于 8月 06, 2009

Instead of letting the compiler to figure out the optimal way to rotate
register usage, explicitly rotate the register names with cpp.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

30d12d4c

07 8月, 2009 8 次提交

block-sha1: get rid of redundant 'lenW' context · 5d5210c3

由 Linus Torvalds 提交于 8月 06, 2009

.. and simplify the ctx->size logic.

We now count the size in bytes, which means that 'lenW' was always just
the low 6 bits of the total size, so we don't carry it around separately
any more.  And we do the 'size in bits' shift at the end.

Suggested by Nicolas Pitre and linux@horizon.com.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

5d5210c3

block-sha1: Use '(B&C)+(D&(B^C))' instead of '(B&C)|(D&(B|C))' in round 3 · e869e113

由 Linus Torvalds 提交于 8月 06, 2009

It's an equivalent expression, but the '+' gives us some freedom in
instruction selection (for example, we can use 'lea' rather than 'add'),
and associates with the other additions around it to give some minor
scheduling freedom.

Suggested-by: linux@horizon.com
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

e869e113

block-sha1: macroize the rounds a bit further · ab14c823

由 Linus Torvalds 提交于 8月 06, 2009

Avoid repeating the shared parts of the different rounds by adding a
macro layer or two. It was already more cpp than C.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

ab14c823

block-sha1: re-use the temporary array as we calculate the SHA1 · 7b5075fc

由 Linus Torvalds 提交于 8月 05, 2009

The mozilla-SHA1 code did this 80-word array for the 80 iterations. But
the SHA1 state is really just 512 bits, and you can actually keep it in
a kind of "circular queue" of just 16 words instead.

This requires us to do the xor updates as we go along (rather than as a
pre-phase), but that's really what we want to do anyway.

This gets me really close to the OpenSSL performance on my Nehalem.
Look ma, all C code (ok, there's the rol/ror hack, but that one doesn't
strictly even matter on my Nehalem, it's just a local optimization).
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

7b5075fc

block-sha1: make the 'ntohl()' part of the first SHA1 loop · 139e3456

由 Linus Torvalds 提交于 8月 05, 2009

This helps a teeny bit.  But what I -really- want to do is to avoid the
whole 80-array loop, and do the xor updates as I go along..
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

139e3456

block-sha1: minor fixups · fd536d34

由 Junio C Hamano 提交于 8月 06, 2009

Bert Wesarg noticed non-x86 version of SHA_ROT() had a typo.
Also spell in-line assembly as __asm__(), otherwise I seem to get
error: implicit declaration of function 'asm' from my compiler.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

fd536d34

block-sha1: try to use rol/ror appropriately · b8e48a89

由 Linus Torvalds 提交于 8月 05, 2009

Use the one with the smaller constant. It _can_ generate slightly
smaller code (a constant of 1 is special), but perhaps more importantly
it's possibly faster on any uarch that does a rotate with a loop.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

b8e48a89

block-sha1: undo ctx->size change · b26a9d50

由 Junio C Hamano 提交于 8月 06, 2009

Undo the change I picked up from the mailing list discussion suggested
by Nico, not because it is wrong, but it will be done at the end of the
follow-up series.
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

b26a9d50

06 8月, 2009 17 次提交

Add new optimized C 'block-sha1' routines · d7c208a9

由 Linus Torvalds 提交于 8月 05, 2009

Based on the mozilla SHA1 routine, but doing the input data accesses a
word at a time and with 'htonl()' instead of loading bytes and shifting.

It requires an architecture that is ok with unaligned 32-bit loads and a
fast htonl().
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

d7c208a9

Merge branch 'sb/read-tree' · 3f55e410

由 Junio C Hamano 提交于 8月 05, 2009

* sb/read-tree:
  read-tree: migrate to parse-options
  read-tree: convert unhelpful usage()'s to helpful die()'s

3f55e410

J
Merge branch 'jc/apply-epoch-patch' · 57621016
由 Junio C Hamano 提交于 8月 05, 2009
```
* jc/apply-epoch-patch:
  apply: notice creation/removal patches produced by GNU diff
```
57621016

Merge branch 'sb/parse-options' · 7e956ccc

由 Junio C Hamano 提交于 8月 05, 2009

* sb/parse-options:
  prune-packed: migrate to parse-options
  verify-pack: migrate to parse-options
  verify-tag: migrate to parse-options
  write-tree: migrate to parse-options

7e956ccc

Merge branch 'ns/init-mkdir' · 0397ff24

由 Junio C Hamano 提交于 8月 05, 2009

* ns/init-mkdir:
  git init: optionally allow a directory argument

Conflicts:
	builtin-init-db.c

0397ff24

J
Merge branch 'mk/init-db-parse-options' · 4d4097da
由 Junio C Hamano 提交于 8月 05, 2009
```
* mk/init-db-parse-options:
  init-db: migrate to parse-options
```
4d4097da

Merge branch 'jk/maint-show-tag' · d0410af7

由 Junio C Hamano 提交于 8月 05, 2009

* jk/maint-show-tag:
  show: add space between multiple items
  show: suppress extra newline when showing annotated tag

d0410af7

Merge branch 'sb/maint-pull-rebase' · e3e9af5b

由 Junio C Hamano 提交于 8月 05, 2009

* sb/maint-pull-rebase:
  pull: support rebased upstream + fetch + pull --rebase
  t5520-pull: Test for rebased upstream + fetch + pull --rebase

e3e9af5b

Merge branch 'ne/futz-upload-pack' · 7d1b5098

由 Junio C Hamano 提交于 8月 05, 2009

* ne/futz-upload-pack:
  Shift object enumeration out of upload-pack

Conflicts:
	upload-pack.c

7d1b5098

Merge branch 'maint' · c39e9eb3

由 Junio C Hamano 提交于 8月 05, 2009

* maint:
  gitweb/README: Document $base_url
  Documentation: git submodule: add missing options to synopsis
  Better usage string for reflog.
  hg-to-git: don't import the unused popen2 module
  send-email: remove debug trace
  config: Keep inner whitespace verbatim

c39e9eb3

Merge branch 'maint-1.6.3' into maint · f0df1293

由 Junio C Hamano 提交于 8月 05, 2009

* maint-1.6.3:
  Better usage string for reflog.
  hg-to-git: don't import the unused popen2 module
  send-email: remove debug trace
  config: Keep inner whitespace verbatim

f0df1293

gitweb/README: Document $base_url · 46068383

由 Jakub Narebski 提交于 8月 04, 2009

Signed-off-by: NJakub Narebski <jnareb@gmail.com>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

46068383

Documentation: git submodule: add missing options to synopsis · 85738ba3

由 Jens Lehmann 提交于 8月 01, 2009

The option --merge was missing for submodule update and --cached for
submodule summary.
Signed-off-by: NJens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

85738ba3

git-merge-base/git-show-branch --merge-base: Documentation and test · f621a845

由 Michael J Gruber 提交于 8月 05, 2009

Currently, the documentation suggests that 'git merge-base -a' and 'git
show-branch --merge-base' are equivalent (in fact it claims that the
former cannot handle more than two revs).

Alas, the handling of more than two revs is very different. Document
this by tests and correct the documentation to reflect this.
Signed-off-by: NMichael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

f621a845

git-merge-base/git-show-branch: Cleanup documentation and usage · 995bdc73

由 Michael J Gruber 提交于 8月 05, 2009

Make sure that usage strings and documentation coincide with each other
and with the actual code.
Signed-off-by: NMichael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

995bdc73

t6010-merge-base.sh: Depict the octopus test graph · 30ca4ca7

由 Michael J Gruber 提交于 8月 05, 2009

...so that it is easier to reuse it for other tests.
Signed-off-by: NMichael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

30ca4ca7

Better usage string for reflog. · e77095e8

由 Matthieu Moy 提交于 8月 05, 2009

Signed-off-by: NMatthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

e77095e8

05 8月, 2009 2 次提交

hg-to-git: don't import the unused popen2 module · b0c051d1

由 Miklos Vajna 提交于 8月 03, 2009

Importing the popen2 module in Python-2.6 results in the
"DeprecationWarning: The popen2 module is deprecated.  Use the
subprocess module." message. The module itself isn't used in fact, so
just removing it solves the problem.
Signed-off-by: NMiklos Vajna <vmiklos@frugalware.org>
Signed-off-by: NJunio C Hamano <gitster@pobox.com>

b0c051d1

E
send-email: remove debug trace · 69931b71
由 Erik Faye-Lund 提交于 8月 04, 2009
```
Signed-off-by: NErik Faye-Lund <kusmabite@gmail.com>
```
69931b71

李少辉-开发者 / git 与 Fork 源项目一致

李少辉-开发者 / git
与 Fork 源项目一致