提交 74e334dc 编写于 作者: D Denys Vlasenko 提交者: Rich Felker

x86_64/memset: avoid performing final store twice

The code does a potentially misaligned 8-byte store to fill the tail
of the buffer. Then it fills the initial part of the buffer
which is a multiple of 8 bytes.
Therefore, if size is divisible by 8, we were storing last word twice.

This patch decrements byte count before dividing it by 8,
making one less store in "size is divisible by 8" case,
and not changing anything in all other cases.
All at the cost of replacing one MOV insn with LEA insn.
Signed-off-by: NDenys Vlasenko <vda.linux@googlemail.com>
上级 bf2071ed
......@@ -9,7 +9,7 @@ memset:
cmp $16,%rdx
jb 1f
mov %rdx,%rcx
lea -1(%rdx),%rcx
mov %rdi,%r8
shr $3,%rcx
mov %rax,-8(%rdi,%rdx)
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册