提交 993fef95 编写于 作者: H Heiko Carstens 提交者: Martin Schwidefsky

s390: optimize memset implementation

Like for the memset16/32/64 variants avoid that subsequent mvc
instructions depend on each other since that might have negative
performance impacts.

This patch is currently hardly relevant since at least gcc 7.1
generates only inline memset code and not a single memset call.
However there is no reason to not provide an optimized version
just in case gcc generates memset calls again, like it did in
the past.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
上级 41879ff6
...@@ -78,21 +78,25 @@ ENTRY(memset) ...@@ -78,21 +78,25 @@ ENTRY(memset)
ex %r4,0(%r3) ex %r4,0(%r3)
br %r14 br %r14
.Lmemset_fill: .Lmemset_fill:
stc %r3,0(%r2)
cghi %r4,1 cghi %r4,1
lgr %r1,%r2 lgr %r1,%r2
ber %r14 je .Lmemset_fill_exit
aghi %r4,-2 aghi %r4,-2
srlg %r3,%r4,8 srlg %r5,%r4,8
ltgr %r3,%r3 ltgr %r5,%r5
jz .Lmemset_fill_remainder jz .Lmemset_fill_remainder
.Lmemset_fill_loop: .Lmemset_fill_loop:
mvc 1(256,%r1),0(%r1) stc %r3,0(%r1)
mvc 1(255,%r1),0(%r1)
la %r1,256(%r1) la %r1,256(%r1)
brctg %r3,.Lmemset_fill_loop brctg %r5,.Lmemset_fill_loop
.Lmemset_fill_remainder: .Lmemset_fill_remainder:
larl %r3,.Lmemset_mvc stc %r3,0(%r1)
ex %r4,0(%r3) larl %r5,.Lmemset_mvc
ex %r4,0(%r5)
br %r14
.Lmemset_fill_exit:
stc %r3,0(%r1)
br %r14 br %r14
.Lmemset_xc: .Lmemset_xc:
xc 0(1,%r1),0(%r1) xc 0(1,%r1),0(%r1)
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册