提交 5a9c8c05 编写于 作者: R Rich Felker

mitigate performance regression in libc-internal locks on x86_64

commit 3c43c076 fixed missing
synchronization in the atomic store operation for i386 and x86_64, but
opted to use mfence for the barrier on x86_64 where it's always
available. however, in practice mfence is significantly slower than
the barrier approach used on i386 (a nop-like lock orl operation).
this commit changes x86_64 (and x32) to use the faster barrier.
上级 c13f2af1
......@@ -83,7 +83,7 @@ static inline void a_dec(volatile int *x)
static inline void a_store(volatile int *p, int x)
{
__asm__( "mov %1, %0 ; mfence" : "=m"(*p) : "r"(x) : "memory" );
__asm__( "mov %1, %0 ; lock ; orl $0,(%%rsp)" : "=m"(*p) : "r"(x) : "memory" );
}
static inline void a_spin()
......
......@@ -83,7 +83,7 @@ static inline void a_dec(volatile int *x)
static inline void a_store(volatile int *p, int x)
{
__asm__( "mov %1, %0 ; mfence" : "=m"(*p) : "r"(x) : "memory" );
__asm__( "mov %1, %0 ; lock ; orl $0,(%%rsp)" : "=m"(*p) : "r"(x) : "memory" );
}
static inline void a_spin()
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册