-
由 Zach Brown 提交于
It's very common for the buffer heads in the lru to have different block numbers. By comparing the blocknr before the bdev and size we can reduce the cost of searching in the very common case where all the entries have the same bdev and size. In quick hot cache cycle counting tests on a single fs workstation this cut the cost of a miss by about 20%. A diff of the disassembly shows the reordering of the bdev and blocknr comparisons. This is in such a tiny loop that skipping one comparison is a meaningful portion of the total work being done: 1628: 83 c1 01 add $0x1,%ecx 162b: 83 f9 08 cmp $0x8,%ecx 162e: 74 60 je 1690 <__find_get_block+0xa0> 1630: 89 c8 mov %ecx,%eax 1632: 65 4c 8b 04 c5 00 00 mov %gs:0x0(,%rax,8),%r8 1639: 00 00 163b: 4d 85 c0 test %r8,%r8 163e: 4c 89 c3 mov %r8,%rbx 1641: 74 e5 je 1628 <__find_get_block+0x38> - 1643: 4d 3b 68 30 cmp 0x30(%r8),%r13 + 1643: 4d 3b 68 18 cmp 0x18(%r8),%r13 1647: 75 df jne 1628 <__find_get_block+0x38> - 1649: 4d 3b 60 18 cmp 0x18(%r8),%r12 + 1649: 4d 3b 60 30 cmp 0x30(%r8),%r12 164d: 75 d9 jne 1628 <__find_get_block+0x38> 164f: 49 39 50 20 cmp %rdx,0x20(%r8) 1653: 75 d3 jne 1628 <__find_get_block+0x38> Signed-off-by: NZach Brown <zab@zabbo.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
9470dd5d