1. 27 4月, 2008 1 次提交
    • A
      x86: change x86 to use generic find_next_bit · 6fd92b63
      Alexander van Heukelum 提交于
      The versions with inline assembly are in fact slower on the machines I
      tested them on (in userspace) (Athlon XP 2800+, p4-like Xeon 2.8GHz, AMD
      Opteron 270). The i386-version needed a fix similar to 06024f21 to avoid
      crashing the benchmark.
      
      Benchmark using: gcc -fomit-frame-pointer -Os. For each bitmap size
      1...512, for each possible bitmap with one bit set, for each possible
      offset: find the position of the first bit starting at offset. If you
      follow ;). Times include setup of the bitmap and checking of the
      results.
      
      		Athlon		Xeon		Opteron 32/64bit
      x86-specific:	0m3.692s	0m2.820s	0m3.196s / 0m2.480s
      generic:	0m2.622s	0m1.662s	0m2.100s / 0m1.572s
      
      If the bitmap size is not a multiple of BITS_PER_LONG, and no set
      (cleared) bit is found, find_next_bit (find_next_zero_bit) returns a
      value outside of the range [0, size]. The generic version always returns
      exactly size. The generic version also uses unsigned long everywhere,
      while the x86 versions use a mishmash of int, unsigned (int), long and
      unsigned long.
      
      Using the generic version does give a slightly bigger kernel, though.
      
      defconfig:	   text    data     bss     dec     hex filename
      x86-specific:	4738555  481232  626688 5846475  5935cb vmlinux (32 bit)
      generic:	4738621  481232  626688 5846541  59360d vmlinux (32 bit)
      x86-specific:	5392395  846568  724424 6963387  6a40bb vmlinux (64 bit)
      generic:	5392458  846568  724424 6963450  6a40fa vmlinux (64 bit)
      Signed-off-by: NAlexander van Heukelum <heukelum@fastmail.fm>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6fd92b63
  2. 17 4月, 2008 2 次提交
    • J
      286275c9
    • J
      x86: bitops asm constraint fixes · 709f744f
      Jan Beulich 提交于
      This (simplified) piece of code didn't behave as expected due to
      incorrect constraints in some of the bitops functions, when
      X86_FEATURE_xxx is referring to other than the first long:
      
      int test(struct cpuinfo_x86 *c) {
      	if (cpu_has(c, X86_FEATURE_xxx))
      		clear_cpu_cap(c, X86_FEATURE_xxx);
      	return cpu_has(c, X86_FEATURE_xxx);
      }
      
      I'd really like understand, though, what the policy of (not) having a
      "memory" clobber in these operations is - currently, this appears to
      be totally inconsistent. Also, many comments of the non-atomic
      functions say those may also be re-ordered - this contradicts the use
      of "asm volatile" in there, which again I'd like to understand.
      
      As much as all of these, using 'int' for the 'nr' parameter and
      'void *' for the 'addr' one is in conflict with
      Documentation/atomic_ops.txt, especially because bt{,c,r,s} indeed
      take the bit index as signed (which hence would really need special
      precaution) and access the full 32 bits (if 'unsigned long' was used
      properly here, 64 bits for x86-64) pointed at, so invalid uses like
      referencing a 'char' array cannot currently be caught.
      
      Finally, the code with and without this patch relies heavily on the
      -fno-strict-aliasing compiler switch and I'm not certain this really
      is a good idea.
      
      In the light of all of this I'm sending this as RFC, as fixing the
      above might warrant a much bigger patch...
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      709f744f
  3. 30 1月, 2008 2 次提交
  4. 11 10月, 2007 1 次提交