1. 22 9月, 2012 1 次提交
  2. 30 6月, 2012 1 次提交
  3. 21 1月, 2012 1 次提交
    • J
      x86: Adjust asm constraints in atomic64 wrappers · 819165fb
      Jan Beulich 提交于
      Eric pointed out overly restrictive constraints in atomic64_set(), but
      there are issues throughout the file. In the cited case, %ebx and %ecx
      are inputs only (don't get changed by either of the two low level
      implementations). This was also the case elsewhere.
      
      Further in many cases early-clobber indicators were missing.
      
      Finally, the previous implementation rolled a custom alternative
      instruction macro from scratch, rather than using alternative_call()
      (which was introduced with the commit that the description of the
      change in question actually refers to). Adjusting has the benefit of
      not hiding referenced symbols from the compiler, which however requires
      them to be declared not just in the exporting source file (which, as a
      desirable side effect, in turn allows that exporting file to become a
      real 5-line stub).
      
      This patch does not eliminate the overly restrictive memory clobbers,
      however: Doing so would occasionally make the compiler set up a second
      register for accessing the memory object (to satisfy the added "m"
      constraint), and it's not clear which of the two non-optimal
      alternatives is better.
      
      v2: Re-do the declaration and exporting of the internal symbols.
      Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Link: http://lkml.kernel.org/r/4F19A2A5020000780006E0D9@nat28.tlf.novell.com
      Cc: Luca Barbieri <luca@luca-barbieri.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      819165fb
  4. 16 9月, 2011 1 次提交
    • L
      asm alternatives: remove incorrect alignment notes · a7f934d4
      Linus Torvalds 提交于
      On x86-64, they were just wasteful: with the explicitly added (now
      unnecessary) padding, the size of the alternatives structure was 16
      bytes, and an alignment of 8 bytes didn't hurt much.
      
      However, it was still silly, since the natural size and alignment for
      the structure is actually just 12 bytes, 4-byte aligned since commit
      59e97e4d ("x86: Make alternative instruction pointers relative").
      So removing the padding, and removing the extra alignment is just a good
      idea.
      
      On x86-32, the alignment of 4 bytes was correct, but was incorrectly
      hardcoded as 8 bytes in <asm/alternative-asm.h>.  That header file had
      used to be an x86-64 only header file, but various unification efforts
      have made it be used for x86-32 too (ie the unification of rwlock and
      rwsem).
      
      That in turn caused x86-32 boot failures, because the extra alignment
      would result in random zero-filled words in the altinstructions section,
      causing oopses early at boot when doing alternative instruction
      replacement.
      
      So just remove all the alignment noise entirely.  It's wrong, and it's
      unnecessary.  The section itself is already properly aligned by the
      linker scripts, and all additions to the section had better be of the
      proper 12-byte format, keeping it aligned.  So if the align directive
      were to ever make a difference, that would be an indication of a serious
      bug to begin with.
      Reported-by: NWerner Landgraf <w.landgraf@ru.r>
      Acked-by: NAndrew Lutomirski <luto@mit.edu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a7f934d4
  5. 14 7月, 2011 1 次提交
  6. 19 4月, 2011 1 次提交
  7. 05 4月, 2011 1 次提交
    • J
      jump label: Introduce static_branch() interface · d430d3d7
      Jason Baron 提交于
      Introduce:
      
      static __always_inline bool static_branch(struct jump_label_key *key);
      
      instead of the old JUMP_LABEL(key, label) macro.
      
      In this way, jump labels become really easy to use:
      
      Define:
      
              struct jump_label_key jump_key;
      
      Can be used as:
      
              if (static_branch(&jump_key))
                      do unlikely code
      
      enable/disale via:
      
              jump_label_inc(&jump_key);
              jump_label_dec(&jump_key);
      
      that's it!
      
      For the jump labels disabled case, the static_branch() becomes an
      atomic_read(), and jump_label_inc()/dec() are simply atomic_inc(),
      atomic_dec() operations. We show testing results for this change below.
      
      Thanks to H. Peter Anvin for suggesting the 'static_branch()' construct.
      
      Since we now require a 'struct jump_label_key *key', we can store a pointer into
      the jump table addresses. In this way, we can enable/disable jump labels, in
      basically constant time. This change allows us to completely remove the previous
      hashtable scheme. Thanks to Peter Zijlstra for this re-write.
      
      Testing:
      
      I ran a series of 'tbench 20' runs 5 times (with reboots) for 3
      configurations, where tracepoints were disabled.
      
      jump label configured in
      avg: 815.6
      
      jump label *not* configured in (using atomic reads)
      avg: 800.1
      
      jump label *not* configured in (regular reads)
      avg: 803.4
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20110316212947.GA8792@redhat.com>
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Suggested-by: NH. Peter Anvin <hpa@linux.intel.com>
      Tested-by: NDavid Daney <ddaney@caviumnetworks.com>
      Acked-by: NRalf Baechle <ralf@linux-mips.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d430d3d7
  8. 14 12月, 2010 1 次提交
  9. 07 12月, 2010 1 次提交
    • M
      x86: Introduce text_poke_smp_batch() for batch-code modifying · 7deb18dc
      Masami Hiramatsu 提交于
      Introduce text_poke_smp_batch(). This function modifies several
      text areas with one stop_machine() on SMP. Because calling
      stop_machine() is heavy task, it is better to aggregate
      text_poke requests.
      
      ( Note: I've talked with Rusty about this interface, and
        he would not like to expand stop_machine() interface, since
        it is not for generic use. )
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Jan Beulich <jbeulich@novell.com>
      Cc: 2nddept-manager@sdl.hitachi.co.jp
      LKML-Reference: <20101203095422.2961.51217.stgit@ltc236.sdl.hitachi.co.jp>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7deb18dc
  10. 23 9月, 2010 1 次提交
    • J
      jump label: Base patch for jump label · bf5438fc
      Jason Baron 提交于
      base patch to implement 'jump labeling'. Based on a new 'asm goto' inline
      assembly gcc mechanism, we can now branch to labels from an 'asm goto'
      statment. This allows us to create a 'no-op' fastpath, which can subsequently
      be patched with a jump to the slowpath code. This is useful for code which
      might be rarely used, but which we'd like to be able to call, if needed.
      Tracepoints are the current usecase that these are being implemented for.
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      LKML-Reference: <ee8b3595967989fdaf84e698dc7447d315ce972a.1284733808.git.jbaron@redhat.com>
      
      [ cleaned up some formating ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      bf5438fc
  11. 21 9月, 2010 2 次提交
  12. 08 7月, 2010 1 次提交
    • H
      x86, alternatives: Use 16-bit numbers for cpufeature index · 83a7a2ad
      H. Peter Anvin 提交于
      We already have cpufeature indicies above 255, so use a 16-bit number
      for the alternatives index.  This consumes a padding field and so
      doesn't add any size, but it means that abusing the padding field to
      create assembly errors on overflow no longer works.  We can retain the
      test simply by redirecting it to the .discard section, however.
      
      [ v3: updated to include open-coded locations ]
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      LKML-Reference: <tip-f88731e3068f9d1392ba71cc9f50f035d26a0d4f@git.kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      83a7a2ad
  13. 30 4月, 2010 1 次提交
    • H
      x86: Fix LOCK_PREFIX_HERE for uniprocessor build · b701a47b
      H. Peter Anvin 提交于
      Checkin b3ac891b:
      x86: Add support for lock prefix in alternatives
      
      ... did not define LOCK_PREFIX_HERE in the case of a uniprocessor
      build.  As a result, it would cause any of the usages of this macro to
      fail on a uniprocessor build.  Fix this by defining LOCK_PREFIX_HERE
      as a null string.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Luca Barbieri <luca@luca-barbieri.com>
      LKML-Reference: <1267005265-27958-2-git-send-email-luca@luca-barbieri.com>
      b701a47b
  14. 29 4月, 2010 1 次提交
  15. 07 4月, 2010 1 次提交
    • B
      x86: Add optimized popcnt variants · d61931d8
      Borislav Petkov 提交于
      Add support for the hardware version of the Hamming weight function,
      popcnt, present in CPUs which advertize it under CPUID, Function
      0x0000_0001_ECX[23]. On CPUs which don't support it, we fallback to the
      default lib/hweight.c sw versions.
      
      A synthetic benchmark comparing popcnt with __sw_hweight64 showed almost
      a 3x speedup on a F10h machine.
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      LKML-Reference: <20100318112015.GC11152@aftab>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      d61931d8
  16. 26 2月, 2010 2 次提交
    • L
      x86: Add support for lock prefix in alternatives · b3ac891b
      Luca Barbieri 提交于
      The current lock prefix UP/SMP alternative code doesn't allow
      LOCK_PREFIX to be used in alternatives code.
      
      This patch solves the problem by adding a new LOCK_PREFIX_ALTERNATIVE_PATCH
      macro that only records the lock prefix location but does not emit
      the prefix.
      
      The user of this macro can then start any alternative sequence with
      "lock" and have it UP/SMP patched.
      
      To make this work, the UP/SMP alternative code is changed to do the
      lock/DS prefix switching only if the byte actually contains a lock or
      DS prefix.
      
      Thus, if an alternative without the "lock" is selected, it will now do
      nothing instead of clobbering the code.
      
      Changes in v2:
      - Naming change
      - Change label to not conflict with alternatives
      Signed-off-by: NLuca Barbieri <luca@luca-barbieri.com>
      LKML-Reference: <1267005265-27958-2-git-send-email-luca@luca-barbieri.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      b3ac891b
    • M
      x86: Add text_poke_smp for SMP cross modifying code · 3d55cc8a
      Masami Hiramatsu 提交于
      Add generic text_poke_smp for SMP which uses stop_machine()
      to synchronize modifying code.
      This stop_machine() method is officially described at "7.1.3
      Handling Self- and Cross-Modifying Code" on the intel's
      software developer's manual 3A.
      
      Since stop_machine() can't protect code against NMI/MCE, this
      function can not modify those handlers. And also, this function
      is basically for modifying multibyte-single-instruction. For
      modifying multibyte-multi-instructions, we need another special
      trap & detour code.
      
      This code originaly comes from immediate values with
      stop_machine() version. Thanks Jason and Mathieu!
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Anders Kaseorg <andersk@ksplice.com>
      Cc: Tim Abbott <tabbott@ksplice.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      LKML-Reference: <20100225133438.6725.80273.stgit@localhost6.localdomain6>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3d55cc8a
  17. 04 2月, 2010 1 次提交
    • M
      ftrace/alternatives: Introducing *_text_reserved functions · 2cfa1978
      Masami Hiramatsu 提交于
      Introducing *_text_reserved functions for checking the text
      address range is partially reserved or not. This patch provides
      checking routines for x86 smp alternatives and dynamic ftrace.
      Since both functions modify fixed pieces of kernel text, they
      should reserve and protect those from other dynamic text
      modifier, like kprobes.
      
      This will also be extended when introducing other subsystems
      which modify fixed pieces of kernel text. Dynamic text modifiers
      should avoid those.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: przemyslaw@pawelczyk.it
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
      Cc: Jason Baron <jbaron@redhat.com>
      LKML-Reference: <20100202214911.4694.16587.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2cfa1978
  18. 30 12月, 2009 1 次提交
    • J
      x86-64: Modify copy_user_generic() alternatives mechanism · 1b1d9258
      Jan Beulich 提交于
      In order to avoid unnecessary chains of branches, rather than
      implementing copy_user_generic() as a function consisting of
      just a single (possibly patched) branch, instead properly deal
      with patching call instructions in the alternative instructions
      framework, and move the patching into the callers.
      
      As a follow-on, one could also introduce something like
      __EXPORT_SYMBOL_ALT() to avoid patching call sites in modules.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4B2BB8180200007800026AE7@vpn.id2.novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1b1d9258
  19. 02 12月, 2009 1 次提交
    • J
      x86/alternatives: Check replacementlen <= instrlen at build time · 01be50a3
      Jan Beulich 提交于
      Having run into the run-(boot-)time check a couple of times lately,
      I finally took time to find a build-time check so that one doesn't
      need to analyze the register/stack dump and resolve this (through
      manual lookup in vmlinux) to the offending construct.
      
      The assembler will emit a message like "Error: value of <num> too
      large for field of 1 bytes at <offset>", which while not pointing
      out the source location still makes analysis quite a bit easier.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      LKML-Reference: <4B0FF8AA0200007800022703@vpn.id2.novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      01be50a3
  20. 22 8月, 2009 1 次提交
  21. 28 4月, 2009 1 次提交
    • M
      x86: clean up alternative.h · edc953fa
      Mathieu Desnoyers 提交于
      Alternative header duplicates assembly that could be merged in
      one single macro.  Merging this into this macro also allows to
      directly declare ALTERNATIVE() statements within assembly code.
      
      Uses a __stringify() of the feature bits rather than passing a
      "i" operand.  Leave the old %0 operand as-is (set to 0), unused
      to stay compatible with API.
      
      (v2: tab alignment fixes)
      
      [ Impact: cleanup ]
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      LKML-Reference: <20090428151346.GA31212@Krystal>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      edc953fa
  22. 23 10月, 2008 2 次提交
  23. 23 7月, 2008 1 次提交
    • V
      x86: consolidate header guards · 77ef50a5
      Vegard Nossum 提交于
      This patch is the result of an automatic script that consolidates the
      format of all the headers in include/asm-x86/.
      
      The format:
      
      1. No leading underscore. Names with leading underscores are reserved.
      2. Pathname components are separated by two underscores. So we can
         distinguish between mm_types.h and mm/types.h.
      3. Everything except letters and numbers are turned into single
         underscores.
      Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>
      77ef50a5
  24. 24 5月, 2008 1 次提交
  25. 17 4月, 2008 2 次提交
    • J
      2ac1ea7c
    • M
      x86: enhance DEBUG_RODATA support - alternatives · e587cadd
      Mathieu Desnoyers 提交于
      Fix a memcpy that should be a text_poke (in apply_alternatives).
      
      Use kernel_wp_save/kernel_wp_restore in text_poke to support DEBUG_RODATA
      correctly and so the CPU HOTPLUG special case can be removed.
      
      Add text_poke_early, for alternatives and paravirt boot-time and module load
      time patching.
      
      Changelog:
      
      - Fix text_set and text_poke alignment check (mixed up bitwise and and or)
      - Remove text_set
      - Export add_nops, so it can be used by others.
      - Document text_poke_early.
      - Remove clflush, since it breaks some VIA architectures and is not strictly
        necessary.
      - Add kerneldoc to text_poke and text_poke_early.
      - Create a second vmap instead of using the WP bit to support Xen and VMI.
      - Move local_irq disable within text_poke and text_poke_early to be able to
        be sleepable in these functions.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      CC: Andi Kleen <andi@firstfloor.org>
      CC: pageexec@freemail.hu
      CC: H. Peter Anvin <hpa@zytor.com>
      CC: Jeremy Fitzhardinge <jeremy@goop.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e587cadd
  26. 30 1月, 2008 1 次提交
  27. 11 10月, 2007 1 次提交