1. 09 6月, 2016 1 次提交
  2. 24 2月, 2016 1 次提交
    • J
      sched/x86: Add stack frame dependency to __preempt_schedule[_notrace]() · 821eae7d
      Josh Poimboeuf 提交于
      If __preempt_schedule() or __preempt_schedule_notrace() is referenced at
      the beginning of a function, gcc can insert the asm inline "call
      ___preempt_schedule[_notrace]" instruction before setting up a stack
      frame, which breaks frame pointer convention if CONFIG_FRAME_POINTER is
      enabled and can result in bad stack traces.
      
      Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
      listing the stack pointer as an output operand for the inline asm
      statements.
      
      Specifically this fixes the following stacktool warnings:
      
        stacktool: drivers/scsi/hpsa.o: hpsa_scsi_do_simple_cmd.constprop.106()+0x79: call without frame pointer save/setup
        stacktool: fs/mbcache.o: mb_cache_entry_find_first()+0x70: call without frame pointer save/setup
        stacktool: fs/mbcache.o: mb_cache_entry_find_first()+0x92: call without frame pointer save/setup
        stacktool: fs/mbcache.o: mb_cache_entry_free()+0xff: call without frame pointer save/setup
        stacktool: fs/mbcache.o: mb_cache_entry_free()+0xf5: call without frame pointer save/setup
        stacktool: fs/mbcache.o: mb_cache_entry_free()+0x11a: call without frame pointer save/setup
        stacktool: fs/mbcache.o: mb_cache_entry_get()+0x225: call without frame pointer save/setup
        stacktool: kernel/locking/percpu-rwsem.o: percpu_up_read()+0x27: call without frame pointer save/setup
        stacktool: kernel/profile.o: do_profile_hits.isra.5()+0x139: call without frame pointer save/setup
        stacktool: lib/nmi_backtrace.o: nmi_trigger_all_cpu_backtrace()+0x2b6: call without frame pointer save/setup
        stacktool: net/rds/ib_cm.o: rds_ib_cq_comp_handler_recv()+0x58: call without frame pointer save/setup
        stacktool: net/rds/ib_cm.o: rds_ib_cq_comp_handler_send()+0x58: call without frame pointer save/setup
        stacktool: net/rds/ib_recv.o: rds_ib_attempt_ack()+0xc1: call without frame pointer save/setup
        stacktool: net/rds/iw_recv.o: rds_iw_attempt_ack()+0xc1: call without frame pointer save/setup
        stacktool: net/rds/iw_recv.o: rds_iw_recv_cq_comp_handler()+0x55: call without frame pointer save/setup
      
      So it only adds a stack frame to 15 call sites out of ~5000 calls to
      ___preempt_schedule[_notrace]().  All the others already had stack frames.
      
      Oddly, this change actually seems to make things faster in a lot of
      cases.  For many smaller functions it causes the stack frame creation to
      get moved out of the common path and into the unlikely path.
      
      For example, here's the original cyc2ns_read_end():
      
        ffffffff8101f8c0 <cyc2ns_read_end>:
        ffffffff8101f8c0:	55                   	push   %rbp
        ffffffff8101f8c1:	48 89 e5             	mov    %rsp,%rbp
        ffffffff8101f8c4:	83 6f 10 01          	subl   $0x1,0x10(%rdi)
        ffffffff8101f8c8:	75 08                	jne    ffffffff8101f8d2 <cyc2ns_read_end+0x12>
        ffffffff8101f8ca:	65 48 89 3d e6 5a ff 	mov    %rdi,%gs:0x7eff5ae6(%rip)        # 153b8 <cyc2ns+0x38>
        ffffffff8101f8d1:	7e
        ffffffff8101f8d2:	65 ff 0d 77 c4 fe 7e 	decl   %gs:0x7efec477(%rip)        # bd50 <__preempt_count>
        ffffffff8101f8d9:	74 02                	je     ffffffff8101f8dd <cyc2ns_read_end+0x1d>
        ffffffff8101f8db:	5d                   	pop    %rbp
        ffffffff8101f8dc:	c3                   	retq
        ffffffff8101f8dd:	e8 1e 37 fe ff       	callq  ffffffff81003000 <___preempt_schedule>
        ffffffff8101f8e2:	5d                   	pop    %rbp
        ffffffff8101f8e3:	c3                   	retq
        ffffffff8101f8e4:	66 66 66 2e 0f 1f 84 	data16 data16 nopw %cs:0x0(%rax,%rax,1)
        ffffffff8101f8eb:	00 00 00 00 00
      
      And here's the same function with the patch:
      
        ffffffff8101f8c0 <cyc2ns_read_end>:
        ffffffff8101f8c0:	83 6f 10 01          	subl   $0x1,0x10(%rdi)
        ffffffff8101f8c4:	75 08                	jne    ffffffff8101f8ce <cyc2ns_read_end+0xe>
        ffffffff8101f8c6:	65 48 89 3d ea 5a ff 	mov    %rdi,%gs:0x7eff5aea(%rip)        # 153b8 <cyc2ns+0x38>
        ffffffff8101f8cd:	7e
        ffffffff8101f8ce:	65 ff 0d 7b c4 fe 7e 	decl   %gs:0x7efec47b(%rip)        # bd50 <__preempt_count>
        ffffffff8101f8d5:	74 01                	je     ffffffff8101f8d8 <cyc2ns_read_end+0x18>
        ffffffff8101f8d7:	c3                   	retq
        ffffffff8101f8d8:	55                   	push   %rbp
        ffffffff8101f8d9:	48 89 e5             	mov    %rsp,%rbp
        ffffffff8101f8dc:	e8 1f 37 fe ff       	callq  ffffffff81003000 <___preempt_schedule>
        ffffffff8101f8e1:	5d                   	pop    %rbp
        ffffffff8101f8e2:	c3                   	retq
        ffffffff8101f8e3:	66 66 66 66 2e 0f 1f 	data16 data16 data16 nopw %cs:0x0(%rax,%rax,1)
        ffffffff8101f8ea:	84 00 00 00 00 00
      
      Notice that it moved the frame pointer setup code to the unlikely
      ___preempt_schedule() call path.  Going through a sampling of the
      differences in the asm, that's the most common change I see.
      
      Otherwise it has no real effect on callers which already have stack
      frames (though it does result in the reordering of some 'mov's).
      Reported-by: NJiri Slaby <jslaby@suse.cz>
      Tested-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: live-patching@vger.kernel.org
      Link: http://lkml.kernel.org/r/20160218174158.GA28230@treble.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      821eae7d
  3. 06 10月, 2015 2 次提交
  4. 03 8月, 2015 1 次提交
  5. 07 6月, 2015 1 次提交
  6. 28 10月, 2014 2 次提交
    • O
      sched: Kill task_preempt_count() · e2336f6e
      Oleg Nesterov 提交于
      task_preempt_count() is pointless if preemption counter is per-cpu,
      currently this is x86 only. It is only valid if the task is not
      running, and even in this case the only info it can provide is the
      state of PREEMPT_ACTIVE bit.
      
      Change its single caller to check p->on_rq instead, this should be
      the same if p->state != TASK_RUNNING, and kill this helper.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Kirill Tkhai <tkhai@yandex.ru>
      Cc: Alexander Graf <agraf@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-arch@vger.kernel.org
      Link: http://lkml.kernel.org/r/20141008183348.GC17495@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e2336f6e
    • O
      sched: stop the unbound recursion in preempt_schedule_context() · 009f60e2
      Oleg Nesterov 提交于
      preempt_schedule_context() does preempt_enable_notrace() at the end
      and this can call the same function again; exception_exit() is heavy
      and it is quite possible that need-resched is true again.
      
      1. Change this code to dec preempt_count() and check need_resched()
         by hand.
      
      2. As Linus suggested, we can use the PREEMPT_ACTIVE bit and avoid
         the enable/disable dance around __schedule(). But in this case
         we need to move into sched/core.c.
      
      3. Cosmetic, but x86 forgets to declare this function. This doesn't
         really matter because it is only called by asm helpers, still it
         make sense to add the declaration into asm/preempt.h to match
         preempt_schedule().
      Reported-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Graf <agraf@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Anvin <hpa@zytor.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Chuck Ebbert <cebbert.lkml@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Link: http://lkml.kernel.org/r/20141005202322.GB27962@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      009f60e2
  7. 08 4月, 2014 1 次提交
    • C
      percpu: add raw_cpu_ops · b3ca1c10
      Christoph Lameter 提交于
      The kernel has never been audited to ensure that this_cpu operations are
      consistently used throughout the kernel.  The code generated in many
      places can be improved through the use of this_cpu operations (which
      uses a segment register for relocation of per cpu offsets instead of
      performing address calculations).
      
      The patch set also addresses various consistency issues in general with
      the per cpu macros.
      
      A. The semantics of __this_cpu_ptr() differs from this_cpu_ptr only
         because checks are skipped. This is typically shown through a raw_
         prefix. So this patch set changes the places where __this_cpu_ptr()
         is used to raw_cpu_ptr().
      
      B. There has been the long term wish by some that __this_cpu operations
         would check for preemption. However, there are cases where preemption
         checks need to be skipped. This patch set adds raw_cpu operations that
         do not check for preemption and then adds preemption checks to the
         __this_cpu operations.
      
      C. The use of __get_cpu_var is always a reference to a percpu variable
         that can also be handled via a this_cpu operation. This patch set
         replaces all uses of __get_cpu_var with this_cpu operations.
      
      D. We can then use this_cpu RMW operations in various places replacing
         sequences of instructions by a single one.
      
      E. The use of this_cpu operations throughout will allow other arches than
         x86 to implement optimized references and RMV operations to work with
         per cpu local data.
      
      F. The use of this_cpu operations opens up the possibility to
         further optimize code that relies on synchronization through
         per cpu data.
      
      The patch set works in a couple of stages:
      
      I. Patch 1 adds the additional raw_cpu operations and raw_cpu_ptr().
          Also converts the existing __this_cpu_xx_# primitive in the x86
          code to raw_cpu_xx_#.
      
      II. Patch 2-4 use the raw_cpu operations in places that would give
           us false positives once they are enabled.
      
      III. Patch 5 adds preemption checks to __this_cpu operations to allow
          checking if preemption is properly disabled when these functions
          are used.
      
      IV. Patches 6-20 are patches that simply replace uses of __get_cpu_var
         with this_cpu_ptr. They do not depend on any changes to the percpu
         code. No preemption tests are skipped if they are applied.
      
      V. Patches 21-46 are conversion patches that use this_cpu operations
         in various kernel subsystems/drivers or arch code.
      
      VI.  Patches 47/48 (not included in this series) remove no longer used
          functions (__this_cpu_ptr and __get_cpu_var).  These should only be
          applied after all the conversion patches have made it and after we
          have done additional passes through the kernel to ensure that none of
          the uses of these functions remain.
      
      This patch (of 46):
      
      The patches following this one will add preemption checks to __this_cpu
      ops so we need to have an alternative way to use this_cpu operations
      without preemption checks.
      
      raw_cpu_ops will be the basis for all other ops since these will be the
      operations that do not implement any checks.
      
      Primitive operations are renamed by this patch from __this_cpu_xxx to
      raw_cpu_xxxx.
      
      Also change the uses of the x86 percpu primitives in preempt.h.
      These depend directly on asm/percpu.h (header #include nesting issue).
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Alex Shi <alex.shi@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Bryan Wu <cooloney@gmail.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: David Daney <david.daney@cavium.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Hedi Berriche <hedi@sgi.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Mike Travis <travis@sgi.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Wim Van Sebroeck <wim@iguana.be>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b3ca1c10
  8. 11 12月, 2013 1 次提交
  9. 28 9月, 2013 1 次提交
  10. 25 9月, 2013 5 次提交