1. 14 3月, 2014 1 次提交
  2. 21 2月, 2014 1 次提交
    • M
      s390/mm,tlb: race of lazy TLB flush vs. recreation of TLB entries · 53e857f3
      Martin Schwidefsky 提交于
      Git commit 050eef36 "[S390] fix tlb flushing vs. concurrent
      /proc accesses" introduced the attach counter to avoid using the
      mm_users value to decide between IPTE for every PTE and lazy TLB
      flushing with IDTE. That fixed the problem with mm_users but it
      introduced another subtle race, fortunately one that is very hard
      to hit.
      The background is the requirement of the architecture that a valid
      PTE may not be changed while it can be used concurrently by another
      cpu. The decision between IPTE and lazy TLB flushing needs to be
      done while the PTE is still valid. Now if the virtual cpu is
      temporarily stopped after the decision to use lazy TLB flushing but
      before the invalid bit of the PTE has been set, another cpu can attach
      the mm, find that flush_mm is set, do the IDTE, return to userspace,
      and recreate a TLB that uses the PTE in question. When the first,
      stopped cpu continues it will change the PTE while it is attached on
      another cpu. The first cpu will do another IDTE shortly after the
      modification of the PTE which makes the race window quite short.
      
      To fix this race the CPU that wants to attach the address space of a
      user space thread needs to wait for the end of the PTE modification.
      The number of concurrent TLB flushers for an mm is tracked in the
      upper 16 bits of the attach_count and finish_arch_post_lock_switch
      is used to wait for the end of the flush operation if required.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      53e857f3
  3. 14 11月, 2013 1 次提交
  4. 26 4月, 2013 2 次提交
  5. 01 10月, 2012 2 次提交
  6. 20 7月, 2012 1 次提交
    • H
      s390/comments: unify copyright messages and remove file names · a53c8fab
      Heiko Carstens 提交于
      Remove the file name from the comment at top of many files. In most
      cases the file name was wrong anyway, so it's rather pointless.
      
      Also unify the IBM copyright statement. We did have a lot of sightly
      different statements and wanted to change them one after another
      whenever a file gets touched. However that never happened. Instead
      people start to take the old/"wrong" statements to use as a template
      for new files.
      So unify all of them in one go.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      a53c8fab
  7. 24 5月, 2012 2 次提交
  8. 16 5月, 2012 1 次提交
  9. 22 11月, 2011 1 次提交
  10. 30 10月, 2011 3 次提交
    • M
      [S390] add TIF_SYSCALL thread flag · b6ef5bb3
      Martin Schwidefsky 提交于
      Add an explicit TIF_SYSCALL bit that indicates if a task is inside
      a system call. The svc_code in the pt_regs structure is now only
      valid if TIF_SYSCALL is set. With this definition TIF_RESTART_SVC
      can be replaced with TIF_SYSCALL. Overall do_signal is a bit more
      readable and it saves a few lines of code.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      b6ef5bb3
    • M
      [S390] signal race with restarting system calls · 20b40a79
      Martin Schwidefsky 提交于
      For a ERESTARTNOHAND/ERESTARTSYS/ERESTARTNOINTR restarting system call
      do_signal will prepare the restart of the system call with a rewind of
      the PSW before calling get_signal_to_deliver (where the debugger might
      take control). For A ERESTART_RESTARTBLOCK restarting system call
      do_signal will set -EINTR as return code.
      There are two issues with this approach:
      1) strace never sees ERESTARTNOHAND, ERESTARTSYS, ERESTARTNOINTR or
         ERESTART_RESTARTBLOCK as the rewinding already took place or the
         return code has been changed to -EINTR
      2) if get_signal_to_deliver does not return with a signal to deliver
         the restart via the repeat of the svc instruction is left in place.
         This opens a race if another signal is made pending before the
         system call instruction can be reexecuted. The original system call
         will be restarted even if the second signal would have ended the
         system call with -EINTR.
      
      These two issues can be solved by dropping the early rewind of the
      system call before get_signal_to_deliver has been called and by using
      the TIF_RESTART_SVC magic to do the restart if no signal has to be
      delivered. The only situation where the system call restart via the
      repeat of the svc instruction is appropriate is when a SA_RESTART
      signal is delivered to user space.
      
      Unfortunately this breaks inferior calls by the debugger again. The
      system call number and the length of the system call instruction is
      lost over the inferior call and user space will see ERESTARTNOHAND/
      ERESTARTSYS/ERESTARTNOINTR/ERESTART_RESTARTBLOCK. To correct this a
      new ptrace interface is added to save/restore the system call number
      and system call instruction length.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      20b40a79
    • T
      [S390] fix _TIF_SINGLE_STEP definition · 80853a8a
      Tejun Heo 提交于
      _TIF_SINGLE_STEP is incorrectly defined as 1<<TIF_FREEZE.  Fix it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      80853a8a
  11. 24 7月, 2011 1 次提交
    • M
      [S390] move sie code to entry.S · 603d1a50
      Martin Schwidefsky 提交于
      The entry to / exit from sie has subtle dependencies to the first level
      interrupt handler. Move the sie assembler code to entry64.S and replace
      the SIE_HOOK callback with a test and the new _TIF_SIE bit.
      In addition this patch fixes several problems in regard to the check for
      the_TIF_EXIT_SIE bits. The old code checked the TIF bits before executing
      the interrupt handler and it only modified the instruction address if it
      pointed directly to the sie instruction. In both cases it could miss
      a TIF bit that normally would cause an exit from the guest and would
      reenter the guest context.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      603d1a50
  12. 12 1月, 2011 1 次提交
  13. 05 1月, 2011 2 次提交
  14. 17 5月, 2010 1 次提交
  15. 14 5月, 2010 1 次提交
  16. 27 2月, 2010 1 次提交
  17. 14 1月, 2010 1 次提交
  18. 26 8月, 2009 1 次提交
    • J
      tracing: Rename FTRACE_SYSCALLS for tracepoints · 66700001
      Josh Stone 提交于
      s/HAVE_FTRACE_SYSCALLS/HAVE_SYSCALL_TRACEPOINTS/g
      s/TIF_SYSCALL_FTRACE/TIF_SYSCALL_TRACEPOINT/g
      
      The syscall enter/exit tracing is no longer specific to just ftrace, so
      they now have names that reflect their tie to tracepoints instead.
      Signed-off-by: NJosh Stone <jistone@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      LKML-Reference: <1251150194-1713-2-git-send-email-jistone@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      66700001
  19. 11 7月, 2009 1 次提交
  20. 12 6月, 2009 2 次提交
  21. 14 4月, 2009 1 次提交
  22. 31 12月, 2008 1 次提交
    • M
      [PATCH] improve precision of process accounting. · aa5e97ce
      Martin Schwidefsky 提交于
      The unit of the cputime accouting values that are stored per process is
      currently a microsecond. The CPU timer has a maximum granularity of
      2**-12 microseconds. There is no benefit in storing the per process values
      in the lesser precision and there is the disadvantage that the backend
      has to do the rounding to microseconds. The better solution is to use
      the maximum granularity of the CPU timer as cputime unit.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      aa5e97ce
  23. 28 10月, 2008 1 次提交
    • H
      [S390] No more 4kb stacks. · 7f5a8ba6
      Heiko Carstens 提交于
      We got a stack overflow with a small stack configuration on a 32 bit
      system. It just looks like as 4kb isn't enough and too dangerous.
      So lets get rid of 4kb stacks on 32 bit.
      
      But one thing I completely dislike about the call trace below is that
      just for debugging or tracing purposes sprintf gets called (cio_start_key):
      
      	/* process condition code */
      	sprintf(dbf_txt, "ccode:%d", ccode);
      	CIO_TRACE_EVENT(4, dbf_txt);
      
      But maybe its just me who thinks that this could be done better.
      
          <4>Kernel stack overflow.
          <4>Modules linked in: dm_multipath sunrpc bonding qeth_l2 dm_mod qeth ccwgroup vmur
          <4>CPU: 1 Not tainted 2.6.27-30.x.20081015-s390default #1
          <4>Process httpd (pid: 3807, task: 20ae2df8, ksp: 1666fb78)
          <4>Krnl PSW : 040c0000 8027098a (number+0xe/0x348)
          <4>           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0
          <4>Krnl GPRS: 00d43318 0027097c 1666f277 9666f270
          <4>           00000000 00000000 0000000a ffffffff
          <4>           9666f270 1666f228 1666f277 1666f098
          <4>           00000002 80270982 80271016 1666f098
          <4>Krnl Code: 8027097e: f0340dd0a7f1	srp	3536(4,%r0),2033(%r10),4
          <4>           80270984: 0f00		clcl	%r0,%r0
          <4>           80270986: a7840001		brc	8,80270988
          <4>          >8027098a: 18ef		lr	%r14,%r15
          <4>           8027098c: a7faff68		ahi	%r15,-152
          <4>           80270990: 18bf		lr	%r11,%r15
          <4>           80270992: 18a2		lr	%r10,%r2
          <4>           80270994: 1893		lr	%r9,%r3
      
      Modified calltrace with annotated stackframe size of each function:
      
      stackframe size
          |
       0 304 vsnprintf+850 [0x271016]
       1  72 sprintf+74 [0x271522]
       2  56 cio_start_key+262 [0x2d4c16]
       3  56 ccw_device_start_key+222 [0x2dfe92]
       4  56 ccw_device_start+40 [0x2dff28]
       5  48 raw3215_start_io+104 [0x30b0f8]
       6  56 raw3215_write+494 [0x30ba0a]
       7  40 con3215_write+68 [0x30bafc]
       8  40 __call_console_drivers+146 [0x12b0fa]
       9  32 _call_console_drivers+102 [0x12b192]
      10  64 release_console_sem+268 [0x12b614]
      11 168 vprintk+462 [0x12bca6]
      12  72 printk+68 [0x12bfd0]
      13 256 __print_symbol+50 [0x15a882]
      14  56 __show_trace+162 [0x103d06]
      15  32 show_trace+224 [0x103e70]
      16  48 show_stack+152 [0x103f20]
      17  56 dump_stack+126 [0x104612]
      18  96 __alloc_pages_internal+592 [0x175004]
      19  80 cache_alloc_refill+776 [0x196f3c]
      20  40 __kmalloc+258 [0x1972ae]
      21  40 __alloc_skb+94 [0x328086]
      22  32 pskb_copy+50 [0x328252]
      23  32 skb_realloc_headroom+110 [0x328a72]
      24 104 qeth_l2_hard_start_xmit+378 [0x7803bfde]
      25  56 dev_hard_start_xmit+450 [0x32ef6e]
      26  56 __qdisc_run+390 [0x3425d6]
      27  48 dev_queue_xmit+410 [0x331e06]
      28  40 ip_finish_output+308 [0x354ac8]
      29  56 ip_output+218 [0x355b6e]
      30  24 ip_local_out+56 [0x354584]
      31 120 ip_queue_xmit+300 [0x355cec]
      32  96 tcp_transmit_skb+812 [0x367da8]
      33  40 tcp_push_one+158 [0x369fda]
      34 112 tcp_sendmsg+852 [0x35d5a0]
      35 240 sock_sendmsg+164 [0x32035c]
      36  56 kernel_sendmsg+86 [0x32064a]
      37  88 sock_no_sendpage+98 [0x322b22]
      38 104 tcp_sendpage+70 [0x35cc1e]
      39  48 sock_sendpage+74 [0x31eb66]
      40  64 pipe_to_sendpage+102 [0x1c4b2e]
      41  64 __splice_from_pipe+120 [0x1c5340]
      42  72 splice_from_pipe+90 [0x1c57e6]
      43  56 generic_splice_sendpage+38 [0x1c5832]
      44  48 do_splice_from+104 [0x1c4c38]
      45  48 direct_splice_actor+52 [0x1c4c88]
      46  80 splice_direct_to_actor+180 [0x1c4f80]
      47  72 do_splice_direct+70 [0x1c5112]
      48  64 do_sendfile+360 [0x19de18]
      49  72 sys_sendfile64+126 [0x19df32]
      50 336 sysc_do_restart+18 [0x111a1a]
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      7f5a8ba6
  24. 20 10月, 2008 1 次提交
    • M
      container freezer: add TIF_FREEZE flag to all architectures · 83224b08
      Matt Helsley 提交于
      This patch series introduces a cgroup subsystem that utilizes the swsusp
      freezer to freeze a group of tasks.  It's immediately useful for batch job
      management scripts.  It should also be useful in the future for
      implementing container checkpoint/restart.
      
      The freezer subsystem in the container filesystem defines a cgroup file
      named freezer.state.  Reading freezer.state will return the current state
      of the cgroup.  Writing "FROZEN" to the state file will freeze all tasks
      in the cgroup.  Subsequently writing "RUNNING" will unfreeze the tasks in
      the cgroup.
      
      * Examples of usage :
      
         # mkdir /containers/freezer
         # mount -t cgroup -ofreezer freezer  /containers
         # mkdir /containers/0
         # echo $some_pid > /containers/0/tasks
      
      to get status of the freezer subsystem :
      
         # cat /containers/0/freezer.state
         RUNNING
      
      to freeze all tasks in the container :
      
         # echo FROZEN > /containers/0/freezer.state
         # cat /containers/0/freezer.state
         FREEZING
         # cat /containers/0/freezer.state
         FROZEN
      
      to unfreeze all tasks in the container :
      
         # echo RUNNING > /containers/0/freezer.state
         # cat /containers/0/freezer.state
         RUNNING
      
      This patch:
      
      The first step in making the refrigerator() available to all
      architectures, even for those without power management.
      
      The purpose of such a change is to be able to use the refrigerator() in a
      new control group subsystem which will implement a control group freezer.
      
      [akpm@linux-foundation.org: fix sparc]
      Signed-off-by: NCedric Le Goater <clg@fr.ibm.com>
      Signed-off-by: NMatt Helsley <matthltc@us.ibm.com>
      Acked-by: NPavel Machek <pavel@suse.cz>
      Acked-by: NSerge E. Hallyn <serue@us.ibm.com>
      Acked-by: NRafael J. Wysocki <rjw@sisk.pl>
      Acked-by: NNigel Cunningham <nigel@tuxonice.net>
      Tested-by: NMatt Helsley <matthltc@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      83224b08
  25. 11 10月, 2008 1 次提交
  26. 02 8月, 2008 1 次提交
  27. 26 7月, 2008 1 次提交
  28. 30 4月, 2008 1 次提交
  29. 29 6月, 2006 1 次提交
  30. 02 2月, 2006 1 次提交
  31. 13 1月, 2006 1 次提交
  32. 26 6月, 2005 1 次提交
  33. 24 6月, 2005 1 次提交
    • J
      [PATCH] streamline preempt_count type across archs · dcd497f9
      Jesper Juhl 提交于
      The preempt_count member of struct thread_info is currently either defined
      as int, unsigned int or __s32 depending on arch.  This patch makes the type
      of preempt_count an int on all archs.
      
      Having preempt_count be an unsigned type prevents the catching of
      preempt_count < 0 bugs, and using int on some archs and __s32 on others is
      not exactely "neat" - much nicer when it's just int all over.
      
      A previous version of this patch was already ACK'ed by Robert Love, and the
      only change in this version of the patch compared to the one he ACK'ed is
      that this one also makes sure the preempt_count member is consistently
      commented.
      Signed-off-by: NJesper Juhl <juhl-lkml@dif.dk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      dcd497f9