1. 25 12月, 2008 1 次提交
  2. 01 12月, 2008 1 次提交
  3. 27 11月, 2008 2 次提交
    • C
      [S390] pgtable.h: Fix oops in unmap_vmas for KVM processes · 2944a5c9
      Christian Borntraeger 提交于
      When running several kvm processes with lots of memory overcommitment,
      we have seen an oops during process shutdown:
      ------------[ cut here ]------------
      Kernel BUG at 0000000000193434 [verbose debug info unavailable]
      addressing exception: 0005 [#1] PREEMPT SMP
      Modules linked in: kvm sunrpc qeth_l2 dm_mod qeth ccwgroup
      CPU: 10 Not tainted 2.6.28-rc4-kvm-bigiron-00521-g0ccca08-dirty #8
      Process kuli (pid: 14460, task: 0000000149822338, ksp: 0000000024f57650)
      Krnl PSW : 0704e00180000000 0000000000193434 (unmap_vmas+0x884/0xf10)
      R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 EA:3
      Krnl GPRS: 0000000000000002 0000000000000000 000000051008d000 000003e05e6034e0
                 00000000001933f6 00000000000001e9 0000000407259e0a 00000002be88c400
                 00000200001c1000 0000000407259608 0000000407259e08 0000000024f577f0
                 0000000407259e09 0000000000445fa8 00000000001933f6 0000000024f577f0
      Krnl Code: 0000000000193426: eb22000c000d sllg %r2,%r2,12
                 000000000019342c: a7180000 lhi %r1,0
                 0000000000193430: b2290012 iske %r1,%r2
                >0000000000193434: a7110002 tmll %r1,2
                 0000000000193438: a7840006 brc 8,193444
                 000000000019343c: 9602c000 oi 0(%r12),2
                 0000000000193440: 96806000 oi 0(%r6),128
                 0000000000193444: a7110004 tmll %r1,4
      Call Trace:
      ([<00000000001933f6>] unmap_vmas+0x846/0xf10)
      [<0000000000199680>] exit_mmap+0x210/0x458
      [<000000000012a8f8>] mmput+0x54/0xfc
      [<000000000012f714>] exit_mm+0x134/0x144
      [<000000000013120c>] do_exit+0x240/0x878
      [<00000000001318dc>] do_group_exit+0x98/0xc8
      [<000000000013e6b0>] get_signal_to_deliver+0x30c/0x358
      [<000000000010bee0>] do_signal+0xec/0x860
      [<0000000000112e30>] sysc_sigpending+0xe/0x22
      [<000002000013198a>] 0x2000013198a
      INFO: lockdep is turned off.
      Last Breaking-Event-Address:
      [<00000000001a68d0>] free_swap_and_cache+0x1a0/0x1a4
      <4>---[ end trace bc19f1d51ac9db7c ]---
      
      The faulting instruction is the storage key operation (iske) in
      ptep_rcp_copy (called by pte_clear, called by unmap_vmas). iske
      reads dirty and reference bit information for a physical page and
      requires a valid physical address. Since we are in pte_clear, we
      cannot rely on the pte containing a valid address. Fortunately we
      dont need these information in pte_clear - after all there is no
      mapping. The best fix is to remove the needless call to ptep_rcp_copy
      that contains the iske.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      2944a5c9
    • M
      [S390] fix system call parameter functions. · 59da2139
      Martin Schwidefsky 提交于
      syscall_get_nr() currently returns a valid result only if the call
      chain of the traced process includes do_syscall_trace_enter(). But
      collect_syscall() can be called for any sleeping task, the result of
      syscall_get_nr() in general is completely bogus.
      
      To make syscall_get_nr() work for any sleeping task the traps field
      in pt_regs is replace with svcnr - the system call number the process
      is executing. If svcnr == 0 the process is not on a system call path.
      
      The syscall_get_arguments and syscall_set_arguments use regs->gprs[2]
      for the first system call parameter. This is incorrect since gprs[2]
      may have been overwritten with the system call number if the call
      chain includes do_syscall_trace_enter. Use regs->orig_gprs2 instead.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      59da2139
  4. 28 10月, 2008 3 次提交
    • C
      [S390] s390: Fix build for !CONFIG_S390_GUEST + CONFIG_VIRTIO_CONSOLE · ea4bfdf5
      Christian Borntraeger 提交于
      The s390 kernel does not compile if virtio console is enabled, but guest
      support is disabled:
      
        LD      .tmp_vmlinux1
      arch/s390/kernel/built-in.o: In function `setup_arch':
      /space/linux-2.5/arch/s390/kernel/setup.c:773: undefined reference to
      `s390_virtio_console_init'
      
      The fix is related to
      commit 99e65c92
      Author: Christian Borntraeger <borntraeger@de.ibm.com>
      Date:   Fri Jul 25 15:50:04 2008 +0200
          KVM: s390: Fix guest kconfig
      
      Which changed the build process to build kvm_virtio.c only if CONFIG_S390_GUEST
      is set. We must ifdef the prototype in the header file accordingly.
      Reported-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      ea4bfdf5
    • H
      [S390] No more 4kb stacks. · 7f5a8ba6
      Heiko Carstens 提交于
      We got a stack overflow with a small stack configuration on a 32 bit
      system. It just looks like as 4kb isn't enough and too dangerous.
      So lets get rid of 4kb stacks on 32 bit.
      
      But one thing I completely dislike about the call trace below is that
      just for debugging or tracing purposes sprintf gets called (cio_start_key):
      
      	/* process condition code */
      	sprintf(dbf_txt, "ccode:%d", ccode);
      	CIO_TRACE_EVENT(4, dbf_txt);
      
      But maybe its just me who thinks that this could be done better.
      
          <4>Kernel stack overflow.
          <4>Modules linked in: dm_multipath sunrpc bonding qeth_l2 dm_mod qeth ccwgroup vmur
          <4>CPU: 1 Not tainted 2.6.27-30.x.20081015-s390default #1
          <4>Process httpd (pid: 3807, task: 20ae2df8, ksp: 1666fb78)
          <4>Krnl PSW : 040c0000 8027098a (number+0xe/0x348)
          <4>           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0
          <4>Krnl GPRS: 00d43318 0027097c 1666f277 9666f270
          <4>           00000000 00000000 0000000a ffffffff
          <4>           9666f270 1666f228 1666f277 1666f098
          <4>           00000002 80270982 80271016 1666f098
          <4>Krnl Code: 8027097e: f0340dd0a7f1	srp	3536(4,%r0),2033(%r10),4
          <4>           80270984: 0f00		clcl	%r0,%r0
          <4>           80270986: a7840001		brc	8,80270988
          <4>          >8027098a: 18ef		lr	%r14,%r15
          <4>           8027098c: a7faff68		ahi	%r15,-152
          <4>           80270990: 18bf		lr	%r11,%r15
          <4>           80270992: 18a2		lr	%r10,%r2
          <4>           80270994: 1893		lr	%r9,%r3
      
      Modified calltrace with annotated stackframe size of each function:
      
      stackframe size
          |
       0 304 vsnprintf+850 [0x271016]
       1  72 sprintf+74 [0x271522]
       2  56 cio_start_key+262 [0x2d4c16]
       3  56 ccw_device_start_key+222 [0x2dfe92]
       4  56 ccw_device_start+40 [0x2dff28]
       5  48 raw3215_start_io+104 [0x30b0f8]
       6  56 raw3215_write+494 [0x30ba0a]
       7  40 con3215_write+68 [0x30bafc]
       8  40 __call_console_drivers+146 [0x12b0fa]
       9  32 _call_console_drivers+102 [0x12b192]
      10  64 release_console_sem+268 [0x12b614]
      11 168 vprintk+462 [0x12bca6]
      12  72 printk+68 [0x12bfd0]
      13 256 __print_symbol+50 [0x15a882]
      14  56 __show_trace+162 [0x103d06]
      15  32 show_trace+224 [0x103e70]
      16  48 show_stack+152 [0x103f20]
      17  56 dump_stack+126 [0x104612]
      18  96 __alloc_pages_internal+592 [0x175004]
      19  80 cache_alloc_refill+776 [0x196f3c]
      20  40 __kmalloc+258 [0x1972ae]
      21  40 __alloc_skb+94 [0x328086]
      22  32 pskb_copy+50 [0x328252]
      23  32 skb_realloc_headroom+110 [0x328a72]
      24 104 qeth_l2_hard_start_xmit+378 [0x7803bfde]
      25  56 dev_hard_start_xmit+450 [0x32ef6e]
      26  56 __qdisc_run+390 [0x3425d6]
      27  48 dev_queue_xmit+410 [0x331e06]
      28  40 ip_finish_output+308 [0x354ac8]
      29  56 ip_output+218 [0x355b6e]
      30  24 ip_local_out+56 [0x354584]
      31 120 ip_queue_xmit+300 [0x355cec]
      32  96 tcp_transmit_skb+812 [0x367da8]
      33  40 tcp_push_one+158 [0x369fda]
      34 112 tcp_sendmsg+852 [0x35d5a0]
      35 240 sock_sendmsg+164 [0x32035c]
      36  56 kernel_sendmsg+86 [0x32064a]
      37  88 sock_no_sendpage+98 [0x322b22]
      38 104 tcp_sendpage+70 [0x35cc1e]
      39  48 sock_sendpage+74 [0x31eb66]
      40  64 pipe_to_sendpage+102 [0x1c4b2e]
      41  64 __splice_from_pipe+120 [0x1c5340]
      42  72 splice_from_pipe+90 [0x1c57e6]
      43  56 generic_splice_sendpage+38 [0x1c5832]
      44  48 do_splice_from+104 [0x1c4c38]
      45  48 direct_splice_actor+52 [0x1c4c88]
      46  80 splice_direct_to_actor+180 [0x1c4f80]
      47  72 do_splice_direct+70 [0x1c5112]
      48  64 do_sendfile+360 [0x19de18]
      49  72 sys_sendfile64+126 [0x19df32]
      50 336 sysc_do_restart+18 [0x111a1a]
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      7f5a8ba6
    • C
      [S390] pgtables: Fix race in enable_sie vs. page table ops · 250cf776
      Christian Borntraeger 提交于
      The current enable_sie code sets the mm->context.pgstes bit to tell
      dup_mm that the new mm should have extended page tables. This bit is also
      used by the s390 specific page table primitives to decide about the page
      table layout - which means context.pgstes has two meanings. This can cause
      any kind of bugs. For example  - e.g. shrink_zone can call
      ptep_clear_flush_young while enable_sie is running. ptep_clear_flush_young
      will test for context.pgstes. Since enable_sie changed that value of the old
      struct mm without changing the page table layout ptep_clear_flush_young will
      do the wrong thing.
      The solution is to split pgstes into two bits
      - one for the allocation
      - one for the current state
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      250cf776
  5. 20 10月, 2008 1 次提交
    • M
      container freezer: add TIF_FREEZE flag to all architectures · 83224b08
      Matt Helsley 提交于
      This patch series introduces a cgroup subsystem that utilizes the swsusp
      freezer to freeze a group of tasks.  It's immediately useful for batch job
      management scripts.  It should also be useful in the future for
      implementing container checkpoint/restart.
      
      The freezer subsystem in the container filesystem defines a cgroup file
      named freezer.state.  Reading freezer.state will return the current state
      of the cgroup.  Writing "FROZEN" to the state file will freeze all tasks
      in the cgroup.  Subsequently writing "RUNNING" will unfreeze the tasks in
      the cgroup.
      
      * Examples of usage :
      
         # mkdir /containers/freezer
         # mount -t cgroup -ofreezer freezer  /containers
         # mkdir /containers/0
         # echo $some_pid > /containers/0/tasks
      
      to get status of the freezer subsystem :
      
         # cat /containers/0/freezer.state
         RUNNING
      
      to freeze all tasks in the container :
      
         # echo FROZEN > /containers/0/freezer.state
         # cat /containers/0/freezer.state
         FREEZING
         # cat /containers/0/freezer.state
         FROZEN
      
      to unfreeze all tasks in the container :
      
         # echo RUNNING > /containers/0/freezer.state
         # cat /containers/0/freezer.state
         RUNNING
      
      This patch:
      
      The first step in making the refrigerator() available to all
      architectures, even for those without power management.
      
      The purpose of such a change is to be able to use the refrigerator() in a
      new control group subsystem which will implement a control group freezer.
      
      [akpm@linux-foundation.org: fix sparc]
      Signed-off-by: NCedric Le Goater <clg@fr.ibm.com>
      Signed-off-by: NMatt Helsley <matthltc@us.ibm.com>
      Acked-by: NPavel Machek <pavel@suse.cz>
      Acked-by: NSerge E. Hallyn <serue@us.ibm.com>
      Acked-by: NRafael J. Wysocki <rjw@sisk.pl>
      Acked-by: NNigel Cunningham <nigel@tuxonice.net>
      Tested-by: NMatt Helsley <matthltc@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      83224b08
  6. 16 10月, 2008 1 次提交
  7. 11 10月, 2008 5 次提交
  8. 07 9月, 2008 1 次提交
  9. 22 8月, 2008 1 次提交
    • E
      [S390] fix ext2_find_next_bit · 152382af
      Eric Sandeen 提交于
      ext4 does not work on s390 because ext2_find_next_bit is broken. Fortunately
      this function is only used by ext4. The function uses ffs which does not work
      analog to ffz. The result of ffs has an offset of 1 which is not taken into
      account. To fix this use the low level __ffs_word function directly instead
      of the ill defined ffs.
      
      In addition the patch improves find_next_zero_bit and ext2_find_next_zero_bit
      by passing the bit offset into __ffz_word instead of adding it after the
      function call returned.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      152382af
  10. 15 8月, 2008 1 次提交
  11. 02 8月, 2008 1 次提交