1. 30 10月, 2011 4 次提交
    • M
      [S390] signal race with restarting system calls · 20b40a79
      Martin Schwidefsky 提交于
      For a ERESTARTNOHAND/ERESTARTSYS/ERESTARTNOINTR restarting system call
      do_signal will prepare the restart of the system call with a rewind of
      the PSW before calling get_signal_to_deliver (where the debugger might
      take control). For A ERESTART_RESTARTBLOCK restarting system call
      do_signal will set -EINTR as return code.
      There are two issues with this approach:
      1) strace never sees ERESTARTNOHAND, ERESTARTSYS, ERESTARTNOINTR or
         ERESTART_RESTARTBLOCK as the rewinding already took place or the
         return code has been changed to -EINTR
      2) if get_signal_to_deliver does not return with a signal to deliver
         the restart via the repeat of the svc instruction is left in place.
         This opens a race if another signal is made pending before the
         system call instruction can be reexecuted. The original system call
         will be restarted even if the second signal would have ended the
         system call with -EINTR.
      
      These two issues can be solved by dropping the early rewind of the
      system call before get_signal_to_deliver has been called and by using
      the TIF_RESTART_SVC magic to do the restart if no signal has to be
      delivered. The only situation where the system call restart via the
      repeat of the svc instruction is appropriate is when a SA_RESTART
      signal is delivered to user space.
      
      Unfortunately this breaks inferior calls by the debugger again. The
      system call number and the length of the system call instruction is
      lost over the inferior call and user space will see ERESTARTNOHAND/
      ERESTARTSYS/ERESTARTNOINTR/ERESTART_RESTARTBLOCK. To correct this a
      new ptrace interface is added to save/restore the system call number
      and system call instruction length.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      20b40a79
    • M
      [S390] kdump: Add infrastructure for unmapping crashkernel memory · 558df720
      Michael Holzheu 提交于
      This patch introduces a mechanism that allows architecture backends to
      remove page tables for the crashkernel memory. This can protect the loaded
      kdump kernel from being overwritten by broken kernel code.  Two new
      functions crash_map_reserved_pages() and crash_unmap_reserved_pages() are
      added that can be implemented by architecture code.  The
      crash_map_reserved_pages() function is called before and
      crash_unmap_reserved_pages() after the crashkernel segments are loaded.  The
      functions are also called in crash_shrink_memory() to create/remove page
      tables when the crashkernel memory size is reduced.
      
      To support architectures that have large pages this patch also introduces
      a new define KEXEC_CRASH_MEM_ALIGN. The crashkernel start and size must
      always be aligned with KEXEC_CRASH_MEM_ALIGN.
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      558df720
    • M
      [S390] kdump: Add size to elfcorehdr kernel parameter · d3bf3795
      Michael Holzheu 提交于
      Currently only the address of the pre-allocated ELF header is passed with
      the elfcorehdr= kernel parameter. In order to reserve memory for the header
      in the 2nd kernel also the size is required. Current kdump architecture
      backends use different methods to do that, e.g. x86 uses the memmap= kernel
      parameter. On s390 there is no easy way to transfer this information.
      Therefore the elfcorehdr kernel parameter is extended to also pass the size.
      This now can also be used as standard mechanism by all future kdump
      architecture backends.
      
      The syntax of the kernel parameter is extended as follows:
      
      elfcorehdr=[size[KMG]@]offset[KMG]
      
      This change is backward compatible because elfcorehdr=size is still allowed.
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Acked-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      d3bf3795
    • M
      [S390] kdump: Add KEXEC_CRASH_CONTROL_MEMORY_LIMIT · 3d214fae
      Michael Holzheu 提交于
      On s390 there is a different KEXEC_CONTROL_MEMORY_LIMIT for the normal and
      the kdump kexec case. Therefore this patch introduces a new macro
      KEXEC_CRASH_CONTROL_MEMORY_LIMIT. This is set to
      KEXEC_CONTROL_MEMORY_LIMIT for all architectures that do not define
      KEXEC_CRASH_CONTROL_MEMORY_LIMIT.
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Acked-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      3d214fae
  2. 29 10月, 2011 1 次提交
  3. 28 10月, 2011 4 次提交
    • A
      vfs: add generic_file_llseek_size · 5760495a
      Andi Kleen 提交于
      Add a generic_file_llseek variant to the VFS that allows passing in
      the maximum file size of the file system, instead of always
      using maxbytes from the superblock.
      
      This can be used to eliminate some cut'n'paste seek code in ext4.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      5760495a
    • A
      vfs: do (nearly) lockless generic_file_llseek · ef3d0fd2
      Andi Kleen 提交于
      The i_mutex lock use of generic _file_llseek hurts.  Independent processes
      accessing the same file synchronize over a single lock, even though
      they have no need for synchronization at all.
      
      Under high utilization this can cause llseek to scale very poorly on larger
      systems.
      
      This patch does some rethinking of the llseek locking model:
      
      First the 64bit f_pos is not necessarily atomic without locks
      on 32bit systems. This can already cause races with read() today.
      This was discussed on linux-kernel in the past and deemed acceptable.
      The patch does not change that.
      
      Let's look at the different seek variants:
      
      SEEK_SET: Doesn't really need any locking.
      If there's a race one writer wins, the other loses.
      
      For 32bit the non atomic update races against read()
      stay the same. Without a lock they can also happen
      against write() now.  The read() race was deemed
      acceptable in past discussions, and I think if it's
      ok for read it's ok for write too.
      
      => Don't need a lock.
      
      SEEK_END: This behaves like SEEK_SET plus it reads
      the maximum size too. Reading the maximum size would have the
      32bit atomic problem. But luckily we already have a way to read
      the maximum size without locking (i_size_read), so we
      can just use that instead.
      
      Without i_mutex there is no synchronization with write() anymore,
      however since the write() update is atomic on 64bit it just behaves
      like another racy SEEK_SET.  On non atomic 32bit it's the same
      as SEEK_SET.
      
      => Don't need a lock, but need to use i_size_read()
      
      SEEK_CUR: This has a read-modify-write race window
      on the same file. One could argue that any application
      doing unsynchronized seeks on the same file is already broken.
      But for the sake of not adding a regression here I'm
      using the file->f_lock to synchronize this. Using this
      lock is much better than the inode mutex because it doesn't
      synchronize between processes.
      
      => So still need a lock, but can use a f_lock.
      
      This patch implements this new scheme in generic_file_llseek.
      I dropped generic_file_llseek_unlocked and changed all callers.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      ef3d0fd2
    • A
      vfs: add hex format for MAY_* flag values · 8522ca58
      Aneesh Kumar K.V 提交于
      We are going to add more flags and having them in hex format
      make it simpler
      Acked-by: NJ. Bruce Fields <bfields@redhat.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      8522ca58
    • S
      Fix build break when freezer not configured · e0c8ea1a
      Steve French 提交于
      fs/cifs/transport.c: In function 'wait_for_response':
      fs/cifs/transport.c:328: error: implicit declaration of function 'wait_event_freezekillable'
      
      Caused by commit f06ac72e ("cifs, freezer: add
      wait_event_freezekillable and have cifs use it").  In this config,
      CONFIG_FREEZER is not set.
      Reviewed-by: NShirish Pargaonkar <shirishp@us.ibm.com>
      CC: Jeff Layton <jlayton@redhat.com>
      Signed-off-by: NSteve French <smfrench@gmail.com>
      e0c8ea1a
  4. 27 10月, 2011 23 次提交
  5. 26 10月, 2011 3 次提交
  6. 25 10月, 2011 2 次提交
  7. 24 10月, 2011 3 次提交