1. 20 6月, 2017 8 次提交
  2. 19 6月, 2017 1 次提交
    • H
      mm: larger stack guard gap, between vmas · 1be7107f
      Hugh Dickins 提交于
      Stack guard page is a useful feature to reduce a risk of stack smashing
      into a different mapping. We have been using a single page gap which
      is sufficient to prevent having stack adjacent to a different mapping.
      But this seems to be insufficient in the light of the stack usage in
      userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
      used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
      which is 256kB or stack strings with MAX_ARG_STRLEN.
      
      This will become especially dangerous for suid binaries and the default
      no limit for the stack size limit because those applications can be
      tricked to consume a large portion of the stack and a single glibc call
      could jump over the guard page. These attacks are not theoretical,
      unfortunatelly.
      
      Make those attacks less probable by increasing the stack guard gap
      to 1MB (on systems with 4k pages; but make it depend on the page size
      because systems with larger base pages might cap stack allocations in
      the PAGE_SIZE units) which should cover larger alloca() and VLA stack
      allocations. It is obviously not a full fix because the problem is
      somehow inherent, but it should reduce attack space a lot.
      
      One could argue that the gap size should be configurable from userspace,
      but that can be done later when somebody finds that the new 1MB is wrong
      for some special case applications.  For now, add a kernel command line
      option (stack_guard_gap) to specify the stack gap size (in page units).
      
      Implementation wise, first delete all the old code for stack guard page:
      because although we could get away with accounting one extra page in a
      stack vma, accounting a larger gap can break userspace - case in point,
      a program run with "ulimit -S -v 20000" failed when the 1MB gap was
      counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
      and strict non-overcommit mode.
      
      Instead of keeping gap inside the stack vma, maintain the stack guard
      gap as a gap between vmas: using vm_start_gap() in place of vm_start
      (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
      places which need to respect the gap - mainly arch_get_unmapped_area(),
      and and the vma tree's subtree_gap support for that.
      Original-patch-by: NOleg Nesterov <oleg@redhat.com>
      Original-patch-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Tested-by: Helge Deller <deller@gmx.de> # parisc
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1be7107f
  3. 17 6月, 2017 1 次提交
  4. 16 6月, 2017 1 次提交
  5. 15 6月, 2017 13 次提交
  6. 12 6月, 2017 2 次提交
    • B
      configfs: Introduce config_item_get_unless_zero() · 19e72d3a
      Bart Van Assche 提交于
      Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
      [hch: minor style tweak]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      19e72d3a
    • N
      configfs: Fix race between create_link and configfs_rmdir · ba80aa90
      Nicholas Bellinger 提交于
      This patch closes a long standing race in configfs between
      the creation of a new symlink in create_link(), while the
      symlink target's config_item is being concurrently removed
      via configfs_rmdir().
      
      This can happen because the symlink target's reference
      is obtained by config_item_get() in create_link() before
      the CONFIGFS_USET_DROPPING bit set by configfs_detach_prep()
      during configfs_rmdir() shutdown is actually checked..
      
      This originally manifested itself on ppc64 on v4.8.y under
      heavy load using ibmvscsi target ports with Novalink API:
      
      [ 7877.289863] rpadlpar_io: slot U8247.22L.212A91A-V1-C8 added
      [ 7879.893760] ------------[ cut here ]------------
      [ 7879.893768] WARNING: CPU: 15 PID: 17585 at ./include/linux/kref.h:46 config_item_get+0x7c/0x90 [configfs]
      [ 7879.893811] CPU: 15 PID: 17585 Comm: targetcli Tainted: G           O 4.8.17-customv2.22 #12
      [ 7879.893812] task: c00000018a0d3400 task.stack: c0000001f3b40000
      [ 7879.893813] NIP: d000000002c664ec LR: d000000002c60980 CTR: c000000000b70870
      [ 7879.893814] REGS: c0000001f3b43810 TRAP: 0700   Tainted: G O     (4.8.17-customv2.22)
      [ 7879.893815] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28222242  XER: 00000000
      [ 7879.893820] CFAR: d000000002c664bc SOFTE: 1
                      GPR00: d000000002c60980 c0000001f3b43a90 d000000002c70908 c0000000fbc06820
                      GPR04: c0000001ef1bd900 0000000000000004 0000000000000001 0000000000000000
                      GPR08: 0000000000000000 0000000000000001 d000000002c69560 d000000002c66d80
                      GPR12: c000000000b70870 c00000000e798700 c0000001f3b43ca0 c0000001d4949d40
                      GPR16: c00000014637e1c0 0000000000000000 0000000000000000 c0000000f2392940
                      GPR20: c0000001f3b43b98 0000000000000041 0000000000600000 0000000000000000
                      GPR24: fffffffffffff000 0000000000000000 d000000002c60be0 c0000001f1dac490
                      GPR28: 0000000000000004 0000000000000000 c0000001ef1bd900 c0000000f2392940
      [ 7879.893839] NIP [d000000002c664ec] config_item_get+0x7c/0x90 [configfs]
      [ 7879.893841] LR [d000000002c60980] check_perm+0x80/0x2e0 [configfs]
      [ 7879.893842] Call Trace:
      [ 7879.893844] [c0000001f3b43ac0] [d000000002c60980] check_perm+0x80/0x2e0 [configfs]
      [ 7879.893847] [c0000001f3b43b10] [c000000000329770] do_dentry_open+0x2c0/0x460
      [ 7879.893849] [c0000001f3b43b70] [c000000000344480] path_openat+0x210/0x1490
      [ 7879.893851] [c0000001f3b43c80] [c00000000034708c] do_filp_open+0xfc/0x170
      [ 7879.893853] [c0000001f3b43db0] [c00000000032b5bc] do_sys_open+0x1cc/0x390
      [ 7879.893856] [c0000001f3b43e30] [c000000000009584] system_call+0x38/0xec
      [ 7879.893856] Instruction dump:
      [ 7879.893858] 409d0014 38210030 e8010010 7c0803a6 4e800020 3d220000 e94981e0 892a0000
      [ 7879.893861] 2f890000 409effe0 39200001 992a0000 <0fe00000> 4bffffd0 60000000 60000000
      [ 7879.893866] ---[ end trace 14078f0b3b5ad0aa ]---
      
      To close this race, go ahead and obtain the symlink's target
      config_item reference only after the existing CONFIGFS_USET_DROPPING
      check succeeds.
      
      This way, if configfs_rmdir() wins create_link() will return -ENONET,
      and if create_link() wins configfs_rmdir() will return -EBUSY.
      Reported-by: NBryant G. Ly <bryantly@linux.vnet.ibm.com>
      Tested-by: NBryant G. Ly <bryantly@linux.vnet.ibm.com>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: stable@vger.kernel.org
      ba80aa90
  7. 11 6月, 2017 1 次提交
  8. 10 6月, 2017 10 次提交
  9. 08 6月, 2017 2 次提交
    • B
      xfs: fix spurious spin_is_locked() assert failures on non-smp kernels · 95989c46
      Brian Foster 提交于
      The 0-day kernel test robot reports assertion failures on
      !CONFIG_SMP kernels due to failed spin_is_locked() checks. As it
      turns out, spin_is_locked() is hardcoded to return zero on
      !CONFIG_SMP kernels and so this function cannot be relied on to
      verify spinlock state in this configuration.
      
      To avoid this problem, replace the associated asserts with lockdep
      variants that do the right thing regardless of kernel configuration.
      Drop the one assert that checks for an unlocked lock as there is no
      suitable lockdep variant for that case. This moves the spinlock
      checks from XFS debug code to lockdep, but generally provides the
      same level of protection.
      Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      95989c46
    • D
      crypto: Work around deallocated stack frame reference gcc bug on sparc. · d41519a6
      David Miller 提交于
      On sparc, if we have an alloca() like situation, as is the case with
      SHASH_DESC_ON_STACK(), we can end up referencing deallocated stack
      memory.  The result can be that the value is clobbered if a trap
      or interrupt arrives at just the right instruction.
      
      It only occurs if the function ends returning a value from that
      alloca() area and that value can be placed into the return value
      register using a single instruction.
      
      For example, in lib/libcrc32c.c:crc32c() we end up with a return
      sequence like:
      
              return  %i7+8
               lduw   [%o5+16], %o0   ! MEM[(u32 *)__shash_desc.1_10 + 16B],
      
      %o5 holds the base of the on-stack area allocated for the shash
      descriptor.  But the return released the stack frame and the
      register window.
      
      So if an intererupt arrives between 'return' and 'lduw', then
      the value read at %o5+16 can be corrupted.
      
      Add a data compiler barrier to work around this problem.  This is
      exactly what the gcc fix will end up doing as well, and it absolutely
      should not change the code generated for other cpus (unless gcc
      on them has the same bug :-)
      
      With crucial insight from Eric Sandeen.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: NAnatoly Pugachev <matorola@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      d41519a6
  10. 05 6月, 2017 1 次提交