You need to sign in or sign up before continuing.
  1. 02 5月, 2018 2 次提交
  2. 28 4月, 2018 1 次提交
  3. 27 4月, 2018 1 次提交
  4. 26 4月, 2018 1 次提交
    • T
      Revert: Unify CLOCK_MONOTONIC and CLOCK_BOOTTIME · a3ed0e43
      Thomas Gleixner 提交于
      Revert commits
      
      92af4dcb ("tracing: Unify the "boot" and "mono" tracing clocks")
      127bfa5f ("hrtimer: Unify MONOTONIC and BOOTTIME clock behavior")
      7250a404 ("posix-timers: Unify MONOTONIC and BOOTTIME clock behavior")
      d6c7270e ("timekeeping: Remove boot time specific code")
      f2d6fdbf ("Input: Evdev - unify MONOTONIC and BOOTTIME clock behavior")
      d6ed449a ("timekeeping: Make the MONOTONIC clock behave like the BOOTTIME clock")
      72199320 ("timekeeping: Add the new CLOCK_MONOTONIC_ACTIVE clock")
      
      As stated in the pull request for the unification of CLOCK_MONOTONIC and
      CLOCK_BOOTTIME, it was clear that we might have to revert the change.
      
      As reported by several folks systemd and other applications rely on the
      documented behaviour of CLOCK_MONOTONIC on Linux and break with the above
      changes. After resume daemons time out and other timeout related issues are
      observed. Rafael compiled this list:
      
      * systemd kills daemons on resume, after >WatchdogSec seconds
        of suspending (Genki Sky).  [Verified that that's because systemd uses
        CLOCK_MONOTONIC and expects it to not include the suspend time.]
      
      * systemd-journald misbehaves after resume:
        systemd-journald[7266]: File /var/log/journal/016627c3c4784cd4812d4b7e96a34226/system.journal
      corrupted or uncleanly shut down, renaming and replacing.
        (Mike Galbraith).
      
      * NetworkManager reports "networking disabled" and networking is broken
        after resume 50% of the time (Pavel).  [May be because of systemd.]
      
      * MATE desktop dims the display and starts the screensaver right after
        system resume (Pavel).
      
      * Full system hang during resume (me).  [May be due to systemd or NM or both.]
      
      That happens on debian and open suse systems.
      
      It's sad, that these problems were neither catched in -next nor by those
      folks who expressed interest in this change.
      Reported-by: NRafael J. Wysocki <rjw@rjwysocki.net>
      Reported-by: Genki Sky <sky@genki.is>,
      Reported-by: NPavel Machek <pavel@ucw.cz>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kevin Easton <kevin@guarana.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Salyzyn <salyzyn@android.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      a3ed0e43
  5. 25 4月, 2018 1 次提交
  6. 23 4月, 2018 1 次提交
  7. 19 4月, 2018 1 次提交
    • J
      cfg80211: limit wiphy names to 128 bytes · a7cfebcb
      Johannes Berg 提交于
      There's currently no limit on wiphy names, other than netlink
      message size and memory limitations, but that causes issues when,
      for example, the wiphy name is used in a uevent, e.g. in rfkill
      where we use the same name for the rfkill instance, and then the
      buffer there is "only" 2k for the environment variables.
      
      This was reported by syzkaller, which used a 4k name.
      
      Limit the name to something reasonable, I randomly picked 128.
      
      Reported-by: syzbot+230d9e642a85d3fec29c@syzkaller.appspotmail.com
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      a7cfebcb
  8. 17 4月, 2018 1 次提交
  9. 16 4月, 2018 1 次提交
  10. 14 4月, 2018 1 次提交
  11. 12 4月, 2018 8 次提交
    • M
      linux/const.h: refactor _BITUL and _BITULL a bit · 21e7bc60
      Masahiro Yamada 提交于
      Minor cleanups available by _UL and _ULL.
      
      Link: http://lkml.kernel.org/r/1519301715-31798-5-git-send-email-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Russell King <rmk+kernel@armlinux.org.uk>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      21e7bc60
    • M
      linux/const.h: move UL() macro to include/linux/const.h · 2dd8a62c
      Masahiro Yamada 提交于
      ARM, ARM64 and UniCore32 duplicate the definition of UL():
      
        #define UL(x) _AC(x, UL)
      
      This is not actually arch-specific, so it will be useful to move it to a
      common header.  Currently, we only have the uapi variant for
      linux/const.h, so I am creating include/linux/const.h.
      
      I also added _UL(), _ULL() and ULL() because _AC() is mostly used in
      the form either _AC(..., UL) or _AC(..., ULL).  I expect they will be
      replaced in follow-up cleanups.  The underscore-prefixed ones should
      be used for exported headers.
      
      Link: http://lkml.kernel.org/r/1519301715-31798-4-git-send-email-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Acked-by: NGuan Xuetao <gxt@mprc.pku.edu.cn>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2dd8a62c
    • M
      linux/const.h: prefix include guard of uapi/linux/const.h with _UAPI · 2a6cc8a6
      Masahiro Yamada 提交于
      Patch series "linux/const.h: cleanups of macros such as UL(), _BITUL(),
      BIT() etc", v3.
      
      ARM, ARM64, UniCore32 define UL() as a shorthand of _AC(..., UL).  More
      architectures may introduce it in the future.
      
      UL() is arch-agnostic, and useful. So let's move it to
      include/linux/const.h
      
      Currently, <asm/memory.h> must be included to use UL().  It pulls in more
      bloats just for defining some bit macros.
      
      I posted V2 one year ago.
      
      The previous posts are:
      https://patchwork.kernel.org/patch/9498273/
      https://patchwork.kernel.org/patch/9498275/
      https://patchwork.kernel.org/patch/9498269/
      https://patchwork.kernel.org/patch/9498271/
      
      At that time, what blocked this series was a comment from
      David Howells:
        You need to be very careful doing this.  Some userspace stuff
        depends on the guard macro names on the kernel header files.
      
      (https://patchwork.kernel.org/patch/9498275/)
      
      Looking at the code closer, I noticed this is not a problem.
      
      See the following line.
      https://github.com/torvalds/linux/blob/v4.16-rc2/scripts/headers_install.sh#L40
      
      scripts/headers_install.sh rips off _UAPI prefix from guard macro names.
      
      I ran "make headers_install" and confirmed the result is what I expect.
      
      So, we can prefix the include guard of include/uapi/linux/const.h,
      and add a new include/linux/const.h.
      
      This patch (of 4):
      
      I am going to add include/linux/const.h for the kernel space.
      
      Add _UAPI to the include guard of include/uapi/linux/const.h to
      prepare for that.
      
      Please notice the guard name of the exported one will be kept as-is.
      So, this commit has no impact to the userspace even if some userspace
      stuff depends on the guard macro names.
      
      scripts/headers_install.sh processes exported headers by SED, and
      rips off "_UAPI" from guard macro names.
      
        #ifndef _UAPI_LINUX_CONST_H
        #define _UAPI_LINUX_CONST_H
      
      will be turned into
      
        #ifndef _LINUX_CONST_H
        #define _LINUX_CONST_H
      
      Link: http://lkml.kernel.org/r/1519301715-31798-2-git-send-email-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Russell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2a6cc8a6
    • M
      fs, elf: drop MAP_FIXED usage from elf_map · 4ed28639
      Michal Hocko 提交于
      Both load_elf_interp and load_elf_binary rely on elf_map to map segments
      on a controlled address and they use MAP_FIXED to enforce that.  This is
      however dangerous thing prone to silent data corruption which can be
      even exploitable.
      
      Let's take CVE-2017-1000253 as an example.  At the time (before commit
      eab09532: "binfmt_elf: use ELF_ET_DYN_BASE only for PIE")
      ELF_ET_DYN_BASE was at TASK_SIZE / 3 * 2 which is not that far away from
      the stack top on 32b (legacy) memory layout (only 1GB away).  Therefore
      we could end up mapping over the existing stack with some luck.
      
      The issue has been fixed since then (a87938b2: "fs/binfmt_elf.c: fix
      bug in loading of PIE binaries"), ELF_ET_DYN_BASE moved moved much
      further from the stack (eab09532 and later by c715b72c: "mm:
      revert x86_64 and arm64 ELF_ET_DYN_BASE base changes") and excessive
      stack consumption early during execve fully stopped by da029c11
      ("exec: Limit arg stack to at most 75% of _STK_LIM").  So we should be
      safe and any attack should be impractical.  On the other hand this is
      just too subtle assumption so it can break quite easily and hard to
      spot.
      
      I believe that the MAP_FIXED usage in load_elf_binary (et. al) is still
      fundamentally dangerous.  Moreover it shouldn't be even needed.  We are
      at the early process stage and so there shouldn't be unrelated mappings
      (except for stack and loader) existing so mmap for a given address should
      succeed even without MAP_FIXED.  Something is terribly wrong if this is
      not the case and we should rather fail than silently corrupt the
      underlying mapping.
      
      Address this issue by changing MAP_FIXED to the newly added
      MAP_FIXED_NOREPLACE.  This will mean that mmap will fail if there is an
      existing mapping clashing with the requested one without clobbering it.
      
      [mhocko@suse.com: fix build]
      [akpm@linux-foundation.org: coding-style fixes]
      [avagin@openvz.org: don't use the same value for MAP_FIXED_NOREPLACE and MAP_SYNC]
        Link: http://lkml.kernel.org/r/20171218184916.24445-1-avagin@openvz.org
      Link: http://lkml.kernel.org/r/20171213092550.2774-3-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NAndrei Vagin <avagin@openvz.org>
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Reviewed-by: NKhalid Aziz <khalid.aziz@oracle.com>
      Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
      Cc: Joel Stanley <joel@jms.id.au>
      Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4ed28639
    • M
      mm: introduce MAP_FIXED_NOREPLACE · a4ff8e86
      Michal Hocko 提交于
      Patch series "mm: introduce MAP_FIXED_NOREPLACE", v2.
      
      This has started as a follow up discussion [3][4] resulting in the
      runtime failure caused by hardening patch [5] which removes MAP_FIXED
      from the elf loader because MAP_FIXED is inherently dangerous as it
      might silently clobber an existing underlying mapping (e.g.  stack).
      The reason for the failure is that some architectures enforce an
      alignment for the given address hint without MAP_FIXED used (e.g.  for
      shared or file backed mappings).
      
      One way around this would be excluding those archs which do alignment
      tricks from the hardening [6].  The patch is really trivial but it has
      been objected, rightfully so, that this screams for a more generic
      solution.  We basically want a non-destructive MAP_FIXED.
      
      The first patch introduced MAP_FIXED_NOREPLACE which enforces the given
      address but unlike MAP_FIXED it fails with EEXIST if the given range
      conflicts with an existing one.  The flag is introduced as a completely
      new one rather than a MAP_FIXED extension because of the backward
      compatibility.  We really want a never-clobber semantic even on older
      kernels which do not recognize the flag.  Unfortunately mmap sucks
      wrt flags evaluation because we do not EINVAL on unknown flags.  On
      those kernels we would simply use the traditional hint based semantic so
      the caller can still get a different address (which sucks) but at least
      not silently corrupt an existing mapping.  I do not see a good way
      around that.  Except we won't export expose the new semantic to the
      userspace at all.
      
      It seems there are users who would like to have something like that.
      Jemalloc has been mentioned by Michael Ellerman [7]
      
      Florian Weimer has mentioned the following:
      : glibc ld.so currently maps DSOs without hints.  This means that the kernel
      : will map right next to each other, and the offsets between them a completely
      : predictable.  We would like to change that and supply a random address in a
      : window of the address space.  If there is a conflict, we do not want the
      : kernel to pick a non-random address. Instead, we would try again with a
      : random address.
      
      John Hubbard has mentioned CUDA example
      : a) Searches /proc/<pid>/maps for a "suitable" region of available
      : VA space.  "Suitable" generally means it has to have a base address
      : within a certain limited range (a particular device model might
      : have odd limitations, for example), it has to be large enough, and
      : alignment has to be large enough (again, various devices may have
      : constraints that lead us to do this).
      :
      : This is of course subject to races with other threads in the process.
      :
      : Let's say it finds a region starting at va.
      :
      : b) Next it does:
      :     p = mmap(va, ...)
      :
      : *without* setting MAP_FIXED, of course (so va is just a hint), to
      : attempt to safely reserve that region. If p != va, then in most cases,
      : this is a failure (almost certainly due to another thread getting a
      : mapping from that region before we did), and so this layer now has to
      : call munmap(), before returning a "failure: retry" to upper layers.
      :
      :     IMPROVEMENT: --> if instead, we could call this:
      :
      :             p = mmap(va, ... MAP_FIXED_NOREPLACE ...)
      :
      :         , then we could skip the munmap() call upon failure. This
      :         is a small thing, but it is useful here. (Thanks to Piotr
      :         Jaroszynski and Mark Hairgrove for helping me get that detail
      :         exactly right, btw.)
      :
      : c) After that, CUDA suballocates from p, via:
      :
      :      q = mmap(sub_region_start, ... MAP_FIXED ...)
      :
      : Interestingly enough, "freeing" is also done via MAP_FIXED, and
      : setting PROT_NONE to the subregion. Anyway, I just included (c) for
      : general interest.
      
      Atomic address range probing in the multithreaded programs in general
      sounds like an interesting thing to me.
      
      The second patch simply replaces MAP_FIXED use in elf loader by
      MAP_FIXED_NOREPLACE.  I believe other places which rely on MAP_FIXED
      should follow.  Actually real MAP_FIXED usages should be docummented
      properly and they should be more of an exception.
      
      [1] http://lkml.kernel.org/r/20171116101900.13621-1-mhocko@kernel.org
      [2] http://lkml.kernel.org/r/20171129144219.22867-1-mhocko@kernel.org
      [3] http://lkml.kernel.org/r/20171107162217.382cd754@canb.auug.org.au
      [4] http://lkml.kernel.org/r/1510048229.12079.7.camel@abdul.in.ibm.com
      [5] http://lkml.kernel.org/r/20171023082608.6167-1-mhocko@kernel.org
      [6] http://lkml.kernel.org/r/20171113094203.aofz2e7kueitk55y@dhcp22.suse.cz
      [7] http://lkml.kernel.org/r/87efp1w7vy.fsf@concordia.ellerman.id.au
      
      This patch (of 2):
      
      MAP_FIXED is used quite often to enforce mapping at the particular range.
      The main problem of this flag is, however, that it is inherently dangerous
      because it unmaps existing mappings covered by the requested range.  This
      can cause silent memory corruptions.  Some of them even with serious
      security implications.  While the current semantic might be really
      desiderable in many cases there are others which would want to enforce the
      given range but rather see a failure than a silent memory corruption on a
      clashing range.  Please note that there is no guarantee that a given range
      is obeyed by the mmap even when it is free - e.g.  arch specific code is
      allowed to apply an alignment.
      
      Introduce a new MAP_FIXED_NOREPLACE flag for mmap to achieve this
      behavior.  It has the same semantic as MAP_FIXED wrt.  the given address
      request with a single exception that it fails with EEXIST if the requested
      address is already covered by an existing mapping.  We still do rely on
      get_unmaped_area to handle all the arch specific MAP_FIXED treatment and
      check for a conflicting vma after it returns.
      
      The flag is introduced as a completely new one rather than a MAP_FIXED
      extension because of the backward compatibility.  We really want a
      never-clobber semantic even on older kernels which do not recognize the
      flag.  Unfortunately mmap sucks wrt.  flags evaluation because we do not
      EINVAL on unknown flags.  On those kernels we would simply use the
      traditional hint based semantic so the caller can still get a different
      address (which sucks) but at least not silently corrupt an existing
      mapping.  I do not see a good way around that.
      
      [mpe@ellerman.id.au: fix whitespace]
      [fail on clashing range with EEXIST as per Florian Weimer]
      [set MAP_FIXED before round_hint_to_min as per Khalid Aziz]
      Link: http://lkml.kernel.org/r/20171213092550.2774-2-mhocko@kernel.orgReviewed-by: NKhalid Aziz <khalid.aziz@oracle.com>
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
      Cc: Khalid Aziz <khalid.aziz@oracle.com>
      Cc: Russell King - ARM Linux <linux@armlinux.org.uk>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
      Cc: Joel Stanley <joel@jms.id.au>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Jason Evans <jasone@google.com>
      Cc: David Goldblatt <davidtgoldblatt@gmail.com>
      Cc: Edward Tomasz Napierała <trasz@FreeBSD.org>
      Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a4ff8e86
    • D
      ipc/msg: introduce msgctl(MSG_STAT_ANY) · 23c8cec8
      Davidlohr Bueso 提交于
      There is a permission discrepancy when consulting msq ipc object
      metadata between /proc/sysvipc/msg (0444) and the MSG_STAT shmctl
      command.  The later does permission checks for the object vs S_IRUGO.
      As such there can be cases where EACCESS is returned via syscall but the
      info is displayed anyways in the procfs files.
      
      While this might have security implications via info leaking (albeit no
      writing to the msq metadata), this behavior goes way back and showing
      all the objects regardless of the permissions was most likely an
      overlook - so we are stuck with it.  Furthermore, modifying either the
      syscall or the procfs file can cause userspace programs to break (ie
      ipcs).  Some applications require getting the procfs info (without root
      privileges) and can be rather slow in comparison with a syscall -- up to
      500x in some reported cases for shm.
      
      This patch introduces a new MSG_STAT_ANY command such that the msq ipc
      object permissions are ignored, and only audited instead.  In addition,
      I've left the lsm security hook checks in place, as if some policy can
      block the call, then the user has no other choice than just parsing the
      procfs file.
      
      Link: http://lkml.kernel.org/r/20180215162458.10059-4-dave@stgolabs.netSigned-off-by: NDavidlohr Bueso <dbueso@suse.de>
      Reported-by: NRobert Kettler <robert.kettler@outlook.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      23c8cec8
    • D
      ipc/sem: introduce semctl(SEM_STAT_ANY) · a280d6dc
      Davidlohr Bueso 提交于
      There is a permission discrepancy when consulting shm ipc object
      metadata between /proc/sysvipc/sem (0444) and the SEM_STAT semctl
      command.  The later does permission checks for the object vs S_IRUGO.
      As such there can be cases where EACCESS is returned via syscall but the
      info is displayed anyways in the procfs files.
      
      While this might have security implications via info leaking (albeit no
      writing to the sma metadata), this behavior goes way back and showing
      all the objects regardless of the permissions was most likely an
      overlook - so we are stuck with it.  Furthermore, modifying either the
      syscall or the procfs file can cause userspace programs to break (ie
      ipcs).  Some applications require getting the procfs info (without root
      privileges) and can be rather slow in comparison with a syscall -- up to
      500x in some reported cases for shm.
      
      This patch introduces a new SEM_STAT_ANY command such that the sem ipc
      object permissions are ignored, and only audited instead.  In addition,
      I've left the lsm security hook checks in place, as if some policy can
      block the call, then the user has no other choice than just parsing the
      procfs file.
      
      Link: http://lkml.kernel.org/r/20180215162458.10059-3-dave@stgolabs.netSigned-off-by: NDavidlohr Bueso <dbueso@suse.de>
      Reported-by: NRobert Kettler <robert.kettler@outlook.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a280d6dc
    • D
      ipc/shm: introduce shmctl(SHM_STAT_ANY) · c21a6970
      Davidlohr Bueso 提交于
      Patch series "sysvipc: introduce STAT_ANY commands", v2.
      
      The following patches adds the discussed (see [1]) new command for shm
      as well as for sems and msq as they are subject to the same
      discrepancies for ipc object permission checks between the syscall and
      via procfs.  These new commands are justified in that (1) we are stuck
      with this semantics as changing syscall and procfs can break userland;
      and (2) some users can benefit from performance (for large amounts of
      shm segments, for example) from not having to parse the procfs
      interface.
      
      Once merged, I will submit the necesary manpage updates.  But I'm thinking
      something like:
      
      : diff --git a/man2/shmctl.2 b/man2/shmctl.2
      : index 7bb503999941..bb00bbe21a57 100644
      : --- a/man2/shmctl.2
      : +++ b/man2/shmctl.2
      : @@ -41,6 +41,7 @@
      :  .\" 2005-04-25, mtk -- noted aberrant Linux behavior w.r.t. new
      :  .\"	attaches to a segment that has already been marked for deletion.
      :  .\" 2005-08-02, mtk: Added IPC_INFO, SHM_INFO, SHM_STAT descriptions.
      : +.\" 2018-02-13, dbueso: Added SHM_STAT_ANY description.
      :  .\"
      :  .TH SHMCTL 2 2017-09-15 "Linux" "Linux Programmer's Manual"
      :  .SH NAME
      : @@ -242,6 +243,18 @@ However, the
      :  argument is not a segment identifier, but instead an index into
      :  the kernel's internal array that maintains information about
      :  all shared memory segments on the system.
      : +.TP
      : +.BR SHM_STAT_ANY " (Linux-specific)"
      : +Return a
      : +.I shmid_ds
      : +structure as for
      : +.BR SHM_STAT .
      : +However, the
      : +.I shm_perm.mode
      : +is not checked for read access for
      : +.IR shmid ,
      : +resembing the behaviour of
      : +/proc/sysvipc/shm.
      :  .PP
      :  The caller can prevent or allow swapping of a shared
      :  memory segment with the following \fIcmd\fP values:
      : @@ -287,7 +300,7 @@ operation returns the index of the highest used entry in the
      :  kernel's internal array recording information about all
      :  shared memory segments.
      :  (This information can be used with repeated
      : -.B SHM_STAT
      : +.B SHM_STAT/SHM_STAT_ANY
      :  operations to obtain information about all shared memory segments
      :  on the system.)
      :  A successful
      : @@ -328,7 +341,7 @@ isn't accessible.
      :  \fIshmid\fP is not a valid identifier, or \fIcmd\fP
      :  is not a valid command.
      :  Or: for a
      : -.B SHM_STAT
      : +.B SHM_STAT/SHM_STAT_ANY
      :  operation, the index value specified in
      :  .I shmid
      :  referred to an array slot that is currently unused.
      
      This patch (of 3):
      
      There is a permission discrepancy when consulting shm ipc object metadata
      between /proc/sysvipc/shm (0444) and the SHM_STAT shmctl command.  The
      later does permission checks for the object vs S_IRUGO.  As such there can
      be cases where EACCESS is returned via syscall but the info is displayed
      anyways in the procfs files.
      
      While this might have security implications via info leaking (albeit no
      writing to the shm metadata), this behavior goes way back and showing all
      the objects regardless of the permissions was most likely an overlook - so
      we are stuck with it.  Furthermore, modifying either the syscall or the
      procfs file can cause userspace programs to break (ie ipcs).  Some
      applications require getting the procfs info (without root privileges) and
      can be rather slow in comparison with a syscall -- up to 500x in some
      reported cases.
      
      This patch introduces a new SHM_STAT_ANY command such that the shm ipc
      object permissions are ignored, and only audited instead.  In addition,
      I've left the lsm security hook checks in place, as if some policy can
      block the call, then the user has no other choice than just parsing the
      procfs file.
      
      [1] https://lkml.org/lkml/2017/12/19/220
      
      Link: http://lkml.kernel.org/r/20180215162458.10059-2-dave@stgolabs.netSigned-off-by: NDavidlohr Bueso <dbueso@suse.de>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Robert Kettler <robert.kettler@outlook.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c21a6970
  12. 11 4月, 2018 2 次提交
  13. 06 4月, 2018 4 次提交
  14. 05 4月, 2018 9 次提交
  15. 04 4月, 2018 2 次提交
  16. 03 4月, 2018 1 次提交
    • E
      signal: Correct the offset of si_pkey and si_lower in struct siginfo on m68k · 8420f719
      Eric W. Biederman 提交于
      The change moving addr_lsb into the _sigfault union failed to take
      into account that _sigfault._addr_bnd._lower being a pointer forced
      the entire union to have pointer alignment.  The fix for
      _sigfault._addr_bnd._lower having pointer alignment failed to take
      into account that m68k has a pointer alignment less than the size
      of a pointer.  So simply making the padding members pointers changed
      the location of later members in the structure.
      
      Fix this by directly computing the needed size of the padding members,
      and making the padding members char arrays of the needed size.  AKA
      if __alignof__(void *) is 1 sizeof(short) otherwise __alignof__(void *).
      Which should be exactly the same rules the compiler whould have
      used when computing the padding.
      
      I have tested this change by adding BUILD_BUG_ONs to m68k to verify
      the offset of every member of struct siginfo, and with those testing
      that the offsets of the fields in struct siginfo is the same before
      I changed the generic _sigfault member and after the correction
      to the _sigfault member.
      
      I have also verified that the x86 with it's own BUILD_BUG_ONs to verify
      the offsets of the siginfo members also compiles cleanly.
      
      Cc: stable@vger.kernel.org
      Reported-by: NEugene Syromiatnikov <esyr@redhat.com>
      Fixes: 859d880c ("signal: Correct the offset of si_pkey in struct siginfo")
      Fixes: b68a68d3 ("signal: Move addr_lsb into the _sigfault union for clarity")
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      8420f719
  17. 01 4月, 2018 2 次提交
    • J
      tipc: avoid possible string overflow · 7494cfa6
      Jon Maloy 提交于
      gcc points out that the combined length of the fixed-length inputs to
      l->name is larger than the destination buffer size:
      
      net/tipc/link.c: In function 'tipc_link_create':
      net/tipc/link.c:465:26: error: '%s' directive writing up to 32 bytes
      into a region of size between 26 and 58 [-Werror=format-overflow=]
      sprintf(l->name, "%s:%s-%s:unknown", self_str, if_name, peer_str);
      
      net/tipc/link.c:465:2: note: 'sprintf' output 11 or more bytes
      (assuming 75) into a destination of size 60
      sprintf(l->name, "%s:%s-%s:unknown", self_str, if_name, peer_str);
      
      A detailed analysis reveals that the theoretical maximum length of
      a link name is:
      max self_str + 1 + max if_name + 1 + max peer_str + 1 + max if_name =
      16 + 1 + 15 + 1 + 16 + 1 + 15 = 65
      Since we also need space for a trailing zero we now set MAX_LINK_NAME
      to 68.
      
      Just to be on the safe side we also replace the sprintf() call with
      snprintf().
      
      Fixes: 25b0b9c4 ("tipc: handle collisions of 32-bit node address
      hash values")
      Reported-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7494cfa6
    • J
      tipc: tipc: rename address types in user api · 7a74d39c
      Jon Maloy 提交于
      The three address type structs in the user API have names that in
      reality reflect the specific, non-Linux environment where they were
      originally created.
      
      We now give them more intuitive names, in accordance with how TIPC is
      described in the current documentation.
      
      struct tipc_portid   -> struct tipc_socket_addr
      struct tipc_name     -> struct tipc_service_addr
      struct tipc_name_seq -> struct tipc_service_range
      
      To avoid confusion, we also update some commmets and macro names to
       match the new terminology.
      
      For compatibility, we add macros that map all old names to the new ones.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7a74d39c
  18. 31 3月, 2018 1 次提交
    • A
      bpf: Post-hooks for sys_bind · aac3fc32
      Andrey Ignatov 提交于
      "Post-hooks" are hooks that are called right before returning from
      sys_bind. At this time IP and port are already allocated and no further
      changes to `struct sock` can happen before returning from sys_bind but
      BPF program has a chance to inspect the socket and change sys_bind
      result.
      
      Specifically it can e.g. inspect what port was allocated and if it
      doesn't satisfy some policy, BPF program can force sys_bind to fail and
      return EPERM to user.
      
      Another example of usage is recording the IP:port pair to some map to
      use it in later calls to sys_connect. E.g. if some TCP server inside
      cgroup was bound to some IP:port_n, it can be recorded to a map. And
      later when some TCP client inside same cgroup is trying to connect to
      127.0.0.1:port_n, BPF hook for sys_connect can override the destination
      and connect application to IP:port_n instead of 127.0.0.1:port_n. That
      helps forcing all applications inside a cgroup to use desired IP and not
      break those applications if they e.g. use localhost to communicate
      between each other.
      
      == Implementation details ==
      
      Post-hooks are implemented as two new attach types
      `BPF_CGROUP_INET4_POST_BIND` and `BPF_CGROUP_INET6_POST_BIND` for
      existing prog type `BPF_PROG_TYPE_CGROUP_SOCK`.
      
      Separate attach types for IPv4 and IPv6 are introduced to avoid access
      to IPv6 field in `struct sock` from `inet_bind()` and to IPv4 field from
      `inet6_bind()` since those fields might not make sense in such cases.
      Signed-off-by: NAndrey Ignatov <rdna@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      aac3fc32