1. 30 1月, 2018 7 次提交
  2. 29 1月, 2018 1 次提交
  3. 26 1月, 2018 2 次提交
  4. 15 1月, 2018 4 次提交
  5. 14 1月, 2018 1 次提交
  6. 12 1月, 2018 1 次提交
  7. 11 1月, 2018 1 次提交
  8. 10 1月, 2018 2 次提交
    • D
      bpf: avoid false sharing of map refcount with max_entries · be95a845
      Daniel Borkmann 提交于
      In addition to commit b2157399 ("bpf: prevent out-of-bounds
      speculation") also change the layout of struct bpf_map such that
      false sharing of fast-path members like max_entries is avoided
      when the maps reference counter is altered. Therefore enforce
      them to be placed into separate cachelines.
      
      pahole dump after change:
      
        struct bpf_map {
              const struct bpf_map_ops  * ops;                 /*     0     8 */
              struct bpf_map *           inner_map_meta;       /*     8     8 */
              void *                     security;             /*    16     8 */
              enum bpf_map_type          map_type;             /*    24     4 */
              u32                        key_size;             /*    28     4 */
              u32                        value_size;           /*    32     4 */
              u32                        max_entries;          /*    36     4 */
              u32                        map_flags;            /*    40     4 */
              u32                        pages;                /*    44     4 */
              u32                        id;                   /*    48     4 */
              int                        numa_node;            /*    52     4 */
              bool                       unpriv_array;         /*    56     1 */
      
              /* XXX 7 bytes hole, try to pack */
      
              /* --- cacheline 1 boundary (64 bytes) --- */
              struct user_struct *       user;                 /*    64     8 */
              atomic_t                   refcnt;               /*    72     4 */
              atomic_t                   usercnt;              /*    76     4 */
              struct work_struct         work;                 /*    80    32 */
              char                       name[16];             /*   112    16 */
              /* --- cacheline 2 boundary (128 bytes) --- */
      
              /* size: 128, cachelines: 2, members: 17 */
              /* sum members: 121, holes: 1, sum holes: 7 */
        };
      
      Now all entries in the first cacheline are read only throughout
      the life time of the map, set up once during map creation. Overall
      struct size and number of cachelines doesn't change from the
      reordering. struct bpf_map is usually first member and embedded
      in map structs in specific map implementations, so also avoid those
      members to sit at the end where it could potentially share the
      cacheline with first map values e.g. in the array since remote
      CPUs could trigger map updates just as well for those (easily
      dirtying members like max_entries intentionally as well) while
      having subsequent values in cache.
      
      Quoting from Google's Project Zero blog [1]:
      
        Additionally, at least on the Intel machine on which this was
        tested, bouncing modified cache lines between cores is slow,
        apparently because the MESI protocol is used for cache coherence
        [8]. Changing the reference counter of an eBPF array on one
        physical CPU core causes the cache line containing the reference
        counter to be bounced over to that CPU core, making reads of the
        reference counter on all other CPU cores slow until the changed
        reference counter has been written back to memory. Because the
        length and the reference counter of an eBPF array are stored in
        the same cache line, this also means that changing the reference
        counter on one physical CPU core causes reads of the eBPF array's
        length to be slow on other physical CPU cores (intentional false
        sharing).
      
      While this doesn't 'control' the out-of-bounds speculation through
      masking the index as in commit b2157399, triggering a manipulation
      of the map's reference counter is really trivial, so lets not allow
      to easily affect max_entries from it.
      
      Splitting to separate cachelines also generally makes sense from
      a performance perspective anyway in that fast-path won't have a
      cache miss if the map gets pinned, reused in other progs, etc out
      of control path, thus also avoids unintentional false sharing.
      
        [1] https://googleprojectzero.blogspot.ch/2018/01/reading-privileged-memory-with-side.htmlSigned-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      be95a845
    • S
  9. 09 1月, 2018 4 次提交
    • B
      drm/bridge/synopsys: stop clobbering drvdata · 8242ecbd
      Brian Norris 提交于
      Bridge drivers/helpers shouldn't be clobbering the drvdata, since a
      parent driver might need to own this. Instead, let's return our
      'dw_mipi_dsi' object and have callers pass that back to us for removal.
      Signed-off-by: NBrian Norris <briannorris@chromium.org>
      Reviewed-by: NMatthias Kaehlcke <mka@chromium.org>
      Reviewed-by: NArchit Taneja <architt@codeaurora.org>
      Acked-by: NPhilippe Cornu <philippe.cornu@st.com>
      Signed-off-by: NAndrzej Hajda <a.hajda@samsung.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171128010538.119114-1-briannorris@chromium.org
      8242ecbd
    • A
      bpf: prevent out-of-bounds speculation · b2157399
      Alexei Starovoitov 提交于
      Under speculation, CPUs may mis-predict branches in bounds checks. Thus,
      memory accesses under a bounds check may be speculated even if the
      bounds check fails, providing a primitive for building a side channel.
      
      To avoid leaking kernel data round up array-based maps and mask the index
      after bounds check, so speculated load with out of bounds index will load
      either valid value from the array or zero from the padded area.
      
      Unconditionally mask index for all array types even when max_entries
      are not rounded to power of 2 for root user.
      When map is created by unpriv user generate a sequence of bpf insns
      that includes AND operation to make sure that JITed code includes
      the same 'index & index_mask' operation.
      
      If prog_array map is created by unpriv user replace
        bpf_tail_call(ctx, map, index);
      with
        if (index >= max_entries) {
          index &= map->index_mask;
          bpf_tail_call(ctx, map, index);
        }
      (along with roundup to power 2) to prevent out-of-bounds speculation.
      There is secondary redundant 'if (index >= max_entries)' in the interpreter
      and in all JITs, but they can be optimized later if necessary.
      
      Other array-like maps (cpumap, devmap, sockmap, perf_event_array, cgroup_array)
      cannot be used by unpriv, so no changes there.
      
      That fixes bpf side of "Variant 1: bounds check bypass (CVE-2017-5753)" on
      all architectures with and without JIT.
      
      v2->v3:
      Daniel noticed that attack potentially can be crafted via syscall commands
      without loading the program, so add masking to those paths as well.
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      b2157399
    • M
      sctp: fix the handling of ICMP Frag Needed for too small MTUs · b6c5734d
      Marcelo Ricardo Leitner 提交于
      syzbot reported a hang involving SCTP, on which it kept flooding dmesg
      with the message:
      [  246.742374] sctp: sctp_transport_update_pmtu: Reported pmtu 508 too
      low, using default minimum of 512
      
      That happened because whenever SCTP hits an ICMP Frag Needed, it tries
      to adjust to the new MTU and triggers an immediate retransmission. But
      it didn't consider the fact that MTUs smaller than the SCTP minimum MTU
      allowed (512) would not cause the PMTU to change, and issued the
      retransmission anyway (thus leading to another ICMP Frag Needed, and so
      on).
      
      As IPv4 (ip_rt_min_pmtu=556) and IPv6 (IPV6_MIN_MTU=1280) minimum MTU
      are higher than that, sctp_transport_update_pmtu() is changed to
      re-fetch the PMTU that got set after our request, and with that, detect
      if there was an actual change or not.
      
      The fix, thus, skips the immediate retransmission if the received ICMP
      resulted in no change, in the hope that SCTP will select another path.
      
      Note: The value being used for the minimum MTU (512,
      SCTP_DEFAULT_MINSEGMENT) is not right and instead it should be (576,
      SCTP_MIN_PMTU), but such change belongs to another patch.
      
      Changes from v1:
      - do not disable PMTU discovery, in the light of commit
      06ad3919 ("[SCTP] Don't disable PMTU discovery when mtu is small")
      and as suggested by Xin Long.
      - changed the way to break the rtx loop by detecting if the icmp
        resulted in a change or not
      Changes from v2:
      none
      
      See-also: https://lkml.org/lkml/2017/12/22/811Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b6c5734d
    • I
      locking/lockdep: Remove cross-release leftovers · 527187d2
      Ingo Molnar 提交于
      There's two cross-release leftover facilities:
      
       - the crossrelease_hist_*() irq-tracing callbacks (NOPs currently)
       - the complete_release_commit() callback (NOP as well)
      
      Remove them.
      
      Cc: David Sterba <dsterba@suse.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      527187d2
  10. 08 1月, 2018 1 次提交
  11. 06 1月, 2018 2 次提交
  12. 05 1月, 2018 1 次提交
  13. 04 1月, 2018 1 次提交
  14. 03 1月, 2018 2 次提交
    • F
      uapi libc compat: add fallback for unsupported libcs · c0bace79
      Felix Janda 提交于
      libc-compat.h aims to prevent symbol collisions between uapi and libc
      headers for each supported libc. This requires continuous coordination
      between them.
      
      The goal of this commit is to improve the situation for libcs (such as
      musl) which are not yet supported and/or do not wish to be explicitly
      supported, while not affecting supported libcs. More precisely, with
      this commit, unsupported libcs can request the suppression of any
      specific uapi definition by defining the correspondings _UAPI_DEF_*
      macro as 0. This can fix symbol collisions for them, as long as the
      libc headers are included before the uapi headers. Inclusion in the
      other order is outside the scope of this commit.
      
      All infrastructure in order to enable this fallback for unsupported
      libcs is already in place, except that libc-compat.h unconditionally
      defines all _UAPI_DEF_* macros to 1 for all unsupported libcs so that
      any previous definitions are ignored. In order to fix this, this commit
      merely makes these definitions conditional.
      
      This commit together with the musl libc commit
      
      http://git.musl-libc.org/cgit/musl/commit/?id=04983f2272382af92eb8f8838964ff944fbb8258
      
      fixes for example the following compiler errors when <linux/in6.h> is
      included after musl's <netinet/in.h>:
      
      ./linux/in6.h:32:8: error: redefinition of 'struct in6_addr'
      ./linux/in6.h:49:8: error: redefinition of 'struct sockaddr_in6'
      ./linux/in6.h:59:8: error: redefinition of 'struct ipv6_mreq'
      
      The comments referencing glibc are still correct, but this file is not
      only used for glibc any more.
      Signed-off-by: NFelix Janda <felix.janda@posteo.de>
      Reviewed-by: NHauke Mehrtens <hauke@hauke-m.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c0bace79
    • A
      efi/capsule-loader: Reinstate virtual capsule mapping · f24c4d47
      Ard Biesheuvel 提交于
      Commit:
      
        82c3768b ("efi/capsule-loader: Use a cached copy of the capsule header")
      
      ... refactored the capsule loading code that maps the capsule header,
      to avoid having to map it several times.
      
      However, as it turns out, the vmap() call we ended up removing did not
      just map the header, but the entire capsule image, and dropping this
      virtual mapping breaks capsules that are processed by the firmware
      immediately (i.e., without a reboot).
      
      Unfortunately, that change was part of a larger refactor that allowed
      a quirk to be implemented for Quark, which has a non-standard memory
      layout for capsules, and we have slightly painted ourselves into a
      corner by allowing quirk code to mangle the capsule header and memory
      layout.
      
      So we need to fix this without breaking Quark. Fortunately, Quark does
      not appear to care about the virtual mapping, and so we can simply
      do a partial revert of commit:
      
        2a457fb3 ("efi/capsule-loader: Use page addresses rather than struct page pointers")
      
      ... and create a vmap() mapping of the entire capsule (including header)
      based on the reinstated struct page array, unless running on Quark, in
      which case we pass the capsule header copy as before.
      Reported-by: NGe Song <ge.song@hxt-semitech.com>
      Tested-by: NBryan O'Donoghue <pure.logic@nexus-software.ie>
      Tested-by: NGe Song <ge.song@hxt-semitech.com>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: <stable@vger.kernel.org>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Fixes: 82c3768b ("efi/capsule-loader: Use a cached copy of the capsule header")
      Link: http://lkml.kernel.org/r/20180102172110.17018-3-ard.biesheuvel@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f24c4d47
  15. 02 1月, 2018 3 次提交
    • D
      fscache: Fix the default for fscache_maybe_release_page() · 98801506
      David Howells 提交于
      Fix the default for fscache_maybe_release_page() for when the cookie isn't
      valid or the page isn't cached.  It mustn't return false as that indicates
      the page cannot yet be freed.
      
      The problem with the default is that if, say, there's no cache, but a
      network filesystem's pages are using up almost all the available memory, a
      system can OOM because the filesystem ->releasepage() op will not allow
      them to be released as fscache_maybe_release_page() incorrectly prevents
      it.
      
      This can be tested by writing a sequence of 512MiB files to an AFS mount.
      It does not affect NFS or CIFS because both of those wrap the call in a
      check of PG_fscache and it shouldn't bother Ceph as that only has
      PG_private set whilst writeback is in progress.  This might be an issue for
      9P, however.
      
      Note that the pages aren't entirely stuck.  Removing a file or unmounting
      will clear things because that uses ->invalidatepage() instead.
      
      Fixes: 201a1542 ("FS-Cache: Handle pages pending storage that get evicted under OOM conditions")
      Reported-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: NMarc Dionne <marc.dionne@auristor.com>
      cc: stable@vger.kernel.org # 2.6.32+
      98801506
    • M
      drm/exynos: ipp: Remove Exynos DRM IPP subsystem · 8ded5941
      Marek Szyprowski 提交于
      Exynos DRM IPP subsystem is in fact non-functional and frankly speaking
      dead-code. This patch clearly marks that Exynos DRM IPP subsystem is
      broken and never really functional. It will be replaced by a completely
      rewritten API.
      
      Exynos DRM IPP user-space API can be obsoleted for the following
      reasons:
      
      1. Exynos DRM IPP user-space API can be optional in Exynos DRM, so
      userspace should not rely that it is always available and should have
      a software fallback in case it is not there.
      
      2. The only mode which was initially semi-working was memory-to-memory
      image processing. The remaining modes (LCD-"writeback" and "output")
      were never operational due to missing code (both in mainline and even
      vendor kernels).
      
      3. Exynos DRM IPP mainline user-space API compatibility for
      memory-to-memory got broken very early by commit 083500ba ("drm:
      remove DRM_FORMAT_NV12MT", which removed the support for tiled formats,
      the main feature which made this API somehow useful on Exynos platforms
      (video codec that time produced only tiled frames, to implement xvideo
      or any other video overlay, one has to de-tile them for proper
      display).
      
      4. Broken drivers. Especially once support for IOMMU has been added,
      it revealed that drivers don't configure DMA operations properly and in
      many cases operate outside the provided buffers trashing memory around.
      
      5. Need for external patches. Although IPP user-space API has been used
      in some vendor kernels, but in such cases there were additional patches
      applied (like reverting mentioned 083500ba patch) what means that
      those userspace apps which might use it, still won't work with the
      mainline kernel version.
      
      We don't have time machines, so we cannot change it, but Exynos DRM IPP
      extension should never have been merged to mainline in that form.
      
      Exynos IPP subsystem and user-space API will be rewritten, so remove
      current IPP core code and mark existing drivers as BROKEN.
      Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com>
      Acked-by: NDaniel Stone <daniels@collabora.com>
      Acked-by: NKrzysztof Kozlowski <krzk@kernel.org>
      Signed-off-by: NInki Dae <inki.dae@samsung.com>
      8ded5941
    • K
      drm/exynos/decon: Move headers from global to local place · 4f52e550
      Krzysztof Kozlowski 提交于
      The DECON headers contain only defines for registers.  There are no
      other drivers using them so this should be put locally to the Exynos DRM
      driver.  Keeping headers local helps managing the code.
      Suggested-by: NMarek Szyprowski <m.szyprowski@samsung.com>
      Signed-off-by: NKrzysztof Kozlowski <krzk@kernel.org>
      Signed-off-by: NInki Dae <inki.dae@samsung.com>
      4f52e550
  16. 30 12月, 2017 3 次提交
    • T
      timers: Reinitialize per cpu bases on hotplug · 26456f87
      Thomas Gleixner 提交于
      The timer wheel bases are not (re)initialized on CPU hotplug. That leaves
      them with a potentially stale clk and next_expiry valuem, which can cause
      trouble then the CPU is plugged.
      
      Add a prepare callback which forwards the clock, sets next_expiry to far in
      the future and reset the control flags to a known state.
      
      Set base->must_forward_clk so the first timer which is queued will try to
      forward the clock to current jiffies.
      
      Fixes: 500462a9 ("timers: Switch to a non-cascading wheel")
      Reported-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272152200.2431@nanos
      26456f87
    • T
      genirq/irqdomain: Rename early argument of irq_domain_activate_irq() · 702cb0a0
      Thomas Gleixner 提交于
      The 'early' argument of irq_domain_activate_irq() is actually used to
      denote reservation mode. To avoid confusion, rename it before abuse
      happens.
      
      No functional change.
      
      Fixes: 72491643 ("genirq/irqdomain: Update irq_domain_ops.activate() signature")
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Alexandru Chirvasitu <achirvasub@gmail.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: Mikael Pettersson <mikpelinux@gmail.com>
      Cc: Josh Poulson <jopoulso@microsoft.com>
      Cc: Mihai Costache <v-micos@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: linux-pci@vger.kernel.org
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: Simon Xiao <sixiao@microsoft.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Jork Loeser <Jork.Loeser@microsoft.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: devel@linuxdriverproject.org
      Cc: KY Srinivasan <kys@microsoft.com>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: Sakari Ailus <sakari.ailus@intel.com>,
      Cc: linux-media@vger.kernel.org
      702cb0a0
    • T
      genirq: Introduce IRQD_CAN_RESERVE flag · 69790ba9
      Thomas Gleixner 提交于
      Add a new flag to mark interrupts which can use reservation mode. This is
      going to be used in subsequent patches to disable reservation mode for a
      certain class of MSI devices.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NAlexandru Chirvasitu <achirvasub@gmail.com>
      Tested-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: Mikael Pettersson <mikpelinux@gmail.com>
      Cc: Josh Poulson <jopoulso@microsoft.com>
      Cc: Mihai Costache <v-micos@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: linux-pci@vger.kernel.org
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: Simon Xiao <sixiao@microsoft.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Jork Loeser <Jork.Loeser@microsoft.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: devel@linuxdriverproject.org
      Cc: KY Srinivasan <kys@microsoft.com>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: Sakari Ailus <sakari.ailus@intel.com>,
      Cc: linux-media@vger.kernel.org
      69790ba9
  17. 29 12月, 2017 1 次提交
  18. 28 12月, 2017 3 次提交