1. 15 9月, 2017 1 次提交
    • J
      livepatch: introduce shadow variable API · 439e7271
      Joe Lawrence 提交于
      Add exported API for livepatch modules:
      
        klp_shadow_get()
        klp_shadow_alloc()
        klp_shadow_get_or_alloc()
        klp_shadow_free()
        klp_shadow_free_all()
      
      that implement "shadow" variables, which allow callers to associate new
      shadow fields to existing data structures.  This is intended to be used
      by livepatch modules seeking to emulate additions to data structure
      definitions.
      
      See Documentation/livepatch/shadow-vars.txt for a summary of the new
      shadow variable API, including a few common use cases.
      
      See samples/livepatch/livepatch-shadow-* for example modules that
      demonstrate shadow variables.
      
      [jkosina@suse.cz: fix __klp_shadow_get_or_alloc() comment as spotted by
       Josh]
      Signed-off-by: NJoe Lawrence <joe.lawrence@redhat.com>
      Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: NMiroslav Benes <mbenes@suse.cz>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      439e7271
  2. 22 6月, 2017 4 次提交
  3. 20 6月, 2017 9 次提交
    • P
      livepatch: Fix stacking of patches with respect to RCU · 842c0884
      Petr Mladek 提交于
      rcu_read_(un)lock(), list_*_rcu(), and synchronize_rcu() are used for a secure
      access and manipulation of the list of patches that modify the same function.
      In particular, it is the variable func_stack that is accessible from the ftrace
      handler via struct ftrace_ops and klp_ops.
      
      Of course, it synchronizes also some states of the patch on the top of the
      stack, e.g. func->transition in klp_ftrace_handler.
      
      At the same time, this mechanism guards also the manipulation of
      task->patch_state. It is modified according to the state of the transition and
      the state of the process.
      
      Now, all this works well as long as RCU works well. Sadly livepatching might
      get into some corner cases when this is not true. For example, RCU is not
      watching when rcu_read_lock() is taken in idle threads.  It is because they
      might sleep and prevent reaching the grace period for too long.
      
      There are ways how to make RCU watching even in idle threads, see
      rcu_irq_enter(). But there is a small location inside RCU infrastructure when
      even this does not work.
      
      This small problematic location can be detected either before calling
      rcu_irq_enter() by rcu_irq_enter_disabled() or later by rcu_is_watching().
      Sadly, there is no safe way how to handle it.  Once we detect that RCU was not
      watching, we might see inconsistent state of the function stack and the related
      variables in klp_ftrace_handler(). Then we could do a wrong decision, use an
      incompatible implementation of the function and break the consistency of the
      system. We could warn but we could not avoid the damage.
      
      Fortunately, ftrace has similar problems and they seem to be solved well there.
      It uses a heavy weight implementation of some RCU operations. In particular, it
      replaces:
      
        + rcu_read_lock() with preempt_disable_notrace()
        + rcu_read_unlock() with preempt_enable_notrace()
        + synchronize_rcu() with schedule_on_each_cpu(sync_work)
      
      My understanding is that this is RCU implementation from a stone age. It meets
      the core RCU requirements but it is rather ineffective. Especially, it does not
      allow to batch or speed up the synchronize calls.
      
      On the other hand, it is very trivial. It allows to safely trace and/or
      livepatch even the RCU core infrastructure.  And the effectiveness is a not a
      big issue because using ftrace or livepatches on productive systems is a rare
      operation.  The safety is much more important than a negligible extra load.
      
      Note that the alternative implementation follows the RCU principles. Therefore,
           we could and actually must use list_*_rcu() variants when manipulating the
           func_stack.  These functions allow to access the pointers in the right
           order and with the right barriers. But they do not use any other
           information that would be set only by rcu_read_lock().
      
      Also note that there are actually two problems solved in ftrace:
      
      First, it cares about the consistency of RCU read sections.  It is being solved
      the way as described and used in this patch.
      
      Second, ftrace needs to make sure that nobody is inside the dynamic trampoline
      when it is being freed. For this, it also calls synchronize_rcu_tasks() in
      preemptive kernel in ftrace_shutdown().
      
      Livepatch has similar problem but it is solved by ftrace for free.
      klp_ftrace_handler() is a good guy and never sleeps. In addition, it is
      registered with FTRACE_OPS_FL_DYNAMIC. It causes that
      unregister_ftrace_function() calls:
      
      	* schedule_on_each_cpu(ftrace_sync) - always
      	* synchronize_rcu_tasks() - in preemptive kernel
      
      The effect is that nobody is neither inside the dynamic trampoline nor inside
      the ftrace handler after unregister_ftrace_function() returns.
      
      [jkosina@suse.cz: reformat changelog, fix comment]
      Signed-off-by: NPetr Mladek <pmladek@suse.com>
      Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: NMiroslav Benes <mbenes@suse.cz>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      842c0884
    • L
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 9705596d
      Linus Torvalds 提交于
      Pull clk fixes from Stephen Boyd:
       "One build fix for an Amlogic clk driver and a handful of Allwinner clk
        driver fixes for some DT bindings and a randconfig build error that
        all came in this merge window"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: sunxi-ng: a64: Export PLL_PERIPH0 clock for the PRCM
        clk: sunxi-ng: h3: Export PLL_PERIPH0 clock for the PRCM
        dt-bindings: clock: sunxi-ccu: Add pll-periph to PRCM's needed clocks
        clk: sunxi-ng: sun5i: Fix ahb_bist_clk definition
        clk: sunxi-ng: enable SUNXI_CCU_MP for PRCM
        clk: meson: gxbb: fix build error without RESET_CONTROLLER
        clk: sunxi-ng: v3s: Fix usb otg device reset bit
        clk: sunxi-ng: a31: Correct lcd1-ch1 clock register offset
      9705596d
    • L
      Merge tag 'ntb-4.12-bugfixes' of git://github.com/jonmason/ntb · 865be780
      Linus Torvalds 提交于
      Pull NTB fixes from Jon Mason:
       "NTB bug fixes to address the modinfo in ntb_perf, a couple of bugs in
        the NTB transport QP calculations, skx doorbells, and sleeping in
        ntb_async_tx_submit"
      
      * tag 'ntb-4.12-bugfixes' of git://github.com/jonmason/ntb:
        ntb: no sleep in ntb_async_tx_submit
        ntb: ntb_hw_intel: Skylake doorbells should be 32bits, not 64bits
        ntb_transport: fix bug calculating num_qps_mw
        ntb_transport: fix qp count bug
        NTB: ntb_test: fix bug printing ntb_perf results
        ntb: Correct modinfo usage statement for ntb_perf
      865be780
    • A
      ntb: no sleep in ntb_async_tx_submit · 88931ec3
      Allen Hubbe 提交于
      Do not sleep in ntb_async_tx_submit, which could deadlock.
      This reverts commit "8c874cc1"
      
      Fixes: 8c874cc1 ("NTB: Address out of DMA descriptor issue with NTB")
      Reported-by: NJia-Ju Bai <baijiaju1990@163.com>
      Signed-off-by: NAllen Hubbe <Allen.Hubbe@dell.com>
      Acked-by: NDave Jiang <dave.jiang@intel.com>
      Signed-off-by: NJon Mason <jdmason@kudzu.us>
      88931ec3
    • D
      ntb: ntb_hw_intel: Skylake doorbells should be 32bits, not 64bits · 5eb449e1
      Dave Jiang 提交于
      Fixing doorbell register length to 32bits per spec. On Skylake NTB, the
      doorbell registers are 32bit write only registers. The source for the
      doorbell is a 64bit register that shows the interrupt bits.
      Signed-off-by: NDave Jiang <dave.jiang@intel.com>
      Fixes: 783dfa6c ("ntb: Adding Skylake Xeon NTB support")
      Acked-by: NAllen Hubbe <Allen.Hubbe@dell.com>
      Signed-off-by: NJon Mason <jdmason@kudzu.us>
      5eb449e1
    • L
      ntb_transport: fix bug calculating num_qps_mw · 8e8496e0
      Logan Gunthorpe 提交于
      A divide by zero error occurs if qp_count is less than mw_count because
      num_qps_mw is calculated to be zero. The calculation appears to be
      incorrect.
      
      The requirement is for num_qps_mw to be set to qp_count / mw_count
      with any remainder divided among the earlier mws.
      
      For example, if mw_count is 5 and qp_count is 12 then mws 0 and 1
      will have 3 qps per window and mws 2 through 4 will have 2 qps per window.
      Thus, when mw_num < qp_count % mw_count, num_qps_mw is 1 higher
      than when mw_num >= qp_count.
      Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
      Fixes: e26a5843 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
      Acked-by: NAllen Hubbe <Allen.Hubbe@dell.com>
      Signed-off-by: NJon Mason <jdmason@kudzu.us>
      8e8496e0
    • L
      ntb_transport: fix qp count bug · cb827ee6
      Logan Gunthorpe 提交于
      In cases where there are more mw's than spads/2-2, the mw count gets
      reduced to match the limitation. ntb_transport also tries to ensure that
      there are fewer qps than mws but uses the full mw count instead of
      the reduced one. When this happens, the math in
      'ntb_transport_setup_qp_mw' will get confused and result in a kernel
      paging request bug.
      
      This patch fixes the bug by reducing qp_count to the reduced mw count
      instead of the full mw count.
      Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
      Fixes: e26a5843 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
      Acked-by: NAllen Hubbe <Allen.Hubbe@dell.com>
      Signed-off-by: NJon Mason <jdmason@kudzu.us>
      cb827ee6
    • L
      NTB: ntb_test: fix bug printing ntb_perf results · 07b0b22b
      Logan Gunthorpe 提交于
      The code mistakenly prints the local perf results for the remote test
      so the script reports identical results for both directions. Fix this
      by ensuring we print the remote result.
      Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
      Fixes: a9c59ef7 ("ntb_test: Add a selftest script for the NTB subsystem")
      Acked-by: NAllen Hubbe <Allen.Hubbe@dell.com>
      Signed-off-by: NJon Mason <jdmason@kudzu.us>
      07b0b22b
    • G
      ntb: Correct modinfo usage statement for ntb_perf · 94fc7954
      Gary R Hook 提交于
      The order parameters are powers of 2; adjust the usage information
      to use correct mathematical representations.
      Signed-off-by: NGary R Hook <gary.hook@amd.com>
      Fixes: 8a7b6a77 ("ntb: ntb perf tool")
      Acked-by: NDave Jiang <dave.jiang@intel.com>
      Signed-off-by: NJon Mason <jdmason@kudzu.us>
      94fc7954
  4. 19 6月, 2017 9 次提交
    • L
      Linux 4.12-rc6 · 41f1830f
      Linus Torvalds 提交于
      41f1830f
    • H
      mm: larger stack guard gap, between vmas · 1be7107f
      Hugh Dickins 提交于
      Stack guard page is a useful feature to reduce a risk of stack smashing
      into a different mapping. We have been using a single page gap which
      is sufficient to prevent having stack adjacent to a different mapping.
      But this seems to be insufficient in the light of the stack usage in
      userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
      used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
      which is 256kB or stack strings with MAX_ARG_STRLEN.
      
      This will become especially dangerous for suid binaries and the default
      no limit for the stack size limit because those applications can be
      tricked to consume a large portion of the stack and a single glibc call
      could jump over the guard page. These attacks are not theoretical,
      unfortunatelly.
      
      Make those attacks less probable by increasing the stack guard gap
      to 1MB (on systems with 4k pages; but make it depend on the page size
      because systems with larger base pages might cap stack allocations in
      the PAGE_SIZE units) which should cover larger alloca() and VLA stack
      allocations. It is obviously not a full fix because the problem is
      somehow inherent, but it should reduce attack space a lot.
      
      One could argue that the gap size should be configurable from userspace,
      but that can be done later when somebody finds that the new 1MB is wrong
      for some special case applications.  For now, add a kernel command line
      option (stack_guard_gap) to specify the stack gap size (in page units).
      
      Implementation wise, first delete all the old code for stack guard page:
      because although we could get away with accounting one extra page in a
      stack vma, accounting a larger gap can break userspace - case in point,
      a program run with "ulimit -S -v 20000" failed when the 1MB gap was
      counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
      and strict non-overcommit mode.
      
      Instead of keeping gap inside the stack vma, maintain the stack guard
      gap as a gap between vmas: using vm_start_gap() in place of vm_start
      (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
      places which need to respect the gap - mainly arch_get_unmapped_area(),
      and and the vma tree's subtree_gap support for that.
      Original-patch-by: NOleg Nesterov <oleg@redhat.com>
      Original-patch-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Tested-by: Helge Deller <deller@gmx.de> # parisc
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1be7107f
    • L
      Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 1132d5e7
      Linus Torvalds 提交于
      Pull ARM SoC fixes from Olof Johansson:
       "Stream of fixes has slowed down, only a few this week:
      
         - Some DT fixes for Allwinner platforms, and addition of a clock to
           the R_CCU clock controller that had been missed.
      
         - A couple of small DT fixes for am335x-sl50"
      
      * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        arm64: allwinner: a64: Add PLL_PERIPH0 clock to the R_CCU
        ARM: sunxi: h3-h5: Add PLL_PERIPH0 clock to the R_CCU
        ARM: dts: am335x-sl50: Fix cannot claim requested pins for spi0
        ARM: dts: am335x-sl50: Fix card detect pin for mmc1
        arm64: allwinner: h5: Remove syslink to shared DTSI
        ARM: sunxi: h3/h5: fix the compatible of R_CCU
      1132d5e7
    • O
      Merge tag 'sunxi-fixes-for-4.12' of... · a1858df9
      Olof Johansson 提交于
      Merge tag 'sunxi-fixes-for-4.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes
      
      Allwinner fixes for 4.12
      
      A few fixes around the PRCM support that got in 4.12 with a wrong
      compatible, and a missing clock in the binding.
      
      * tag 'sunxi-fixes-for-4.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
        arm64: allwinner: a64: Add PLL_PERIPH0 clock to the R_CCU
        ARM: sunxi: h3-h5: Add PLL_PERIPH0 clock to the R_CCU
        arm64: allwinner: h5: Remove syslink to shared DTSI
        ARM: sunxi: h3/h5: fix the compatible of R_CCU
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      a1858df9
    • O
      Merge tag 'omap-for-v4.12/fixes-sl50' of... · 51b6e281
      Olof Johansson 提交于
      Merge tag 'omap-for-v4.12/fixes-sl50' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
      
      Two fixes for am335x-sl50 to fix a boot time error
      for claiming SPI pins, and to fix a SDIO card detect
      pin for production version of the device.
      
      * tag 'omap-for-v4.12/fixes-sl50' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
        ARM: dts: am335x-sl50: Fix cannot claim requested pins for spi0
        ARM: dts: am335x-sl50: Fix card detect pin for mmc1
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      51b6e281
    • L
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · 3696e4f0
      Linus Torvalds 提交于
      Pull virtio bugfix from Michael Tsirkin:
       "It turns out balloon does not handle IOMMUs correctly. We should fix
        that at some point, for now let's just disable this configuration"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        virtio_balloon: disable VIOMMU support
      3696e4f0
    • L
      Merge branch 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 7d62d947
      Linus Torvalds 提交于
      Pull i2c fixes from Wolfram Sang:
       "Two driver bugfixes"
      
      * 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: ismt: fix wrong device address when unmap the data buffer
        i2c: rcar: use correct length when unmapping DMA
      7d62d947
    • L
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · b3ee4edd
      Linus Torvalds 提交于
      Pull MIPS fixes from Ralf Baechle:
      
       - Three highmem fixes:
          + Fixed mapping initialization
          + Adjust the pkmap location
          + Ensure we use at most one page for PTEs
      
       - Fix makefile dependencies for .its targets to depend on vmlinux
      
       - Fix reversed condition in BNEZC and JIALC software branch emulation
      
       - Only flush initialized flush_insn_slot to avoid NULL pointer
         dereference
      
       - perf: Remove incorrect odd/even counter handling for I6400
      
       - ftrace: Fix init functions tracing
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
        MIPS: .its targets depend on vmlinux
        MIPS: Fix bnezc/jialc return address calculation
        MIPS: kprobes: flush_insn_slot should flush only if probe initialised
        MIPS: ftrace: fix init functions tracing
        MIPS: mm: adjust PKMAP location
        MIPS: highmem: ensure that we don't use more than one page for PTEs
        MIPS: mm: fixed mappings: correct initialisation
        MIPS: perf: Remove incorrect odd/even counter handling for I6400
      b3ee4edd
    • M
      virtio_balloon: disable VIOMMU support · e41b1355
      Michael S. Tsirkin 提交于
      virtio balloon bypasses the DMA API entirely so does not support the
      VIOMMU right now.  It's not clear we need that support, for now let's
      just make sure we don't pretend to support it.
      
      Cc: stable@vger.kernel.org
      Cc: Wei Wang <wei.w.wang@intel.com>
      Fixes: 1a937693 ("virtio: new feature to detect IOMMU device quirk")
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      e41b1355
  5. 18 6月, 2017 13 次提交
  6. 17 6月, 2017 4 次提交