1. 13 1月, 2018 5 次提交
    • E
      signal/arm64: Document conflicts with SI_USER and SIGFPE,SIGTRAP,SIGBUS · 526c3ddb
      Eric W. Biederman 提交于
      Setting si_code to 0 results in a userspace seeing an si_code of 0.
      This is the same si_code as SI_USER.  Posix and common sense requires
      that SI_USER not be a signal specific si_code.  As such this use of 0
      for the si_code is a pretty horribly broken ABI.
      
      Further use of si_code == 0 guaranteed that copy_siginfo_to_user saw a
      value of __SI_KILL and now sees a value of SIL_KILL with the result
      that uid and pid fields are copied and which might copying the si_addr
      field by accident but certainly not by design.  Making this a very
      flakey implementation.
      
      Utilizing FPE_FIXME, BUS_FIXME, TRAP_FIXME siginfo_layout will now return
      SIL_FAULT and the appropriate fields will be reliably copied.
      
      But folks this is a new and unique kind of bad.  This is massively
      untested code bad.  This is inventing new and unique was to get
      siginfo wrong bad.  This is don't even think about Posix or what
      siginfo means bad.  This is lots of eyeballs all missing the fact
      that the code does the wrong thing bad.  This is getting stuck
      and keep making the same mistake bad.
      
      I really hope we can find a non userspace breaking fix for this on a
      port as new as arm64.
      
      Possible ABI fixes include:
      - Send the signal without siginfo
      - Don't generate a signal
      - Possibly assign and use an appropriate si_code
      - Don't handle cases which can't happen
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Tyler Baicar <tbaicar@codeaurora.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Tony Lindgren <tony@atomide.com>
      Cc: Nicolas Pitre <nico@linaro.org>
      Cc: Olof Johansson <olof@lixom.net>
      Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: linux-arm-kernel@lists.infradead.org
      Ref: 53631b54 ("arm64: Floating point and SIMD")
      Ref: 32015c23 ("arm64: exception: handle Synchronous External Abort")
      Ref: 1d18c47c ("arm64: MMU fault handling and page table management")
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      526c3ddb
    • E
      signal/powerpc: Document conflicts with SI_USER and SIGFPE and SIGTRAP · cf4674c4
      Eric W. Biederman 提交于
      Setting si_code to 0 results in a userspace seeing an si_code of 0.
      This is the same si_code as SI_USER.  Posix and common sense requires
      that SI_USER not be a signal specific si_code.  As such this use of 0
      for the si_code is a pretty horribly broken ABI.
      
      Further use of si_code == 0 guaranteed that copy_siginfo_to_user saw a
      value of __SI_KILL and now sees a value of SIL_KILL with the result
      that uid and pid fields are copied and which might copying the si_addr
      field by accident but certainly not by design.  Making this a very
      flakey implementation.
      
      Utilizing FPE_FIXME and TRAP_FIXME, siginfo_layout() will now return
      SIL_FAULT and the appropriate fields will be reliably copied.
      
      Possible ABI fixes includee:
      - Send the signal without siginfo
      - Don't generate a signal
      - Possibly assign and use an appropriate si_code
      - Don't handle cases which can't happen
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Kumar Gala <kumar.gala@freescale.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc:  linuxppc-dev@lists.ozlabs.org
      Ref: 9bad068c24d7 ("[PATCH] ppc32: support for e500 and 85xx")
      Ref: 0ed70f6105ef ("PPC32: Provide proper siginfo information on various exceptions.")
      History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.gitSigned-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      cf4674c4
    • E
      signal/metag: Document a conflict with SI_USER with SIGFPE · b80328be
      Eric W. Biederman 提交于
      Setting si_code to 0 results in a userspace seeing an si_code of 0.
      This is the same si_code as SI_USER.  Posix and common sense requires
      that SI_USER not be a signal specific si_code.  As such this use of 0
      for the si_code is a pretty horribly broken ABI.
      
      Further use of si_code == 0 guaranteed that copy_siginfo_to_user saw a
      value of __SI_KILL and now sees a value of SIL_KILL with the result
      hat uid and pid fields are copied and which might copying the si_addr
      field by accident but certainly not by design.  Making this a very
      flakey implementation.
      
      Utilizing FPE_FIXME siginfo_layout will now return SIL_FAULT and the
      appropriate fields will reliably be copied.
      
      Possible ABI fixes includee:
        - Send the signal without siginfo
        - Don't generate a signal
        - Possibly assign and use an appropriate si_code
        - Don't handle cases which can't happen
      
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: linux-metag@vger.kernel.org
      Ref: ac919f08 ("metag: Traps")
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      b80328be
    • E
      signal/parisc: Document a conflict with SI_USER with SIGFPE · b5daf2b9
      Eric W. Biederman 提交于
      Setting si_code to 0 results in a userspace seeing an si_code of 0.
      This is the same si_code as SI_USER.  Posix and common sense requires
      that SI_USER not be a signal specific si_code.  As such this use of 0
      for the si_code is a pretty horribly broken ABI.
      
      Further use of si_code == 0 guaranteed that copy_siginfo_to_user saw a
      value of __SI_KILL and now sees a value of SIL_KILL with the result
      that uid and pid fields are copied and which might copying the si_addr
      field by accident but certainly not by design.  Making this a very
      flakey implementation.
      
      Utilizing FPE_FIXME siginfo_layout will now return SIL_FAULT and the
      appropriate fields will reliably be copied.
      
      This bug is 13 years old and parsic machines are no longer being built
      so I don't know if it possible or worth fixing it.  But it is at least
      worth documenting this so other architectures don't make the same
      mistake.
      
      Possible ABI fixes includee:
        - Send the signal without siginfo
        - Don't generate a signal
        - Possibly assign and use an appropriate si_code
        - Don't handle cases which can't happen
      
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: linux-parisc@vger.kernel.org
      Ref: 313c01d3e3fd ("[PATCH] PA-RISC update for 2.6.0")
      Histroy Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.gitSigned-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      b5daf2b9
    • E
      signal/openrisc: Fix do_unaligned_access to send the proper signal · 500d5830
      Eric W. Biederman 提交于
      While reviewing the signal sending on openrisc the do_unaligned_access
      function stood out because it is obviously wrong.  A comment about an
      si_code set above when actually si_code is never set.  Leading to a
      random si_code being sent to userspace in the event of an unaligned
      access.
      
      Looking further SIGBUS BUS_ADRALN is the proper pair of signal and
      si_code to send for an unaligned access. That is what other
      architectures do and what is required by posix.
      
      Given that do_unaligned_access is broken in a way that no one can be
      relying on it on openrisc fix the code to just do the right thing.
      
      Cc: stable@vger.kernel.org
      Fixes: 769a8a96 ("OpenRISC: Traps")
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: openrisc@lists.librecores.org
      Acked-by: NStafford Horne <shorne@gmail.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      500d5830
  2. 06 1月, 2018 1 次提交
  3. 04 1月, 2018 1 次提交
    • E
      signal: Simplify and fix kdb_send_sig · 0b44bf9a
      Eric W. Biederman 提交于
      - Rename from kdb_send_sig_info to kdb_send_sig
        As there is no meaningful siginfo sent
      
      - Use SEND_SIG_PRIV instead of generating a siginfo for a kdb
        signal.  The generated siginfo had a bogus rationale and was
        not correct in the face of pid namespaces.  SEND_SIG_PRIV
        is simpler and actually correct.
      
      - As the code grabs siglock just send the signal with siglock
        held instead of dropping siglock and attempting to grab it again.
      
      - Move the sig_valid test into kdb_kill where it can generate
        a good error message.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      0b44bf9a
  4. 01 1月, 2018 16 次提交
    • L
      Linux 4.15-rc6 · 30a7acd5
      Linus Torvalds 提交于
      30a7acd5
    • L
      Merge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f39d7d78
      Linus Torvalds 提交于
      Pull x86 fixes from Thomas Gleixner:
       "A couple of fixlets for x86:
      
         - Fix the ESPFIX double fault handling for 5-level pagetables
      
         - Fix the commandline parsing for 'apic=' on 32bit systems and update
           documentation
      
         - Make zombie stack traces reliable
      
         - Fix kexec with stack canary
      
         - Fix the delivery mode for APICs which was missed when the x86
           vector management was converted to single target delivery. Caused a
           regression due to the broken hardware which ignores affinity
           settings in lowest prio delivery mode.
      
         - Unbreak modules when AMD memory encryption is enabled
      
         - Remove an unused parameter of prepare_switch_to"
      
      * 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/apic: Switch all APICs to Fixed delivery mode
        x86/apic: Update the 'apic=' description of setting APIC driver
        x86/apic: Avoid wrong warning when parsing 'apic=' in X86-32 case
        x86-32: Fix kexec with stack canary (CONFIG_CC_STACKPROTECTOR)
        x86: Remove unused parameter of prepare_switch_to
        x86/stacktrace: Make zombie stack traces reliable
        x86/mm: Unbreak modules that use the DMA API
        x86/build: Make isoimage work on Debian
        x86/espfix/64: Fix espfix double-fault handling on 5-level systems
      f39d7d78
    • L
      Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 52c90f2d
      Linus Torvalds 提交于
      Pull x86 page table isolation fixes from Thomas Gleixner:
       "Four patches addressing the PTI fallout as discussed and debugged
        yesterday:
      
         - Remove stale and pointless TLB flush invocations from the hotplug
           code
      
         - Remove stale preempt_disable/enable from __native_flush_tlb()
      
         - Plug the memory leak in the write_ldt() error path"
      
      * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/ldt: Make LDT pgtable free conditional
        x86/ldt: Plug memory leak in error path
        x86/mm: Remove preempt_disable/enable() from __native_flush_tlb()
        x86/smpboot: Remove stale TLB flush invocations
      52c90f2d
    • L
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · cea92e84
      Linus Torvalds 提交于
      Pull timer fixes from Thomas Gleixner:
       "A pile of fixes for long standing issues with the timer wheel and the
        NOHZ code:
      
         - Prevent timer base confusion accross the nohz switch, which can
           cause unlocked access and data corruption
      
         - Reinitialize the stale base clock on cpu hotplug to prevent subtle
           side effects including rollovers on 32bit
      
         - Prevent an interrupt storm when the timer softirq is already
           pending caused by tick_nohz_stop_sched_tick()
      
         - Move the timer start tracepoint to a place where it actually makes
           sense
      
         - Add documentation to timerqueue functions as they caused confusion
           several times now"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        timerqueue: Document return values of timerqueue_add/del()
        timers: Invoke timer_start_debug() where it makes sense
        nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()
        timers: Reinitialize per cpu bases on hotplug
        timers: Use deferrable base independent of base::nohz_active
      cea92e84
    • L
      Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8d517bdf
      Linus Torvalds 提交于
      Pull smp fixlet from Thomas Gleixner:
       "A trivial build warning fix for newer compilers"
      
      * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        cpu/hotplug: Move inline keyword at the beginning of declaration
      8d517bdf
    • L
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4c470317
      Linus Torvalds 提交于
      Pull scheduler fixes from Thomas Gleixner:
       "Three patches addressing the fallout of the CPU_ISOLATION changes
        especially with NO_HZ_FULL plus documentation of boot parameter
        dependency"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/isolation: Document boot parameters dependency on CONFIG_CPU_ISOLATION=y
        sched/isolation: Enable CONFIG_CPU_ISOLATION=y by default
        sched/isolation: Make CONFIG_NO_HZ_FULL select CONFIG_CPU_ISOLATION
      4c470317
    • L
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e7c632fc
      Linus Torvalds 提交于
      Pull perf fixes from Thomas Gleixner:
      
       - plug a memory leak in the intel pmu init code
      
       - clang fixes
      
       - tooling fix to avoid including kernel headers
      
       - a fix for jvmti to generate correct debug information for inlined
         code
      
       - replace backtick with a regular shell function
      
       - fix the build in hardened environments
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/intel: Plug memory leak in intel_pmu_init()
        x86/asm: Allow again using asm.h when building for the 'bpf' clang target
        tools arch s390: Do not include header files from the kernel sources
        perf jvmti: Generate correct debug information for inlined code
        perf tools: Fix up build in hardened environments
        perf tools: Use shell function for perl cflags retrieval
      e7c632fc
    • L
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 88fa025d
      Linus Torvalds 提交于
      Pull irq fixes from Thomas Gleixner:
       "A rather large update after the kaisered maintainer finally found time
        to handle regression reports.
      
         - The larger part addresses a regression caused by the x86 vector
           management rework.
      
           The reservation based model does not work reliably for MSI
           interrupts, if they cannot be masked (yes, yet another hw
           engineering trainwreck). The reason is that the reservation mode
           assigns a dummy vector when the interrupt is allocated and switches
           to a real vector when the interrupt is requested.
      
           If the MSI entry cannot be masked then the initialization might
           raise an interrupt before the interrupt is requested, which ends up
           as spurious interrupt and causes device malfunction and worse. The
           fix is to exclude MSI interrupts which do not support masking from
           reservation mode and assign a real vector right away.
      
         - Extend the extra lockdep class setup for nested interrupts with a
           class for the recently added irq_desc::request_mutex so lockdep can
           differeniate and does not emit false positive warnings.
      
         - A ratelimit guard for the bad irq printout so in case a bad irq
           comes back immediately the system does not drown in dmesg spam"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq/msi, x86/vector: Prevent reservation mode for non maskable MSI
        genirq/irqdomain: Rename early argument of irq_domain_activate_irq()
        x86/vector: Use IRQD_CAN_RESERVE flag
        genirq: Introduce IRQD_CAN_RESERVE flag
        genirq/msi: Handle reactivation only on success
        gpio: brcmstb: Make really use of the new lockdep class
        genirq: Guard handle_bad_irq log messages
        kernel/irq: Extend lockdep class for request mutex
      88fa025d
    • L
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 31336ed9
      Linus Torvalds 提交于
      Pull objtool fixes from Thomas Gleixner:
       "Three fixlets for objtool:
      
         - Address two segfaults related to missing parameter and clang
           objects
      
         - Make it compile clean with clang"
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        objtool: Fix seg fault with clang-compiled objects
        objtool: Fix seg fault caused by missing parameter
        objtool: Fix Clang enum conversion warning
      31336ed9
    • L
      Merge tag 'char-misc-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 8371e5a0
      Linus Torvalds 提交于
      Pull char/misc fixes from Greg KH:
       "Here are six small fixes of some of the char/misc drivers that have
        been sent in to resolve reported issues.
      
        Nothing major, a binder use-after-free fix, some thunderbolt bugfixes,
        a hyper-v bugfix, and an nvmem driver fix. All of these have been in
        linux-next with no reported issues for a while"
      
      * tag 'char-misc-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        nvmem: meson-mx-efuse: fix reading from an offset other than 0
        binder: fix proc->files use-after-free
        vmbus: unregister device_obj->channels_kset
        thunderbolt: Mask ring interrupt properly when polling starts
        MAINTAINERS: Add thunderbolt.rst to the Thunderbolt driver entry
        thunderbolt: Make pathname to force_power shorter
      8371e5a0
    • L
      Merge tag 'driver-core-4.15-rc6' of... · 4288e6b4
      Linus Torvalds 提交于
      Merge tag 'driver-core-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
      
      Pull driver core fixes from Greg KH:
       "Here are two driver core fixes for 4.15-rc6, resolving some reported
        issues.
      
        The first is a cacheinfo fix for DT based systems to resolve a
        reported issue that has been around for a while, and the other is to
        resolve a regression in the kobject uevent code that showed up in
        4.15-rc1.
      
        Both have been in linux-next for a while with no reported issues"
      
      * tag 'driver-core-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        kobject: fix suppressing modalias in uevents delivered over netlink
        drivers: base: cacheinfo: fix cache type for non-architected system cache
      4288e6b4
    • L
      Merge tag 'staging-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 29a9b000
      Linus Torvalds 提交于
      Pull staging fixes from Greg KH:
       "Here are three staging driver fixes for 4.15-rc6
      
        The first resolves a bug in the lustre driver that came about due to a
        broken cleanup patch, due to crazy list usage in that codebase.
      
        The remaining two are ion driver fixes, finally getting the CMA
        interaction to work properly, resolving two regressions in that area
        of the code.
      
        All have been in linux-next with no reported issues for a while"
      
      * tag 'staging-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: android: ion: Fix dma direction for dma_sync_sg_for_cpu/device
        staging: ion: Fix ion_cma_heap allocations
        staging: lustre: lnet: Fix recent breakage from list_for_each conversion
      29a9b000
    • L
      Merge tag 'tty-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · bc7236fb
      Linus Torvalds 提交于
      Pull TTY fix from Greg KH:
       "Here is a single tty fix for a reported issue that you wrote the patch
        for :)
      
        It's been in linux-next for a week or so with no reported issues"
      
      * tag 'tty-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD)
      bc7236fb
    • L
      Merge tag 'usb-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · a9746e40
      Linus Torvalds 提交于
      Pull USB/PHY fixes from Greg KH:
       "Here are a number of small USB and PHY driver fixes for 4.15-rc6.
      
        Nothing major, but there are a number of regression fixes in here that
        resolve issues that have been reported a bunch. There are also the
        usual xhci fixes as well as a number of new usb serial device ids.
      
        All of these have been in linux-next for a while with no reported
        issues"
      
      * tag 'usb-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        usb: xhci: Add XHCI_TRUST_TX_LENGTH for Renesas uPD720201
        xhci: Fix use-after-free in xhci debugfs
        xhci: Fix xhci debugfs NULL pointer dereference in resume from hibernate
        USB: serial: ftdi_sio: add id for Airbus DS P8GR
        usb: Add device quirk for Logitech HD Pro Webcam C925e
        usb: add RESET_RESUME for ELSA MicroLink 56K
        usbip: fix usbip bind writing random string after command in match_busid
        usbip: stub_rx: fix static checker warning on unnecessary checks
        usbip: prevent leaking socket pointer address in messages
        usbip: stub: stop printing kernel pointer addresses in messages
        usbip: vhci: stop printing kernel pointer addresses in messages
        USB: Fix off by one in type-specific length check of BOS SSP capability
        USB: serial: option: adding support for YUGA CLM920-NC5
        phy: rcar-gen3-usb2: select USB_COMMON
        phy: rockchip-typec: add pm_runtime_disable in err case
        phy: cpcap-usb: Fix platform_get_irq_byname's error checking.
        phy: tegra: fix device-tree node lookups
        USB: serial: qcserial: add Sierra Wireless EM7565
        USB: serial: option: add support for Telit ME910 PID 0x1101
        USB: chipidea: msm: fix ulpi-node lookup
      a9746e40
    • A
      MAINTAINERS: mark arch/blackfin/ and its gubbins as orphaned · c0b23903
      Adam Borowski 提交于
      The blackfin architecture has seen no maintainer action of any kind since
      April 2015.  No new code, no pull requests, no acks to patches, no response
      to mails, nothing.
      
      The web site has an expired certificate (expiration Sep 2017, issued in
      2013), the mailing list sees no answers either, with one exception:
      
        https://sourceforge.net/p/adi-buildroot/mailman/adi-buildroot-devel/
        >
        > Steven is no longer working on this for ADI. Acked by me if this works. Thanks.
        >
        > Best regards,
        > Aaron Wu
        > Analog Devices Inc.
      
      But, Aaron doesn't seem to respond to queries either.
      Signed-off-by: NAdam Borowski <kilobyte@angband.pl>
      Acked-by: NLinus Walleij <linus.walleij@linaro.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c0b23903
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 6bba94d0
      Linus Torvalds 提交于
      Pull sparc bugfix from David Miller.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc64: repair calling incorrect hweight function from stubs
      6bba94d0
  5. 31 12月, 2017 9 次提交
  6. 30 12月, 2017 8 次提交
    • L
      Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5aa90a84
      Linus Torvalds 提交于
      Pull x86 page table isolation updates from Thomas Gleixner:
       "This is the final set of enabling page table isolation on x86:
      
         - Infrastructure patches for handling the extra page tables.
      
         - Patches which map the various bits and pieces which are required to
           get in and out of user space into the user space visible page
           tables.
      
         - The required changes to have CR3 switching in the entry/exit code.
      
         - Optimizations for the CR3 switching along with documentation how
           the ASID/PCID mechanism works.
      
         - Updates to dump pagetables to cover the user space page tables for
           W+X scans and extra debugfs files to analyze both the kernel and
           the user space visible page tables
      
        The whole functionality is compile time controlled via a config switch
        and can be turned on/off on the command line as well"
      
      * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
        x86/ldt: Make the LDT mapping RO
        x86/mm/dump_pagetables: Allow dumping current pagetables
        x86/mm/dump_pagetables: Check user space page table for WX pages
        x86/mm/dump_pagetables: Add page table directory to the debugfs VFS hierarchy
        x86/mm/pti: Add Kconfig
        x86/dumpstack: Indicate in Oops whether PTI is configured and enabled
        x86/mm: Clarify the whole ASID/kernel PCID/user PCID naming
        x86/mm: Use INVPCID for __native_flush_tlb_single()
        x86/mm: Optimize RESTORE_CR3
        x86/mm: Use/Fix PCID to optimize user/kernel switches
        x86/mm: Abstract switching CR3
        x86/mm: Allow flushing for future ASID switches
        x86/pti: Map the vsyscall page if needed
        x86/pti: Put the LDT in its own PGD if PTI is on
        x86/mm/64: Make a full PGD-entry size hole in the memory map
        x86/events/intel/ds: Map debug buffers in cpu_entry_area
        x86/cpu_entry_area: Add debugstore entries to cpu_entry_area
        x86/mm/pti: Map ESPFIX into user space
        x86/mm/pti: Share entry text PMD
        x86/entry: Align entry text section to PMD boundary
        ...
      5aa90a84
    • T
      timerqueue: Document return values of timerqueue_add/del() · 9f4533cd
      Thomas Gleixner 提交于
      The return values of timerqueue_add/del() are not documented in the kernel doc
      comment. Add proper documentation.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: rt@linutronix.de
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Link: https://lkml.kernel.org/r/20171222145337.872681338@linutronix.de
      9f4533cd
    • T
      timers: Invoke timer_start_debug() where it makes sense · fd45bb77
      Thomas Gleixner 提交于
      The timer start debug function is called before the proper timer base is
      set. As a consequence the trace data contains the stale CPU and flags
      values.
      
      Call the debug function after setting the new base and flags.
      
      Fixes: 500462a9 ("timers: Switch to a non-cascading wheel")
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: stable@vger.kernel.org
      Cc: rt@linutronix.de
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Link: https://lkml.kernel.org/r/20171222145337.792907137@linutronix.de
      fd45bb77
    • T
      nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick() · 5d62c183
      Thomas Gleixner 提交于
      The conditions in irq_exit() to invoke tick_nohz_irq_exit() which
      subsequently invokes tick_nohz_stop_sched_tick() are:
      
        if ((idle_cpu(cpu) && !need_resched()) || tick_nohz_full_cpu(cpu))
      
      If need_resched() is not set, but a timer softirq is pending then this is
      an indication that the softirq code punted and delegated the execution to
      softirqd. need_resched() is not true because the current interrupted task
      takes precedence over softirqd.
      
      Invoking tick_nohz_irq_exit() in this case can cause an endless loop of
      timer interrupts because the timer wheel contains an expired timer, but
      softirqs are not yet executed. So it returns an immediate expiry request,
      which causes the timer to fire immediately again. Lather, rinse and
      repeat....
      
      Prevent that by adding a check for a pending timer soft interrupt to the
      conditions in tick_nohz_stop_sched_tick() which avoid calling
      get_next_timer_interrupt(). That keeps the tick sched timer on the tick and
      prevents a repetitive programming of an already expired timer.
      Reported-by: NSebastian Siewior <bigeasy@linutronix.d>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272156050.2431@nanos
      5d62c183
    • T
      timers: Reinitialize per cpu bases on hotplug · 26456f87
      Thomas Gleixner 提交于
      The timer wheel bases are not (re)initialized on CPU hotplug. That leaves
      them with a potentially stale clk and next_expiry valuem, which can cause
      trouble then the CPU is plugged.
      
      Add a prepare callback which forwards the clock, sets next_expiry to far in
      the future and reset the control flags to a known state.
      
      Set base->must_forward_clk so the first timer which is queued will try to
      forward the clock to current jiffies.
      
      Fixes: 500462a9 ("timers: Switch to a non-cascading wheel")
      Reported-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272152200.2431@nanos
      26456f87
    • A
      timers: Use deferrable base independent of base::nohz_active · ced6d5c1
      Anna-Maria Gleixner 提交于
      During boot and before base::nohz_active is set in the timer bases, deferrable
      timers are enqueued into the standard timer base. This works correctly as
      long as base::nohz_active is false.
      
      Once it base::nohz_active is set and a timer which was enqueued before that
      is accessed the lock selector code choses the lock of the deferred
      base. This causes unlocked access to the standard base and in case the
      timer is removed it does not clear the pending flag in the standard base
      bitmap which causes get_next_timer_interrupt() to return bogus values.
      
      To prevent that, the deferrable timers must be enqueued in the deferrable
      base, even when base::nohz_active is not set. Those deferrable timers also
      need to be expired unconditional.
      
      Fixes: 500462a9 ("timers: Switch to a non-cascading wheel")
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: stable@vger.kernel.org
      Cc: rt@linutronix.de
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Link: https://lkml.kernel.org/r/20171222145337.633328378@linutronix.de
      ced6d5c1
    • T
      genirq/msi, x86/vector: Prevent reservation mode for non maskable MSI · bc976233
      Thomas Gleixner 提交于
      The new reservation mode for interrupts assigns a dummy vector when the
      interrupt is allocated and assigns a real vector when the interrupt is
      requested. The reservation mode prevents vector pressure when devices with
      a large amount of queues/interrupts are initialized, but only a minimal
      subset of those queues/interrupts is actually used.
      
      This mode has an issue with MSI interrupts which cannot be masked. If the
      driver is not careful or the hardware emits an interrupt before the device
      irq is requestd by the driver then the interrupt ends up on the dummy
      vector as a spurious interrupt which can cause malfunction of the device or
      in the worst case a lockup of the machine.
      
      Change the logic for the reservation mode so that the early activation of
      MSI interrupts checks whether:
      
       - the device is a PCI/MSI device
       - the reservation mode of the underlying irqdomain is activated
       - PCI/MSI masking is globally enabled
       - the PCI/MSI device uses either MSI-X, which supports masking, or
         MSI with the maskbit supported.
      
      If one of those conditions is false, then clear the reservation mode flag
      in the irq data of the interrupt and invoke irq_domain_activate_irq() with
      the reserve argument cleared. In the x86 vector code, clear the can_reserve
      flag in the vector allocation data so a subsequent free_irq() won't create
      the same situation again. The interrupt stays assigned to a real vector
      until pci_disable_msi() is invoked and all allocations are undone.
      
      Fixes: 4900be83 ("x86/vector/msi: Switch to global reservation mode")
      Reported-by: NAlexandru Chirvasitu <achirvasub@gmail.com>
      Reported-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NAlexandru Chirvasitu <achirvasub@gmail.com>
      Tested-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: Mikael Pettersson <mikpelinux@gmail.com>
      Cc: Josh Poulson <jopoulso@microsoft.com>
      Cc: Mihai Costache <v-micos@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: linux-pci@vger.kernel.org
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: Simon Xiao <sixiao@microsoft.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Jork Loeser <Jork.Loeser@microsoft.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: devel@linuxdriverproject.org
      Cc: KY Srinivasan <kys@microsoft.com>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: Sakari Ailus <sakari.ailus@intel.com>,
      Cc: linux-media@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291406420.1899@nanos
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291409460.1899@nanos
      bc976233
    • T
      genirq/irqdomain: Rename early argument of irq_domain_activate_irq() · 702cb0a0
      Thomas Gleixner 提交于
      The 'early' argument of irq_domain_activate_irq() is actually used to
      denote reservation mode. To avoid confusion, rename it before abuse
      happens.
      
      No functional change.
      
      Fixes: 72491643 ("genirq/irqdomain: Update irq_domain_ops.activate() signature")
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Alexandru Chirvasitu <achirvasub@gmail.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: Mikael Pettersson <mikpelinux@gmail.com>
      Cc: Josh Poulson <jopoulso@microsoft.com>
      Cc: Mihai Costache <v-micos@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: linux-pci@vger.kernel.org
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: Simon Xiao <sixiao@microsoft.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Jork Loeser <Jork.Loeser@microsoft.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: devel@linuxdriverproject.org
      Cc: KY Srinivasan <kys@microsoft.com>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: Sakari Ailus <sakari.ailus@intel.com>,
      Cc: linux-media@vger.kernel.org
      702cb0a0