1. 19 3月, 2016 3 次提交
  2. 18 3月, 2016 3 次提交
    • T
      x86/irq: Cure live lock in fixup_irqs() · 551adc60
      Thomas Gleixner 提交于
      Harry reported, that he's able to trigger a system freeze with cpu hot
      unplug. The freeze turned out to be a live lock caused by recent changes in
      irq_force_complete_move().
      
      When fixup_irqs() and from there irq_force_complete_move() is called on the
      dying cpu, then all other cpus are in stop machine an wait for the dying cpu
      to complete the teardown. If there is a move of an interrupt pending then
      irq_force_complete_move() sends the cleanup IPI to the cpus in the old_domain
      mask and waits for them to clear the mask. That's obviously impossible as
      those cpus are firmly stuck in stop machine with interrupts disabled.
      
      I should have known that, but I completely overlooked it being concentrated on
      the locking issues around the vectors. And the existance of the call to
      __irq_complete_move() in the code, which actually sends the cleanup IPI made
      it reasonable to wait for that cleanup to complete. That call was bogus even
      before the recent changes as it was just a pointless distraction.
      
      We have to look at two cases:
      
      1) The move_in_progress flag of the interrupt is set
      
         This means the ioapic has been updated with the new vector, but it has not
         fired yet. In theory there is a race:
      
         set_ioapic(new_vector) <-- Interrupt is raised before update is effective,
         			      i.e. it's raised on the old vector. 
      
         So if the target cpu cannot handle that interrupt before the old vector is
         cleaned up, we get a spurious interrupt and in the worst case the ioapic
         irq line becomes stale, but my experiments so far have only resulted in
         spurious interrupts.
      
         But in case of cpu hotplug this should be a non issue because if the
         affinity update happens right before all cpus rendevouz in stop machine,
         there is no way that the interrupt can be blocked on the target cpu because
         all cpus loops first with interrupts enabled in stop machine, so the old
         vector is not yet cleaned up when the interrupt fires.
      
         So the only way to run into this issue is if the delivery of the interrupt
         on the apic/system bus would be delayed beyond the point where the target
         cpu disables interrupts in stop machine. I doubt that it can happen, but at
         least there is a theroretical chance. Virtualization might be able to
         expose this, but AFAICT the IOAPIC emulation is not as stupid as the real
         hardware.
      
         I've spent quite some time over the weekend to enforce that situation,
         though I was not able to trigger the delayed case.
      
      2) The move_in_progress flag is not set and the old_domain cpu mask is not
         empty.
      
         That means, that an interrupt was delivered after the change and the
         cleanup IPI has been sent to the cpus in old_domain, but not all CPUs have
         responded to it yet.
      
      In both cases we can assume that the next interrupt will arrive on the new
      vector, so we can cleanup the old vectors on the cpus in the old_domain cpu
      mask.
      
      Fixes: 98229aa3 "x86/irq: Plug vector cleanup race"
      Reported-by: NHarry Junior <harryjr@outlook.fr>
      Tested-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Joe Lawrence <joe.lawrence@stratus.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1603140931430.3657@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      551adc60
    • T
      x86/tsc: Prevent NULL pointer deref in calibrate_delay_is_known() · f508a5ba
      Thomas Gleixner 提交于
      The topology_core_cpumask is used to find a neighbour cpu in
      calibrate_delay_is_known(). It might not be allocated at the first invocation
      of that function on the boot cpu, when CONFIG_CPUMASK_OFFSTACK is set.
      
      The mask is allocated later in native_smp_prepare_cpus. As a consequence the
      underlying find_next_bit() call dereferences a NULL pointer.
      
      Add a proper check to prevent this.
      
      Fixes: c25323c0 "x86/tsc: Use topology functions"
      Reported-and-tested-by: NRichard W.M. Jones <rjones@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Josh Boyer <jwboyer@fedoraproject.org>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1603180843270.3978@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      f508a5ba
    • D
      x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt() · 7834c103
      Dave Jones 提交于
      Since 4.4, I've been able to trigger this occasionally:
      
      ===============================
      [ INFO: suspicious RCU usage. ]
      4.5.0-rc7-think+ #3 Not tainted
      Cc: Andi Kleen <ak@linux.intel.com>
      Link: http://lkml.kernel.org/r/20160315012054.GA17765@codemonkey.org.ukSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      
      -------------------------------
      ./arch/x86/include/asm/msr-trace.h:47 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      RCU used illegally from idle CPU!
      rcu_scheduler_active = 1, debug_locks = 1
      RCU used illegally from extended quiescent state!
      no locks held by swapper/3/0.
      
      stack backtrace:
      CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.5.0-rc7-think+ #3
       ffffffff92f821e0 1f3e5c340597d7fc ffff880468e07f10 ffffffff92560c2a
       ffff880462145280 0000000000000001 ffff880468e07f40 ffffffff921376a6
       ffffffff93665ea0 0000cc7c876d28da 0000000000000005 ffffffff9383dd60
      Call Trace:
       <IRQ>  [<ffffffff92560c2a>] dump_stack+0x67/0x9d
       [<ffffffff921376a6>] lockdep_rcu_suspicious+0xe6/0x100
       [<ffffffff925ae7a7>] do_trace_write_msr+0x127/0x1a0
       [<ffffffff92061c83>] native_apic_msr_eoi_write+0x23/0x30
       [<ffffffff92054408>] smp_trace_call_function_interrupt+0x38/0x360
       [<ffffffff92d1ca60>] trace_call_function_interrupt+0x90/0xa0
       <EOI>  [<ffffffff92ac5124>] ? cpuidle_enter_state+0x1b4/0x520
      
      Move the entering_irq() call before ack_APIC_irq(), because entering_irq()
      tells the RCU susbstems to end the extended quiescent state, so that the
      following trace call in ack_APIC_irq() works correctly.
      Suggested-by: NAndi Kleen <ak@linux.intel.com>
      Fixes: 4787c368 "x86/tracing: Add irq_enter/exit() in smp_trace_reschedule_interrupt()"
      Signed-off-by: NDave Jones <davej@codemonkey.org.uk>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      7834c103
  3. 17 3月, 2016 4 次提交
  4. 16 3月, 2016 14 次提交
    • T
      x86/mm, x86/mce: Fix return type/value for memcpy_mcsafe() · cbf8b5a2
      Tony Luck 提交于
      Returning a 'bool' was very unpopular. Doubly so because the
      code was just wrong (returning zero for true, one for false;
      great for shell programming, not so good for C).
      
      Change return type to "int". Keep zero as the success indicator
      because it matches other similar code and people may be more
      comfortable writing:
      
      	if (memcpy_mcsafe(to, from, count)) {
      		printk("Sad panda, copy failed\n");
      		...
      	}
      
      Make the failure return value -EFAULT for now.
      
      Reported by: Mika Penttilä <mika.penttila@nextfour.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: mika.penttila@nextfour.com
      Fixes: 92b0729c ("x86/mm, x86/mce: Add memcpy_mcsafe()")
      Link: http://lkml.kernel.org/r/695f14233fa7a54fcac4406c706d7fec228e3f4c.1457993040.git.tony.luck@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cbf8b5a2
    • I
    • L
      Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 710d60cb
      Linus Torvalds 提交于
      Pull cpu hotplug updates from Thomas Gleixner:
       "This is the first part of the ongoing cpu hotplug rework:
      
         - Initial implementation of the state machine
      
         - Runs all online and prepare down callbacks on the plugged cpu and
           not on some random processor
      
         - Replaces busy loop waiting with completions
      
         - Adds tracepoints so the states can be followed"
      
      More detailed commentary on this work from an earlier email:
       "What's wrong with the current cpu hotplug infrastructure?
      
         - Asymmetry
      
           The hotplug notifier mechanism is asymmetric versus the bringup and
           teardown.  This is mostly caused by the notifier mechanism.
      
         - Largely undocumented dependencies
      
           While some notifiers use explicitely defined notifier priorities,
           we have quite some notifiers which use numerical priorities to
           express dependencies without any documentation why.
      
         - Control processor driven
      
           Most of the bringup/teardown of a cpu is driven by a control
           processor.  While it is understandable, that preperatory steps,
           like idle thread creation, memory allocation for and initialization
           of essential facilities needs to be done before a cpu can boot,
           there is no reason why everything else must run on a control
           processor.  Before this patch series, bringup looks like this:
      
             Control CPU                     Booting CPU
      
             do preparatory steps
             kick cpu into life
      
                                             do low level init
      
             sync with booting cpu           sync with control cpu
      
             bring the rest up
      
         - All or nothing approach
      
           There is no way to do partial bringups.  That's something which is
           really desired because we waste e.g.  at boot substantial amount of
           time just busy waiting that the cpu comes to life.  That's stupid
           as we could very well do preparatory steps and the initial IPI for
           other cpus and then go back and do the necessary low level
           synchronization with the freshly booted cpu.
      
         - Minimal debuggability
      
           Due to the notifier based design, it's impossible to switch between
           two stages of the bringup/teardown back and forth in order to test
           the correctness.  So in many hotplug notifiers the cancel
           mechanisms are either not existant or completely untested.
      
         - Notifier [un]registering is tedious
      
           To [un]register notifiers we need to protect against hotplug at
           every callsite.  There is no mechanism that bringup/teardown
           callbacks are issued on the online cpus, so every caller needs to
           do it itself.  That also includes error rollback.
      
        What's the new design?
      
           The base of the new design is a symmetric state machine, where both
           the control processor and the booting/dying cpu execute a well
           defined set of states.  Each state is symmetric in the end, except
           for some well defined exceptions, and the bringup/teardown can be
           stopped and reversed at almost all states.
      
           So the bringup of a cpu will look like this in the future:
      
             Control CPU                     Booting CPU
      
             do preparatory steps
             kick cpu into life
      
                                             do low level init
      
             sync with booting cpu           sync with control cpu
      
                                             bring itself up
      
           The synchronization step does not require the control cpu to wait.
           That mechanism can be done asynchronously via a worker or some
           other mechanism.
      
           The teardown can be made very similar, so that the dying cpu cleans
           up and brings itself down.  Cleanups which need to be done after
           the cpu is gone, can be scheduled asynchronously as well.
      
        There is a long way to this, as we need to refactor the notion when a
        cpu is available.  Today we set the cpu online right after it comes
        out of the low level bringup, which is not really correct.
      
        The proper mechanism is to set it to available, i.e. cpu local
        threads, like softirqd, hotplug thread etc. can be scheduled on that
        cpu, and once it finished all booting steps, it's set to online, so
        general workloads can be scheduled on it.  The reverse happens on
        teardown.  First thing to do is to forbid scheduling of general
        workloads, then teardown all the per cpu resources and finally shut it
        off completely.
      
        This patch series implements the basic infrastructure for this at the
        core level.  This includes the following:
      
         - Basic state machine implementation with well defined states, so
           ordering and prioritization can be expressed.
      
         - Interfaces to [un]register state callbacks
      
           This invokes the bringup/teardown callback on all online cpus with
           the proper protection in place and [un]installs the callbacks in
           the state machine array.
      
           For callbacks which have no particular ordering requirement we have
           a dynamic state space, so that drivers don't have to register an
           explicit hotplug state.
      
           If a callback fails, the code automatically does a rollback to the
           previous state.
      
         - Sysfs interface to drive the state machine to a particular step.
      
           This is only partially functional today.  Full functionality and
           therefor testability will be achieved once we converted all
           existing hotplug notifiers over to the new scheme.
      
         - Run all CPU_ONLINE/DOWN_PREPARE notifiers on the booting/dying
           processor:
      
             Control CPU                     Booting CPU
      
             do preparatory steps
             kick cpu into life
      
                                             do low level init
      
             sync with booting cpu           sync with control cpu
             wait for boot
                                             bring itself up
      
                                             Signal completion to control cpu
      
           In a previous step of this work we've done a full tree mechanical
           conversion of all hotplug notifiers to the new scheme.  The balance
           is a net removal of about 4000 lines of code.
      
           This is not included in this series, as we decided to take a
           different approach.  Instead of mechanically converting everything
           over, we will do a proper overhaul of the usage sites one by one so
           they nicely fit into the symmetric callback scheme.
      
           I decided to do that after I looked at the ugliness of some of the
           converted sites and figured out that their hotplug mechanism is
           completely buggered anyway.  So there is no point to do a
           mechanical conversion first as we need to go through the usage
           sites one by one again in order to achieve a full symmetric and
           testable behaviour"
      
      * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
        cpu/hotplug: Document states better
        cpu/hotplug: Fix smpboot thread ordering
        cpu/hotplug: Remove redundant state check
        cpu/hotplug: Plug death reporting race
        rcu: Make CPU_DYING_IDLE an explicit call
        cpu/hotplug: Make wait for dead cpu completion based
        cpu/hotplug: Let upcoming cpu bring itself fully up
        arch/hotplug: Call into idle with a proper state
        cpu/hotplug: Move online calls to hotplugged cpu
        cpu/hotplug: Create hotplug threads
        cpu/hotplug: Split out the state walk into functions
        cpu/hotplug: Unpark smpboot threads from the state machine
        cpu/hotplug: Move scheduler cpu_online notifier to hotplug core
        cpu/hotplug: Implement setup/removal interface
        cpu/hotplug: Make target state writeable
        cpu/hotplug: Add sysfs state interface
        cpu/hotplug: Hand in target state to _cpu_up/down
        cpu/hotplug: Convert the hotplugged cpu work to a state machine
        cpu/hotplug: Convert to a state machine for the control processor
        cpu/hotplug: Add tracepoints
        ...
      710d60cb
    • L
      Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · df2e37c8
      Linus Torvalds 提交于
      Pull irq updates from Thomas Gleixner:
       "The 4.6 pile of irq updates contains:
      
         - Support for IPI irqdomains to support proper integration of IPIs to
           and from coprocessors.  The first user of this new facility is
           MIPS.  The relevant MIPS patches come with the core to avoid merge
           ordering issues and have been acked by Ralf.
      
         - A new command line option to set the default interrupt affinity
           mask at boot time.
      
         - Support for some more new ARM and MIPS interrupt controllers:
           tango, alpine-msix and bcm6345-l1
      
         - Two small cleanups for x86/apic which we merged into irq/core to
           avoid yet another branch in x86 with two tiny commits.
      
         - The usual set of updates, cleanups in drivers/irqchip.  Mostly in
           the area of ARM-GIC, arada-37-xp and atmel chips.  Nothing
           outstanding here"
      
      * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits)
        irqchip/irq-alpine-msi: Release the correct domain on error
        irqchip/mxs: Fix error check of of_io_request_and_map()
        irqchip/sunxi-nmi: Fix error check of of_io_request_and_map()
        genirq: Export IRQ functions for module use
        irqchip/gic/realview: Support more RealView DCC variants
        Documentation/bindings: Document the Alpine MSIX driver
        irqchip: Add the Alpine MSIX interrupt controller
        irqchip/gic-v3: Always return IRQ_SET_MASK_OK_DONE in gic_set_affinity
        irqchip/gic-v3-its: Mark its_init() and its children as __init
        irqchip/gic-v3: Remove gic_root_node variable from the ITS code
        irqchip/gic-v3: ACPI: Add redistributor support via GICC structures
        irqchip/gic-v3: Add ACPI support for GICv3/4 initialization
        irqchip/gic-v3: Refactor gic_of_init() for GICv3 driver
        x86/apic: Deinline _flat_send_IPI_mask, save ~150 bytes
        x86/apic: Deinline __default_send_IPI_*, save ~200 bytes
        dt-bindings: interrupt-controller: Add SoC-specific compatible string to Marvell ODMI
        irqchip/mips-gic: Add new DT property to reserve IPIs
        MIPS: Delete smp-gic.c
        MIPS: Make smp CMP, CPS and MT use the new generic IPI functions
        MIPS: Add generic SMP IPI support
        ...
      df2e37c8
    • L
      Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8a284c06
      Linus Torvalds 提交于
      Pull timer updates from Thomas Gleixner:
       "The timer department delivers this time:
      
         - Support for cross clock domain timestamps in the core code plus a
           first user.  That allows more precise timestamping for PTP and
           later for audio and other peripherals.
      
           The ptp/e1000e patches have been acked by the relevant maintainers
           and are carried in the timer tree to avoid merge ordering issues.
      
         - Support for unregistering the current clocksource watchdog.  That
           lifts a limitation for switching clocksources which has been there
           from day 1
      
         - The usual pile of fixes and updates to the core and the drivers.
           Nothing outstanding and exciting"
      
      * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
        time/timekeeping: Work around false positive GCC warning
        e1000e: Adds hardware supported cross timestamp on e1000e nic
        ptp: Add PTP_SYS_OFFSET_PRECISE for driver crosstimestamping
        x86/tsc: Always Running Timer (ART) correlated clocksource
        hrtimer: Revert CLOCK_MONOTONIC_RAW support
        time: Add history to cross timestamp interface supporting slower devices
        time: Add driver cross timestamp interface for higher precision time synchronization
        time: Remove duplicated code in ktime_get_raw_and_real()
        time: Add timekeeping snapshot code capturing system time and counter
        time: Add cycles to nanoseconds translation
        jiffies: Use CLOCKSOURCE_MASK instead of constant
        clocksource: Introduce clocksource_freq2mult()
        clockevents/drivers/exynos_mct: Implement ->set_state_oneshot_stopped()
        clockevents/drivers/arm_global_timer: Implement ->set_state_oneshot_stopped()
        clockevents/drivers/arm_arch_timer: Implement ->set_state_oneshot_stopped()
        clocksource/drivers/arm_global_timer: Register delay timer
        clocksource/drivers/lpc32xx: Support timer-based ARM delay
        clocksource/drivers/lpc32xx: Support periodic mode
        clocksource/drivers/lpc32xx: Don't use the prescaler counter for clockevents
        clocksource/drivers/rockchip: Add err handle for rk_timer_init
        ...
      8a284c06
    • L
      Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 208de214
      Linus Torvalds 提交于
      Pull RCU updates from Ingo Molnar:
       "The main changes in this cycle were:
      
         - Miscellaneous fixes, cleanups, restructuring.
      
         - RCU torture-test updates"
      
      * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        rcu: Export rcu_gp_is_normal()
        rcu: Remove rcu_user_hooks_switch
        rcu: Catch up rcu_report_qs_rdp() comment with reality
        rcu: Document unique-name limitation for DEFINE_STATIC_SRCU()
        rcu: Make rcu/tiny_plugin.h explicitly non-modular
        irq: Privatize irq_common_data::state_use_accessors
        RCU: Privatize rcu_node::lock
        sparse: Add __private to privatize members of structs
        rcu: Remove useless rcu_data_p when !PREEMPT_RCU
        rcutorture: Correct no-expedite console messages
        rcu: Set rdp->gpwrap when CPU is idle
        rcu: Stop treating in-kernel CPU-bound workloads as errors
        rcu: Update rcu_report_qs_rsp() comment
        rcu: Assign false instead of 0 for ->core_needs_qs
        rcutorture: Check for self-detected stalls
        rcutorture: Don't keep empty console.log.diags files
        rcutorture: Add checks for rcutorture writer starvation
      208de214
    • L
      Merge branch 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ae465bee
      Linus Torvalds 提交于
      Pull x86 timer update from Ingo Molnar:
       "A single simplification of the x86 TSC code"
      
      * 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/tsc: Use topology functions
      ae465bee
    • L
      Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8ab84ef6
      Linus Torvalds 提交于
      Pull x86 core platform updates from Ingo Molnar:
       "Intel Quark and Geode SoC platform updates"
      
      * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/platform/intel/quark: Drop IMR lock bit support
        x86/platform/intel/mid: Remove dead code
        x86/platform: Make platform/geode/net5501.c explicitly non-modular
        x86/platform: Make platform/geode/alix.c explicitly non-modular
        x86/platform: Make platform/geode/geos.c explicitly non-modular
        x86/platform: Make platform/intel-quark/imr_selftest.c explicitly non-modular
        x86/platform: Make platform/intel-quark/imr.c explicitly non-modular
      8ab84ef6
    • L
      Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 13c76ad8
      Linus Torvalds 提交于
      Pull x86 mm updates from Ingo Molnar:
       "The main changes in this cycle were:
      
         - Enable full ASLR randomization for 32-bit programs (Hector
           Marco-Gisbert)
      
         - Add initial minimal INVPCI support, to flush global mappings (Andy
           Lutomirski)
      
         - Add KASAN enhancements (Andrey Ryabinin)
      
         - Fix mmiotrace for huge pages (Karol Herbst)
      
         - ... misc cleanups and small enhancements"
      
      * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mm/32: Enable full randomization on i386 and X86_32
        x86/mm/kmmio: Fix mmiotrace for hugepages
        x86/mm: Avoid premature success when changing page attributes
        x86/mm/ptdump: Remove paravirt_enabled()
        x86/mm: Fix INVPCID asm constraint
        x86/dmi: Switch dmi_remap() from ioremap() [uncached] to ioremap_cache()
        x86/mm: If INVPCID is available, use it to flush global mappings
        x86/mm: Add a 'noinvpcid' boot option to turn off INVPCID
        x86/mm: Add INVPCID helpers
        x86/kasan: Write protect kasan zero shadow
        x86/kasan: Clear kasan_zero_page after TLB flush
        x86/mm/numa: Check for failures in numa_clear_kernel_node_hotplug()
        x86/mm/numa: Clean up numa_clear_kernel_node_hotplug()
        x86/mm: Make kmap_prot into a #define
        x86/mm/32: Set NX in __supported_pte_mask before enabling paging
        x86/mm: Streamline and restore probe_memory_block_size()
      13c76ad8
    • L
      Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9cf8d636
      Linus Torvalds 提交于
      Pull x86 microcode updates from Ingo Molnar:
       "The biggest change in this cycle was the separation of the microcode
        loading mechanism from the initrd code plus the support of built-in
        microcode images.
      
        There were also lots cleanups and general restructuring (by Borislav
        Petkov)"
      
      * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
        x86/microcode/intel: Drop orig_sum from ext signature checksum
        x86/microcode/intel: Improve microcode sanity-checking error messages
        x86/microcode/intel: Merge two consecutive if-statements
        x86/microcode/intel: Get rid of DWSIZE
        x86/microcode/intel: Change checksum variables to u32
        x86/microcode: Use kmemdup() rather than duplicating its implementation
        x86/microcode: Remove unnecessary paravirt_enabled check
        x86/microcode: Document builtin microcode loading method
        x86/microcode/AMD: Issue microcode updated message later
        x86/microcode/intel: Cleanup get_matching_model_microcode()
        x86/microcode/intel: Remove unused arg of get_matching_model_microcode()
        x86/microcode/intel: Rename mc_saved_in_initrd
        x86/microcode/intel: Use *wrmsrl variants
        x86/microcode/intel: Cleanup apply_microcode_intel()
        x86/microcode/intel: Move the BUG_ON up and turn it into WARN_ON
        x86/microcode/intel: Rename mc_intel variable to mc
        x86/microcode/intel: Rename mc_saved_count to num_saved
        x86/microcode/intel: Rename local variables of type struct mc_saved_data
        x86/microcode/AMD: Drop redundant printk prefix
        x86/microcode: Issue update message only once
        ...
      9cf8d636
    • L
      Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ecc026bf
      Linus Torvalds 提交于
      Pull x86 fpu updates from Ingo Molnar:
       "The biggest change in terms of impact is the changing of the FPU
        context switch model to 'eagerfpu' for all CPU types, via: commit
        58122bf1: "x86/fpu: Default eagerfpu=on on all CPUs"
      
        This makes all FPU saves and restores synchronous and makes the FPU
        code a lot more obvious to read.  In the next cycle, if this change is
        problem free, we'll remove the old lazy FPU restore code altogether.
      
        This change flushed out some old bugs, which should all be fixed by
        now, BYMMV"
      
      * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/fpu: Default eagerfpu=on on all CPUs
        x86/fpu: Speed up lazy FPU restores slightly
        x86/fpu: Fold fpu_copy() into fpu__copy()
        x86/fpu: Fix FNSAVE usage in eagerfpu mode
        x86/fpu: Fix math emulation in eager fpu mode
      ecc026bf
    • L
      Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · fa53c489
      Linus Torvalds 提交于
      Pull x86 build update from Ingo Molnar:
       "A single adjustment of a defconfig value"
      
      * 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/defconfigs/32: Set CONFIG_FRAME_WARN to the Kconfig default
      fa53c489
    • L
      Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 42576bee
      Linus Torvalds 提交于
      Pull x86 boot updates from Ingo Molnar:
       "Early command line options parsing enhancements from Dave Hansen, plus
        minor cleanups and enhancements"
      
      * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/boot: Remove unused 'is_big_kernel' variable
        x86/boot: Use proper array element type in memset() size calculation
        x86/boot: Pass in size to early cmdline parsing
        x86/boot: Simplify early command line parsing
        x86/boot: Fix early command-line parsing when partial word matches
        x86/boot: Fix early command-line parsing when matching at end
        x86/boot: Simplify kernel load address alignment check
        x86/boot: Micro-optimize reset_early_page_tables()
      42576bee
    • L
      Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ba33ea81
      Linus Torvalds 提交于
      Pull x86 asm updates from Ingo Molnar:
       "This is another big update. Main changes are:
      
         - lots of x86 system call (and other traps/exceptions) entry code
           enhancements.  In particular the complex parts of the 64-bit entry
           code have been migrated to C code as well, and a number of dusty
           corners have been refreshed.  (Andy Lutomirski)
      
         - vDSO special mapping robustification and general cleanups (Andy
           Lutomirski)
      
         - cpufeature refactoring, cleanups and speedups (Borislav Petkov)
      
         - lots of other changes ..."
      
      * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits)
        x86/cpufeature: Enable new AVX-512 features
        x86/entry/traps: Show unhandled signal for i386 in do_trap()
        x86/entry: Call enter_from_user_mode() with IRQs off
        x86/entry/32: Change INT80 to be an interrupt gate
        x86/entry: Improve system call entry comments
        x86/entry: Remove TIF_SINGLESTEP entry work
        x86/entry/32: Add and check a stack canary for the SYSENTER stack
        x86/entry/32: Simplify and fix up the SYSENTER stack #DB/NMI fixup
        x86/entry: Only allocate space for tss_struct::SYSENTER_stack if needed
        x86/entry: Vastly simplify SYSENTER TF (single-step) handling
        x86/entry/traps: Clear DR6 early in do_debug() and improve the comment
        x86/entry/traps: Clear TIF_BLOCKSTEP on all debug exceptions
        x86/entry/32: Restore FLAGS on SYSEXIT
        x86/entry/32: Filter NT and speed up AC filtering in SYSENTER
        x86/entry/compat: In SYSENTER, sink AC clearing below the existing FLAGS test
        selftests/x86: In syscall_nt, test NT|TF as well
        x86/asm-offsets: Remove PARAVIRT_enabled
        x86/entry/32: Introduce and use X86_BUG_ESPFIX instead of paravirt_enabled
        uprobes: __create_xol_area() must nullify xol_mapping.fault
        x86/cpufeature: Create a new synthetic cpu capability for machine check recovery
        ...
      ba33ea81
  5. 15 3月, 2016 11 次提交
    • V
      x86/video: Don't assume all FB devices are PCI devices · 743146db
      Vitaly Kuznetsov 提交于
      When booting Hyper-V Generation 2 guests KASAN reports the following
      out-of-bounds access:
      
        BUG: KASAN: slab-out-of-bounds in fb_is_primary_device+0x58/0x70 at addr ffff880079cf0eb0
        Read of size 8 by task swapper/0/1
        ...
         [<ffffffff81581308>] dump_stack+0x63/0x8b
         [<ffffffff812e1f99>] print_trailer+0xf9/0x150
         [<ffffffff812e7344>] object_err+0x34/0x40
         [<ffffffff812e9630>] kasan_report_error+0x230/0x550
         [<ffffffff812e9ee8>] kasan_report+0x58/0x60
         [<ffffffff812e4500>] ? ___slab_alloc+0x80/0x490
         [<ffffffff81878a28>] ? fb_is_primary_device+0x58/0x70
         [<ffffffff812e87cd>] __asan_load8+0x5d/0x70
         [<ffffffff81878a28>] fb_is_primary_device+0x58/0x70
         [<ffffffff8162357a>] register_framebuffer+0xda/0x5b0
         [<ffffffff816234a0>] ? remove_conflicting_framebuffers+0x50/0x50
        ...
      
      The issue is caused by the to_pci_dev() call with no check that the given
      info->device is in fact a PCI device and some FB devices (Hyper-V FB, EFI
      FB,...) are not.
      
      While on it, clean up the function.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Acked-by: NBjorn Helgaas <bhelgaas@google.com>
      Cc: Bjorn Helgaas <helgaas@kernel.org>
      Cc: Cathy Avery <cavery@redhat.com>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1458030033-10122-1-git-send-email-vkuznets@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      743146db
    • I
      Merge commit 'torture.2015.02.23a' into core/rcu · 67019157
      Ingo Molnar 提交于
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      67019157
    • I
      Merge commit 'fixes.2015.02.23a' into core/rcu · 8bc6782f
      Ingo Molnar 提交于
       Conflicts:
      	kernel/rcu/tree.c
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      8bc6782f
    • L
      Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e23604ed
      Linus Torvalds 提交于
      Pull NOHZ updates from Ingo Molnar:
       "NOHZ enhancements, by Frederic Weisbecker, which reorganizes/refactors
        the NOHZ 'can the tick be stopped?' infrastructure and related code to
        be data driven, and harmonizes the naming and handling of all the
        various properties"
      
      [ This makes the ugly "fetch_or()" macro that the scheduler used
        internally a new generic helper, and does a bad job at it.
      
        I'm pulling it, but I've asked Ingo and Frederic to get this
        fixed up ]
      
      * 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched-clock: Migrate to use new tick dependency mask model
        posix-cpu-timers: Migrate to use new tick dependency mask model
        sched: Migrate sched to use new tick dependency mask model
        sched: Account rr tasks
        perf: Migrate perf to use new tick dependency mask model
        nohz: Use enum code for tick stop failure tracing message
        nohz: New tick dependency mask
        nohz: Implement wide kick on top of irq work
        atomic: Export fetch_or()
      e23604ed
    • L
      Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d4e79615
      Linus Torvalds 提交于
      Pull scheduler updates from Ingo Molnar:
       "The main changes in this cycle are:
      
         - Make schedstats a runtime tunable (disabled by default) and
           optimize it via static keys.
      
           As most distributions enable CONFIG_SCHEDSTATS=y due to its
           instrumentation value, this is a nice performance enhancement.
           (Mel Gorman)
      
         - Implement 'simple waitqueues' (swait): these are just pure
           waitqueues without any of the more complex features of full-blown
           waitqueues (callbacks, wake flags, wake keys, etc.).  Simple
           waitqueues have less memory overhead and are faster.
      
           Use simple waitqueues in the RCU code (in 4 different places) and
           for handling KVM vCPU wakeups.
      
           (Peter Zijlstra, Daniel Wagner, Thomas Gleixner, Paul Gortmaker,
           Marcelo Tosatti)
      
         - sched/numa enhancements (Rik van Riel)
      
         - NOHZ performance enhancements (Rik van Riel)
      
         - Various sched/deadline enhancements (Steven Rostedt)
      
         - Various fixes (Peter Zijlstra)
      
         - ... and a number of other fixes, cleanups and smaller enhancements"
      
      * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits)
        sched/cputime: Fix steal_account_process_tick() to always return jiffies
        sched/deadline: Remove dl_new from struct sched_dl_entity
        Revert "kbuild: Add option to turn incompatible pointer check into error"
        sched/deadline: Remove superfluous call to switched_to_dl()
        sched/debug: Fix preempt_disable_ip recording for preempt_disable()
        sched, time: Switch VIRT_CPU_ACCOUNTING_GEN to jiffy granularity
        time, acct: Drop irq save & restore from __acct_update_integrals()
        acct, time: Change indentation in __acct_update_integrals()
        sched, time: Remove non-power-of-two divides from __acct_update_integrals()
        sched/rt: Kick RT bandwidth timer immediately on start up
        sched/debug: Add deadline scheduler bandwidth ratio to /proc/sched_debug
        sched/debug: Move sched_domain_sysctl to debug.c
        sched/debug: Move the /sys/kernel/debug/sched_features file setup into debug.c
        sched/rt: Fix PI handling vs. sched_setscheduler()
        sched/core: Remove duplicated sched_group_set_shares() prototype
        sched/fair: Consolidate nohz CPU load update code
        sched/fair: Avoid using decay_load_missed() with a negative value
        sched/deadline: Always calculate end of period on sched_yield()
        sched/cgroup: Fix cgroup entity load tracking tear-down
        rcu: Use simple wait queues where possible in rcutree
        ...
      d4e79615
    • L
      Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d88bfe1d
      Linus Torvalds 提交于
      Pull RAS updates from Ingo Molnar:
       "Various RAS updates:
      
         - AMD MCE support updates for future CPUs, fixes and 'SMCA' (Scalable
           MCA) error decoding support (Aravind Gopalakrishnan)
      
         - x86 memcpy_mcsafe() support, to enable smart(er) hardware error
           recovery in NVDIMM drivers, based on an extension of the x86
           exception handling code.  (Tony Luck)"
      
      * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        EDAC/sb_edac: Fix computation of channel address
        x86/mm, x86/mce: Add memcpy_mcsafe()
        x86/mce/AMD: Document some functionality
        x86/mce: Clarify comments regarding deferred error
        x86/mce/AMD: Fix logic to obtain block address
        x86/mce/AMD, EDAC: Enable error decoding of Scalable MCA errors
        x86/mce: Move MCx_CONFIG MSR definitions
        x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries
        x86/mm: Expand the exception table logic to allow new handling options
        x86/mce/AMD: Set MCAX Enable bit
        x86/mce/AMD: Carve out threshold block preparation
        x86/mce/AMD: Fix LVT offset configuration for thresholding
        x86/mce/AMD: Reduce number of blocks scanned per bank
        x86/mce/AMD: Do not perform shared bank check for future processors
        x86/mce: Fix order of AMD MCE init function call
      d88bfe1d
    • L
      Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e71c2c1e
      Linus Torvalds 提交于
      Pull perf updates from Ingo Molnar:
       "Main kernel side changes:
      
         - Big reorganization of the x86 perf support code.  The old code grew
           organically deep inside arch/x86/kernel/cpu/perf* and its naming
           became somewhat messy.
      
           The new location is under arch/x86/events/, using the following
           cleaner hierarchy of source code files:
      
             perf/x86: Move perf_event.c .................. => x86/events/core.c
             perf/x86: Move perf_event_amd.c .............. => x86/events/amd/core.c
             perf/x86: Move perf_event_amd_ibs.c .......... => x86/events/amd/ibs.c
             perf/x86: Move perf_event_amd_iommu.[ch] ..... => x86/events/amd/iommu.[ch]
             perf/x86: Move perf_event_amd_uncore.c ....... => x86/events/amd/uncore.c
             perf/x86: Move perf_event_intel_bts.c ........ => x86/events/intel/bts.c
             perf/x86: Move perf_event_intel.c ............ => x86/events/intel/core.c
             perf/x86: Move perf_event_intel_cqm.c ........ => x86/events/intel/cqm.c
             perf/x86: Move perf_event_intel_cstate.c ..... => x86/events/intel/cstate.c
             perf/x86: Move perf_event_intel_ds.c ......... => x86/events/intel/ds.c
             perf/x86: Move perf_event_intel_lbr.c ........ => x86/events/intel/lbr.c
             perf/x86: Move perf_event_intel_pt.[ch] ...... => x86/events/intel/pt.[ch]
             perf/x86: Move perf_event_intel_rapl.c ....... => x86/events/intel/rapl.c
             perf/x86: Move perf_event_intel_uncore.[ch] .. => x86/events/intel/uncore.[ch]
             perf/x86: Move perf_event_intel_uncore_nhmex.c => x86/events/intel/uncore_nmhex.c
             perf/x86: Move perf_event_intel_uncore_snb.c   => x86/events/intel/uncore_snb.c
             perf/x86: Move perf_event_intel_uncore_snbep.c => x86/events/intel/uncore_snbep.c
             perf/x86: Move perf_event_knc.c .............. => x86/events/intel/knc.c
             perf/x86: Move perf_event_p4.c ............... => x86/events/intel/p4.c
             perf/x86: Move perf_event_p6.c ............... => x86/events/intel/p6.c
             perf/x86: Move perf_event_msr.c .............. => x86/events/msr.c
      
           (Borislav Petkov)
      
         - Update various x86 PMU constraint and hw support details (Stephane
           Eranian)
      
         - Optimize kprobes for BPF execution (Martin KaFai Lau)
      
         - Rewrite, refactor and fix the Intel uncore PMU driver code (Thomas
           Gleixner)
      
         - Rewrite, refactor and fix the Intel RAPL PMU code (Thomas Gleixner)
      
         - Various fixes and smaller cleanups.
      
        There are lots of perf tooling updates as well.  A few highlights:
      
        perf report/top:
      
           - Hierarchy histogram mode for 'perf top' and 'perf report',
             showing multiple levels, one per --sort entry: (Namhyung Kim)
      
             On a mostly idle system:
      
               # perf top --hierarchy -s comm,dso
      
             Then expand some levels and use 'P' to take a snapshot:
      
               # cat perf.hist.0
               -  92.32%         perf
                     58.20%         perf
                     22.29%         libc-2.22.so
                      5.97%         [kernel]
                      4.18%         libelf-0.165.so
                      1.69%         [unknown]
               -   4.71%         qemu-system-x86
                      3.10%         [kernel]
                      1.60%         qemu-system-x86_64 (deleted)
               +   2.97%         swapper
               #
      
           - Add 'L' hotkey to dynamicly set the percent threshold for
             histogram entries and callchains, i.e.  dynamicly do what the
             --percent-limit command line option to 'top' and 'report' does.
             (Namhyung Kim)
      
        perf mem:
      
           - Allow specifying events via -e in 'perf mem record', also listing
             what events can be specified via 'perf mem record -e list' (Jiri
             Olsa)
      
        perf record:
      
           - Add 'perf record' --all-user/--all-kernel options, so that one
             can tell that all the events in the command line should be
             restricted to the user or kernel levels (Jiri Olsa), i.e.:
      
               perf record -e cycles:u,instructions:u
      
             is equivalent to:
      
               perf record --all-user -e cycles,instructions
      
           - Make 'perf record' collect CPU cache info in the perf.data file header:
      
               $ perf record usleep 1
               [ perf record: Woken up 1 times to write data ]
               [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
               $ perf report --header-only -I | tail -10 | head -8
               # CPU cache info:
               #  L1 Data                 32K [0-1]
               #  L1 Instruction          32K [0-1]
               #  L1 Data                 32K [2-3]
               #  L1 Instruction          32K [2-3]
               #  L2 Unified             256K [0-1]
               #  L2 Unified             256K [2-3]
               #  L3 Unified            4096K [0-3]
      
             Will be used in 'perf c2c' and eventually in 'perf diff' to
             allow, for instance running the same workload in multiple
             machines and then when using 'diff' show the hardware difference.
             (Jiri Olsa)
      
           - Improved support for Java, using the JVMTI agent library to do
             jitdumps that then will be inserted in synthesized
             PERF_RECORD_MMAP2 events via 'perf inject' pointed to synthesized
             ELF files stored in ~/.debug and keyed with build-ids, to allow
             symbol resolution and even annotation with source line info, see
             the changeset comments to see how to use it (Stephane Eranian)
      
        perf script/trace:
      
           - Decode data_src values (e.g.  perf.data files generated by 'perf
             mem record') in 'perf script': (Jiri Olsa)
      
               # perf script
                 perf 693 [1] 4.088652: 1 cpu/mem-loads,ldlat=30/P: ffff88007d0b0f40 68100142 L1 hit|SNP None|TLB L1 or L2 hit|LCK No <SNIP>
                                                                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
           - Improve support to 'data_src', 'weight' and 'addr' fields in
             'perf script' (Jiri Olsa)
      
           - Handle empty print fmts in 'perf script -s' i.e. when running
             python or perl scripts (Taeung Song)
      
        perf stat:
      
           - 'perf stat' now shows shadow metrics (insn per cycle, etc) in
             interval mode too.  E.g:
      
               # perf stat -I 1000 -e instructions,cycles sleep 1
               #         time   counts unit events
                  1.000215928  519,620      instructions     #  0.69 insn per cycle
                  1.000215928  752,003      cycles
               <SNIP>
      
           - Port 'perf kvm stat' to PowerPC (Hemant Kumar)
      
           - Implement CSV metrics output in 'perf stat' (Andi Kleen)
      
        perf BPF support:
      
           - Support converting data from bpf events in 'perf data' (Wang Nan)
      
           - Print bpf-output events in 'perf script': (Wang Nan).
      
               # perf record -e bpf-output/no-inherit,name=evt/ -e ./test_bpf_output_3.c/map:channel.event=evt/ usleep 1000
               # perf script
                  usleep  4882 21384.532523:   evt:  ffffffff810e97d1 sys_nanosleep ([kernel.kallsyms])
                   BPF output: 0000: 52 61 69 73 65 20 61 20  Raise a
                               0008: 42 50 46 20 65 76 65 6e  BPF even
                               0010: 74 21 00 00              t!..
                   BPF string: "Raise a BPF event!"
               #
      
           - Add API to set values of map entries in a BPF object, be it
             individual map slots or ranges (Wang Nan)
      
           - Introduce support for the 'bpf-output' event (Wang Nan)
      
           - Add glue to read perf events in a BPF program (Wang Nan)
      
           - Improve support for bpf-output events in 'perf trace' (Wang Nan)
      
        ... and tons of other changes as well - see the shortlog and git log
        for details!"
      
      * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (342 commits)
        perf stat: Add --metric-only support for -A
        perf stat: Implement --metric-only mode
        perf stat: Document CSV format in manpage
        perf hists browser: Check sort keys before hot key actions
        perf hists browser: Allow thread filtering for comm sort key
        perf tools: Add sort__has_comm variable
        perf tools: Recalc total periods using top-level entries in hierarchy
        perf tools: Remove nr_sort_keys field
        perf hists browser: Cleanup hist_browser__fprintf_hierarchy_entry()
        perf tools: Remove hist_entry->fmt field
        perf tools: Fix command line filters in hierarchy mode
        perf tools: Add more sort entry check functions
        perf tools: Fix hist_entry__filter() for hierarchy
        perf jitdump: Build only on supported archs
        tools lib traceevent: Add '~' operation within arg_num_eval()
        perf tools: Omit unnecessary cast in perf_pmu__parse_scale
        perf tools: Pass perf_hpp_list all the way through setup_sort_list
        perf tools: Fix perf script python database export crash
        perf jitdump: DWARF is also needed
        perf bench mem: Prepare the x86-64 build for upstream memcpy_mcsafe() changes
        ...
      e71c2c1e
    • L
      Merge branch 'mm-readonly-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d09e356a
      Linus Torvalds 提交于
      Pull read-only kernel memory updates from Ingo Molnar:
       "This tree adds two (security related) enhancements to the kernel's
        handling of read-only kernel memory:
      
         - extend read-only kernel memory to a new class of formerly writable
           kernel data: 'post-init read-only memory' via the __ro_after_init
           attribute, and mark the ARM and x86 vDSO as such read-only memory.
      
           This kind of attribute can be used for data that requires a once
           per bootup initialization sequence, but is otherwise never modified
           after that point.
      
           This feature was based on the work by PaX Team and Brad Spengler.
      
           (by Kees Cook, the ARM vDSO bits by David Brown.)
      
         - make CONFIG_DEBUG_RODATA always enabled on x86 and remove the
           Kconfig option.  This simplifies the kernel and also signals that
           read-only memory is the default model and a first-class citizen.
           (Kees Cook)"
      
      * 'mm-readonly-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        ARM/vdso: Mark the vDSO code read-only after init
        x86/vdso: Mark the vDSO code read-only after init
        lkdtm: Verify that '__ro_after_init' works correctly
        arch: Introduce post-init read-only memory
        x86/mm: Always enable CONFIG_DEBUG_RODATA and remove the Kconfig option
        mm/init: Add 'rodata=off' boot cmdline parameter to disable read-only kernel mappings
        asm-generic: Consolidate mark_rodata_ro()
      d09e356a
    • L
      Merge branch 'mm-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5ec94246
      Linus Torvalds 提交于
      Pull dma_*_writecombine rename from Ingo Molnar:
       "Rename dma_*_writecombine() to dma_*_wc()
      
        This is a tree-wide API rename, to move the dma_*() write-combining
        APIs closer in name to their usual API families.  (The old API names
        are kept as compatibility wrappers to not introduce extra breakage.)
      
        The patch was Coccinelle generated"
      
      * 'mm-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        dma, mm/pat: Rename dma_*_writecombine() to dma_*_wc()
      5ec94246
    • L
      Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · fbed0bc0
      Linus Torvalds 提交于
      Pull locking changes from Ingo Molnar:
       "Various updates:
      
         - Futex scalability improvements: remove page lock use for shared
           futex get_futex_key(), which speeds up 'perf bench futex hash'
           benchmarks by over 40% on a 60-core Westmere.  This makes anon-mem
           shared futexes perform close to private futexes.  (Mel Gorman)
      
         - lockdep hash collision detection and fix (Alfredo Alvarez
           Fernandez)
      
         - lockdep testing enhancements (Alfredo Alvarez Fernandez)
      
         - robustify lockdep init by using hlists (Andrew Morton, Andrey
           Ryabinin)
      
         - mutex and csd_lock micro-optimizations (Davidlohr Bueso)
      
         - small x86 barriers tweaks (Michael S Tsirkin)
      
         - qspinlock updates (Waiman Long)"
      
      * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
        locking/csd_lock: Use smp_cond_acquire() in csd_lock_wait()
        locking/csd_lock: Explicitly inline csd_lock*() helpers
        futex: Replace barrier() in unqueue_me() with READ_ONCE()
        locking/lockdep: Detect chain_key collisions
        locking/lockdep: Prevent chain_key collisions
        tools/lib/lockdep: Fix link creation warning
        tools/lib/lockdep: Add tests for AA and ABBA locking
        tools/lib/lockdep: Add userspace version of READ_ONCE()
        tools/lib/lockdep: Fix the build on recent kernels
        locking/qspinlock: Move __ARCH_SPIN_LOCK_UNLOCKED to qspinlock_types.h
        locking/mutex: Allow next waiter lockless wakeup
        locking/pvqspinlock: Enable slowpath locking count tracking
        locking/qspinlock: Use smp_cond_acquire() in pending code
        locking/pvqspinlock: Move lock stealing count tracking code into pv_queued_spin_steal_lock()
        locking/mcs: Fix mcs_spin_lock() ordering
        futex: Remove requirement for lock_page() in get_futex_key()
        futex: Rename barrier references in ordering guarantees
        locking/atomics: Update comment about READ_ONCE() and structures
        locking/lockdep: Eliminate lockdep_init()
        locking/lockdep: Convert hash tables to hlists
        ...
      fbed0bc0
    • L
      Merge branch 'core-resources-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d37a14bb
      Linus Torvalds 提交于
      Pull ram resource handling changes from Ingo Molnar:
       "Core kernel resource handling changes to support NVDIMM error
        injection.
      
        This tree introduces a new I/O resource type, IORESOURCE_SYSTEM_RAM,
        for System RAM while keeping the current IORESOURCE_MEM type bit set
        for all memory-mapped ranges (including System RAM) for backward
        compatibility.
      
        With this resource flag it no longer takes a strcmp() loop through the
        resource tree to find "System RAM" resources.
      
        The new resource type is then used to extend ACPI/APEI error injection
        facility to also support NVDIMM"
      
      * 'core-resources-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        ACPI/EINJ: Allow memory error injection to NVDIMM
        resource: Kill walk_iomem_res()
        x86/kexec: Remove walk_iomem_res() call with GART type
        x86, kexec, nvdimm: Use walk_iomem_res_desc() for iomem search
        resource: Add walk_iomem_res_desc()
        memremap: Change region_intersects() to take @flags and @desc
        arm/samsung: Change s3c_pm_run_res() to use System RAM type
        resource: Change walk_system_ram() to use System RAM type
        drivers: Initialize resource entry to zero
        xen, mm: Set IORESOURCE_SYSTEM_RAM to System RAM
        kexec: Set IORESOURCE_SYSTEM_RAM for System RAM
        arch: Set IORESOURCE_SYSTEM_RAM flag for System RAM
        ia64: Set System RAM type and descriptor
        x86/e820: Set System RAM type and descriptor
        resource: Add I/O resource descriptor
        resource: Handle resource flags properly
        resource: Add System RAM resource type
      d37a14bb
  6. 14 3月, 2016 4 次提交
  7. 13 3月, 2016 1 次提交
    • J
      MIPS: smp.c: Fix uninitialised temp_foreign_map · d825c06b
      James Hogan 提交于
      When calculate_cpu_foreign_map() recalculates the cpu_foreign_map
      cpumask it uses the local variable temp_foreign_map without initialising
      it to zero. Since the calculation only ever sets bits in this cpumask
      any existing bits at that memory location will remain set and find their
      way into cpu_foreign_map too. This could potentially lead to cache
      operations suboptimally doing smp calls to multiple VPEs in the same
      core, even though the VPEs share primary caches.
      
      Therefore initialise temp_foreign_map using cpumask_clear() before use.
      
      Fixes: cccf34e9 ("MIPS: c-r4k: Fix cache flushing for MT cores")
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/12759/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      d825c06b