1. 16 3月, 2016 2 次提交
    • L
      Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 42576bee
      Linus Torvalds 提交于
      Pull x86 boot updates from Ingo Molnar:
       "Early command line options parsing enhancements from Dave Hansen, plus
        minor cleanups and enhancements"
      
      * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/boot: Remove unused 'is_big_kernel' variable
        x86/boot: Use proper array element type in memset() size calculation
        x86/boot: Pass in size to early cmdline parsing
        x86/boot: Simplify early command line parsing
        x86/boot: Fix early command-line parsing when partial word matches
        x86/boot: Fix early command-line parsing when matching at end
        x86/boot: Simplify kernel load address alignment check
        x86/boot: Micro-optimize reset_early_page_tables()
      42576bee
    • L
      Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ba33ea81
      Linus Torvalds 提交于
      Pull x86 asm updates from Ingo Molnar:
       "This is another big update. Main changes are:
      
         - lots of x86 system call (and other traps/exceptions) entry code
           enhancements.  In particular the complex parts of the 64-bit entry
           code have been migrated to C code as well, and a number of dusty
           corners have been refreshed.  (Andy Lutomirski)
      
         - vDSO special mapping robustification and general cleanups (Andy
           Lutomirski)
      
         - cpufeature refactoring, cleanups and speedups (Borislav Petkov)
      
         - lots of other changes ..."
      
      * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits)
        x86/cpufeature: Enable new AVX-512 features
        x86/entry/traps: Show unhandled signal for i386 in do_trap()
        x86/entry: Call enter_from_user_mode() with IRQs off
        x86/entry/32: Change INT80 to be an interrupt gate
        x86/entry: Improve system call entry comments
        x86/entry: Remove TIF_SINGLESTEP entry work
        x86/entry/32: Add and check a stack canary for the SYSENTER stack
        x86/entry/32: Simplify and fix up the SYSENTER stack #DB/NMI fixup
        x86/entry: Only allocate space for tss_struct::SYSENTER_stack if needed
        x86/entry: Vastly simplify SYSENTER TF (single-step) handling
        x86/entry/traps: Clear DR6 early in do_debug() and improve the comment
        x86/entry/traps: Clear TIF_BLOCKSTEP on all debug exceptions
        x86/entry/32: Restore FLAGS on SYSEXIT
        x86/entry/32: Filter NT and speed up AC filtering in SYSENTER
        x86/entry/compat: In SYSENTER, sink AC clearing below the existing FLAGS test
        selftests/x86: In syscall_nt, test NT|TF as well
        x86/asm-offsets: Remove PARAVIRT_enabled
        x86/entry/32: Introduce and use X86_BUG_ESPFIX instead of paravirt_enabled
        uprobes: __create_xol_area() must nullify xol_mapping.fault
        x86/cpufeature: Create a new synthetic cpu capability for machine check recovery
        ...
      ba33ea81
  2. 15 3月, 2016 8 次提交
    • L
      Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e23604ed
      Linus Torvalds 提交于
      Pull NOHZ updates from Ingo Molnar:
       "NOHZ enhancements, by Frederic Weisbecker, which reorganizes/refactors
        the NOHZ 'can the tick be stopped?' infrastructure and related code to
        be data driven, and harmonizes the naming and handling of all the
        various properties"
      
      [ This makes the ugly "fetch_or()" macro that the scheduler used
        internally a new generic helper, and does a bad job at it.
      
        I'm pulling it, but I've asked Ingo and Frederic to get this
        fixed up ]
      
      * 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched-clock: Migrate to use new tick dependency mask model
        posix-cpu-timers: Migrate to use new tick dependency mask model
        sched: Migrate sched to use new tick dependency mask model
        sched: Account rr tasks
        perf: Migrate perf to use new tick dependency mask model
        nohz: Use enum code for tick stop failure tracing message
        nohz: New tick dependency mask
        nohz: Implement wide kick on top of irq work
        atomic: Export fetch_or()
      e23604ed
    • L
      Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d4e79615
      Linus Torvalds 提交于
      Pull scheduler updates from Ingo Molnar:
       "The main changes in this cycle are:
      
         - Make schedstats a runtime tunable (disabled by default) and
           optimize it via static keys.
      
           As most distributions enable CONFIG_SCHEDSTATS=y due to its
           instrumentation value, this is a nice performance enhancement.
           (Mel Gorman)
      
         - Implement 'simple waitqueues' (swait): these are just pure
           waitqueues without any of the more complex features of full-blown
           waitqueues (callbacks, wake flags, wake keys, etc.).  Simple
           waitqueues have less memory overhead and are faster.
      
           Use simple waitqueues in the RCU code (in 4 different places) and
           for handling KVM vCPU wakeups.
      
           (Peter Zijlstra, Daniel Wagner, Thomas Gleixner, Paul Gortmaker,
           Marcelo Tosatti)
      
         - sched/numa enhancements (Rik van Riel)
      
         - NOHZ performance enhancements (Rik van Riel)
      
         - Various sched/deadline enhancements (Steven Rostedt)
      
         - Various fixes (Peter Zijlstra)
      
         - ... and a number of other fixes, cleanups and smaller enhancements"
      
      * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits)
        sched/cputime: Fix steal_account_process_tick() to always return jiffies
        sched/deadline: Remove dl_new from struct sched_dl_entity
        Revert "kbuild: Add option to turn incompatible pointer check into error"
        sched/deadline: Remove superfluous call to switched_to_dl()
        sched/debug: Fix preempt_disable_ip recording for preempt_disable()
        sched, time: Switch VIRT_CPU_ACCOUNTING_GEN to jiffy granularity
        time, acct: Drop irq save & restore from __acct_update_integrals()
        acct, time: Change indentation in __acct_update_integrals()
        sched, time: Remove non-power-of-two divides from __acct_update_integrals()
        sched/rt: Kick RT bandwidth timer immediately on start up
        sched/debug: Add deadline scheduler bandwidth ratio to /proc/sched_debug
        sched/debug: Move sched_domain_sysctl to debug.c
        sched/debug: Move the /sys/kernel/debug/sched_features file setup into debug.c
        sched/rt: Fix PI handling vs. sched_setscheduler()
        sched/core: Remove duplicated sched_group_set_shares() prototype
        sched/fair: Consolidate nohz CPU load update code
        sched/fair: Avoid using decay_load_missed() with a negative value
        sched/deadline: Always calculate end of period on sched_yield()
        sched/cgroup: Fix cgroup entity load tracking tear-down
        rcu: Use simple wait queues where possible in rcutree
        ...
      d4e79615
    • L
      Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d88bfe1d
      Linus Torvalds 提交于
      Pull RAS updates from Ingo Molnar:
       "Various RAS updates:
      
         - AMD MCE support updates for future CPUs, fixes and 'SMCA' (Scalable
           MCA) error decoding support (Aravind Gopalakrishnan)
      
         - x86 memcpy_mcsafe() support, to enable smart(er) hardware error
           recovery in NVDIMM drivers, based on an extension of the x86
           exception handling code.  (Tony Luck)"
      
      * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        EDAC/sb_edac: Fix computation of channel address
        x86/mm, x86/mce: Add memcpy_mcsafe()
        x86/mce/AMD: Document some functionality
        x86/mce: Clarify comments regarding deferred error
        x86/mce/AMD: Fix logic to obtain block address
        x86/mce/AMD, EDAC: Enable error decoding of Scalable MCA errors
        x86/mce: Move MCx_CONFIG MSR definitions
        x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries
        x86/mm: Expand the exception table logic to allow new handling options
        x86/mce/AMD: Set MCAX Enable bit
        x86/mce/AMD: Carve out threshold block preparation
        x86/mce/AMD: Fix LVT offset configuration for thresholding
        x86/mce/AMD: Reduce number of blocks scanned per bank
        x86/mce/AMD: Do not perform shared bank check for future processors
        x86/mce: Fix order of AMD MCE init function call
      d88bfe1d
    • L
      Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e71c2c1e
      Linus Torvalds 提交于
      Pull perf updates from Ingo Molnar:
       "Main kernel side changes:
      
         - Big reorganization of the x86 perf support code.  The old code grew
           organically deep inside arch/x86/kernel/cpu/perf* and its naming
           became somewhat messy.
      
           The new location is under arch/x86/events/, using the following
           cleaner hierarchy of source code files:
      
             perf/x86: Move perf_event.c .................. => x86/events/core.c
             perf/x86: Move perf_event_amd.c .............. => x86/events/amd/core.c
             perf/x86: Move perf_event_amd_ibs.c .......... => x86/events/amd/ibs.c
             perf/x86: Move perf_event_amd_iommu.[ch] ..... => x86/events/amd/iommu.[ch]
             perf/x86: Move perf_event_amd_uncore.c ....... => x86/events/amd/uncore.c
             perf/x86: Move perf_event_intel_bts.c ........ => x86/events/intel/bts.c
             perf/x86: Move perf_event_intel.c ............ => x86/events/intel/core.c
             perf/x86: Move perf_event_intel_cqm.c ........ => x86/events/intel/cqm.c
             perf/x86: Move perf_event_intel_cstate.c ..... => x86/events/intel/cstate.c
             perf/x86: Move perf_event_intel_ds.c ......... => x86/events/intel/ds.c
             perf/x86: Move perf_event_intel_lbr.c ........ => x86/events/intel/lbr.c
             perf/x86: Move perf_event_intel_pt.[ch] ...... => x86/events/intel/pt.[ch]
             perf/x86: Move perf_event_intel_rapl.c ....... => x86/events/intel/rapl.c
             perf/x86: Move perf_event_intel_uncore.[ch] .. => x86/events/intel/uncore.[ch]
             perf/x86: Move perf_event_intel_uncore_nhmex.c => x86/events/intel/uncore_nmhex.c
             perf/x86: Move perf_event_intel_uncore_snb.c   => x86/events/intel/uncore_snb.c
             perf/x86: Move perf_event_intel_uncore_snbep.c => x86/events/intel/uncore_snbep.c
             perf/x86: Move perf_event_knc.c .............. => x86/events/intel/knc.c
             perf/x86: Move perf_event_p4.c ............... => x86/events/intel/p4.c
             perf/x86: Move perf_event_p6.c ............... => x86/events/intel/p6.c
             perf/x86: Move perf_event_msr.c .............. => x86/events/msr.c
      
           (Borislav Petkov)
      
         - Update various x86 PMU constraint and hw support details (Stephane
           Eranian)
      
         - Optimize kprobes for BPF execution (Martin KaFai Lau)
      
         - Rewrite, refactor and fix the Intel uncore PMU driver code (Thomas
           Gleixner)
      
         - Rewrite, refactor and fix the Intel RAPL PMU code (Thomas Gleixner)
      
         - Various fixes and smaller cleanups.
      
        There are lots of perf tooling updates as well.  A few highlights:
      
        perf report/top:
      
           - Hierarchy histogram mode for 'perf top' and 'perf report',
             showing multiple levels, one per --sort entry: (Namhyung Kim)
      
             On a mostly idle system:
      
               # perf top --hierarchy -s comm,dso
      
             Then expand some levels and use 'P' to take a snapshot:
      
               # cat perf.hist.0
               -  92.32%         perf
                     58.20%         perf
                     22.29%         libc-2.22.so
                      5.97%         [kernel]
                      4.18%         libelf-0.165.so
                      1.69%         [unknown]
               -   4.71%         qemu-system-x86
                      3.10%         [kernel]
                      1.60%         qemu-system-x86_64 (deleted)
               +   2.97%         swapper
               #
      
           - Add 'L' hotkey to dynamicly set the percent threshold for
             histogram entries and callchains, i.e.  dynamicly do what the
             --percent-limit command line option to 'top' and 'report' does.
             (Namhyung Kim)
      
        perf mem:
      
           - Allow specifying events via -e in 'perf mem record', also listing
             what events can be specified via 'perf mem record -e list' (Jiri
             Olsa)
      
        perf record:
      
           - Add 'perf record' --all-user/--all-kernel options, so that one
             can tell that all the events in the command line should be
             restricted to the user or kernel levels (Jiri Olsa), i.e.:
      
               perf record -e cycles:u,instructions:u
      
             is equivalent to:
      
               perf record --all-user -e cycles,instructions
      
           - Make 'perf record' collect CPU cache info in the perf.data file header:
      
               $ perf record usleep 1
               [ perf record: Woken up 1 times to write data ]
               [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
               $ perf report --header-only -I | tail -10 | head -8
               # CPU cache info:
               #  L1 Data                 32K [0-1]
               #  L1 Instruction          32K [0-1]
               #  L1 Data                 32K [2-3]
               #  L1 Instruction          32K [2-3]
               #  L2 Unified             256K [0-1]
               #  L2 Unified             256K [2-3]
               #  L3 Unified            4096K [0-3]
      
             Will be used in 'perf c2c' and eventually in 'perf diff' to
             allow, for instance running the same workload in multiple
             machines and then when using 'diff' show the hardware difference.
             (Jiri Olsa)
      
           - Improved support for Java, using the JVMTI agent library to do
             jitdumps that then will be inserted in synthesized
             PERF_RECORD_MMAP2 events via 'perf inject' pointed to synthesized
             ELF files stored in ~/.debug and keyed with build-ids, to allow
             symbol resolution and even annotation with source line info, see
             the changeset comments to see how to use it (Stephane Eranian)
      
        perf script/trace:
      
           - Decode data_src values (e.g.  perf.data files generated by 'perf
             mem record') in 'perf script': (Jiri Olsa)
      
               # perf script
                 perf 693 [1] 4.088652: 1 cpu/mem-loads,ldlat=30/P: ffff88007d0b0f40 68100142 L1 hit|SNP None|TLB L1 or L2 hit|LCK No <SNIP>
                                                                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
           - Improve support to 'data_src', 'weight' and 'addr' fields in
             'perf script' (Jiri Olsa)
      
           - Handle empty print fmts in 'perf script -s' i.e. when running
             python or perl scripts (Taeung Song)
      
        perf stat:
      
           - 'perf stat' now shows shadow metrics (insn per cycle, etc) in
             interval mode too.  E.g:
      
               # perf stat -I 1000 -e instructions,cycles sleep 1
               #         time   counts unit events
                  1.000215928  519,620      instructions     #  0.69 insn per cycle
                  1.000215928  752,003      cycles
               <SNIP>
      
           - Port 'perf kvm stat' to PowerPC (Hemant Kumar)
      
           - Implement CSV metrics output in 'perf stat' (Andi Kleen)
      
        perf BPF support:
      
           - Support converting data from bpf events in 'perf data' (Wang Nan)
      
           - Print bpf-output events in 'perf script': (Wang Nan).
      
               # perf record -e bpf-output/no-inherit,name=evt/ -e ./test_bpf_output_3.c/map:channel.event=evt/ usleep 1000
               # perf script
                  usleep  4882 21384.532523:   evt:  ffffffff810e97d1 sys_nanosleep ([kernel.kallsyms])
                   BPF output: 0000: 52 61 69 73 65 20 61 20  Raise a
                               0008: 42 50 46 20 65 76 65 6e  BPF even
                               0010: 74 21 00 00              t!..
                   BPF string: "Raise a BPF event!"
               #
      
           - Add API to set values of map entries in a BPF object, be it
             individual map slots or ranges (Wang Nan)
      
           - Introduce support for the 'bpf-output' event (Wang Nan)
      
           - Add glue to read perf events in a BPF program (Wang Nan)
      
           - Improve support for bpf-output events in 'perf trace' (Wang Nan)
      
        ... and tons of other changes as well - see the shortlog and git log
        for details!"
      
      * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (342 commits)
        perf stat: Add --metric-only support for -A
        perf stat: Implement --metric-only mode
        perf stat: Document CSV format in manpage
        perf hists browser: Check sort keys before hot key actions
        perf hists browser: Allow thread filtering for comm sort key
        perf tools: Add sort__has_comm variable
        perf tools: Recalc total periods using top-level entries in hierarchy
        perf tools: Remove nr_sort_keys field
        perf hists browser: Cleanup hist_browser__fprintf_hierarchy_entry()
        perf tools: Remove hist_entry->fmt field
        perf tools: Fix command line filters in hierarchy mode
        perf tools: Add more sort entry check functions
        perf tools: Fix hist_entry__filter() for hierarchy
        perf jitdump: Build only on supported archs
        tools lib traceevent: Add '~' operation within arg_num_eval()
        perf tools: Omit unnecessary cast in perf_pmu__parse_scale
        perf tools: Pass perf_hpp_list all the way through setup_sort_list
        perf tools: Fix perf script python database export crash
        perf jitdump: DWARF is also needed
        perf bench mem: Prepare the x86-64 build for upstream memcpy_mcsafe() changes
        ...
      e71c2c1e
    • L
      Merge branch 'mm-readonly-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d09e356a
      Linus Torvalds 提交于
      Pull read-only kernel memory updates from Ingo Molnar:
       "This tree adds two (security related) enhancements to the kernel's
        handling of read-only kernel memory:
      
         - extend read-only kernel memory to a new class of formerly writable
           kernel data: 'post-init read-only memory' via the __ro_after_init
           attribute, and mark the ARM and x86 vDSO as such read-only memory.
      
           This kind of attribute can be used for data that requires a once
           per bootup initialization sequence, but is otherwise never modified
           after that point.
      
           This feature was based on the work by PaX Team and Brad Spengler.
      
           (by Kees Cook, the ARM vDSO bits by David Brown.)
      
         - make CONFIG_DEBUG_RODATA always enabled on x86 and remove the
           Kconfig option.  This simplifies the kernel and also signals that
           read-only memory is the default model and a first-class citizen.
           (Kees Cook)"
      
      * 'mm-readonly-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        ARM/vdso: Mark the vDSO code read-only after init
        x86/vdso: Mark the vDSO code read-only after init
        lkdtm: Verify that '__ro_after_init' works correctly
        arch: Introduce post-init read-only memory
        x86/mm: Always enable CONFIG_DEBUG_RODATA and remove the Kconfig option
        mm/init: Add 'rodata=off' boot cmdline parameter to disable read-only kernel mappings
        asm-generic: Consolidate mark_rodata_ro()
      d09e356a
    • L
      Merge branch 'mm-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5ec94246
      Linus Torvalds 提交于
      Pull dma_*_writecombine rename from Ingo Molnar:
       "Rename dma_*_writecombine() to dma_*_wc()
      
        This is a tree-wide API rename, to move the dma_*() write-combining
        APIs closer in name to their usual API families.  (The old API names
        are kept as compatibility wrappers to not introduce extra breakage.)
      
        The patch was Coccinelle generated"
      
      * 'mm-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        dma, mm/pat: Rename dma_*_writecombine() to dma_*_wc()
      5ec94246
    • L
      Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · fbed0bc0
      Linus Torvalds 提交于
      Pull locking changes from Ingo Molnar:
       "Various updates:
      
         - Futex scalability improvements: remove page lock use for shared
           futex get_futex_key(), which speeds up 'perf bench futex hash'
           benchmarks by over 40% on a 60-core Westmere.  This makes anon-mem
           shared futexes perform close to private futexes.  (Mel Gorman)
      
         - lockdep hash collision detection and fix (Alfredo Alvarez
           Fernandez)
      
         - lockdep testing enhancements (Alfredo Alvarez Fernandez)
      
         - robustify lockdep init by using hlists (Andrew Morton, Andrey
           Ryabinin)
      
         - mutex and csd_lock micro-optimizations (Davidlohr Bueso)
      
         - small x86 barriers tweaks (Michael S Tsirkin)
      
         - qspinlock updates (Waiman Long)"
      
      * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
        locking/csd_lock: Use smp_cond_acquire() in csd_lock_wait()
        locking/csd_lock: Explicitly inline csd_lock*() helpers
        futex: Replace barrier() in unqueue_me() with READ_ONCE()
        locking/lockdep: Detect chain_key collisions
        locking/lockdep: Prevent chain_key collisions
        tools/lib/lockdep: Fix link creation warning
        tools/lib/lockdep: Add tests for AA and ABBA locking
        tools/lib/lockdep: Add userspace version of READ_ONCE()
        tools/lib/lockdep: Fix the build on recent kernels
        locking/qspinlock: Move __ARCH_SPIN_LOCK_UNLOCKED to qspinlock_types.h
        locking/mutex: Allow next waiter lockless wakeup
        locking/pvqspinlock: Enable slowpath locking count tracking
        locking/qspinlock: Use smp_cond_acquire() in pending code
        locking/pvqspinlock: Move lock stealing count tracking code into pv_queued_spin_steal_lock()
        locking/mcs: Fix mcs_spin_lock() ordering
        futex: Remove requirement for lock_page() in get_futex_key()
        futex: Rename barrier references in ordering guarantees
        locking/atomics: Update comment about READ_ONCE() and structures
        locking/lockdep: Eliminate lockdep_init()
        locking/lockdep: Convert hash tables to hlists
        ...
      fbed0bc0
    • L
      Merge branch 'core-resources-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d37a14bb
      Linus Torvalds 提交于
      Pull ram resource handling changes from Ingo Molnar:
       "Core kernel resource handling changes to support NVDIMM error
        injection.
      
        This tree introduces a new I/O resource type, IORESOURCE_SYSTEM_RAM,
        for System RAM while keeping the current IORESOURCE_MEM type bit set
        for all memory-mapped ranges (including System RAM) for backward
        compatibility.
      
        With this resource flag it no longer takes a strcmp() loop through the
        resource tree to find "System RAM" resources.
      
        The new resource type is then used to extend ACPI/APEI error injection
        facility to also support NVDIMM"
      
      * 'core-resources-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        ACPI/EINJ: Allow memory error injection to NVDIMM
        resource: Kill walk_iomem_res()
        x86/kexec: Remove walk_iomem_res() call with GART type
        x86, kexec, nvdimm: Use walk_iomem_res_desc() for iomem search
        resource: Add walk_iomem_res_desc()
        memremap: Change region_intersects() to take @flags and @desc
        arm/samsung: Change s3c_pm_run_res() to use System RAM type
        resource: Change walk_system_ram() to use System RAM type
        drivers: Initialize resource entry to zero
        xen, mm: Set IORESOURCE_SYSTEM_RAM to System RAM
        kexec: Set IORESOURCE_SYSTEM_RAM for System RAM
        arch: Set IORESOURCE_SYSTEM_RAM flag for System RAM
        ia64: Set System RAM type and descriptor
        x86/e820: Set System RAM type and descriptor
        resource: Add I/O resource descriptor
        resource: Handle resource flags properly
        resource: Add System RAM resource type
      d37a14bb
  3. 14 3月, 2016 2 次提交
  4. 13 3月, 2016 9 次提交
  5. 12 3月, 2016 12 次提交
    • M
      x86/efi: Fix boot crash by always mapping boot service regions into new EFI page tables · 452308de
      Matt Fleming 提交于
      Some machines have EFI regions in page zero (physical address
      0x00000000) and historically that region has been added to the e820
      map via trim_bios_range(), and ultimately mapped into the kernel page
      tables. It was not mapped via efi_map_regions() as one would expect.
      
      Alexis reports that with the new separate EFI page tables some boot
      services regions, such as page zero, are not mapped. This triggers an
      oops during the SetVirtualAddressMap() runtime call.
      
      For the EFI boot services quirk on x86 we need to memblock_reserve()
      boot services regions until after SetVirtualAddressMap(). Doing that
      while respecting the ownership of regions that may have already been
      reserved by the kernel was the motivation behind this commit:
      
        7d68dc3f ("x86, efi: Do not reserve boot services regions within reserved areas")
      
      That patch was merged at a time when the EFI runtime virtual mappings
      were inserted into the kernel page tables as described above, and the
      trick of setting ->numpages (and hence the region size) to zero to
      track regions that should not be freed in efi_free_boot_services()
      meant that we never mapped those regions in efi_map_regions(). Instead
      we were relying solely on the existing kernel mappings.
      
      Now that we have separate page tables we need to make sure the EFI
      boot services regions are mapped correctly, even if someone else has
      already called memblock_reserve(). Instead of stashing a tag in
      ->numpages, set the EFI_MEMORY_RUNTIME bit of ->attribute. Since it
      generally makes no sense to mark a boot services region as required at
      runtime, it's pretty much guaranteed the firmware will not have
      already set this bit.
      
      For the record, the specific circumstances under which Alexis
      triggered this bug was that an EFI runtime driver on his machine was
      responding to the EVT_SIGNAL_VIRTUAL_ADDRESS_CHANGE event during
      SetVirtualAddressMap().
      
      The event handler for this driver looks like this,
      
        sub rsp,0x28
        lea rdx,[rip+0x2445] # 0xaa948720
        mov ecx,0x4
        call func_aa9447c0  ; call to ConvertPointer(4, & 0xaa948720)
        mov r11,QWORD PTR [rip+0x2434] # 0xaa948720
        xor eax,eax
        mov BYTE PTR [r11+0x1],0x1
        add rsp,0x28
        ret
      
      Which is pretty typical code for an EVT_SIGNAL_VIRTUAL_ADDRESS_CHANGE
      handler. The "mov r11, QWORD PTR [rip+0x2424]" was the faulting
      instruction because ConvertPointer() was being called to convert the
      address 0x0000000000000000, which when converted is left unchanged and
      remains 0x0000000000000000.
      
      The output of the oops trace gave the impression of a standard NULL
      pointer dereference bug, but because we're accessing physical
      addresses during ConvertPointer(), it wasn't. EFI boot services code
      is stored at that address on Alexis' machine.
      Reported-by: NAlexis Murzeau <amurzeau@gmail.com>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Raphael Hertzog <hertzog@debian.org>
      Cc: Roger Shimizu <rogershimizu@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/1457695163-29632-2-git-send-email-matt@codeblueprint.co.uk
      Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=815125Signed-off-by: NIngo Molnar <mingo@kernel.org>
      452308de
    • B
      x86/fpu: Fix eager-FPU handling on legacy FPU machines · 6e686709
      Borislav Petkov 提交于
      i486 derived cores like Intel Quark support only the very old,
      legacy x87 FPU (FSAVE/FRSTOR, CPUID bit FXSR is not set), and
      our FPU code wasn't handling the saving and restoring there
      properly in the 'eagerfpu' case.
      
      So after we made eagerfpu the default for all CPU types:
      
        58122bf1 x86/fpu: Default eagerfpu=on on all CPUs
      
      these old FPU designs broke. First, Andy Shevchenko reported a splat:
      
        WARNING: CPU: 0 PID: 823 at arch/x86/include/asm/fpu/internal.h:163 fpu__clear+0x8c/0x160
      
      which was us trying to execute FXRSTOR on those machines even though
      they don't support it.
      
      After taking care of that, Bryan O'Donoghue reported that a simple FPU
      test still failed because we weren't initializing the FPU state properly
      on those machines.
      
      Take care of all that.
      Reported-and-tested-by: NBryan O'Donoghue <pure.logic@nexus-software.ie>
      Reported-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yu-cheng <yu-cheng.yu@intel.com>
      Link: http://lkml.kernel.org/r/20160311113206.GD4312@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6e686709
    • L
      Merge tag 'for-linus-20160311' of git://git.infradead.org/linux-mtd · 03c668a9
      Linus Torvalds 提交于
      Pull MTD fixes from Brian Norris:
       "Late MTD fix for v4.5:
      
         - A simple error code handling fix for the NAND ECC test; this was a
           regression in v4.5-rc1
      
         - A MAINTAINERS update, which might as well go in ASAP"
      
      * tag 'for-linus-20160311' of git://git.infradead.org/linux-mtd:
        MAINTAINERS: add a maintainer for the NAND subsystem
        mtd: nand: tests: fix regression introduced in mtd_nandectest
      03c668a9
    • L
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 3ab0a0f9
      Linus Torvalds 提交于
      Pull drm/i915 fixes from Dave Airlie:
       "Just two i915 regression fixes, that should be it from me"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        drm/i915: Actually retry with bit-banging after GMBUS timeout
        drm/i915: Fix bogus dig_port_map[] assignment for pre-HSW
      3ab0a0f9
    • M
      mm/mempool: avoid KASAN marking mempool poison checks as use-after-free · 76401310
      Matthew Dawson 提交于
      When removing an element from the mempool, mark it as unpoisoned in KASAN
      before verifying its contents for SLUB/SLAB debugging.  Otherwise KASAN
      will flag the reads checking the element use-after-free writes as
      use-after-free reads.
      Signed-off-by: NMatthew Dawson <matthew@mjdsystems.ca>
      Acked-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      76401310
    • L
      Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 2a4fb270
      Linus Torvalds 提交于
      Pull ARM SoC fixes from Olof Johansson:
       "Two more fixes for 4.5:
      
         - One is a fix for OMAP that is urgently needed to avoid DRA7xx chips
           from premature aging, by always keeping the Ethernet clock enabled.
      
         - The other solves a I/O memory layout issue on Armada, where SROM
           and PCI memory windows were conflicting in some configurations"
      
      * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        ARM: mvebu: fix overlap of Crypto SRAM with PCIe memory window
        ARM: dts: dra7: do not gate cpsw clock due to errata i877
        ARM: OMAP2+: hwmod: Introduce ti,no-idle dt property
      2a4fb270
    • L
      Merge tag 'media/v4.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 95f41fb2
      Linus Torvalds 提交于
      Pull media fix from Mauro Carvalho Chehab:
       "One last time fix: It adds a code that prevents some media tools like
        media-ctl to hide some entities that have their IDs out of the range
        expected by those apps"
      
      * tag 'media/v4.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
        [media] media-device: map new functions into old types for legacy API
      95f41fb2
    • T
      ARM: mvebu: fix overlap of Crypto SRAM with PCIe memory window · d7d5a43c
      Thomas Petazzoni 提交于
      When the Crypto SRAM mappings were added to the Device Tree files
      describing the Armada XP boards in commit c466d997 ("ARM: mvebu:
      define crypto SRAM ranges for all armada-xp boards"), the fact that
      those mappings were overlaping with the PCIe memory aperture was
      overlooked. Due to this, we currently have for all Armada XP platforms
      a situation that looks like this:
      
      Memory mapping on Armada XP boards with internal registers at
      0xf1000000:
      
       - 0x00000000 -> 0xf0000000	3.75G 	RAM
       - 0xf0000000 -> 0xf1000000	16M	NOR flashes (AXP GP / AXP DB)
       - 0xf1000000 -> 0xf1100000	1M	internal registers
       - 0xf8000000 -> 0xffe0000	126M	PCIe memory aperture
       - 0xf8100000 -> 0xf8110000	64KB	Crypto SRAM #0	=> OVERLAPS WITH PCIE !
       - 0xf8110000 -> 0xf8120000	64KB	Crypto SRAM #1	=> OVERLAPS WITH PCIE !
       - 0xffe00000 -> 0xfff00000	1M	PCIe I/O aperture
       - 0xfff0000  -> 0xffffffff	1M	BootROM
      
      The overlap means that when PCIe devices are added, depending on their
      memory window needs, they might or might not be mapped into the
      physical address space. Indeed, they will not be mapped if the area
      allocated in the PCIe memory aperture by the PCI core overlaps with
      one of the Crypto SRAM. Typically, a Intel IGB PCIe NIC that needs 8MB
      of PCIe memory will see its PCIe memory window allocated from
      0xf80000000 for 8MB, which overlaps with the Crypto SRAM windows. Due
      to this, the PCIe window is not created, and any attempt to access the
      PCIe window makes the kernel explode:
      
      [    3.302213] igb: Copyright (c) 2007-2014 Intel Corporation.
      [    3.307841] pci 0000:00:09.0: enabling device (0140 -> 0143)
      [    3.313539] mvebu_mbus: cannot add window '4:f8', conflicts with another window
      [    3.320870] mvebu-pcie soc:pcie-controller: Could not create MBus window at [mem 0xf8000000-0xf87fffff]: -22
      [    3.330811] Unhandled fault: external abort on non-linefetch (0x1008) at 0xf08c0018
      
      This problem does not occur on Armada 370 boards, because we use the
      following memory mapping (for boards that have internal registers at
      0xf1000000):
      
       - 0x00000000 -> 0xf0000000	3.75G 	RAM
       - 0xf0000000 -> 0xf1000000	16M	NOR flashes (AXP GP / AXP DB)
       - 0xf1000000 -> 0xf1100000	1M	internal registers
       - 0xf1100000 -> 0xf1110000	64KB	Crypto SRAM #0 => OK !
       - 0xf8000000 -> 0xffe0000	126M	PCIe memory
       - 0xffe00000 -> 0xfff00000	1M	PCIe I/O
       - 0xfff0000  -> 0xffffffff	1M	BootROM
      
      Obviously, the solution is to align the location of the Crypto SRAM
      mappings of Armada XP to be similar with the ones on Armada 370, i.e
      have them between the "internal registers" area and the beginning of
      the PCIe aperture.
      
      However, we have a special case with the OpenBlocks AX3-4 platform,
      which has a 128 MB NOR flash. Currently, this NOR flash is mapped from
      0xf0000000 to 0xf8000000. This is possible because on OpenBlocks
      AX3-4, the internal registers are not at 0xf1000000. And this explains
      why the Crypto SRAM mappings were not configured at the same place on
      Armada XP.
      
      Hence, the solution is two-fold:
      
       (1) Move the NOR flash mapping on Armada XP OpenBlocks AX3-4 from
           0xe8000000 to 0xf0000000. This frees the 0xf0000000 ->
           0xf80000000 space.
      
       (2) Move the Crypto SRAM mappings on Armada XP to be similar to
           Armada 370 (except of course that Armada XP has two Crypto SRAM
           and not one).
      
      After this patch, the memory mapping on Armada XP boards with
      registers at 0xf1 is:
      
       - 0x00000000 -> 0xf0000000	3.75G 	RAM
       - 0xf0000000 -> 0xf1000000	16M	NOR flashes (AXP GP / AXP DB)
       - 0xf1000000 -> 0xf1100000	1M	internal registers
       - 0xf1100000 -> 0xf1110000	64KB	Crypto SRAM #0
       - 0xf1110000 -> 0xf1120000	64KB	Crypto SRAM #1
       - 0xf8000000 -> 0xffe0000	126M	PCIe memory
       - 0xffe00000 -> 0xfff00000	1M	PCIe I/O
       - 0xfff0000  -> 0xffffffff	1M	BootROM
      
      And the memory mapping for the special case of the OpenBlocks AX3-4
      (internal registers at 0xd0000000, NOR of 128 MB):
      
       - 0x00000000 -> 0xc0000000	3G 	RAM
       - 0xd0000000 -> 0xd1000000	1M	internal registers
       - 0xe800000  -> 0xf0000000	128M	NOR flash
       - 0xf1100000 -> 0xf1110000	64KB	Crypto SRAM #0
       - 0xf1110000 -> 0xf1120000	64KB	Crypto SRAM #1
       - 0xf8000000 -> 0xffe0000	126M	PCIe memory
       - 0xffe00000 -> 0xfff00000	1M	PCIe I/O
       - 0xfff0000  -> 0xffffffff	1M	BootROM
      
      Fixes: c466d997 ("ARM: mvebu: define crypto SRAM ranges for all armada-xp boards")
      Reported-by: NPhil Sutter <phil@nwl.cc>
      Cc: Phil Sutter <phil@nwl.cc>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Acked-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      d7d5a43c
    • L
      Merge tag 'dmaengine-fix-4.5' of git://git.infradead.org/users/vkoul/slave-dma · 20698c92
      Linus Torvalds 提交于
      Pull dmaengine fixes from Vinod Koul:
       "Two fixes showed up in last few days, and they should be included in
        4.5.  Summary:
      
        Two more late fixes to drivers, nothing major here:
      
         - A memory leak fix in fsdma unmap the dma descriptors on freeup
      
         - A fix in xdmac driver for residue calculation of dma descriptor"
      
      * tag 'dmaengine-fix-4.5' of git://git.infradead.org/users/vkoul/slave-dma:
        dmaengine: at_xdmac: fix residue computation
        dmaengine: fsldma: fix memory leak
      20698c92
    • L
      Merge tag 'pm+acpi-4.5-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 7ae9c768
      Linus Torvalds 提交于
      Pull power management and ACPI fixes from Rafael Wysocki:
       "Two more fixes for issues introduced recently, one in the generic
        device properties framework and one in ACPICA.
      
        Specifics:
      
         - Revert a recent ACPICA commit that has been reverted upstream,
           because it caused problems to happen on user systems and the
           problem it attempted to address will not be relevant any more after
           upcoming ACPI specification changes (Bob Moore).
      
         - Fix crash in the generic device properties framework introduced by
           a recent change that forgot to check pointers against error values
           in addition to checking them against NULL (Heikki Krogerus)"
      
      * tag 'pm+acpi-4.5-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        device property: fwnode->secondary may contain ERR_PTR(-ENODEV)
        ACPICA: Revert "Parser: Fix for SuperName method invocation"
      7ae9c768
    • L
      Merge tag 'xfs-for-linus-4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs · 2a62ec0a
      Linus Torvalds 提交于
      Pull xfs fixes from Dave Chinner:
       "This is a fix for a regression introduced in 4.5-rc1 by the new torn
        log write detection code.  The regression only affects people moving a
        clean filesystem between machines/kernels of different architecture
        (such as changing between 32 bit and 64 bit kernels), but this is the
        recommended (and only!) safe way to migrate a filesystem between
        architectures so we really need to ensure it works.
      
        The changes are larger than I'd prefer right at the end of the release
        cycle, but the majority of the change is just factoring code to enable
        the detection of a clean log at the correct time to avoid this issue.
      
        Changes:
      
         - Only perform torn log write detection on dirty logs.  This prevents
           failures being detected due to a clean filesystem being moved
           between machines or kernels of different architectures (e.g.  32 ->
           64 bit, BE -> LE, etc).  This fixes a regression introduced by the
           torn log write detection in 4.5-rc1"
      
      * tag 'xfs-for-linus-4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
        xfs: only run torn log write detection on dirty logs
        xfs: refactor in-core log state update to helper
        xfs: refactor unmount record detection into helper
        xfs: separate log head record discovery from verification
      2a62ec0a
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 63cf207e
      Linus Torvalds 提交于
      Pull vfs fixes from Al Viro:
       "A couple of fixes: Fix for my dumb braino in ncpfs and a long-standing
        breakage on recovery from failed rename() in jffs2"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        jffs2: reduce the breakage on recovery from halfway failed rename()
        ncpfs: fix a braino in OOM handling in ncp_fill_cache()
      63cf207e
  6. 11 3月, 2016 7 次提交