提交 · 842b90b62461d0848bd56ad776117d15a5fa95c0 · openanolis / cloud-kernel

16 3月, 2016 9 次提交

ocfs2/dlm: return in progress if master can not clear the refmap bit right now · 842b90b6

由 xuejiufei 提交于 3月 15, 2016

Master returns in-progress to non-master node when it can not clear the
refmap bit right now.  And non-master node will not purge the lock
resource until receiving deref done message.
Signed-off-by: Nxuejiufei <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

842b90b6

ocfs2/dlm: add DEREF_DONE message · 60d663cb

由 xuejiufei 提交于 3月 15, 2016

This series of patches is to fix the dis-order issue of setting/clearing
refmap bit described below.

Node 1                               Node 2(master)
dlmlock
dlm_do_master_request
                                dlm_master_request_handler
                                -> dlm_lockres_set_refmap_bit
dlmlock succeed
dlmunlock succeed

dlm_purge_lockres
                                dlm_deref_handler
                                -> find lock resource is in
                                   DLM_LOCK_RES_SETREF_INPROG state,
                                   so dispatch a deref work
dlm_purge_lockres succeed.

call dlmlock again
dlm_do_master_request
                                dlm_master_request_handler
                                -> dlm_lockres_set_refmap_bit

                                deref work trigger, call
                                dlm_lockres_clear_refmap_bit
                                to clear Node 1 from refmap

                                dlm_purge_lockres succeed

dlm_send_remote_lock_request
                                return DLM_IVLOCKID because
                                the lockres is not exist
BUG if the lockres is $RECOVERY

This series of patches add a new message to keep the order of set and
clear.  Other nodes can purge the lock resource only after the refmap bit
on master is cleared.

This patch is to add DEREF_DONE message and corresponding handler.  Node
can purge the lock resource after receiving this message.  As a new
message is added, so increase the minor number of dlm protocol version.
Signed-off-by: Nxuejiufei <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

60d663cb

ocfs2/dlm: fix a typo in dlmcommon.h · 39b29af0

由 Joseph Qi 提交于 3月 15, 2016

Refer to cluster/tcp.h, NET_MAX_PAYLOAD_BYTES is a typo for
O2NET_MAX_PAYLOAD_BYTES.

Since currently DLM_MIG_LOCKRES_RESERVED is not actually used, it won't
cause any problem.  But we'd better correct it for further use.
Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

39b29af0

ocfs2: use spinlock_irqsave() to downconvert lock in ocfs2_osb_dump() · bfd97a03

由 jiangyiwen 提交于 3月 15, 2016

Commit a75e9cca ("ocfs2: use spinlock irqsave for downconvert lock")
missed an unmodified place in ocfs2_osb_dump(), so it still exists a
deadlock scenario.

    ocfs2_wake_downconvert_thread
    ocfs2_rw_unlock
    ocfs2_dio_end_io
    dio_complete
    .....
    bio_endio
    req_bio_endio
    ....
    scsi_io_completion
    blk_done_softirq
    __do_softirq
    do_softirq
    irq_exit
    do_IRQ
    ocfs2_osb_dump
    cat /sys/kernel/debug/ocfs2/${uuid}/fs_state

This patch still uses spin_lock_irqsave() - replace spin_lock() to solve
this situation.
Signed-off-by: NYiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bfd97a03

ocfs2/cluster: replace the interrupt safe spinlocks with common ones · 4d548f61

由 jiangyiwen 提交于 3月 15, 2016

There actually no hardware or software interrupts in the context which
using o2hb_live_lock, so we don't need to worry about race conditions
caused by irq/softirq with spinlock held.  Turning off irq is not good
for system performance after all.  Just replace them with a non
interrupt safe function.
Signed-off-by: NYiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4d548f61

blackfin: define dummy pgprot_writecombine for !MMU · e928f350

由 Sudip Mukherjee 提交于 3月 15, 2016

blackfin allmodconfig build fails with the error:

  ../sound/core/pcm_native.c: In function 'snd_pcm_lib_default_mmap':
  ../sound/core/pcm_native.c:3386:24: error: implicit declaration of function 'pgprot_writecombine' [-Werror=implicit-function-declaration]
     area->vm_page_prot = pgprot_writecombine(area->vm_page_prot);
                          ^
  ../sound/core/pcm_native.c:3386:22: error: incompatible types when assigning to type 'pgprot_t {aka struct <anonymous>}' from type 'int'
     area->vm_page_prot = pgprot_writecombine(area->vm_page_prot);
                        ^

When !MMU, asm-generic will not define default pgprot_writecombine, so
blackfin needs to define it by itself.

The patch idea is from commit 65b9ab88 ("arch/c6x/include/asm/pgtable.h:
define dummy pgprot_writecombine for !MMU")
Signed-off-by: NSudip Mukherjee <sudip@vectorindia.org>
Cc: Steven Miao <realmz6@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e928f350

m32r: mm: fix build warning · 3701dc81

由 Sudip Mukherjee 提交于 3月 15, 2016

While building we are getting warnings:

  arch/m32r/mm/init.c:63:17: warning: unused variable 'low'
  arch/m32r/mm/init.c:62:17: warning: unused variable 'max_dma'

max_dma and low are only used if CONFIG_MMU is defined.  Lets declare
the variables inside the #ifdef.
Signed-off-by: NSudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3701dc81

tags: Fix DEFINE_PER_CPU expansions · 25528213

由 Peter Zijlstra 提交于 3月 15, 2016

$ make tags
GEN tags
ctags: Warning: drivers/acpi/processor_idle.c:64: null expansion of name pattern "\1"
ctags: Warning: drivers/xen/events/events_2l.c:41: null expansion of name pattern "\1"
ctags: Warning: kernel/locking/lockdep.c:151: null expansion of name pattern "\1"
ctags: Warning: kernel/rcu/rcutorture.c:133: null expansion of name pattern "\1"
ctags: Warning: kernel/rcu/rcutorture.c:135: null expansion of name pattern "\1"
ctags: Warning: kernel/workqueue.c:323: null expansion of name pattern "\1"
ctags: Warning: net/ipv4/syncookies.c:53: null expansion of name pattern "\1"
ctags: Warning: net/ipv6/syncookies.c:44: null expansion of name pattern "\1"
ctags: Warning: net/rds/page.c:45: null expansion of name pattern "\1"

Which are all the result of the DEFINE_PER_CPU pattern:

scripts/tags.sh:200: '/\<DEFINE_PER_CPU([^,]*, *$[[:alnum:]_]*$/\1/v/'
scripts/tags.sh:201: '/\<DEFINE_PER_CPU_SHARED_ALIGNED([^,]*, *$[[:alnum:]_]*$/\1/v/'

The below cures them. All except the workqueue one are within reasonable
distance of the 80 char limit. TJ do you have any preference on how to
fix the wq one, or shall we just not care its too long?
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

25528213

init/main.c: use list_for_each_entry() · e6fd1fb3

由 Geliang Tang 提交于 3月 15, 2016

Use list_for_each_entry() instead of list_for_each() to simplify the code.
Signed-off-by: NGeliang Tang <geliangtang@163.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e6fd1fb3

15 3月, 2016 8 次提交

Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e23604ed

由 Linus Torvalds 提交于 3月 14, 2016

Pull NOHZ updates from Ingo Molnar:
 "NOHZ enhancements, by Frederic Weisbecker, which reorganizes/refactors
  the NOHZ 'can the tick be stopped?' infrastructure and related code to
  be data driven, and harmonizes the naming and handling of all the
  various properties"

[ This makes the ugly "fetch_or()" macro that the scheduler used
  internally a new generic helper, and does a bad job at it.

  I'm pulling it, but I've asked Ingo and Frederic to get this
  fixed up ]

* 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched-clock: Migrate to use new tick dependency mask model
  posix-cpu-timers: Migrate to use new tick dependency mask model
  sched: Migrate sched to use new tick dependency mask model
  sched: Account rr tasks
  perf: Migrate perf to use new tick dependency mask model
  nohz: Use enum code for tick stop failure tracing message
  nohz: New tick dependency mask
  nohz: Implement wide kick on top of irq work
  atomic: Export fetch_or()

e23604ed

Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d4e79615

由 Linus Torvalds 提交于 3月 14, 2016

Pull scheduler updates from Ingo Molnar:
 "The main changes in this cycle are:

   - Make schedstats a runtime tunable (disabled by default) and
     optimize it via static keys.

     As most distributions enable CONFIG_SCHEDSTATS=y due to its
     instrumentation value, this is a nice performance enhancement.
     (Mel Gorman)

   - Implement 'simple waitqueues' (swait): these are just pure
     waitqueues without any of the more complex features of full-blown
     waitqueues (callbacks, wake flags, wake keys, etc.).  Simple
     waitqueues have less memory overhead and are faster.

     Use simple waitqueues in the RCU code (in 4 different places) and
     for handling KVM vCPU wakeups.

     (Peter Zijlstra, Daniel Wagner, Thomas Gleixner, Paul Gortmaker,
     Marcelo Tosatti)

   - sched/numa enhancements (Rik van Riel)

   - NOHZ performance enhancements (Rik van Riel)

   - Various sched/deadline enhancements (Steven Rostedt)

   - Various fixes (Peter Zijlstra)

   - ... and a number of other fixes, cleanups and smaller enhancements"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits)
  sched/cputime: Fix steal_account_process_tick() to always return jiffies
  sched/deadline: Remove dl_new from struct sched_dl_entity
  Revert "kbuild: Add option to turn incompatible pointer check into error"
  sched/deadline: Remove superfluous call to switched_to_dl()
  sched/debug: Fix preempt_disable_ip recording for preempt_disable()
  sched, time: Switch VIRT_CPU_ACCOUNTING_GEN to jiffy granularity
  time, acct: Drop irq save & restore from __acct_update_integrals()
  acct, time: Change indentation in __acct_update_integrals()
  sched, time: Remove non-power-of-two divides from __acct_update_integrals()
  sched/rt: Kick RT bandwidth timer immediately on start up
  sched/debug: Add deadline scheduler bandwidth ratio to /proc/sched_debug
  sched/debug: Move sched_domain_sysctl to debug.c
  sched/debug: Move the /sys/kernel/debug/sched_features file setup into debug.c
  sched/rt: Fix PI handling vs. sched_setscheduler()
  sched/core: Remove duplicated sched_group_set_shares() prototype
  sched/fair: Consolidate nohz CPU load update code
  sched/fair: Avoid using decay_load_missed() with a negative value
  sched/deadline: Always calculate end of period on sched_yield()
  sched/cgroup: Fix cgroup entity load tracking tear-down
  rcu: Use simple wait queues where possible in rcutree
  ...

d4e79615

Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d88bfe1d

由 Linus Torvalds 提交于 3月 14, 2016

Pull RAS updates from Ingo Molnar:
 "Various RAS updates:

   - AMD MCE support updates for future CPUs, fixes and 'SMCA' (Scalable
     MCA) error decoding support (Aravind Gopalakrishnan)

   - x86 memcpy_mcsafe() support, to enable smart(er) hardware error
     recovery in NVDIMM drivers, based on an extension of the x86
     exception handling code.  (Tony Luck)"

* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  EDAC/sb_edac: Fix computation of channel address
  x86/mm, x86/mce: Add memcpy_mcsafe()
  x86/mce/AMD: Document some functionality
  x86/mce: Clarify comments regarding deferred error
  x86/mce/AMD: Fix logic to obtain block address
  x86/mce/AMD, EDAC: Enable error decoding of Scalable MCA errors
  x86/mce: Move MCx_CONFIG MSR definitions
  x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries
  x86/mm: Expand the exception table logic to allow new handling options
  x86/mce/AMD: Set MCAX Enable bit
  x86/mce/AMD: Carve out threshold block preparation
  x86/mce/AMD: Fix LVT offset configuration for thresholding
  x86/mce/AMD: Reduce number of blocks scanned per bank
  x86/mce/AMD: Do not perform shared bank check for future processors
  x86/mce: Fix order of AMD MCE init function call

d88bfe1d

Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e71c2c1e

由 Linus Torvalds 提交于 3月 14, 2016

Pull perf updates from Ingo Molnar:
 "Main kernel side changes:

   - Big reorganization of the x86 perf support code.  The old code grew
     organically deep inside arch/x86/kernel/cpu/perf* and its naming
     became somewhat messy.

     The new location is under arch/x86/events/, using the following
     cleaner hierarchy of source code files:

       perf/x86: Move perf_event.c .................. => x86/events/core.c
       perf/x86: Move perf_event_amd.c .............. => x86/events/amd/core.c
       perf/x86: Move perf_event_amd_ibs.c .......... => x86/events/amd/ibs.c
       perf/x86: Move perf_event_amd_iommu.[ch] ..... => x86/events/amd/iommu.[ch]
       perf/x86: Move perf_event_amd_uncore.c ....... => x86/events/amd/uncore.c
       perf/x86: Move perf_event_intel_bts.c ........ => x86/events/intel/bts.c
       perf/x86: Move perf_event_intel.c ............ => x86/events/intel/core.c
       perf/x86: Move perf_event_intel_cqm.c ........ => x86/events/intel/cqm.c
       perf/x86: Move perf_event_intel_cstate.c ..... => x86/events/intel/cstate.c
       perf/x86: Move perf_event_intel_ds.c ......... => x86/events/intel/ds.c
       perf/x86: Move perf_event_intel_lbr.c ........ => x86/events/intel/lbr.c
       perf/x86: Move perf_event_intel_pt.[ch] ...... => x86/events/intel/pt.[ch]
       perf/x86: Move perf_event_intel_rapl.c ....... => x86/events/intel/rapl.c
       perf/x86: Move perf_event_intel_uncore.[ch] .. => x86/events/intel/uncore.[ch]
       perf/x86: Move perf_event_intel_uncore_nhmex.c => x86/events/intel/uncore_nmhex.c
       perf/x86: Move perf_event_intel_uncore_snb.c   => x86/events/intel/uncore_snb.c
       perf/x86: Move perf_event_intel_uncore_snbep.c => x86/events/intel/uncore_snbep.c
       perf/x86: Move perf_event_knc.c .............. => x86/events/intel/knc.c
       perf/x86: Move perf_event_p4.c ............... => x86/events/intel/p4.c
       perf/x86: Move perf_event_p6.c ............... => x86/events/intel/p6.c
       perf/x86: Move perf_event_msr.c .............. => x86/events/msr.c

     (Borislav Petkov)

   - Update various x86 PMU constraint and hw support details (Stephane
     Eranian)

   - Optimize kprobes for BPF execution (Martin KaFai Lau)

   - Rewrite, refactor and fix the Intel uncore PMU driver code (Thomas
     Gleixner)

   - Rewrite, refactor and fix the Intel RAPL PMU code (Thomas Gleixner)

   - Various fixes and smaller cleanups.

  There are lots of perf tooling updates as well.  A few highlights:

  perf report/top:

     - Hierarchy histogram mode for 'perf top' and 'perf report',
       showing multiple levels, one per --sort entry: (Namhyung Kim)

       On a mostly idle system:

         # perf top --hierarchy -s comm,dso

       Then expand some levels and use 'P' to take a snapshot:

         # cat perf.hist.0
         -  92.32%         perf
               58.20%         perf
               22.29%         libc-2.22.so
                5.97%         [kernel]
                4.18%         libelf-0.165.so
                1.69%         [unknown]
         -   4.71%         qemu-system-x86
                3.10%         [kernel]
                1.60%         qemu-system-x86_64 (deleted)
         +   2.97%         swapper
         #

     - Add 'L' hotkey to dynamicly set the percent threshold for
       histogram entries and callchains, i.e.  dynamicly do what the
       --percent-limit command line option to 'top' and 'report' does.
       (Namhyung Kim)

  perf mem:

     - Allow specifying events via -e in 'perf mem record', also listing
       what events can be specified via 'perf mem record -e list' (Jiri
       Olsa)

  perf record:

     - Add 'perf record' --all-user/--all-kernel options, so that one
       can tell that all the events in the command line should be
       restricted to the user or kernel levels (Jiri Olsa), i.e.:

         perf record -e cycles:u,instructions:u

       is equivalent to:

         perf record --all-user -e cycles,instructions

     - Make 'perf record' collect CPU cache info in the perf.data file header:

         $ perf record usleep 1
         [ perf record: Woken up 1 times to write data ]
         [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
         $ perf report --header-only -I | tail -10 | head -8
         # CPU cache info:
         #  L1 Data                 32K [0-1]
         #  L1 Instruction          32K [0-1]
         #  L1 Data                 32K [2-3]
         #  L1 Instruction          32K [2-3]
         #  L2 Unified             256K [0-1]
         #  L2 Unified             256K [2-3]
         #  L3 Unified            4096K [0-3]

       Will be used in 'perf c2c' and eventually in 'perf diff' to
       allow, for instance running the same workload in multiple
       machines and then when using 'diff' show the hardware difference.
       (Jiri Olsa)

     - Improved support for Java, using the JVMTI agent library to do
       jitdumps that then will be inserted in synthesized
       PERF_RECORD_MMAP2 events via 'perf inject' pointed to synthesized
       ELF files stored in ~/.debug and keyed with build-ids, to allow
       symbol resolution and even annotation with source line info, see
       the changeset comments to see how to use it (Stephane Eranian)

  perf script/trace:

     - Decode data_src values (e.g.  perf.data files generated by 'perf
       mem record') in 'perf script': (Jiri Olsa)

         # perf script
           perf 693 [1] 4.088652: 1 cpu/mem-loads,ldlat=30/P: ffff88007d0b0f40 68100142 L1 hit|SNP None|TLB L1 or L2 hit|LCK No <SNIP>
                                                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     - Improve support to 'data_src', 'weight' and 'addr' fields in
       'perf script' (Jiri Olsa)

     - Handle empty print fmts in 'perf script -s' i.e. when running
       python or perl scripts (Taeung Song)

  perf stat:

     - 'perf stat' now shows shadow metrics (insn per cycle, etc) in
       interval mode too.  E.g:

         # perf stat -I 1000 -e instructions,cycles sleep 1
         #         time   counts unit events
            1.000215928  519,620      instructions     #  0.69 insn per cycle
            1.000215928  752,003      cycles
         <SNIP>

     - Port 'perf kvm stat' to PowerPC (Hemant Kumar)

     - Implement CSV metrics output in 'perf stat' (Andi Kleen)

  perf BPF support:

     - Support converting data from bpf events in 'perf data' (Wang Nan)

     - Print bpf-output events in 'perf script': (Wang Nan).

         # perf record -e bpf-output/no-inherit,name=evt/ -e ./test_bpf_output_3.c/map:channel.event=evt/ usleep 1000
         # perf script
            usleep  4882 21384.532523:   evt:  ffffffff810e97d1 sys_nanosleep ([kernel.kallsyms])
             BPF output: 0000: 52 61 69 73 65 20 61 20  Raise a
                         0008: 42 50 46 20 65 76 65 6e  BPF even
                         0010: 74 21 00 00              t!..
             BPF string: "Raise a BPF event!"
         #

     - Add API to set values of map entries in a BPF object, be it
       individual map slots or ranges (Wang Nan)

     - Introduce support for the 'bpf-output' event (Wang Nan)

     - Add glue to read perf events in a BPF program (Wang Nan)

     - Improve support for bpf-output events in 'perf trace' (Wang Nan)

  ... and tons of other changes as well - see the shortlog and git log
  for details!"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (342 commits)
  perf stat: Add --metric-only support for -A
  perf stat: Implement --metric-only mode
  perf stat: Document CSV format in manpage
  perf hists browser: Check sort keys before hot key actions
  perf hists browser: Allow thread filtering for comm sort key
  perf tools: Add sort__has_comm variable
  perf tools: Recalc total periods using top-level entries in hierarchy
  perf tools: Remove nr_sort_keys field
  perf hists browser: Cleanup hist_browser__fprintf_hierarchy_entry()
  perf tools: Remove hist_entry->fmt field
  perf tools: Fix command line filters in hierarchy mode
  perf tools: Add more sort entry check functions
  perf tools: Fix hist_entry__filter() for hierarchy
  perf jitdump: Build only on supported archs
  tools lib traceevent: Add '~' operation within arg_num_eval()
  perf tools: Omit unnecessary cast in perf_pmu__parse_scale
  perf tools: Pass perf_hpp_list all the way through setup_sort_list
  perf tools: Fix perf script python database export crash
  perf jitdump: DWARF is also needed
  perf bench mem: Prepare the x86-64 build for upstream memcpy_mcsafe() changes
  ...

e71c2c1e

Merge branch 'mm-readonly-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d09e356a

由 Linus Torvalds 提交于 3月 14, 2016

Pull read-only kernel memory updates from Ingo Molnar:
 "This tree adds two (security related) enhancements to the kernel's
  handling of read-only kernel memory:

   - extend read-only kernel memory to a new class of formerly writable
     kernel data: 'post-init read-only memory' via the __ro_after_init
     attribute, and mark the ARM and x86 vDSO as such read-only memory.

     This kind of attribute can be used for data that requires a once
     per bootup initialization sequence, but is otherwise never modified
     after that point.

     This feature was based on the work by PaX Team and Brad Spengler.

     (by Kees Cook, the ARM vDSO bits by David Brown.)

   - make CONFIG_DEBUG_RODATA always enabled on x86 and remove the
     Kconfig option.  This simplifies the kernel and also signals that
     read-only memory is the default model and a first-class citizen.
     (Kees Cook)"

* 'mm-readonly-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ARM/vdso: Mark the vDSO code read-only after init
  x86/vdso: Mark the vDSO code read-only after init
  lkdtm: Verify that '__ro_after_init' works correctly
  arch: Introduce post-init read-only memory
  x86/mm: Always enable CONFIG_DEBUG_RODATA and remove the Kconfig option
  mm/init: Add 'rodata=off' boot cmdline parameter to disable read-only kernel mappings
  asm-generic: Consolidate mark_rodata_ro()

d09e356a

Merge branch 'mm-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5ec94246

由 Linus Torvalds 提交于 3月 14, 2016

Pull dma_*_writecombine rename from Ingo Molnar:
 "Rename dma_*_writecombine() to dma_*_wc()

  This is a tree-wide API rename, to move the dma_*() write-combining
  APIs closer in name to their usual API families.  (The old API names
  are kept as compatibility wrappers to not introduce extra breakage.)

  The patch was Coccinelle generated"

* 'mm-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  dma, mm/pat: Rename dma_*_writecombine() to dma_*_wc()

5ec94246

Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · fbed0bc0

由 Linus Torvalds 提交于 3月 14, 2016

Pull locking changes from Ingo Molnar:
 "Various updates:

   - Futex scalability improvements: remove page lock use for shared
     futex get_futex_key(), which speeds up 'perf bench futex hash'
     benchmarks by over 40% on a 60-core Westmere.  This makes anon-mem
     shared futexes perform close to private futexes.  (Mel Gorman)

   - lockdep hash collision detection and fix (Alfredo Alvarez
     Fernandez)

   - lockdep testing enhancements (Alfredo Alvarez Fernandez)

   - robustify lockdep init by using hlists (Andrew Morton, Andrey
     Ryabinin)

   - mutex and csd_lock micro-optimizations (Davidlohr Bueso)

   - small x86 barriers tweaks (Michael S Tsirkin)

   - qspinlock updates (Waiman Long)"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
  locking/csd_lock: Use smp_cond_acquire() in csd_lock_wait()
  locking/csd_lock: Explicitly inline csd_lock*() helpers
  futex: Replace barrier() in unqueue_me() with READ_ONCE()
  locking/lockdep: Detect chain_key collisions
  locking/lockdep: Prevent chain_key collisions
  tools/lib/lockdep: Fix link creation warning
  tools/lib/lockdep: Add tests for AA and ABBA locking
  tools/lib/lockdep: Add userspace version of READ_ONCE()
  tools/lib/lockdep: Fix the build on recent kernels
  locking/qspinlock: Move __ARCH_SPIN_LOCK_UNLOCKED to qspinlock_types.h
  locking/mutex: Allow next waiter lockless wakeup
  locking/pvqspinlock: Enable slowpath locking count tracking
  locking/qspinlock: Use smp_cond_acquire() in pending code
  locking/pvqspinlock: Move lock stealing count tracking code into pv_queued_spin_steal_lock()
  locking/mcs: Fix mcs_spin_lock() ordering
  futex: Remove requirement for lock_page() in get_futex_key()
  futex: Rename barrier references in ordering guarantees
  locking/atomics: Update comment about READ_ONCE() and structures
  locking/lockdep: Eliminate lockdep_init()
  locking/lockdep: Convert hash tables to hlists
  ...

fbed0bc0

Merge branch 'core-resources-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d37a14bb

由 Linus Torvalds 提交于 3月 14, 2016

Pull ram resource handling changes from Ingo Molnar:
 "Core kernel resource handling changes to support NVDIMM error
  injection.

  This tree introduces a new I/O resource type, IORESOURCE_SYSTEM_RAM,
  for System RAM while keeping the current IORESOURCE_MEM type bit set
  for all memory-mapped ranges (including System RAM) for backward
  compatibility.

  With this resource flag it no longer takes a strcmp() loop through the
  resource tree to find "System RAM" resources.

  The new resource type is then used to extend ACPI/APEI error injection
  facility to also support NVDIMM"

* 'core-resources-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ACPI/EINJ: Allow memory error injection to NVDIMM
  resource: Kill walk_iomem_res()
  x86/kexec: Remove walk_iomem_res() call with GART type
  x86, kexec, nvdimm: Use walk_iomem_res_desc() for iomem search
  resource: Add walk_iomem_res_desc()
  memremap: Change region_intersects() to take @flags and @desc
  arm/samsung: Change s3c_pm_run_res() to use System RAM type
  resource: Change walk_system_ram() to use System RAM type
  drivers: Initialize resource entry to zero
  xen, mm: Set IORESOURCE_SYSTEM_RAM to System RAM
  kexec: Set IORESOURCE_SYSTEM_RAM for System RAM
  arch: Set IORESOURCE_SYSTEM_RAM flag for System RAM
  ia64: Set System RAM type and descriptor
  x86/e820: Set System RAM type and descriptor
  resource: Add I/O resource descriptor
  resource: Handle resource flags properly
  resource: Add System RAM resource type

d37a14bb

14 3月, 2016 2 次提交

L

Linux 4.5 · b562e44f
由 Linus Torvalds 提交于 3月 13, 2016

b562e44f

Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · a2655549

由 Linus Torvalds 提交于 3月 13, 2016

Pull MIPS fixes from Ralf Baechle:
 "Another round of MIPS fixes for 4.5:

   - Fix JZ4780 build with DEBUG_ZBOOT and MACH_JZ4780
   - Fix build with DEBUG_ZBOOT and MACH_JZ4780
   - Fix issue with uninitialised temp_foreign_map
   - Fix awk regex compile failure with certain versions of awk.  At
     this time, the sole user, ld-ifversion, is only used on MIPS"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: smp.c: Fix uninitialised temp_foreign_map
  MIPS: Fix build error when SMP is used without GIC
  ld-version: Fix awk regex compile failure
  MIPS: Fix build with DEBUG_ZBOOT and MACH_JZ4780

a2655549

13 3月, 2016 8 次提交

MIPS: smp.c: Fix uninitialised temp_foreign_map · d825c06b

由 James Hogan 提交于 3月 04, 2016

When calculate_cpu_foreign_map() recalculates the cpu_foreign_map
cpumask it uses the local variable temp_foreign_map without initialising
it to zero. Since the calculation only ever sets bits in this cpumask
any existing bits at that memory location will remain set and find their
way into cpu_foreign_map too. This could potentially lead to cache
operations suboptimally doing smp calls to multiple VPEs in the same
core, even though the VPEs share primary caches.

Therefore initialise temp_foreign_map using cpumask_clear() before use.

Fixes: cccf34e9 ("MIPS: c-r4k: Fix cache flushing for MT cores")
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12759/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

d825c06b

MIPS: Fix build error when SMP is used without GIC · 7a50e468

由 Hauke Mehrtens 提交于 3月 06, 2016

The MIPS_GIC_IPI should only be selected when MIPS_GIC is also
selected, otherwise it results in a compile error. smp-gic.c uses some
functions from include/linux/irqchip/mips-gic.h like
plat_ipi_call_int_xlate() which are only added to the header file when
MIPS_GIC is set. The Lantiq SoC does not use the GIC, but supports SMP.
The calls top the functions from smp-gic.c are already protected by
some #ifdefs

The first part of this was introduced in commit 72e20142 ("MIPS:
Move GIC IPI functions out of smp-cmp.c")
Signed-off-by: NHauke Mehrtens <hauke@hauke-m.de>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: stable@vger.kernel.org # v3.15+
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12774/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

7a50e468

ld-version: Fix awk regex compile failure · 4b7b1ef2

由 James Hogan 提交于 3月 08, 2016

The ld-version.sh script fails on some versions of awk with the
following error, resulting in build failures for MIPS:

awk: scripts/ld-version.sh: line 4: regular expression compile failed (missing '(')

This is due to the regular expression ".*)", meant to strip off the
beginning of the ld version string up to the close bracket, however
brackets have a meaning in regular expressions, so lets escape it so
that awk doesn't expect a corresponding open bracket.

Fixes: ccbef167 ("Kbuild, lto: add ld-version and ld-ifversion ...")
Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Tested-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Tested-by: NSudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Cc: Michal Marek <mmarek@suse.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kbuild@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # 4.4.x-
Patchwork: https://patchwork.linux-mips.org/patch/12838/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

4b7b1ef2

MIPS: Fix build with DEBUG_ZBOOT and MACH_JZ4780 · ba9e72c2

由 Aaro Koskinen 提交于 3月 08, 2016

Ingenic SoC declares ZBOOT support, but debug definitions are missing
for MACH_JZ4780 resulting in a build failure when DEBUG_ZBOOT is set.
The UART addresses are same as with JZ4740, so fix by covering JZ4780
with those as well.
Signed-off-by: NAaro Koskinen <aaro.koskinen@iki.fi>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12830/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

ba9e72c2

Merge branch 'for-linus' of git://git.kernel.dk/linux-block · f414ca64

由 Linus Torvalds 提交于 3月 12, 2016

Pull block merge fix from Jens Axboe.

This fixes the block segment counting bug and resulting sg overrun
reported by Kent Overstreet, introduced with the last block pull.

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: don't optimize for non-cloned bio in bio_get_last_bvec()

f414ca64

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 2f51c820

由 Linus Torvalds 提交于 3月 12, 2016

Pull x86 fixes from Ingo Molnar:
 "This fixes 3 FPU handling related bugs, an EFI boot crash and a
  runtime warning.

  The EFI fix arrived late but I didn't want to delay it to after v4.5
  because the effects are pretty bad for the systems that are affected
  by it"

[ Actually, I don't think the EFI fix really matters yet, because we
  haven't switched to the separate EFI page tables in mainline yet ]

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/efi: Fix boot crash by always mapping boot service regions into new EFI page tables
  x86/fpu: Fix eager-FPU handling on legacy FPU machines
  x86/delay: Avoid preemptible context checks in delay_mwaitx()
  x86/fpu: Revert ("x86/fpu: Disable AVX when eagerfpu is off")
  x86/fpu: Fix 'no387' regression

2f51c820

Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending · fda604a4

由 Linus Torvalds 提交于 3月 12, 2016

Pull SCSI target bug fix from Nicholas Bellinger:
 "Here is an outstanding target-core bug-fix for v4.5 code."

  This patch addresses a recent Task Management (TMR) regression related
  to larger set of multi-port LUN_RESET bug-fixes in v4.5-rc5.

  It drops a left-over target_put_sess_cmd() of se_cmd->cmd_kref within
  ABORT_TASK failure path, once a se_cmd descriptor has already
  completed posting response to fabric driver logic, and must be skipped
  during normal ABORT_TASK se_cmd->tag lookup"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  target: Drop incorrect ABORT_TASK put for completed commands

fda604a4

block: don't optimize for non-cloned bio in bio_get_last_bvec() · 90d0f0f1

由 Ming Lei 提交于 3月 12, 2016

For !BIO_CLONED bio, we can use .bi_vcnt safely, but it
doesn't mean we can just simply return .bi_io_vec[.bi_vcnt - 1]
because the start postion may have been moved in the middle of
the bvec, such as splitting in the middle of bvec.

Fixes: 7bcd79ac(block: bio: introduce helpers to get the 1st and last bvec)
Cc: stable@vger.kernel.org
Reported-by: NKent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

90d0f0f1

12 3月, 2016 12 次提交

x86/efi: Fix boot crash by always mapping boot service regions into new EFI page tables · 452308de

由 Matt Fleming 提交于 3月 11, 2016

Some machines have EFI regions in page zero (physical address
0x00000000) and historically that region has been added to the e820
map via trim_bios_range(), and ultimately mapped into the kernel page
tables. It was not mapped via efi_map_regions() as one would expect.

Alexis reports that with the new separate EFI page tables some boot
services regions, such as page zero, are not mapped. This triggers an
oops during the SetVirtualAddressMap() runtime call.

For the EFI boot services quirk on x86 we need to memblock_reserve()
boot services regions until after SetVirtualAddressMap(). Doing that
while respecting the ownership of regions that may have already been
reserved by the kernel was the motivation behind this commit:

  7d68dc3f ("x86, efi: Do not reserve boot services regions within reserved areas")

That patch was merged at a time when the EFI runtime virtual mappings
were inserted into the kernel page tables as described above, and the
trick of setting ->numpages (and hence the region size) to zero to
track regions that should not be freed in efi_free_boot_services()
meant that we never mapped those regions in efi_map_regions(). Instead
we were relying solely on the existing kernel mappings.

Now that we have separate page tables we need to make sure the EFI
boot services regions are mapped correctly, even if someone else has
already called memblock_reserve(). Instead of stashing a tag in
->numpages, set the EFI_MEMORY_RUNTIME bit of ->attribute. Since it
generally makes no sense to mark a boot services region as required at
runtime, it's pretty much guaranteed the firmware will not have
already set this bit.

For the record, the specific circumstances under which Alexis
triggered this bug was that an EFI runtime driver on his machine was
responding to the EVT_SIGNAL_VIRTUAL_ADDRESS_CHANGE event during
SetVirtualAddressMap().

The event handler for this driver looks like this,

  sub rsp,0x28
  lea rdx,[rip+0x2445] # 0xaa948720
  mov ecx,0x4
  call func_aa9447c0  ; call to ConvertPointer(4, & 0xaa948720)
  mov r11,QWORD PTR [rip+0x2434] # 0xaa948720
  xor eax,eax
  mov BYTE PTR [r11+0x1],0x1
  add rsp,0x28
  ret

Which is pretty typical code for an EVT_SIGNAL_VIRTUAL_ADDRESS_CHANGE
handler. The "mov r11, QWORD PTR [rip+0x2424]" was the faulting
instruction because ConvertPointer() was being called to convert the
address 0x0000000000000000, which when converted is left unchanged and
remains 0x0000000000000000.

The output of the oops trace gave the impression of a standard NULL
pointer dereference bug, but because we're accessing physical
addresses during ConvertPointer(), it wasn't. EFI boot services code
is stored at that address on Alexis' machine.
Reported-by: NAlexis Murzeau <amurzeau@gmail.com>
Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raphael Hertzog <hertzog@debian.org>
Cc: Roger Shimizu <rogershimizu@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1457695163-29632-2-git-send-email-matt@codeblueprint.co.uk
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=815125Signed-off-by: NIngo Molnar <mingo@kernel.org>

452308de

x86/fpu: Fix eager-FPU handling on legacy FPU machines · 6e686709

由 Borislav Petkov 提交于 3月 11, 2016

i486 derived cores like Intel Quark support only the very old,
legacy x87 FPU (FSAVE/FRSTOR, CPUID bit FXSR is not set), and
our FPU code wasn't handling the saving and restoring there
properly in the 'eagerfpu' case.

So after we made eagerfpu the default for all CPU types:

  58122bf1 x86/fpu: Default eagerfpu=on on all CPUs

these old FPU designs broke. First, Andy Shevchenko reported a splat:

  WARNING: CPU: 0 PID: 823 at arch/x86/include/asm/fpu/internal.h:163 fpu__clear+0x8c/0x160

which was us trying to execute FXRSTOR on those machines even though
they don't support it.

After taking care of that, Bryan O'Donoghue reported that a simple FPU
test still failed because we weren't initializing the FPU state properly
on those machines.

Take care of all that.
Reported-and-tested-by: NBryan O'Donoghue <pure.logic@nexus-software.ie>
Reported-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu-cheng <yu-cheng.yu@intel.com>
Link: http://lkml.kernel.org/r/20160311113206.GD4312@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>

6e686709

Merge tag 'for-linus-20160311' of git://git.infradead.org/linux-mtd · 03c668a9

由 Linus Torvalds 提交于 3月 11, 2016

Pull MTD fixes from Brian Norris:
 "Late MTD fix for v4.5:

   - A simple error code handling fix for the NAND ECC test; this was a
     regression in v4.5-rc1

   - A MAINTAINERS update, which might as well go in ASAP"

* tag 'for-linus-20160311' of git://git.infradead.org/linux-mtd:
  MAINTAINERS: add a maintainer for the NAND subsystem
  mtd: nand: tests: fix regression introduced in mtd_nandectest

03c668a9

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 3ab0a0f9

由 Linus Torvalds 提交于 3月 11, 2016

Pull drm/i915 fixes from Dave Airlie:
 "Just two i915 regression fixes, that should be it from me"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/i915: Actually retry with bit-banging after GMBUS timeout
  drm/i915: Fix bogus dig_port_map[] assignment for pre-HSW

3ab0a0f9

mm/mempool: avoid KASAN marking mempool poison checks as use-after-free · 76401310

由 Matthew Dawson 提交于 3月 11, 2016

When removing an element from the mempool, mark it as unpoisoned in KASAN
before verifying its contents for SLUB/SLAB debugging.  Otherwise KASAN
will flag the reads checking the element use-after-free writes as
use-after-free reads.
Signed-off-by: NMatthew Dawson <matthew@mjdsystems.ca>
Acked-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

76401310

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 2a4fb270

由 Linus Torvalds 提交于 3月 11, 2016

Pull ARM SoC fixes from Olof Johansson:
 "Two more fixes for 4.5:

   - One is a fix for OMAP that is urgently needed to avoid DRA7xx chips
     from premature aging, by always keeping the Ethernet clock enabled.

   - The other solves a I/O memory layout issue on Armada, where SROM
     and PCI memory windows were conflicting in some configurations"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: mvebu: fix overlap of Crypto SRAM with PCIe memory window
  ARM: dts: dra7: do not gate cpsw clock due to errata i877
  ARM: OMAP2+: hwmod: Introduce ti,no-idle dt property

2a4fb270

Merge tag 'media/v4.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 95f41fb2

由 Linus Torvalds 提交于 3月 11, 2016

Pull media fix from Mauro Carvalho Chehab:
 "One last time fix: It adds a code that prevents some media tools like
  media-ctl to hide some entities that have their IDs out of the range
  expected by those apps"

* tag 'media/v4.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] media-device: map new functions into old types for legacy API

95f41fb2

ARM: mvebu: fix overlap of Crypto SRAM with PCIe memory window · d7d5a43c

由 Thomas Petazzoni 提交于 3月 08, 2016

When the Crypto SRAM mappings were added to the Device Tree files
describing the Armada XP boards in commit c466d997 ("ARM: mvebu:
define crypto SRAM ranges for all armada-xp boards"), the fact that
those mappings were overlaping with the PCIe memory aperture was
overlooked. Due to this, we currently have for all Armada XP platforms
a situation that looks like this:

Memory mapping on Armada XP boards with internal registers at
0xf1000000:

 - 0x00000000 -> 0xf0000000	3.75G 	RAM
 - 0xf0000000 -> 0xf1000000	16M	NOR flashes (AXP GP / AXP DB)
 - 0xf1000000 -> 0xf1100000	1M	internal registers
 - 0xf8000000 -> 0xffe0000	126M	PCIe memory aperture
 - 0xf8100000 -> 0xf8110000	64KB	Crypto SRAM #0	=> OVERLAPS WITH PCIE !
 - 0xf8110000 -> 0xf8120000	64KB	Crypto SRAM #1	=> OVERLAPS WITH PCIE !
 - 0xffe00000 -> 0xfff00000	1M	PCIe I/O aperture
 - 0xfff0000  -> 0xffffffff	1M	BootROM

The overlap means that when PCIe devices are added, depending on their
memory window needs, they might or might not be mapped into the
physical address space. Indeed, they will not be mapped if the area
allocated in the PCIe memory aperture by the PCI core overlaps with
one of the Crypto SRAM. Typically, a Intel IGB PCIe NIC that needs 8MB
of PCIe memory will see its PCIe memory window allocated from
0xf80000000 for 8MB, which overlaps with the Crypto SRAM windows. Due
to this, the PCIe window is not created, and any attempt to access the
PCIe window makes the kernel explode:

[    3.302213] igb: Copyright (c) 2007-2014 Intel Corporation.
[    3.307841] pci 0000:00:09.0: enabling device (0140 -> 0143)
[    3.313539] mvebu_mbus: cannot add window '4:f8', conflicts with another window
[    3.320870] mvebu-pcie soc:pcie-controller: Could not create MBus window at [mem 0xf8000000-0xf87fffff]: -22
[    3.330811] Unhandled fault: external abort on non-linefetch (0x1008) at 0xf08c0018

This problem does not occur on Armada 370 boards, because we use the
following memory mapping (for boards that have internal registers at
0xf1000000):

 - 0x00000000 -> 0xf0000000	3.75G 	RAM
 - 0xf0000000 -> 0xf1000000	16M	NOR flashes (AXP GP / AXP DB)
 - 0xf1000000 -> 0xf1100000	1M	internal registers
 - 0xf1100000 -> 0xf1110000	64KB	Crypto SRAM #0 => OK !
 - 0xf8000000 -> 0xffe0000	126M	PCIe memory
 - 0xffe00000 -> 0xfff00000	1M	PCIe I/O
 - 0xfff0000  -> 0xffffffff	1M	BootROM

Obviously, the solution is to align the location of the Crypto SRAM
mappings of Armada XP to be similar with the ones on Armada 370, i.e
have them between the "internal registers" area and the beginning of
the PCIe aperture.

However, we have a special case with the OpenBlocks AX3-4 platform,
which has a 128 MB NOR flash. Currently, this NOR flash is mapped from
0xf0000000 to 0xf8000000. This is possible because on OpenBlocks
AX3-4, the internal registers are not at 0xf1000000. And this explains
why the Crypto SRAM mappings were not configured at the same place on
Armada XP.

Hence, the solution is two-fold:

 (1) Move the NOR flash mapping on Armada XP OpenBlocks AX3-4 from
     0xe8000000 to 0xf0000000. This frees the 0xf0000000 ->
     0xf80000000 space.

 (2) Move the Crypto SRAM mappings on Armada XP to be similar to
     Armada 370 (except of course that Armada XP has two Crypto SRAM
     and not one).

After this patch, the memory mapping on Armada XP boards with
registers at 0xf1 is:

 - 0x00000000 -> 0xf0000000	3.75G 	RAM
 - 0xf0000000 -> 0xf1000000	16M	NOR flashes (AXP GP / AXP DB)
 - 0xf1000000 -> 0xf1100000	1M	internal registers
 - 0xf1100000 -> 0xf1110000	64KB	Crypto SRAM #0
 - 0xf1110000 -> 0xf1120000	64KB	Crypto SRAM #1
 - 0xf8000000 -> 0xffe0000	126M	PCIe memory
 - 0xffe00000 -> 0xfff00000	1M	PCIe I/O
 - 0xfff0000  -> 0xffffffff	1M	BootROM

And the memory mapping for the special case of the OpenBlocks AX3-4
(internal registers at 0xd0000000, NOR of 128 MB):

 - 0x00000000 -> 0xc0000000	3G 	RAM
 - 0xd0000000 -> 0xd1000000	1M	internal registers
 - 0xe800000  -> 0xf0000000	128M	NOR flash
 - 0xf1100000 -> 0xf1110000	64KB	Crypto SRAM #0
 - 0xf1110000 -> 0xf1120000	64KB	Crypto SRAM #1
 - 0xf8000000 -> 0xffe0000	126M	PCIe memory
 - 0xffe00000 -> 0xfff00000	1M	PCIe I/O
 - 0xfff0000  -> 0xffffffff	1M	BootROM

Fixes: c466d997 ("ARM: mvebu: define crypto SRAM ranges for all armada-xp boards")
Reported-by: NPhil Sutter <phil@nwl.cc>
Cc: Phil Sutter <phil@nwl.cc>
Cc: <stable@vger.kernel.org>
Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: NOlof Johansson <olof@lixom.net>

d7d5a43c

Merge tag 'dmaengine-fix-4.5' of git://git.infradead.org/users/vkoul/slave-dma · 20698c92

由 Linus Torvalds 提交于 3月 11, 2016

Pull dmaengine fixes from Vinod Koul:
 "Two fixes showed up in last few days, and they should be included in
  4.5.  Summary:

  Two more late fixes to drivers, nothing major here:

   - A memory leak fix in fsdma unmap the dma descriptors on freeup

   - A fix in xdmac driver for residue calculation of dma descriptor"

* tag 'dmaengine-fix-4.5' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: at_xdmac: fix residue computation
  dmaengine: fsldma: fix memory leak

20698c92

Merge tag 'pm+acpi-4.5-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 7ae9c768

由 Linus Torvalds 提交于 3月 11, 2016

Pull power management and ACPI fixes from Rafael Wysocki:
 "Two more fixes for issues introduced recently, one in the generic
  device properties framework and one in ACPICA.

  Specifics:

   - Revert a recent ACPICA commit that has been reverted upstream,
     because it caused problems to happen on user systems and the
     problem it attempted to address will not be relevant any more after
     upcoming ACPI specification changes (Bob Moore).

   - Fix crash in the generic device properties framework introduced by
     a recent change that forgot to check pointers against error values
     in addition to checking them against NULL (Heikki Krogerus)"

* tag 'pm+acpi-4.5-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  device property: fwnode->secondary may contain ERR_PTR(-ENODEV)
  ACPICA: Revert "Parser: Fix for SuperName method invocation"

7ae9c768

Merge tag 'xfs-for-linus-4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs · 2a62ec0a

由 Linus Torvalds 提交于 3月 11, 2016

Pull xfs fixes from Dave Chinner:
 "This is a fix for a regression introduced in 4.5-rc1 by the new torn
  log write detection code.  The regression only affects people moving a
  clean filesystem between machines/kernels of different architecture
  (such as changing between 32 bit and 64 bit kernels), but this is the
  recommended (and only!) safe way to migrate a filesystem between
  architectures so we really need to ensure it works.

  The changes are larger than I'd prefer right at the end of the release
  cycle, but the majority of the change is just factoring code to enable
  the detection of a clean log at the correct time to avoid this issue.

  Changes:

   - Only perform torn log write detection on dirty logs.  This prevents
     failures being detected due to a clean filesystem being moved
     between machines or kernels of different architectures (e.g.  32 ->
     64 bit, BE -> LE, etc).  This fixes a regression introduced by the
     torn log write detection in 4.5-rc1"

* tag 'xfs-for-linus-4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
  xfs: only run torn log write detection on dirty logs
  xfs: refactor in-core log state update to helper
  xfs: refactor unmount record detection into helper
  xfs: separate log head record discovery from verification

2a62ec0a

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 63cf207e

由 Linus Torvalds 提交于 3月 11, 2016

Pull vfs fixes from Al Viro:
 "A couple of fixes: Fix for my dumb braino in ncpfs and a long-standing
  breakage on recovery from failed rename() in jffs2"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  jffs2: reduce the breakage on recovery from halfway failed rename()
  ncpfs: fix a braino in OOM handling in ncp_fill_cache()

63cf207e

11 3月, 2016 1 次提交

Merge branches 'device-properties-fixes' and 'acpica-fixes' · 5b3e7e05

由 Rafael J. Wysocki 提交于 3月 11, 2016

* device-properties-fixes:
  device property: fwnode->secondary may contain ERR_PTR(-ENODEV)

* acpica-fixes:
  ACPICA: Revert "Parser: Fix for SuperName method invocation"

5b3e7e05

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功