1. 12 8月, 2017 5 次提交
    • A
      perf test shell: Move vfs_getname probe function to lib · 5ce669a5
      Arnaldo Carvalho de Melo 提交于
      Multiple tests will be able to reuse these functions, to test things
      like perf report, 'trace', etc, using this probe.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-48xagvozhouhyi8fjota6o2d@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5ce669a5
    • A
      perf test shell: Install shell tests · 122e0b94
      Arnaldo Carvalho de Melo 提交于
      Now that we have shell tests, install them.
      
      Developers don't need this pass, as 'perf test' will look first at the
      in tree scripts at tools/perf/tests/shell/.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-j21u4v0jsehi0lpwqwjb4j45@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      122e0b94
    • A
      perf test shell: Add 'probe_vfs_getname' shell test · a3534842
      Arnaldo Carvalho de Melo 提交于
      First perf shell test:
      
        # perf test vfs_getname
        60: Add vfs_getname probe to get syscall args filenames: Ok
        #
      
      In verbose mode:
      
        # perf test -v vfs_getname
        60: Add vfs_getname probe to get syscall args filenames:
        --- start ---
        test child forked, pid 19146
        Added new event:
          probe:vfs_getname    (on getname_flags:72 with pathname=result->name:string)
      
        You can now use it in all perf tools, such as:
      
      	  perf record -e probe:vfs_getname -aR sleep 1
      
        test child finished with 0
        ---- end ----
        Add vfs_getname probe to get syscall args filenames: Ok
        #
      
      And if the vmlinux file is not found:
      
        # mv ../build/v4.12.0-rc6+/vmlinux ../build/v4.12.0-rc6+/vmlinux.hidden
        # perf test vfs_getname
        60: Add vfs_getname probe to get syscall args filenames: Skip
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-8f3n22c1yn516ev30s603ow2@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a3534842
    • A
      perf test: Make 'list' use same filtering code as main 'perf test' · 6d02acc1
      Arnaldo Carvalho de Melo 提交于
      Before:
      
        # perf test Synth
        39: Synthesize thread map  : Ok
        41: Synthesize cpu map     : Ok
        42: Synthesize stat config : Ok
        43: Synthesize stat        : Ok
        44: Synthesize stat round  : Ok
        45: Synthesize attr update : Ok
        # perf test list Synth
        #
      
      After:
      
        # perf test Synth
        39: Synthesize thread map  : Ok
        41: Synthesize cpu map     : Ok
        42: Synthesize stat config : Ok
        43: Synthesize stat        : Ok
        44: Synthesize stat round  : Ok
        45: Synthesize attr update : Ok
        # perf test list Synth
        39: Synthesize thread map
        41: Synthesize cpu map
        42: Synthesize stat config
        43: Synthesize stat
        44: Synthesize stat round
        45: Synthesize attr update
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-v95tqqzuwawsmds3zn2mosje@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6d02acc1
    • A
      perf test: Add infrastructure to run shell based tests · 1209b273
      Arnaldo Carvalho de Melo 提交于
      To allow testing by directly using perf tools in scripts, checking that
      the effects on the system are the ones expected and that the output
      produced is as well the desired one.
      
      For instance, adding a probe at a well known location with 'perf probe',
      then checking that the results from using that probe to record are the
      desired ones, etc.
      
      The next csets will introduce tests using this new testing
      infrastructure.
      
      The scripts should return 0 for Ok, 1 for FAIL and 2 for SKIP.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-swbpn7amrjqffh83lsr39s9p@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1209b273
  2. 11 8月, 2017 7 次提交
  3. 10 8月, 2017 18 次提交
    • I
      Merge tag 'perf-core-for-mingo-4.14-20170801' of... · 82119cbe
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo-4.14-20170801' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
      - Beautifiers for the 'cmd' arg of several ioctl types, including:
        sound, DRM, KVM, vhost virtio and perf_events.
      
        This was done by using scripts that extract the information from
        the UAPI headers, generating string tables that are then used in
        the 'perf trace' syscall argument ioctl beautifier.
      
        More work needed to further use it, for instance, to use the
        _IOC_DIR value where it is used sanely to suppress the third
        argument, to set formatters for non-pointer values and ultimately
        for using eBPF + pahole-like code to collect + beautify structs in
        the third arg.
      
        Using the current scheme of having tools/ copies of kernel headers
        we'll make sure tooling stays working when changes are made to the
        kernel ABI headers and will be notified when they get changed,
        reducing the time for 'perf trace' to support new ABIs and allowing
        the tools/perf/ codebase to have the definitions it needs to
        build in dozens of distros/versions, as routinely tested using
        containers for, at this time, 47 environments. (Arnaldo Carvalho de Melo)
      
      Infrastructure changes:
      
      - Clarify header version warning message (Ingo Molnar)
      
      - Sync kernel ABI headers with tooling headers (Ingo Molnar, Arnaldo Carvalho de Melo)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      82119cbe
    • M
      kprobes/x86: Do not jump-optimize kprobes on irq entry code · d9f5f32a
      Masami Hiramatsu 提交于
      Since the kernel segment registers are not prepared at the
      entry of irq-entry code, if a kprobe on such code is
      jump-optimized, accessing per-CPU variables may cause a
      kernel panic.
      
      However, if the kprobe is not optimized, it triggers an int3
      exception and sets segment registers correctly.
      
      With this patch we check the probe-address and if it is in the
      irq-entry code, it prohibits optimizing such kprobes.
      
      This means we can continue probing such interrupt handlers by kprobes
      but it is not optimized anymore.
      Reported-by: NFrancis Deslauriers <francis.deslauriers@efficios.com>
      Tested-by: NFrancis Deslauriers <francis.deslauriers@efficios.com>
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: David S . Miller <davem@davemloft.net>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-cris-kernel@axis.com
      Cc: mathieu.desnoyers@efficios.com
      Link: http://lkml.kernel.org/r/150172795654.27216.9824039077047777477.stgit@devboxSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d9f5f32a
    • M
      irq: Make the irqentry text section unconditional · 229a7186
      Masami Hiramatsu 提交于
      Generate irqentry and softirqentry text sections without
      any Kconfig dependencies. This will add extra sections, but
      there should be no performace impact.
      Suggested-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: David S . Miller <davem@davemloft.net>
      Cc: Francis Deslauriers <francis.deslauriers@efficios.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-cris-kernel@axis.com
      Cc: mathieu.desnoyers@efficios.com
      Link: http://lkml.kernel.org/r/150172789110.27216.3955739126693102122.stgit@devboxSigned-off-by: NIngo Molnar <mingo@kernel.org>
      229a7186
    • M
      cris: Mark _stext and _end as char-arrays, not single char variables · c2579fee
      Masami Hiramatsu 提交于
      Mark _stext and _end as character arrays instead of single
      character variable, like include/asm-generic/sections.h does.
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: David S . Miller <davem@davemloft.net>
      Cc: Francis Deslauriers <francis.deslauriers@efficios.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-cris-kernel@axis.com
      Cc: mathieu.desnoyers@efficios.com
      Link: http://lkml.kernel.org/r/150172782555.27216.2805751327900543374.stgit@devboxSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c2579fee
    • M
      xtensa: Mark _stext and _end as char-arrays, not single char variables · 18244362
      Masami Hiramatsu 提交于
      Mark _stext and _end as character arrays instead of single
      character variables, like include/asm-generic/sections.h does.
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: David S . Miller <davem@davemloft.net>
      Cc: Francis Deslauriers <francis.deslauriers@efficios.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-cris-kernel@axis.com
      Cc: mathieu.desnoyers@efficios.com
      Link: http://lkml.kernel.org/r/150172775958.27216.12951305461398200544.stgit@devboxSigned-off-by: NIngo Molnar <mingo@kernel.org>
      18244362
    • M
      h8300: Mark _stext and _etext as char-arrays, not single char variables · b4464bf9
      Masami Hiramatsu 提交于
      Mark _stext and _etext as character arrays instead of
      single character variables, like include/asm-generic/sections.h
      does.
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: David S . Miller <davem@davemloft.net>
      Cc: Francis Deslauriers <francis.deslauriers@efficios.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-cris-kernel@axis.com
      Cc: mathieu.desnoyers@efficios.com
      Link: http://lkml.kernel.org/r/150172769415.27216.12021110228384155707.stgit@devboxSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b4464bf9
    • L
      perf/core: Reduce context switch overhead · fdccc3fb
      leilei.lin 提交于
      Skip most of the PMU context switching overhead when ctx->nr_events is 0.
      
      50% performance overhead was observed under an extreme testcase.
      Signed-off-by: Nleilei.lin <leilei.lin@alibaba-inc.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Cc: alexander.shishkin@linux.intel.com
      Cc: eranian@gmail.com
      Cc: jolsa@redhat.com
      Cc: linxiulei@gmail.com
      Cc: yang_oliver@hotmail.com
      Link: http://lkml.kernel.org/r/20170809002921.69813-1-leilei.lin@alibaba-inc.com
      [ Rewrote the changelog. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      fdccc3fb
    • J
      perf/x86/amd/uncore: Get correct number of cores sharing last level cache · ab027620
      Janakarajan Natarajan 提交于
      In Family 17h, the number of cores sharing a cache level is obtained
      from the Cache Properties CPUID leaf (0x8000001d) by passing in the
      cache level in ECX. In prior families, a cache level of 2 was used to
      determine this information.
      
      To get the right information, irrespective of Family, iterate over
      the cache levels using CPUID 0x8000001d. The last level cache is the
      last value to return a non-zero value in EAX.
      Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/5ab569025b39cdfaeca55b571d78c0fc800bdb69.1497452002.git.Janakarajan.Natarajan@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ab027620
    • J
      perf/x86/amd/uncore: Rename cpufeatures macro for cache counters · 910448bb
      Janakarajan Natarajan 提交于
      In Family 17h, L3 is the last level cache as opposed to L2 in previous
      families. Avoid this name confusion and rename X86_FEATURE_PERFCTR_L2 to
      X86_FEATURE_PERFCTR_LLC to indicate the performance counter on the last
      level of cache.
      Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/016311029fdecdc3fdc13b7ed865c6cbf48b2f15.1497452002.git.Janakarajan.Natarajan@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      910448bb
    • I
      1ccb2f4e
    • P
      perf/core: Fix time on IOC_ENABLE · 9b231d9f
      Peter Zijlstra 提交于
      Vince reported that when we do IOC_ENABLE/IOC_DISABLE while the task
      is SIGSTOP'ed state the timestamps go wobbly.
      
      It turns out we indeed fail to correctly account time while in 'OFF'
      state and doing IOC_ENABLE without getting scheduled in exposes the
      problem.
      
      Further thinking about this problem, it occurred to me that we can
      suffer a similar fate when we migrate an uncore event between CPUs.
      The perf_event_install() on the 'new' CPU will do add_event_to_ctx()
      which will reset all the time stamp, resulting in a subsequent
      update_event_times() to overwrite the total_time_* fields with smaller
      values.
      Reported-by: NVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      9b231d9f
    • P
      perf/x86: Fix RDPMC vs. mm_struct tracking · bfe33492
      Peter Zijlstra 提交于
      Vince reported the following rdpmc() testcase failure:
      
       > Failing test case:
       >
       >	fd=perf_event_open();
       >	addr=mmap(fd);
       >	exec()  // without closing or unmapping the event
       >	fd=perf_event_open();
       >	addr=mmap(fd);
       >	rdpmc()	// GPFs due to rdpmc being disabled
      
      The problem is of course that exec() plays tricks with what is
      current->mm, only destroying the old mappings after having
      installed the new mm.
      
      Fix this confusion by passing along vma->vm_mm instead of relying on
      current->mm.
      Reported-by: NVince Weaver <vincent.weaver@maine.edu>
      Tested-by: NVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Fixes: 1e0fb9ec ("perf: Add pmu callbacks to track event mapping and unmapping")
      Link: http://lkml.kernel.org/r/20170802173930.cstykcqefmqt7jau@hirez.programming.kicks-ass.net
      [ Minor cleanups. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      bfe33492
    • L
      Merge tag 'pinctrl-v4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 8d31f80e
      Linus Torvalds 提交于
      Pull pin control fixes from Linus Walleij:
       "These are the pin control fixes I have gathered since the return from
        my vacation. They boiled in -next a while so let's get them in.
      
        Apart from the documentation build it is purely driver fixes. Which is
        nice. The Intel fixes seem kind of important.
      
         - Fix the documentation build as the docs were moved
      
         - Correct the UART pin list on the Intel Merrifield
      
         - Fix pin assignment and number of pins on the Marvell Armada 37xx
           pin controller
      
         - Cover the Setzer models in the Chromebook DMI quirk in the Intel
           cheryview driver so they start working
      
         - Add the missing "sim" function to the sunxi driver
      
         - Fix USB pin definitions on Uniphier Pro4
      
         - Smatch fix for invalid reference in the zx pin control driver"
      
      * tag 'pinctrl-v4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: generic: update references to Documentation/pinctrl.txt
        pinctrl: intel: merrifield: Correct UART pin lists
        pinctrl: armada-37xx: Fix number of pin in south bridge
        pinctrl: armada-37xx: Fix the pin 23 on south bridge
        pinctrl: cherryview: Add Setzer models to the Chromebook DMI quirk
        pinctrl: sunxi: add a missing function of A10/A20 pinctrl driver
        pinctrl: uniphier: fix USB3 pin assignment for Pro4
        pinctrl: zte: fix dereference of 'data' in zx_set_mux()
      8d31f80e
    • M
      futex: Remove unnecessary warning from get_futex_key · 48fb6f4d
      Mel Gorman 提交于
      Commit 65d8fc77 ("futex: Remove requirement for lock_page() in
      get_futex_key()") removed an unnecessary lock_page() with the
      side-effect that page->mapping needed to be treated very carefully.
      
      Two defensive warnings were added in case any assumption was missed and
      the first warning assumed a correct application would not alter a
      mapping backing a futex key.  Since merging, it has not triggered for
      any unexpected case but Mark Rutland reported the following bug
      triggering due to the first warning.
      
        kernel BUG at kernel/futex.c:679!
        Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
        Modules linked in:
        CPU: 0 PID: 3695 Comm: syz-executor1 Not tainted 4.13.0-rc3-00020-g307fec773ba3 #3
        Hardware name: linux,dummy-virt (DT)
        task: ffff80001e271780 task.stack: ffff000010908000
        PC is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679
        LR is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679
        pc : [<ffff00000821ac14>] lr : [<ffff00000821ac14>] pstate: 80000145
      
      The fact that it's a bug instead of a warning was due to an unrelated
      arm64 problem, but the warning itself triggered because the underlying
      mapping changed.
      
      This is an application issue but from a kernel perspective it's a
      recoverable situation and the warning is unnecessary so this patch
      removes the warning.  The warning may potentially be triggered with the
      following test program from Mark although it may be necessary to adjust
      NR_FUTEX_THREADS to be a value smaller than the number of CPUs in the
      system.
      
          #include <linux/futex.h>
          #include <pthread.h>
          #include <stdio.h>
          #include <stdlib.h>
          #include <sys/mman.h>
          #include <sys/syscall.h>
          #include <sys/time.h>
          #include <unistd.h>
      
          #define NR_FUTEX_THREADS 16
          pthread_t threads[NR_FUTEX_THREADS];
      
          void *mem;
      
          #define MEM_PROT  (PROT_READ | PROT_WRITE)
          #define MEM_SIZE  65536
      
          static int futex_wrapper(int *uaddr, int op, int val,
                                   const struct timespec *timeout,
                                   int *uaddr2, int val3)
          {
              syscall(SYS_futex, uaddr, op, val, timeout, uaddr2, val3);
          }
      
          void *poll_futex(void *unused)
          {
              for (;;) {
                  futex_wrapper(mem, FUTEX_CMP_REQUEUE_PI, 1, NULL, mem + 4, 1);
              }
          }
      
          int main(int argc, char *argv[])
          {
              int i;
      
              mem = mmap(NULL, MEM_SIZE, MEM_PROT,
                     MAP_SHARED | MAP_ANONYMOUS, -1, 0);
      
              printf("Mapping @ %p\n", mem);
      
              printf("Creating futex threads...\n");
      
              for (i = 0; i < NR_FUTEX_THREADS; i++)
                  pthread_create(&threads[i], NULL, poll_futex, NULL);
      
              printf("Flipping mapping...\n");
              for (;;) {
                  mmap(mem, MEM_SIZE, MEM_PROT,
                       MAP_FIXED | MAP_SHARED | MAP_ANONYMOUS, -1, 0);
              }
      
              return 0;
          }
      Reported-and-tested-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: stable@vger.kernel.org # 4.7+
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      48fb6f4d
    • L
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 358f8c26
      Linus Torvalds 提交于
      Pull i2c fixes from Wolfram Sang:
       "The main thing is to allow empty id_tables for ACPI to make some
        drivers get probed again. It looks a bit bigger than usual because it
        needs some internal renaming, too.
      
        Other than that, there is a fix for broken DSTDs, a super simple
        enablement for ARM MPS, and two documentation fixes which I'd like to
        see in v4.13 already"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: rephrase explanation of I2C_CLASS_DEPRECATED
        i2c: allow i2c-versatile for ARM MPS platforms
        i2c: designware: Some broken DSTDs use 1MiHz instead of 1MHz
        i2c: designware: Print clock freq on invalid clock freq error
        i2c: core: Allow empty id_table in ACPI case as well
        i2c: mux: pinctrl: mention correct module name in Kconfig help text
      358f8c26
    • L
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 31cf92f3
      Linus Torvalds 提交于
      Pull block fixes from Jens Axboe:
       "Three patches that should go into this release.
      
        Two of them are from Paolo and fix up some corner cases with BFQ, and
        the last patch is from Ming and fixes up a potential usage count
        imbalance regression due to the recent NOWAIT work"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        blk-mq: don't leak preempt counter/q_usage_counter when allocating rq failed
        block, bfq: consider also in_service_entity to state whether an entity is active
        block, bfq: reset in_service_entity if it becomes idle
      31cf92f3
    • L
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · d555eb6b
      Linus Torvalds 提交于
      Pull crypto fixes from Herbert Xu:
       "Fix two regressions in the inside-secure driver with respect to
        hmac(sha1)"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: inside-secure - fix the sha state length in hmac_sha1_setkey
        crypto: inside-secure - fix invalidation check in hmac_sha1_setkey
      d555eb6b
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 4530cca1
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
       "The pull requests are getting smaller, that's progress I suppose :-)
      
         1) Fix infinite loop in CIPSO option parsing, from Yujuan Qi.
      
         2) Fix remote checksum handling in VXLAN and GUE tunneling drivers,
            from Koichiro Den.
      
         3) Missing u64_stats_init() calls in several drivers, from Florian
            Fainelli.
      
         4) TCP can set the congestion window to an invalid ssthresh value
            after congestion window reductions, from Yuchung Cheng.
      
         5) Fix BPF jit branch generation on s390, from Daniel Borkmann.
      
         6) Correct MIPS ebpf JIT merge, from David Daney.
      
         7) Correct byte order test in BPF test_verifier.c, from Daniel
            Borkmann.
      
         8) Fix various crashes and leaks in ASIX driver, from Dean Jenkins.
      
         9) Handle SCTP checksums properly in mlx4 driver, from Davide
            Caratti.
      
        10) We can potentially enter tcp_connect() with a cached route
            already, due to fastopen, so we have to explicitly invalidate it.
      
        11) skb_warn_bad_offload() can bark in legitimate situations, fix from
            Willem de Bruijn"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
        net: avoid skb_warn_bad_offload false positives on UFO
        qmi_wwan: fix NULL deref on disconnect
        ppp: fix xmit recursion detection on ppp channels
        rds: Reintroduce statistics counting
        tcp: fastopen: tcp_connect() must refresh the route
        net: sched: set xt_tgchk_param par.net properly in ipt_init_target
        net: dsa: mediatek: add adjust link support for user ports
        net/mlx4_en: don't set CHECKSUM_COMPLETE on SCTP packets
        qed: Fix a memory allocation failure test in 'qed_mcp_cmd_init()'
        hysdn: fix to a race condition in put_log_buffer
        s390/qeth: fix L3 next-hop in xmit qeth hdr
        asix: Fix small memory leak in ax88772_unbind()
        asix: Ensure asix_rx_fixup_info members are all reset
        asix: Add rx->ax_skb = NULL after usbnet_skb_return()
        bpf: fix selftest/bpf/test_pkt_md_access on s390x
        netvsc: fix race on sub channel creation
        bpf: fix byte order test in test_verifier
        xgene: Always get clk source, but ignore if it's missing for SGMII ports
        MIPS: Add missing file for eBPF JIT.
        bpf, s390: fix build for libbpf and selftest suite
        ...
      4530cca1
  4. 09 8月, 2017 10 次提交
    • W
      net: avoid skb_warn_bad_offload false positives on UFO · 8d63bee6
      Willem de Bruijn 提交于
      skb_warn_bad_offload triggers a warning when an skb enters the GSO
      stack at __skb_gso_segment that does not have CHECKSUM_PARTIAL
      checksum offload set.
      
      Commit b2504a5d ("net: reduce skb_warn_bad_offload() noise")
      observed that SKB_GSO_DODGY producers can trigger the check and
      that passing those packets through the GSO handlers will fix it
      up. But, the software UFO handler will set ip_summed to
      CHECKSUM_NONE.
      
      When __skb_gso_segment is called from the receive path, this
      triggers the warning again.
      
      Make UFO set CHECKSUM_UNNECESSARY instead of CHECKSUM_NONE. On
      Tx these two are equivalent. On Rx, this better matches the
      skb state (checksum computed), as CHECKSUM_NONE here means no
      checksum computed.
      
      See also this thread for context:
      http://patchwork.ozlabs.org/patch/799015/
      
      Fixes: b2504a5d ("net: reduce skb_warn_bad_offload() noise")
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d63bee6
    • B
      qmi_wwan: fix NULL deref on disconnect · bbae08e5
      Bjørn Mork 提交于
      qmi_wwan_disconnect is called twice when disconnecting devices with
      separate control and data interfaces.  The first invocation will set
      the interface data to NULL for both interfaces to flag that the
      disconnect has been handled.  But the matching NULL check was left
      out when qmi_wwan_disconnect was added, resulting in this oops:
      
        usb 2-1.4: USB disconnect, device number 4
        qmi_wwan 2-1.4:1.6 wwp0s29u1u4i6: unregister 'qmi_wwan' usb-0000:00:1d.0-1.4, WWAN/QMI device
        BUG: unable to handle kernel NULL pointer dereference at 00000000000000e0
        IP: qmi_wwan_disconnect+0x25/0xc0 [qmi_wwan]
        PGD 0
        P4D 0
        Oops: 0000 [#1] SMP
        Modules linked in: <stripped irrelevant module list>
        CPU: 2 PID: 33 Comm: kworker/2:1 Tainted: G            E   4.12.3-nr44-normandy-r1500619820+ #1
        Hardware name: LENOVO 4291LR7/4291LR7, BIOS CBET4000 4.6-810-g50522254fb 07/21/2017
        Workqueue: usb_hub_wq hub_event [usbcore]
        task: ffff8c882b716040 task.stack: ffffb8e800d84000
        RIP: 0010:qmi_wwan_disconnect+0x25/0xc0 [qmi_wwan]
        RSP: 0018:ffffb8e800d87b38 EFLAGS: 00010246
        RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
        RDX: 0000000000000001 RSI: ffff8c8824f3f1d0 RDI: ffff8c8824ef6400
        RBP: ffff8c8824ef6400 R08: 0000000000000000 R09: 0000000000000000
        R10: ffffb8e800d87780 R11: 0000000000000011 R12: ffffffffc07ea0e8
        R13: ffff8c8824e2e000 R14: ffff8c8824e2e098 R15: 0000000000000000
        FS:  0000000000000000(0000) GS:ffff8c8835300000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00000000000000e0 CR3: 0000000229ca5000 CR4: 00000000000406e0
        Call Trace:
         ? usb_unbind_interface+0x71/0x270 [usbcore]
         ? device_release_driver_internal+0x154/0x210
         ? qmi_wwan_unbind+0x6d/0xc0 [qmi_wwan]
         ? usbnet_disconnect+0x6c/0xf0 [usbnet]
         ? qmi_wwan_disconnect+0x87/0xc0 [qmi_wwan]
         ? usb_unbind_interface+0x71/0x270 [usbcore]
         ? device_release_driver_internal+0x154/0x210
      Reported-and-tested-by: NNathaniel Roach <nroach44@gmail.com>
      Fixes: c6adf779 ("net: usb: qmi_wwan: add qmap mux protocol support")
      Cc: Daniele Palmas <dnlplm@gmail.com>
      Signed-off-by: NBjørn Mork <bjorn@mork.no>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bbae08e5
    • G
      ppp: fix xmit recursion detection on ppp channels · 0a0e1a85
      Guillaume Nault 提交于
      Commit e5dadc65 ("ppp: Fix false xmit recursion detect with two ppp
      devices") dropped the xmit_recursion counter incrementation in
      ppp_channel_push() and relied on ppp_xmit_process() for this task.
      But __ppp_channel_push() can also send packets directly (using the
      .start_xmit() channel callback), in which case the xmit_recursion
      counter isn't incremented anymore. If such packets get routed back to
      the parent ppp unit, ppp_xmit_process() won't notice the recursion and
      will call ppp_channel_push() on the same channel, effectively creating
      the deadlock situation that the xmit_recursion mechanism was supposed
      to prevent.
      
      This patch re-introduces the xmit_recursion counter incrementation in
      ppp_channel_push(). Since the xmit_recursion variable is now part of
      the parent ppp unit, incrementation is skipped if the channel doesn't
      have any. This is fine because only packets routed through the parent
      unit may enter the channel recursively.
      
      Finally, we have to ensure that pch->ppp is not going to be modified
      while executing ppp_channel_push(). Instead of taking this lock only
      while calling ppp_xmit_process(), we now have to hold it for the full
      ppp_channel_push() execution. This respects the ppp locks ordering
      which requires locking ->upl before ->downl.
      
      Fixes: e5dadc65 ("ppp: Fix false xmit recursion detect with two ppp devices")
      Signed-off-by: NGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a0e1a85
    • H
      rds: Reintroduce statistics counting · 05bfd7db
      Håkon Bugge 提交于
      In commit 7e3f2952 ("rds: don't let RDS shutdown a connection
      while senders are present"), refilling the receive queue was removed
      from rds_ib_recv(), along with the increment of
      s_ib_rx_refill_from_thread.
      
      Commit 73ce4317 ("RDS: make sure we post recv buffers")
      re-introduces filling the receive queue from rds_ib_recv(), but does
      not add the statistics counter. rds_ib_recv() was later renamed to
      rds_ib_recv_path().
      
      This commit reintroduces the statistics counting of
      s_ib_rx_refill_from_thread and s_ib_rx_refill_from_cq.
      Signed-off-by: NHåkon Bugge <haakon.bugge@oracle.com>
      Reviewed-by: NKnut Omang <knut.omang@oracle.com>
      Reviewed-by: NWei Lin Guay <wei.lin.guay@oracle.com>
      Reviewed-by: NShamir Rabinovitch <shamir.rabinovitch@oracle.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      05bfd7db
    • E
      tcp: fastopen: tcp_connect() must refresh the route · 8ba60924
      Eric Dumazet 提交于
      With new TCP_FASTOPEN_CONNECT socket option, there is a possibility
      to call tcp_connect() while socket sk_dst_cache is either NULL
      or invalid.
      
       +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 4
       +0 fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
       +0 setsockopt(4, SOL_TCP, TCP_FASTOPEN_CONNECT, [1], 4) = 0
       +0 connect(4, ..., ...) = 0
      
      << sk->sk_dst_cache becomes obsolete, or even set to NULL >>
      
       +1 sendto(4, ..., 1000, MSG_FASTOPEN, ..., ...) = 1000
      
      We need to refresh the route otherwise bad things can happen,
      especially when syzkaller is running on the host :/
      
      Fixes: 19f6d3f3 ("net/tcp-fastopen: Add new API support")
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Wei Wang <weiwan@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-by: NWei Wang <weiwan@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8ba60924
    • X
      net: sched: set xt_tgchk_param par.net properly in ipt_init_target · ec0acb09
      Xin Long 提交于
      Now xt_tgchk_param par in ipt_init_target is a local varibale,
      par.net is not initialized there. Later when xt_check_target
      calls target's checkentry in which it may access par.net, it
      would cause kernel panic.
      
      Jaroslav found this panic when running:
      
        # ip link add TestIface type dummy
        # tc qd add dev TestIface ingress handle ffff:
        # tc filter add dev TestIface parent ffff: u32 match u32 0 0 \
          action xt -j CONNMARK --set-mark 4
      
      This patch is to pass net param into ipt_init_target and set
      par.net with it properly in there.
      
      v1->v2:
        As Wang Cong pointed, I missed ipt_net_id != xt_net_id, so fix
        it by also passing net_id to __tcf_ipt_init.
      v2->v3:
        Missed the fixes tag, so add it.
      
      Fixes: ecb2421b ("netfilter: add and use nf_ct_netns_get/put")
      Reported-by: NJaroslav Aster <jaster@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec0acb09
    • J
      net: dsa: mediatek: add adjust link support for user ports · 8e6f1521
      John Crispin 提交于
      Manually adjust the port settings of user ports once PHY polling has
      completed. This patch extends the adjust_link callback to configure the
      per port PMCR register, applying the proper values polled from the PHY.
      Without this patch flow control was not always getting setup properly.
      Signed-off-by: NShashidhar Lakkavalli <shashidhar.lakkavalli@openmesh.com>
      Signed-off-by: NMuciri Gatimu <muciri@openmesh.com>
      Signed-off-by: NJohn Crispin <john@phrozen.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8e6f1521
    • D
      net/mlx4_en: don't set CHECKSUM_COMPLETE on SCTP packets · e718fe45
      Davide Caratti 提交于
      if the NIC fails to validate the checksum on TCP/UDP, and validation of IP
      checksum is successful, the driver subtracts the pseudo-header checksum
      from the value obtained by the hardware and sets CHECKSUM_COMPLETE. Don't
      do that if protocol is IPPROTO_SCTP, otherwise CRC32c validation fails.
      
      V2: don't test MLX4_CQE_STATUS_IPV6 if MLX4_CQE_STATUS_IPV4 is set
      Reported-by: NShuang Li <shuali@redhat.com>
      Fixes: f8c6455b ("net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE")
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Acked-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e718fe45
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma · bfa738cf
      Linus Torvalds 提交于
      Pull rdma fixes from Doug Ledford:
       "Third set of -rc fixes for 4.13 cycle
      
         - small set of miscellanous fixes
      
         - a reasonably sizable set of IPoIB fixes that deal with multiple
           long standing issues"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
        IB/hns: checking for IS_ERR() instead of NULL
        RDMA/mlx5: Fix existence check for extended address vector
        IB/uverbs: Fix device cleanup
        RDMA/uverbs: Prevent leak of reserved field
        IB/core: Fix race condition in resolving IP to MAC
        IB/ipoib: Notify on modify QP failure only when relevant
        Revert "IB/core: Allow QP state transition from reset to error"
        IB/ipoib: Remove double pointer assigning
        IB/ipoib: Clean error paths in add port
        IB/ipoib: Add get statistics support to SRIOV VF
        IB/ipoib: Add multicast packets statistics
        IB/ipoib: Set IPOIB_NEIGH_TBL_FLUSH after flushed completion initialization
        IB/ipoib: Prevent setting negative values to max_nonsrq_conn_qp
        IB/ipoib: Make sure no in-flight joins while leaving that mcast
        IB/ipoib: Use cancel_delayed_work_sync when needed
        IB/ipoib: Fix race between light events and interface restart
      bfa738cf
    • J
      parse-maintainers: Move matching sections from MAINTAINERS · b95c29a2
      Joe Perches 提交于
      Allow any number of command line arguments to match either the
      section header or the section contents and create new files.
      
      Create MAINTAINERS.new and SECTION.new.
      
      This allows scripting of the movement of various sections from
      MAINTAINERS.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b95c29a2