1. 13 3月, 2018 4 次提交
    • T
      tracing: Unify the "boot" and "mono" tracing clocks · 92af4dcb
      Thomas Gleixner 提交于
      Unify the "boot" and "mono" tracing clocks and document the new behaviour.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kevin Easton <kevin@guarana.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Salyzyn <salyzyn@android.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/20180301165150.489635255@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      92af4dcb
    • T
      hrtimer: Unify MONOTONIC and BOOTTIME clock behavior · 127bfa5f
      Thomas Gleixner 提交于
      Now that th MONOTONIC and BOOTTIME clocks are indentical remove all the special
      casing.
      
      The user space visible interfaces still support both clocks, but their behavior
      is identical.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kevin Easton <kevin@guarana.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Salyzyn <salyzyn@android.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/20180301165150.410218515@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      127bfa5f
    • T
      timekeeping: Remove boot time specific code · d6c7270e
      Thomas Gleixner 提交于
      Now that the MONOTONIC and BOOTTIME clocks are the same, remove all the
      special handling from timekeeping. Keep wrappers for the existing users of
      the *boot* timekeeper interfaces.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kevin Easton <kevin@guarana.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Salyzyn <salyzyn@android.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/20180301165150.236279497@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d6c7270e
    • T
      timekeeping: Add the new CLOCK_MONOTONIC_ACTIVE clock · 72199320
      Thomas Gleixner 提交于
      The planned change to unify the behaviour of the MONOTONIC and BOOTTIME
      clocks vs. suspend removes the ability to retrieve the active
      non-suspended time of a system.
      
      Provide a new CLOCK_MONOTONIC_ACTIVE clock which returns the active
      non-suspended time of the system via clock_gettime().
      
      This preserves the old behaviour of CLOCK_MONOTONIC before the
      BOOTTIME/MONOTONIC unification.
      
      This new clock also allows applications to detect programmatically that
      the MONOTONIC and BOOTTIME clocks are identical.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kevin Easton <kevin@guarana.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Salyzyn <salyzyn@android.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/20180301165149.965235774@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      72199320
  2. 10 3月, 2018 1 次提交
    • M
      timekeeping/ntp: Determine the multiplier directly from NTP tick length · 78b98e3c
      Miroslav Lichvar 提交于
      When the length of the NTP tick changes significantly, e.g. when an
      NTP/PTP application is correcting the initial offset of the clock, a
      large value may accumulate in the NTP error before the multiplier
      converges to the correct value. It may then take a very long time (hours
      or even days) before the error is corrected. This causes the clock to
      have an unstable frequency offset, which has a negative impact on the
      stability of synchronization with precise time sources (e.g. NTP/PTP
      using hardware timestamping or the PTP KVM clock).
      
      Use division to determine the correct multiplier directly from the NTP
      tick length and replace the iterative approach. This removes the last
      major source of the NTP error. The only remaining source is now limited
      resolution of the multiplier, which is corrected by adding 1 to the
      multiplier when the system clock is behind the NTP time.
      Signed-off-by: NMiroslav Lichvar <mlichvar@redhat.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Stephen Boyd <stephen.boyd@linaro.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1520620971-9567-3-git-send-email-john.stultz@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      78b98e3c
  3. 23 2月, 2018 3 次提交
  4. 22 2月, 2018 6 次提交
    • T
      seccomp, ptrace: switch get_metadata types to arch independent · 2a040f9f
      Tycho Andersen 提交于
      Commit 26500475 ("ptrace, seccomp: add support for retrieving seccomp
      metadata") introduced `struct seccomp_metadata`, which contained unsigned
      longs that should be arch independent. The type of the flags member was
      chosen to match the corresponding argument to seccomp(), and so we need
      something at least as big as unsigned long. My understanding is that __u64
      should fit the bill, so let's switch both types to that.
      
      While this is userspace facing, it was only introduced in 4.16-rc2, and so
      should be safe assuming it goes in before then.
      Reported-by: N"Dmitry V. Levin" <ldv@altlinux.org>
      Signed-off-by: NTycho Andersen <tycho@tycho.ws>
      CC: Kees Cook <keescook@chromium.org>
      CC: Oleg Nesterov <oleg@redhat.com>
      Reviewed-by: N"Dmitry V. Levin" <ldv@altlinux.org>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      2a040f9f
    • A
      bug.h: work around GCC PR82365 in BUG() · 173a3efd
      Arnd Bergmann 提交于
      Looking at functions with large stack frames across all architectures
      led me discovering that BUG() suffers from the same problem as
      fortify_panic(), which I've added a workaround for already.
      
      In short, variables that go out of scope by calling a noreturn function
      or __builtin_unreachable() keep using stack space in functions
      afterwards.
      
      A workaround that was identified is to insert an empty assembler
      statement just before calling the function that doesn't return.  I'm
      adding a macro "barrier_before_unreachable()" to document this, and
      insert calls to that in all instances of BUG() that currently suffer
      from this problem.
      
      The files that saw the largest change from this had these frame sizes
      before, and much less with my patch:
      
        fs/ext4/inode.c:82:1: warning: the frame size of 1672 bytes is larger than 800 bytes [-Wframe-larger-than=]
        fs/ext4/namei.c:434:1: warning: the frame size of 904 bytes is larger than 800 bytes [-Wframe-larger-than=]
        fs/ext4/super.c:2279:1: warning: the frame size of 1160 bytes is larger than 800 bytes [-Wframe-larger-than=]
        fs/ext4/xattr.c:146:1: warning: the frame size of 1168 bytes is larger than 800 bytes [-Wframe-larger-than=]
        fs/f2fs/inode.c:152:1: warning: the frame size of 1424 bytes is larger than 800 bytes [-Wframe-larger-than=]
        net/netfilter/ipvs/ip_vs_core.c:1195:1: warning: the frame size of 1068 bytes is larger than 800 bytes [-Wframe-larger-than=]
        net/netfilter/ipvs/ip_vs_core.c:395:1: warning: the frame size of 1084 bytes is larger than 800 bytes [-Wframe-larger-than=]
        net/netfilter/ipvs/ip_vs_ftp.c:298:1: warning: the frame size of 928 bytes is larger than 800 bytes [-Wframe-larger-than=]
        net/netfilter/ipvs/ip_vs_ftp.c:418:1: warning: the frame size of 908 bytes is larger than 800 bytes [-Wframe-larger-than=]
        net/netfilter/ipvs/ip_vs_lblcr.c:718:1: warning: the frame size of 960 bytes is larger than 800 bytes [-Wframe-larger-than=]
        drivers/net/xen-netback/netback.c:1500:1: warning: the frame size of 1088 bytes is larger than 800 bytes [-Wframe-larger-than=]
      
      In case of ARC and CRIS, it turns out that the BUG() implementation
      actually does return (or at least the compiler thinks it does),
      resulting in lots of warnings about uninitialized variable use and
      leaving noreturn functions, such as:
      
        block/cfq-iosched.c: In function 'cfq_async_queue_prio':
        block/cfq-iosched.c:3804:1: error: control reaches end of non-void function [-Werror=return-type]
        include/linux/dmaengine.h: In function 'dma_maxpq':
        include/linux/dmaengine.h:1123:1: error: control reaches end of non-void function [-Werror=return-type]
      
      This makes them call __builtin_trap() instead, which should normally
      dump the stack and kill the current process, like some of the other
      architectures already do.
      
      I tried adding barrier_before_unreachable() to panic() and
      fortify_panic() as well, but that had very little effect, so I'm not
      submitting that patch.
      
      Vineet said:
      
      : For ARC, it is double win.
      :
      : 1. Fixes 3 -Wreturn-type warnings
      :
      : | ../net/core/ethtool.c:311:1: warning: control reaches end of non-void function
      : [-Wreturn-type]
      : | ../kernel/sched/core.c:3246:1: warning: control reaches end of non-void function
      : [-Wreturn-type]
      : | ../include/linux/sunrpc/svc_xprt.h:180:1: warning: control reaches end of
      : non-void function [-Wreturn-type]
      :
      : 2.  bloat-o-meter reports code size improvements as gcc elides the
      :    generated code for stack return.
      
      Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365
      Link: http://lkml.kernel.org/r/20171219114112.939391-1-arnd@arndb.deSigned-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: Vineet Gupta <vgupta@synopsys.com>	[arch/arc]
      Tested-by: Vineet Gupta <vgupta@synopsys.com>	[arch/arc]
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Christopher Li <sparse@chrisli.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      173a3efd
    • S
      mm, mlock, vmscan: no more skipping pagevecs · 9c4e6b1a
      Shakeel Butt 提交于
      When a thread mlocks an address space backed either by file pages which
      are currently not present in memory or swapped out anon pages (not in
      swapcache), a new page is allocated and added to the local pagevec
      (lru_add_pvec), I/O is triggered and the thread then sleeps on the page.
      On I/O completion, the thread can wake on a different CPU, the mlock
      syscall will then sets the PageMlocked() bit of the page but will not be
      able to put that page in unevictable LRU as the page is on the pagevec
      of a different CPU.  Even on drain, that page will go to evictable LRU
      because the PageMlocked() bit is not checked on pagevec drain.
      
      The page will eventually go to right LRU on reclaim but the LRU stats
      will remain skewed for a long time.
      
      This patch puts all the pages, even unevictable, to the pagevecs and on
      the drain, the pages will be added on their LRUs correctly by checking
      their evictability.  This resolves the mlocked pages on pagevec of other
      CPUs issue because when those pagevecs will be drained, the mlocked file
      pages will go to unevictable LRU.  Also this makes the race with munlock
      easier to resolve because the pagevec drains happen in LRU lock.
      
      However there is still one place which makes a page evictable and does
      PageLRU check on that page without LRU lock and needs special attention.
      TestClearPageMlocked() and isolate_lru_page() in clear_page_mlock().
      
      	#0: __pagevec_lru_add_fn	#1: clear_page_mlock
      
      	SetPageLRU()			if (!TestClearPageMlocked())
      					  return
      	smp_mb() // <--required
      					// inside does PageLRU
      	if (!PageMlocked())		if (isolate_lru_page())
      	  move to evictable LRU		  putback_lru_page()
      	else
      	  move to unevictable LRU
      
      In '#1', TestClearPageMlocked() provides full memory barrier semantics
      and thus the PageLRU check (inside isolate_lru_page) can not be
      reordered before it.
      
      In '#0', without explicit memory barrier, the PageMlocked() check can be
      reordered before SetPageLRU().  If that happens, '#0' can put a page in
      unevictable LRU and '#1' might have just cleared the Mlocked bit of that
      page but fails to isolate as PageLRU fails as '#0' still hasn't set
      PageLRU bit of that page.  That page will be stranded on the unevictable
      LRU.
      
      There is one (good) side effect though.  Without this patch, the pages
      allocated for System V shared memory segment are added to evictable LRUs
      even after shmctl(SHM_LOCK) on that segment.  This patch will correctly
      put such pages to unevictable LRU.
      
      Link: http://lkml.kernel.org/r/20171121211241.18877-1-shakeelb@google.comSigned-off-by: NShakeel Butt <shakeelb@google.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Shaohua Li <shli@fb.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9c4e6b1a
    • J
      mm: memcontrol: fix NR_WRITEBACK leak in memcg and system stats · c3cc3911
      Johannes Weiner 提交于
      After commit a983b5eb ("mm: memcontrol: fix excessive complexity in
      memory.stat reporting"), we observed slowly upward creeping NR_WRITEBACK
      counts over the course of several days, both the per-memcg stats as well
      as the system counter in e.g.  /proc/meminfo.
      
      The conversion from full per-cpu stat counts to per-cpu cached atomic
      stat counts introduced an irq-unsafe RMW operation into the updates.
      
      Most stat updates come from process context, but one notable exception
      is the NR_WRITEBACK counter.  While writebacks are issued from process
      context, they are retired from (soft)irq context.
      
      When writeback completions interrupt the RMW counter updates of new
      writebacks being issued, the decs from the completions are lost.
      
      Since the global updates are routed through the joint lruvec API, both
      the memcg counters as well as the system counters are affected.
      
      This patch makes the joint stat and event API irq safe.
      
      Link: http://lkml.kernel.org/r/20180203082353.17284-1-hannes@cmpxchg.org
      Fixes: a983b5eb ("mm: memcontrol: fix excessive complexity in memory.stat reporting")
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Debugged-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NRik van Riel <riel@surriel.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c3cc3911
    • A
      Kbuild: always define endianess in kconfig.h · 101110f6
      Arnd Bergmann 提交于
      Build testing with LTO found a couple of files that get compiled
      differently depending on whether asm/byteorder.h gets included early
      enough or not.  In particular, include/asm-generic/qrwlock_types.h is
      affected by this, but there are probably others as well.
      
      The symptom is a series of LTO link time warnings, including these:
      
          net/netlabel/netlabel_unlabeled.h:223: error: type of 'netlbl_unlhsh_add' does not match original declaration [-Werror=lto-type-mismatch]
           int netlbl_unlhsh_add(struct net *net,
          net/netlabel/netlabel_unlabeled.c:377: note: 'netlbl_unlhsh_add' was previously declared here
      
          include/net/ipv6.h:360: error: type of 'ipv6_renew_options_kern' does not match original declaration [-Werror=lto-type-mismatch]
           ipv6_renew_options_kern(struct sock *sk,
          net/ipv6/exthdrs.c:1162: note: 'ipv6_renew_options_kern' was previously declared here
      
          net/core/dev.c:761: note: 'dev_get_by_name_rcu' was previously declared here
           struct net_device *dev_get_by_name_rcu(struct net *net, const char *name)
          net/core/dev.c:761: note: code may be misoptimized unless -fno-strict-aliasing is used
      
          drivers/gpu/drm/i915/i915_drv.h:3377: error: type of 'i915_gem_object_set_to_wc_domain' does not match original declaration [-Werror=lto-type-mismatch]
           i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write);
          drivers/gpu/drm/i915/i915_gem.c:3639: note: 'i915_gem_object_set_to_wc_domain' was previously declared here
      
          include/linux/debugfs.h:92:9: error: type of 'debugfs_attr_read' does not match original declaration [-Werror=lto-type-mismatch]
           ssize_t debugfs_attr_read(struct file *file, char __user *buf,
          fs/debugfs/file.c:318: note: 'debugfs_attr_read' was previously declared here
      
          include/linux/rwlock_api_smp.h:30: error: type of '_raw_read_unlock' does not match original declaration [-Werror=lto-type-mismatch]
           void __lockfunc _raw_read_unlock(rwlock_t *lock) __releases(lock);
          kernel/locking/spinlock.c:246:26: note: '_raw_read_unlock' was previously declared here
      
          include/linux/fs.h:3308:5: error: type of 'simple_attr_open' does not match original declaration [-Werror=lto-type-mismatch]
           int simple_attr_open(struct inode *inode, struct file *file,
          fs/libfs.c:795: note: 'simple_attr_open' was previously declared here
      
      All of the above are caused by include/asm-generic/qrwlock_types.h
      failing to include asm/byteorder.h after commit e0d02285
      ("locking/qrwlock: Use 'struct qrwlock' instead of 'struct __qrwlock'")
      in linux-4.15.
      
      Similar bugs may or may not exist in older kernels as well, but there is
      no easy way to test those with link-time optimizations, and kernels
      before 4.14 are harder to fix because they don't have Babu's patch
      series
      
      We had similar issues with CONFIG_ symbols in the past and ended up
      always including the configuration headers though linux/kconfig.h.  This
      works around the issue through that same file, defining either
      __BIG_ENDIAN or __LITTLE_ENDIAN depending on CONFIG_CPU_BIG_ENDIAN,
      which is now always set on all architectures since commit 4c97a0c8
      ("arch: define CPU_BIG_ENDIAN for all fixed big endian archs").
      
      Link: http://lkml.kernel.org/r/20180202154104.1522809-2-arnd@arndb.deSigned-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Babu Moger <babu.moger@amd.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Nicolas Pitre <nico@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      101110f6
    • A
      include/linux/sched/mm.h: re-inline mmdrop() · d34bc48f
      Andrew Morton 提交于
      As Peter points out, Doing a CALL+RET for just the decrement is a bit silly.
      
      Fixes: d70f2a14 ("include/linux/sched/mm.h: uninline mmdrop_async(), etc")
      Acked-by: NPeter Zijlstra (Intel) <peterz@infraded.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d34bc48f
  5. 21 2月, 2018 1 次提交
  6. 20 2月, 2018 5 次提交
  7. 19 2月, 2018 1 次提交
  8. 17 2月, 2018 4 次提交
  9. 16 2月, 2018 5 次提交
  10. 15 2月, 2018 3 次提交
  11. 14 2月, 2018 1 次提交
    • H
      uapi/if_ether.h: move __UAPI_DEF_ETHHDR libc define · da360299
      Hauke Mehrtens 提交于
      This fixes a compile problem of some user space applications by not
      including linux/libc-compat.h in uapi/if_ether.h.
      
      linux/libc-compat.h checks which "features" the header files, included
      from the libc, provide to make the Linux kernel uapi header files only
      provide no conflicting structures and enums. If a user application mixes
      kernel headers and libc headers it could happen that linux/libc-compat.h
      gets included too early where not all other libc headers are included
      yet. Then the linux/libc-compat.h would not prevent all the
      redefinitions and we run into compile problems.
      This patch removes the include of linux/libc-compat.h from
      uapi/if_ether.h to fix the recently introduced case, but not all as this
      is more or less impossible.
      
      It is no problem to do the check directly in the if_ether.h file and not
      in libc-compat.h as this does not need any fancy glibc header detection
      as glibc never provided struct ethhdr and should define
      __UAPI_DEF_ETHHDR by them self when they will provide this.
      
      The following test program did not compile correctly any more:
      
      #include <linux/if_ether.h>
      #include <netinet/in.h>
      #include <linux/in.h>
      
      int main(void)
      {
      	return 0;
      }
      
      Fixes: 6926e041 ("uapi/if_ether.h: prevent redefinition of struct ethhdr")
      Reported-by: NGuillaume Nault <g.nault@alphalink.fr>
      Cc: <stable@vger.kernel.org> # 4.15
      Signed-off-by: NHauke Mehrtens <hauke@hauke-m.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      da360299
  12. 13 2月, 2018 4 次提交
    • T
      x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages · fd0e786d
      Tony Luck 提交于
      In the following commit:
      
        ce0fa3e5 ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages")
      
      ... we added code to memory_failure() to unmap the page from the
      kernel 1:1 virtual address space to avoid speculative access to the
      page logging additional errors.
      
      But memory_failure() may not always succeed in taking the page offline,
      especially if the page belongs to the kernel.  This can happen if
      there are too many corrected errors on a page and either mcelog(8)
      or drivers/ras/cec.c asks to take a page offline.
      
      Since we remove the 1:1 mapping early in memory_failure(), we can
      end up with the page unmapped, but still in use. On the next access
      the kernel crashes :-(
      
      There are also various debug paths that call memory_failure() to simulate
      occurrence of an error. Since there is no actual error in memory, we
      don't need to map out the page for those cases.
      
      Revert most of the previous attempt and keep the solution local to
      arch/x86/kernel/cpu/mcheck/mce.c. Unmap the page only when:
      
      	1) there is a real error
      	2) memory_failure() succeeds.
      
      All of this only applies to 64-bit systems. 32-bit kernel doesn't map
      all of memory into kernel space. It isn't worth adding the code to unmap
      the piece that is mapped because nobody would run a 32-bit kernel on a
      machine that has recoverable machine checks.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave <dave.hansen@intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert (Persistent Memory) <elliott@hpe.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Cc: stable@vger.kernel.org #v4.14
      Fixes: ce0fa3e5 ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      fd0e786d
    • T
      locking/semaphore: Update the file path in documentation · 2dd6fd2e
      Tycho Andersen 提交于
      While reading this header I noticed that the locking stuff has moved to
      kernel/locking/*, so update the path in semaphore.h to point to that.
      Signed-off-by: NTycho Andersen <tycho@tycho.ws>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180201114119.1090-1-tycho@tycho.wsSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2dd6fd2e
    • W
      locking/atomic/bitops: Document and clarify ordering semantics for failed test_and_{}_bit() · 61e02392
      Will Deacon 提交于
      A test_and_{}_bit() operation fails if the value of the bit is such that
      the modification does not take place. For example, if test_and_set_bit()
      returns 1. In these cases, follow the behaviour of cmpxchg and allow the
      operation to be unordered. This also applies to test_and_set_bit_lock()
      if the lock is found to be be taken already.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1518528619-20049-1-git-send-email-will.deacon@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      61e02392
    • jia zhang's avatar
      vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user page · 595dd46e
      jia zhang 提交于
      Commit:
      
        df04abfd ("fs/proc/kcore.c: Add bounce buffer for ktext data")
      
      ... introduced a bounce buffer to work around CONFIG_HARDENED_USERCOPY=y.
      However, accessing the vsyscall user page will cause an SMAP fault.
      
      Replace memcpy() with copy_from_user() to fix this bug works, but adding
      a common way to handle this sort of user page may be useful for future.
      
      Currently, only vsyscall page requires KCORE_USER.
      Signed-off-by: jia zhang's avatarJia Zhang <zhang.jia@linux.alibaba.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: jolsa@redhat.com
      Link: http://lkml.kernel.org/r/1518446694-21124-2-git-send-email-zhang.jia@linux.alibaba.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      595dd46e
  13. 12 2月, 2018 2 次提交