1. 19 5月, 2018 3 次提交
  2. 18 5月, 2018 1 次提交
    • W
      proc: do not access cmdline nor environ from file-backed areas · 7f7ccc2c
      Willy Tarreau 提交于
      proc_pid_cmdline_read() and environ_read() directly access the target
      process' VM to retrieve the command line and environment. If this
      process remaps these areas onto a file via mmap(), the requesting
      process may experience various issues such as extra delays if the
      underlying device is slow to respond.
      
      Let's simply refuse to access file-backed areas in these functions.
      For this we add a new FOLL_ANON gup flag that is passed to all calls
      to access_remote_vm(). The code already takes care of such failures
      (including unmapped areas). Accesses via /proc/pid/mem were not
      changed though.
      
      This was assigned CVE-2018-1120.
      
      Note for stable backports: the patch may apply to kernels prior to 4.11
      but silently miss one location; it must be checked that no call to
      access_remote_vm() keeps zero as the last argument.
      Reported-by: NQualys Security Advisory <qsa@qualys.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NWilly Tarreau <w@1wt.eu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7f7ccc2c
  3. 15 5月, 2018 1 次提交
  4. 14 5月, 2018 1 次提交
  5. 12 5月, 2018 2 次提交
  6. 11 5月, 2018 1 次提交
    • W
      KVM: Extend MAX_IRQ_ROUTES to 4096 for all archs · ddc9cfb7
      Wanpeng Li 提交于
      Our virtual machines make use of device assignment by configuring
      12 NVMe disks for high I/O performance. Each NVMe device has 129
      MSI-X Table entries:
      Capabilities: [50] MSI-X: Enable+ Count=129 Masked-Vector table: BAR=0 offset=00002000
      The windows virtual machines fail to boot since they will map the number of
      MSI-table entries that the NVMe hardware reported to the bus to msi routing
      table, this will exceed the 1024. This patch extends MAX_IRQ_ROUTES to 4096
      for all archs, in the future this might be extended again if needed.
      Reviewed-by: NCornelia Huck <cohuck@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim KrÄmář <rkrcmar@redhat.com>
      Cc: Cornelia Huck <cohuck@redhat.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NTonny Lu <tonnylu@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ddc9cfb7
  7. 10 5月, 2018 1 次提交
  8. 05 5月, 2018 1 次提交
  9. 04 5月, 2018 1 次提交
    • P
      sched/core: Introduce set_special_state() · b5bf9a90
      Peter Zijlstra 提交于
      Gaurav reported a perceived problem with TASK_PARKED, which turned out
      to be a broken wait-loop pattern in __kthread_parkme(), but the
      reported issue can (and does) in fact happen for states that do not do
      condition based sleeps.
      
      When the 'current->state = TASK_RUNNING' store of a previous
      (concurrent) try_to_wake_up() collides with the setting of a 'special'
      sleep state, we can loose the sleep state.
      
      Normal condition based wait-loops are immune to this problem, but for
      sleep states that are not condition based are subject to this problem.
      
      There already is a fix for TASK_DEAD. Abstract that and also apply it
      to TASK_STOPPED and TASK_TRACED, both of which are also without
      condition based wait-loop.
      Reported-by: NGaurav Kohli <gkohli@codeaurora.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      b5bf9a90
  10. 03 5月, 2018 2 次提交
    • T
      bdi: wake up concurrent wb_shutdown() callers. · 8236b0ae
      Tetsuo Handa 提交于
      syzbot is reporting hung tasks at wait_on_bit(WB_shutting_down) in
      wb_shutdown() [1]. This seems to be because commit 5318ce7d ("bdi:
      Shutdown writeback on all cgwbs in cgwb_bdi_destroy()") forgot to call
      wake_up_bit(WB_shutting_down) after clear_bit(WB_shutting_down).
      
      Introduce a helper function clear_and_wake_up_bit() and use it, in order
      to avoid similar errors in future.
      
      [1] https://syzkaller.appspot.com/bug?id=b297474817af98d5796bc544e1bb806fc3da0e5eSigned-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-by: Nsyzbot <syzbot+c0cf869505e03bdf1a24@syzkaller.appspotmail.com>
      Fixes: 5318ce7d ("bdi: Shutdown writeback on all cgwbs in cgwb_bdi_destroy()")
      Cc: Tejun Heo <tj@kernel.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      8236b0ae
    • P
      kthread, sched/wait: Fix kthread_parkme() completion issue · 85f1abe0
      Peter Zijlstra 提交于
      Even with the wait-loop fixed, there is a further issue with
      kthread_parkme(). Upon hotplug, when we do takedown_cpu(),
      smpboot_park_threads() can return before all those threads are in fact
      blocked, due to the placement of the complete() in __kthread_parkme().
      
      When that happens, sched_cpu_dying() -> migrate_tasks() can end up
      migrating such a still runnable task onto another CPU.
      
      Normally the task will have hit schedule() and gone to sleep by the
      time we do kthread_unpark(), which will then do __kthread_bind() to
      re-bind the task to the correct CPU.
      
      However, when we loose the initial TASK_PARKED store to the concurrent
      wakeup issue described previously, do the complete(), get migrated, it
      is possible to either:
      
       - observe kthread_unpark()'s clearing of SHOULD_PARK and terminate
         the park and set TASK_RUNNING, or
      
       - __kthread_bind()'s wait_task_inactive() to observe the competing
         TASK_RUNNING store.
      
      Either way the WARN() in __kthread_bind() will trigger and fail to
      correctly set the CPU affinity.
      
      Fix this by only issuing the complete() when the kthread has scheduled
      out. This does away with all the icky 'still running' nonsense.
      
      The alternative is to promote TASK_PARKED to a special state, this
      guarantees wait_task_inactive() cannot observe a 'stale' TASK_RUNNING
      and we'll end up doing the right thing, but this preserves the whole
      icky business of potentially migating the still runnable thing.
      Reported-by: NGaurav Kohli <gkohli@codeaurora.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      85f1abe0
  11. 29 4月, 2018 1 次提交
    • A
      <linux/stringhash.h>: fix end_name_hash() for 64bit long · 19b9ad67
      Amir Goldstein 提交于
      The comment claims that this helper will try not to loose bits, but for
      64bit long it looses the high bits before hashing 64bit long into 32bit
      int.  Use the helper hash_long() to do the right thing for 64bit long.
      For 32bit long, there is no change.
      
      All the callers of end_name_hash() either assign the result to
      qstr->hash, which is u32 or return the result as an int value (e.g.
      full_name_hash()).  Change the helper return type to int to conform to
      its users.
      
      [ It took me a while to apply this, because my initial reaction to it
        was - incorrectly - that it could make for slower code.
      
        After having looked more at it, I take back all my complaints about
        the patch, Amir was right and I was mis-reading things or just being
        stupid.
      
        I also don't worry too much about the possible performance impact of
        this on 64-bit, since most architectures that actually care about
        performance end up not using this very much (the dcache code is the
        most performance-critical, but the word-at-a-time case uses its own
        hashing anyway).
      
        So this ends up being mostly used for filesystems that do their own
        degraded hashing (usually because they want a case-insensitive
        comparison function).
      
        A _tiny_ worry remains, in that not everybody uses DCACHE_WORD_ACCESS,
        and then this potentially makes things more expensive on 64-bit
        architectures with slow or lacking multipliers even for the normal
        case.
      
        That said, realistically the only such architecture I can think of is
        PA-RISC. Nobody really cares about performance on that, it's more of a
        "look ma, I've got warts^W an odd machine" platform.
      
        So the patch is fine, and all my initial worries were just misplaced
        from not looking at this properly.   - Linus ]
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      19b9ad67
  12. 27 4月, 2018 2 次提交
  13. 26 4月, 2018 4 次提交
    • O
      blk-mq: fix sysfs inflight counter · bf0ddaba
      Omar Sandoval 提交于
      When the blk-mq inflight implementation was added, /proc/diskstats was
      converted to use it, but /sys/block/$dev/inflight was not. Fix it by
      adding another helper to count in-flight requests by data direction.
      
      Fixes: f299b7c7 ("blk-mq: provide internal in-flight variant")
      Signed-off-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      bf0ddaba
    • T
      Revert: Unify CLOCK_MONOTONIC and CLOCK_BOOTTIME · a3ed0e43
      Thomas Gleixner 提交于
      Revert commits
      
      92af4dcb ("tracing: Unify the "boot" and "mono" tracing clocks")
      127bfa5f ("hrtimer: Unify MONOTONIC and BOOTTIME clock behavior")
      7250a404 ("posix-timers: Unify MONOTONIC and BOOTTIME clock behavior")
      d6c7270e ("timekeeping: Remove boot time specific code")
      f2d6fdbf ("Input: Evdev - unify MONOTONIC and BOOTTIME clock behavior")
      d6ed449a ("timekeeping: Make the MONOTONIC clock behave like the BOOTTIME clock")
      72199320 ("timekeeping: Add the new CLOCK_MONOTONIC_ACTIVE clock")
      
      As stated in the pull request for the unification of CLOCK_MONOTONIC and
      CLOCK_BOOTTIME, it was clear that we might have to revert the change.
      
      As reported by several folks systemd and other applications rely on the
      documented behaviour of CLOCK_MONOTONIC on Linux and break with the above
      changes. After resume daemons time out and other timeout related issues are
      observed. Rafael compiled this list:
      
      * systemd kills daemons on resume, after >WatchdogSec seconds
        of suspending (Genki Sky).  [Verified that that's because systemd uses
        CLOCK_MONOTONIC and expects it to not include the suspend time.]
      
      * systemd-journald misbehaves after resume:
        systemd-journald[7266]: File /var/log/journal/016627c3c4784cd4812d4b7e96a34226/system.journal
      corrupted or uncleanly shut down, renaming and replacing.
        (Mike Galbraith).
      
      * NetworkManager reports "networking disabled" and networking is broken
        after resume 50% of the time (Pavel).  [May be because of systemd.]
      
      * MATE desktop dims the display and starts the screensaver right after
        system resume (Pavel).
      
      * Full system hang during resume (me).  [May be due to systemd or NM or both.]
      
      That happens on debian and open suse systems.
      
      It's sad, that these problems were neither catched in -next nor by those
      folks who expressed interest in this change.
      Reported-by: NRafael J. Wysocki <rjw@rjwysocki.net>
      Reported-by: Genki Sky <sky@genki.is>,
      Reported-by: NPavel Machek <pavel@ucw.cz>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kevin Easton <kevin@guarana.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Salyzyn <salyzyn@android.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      a3ed0e43
    • A
      remoteproc: fix crashed parameter logic on stop call · fcd58037
      Arnaud Pouliquen 提交于
      Fix rproc_add_subdev parameter name and inverse the crashed logic.
      
      Fixes: 880f5b38 ("remoteproc: Pass type of shutdown to subdev remove")
      Reviewed-by: NAlex Elder <elder@linaro.org>
      Signed-off-by: NArnaud Pouliquen <arnaud.pouliquen@st.com>
      Signed-off-by: NBjorn Andersson <bjorn.andersson@linaro.org>
      fcd58037
    • M
      virtio: add ability to iterate over vqs · 24a7e4d2
      Michael S. Tsirkin 提交于
      For cleanup it's helpful to be able to simply scan all vqs and discard
      all data. Add an iterator to do that.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      24a7e4d2
  14. 25 4月, 2018 2 次提交
  15. 24 4月, 2018 3 次提交
  16. 23 4月, 2018 3 次提交
  17. 21 4月, 2018 3 次提交
    • A
      kasan: add no_sanitize attribute for clang builds · 12c8f25a
      Andrey Konovalov 提交于
      KASAN uses the __no_sanitize_address macro to disable instrumentation of
      particular functions.  Right now it's defined only for GCC build, which
      causes false positives when clang is used.
      
      This patch adds a definition for clang.
      
      Note, that clang's revision 329612 or higher is required.
      
      [andreyknvl@google.com: remove redundant #ifdef CONFIG_KASAN check]
        Link: http://lkml.kernel.org/r/c79aa31a2a2790f6131ed607c58b0dd45dd62a6c.1523967959.git.andreyknvl@google.com
      Link: http://lkml.kernel.org/r/4ad725cc903f8534f8c8a60f0daade5e3d674f8d.1523554166.git.andreyknvl@google.comSigned-off-by: NAndrey Konovalov <andreyknvl@google.com>
      Acked-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Paul Lawrence <paullawrence@google.com>
      Cc: Sandipan Das <sandipan@linux.vnet.ibm.com>
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      12c8f25a
    • G
      writeback: safer lock nesting · 2e898e4c
      Greg Thelen 提交于
      lock_page_memcg()/unlock_page_memcg() use spin_lock_irqsave/restore() if
      the page's memcg is undergoing move accounting, which occurs when a
      process leaves its memcg for a new one that has
      memory.move_charge_at_immigrate set.
      
      unlocked_inode_to_wb_begin,end() use spin_lock_irq/spin_unlock_irq() if
      the given inode is switching writeback domains.  Switches occur when
      enough writes are issued from a new domain.
      
      This existing pattern is thus suspicious:
          lock_page_memcg(page);
          unlocked_inode_to_wb_begin(inode, &locked);
          ...
          unlocked_inode_to_wb_end(inode, locked);
          unlock_page_memcg(page);
      
      If both inode switch and process memcg migration are both in-flight then
      unlocked_inode_to_wb_end() will unconditionally enable interrupts while
      still holding the lock_page_memcg() irq spinlock.  This suggests the
      possibility of deadlock if an interrupt occurs before unlock_page_memcg().
      
          truncate
          __cancel_dirty_page
          lock_page_memcg
          unlocked_inode_to_wb_begin
          unlocked_inode_to_wb_end
          <interrupts mistakenly enabled>
                                          <interrupt>
                                          end_page_writeback
                                          test_clear_page_writeback
                                          lock_page_memcg
                                          <deadlock>
          unlock_page_memcg
      
      Due to configuration limitations this deadlock is not currently possible
      because we don't mix cgroup writeback (a cgroupv2 feature) and
      memory.move_charge_at_immigrate (a cgroupv1 feature).
      
      If the kernel is hacked to always claim inode switching and memcg
      moving_account, then this script triggers lockup in less than a minute:
      
        cd /mnt/cgroup/memory
        mkdir a b
        echo 1 > a/memory.move_charge_at_immigrate
        echo 1 > b/memory.move_charge_at_immigrate
        (
          echo $BASHPID > a/cgroup.procs
          while true; do
            dd if=/dev/zero of=/mnt/big bs=1M count=256
          done
        ) &
        while true; do
          sync
        done &
        sleep 1h &
        SLEEP=$!
        while true; do
          echo $SLEEP > a/cgroup.procs
          echo $SLEEP > b/cgroup.procs
        done
      
      The deadlock does not seem possible, so it's debatable if there's any
      reason to modify the kernel.  I suggest we should to prevent future
      surprises.  And Wang Long said "this deadlock occurs three times in our
      environment", so there's more reason to apply this, even to stable.
      Stable 4.4 has minor conflicts applying this patch.  For a clean 4.4 patch
      see "[PATCH for-4.4] writeback: safer lock nesting"
      https://lkml.org/lkml/2018/4/11/146
      
      Wang Long said "this deadlock occurs three times in our environment"
      
      [gthelen@google.com: v4]
        Link: http://lkml.kernel.org/r/20180411084653.254724-1-gthelen@google.com
      [akpm@linux-foundation.org: comment tweaks, struct initialization simplification]
      Change-Id: Ibb773e8045852978f6207074491d262f1b3fb613
      Link: http://lkml.kernel.org/r/20180410005908.167976-1-gthelen@google.com
      Fixes: 682aa8e1 ("writeback: implement unlocked_inode_to_wb transaction and use it for stat updates")
      Signed-off-by: NGreg Thelen <gthelen@google.com>
      Reported-by: NWang Long <wanglong19@meituan.com>
      Acked-by: NWang Long <wanglong19@meituan.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: <stable@vger.kernel.org>	[v4.2+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2e898e4c
    • K
      fork: unconditionally clear stack on fork · e01e8063
      Kees Cook 提交于
      One of the classes of kernel stack content leaks[1] is exposing the
      contents of prior heap or stack contents when a new process stack is
      allocated.  Normally, those stacks are not zeroed, and the old contents
      remain in place.  In the face of stack content exposure flaws, those
      contents can leak to userspace.
      
      Fixing this will make the kernel no longer vulnerable to these flaws, as
      the stack will be wiped each time a stack is assigned to a new process.
      There's not a meaningful change in runtime performance; it almost looks
      like it provides a benefit.
      
      Performing back-to-back kernel builds before:
      	Run times: 157.86 157.09 158.90 160.94 160.80
      	Mean: 159.12
      	Std Dev: 1.54
      
      and after:
      	Run times: 159.31 157.34 156.71 158.15 160.81
      	Mean: 158.46
      	Std Dev: 1.46
      
      Instead of making this a build or runtime config, Andy Lutomirski
      recommended this just be enabled by default.
      
      [1] A noisy search for many kinds of stack content leaks can be seen here:
      https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=linux+kernel+stack+leak
      
      I did some more with perf and cycle counts on running 100,000 execs of
      /bin/true.
      
      before:
      Cycles: 218858861551 218853036130 214727610969 227656844122 224980542841
      Mean:  221015379122.60
      Std Dev: 4662486552.47
      
      after:
      Cycles: 213868945060 213119275204 211820169456 224426673259 225489986348
      Mean:  217745009865.40
      Std Dev: 5935559279.99
      
      It continues to look like it's faster, though the deviation is rather
      wide, but I'm not sure what I could do that would be less noisy.  I'm
      open to ideas!
      
      Link: http://lkml.kernel.org/r/20180221021659.GA37073@beastSigned-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e01e8063
  18. 20 4月, 2018 2 次提交
    • A
      y2038: ipc: Use __kernel_timespec · 21fc538d
      Arnd Bergmann 提交于
      This is a preparatation for changing over __kernel_timespec to 64-bit
      times, which involves assigning new system call numbers for mq_timedsend(),
      mq_timedreceive() and semtimedop() for compatibility with future y2038
      proof user space.
      
      The existing ABIs will remain available through compat code.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      21fc538d
    • R
      fsnotify: Fix fsnotify_mark_connector race · d90a10e2
      Robert Kolchmeyer 提交于
      fsnotify() acquires a reference to a fsnotify_mark_connector through
      the SRCU-protected pointer to_tell->i_fsnotify_marks. However, it
      appears that no precautions are taken in fsnotify_put_mark() to
      ensure that fsnotify() drops its reference to this
      fsnotify_mark_connector before assigning a value to its 'destroy_next'
      field. This can result in fsnotify_put_mark() assigning a value
      to a connector's 'destroy_next' field right before fsnotify() tries to
      traverse the linked list referenced by the connector's 'list' field.
      Since these two fields are members of the same union, this behavior
      results in a kernel panic.
      
      This issue is resolved by moving the connector's 'destroy_next' field
      into the object pointer union. This should work since the object pointer
      access is protected by both a spinlock and the value of the 'flags'
      field, and the 'flags' field is cleared while holding the spinlock in
      fsnotify_put_mark() before 'destroy_next' is updated. It shouldn't be
      possible for another thread to accidentally read from the object pointer
      after the 'destroy_next' field is updated.
      
      The offending behavior here is extremely unlikely; since
      fsnotify_put_mark() removes references to a connector (specifically,
      it ensures that the connector is unreachable from the inode it was
      formerly attached to) before updating its 'destroy_next' field, a
      sizeable chunk of code in fsnotify_put_mark() has to execute in the
      short window between when fsnotify() acquires the connector reference
      and saves the value of its 'list' field. On the HEAD kernel, I've only
      been able to reproduce this by inserting a udelay(1) in fsnotify().
      However, I've been able to reproduce this issue without inserting a
      udelay(1) anywhere on older unmodified release kernels, so I believe
      it's worth fixing at HEAD.
      
      References: https://bugzilla.kernel.org/show_bug.cgi?id=199437
      Fixes: 08991e83
      CC: stable@vger.kernel.org
      Signed-off-by: NRobert Kolchmeyer <rkolchmeyer@google.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      d90a10e2
  19. 19 4月, 2018 6 次提交
    • M
      coresight: Move to SPDX identifier · 8a9fd832
      Mathieu Poirier 提交于
      Move CoreSight headers to the SPDX identifier.
      Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/1524089118-27595-1-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8a9fd832
    • D
      time: Change nanosleep to safe __kernel_* types · 01909974
      Deepa Dinamani 提交于
      Change over clock_nanosleep syscalls to use y2038 safe
      __kernel_timespec times. This will enable changing over
      of these syscalls to use new y2038 safe syscalls when
      the architectures define the CONFIG_64BIT_TIME.
      
      Note that nanosleep syscall is deprecated and does not have a
      plan for making it y2038 safe. But, the syscall should work as
      before on 64 bit machines and on 32 bit machines, the syscall
      works correctly until y2038 as before using the existing compat
      syscall version. There is no new syscall for supporting 64 bit
      time_t on 32 bit architectures.
      
      Cc: linux-api@vger.kernel.org
      Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      01909974
    • D
      time: Change types to new y2038 safe __kernel_* types · 6d5b8413
      Deepa Dinamani 提交于
      Change over clock_settime, clock_gettime and clock_getres
      syscalls to use __kernel_timespec times. This will enable
      changing over of these syscalls to use new y2038 safe syscalls
      when the architectures define the CONFIG_64BIT_TIME.
      
      Cc: linux-api@vger.kernel.org
      Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      6d5b8413
    • D
      time: Fix get_timespec64() for y2038 safe compat interfaces · ea2ce8f3
      Deepa Dinamani 提交于
      get/put_timespec64() interfaces will eventually be used for
      conversions between the new y2038 safe struct __kernel_timespec
      and struct timespec64.
      
      The new y2038 safe syscalls have a common entry for native
      and compat interfaces.
      On compat interfaces, the high order bits of nanoseconds
      should be zeroed out. This is because the application code
      or the libc do not guarantee zeroing of these. If used without
      zeroing, kernel might be at risk of using timespec values
      incorrectly.
      
      Note that clearing of bits is dependent on CONFIG_64BIT_TIME
      for now. This is until COMPAT_USE_64BIT_TIME has been handled
      correctly. x86 will be the first architecture that will use the
      CONFIG_64BIT_TIME.
      Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      ea2ce8f3
    • D
      time: Add new y2038 safe __kernel_timespec · acf8870a
      Deepa Dinamani 提交于
      The new struct __kernel_timespec is similar to current
      internal kernel struct timespec64 on 64 bit architecture.
      The compat structure however is similar to below on little
      endian systems (padding and tv_nsec are switched for big
      endian systems):
      
      typedef s32            compat_long_t;
      typedef s64            compat_kernel_time64_t;
      
      struct compat_kernel_timespec {
             compat_kernel_time64_t  tv_sec;
             compat_long_t           tv_nsec;
             compat_long_t           padding;
      };
      
      This allows for both the native and compat representations to
      be the same and syscalls using this type as part of their ABI
      can have a single entry point to both.
      
      Note that the compat define is not included anywhere in the
      kernel explicitly to avoid confusion.
      
      These types will be used by the new syscalls that will be
      introduced in the consequent patches.
      Most of the new syscalls are just an update to the existing
      native ones with this new type. Hence, put this new type under
      an ifdef so that the architectures can define CONFIG_64BIT_TIME
      when they are ready to handle this switch.
      
      Cc: linux-arch@vger.kernel.org
      Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      acf8870a
    • D
      compat: Enable compat_get/put_timespec64 always · 1c68adf6
      Deepa Dinamani 提交于
      These functions are used in the repurposed compat syscalls
      to provide backward compatibility for using 32 bit time_t
      on 32 bit systems.
      Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      1c68adf6