1. 25 5月, 2018 3 次提交
    • S
      perf/core: Fix bad use of igrab() · 9511bce9
      Song Liu 提交于
      As Miklos reported and suggested:
      
       "This pattern repeats two times in trace_uprobe.c and in
        kernel/events/core.c as well:
      
            ret = kern_path(filename, LOOKUP_FOLLOW, &path);
            if (ret)
                goto fail_address_parse;
      
            inode = igrab(d_inode(path.dentry));
            path_put(&path);
      
        And it's wrong.  You can only hold a reference to the inode if you
        have an active ref to the superblock as well (which is normally
        through path.mnt) or holding s_umount.
      
        This way unmounting the containing filesystem while the tracepoint is
        active will give you the "VFS: Busy inodes after unmount..." message
        and a crash when the inode is finally put.
      
        Solution: store path instead of inode."
      
      This patch fixes the issue in kernel/event/core.c.
      Reviewed-and-tested-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Reported-by: NMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <kernel-team@fb.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: 375637bc ("perf/core: Introduce address range filtering")
      Link: http://lkml.kernel.org/r/20180418062907.3210386-2-songliubraving@fb.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9511bce9
    • S
      perf/core: Fix group scheduling with mixed hw and sw events · a1150c20
      Song Liu 提交于
      When hw and sw events are mixed in the same group, they are all attached
      to the hw perf_event_context. This sometimes requires moving group of
      perf_event to a different context.
      
      We found a bug in how the kernel handles this, for example if we do:
      
         perf stat -e '{faults,ref-cycles,faults}'  -I 1000
      
           1.005591180              1,297      faults
           1.005591180        457,476,576      ref-cycles
           1.005591180    <not supported>      faults
      
      First, sw event "faults" is attached to the sw context, and becomes the
      group leader. Then, hw event "ref-cycles" is attached, so both events
      are moved to the hw context. Last, another sw "faults" tries to attach,
      but it fails because of mismatch between the new target ctx (from sw
      pmu) and the group_leader's ctx (hw context, same as ref-cycles).
      
      The broken condition is:
         group_leader is sw event;
         group_leader is on hw context;
         add a sw event to the group.
      
      Fix this scenario by checking group_leader's context (instead of just
      event type). If group_leader is on hw context, use the ->pmu of this
      context to look up context for the new event.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <kernel-team@fb.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: b04243ef ("perf: Complete software pmu grouping")
      Link: http://lkml.kernel.org/r/20180503194716.162815-1-songliubraving@fb.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a1150c20
    • J
      Revert "mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE" · d883c6cf
      Joonsoo Kim 提交于
      This reverts the following commits that change CMA design in MM.
      
       3d2054ad ("ARM: CMA: avoid double mapping to the CMA area if CONFIG_HIGHMEM=y")
      
       1d47a3ec ("mm/cma: remove ALLOC_CMA")
      
       bad8c6c0 ("mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE")
      
      Ville reported a following error on i386.
      
        Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
        microcode: microcode updated early to revision 0x4, date = 2013-06-28
        Initializing CPU#0
        Initializing HighMem for node 0 (000377fe:00118000)
        Initializing Movable for node 0 (00000001:00118000)
        BUG: Bad page state in process swapper  pfn:377fe
        page:f53effc0 count:0 mapcount:-127 mapping:00000000 index:0x0
        flags: 0x80000000()
        raw: 80000000 00000000 00000000 ffffff80 00000000 00000100 00000200 00000001
        page dumped because: nonzero mapcount
        Modules linked in:
        CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0-rc5-elk+ #145
        Hardware name: Dell Inc. Latitude E5410/03VXMC, BIOS A15 07/11/2013
        Call Trace:
         dump_stack+0x60/0x96
         bad_page+0x9a/0x100
         free_pages_check_bad+0x3f/0x60
         free_pcppages_bulk+0x29d/0x5b0
         free_unref_page_commit+0x84/0xb0
         free_unref_page+0x3e/0x70
         __free_pages+0x1d/0x20
         free_highmem_page+0x19/0x40
         add_highpages_with_active_regions+0xab/0xeb
         set_highmem_pages_init+0x66/0x73
         mem_init+0x1b/0x1d7
         start_kernel+0x17a/0x363
         i386_start_kernel+0x95/0x99
         startup_32_smp+0x164/0x168
      
      The reason for this error is that the span of MOVABLE_ZONE is extended
      to whole node span for future CMA initialization, and, normal memory is
      wrongly freed here.  I submitted the fix and it seems to work, but,
      another problem happened.
      
      It's so late time to fix the later problem so I decide to reverting the
      series.
      Reported-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Acked-by: NLaura Abbott <labbott@redhat.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d883c6cf
  2. 20 5月, 2018 1 次提交
    • A
      bpf: Prevent memory disambiguation attack · af86ca4e
      Alexei Starovoitov 提交于
      Detect code patterns where malicious 'speculative store bypass' can be used
      and sanitize such patterns.
      
       39: (bf) r3 = r10
       40: (07) r3 += -216
       41: (79) r8 = *(u64 *)(r7 +0)   // slow read
       42: (7a) *(u64 *)(r10 -72) = 0  // verifier inserts this instruction
       43: (7b) *(u64 *)(r8 +0) = r3   // this store becomes slow due to r8
       44: (79) r1 = *(u64 *)(r6 +0)   // cpu speculatively executes this load
       45: (71) r2 = *(u8 *)(r1 +0)    // speculatively arbitrary 'load byte'
                                       // is now sanitized
      
      Above code after x86 JIT becomes:
       e5: mov    %rbp,%rdx
       e8: add    $0xffffffffffffff28,%rdx
       ef: mov    0x0(%r13),%r14
       f3: movq   $0x0,-0x48(%rbp)
       fb: mov    %rdx,0x0(%r14)
       ff: mov    0x0(%rbx),%rdi
      103: movzbq 0x0(%rdi),%rsi
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      af86ca4e
  3. 19 5月, 2018 1 次提交
  4. 18 5月, 2018 1 次提交
    • W
      proc: do not access cmdline nor environ from file-backed areas · 7f7ccc2c
      Willy Tarreau 提交于
      proc_pid_cmdline_read() and environ_read() directly access the target
      process' VM to retrieve the command line and environment. If this
      process remaps these areas onto a file via mmap(), the requesting
      process may experience various issues such as extra delays if the
      underlying device is slow to respond.
      
      Let's simply refuse to access file-backed areas in these functions.
      For this we add a new FOLL_ANON gup flag that is passed to all calls
      to access_remote_vm(). The code already takes care of such failures
      (including unmapped areas). Accesses via /proc/pid/mem were not
      changed though.
      
      This was assigned CVE-2018-1120.
      
      Note for stable backports: the patch may apply to kernels prior to 4.11
      but silently miss one location; it must be checked that no call to
      access_remote_vm() keeps zero as the last argument.
      Reported-by: NQualys Security Advisory <qsa@qualys.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NWilly Tarreau <w@1wt.eu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7f7ccc2c
  5. 17 5月, 2018 1 次提交
  6. 16 5月, 2018 1 次提交
    • W
      locking/percpu-rwsem: Annotate rwsem ownership transfer by setting RWSEM_OWNER_UNKNOWN · 5a817641
      Waiman Long 提交于
      The filesystem freezing code needs to transfer ownership of a rwsem
      embedded in a percpu-rwsem from the task that does the freezing to
      another one that does the thawing by calling percpu_rwsem_release()
      after freezing and percpu_rwsem_acquire() before thawing.
      
      However, the new rwsem debug code runs afoul with this scheme by warning
      that the task that releases the rwsem isn't the one that acquires it,
      as reported by Amir Goldstein:
      
        DEBUG_LOCKS_WARN_ON(sem->owner != get_current())
        WARNING: CPU: 1 PID: 1401 at /home/amir/build/src/linux/kernel/locking/rwsem.c:133 up_write+0x59/0x79
      
        Call Trace:
         percpu_up_write+0x1f/0x28
         thaw_super_locked+0xdf/0x120
         do_vfs_ioctl+0x270/0x5f1
         ksys_ioctl+0x52/0x71
         __x64_sys_ioctl+0x16/0x19
         do_syscall_64+0x5d/0x167
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      To work properly with the rwsem debug code, we need to annotate that the
      rwsem ownership is unknown during the tranfer period until a brave soul
      comes forward to acquire the ownership. During that period, optimistic
      spinning will be disabled.
      Reported-by: NAmir Goldstein <amir73il@gmail.com>
      Tested-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NWaiman Long <longman@redhat.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Theodore Y. Ts'o <tytso@mit.edu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-fsdevel@vger.kernel.org
      Link: http://lkml.kernel.org/r/1526420991-21213-3-git-send-email-longman@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5a817641
  7. 15 5月, 2018 1 次提交
  8. 14 5月, 2018 2 次提交
  9. 12 5月, 2018 3 次提交
  10. 11 5月, 2018 1 次提交
    • W
      KVM: Extend MAX_IRQ_ROUTES to 4096 for all archs · ddc9cfb7
      Wanpeng Li 提交于
      Our virtual machines make use of device assignment by configuring
      12 NVMe disks for high I/O performance. Each NVMe device has 129
      MSI-X Table entries:
      Capabilities: [50] MSI-X: Enable+ Count=129 Masked-Vector table: BAR=0 offset=00002000
      The windows virtual machines fail to boot since they will map the number of
      MSI-table entries that the NVMe hardware reported to the bus to msi routing
      table, this will exceed the 1024. This patch extends MAX_IRQ_ROUTES to 4096
      for all archs, in the future this might be extended again if needed.
      Reviewed-by: NCornelia Huck <cohuck@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim KrÄmář <rkrcmar@redhat.com>
      Cc: Cornelia Huck <cohuck@redhat.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NTonny Lu <tonnylu@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ddc9cfb7
  11. 10 5月, 2018 1 次提交
  12. 05 5月, 2018 4 次提交
  13. 04 5月, 2018 1 次提交
    • P
      sched/core: Introduce set_special_state() · b5bf9a90
      Peter Zijlstra 提交于
      Gaurav reported a perceived problem with TASK_PARKED, which turned out
      to be a broken wait-loop pattern in __kthread_parkme(), but the
      reported issue can (and does) in fact happen for states that do not do
      condition based sleeps.
      
      When the 'current->state = TASK_RUNNING' store of a previous
      (concurrent) try_to_wake_up() collides with the setting of a 'special'
      sleep state, we can loose the sleep state.
      
      Normal condition based wait-loops are immune to this problem, but for
      sleep states that are not condition based are subject to this problem.
      
      There already is a fix for TASK_DEAD. Abstract that and also apply it
      to TASK_STOPPED and TASK_TRACED, both of which are also without
      condition based wait-loop.
      Reported-by: NGaurav Kohli <gkohli@codeaurora.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      b5bf9a90
  14. 03 5月, 2018 5 次提交
    • T
      bdi: wake up concurrent wb_shutdown() callers. · 8236b0ae
      Tetsuo Handa 提交于
      syzbot is reporting hung tasks at wait_on_bit(WB_shutting_down) in
      wb_shutdown() [1]. This seems to be because commit 5318ce7d ("bdi:
      Shutdown writeback on all cgwbs in cgwb_bdi_destroy()") forgot to call
      wake_up_bit(WB_shutting_down) after clear_bit(WB_shutting_down).
      
      Introduce a helper function clear_and_wake_up_bit() and use it, in order
      to avoid similar errors in future.
      
      [1] https://syzkaller.appspot.com/bug?id=b297474817af98d5796bc544e1bb806fc3da0e5eSigned-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-by: Nsyzbot <syzbot+c0cf869505e03bdf1a24@syzkaller.appspotmail.com>
      Fixes: 5318ce7d ("bdi: Shutdown writeback on all cgwbs in cgwb_bdi_destroy()")
      Cc: Tejun Heo <tj@kernel.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      8236b0ae
    • K
      nospec: Allow getting/setting on non-current task · 7bbf1373
      Kees Cook 提交于
      Adjust arch_prctl_get/set_spec_ctrl() to operate on tasks other than
      current.
      
      This is needed both for /proc/$pid/status queries and for seccomp (since
      thread-syncing can trigger seccomp in non-current threads).
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      7bbf1373
    • T
      prctl: Add speculation control prctls · b617cfc8
      Thomas Gleixner 提交于
      Add two new prctls to control aspects of speculation related vulnerabilites
      and their mitigations to provide finer grained control over performance
      impacting mitigations.
      
      PR_GET_SPECULATION_CTRL returns the state of the speculation misfeature
      which is selected with arg2 of prctl(2). The return value uses bit 0-2 with
      the following meaning:
      
      Bit  Define           Description
      0    PR_SPEC_PRCTL    Mitigation can be controlled per task by
                            PR_SET_SPECULATION_CTRL
      1    PR_SPEC_ENABLE   The speculation feature is enabled, mitigation is
                            disabled
      2    PR_SPEC_DISABLE  The speculation feature is disabled, mitigation is
                            enabled
      
      If all bits are 0 the CPU is not affected by the speculation misfeature.
      
      If PR_SPEC_PRCTL is set, then the per task control of the mitigation is
      available. If not set, prctl(PR_SET_SPECULATION_CTRL) for the speculation
      misfeature will fail.
      
      PR_SET_SPECULATION_CTRL allows to control the speculation misfeature, which
      is selected by arg2 of prctl(2) per task. arg3 is used to hand in the
      control value, i.e. either PR_SPEC_ENABLE or PR_SPEC_DISABLE.
      
      The common return values are:
      
      EINVAL  prctl is not implemented by the architecture or the unused prctl()
              arguments are not 0
      ENODEV  arg2 is selecting a not supported speculation misfeature
      
      PR_SET_SPECULATION_CTRL has these additional return values:
      
      ERANGE  arg3 is incorrect, i.e. it's not either PR_SPEC_ENABLE or PR_SPEC_DISABLE
      ENXIO   prctl control of the selected speculation misfeature is disabled
      
      The first supported controlable speculation misfeature is
      PR_SPEC_STORE_BYPASS. Add the define so this can be shared between
      architectures.
      
      Based on an initial patch from Tim Chen and mostly rewritten.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      b617cfc8
    • K
      x86/bugs: Expose /sys/../spec_store_bypass · c456442c
      Konrad Rzeszutek Wilk 提交于
      Add the sysfs file for the new vulerability. It does not do much except
      show the words 'Vulnerable' for recent x86 cores.
      
      Intel cores prior to family 6 are known not to be vulnerable, and so are
      some Atoms and some Xeon Phi.
      
      It assumes that older Cyrix, Centaur, etc. cores are immune.
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      c456442c
    • P
      kthread, sched/wait: Fix kthread_parkme() completion issue · 85f1abe0
      Peter Zijlstra 提交于
      Even with the wait-loop fixed, there is a further issue with
      kthread_parkme(). Upon hotplug, when we do takedown_cpu(),
      smpboot_park_threads() can return before all those threads are in fact
      blocked, due to the placement of the complete() in __kthread_parkme().
      
      When that happens, sched_cpu_dying() -> migrate_tasks() can end up
      migrating such a still runnable task onto another CPU.
      
      Normally the task will have hit schedule() and gone to sleep by the
      time we do kthread_unpark(), which will then do __kthread_bind() to
      re-bind the task to the correct CPU.
      
      However, when we loose the initial TASK_PARKED store to the concurrent
      wakeup issue described previously, do the complete(), get migrated, it
      is possible to either:
      
       - observe kthread_unpark()'s clearing of SHOULD_PARK and terminate
         the park and set TASK_RUNNING, or
      
       - __kthread_bind()'s wait_task_inactive() to observe the competing
         TASK_RUNNING store.
      
      Either way the WARN() in __kthread_bind() will trigger and fail to
      correctly set the CPU affinity.
      
      Fix this by only issuing the complete() when the kthread has scheduled
      out. This does away with all the icky 'still running' nonsense.
      
      The alternative is to promote TASK_PARKED to a special state, this
      guarantees wait_task_inactive() cannot observe a 'stale' TASK_RUNNING
      and we'll end up doing the right thing, but this preserves the whole
      icky business of potentially migating the still runnable thing.
      Reported-by: NGaurav Kohli <gkohli@codeaurora.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      85f1abe0
  15. 29 4月, 2018 1 次提交
    • A
      <linux/stringhash.h>: fix end_name_hash() for 64bit long · 19b9ad67
      Amir Goldstein 提交于
      The comment claims that this helper will try not to loose bits, but for
      64bit long it looses the high bits before hashing 64bit long into 32bit
      int.  Use the helper hash_long() to do the right thing for 64bit long.
      For 32bit long, there is no change.
      
      All the callers of end_name_hash() either assign the result to
      qstr->hash, which is u32 or return the result as an int value (e.g.
      full_name_hash()).  Change the helper return type to int to conform to
      its users.
      
      [ It took me a while to apply this, because my initial reaction to it
        was - incorrectly - that it could make for slower code.
      
        After having looked more at it, I take back all my complaints about
        the patch, Amir was right and I was mis-reading things or just being
        stupid.
      
        I also don't worry too much about the possible performance impact of
        this on 64-bit, since most architectures that actually care about
        performance end up not using this very much (the dcache code is the
        most performance-critical, but the word-at-a-time case uses its own
        hashing anyway).
      
        So this ends up being mostly used for filesystems that do their own
        degraded hashing (usually because they want a case-insensitive
        comparison function).
      
        A _tiny_ worry remains, in that not everybody uses DCACHE_WORD_ACCESS,
        and then this potentially makes things more expensive on 64-bit
        architectures with slow or lacking multipliers even for the normal
        case.
      
        That said, realistically the only such architecture I can think of is
        PA-RISC. Nobody really cares about performance on that, it's more of a
        "look ma, I've got warts^W an odd machine" platform.
      
        So the patch is fine, and all my initial worries were just misplaced
        from not looking at this properly.   - Linus ]
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      19b9ad67
  16. 27 4月, 2018 2 次提交
  17. 26 4月, 2018 4 次提交
    • O
      blk-mq: fix sysfs inflight counter · bf0ddaba
      Omar Sandoval 提交于
      When the blk-mq inflight implementation was added, /proc/diskstats was
      converted to use it, but /sys/block/$dev/inflight was not. Fix it by
      adding another helper to count in-flight requests by data direction.
      
      Fixes: f299b7c7 ("blk-mq: provide internal in-flight variant")
      Signed-off-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      bf0ddaba
    • T
      Revert: Unify CLOCK_MONOTONIC and CLOCK_BOOTTIME · a3ed0e43
      Thomas Gleixner 提交于
      Revert commits
      
      92af4dcb ("tracing: Unify the "boot" and "mono" tracing clocks")
      127bfa5f ("hrtimer: Unify MONOTONIC and BOOTTIME clock behavior")
      7250a404 ("posix-timers: Unify MONOTONIC and BOOTTIME clock behavior")
      d6c7270e ("timekeeping: Remove boot time specific code")
      f2d6fdbf ("Input: Evdev - unify MONOTONIC and BOOTTIME clock behavior")
      d6ed449a ("timekeeping: Make the MONOTONIC clock behave like the BOOTTIME clock")
      72199320 ("timekeeping: Add the new CLOCK_MONOTONIC_ACTIVE clock")
      
      As stated in the pull request for the unification of CLOCK_MONOTONIC and
      CLOCK_BOOTTIME, it was clear that we might have to revert the change.
      
      As reported by several folks systemd and other applications rely on the
      documented behaviour of CLOCK_MONOTONIC on Linux and break with the above
      changes. After resume daemons time out and other timeout related issues are
      observed. Rafael compiled this list:
      
      * systemd kills daemons on resume, after >WatchdogSec seconds
        of suspending (Genki Sky).  [Verified that that's because systemd uses
        CLOCK_MONOTONIC and expects it to not include the suspend time.]
      
      * systemd-journald misbehaves after resume:
        systemd-journald[7266]: File /var/log/journal/016627c3c4784cd4812d4b7e96a34226/system.journal
      corrupted or uncleanly shut down, renaming and replacing.
        (Mike Galbraith).
      
      * NetworkManager reports "networking disabled" and networking is broken
        after resume 50% of the time (Pavel).  [May be because of systemd.]
      
      * MATE desktop dims the display and starts the screensaver right after
        system resume (Pavel).
      
      * Full system hang during resume (me).  [May be due to systemd or NM or both.]
      
      That happens on debian and open suse systems.
      
      It's sad, that these problems were neither catched in -next nor by those
      folks who expressed interest in this change.
      Reported-by: NRafael J. Wysocki <rjw@rjwysocki.net>
      Reported-by: Genki Sky <sky@genki.is>,
      Reported-by: NPavel Machek <pavel@ucw.cz>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kevin Easton <kevin@guarana.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Salyzyn <salyzyn@android.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      a3ed0e43
    • A
      remoteproc: fix crashed parameter logic on stop call · fcd58037
      Arnaud Pouliquen 提交于
      Fix rproc_add_subdev parameter name and inverse the crashed logic.
      
      Fixes: 880f5b38 ("remoteproc: Pass type of shutdown to subdev remove")
      Reviewed-by: NAlex Elder <elder@linaro.org>
      Signed-off-by: NArnaud Pouliquen <arnaud.pouliquen@st.com>
      Signed-off-by: NBjorn Andersson <bjorn.andersson@linaro.org>
      fcd58037
    • M
      virtio: add ability to iterate over vqs · 24a7e4d2
      Michael S. Tsirkin 提交于
      For cleanup it's helpful to be able to simply scan all vqs and discard
      all data. Add an iterator to do that.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      24a7e4d2
  18. 25 4月, 2018 2 次提交
  19. 24 4月, 2018 3 次提交
  20. 23 4月, 2018 2 次提交