1. 06 10月, 2018 1 次提交
  2. 03 10月, 2018 3 次提交
  3. 20 9月, 2018 1 次提交
    • D
      KVM: x86: Control guest reads of MSR_PLATFORM_INFO · 6fbbde9a
      Drew Schmitt 提交于
      Add KVM_CAP_MSR_PLATFORM_INFO so that userspace can disable guest access
      to reads of MSR_PLATFORM_INFO.
      
      Disabling access to reads of this MSR gives userspace the control to "expose"
      this platform-dependent information to guests in a clear way. As it exists
      today, guests that read this MSR would get unpopulated information if userspace
      hadn't already set it (and prior to this patch series, only the CPUID faulting
      information could have been populated). This existing interface could be
      confusing if guests don't handle the potential for incorrect/incomplete
      information gracefully (e.g. zero reported for base frequency).
      Signed-off-by: NDrew Schmitt <dasch@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6fbbde9a
  4. 19 9月, 2018 1 次提交
  5. 17 9月, 2018 2 次提交
  6. 14 9月, 2018 1 次提交
    • M
      xen/balloon: add runtime control for scrubbing ballooned out pages · 197ecb38
      Marek Marczykowski-Górecki 提交于
      Scrubbing pages on initial balloon down can take some time, especially
      in nested virtualization case (nested EPT is slow). When HVM/PVH guest is
      started with memory= significantly lower than maxmem=, all the extra
      pages will be scrubbed before returning to Xen. But since most of them
      weren't used at all at that point, Xen needs to populate them first
      (from populate-on-demand pool). In nested virt case (Xen inside KVM)
      this slows down the guest boot by 15-30s with just 1.5GB needed to be
      returned to Xen.
      
      Add runtime parameter to enable/disable it, to allow initially disabling
      scrubbing, then enable it back during boot (for example in initramfs).
      Such usage relies on assumption that a) most pages ballooned out during
      initial boot weren't used at all, and b) even if they were, very few
      secrets are in the guest at that time (before any serious userspace
      kicks in).
      Convert CONFIG_XEN_SCRUB_PAGES to CONFIG_XEN_SCRUB_PAGES_DEFAULT (also
      enabled by default), controlling default value for the new runtime
      switch.
      Signed-off-by: NMarek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
      Reviewed-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      197ecb38
  7. 12 9月, 2018 1 次提交
  8. 10 9月, 2018 1 次提交
  9. 07 9月, 2018 1 次提交
  10. 03 9月, 2018 3 次提交
  11. 02 9月, 2018 1 次提交
  12. 31 8月, 2018 1 次提交
  13. 30 8月, 2018 5 次提交
  14. 29 8月, 2018 3 次提交
  15. 28 8月, 2018 2 次提交
  16. 27 8月, 2018 1 次提交
  17. 24 8月, 2018 9 次提交
  18. 23 8月, 2018 3 次提交
    • M
      ipc: reorganize initialization of kern_ipc_perm.seq · e2652ae6
      Manfred Spraul 提交于
      ipc_addid() initializes kern_ipc_perm.seq after having called idr_alloc()
      (within ipc_idr_alloc()).
      
      Thus a parallel semop() or msgrcv() that uses ipc_obtain_object_check()
      may see an uninitialized value.
      
      The patch moves the initialization of kern_ipc_perm.seq before the calls
      of idr_alloc().
      
      Notes:
      1) This patch has a user space visible side effect:
      If /proc/sys/kernel/*_next_id is used (i.e.: checkpoint/restore) and
      if semget()/msgget()/shmget() fails in the final step of adding the id
      to the rhash tree, then .._next_id is cleared. Before the patch, is
      remained unmodified.
      
      There is no change of the behavior after a successful ..get() call: It
      always clears .._next_id, there is no impact to non checkpoint/restore
      code as that code does not use .._next_id.
      
      2) The patch correctly documents that after a call to ipc_idr_alloc(),
      the full tear-down sequence must be used. The callers of ipc_addid()
      do not fullfill that, i.e. more bugfixes are required.
      
      The patch is a squash of a patch from Dmitry and my own changes.
      
      Link: http://lkml.kernel.org/r/20180712185241.4017-3-manfred@colorfullife.com
      Reported-by: syzbot+2827ef6b3385deb07eaf@syzkaller.appspotmail.com
      Signed-off-by: NManfred Spraul <manfred@colorfullife.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Michal Hocko <mhocko@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e2652ae6
    • D
      kernel/hung_task.c: allow to set checking interval separately from timeout · a2e51445
      Dmitry Vyukov 提交于
      Currently task hung checking interval is equal to timeout, as the result
      hung is detected anywhere between timeout and 2*timeout.  This is fine for
      most interactive environments, but this hurts automated testing setups
      (syzbot).  In an automated setup we need to strictly order CPU lockup <
      RCU stall < workqueue lockup < task hung < silent loss, so that RCU stall
      is not detected as task hung and task hung is not detected as silent
      machine loss.  The large variance in task hung detection timeout requires
      setting silent machine loss timeout to a very large value (e.g.  if task
      hung is 3 mins, then silent loss need to be set to ~7 mins).  The
      additional 3 minutes significantly reduce testing efficiency because
      usually we crash kernel within a minute, and this can add hours to bug
      localization process as it needs to do dozens of tests.
      
      Allow setting checking interval separately from timeout.  This allows to
      set timeout to, say, 3 minutes, but checking interval to 10 secs.
      
      The interval is controlled via a new hung_task_check_interval_secs sysctl,
      similar to the existing hung_task_timeout_secs sysctl.  The default value
      of 0 results in the current behavior: checking interval is equal to
      timeout.
      
      [akpm@linux-foundation.org: update hung_task_timeout_max's comment]
      Link: http://lkml.kernel.org/r/20180611111004.203513-1-dvyukov@google.comSigned-off-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a2e51445
    • D
      /proc/meminfo: add percpu populated pages count · 7e8a6304
      Dennis Zhou (Facebook) 提交于
      Currently, percpu memory only exposes allocation and utilization
      information via debugfs.  This more or less is only really useful for
      understanding the fragmentation and allocation information at a per-chunk
      level with a few global counters.  This is also gated behind a config.
      BPF and cgroup, for example, have seen an increase in use causing
      increased use of percpu memory.  Let's make it easier for someone to
      identify how much memory is being used.
      
      This patch adds the "Percpu" stat to meminfo to more easily look up how
      much percpu memory is in use.  This number includes the cost for all
      allocated backing pages and not just insight at the per a unit, per chunk
      level.  Metadata is excluded.  I think excluding metadata is fair because
      the backing memory scales with the numbere of cpus and can quickly
      outweigh the metadata.  It also makes this calculation light.
      
      Link: http://lkml.kernel.org/r/20180807184723.74919-1-dennisszhou@gmail.comSigned-off-by: NDennis Zhou <dennisszhou@gmail.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Acked-by: NRoman Gushchin <guro@fb.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7e8a6304