1. 09 8月, 2019 3 次提交
  2. 05 8月, 2019 6 次提交
    • P
      x86: kvm: remove useless calls to kvm_para_available · 57b76bdb
      Paolo Bonzini 提交于
      Most code in arch/x86/kernel/kvm.c is called through x86_hyper_kvm, and thus only
      runs if KVM has been detected.  There is no need to check again for the CPUID
      base.
      
      Cc: Sergio Lopez <slp@redhat.com>
      Cc: Jan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      57b76bdb
    • G
      KVM: no need to check return value of debugfs_create functions · 3e7093d0
      Greg KH 提交于
      When calling debugfs functions, there is no need to ever check the
      return value.  The function can work or not, but the code logic should
      never do something different based on this.
      
      Also, when doing this, change kvm_arch_create_vcpu_debugfs() to return
      void instead of an integer, as we should not care at all about if this
      function actually does anything or not.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <x86@kernel.org>
      Cc: <kvm@vger.kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3e7093d0
    • P
      KVM: remove kvm_arch_has_vcpu_debugfs() · 741cbbae
      Paolo Bonzini 提交于
      There is no need for this function as all arches have to implement
      kvm_arch_create_vcpu_debugfs() no matter what.  A #define symbol
      let us actually simplify the code.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      741cbbae
    • W
      KVM: Fix leak vCPU's VMCS value into other pCPU · 17e433b5
      Wanpeng Li 提交于
      After commit d73eb57b (KVM: Boost vCPUs that are delivering interrupts), a
      five years old bug is exposed. Running ebizzy benchmark in three 80 vCPUs VMs
      on one 80 pCPUs Skylake server, a lot of rcu_sched stall warning splatting
      in the VMs after stress testing:
      
       INFO: rcu_sched detected stalls on CPUs/tasks: { 4 41 57 62 77} (detected by 15, t=60004 jiffies, g=899, c=898, q=15073)
       Call Trace:
         flush_tlb_mm_range+0x68/0x140
         tlb_flush_mmu.part.75+0x37/0xe0
         tlb_finish_mmu+0x55/0x60
         zap_page_range+0x142/0x190
         SyS_madvise+0x3cd/0x9c0
         system_call_fastpath+0x1c/0x21
      
      swait_active() sustains to be true before finish_swait() is called in
      kvm_vcpu_block(), voluntarily preempted vCPUs are taken into account
      by kvm_vcpu_on_spin() loop greatly increases the probability condition
      kvm_arch_vcpu_runnable(vcpu) is checked and can be true, when APICv
      is enabled the yield-candidate vCPU's VMCS RVI field leaks(by
      vmx_sync_pir_to_irr()) into spinning-on-a-taken-lock vCPU's current
      VMCS.
      
      This patch fixes it by checking conservatively a subset of events.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Marc Zyngier <Marc.Zyngier@arm.com>
      Cc: stable@vger.kernel.org
      Fixes: 98f4a146 (KVM: add kvm_arch_vcpu_runnable() test to kvm_vcpu_on_spin() loop)
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      17e433b5
    • W
      KVM: Check preempted_in_kernel for involuntary preemption · 046ddeed
      Wanpeng Li 提交于
      preempted_in_kernel is updated in preempt_notifier when involuntary preemption
      ocurrs, it can be stale when the voluntarily preempted vCPUs are taken into
      account by kvm_vcpu_on_spin() loop. This patch lets it just check preempted_in_kernel
      for involuntary preemption.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      046ddeed
    • W
      KVM: LAPIC: Don't need to wakeup vCPU twice afer timer fire · a48d06f9
      Wanpeng Li 提交于
      kvm_set_pending_timer() will take care to wake up the sleeping vCPU which
      has pending timer, don't need to check this in apic_timer_expired() again.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a48d06f9
  3. 24 7月, 2019 2 次提交
    • W
      KVM: X86: Boost queue head vCPU to mitigate lock waiter preemption · 266e85a5
      Wanpeng Li 提交于
      Commit 11752adb (locking/pvqspinlock: Implement hybrid PV queued/unfair locks)
      introduces hybrid PV queued/unfair locks
       - queued mode (no starvation)
       - unfair mode (good performance on not heavily contended lock)
      The lock waiter goes into the unfair mode especially in VMs with over-commit
      vCPUs since increaing over-commitment increase the likehood that the queue
      head vCPU may have been preempted and not actively spinning.
      
      However, reschedule queue head vCPU timely to acquire the lock still can get
      better performance than just depending on lock stealing in over-subscribe
      scenario.
      
      Testing on 80 HT 2 socket Xeon Skylake server, with 80 vCPUs VM 80GB RAM:
      ebizzy -M
                   vanilla     boosting    improved
       1VM          23520        25040         6%
       2VM           8000        13600        70%
       3VM           3100         5400        74%
      
      The lock holder vCPU yields to the queue head vCPU when unlock, to boost queue
      head vCPU which is involuntary preemption or the one which is voluntary halt
      due to fail to acquire the lock after a short spin in the guest.
      
      Cc: Waiman Long <longman@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      266e85a5
    • C
      Documentation: move Documentation/virtual to Documentation/virt · 2f5947df
      Christoph Hellwig 提交于
      Renaming docs seems to be en vogue at the moment, so fix on of the
      grossly misnamed directories.  We usually never use "virtual" as
      a shortcut for virtualization in the kernel, but always virt,
      as seen in the virt/ top-level directory.  Fix up the documentation
      to match that.
      
      Fixes: ed16648e ("Move kvm, uml, and lguest subdirectories under a common "virtual" directory, I.E:")
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2f5947df
  4. 22 7月, 2019 5 次提交
  5. 20 7月, 2019 9 次提交
    • E
      KVM: x86: Add fixed counters to PMU filter · 30cd8604
      Eric Hankland 提交于
      Updates KVM_CAP_PMU_EVENT_FILTER so it can also whitelist or blacklist
      fixed counters.
      Signed-off-by: NEric Hankland <ehankland@google.com>
      [No need to check padding fields for zero. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      30cd8604
    • P
      KVM: nVMX: do not use dangling shadow VMCS after guest reset · 88dddc11
      Paolo Bonzini 提交于
      If a KVM guest is reset while running a nested guest, free_nested will
      disable the shadow VMCS execution control in the vmcs01.  However,
      on the next KVM_RUN vmx_vcpu_run would nevertheless try to sync
      the VMCS12 to the shadow VMCS which has since been freed.
      
      This causes a vmptrld of a NULL pointer on my machime, but Jan reports
      the host to hang altogether.  Let's see how much this trivial patch fixes.
      Reported-by: NJan Kiszka <jan.kiszka@siemens.com>
      Cc: Liran Alon <liran.alon@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      88dddc11
    • P
      KVM: VMX: dump VMCS on failed entry · 3b20e03a
      Paolo Bonzini 提交于
      This is useful for debugging, and is ratelimited nowadays.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3b20e03a
    • L
      KVM: x86/vPMU: refine kvm_pmu err msg when event creation failed · 6fc3977c
      Like Xu 提交于
      If a perf_event creation fails due to any reason of the host perf
      subsystem, it has no chance to log the corresponding event for guest
      which may cause abnormal sampling data in guest result. In debug mode,
      this message helps to understand the state of vPMC and we may not
      limit the number of occurrences but not in a spamming style.
      Suggested-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NLike Xu <like.xu@linux.intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6fc3977c
    • W
      KVM: s390: Use kvm_vcpu_wake_up in kvm_s390_vcpu_wakeup · d9847409
      Wanpeng Li 提交于
      Use kvm_vcpu_wake_up() in kvm_s390_vcpu_wakeup().
      Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d9847409
    • W
      KVM: Boost vCPUs that are delivering interrupts · d73eb57b
      Wanpeng Li 提交于
      Inspired by commit 9cac38dd (KVM/s390: Set preempted flag during
      vcpu wakeup and interrupt delivery), we want to also boost not just
      lock holders but also vCPUs that are delivering interrupts. Most
      smp_call_function_many calls are synchronous, so the IPI target vCPUs
      are also good yield candidates.  This patch introduces vcpu->ready to
      boost vCPUs during wakeup and interrupt delivery time; unlike s390 we do
      not reuse vcpu->preempted so that voluntarily preempted vCPUs are taken
      into account by kvm_vcpu_on_spin, but vmx_vcpu_pi_put is not affected
      (VT-d PI handles voluntary preemption separately, in pi_pre_block).
      
      Testing on 80 HT 2 socket Xeon Skylake server, with 80 vCPUs VM 80GB RAM:
      ebizzy -M
      
                  vanilla     boosting    improved
      1VM          21443       23520         9%
      2VM           2800        8000       180%
      3VM           1800        3100        72%
      
      Testing on my Haswell desktop 8 HT, with 8 vCPUs VM 8GB RAM, two VMs,
      one running ebizzy -M, the other running 'stress --cpu 2':
      
      w/ boosting + w/o pv sched yield(vanilla)
      
                  vanilla     boosting   improved
                    1570         4000      155%
      
      w/ boosting + w/ pv sched yield(vanilla)
      
                  vanilla     boosting   improved
                    1844         5157      179%
      
      w/o boosting, perf top in VM:
      
       72.33%  [kernel]       [k] smp_call_function_many
        4.22%  [kernel]       [k] call_function_i
        3.71%  [kernel]       [k] async_page_fault
      
      w/ boosting, perf top in VM:
      
       38.43%  [kernel]       [k] smp_call_function_many
        6.31%  [kernel]       [k] async_page_fault
        6.13%  libc-2.23.so   [.] __memcpy_avx_unaligned
        4.88%  [kernel]       [k] call_function_interrupt
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Paul Mackerras <paulus@ozlabs.org>
      Cc: Marc Zyngier <maz@kernel.org>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d73eb57b
    • T
      KVM: selftests: Remove superfluous define from vmx.c · 2417c870
      Thomas Huth 提交于
      The code in vmx.c does not use "program_invocation_name", so there
      is no need to "#define _GNU_SOURCE" here.
      Signed-off-by: NThomas Huth <thuth@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2417c870
    • L
      KVM: SVM: Fix detection of AMD Errata 1096 · 118154bd
      Liran Alon 提交于
      When CPU raise #NPF on guest data access and guest CR4.SMAP=1, it is
      possible that CPU microcode implementing DecodeAssist will fail
      to read bytes of instruction which caused #NPF. This is AMD errata
      1096 and it happens because CPU microcode reading instruction bytes
      incorrectly attempts to read code as implicit supervisor-mode data
      accesses (that is, just like it would read e.g. a TSS), which are
      susceptible to SMAP faults. The microcode reads CS:RIP and if it is
      a user-mode address according to the page tables, the processor
      gives up and returns no instruction bytes.  In this case,
      GuestIntrBytes field of the VMCB on a VMEXIT will incorrectly
      return 0 instead of the correct guest instruction bytes.
      
      Current KVM code attemps to detect and workaround this errata, but it
      has multiple issues:
      
      1) It mistakenly checks if guest CR4.SMAP=0 instead of guest CR4.SMAP=1,
      which is required for encountering a SMAP fault.
      
      2) It assumes SMAP faults can only occur when guest CPL==3.
      However, in case guest CR4.SMEP=0, the guest can execute an instruction
      which reside in a user-accessible page with CPL<3 priviledge. If this
      instruction raise a #NPF on it's data access, then CPU DecodeAssist
      microcode will still encounter a SMAP violation.  Even though no sane
      OS will do so (as it's an obvious priviledge escalation vulnerability),
      we still need to handle this semanticly correct in KVM side.
      
      Note that (2) *is* a useful optimization, because CR4.SMAP=1 is an easy
      triggerable condition and guests usually enable SMAP together with SMEP.
      If the vCPU has CR4.SMEP=1, the errata could indeed be encountered onlt
      at guest CPL==3; otherwise, the CPU would raise a SMEP fault to guest
      instead of #NPF.  We keep this condition to avoid false positives in
      the detection of the errata.
      
      In addition, to avoid future confusion and improve code readbility,
      include details of the errata in code and not just in commit message.
      
      Fixes: 05d5a486 ("KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)")
      Cc: Singh Brijesh <brijesh.singh@amd.com>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NLiran Alon <liran.alon@oracle.com>
      Reviewed-by: NBrijesh Singh <brijesh.singh@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      118154bd
    • W
      KVM: LAPIC: Inject timer interrupt via posted interrupt · 0c5f81da
      Wanpeng Li 提交于
      Dedicated instances are currently disturbed by unnecessary jitter due
      to the emulated lapic timers firing on the same pCPUs where the
      vCPUs reside.  There is no hardware virtual timer on Intel for guest
      like ARM, so both programming timer in guest and the emulated timer fires
      incur vmexits.  This patch tries to avoid vmexit when the emulated timer
      fires, at least in dedicated instance scenario when nohz_full is enabled.
      
      In that case, the emulated timers can be offload to the nearest busy
      housekeeping cpus since APICv has been found for several years in server
      processors. The guest timer interrupt can then be injected via posted interrupts,
      which are delivered by the housekeeping cpu once the emulated timer fires.
      
      The host should tuned so that vCPUs are placed on isolated physical
      processors, and with several pCPUs surplus for busy housekeeping.
      If disabled mwait/hlt/pause vmexits keep the vCPUs in non-root mode,
      ~3% redis performance benefit can be observed on Skylake server, and the
      number of external interrupt vmexits drops substantially.  Without patch
      
                  VM-EXIT  Samples  Samples%  Time%   Min Time  Max Time   Avg time
      EXTERNAL_INTERRUPT    42916    49.43%   39.30%   0.47us   106.09us   0.71us ( +-   1.09% )
      
      While with patch:
      
                  VM-EXIT  Samples  Samples%  Time%   Min Time  Max Time         Avg time
      EXTERNAL_INTERRUPT    6871     9.29%     2.96%   0.44us    57.88us   0.72us ( +-   4.02% )
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0c5f81da
  6. 18 7月, 2019 1 次提交
    • W
      KVM: LAPIC: Make lapic timer unpinned · 4d151bf3
      Wanpeng Li 提交于
      Commit 61abdbe0 ("kvm: x86: make lapic hrtimer pinned") pinned the
      lapic timer to avoid to wait until the next kvm exit for the guest to
      see KVM_REQ_PENDING_TIMER set. There is another solution to give a kick
      after setting the KVM_REQ_PENDING_TIMER bit, make lapic timer unpinned
      will be used in follow up patches.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4d151bf3
  7. 17 7月, 2019 1 次提交
  8. 16 7月, 2019 2 次提交
  9. 15 7月, 2019 6 次提交
    • Y
      kvm: x86: some tsc debug cleanup · 9a5611af
      Yi Wang 提交于
      There are some pr_debug in TSC code, which may have
      been no use, so remove them as Paolo suggested.
      Signed-off-by: NYi Wang <wang.yi59@zte.com.cn>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9a5611af
    • Y
      kvm: vmx: fix coccinelle warnings · 9481b7f1
      Yi Wang 提交于
      This fixes the following coccinelle warning:
      
      WARNING: return of 0/1 in function 'vmx_need_emulation_on_page_fault'
      with return type bool
      
      Return false instead of 0.
      Signed-off-by: NYi Wang <wang.yi59@zte.com.cn>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9481b7f1
    • P
      Merge tag 'kvm-s390-next-5.3-1' of... · fd4198bf
      Paolo Bonzini 提交于
      Merge tag 'kvm-s390-next-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
      
      KVM: s390: add kselftests
      
      This is the initial implementation for KVM selftests on s390.
      fd4198bf
    • A
      x86: kvm: avoid constant-conversion warning · a6a6d3b1
      Arnd Bergmann 提交于
      clang finds a contruct suspicious that converts an unsigned
      character to a signed integer and back, causing an overflow:
      
      arch/x86/kvm/mmu.c:4605:39: error: implicit conversion from 'int' to 'u8' (aka 'unsigned char') changes value from -205 to 51 [-Werror,-Wconstant-conversion]
                      u8 wf = (pfec & PFERR_WRITE_MASK) ? ~w : 0;
                         ~~                               ^~
      arch/x86/kvm/mmu.c:4607:38: error: implicit conversion from 'int' to 'u8' (aka 'unsigned char') changes value from -241 to 15 [-Werror,-Wconstant-conversion]
                      u8 uf = (pfec & PFERR_USER_MASK) ? ~u : 0;
                         ~~                              ^~
      arch/x86/kvm/mmu.c:4609:39: error: implicit conversion from 'int' to 'u8' (aka 'unsigned char') changes value from -171 to 85 [-Werror,-Wconstant-conversion]
                      u8 ff = (pfec & PFERR_FETCH_MASK) ? ~x : 0;
                         ~~                               ^~
      
      Add an explicit cast to tell clang that everything works as
      intended here.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Link: https://github.com/ClangBuiltLinux/linux/issues/95Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a6a6d3b1
    • A
      x86: kvm: avoid -Wsometimes-uninitized warning · f4e4805e
      Arnd Bergmann 提交于
      Clang notices a code path in which some variables are never
      initialized, but fails to figure out that this can never happen
      on i386 because is_64_bit_mode() always returns false.
      
      arch/x86/kvm/hyperv.c:1610:6: error: variable 'ingpa' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
              if (!longmode) {
                  ^~~~~~~~~
      arch/x86/kvm/hyperv.c:1632:55: note: uninitialized use occurs here
              trace_kvm_hv_hypercall(code, fast, rep_cnt, rep_idx, ingpa, outgpa);
                                                                   ^~~~~
      arch/x86/kvm/hyperv.c:1610:2: note: remove the 'if' if its condition is always true
              if (!longmode) {
              ^~~~~~~~~~~~~~~
      arch/x86/kvm/hyperv.c:1595:18: note: initialize the variable 'ingpa' to silence this warning
              u64 param, ingpa, outgpa, ret = HV_STATUS_SUCCESS;
                              ^
                               = 0
      arch/x86/kvm/hyperv.c:1610:6: error: variable 'outgpa' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
      arch/x86/kvm/hyperv.c:1610:6: error: variable 'param' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
      
      Flip the condition around to avoid the conditional execution on i386.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f4e4805e
    • J
      KVM: x86: expose AVX512_BF16 feature to guest · 0b774629
      Jing Liu 提交于
      AVX512 BFLOAT16 instructions support 16-bit BFLOAT16 floating-point
      format (BF16) for deep learning optimization.
      
      Intel adds AVX512 BFLOAT16 feature in CooperLake, which is CPUID.7.1.EAX[5].
      
      Detailed information of the CPUID bit can be found here,
      https://software.intel.com/sites/default/files/managed/c5/15/\
      architecture-instruction-set-extensions-programming-reference.pdf.
      Signed-off-by: NJing Liu <jing2.liu@linux.intel.com>
      [Fix type mismatch in min, changing constant "1" to "1u". - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0b774629
  10. 13 7月, 2019 5 次提交
    • L
      Merge tag 'dlm-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm · 964a4eac
      Linus Torvalds 提交于
      Pull dlm updates from David Teigland:
       "This set removes some unnecessary debugfs error handling, and checks
        that lowcomms workqueues are not NULL before destroying"
      
      * tag 'dlm-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
        dlm: no need to check return value of debugfs_create functions
        dlm: check if workqueues are NULL before flushing/destroying
      964a4eac
    • L
      Merge tag '9p-for-5.3' of git://github.com/martinetd/linux · 23bbbf5c
      Linus Torvalds 提交于
      Pull 9p updates from Dominique Martinet:
       "Two small fixes to properly cleanup the 9p transports list if
        virtio/xen module initialization fail.
      
        9p might otherwise try to access memory from a module that failed to
        register got freed"
      
      * tag '9p-for-5.3' of git://github.com/martinetd/linux:
        9p/xen: Add cleanup path in p9_trans_xen_init
        9p/virtio: Add cleanup path in p9_virtio_init
      23bbbf5c
    • L
      Merge tag 'f2fs-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs · a641a88e
      Linus Torvalds 提交于
      Pull f2fs updates from Jaegeuk Kim:
       "In this round, we've introduced native swap file support which can
        exploit DIO, enhanced existing checkpoint=disable feature with
        additional mount option to tune the triggering condition, and allowed
        user to preallocate physical blocks in a pinned file which will be
        useful to avoid f2fs fragmentation in append-only workloads. In
        addition, we've fixed subtle quota corruption issue.
      
        Enhancements:
         - add swap file support which uses DIO
         - allocate blocks for pinned file
         - allow SSR and mount option to enhance checkpoint=disable
         - enhance IPU IOs
         - add more sanity checks such as memory boundary access
      
        Bug fixes:
         - quota corruption in very corner case of error-injected SPO case
         - fix root_reserved on remount and some wrong counts
         - add missing fsck flag
      
        Some patches were also introduced to clean up ambiguous i_flags and
        debugging messages codes"
      
      * tag 'f2fs-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (33 commits)
        f2fs: improve print log in f2fs_sanity_check_ckpt()
        f2fs: avoid out-of-range memory access
        f2fs: fix to avoid long latency during umount
        f2fs: allow all the users to pin a file
        f2fs: support swap file w/ DIO
        f2fs: allocate blocks for pinned file
        f2fs: fix is_idle() check for discard type
        f2fs: add a rw_sem to cover quota flag changes
        f2fs: set SBI_NEED_FSCK for xattr corruption case
        f2fs: use generic EFSBADCRC/EFSCORRUPTED
        f2fs: Use DIV_ROUND_UP() instead of open-coding
        f2fs: print kernel message if filesystem is inconsistent
        f2fs: introduce f2fs_<level> macros to wrap f2fs_printk()
        f2fs: avoid get_valid_blocks() for cleanup
        f2fs: ioctl for removing a range from F2FS
        f2fs: only set project inherit bit for directory
        f2fs: separate f2fs i_flags from fs_flags and ext4 i_flags
        f2fs: replace ktype default_attrs with default_groups
        f2fs: Add option to limit required GC for checkpoint=disable
        f2fs: Fix accounting for unusable blocks
        ...
      a641a88e
    • L
      Merge tag 'xfs-5.3-merge-12' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 4ce9d181
      Linus Torvalds 提交于
      Pull xfs updates from Darrick Wong:
       "In this release there are a significant amounts of consolidations and
        cleanups in the log code; restructuring of the log to issue struct
        bios directly; new bulkstat ioctls to return v5 fs inode information
        (and fix all the padding problems of the old ioctl); the beginnings of
        multithreaded inode walks (e.g. quotacheck); and a reduction in memory
        usage in the online scrub code leading to reduced runtimes.
      
         - Refactor inode geometry calculation into a single structure instead
           of open-coding pieces everywhere.
      
         - Add online repair to build options.
      
         - Remove unnecessary function call flags and functions.
      
         - Claim maintainership of various loose xfs documentation and header
           files.
      
         - Use struct bio directly for log buffer IOs instead of struct
           xfs_buf.
      
         - Reduce log item boilerplate code requirements.
      
         - Merge log item code spread across too many files.
      
         - Further distinguish between log item commits and cancellations.
      
         - Various small cleanups to the ag small allocator.
      
         - Support cgroup-aware writeback
      
         - libxfs refactoring for mkfs cleanup
      
         - Remove unneeded #includes
      
         - Fix a memory allocation miscalculation in the new log bio code
      
         - Fix bisection problems
      
         - Fix a crash in ioend processing caused by tripping over freeing of
           preallocated transactions
      
         - Split out a generic inode walk mechanism from the bulkstat code,
           hook up all the internal users to use the walking code, then clean
           up bulkstat to serve only the bulkstat ioctls.
      
         - Add a multithreaded iwalk implementation to speed up quotacheck on
           fast storage with many CPUs.
      
         - Remove unnecessary return values in logging teardown functions.
      
         - Supplement the bstat and inogrp structures with new bulkstat and
           inumbers structures that have all the fields we need for v5
           filesystem features and none of the padding problems of their
           predecessors.
      
         - Wire up new ioctls that use the new structures with a much simpler
           bulk_ireq structure at the head instead of the pointerhappy mess we
           had before.
      
         - Enable userspace to constrain bulkstat returns to a single AG or a
           single special inode so that we can phase out a lot of geometry
           guesswork in userspace.
      
         - Reduce memory consumption and zeroing overhead in extended
           attribute scrub code.
      
         - Fix some behavioral regressions in the new bulkstat backend code.
      
         - Fix some behavioral regressions in the new log bio code"
      
      * tag 'xfs-5.3-merge-12' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (100 commits)
        xfs: chain bios the right way around in xfs_rw_bdev
        xfs: bump INUMBERS cursor correctly in xfs_inumbers_walk
        xfs: don't update lastino for FSBULKSTAT_SINGLE
        xfs: online scrub needn't bother zeroing its temporary buffer
        xfs: only allocate memory for scrubbing attributes when we need it
        xfs: refactor attr scrub memory allocation function
        xfs: refactor extended attribute buffer pointer functions
        xfs: attribute scrub should use seen_enough to pass error values
        xfs: allow single bulkstat of special inodes
        xfs: specify AG in bulk req
        xfs: wire up the v5 inumbers ioctl
        xfs: wire up new v5 bulkstat ioctls
        xfs: introduce v5 inode group structure
        xfs: introduce new v5 bulkstat structure
        xfs: rename bulkstat functions
        xfs: remove various bulk request typedef usage
        fs: xfs: xfs_log: Change return type from int to void
        xfs: poll waiting for quotacheck
        xfs: multithreaded iwalk implementation
        xfs: refactor INUMBERS to use iwalk functions
        ...
      4ce9d181
    • L
      Merge tag 'vfs-fix-ioctl-checking-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 5010fe9f
      Linus Torvalds 提交于
      Pull common SETFLAGS/FSSETXATTR parameter checking from Darrick Wong:
       "Here's a patch series that sets up common parameter checking functions
        for the FS_IOC_SETFLAGS and FS_IOC_FSSETXATTR ioctl implementations.
      
        The goal here is to reduce the amount of behaviorial variance between
        the filesystems where those ioctls originated (ext2 and XFS,
        respectively) and everybody else.
      
         - Standardize parameter checking for the SETFLAGS and FSSETXATTR
           ioctls (which were the file attribute setters for ext4 and xfs and
           have now been hoisted to the vfs)
      
         - Only allow the DAX flag to be set on files and directories"
      
      * tag 'vfs-fix-ioctl-checking-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        vfs: only allow FSSETXATTR to set DAX flag on files and dirs
        vfs: teach vfs_ioc_fssetxattr_check to check extent size hints
        vfs: teach vfs_ioc_fssetxattr_check to check project id info
        vfs: create a generic checking function for FS_IOC_FSSETXATTR
        vfs: create a generic checking and prep function for FS_IOC_SETFLAGS
      5010fe9f