1. 27 5月, 2019 9 次提交
    • L
      iommu: Add API to request DMA domain for device · 7423e017
      Lu Baolu 提交于
      Normally during iommu probing a device, a default doamin will
      be allocated and attached to the device. The domain type of
      the default domain is statically defined, which results in a
      situation where the allocated default domain isn't suitable
      for the device due to some limitations. We already have API
      iommu_request_dm_for_dev() to replace a DMA domain with an
      identity one. This adds iommu_request_dma_domain_for_dev()
      to request a dma domain if an allocated identity domain isn't
      suitable for the device in question.
      Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      7423e017
    • S
      iommu/vt-d: Add debugfs support to show scalable mode DMAR table internals · dd5142ca
      Sai Praneeth Prakhya 提交于
      A DMAR table walk would typically follow the below process.
      1. Bus number is used to index into root table which points to a context
         table.
      2. Device number and Function number are used together to index into
         context table which then points to a pasid directory.
      3. PASID[19:6] is used to index into PASID directory which points to a
         PASID table.
      4. PASID[5:0] is used to index into PASID table which points to all levels
         of page tables.
      
      Whenever a user opens the file
      "/sys/kernel/debug/iommu/intel/dmar_translation_struct", the above
      described DMAR table walk is performed and the contents of the table are
      dumped into the file. The dump could be handy while dealing with devices
      that use PASID.
      
      Example of such dump:
      cat /sys/kernel/debug/iommu/intel/dmar_translation_struct
      
      (Please note that because of 80 char limit, entries that should have been
      in the same line are broken into different lines)
      
      IOMMU dmar0: Root Table Address: 0x436f7c000
      B.D.F	Root_entry				Context_entry
      PASID	PASID_table_entry
      00:0a.0	0x0000000000000000:0x000000044dd3f001	0x0000000000100000:0x0000000435460e1d
      0	0x000000044d6e1089:0x0000000000000003:0x0000000000000001
      00:0a.0	0x0000000000000000:0x000000044dd3f001	0x0000000000100000:0x0000000435460e1d
      1	0x0000000000000049:0x0000000000000001:0x0000000003c0e001
      
      Note that the above format is followed even for legacy DMAR table dump
      which doesn't support PASID and hence in such cases PASID is defaulted to
      -1 indicating that PASID and it's related fields are invalid.
      
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: Lu Baolu <baolu.lu@linux.intel.com>
      Cc: Sohil Mehta <sohil.mehta@intel.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
      Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      dd5142ca
    • S
      iommu/vt-d: Introduce macros useful for dumping DMAR table · cdd3a249
      Sai Praneeth Prakhya 提交于
      A scalable mode DMAR table walk would involve looking at bits in each stage
      of walk, like,
      1. Is PASID enabled in the context entry?
      2. What's the size of PASID directory?
      3. Is the PASID directory entry present?
      4. Is the PASID table entry present?
      5. Number of PASID table entries?
      
      Hence, add these macros that will later be used during this walk.
      Apart from adding new macros, move existing macros (like
      pasid_pde_is_present(), get_pasid_table_from_pde() and pasid_supported())
      to appropriate header files so that they could be reused.
      
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: Lu Baolu <baolu.lu@linux.intel.com>
      Cc: Sohil Mehta <sohil.mehta@intel.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
      Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      cdd3a249
    • S
      iommu/vt-d: Modify the format of intel DMAR tables dump · ea09506c
      Sai Praneeth Prakhya 提交于
      Presently, "/sys/kernel/debug/iommu/intel/dmar_translation_struct" file
      dumps DMAR tables in the below format
      
      IOMMU dmar2: Root Table Address:4362cc000
      Root Table Entries:
       Bus: 0 H: 0 L: 4362f0001
       Context Table Entries for Bus: 0
        Entry	B:D.F	High	Low
        160   00:14.0	102     4362ef001
        184   00:17.0	302     435ec4001
        248   00:1f.0	202     436300001
      
      This format has few short comings like
      1. When extended for dumping scalable mode DMAR table it will quickly be
         very clumsy, making it unreadable.
      2. It has information like the Bus number and Entry which are basically
         part of B:D.F, hence are a repetition and are not so useful.
      
      So, change it to a new format which could be easily extended to dump
      scalable mode DMAR table. The new format looks as below:
      
      IOMMU dmar2: Root Table Address: 0x436f7d000
      B.D.F	Root_entry				Context_entry
      00:14.0	0x0000000000000000:0x0000000436fbd001	0x0000000000000102:0x0000000436fbc001
      00:17.0	0x0000000000000000:0x0000000436fbd001	0x0000000000000302:0x0000000436af4001
      00:1f.0	0x0000000000000000:0x0000000436fbd001	0x0000000000000202:0x0000000436fcd001
      
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: Lu Baolu <baolu.lu@linux.intel.com>
      Cc: Sohil Mehta <sohil.mehta@intel.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
      Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      ea09506c
    • L
      iommu/vt-d: Remove unnecessary rcu_read_locks · f780a8dc
      Lukasz Odzioba 提交于
      We use RCU's for rarely updated lists like iommus, rmrr, atsr units.
      
      I'm not sure why domain_remove_dev_info() in domain_exit() was surrounded
      by rcu_read_lock. Lock was present before refactoring in d160aca5,
      but it was related to rcu list, not domain_remove_dev_info function.
      
      dmar_remove_one_dev_info() doesn't touch any of those lists, so it doesn't
      require a lock. In fact it is called 6 times without it anyway.
      
      Fixes: d160aca5 ("iommu/vt-d: Unify domain->iommu attach/detachment")
      Signed-off-by: NLukasz Odzioba <lukasz.odzioba@intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      f780a8dc
    • J
      iommu/vt-d: Fix bind svm with multiple devices · d7af4d98
      Jacob Pan 提交于
      If multiple devices try to bind to the same mm/PASID, we need to
      set up first level PASID entries for all the devices. The current
      code does not consider this case which results in failed DMA for
      devices after the first bind.
      Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
      Reported-by: NMike Campin <mike.campin@intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      d7af4d98
    • L
      Linux 5.2-rc2 · cd6c84d8
      Linus Torvalds 提交于
      cd6c84d8
    • L
      Merge tag 'trace-v5.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · c5b44095
      Linus Torvalds 提交于
      Pull tracing warning fix from Steven Rostedt:
       "Make the GCC 9 warning for sub struct memset go away.
      
        GCC 9 now warns about calling memset() on partial structures when it
        goes across multiple fields. This adds a helper for the place in
        tracing that does this type of clearing of a structure"
      
      * tag 'trace-v5.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Silence GCC 9 array bounds warning
      c5b44095
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 862f0a32
      Linus Torvalds 提交于
      Pull KVM fixes from Paolo Bonzini:
       "The usual smattering of fixes and tunings that came in too late for
        the merge window, but should not wait four months before they appear
        in a release.
      
        I also travelled a bit more than usual in the first part of May, which
        didn't help with picking up patches and reports promptly"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (33 commits)
        KVM: x86: fix return value for reserved EFER
        tools/kvm_stat: fix fields filter for child events
        KVM: selftests: Wrap vcpu_nested_state_get/set functions with x86 guard
        kvm: selftests: aarch64: compile with warnings on
        kvm: selftests: aarch64: fix default vm mode
        kvm: selftests: aarch64: dirty_log_test: fix unaligned memslot size
        KVM: s390: fix memory slot handling for KVM_SET_USER_MEMORY_REGION
        KVM: x86/pmu: do not mask the value that is written to fixed PMUs
        KVM: x86/pmu: mask the result of rdpmc according to the width of the counters
        x86/kvm/pmu: Set AMD's virt PMU version to 1
        KVM: x86: do not spam dmesg with VMCS/VMCB dumps
        kvm: Check irqchip mode before assign irqfd
        kvm: svm/avic: fix off-by-one in checking host APIC ID
        KVM: selftests: do not blindly clobber registers in guest asm
        KVM: selftests: Remove duplicated TEST_ASSERT in hyperv_cpuid.c
        KVM: LAPIC: Expose per-vCPU timer_advance_ns to userspace
        KVM: LAPIC: Fix lapic_timer_advance_ns parameter overflow
        kvm: vmx: Fix -Wmissing-prototypes warnings
        KVM: nVMX: Fix using __this_cpu_read() in preemptible context
        kvm: fix compilation on s390
        ...
      862f0a32
  2. 26 5月, 2019 6 次提交
  3. 25 5月, 2019 25 次提交
    • G
      ext4: fix dcache lookup of !casefolded directories · 66883da1
      Gabriel Krisman Bertazi 提交于
      Found by visual inspection, this wasn't caught by my xfstest, since it's
      effect is ignoring positive dentries in the cache the fallback just goes
      to the disk.  it was introduced in the last iteration of the
      case-insensitive patch.
      
      d_compare should return 0 when the entries match, so make sure we are
      correctly comparing the entire string if the encoding feature is set and
      we are on a case-INsensitive directory.
      
      Fixes: b886ee3e ("ext4: Support case-insensitive file name lookups")
      Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      66883da1
    • L
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 2409207a
      Linus Torvalds 提交于
      Pull SCSI fixes from James Bottomley:
       "This is the same set of patches sent in the merge window as the final
        pull except that Martin's read only rework is replaced with a simple
        revert of the original change that caused the regression.
      
        Everything else is an obvious fix or small cleanup"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        Revert "scsi: sd: Keep disk read-only when re-reading partition"
        scsi: bnx2fc: fix incorrect cast to u64 on shift operation
        scsi: smartpqi: Reporting unhandled SCSI errors
        scsi: myrs: Fix uninitialized variable
        scsi: lpfc: Update lpfc version to 12.2.0.2
        scsi: lpfc: add check for loss of ndlp when sending RRQ
        scsi: lpfc: correct rcu unlock issue in lpfc_nvme_info_show
        scsi: lpfc: resolve lockdep warnings
        scsi: qedi: remove set but not used variables 'cdev' and 'udev'
        scsi: qedi: remove memset/memcpy to nfunc and use func instead
        scsi: qla2xxx: Add cleanup for PCI EEH recovery
      2409207a
    • L
      Merge tag 'for-linus-20190524' of git://git.kernel.dk/linux-block · 7fbc78e3
      Linus Torvalds 提交于
      Pull block fixes from Jens Axboe:
      
       - NVMe pull request from Keith, with fixes from a few folks.
      
       - bio and sbitmap before atomic barrier fixes (Andrea)
      
       - Hang fix for blk-mq freeze and unfreeze (Bob)
      
       - Single segment count regression fix (Christoph)
      
       - AoE now has a new maintainer
      
       - tools/io_uring/ Makefile fix, and sync with liburing (me)
      
      * tag 'for-linus-20190524' of git://git.kernel.dk/linux-block: (23 commits)
        tools/io_uring: sync with liburing
        tools/io_uring: fix Makefile for pthread library link
        blk-mq: fix hang caused by freeze/unfreeze sequence
        block: remove the bi_seg_{front,back}_size fields in struct bio
        block: remove the segment size check in bio_will_gap
        block: force an unlimited segment size on queues with a virt boundary
        block: don't decrement nr_phys_segments for physically contigous segments
        sbitmap: fix improper use of smp_mb__before_atomic()
        bio: fix improper use of smp_mb__before_atomic()
        aoe: list new maintainer for aoe driver
        nvme-pci: use blk-mq mapping for unmanaged irqs
        nvme: update MAINTAINERS
        nvme: copy MTFA field from identify controller
        nvme: fix memory leak for power latency tolerance
        nvme: release namespace SRCU protection before performing controller ioctls
        nvme: merge nvme_ns_ioctl into nvme_ioctl
        nvme: remove the ifdef around nvme_nvm_ioctl
        nvme: fix srcu locking on error return in nvme_get_ns_from_disk
        nvme: Fix known effects
        nvme-pci: Sync queues on reset
        ...
      7fbc78e3
    • L
      Merge tag 'linux-kselftest-5.2-rc2' of... · 7f8b40e3
      Linus Torvalds 提交于
      Merge tag 'linux-kselftest-5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull Kselftest fixes from Shuah Khan:
      
       - Two fixes to regressions introduced in kselftest Makefile test run
         output refactoring work (Kees Cook)
      
       - Adding Atom support to syscall_arg_fault test (Tong Bo)
      
      * tag 'linux-kselftest-5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        selftests/timers: Add missing fflush(stdout) calls
        selftests: Remove forced unbuffering for test running
        selftests/x86: Support Atom for syscall_arg_fault test
      7f8b40e3
    • L
      Merge tag 'devicetree-fixes-for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · e7bd3e24
      Linus Torvalds 提交于
      Pull Devicetree fixes from Rob Herring:
      
       - Update checkpatch.pl to use DT vendor-prefixes.yaml
      
       - Fix DT binding references to files converted to DT schema
      
       - Clean-up Arm CPU binding examples to match schema
      
       - Add Sifive block versioning scheme documentation
      
       - Pass binding directory base to validation tools for reference lookups
      
      * tag 'devicetree-fixes-for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        checkpatch.pl: Update DT vendor prefix check
        dt: bindings: mtd: replace references to nand.txt with nand-controller.yaml
        dt-bindings: interrupt-controller: arm,gic: Fix schema errors in example
        dt-bindings: arm: Clean up CPU binding examples
        dt: fix refs that were renamed to json with the same file name
        dt-bindings: Pass binding directory to validation tools
        dt-bindings: sifive: describe sifive-blocks versioning
      e7bd3e24
    • L
      Merge tag 'spdx-5.2-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core · 86c2f5d6
      Linus Torvalds 提交于
      Pule more SPDX updates from Greg KH:
       "Here is another set of reviewed patches that adds SPDX tags to
        different kernel files, based on a set of rules that are being used to
        parse the comments to try to determine that the license of the file is
        "GPL-2.0-or-later".
      
        Only the "obvious" versions of these matches are included here, a
        number of "non-obvious" variants of text have been found but those
        have been postponed for later review and analysis.
      
        These patches have been out for review on the linux-spdx@vger mailing
        list, and while they were created by automatic tools, they were
        hand-verified by a bunch of different people, all whom names are on
        the patches are reviewers"
      
      * tag 'spdx-5.2-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (85 commits)
        treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 125
        treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 123
        treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 122
        treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 121
        treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 120
        treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 119
        treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 118
        treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 116
        treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 114
        treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 113
        treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 112
        treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 111
        treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 110
        treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 106
        treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 105
        treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 104
        treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 103
        treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 102
        treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 101
        treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 98
        ...
      86c2f5d6
    • W
      locking/lock_events: Use this_cpu_add() when necessary · 51816e9e
      Waiman Long 提交于
      The kernel test robot has reported that the use of __this_cpu_add()
      causes bug messages like:
      
        BUG: using __this_cpu_add() in preemptible [00000000] code: ...
      
      Given the imprecise nature of the count and the possibility of resetting
      the count and doing the measurement again, this is not really a big
      problem to use the unprotected __this_cpu_*() functions.
      
      To make the preemption checking code happy, the this_cpu_*() functions
      will be used if CONFIG_DEBUG_PREEMPT is defined.
      
      The imprecise nature of the locking counts are also documented with
      the suggestion that we should run the measurement a few times with the
      counts reset in between to get a better picture of what is going on
      under the hood.
      
      Fixes: a8654596 ("locking/rwsem: Enable lock event counting")
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NWaiman Long <longman@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      51816e9e
    • P
      KVM: x86: fix return value for reserved EFER · 66f61c92
      Paolo Bonzini 提交于
      Commit 11988499 ("KVM: x86: Skip EFER vs. guest CPUID checks for
      host-initiated writes", 2019-04-02) introduced a "return false" in a
      function returning int, and anyway set_efer has a "nonzero on error"
      conventon so it should be returning 1.
      Reported-by: NPavel Machek <pavel@denx.de>
      Fixes: 11988499 ("KVM: x86: Skip EFER vs. guest CPUID checks for host-initiated writes")
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      66f61c92
    • S
      tools/kvm_stat: fix fields filter for child events · 883d25e7
      Stefan Raspl 提交于
      The fields filter would not work with child fields, as the respective
      parents would not be included. No parents displayed == no childs displayed.
      To reproduce, run on s390 (would work on other platforms, too, but would
      require a different filter name):
      - Run 'kvm_stat -d'
      - Press 'f'
      - Enter 'instruct'
      Notice that events like instruction_diag_44 or instruction_diag_500 are not
      displayed - the output remains empty.
      With this patch, we will filter by matching events and their parents.
      However, consider the following example where we filter by
      instruction_diag_44:
      
        kvm statistics - summary
                         regex filter: instruction_diag_44
         Event                                         Total %Total CurAvg/s
         exit_instruction                                276  100.0       12
           instruction_diag_44                           256   92.8       11
         Total                                           276              12
      
      Note that the parent ('exit_instruction') displays the total events, but
      the childs listed do not match its total (256 instead of 276). This is
      intended (since we're filtering all but one child), but might be confusing
      on first sight.
      Signed-off-by: NStefan Raspl <raspl@linux.ibm.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      883d25e7
    • T
      KVM: selftests: Wrap vcpu_nested_state_get/set functions with x86 guard · c7957206
      Thomas Huth 提交于
      struct kvm_nested_state is only available on x86 so far. To be able
      to compile the code on other architectures as well, we need to wrap
      the related code with #ifdefs.
      Signed-off-by: NThomas Huth <thuth@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      c7957206
    • A
      kvm: selftests: aarch64: compile with warnings on · 98e68344
      Andrew Jones 提交于
      aarch64 fixups needed to compile with warnings as errors.
      Reviewed-by: NThomas Huth <thuth@redhat.com>
      Signed-off-by: NAndrew Jones <drjones@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      98e68344
    • A
      kvm: selftests: aarch64: fix default vm mode · 55eda003
      Andrew Jones 提交于
      VM_MODE_P52V48_4K is not a valid mode for AArch64. Replace its
      use in vm_create_default() with a mode that works and represents
      a good AArch64 default. (We didn't ever see a problem with this
      because we don't have any unit tests using vm_create_default(),
      but it's good to get it fixed in advance.)
      Reported-by: NThomas Huth <thuth@redhat.com>
      Signed-off-by: NAndrew Jones <drjones@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      55eda003
    • A
      kvm: selftests: aarch64: dirty_log_test: fix unaligned memslot size · bffed38d
      Andrew Jones 提交于
      The memory slot size must be aligned to the host's page size. When
      testing a guest with a 4k page size on a host with a 64k page size,
      then 3 guest pages are not host page size aligned. Since we just need
      a nearly arbitrary number of extra pages to ensure the memslot is not
      aligned to a 64 host-page boundary for this test, then we can use
      16, as that's 64k aligned, but not 64 * 64k aligned.
      
      Fixes: 76d58e0f ("KVM: fix KVM_CLEAR_DIRTY_LOG for memory slots of unaligned size", 2019-04-17)
      Signed-off-by: NAndrew Jones <drjones@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      bffed38d
    • C
      KVM: s390: fix memory slot handling for KVM_SET_USER_MEMORY_REGION · 19ec166c
      Christian Borntraeger 提交于
      kselftests exposed a problem in the s390 handling for memory slots.
      Right now we only do proper memory slot handling for creation of new
      memory slots. Neither MOVE, nor DELETION are handled properly. Let us
      implement those.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      19ec166c
    • P
      KVM: x86/pmu: do not mask the value that is written to fixed PMUs · 2924b521
      Paolo Bonzini 提交于
      According to the SDM, for MSR_IA32_PERFCTR0/1 "the lower-order 32 bits of
      each MSR may be written with any value, and the high-order 8 bits are
      sign-extended according to the value of bit 31", but the fixed counters
      in real hardware are limited to the width of the fixed counters ("bits
      beyond the width of the fixed-function counter are reserved and must be
      written as zeros").  Fix KVM to do the same.
      Reported-by: NNadav Amit <nadav.amit@gmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2924b521
    • P
      KVM: x86/pmu: mask the result of rdpmc according to the width of the counters · 0e6f467e
      Paolo Bonzini 提交于
      This patch will simplify the changes in the next, by enforcing the
      masking of the counters to RDPMC and RDMSR.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0e6f467e
    • B
      x86/kvm/pmu: Set AMD's virt PMU version to 1 · a80c4ec1
      Borislav Petkov 提交于
      After commit:
      
        672ff6cf ("KVM: x86: Raise #GP when guest vCPU do not support PMU")
      
      my AMD guests started #GPing like this:
      
        general protection fault: 0000 [#1] PREEMPT SMP
        CPU: 1 PID: 4355 Comm: bash Not tainted 5.1.0-rc6+ #3
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
        RIP: 0010:x86_perf_event_update+0x3b/0xa0
      
      with Code: pointing to RDPMC. It is RDPMC because the guest has the
      hardware watchdog CONFIG_HARDLOCKUP_DETECTOR_PERF enabled which uses
      perf. Instrumenting kvm_pmu_rdpmc() some, showed that it fails due to:
      
        if (!pmu->version)
        	return 1;
      
      which the above commit added. Since AMD's PMU leaves the version at 0,
      that causes the #GP injection into the guest.
      
      Set pmu->version arbitrarily to 1 and move it above the non-applicable
      struct kvm_pmu members.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
      Cc: kvm@vger.kernel.org
      Cc: Liran Alon <liran.alon@oracle.com>
      Cc: Mihai Carabas <mihai.carabas@oracle.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: x86@kernel.org
      Cc: stable@vger.kernel.org
      Fixes: 672ff6cf ("KVM: x86: Raise #GP when guest vCPU do not support PMU")
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a80c4ec1
    • P
      KVM: x86: do not spam dmesg with VMCS/VMCB dumps · 6f2f8453
      Paolo Bonzini 提交于
      Userspace can easily set up invalid processor state in such a way that
      dmesg will be filled with VMCS or VMCB dumps.  Disable this by default
      using a module parameter.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6f2f8453
    • P
      kvm: Check irqchip mode before assign irqfd · 654f1f13
      Peter Xu 提交于
      When assigning kvm irqfd we didn't check the irqchip mode but we allow
      KVM_IRQFD to succeed with all the irqchip modes.  However it does not
      make much sense to create irqfd even without the kernel chips.  Let's
      provide a arch-dependent helper to check whether a specific irqfd is
      allowed by the arch.  At least for x86, it should make sense to check:
      
      - when irqchip mode is NONE, all irqfds should be disallowed, and,
      
      - when irqchip mode is SPLIT, irqfds that are with resamplefd should
        be disallowed.
      
      For either of the case, previously we'll silently ignore the irq or
      the irq ack event if the irqchip mode is incorrect.  However that can
      cause misterious guest behaviors and it can be hard to triage.  Let's
      fail KVM_IRQFD even earlier to detect these incorrect configurations.
      
      CC: Paolo Bonzini <pbonzini@redhat.com>
      CC: Radim Krčmář <rkrcmar@redhat.com>
      CC: Alex Williamson <alex.williamson@redhat.com>
      CC: Eduardo Habkost <ehabkost@redhat.com>
      Signed-off-by: NPeter Xu <peterx@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      654f1f13
    • S
      kvm: svm/avic: fix off-by-one in checking host APIC ID · c9bcd3e3
      Suthikulpanit, Suravee 提交于
      Current logic does not allow VCPU to be loaded onto CPU with
      APIC ID 255. This should be allowed since the host physical APIC ID
      field in the AVIC Physical APIC table entry is an 8-bit value,
      and APIC ID 255 is valid in system with x2APIC enabled.
      Instead, do not allow VCPU load if the host APIC ID cannot be
      represented by an 8-bit value.
      
      Also, use the more appropriate AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK
      instead of AVIC_MAX_PHYSICAL_ID_COUNT.
      Signed-off-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      c9bcd3e3
    • P
      KVM: selftests: do not blindly clobber registers in guest asm · 204c91ef
      Paolo Bonzini 提交于
      The guest_code of sync_regs_test is assuming that the compiler will not
      touch %r11 outside the asm that increments it, which is a bit brittle.
      Instead, we can increment a variable and use a dummy asm to ensure the
      increment is not optimized away.  However, we also need to use a
      callee-save register or the compiler will insert a save/restore around
      the vmexit, breaking the whole idea behind the test.
      
      (Yes, "if it ain't broken...", but I would like the test to be clean
      before it is copied into the upcoming s390 selftests).
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      204c91ef
    • T
      KVM: selftests: Remove duplicated TEST_ASSERT in hyperv_cpuid.c · 12e9612c
      Thomas Huth 提交于
      The check for entry->index == 0 is done twice. One time should
      be sufficient.
      Suggested-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NThomas Huth <thuth@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      12e9612c
    • W
      KVM: LAPIC: Expose per-vCPU timer_advance_ns to userspace · 16ba3ab4
      Wanpeng Li 提交于
      Expose per-vCPU timer_advance_ns to userspace, so it is able to
      query the auto-adjusted value.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Liran Alon <liran.alon@oracle.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      16ba3ab4
    • W
      KVM: LAPIC: Fix lapic_timer_advance_ns parameter overflow · 0e6edceb
      Wanpeng Li 提交于
      After commit c3941d9e (KVM: lapic: Allow user to disable adaptive tuning of
      timer advancement), '-1' enables adaptive tuning starting from default
      advancment of 1000ns. However, we should expose an int instead of an overflow
      uint module parameter.
      
      Before patch:
      
      /sys/module/kvm/parameters/lapic_timer_advance_ns:4294967295
      
      After patch:
      
      /sys/module/kvm/parameters/lapic_timer_advance_ns:-1
      
      Fixes: c3941d9e (KVM: lapic: Allow user to disable adaptive tuning of timer advancement)
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Liran Alon <liran.alon@oracle.com>
      Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0e6edceb
    • Y
      kvm: vmx: Fix -Wmissing-prototypes warnings · 4d259965
      Yi Wang 提交于
      We get a warning when build kernel W=1:
      arch/x86/kvm/vmx/vmx.c:6365:6: warning: no previous prototype for ‘vmx_update_host_rsp’ [-Wmissing-prototypes]
       void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp)
      
      Add the missing declaration to fix this.
      Signed-off-by: NYi Wang <wang.yi59@zte.com.cn>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4d259965