- 17 11月, 2022 3 次提交
-
-
由 Colton Lewis 提交于
Randomize which pages are written vs read using the random number generator. Change the variable wr_fract and associated function calls to write_percent that now operates as a percentage from 0 to 100 where X means each page has an X% chance of being written. Change the -f argument to -w to reflect the new variable semantics. Keep the same default of 100% writes. Population always uses 100% writes to ensure all memory is actually populated and not just mapped to the zero page. The prevents expensive copy-on-write faults from occurring during the dirty memory iterations below, which would pollute the performance results. Each vCPU calculates its own random seed by adding its index to the seed provided. Signed-off-by: NColton Lewis <coltonlewis@google.com> Reviewed-by: NDavid Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20221107182208.479157-4-coltonlewis@google.comSigned-off-by: NSean Christopherson <seanjc@google.com>
-
由 Colton Lewis 提交于
Create a -r argument to specify a random seed. If no argument is provided, the seed defaults to 1. The random seed is set with perf_test_set_random_seed() and must be set before guest_code runs to apply. Signed-off-by: NColton Lewis <coltonlewis@google.com> Reviewed-by: NDavid Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20221107182208.479157-3-coltonlewis@google.comSigned-off-by: NSean Christopherson <seanjc@google.com>
-
由 Vipin Sharma 提交于
Add a command line option, -c, to pin vCPUs to physical CPUs (pCPUs), i.e. to force vCPUs to run on specific pCPUs. Requirement to implement this feature came in discussion on the patch "Make page tables for eager page splitting NUMA aware" https://lore.kernel.org/lkml/YuhPT2drgqL+osLl@google.com/ This feature is useful as it provides a way to analyze performance based on the vCPUs and dirty log worker locations, like on the different NUMA nodes or on the same NUMA nodes. To keep things simple, implementation is intentionally very limited, either all of the vCPUs will be pinned followed by an optional main thread or nothing will be pinned. Signed-off-by: NVipin Sharma <vipinsh@google.com> Suggested-by: NDavid Matlack <dmatlack@google.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221103191719.1559407-8-vipinsh@google.comSigned-off-by: NSean Christopherson <seanjc@google.com>
-
- 11 6月, 2022 7 次提交
-
-
由 Sean Christopherson 提交于
Drop @num_percpu_pages from __vm_create_with_vcpus(), all callers pass '0' and there's unlikely to be a test that allocates just enough memory that it needs a per-CPU allocation, but not so much that it won't just do its own memory management. Signed-off-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
All callers of __vm_create_with_vcpus() pass DEFAULT_GUEST_PHY_PAGES for @slot_mem_pages; drop the param and just hardcode the "default" as the base number of pages for slot0. Signed-off-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Drop a variety of 'struct kvm_vm' accessors that wrap a single variable now that tests can simply reference the variable directly. Signed-off-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Take a vCPU directly instead of a VM+vcpu pair in all vCPU-scoped helpers and ioctls. Signed-off-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Track vCPUs by their 'struct kvm_vcpu' object, and stop assuming that a vCPU's ID is the same as its index when referencing a vCPU's metadata. Signed-off-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Drop the @vcpuids parameter from VM creators now that there are no users. Allowing tests to specify IDs was a gigantic mistake as it resulted in tests with arbitrary and ultimately meaningless IDs that differed only because the author used test X intead of test Y as the source for copy+paste (the de facto standard way to create a KVM selftest). Except for literally two tests, x86's set_boot_cpu_id and s390's resets, tests do not and should not care about the vCPU ID. Signed-off-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Add a VM creator that "returns" the created vCPUs by filling the provided array. This will allow converting multi-vCPU tests away from hardcoded vCPU IDs. Signed-off-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 09 6月, 2022 2 次提交
-
-
由 David Matlack 提交于
The selftests nested code only supports 4-level paging at the moment. This means it cannot map nested guest physical addresses with more than 48 bits. Allow perf_test_util nested mode to work on hosts with more than 48 physical addresses by restricting the guest test region to 48-bits. While here, opportunistically fix an off-by-one error when dealing with vm_get_max_gfn(). perf_test_util.c was treating this as the maximum number of GFNs, rather than the maximum allowed GFN. This didn't result in any correctness issues, but it did end up shifting the test region down slightly when using huge pages. Suggested-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NDavid Matlack <dmatlack@google.com> Message-Id: <20220520233249.3776001-12-dmatlack@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 David Matlack 提交于
Add an option to dirty_log_perf_test that configures the vCPUs to run in L2 instead of L1. This makes it possible to benchmark the dirty logging performance of nested virtualization, which is particularly interesting because KVM must shadow L1's EPT/NPT tables. For now this support only works on x86_64 CPUs with VMX. Otherwise passing -n results in the test being skipped. Signed-off-by: NDavid Matlack <dmatlack@google.com> Message-Id: <20220520233249.3776001-11-dmatlack@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 16 11月, 2021 11 次提交
-
-
由 David Matlack 提交于
Thread creation requires taking the mmap_sem in write mode, which causes vCPU threads running in guest mode to block while they are populating memory. Fix this by waiting for all vCPU threads to be created and start running before entering guest mode on any one vCPU thread. This substantially improves the "Populate memory time" when using 1GiB pages since it allows all vCPUs to zero pages in parallel rather than blocking because a writer is waiting (which is waiting for another vCPU that is busy zeroing a 1GiB page). Before: $ ./dirty_log_perf_test -v256 -s anonymous_hugetlb_1gb ... Populate memory time: 52.811184013s After: $ ./dirty_log_perf_test -v256 -s anonymous_hugetlb_1gb ... Populate memory time: 10.204573342s Signed-off-by: NDavid Matlack <dmatlack@google.com> Message-Id: <20211111001257.1446428-4-dmatlack@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 David Matlack 提交于
Move vCPU thread creation and joining to common helper functions. This is in preparation for the next commit which ensures that all vCPU threads are fully created before entering guest mode on any one vCPU. No functional change intended. Signed-off-by: NDavid Matlack <dmatlack@google.com> Reviewed-by: NBen Gardon <bgardon@google.com> Message-Id: <20211111001257.1446428-3-dmatlack@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Copy perf_test_args to the guest during VM creation instead of relying on the caller to do so at their leisure. Ideally, tests wouldn't even be able to modify perf_test_args, i.e. they would have no motivation to do the sync, but enforcing that is arguably a net negative for readability. No functional change intended. [Set wr_fract=1 by default and add helper to override it since the new access_tracking_perf_test needs to set it dynamically.] Signed-off-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NDavid Matlack <dmatlack@google.com> Reviewed-by: NBen Gardon <bgardon@google.com> Message-Id: <20211111000310.1435032-13-dmatlack@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Fill the per-vCPU args when creating the perf_test VM instead of having the caller do so. This helps ensure that any adjustments to the number of pages (and thus vcpu_memory_bytes) are reflected in the per-VM args. Automatically filling the per-vCPU args will also allow a future patch to do the sync to the guest during creation. Signed-off-by: NSean Christopherson <seanjc@google.com> [Updated access_tracking_perf_test as well.] Signed-off-by: NDavid Matlack <dmatlack@google.com> Reviewed-by: NBen Gardon <bgardon@google.com> Message-Id: <20211111000310.1435032-12-dmatlack@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Use the already computed guest_num_pages when creating the so called extra VM pages for a perf test, and add a comment explaining why the pages are allocated as extra pages. Signed-off-by: NSean Christopherson <seanjc@google.com> Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NDavid Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-11-dmatlack@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Remove perf_test_args.host_page_size and instead use getpagesize() so that it's somewhat obvious that, for tests that care about the host page size, they care about the system page size, not the hardware page size, e.g. that the logic is unchanged if hugepages are in play. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NDavid Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-10-dmatlack@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Move the per-VM GPA into perf_test_args instead of storing it as a separate global variable. It's not obvious that guest_test_phys_mem holds a GPA, nor that it's connected/coupled with per_vcpu->gpa. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NDavid Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-9-dmatlack@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Capture the per-vCPU GPA in perf_test_vcpu_args so that tests can get the GPA without having to calculate the GPA on their own. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NDavid Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-7-dmatlack@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Use 'pta' as a local pointer to the global perf_tests_args in order to shorten line lengths and make the code borderline readable. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NDavid Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-6-dmatlack@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Assert that the GPA for a memslot backed by a hugepage is aligned to the hugepage size and fix perf_test_util accordingly. Lack of GPA alignment prevents KVM from backing the guest with hugepages, e.g. x86's write-protection of hugepages when dirty logging is activated is otherwise not exercised. Add a comment explaining that guest_page_size is for non-huge pages to try and avoid confusion about what it actually tracks. Cc: Ben Gardon <bgardon@google.com> Cc: Yanan Wang <wangyanan55@huawei.com> Cc: Andrew Jones <drjones@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: Aaron Lewis <aaronlewis@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> [Used get_backing_src_pagesz() to determine alignment dynamically.] Signed-off-by: NDavid Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-5-dmatlack@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Refactor align() to work with non-pointers and split into separate helpers for aligning up vs. down. Add align_ptr_up() for use with pointers. Expose all helpers so that they can be used by tests and/or other utilities. The align_down() helper in particular will be used to ensure gpa alignment for hugepages. No functional change intended. [Added sepearate up/down helpers and replaced open-coded alignment bit math throughout the KVM selftests.] Signed-off-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NDavid Matlack <dmatlack@google.com> Reviewed-by: NBen Gardon <bgardon@google.com> Message-Id: <20211111000310.1435032-3-dmatlack@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 06 8月, 2021 2 次提交
-
-
由 David Matlack 提交于
perf_test_util is used to set up KVM selftests where vCPUs touch a region of memory. The guest code is implemented in perf_test_util.c (not the calling selftests). The guest code requires a 1 parameter, the vcpuid, which has to be set by calling vcpu_args_set(vm, vcpu_id, 1, vcpu_id). Today all of the selftests that use perf_test_util are making this call. Instead, perf_test_util should just do it. This will save some code but more importantly prevents mistakes since totally non-obvious that this needs to be called and failing to do so results in vCPUs not accessing the right regions of memory. Signed-off-by: NDavid Matlack <dmatlack@google.com> Message-Id: <20210805172821.2622793-1-dmatlack@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 David Matlack 提交于
Introduce a new option to dirty_log_perf_test: -x number_of_slots. This causes the test to attempt to split the region of memory into the given number of slots. If the region cannot be evenly divided, the test will fail. This allows testing with more than one slot and therefore measure how performance scales with the number of memslots. Signed-off-by: NDavid Matlack <dmatlack@google.com> Message-Id: <20210804222844.1419481-8-dmatlack@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 24 6月, 2021 1 次提交
-
-
由 Sean Christopherson 提交于
Drop the memslot param from virt_pg_map() and virt_map() and shove the hardcoded '0' down to the vm_phy_page_alloc() calls. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-13-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 09 6月, 2021 1 次提交
-
-
由 Zhenzhong Duan 提交于
Until commit 39fe2fc9 ("selftests: kvm: make allocation of extra memory take effect", 2021-05-27), parameter extra_mem_pages was used only to calculate the page table size for all the memory chunks, because real memory allocation happened with calls of vm_userspace_mem_region_add() after vm_create_default(). Commit 39fe2fc9 however changed the meaning of extra_mem_pages to the size of memory slot 0. This makes the memory allocation more flexible, but makes it harder to account for the number of pages needed for the page tables. For example, memslot_perf_test has a small amount of memory in slot 0 but a lot in other slots, and adding that memory twice (both in slot 0 and with later calls to vm_userspace_mem_region_add()) causes an error that was fixed in commit 000ac429 ("selftests: kvm: fix overlapping addresses in memslot_perf_test", 2021-05-29) Since both uses are sensible, add a new parameter slot0_mem_pages to vm_create_with_vcpus() and some comments to clarify the meaning of slot0_mem_pages and extra_mem_pages. With this change, memslot_perf_test can go back to passing the number of memory pages as extra_mem_pages. Signed-off-by: NZhenzhong Duan <zhenzhong.duan@intel.com> Message-Id: <20210608233816.423958-4-zhenzhong.duan@intel.com> [Squashed in a single patch and rewrote the commit message. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 27 5月, 2021 1 次提交
-
-
由 David Matlack 提交于
vm_get_max_gfn() casts vm->max_gfn from a uint64_t to an unsigned int, which causes the upper 32-bits of the max_gfn to get truncated. Nobody noticed until now likely because vm_get_max_gfn() is only used as a mechanism to create a memslot in an unused region of the guest physical address space (the top), and the top of the 32-bit physical address space was always good enough. This fix reveals a bug in memslot_modification_stress_test which was trying to create a dummy memslot past the end of guest physical memory. Fix that by moving the dummy memslot lower. Fixes: 52200d0d ("KVM: selftests: Remove duplicate guest mode handling") Reviewed-by: NVenkatesh Srinivas <venkateshs@chromium.org> Signed-off-by: NDavid Matlack <dmatlack@google.com> Message-Id: <20210521173828.1180619-1-dmatlack@google.com> Reviewed-by: NAndrew Jones <drjones@redhat.com> Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 04 2月, 2021 2 次提交
-
-
由 Ben Gardon 提交于
Add a parameter to control the backing memory type for dirty_log_perf_test so that the test can be run with hugepages. To: linux-kselftest@vger.kernel.org CC: Peter Xu <peterx@redhat.com> CC: Andrew Jones <drjones@redhat.com> CC: Thomas Huth <thuth@redhat.com> Signed-off-by: NBen Gardon <bgardon@google.com> Message-Id: <20210202185734.1680553-28-bgardon@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Ben Gardon 提交于
Add an option to overlap the ranges of memory each vCPU accesses instead of partitioning them. This option will increase the probability of multiple vCPUs faulting on the same page at the same time, and causing interesting races, if there are bugs in the page fault handler or elsewhere in the kernel. Reviewed-by: NJacob Xu <jacobhxu@google.com> Reviewed-by: NMakarand Sonare <makarandsonare@google.com> Signed-off-by: NBen Gardon <bgardon@google.com> Message-Id: <20210112214253.463999-6-bgardon@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 08 1月, 2021 2 次提交
-
-
由 Andrew Jones 提交于
It's not conventional C to put non-inline functions in header files. Create a source file for the functions instead. Also reduce the amount of globals and rename the functions to something less generic. Reviewed-by: NBen Gardon <bgardon@google.com> Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NAndrew Jones <drjones@redhat.com> Message-Id: <20201218141734.54359-4-drjones@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Andrew Jones 提交于
Reviewed-by: NBen Gardon <bgardon@google.com> Signed-off-by: NAndrew Jones <drjones@redhat.com> Message-Id: <20201218141734.54359-3-drjones@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 17 11月, 2020 1 次提交
-
-
由 Andrew Jones 提交于
Almost all tests do this anyway and the ones that don't don't appear to care. Only vmx_set_nested_state_test assumes that a feature (VMX) is disabled until later setting the supported CPUIDs. It's better to disable that explicitly anyway. Signed-off-by: NAndrew Jones <drjones@redhat.com> Message-Id: <20201111122636.73346-11-drjones@redhat.com> [Restore CPUID_VMX, or vmx_set_nested_state breaks. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 08 11月, 2020 7 次提交
-
-
由 Ben Gardon 提交于
The dirty log perf test will time verious dirty logging operations (enabling dirty logging, dirtying memory, getting the dirty log, clearing the dirty log, and disabling dirty logging) in order to quantify dirty logging performance. This test can be used to inform future performance improvements to KVM's dirty logging infrastructure. This series was tested by running the following invocations on an Intel Skylake machine: dirty_log_perf_test -b 20m -i 100 -v 64 dirty_log_perf_test -b 20g -i 5 -v 4 dirty_log_perf_test -b 4g -i 5 -v 32 demand_paging_test -b 20m -v 64 demand_paging_test -b 20g -v 4 demand_paging_test -b 4g -v 32 All behaved as expected. Signed-off-by: NBen Gardon <bgardon@google.com> Message-Id: <20201027233733.1484855-6-bgardon@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Andrew Jones 提交于
We also check the input number of vcpus against the maximum supported. Signed-off-by: NAndrew Jones <drjones@redhat.com> Message-Id: <20201104212357.171559-8-drjones@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Andrew Jones 提交于
Rename vcpu_memory_bytes to something with "percpu" in it in order to be less ambiguous. Also make it global to simplify things. Signed-off-by: NAndrew Jones <drjones@redhat.com> Message-Id: <20201104212357.171559-7-drjones@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Andrew Jones 提交于
Signed-off-by: NAndrew Jones <drjones@redhat.com> Message-Id: <20201104212357.171559-3-drjones@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Ben Gardon 提交于
Wrfract will be used by the dirty logging perf test introduced later in this series to dirty memory sparsely. This series was tested by running the following invocations on an Intel Skylake machine: dirty_log_perf_test -b 20m -i 100 -v 64 dirty_log_perf_test -b 20g -i 5 -v 4 dirty_log_perf_test -b 4g -i 5 -v 32 demand_paging_test -b 20m -v 64 demand_paging_test -b 20g -v 4 demand_paging_test -b 4g -v 32 All behaved as expected. Signed-off-by: NBen Gardon <bgardon@google.com> Message-Id: <20201027233733.1484855-5-bgardon@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Ben Gardon 提交于
Rounding the address the guest writes to a host page boundary will only have an effect if the host page size is larger than the guest page size, but in that case the guest write would still go to the same host page. There's no reason to round the address down, so remove the rounding to simplify the demand paging test. This series was tested by running the following invocations on an Intel Skylake machine: dirty_log_perf_test -b 20m -i 100 -v 64 dirty_log_perf_test -b 20g -i 5 -v 4 dirty_log_perf_test -b 4g -i 5 -v 32 demand_paging_test -b 20m -v 64 demand_paging_test -b 20g -v 4 demand_paging_test -b 4g -v 32 All behaved as expected. Signed-off-by: NBen Gardon <bgardon@google.com> Message-Id: <20201027233733.1484855-3-bgardon@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Ben Gardon 提交于
Much of the code in demand_paging_test can be reused by other, similar multi-vCPU-memory-touching-perfromance-tests. Factor that common code out for reuse. No functional change expected. This series was tested by running the following invocations on an Intel Skylake machine: dirty_log_perf_test -b 20m -i 100 -v 64 dirty_log_perf_test -b 20g -i 5 -v 4 dirty_log_perf_test -b 4g -i 5 -v 32 demand_paging_test -b 20m -v 64 demand_paging_test -b 20g -v 4 demand_paging_test -b 4g -v 32 All behaved as expected. Signed-off-by: NBen Gardon <bgardon@google.com> Message-Id: <20201027233733.1484855-2-bgardon@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-