提交 · e885cafc6a0b97890ec404ee411a3610117b0494 · openanolis / cloud-kernel

17 1月, 2020 40 次提交

iommu/amd: Wait for completion of IOTLB flush in attach_device · e885cafc

由 Filippo Sironi 提交于 11月 26, 2019

commit 0b15e02f0cc4fb34a9160de7ba6db3a4013dc1b7 upstream.

To make sure the domain tlb flush completes before the
function returns, explicitly wait for its completion.
Signed-off-by: NFilippo Sironi <sironi@amazon.de>
Fixes: 42a49f96 ("amd-iommu: flush domain tlb when attaching a new device")
[joro: Added commit message and fixes tag]
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

e885cafc

iommu/amd: fix a crash in iova_magazine_free_pfns · 37a64909

由 Qian Cai 提交于 11月 26, 2019

commit 8cf66504210d308a35cca35fe9c310b1241f9fa7 upstream.

The commit b3aa14f02254 ("iommu: remove the mapping_error dma_map_ops
method") incorrectly changed the checking from dma_ops_alloc_iova() in
map_sg() causes a crash under memory pressure as dma_ops_alloc_iova()
never return DMA_MAPPING_ERROR on failure but 0, so the error handling
is all wrong.

   kernel BUG at drivers/iommu/iova.c:801!
    Workqueue: kblockd blk_mq_run_work_fn
    RIP: 0010:iova_magazine_free_pfns+0x7d/0xc0
    Call Trace:
     free_cpu_cached_iovas+0xbd/0x150
     alloc_iova_fast+0x8c/0xba
     dma_ops_alloc_iova.isra.6+0x65/0xa0
     map_sg+0x8c/0x2a0
     scsi_dma_map+0xc6/0x160
     pqi_aio_submit_io+0x1f6/0x440 [smartpqi]
     pqi_scsi_queue_command+0x90c/0xdd0 [smartpqi]
     scsi_queue_rq+0x79c/0x1200
     blk_mq_dispatch_rq_list+0x4dc/0xb70
     blk_mq_sched_dispatch_requests+0x249/0x310
     __blk_mq_run_hw_queue+0x128/0x200
     blk_mq_run_work_fn+0x27/0x30
     process_one_work+0x522/0xa10
     worker_thread+0x63/0x5b0
     kthread+0x1d2/0x1f0
     ret_from_fork+0x22/0x40

Fixes: b3aa14f02254 ("iommu: remove the mapping_error dma_map_ops method")
Signed-off-by: NQian Cai <cai@lca.pw>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

37a64909

iommu: remove the mapping_error dma_map_ops method · 4b14fd99

由 Christoph Hellwig 提交于 11月 26, 2019

commit b3aa14f022543ed86823c97c145495b747102fa9 upstream.

Return DMA_MAPPING_ERROR instead of 0 on a dma mapping failure and let
the core dma-mapping code handle the rest.

Note that the existing code used AMD_IOMMU_MAPPING_ERROR to check from
a 0 return from the IOVA allocator, which is replaced with an explicit
0 as in the implementation and other users of that interface.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

4b14fd99

iommu: Fix IOMMU debugfs fallout · 4fa497af

由 Geert Uytterhoeven 提交于 11月 26, 2019

commit 18b3af4492a0aa6046b86d712f6ba4cbb66100fb upstream.

A change made in the final version of IOMMU debugfs support replaced the
public function iommu_debugfs_new_driver_dir() by the public dentry
iommu_debugfs_dir in <linux/iommu.h>, but forgot to update both the
implementation in iommu-debugfs.c, and the patch description.

Fix this by exporting iommu_debugfs_dir, and removing the reference to
and implementation of iommu_debugfs_new_driver_dir().

Fixes: bad614b2 ("iommu: Enable debugfs exposure of IOMMU driver internals")
Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Acked-by: NGary R Hook <gary.hook@amd.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

4fa497af

dma-mapping: provide a generic DMA_MAPPING_ERROR · 58c0aa62

由 Christoph Hellwig 提交于 11月 26, 2019

commit 42ee3cae0ed38b6c04038bf851ea2496da2135bb upstream.

Error handling of the dma_map_single and dma_map_page APIs is a little
problematic at the moment, in that we use different encodings in the
returned dma_addr_t to indicate an error.  That means we require an
additional indirect call to figure out if a dma mapping call returned
an error, and a lot of boilerplate code to implement these semantics.

Instead return the maximum addressable value as the error.  As long
as we don't allow mapping single-byte ranges with single-byte alignment
this value can never be a valid return.  Additionaly if drivers do
not check the return value from the dma_map* routines this values means
they will generally not be pointed to actual memory.

Once the default value is added here we can start removing the
various mapping_error methods and just rely on this generic check.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
Acked-by: NRussell King <rmk+kernel@armlinux.org.uk>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

58c0aa62

EDAC/amd64: Adjust printed chip select sizes when interleaved · f2e28686

由 Yazen Ghannam 提交于 11月 19, 2019

commit fc00c6a416381010c4a721a4142ddd0260d68f20 upstream.

AMD systems may support chip select interleaving. However, on family
17h+ this was not taken into account when printing the chip select
sizes.

Add support to detect if chip selects are interleaved on family 17h+,
and adjust the sizes accordingly.
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Tested-by: NKim Phillips <kim.phillips@amd.com>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: https://lkml.kernel.org/r/20190228153558.127292-6-Yazen.Ghannam@amd.comSigned-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

f2e28686

EDAC/amd64: Support more than two controllers for chip select handling · ec89b933

由 Yazen Ghannam 提交于 11月 19, 2019

commit 0a227af521d6df5286550b62f4b591417170b4ea upstream.

The struct chip_select array that's used for saving chip select bases
and masks is fixed at length of two. There should be one struct
chip_select for each controller, so this array should be increased to
support systems that may have more than two controllers.

Increase the size of the struct chip_select array to eight, which is the
largest number of controllers per die currently supported on AMD
systems.

Also, carve out the Family 17h+ reading of the bases/masks into a
separate function. This effectively reverts the original bases/masks
reading code to before Family 17h support was added.
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Tested-by: NKim Phillips <kim.phillips@amd.com>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: https://lkml.kernel.org/r/20190228153558.127292-5-Yazen.Ghannam@amd.comSigned-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

ec89b933

EDAC/amd64: Recognize x16 symbol size · f5dcc1bf

由 Yazen Ghannam 提交于 11月 19, 2019

commit 7835961d377b75ab9ae77f715e378fcb72508306 upstream.

Future AMD systems may support x16 symbol sizes.

Recognize if a system is using x16 symbol size. Also, simplify the print
statement.

Note that a x16 syndrome vector table is not necessary like with x4 or
x8 syndromes. This is because systems that support x16 symbol sizes are
SMCA systems and in that case, the syndrome can be directly extracted
from the MCA_SYND[Syndrome] field.

 [ bp: massage. ]
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Tested-by: NKim Phillips <kim.phillips@amd.com>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: https://lkml.kernel.org/r/20190228153558.127292-4-Yazen.Ghannam@amd.comSigned-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

f5dcc1bf

EDAC/amd64: Set maximum channel layer size depending on family · 2c71ebbb

由 Yazen Ghannam 提交于 11月 19, 2019

commit 869adc4316ea348e3c52af2494d9b1f6bd68abbd upstream.

The AMD64 EDAC module currently hardcodes the EDAC channel layer size
count to two. Future AMD systems may have more channels than this.

Set the EDAC channel layer size equal to the maximum number of channels
possible for the system. On Family 17h and later, this is set in the
num_umcs variable. Older systems will continue to use two as the
default.
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: https://lkml.kernel.org/r/20190325203319.7603-1-Yazen.Ghannam@amd.comSigned-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

2c71ebbb

EDAC/amd64: Support more than two Unified Memory Controllers · 2c2b112c

由 Yazen Ghannam 提交于 11月 19, 2019

commit bdcee7747f5c490297665af0e1e0fbeb4368804d upstream.

The first few models of Family 17h all had 2 Unified Memory Controllers
per Die, so this was treated as a fixed value. However, future systems
may have more Unified Memory Controllers per Die.

Related to this, the channel number and base address of a Unified Memory
Controller were found by matching on fixed, known values. However,
current and future systems follow this pattern for the channel number
and base address of a Unified Memory Controller: 0xYXXXXX, where Y is
the channel number. So matching on hardcoded values is not necessary.

Set the number of Unified Memory Controllers at driver init time based
on the family/model. Also, update the functions that find the channel
number and base address of a Unified Memory Controller to support more
than two.

 [ bp: Move num_umcs into the .c file and simplify comment. ]
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Tested-by: NKim Phillips <kim.phillips@amd.com>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: https://lkml.kernel.org/r/20190228153558.127292-3-Yazen.Ghannam@amd.comSigned-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

2c2b112c

EDAC/amd64: Use a macro for iterating over Unified Memory Controllers · 6144c0f5

由 Yazen Ghannam 提交于 11月 19, 2019

commit 4d30d2bc3c23e63c2608bc5b03b0960490d5b740 upstream.

Define and use a macro for looping over the number of Unified Memory
Controllers.

No functional change.
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Tested-by: NKim Phillips <kim.phillips@amd.com>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: https://lkml.kernel.org/r/20190228153558.127292-2-Yazen.Ghannam@amd.comSigned-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

6144c0f5

EDAC/amd64: Add Family 17h Model 30h PCI IDs · efff4c14

由 Yazen Ghannam 提交于 11月 19, 2019

commit 6e846239e5487cbb89ac8192d5f11437d010130e upstream.

Add the new Family 17h Model 30h PCI IDs to the AMD64 EDAC module.

This also fixes a probe failure that appeared when some other PCI IDs
for Family 17h Model 30h were added to the AMD NB code.

Fixes: be3518a16ef2 (x86/amd_nb: Add PCI device IDs for family 17h, model 30h)
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Tested-by: NKim Phillips <kim.phillips@amd.com>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: https://lkml.kernel.org/r/20190228153558.127292-1-Yazen.Ghannam@amd.comSigned-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

efff4c14

EDAC/mce_amd: Decode MCA_STATUS[Scrub] bit · 3d04fc72

由 Yazen Ghannam 提交于 11月 19, 2019

commit 3f4da372ec8e4ce58c17ac4f2e3c8891bbfea17e upstream.

Previous AMD systems have had a bit in MCA_STATUS to indicate that an
error was detected on a scrub operation. However, this bit was defined
differently within different banks and families/models.

Starting with Family 17h, MCA_STATUS[40] is either Reserved/Read-as-Zero
or defined as "Scrub", for all MCA banks and CPU models. Therefore, this
bit can be defined as the "Scrub" bit.

Define MCA_STATUS[40] as "Scrub" and decode it in the AMD MCE decoding
module for Family 17h and newer systems.
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190212212417.107049-1-Yazen.Ghannam@amd.comSigned-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

3d04fc72

x86/MCE/AMD, EDAC/mce_amd: Add new error descriptions for some SMCA bank types · de7cb125

由 Yazen Ghannam 提交于 11月 19, 2019

commit 8a5dd2cd2f2e94878cacc969655a69ca214795ab upstream.

Some SMCA bank types on future systems will report new error types even
though the bank type is not treated as a new version. These new error
types will reported by bits that are reserved in past systems.

Add the new error descriptions to the lists in edac_mce_amd.
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Shirish S <Shirish.S@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190201225534.8177-4-Yazen.Ghannam@amd.comSigned-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

de7cb125

x86/MCE/AMD, EDAC/mce_amd: Add new McaTypes for CS, PSP, and SMU units · 0b8080b5

由 Yazen Ghannam 提交于 11月 19, 2019

commit 3ad7e748c12cc771df6020a552def3e1727e8a17 upstream.

The existing CS, PSP, and SMU SMCA bank types will see new versions (as
indicated by their McaTypes) in future SMCA systems.

Add the new (HWID, MCATYPE) tuples for these new versions. Reuse the
same names as the older versions, since they are logically the same to
the user. SMCA systems won't mix and match IP blocks with different
McaType versions in the same system, so there isn't a need to
distinguish them. The MCA_IPID register is saved when logging an MCA
error, and that can be used to triage the error.

Also, add the new error descriptions to edac_mce_amd. Some error types
(positions in the list) are overloaded compared to the previous
McaTypes. Therefore, just create new lists of the error descriptions to
keep things simple even if some of the error descriptions are the same
between versions.
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Shirish S <Shirish.S@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190201225534.8177-3-Yazen.Ghannam@amd.comSigned-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

0b8080b5

x86/MCE/AMD, EDAC/mce_amd: Add new MP5, NBIO, and PCIE SMCA bank types · 43886175

由 Yazen Ghannam 提交于 11月 19, 2019

commit cbfa447edd6a3825fdb8a4ffae74ff7208f2d2c0 upstream.

Add the (HWID, MCATYPE) tuples and names for the new MP5, NBIO, and
PCIE SMCA bank types.

Also, add their respective error descriptions to the MCE decoding module
edac_mce_amd.
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Shirish S <Shirish.S@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190201225534.8177-2-Yazen.Ghannam@amd.comSigned-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

43886175

x86/amd_nb: Add PCI device IDs for family 17h, model 30h · 4bcf519a

由 Woods, Brian 提交于 11月 19, 2019

commit be3518a16ef270e3b030a6ae96055f83f51bd3dd upstream.

Add the PCI device IDs for family 17h model 30h, since they are needed
for accessing various registers via the data fabric/SMN interface.
Signed-off-by: NBrian Woods <brian.woods@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Clemens Ladisch <clemens@ladisch.de>
CC: Guenter Roeck <linux@roeck-us.net>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Ingo Molnar <mingo@redhat.com>
CC: Jean Delvare <jdelvare@suse.com>
CC: Jia Zhang <qianyue.zj@alibaba-inc.com>
CC: <linux-hwmon@vger.kernel.org>
CC: <linux-pci@vger.kernel.org>
CC: Pu Wen <puwen@hygon.cn>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20181106200754.60722-4-brian.woods@amd.comSigned-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

4bcf519a

x86/amd_nb: Add support for newer PCI topologies · 27e3c2c7

由 Woods, Brian 提交于 11月 19, 2019

commit 556e4c62baffa71e2045a298379db7e57dd47f3d upstream.

Add support for new processors which have multiple PCI root complexes
per data fabric/system management network interface.  If there are (N)
multiple PCI roots per DF/SMN interface, then the PCI roots are
redundant (as far as SMN/DF access goes).  For each DF/SMN interface:
map to the first available PCI root and skip the next N-1 PCI roots so
the following DF/SMN interface get mapped to a correct PCI root.

Ex:
DF/SMN 0 -> 60
	    40
	    20
	    00
DF/SMN 1 -> e0
	    c0
	    a0
	    80
Signed-off-by: NBrian Woods <brian.woods@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Clemens Ladisch <clemens@ladisch.de>
CC: Guenter Roeck <linux@roeck-us.net>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Ingo Molnar <mingo@redhat.com>
CC: Jean Delvare <jdelvare@suse.com>
CC: Jia Zhang <qianyue.zj@alibaba-inc.com>
CC: <linux-hwmon@vger.kernel.org>
CC: <linux-pci@vger.kernel.org>
CC: Pu Wen <puwen@hygon.cn>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20181106200754.60722-3-brian.woods@amd.comSigned-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

27e3c2c7

alinux: sched/fair: use static load in wake_affine_weight · d2440c99

由 Huaixin Chang 提交于 12月 23, 2019

For a long time runnable cpu load has been used in selecting task rq
when waking up tasks. Recent test has shown for test load with a large
quantity of short running tasks and almost full cpu utility, static load
is more helpful.

In our e2e tests, runnable load avg of java threads ranges from less than
10 to as large as 362, while these java threads are no different from
each other, and should be treated in the same way. After using static
load, qps imporvement has been seen in multiple test cases.

A new sched feature WA_STATIC_WEIGHT is introduced here to control. Echo
WA_STATIC_WEIGHT to /sys/kernel/debug/sched_features to turn static load
in wake_affine_weight on and NO_WA_STATIC_WEIGHT to turn it off. This
feature is kept off by default.

Test is done on the following hardware:

4 threads Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz

In tests with 120 threads and sql loglevel configured to info:

	NO_WA_STATIC_WEIGHT     WA_STATIC_WEIGHT
	33170.63                34614.95 (+4.35%)

In tests with 160 threads and sql loglevel configured to info:

	NO_WA_STATIC_WEIGHT     WA_STATIC_WEIGHT
	35888.71                38247.20 (+6.57%)

In tests with 160 threads and sql loglevel configured to warn:

	NO_WA_STATIC_WEIGHT     WA_STATIC_WEIGHT
	39118.72                39698.72 (+1.48%)
Signed-off-by: NHuaixin Chang <changhuaixin@linux.alibaba.com>
Acked-by: NShanpei Chen <shanpeic@linux.alibaba.com>

d2440c99

modsign: use all trusted keys to verify module signature · f7a573c3

由 Ke Wu 提交于 11月 06, 2018

commit e84cd7ee630e44a2cc8ae49e85920a271b214cb3 upstream

Make mod_verify_sig to use all trusted keys. This allows keys in
secondary_trusted_keys to be used to verify PKCS#7 signature on a
kernel module.
Signed-off-by: NKe Wu <mikewu@google.com>
Signed-off-by: NJessica Yu <jeyu@kernel.org>
Signed-off-by: NTianjia Zhang <tianjia.zhang@linux.alibaba.com>
Reviewed-by: Jia Zhang <zhang.jia@linux.alibaba.com>

f7a573c3

tpm: Fix off-by-one when reading binary_bios_measurements · 1031b869

由 jia zhang 提交于 1月 11, 2019

commit 64494d39ff630a63b5308042b20132b491e3706b upstream

It is unable to read the entry when it is the only one in
binary_bios_measurements:

00000000  00 00 00 00 08 00 00 00  c4 2f ed ad 26 82 00 cb
00000010  1d 15 f9 78 41 c3 44 e7  9d ae 33 20 00 00 00 00
00000020

This is obviously a firmware problem on my linux machine:

	Manufacturer: Inspur
	Product Name: SA5212M4
	Version: 01

However, binary_bios_measurements should return it any way,
rather than nothing, after all its content is completely
valid.

Fixes: 55a82ab3 ("tpm: add bios measurement log")
Signed-off-by: Jia Zhang <zhang.jia@linux.alibaba.com>
Reviewd-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: NTianjia Zhang <tianjia.zhang@linux.alibaba.com>
Reviewed-by: Jia Zhang <zhang.jia@linux.alibaba.com>

1031b869

tpm: Simplify the measurements loop · 0448f28b

由 jia zhang 提交于 1月 11, 2019

commit bb3b6b0fc57182b568ded61c55eff8a02fcfe27b upstream

The responsibility of tpm1_bios_measurements_start() is to walk over the
first *pos measurements, ensuring the skipped and to-be-read
measurements are not out-of-boundary.

This commit simplifies the loop by employing a do-while loop with
the necessary sanity check.
Signed-off-by: Jia Zhang <zhang.jia@linux.alibaba.com>
Reviewd-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: NTianjia Zhang <tianjia.zhang@linux.alibaba.com>
Reviewed-by: Jia Zhang <zhang.jia@linux.alibaba.com>

0448f28b

alinux: hotfix: Add Cloud Kernel hotfix enhancement · f94e5b1a

由 Xunlei Pang 提交于 7月 18, 2019

We reserve some fields beforehand for core structures prone to change,
so that we won't hurt when extra fields have to be added for hotfix,
thereby inceasing the success rate, we even can hot add features with
this enhancement.

After reserving, normally cache does not matter as the reserved fields
(usually at tail) are not accessed at all.

Currently involve the following structures:
    MM:
    struct zone
    struct pglist_data
    struct mm_struct
    struct vm_area_struct
    struct mem_cgroup
    struct writeback_control

    Block:
    struct gendisk
    struct backing_dev_info
    struct bio
    struct queue_limits
    struct request_queue
    struct blkcg
    struct blkcg_policy
    struct blk_mq_hw_ctx
    struct blk_mq_tag_set
    struct blk_mq_queue_data
    struct blk_mq_ops
    struct elevator_mq_ops
    struct inode
    struct dentry
    struct address_space
    struct block_device
    struct hd_struct
    struct bio_set

    Network:
    struct sk_buff
    struct sock
    struct net_device_ops
    struct xt_target
    struct dst_entry
    struct dst_ops
    struct fib_rule

    Scheduler:
    struct task_struct
    struct cfs_rq
    struct rq
    struct sched_statistics
    struct sched_entity
    struct signal_struct
    struct task_group
    struct cpuacct

    cgroup:
    struct cgroup_root
    struct cgroup_subsys_state
    struct cgroup_subsys
    struct css_set
Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
Signed-off-by: NXunlei Pang <xlpang@linux.alibaba.com>
[ caspar: use SPDX-License-Identifier ]
Signed-off-by: NCaspar Zhang <caspar@linux.alibaba.com>

f94e5b1a

x86/unwind/orc: Remove boot-time ORC unwind tables sorting · cea84490

由 Shile Zhang 提交于 12月 04, 2019

commit f14bf6a350dfd6613dbf91be5b423bc7eab690da upstream.

Now that the orc_unwind and orc_unwind_ip tables are sorted at build time,
remove the boot time sorting pass.

No change in functionality.

[ mingo: Rewrote the changelog and code comments. ]
Signed-off-by: NShile Zhang <shile.zhang@linux.alibaba.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kbuild@vger.kernel.org
Link: https://lkml.kernel.org/r/20191204004633.88660-8-shile.zhang@linux.alibaba.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

cea84490

scripts/sorttable: Implement build-time ORC unwind table sorting · 74de6a39

由 Shile Zhang 提交于 12月 04, 2019

commit 57fa1899428538e314a7e0d52a5b617af082389a upstream.

The ORC unwinder has two tables: .orc_unwind_ip and .orc_unwind, which
need to be sorted for binary search. Previously this sorting was done
during bootup.

Sort them at build time to speed up booting.

Add the ORC tables sorting in a parallel build process to speed up the build.

[ mingo: Rewrote the changelog and fixed some comments. ]
Suggested-by: NAndy Lutomirski <luto@amacapital.net>
Suggested-by: NPeter Zijlstra <peterz@infradead.org>
Reported-by: Nkbuild test robot <lkp@intel.com>
Signed-off-by: NShile Zhang <shile.zhang@linux.alibaba.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: linux-kbuild@vger.kernel.org
Link: https://lkml.kernel.org/r/20191204004633.88660-7-shile.zhang@linux.alibaba.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

74de6a39

scripts/sorttable: Rename 'sortextable' to 'sorttable' · 1fd0b7e9

由 Shile Zhang 提交于 12月 04, 2019

commit 1091670637be8bd34a39dd1ddcc0a10a7c88d4e2 upstream.

Use a more generic name for additional table sorting usecases,
such as the upcoming ORC table sorting feature. This tool is
not tied to exception table sorting anymore.

No functional changes intended.

[ mingo: Rewrote the changelog. ]
Signed-off-by: NShile Zhang <shile.zhang@linux.alibaba.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: linux-kbuild@vger.kernel.org
Link: https://lkml.kernel.org/r/20191204004633.88660-6-shile.zhang@linux.alibaba.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

1fd0b7e9

scripts/sortextable: Refactor the do_func() function · f2180acd

由 Shile Zhang 提交于 12月 04, 2019

commit 57cafdf2a04e161b9654c4ae3888a7549594c499 upstream.

Refine the loop, naming and code structure, make the code more readable
and extendable. No functional changes intended.
Signed-off-by: NShile Zhang <shile.zhang@linux.alibaba.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: linux-kbuild@vger.kernel.org
Link: https://lkml.kernel.org/r/20191204004633.88660-5-shile.zhang@linux.alibaba.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

f2180acd

scripts/sortextable: Remove dead code · dd4b2860

由 Shile Zhang 提交于 12月 04, 2019

commit abe4f92ca8948a3e04c56788354933c326909acb upstream.
Signed-off-by: NShile Zhang <shile.zhang@linux.alibaba.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: linux-kbuild@vger.kernel.org
Link: https://lkml.kernel.org/r/20191204004633.88660-4-shile.zhang@linux.alibaba.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

dd4b2860

scripts/sortextable: Clean up the code to meet the kernel coding style better · e3f834c3

由 Shile Zhang 提交于 12月 04, 2019

commit 6402e1416255a7bb94834925ba0255c750f54a2d upstream.

Fix various style errors and inconsistencies, no functional changes
intended.
Signed-off-by: NShile Zhang <shile.zhang@linux.alibaba.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: linux-kbuild@vger.kernel.org
Link: https://lkml.kernel.org/r/20191204004633.88660-3-shile.zhang@linux.alibaba.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

e3f834c3

scripts/sortextable: Rewrite error/success handling · 99991623

由 Shile Zhang 提交于 12月 04, 2019

commit 3c47b787b6516d2c3cbaa193fe13a83adbaaad1f upstream.

The scripts/sortextable.c code has originally copied some code from
scripts/recordmount.c, which used the same setjmp/longjmp method to
manage control flow.

Meanwhile recordmcount has improved its error handling via:

   3f1df12019f3 ("recordmcount: Rewrite error/success handling").

So rewrite this part of sortextable as well to get rid of the setjmp/longjmp
kludges, with additional refactoring, to make it more readable and
easier to extend.

No functional changes intended.

[ mingo: Rewrote the changelog. ]
Signed-off-by: NShile Zhang <shile.zhang@linux.alibaba.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: linux-kbuild@vger.kernel.org
Link: https://lkml.kernel.org/r/20191204004633.88660-2-shile.zhang@linux.alibaba.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

99991623

alinux: introduce psi_v1 boot parameter · dc159a61

由 Joseph Qi 提交于 12月 25, 2019

Instead using static kconfig CONFIG_PSI_CGROUP_V1, we introduce a boot
parameter psi_v1 to enable psi cgroup v1 support. Default it is
disabled, which means when passing psi=1 boot parameter, we only support
cgroup v2.
This is to keep consistent with other cgroup v1 features such as cgroup
writeback v1 (cgwb_v1).
Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
Acked-by: NXunlei Pang <xlpang@linux.alibaba.com>

dc159a61

alinux: psi: Support PSI under cgroup v1 · 1f49a738

由 Xunlei Pang 提交于 12月 23, 2019

Export "cpu|io|memory.pressure" under cgroup v1 "cpuacct" subsystem.
Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
Signed-off-by: NXunlei Pang <xlpang@linux.alibaba.com>

1f49a738

perf/x86: Make perf callchains work without CONFIG_FRAME_POINTER · 3e25ac47

由 Kairui Song 提交于 4月 23, 2019

commit d15d356887e770c5f2dcf963b52c7cb510c9e42d upstream.

Currently perf callchain doesn't work well with ORC unwinder
when sampling from trace point. We'll get useless in kernel callchain
like this:

perf  6429 [000]    22.498450:             kmem:mm_page_alloc: page=0x176a17 pfn=1534487 order=0 migratetype=0 gfp_flags=GFP_KERNEL
    ffffffffbe23e32e __alloc_pages_nodemask+0x22e (/lib/modules/5.1.0-rc3+/build/vmlinux)
	7efdf7f7d3e8 __poll+0x18 (/usr/lib64/libc-2.28.so)
	5651468729c1 [unknown] (/usr/bin/perf)
	5651467ee82a main+0x69a (/usr/bin/perf)
	7efdf7eaf413 __libc_start_main+0xf3 (/usr/lib64/libc-2.28.so)
    5541f689495641d7 [unknown] ([unknown])

The root cause is that, for trace point events, it doesn't provide a
real snapshot of the hardware registers. Instead perf tries to get
required caller's registers and compose a fake register snapshot
which suppose to contain enough information for start a unwinding.
However without CONFIG_FRAME_POINTER, if failed to get caller's BP as the
frame pointer, so current frame pointer is returned instead. We get
a invalid register combination which confuse the unwinder, and end the
stacktrace early.

So in such case just don't try dump BP, and let the unwinder start
directly when the register is not a real snapshot. Use SP
as the skip mark, unwinder will skip all the frames until it meet
the frame of the trace point caller.

Tested with frame pointer unwinder and ORC unwinder, this makes perf
callchain get the full kernel space stacktrace again like this:

perf  6503 [000]  1567.570191:             kmem:mm_page_alloc: page=0x16c904 pfn=1493252 order=0 migratetype=0 gfp_flags=GFP_KERNEL
    ffffffffb523e2ae __alloc_pages_nodemask+0x22e (/lib/modules/5.1.0-rc3+/build/vmlinux)
    ffffffffb52383bd __get_free_pages+0xd (/lib/modules/5.1.0-rc3+/build/vmlinux)
    ffffffffb52fd28a __pollwait+0x8a (/lib/modules/5.1.0-rc3+/build/vmlinux)
    ffffffffb521426f perf_poll+0x2f (/lib/modules/5.1.0-rc3+/build/vmlinux)
    ffffffffb52fe3e2 do_sys_poll+0x252 (/lib/modules/5.1.0-rc3+/build/vmlinux)
    ffffffffb52ff027 __x64_sys_poll+0x37 (/lib/modules/5.1.0-rc3+/build/vmlinux)
    ffffffffb500418b do_syscall_64+0x5b (/lib/modules/5.1.0-rc3+/build/vmlinux)
    ffffffffb5a0008c entry_SYSCALL_64_after_hwframe+0x44 (/lib/modules/5.1.0-rc3+/build/vmlinux)
	7f71e92d03e8 __poll+0x18 (/usr/lib64/libc-2.28.so)
	55a22960d9c1 [unknown] (/usr/bin/perf)
	55a22958982a main+0x69a (/usr/bin/perf)
	7f71e9202413 __libc_start_main+0xf3 (/usr/lib64/libc-2.28.so)
    5541f689495641d7 [unknown] ([unknown])
Co-developed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NKairui Song <kasong@redhat.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Young <dyoung@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190422162652.15483-1-kasong@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NShile Zhang <shile.zhang@linux.alibaba.com>
Acked-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

3e25ac47

x86/cpufeatures: Add WBNOINVD feature definition · 4cef0f14

由 Janakarajan Natarajan 提交于 12月 18, 2019

commit 08e823c2c5899ef2de3aa1727233f1f19e8c1cc1 upstream.

Add a new cpufeature definition for the WBNOINVD instruction.

The WBNOINVD instruction writes all modified cache lines in all levels of
the cache associated with a processor to main memory while retaining the
cached values.

Both AMD and Intel support this instruction.
Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
CC: David Woodhouse <dwmw@amazon.co.uk>
CC: Fenghua Yu <fenghua.yu@intel.com>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Ingo Molnar <mingo@redhat.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Rudolf Marek <r.marek@assembler.cz>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/1541624211-32196-1-git-send-email-Janakarajan.Natarajan@amd.comSigned-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

4cef0f14

ACPI / APEI: Fix parsing HEST that includes Deferred Machine Check subtable · 5df57f07

由 Yazen Ghannam 提交于 12月 18, 2019

commit f3355298fc5a24eb7606448bc02a08b3485e5979 upstream.

ACPI 6.2 includes a new definition for a Deferred Machine Check "DMC"
subtable.

The definition of this subtable was included in following commit:

c042933d (ACPICA: Add support for new HEST subtable)

However, the HEST parsing function was not updated to include this new
subtable. Therefore, Linux will fail to parse the HEST on systems that
include a DMC entry.

Add the length check for the new DMC subtable so that HEST parsing
doesn't fail on systems that include it.
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

5df57f07

ACPI / processor: Set P_LVL{2,3} idle state descriptions · 58e2ef5a

由 Yazen Ghannam 提交于 12月 18, 2019

commit 34a62cd0df89dd7034165048c0921d1314191b66 upstream.

The ACPI idle driver will fallback to using the legacy P_LVL* SystemIO
method of entering C-states if the _CST method is disabled and P_BLK is
defined. However, in this case the C2 and C3 states won't have a
description set, so the user will see "<null>" when reading the
description from sysfs.

Give each of these states a description.
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

58e2ef5a

perf vendor events amd: perf PMU events for AMD Family 17h · 3253729e

由 Martin Liška 提交于 12月 18, 2019

commit 98c07a8f74f85a19aeee2016f5afa0c667fa694d upstream.

Thi patch adds PMC events for AMD Family 17 CPUs as defined in [1].  It
covers events described in section: 2.1.13. Regex pattern in mapfile.csv
covers all CPUs of the family.

[1] https://support.amd.com/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdfSigned-off-by: NMartin Liška <mliska@suse.cz>
Acked-by: NBorislav Petkov <bp@suse.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jon Grimm <jon.grimm@amd.com>
Cc: Martin Jambor <mjambor@suse.cz>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/r/d65873ca-e402-b198-4fe9-8c4af81258c8@suse.czSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

3253729e

KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation) · 20ba1423

由 Singh, Brijesh 提交于 12月 18, 2019

commit 05d5a48635259e621ea26d01e8316c6feeb34190 upstream.

Errata#1096:

On a nested data page fault when CR.SMAP=1 and the guest data read
generates a SMAP violation, GuestInstrBytes field of the VMCB on a
VMEXIT will incorrectly return 0h instead the correct guest
instruction bytes .

Recommend Workaround:

To determine what instruction the guest was executing the hypervisor
will have to decode the instruction at the instruction pointer.

The recommended workaround can not be implemented for the SEV
guest because guest memory is encrypted with the guest specific key,
and instruction decoder will not be able to decode the instruction
bytes. If we hit this errata in the SEV guest then log the message
and request a guest shutdown.
Reported-by: NVenkatesh Srinivas <venkateshs@google.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

20ba1423

svm/avic: Fix invalidate logical APIC id entry · 1abad802

由 Suthikulpanit, Suravee 提交于 12月 18, 2019

commit e44e3eacccfd2294a1ce279f68452b1635d7fa82 upstream.

Only clear the valid bit when invalidate logical APIC id entry.
The current logic clear the valid bit, but also set the rest of
the bits (including reserved bits) to 1.

Fixes: 98d90582be2e ('svm: Fix AVIC DFR and LDR handling')
Signed-off-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

1abad802

svm: Fix improper check when deactivate AVIC · 20ce601e

由 Suthikulpanit, Suravee 提交于 12月 18, 2019

commit c57cd3c89ecf2812976f53e494580a396f93efd2 upstream.

The function svm_refresh_apicv_exec_ctrl() always returning prematurely
as kvm_vcpu_apicv_active() always return false when calling from
the function arch/x86/kvm/x86.c:kvm_vcpu_deactivate_apicv().
This is because the apicv_active is set to false just before calling
refresh_apicv_exec_ctrl().

Also, we need to mark VMCB_AVIC bit as dirty instead of VMCB_INTR.

So, fix svm_refresh_apicv_exec_ctrl() to properly deactivate AVIC.

Fixes: 67034bb9 ('KVM: SVM: Add irqchip_split() checks before enabling AVIC')
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NWANG Siyuan <Siyuan.Wang@amd.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

20ce601e

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功