- 07 12月, 2022 1 次提交
-
-
由 Anders Roxell 提交于
When booting a arm 32-bit kernel with config CONFIG_AHCI_DWC enabled on a am57xx-evm board. This happens when the clock references are unnamed in DT, the strcmp() produces a NULL pointer dereference, see the following oops, NULL pointer dereference: [ 4.673950] Unable to handle kernel NULL pointer dereference at virtual address 00000000 [ 4.682098] [00000000] *pgd=00000000 [ 4.685699] Internal error: Oops: 5 [#1] SMP ARM [ 4.690338] Modules linked in: [ 4.693420] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc7 #1 [ 4.699615] Hardware name: Generic DRA74X (Flattened Device Tree) [ 4.705749] PC is at strcmp+0x0/0x34 [ 4.709350] LR is at ahci_platform_find_clk+0x3c/0x5c [ 4.714416] pc : [<c130c494>] lr : [<c0c230e0>] psr: 20000013 [ 4.720703] sp : f000dda8 ip : 00000001 fp : c29b1840 [ 4.725952] r10: 00000020 r9 : c1b23380 r8 : c1b23368 [ 4.731201] r7 : c1ab4cc4 r6 : 00000001 r5 : c3c66040 r4 : 00000000 [ 4.737762] r3 : 00000080 r2 : 00000080 r1 : c1ab4cc4 r0 : 00000000 [...] [ 4.998870] strcmp from ahci_platform_find_clk+0x3c/0x5c [ 5.004302] ahci_platform_find_clk from ahci_dwc_probe+0x1f0/0x54c [ 5.010589] ahci_dwc_probe from platform_probe+0x64/0xc0 [ 5.016021] platform_probe from really_probe+0xe8/0x41c [ 5.021362] really_probe from __driver_probe_device+0xa4/0x204 [ 5.027313] __driver_probe_device from driver_probe_device+0x38/0xc8 [ 5.033782] driver_probe_device from __driver_attach+0xb4/0x1ec [ 5.039825] __driver_attach from bus_for_each_dev+0x78/0xb8 [ 5.045532] bus_for_each_dev from bus_add_driver+0x17c/0x220 [ 5.051300] bus_add_driver from driver_register+0x90/0x124 [ 5.056915] driver_register from do_one_initcall+0x48/0x1e8 [ 5.062591] do_one_initcall from kernel_init_freeable+0x1cc/0x234 [ 5.068817] kernel_init_freeable from kernel_init+0x20/0x13c [ 5.074584] kernel_init from ret_from_fork+0x14/0x2c [ 5.079681] Exception stack(0xf000dfb0 to 0xf000dff8) [ 5.084747] dfa0: 00000000 00000000 00000000 00000000 [ 5.092956] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 5.101165] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 5.107818] Code: e5e32001 e3520000 1afffffb e12fff1e (e4d03001) [ 5.114013] ---[ end trace 0000000000000000 ]--- Add an extra check in the if-statement if hpriv-clks[i].id. Fixes: 6ce73f3a ("ata: libahci_platform: Add function returning a clock-handle by id") Suggested-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NAnders Roxell <anders.roxell@linaro.org> Reviewed-by: NSerge Semin <fancer.lancer@gmail.com> Signed-off-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
-
- 12 11月, 2022 1 次提交
-
-
由 Niklas Cassel 提交于
While the ATA specification states that a device should return command aborted for all commands queued after the device has entered error state, since ATA only keeps the sense data for the latest command (in non-NCQ case), we really don't want to send block layer commands to the device after it has entered error state. (Only ATA EH commands should be sent, to read the sense data etc.) Currently, scsi_queue_rq() will check if scsi_host_in_recovery() (state is SHOST_RECOVERY), and if so, it will _not_ issue a command via: scsi_dispatch_cmd() -> host->hostt->queuecommand() (ata_scsi_queuecmd()) -> __ata_scsi_queuecmd() -> ata_scsi_translate() -> ata_qc_issue() Before commit e494f6a7 ("[SCSI] improved eh timeout handler"), when receiving a TFES error IRQ, the call chain looked like this: ahci_error_intr() -> ata_port_abort() -> ata_do_link_abort() -> ata_qc_complete() -> ata_qc_schedule_eh() -> blk_abort_request() -> blk_rq_timed_out() -> q->rq_timed_out_fn() (scsi_times_out()) -> scsi_eh_scmd_add() -> scsi_host_set_state(shost, SHOST_RECOVERY) Which meant that as soon as an error IRQ was serviced, SHOST_RECOVERY would be set. However, after commit e494f6a7 ("[SCSI] improved eh timeout handler"), scsi_times_out() will instead call scsi_abort_command() which will queue delayed work, and the worker function scmd_eh_abort_handler() will call scsi_eh_scmd_add(), which calls scsi_host_set_state(shost, SHOST_RECOVERY). So now, after the TFES error IRQ has been serviced, we need to wait for the SCSI workqueue to run its work before SHOST_RECOVERY gets set. It is worth noting that, even before commit e494f6a7 ("[SCSI] improved eh timeout handler"), we could receive an error IRQ from the time when scsi_queue_rq() checks scsi_host_in_recovery(), to the time when ata_scsi_queuecmd() is actually called. In order to handle both the delayed setting of SHOST_RECOVERY and the window where we can receive an error IRQ, add a check against ATA_PFLAG_EH_PENDING (which gets set when servicing the error IRQ), inside ata_scsi_queuecmd() itself, while holding the ap->lock. (Since the ap->lock is held while servicing IRQs.) Fixes: e494f6a7 ("[SCSI] improved eh timeout handler") Signed-off-by: NNiklas Cassel <niklas.cassel@wdc.com> Tested-by: NJohn Garry <john.g.garry@oracle.com> Signed-off-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
-
- 11 11月, 2022 4 次提交
-
-
由 Yang Yingliang 提交于
In ata_tdev_add(), the return value of transport_add_device() is not checked. As a result, it causes null-ptr-deref while removing the module, because transport_remove_device() is called to remove the device that was not added. Unable to handle kernel NULL pointer dereference at virtual address 00000000000000d0 CPU: 13 PID: 13603 Comm: rmmod Kdump: loaded Tainted: G W 6.1.0-rc3+ #36 pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : device_del+0x48/0x3a0 lr : device_del+0x44/0x3a0 Call trace: device_del+0x48/0x3a0 attribute_container_class_device_del+0x28/0x40 transport_remove_classdev+0x60/0x7c attribute_container_device_trigger+0x118/0x120 transport_remove_device+0x20/0x30 ata_tdev_delete+0x24/0x50 [libata] ata_tlink_delete+0x40/0xa0 [libata] ata_tport_delete+0x2c/0x60 [libata] ata_port_detach+0x148/0x1b0 [libata] ata_pci_remove_one+0x50/0x80 [libata] ahci_remove_one+0x4c/0x8c [ahci] Fix this by checking and handling return value of transport_add_device() in ata_tdev_add(). In the error path, device_del() is called to delete the device which was added earlier in this function, and ata_tdev_free() is called to free ata_dev. Fixes: d9027470 ("[libata] Add ATA transport class") Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
-
由 Yang Yingliang 提交于
In ata_tlink_add(), the return value of transport_add_device() is not checked. As a result, it causes null-ptr-deref while removing the module, because transport_remove_device() is called to remove the device that was not added. Unable to handle kernel NULL pointer dereference at virtual address 00000000000000d0 CPU: 33 PID: 13850 Comm: rmmod Kdump: loaded Tainted: G W 6.1.0-rc3+ #12 pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : device_del+0x48/0x39c lr : device_del+0x44/0x39c Call trace: device_del+0x48/0x39c attribute_container_class_device_del+0x28/0x40 transport_remove_classdev+0x60/0x7c attribute_container_device_trigger+0x118/0x120 transport_remove_device+0x20/0x30 ata_tlink_delete+0x88/0xb0 [libata] ata_tport_delete+0x2c/0x60 [libata] ata_port_detach+0x148/0x1b0 [libata] ata_pci_remove_one+0x50/0x80 [libata] ahci_remove_one+0x4c/0x8c [ahci] Fix this by checking and handling return value of transport_add_device() in ata_tlink_add(). Fixes: d9027470 ("[libata] Add ATA transport class") Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
-
由 Yang Yingliang 提交于
In ata_tport_add(), the return value of transport_add_device() is not checked. As a result, it causes null-ptr-deref while removing the module, because transport_remove_device() is called to remove the device that was not added. Unable to handle kernel NULL pointer dereference at virtual address 00000000000000d0 CPU: 12 PID: 13605 Comm: rmmod Kdump: loaded Tainted: G W 6.1.0-rc3+ #8 pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : device_del+0x48/0x39c lr : device_del+0x44/0x39c Call trace: device_del+0x48/0x39c attribute_container_class_device_del+0x28/0x40 transport_remove_classdev+0x60/0x7c attribute_container_device_trigger+0x118/0x120 transport_remove_device+0x20/0x30 ata_tport_delete+0x34/0x60 [libata] ata_port_detach+0x148/0x1b0 [libata] ata_pci_remove_one+0x50/0x80 [libata] ahci_remove_one+0x4c/0x8c [ahci] Fix this by checking and handling return value of transport_add_device() in ata_tport_add(). Fixes: d9027470 ("[libata] Add ATA transport class") Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
-
由 Yang Yingliang 提交于
In the error path in ata_tport_add(), when calling put_device(), ata_tport_release() is called, it will put the refcount of 'ap->host'. And then ata_host_put() is called again, the refcount is decreased to 0, ata_host_release() is called, all ports are freed and set to null. When unbinding the device after failure, ata_host_stop() is called to release the resources, it leads a null-ptr-deref(), because all the ports all freed and null. Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 CPU: 7 PID: 18671 Comm: modprobe Kdump: loaded Tainted: G E 6.1.0-rc3+ #8 pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : ata_host_stop+0x3c/0x84 [libata] lr : release_nodes+0x64/0xd0 Call trace: ata_host_stop+0x3c/0x84 [libata] release_nodes+0x64/0xd0 devres_release_all+0xbc/0x1b0 device_unbind_cleanup+0x20/0x70 really_probe+0x158/0x320 __driver_probe_device+0x84/0x120 driver_probe_device+0x44/0x120 __driver_attach+0xb4/0x220 bus_for_each_dev+0x78/0xdc driver_attach+0x2c/0x40 bus_add_driver+0x184/0x240 driver_register+0x80/0x13c __pci_register_driver+0x4c/0x60 ahci_pci_driver_init+0x30/0x1000 [ahci] Fix this by removing redundant ata_host_put() in the error path. Fixes: 2623c7a5 ("libata: add refcounting to ata_host") Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
-
- 08 11月, 2022 1 次提交
-
-
由 Shin'ichiro Kawasaki 提交于
SAT SCSI/ATA Translation specification requires SCSI SYNCHRONIZE CACHE (10) and (16) commands both shall be translated to ATA flush command. Also, ZBC Zoned Block Commands specification mandates SYNCHRONIZE CACHE (16) command support. However, libata translates only SYNCHRONIZE CACHE (10). This results in SYNCHRONIZE CACHE (16) command failures on SATA drives and then libata translation does not conform to ZBC. To avoid the failure, add support for SYNCHRONIZE CACHE (16). Signed-off-by: NShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Cc: stable@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
-
- 31 10月, 2022 2 次提交
-
-
由 Yang Yingliang 提交于
If devm_platform_ioremap_resource() fails, it never return NULL pointer, replace the check with IS_ERR(). Fixes: 57bf0f5a ("ARM: pxa: use pdev resource for palmld mmio") Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NSergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
-
由 Sergey Shtylyov 提交于
Clang gives a warning when compiling pata_legacy.c with 'make W=1' about the 'rt' local variable in pdc20230_set_piomode() being set but unused. Quite obviously, there is an outb() call missing to write back the updated variable. Moreover, checking the docs by Petr Soucek revealed that bitwise AND should have been done with a negated timing mask and the master/slave timing masks were swapped while updating... Fixes: 669a5db4 ("[libata] Add a bunch of PATA drivers.") Reported-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: NSergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
-
- 18 10月, 2022 5 次提交
-
-
由 Damien Le Moal 提交于
When compiling with clang and W=1, the following warning is generated: drivers/ata/ahci_qoriq.c:283:22: error: cast to smaller integer type 'enum ahci_qoriq_type' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast] qoriq_priv->type = (enum ahci_qoriq_type)of_id->data; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix this by using a cast to unsigned long to match the "void *" type size of of_id->data. Signed-off-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: NArnd Bergmann <arnd@arndb.de>
-
由 Damien Le Moal 提交于
When compiling with clang and W=1, the following warning is generated: drivers/ata/ahci_imx.c:1070:18: error: cast to smaller integer type 'enum ahci_imx_type' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast] imxpriv->type = (enum ahci_imx_type)of_id->data; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix this by using a cast to unsigned long to match the "void *" type size of of_id->data. Signed-off-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: NArnd Bergmann <arnd@arndb.de>
-
由 Damien Le Moal 提交于
When compiling with clang and W=1, the following warning is generated: drivers/ata/ahci_xgene.c:788:14: error: cast to smaller integer type 'enum xgene_ahci_version' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast] version = (enum xgene_ahci_version) of_devid->data; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix this by using a cast to unsigned long to match the "void *" type size of of_devid->data. Signed-off-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: NArnd Bergmann <arnd@arndb.de>
-
由 Damien Le Moal 提交于
When compiling with clang and W=1, the following warning is generated: drivers/ata/ahci_brcm.c:451:18: error: cast to smaller integer type 'enum brcm_ahci_version' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast] priv->version = (enum brcm_ahci_version)of_id->data; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix this by using a cast to unsigned long to match the "void *" type size of of_id->data. Signed-off-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NFlorian Fainelli <f.fainelli@gmail.com>
-
由 Damien Le Moal 提交于
When compiling with clang and W=1, the following warning is generated: drivers/ata/sata_rcar.c:878:15: error: cast to smaller integer type 'enum sata_rcar_type' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast] priv->type = (enum sata_rcar_type)of_device_get_match_data(dev); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix this by using a cast to unsigned long to match the "void *" type size returned by of_device_get_match_data(). Signed-off-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: NGeert Uytterhoeven <geert+renesas@glider.be> Acked-by: NArnd Bergmann <arnd@arndb.de> Reviewed-by: NSergey Shtylyov <s.shtylyov@omp.ru>
-
- 17 10月, 2022 3 次提交
-
-
由 Damien Le Moal 提交于
If CONFIG_OF is disabled and the ahci_st driver is builtin (or CONFIG_MODULES is disabled), then using the macro of_match_ptr() results in the st_ahci_match variable being unused, which generates a compilation warning and a compilation error if CONFIG_WERROR is enabled. Fix this by directly assigning st_ahci_match to .of_match_table in the st_ahci_driver platform driver definition. Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: NArnd Bergmann <arnd@arndb.de>
-
由 Kai-Heng Feng 提交于
UBSAN complains about array-index-out-of-bounds: [ 1.980703] kernel: UBSAN: array-index-out-of-bounds in /build/linux-9H675w/linux-5.15.0/drivers/ata/libahci.c:968:41 [ 1.980709] kernel: index 15 is out of range for type 'ahci_em_priv [8]' [ 1.980713] kernel: CPU: 0 PID: 209 Comm: scsi_eh_8 Not tainted 5.15.0-25-generic #25-Ubuntu [ 1.980716] kernel: Hardware name: System manufacturer System Product Name/P5Q3, BIOS 1102 06/11/2010 [ 1.980718] kernel: Call Trace: [ 1.980721] kernel: <TASK> [ 1.980723] kernel: show_stack+0x52/0x58 [ 1.980729] kernel: dump_stack_lvl+0x4a/0x5f [ 1.980734] kernel: dump_stack+0x10/0x12 [ 1.980736] kernel: ubsan_epilogue+0x9/0x45 [ 1.980739] kernel: __ubsan_handle_out_of_bounds.cold+0x44/0x49 [ 1.980742] kernel: ahci_qc_issue+0x166/0x170 [libahci] [ 1.980748] kernel: ata_qc_issue+0x135/0x240 [ 1.980752] kernel: ata_exec_internal_sg+0x2c4/0x580 [ 1.980754] kernel: ? vprintk_default+0x1d/0x20 [ 1.980759] kernel: ata_exec_internal+0x67/0xa0 [ 1.980762] kernel: sata_pmp_read+0x8d/0xc0 [ 1.980765] kernel: sata_pmp_read_gscr+0x3c/0x90 [ 1.980768] kernel: sata_pmp_attach+0x8b/0x310 [ 1.980771] kernel: ata_eh_revalidate_and_attach+0x28c/0x4b0 [ 1.980775] kernel: ata_eh_recover+0x6b6/0xb30 [ 1.980778] kernel: ? ahci_do_hardreset+0x180/0x180 [libahci] [ 1.980783] kernel: ? ahci_stop_engine+0xb0/0xb0 [libahci] [ 1.980787] kernel: ? ahci_do_softreset+0x290/0x290 [libahci] [ 1.980792] kernel: ? trace_event_raw_event_ata_eh_link_autopsy_qc+0xe0/0xe0 [ 1.980795] kernel: sata_pmp_eh_recover.isra.0+0x214/0x560 [ 1.980799] kernel: sata_pmp_error_handler+0x23/0x40 [ 1.980802] kernel: ahci_error_handler+0x43/0x80 [libahci] [ 1.980806] kernel: ata_scsi_port_error_handler+0x2b1/0x600 [ 1.980810] kernel: ata_scsi_error+0x9c/0xd0 [ 1.980813] kernel: scsi_error_handler+0xa1/0x180 [ 1.980817] kernel: ? scsi_unjam_host+0x1c0/0x1c0 [ 1.980820] kernel: kthread+0x12a/0x150 [ 1.980823] kernel: ? set_kthread_struct+0x50/0x50 [ 1.980826] kernel: ret_from_fork+0x22/0x30 [ 1.980831] kernel: </TASK> This happens because sata_pmp_init_links() initialize link->pmp up to SATA_PMP_MAX_PORTS while em_priv is declared as 8 elements array. I can't find the maximum Enclosure Management ports specified in AHCI spec v1.3.1, but "12.2.1 LED message type" states that "Port Multiplier Information" can utilize 4 bits, which implies it can support up to 16 ports. Hence, use SATA_PMP_MAX_PORTS as EM_MAX_SLOTS to resolve the issue. BugLink: https://bugs.launchpad.net/bugs/1970074 Cc: stable@vger.kernel.org Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
-
由 Alexander Stein 提交于
'ahci:' is an invalid prefix, preventing the module from autoloading. Fix this by using the 'platform:' prefix and DRV_NAME. Fixes: 9e54eae2 ("ahci_imx: add ahci sata support on imx platforms") Cc: stable@vger.kernel.org Signed-off-by: NAlexander Stein <alexander.stein@ew.tq-group.com> Reviewed-by: NFabio Estevam <festevam@gmail.com> Signed-off-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
-
- 15 10月, 2022 6 次提交
-
-
由 Jon Hunter 提交于
Commit 8c193f47 ("pwm: tegra: Optimize period calculation") updated the period calculation in the Tegra PWM driver and now returns an error if the period requested is less than minimum period supported. This is breaking PWM support on various Tegra platforms. For example, on the Tegra210 Jetson Nano platform this is breaking the PWM fan support and probing the PWM fan driver now fails ... pwm-fan pwm-fan: Failed to configure PWM: -22 pwm-fan: probe of pwm-fan failed with error -22 The problem is that the default parent clock for the PWM on Tegra210 is a 32kHz clock and is unable to support the requested PWM period. Fix PWM support on Tegra20, Tegra30, Tegra114, Tegra124 and Tegra210 by updating the parent clock for the PWM to be the PLL_P. Fixes: 8c193f47 ("pwm: tegra: Optimize period calculation") Signed-off-by: NJon Hunter <jonathanh@nvidia.com> Tested-by: Robert Eckelmann <longnoserob@gmail.com> # TF101 T20 Tested-by: Antoni Aloy Torrens <aaloytorrens@gmail.com> # TF101 T20 Tested-by: Svyatoslav Ryhel <clamor95@gmail.com> # TF201 T30 Tested-by: Andreas Westman Dorcsak <hedmoo@yahoo.com> # TF700T T3 Link: https://lore.kernel.org/r/20221010100046.6477-1-jonathanh@nvidia.comAcked-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: NStephen Boyd <sboyd@kernel.org>
-
由 Linus Walleij 提交于
These two clocks are now registered in the device tree as fixed clocks, causing a regression in the driver as the clock already exists with e.g. the name "pxo_board" as the MSM8660 GCC driver probes. Fix this by just not hard-coding this anymore and everything works like a charm. Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Fixes: baecbda5 ("ARM: dts: qcom: msm8660: fix node names for fixed clocks") Signed-off-by: NLinus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20221013140745.7801-1-linus.walleij@linaro.orgSigned-off-by: NStephen Boyd <sboyd@kernel.org>
-
Since commit 262ca38f ("clk: Stop forwarding clk_rate_requests to the parent"), the clk_rate_request is .. as the title says, not forwarded anymore to the parent: this produces an issue with the MediaTek clock MUX driver during GPU DVFS on MT8195, but not on MT8192 or others. This is because, differently from others, like MT8192 where all of the clocks in the MFG parents tree are of mtk_mux type, but in the parent tree of MT8195's MFG clock, we have one mtk_mux clock and one (clk framework generic) mux clock, like so: names: mfg_bg3d -> mfg_ck_fast_ref -> top_mfg_core_tmp (or) mfgpll types: mtk_gate -> mux -> mtk_mux (or) mtk_pll To solve this issue and also keep the GPU DVFS clocks code working as expected, wire up a .determine_rate() callback for the mtk_mux ops; for that, the standard clk_mux_determine_rate_flags() was used as it was possible to. This commit was successfully tested on MT6795 Xperia M5, MT8173 Elm, MT8192 Spherion and MT8195 Tomato; no regressions were seen. For the sake of some more documentation about this issue here's the trace of it: [ 12.211587] ------------[ cut here ]------------ [ 12.211589] WARNING: CPU: 6 PID: 78 at drivers/clk/clk.c:1462 clk_core_init_rate_req+0x84/0x90 [ 12.211593] Modules linked in: stp crct10dif_ce mtk_adsp_common llc rfkill snd_sof_xtensa_dsp panfrost(+) sbs_battery cros_ec_lid_angle cros_ec_sensors snd_sof_of cros_ec_sensors_core hid_multitouch cros_usbpd_logger snd_sof gpu_sched snd_sof_utils fuse ipv6 [ 12.211614] CPU: 6 PID: 78 Comm: kworker/u16:2 Tainted: G W 6.0.0-next-20221011+ #58 [ 12.211616] Hardware name: Acer Tomato (rev2) board (DT) [ 12.211617] Workqueue: devfreq_wq devfreq_monitor [ 12.211620] pstate: 40400009 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 12.211622] pc : clk_core_init_rate_req+0x84/0x90 [ 12.211625] lr : clk_core_forward_rate_req+0xa4/0xe4 [ 12.211627] sp : ffff80000893b8e0 [ 12.211628] x29: ffff80000893b8e0 x28: ffffdddf92f9b000 x27: ffff46a2c0e8bc05 [ 12.211632] x26: ffff46a2c1041200 x25: 0000000000000000 x24: 00000000173eed80 [ 12.211636] x23: ffff80000893b9c0 x22: ffff80000893b940 x21: 0000000000000000 [ 12.211641] x20: ffff46a2c1039f00 x19: ffff46a2c1039f00 x18: 0000000000000000 [ 12.211645] x17: 0000000000000038 x16: 000000000000d904 x15: 0000000000000003 [ 12.211649] x14: ffffdddf9357ce48 x13: ffffdddf935e71c8 x12: 000000000004803c [ 12.211653] x11: 00000000a867d7ad x10: 00000000a867d7ad x9 : ffffdddf90c28df4 [ 12.211657] x8 : ffffdddf9357a980 x7 : 0000000000000000 x6 : 0000000000000004 [ 12.211661] x5 : ffffffffffffffc8 x4 : 00000000173eed80 x3 : ffff80000893b940 [ 12.211665] x2 : 00000000173eed80 x1 : ffff80000893b940 x0 : 0000000000000000 [ 12.211669] Call trace: [ 12.211670] clk_core_init_rate_req+0x84/0x90 [ 12.211673] clk_core_round_rate_nolock+0xe8/0x10c [ 12.211675] clk_mux_determine_rate_flags+0x174/0x1f0 [ 12.211677] clk_mux_determine_rate+0x1c/0x30 [ 12.211680] clk_core_determine_round_nolock+0x74/0x130 [ 12.211682] clk_core_round_rate_nolock+0x58/0x10c [ 12.211684] clk_core_round_rate_nolock+0xf4/0x10c [ 12.211686] clk_core_set_rate_nolock+0x194/0x2ac [ 12.211688] clk_set_rate+0x40/0x94 [ 12.211691] _opp_config_clk_single+0x38/0xa0 [ 12.211693] _set_opp+0x1b0/0x500 [ 12.211695] dev_pm_opp_set_rate+0x120/0x290 [ 12.211697] panfrost_devfreq_target+0x3c/0x50 [panfrost] [ 12.211705] devfreq_set_target+0x8c/0x2d0 [ 12.211707] devfreq_update_target+0xcc/0xf4 [ 12.211708] devfreq_monitor+0x40/0x1d0 [ 12.211710] process_one_work+0x294/0x664 [ 12.211712] worker_thread+0x7c/0x45c [ 12.211713] kthread+0x104/0x110 [ 12.211716] ret_from_fork+0x10/0x20 [ 12.211718] irq event stamp: 7102 [ 12.211719] hardirqs last enabled at (7101): [<ffffdddf904ea5a0>] finish_task_switch.isra.0+0xec/0x2f0 [ 12.211723] hardirqs last disabled at (7102): [<ffffdddf91794b74>] el1_dbg+0x24/0x90 [ 12.211726] softirqs last enabled at (6716): [<ffffdddf90410be4>] __do_softirq+0x414/0x588 [ 12.211728] softirqs last disabled at (6507): [<ffffdddf904171d8>] ____do_softirq+0x18/0x24 [ 12.211730] ---[ end trace 0000000000000000 ]--- Fixes: 262ca38f ("clk: Stop forwarding clk_rate_requests to the parent") Signed-off-by: NAngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20221011135548.318323-1-angelogioacchino.delregno@collabora.comSigned-off-by: NStephen Boyd <sboyd@kernel.org>
-
由 Bjorn Helgaas 提交于
This reverts commit e96e27fc. Jonathan reported that this commit broke this topology, where all the space available on bus 02 was assigned to the 02:00.0 bridge window, leaving none for the e1000 device at 02:00.1: pci 0000:00:04.0: bridge window [mem 0x10200000-0x103fffff] to [bus 02-04] pci 0000:02:00.0: bridge window [mem 0x10200000-0x103fffff] to [bus 03-04] pci 0000:02:00.1: BAR 0: failed to assign [mem size 0x00020000] e1000 0000:02:00.1: can't ioremap BAR 0: [??? 0x00000000 flags 0x0] Link: https://lore.kernel.org/r/20221014124553.0000696f@huawei.comReported-by: NJonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Nathan Chancellor 提交于
After commit 8799c0be ("drm/amd/display: Fix vblank refcount in vrr transition"), a build with CONFIG_DEBUG_FS=n is broken due to a misplaced brace, along the lines of: In file included from drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_trace.h:39, from drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:41: drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c: At top level: ./include/drm/drm_atomic.h:864:9: error: expected identifier or ‘(’ before ‘for’ 864 | for ((__i) = 0; \ | ^~~ drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:8317:9: note: in expansion of macro ‘for_each_new_crtc_in_state’ 8317 | for_each_new_crtc_in_state(state, crtc, new_crtc_state, j) | ^~~~~~~~~~~~~~~~~~~~~~~~~~ Move the brace within the #ifdef so that the file can be built with or without CONFIG_DEBUG_FS. Fixes: 8799c0be ("drm/amd/display: Fix vblank refcount in vrr transition") Signed-off-by: NNathan Chancellor <nathan@kernel.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Colin Ian King 提交于
There are several spelling mistakes in kernel error messages. Fix them. Signed-off-by: NColin Ian King <colin.i.king@gmail.com> Signed-off-by: NHelge Deller <deller@gmx.de>
-
- 14 10月, 2022 11 次提交
-
-
由 Helge Deller 提交于
Independend of the current graphics resolution, adjust the reported graphics card memory size to the next 4MB boundary. This fixes the fbtest program which expects a naturally aligned size. Signed-off-by: NHelge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org>
-
由 Ke Sun 提交于
Compiler warnings: drivers/rtc/rtc-rv3028.c: In function 'rv3028_param_set': drivers/rtc/rtc-rv3028.c:559:20: warning: statement will never be executed [-Wswitch-unreachable] 559 | u8 mode; | ^~~~ drivers/rtc/rtc-rv3028.c: In function 'rv3028_param_get': drivers/rtc/rtc-rv3028.c:526:21: warning: statement will never be executed [-Wswitch-unreachable] 526 | u32 value; | ^~~~~ Fix it by moving the variable declaration to the beginning of the function. Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: linux-rtc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reported-by: Nk2ci <kernel-bot@kylinos.cn> Signed-off-by: NKe Sun <sunke@kylinos.cn> Link: https://lore.kernel.org/r/20221008071321.1799971-1-sunke@kylinos.cnSigned-off-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
-
由 Rafael J. Wysocki 提交于
Because acpi_install_fixed_event_handler() enables the event automatically on success, it is incorrect to call it before the handler routine passed to it is ready to handle events. Unfortunately, the rtc-cmos driver does exactly the incorrect thing by calling cmos_wake_setup(), which passes rtc_handler() to acpi_install_fixed_event_handler(), before cmos_do_probe(), because rtc_handler() uses dev_get_drvdata() to get to the cmos object pointer and the driver data pointer is only populated in cmos_do_probe(). This leads to a NULL pointer dereference in rtc_handler() on boot if the RTC fixed event happens to be active at the init time. To address this issue, change the initialization ordering of the driver so that cmos_wake_setup() is always called after a successful cmos_do_probe() call. While at it, change cmos_pnp_probe() to call cmos_do_probe() after the initial if () statement used for computing the IRQ argument to be passed to cmos_do_probe() which is cleaner than calling it in each branch of that if () (local variable "irq" can be of type int, because it is passed to that function as an argument of type int). Note that commit 6492fed7 ("rtc: rtc-cmos: Do not check ACPI_FADT_LOW_POWER_S0") caused this issue to affect a larger number of systems, because previously it only affected systems with ACPI_FADT_LOW_POWER_S0 set, but it is present regardless of that commit. Fixes: 6492fed7 ("rtc: rtc-cmos: Do not check ACPI_FADT_LOW_POWER_S0") Fixes: a474aaed ("rtc-cmos: move wake setup from ACPI glue into RTC driver") Link: https://lore.kernel.org/linux-acpi/20221010141630.zfzi7mk7zvnmclzy@techsingularity.net/Reported-by: NMel Gorman <mgorman@techsingularity.net> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: NBjorn Helgaas <bhelgaas@google.com> Tested-by: NMel Gorman <mgorman@techsingularity.net> Link: https://lore.kernel.org/r/5629262.DvuYhMxLoT@kreacherSigned-off-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
-
由 Palmer Dabbelt 提交于
These counters were part of the ISA when we froze the uABI, removing them breaks userspace. Link: https://lore.kernel.org/all/YxEhC%2FmDW1lFt36J@aurel32.net/ Fixes: e9991434 ("RISC-V: Add perf platform driver based on SBI PMU extension") Tested-by: NConor Dooley <conor.dooley@microchip.com> Reviewed-by: NConor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20220928131807.30386-1-palmer@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: NPalmer Dabbelt <palmer@rivosinc.com>
-
由 Zong Li 提交于
Define the macro for the register shifts, it could make the code be more readable Signed-off-by: NZong Li <zong.li@sifive.com> Reviewed-by: NConor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20220913061817.22564-7-zong.li@sifive.comSigned-off-by: NPalmer Dabbelt <palmer@rivosinc.com>
-
由 Ben Dooks 提交于
Use the pr_fmt() macro to prefix all the output with "CCACHE:" to avoid having to write it out each time, or make a large diff when the next change comes along. Signed-off-by: NBen Dooks <ben.dooks@sifive.com> Signed-off-by: NZong Li <zong.li@sifive.com> Reviewed-by: NConor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20220913061817.22564-6-zong.li@sifive.comSigned-off-by: NPalmer Dabbelt <palmer@rivosinc.com>
-
由 Ben Dooks 提交于
The driver prints out 6 lines on startup, which can easily be redcued to two lines without losing any information. Note, to make the types work better, uint64_t has been replaced with ULL to make the unsigned long long match the format in the print statement. Signed-off-by: NBen Dooks <ben.dooks@sifive.com> Signed-off-by: NZong Li <zong.li@sifive.com> Reviewed-by: NConor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20220913061817.22564-5-zong.li@sifive.comSigned-off-by: NPalmer Dabbelt <palmer@rivosinc.com>
-
由 Zong Li 提交于
Composable cache could be L2 or L3 cache, use 'cache-level' property of device node to determine the level. Signed-off-by: NZong Li <zong.li@sifive.com> Reviewed-by: NConor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20220913061817.22564-4-zong.li@sifive.comSigned-off-by: NPalmer Dabbelt <palmer@rivosinc.com>
-
由 Greentime Hu 提交于
Since composable cache may be L3 cache if there is a L2 cache, we should use its original name composable cache to prevent confusion. There are some new lines were generated due to adding the compatible "sifive,ccache0" into ID table and indent requirement. The sifive L2 has been renamed to sifive CCACHE, EDAC driver needs to apply the change as well. Signed-off-by: NGreentime Hu <greentime.hu@sifive.com> Signed-off-by: NZong Li <zong.li@sifive.com> Co-developed-by: NZong Li <zong.li@sifive.com> Reviewed-by: NConor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20220913061817.22564-3-zong.li@sifive.comSigned-off-by: NPalmer Dabbelt <palmer@rivosinc.com>
-
由 Dan Carpenter 提交于
The devm_request_region() function does not return error pointers, it returns NULL on error. Fixes: 914d9b27 ("sunhme: switch to devres") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Reviewed-by: NSean Anderson <seanga2@gmail.com> Reviewed-by: NRolf Eike Beer <eike-kernel@sf-tec.de> Link: https://lore.kernel.org/r/Y0bWzJL8JknX8MUf@kiliSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Dan Carpenter 提交于
The __prestera_nexthop_group_create() function returns NULL on error and the prestera_nexthop_group_get() returns error pointers. Fix these two checks. Fixes: 0a23ae23 ("net: marvell: prestera: Add router nexthops ABI") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/Y0bWq+7DoKK465z8@kiliSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 13 10月, 2022 6 次提交
-
-
由 Michael S. Tsirkin 提交于
commit 71491c54 ("virtio_pci: don't try to use intxif pin is zero") breaks virtio_pci on powerpc, when running as a qemu guest. vp_find_vqs() bails out because pci_dev->pin == 0. But pci_dev->irq is populated correctly, so vp_find_vqs_intx() would succeed if we called it - which is what the code used to do. This seems to happen because pci_dev->pin is not populated in pci_assign_irq(). A PCI core bug? Maybe. However Linus said: I really think that that is basically the only time you should use that 'pci_dev->pin' thing: it basically exists not for "does this device have an IRQ", but for "what is the routing of this irq on this device". and The correct way to check for "no irq" doesn't use NO_IRQ at all, it just does if (dev->irq) ... so let's just check irq and be done with it. Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Reported-by: NMichael Ellerman <mpe@ellerman.id.au> Fixes: 71491c54 ("virtio_pci: don't try to use intxif pin is zero") Cc: "Angus Chen" <angus.chen@jaguarmicro.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Tested-by: NMichael Ellerman <mpe@ellerman.id.au> Acked-by: NJason Wang <jasowang@redhat.com> Message-Id: <20221012220312.308522-1-mst@redhat.com>
-
由 Brian Geffon 提交于
Currently zram will adjust its fops to a version which does not contain rw_page when a backing device has been assigned. This is done to prevent upper layers from assuming a synchronous operation when a page may have been written back. This forces every operation through bio which has overhead associated with bio_alloc/frees. The code can be simplified to always expose an rw_page method and only in the rare event that a page is written back we instead will return -EOPNOTSUPP forcing the upper layer to fallback to bio. Link: https://lkml.kernel.org/r/20221003144832.2906610-1-bgeffon@google.comSigned-off-by: NBrian Geffon <bgeffon@google.com> Reviewed-by: NSergey Senozhatsky <senozhatsky@chromium.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Rom Lemarchand <romlem@google.com> Cc: Suleiman Souhlal <suleiman@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Alistair Popple 提交于
When the module is unloaded or a GPU is unbound from the module it is possible for device private pages to still be mapped in currently running processes. This can lead to a hangs and RCU stall warnings when unbinding the device as memunmap_pages() will wait in an uninterruptible state until all device pages have been freed which may never happen. Fix this by migrating device mappings back to normal CPU memory prior to freeing the GPU memory chunks and associated device private pages. Link: https://lkml.kernel.org/r/66277601fb8fda9af408b33da9887192bf895bda.1664366292.git-series.apopple@nvidia.comSigned-off-by: NAlistair Popple <apopple@nvidia.com> Cc: Lyude Paul <lyude@redhat.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Alex Sierra <alex.sierra@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Yang Shi <shy828301@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Alistair Popple 提交于
nouveau_dmem_fault_copy_one() is used during handling of CPU faults via the migrate_to_ram() callback and is used to copy data from GPU to CPU memory. It is currently specific to fault handling, however a future patch implementing eviction of data during teardown needs similar functionality. Refactor out the core functionality so that it is not specific to fault handling. Link: https://lkml.kernel.org/r/20573d7b4e641a78fde9935f948e64e71c9e709e.1664366292.git-series.apopple@nvidia.comSigned-off-by: NAlistair Popple <apopple@nvidia.com> Reviewed-by: NLyude Paul <lyude@redhat.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Alex Sierra <alex.sierra@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Yang Shi <shy828301@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Alistair Popple 提交于
Since 27674ef6 ("mm: remove the extra ZONE_DEVICE struct page refcount") device private pages have no longer had an extra reference count when the page is in use. However before handing them back to the owning device driver we add an extra reference count such that free pages have a reference count of one. This makes it difficult to tell if a page is free or not because both free and in use pages will have a non-zero refcount. Instead we should return pages to the drivers page allocator with a zero reference count. Kernel code can then safely use kernel functions such as get_page_unless_zero(). Link: https://lkml.kernel.org/r/cf70cf6f8c0bdb8aaebdbfb0d790aea4c683c3c6.1664366292.git-series.apopple@nvidia.comSigned-off-by: NAlistair Popple <apopple@nvidia.com> Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Lyude Paul <lyude@redhat.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Alex Sierra <alex.sierra@amd.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Yang Shi <shy828301@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Alistair Popple 提交于
Patch series "Fix several device private page reference counting issues", v2 This series aims to fix a number of page reference counting issues in drivers dealing with device private ZONE_DEVICE pages. These result in use-after-free type bugs, either from accessing a struct page which no longer exists because it has been removed or accessing fields within the struct page which are no longer valid because the page has been freed. During normal usage it is unlikely these will cause any problems. However without these fixes it is possible to crash the kernel from userspace. These crashes can be triggered either by unloading the kernel module or unbinding the device from the driver prior to a userspace task exiting. In modules such as Nouveau it is also possible to trigger some of these issues by explicitly closing the device file-descriptor prior to the task exiting and then accessing device private memory. This involves some minor changes to both PowerPC and AMD GPU code. Unfortunately I lack hardware to test either of those so any help there would be appreciated. The changes mimic what is done in for both Nouveau and hmm-tests though so I doubt they will cause problems. This patch (of 8): When the CPU tries to access a device private page the migrate_to_ram() callback associated with the pgmap for the page is called. However no reference is taken on the faulting page. Therefore a concurrent migration of the device private page can free the page and possibly the underlying pgmap. This results in a race which can crash the kernel due to the migrate_to_ram() function pointer becoming invalid. It also means drivers can't reliably read the zone_device_data field because the page may have been freed with memunmap_pages(). Close the race by getting a reference on the page while holding the ptl to ensure it has not been freed. Unfortunately the elevated reference count will cause the migration required to handle the fault to fail. To avoid this failure pass the faulting page into the migrate_vma functions so that if an elevated reference count is found it can be checked to see if it's expected or not. [mpe@ellerman.id.au: fix build] Link: https://lkml.kernel.org/r/87fsgbf3gh.fsf@mpe.ellerman.id.au Link: https://lkml.kernel.org/r/cover.60659b549d8509ddecafad4f498ee7f03bb23c69.1664366292.git-series.apopple@nvidia.com Link: https://lkml.kernel.org/r/d3e813178a59e565e8d78d9b9a4e2562f6494f90.1664366292.git-series.apopple@nvidia.comSigned-off-by: NAlistair Popple <apopple@nvidia.com> Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Lyude Paul <lyude@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Alex Sierra <alex.sierra@amd.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Yang Shi <shy828301@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-