- 15 1月, 2022 27 次提交
-
-
由 Eli Cohen 提交于
Add wrappers to get/set status and protect these operations with cf_mutex to serialize these operations with respect to get/set config operations. Signed-off-by: NEli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-4-elic@nvidia.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Eli Cohen 提交于
Distribute the available rx virtqueues amongst the available RQT entries. RQTs require to have a power of two entries. When creating or modifying the RQT, use the lowest number of power of two entries that is not less than the number of rx virtqueues. Distribute them in the available entries such that some virtqueus may be referenced twice. This allows to configure any number of virtqueue pairs when multiqueue is used. Reviewed-by: NSi-Wei Liu <si-wei.liu@oracle.com> Acked-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NEli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-3-elic@nvidia.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Eli Cohen 提交于
Provide an interface to read the negotiated features. This is needed when building the netlink message in vdpa_dev_net_config_fill(). Also fix the implementation of vdpa_dev_net_config_fill() to use the negotiated features instead of the device features. To make APIs clearer, make the following name changes to struct vdpa_config_ops so they better describe their operations: get_features -> get_device_features set_features -> set_driver_features Finally, add get_driver_features to return the negotiated features and add implementation to all the upstream drivers. Acked-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NEli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-2-elic@nvidia.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Laura Abbott 提交于
The return type of get_config_size is size_t so it makes sense to change the type of the variable holding its result. That said, this already got taken care of (differently, and arguably not as well) by commit 3ed21c14 ("vdpa: check that offsets are within bounds"). The added 'c->off > size' test in that commit will be done as an unsigned comparison on 32-bit (safe due to not being signed). On a 64-bit platform, it will be done as a signed comparison, but in that case the comparison will be done in 64-bit, and 'c->off' being an u32 it will be valid thanks to the extended range (ie both values will be positive in 64 bits). So this was a real bug, but it was already addressed and marked for stable. Signed-off-by: NLaura Abbott <labbott@kernel.org> Reported-by: NLuo Likang <luolikang@nsfocus.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Michael S. Tsirkin 提交于
A recently added error path does not mark ring unused when exiting on OOM, which will lead to BUG on the next entry in debug builds. TODO: refactor code so we have START_USE and END_USE in the same function. Fixes: fc6d70f4 ("virtio_ring: check desc == NULL when using indirect with packed") Cc: "Xuan Zhuo" <xuanzhuo@linux.alibaba.com> Cc: Jiasheng Jiang <jiasheng@iscas.ac.cn> Reviewed-by: NXuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Xianting Tian 提交于
We need free the vqs in .release(), which are allocated in .open(). Signed-off-by: NXianting Tian <xianting.tian@linux.alibaba.com> Link: https://lore.kernel.org/r/20211228030924.3468439-1-xianting.tian@linux.alibaba.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NJason Wang <jasowang@redhat.com>
-
由 Eli Cohen 提交于
Remove overriding of virtio_version_1_0 which forced the virtqueue object to version 1. Fixes: 1a86b377 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: NEli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20211230142024.142979-1-elic@nvidia.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com> Reviewed-by: NParav Pandit <parav@nvidia.com> Acked-by: NJason Wang <jasowang@redhat.com> Reviewed-by: NSi-Wei Liu <si-wei.liu@oracle.com>
-
由 Peng Hao 提交于
When pci_iomap return NULL, the return value is zero. Signed-off-by: NPeng Hao <flyingpeng@tencent.com> Link: https://lore.kernel.org/r/20211222112014.87394-1-flyingpeng@tencent.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NJason Wang <jasowang@redhat.com>
-
由 Peng Hao 提交于
There is a check for vm->sbm.sb_states before, and it should check it here as well. Signed-off-by: NPeng Hao <flyingpeng@tencent.com> Link: https://lore.kernel.org/r/20211222011225.40573-1-flyingpeng@tencent.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com> Fixes: 5f1f79bb ("virtio-mem: Paravirtualized memory hotplug") Cc: stable@vger.kernel.org # v5.8+
-
由 Dapeng Mi 提交于
Function name "vp_modern_remove" in comments is written to "vp_modern_probe" incorrectly. Change it. Signed-off-by: NDapeng Mi <dapeng1.mi@intel.com> Link: https://lore.kernel.org/r/20211210073546.700783-1-dapeng1.mi@intel.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com> Reviewed-by: NStefano Garzarella <sgarzare@redhat.com>
-
由 王贇 提交于
The error message on the failure of pfn check should tell virtio-pci rather than virtio-mmio, just fix it. Signed-off-by: NMichael Wang <yun.wang@linux.alibaba.com> Suggested-by: NMichael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/ae5e154e-ac59-f0fa-a7c7-091a2201f581@linux.alibaba.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Johan Hovold 提交于
Explicitly remove the file entries from sysfs before dropping the final reference for symmetry reasons and for consistency with the rest of the driver. Signed-off-by: NJohan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20211201132528.30025-5-johan@kernel.orgSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Johan Hovold 提交于
Make sure to always NUL-terminate file names retrieved from the firmware to avoid accessing data beyond the entry slab buffer and exposing it through sysfs in case the firmware data is corrupt. Fixes: 75f3e8e4 ("firmware: introduce sysfs driver for QEMU's fw_cfg device") Cc: stable@vger.kernel.org # 4.6 Cc: Gabriel Somlo <somlo@cmu.edu> Signed-off-by: NJohan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20211201132528.30025-4-johan@kernel.orgSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Johan Hovold 提交于
An initialised kobject must be freed using kobject_put() to avoid leaking associated resources (e.g. the object name). Commit fe3c6068 ("firmware: Fix a reference count leak.") "fixed" the leak in the first error path of the file registration helper but left the second one unchanged. This "fix" would however result in a NULL pointer dereference due to the release function also removing the never added entry from the fw_cfg_entry_cache list. This has now been addressed. Fix the remaining kobject leak by restoring the common error path and adding the missing kobject_put(). Fixes: 75f3e8e4 ("firmware: introduce sysfs driver for QEMU's fw_cfg device") Cc: stable@vger.kernel.org # 4.6 Cc: Gabriel Somlo <somlo@cmu.edu> Signed-off-by: NJohan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20211201132528.30025-3-johan@kernel.orgSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Johan Hovold 提交于
Commit fe3c6068 ("firmware: Fix a reference count leak.") "fixed" a kobject leak in the file registration helper by properly calling kobject_put() for the entry in case registration of the object fails (e.g. due to a name collision). This would however result in a NULL pointer dereference when the release function tries to remove the never added entry from the fw_cfg_entry_cache list. Fix this by moving the list-removal out of the release function. Note that the offending commit was one of the benign looking umn.edu fixes which was reviewed but not reverted. [1][2] [1] https://lore.kernel.org/r/202105051005.49BFABCE@keescook [2] https://lore.kernel.org/all/YIg7ZOZvS3a8LjSv@kroah.com Fixes: fe3c6068 ("firmware: Fix a reference count leak.") Cc: stable@vger.kernel.org # 5.8 Cc: Qiushi Wu <wu000273@umn.edu> Cc: Kees Cook <keescook@chromium.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NJohan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20211201132528.30025-2-johan@kernel.orgSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Eugenio Pérez 提交于
Since vhost_vdpa_mmap checks for its existence before calling it. Signed-off-by: NEugenio Pérez <eperezma@redhat.com> Link: https://lore.kernel.org/r/20211104195248.2088904-1-eperezma@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NJason Wang <jasowang@redhat.com> Reviewed-by: NStefano Garzarella <sgarzare@redhat.com>
-
由 Eugenio Pérez 提交于
It has no sense to call get_status twice, since we already have a variable for that. Signed-off-by: NEugenio Pérez <eperezma@redhat.com> Link: https://lore.kernel.org/r/20211104195833.2089796-1-eperezma@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NJason Wang <jasowang@redhat.com> Reviewed-by: NStefano Garzarella <sgarzare@redhat.com>
-
由 Christophe JAILLET 提交于
When 'pcim_enable_device()' is used, some resources become automagically managed. There is no need to call 'pci_free_irq_vectors()' when the driver is removed. The same will already be done by 'pcim_release()'. Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/02045bdcbbb25f79bae4827f66029cfcddc90381.1636301587.git.christophe.jaillet@wanadoo.frSigned-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NJason Wang <jasowang@redhat.com>
-
由 Eli Cohen 提交于
Make sure to offer VIRTIO_NET_F_MTU since we configure the MTU based on what was queried from the device. This allows the virtio driver to allocate large enough buffers based on the reported MTU. Signed-off-by: NEli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20211124170949.51725-1-elic@nvidia.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NJason Wang <jasowang@redhat.com> Reviewed-by: NSi-Wei Liu <si-wei.liu@oracle.com>
-
由 David Hildenbrand 提交于
Let's prepare our fake page onlining code for subblock size smaller than MAX_ORDER - 1: we might get called for ranges not covering properly aligned MAX_ORDER - 1 pages. We have to detect the order to use dynamically. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20211126134209.17332-3-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com> Reviewed-by: NZi Yan <ziy@nvidia.com> Reviewed-by: NEric Ren <renzhengeek@gmail.com>
-
由 David Hildenbrand 提交于
Let's prepare our page onlining code for subblock size smaller than MAX_ORDER - 1: we'll get called for a MAX_ORDER - 1 page but might have some subblocks in the range plugged and some unplugged. In that case, fallback to subblock granularity to properly only expose the plugged parts to the buddy. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20211126134209.17332-2-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com> Reviewed-by: NZi Yan <ziy@nvidia.com> Reviewed-by: NEric Ren <renzhengeek@gmail.com>
-
由 Stefano Garzarella 提交于
`driver_override` allows to control which of the vDPA bus drivers binds to a vDPA device. If `driver_override` is not set, the previous behaviour is followed: devices use the first vDPA bus driver loaded (unless auto binding is disabled). Tested on Fedora 34 with driverctl(8): $ modprobe virtio-vdpa $ modprobe vhost-vdpa $ modprobe vdpa-sim-net $ vdpa dev add mgmtdev vdpasim_net name dev1 # dev1 is attached to the first vDPA bus driver loaded $ driverctl -b vdpa list-devices dev1 virtio_vdpa $ driverctl -b vdpa set-override dev1 vhost_vdpa $ driverctl -b vdpa list-devices dev1 vhost_vdpa [*] Note: driverctl(8) integrates with udev so the binding is preserved. Suggested-by: NJason Wang <jasowang@redhat.com> Acked-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NStefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20211126164753.181829-3-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Stefano Garzarella 提交于
Add missing documentation of sysfs ABI for vDPA bus in the new Documentation/ABI/testing/sysfs-bus-vdpa file. Signed-off-by: NStefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20211126164753.181829-2-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NJason Wang <jasowang@redhat.com>
-
由 Zhu Lingshan 提交于
This commit fixes a misuse of virtio-net device config size issue for virtio-block devices. A new member config_size in struct ifcvf_hw is introduced and would be initialized through vdpa_dev_add() to record correct device config size. To be more generic, rename ifcvf_hw.net_config to ifcvf_hw.dev_config, the helpers ifcvf_read/write_net_config() to ifcvf_read/write_dev_config() Signed-off-by: NZhu Lingshan <lingshan.zhu@intel.com> Reported-and-suggested-by: NStefano Garzarella <sgarzare@redhat.com> Reviewed-by: NStefano Garzarella <sgarzare@redhat.com> Fixes: 6ad31d16 ("vDPA/ifcvf: enable Intel C5000X-PL virtio-block for vDPA") Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20211201081255.60187-1-lingshan.zhu@intel.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Guanjun 提交于
This free action should be moved into caller 'vduse_ioctl' in concert with the allocation. No functional change. Signed-off-by: NGuanjun <guanjun@linux.alibaba.com> Link: https://lore.kernel.org/r/1638780498-55571-1-git-send-email-guanjun@linux.alibaba.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Michael S. Tsirkin 提交于
unregister after reset is clearly wrong - device can be used while it's reset. There's an attempt to protect against that using hwrng_removed but it seems racy since access can be in progress when the flag is set. Just unregister, then reset seems simpler and cleaner. NB: we might be able to drop hwrng_removed in a follow-up patch. Signed-off-by: NLaurent Vivier <lvivier@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Michael S. Tsirkin 提交于
This will enable cleanups down the road. The idea is to disable cbs, then add "flush_queued_cbs" callback as a parameter, this way drivers can flush any work queued after callbacks have been disabled. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20211013105226.20225-1-mst@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 03 1月, 2022 4 次提交
-
-
由 Linus Torvalds 提交于
-
由 Linus Torvalds 提交于
Merge tag 'perf-tools-fixes-for-v5.16-2022-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix TUI exit screen refresh race condition in 'perf top'. - Fix parsing of Intel PT VM time correlation arguments. - Honour CPU filtering command line request of a script's switch events in 'perf script'. - Fix printing of switch events in Intel PT python script. - Fix duplicate alias events list printing in 'perf list', noticed on heterogeneous arm64 systems. - Fix return value of ids__new(), users expect NULL for failure, not ERR_PTR(-ENOMEM). * tag 'perf-tools-fixes-for-v5.16-2022-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf top: Fix TUI exit screen refresh race condition perf pmu: Fix alias events list perf scripts python: intel-pt-events.py: Fix printing of switch events perf script: Fix CPU filtering of a script's switch events perf intel-pt: Fix parsing of VM time correlation arguments perf expr: Fix return value of ids__new()
-
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux由 Linus Torvalds 提交于
Pull i2c fixes from Wolfram Sang: "Better input validation for compat ioctls and a documentation bugfix for 5.16" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: Docs: Fixes link to I2C specification i2c: validate user data in compat ioctl
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull x86 fix from Borislav Petkov: - Use the proper CONFIG symbol in a preprocessor check. * tag 'x86_urgent_for_v5.16_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/build: Use the proper name CONFIG_FW_LOADER
-
- 02 1月, 2022 3 次提交
-
-
由 yaowenbin 提交于
When the following command is executed several times, a coredump file is generated. $ timeout -k 9 5 perf top -e task-clock ******* ******* ******* 0.01% [kernel] [k] __do_softirq 0.01% libpthread-2.28.so [.] __pthread_mutex_lock 0.01% [kernel] [k] __ll_sc_atomic64_sub_return double free or corruption (!prev) perf top --sort comm,dso timeout: the monitored command dumped core When we terminate "perf top" using sending signal method, SLsmg_reset_smg() called. SLsmg_reset_smg() resets the SLsmg screen management routines by freeing all memory allocated while it was active. However SLsmg_reinit_smg() maybe be called by another thread. SLsmg_reinit_smg() will free the same memory accessed by SLsmg_reset_smg(), thus it results in a double free. SLsmg_reinit_smg() is called already protected by ui__lock, so we fix the problem by adding pthread_mutex_trylock of ui__lock when calling SLsmg_reset_smg(). Signed-off-by: NWenyu Liu <liuwenyu7@huawei.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: wuxu.wu@huawei.com Link: http://lore.kernel.org/lkml/a91e3943-7ddc-f5c0-a7f5-360f073c20e6@huawei.comSigned-off-by: NHewenliang <hewenliang4@huawei.com> Signed-off-by: Nyaowenbin <yaowenbin1@huawei.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 John Garry 提交于
Commit 0e0ae874 ("perf list: Display hybrid PMU events with cpu type") changes the event list for uncore PMUs or arm64 heterogeneous CPU systems, such that duplicate aliases are incorrectly listed per PMU (which they should not be), like: # perf list ... unc_cbo_cache_lookup.any_es [Unit: uncore_cbox L3 Lookup any request that access cache and found line in E or S-state] unc_cbo_cache_lookup.any_es [Unit: uncore_cbox L3 Lookup any request that access cache and found line in E or S-state] unc_cbo_cache_lookup.any_i [Unit: uncore_cbox L3 Lookup any request that access cache and found line in I-state] unc_cbo_cache_lookup.any_i [Unit: uncore_cbox L3 Lookup any request that access cache and found line in I-state] ... Notice how the events are listed twice. The named commit changed how we remove duplicate events, in that events for different PMUs are not treated as duplicates. I suppose this is to handle how "Each hybrid pmu event has been assigned with a pmu name". Fix PMU alias listing by restoring behaviour to remove duplicates for non-hybrid PMUs. Fixes: 0e0ae874 ("perf list: Display hybrid PMU events with cpu type") Signed-off-by: NJohn Garry <john.garry@huawei.com> Tested-by: NZhengjun Xing <zhengjun.xing@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/1640103090-140490-1-git-send-email-john.garry@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input由 Linus Torvalds 提交于
Pull input fixes from Dmitry Torokhov: "Two small fixups for spaceball joystick driver and appletouch touchpad driver" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: spaceball - fix parsing of movement data packets Input: appletouch - initialize work before device registration
-
- 01 1月, 2022 6 次提交
-
-
由 Mel Gorman 提交于
Hugh Dickins reported the following My tmpfs swapping load (tweaked to use huge pages more heavily than in real life) is far from being a realistic load: but it was notably slowed down by your throttling mods in 5.16-rc, and this patch makes it well again - thanks. But: it very quickly hit NULL pointer until I changed that last line to if (first_pgdat) consider_reclaim_throttle(first_pgdat, sc); The likely issue is that huge pages are a major component of the test workload. When this is the case, first_pgdat may never get set if compaction is ready to continue due to this check if (IS_ENABLED(CONFIG_COMPACTION) && sc->order > PAGE_ALLOC_COSTLY_ORDER && compaction_ready(zone, sc)) { sc->compaction_ready = true; continue; } If this was true for every zone in the zonelist, first_pgdat would never get set resulting in a NULL pointer exception. Link: https://lkml.kernel.org/r/20211209095453.GM3366@techsingularity.net Fixes: 1b4e3f26 ("mm: vmscan: Reduce throttling due to a failure to make progress") Signed-off-by: NMel Gorman <mgorman@techsingularity.net> Reported-by: NHugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Rik van Riel <riel@surriel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Darrick J. Wong <djwong@kernel.org> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
Mike Galbraith, Alexey Avramov and Darrick Wong all reported similar problems due to reclaim throttling for excessive lengths of time. In Alexey's case, a memory hog that should go OOM quickly stalls for several minutes before stalling. In Mike and Darrick's cases, a small memcg environment stalled excessively even though the system had enough memory overall. Commit 69392a40 ("mm/vmscan: throttle reclaim when no progress is being made") introduced the problem although commit a19594ca ("mm/vmscan: increase the timeout if page reclaim is not making progress") made it worse. Systems at or near an OOM state that cannot be recovered must reach OOM quickly and memcg should kill tasks if a memcg is near OOM. To address this, only stall for the first zone in the zonelist, reduce the timeout to 1 tick for VMSCAN_THROTTLE_NOPROGRESS and only stall if the scan control nr_reclaimed is 0, kswapd is still active and there were excessive pages pending for writeback. If kswapd has stopped reclaiming due to excessive failures, do not stall at all so that OOM triggers relatively quickly. Similarly, if an LRU is simply congested, only lightly throttle similar to NOPROGRESS. Alexey's original case was the most straight forward for i in {1..3}; do tail /dev/zero; done On vanilla 5.16-rc1, this test stalled heavily, after the patch the test completes in a few seconds similar to 5.15. Alexey's second test case added watching a youtube video while tail runs 10 times. On 5.15, playback only jitters slightly, 5.16-rc1 stalls a lot with lots of frames missing and numerous audio glitches. With this patch applies, the video plays similarly to 5.15. [lkp@intel.com: Fix W=1 build warning] Link: https://lore.kernel.org/r/99e779783d6c7fce96448a3402061b9dc1b3b602.camel@gmx.de Link: https://lore.kernel.org/r/20211124011954.7cab9bb4@mail.inbox.lv Link: https://lore.kernel.org/r/20211022144651.19914-1-mgorman@techsingularity.net Link: https://lore.kernel.org/r/20211202150614.22440-1-mgorman@techsingularity.net Link: https://linux-regtracking.leemhuis.info/regzbot/regression/20211124011954.7cab9bb4@mail.inbox.lv/Reported-and-tested-by: NAlexey Avramov <hakavlad@inbox.lv> Reported-and-tested-by: NMike Galbraith <efault@gmx.de> Reported-and-tested-by: NDarrick J. Wong <djwong@kernel.org> Reported-by: Nkernel test robot <lkp@intel.com> Acked-by: NHugh Dickins <hughd@google.com> Tracked-by: NThorsten Leemhuis <regressions@leemhuis.info> Fixes: 69392a40 ("mm/vmscan: throttle reclaim when no progress is being made") Signed-off-by: NMel Gorman <mgorman@techsingularity.net> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
Merge misc mm fixes from Andrew Morton: "2 patches. Subsystems affected by this patch series: mm (userfaultfd and damon)" * akpm: mm/damon/dbgfs: fix 'struct pid' leaks in 'dbgfs_target_ids_write()' userfaultfd/selftests: fix hugetlb area allocations
-
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi由 Linus Torvalds 提交于
Pull SCSI fixes from James Bottomley: "Three fixes, all in drivers. The lpfc one doesn't look exploitable, but nasty things could happen in string operations if mybuf ends up with an on stack unterminated string" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: vmw_pvscsi: Set residual data length conditionally scsi: libiscsi: Fix UAF in iscsi_conn_get_param()/iscsi_conn_teardown() scsi: lpfc: Terminate string in lpfc_debugfs_nvmeio_trc_write()
-
由 SeongJae Park 提交于
DAMON debugfs interface increases the reference counts of 'struct pid's for targets from the 'target_ids' file write callback ('dbgfs_target_ids_write()'), but decreases the counts only in DAMON monitoring termination callback ('dbgfs_before_terminate()'). Therefore, when 'target_ids' file is repeatedly written without DAMON monitoring start/termination, the reference count is not decreased and therefore memory for the 'struct pid' cannot be freed. This commit fixes this issue by decreasing the reference counts when 'target_ids' is written. Link: https://lkml.kernel.org/r/20211229124029.23348-1-sj@kernel.org Fixes: 4bc05954 ("mm/damon: implement a debugfs-based user space interface") Signed-off-by: NSeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> [5.15+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mike Kravetz 提交于
Currently, userfaultfd selftest for hugetlb as run from run_vmtests.sh or any environment where there are 'just enough' hugetlb pages will always fail with: testing events (fork, remap, remove): ERROR: UFFDIO_COPY error: -12 (errno=12, line=616) The ENOMEM error code implies there are not enough hugetlb pages. However, there are free hugetlb pages but they are all reserved. There is a basic problem with the way the test allocates hugetlb pages which has existed since the test was originally written. Due to the way 'cleanup' was done between different phases of the test, this issue was masked until recently. The issue was uncovered by commit 8ba6e864 ("userfaultfd/selftests: reinitialize test context in each test"). For the hugetlb test, src and dst areas are allocated as PRIVATE mappings of a hugetlb file. This means that at mmap time, pages are reserved for the src and dst areas. At the start of event testing (and other tests) the src area is populated which results in allocation of huge pages to fill the area and consumption of reserves associated with the area. Then, a child is forked to fault in the dst area. Note that the dst area was allocated in the parent and hence the parent owns the reserves associated with the mapping. The child has normal access to the dst area, but can not use the reserves created/owned by the parent. Thus, if there are no other huge pages available allocation of a page for the dst by the child will fail. Fix by not creating reserves for the dst area. In this way the child can use free (non-reserved) pages. Also, MAP_PRIVATE of a file only makes sense if you are interested in the contents of the file before making a COW copy. The test does not do this. So, just use MAP_ANONYMOUS | MAP_HUGETLB to create an anonymous hugetlb mapping. There is no need to create a hugetlb file in the non-shared case. Link: https://lkml.kernel.org/r/20211217172919.7861-1-mike.kravetz@oracle.comSigned-off-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-