- 11 11月, 2021 40 次提交
-
-
由 Colin Ian King 提交于
mainline inclusion from mainline-v5.14-rc1 commit f25247d8 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4H7E6 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f25247d88708 ---------------------------------------------------------------------- The comparisons of the u16 values priv->phycr1 and priv->phycr2 to less than zero always false because they are unsigned. Fix this by using an int for the assignment and less than zero check. Addresses-Coverity: ("Unsigned compared against 0") Fixes: 0a4355c2 ("net: phy: realtek: add dt property to disable CLKOUT clock") Fixes: d90db36a ("net: phy: realtek: add dt property to enable ALDPS mode") Signed-off-by: NColin Ian King <colin.king@canonical.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Joakim Zhang 提交于
mainline inclusion from mainline-v5.14-rc1 commit d90db36a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4H7E6 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d90db36a9e74 ---------------------------------------------------------------------- If enable Advance Link Down Power Saving (ALDPS) mode, it will change crystal/clock behavior, which cause RXC clock stop for dozens to hundreds of miliseconds. This is comfirmed by Realtek engineer. For some MACs, it needs RXC clock to support RX logic, after this patch, PHY can generate continuous RXC clock during auto-negotiation. ALDPS default is disabled after hardware reset, it's more reasonable to add a property to enable this feature, since ALDPS would introduce side effect. This patch adds dt property "realtek,aldps-enable" to enable ALDPS mode per users' requirement. Jisheng Zhang enables this feature, changes the default behavior. Since mine patch breaks the rule that new implementation should not break existing design, so Cc'ed let him know to see if it can be accepted. Cc: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Signed-off-by: NJoakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Joakim Zhang 提交于
mainline inclusion from mainline-v5.14-rc1 commit 0a4355c2 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4H7E6 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0a4355c2b7f8 ---------------------------------------------------------------------- CLKOUT is enabled by default after PHY hardware reset, this patch adds "realtek,clkout-disable" property for user to disable CLKOUT clock to save PHY power. Per RTL8211F guide, a PHY reset should be issued after setting these bits in PHYCR2 register. After this patch, CLKOUT clock output to be disabled. Signed-off-by: NJoakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> (fix conflicts: adjust the location of RTL8211F_CLKOUT_EN) (fix conflicts: replace rtl8211e_config_init by rtl821x_resume) (fix conflicts: replace .read_status by .ack_interrupt) Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 l00525341 提交于
driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4A1WQ ------------------------------------------------------------------------ Build HISI PMU drivers as modules. Signed-off-by: NQi Liu <liuqi115@huawei.com> Reviewed-by: NShaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Qi Deng 提交于
driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4ETXO CVE: NA ----------------------------------------- Make CONFIG_BMA=m, except euleros_defconfig. Link: https://lkml.org/lkml/2020/6/22/752Signed-off-by: NQi Deng <dengqi7@huawei.com> Reviewed-by: NQidong Wang <wangqindong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Qi Deng 提交于
driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4ETXO CVE: NA ----------------------------------------- The BMA software is a system management software offered by Huawei. It supports the status monitoring, performance monitoring, and event monitoring of various components, including server CPUs, memory, hard disks, NICs, IB cards, PCIe cards, RAID controller cards, and optical modules. This cdev_veth_drv driver is one of the communication drivers used by BMA software. It depends on the host_edma_drv driver. It will create a char device once loaded, offering interfaces (open, close, read, write and poll) to BMA to send/receive RESTful messages between BMC software. When the message is longer than 1KB, it will be cut into packets of 1KB. The other side, BMC's cdev_veth driver, will assemble these packets back into original mesages. Link: https://lkml.org/lkml/2020/6/22/752Signed-off-by: NQi Deng <dengqi7@huawei.com> Reviewed-by: NQidong Wang <wangqindong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Qi Deng 提交于
driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4ETXO CVE: NA ----------------------------------------- The BMA software is a system management software offered by Huawei. It supports the status monitoring, performance monitoring, and event monitoring of various components, including server CPUs, memory, hard disks, NICs, IB cards, PCIe cards, RAID controller cards, and optical modules. The host_kbox_drv driver serves the function of a black box. When a panic or mce event happen to the system, it will record the event time, system's status and system logs and send them to BMC before the OS shutdown. This driver depends on the host_edms_drv driver. Link: https://lkml.org/lkml/2020/6/22/752Signed-off-by: NQi Deng <dengqi7@huawei.com> Reviewed-by: NQidong Wang <wangqindong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Qi Deng 提交于
driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4ETXO CVE: NA ----------------------------------------- The BMA software is a system management software offered by Huawei. It supports the status monitoring, performance monitoring, and event monitoring of various components, including server CPUs, memory, hard disks, NICs, IB cards, PCIe cards, RAID controller cards, and optical modules. This host_veth_drv driver is one of the communication drivers used by BMA software. It depends on the host_edma_drv driver. The host_veth_drv driver will create a virtual net device "veth" once loaded. BMA software will use it to send/receive RESTful messages to/from BMC software. Link: https://lkml.org/lkml/2020/6/22/752Signed-off-by: NQi Deng <dengqi7@huawei.com> Reviewed-by: NQidong Wang <wangqindong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Qi Deng 提交于
driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4ETXO CVE: NA ----------------------------------------- The BMA software is a system management software offered by Huawei. It supports the status monitoring, performance monitoring, and event monitoring of various components, including server CPUs, memory, hard disks, NICs, IB cards, PCIe cards, RAID controller cards, and optical modules. This host_cdev_drv driver is one of the communication driver used by BMA software. It depends on the host_edma_drv driver. The host_cdev_drv driver will create 4 char devices(hwbmc0, hwbmc1, hwbmc2, hwbmc3) once loaded. These char devices offer interfaces, including open, close, read, write and poll, to upper level applications. BMA uses them to send/receive ipmi commons to/from BMC. Link: https://lkml.org/lkml/2020/6/22/752Signed-off-by: NQi Deng <dengqi7@huawei.com> Reviewed-by: NQidong Wang <wangqindong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Qi Deng 提交于
driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4ETXO CVE: NA ----------------------------------------- The BMA software is a system management software offered by Huawei. It supports the status monitoring, performance monitoring, and event monitoring of various components, including server CPUs, memory, hard disks, NICs, IB cards, PCIe cards, RAID controller cards, and optical modules. This host_edma_drv driver is a PCIe driver used by Huawei BMA software. The main function of it is to control the PCIe bus between BMA software and Huawei 1711 chip. The chip will then process the data and display to users. This host_edma_drv driver offers API to send/receive data for other BMA drivers which want to use the PCIe channel in different ways(eg.host_cdev_drv, host_veth_drv). Link: https://lkml.org/lkml/2020/6/22/752Signed-off-by: NQi Deng <dengqi7@huawei.com> Reviewed-by: NQidong Wang <wangqindong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yunsheng Lin 提交于
mainline inclusion from mainline-master commit d00e60ee category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4CVS3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d00e60ee54b1 ---------------------------------------------------------------------- As the 32-bit arch with 64-bit DMA seems to rare those days, and page pool might carry a lot of code and complexity for systems that possibly. So disable dma mapping support for such systems, if drivers really want to work on such systems, they have to implement their own DMA-mapping fallback tracking outside page_pool. Reviewed-by: NIlias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: NYunsheng Lin <linyunsheng@huawei.com> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yunsheng Lin 提交于
mainline inclusion from mainline-v5.15-rc1 commit 7fb9b66d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4CVS3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7fb9b66dc9ce ---------------------------------------------------------------------- There is no need to synchronize the account updating, so use the relaxed atomic to avoid some memory barrier in the data path. Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NYunsheng Lin <linyunsheng@huawei.com> Acked-by: NIlias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yunsheng Lin 提交于
mainline inclusion from mainline-v5.15-rc2 commit f7ec554b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4CVS3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f7ec554b73c5 ---------------------------------------------------------------------- When page pool is added to the hns3 driver, it is always enabled unconditionally, which means spilt page handling in the hns3 driver is dead code. As there is a requirement to test the performance between spilt page handling in driver and page pool, so add a module param to support disabling the page pool. When the page pool is proved to perform better in most case, the spilt page handling in driver can be removed. Fixes: 93188e96 ("net: hns3: support skb's frag page recycling based on page pool") Signed-off-by: NYunsheng Lin <linyunsheng@huawei.com> Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yunsheng Lin 提交于
mainline inclusion from mainline-v5.15-rc1 commit 93188e96 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4CVS3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=93188e9642c3 ---------------------------------------------------------------------- This patch adds skb's frag page recycling support based on the frag page support in page pool. The performance improves above 10~20% for single thread iperf TCP flow with IOMMU disabled when iperf server and irq/NAPI have a different CPU. The performance improves about 135%(14Gbit to 33Gbit) for single thread iperf TCP flow when IOMMU is in strict mode and iperf server shares the same cpu with irq/NAPI. Signed-off-by: NYunsheng Lin <linyunsheng@huawei.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yunsheng Lin 提交于
mainline inclusion from mainline-v5.15-rc1 commit 53e0961d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4CVS3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=53e0961da1c7 ---------------------------------------------------------------------- Currently page pool only support page recycling when there is only one user of the page, and the split page reusing implemented in the most driver can not use the page pool as bing-pong way of reusing requires the multi user support in page pool. Those reusing or recycling has below limitations: 1. page from page pool can only be used be one user in order for the page recycling to happen. 2. Bing-pong way of reusing in most driver does not support multi desc using different part of the same page in order to save memory. So add multi-users support and frag page recycling in page pool to overcome the above limitation. Signed-off-by: NYunsheng Lin <linyunsheng@huawei.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yunsheng Lin 提交于
mainline inclusion from mainline-v5.15-rc1 commit 0e9d2a0a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4CVS3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0e9d2a0a3a83 ---------------------------------------------------------------------- For 32 bit systems with 64 bit dma, dma_addr[1] is used to store the upper 32 bit dma addr, those system should be rare those days. For normal system, the dma_addr[1] in 'struct page' is not used, so we can reuse dma_addr[1] for storing frag count, which means how many frags this page might be splited to. In order to simplify the page frag support in the page pool, the PAGE_POOL_DMA_USE_PP_FRAG_COUNT macro is added to indicate the 32 bit systems with 64 bit dma, and the page frag support in page pool is disabled for such system. The newly added page_pool_set_frag_count() is called to reserve the maximum frag count before any page frag is passed to the user. The page_pool_atomic_sub_frag_count_return() is called when user is done with the page frag. Signed-off-by: NYunsheng Lin <linyunsheng@huawei.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yunsheng Lin 提交于
mainline inclusion from mainline-v5.15-rc1 commit 57f05bc2 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4CVS3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=57f05bc2ab24 ---------------------------------------------------------------------- Currently, page->pp is cleared and set everytime the page is recycled, which is unnecessary. So only set the page->pp when the page is added to the page pool and only clear it when the page is released from the page pool. This is also a preparation to support allocating frag page in page pool. Reviewed-by: NIlias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: NYunsheng Lin <linyunsheng@huawei.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yunsheng Lin 提交于
mainline inclusion from mainline-v5.14-rc6 commit 0fa32ca4 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4CVS3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0fa32ca438b4 ---------------------------------------------------------------------- As mentioned in commit c07aea3e ("mm: add a signature in struct page"): "The page->signature field is aliased to page->lru.next and page->compound_head." And as the comment in page_is_pfmemalloc(): "lru.next has bit 1 set if the page is allocated from the pfmemalloc reserves. Callers may simply overwrite it if they do not need to preserve that information." The page->signature is OR’ed with PP_SIGNATURE when a page is allocated in page pool, see __page_pool_alloc_pages_slow(), and page->signature is checked directly with PP_SIGNATURE in page_pool_return_skb_page(), which might cause resoure leaking problem for a page from page pool if bit 1 of lru.next is set for a pfmemalloc page. What happens here is that the original pp->signature is OR'ed with PP_SIGNATURE after the allocation in order to preserve any existing bits(such as the bit 1, used to indicate a pfmemalloc page), so when those bits are present, those page is not considered to be from page pool and the DMA mapping of those pages will be left stale. As bit 0 is for page->compound_head, So mask both bit 0/1 before the checking in page_pool_return_skb_page(). And we will return those pfmemalloc pages back to the page allocator after cleaning up the DMA mapping. Fixes: 6a5bcd84 ("page_pool: Allow drivers to hint on SKB recycling") Reviewed-by: NIlias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: NYunsheng Lin <linyunsheng@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Ilias Apalodimas 提交于
mainline inclusion from mainline-v5.14-rc3 commit 2cc3aeb5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4CVS3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2cc3aeb5eccc ---------------------------------------------------------------------- As Alexander points out, when we are trying to recycle a cloned/expanded SKB we might trigger a race. The recycling code relies on the pp_recycle bit to trigger, which we carry over to cloned SKBs. If that cloned SKB gets expanded or if we get references to the frags, call skb_release_data() and overwrite skb->head, we are creating separate instances accessing the same page frags. Since the skb_release_data() will first try to recycle the frags, there's a potential race between the original and cloned SKB, since both will have the pp_recycle bit set. Fix this by explicitly those SKBs not recyclable. The atomic_sub_return effectively limits us to a single release case, and when we are calling skb_release_data we are also releasing the option to perform the recycling, or releasing the pages from the page pool. Fixes: 6a5bcd84 ("page_pool: Allow drivers to hint on SKB recycling") Reported-by: NAlexander Duyck <alexanderduyck@fb.com> Suggested-by: NAlexander Duyck <alexanderduyck@fb.com> Reviewed-by: NAlexander Duyck <alexanderduyck@fb.com> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NIlias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Lorenzo Bianconi 提交于
mainline inclusion from mainline-v5.14-rc1 commit a078d981 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4CVS3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a078d981f863 ---------------------------------------------------------------------- As already done for mvneta and mvpp2, enable skb recycling for ti ethernet drivers ti driver on net-next: ---------------------- [perf top] 47.15% [kernel] [k] _raw_spin_unlock_irqrestore 11.77% [kernel] [k] __cpdma_chan_free 3.16% [kernel] [k] ___bpf_prog_run 2.52% [kernel] [k] cpsw_rx_vlan_encap 2.34% [kernel] [k] __netif_receive_skb_core 2.27% [kernel] [k] free_unref_page 2.26% [kernel] [k] kmem_cache_free 2.24% [kernel] [k] kmem_cache_alloc 1.69% [kernel] [k] __softirqentry_text_start 1.61% [kernel] [k] cpsw_rx_handler 1.19% [kernel] [k] page_pool_release_page 1.19% [kernel] [k] clear_bits_ll 1.15% [kernel] [k] page_frag_free 1.06% [kernel] [k] __dma_page_dev_to_cpu 0.99% [kernel] [k] memset 0.94% [kernel] [k] __alloc_pages_bulk 0.92% [kernel] [k] kfree_skb 0.85% [kernel] [k] packet_rcv 0.78% [kernel] [k] page_address 0.75% [kernel] [k] v7_dma_inv_range 0.71% [kernel] [k] __lock_text_start [iperf3 tcp] [ 5] 0.00-10.00 sec 873 MBytes 732 Mbits/sec 0 sender [ 5] 0.00-10.01 sec 866 MBytes 726 Mbits/sec receiver ti + skb recycling: ------------------- [perf top] 40.58% [kernel] [k] _raw_spin_unlock_irqrestore 16.18% [kernel] [k] __softirqentry_text_start 10.33% [kernel] [k] __cpdma_chan_free 2.62% [kernel] [k] ___bpf_prog_run 2.05% [kernel] [k] cpsw_rx_vlan_encap 2.00% [kernel] [k] kmem_cache_alloc 1.86% [kernel] [k] __netif_receive_skb_core 1.80% [kernel] [k] kmem_cache_free 1.63% [kernel] [k] cpsw_rx_handler 1.12% [kernel] [k] cpsw_rx_mq_poll 1.11% [kernel] [k] page_pool_put_page 1.04% [kernel] [k] _raw_spin_unlock 0.97% [kernel] [k] clear_bits_ll 0.90% [kernel] [k] packet_rcv 0.88% [kernel] [k] __dma_page_dev_to_cpu 0.85% [kernel] [k] kfree_skb 0.80% [kernel] [k] memset 0.71% [kernel] [k] __lock_text_start 0.66% [kernel] [k] v7_dma_inv_range 0.64% [kernel] [k] gen_pool_free_owner [iperf3 tcp] [ 5] 0.00-10.00 sec 884 MBytes 742 Mbits/sec 0 sender [ 5] 0.00-10.01 sec 878 MBytes 735 Mbits/sec receiver Tested-by: NGrygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: NGrygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Matteo Croce 提交于
mainline inclusion from mainline-v5.14-rc1 commit 2f128eb3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4CVS3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2f128eb3308a ---------------------------------------------------------------------- Most of the time during the RX is caused by the compound_head() call done at the end of the RX loop: │ build_skb(): [...] │ static inline struct page *compound_head(struct page *page) │ { │ unsigned long head = READ_ONCE(page->compound_head); 65.23 │ ldr x2, [x1, #8] Prefetch the page struct as soon as possible, to speedup the RX path noticeabily by a ~3-4% packet rate in a drop test. │ build_skb(): [...] │ static inline struct page *compound_head(struct page *page) │ { │ unsigned long head = READ_ONCE(page->compound_head); 17.92 │ ldr x2, [x1, #8] Signed-off-by: NMatteo Croce <mcroce@microsoft.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Matteo Croce 提交于
mainline inclusion from mainline-v5.14-rc1 commit d8ea89fe category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4CVS3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d8ea89fe8a49 ---------------------------------------------------------------------- In the RX buffer, the received data starts after a headroom used to align the IP header and to allow prepending headers efficiently. The prefetch() should take this into account, and prefetch from the very start of the received data. We can see that ether_addr_equal_64bits(), which is the first function to access the data, drops from the top of the perf top output. prefetch(data): Overhead Shared Object Symbol 11.64% [kernel] [k] eth_type_trans prefetch(data + MVPP2_MH_SIZE + MVPP2_SKB_HEADROOM): Overhead Shared Object Symbol 13.42% [kernel] [k] build_skb 10.35% [mvpp2] [k] mvpp2_rx 9.35% [kernel] [k] __netif_receive_skb_core 8.24% [kernel] [k] kmem_cache_free 7.97% [kernel] [k] dev_gro_receive 7.68% [kernel] [k] page_pool_put_page 7.32% [kernel] [k] kmem_cache_alloc 7.09% [mvpp2] [k] mvpp2_bm_pool_put 3.36% [kernel] [k] eth_type_trans Also, move the eth_type_trans() call a bit down, to give the RAM more time to prefetch the data. Signed-off-by: NMatteo Croce <mcroce@microsoft.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Matteo Croce 提交于
mainline inclusion from mainline-v5.14-rc1 commit e4017570 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4CVS3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e4017570daee ---------------------------------------------------------------------- Use the new recycling API for page_pool. In a drop rate test, the packet rate increased by 10%, from 296 Kpps to 326 Kpps. perf top on a stock system shows: Overhead Shared Object Symbol 23.66% [kernel] [k] __pi___inval_dcache_area 22.85% [mvneta] [k] mvneta_rx_swbm 7.54% [kernel] [k] kmem_cache_alloc 6.49% [kernel] [k] eth_type_trans 3.94% [kernel] [k] dev_gro_receive 3.91% [kernel] [k] __netif_receive_skb_core 3.91% [kernel] [k] kmem_cache_free 3.76% [kernel] [k] page_pool_release_page 3.56% [kernel] [k] free_unref_page 2.40% [kernel] [k] build_skb 1.49% [kernel] [k] skb_release_data 1.45% [kernel] [k] __alloc_pages_bulk 1.30% [kernel] [k] page_frag_free And this is the same output with recycling enabled: Overhead Shared Object Symbol 26.41% [kernel] [k] __pi___inval_dcache_area 25.00% [mvneta] [k] mvneta_rx_swbm 8.14% [kernel] [k] kmem_cache_alloc 6.84% [kernel] [k] eth_type_trans 4.44% [kernel] [k] __netif_receive_skb_core 4.38% [kernel] [k] kmem_cache_free 4.16% [kernel] [k] dev_gro_receive 3.21% [kernel] [k] page_pool_put_page 2.41% [kernel] [k] build_skb 1.82% [kernel] [k] skb_release_data 1.61% [kernel] [k] napi_gro_receive 1.25% [kernel] [k] page_pool_refill_alloc_cache 1.16% [kernel] [k] __netif_receive_skb_list_core We can see that page_pool_release_page(), free_unref_page() and __alloc_pages_bulk() are no longer on top of the list when receiving traffic. The test was done with mausezahn on the TX side with 64 byte raw ethernet frames. Signed-off-by: NMatteo Croce <mcroce@microsoft.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Matteo Croce 提交于
mainline inclusion from mainline-v5.14-rc1 commit 133637fc category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4CVS3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=133637fcfab2 ---------------------------------------------------------------------- Use the new recycling API for page_pool. In a drop rate test, the packet rate is almost doubled, from 1110 Kpps to 2128 Kpps. perf top on a stock system shows: Overhead Shared Object Symbol 34.88% [kernel] [k] page_pool_release_page 8.06% [kernel] [k] free_unref_page 6.42% [mvpp2] [k] mvpp2_rx 6.07% [kernel] [k] eth_type_trans 5.18% [kernel] [k] __netif_receive_skb_core 4.95% [kernel] [k] build_skb 4.88% [kernel] [k] kmem_cache_free 3.97% [kernel] [k] kmem_cache_alloc 3.45% [kernel] [k] dev_gro_receive 2.73% [kernel] [k] page_frag_free 2.07% [kernel] [k] __alloc_pages_bulk 1.99% [kernel] [k] arch_local_irq_save 1.84% [kernel] [k] skb_release_data 1.20% [kernel] [k] netif_receive_skb_list_internal With packet rate stable at 1100 Kpps: tx: 0 bps 0 pps rx: 532.7 Mbps 1110 Kpps tx: 0 bps 0 pps rx: 532.6 Mbps 1110 Kpps tx: 0 bps 0 pps rx: 532.4 Mbps 1109 Kpps tx: 0 bps 0 pps rx: 532.1 Mbps 1109 Kpps tx: 0 bps 0 pps rx: 531.9 Mbps 1108 Kpps tx: 0 bps 0 pps rx: 531.9 Mbps 1108 Kpps And this is the same output with recycling enabled: Overhead Shared Object Symbol 12.91% [kernel] [k] eth_type_trans 12.54% [mvpp2] [k] mvpp2_rx 9.67% [kernel] [k] build_skb 9.63% [kernel] [k] __netif_receive_skb_core 8.44% [kernel] [k] page_pool_put_page 8.07% [kernel] [k] kmem_cache_free 7.79% [kernel] [k] kmem_cache_alloc 6.86% [kernel] [k] dev_gro_receive 3.19% [kernel] [k] skb_release_data 2.41% [kernel] [k] netif_receive_skb_list_internal 2.18% [kernel] [k] page_pool_refill_alloc_cache 1.76% [kernel] [k] napi_gro_receive 1.61% [kernel] [k] kfree_skb 1.20% [kernel] [k] dma_sync_single_for_device 1.16% [mvpp2] [k] mvpp2_poll 1.12% [mvpp2] [k] mvpp2_read With packet rate above 2100 Kpps: tx: 0 bps 0 pps rx: 1021 Mbps 2128 Kpps tx: 0 bps 0 pps rx: 1021 Mbps 2127 Kpps tx: 0 bps 0 pps rx: 1021 Mbps 2128 Kpps tx: 0 bps 0 pps rx: 1021 Mbps 2128 Kpps tx: 0 bps 0 pps rx: 1022 Mbps 2128 Kpps tx: 0 bps 0 pps rx: 1022 Mbps 2129 Kpps The major performance increase is explained by the fact that the most CPU consuming functions (page_pool_release_page, page_frag_free and free_unref_page) are no longer called on a per packet basis. The test was done by sending to the macchiatobin 64 byte ethernet frames with an invalid ethertype, so the packets are dropped early in the RX path. Signed-off-by: NMatteo Croce <mcroce@microsoft.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Ilias Apalodimas 提交于
mainline inclusion from mainline-v5.14-rc1 commit 6a5bcd84 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4CVS3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6a5bcd84e886 ---------------------------------------------------------------------- Up to now several high speed NICs have custom mechanisms of recycling the allocated memory they use for their payloads. Our page_pool API already has recycling capabilities that are always used when we are running in 'XDP mode'. So let's tweak the API and the kernel network stack slightly and allow the recycling to happen even during the standard operation. The API doesn't take into account 'split page' policies used by those drivers currently, but can be extended once we have users for that. The idea is to be able to intercept the packet on skb_release_data(). If it's a buffer coming from our page_pool API recycle it back to the pool for further usage or just release the packet entirely. To achieve that we introduce a bit in struct sk_buff (pp_recycle:1) and a field in struct page (page->pp) to store the page_pool pointer. Storing the information in page->pp allows us to recycle both SKBs and their fragments. We could have skipped the skb bit entirely, since identical information can bederived from struct page. However, in an effort to affect the free path as less as possible, reading a single bit in the skb which is already in cache, is better that trying to derive identical information for the page stored data. The driver or page_pool has to take care of the sync operations on it's own during the buffer recycling since the buffer is, after opting-in to the recycling, never unmapped. Since the gain on the drivers depends on the architecture, we are not enabling recycling by default if the page_pool API is used on a driver. In order to enable recycling the driver must call skb_mark_for_recycle() to store the information we need for recycling in page->pp and enabling the recycling bit, or page_pool_store_mem_info() for a fragment. Co-developed-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Co-developed-by: NMatteo Croce <mcroce@microsoft.com> Signed-off-by: NMatteo Croce <mcroce@microsoft.com> Signed-off-by: NIlias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> (fix conflicts: replace skb_get_kcov_handle by skb_csum_is_sctp) Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Matteo Croce 提交于
mainline inclusion from mainline-v5.14-rc1 commit c420c989 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4CVS3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c420c98982fa ---------------------------------------------------------------------- This is a prerequisite patch, the next one is enabling recycling of skbs and fragments. Add an extra argument on __skb_frag_unref() to handle recycling, and update the current users of the function with that. Signed-off-by: NMatteo Croce <mcroce@microsoft.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Matteo Croce 提交于
mainline inclusion from mainline-v5.14-rc1 commit c07aea3e category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4CVS3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c07aea3ef4d4 ---------------------------------------------------------------------- This is needed by the page_pool to avoid recycling a page not allocated via page_pool. The page->signature field is aliased to page->lru.next and page->compound_head, but it can't be set by mistake because the signature value is a bad pointer, and can't trigger a false positive in PageTail() because the last bit is 0. Co-developed-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NMatteo Croce <mcroce@microsoft.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Alexander Lobakin 提交于
mainline inclusion from mainline-v5.12-rc1-dontuse commit 05656132 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4CVS3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=05656132a874 ---------------------------------------------------------------------- pool_page_reusable() is a leftover from pre-NUMA-aware times. For now, this function is just a redundant wrapper over page_is_pfmemalloc(), so inline it into its sole call site. Signed-off-by: NAlexander Lobakin <alobakin@pm.me> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: NIlias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: NJesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: NDavid Rientjes <rientjes@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jonathan Lemon 提交于
mainline inclusion from mainline-v5.12-rc1-dontuse commit 70c43167 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4CVS3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=70c4316749f6 ---------------------------------------------------------------------- RX zerocopy fragment pages which are not allocated from the system page pool require special handling. Give the callback in skb_zcopy_clear() a chance to process them first. Signed-off-by: NJonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org> Reviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Lorenzo Bianconi 提交于
mainline inclusion from mainline-v5.11-rc1 commit 78862447 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4CVS3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7886244736a4 ---------------------------------------------------------------------- Introduce the capability to batch page_pool ptr_ring refill since it is usually run inside the driver NAPI tx completion loop. Suggested-by: NJesper Dangaard Brouer <brouer@redhat.com> Co-developed-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Acked-by: NIlias Apalodimas <ilias.apalodimas@linaro.org> Link: https://lore.kernel.org/bpf/08dd249c9522c001313f520796faa777c4089e1c.1605267335.git.lorenzo@kernel.orgReviewed-by: NYongxin Li <liyongxin1@huawei.com> Signed-off-by: NJunxin Chen <chenjunxin1@huawei.com> (fix conflicts: delete modifications to net/core/xdp.c) Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 SeongJae Park 提交于
mainline inclusion from mainline-5.15-rc1 commit 75e39b1a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4GVMK CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=75e39b1a3668c008c32c7ffbebe93386e67c7352 ------------------------------------------------- This commit updates MAINTAINERS file for DAMON related files. Link: https://lkml.kernel.org/r/20210716081449.22187-14-sj38.park@gmail.comSigned-off-by: NSeongJae Park <sjpark@amazon.de> Reviewed-by: NMarkus Boehme <markubo@amazon.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Amit Shah <amit@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: David Woodhouse <dwmw@amazon.com> Cc: Fan Du <fan.du@intel.com> Cc: Fernand Sieber <sieberf@amazon.com> Cc: Greg Kroah-Hartman <greg@kroah.com> Cc: Greg Thelen <gthelen@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Leonard Foerster <foersleo@amazon.de> Cc: Marco Elver <elver@google.com> Cc: Maximilian Heyne <mheyne@amazon.de> Cc: Mel Gorman <mgorman@suse.de> Cc: Minchan Kim <minchan@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 75e39b1a) Signed-off-by: NYue Zou <zouyue3@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 SeongJae Park 提交于
mainline inclusion from mainline-5.15-rc1 commit b348eb7a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4GVMK CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b348eb7abd0987b849420113ced27ad7a1bc6cf3 ------------------------------------------------- This commit adds a simple user space tests for DAMON. The tests are using kselftest framework. Link: https://lkml.kernel.org/r/20210716081449.22187-13-sj38.park@gmail.comSigned-off-by: NSeongJae Park <sjpark@amazon.de> Reviewed-by: NMarkus Boehme <markubo@amazon.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Amit Shah <amit@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: David Woodhouse <dwmw@amazon.com> Cc: Fan Du <fan.du@intel.com> Cc: Fernand Sieber <sieberf@amazon.com> Cc: Greg Kroah-Hartman <greg@kroah.com> Cc: Greg Thelen <gthelen@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Leonard Foerster <foersleo@amazon.de> Cc: Marco Elver <elver@google.com> Cc: Maximilian Heyne <mheyne@amazon.de> Cc: Mel Gorman <mgorman@suse.de> Cc: Minchan Kim <minchan@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit b348eb7a) Signed-off-by: NYue Zou <zouyue3@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 SeongJae Park 提交于
mainline inclusion from mainline-5.15-rc1 commit 17ccae8b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4GVMK CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=17ccae8bb5c928946f6f3af14626ec458f74e6ad ------------------------------------------------- This commit adds kunit based unit tests for the core and the virtual address spaces monitoring primitives of DAMON. Link: https://lkml.kernel.org/r/20210716081449.22187-12-sj38.park@gmail.comSigned-off-by: NSeongJae Park <sjpark@amazon.de> Reviewed-by: NBrendan Higgins <brendanhiggins@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Amit Shah <amit@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: David Woodhouse <dwmw@amazon.com> Cc: Fan Du <fan.du@intel.com> Cc: Fernand Sieber <sieberf@amazon.com> Cc: Greg Kroah-Hartman <greg@kroah.com> Cc: Greg Thelen <gthelen@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Leonard Foerster <foersleo@amazon.de> Cc: Marco Elver <elver@google.com> Cc: Markus Boehme <markubo@amazon.de> Cc: Maximilian Heyne <mheyne@amazon.de> Cc: Mel Gorman <mgorman@suse.de> Cc: Minchan Kim <minchan@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 17ccae8b) Signed-off-by: NYue Zou <zouyue3@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 SeongJae Park 提交于
mainline inclusion from mainline-5.15-rc1 commit c4ba6014 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4GVMK CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c4ba6014aec39e74ad3c10229dcfd187c42ee4f3 ------------------------------------------------- This commit adds documents for DAMON under `Documentation/admin-guide/mm/damon/` and `Documentation/vm/damon/`. Link: https://lkml.kernel.org/r/20210716081449.22187-11-sj38.park@gmail.comSigned-off-by: NSeongJae Park <sjpark@amazon.de> Reviewed-by: NFernand Sieber <sieberf@amazon.com> Reviewed-by: NMarkus Boehme <markubo@amazon.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Amit Shah <amit@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: David Woodhouse <dwmw@amazon.com> Cc: Fan Du <fan.du@intel.com> Cc: Greg Kroah-Hartman <greg@kroah.com> Cc: Greg Thelen <gthelen@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Leonard Foerster <foersleo@amazon.de> Cc: Marco Elver <elver@google.com> Cc: Maximilian Heyne <mheyne@amazon.de> Cc: Mel Gorman <mgorman@suse.de> Cc: Minchan Kim <minchan@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit c4ba6014) Signed-off-by: NYue Zou <zouyue3@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 SeongJae Park 提交于
mainline inclusion from mainline-5.15-rc1 commit 75c1c2b5 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4GVMK CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=75c1c2b53c78bf3b3188ebb7b3508dadbf98bba1 ------------------------------------------------- In some use cases, users would want to run multiple monitoring context. For example, if a user wants a high precision monitoring and dedicating multiple CPUs for the job is ok, because DAMON creates one monitoring thread per one context, the user can split the monitoring target regions into multiple small regions and create one context for each region. Or, someone might want to simultaneously monitor different address spaces, e.g., both virtual address space and physical address space. The DAMON's API allows such usage, but 'damon-dbgfs' does not. Therefore, only kernel space DAMON users can do multiple contexts monitoring. This commit allows the user space DAMON users to use multiple contexts monitoring by introducing two new 'damon-dbgfs' debugfs files, 'mk_context' and 'rm_context'. Users can create a new monitoring context by writing the desired name of the new context to 'mk_context'. Then, a new directory with the name and having the files for setting of the context ('attrs', 'target_ids' and 'record') will be created under the debugfs directory. Writing the name of the context to remove to 'rm_context' will remove the related context and directory. Link: https://lkml.kernel.org/r/20210716081449.22187-10-sj38.park@gmail.comSigned-off-by: NSeongJae Park <sjpark@amazon.de> Reviewed-by: NFernand Sieber <sieberf@amazon.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Amit Shah <amit@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: David Woodhouse <dwmw@amazon.com> Cc: Fan Du <fan.du@intel.com> Cc: Greg Kroah-Hartman <greg@kroah.com> Cc: Greg Thelen <gthelen@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Leonard Foerster <foersleo@amazon.de> Cc: Marco Elver <elver@google.com> Cc: Markus Boehme <markubo@amazon.de> Cc: Maximilian Heyne <mheyne@amazon.de> Cc: Mel Gorman <mgorman@suse.de> Cc: Minchan Kim <minchan@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 75c1c2b5) Signed-off-by: NYue Zou <zouyue3@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 SeongJae Park 提交于
mainline inclusion from mainline-5.15-rc1 commit 429538e8 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4GVMK CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=429538e85410c3ae12719ec42b89ab873ed6d47b ------------------------------------------------- For CPU usage accounting, knowing pid of the monitoring thread could be helpful. For example, users could use cpuaccount cgroups with the pid. This commit therefore exports the pid of currently running monitoring thread to the user space via 'kdamond_pid' file in the debugfs directory. Link: https://lkml.kernel.org/r/20210716081449.22187-9-sj38.park@gmail.comSigned-off-by: NSeongJae Park <sjpark@amazon.de> Reviewed-by: NFernand Sieber <sieberf@amazon.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Amit Shah <amit@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: David Woodhouse <dwmw@amazon.com> Cc: Fan Du <fan.du@intel.com> Cc: Greg Kroah-Hartman <greg@kroah.com> Cc: Greg Thelen <gthelen@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Leonard Foerster <foersleo@amazon.de> Cc: Marco Elver <elver@google.com> Cc: Markus Boehme <markubo@amazon.de> Cc: Maximilian Heyne <mheyne@amazon.de> Cc: Mel Gorman <mgorman@suse.de> Cc: Minchan Kim <minchan@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 429538e8) Signed-off-by: NYue Zou <zouyue3@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 SeongJae Park 提交于
mainline inclusion from mainline-5.15-rc1 commit 4bc05954 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4GVMK CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4bc05954d0076655cfaf6f0135585bdc20cd6b11 ------------------------------------------------- DAMON is designed to be used by kernel space code such as the memory management subsystems, and therefore it provides only kernel space API. That said, letting the user space control DAMON could provide some benefits to them. For example, it will allow user space to analyze their specific workloads and make their own special optimizations. For such cases, this commit implements a simple DAMON application kernel module, namely 'damon-dbgfs', which merely wraps the DAMON api and exports those to the user space via the debugfs. 'damon-dbgfs' exports three files, ``attrs``, ``target_ids``, and ``monitor_on`` under its debugfs directory, ``<debugfs>/damon/``. Attributes ---------- Users can read and write the ``sampling interval``, ``aggregation interval``, ``regions update interval``, and min/max number of monitoring target regions by reading from and writing to the ``attrs`` file. For example, below commands set those values to 5 ms, 100 ms, 1,000 ms, 10, 1000 and check it again:: # cd <debugfs>/damon # echo 5000 100000 1000000 10 1000 > attrs # cat attrs 5000 100000 1000000 10 1000 Target IDs ---------- Some types of address spaces supports multiple monitoring target. For example, the virtual memory address spaces monitoring can have multiple processes as the monitoring targets. Users can set the targets by writing relevant id values of the targets to, and get the ids of the current targets by reading from the ``target_ids`` file. In case of the virtual address spaces monitoring, the values should be pids of the monitoring target processes. For example, below commands set processes having pids 42 and 4242 as the monitoring targets and check it again:: # cd <debugfs>/damon # echo 42 4242 > target_ids # cat target_ids 42 4242 Note that setting the target ids doesn't start the monitoring. Turning On/Off -------------- Setting the files as described above doesn't incur effect unless you explicitly start the monitoring. You can start, stop, and check the current status of the monitoring by writing to and reading from the ``monitor_on`` file. Writing ``on`` to the file starts the monitoring of the targets with the attributes. Writing ``off`` to the file stops those. DAMON also stops if every targets are invalidated (in case of the virtual memory monitoring, target processes are invalidated when terminated). Below example commands turn on, off, and check the status of DAMON:: # cd <debugfs>/damon # echo on > monitor_on # echo off > monitor_on # cat monitor_on off Please note that you cannot write to the above-mentioned debugfs files while the monitoring is turned on. If you write to the files while DAMON is running, an error code such as ``-EBUSY`` will be returned. [akpm@linux-foundation.org: remove unneeded "alloc failed" printks] [akpm@linux-foundation.org: replace macro with static inline] Link: https://lkml.kernel.org/r/20210716081449.22187-8-sj38.park@gmail.comSigned-off-by: NSeongJae Park <sjpark@amazon.de> Reviewed-by: NLeonard Foerster <foersleo@amazon.de> Reviewed-by: NFernand Sieber <sieberf@amazon.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Amit Shah <amit@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: David Woodhouse <dwmw@amazon.com> Cc: Fan Du <fan.du@intel.com> Cc: Greg Kroah-Hartman <greg@kroah.com> Cc: Greg Thelen <gthelen@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Marco Elver <elver@google.com> Cc: Markus Boehme <markubo@amazon.de> Cc: Maximilian Heyne <mheyne@amazon.de> Cc: Mel Gorman <mgorman@suse.de> Cc: Minchan Kim <minchan@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 4bc05954) Signed-off-by: NYue Zou <zouyue3@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 SeongJae Park 提交于
mainline inclusion from mainline-5.15-rc1 commit 2fcb9362 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4GVMK CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2fcb93629ad8911c846cdc44521c746e53cc4e6d ------------------------------------------------- This commit adds a tracepoint for DAMON. It traces the monitoring results of each region for each aggregation interval. Using this, DAMON can easily integrated with tracepoints supporting tools such as perf. Link: https://lkml.kernel.org/r/20210716081449.22187-7-sj38.park@gmail.comSigned-off-by: NSeongJae Park <sjpark@amazon.de> Reviewed-by: NLeonard Foerster <foersleo@amazon.de> Reviewed-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: NFernand Sieber <sieberf@amazon.com> Acked-by: NShakeel Butt <shakeelb@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Amit Shah <amit@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: David Woodhouse <dwmw@amazon.com> Cc: Fan Du <fan.du@intel.com> Cc: Greg Kroah-Hartman <greg@kroah.com> Cc: Greg Thelen <gthelen@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Marco Elver <elver@google.com> Cc: Markus Boehme <markubo@amazon.de> Cc: Maximilian Heyne <mheyne@amazon.de> Cc: Mel Gorman <mgorman@suse.de> Cc: Minchan Kim <minchan@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 2fcb9362) Signed-off-by: NYue Zou <zouyue3@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 SeongJae Park 提交于
mainline inclusion from mainline-5.15-rc1 commit 3f49584b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4GVMK CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3f49584b262cf8f42b25f4c1ad9f5bfd3bdc1bca ------------------------------------------------- This commit introduces a reference implementation of the address space specific low level primitives for the virtual address space, so that users of DAMON can easily monitor the data accesses on virtual address spaces of specific processes by simply configuring the implementation to be used by DAMON. The low level primitives for the fundamental access monitoring are defined in two parts: 1. Identification of the monitoring target address range for the address space. 2. Access check of specific address range in the target space. The reference implementation for the virtual address space does the works as below. PTE Accessed-bit Based Access Check ----------------------------------- The implementation uses PTE Accessed-bit for basic access checks. That is, it clears the bit for the next sampling target page and checks whether it is set again after one sampling period. This could disturb the reclaim logic. DAMON uses ``PG_idle`` and ``PG_young`` page flags to solve the conflict, as Idle page tracking does. VMA-based Target Address Range Construction ------------------------------------------- Only small parts in the super-huge virtual address space of the processes are mapped to physical memory and accessed. Thus, tracking the unmapped address regions is just wasteful. However, because DAMON can deal with some level of noise using the adaptive regions adjustment mechanism, tracking every mapping is not strictly required but could even incur a high overhead in some cases. That said, too huge unmapped areas inside the monitoring target should be removed to not take the time for the adaptive mechanism. For the reason, this implementation converts the complex mappings to three distinct regions that cover every mapped area of the address space. Also, the two gaps between the three regions are the two biggest unmapped areas in the given address space. The two biggest unmapped areas would be the gap between the heap and the uppermost mmap()-ed region, and the gap between the lowermost mmap()-ed region and the stack in most of the cases. Because these gaps are exceptionally huge in usual address spaces, excluding these will be sufficient to make a reasonable trade-off. Below shows this in detail:: <heap> <BIG UNMAPPED REGION 1> <uppermost mmap()-ed region> (small mmap()-ed regions and munmap()-ed regions) <lowermost mmap()-ed region> <BIG UNMAPPED REGION 2> <stack> [akpm@linux-foundation.org: mm/damon/vaddr.c needs highmem.h for kunmap_atomic()] [sjpark@amazon.de: remove unnecessary PAGE_EXTENSION setup] Link: https://lkml.kernel.org/r/20210806095153.6444-2-sj38.park@gmail.com [sjpark@amazon.de: safely walk page table] Link: https://lkml.kernel.org/r/20210831161800.29419-1-sj38.park@gmail.com Link: https://lkml.kernel.org/r/20210716081449.22187-6-sj38.park@gmail.comSigned-off-by: NSeongJae Park <sjpark@amazon.de> Reviewed-by: NLeonard Foerster <foersleo@amazon.de> Reviewed-by: NFernand Sieber <sieberf@amazon.com> Acked-by: NShakeel Butt <shakeelb@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Amit Shah <amit@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: David Woodhouse <dwmw@amazon.com> Cc: Fan Du <fan.du@intel.com> Cc: Greg Kroah-Hartman <greg@kroah.com> Cc: Greg Thelen <gthelen@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Marco Elver <elver@google.com> Cc: Markus Boehme <markubo@amazon.de> Cc: Maximilian Heyne <mheyne@amazon.de> Cc: Mel Gorman <mgorman@suse.de> Cc: Minchan Kim <minchan@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 3f49584b) Signed-off-by: NYue Zou <zouyue3@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 SeongJae Park 提交于
mainline inclusion from mainline-5.15-rc1 commit 1c676e0d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4GVMK CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1c676e0d9b1a59b98885b24a0e16a81fe4cc8301 ------------------------------------------------- PG_idle and PG_young allow the two PTE Accessed bit users, Idle Page Tracking and the reclaim logic concurrently work while not interfering with each other. That is, when they need to clear the Accessed bit, they set PG_young to represent the previous state of the bit, respectively. And when they need to read the bit, if the bit is cleared, they further read the PG_young to know whether the other has cleared the bit meanwhile or not. For yet another user of the PTE Accessed bit, we could add another page flag, or extend the mechanism to use the flags. For the DAMON usecase, however, we don't need to do that just yet. IDLE_PAGE_TRACKING and DAMON are mutually exclusive, so there's only ever going to be one user of the current set of flags. In this commit, we split out the CONFIG options to allow for the use of PG_young and PG_idle outside of idle page tracking. In the next commit, DAMON's reference implementation of the virtual memory address space monitoring primitives will use it. [sjpark@amazon.de: set PAGE_EXTENSION for non-64BIT] Link: https://lkml.kernel.org/r/20210806095153.6444-1-sj38.park@gmail.com [akpm@linux-foundation.org: tweak Kconfig text] [sjpark@amazon.de: hide PAGE_IDLE_FLAG from users] Link: https://lkml.kernel.org/r/20210813081238.34705-1-sj38.park@gmail.com Link: https://lkml.kernel.org/r/20210716081449.22187-5-sj38.park@gmail.comSigned-off-by: NSeongJae Park <sjpark@amazon.de> Reviewed-by: NShakeel Butt <shakeelb@google.com> Reviewed-by: NFernand Sieber <sieberf@amazon.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Amit Shah <amit@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: David Woodhouse <dwmw@amazon.com> Cc: Fan Du <fan.du@intel.com> Cc: Greg Kroah-Hartman <greg@kroah.com> Cc: Greg Thelen <gthelen@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Leonard Foerster <foersleo@amazon.de> Cc: Marco Elver <elver@google.com> Cc: Markus Boehme <markubo@amazon.de> Cc: Maximilian Heyne <mheyne@amazon.de> Cc: Mel Gorman <mgorman@suse.de> Cc: Minchan Kim <minchan@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 1c676e0d) Signed-off-by: NYue Zou <zouyue3@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-