- 02 8月, 2016 11 次提交
-
-
由 xypron.glpk@gmx.de 提交于
Do not write random bytes from the kernel stack when calling qed_wr. Signed-off-by: NHeinrich Schuchardt <xypron.glpk@gmx.de> Acked-by: NYuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 xypron.glpk@gmx.de 提交于
All assignments to components of priv should only occur after the check if prif is NULL. Signed-off-by: NHeinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 xypron.glpk@gmx.de 提交于
Variable length is not used after the deleted line. Signed-off-by: NHeinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 xypron.glpk@gmx.de 提交于
(!A || (A && B)) is equivalent to (!A || B) Signed-off-by: NHeinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 xypron.glpk@gmx.de 提交于
i is defined as unsigned. So print it with %u. Signed-off-by: NHeinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 xypron.glpk@gmx.de 提交于
add and val are read with sscanf(kern_buf, "%x:%x", &addr, &val); and used as arguments for bna_reg_offset_check and writel so they have to be unsigned. Signed-off-by: NHeinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 xypron.glpk@gmx.de 提交于
addr and len are read with sscanf(kern_buf, "%x:%x", &addr, &len); and used as arguments for bna_reg_offset_check. So they have to be unsigned. Signed-off-by: NHeinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 xypron.glpk@gmx.de 提交于
If dev_get_platdata has failed pd is null. Do not dereference a null pointer. Signed-off-by: NHeinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 xypron.glpk@gmx.de 提交于
i has been defined as unsigned int. So use %u for output. Signed-off-by: NHeinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 xypron.glpk@gmx.de 提交于
If platform_get_resource fails, mem2 is null. Do not dereference null. Signed-off-by: NHeinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 xypron.glpk@gmx.de 提交于
%u is the wrong format specifier for int. size_t cannot be converted to int without possible loss of information. So leave the result as size_t and use %zu as format specifier. cf. Documentation/printk-formats.txt Signed-off-by: NHeinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 8月, 2016 4 次提交
-
-
由 Alexander Stein 提交于
There are KSZ8721 PHYs with phy_id 0x00221619. In order to detect them as PHY_ID_KSZ8001 compatible while staying different to PHY_ID_KSZ9021 ignore the last two bits when matching PHY_ID Signed-off-by: NAlexander Stein <alexanders83@web.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Chun-Hao Lin 提交于
When there is no AC power, NIC may not work after changing mac address. Please refer to following link. http://www.spinics.net/lists/netdev/msg356572.html This issue is caused by runtime power management. When there is no AC power, if we put NIC down (ifconfig down), the driver will be in runtime suspend state and hardware will be put into D3 state. During this time, driver cannot access hardware regisers. So if you set new mac address during this time, it will not be set to hardware. After resume, NIC will keep using the old mac address and the network will not work normally. In this patch I add detecting runtime pm status when setting mac address. If driver is in runtime suspend state, it will skip setting mac address, keep the new mac address, and set the new mac address during runtime resume. Signed-off-by: NChunhao Lin <hau@realtek.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Chun-Hao Lin 提交于
Not to call rtl8169_update_counters() to dump tally counter when driver is in runtime suspend state. Calling rtl8169_update_counters() in runtime suspend state will produce warning message "rtl_counters_cond == 1 (loop: 1000, delay: 10)". Signed-off-by: NChunhao Lin <hau@realtek.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Chun-Hao Lin 提交于
NIC will be put into D3 state during runtime suspend state. When set or get hardware wol setting, driver will write or read hardware registers. If we set or get hardware wol setting in runtime suspend state, because NIC will in D3 state, the hardware registers read by driver will return all 0xff. That will let driver thinking register flag is not toggled and then prints the warning message "rtl_counters_cond == 1 (loop: 1000, delay: 10)" to kernel log. For fixing this issue, add checking driver's pm runtime status in rtl8169_get_wol() and rtl8169_set_wol(). Signed-off-by: NChunhao Lin <hau@realtek.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 7月, 2016 16 次提交
-
-
由 Florian Fainelli 提交于
In case we cannot complete bcm_sf2_sw_setup() for any reason, and we go to the out_unmap label, but the MDIO bus has not been registered yet, we will hit the BUG condition in drivers/net/phy/mdio_bus.c about the bus not being registered. Fix this by dedicating a specific lable for when we fail after the MDIO bus has been successfully registered. Fixes: 461cd1b0 ("net: dsa: bcm_sf2: Register our slave MDIO bus") Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Reviewed-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Colin Ian King 提交于
trivial fix to spelling mistake in printk message Signed-off-by: NColin Ian King <colin.king@canonical.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sabrina Dubroca 提交于
When creation of a macsec device fails because an identical device already exists on this link, the current code decrements the refcnt on the parent link (in ->destructor for the macsec device), but it had not been incremented yet. Move the dev_hold(parent_link) call earlier during macsec device creation. Fixes: c09440f7 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: NSabrina Dubroca <sd@queasysnail.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sabrina Dubroca 提交于
Following the previous patch, RXSCs are held and properly refcounted in the RX path (instead of being implicitly held by their SA), so the SA doesn't need to hold a reference on its parent RXSC. This also avoids panics on module unload caused by the double layer of RCU callbacks (call_rcu frees the RXSA, which puts the final reference on the RXSC and allows to free it in its own call_rcu) that commit b196c22a ("macsec: add rcu_barrier() on module exit") didn't protect against. There were also some refcounting bugs in macsec_add_rxsa where I didn't put the reference on the RXSC on the error paths, which would lead to memory leaks. Fixes: c09440f7 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: NSabrina Dubroca <sd@queasysnail.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sabrina Dubroca 提交于
Currently, we lookup the RXSC without taking a reference on it. The RXSA holds a reference on the RXSC, but the SA and SC could still both disappear before we take a reference on the SA. Take a reference on the RXSC in macsec_handle_frame. Fixes: c09440f7 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: NSabrina Dubroca <sd@queasysnail.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Grygorii Strashko 提交于
Use of_platform_depopulate() in cpsw_remove() instead of of_device_unregister(), because CSPW child devices will not be recreated otherwise on next insmod. of_platform_depopulate() is correct way now as it will ensure that all steps done in of_platform_populate() are reverted, including cleaning up of OF_POPULATED flag. Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: NMugunthan V N <mugunthanvnm@ti.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Grygorii Strashko 提交于
The L3 error will be generated and system will crash during unloading of CPSW driver if CPSW is used as module and ethX devices are down. This happens because CPSW can be power off by PM runtime now when ethX devices are down. Hence, ensure that CPSW powered up by PM runtime before performing any deinitialization actions which require CPSW registers access. In case of PM runtime error just leave cpsw_remove() as we can't do anything anymore. Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: NMugunthan V N <mugunthanvnm@ti.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Grygorii Strashko 提交于
Fix deadlock in cpdma_ctlr_destroy() which is triggered now on cpsw module removal: cpsw_remove() - cpdma_ctlr_destroy() - spin_lock_irqsave(&ctlr->lock, flags) - cpdma_ctlr_stop() - spin_lock_irqsave(&ctlr->lock, flags); - cpdma_chan_destroy() - spin_lock_irqsave(&ctlr->lock, flags); The issue has not been observed before because CPDMA channels have been destroyed manually by CPSW until commit d941ebe8 ("net: ethernet: ti: cpsw: use destroy ctlr to destroy channels") was merged. Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: NMugunthan V N <mugunthanvnm@ti.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hariprasad Shenai 提交于
The commit 637d3e99 ("cxgb4: Discard the packet if the length is greater than mtu") introduced a regression in the VLAN interface performance when Tx VLAN offload is disabled. Check if skb is tagged, regardless of whether it is hardware accelerated or not. Presently we were checking only for hardware acclereated one, which caused performance to drop to ~0.17Mbps on a 10GbE adapter for VLAN interface, when tx vlan offload is turned off using ethtool. The ethernet head length calculation was going wrong in this case, and driver ended up dropping packets. Fixes: 637d3e99 ("cxgb4: Discard the packet if the length is greater than mtu") Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wei Yongjun 提交于
There is a error message within devm_ioremap_resource already, so remove the dev_err call to avoid redundant error message. Signed-off-by: NWei Yongjun <weiyj.lk@gmail.com> Acked-By: NIyappan Subramanian <isubramanian@apm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuval Mintz 提交于
Each PF/VF has a limited number of vlan filters for configuration purposes; This information is passed to qede and is used to prevent over-usage - once a vlan is to be configured and no filter credit is available, the driver would switch into working in vlan-promisc mode. Problem is the credit pool is shared by both PFs and VFs, and currently PFs aren't deducting the filters that are reserved for their VFs from their quota, which may lead to some vlan filters failing unknowingly due to lack of credit. Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuval Mintz 提交于
Driver uses reverse logic when checking if minimum bandwidth configuration applied, causing it to configure the guarantee only on the first hw-function. Fixes: a0d26d5a ("qed*: Don't reset statistics on inner reload") Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuval Mintz 提交于
Adding the necessary logic to prevet statistics reset on inner-reload introduced a bug, and now statistics are reset only when re-probing the driver. Fixes: a0d26d5a ("qed*: Don't reset statistics on inner reload") Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuval Mintz 提交于
Before requesting the firmware to start Rx queues, driver goes and sets the queue producer in the device to 0. But while the producer is 32-bit, the driver currently clears 64 bits, effectively zeroing an additional CID's producer as well. Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuval Mintz 提交于
Driver has reverse logic for checking the result of the spoof-checking configuration. As a result, it would log that the configuration failed [even though it succeeded], and will no longer do anything when requested to remove the configuration, as it's accounting of the feature will be incorrect. Fixes: 6ddc7608 ("qed*: IOV support spoof-checking") Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuval Mintz 提交于
As part of ndo_vlan_rx_kill_vid() implementation, qede is requesting firmware to remove the vlan filter. This currently happens even if the vlan wasn't previously added [In case device ran out of vlan credits]. For PFs this doesn't cause any issues as the firmware would simply ignore the removal request. But for VFs their parent PF is holding an accounting of the configured vlans, and such a request would cause the PF to fail the VF's removal request. Simply fix this for both PFs & VFs and don't remove filters that were not previously added. Fixes: 7c1bfcad ("qede: Add vlan filtering offload support") Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 7月, 2016 7 次提交
-
-
由 Andy Lutomirski 提交于
Currently, NR_KERNEL_STACK tracks the number of kernel stacks in a zone. This only makes sense if each kernel stack exists entirely in one zone, and allowing vmapped stacks could break this assumption. Since frv has THREAD_SIZE < PAGE_SIZE, we need to track kernel stack allocations in a unit that divides both THREAD_SIZE and PAGE_SIZE on all architectures. Keep it simple and use KiB. Link: http://lkml.kernel.org/r/083c71e642c5fa5f1b6898902e1b2db7b48940d4.1468523549.git.luto@kernel.orgSigned-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: NVladimir Davydov <vdavydov@virtuozzo.com> Acked-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
There are now a number of accounting oddities such as mapped file pages being accounted for on the node while the total number of file pages are accounted on the zone. This can be coped with to some extent but it's confusing so this patch moves the relevant file-based accounted. Due to throttling logic in the page allocator for reliable OOM detection, it is still necessary to track dirty and writeback pages on a per-zone basis. [mgorman@techsingularity.net: fix NR_ZONE_WRITE_PENDING accounting] Link: http://lkml.kernel.org/r/1468404004-5085-5-git-send-email-mgorman@techsingularity.net Link: http://lkml.kernel.org/r/1467970510-21195-20-git-send-email-mgorman@techsingularity.netSigned-off-by: NMel Gorman <mgorman@techsingularity.net> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
NR_FILE_PAGES is the number of file pages. NR_FILE_MAPPED is the number of mapped file pages. NR_ANON_PAGES is the number of mapped anon pages. This is unhelpful naming as it's easy to confuse NR_FILE_MAPPED and NR_ANON_PAGES for mapped pages. This patch renames NR_ANON_PAGES so we have NR_FILE_PAGES is the number of file pages. NR_FILE_MAPPED is the number of mapped file pages. NR_ANON_MAPPED is the number of mapped anon pages. Link: http://lkml.kernel.org/r/1467970510-21195-19-git-send-email-mgorman@techsingularity.netSigned-off-by: NMel Gorman <mgorman@techsingularity.net> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
Reclaim makes decisions based on the number of pages that are mapped but it's mixing node and zone information. Account NR_FILE_MAPPED and NR_ANON_PAGES pages on the node. Link: http://lkml.kernel.org/r/1467970510-21195-18-git-send-email-mgorman@techsingularity.netSigned-off-by: NMel Gorman <mgorman@techsingularity.net> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
This moves the LRU lists from the zone to the node and related data such as counters, tracing, congestion tracking and writeback tracking. Unfortunately, due to reclaim and compaction retry logic, it is necessary to account for the number of LRU pages on both zone and node logic. Most reclaim logic is based on the node counters but the retry logic uses the zone counters which do not distinguish inactive and active sizes. It would be possible to leave the LRU counters on a per-zone basis but it's a heavier calculation across multiple cache lines that is much more frequent than the retry checks. Other than the LRU counters, this is mostly a mechanical patch but note that it introduces a number of anomalies. For example, the scans are per-zone but using per-node counters. We also mark a node as congested when a zone is congested. This causes weird problems that are fixed later but is easier to review. In the event that there is excessive overhead on 32-bit systems due to the nodes being on LRU then there are two potential solutions 1. Long-term isolation of highmem pages when reclaim is lowmem When pages are skipped, they are immediately added back onto the LRU list. If lowmem reclaim persisted for long periods of time, the same highmem pages get continually scanned. The idea would be that lowmem keeps those pages on a separate list until a reclaim for highmem pages arrives that splices the highmem pages back onto the LRU. It potentially could be implemented similar to the UNEVICTABLE list. That would reduce the skip rate with the potential corner case is that highmem pages have to be scanned and reclaimed to free lowmem slab pages. 2. Linear scan lowmem pages if the initial LRU shrink fails This will break LRU ordering but may be preferable and faster during memory pressure than skipping LRU pages. Link: http://lkml.kernel.org/r/1467970510-21195-4-git-send-email-mgorman@techsingularity.netSigned-off-by: NMel Gorman <mgorman@techsingularity.net> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
Patchset: "Move LRU page reclaim from zones to nodes v9" This series moves LRUs from the zones to the node. While this is a current rebase, the test results were based on mmotm as of June 23rd. Conceptually, this series is simple but there are a lot of details. Some of the broad motivations for this are; 1. The residency of a page partially depends on what zone the page was allocated from. This is partially combatted by the fair zone allocation policy but that is a partial solution that introduces overhead in the page allocator paths. 2. Currently, reclaim on node 0 behaves slightly different to node 1. For example, direct reclaim scans in zonelist order and reclaims even if the zone is over the high watermark regardless of the age of pages in that LRU. Kswapd on the other hand starts reclaim on the highest unbalanced zone. A difference in distribution of file/anon pages due to when they were allocated results can result in a difference in again. While the fair zone allocation policy mitigates some of the problems here, the page reclaim results on a multi-zone node will always be different to a single-zone node. it was scheduled on as a result. 3. kswapd and the page allocator scan zones in the opposite order to avoid interfering with each other but it's sensitive to timing. This mitigates the page allocator using pages that were allocated very recently in the ideal case but it's sensitive to timing. When kswapd is allocating from lower zones then it's great but during the rebalancing of the highest zone, the page allocator and kswapd interfere with each other. It's worse if the highest zone is small and difficult to balance. 4. slab shrinkers are node-based which makes it harder to identify the exact relationship between slab reclaim and LRU reclaim. The reason we have zone-based reclaim is that we used to have large highmem zones in common configurations and it was necessary to quickly find ZONE_NORMAL pages for reclaim. Today, this is much less of a concern as machines with lots of memory will (or should) use 64-bit kernels. Combinations of 32-bit hardware and 64-bit hardware are rare. Machines that do use highmem should have relatively low highmem:lowmem ratios than we worried about in the past. Conceptually, moving to node LRUs should be easier to understand. The page allocator plays fewer tricks to game reclaim and reclaim behaves similarly on all nodes. The series has been tested on a 16 core UMA machine and a 2-socket 48 core NUMA machine. The UMA results are presented in most cases as the NUMA machine behaved similarly. pagealloc --------- This is a microbenchmark that shows the benefit of removing the fair zone allocation policy. It was tested uip to order-4 but only orders 0 and 1 are shown as the other orders were comparable. 4.7.0-rc4 4.7.0-rc4 mmotm-20160623 nodelru-v9 Min total-odr0-1 490.00 ( 0.00%) 457.00 ( 6.73%) Min total-odr0-2 347.00 ( 0.00%) 329.00 ( 5.19%) Min total-odr0-4 288.00 ( 0.00%) 273.00 ( 5.21%) Min total-odr0-8 251.00 ( 0.00%) 239.00 ( 4.78%) Min total-odr0-16 234.00 ( 0.00%) 222.00 ( 5.13%) Min total-odr0-32 223.00 ( 0.00%) 211.00 ( 5.38%) Min total-odr0-64 217.00 ( 0.00%) 208.00 ( 4.15%) Min total-odr0-128 214.00 ( 0.00%) 204.00 ( 4.67%) Min total-odr0-256 250.00 ( 0.00%) 230.00 ( 8.00%) Min total-odr0-512 271.00 ( 0.00%) 269.00 ( 0.74%) Min total-odr0-1024 291.00 ( 0.00%) 282.00 ( 3.09%) Min total-odr0-2048 303.00 ( 0.00%) 296.00 ( 2.31%) Min total-odr0-4096 311.00 ( 0.00%) 309.00 ( 0.64%) Min total-odr0-8192 316.00 ( 0.00%) 314.00 ( 0.63%) Min total-odr0-16384 317.00 ( 0.00%) 315.00 ( 0.63%) Min total-odr1-1 742.00 ( 0.00%) 712.00 ( 4.04%) Min total-odr1-2 562.00 ( 0.00%) 530.00 ( 5.69%) Min total-odr1-4 457.00 ( 0.00%) 433.00 ( 5.25%) Min total-odr1-8 411.00 ( 0.00%) 381.00 ( 7.30%) Min total-odr1-16 381.00 ( 0.00%) 356.00 ( 6.56%) Min total-odr1-32 372.00 ( 0.00%) 346.00 ( 6.99%) Min total-odr1-64 372.00 ( 0.00%) 343.00 ( 7.80%) Min total-odr1-128 375.00 ( 0.00%) 351.00 ( 6.40%) Min total-odr1-256 379.00 ( 0.00%) 351.00 ( 7.39%) Min total-odr1-512 385.00 ( 0.00%) 355.00 ( 7.79%) Min total-odr1-1024 386.00 ( 0.00%) 358.00 ( 7.25%) Min total-odr1-2048 390.00 ( 0.00%) 362.00 ( 7.18%) Min total-odr1-4096 390.00 ( 0.00%) 362.00 ( 7.18%) Min total-odr1-8192 388.00 ( 0.00%) 363.00 ( 6.44%) This shows a steady improvement throughout. The primary benefit is from reduced system CPU usage which is obvious from the overall times; 4.7.0-rc4 4.7.0-rc4 mmotm-20160623nodelru-v8 User 189.19 191.80 System 2604.45 2533.56 Elapsed 2855.30 2786.39 The vmstats also showed that the fair zone allocation policy was definitely removed as can be seen here; 4.7.0-rc3 4.7.0-rc3 mmotm-20160623 nodelru-v8 DMA32 allocs 28794729769 0 Normal allocs 48432501431 77227309877 Movable allocs 0 0 tiobench on ext4 ---------------- tiobench is a benchmark that artifically benefits if old pages remain resident while new pages get reclaimed. The fair zone allocation policy mitigates this problem so pages age fairly. While the benchmark has problems, it is important that tiobench performance remains constant as it implies that page aging problems that the fair zone allocation policy fixes are not re-introduced. 4.7.0-rc4 4.7.0-rc4 mmotm-20160623 nodelru-v9 Min PotentialReadSpeed 89.65 ( 0.00%) 90.21 ( 0.62%) Min SeqRead-MB/sec-1 82.68 ( 0.00%) 82.01 ( -0.81%) Min SeqRead-MB/sec-2 72.76 ( 0.00%) 72.07 ( -0.95%) Min SeqRead-MB/sec-4 75.13 ( 0.00%) 74.92 ( -0.28%) Min SeqRead-MB/sec-8 64.91 ( 0.00%) 65.19 ( 0.43%) Min SeqRead-MB/sec-16 62.24 ( 0.00%) 62.22 ( -0.03%) Min RandRead-MB/sec-1 0.88 ( 0.00%) 0.88 ( 0.00%) Min RandRead-MB/sec-2 0.95 ( 0.00%) 0.92 ( -3.16%) Min RandRead-MB/sec-4 1.43 ( 0.00%) 1.34 ( -6.29%) Min RandRead-MB/sec-8 1.61 ( 0.00%) 1.60 ( -0.62%) Min RandRead-MB/sec-16 1.80 ( 0.00%) 1.90 ( 5.56%) Min SeqWrite-MB/sec-1 76.41 ( 0.00%) 76.85 ( 0.58%) Min SeqWrite-MB/sec-2 74.11 ( 0.00%) 73.54 ( -0.77%) Min SeqWrite-MB/sec-4 80.05 ( 0.00%) 80.13 ( 0.10%) Min SeqWrite-MB/sec-8 72.88 ( 0.00%) 73.20 ( 0.44%) Min SeqWrite-MB/sec-16 75.91 ( 0.00%) 76.44 ( 0.70%) Min RandWrite-MB/sec-1 1.18 ( 0.00%) 1.14 ( -3.39%) Min RandWrite-MB/sec-2 1.02 ( 0.00%) 1.03 ( 0.98%) Min RandWrite-MB/sec-4 1.05 ( 0.00%) 0.98 ( -6.67%) Min RandWrite-MB/sec-8 0.89 ( 0.00%) 0.92 ( 3.37%) Min RandWrite-MB/sec-16 0.92 ( 0.00%) 0.93 ( 1.09%) 4.7.0-rc4 4.7.0-rc4 mmotm-20160623 approx-v9 User 645.72 525.90 System 403.85 331.75 Elapsed 6795.36 6783.67 This shows that the series has little or not impact on tiobench which is desirable and a reduction in system CPU usage. It indicates that the fair zone allocation policy was removed in a manner that didn't reintroduce one class of page aging bug. There were only minor differences in overall reclaim activity 4.7.0-rc4 4.7.0-rc4 mmotm-20160623nodelru-v8 Minor Faults 645838 647465 Major Faults 573 640 Swap Ins 0 0 Swap Outs 0 0 DMA allocs 0 0 DMA32 allocs 46041453 44190646 Normal allocs 78053072 79887245 Movable allocs 0 0 Allocation stalls 24 67 Stall zone DMA 0 0 Stall zone DMA32 0 0 Stall zone Normal 0 2 Stall zone HighMem 0 0 Stall zone Movable 0 65 Direct pages scanned 10969 30609 Kswapd pages scanned 93375144 93492094 Kswapd pages reclaimed 93372243 93489370 Direct pages reclaimed 10969 30609 Kswapd efficiency 99% 99% Kswapd velocity 13741.015 13781.934 Direct efficiency 100% 100% Direct velocity 1.614 4.512 Percentage direct scans 0% 0% kswapd activity was roughly comparable. There were differences in direct reclaim activity but negligible in the context of the overall workload (velocity of 4 pages per second with the patches applied, 1.6 pages per second in the baseline kernel). pgbench read-only large configuration on ext4 --------------------------------------------- pgbench is a database benchmark that can be sensitive to page reclaim decisions. This also checks if removing the fair zone allocation policy is safe pgbench Transactions 4.7.0-rc4 4.7.0-rc4 mmotm-20160623 nodelru-v8 Hmean 1 188.26 ( 0.00%) 189.78 ( 0.81%) Hmean 5 330.66 ( 0.00%) 328.69 ( -0.59%) Hmean 12 370.32 ( 0.00%) 380.72 ( 2.81%) Hmean 21 368.89 ( 0.00%) 369.00 ( 0.03%) Hmean 30 382.14 ( 0.00%) 360.89 ( -5.56%) Hmean 32 428.87 ( 0.00%) 432.96 ( 0.95%) Negligible differences again. As with tiobench, overall reclaim activity was comparable. bonnie++ on ext4 ---------------- No interesting performance difference, negligible differences on reclaim stats. paralleldd on ext4 ------------------ This workload uses varying numbers of dd instances to read large amounts of data from disk. 4.7.0-rc3 4.7.0-rc3 mmotm-20160623 nodelru-v9 Amean Elapsd-1 186.04 ( 0.00%) 189.41 ( -1.82%) Amean Elapsd-3 192.27 ( 0.00%) 191.38 ( 0.46%) Amean Elapsd-5 185.21 ( 0.00%) 182.75 ( 1.33%) Amean Elapsd-7 183.71 ( 0.00%) 182.11 ( 0.87%) Amean Elapsd-12 180.96 ( 0.00%) 181.58 ( -0.35%) Amean Elapsd-16 181.36 ( 0.00%) 183.72 ( -1.30%) 4.7.0-rc4 4.7.0-rc4 mmotm-20160623 nodelru-v9 User 1548.01 1552.44 System 8609.71 8515.08 Elapsed 3587.10 3594.54 There is little or no change in performance but some drop in system CPU usage. 4.7.0-rc3 4.7.0-rc3 mmotm-20160623 nodelru-v9 Minor Faults 362662 367360 Major Faults 1204 1143 Swap Ins 22 0 Swap Outs 2855 1029 DMA allocs 0 0 DMA32 allocs 31409797 28837521 Normal allocs 46611853 49231282 Movable allocs 0 0 Direct pages scanned 0 0 Kswapd pages scanned 40845270 40869088 Kswapd pages reclaimed 40830976 40855294 Direct pages reclaimed 0 0 Kswapd efficiency 99% 99% Kswapd velocity 11386.711 11369.769 Direct efficiency 100% 100% Direct velocity 0.000 0.000 Percentage direct scans 0% 0% Page writes by reclaim 2855 1029 Page writes file 0 0 Page writes anon 2855 1029 Page reclaim immediate 771 1628 Sector Reads 293312636 293536360 Sector Writes 18213568 18186480 Page rescued immediate 0 0 Slabs scanned 128257 132747 Direct inode steals 181 56 Kswapd inode steals 59 1131 It basically shows that kswapd was active at roughly the same rate in both kernels. There was also comparable slab scanning activity and direct reclaim was avoided in both cases. There appears to be a large difference in numbers of inodes reclaimed but the workload has few active inodes and is likely a timing artifact. stutter ------- stutter simulates a simple workload. One part uses a lot of anonymous memory, a second measures mmap latency and a third copies a large file. The primary metric is checking for mmap latency. stutter 4.7.0-rc4 4.7.0-rc4 mmotm-20160623 nodelru-v8 Min mmap 16.6283 ( 0.00%) 13.4258 ( 19.26%) 1st-qrtle mmap 54.7570 ( 0.00%) 34.9121 ( 36.24%) 2nd-qrtle mmap 57.3163 ( 0.00%) 46.1147 ( 19.54%) 3rd-qrtle mmap 58.9976 ( 0.00%) 47.1882 ( 20.02%) Max-90% mmap 59.7433 ( 0.00%) 47.4453 ( 20.58%) Max-93% mmap 60.1298 ( 0.00%) 47.6037 ( 20.83%) Max-95% mmap 73.4112 ( 0.00%) 82.8719 (-12.89%) Max-99% mmap 92.8542 ( 0.00%) 88.8870 ( 4.27%) Max mmap 1440.6569 ( 0.00%) 121.4201 ( 91.57%) Mean mmap 59.3493 ( 0.00%) 42.2991 ( 28.73%) Best99%Mean mmap 57.2121 ( 0.00%) 41.8207 ( 26.90%) Best95%Mean mmap 55.9113 ( 0.00%) 39.9620 ( 28.53%) Best90%Mean mmap 55.6199 ( 0.00%) 39.3124 ( 29.32%) Best50%Mean mmap 53.2183 ( 0.00%) 33.1307 ( 37.75%) Best10%Mean mmap 45.9842 ( 0.00%) 20.4040 ( 55.63%) Best5%Mean mmap 43.2256 ( 0.00%) 17.9654 ( 58.44%) Best1%Mean mmap 32.9388 ( 0.00%) 16.6875 ( 49.34%) This shows a number of improvements with the worst-case outlier greatly improved. Some of the vmstats are interesting 4.7.0-rc4 4.7.0-rc4 mmotm-20160623nodelru-v8 Swap Ins 163 502 Swap Outs 0 0 DMA allocs 0 0 DMA32 allocs 618719206 1381662383 Normal allocs 891235743 564138421 Movable allocs 0 0 Allocation stalls 2603 1 Direct pages scanned 216787 2 Kswapd pages scanned 50719775 41778378 Kswapd pages reclaimed 41541765 41777639 Direct pages reclaimed 209159 0 Kswapd efficiency 81% 99% Kswapd velocity 16859.554 14329.059 Direct efficiency 96% 0% Direct velocity 72.061 0.001 Percentage direct scans 0% 0% Page writes by reclaim 6215049 0 Page writes file 6215049 0 Page writes anon 0 0 Page reclaim immediate 70673 90 Sector Reads 81940800 81680456 Sector Writes 100158984 98816036 Page rescued immediate 0 0 Slabs scanned 1366954 22683 While this is not guaranteed in all cases, this particular test showed a large reduction in direct reclaim activity. It's also worth noting that no page writes were issued from reclaim context. This series is not without its hazards. There are at least three areas that I'm concerned with even though I could not reproduce any problems in that area. 1. Reclaim/compaction is going to be affected because the amount of reclaim is no longer targetted at a specific zone. Compaction works on a per-zone basis so there is no guarantee that reclaiming a few THP's worth page pages will have a positive impact on compaction success rates. 2. The Slab/LRU reclaim ratio is affected because the frequency the shrinkers are called is now different. This may or may not be a problem but if it is, it'll be because shrinkers are not called enough and some balancing is required. 3. The anon/file reclaim ratio may be affected. Pages about to be dirtied are distributed between zones and the fair zone allocation policy used to do something very similar for anon. The distribution is now different but not necessarily in any way that matters but it's still worth bearing in mind. VM statistic counters for reclaim decisions are zone-based. If the kernel is to reclaim on a per-node basis then we need to track per-node statistics but there is no infrastructure for that. The most notable change is that the old node_page_state is renamed to sum_zone_node_page_state. The new node_page_state takes a pglist_data and uses per-node stats but none exist yet. There is some renaming such as vm_stat to vm_zone_stat and the addition of vm_node_stat and the renaming of mod_state to mod_zone_state. Otherwise, this is mostly a mechanical patch with no functional change. There is a lot of similarity between the node and zone helpers which is unfortunate but there was no obvious way of reusing the code and maintaining type safety. Link: http://lkml.kernel.org/r/1467970510-21195-2-git-send-email-mgorman@techsingularity.netSigned-off-by: NMel Gorman <mgorman@techsingularity.net> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Rik van Riel <riel@surriel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Shaohua Li 提交于
The md device might not have personality (for example, ddf raid array). The issue is introduced by 8430e7e0(md: disconnect device from personality before trying to remove it) Reported-by: Nkernel test robot <xiaolong.ye@intel.com> Signed-off-by: NShaohua Li <shli@fb.com>
-
- 28 7月, 2016 2 次提交
-
-
由 Dan Carpenter 提交于
We accidentally take the "port->lock" twice in a row. This old code was supposed to be deleted. Fixes: e58e241c ('sparc: serial: Clean up the locking for -rt') Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Theodore Ts'o 提交于
This fixes a crash on s390 with fake NUMA enabled. Reported-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Fixes: 1e7f583a ("random: make /dev/urandom scalable for silly userspace programs") Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-