- 09 9月, 2021 1 次提交
-
-
由 David Hildenbrand 提交于
There is only a single user remaining. We can simply lookup the nid only used for node offlining purposes when walking our memory blocks. We don't expect to remove multi-nid ranges; and if we'd ever do, we most probably don't care about removing multi-nid ranges that actually result in empty nodes. If ever required, we can detect the "multi-nid" scenario and simply try offlining all online nodes. Link: https://lkml.kernel.org/r/20210712124052.26491-4-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Scott Cheloha <cheloha@linux.ibm.com> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jia He <justin.he@arm.com> Cc: Joe Perches <joe@perches.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michel Lespinasse <michel@lespinasse.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Pankaj Gupta <pankaj.gupta@ionos.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pierre Morel <pmorel@linux.ibm.com> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Rich Felker <dalias@libc.org> Cc: Sergei Trofimovich <slyfox@gentoo.org> Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 24 6月, 2021 3 次提交
-
-
由 Daniel Henrique Barboza 提交于
The validation done at the start of dlpar_memory_add_by_ic() is an all of nothing scenario - if any LMBs in the range is marked as RESERVED we can fail right away. We then can remove the 'lmbs_available' var and its check with 'lmbs_to_add' since the whole LMB range was already validated in the previous step. Signed-off-by: NDaniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210622133923.295373-4-danielhb413@gmail.com
-
由 Daniel Henrique Barboza 提交于
After a successful dlpar_add_lmb() call the LMB is marked as reserved. Later on, depending whether we added enough LMBs or not, we rely on the marked LMBs to see which ones might need to be removed, and we remove the reservation of all of them. These are done in for_each_drmem_lmb() loops without any break condition. This means that we're going to check all LMBs of the partition even after going through all the reserved ones. This patch adds break conditions in both loops to avoid this. The 'lmbs_added' variable was renamed to 'lmbs_reserved', and it's now being decremented each time a lmb reservation is removed, indicating if there are still marked LMBs to be processed. Signed-off-by: NDaniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210622133923.295373-3-danielhb413@gmail.com
-
由 Daniel Henrique Barboza 提交于
The function is counting reserved LMBs as available to be added, but they aren't. This will cause the function to miscalculate the available LMBs and can trigger errors later on when executing dlpar_add_lmb(). Signed-off-by: NDaniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210622133923.295373-2-danielhb413@gmail.com
-
- 23 5月, 2021 5 次提交
-
-
由 Daniel Henrique Barboza 提交于
We don't need the 'lmbs_available' variable to count the valid LMBs and to check if we have less than 'lmbs_to_remove'. We must ensure that the entire LMB range must be removed, so we can error out immediately if any LMB in the range is marked as reserved. Add a couple of comments explaining the reasoning behind the differences we have in this function in contrast to what it is done in its sister function, dlpar_memory_remove_by_count(). Signed-off-by: NDaniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210512202809.95363-5-danielhb413@gmail.com
-
由 Daniel Henrique Barboza 提交于
After marking the LMBs as reserved depending on dlpar_remove_lmb() rc, we evaluate whether we need to add the LMBs back or if we can release the LMB DRCs. In both cases, a for_each_drmem_lmb() loop without a break condition is used. This means that we're going to cycle through all LMBs of the partition even after we're done with what we were going to do. This patch adds break conditions in both loops to avoid this. The 'lmbs_removed' variable was renamed to 'lmbs_reserved', and it's now being decremented each time a lmb reservation is removed, indicating that the operation we're doing (adding back LMBs or releasing DRCs) is completed. Signed-off-by: NDaniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210512202809.95363-4-danielhb413@gmail.com
-
由 Daniel Henrique Barboza 提交于
DRCONF_MEM_RESERVED is a flag that represents the "Reserved Memory" status in LOPAR v2.10, section 4.2.8. If a LMB is marked as reserved, quoting LOPAR, "is not to be used or altered by the base OS". This flag is read only in the kernel, being set by the firmware/hypervisor in the DT. As an example, QEMU will set this flag in hw/ppc/spapr.c, spapr_dt_dynamic_memory(). lmb_is_removable() does not check for DRCONF_MEM_RESERVED. This function is used in dlpar_remove_lmb() as a guard before the removal logic. Since it is failing to check for !RESERVED, dlpar_remove_lmb() will fail in a later stage instead of failing in the validation when receiving a reserved LMB as input. lmb_is_removable() is also used in dlpar_memory_remove_by_count() to evaluate if we have enough LMBs to complete the request. The missing !RESERVED check in this case is causing dlpar_memory_remove_by_count() to miscalculate the number of elegible LMBs for the removal, and can make it error out later on instead of failing in the validation with the 'not enough LMBs to satisfy request' message. Making a DRCONF_MEM_RESERVED check in lmb_is_removable() fixes all these issues. Signed-off-by: NDaniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210512202809.95363-3-danielhb413@gmail.com
-
由 Daniel Henrique Barboza 提交于
As previously done in dlpar_cpu_remove() for CPUs, this patch changes dlpar_memory_remove_by_ic() to unisolate the LMB DRC when the LMB is failed to be removed. The hypervisor, seeing a LMB DRC that was supposed to be removed being unisolated instead, can do error recovery on its side. This change is done in dlpar_memory_remove_by_ic() only because, as of today, only QEMU is using this code path for error recovery (via the PSERIES_HP_ELOG_ID_DRC_IC event). phyp treats it as a no-op. Signed-off-by: NDaniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210512202809.95363-2-danielhb413@gmail.com
-
由 YueHaibing 提交于
dlpar_memory_remove() is never used, so can be removed. Signed-off-by: NYueHaibing <yuehaibing@huawei.com> Reviewed-by: NDaniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210514071041.17432-1-yuehaibing@huawei.com
-
- 15 12月, 2020 1 次提交
-
-
由 Laurent Dufour 提交于
When attempting to remove by index a set of LMBs a lot of messages are displayed on the console, even when everything goes fine: pseries-hotplug-mem: Attempting to hot-remove LMB, drc index 8000002d Offlined Pages 4096 pseries-hotplug-mem: Memory at 2d0000000 was hot-removed The 2 messages prefixed by "pseries-hotplug-mem" are not really helpful for the end user, they should be debug outputs. In case of error, because some of the LMB's pages couldn't be offlined, the following is displayed on the console: pseries-hotplug-mem: Attempting to hot-remove LMB, drc index 8000003e pseries-hotplug-mem: Failed to hot-remove memory at 3e0000000 dlpar: Could not handle DLPAR request "memory remove index 0x8000003e" Again, the 2 messages prefixed by "pseries-hotplug-mem" are useless, and the generic DLPAR prefixed message should be enough. These 2 first changes are mainly triggered by the changes introduced in drmgr: https://groups.google.com/g/powerpc-utils-devel/c/Y6ef4NB3EzM/m/9cu5JHRxAQAJ Also, when adding a bunch of LMBs, a message is displayed in the console per LMB like these ones: pseries-hotplug-mem: Memory at 7e0000000 (drc index 8000007e) was hot-added pseries-hotplug-mem: Memory at 7f0000000 (drc index 8000007f) was hot-added pseries-hotplug-mem: Memory at 800000000 (drc index 80000080) was hot-added pseries-hotplug-mem: Memory at 810000000 (drc index 80000081) was hot-added When adding 1TB of memory and LMB size is 256MB, this leads to 4096 messages to be displayed on the console. These messages are not really helpful for the end user, so moving them to the DEBUG level. Signed-off-by: NLaurent Dufour <ldufour@linux.ibm.com> [mpe: Tweak change log wording] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201211145954.90143-1-ldufour@linux.ibm.com
-
- 17 10月, 2020 1 次提交
-
-
由 David Hildenbrand 提交于
We soon want to pass flags, e.g., to mark added System RAM resources. mergeable. Prepare for that. This patch is based on a similar patch by Oscar Salvador: https://lkml.kernel.org/r/20190625075227.15193-3-osalvador@suse.deSigned-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: Juergen Gross <jgross@suse.com> # Xen related part Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Acked-by: NWei Liu <wei.liu@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Baoquan He <bhe@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Pingfan Liu <kernelfans@gmail.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: Libor Pechacek <lpechacek@suse.cz> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Leonardo Bras <leobras.c@gmail.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Julien Grall <julien@xen.org> Cc: Kees Cook <keescook@chromium.org> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richardw.yang@linux.intel.com> Link: https://lkml.kernel.org/r/20200911103459.10306-5-david@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 08 10月, 2020 2 次提交
-
-
由 Aneesh Kumar K.V 提交于
Make it consistent with other usages. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201007114836.282468-5-aneesh.kumar@linux.ibm.com
-
由 Aneesh Kumar K.V 提交于
Similar to commit 89c140bb ("pseries: Fix 64 bit logical memory block panic") make sure different variables tracking lmb_size are updated to be 64 bit. This was found by code audit. Cc: stable@vger.kernel.org Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201007114836.282468-3-aneesh.kumar@linux.ibm.com
-
- 06 10月, 2020 1 次提交
-
-
由 Scott Cheloha 提交于
During memory hot-add, dlpar_add_lmb() calls memory_add_physaddr_to_nid() to determine which node id (nid) to use when later calling __add_memory(). This is wasteful. On pseries, memory_add_physaddr_to_nid() finds an appropriate nid for a given address by looking up the LMB containing the address and then passing that LMB to of_drconf_to_nid_single() to get the nid. In dlpar_add_lmb() we get this address from the LMB itself. In short, we have a pointer to an LMB and then we are searching for that LMB *again* in order to find its nid. If we call of_drconf_to_nid_single() directly from dlpar_add_lmb() we can skip the redundant lookup. The only error handling we need to duplicate from memory_add_physaddr_to_nid() is the fallback to the default nid when drconf_to_nid_single() returns -1 (NUMA_NO_NODE) or an invalid nid. Skipping the extra lookup makes hot-add operations faster, especially on machines with many LMBs. Consider an LPAR with 126976 LMBs. In one test, hot-adding 126000 LMBs on an upatched kernel took ~3.5 hours while a patched kernel completed the same operation in ~2 hours: Unpatched (12450 seconds): Sep 9 04:06:31 ltc-brazos1 drmgr[810169]: drmgr: -c mem -a -q 126000 Sep 9 04:06:31 ltc-brazos1 kernel: pseries-hotplug-mem: Attempting to hot-add 126000 LMB(s) [...] Sep 9 07:34:01 ltc-brazos1 kernel: pseries-hotplug-mem: Memory at 20000000 (drc index 80000002) was hot-added Patched (7065 seconds): Sep 8 21:49:57 ltc-brazos1 drmgr[877703]: drmgr: -c mem -a -q 126000 Sep 8 21:49:57 ltc-brazos1 kernel: pseries-hotplug-mem: Attempting to hot-add 126000 LMB(s) [...] Sep 8 23:27:42 ltc-brazos1 kernel: pseries-hotplug-mem: Memory at 20000000 (drc index 80000002) was hot-added It should be noted that the speedup grows more substantial when hot-adding LMBs at the end of the drconf range. This is because we are skipping a linear LMB search. To see the distinction, consider smaller hot-add test on the same LPAR. A perf-stat run with 10 iterations showed that hot-adding 4096 LMBs completed less than 1 second faster on a patched kernel: Unpatched: Performance counter stats for 'drmgr -c mem -a -q 4096' (10 runs): 104,753.42 msec task-clock # 0.992 CPUs utilized ( +- 0.55% ) 4,708 context-switches # 0.045 K/sec ( +- 0.69% ) 2,444 cpu-migrations # 0.023 K/sec ( +- 1.25% ) 394 page-faults # 0.004 K/sec ( +- 0.22% ) 445,902,503,057 cycles # 4.257 GHz ( +- 0.55% ) (66.67%) 8,558,376,740 stalled-cycles-frontend # 1.92% frontend cycles idle ( +- 0.88% ) (49.99%) 300,346,181,651 stalled-cycles-backend # 67.36% backend cycles idle ( +- 0.76% ) (50.01%) 258,091,488,691 instructions # 0.58 insn per cycle # 1.16 stalled cycles per insn ( +- 0.22% ) (66.67%) 70,568,169,256 branches # 673.660 M/sec ( +- 0.17% ) (50.01%) 3,100,725,426 branch-misses # 4.39% of all branches ( +- 0.20% ) (49.99%) 105.583 +- 0.589 seconds time elapsed ( +- 0.56% ) Patched: Performance counter stats for 'drmgr -c mem -a -q 4096' (10 runs): 104,055.69 msec task-clock # 0.993 CPUs utilized ( +- 0.32% ) 4,606 context-switches # 0.044 K/sec ( +- 0.20% ) 2,463 cpu-migrations # 0.024 K/sec ( +- 0.93% ) 394 page-faults # 0.004 K/sec ( +- 0.25% ) 442,951,129,921 cycles # 4.257 GHz ( +- 0.32% ) (66.66%) 8,710,413,329 stalled-cycles-frontend # 1.97% frontend cycles idle ( +- 0.47% ) (50.06%) 299,656,905,836 stalled-cycles-backend # 67.65% backend cycles idle ( +- 0.39% ) (50.02%) 252,731,168,193 instructions # 0.57 insn per cycle # 1.19 stalled cycles per insn ( +- 0.20% ) (66.66%) 68,902,851,121 branches # 662.173 M/sec ( +- 0.13% ) (49.94%) 3,100,242,882 branch-misses # 4.50% of all branches ( +- 0.15% ) (49.98%) 104.829 +- 0.325 seconds time elapsed ( +- 0.31% ) This is consistent. An add-by-count hot-add operation adds LMBs greedily, so LMBs near the start of the drconf range are considered first. On an otherwise idle LPAR with so many LMBs we would expect to find the LMBs we need near the start of the drconf range, hence the smaller speedup. Signed-off-by: NScott Cheloha <cheloha@linux.ibm.com> Reviewed-by: NLaurent Dufour <ldufour@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200916145122.3408129-1-cheloha@linux.ibm.com
-
- 02 9月, 2020 1 次提交
-
-
由 Scott Cheloha 提交于
At memory hot-remove time we can retrieve an LMB's nid from its corresponding memory_block. There is no need to store the nid in multiple locations. Note that lmb_to_memblock() uses find_memory_block() to get the corresponding memory_block. As find_memory_block() runs in sub-linear time this approach is negligibly slower than what we do at present. In exchange for this lookup at hot-remove time we no longer need to call memory_add_physaddr_to_nid() during drmem_init() for each LMB. On powerpc, memory_add_physaddr_to_nid() is a linear search, so this spares us an O(n^2) initialization during boot. On systems with many LMBs that initialization overhead is palpable and disruptive. For example, on a box with 249854 LMBs we're seeing drmem_init() take upwards of 30 seconds to complete: [ 53.721639] drmem: initializing drmem v2 [ 80.604346] watchdog: BUG: soft lockup - CPU#65 stuck for 23s! [swapper/0:1] [ 80.604377] Modules linked in: [ 80.604389] CPU: 65 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc2+ #4 [ 80.604397] NIP: c0000000000a4980 LR: c0000000000a4940 CTR: 0000000000000000 [ 80.604407] REGS: c0002dbff8493830 TRAP: 0901 Not tainted (5.6.0-rc2+) [ 80.604412] MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 44000248 XER: 0000000d [ 80.604431] CFAR: c0000000000a4a38 IRQMASK: 0 [ 80.604431] GPR00: c0000000000a4940 c0002dbff8493ac0 c000000001904400 c0003cfffffede30 [ 80.604431] GPR04: 0000000000000000 c000000000f4095a 000000000000002f 0000000010000000 [ 80.604431] GPR08: c0000bf7ecdb7fb8 c0000bf7ecc2d3c8 0000000000000008 c00c0002fdfb2001 [ 80.604431] GPR12: 0000000000000000 c00000001e8ec200 [ 80.604477] NIP [c0000000000a4980] hot_add_scn_to_nid+0xa0/0x3e0 [ 80.604486] LR [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0 [ 80.604492] Call Trace: [ 80.604498] [c0002dbff8493ac0] [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0 (unreliable) [ 80.604509] [c0002dbff8493b20] [c000000000087c10] memory_add_physaddr_to_nid+0x20/0x60 [ 80.604521] [c0002dbff8493b40] [c0000000010d4880] drmem_init+0x25c/0x2f0 [ 80.604530] [c0002dbff8493c10] [c000000000010154] do_one_initcall+0x64/0x2c0 [ 80.604540] [c0002dbff8493ce0] [c0000000010c4aa0] kernel_init_freeable+0x2d8/0x3a0 [ 80.604550] [c0002dbff8493db0] [c000000000010824] kernel_init+0x2c/0x148 [ 80.604560] [c0002dbff8493e20] [c00000000000b648] ret_from_kernel_thread+0x5c/0x74 [ 80.604567] Instruction dump: [ 80.604574] 392918e8 e9490000 e90a000a e92a0000 80ea000c 1d080018 3908ffe8 7d094214 [ 80.604586] 7fa94040 419d00dc e9490010 714a0088 <2faa0008> 409e00ac e9490000 7fbe5040 [ 89.047390] drmem: 249854 LMB(s) With a patched kernel on the same machine we're no longer seeing the soft lockup. drmem_init() now completes in negligible time, even when the LMB count is large. Fixes: b2d3b5ee ("powerpc/pseries: Track LMB nid instead of using device tree") Signed-off-by: NScott Cheloha <cheloha@linux.ibm.com> Reviewed-by: NNathan Lynch <nathanl@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200811015115.63677-1-cheloha@linux.ibm.com
-
- 16 7月, 2020 3 次提交
-
-
由 Anton Blanchard 提交于
Booting with a 4GB LMB size causes us to panic: qemu-system-ppc64: OS terminated: OS panic: Memory block size not suitable: 0x0 Fix pseries_memory_block_size() to handle 64 bit LMBs. Cc: stable@vger.kernel.org Signed-off-by: NAnton Blanchard <anton@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200715000820.1255764-1-anton@ozlabs.org
-
由 Nathan Lynch 提交于
pseries_update_drconf_memory() runs from a DT notifier in response to an update to the ibm,dynamic-memory property of the /ibm,dynamic-reconfiguration-memory node. This property is an older less compact format than the ibm,dynamic-memory-v2 property used in most currently supported firmwares. There has never been an equivalent function for the v2 property. pseries_update_drconf_memory() compares the 'assigned' flag for each LMB in the old vs new properties and adds or removes the block accordingly. However it appears to be of no actual utility: * Partition suspension and PRRNs are specified only to change LMBs' NUMA affinity information. This notifier should be a no-op for those scenarios since the assigned flags should not change. * The memory hotplug/DLPAR path has a hack which short-circuits execution of the notifier: dlpar_memory() ... rtas_hp_event = true; drmem_update_dt() of_update_property() pseries_memory_notifier() pseries_update_drconf_memory() if (rtas_hp_event) return; So this code only makes sense as a relic of the time when more of the DLPAR workflow took place in user space. I don't see a purpose for it now. Signed-off-by: NNathan Lynch <nathanl@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200612051238.1007764-19-nathanl@linux.ibm.com
-
由 Nathan Lynch 提交于
dlpar_memory() no longer has any callers which pass PSERIES_HP_ELOG_ACTION_READD. Remove this case and the corresponding unreachable code. Signed-off-by: NNathan Lynch <nathanl@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200612051238.1007764-17-nathanl@linux.ibm.com
-
- 05 6月, 2020 1 次提交
-
-
由 David Hildenbrand 提交于
In commit 53cdc1cb ("drivers/base/memory.c: indicate all memory blocks as removable"), the user space interface to compute whether a memory block can be offlined (exposed via /sys/devices/system/memory/memoryX/removable) has effectively been deprecated. We want to remove the leftovers of the kernel implementation. When offlining a memory block (mm/memory_hotplug.c:__offline_pages()), we'll start by: 1. Testing if it contains any holes, and reject if so 2. Testing if pages belong to different zones, and reject if so 3. Isolating the page range, checking if it contains any unmovable pages Using is_mem_section_removable() before trying to offline is not only racy, it can easily result in false positives/negatives. Let's stop manually checking is_mem_section_removable(), and let device_offline() handle it completely instead. We can remove the racy is_mem_section_removable() implementation next. We now take more locks (e.g., memory hotplug lock when offlining and the zone lock when isolating), but maybe we should optimize that implementation instead if this ever becomes a real problem (after all, memory unplug is already an expensive operation). We started using is_mem_section_removable() in commit 51925fb3 ("powerpc/pseries: Implement memory hotplug remove in the kernel"), with the initial hotremove support of lmbs. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Baoquan He <bhe@redhat.com> Cc: Wei Yang <richard.weiyang@gmail.com> Link: http://lkml.kernel.org/r/20200407135416.24093-2-david@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 4月, 2020 1 次提交
-
-
由 Pingfan Liu 提交于
After introducing mem sub section concept, pfn_present() loses its literal meaning, and will not be necessary a truth on partial populated mem section. Since all of the callers use it to judge an absent section, it is better to rename pfn_present() as pfn_in_present_section(). Signed-off-by: NPingfan Liu <kernelfans@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Cc: Dan Williams <dan.j.williams@intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Leonardo Bras <leonardo@linux.ibm.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Link: http://lkml.kernel.org/r/1581919110-29575-1-git-send-email-kernelfans@gmail.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 19 2月, 2020 1 次提交
-
-
由 Libor Pechacek 提交于
In guests without hotplugagble memory drmem structure is only zero initialized. Trying to manipulate DLPAR parameters results in a crash. $ echo "memory add count 1" > /sys/kernel/dlpar Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries ... NIP: c0000000000ff294 LR: c0000000000ff248 CTR: 0000000000000000 REGS: c0000000fb9d3880 TRAP: 0300 Tainted: G E (5.5.0-rc6-2-default) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28242428 XER: 20000000 CFAR: c0000000009a6c10 DAR: 0000000000000010 DSISR: 40000000 IRQMASK: 0 ... NIP dlpar_memory+0x6e4/0xd00 LR dlpar_memory+0x698/0xd00 Call Trace: dlpar_memory+0x698/0xd00 (unreliable) handle_dlpar_errorlog+0xc0/0x190 dlpar_store+0x198/0x4a0 kobj_attr_store+0x30/0x50 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1b0/0x290 __vfs_write+0x3c/0x70 vfs_write+0xd0/0x260 ksys_write+0xdc/0x130 system_call+0x5c/0x68 Taking closer look at the code, I can see that for_each_drmem_lmb is a macro expanding into `for (lmb = &drmem_info->lmbs[0]; lmb <= &drmem_info->lmbs[drmem_info->n_lmbs - 1]; lmb++)`. When drmem_info->lmbs is NULL, the loop would iterate through the whole address range if it weren't stopped by the NULL pointer dereference on the next line. This patch aligns for_each_drmem_lmb and for_each_drmem_lmb_in_range macro behavior with the common C semantics, where the end marker does not belong to the scanned range, and alters get_lmb_range() semantics. As a side effect, the wraparound observed in the crash is prevented. Fixes: 6c6ea537 ("powerpc/mm: Separate ibm, dynamic-memory data from DT format") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: NLibor Pechacek <lpechacek@suse.cz> Signed-off-by: NMichal Suchanek <msuchanek@suse.de> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200131132829.10281-1-msuchanek@suse.de
-
- 14 1月, 2020 1 次提交
-
-
由 Pingfan Liu 提交于
In lmb_is_removable(), if a section is not present, it should continue to test the rest of the sections in the block. But the current code fails to do so. Fixes: 51925fb3 ("powerpc/pseries: Implement memory hotplug remove in the kernel") Cc: stable@vger.kernel.org # v4.1+ Signed-off-by: NPingfan Liu <kernelfans@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1578632042-12415-1-git-send-email-kernelfans@gmail.com
-
- 13 11月, 2019 1 次提交
-
-
由 Leonardo Bras 提交于
Changes the return variable to bool (as the return value) and avoids doing a ternary operation before returning. Signed-off-by: NLeonardo Bras <leonardo@linux.ibm.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802133914.30413-1-leonardo@linux.ibm.com
-
- 05 8月, 2019 1 次提交
-
-
由 Leonardo Bras 提交于
I noticed these nested ifs can be easily replaced by switch-cases, which can improve readability. Signed-off-by: NLeonardo Bras <leonardo@linux.ibm.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190801225251.17864-1-leonardo@linux.ibm.com
-
- 14 6月, 2019 1 次提交
-
-
由 Nathan Lynch 提交于
During post-migration device tree updates, we can oops in pseries_update_drconf_memory() if the source device tree has an ibm,dynamic-memory-v2 property and the destination has a ibm,dynamic_memory (v1) property. The notifier processes an "update" for the ibm,dynamic-memory property but it's really an add in this scenario. So make sure the old property object is there before dereferencing it. Fixes: 2b31e3ae ("powerpc/drmem: Add support for ibm, dynamic-memory-v2 property") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: NNathan Lynch <nathanl@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 31 5月, 2019 1 次提交
-
-
由 Thomas Gleixner 提交于
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NAllison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 29 4月, 2019 1 次提交
-
-
由 Nathan Fontenot 提交于
When removing memory we need to remove the memory from the node it was added to instead of looking up the node it should be in in the device tree. During testing we have seen scenarios where the affinity for a LMB changes due to a partition migration or PRRN event. In these cases the node the LMB exists in may not match the node the device tree indicates it belongs in. This can lead to a system crash when trying to DLPAR remove the LMB after a migration or PRRN event. The current code looks up the node in the device tree to remove the LMB from, the crash occurs when we try to offline this node and it does not have any data, i.e. node_data[nid] == NULL. 36:mon> e cpu 0x36: Vector: 300 (Data Access) at [c0000001828b7810] pc: c00000000036d08c: try_offline_node+0x2c/0x1b0 lr: c0000000003a14ec: remove_memory+0xbc/0x110 sp: c0000001828b7a90 msr: 800000000280b033 dar: 9a28 dsisr: 40000000 current = 0xc0000006329c4c80 paca = 0xc000000007a55200 softe: 0 irq_happened: 0x01 pid = 76926, comm = kworker/u320:3 36:mon> t [link register ] c0000000003a14ec remove_memory+0xbc/0x110 [c0000001828b7a90] c00000000006a1cc arch_remove_memory+0x9c/0xd0 (unreliable) [c0000001828b7ad0] c0000000003a14e0 remove_memory+0xb0/0x110 [c0000001828b7b20] c0000000000c7db4 dlpar_remove_lmb+0x94/0x160 [c0000001828b7b60] c0000000000c8ef8 dlpar_memory+0x7e8/0xd10 [c0000001828b7bf0] c0000000000bf828 handle_dlpar_errorlog+0xf8/0x160 [c0000001828b7c60] c0000000000bf8cc pseries_hp_work_fn+0x3c/0xa0 [c0000001828b7c90] c000000000128cd8 process_one_work+0x298/0x5a0 [c0000001828b7d20] c000000000129068 worker_thread+0x88/0x620 [c0000001828b7dc0] c00000000013223c kthread+0x1ac/0x1c0 [c0000001828b7e30] c00000000000b45c ret_from_kernel_thread+0x5c/0x80 To resolve this we need to track the node a LMB belongs to when it is added to the system so we can remove it from that node instead of the node that the device tree indicates it should belong to. Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 22 12月, 2018 1 次提交
-
-
由 Michael Ellerman 提交于
In update_lmb_associativity_index() we lookup dr_node using of_find_node_by_path() which takes a reference for us. In the non-error case we forget to drop the reference. Note that find_aa_index() does modify properties of the node, but doesn't need an extra reference held once it's returned. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 21 12月, 2018 1 次提交
-
-
由 Mahesh Salgaonkar 提交于
For fadump to work successfully there should not be any holes in reserved memory ranges where kernel has asked firmware to move the content of old kernel memory in event of crash. Now that fadump uses CMA for reserved area, this memory area is now not protected from hot-remove operations unless it is cma allocated. Hence, fadump service can fail to re-register after the hot-remove operation, if hot-removed memory belongs to fadump reserved region. To avoid this make sure that memory from fadump reserved area is not hot-removable if fadump is registered. However, if user still wants to remove that memory, he can do so by manually stopping fadump service before hot-remove operation. Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 26 11月, 2018 1 次提交
-
-
由 Rob Herring 提交于
Remove directly accessing device_node.type pointer and use the accessors instead. This will eventually allow removing the type pointer. Replace the open coded iterating over child nodes with for_each_child_of_node() while we're here. Signed-off-by: NRob Herring <robh@kernel.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 31 10月, 2018 2 次提交
-
-
由 David Hildenbrand 提交于
add_memory() currently does not take the device_hotplug_lock, however is aleady called under the lock from arch/powerpc/platforms/pseries/hotplug-memory.c drivers/acpi/acpi_memhotplug.c to synchronize against CPU hot-remove and similar. In general, we should hold the device_hotplug_lock when adding memory to synchronize against online/offline request (e.g. from user space) - which already resulted in lock inversions due to device_lock() and mem_hotplug_lock - see 30467e0b ("mm, hotplug: fix concurrent memory hot-add deadlock"). add_memory()/add_memory_resource() will create memory block devices, so this really feels like the right thing to do. Holding the device_hotplug_lock makes sure that a memory block device can really only be accessed (e.g. via .online/.state) from user space, once the memory has been fully added to the system. The lock is not held yet in drivers/xen/balloon.c arch/powerpc/platforms/powernv/memtrace.c drivers/s390/char/sclp_cmd.c drivers/hv/hv_balloon.c So, let's either use the locked variants or take the lock. Don't export add_memory_resource(), as it once was exported to be used by XEN, which is never built as a module. If somebody requires it, we also have to export a locked variant (as device_hotplug_lock is never exported). Link: http://lkml.kernel.org/r/20180925091457.28651-3-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NPavel Tatashin <pavel.tatashin@microsoft.com> Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: NRashmica Gupta <rashmica.g@gmail.com> Reviewed-by: NOscar Salvador <osalvador@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: John Allen <jallen@linux.vnet.ibm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mathieu Malaterre <malat@debian.org> Cc: Pavel Tatashin <pavel.tatashin@microsoft.com> Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Hildenbrand 提交于
Patch series "mm: online/offline_pages called w.o. mem_hotplug_lock", v3. Reading through the code and studying how mem_hotplug_lock is to be used, I noticed that there are two places where we can end up calling device_online()/device_offline() - online_pages()/offline_pages() without the mem_hotplug_lock. And there are other places where we call device_online()/device_offline() without the device_hotplug_lock. While e.g. echo "online" > /sys/devices/system/memory/memory9/state is fine, e.g. echo 1 > /sys/devices/system/memory/memory9/online Will not take the mem_hotplug_lock. However the device_lock() and device_hotplug_lock. E.g. via memory_probe_store(), we can end up calling add_memory()->online_pages() without the device_hotplug_lock. So we can have concurrent callers in online_pages(). We e.g. touch in online_pages() basically unprotected zone->present_pages then. Looks like there is a longer history to that (see Patch #2 for details), and fixing it to work the way it was intended is not really possible. We would e.g. have to take the mem_hotplug_lock in device/base/core.c, which sounds wrong. Summary: We had a lock inversion on mem_hotplug_lock and device_lock(). More details can be found in patch 3 and patch 6. I propose the general rules (documentation added in patch 6): 1. add_memory/add_memory_resource() must only be called with device_hotplug_lock. 2. remove_memory() must only be called with device_hotplug_lock. This is already documented and holds for all callers. 3. device_online()/device_offline() must only be called with device_hotplug_lock. This is already documented and true for now in core code. Other callers (related to memory hotplug) have to be fixed up. 4. mem_hotplug_lock is taken inside of add_memory/remove_memory/ online_pages/offline_pages. To me, this looks way cleaner than what we have right now (and easier to verify). And looking at the documentation of remove_memory, using lock_device_hotplug also for add_memory() feels natural. This patch (of 6): remove_memory() is exported right now but requires the device_hotplug_lock, which is not exported. So let's provide a variant that takes the lock and only export that one. The lock is already held in arch/powerpc/platforms/pseries/hotplug-memory.c drivers/acpi/acpi_memhotplug.c arch/powerpc/platforms/powernv/memtrace.c Apart from that, there are not other users in the tree. Link: http://lkml.kernel.org/r/20180925091457.28651-2-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NPavel Tatashin <pavel.tatashin@microsoft.com> Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: NRashmica Gupta <rashmica.g@gmail.com> Reviewed-by: NOscar Salvador <osalvador@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Rashmica Gupta <rashmica.g@gmail.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: John Allen <jallen@linux.vnet.ibm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com> Cc: Mathieu Malaterre <malat@debian.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Juergen Gross <jgross@suse.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 10月, 2018 1 次提交
-
-
由 YueHaibing 提交于
The variable 'aa_index' is defined as an unsigned value in update_lmb_associativity_index(), but find_aa_index() may return -1 when dlpar_clone_property() fails. So change find_aa_index() to return a bool, which indicates whether 'aa_index' was found or not. Fixes: c05a5a40 ("powerpc/pseries: Dynamic add entires to associativity lookup array") Signed-off-by: NYueHaibing <yuehaibing@huawei.com> Reviewed-by: Nathan Fontenot nfont@linux.vnet.ibm.com> [mpe: Tweak changelog, rename is_found to just found] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 19 9月, 2018 1 次提交
-
-
由 Nathan Fontenot 提交于
The updates to powerpc numa and memory hotplug code now use the in-kernel LMB array instead of the device tree. This change allows the pseries memory DLPAR code to only update the device tree once after successfully handling a DLPAR request. Prior to the in-kernel LMB array, the numa code looked up the affinity for memory being added in the device tree, the code now looks this up in the LMB array. This change means the memory hotplug code can just update the affinity for an LMB in the LMB array instead of updating the device tree. This also provides a savings in kernel memory. When updating the device tree old properties are never free'ed since there is no usecount on properties. This behavior leads to a new copy of the property being allocated every time a LMB is added or removed (i.e. a request to add 100 LMBs creates 100 new copies of the property). With this update only a single new property is created when a DLPAR request completes successfully. Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 16 1月, 2018 2 次提交
-
-
由 Nathan Fontenot 提交于
Now that the powerpc code parses dynamic reconfiguration memory LMB information from the LMB array and not the device tree directly we can move the of_drconf_cell struct to drmem.h where it fits better. In addition, the struct is renamed to of_drconf_cell_v1 in anticipation of upcoming support for version 2 of the dynamic reconfiguration property and the members are typed as __be* values to reflect how they exist in the device tree. Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nathan Fontenot 提交于
Update the pseries memory hotplug code to use the newly added dynamic reconfiguration LMB array. Doing this is required for the upcoming support of version 2 of the dynamic reconfiguration device tree property. In addition, making this change cleans up the code that parses the LMB information as we no longer need to worry about device tree format. This allows us to discard one of the first steps on memory hotplug where we make a working copy of the device tree property and convert the entire property to cpu format. Instead we just use the LMB array directly while holding the memory hotplug lock. This patch also moves the updating of the device tree property to powerpc/mm/drmem.c. This allows to the hotplug code to work without needing to know the device tree format and provides a single routine for updating the device tree property. This new routine will handle determination of the proper device tree format and generate a properly formatted device tree property. Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 31 8月, 2017 1 次提交
-
-
由 John Allen 提交于
Check if an LMB is assigned before attempting to call dlpar_acquire_drc in order to avoid any unnecessary rtas calls. This substantially reduces the running time of memory hot add on lpars with large amounts of memory. [mpe: We need to explicitly set rc to 0 in the success case, otherwise the compiler might think we use rc without initialising it.] Fixes: c21f515c ("powerpc/pseries: Make the acquire/release of the drc for memory a seperate step") Cc: stable@vger.kernel.org # v4.11+ Signed-off-by: NJohn Allen <jallen@linux.vnet.ibm.com> Reviewed-by: NNathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 10 8月, 2017 1 次提交
-
-
由 Nathan Fontenot 提交于
When DLPAR adding or removing memory we need to check the device offline status before trying to online/offline the memory. This is needed because calls to device_online() and device_offline() will return non-zero for memory that is already online and offline respectively. This update resolves two scenarios. First, for a kernel built with auto-online memory enabled (CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=y), memory will be onlined as part of calls to add_memory(). After adding the memory the pseries DLPAR code tries to online it and fails since the memory is already online. The DLPAR code then tries to remove the memory which produces the oops message below because the memory is not offline. The second scenario occurs when removing memory that is already offline, i.e. marking memory offline (via sysfs) and then trying to remove that memory. This doesn't work because offlining the already offline memory does not succeed and the DLPAR code then fails the DLPAR remove operation. The fix for both scenarios is to check the device.offline status before making the calls to device_online() or device_offline(). kernel BUG at mm/memory_hotplug.c:1936! ... NIP [c0000000002ca428] .remove_memory+0xb8/0xc0 LR [c0000000002ca3cc] .remove_memory+0x5c/0xc0 Call Trace: .remove_memory+0x5c/0xc0 (unreliable) .dlpar_add_lmb+0x384/0x400 .dlpar_memory+0x5dc/0xca0 .handle_dlpar_errorlog+0x74/0xe0 .pseries_hp_work_fn+0x2c/0x90 .process_one_work+0x17c/0x460 .worker_thread+0x88/0x500 .kthread+0x15c/0x1a0 .ret_from_kernel_thread+0x58/0xc0 Fixes: 943db62c ("powerpc/pseries: Revert 'Auto-online hotplugged memory'") Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com> [mpe: Use bool, add explicit rc=0 case, change log typos & formatting] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 28 6月, 2017 1 次提交
-
-
由 Hari Bathini 提交于
To register fadump, boot memory area - the size of low memory chunk that is required for a kernel to boot successfully when booted with restricted memory, is assumed to have no holes. But this memory area is currently not protected from hot-remove operations. So, fadump could fail to re-register after a memory hot-remove operation, if memory is removed from boot memory area. To avoid this, ensure that memory from boot memory area is not hot-removed when fadump is registered. Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com> Reviewed-by: NMahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 01 6月, 2017 1 次提交
-
-
由 Michael Bringmann 提交于
When adding or removing memory, the aa_index (affinity value) for the memblock must also be converted to match the endianness of the rest of the 'ibm,dynamic-memory' property. Otherwise, subsequent retrieval of the attribute will likely lead to non-existent nodes, followed by using the default node in the code inappropriately. Fixes: 5f97b2a0 ("powerpc/pseries: Implement memory hotplug add in the kernel") Cc: stable@vger.kernel.org # v4.1+ Signed-off-by: NMichael Bringmann <mwb@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-