- 06 6月, 2020 24 次提交
-
-
由 Erwan Velu 提交于
Some vendors like HPe or Dell, encode the release version of their BIOS in the "System BIOS {Major|Minor} Release" fields of Type 0. This information is used to know which bios release actually runs. It could be used for some quirks, debugging sessions or inventory tasks. A typical output for a Dell system running the 65.27 bios is : [root@t1700 ~]# cat /sys/devices/virtual/dmi/id/bios_release 65.27 [root@t1700 ~]# Servers that have a BMC encode the release version of their firmware in the "Embedded Controller Firmware {Major|Minor} Release" fields of Type 0. This information is used to know which BMC release actually runs. It could be used for some quirks, debugging sessions or inventory tasks. A typical output for a Dell system running the 3.75 bmc release is : [root@t1700 ~]# cat /sys/devices/virtual/dmi/id/ec_firmware_release 3.75 [root@t1700 ~]# Signed-off-by: NErwan Velu <e.velu@criteo.com> Signed-off-by: NJean Delvare <jdelvare@suse.de>
-
由 Eric Biggers 提交于
queue_limits::logical_block_size got changed from unsigned short to unsigned int, but it was forgotten to update crypt_io_hints() to use the new type. Fix it. Fixes: ad6bf88a ("block: fix an integer overflow in logical block size") Cc: stable@vger.kernel.org Signed-off-by: NEric Biggers <ebiggers@google.com> Reviewed-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
When there are many DM multipath devices it really helps to have additional context for which DM device a failed or reinstated path is part of. Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Add more DMDEBUG that shows arguments passed and caller, and another that shows state of related flags at end of queue_if_no_path(). Also add queue_if_no_path DMDEBUG to multipath_resume(). Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Do not allow saving disabled queue_if_no_path if already saved as enabled; implies multiple suspends (which shouldn't ever happen). Log if this unlikely scenario is ever triggered. Also, only write MPATHF_SAVED_QUEUE_IF_NO_PATH during presuspend or if "fail_if_no_path" message. MPATHF_SAVED_QUEUE_IF_NO_PATH is no longer always modified, e.g.: even if queue_if_no_path()'s save_old_value argument wasn't set. This just implies a bit tighter control over the management of MPATHF_SAVED_QUEUE_IF_NO_PATH. Side-effect is multipath_resume() doesn't reset MPATHF_QUEUE_IF_NO_PATH unless MPATHF_SAVED_QUEUE_IF_NO_PATH was set (during presuspend); and at that time the MPATHF_SAVED_QUEUE_IF_NO_PATH bit gets cleared. So MPATHF_SAVED_QUEUE_IF_NO_PATH's use is much more narrow in scope. Last, but not least, do _not_ disable queue_if_no_path during noflush suspend. There is no need/benefit to saving off queue_if_no_path via MPATHF_SAVED_QUEUE_IF_NO_PATH and clearing MPATHF_QUEUE_IF_NO_PATH for noflush suspend -- by avoiding this needless queue_if_no_path flag churn there is less potential for MPATHF_QUEUE_IF_NO_PATH to get lost. Which avoids potential for IOs to be errored back up to userspace during DM multipath's handling of path failures. That said, this last change papers over a reported issue concerning request-based dm-multipath's interaction with blk-mq, relative to suspend and resume: multipath_endio is being called _before_ multipath_resume. This should never happen if DM suspend's blk_mq_quiesce_queue() + dm_wait_for_completion() is genuinely waiting for all inflight blk-mq requests to complete. Similarly: drivers/md/dm.c:__dm_resume() clearly calls dm_table_resume_targets() _before_ dm_start_queue()'s blk_mq_unquiesce_queue() is called. If the queue isn't even restarted until after multipath_resume(); the BIG question that still needs answering is: how can multipath_end_io beat multipath_resume in a race!? Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Remove micro-optimization that infers device is between presuspend and resume (was done purely to avoid call to dm_noflush_suspending, which isn't expensive anyway). Remove flags argument since they are no longer checked. And remove must_push_back_bio() since it was simply a call to __must_push_back(). Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
When specifying several devices the superblock location must be checked to ensure the devices are specified in the correct order. Signed-off-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Prefer full zones when selecting the next zone for reclaim. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
per-device reclaim should select zones on that device only. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
When allocating a zone, pass in an indicator on which device the zone should be allocated; this increases performance for a multi-device setup because reclaim will now allocate zones on the device for which reclaim is running. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Remove the hard-coded limit of two devices and support an unlimited number of additional zoned devices. Signed-off-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Random and sequential zones should be part of the respective device structure to make arbitration between devices possible. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Instead of having one reclaim workqueue for the entire set we should be allocating a reclaim workqueue per device; doing so will reduce contention and should boost performance for a multi-device setup. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Add a metadata pointer within struct dmz_dev and use it as argument for blkdev_report_zones() instead of the metadata itself. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Add a pointer, to the containing device, within struct dm_zone and kill dmz_zone_to_dev(). Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Checking the tertiary superblock just consists of validating UUIDs, crcs, and the generation number; it doesn't have contents which would be required during the actual operation. So allocate a temporary superblock when checking tertiary devices to avoid having to store it together with the 'real' superblocks. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
The zones array is getting really large, and large arrays tend to wreak havoc with the CPU caches. So convert it to xarray to become more cache friendly. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Colin Ian King <colin.king@canonical.com> # fix leak in dmz_insert Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Instead of counting the number of reserved zones in dmz_free_zone(), mark the zone as 'reserved' during allocation and simplify dmz_free_zone(). Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Instead of just reporting the errno, add some more verbose debugging message in the reclaim path. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
The secondary superblock must reside on the same device as the primary superblock, so there is no need to re-calculate the device. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Hannes Reinecke 提交于
Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mikulas Patocka 提交于
Use dm_bufio_forget_buffers instead of a block-by-block loop that calls dm_bufio_forget. dm_bufio_forget_buffers is faster than the loop because it searches for used buffers using rb-tree. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mikulas Patocka 提交于
Introduce a function forget_buffer_locked that forgets a range of buffers. It is more efficient than calling forget_buffer in a loop. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mikulas Patocka 提交于
dm-bufio uses unnatural ordering in the rb-tree - blocks with smaller numbers were put to the right node and blocks with bigger numbers were put to the left node. Reverse that logic so that it's natural. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 05 6月, 2020 7 次提交
-
-
由 John Hubbard 提交于
This code was using get_user_pages_fast(), in a "Case 2" scenario (DMA/RDMA), using the categorization from [1]. That means that it's time to convert the get_user_pages_fast() + put_page() calls to pin_user_pages_fast() + unpin_user_pages() calls. There is some helpful background in [2]: basically, this is a small part of fixing a long-standing disconnect between pinning pages, and file systems' use of those pages. [1] Documentation/core-api/pin_user_pages.rst [2] "Explicit pinning of user-space pages": https://lwn.net/Articles/807108/Signed-off-by: NJohn Hubbard <jhubbard@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Alexandre Bounine <alex.bou9@gmail.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Dan Carpenter <dan.carpenter@oracle.com> Link: http://lkml.kernel.org/r/20200517235620.205225-3-jhubbard@nvidia.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Madhuparna Bhowmik 提交于
Fields of md(mport_dev) are set after cdev_device_add(). However, the file operation callbacks can be called after cdev_device_add() and therefore accesses to fields of md in the callbacks can race with the rest of the mport_cdev_add() function. One such example is INIT_LIST_HEAD(&md->portwrites) in mport_cdev_add(), the list is initialised after cdev_device_add(). This can race with list_add_tail(&pw_filter->md_node,&md->portwrites) in rio_mport_add_pw_filter() which is called by unlocked_ioctl. To avoid such data races use cdev_device_add() after initializing md. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: NMadhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NAlexandre Bounine <alex.bou9@gmail.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Mike Marshall <hubcap@omnibond.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Allison Randal <allison@lohutok.net> Cc: Pavel Andrianov <andrianov@ispras.ru> Link: http://lkml.kernel.org/r/20200426112950.1803-1-madhuparnabhowmik10@gmail.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andy Shevchenko 提交于
Instead of keeping NULL terminated array switch to use ARRAY_SIZE() which helps to further clean up. Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NMinchan Kim <minchan@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: http://lkml.kernel.org/r/20200508100758.51644-1-andriy.shevchenko@linux.intel.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Hildenbrand 提交于
Currently, when adding memory, we create entries in /sys/firmware/memmap/ as "System RAM". This will lead to kexec-tools to add that memory to the fixed-up initial memmap for a kexec kernel (loaded via kexec_load()). The memory will be considered initial System RAM by the kexec'd kernel and can no longer be reconfigured. This is not what happens during a real reboot. Let's add our memory via add_memory_driver_managed() now, so we won't create entries in /sys/firmware/memmap/ and indicate the memory as "System RAM (kmem)" in /proc/iomem. This allows everybody (especially kexec-tools) to identify that this memory is special and has to be treated differently than ordinary (hotplugged) System RAM. Before configuring the namespace: [root@localhost ~]# cat /proc/iomem ... 140000000-33fffffff : Persistent Memory 140000000-33fffffff : namespace0.0 3280000000-32ffffffff : PCI Bus 0000:00 After configuring the namespace: [root@localhost ~]# cat /proc/iomem ... 140000000-33fffffff : Persistent Memory 140000000-1481fffff : namespace0.0 148200000-33fffffff : dax0.0 3280000000-32ffffffff : PCI Bus 0000:00 After loading kmem before this change: [root@localhost ~]# cat /proc/iomem ... 140000000-33fffffff : Persistent Memory 140000000-1481fffff : namespace0.0 150000000-33fffffff : dax0.0 150000000-33fffffff : System RAM 3280000000-32ffffffff : PCI Bus 0000:00 After loading kmem after this change: [root@localhost ~]# cat /proc/iomem ... 140000000-33fffffff : Persistent Memory 140000000-1481fffff : namespace0.0 150000000-33fffffff : dax0.0 150000000-33fffffff : System RAM (kmem) 3280000000-32ffffffff : PCI Bus 0000:00 After a proper reboot: [root@localhost ~]# cat /proc/iomem ... 140000000-33fffffff : Persistent Memory 140000000-1481fffff : namespace0.0 148200000-33fffffff : dax0.0 3280000000-32ffffffff : PCI Bus 0000:00 Within the kexec kernel before this change: [root@localhost ~]# cat /proc/iomem ... 140000000-33fffffff : Persistent Memory 140000000-1481fffff : namespace0.0 150000000-33fffffff : System RAM 3280000000-32ffffffff : PCI Bus 0000:00 Within the kexec kernel after this change: [root@localhost ~]# cat /proc/iomem ... 140000000-33fffffff : Persistent Memory 140000000-1481fffff : namespace0.0 148200000-33fffffff : dax0.0 3280000000-32ffffffff : PCI Bus 0000:00 /sys/firmware/memmap/ before this change: 0000000000000000-000000000009fc00 (System RAM) 000000000009fc00-00000000000a0000 (Reserved) 00000000000f0000-0000000000100000 (Reserved) 0000000000100000-00000000bffdf000 (System RAM) 00000000bffdf000-00000000c0000000 (Reserved) 00000000feffc000-00000000ff000000 (Reserved) 00000000fffc0000-0000000100000000 (Reserved) 0000000100000000-0000000140000000 (System RAM) 0000000150000000-0000000340000000 (System RAM) /sys/firmware/memmap/ after a proper reboot: 0000000000000000-000000000009fc00 (System RAM) 000000000009fc00-00000000000a0000 (Reserved) 00000000000f0000-0000000000100000 (Reserved) 0000000000100000-00000000bffdf000 (System RAM) 00000000bffdf000-00000000c0000000 (Reserved) 00000000feffc000-00000000ff000000 (Reserved) 00000000fffc0000-0000000100000000 (Reserved) 0000000100000000-0000000140000000 (System RAM) /sys/firmware/memmap/ after this change: 0000000000000000-000000000009fc00 (System RAM) 000000000009fc00-00000000000a0000 (Reserved) 00000000000f0000-0000000000100000 (Reserved) 0000000000100000-00000000bffdf000 (System RAM) 00000000bffdf000-00000000c0000000 (Reserved) 00000000feffc000-00000000ff000000 (Reserved) 00000000fffc0000-0000000100000000 (Reserved) 0000000100000000-0000000140000000 (System RAM) kexec-tools already seem to basically ignore any System RAM that's not on top level when searching for areas to place kexec images - but also for determining crash areas to dump via kdump. Changing the resource name won't have an impact. Handle unloading of the driver after memory hotremove failed properly, by duplicating the string if necessary. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Dan Williams <dan.j.williams@intel.com> Link: http://lkml.kernel.org/r/20200508084217.9160-5-david@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ira Weiny 提交于
kmap_atomic_prot() is now exported by all architectures. Use this function rather than open coding a driver specific kmap_atomic. [arnd@arndb.de: include linux/highmem.h] Link: http://lkml.kernel.org/r/20200508220150.649044-1-arnd@arndb.deSigned-off-by: NIra Weiny <ira.weiny@intel.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Chris Zankel <chris@zankel.net> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200507150004.1423069-12-ira.weiny@intel.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrey Konovalov 提交于
This patch adds kcov_remote_start/stop() callbacks around the urb complete() callback that is executed in softirq context when dummy_hcd is in use. As the result, kcov can be used to collect coverage from those callbacks, which is used to facilitate coverage-guided fuzzing with syzkaller. Signed-off-by: NAndrey Konovalov <andreyknvl@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NDmitry Vyukov <dvyukov@google.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Marco Elver <elver@google.com> Link: http://lkml.kernel.org/r/4520671eeb604adbc2432c248b0c07fbaa5519ef.1585233617.git.andreyknvl@google.com Link: http://lkml.kernel.org/r/2821d497ac1cdc0efb5e00df30271e4a67fc8009.1584655448.git.andreyknvl@google.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mikulas Patocka 提交于
There is no user for this interface. If in future it is needed it can be reimplemented to walk the rbtree of buffers instead of doing block-by-block lookups. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 04 6月, 2020 8 次提交
-
-
由 Kunihiko Hayashi 提交于
Add driver for the Socionext UniPhier Pro5 SoC endpoint controller. This controller is based on the DesignWare PCIe core. And add "host" to existing controller descriontions for the host controller in Kconfig. Link: https://lore.kernel.org/r/1589457801-12796-3-git-send-email-hayashi.kunihiko@socionext.comSigned-off-by: NKunihiko Hayashi <hayashi.kunihiko@socionext.com> Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: NRob Herring <robh@kernel.org>
-
由 Linus Torvalds 提交于
The atomisp_mrfld_power() function isn't actually ever called, because the two call-sites have commented out the use because it breaks on some platforms. That results in: drivers/staging/media/atomisp/pci/atomisp_v4l2.c:764:12: warning: ‘atomisp_mrfld_power’ defined but not used [-Wunused-function] 764 | static int atomisp_mrfld_power(struct atomisp_device *isp, bool enable) | ^~~~~~~~~~~~~~~~~~~ during the build. Rather than commenting out the use entirely, just disable it semantically instead (using a "0 &&" construct), leaving the call in place from a syntax standpoint, and avoiding the warning. I really don't want my builds to have any warnings that can then hide real issues. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Scott Cheloha 提交于
Searching for a particular memory block by id is an O(n) operation because each memory block's underlying device is kept in an unsorted linked list on the subsystem bus. We can cut the lookup cost to O(log n) if we cache each memory block in an xarray. This time complexity improvement is significant on systems with many memory blocks. For example: 1. A 128GB POWER9 VM with 256MB memblocks has 512 blocks. With this change memory_dev_init() completes ~12ms faster and walk_memory_blocks() completes ~12ms faster. Before: [ 0.005042] memory_dev_init: adding memory blocks [ 0.021591] memory_dev_init: added memory blocks [ 0.022699] walk_memory_blocks: walking memory blocks [ 0.038730] walk_memory_blocks: walked memory blocks 0-511 After: [ 0.005057] memory_dev_init: adding memory blocks [ 0.009415] memory_dev_init: added memory blocks [ 0.010519] walk_memory_blocks: walking memory blocks [ 0.014135] walk_memory_blocks: walked memory blocks 0-511 2. A 256GB POWER9 LPAR with 256MB memblocks has 1024 blocks. With this change memory_dev_init() completes ~88ms faster and walk_memory_blocks() completes ~87ms faster. Before: [ 0.252246] memory_dev_init: adding memory blocks [ 0.395469] memory_dev_init: added memory blocks [ 0.409413] walk_memory_blocks: walking memory blocks [ 0.433028] walk_memory_blocks: walked memory blocks 0-511 [ 0.433094] walk_memory_blocks: walking memory blocks [ 0.500244] walk_memory_blocks: walked memory blocks 131072-131583 After: [ 0.245063] memory_dev_init: adding memory blocks [ 0.299539] memory_dev_init: added memory blocks [ 0.313609] walk_memory_blocks: walking memory blocks [ 0.315287] walk_memory_blocks: walked memory blocks 0-511 [ 0.315349] walk_memory_blocks: walking memory blocks [ 0.316988] walk_memory_blocks: walked memory blocks 131072-131583 3. A 32TB POWER9 LPAR with 256MB memblocks has 131072 blocks. With this change we complete memory_dev_init() ~37 minutes faster and walk_memory_blocks() at least ~30 minutes faster. The exact timing for walk_memory_blocks() is missing, though I observed that the soft lockups in walk_memory_blocks() disappeared with the change, suggesting that lower bound. Before: [ 13.703907] memory_dev_init: adding blocks [ 2287.406099] memory_dev_init: added all blocks [ 2347.494986] [c000000014c5bb60] [c000000000869af4] walk_memory_blocks+0x94/0x160 [ 2527.625378] [c000000014c5bb60] [c000000000869af4] walk_memory_blocks+0x94/0x160 [ 2707.761977] [c000000014c5bb60] [c000000000869af4] walk_memory_blocks+0x94/0x160 [ 2887.899975] [c000000014c5bb60] [c000000000869af4] walk_memory_blocks+0x94/0x160 [ 3068.028318] [c000000014c5bb60] [c000000000869af4] walk_memory_blocks+0x94/0x160 [ 3248.158764] [c000000014c5bb60] [c000000000869af4] walk_memory_blocks+0x94/0x160 [ 3428.287296] [c000000014c5bb60] [c000000000869af4] walk_memory_blocks+0x94/0x160 [ 3608.425357] [c000000014c5bb60] [c000000000869af4] walk_memory_blocks+0x94/0x160 [ 3788.554572] [c000000014c5bb60] [c000000000869af4] walk_memory_blocks+0x94/0x160 [ 3968.695071] [c000000014c5bb60] [c000000000869af4] walk_memory_blocks+0x94/0x160 [ 4148.823970] [c000000014c5bb60] [c000000000869af4] walk_memory_blocks+0x94/0x160 After: [ 13.696898] memory_dev_init: adding blocks [ 15.660035] memory_dev_init: added all blocks (the walk_memory_blocks traces disappear) There should be no significant negative impact for machines with few memory blocks. A sparse xarray has a small footprint and an O(log n) lookup is negligibly slower than an O(n) lookup for only the smallest number of memory blocks. 1. A 16GB x86 machine with 128MB memblocks has 132 blocks. With this change memory_dev_init() completes ~300us faster and walk_memory_blocks() completes no faster or slower. The improvement is pretty close to noise. Before: [ 0.224752] memory_dev_init: adding memory blocks [ 0.227116] memory_dev_init: added memory blocks [ 0.227183] walk_memory_blocks: walking memory blocks [ 0.227183] walk_memory_blocks: walked memory blocks 0-131 After: [ 0.224911] memory_dev_init: adding memory blocks [ 0.226935] memory_dev_init: added memory blocks [ 0.227089] walk_memory_blocks: walking memory blocks [ 0.227089] walk_memory_blocks: walked memory blocks 0-131 [david@redhat.com: document the locking] Link: http://lkml.kernel.org/r/bc21eec6-7251-4c91-2f57-9a0671f8d414@redhat.comSigned-off-by: NScott Cheloha <cheloha@linux.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NDavid Hildenbrand <david@redhat.com> Acked-by: NNathan Lynch <nathanl@linux.ibm.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Rick Lindsley <ricklind@linux.vnet.ibm.com> Cc: Scott Cheloha <cheloha@linux.ibm.com> Link: http://lkml.kernel.org/r/20200121231028.13699-1-cheloha@linux.ibm.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 John Hubbard 提交于
This code was using get_user_pages*(), in a "Case 2" scenario (DMA/RDMA), using the categorization from [1]. That means that it's time to convert the get_user_pages*() + put_page() calls to pin_user_pages*() + unpin_user_pages() calls. There is some helpful background in [2]: basically, this is a small part of fixing a long-standing disconnect between pinning pages, and file systems' use of those pages. [1] Documentation/core-api/pin_user_pages.rst [2] "Explicit pinning of user-space pages": https://lwn.net/Articles/807108/Signed-off-by: NJohn Hubbard <jhubbard@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: "Joonas Lahtinen" <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Link: http://lkml.kernel.org/r/20200519002124.2025955-5-jhubbard@nvidia.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ka-Cheong Poon 提交于
If the cm_id state is IB_CM_REP_SENT when cm_destroy_id() is called, it calls cm_send_rej_locked(). In cm_send_rej_locked(), it calls cm_enter_timewait() and the state is changed to IB_CM_TIMEWAIT. Now back to cm_destroy_id(), it breaks from the switch statement, and the next call is WARN_ON(cm_id->state != IB_CM_IDLE). This triggers a spurious warning. Instead, the code should goto retest after returning from cm_send_rej_locked() to move the state to IDLE. Fixes: 67b3c8dc ("RDMA/cm: Make sure the cm_id is in the IB_CM_IDLE state in destroy") Link: https://lore.kernel.org/r/1591191218-9446-1-git-send-email-ka-cheong.poon@oracle.comSigned-off-by: NKa-Cheong Poon <ka-cheong.poon@oracle.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
The DC QPs are many-to-one QP types that means that first connection will establish ECE options that coming connections should follow. Due to this property, the ECE code was removed between first [1] and second [2] ECE submissions. This patch returns the dropped code, because ECE is a property of a connection and like any other connection users are needed to manage this data. Allow them to set ECE parameter for DC too and avoid need of having compatibility flag for the DC ECE. [1] https://lore.kernel.org/linux-rdma/20200523132243.817936-1-leon@kernel.org/ [2] https://lore.kernel.org/linux-rdma/20200525174401.71152-1-leon@kernel.org/ Link: https://lore.kernel.org/r/20200602125548.172654-4-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
The FW returns zeros in case feature is not enabled, but it is better to have the capability check and ensure that returned result is cleared. Fixes: 3e09a427 ("RDMA/mlx5: Get ECE options from FW during create QP") Link: https://lore.kernel.org/r/20200602125548.172654-3-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
In theoretical event, the ib_copy_to_udata() can fail, so return -EFAULT error to the user, so he will destroy the QP. Fixes: 50aec2c3 ("RDMA/mlx5: Return ECE data after modify QP") Link: https://lore.kernel.org/r/20200602125548.172654-2-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 03 6月, 2020 1 次提交
-
-
由 Andy Shevchenko 提交于
ACPI_PTR() becomes a no-op when !CONFIG_ACPI. This is not needed since we always have ID table enabled. Moreover, in the mentioned case compiler will complain about defined but not used variable. Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20200520211916.25727-3-andriy.shevchenko@linux.intel.comSigned-off-by: NLinus Walleij <linus.walleij@linaro.org>
-