- 30 11月, 2019 3 次提交
-
-
由 Jiri Kosina 提交于
- removal of superfluous delay (You-Sheng Yang)
-
由 Jiri Kosina 提交于
- printk() -> pr_*() cleanup (Rishi Gupta)
-
由 Jiri Kosina 提交于
- hid_have_special_driver[] cleanup for LED devices (Heiner Kallweit) - HID parser improvements (Blaž Hrastnik, Candle Sun)
-
- 18 11月, 2019 3 次提交
-
-
由 Andrew Duggan 提交于
In the event that the RMI device is unreachable, the calls to rmi_set_mode() or rmi_set_page() will fail before registering the RMI transport device. When the device is removed, rmi_remove() will call rmi_unregister_transport_device() which will attempt to access the rmi_dev pointer which was not set. This patch adds a check of the RMI_STARTED bit before calling rmi_unregister_transport_device(). The RMI_STARTED bit is only set after rmi_register_transport_device() completes successfully. The kernel oops was reported in this message: https://www.spinics.net/lists/linux-input/msg58433.html [jkosina@suse.cz: reworded changelog as agreed with Andrew] Signed-off-by: NAndrew Duggan <aduggan@synaptics.com> Reported-by: NFederico Cerutti <federico@ceres-c.it> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
由 Heiner Kallweit 提交于
Since e04a0442 ("HID: core: remove the absolute need of hid_have_special_driver[]") it's no longer needed to list these LED devices in hid_have_special_driver[]. This allows libraries needing access to the hidraw device to work properly. Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com> Signed-off-by: NBenjamin Tissoires <benjamin.tissoires@redhat.com>
-
由 Blaž Hrastnik 提交于
Per Microsoft spec, usage 0xC5 (page 0xFF) returns a blob containing data used to verify the touchpad as a Windows Precision Touchpad. 0x85, REPORTID_PTPHQA, // REPORT_ID (PTPHQA) 0x09, 0xC5, // USAGE (Vendor Usage 0xC5) 0x15, 0x00, // LOGICAL_MINIMUM (0) 0x26, 0xff, 0x00, // LOGICAL_MAXIMUM (0xff) 0x75, 0x08, // REPORT_SIZE (8) 0x96, 0x00, 0x01, // REPORT_COUNT (0x100 (256)) 0xb1, 0x02, // FEATURE (Data,Var,Abs) However, some devices, namely Microsoft's Surface line of products instead implement a "segmented device certification report" (usage 0xC6) which returns the same report, but in smaller chunks. 0x06, 0x00, 0xff, // USAGE_PAGE (Vendor Defined) 0x85, REPORTID_PTPHQA, // REPORT_ID (PTPHQA) 0x09, 0xC6, // USAGE (Vendor usage for segment #) 0x25, 0x08, // LOGICAL_MAXIMUM (8) 0x75, 0x08, // REPORT_SIZE (8) 0x95, 0x01, // REPORT_COUNT (1) 0xb1, 0x02, // FEATURE (Data,Var,Abs) 0x09, 0xC7, // USAGE (Vendor Usage) 0x26, 0xff, 0x00, // LOGICAL_MAXIMUM (0xff) 0x95, 0x20, // REPORT_COUNT (32) 0xb1, 0x02, // FEATURE (Data,Var,Abs) By expanding Win8 touchpad detection to also look for the segmented report, all Surface touchpads are now properly recognized by hid-multitouch. Signed-off-by: NBlaž Hrastnik <blaz@mxxn.io> Signed-off-by: NBenjamin Tissoires <benjamin.tissoires@redhat.com>
-
- 15 11月, 2019 3 次提交
-
-
由 Kai-Heng Feng 提交于
Commit 52cf93e6 ("HID: i2c-hid: Don't reset device upon system resume") fixes many touchpads and touchscreens, however ALPS touchpads start to trigger IRQ storm after system resume. Since it's total silence from ALPS, let's bring the old behavior back to ALPS touchpads. Fixes: 52cf93e6 ("HID: i2c-hid: Don't reset device upon system resume") Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
由 Aaron Ma 提交于
On some ThinkPad L390 some raydium 3118 touchscreen devices doesn't response any data after reset, but some does. Add this ID to no irq quirk, then don't wait for any response alike on these touchscreens. All kinds of raydium 3118 devices work fine. BugLink: https://bugs.launchpad.net/bugs/1849721Signed-off-by: NAaron Ma <aaron.ma@canonical.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
由 You-Sheng Yang 提交于
This was introduced in commit 00b790ea ("HID: i2c-hid: Add a small delay after sleep command for Raydium touchpanel") which has been effectively reverted by commit 67b18dfb ("HID: i2c-hid: Remove runtime power management"). Signed-off-by: NYou-Sheng Yang <vicamo.yang@canonical.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 14 11月, 2019 1 次提交
-
-
由 Jinke Fan 提交于
The PixArt OEM mouse disconnets/reconnects every minute on Linux. All contents of dmesg are repetitive: [ 1465.810014] usb 1-2.2: USB disconnect, device number 20 [ 1467.431509] usb 1-2.2: new low-speed USB device number 21 using xhci_hcd [ 1467.654982] usb 1-2.2: New USB device found, idVendor=03f0,idProduct=1f4a, bcdDevice= 1.00 [ 1467.654985] usb 1-2.2: New USB device strings: Mfr=1, Product=2,SerialNumber=0 [ 1467.654987] usb 1-2.2: Product: HP USB Optical Mouse [ 1467.654988] usb 1-2.2: Manufacturer: PixArt [ 1467.699722] input: PixArt HP USB Optical Mouse as /devices/pci0000:00/0000:00:07.1/0000:05:00.3/usb1/1-2/1-2.2/1-2.2:1.0/0003:03F0:1F4A.0012/input/input19 [ 1467.700124] hid-generic 0003:03F0:1F4A.0012: input,hidraw0: USB HID v1.11 Mouse [PixArt HP USB Optical Mouse] on usb-0000:05:00.3-2.2/input0 So add HID_QUIRK_ALWAYS_POLL for this one as well. Test the patch, the mouse is no longer disconnected and there are no duplicate logs in dmesg. Reference: https://github.com/sriemer/fix-linux-mouseSigned-off-by: NJinke Fan <fanjinke@hygon.cn> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 12 11月, 2019 1 次提交
-
-
由 Candle Sun 提交于
Upstream commit 58e75155 ("HID: core: move Usage Page concatenation to Main item") adds support for Usage Page item after Usage ID items (such as keyboards manufactured by Primax). Usage Page concatenation in Main item works well for following report descriptor patterns: USAGE_PAGE (Keyboard) 05 07 USAGE_MINIMUM (Keyboard LeftControl) 19 E0 USAGE_MAXIMUM (Keyboard Right GUI) 29 E7 LOGICAL_MINIMUM (0) 15 00 LOGICAL_MAXIMUM (1) 25 01 REPORT_SIZE (1) 75 01 REPORT_COUNT (8) 95 08 INPUT (Data,Var,Abs) 81 02 ------------- USAGE_MINIMUM (Keyboard LeftControl) 19 E0 USAGE_MAXIMUM (Keyboard Right GUI) 29 E7 LOGICAL_MINIMUM (0) 15 00 LOGICAL_MAXIMUM (1) 25 01 REPORT_SIZE (1) 75 01 REPORT_COUNT (8) 95 08 USAGE_PAGE (Keyboard) 05 07 INPUT (Data,Var,Abs) 81 02 But it makes the parser act wrong for the following report descriptor pattern(such as some Gamepads): USAGE_PAGE (Button) 05 09 USAGE (Button 1) 09 01 USAGE (Button 2) 09 02 USAGE (Button 4) 09 04 USAGE (Button 5) 09 05 USAGE (Button 7) 09 07 USAGE (Button 8) 09 08 USAGE (Button 14) 09 0E USAGE (Button 15) 09 0F USAGE (Button 13) 09 0D USAGE_PAGE (Consumer Devices) 05 0C USAGE (Back) 0a 24 02 USAGE (HomePage) 0a 23 02 LOGICAL_MINIMUM (0) 15 00 LOGICAL_MAXIMUM (1) 25 01 REPORT_SIZE (1) 75 01 REPORT_COUNT (11) 95 0B INPUT (Data,Var,Abs) 81 02 With Usage Page concatenation in Main item, parser recognizes all the 11 Usages as consumer keys, it is not the HID device's real intention. This patch checks whether Usage Page is really defined after Usage ID items by comparing usage page using status. Usage Page concatenation on currently defined Usage Page will always do in local parsing when Usage ID items encountered. When Main item is parsing, concatenation will do again with last defined Usage Page if this page has not been used in the previous usages concatenation. Signed-off-by: NCandle Sun <candle.sun@unisoc.com> Signed-off-by: NNianfu Bai <nianfu.bai@unisoc.com> Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 08 11月, 2019 1 次提交
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid由 Linus Torvalds 提交于
Pull HID fixes from Jiri Kosina: "Two fixes for the HID subsystem: - regression fix for i2c-hid power management (Hans de Goede) - signed vs unsigned API fix for Wacom driver (Jason Gerecke)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: wacom: generic: Treat serial number and related fields as unsigned HID: i2c-hid: Send power-on command after reset
-
- 07 11月, 2019 19 次提交
-
-
由 Jason Gerecke 提交于
The HID descriptors for most Wacom devices oddly declare the serial number and other related fields as signed integers. When these numbers are ingested by the HID subsystem, they are automatically sign-extended into 32-bit integers. We treat the fields as unsigned elsewhere in the kernel and userspace, however, so this sign-extension causes problems. In particular, the sign-extended tool ID sent to userspace as ABS_MISC does not properly match unsigned IDs used by xf86-input-wacom and libwacom. We introduce a function 'wacom_s32tou' that can undo the automatic sign extension performed by 'hid_snto32'. We call this function when processing the serial number and related fields to ensure that we are dealing with and reporting the unsigned form. We opt to use this method rather than adding a descriptor fixup in 'wacom_hid_usage_quirk' since it should be more robust in the face of future devices. Ref: https://github.com/linuxwacom/input-wacom/issues/134 Fixes: f85c9dc6 ("HID: wacom: generic: Support tool ID and additional tool types") CC: <stable@vger.kernel.org> # v4.10+ Signed-off-by: NJason Gerecke <jason.gerecke@wacom.com> Reviewed-by: NAaron Armstrong Skomra <aaron.skomra@wacom.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
由 Linus Torvalds 提交于
Merge more fixes from Andrew Morton: "17 fixes" Mostly mm fixes and one ocfs2 locking fix. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: memcontrol: fix network errors from failing __GFP_ATOMIC charges mm/memory_hotplug: fix updating the node span scripts/gdb: fix debugging modules compiled with hot/cold partitioning mm: slab: make page_cgroup_ino() to recognize non-compound slab pages properly MAINTAINERS: update information for "MEMORY MANAGEMENT" dump_stack: avoid the livelock of the dump_lock zswap: add Vitaly to the maintainers list mm/page_alloc.c: ratelimit allocation failure warnings more aggressively mm/khugepaged: fix might_sleep() warn with CONFIG_HIGHPTE=y mm, vmstat: reduce zone->lock holding time by /proc/pagetypeinfo mm, vmstat: hide /proc/pagetypeinfo from normal users mm/mmu_notifiers: use the right return code for WARN_ON ocfs2: protect extent tree in ocfs2_prepare_inode_for_write() mm: thp: handle page cache THP correctly in PageTransCompoundMap mm, meminit: recalculate pcpu batch and high limits after init completes mm/gup_benchmark: fix MAP_HUGETLB case mm: memcontrol: fix NULL-ptr deref in percpu stats flush
-
由 Johannes Weiner 提交于
While upgrading from 4.16 to 5.2, we noticed these allocation errors in the log of the new kernel: SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC) cache: tw_sock_TCPv6(960:helper-logs), object size: 232, buffer size: 240, default order: 1, min order: 0 node 0: slabs: 5, objs: 170, free: 0 slab_out_of_memory+1 ___slab_alloc+969 __slab_alloc+14 kmem_cache_alloc+346 inet_twsk_alloc+60 tcp_time_wait+46 tcp_fin+206 tcp_data_queue+2034 tcp_rcv_state_process+784 tcp_v6_do_rcv+405 __release_sock+118 tcp_close+385 inet_release+46 __sock_release+55 sock_close+17 __fput+170 task_work_run+127 exit_to_usermode_loop+191 do_syscall_64+212 entry_SYSCALL_64_after_hwframe+68 accompanied by an increase in machines going completely radio silent under memory pressure. One thing that changed since 4.16 is e699e2c6 ("net, mm: account sock objects to kmemcg"), which made these slab caches subject to cgroup memory accounting and control. The problem with that is that cgroups, unlike the page allocator, do not maintain dedicated atomic reserves. As a cgroup's usage hovers at its limit, atomic allocations - such as done during network rx - can fail consistently for extended periods of time. The kernel is not able to operate under these conditions. We don't want to revert the culprit patch, because it indeed tracks a potentially substantial amount of memory used by a cgroup. We also don't want to implement dedicated atomic reserves for cgroups. There is no point in keeping a fixed margin of unused bytes in the cgroup's memory budget to accomodate a consumer that is impossible to predict - we'd be wasting memory and get into configuration headaches, not unlike what we have going with min_free_kbytes. We do this for physical mem because we have to, but cgroups are an accounting game. Instead, account these privileged allocations to the cgroup, but let them bypass the configured limit if they have to. This way, we get the benefits of accounting the consumed memory and have it exert pressure on the rest of the cgroup, but like with the page allocator, we shift the burden of reclaimining on behalf of atomic allocations onto the regular allocations that can block. Link: http://lkml.kernel.org/r/20191022233708.365764-1-hannes@cmpxchg.org Fixes: e699e2c6 ("net, mm: account sock objects to kmemcg") Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org> Reviewed-by: NShakeel Butt <shakeelb@google.com> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: <stable@vger.kernel.org> [4.18+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Hildenbrand 提交于
We recently started updating the node span based on the zone span to avoid touching uninitialized memmaps. Currently, we will always detect the node span to start at 0, meaning a node can easily span too many pages. pgdat_is_empty() will still work correctly if all zones span no pages. We should skip over all zones without spanned pages and properly handle the first detected zone that spans pages. Unfortunately, in contrast to the zone span (/proc/zoneinfo), the node span cannot easily be inspected and tested. The node span gives no real guarantees when an architecture supports memory hotplug, meaning it can easily contain holes or span pages of different nodes. The node span is not really used after init on architectures that support memory hotplug. E.g., we use it in mm/memory_hotplug.c:try_offline_node() and in mm/kmemleak.c:kmemleak_scan(). These users seem to be fine. Link: http://lkml.kernel.org/r/20191027222714.5313-1-david@redhat.com Fixes: 00d6c019 ("mm/memory_hotplug: don't access uninitialized memmaps in shrink_pgdat_span()") Signed-off-by: NDavid Hildenbrand <david@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ilya Leoshkevich 提交于
gcc's -freorder-blocks-and-partition option makes it group frequently and infrequently used code in .text.hot and .text.unlikely sections respectively. At least when building modules on s390, this option is used by default. gdb assumes that all code is located in .text section, and that .text section is located at module load address. With such modules this is no longer the case: there is code in .text.hot and .text.unlikely, and either of them might precede .text. Fix by explicitly telling gdb the addresses of code sections. It might be tempting to do this for all sections, not only the ones in the white list. Unfortunately, gdb appears to have an issue, when telling it about e.g. loadable .note.gnu.build-id section causes it to think that non-loadable .note.Linux section is loaded at address 0, which in turn causes NULL pointers to be resolved to bogus symbols. So keep using the white list approach for the time being. Link: http://lkml.kernel.org/r/20191028152734.13065-1-iii@linux.ibm.comSigned-off-by: NIlya Leoshkevich <iii@linux.ibm.com> Reviewed-by: NJan Kiszka <jan.kiszka@siemens.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Roman Gushchin 提交于
page_cgroup_ino() doesn't return a valid memcg pointer for non-compound slab pages, because it depends on PgHead AND PgSlab flags to be set to determine the memory cgroup from the kmem_cache. It's correct for compound pages, but not for generic small pages. Those don't have PgHead set, so it ends up returning zero. Fix this by replacing the condition to PageSlab() && !PageTail(). Before this patch: [root@localhost ~]# ./page-types -c /sys/fs/cgroup/user.slice/user-0.slice/user@0.service/ | grep slab 0x0000000000000080 38 0 _______S___________________________________ slab After this patch: [root@localhost ~]# ./page-types -c /sys/fs/cgroup/user.slice/user-0.slice/user@0.service/ | grep slab 0x0000000000000080 147 0 _______S___________________________________ slab Also, hwpoison_filter_task() uses output of page_cgroup_ino() in order to filter error injection events based on memcg. So if page_cgroup_ino() fails to return memcg pointer, we just fail to inject memory error. Considering that hwpoison filter is for testing, affected users are limited and the impact should be marginal. [n-horiguchi@ah.jp.nec.com: changelog additions] Link: http://lkml.kernel.org/r/20191031012151.2722280-1-guro@fb.com Fixes: 4d96ba35 ("mm: memcg/slab: stop setting page->mem_cgroup pointer for slab pages") Signed-off-by: NRoman Gushchin <guro@fb.com> Reviewed-by: NShakeel Butt <shakeelb@google.com> Acked-by: NDavid Rientjes <rientjes@google.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Song Liu 提交于
I was trying to find the mm tree in MAINTAINERS by searching "Morton". Unfortunately, I didn't find one. And I didn't even locate the MEMORY MANAGEMENT section quickly, because Andrew's name was not listed there. Thanks to Johannes who helped me find the mm tree. Let save other's time searching around by adding: M: Andrew Morton <akpm@linux-foundation.org> T: git git://github.com/hnaz/linux-mm.git [akpm@linux-foundation.org: add ozlabs.org quilt trees] Link: http://lkml.kernel.org/r/20191030202217.3498133-1-songliubraving@fb.comSigned-off-by: NSong Liu <songliubraving@fb.com> Acked-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kevin Hao 提交于
In the current code, we use the atomic_cmpxchg() to serialize the output of the dump_stack(), but this implementation suffers the thundering herd problem. We have observed such kind of livelock on a Marvell cn96xx board(24 cpus) when heavily using the dump_stack() in a kprobe handler. Actually we can let the competitors to wait for the releasing of the lock before jumping to atomic_cmpxchg(). This will definitely mitigate the thundering herd problem. Thanks Linus for the suggestion. [akpm@linux-foundation.org: fix comment] Link: http://lkml.kernel.org/r/20191030031637.6025-1-haokexin@gmail.com Fixes: b58d9774 ("dump_stack: serialize the output from dump_stack()") Signed-off-by: NKevin Hao <haokexin@gmail.com> Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vitaly Wool 提交于
Per conversation with Dan, add myself to the zswap MAINTAINERS list. Link: http://lkml.kernel.org/r/20191028143154.31304-1-vitaly.wool@konsulko.comSigned-off-by: NVitaly Wool <vitaly.wool@konsulko.com> Acked-by: NDan Streetman <ddstreet@ieee.org> Acked-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Johannes Weiner 提交于
While investigating a bug related to higher atomic allocation failures, we noticed the failure warnings positively drowning the console, and in our case trigger lockup warnings because of a serial console too slow to handle all that output. But even if we had a faster console, it's unclear what additional information the current level of repetition provides. Allocation failures happen for three reasons: The machine is OOM, the VM is failing to handle reasonable requests, or somebody is making unreasonable requests (and didn't acknowledge their opportunism with __GFP_NOWARN). Having the memory dump, a callstack, and the ratelimit stats on skipped failure warnings should provide enough information to let users/admins/developers know whether something is wrong and point them in the right direction for debugging, bpftracing etc. Limit allocation failure warnings to one spew every ten seconds. Link: http://lkml.kernel.org/r/20191028194906.26899-1-hannes@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org> Acked-by: NDavid Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ville Syrjälä 提交于
I got some khugepaged spew on a 32bit x86: BUG: sleeping function called from invalid context at include/linux/mmu_notifier.h:346 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 25, name: khugepaged INFO: lockdep is turned off. CPU: 1 PID: 25 Comm: khugepaged Not tainted 5.4.0-rc5-elk+ #206 Hardware name: System manufacturer P5Q-EM/P5Q-EM, BIOS 2203 07/08/2009 Call Trace: dump_stack+0x66/0x8e ___might_sleep.cold.96+0x95/0xa6 __might_sleep+0x2e/0x80 collapse_huge_page.isra.51+0x5ac/0x1360 khugepaged+0x9a9/0x20f0 kthread+0xf5/0x110 ret_from_fork+0x2e/0x38 Looks like it's due to CONFIG_HIGHPTE=y pte_offset_map()->kmap_atomic() vs. mmu_notifier_invalidate_range_start(). Let's do the naive approach and just reorder the two operations. Link: http://lkml.kernel.org/r/20191029201513.GG1208@intel.com Fixes: 810e24e0 ("mm/mmu_notifiers: annotate with might_sleep()") Signed-off-by: NVille Syrjl <ville.syrjala@linux.intel.com> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
pagetypeinfo_showfree_print is called by zone->lock held in irq mode. This is not really nice because it blocks both any interrupts on that cpu and the page allocator. On large machines this might even trigger the hard lockup detector. Considering the pagetypeinfo is a debugging tool we do not really need exact numbers here. The primary reason to look at the outuput is to see how pageblocks are spread among different migratetypes and low number of pages is much more interesting therefore putting a bound on the number of pages on the free_list sounds like a reasonable tradeoff. The new output will simply tell [...] Node 6, zone Normal, type Movable >100000 >100000 >100000 >100000 41019 31560 23996 10054 3229 983 648 instead of Node 6, zone Normal, type Movable 399568 294127 221558 102119 41019 31560 23996 10054 3229 983 648 The limit has been chosen arbitrary and it is a subject of a future change should there be a need for that. While we are at it, also drop the zone lock after each free_list iteration which will help with the IRQ and page allocator responsiveness even further as the IRQ lock held time is always bound to those 100k pages. [akpm@linux-foundation.org: tweak comment text, per David Hildenbrand] Link: http://lkml.kernel.org/r/20191025072610.18526-3-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com> Suggested-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NWaiman Long <longman@redhat.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NDavid Hildenbrand <david@redhat.com> Acked-by: NRafael Aquini <aquini@redhat.com> Acked-by: NDavid Rientjes <rientjes@google.com> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Mel Gorman <mgorman@suse.de> Cc: Roman Gushchin <guro@fb.com> Cc: Song Liu <songliubraving@fb.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
/proc/pagetypeinfo is a debugging tool to examine internal page allocator state wrt to fragmentation. It is not very useful for any other use so normal users really do not need to read this file. Waiman Long has noticed that reading this file can have negative side effects because zone->lock is necessary for gathering data and that a) interferes with the page allocator and its users and b) can lead to hard lockups on large machines which have very long free_list. Reduce both issues by simply not exporting the file to regular users. Link: http://lkml.kernel.org/r/20191025072610.18526-2-mhocko@kernel.org Fixes: 467c996c ("Print out statistics in relation to fragmentation avoidance to /proc/pagetypeinfo") Signed-off-by: NMichal Hocko <mhocko@suse.com> Reported-by: NWaiman Long <longman@redhat.com> Acked-by: NMel Gorman <mgorman@suse.de> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NWaiman Long <longman@redhat.com> Acked-by: NRafael Aquini <aquini@redhat.com> Acked-by: NDavid Rientjes <rientjes@google.com> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Roman Gushchin <guro@fb.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Jann Horn <jannh@google.com> Cc: Song Liu <songliubraving@fb.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jason Gunthorpe 提交于
The return code from the op callback is actually in _ret, while the WARN_ON was checking ret which causes it to misfire. Link: http://lkml.kernel.org/r/20191025175502.GA31127@ziepe.ca Fixes: 8402ce61 ("mm/mmu_notifiers: check if mmu notifier callbacks are allowed to fail") Signed-off-by: NJason Gunthorpe <jgg@mellanox.com> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Shuning Zhang 提交于
When the extent tree is modified, it should be protected by inode cluster lock and ip_alloc_sem. The extent tree is accessed and modified in the ocfs2_prepare_inode_for_write, but isn't protected by ip_alloc_sem. The following is a case. The function ocfs2_fiemap is accessing the extent tree, which is modified at the same time. kernel BUG at fs/ocfs2/extent_map.c:475! invalid opcode: 0000 [#1] SMP Modules linked in: tun ocfs2 ocfs2_nodemanager configfs ocfs2_stackglue [...] CPU: 16 PID: 14047 Comm: o2info Not tainted 4.1.12-124.23.1.el6uek.x86_64 #2 Hardware name: Oracle Corporation ORACLE SERVER X7-2L/ASM, MB MECH, X7-2L, BIOS 42040600 10/19/2018 task: ffff88019487e200 ti: ffff88003daa4000 task.ti: ffff88003daa4000 RIP: ocfs2_get_clusters_nocache.isra.11+0x390/0x550 [ocfs2] Call Trace: ocfs2_fiemap+0x1e3/0x430 [ocfs2] do_vfs_ioctl+0x155/0x510 SyS_ioctl+0x81/0xa0 system_call_fastpath+0x18/0xd8 Code: 18 48 c7 c6 60 7f 65 a0 31 c0 bb e2 ff ff ff 48 8b 4a 40 48 8b 7a 28 48 c7 c2 78 2d 66 a0 e8 38 4f 05 00 e9 28 fe ff ff 0f 1f 00 <0f> 0b 66 0f 1f 44 00 00 bb 86 ff ff ff e9 13 fe ff ff 66 0f 1f RIP ocfs2_get_clusters_nocache.isra.11+0x390/0x550 [ocfs2] ---[ end trace c8aa0c8180e869dc ]--- Kernel panic - not syncing: Fatal exception Kernel Offset: disabled This issue can be reproduced every week in a production environment. This issue is related to the usage mode. If others use ocfs2 in this mode, the kernel will panic frequently. [akpm@linux-foundation.org: coding style fixes] [Fix new warning due to unused function by removing said function - Linus ] Link: http://lkml.kernel.org/r/1568772175-2906-2-git-send-email-sunny.s.zhang@oracle.comSigned-off-by: NShuning Zhang <sunny.s.zhang@oracle.com> Reviewed-by: NJunxiao Bi <junxiao.bi@oracle.com> Reviewed-by: NGang He <ghe@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Yang Shi 提交于
We have a usecase to use tmpfs as QEMU memory backend and we would like to take the advantage of THP as well. But, our test shows the EPT is not PMD mapped even though the underlying THP are PMD mapped on host. The number showed by /sys/kernel/debug/kvm/largepage is much less than the number of PMD mapped shmem pages as the below: 7f2778200000-7f2878200000 rw-s 00000000 00:14 262232 /dev/shm/qemu_back_mem.mem.Hz2hSf (deleted) Size: 4194304 kB [snip] AnonHugePages: 0 kB ShmemPmdMapped: 579584 kB [snip] Locked: 0 kB cat /sys/kernel/debug/kvm/largepages 12 And some benchmarks do worse than with anonymous THPs. By digging into the code we figured out that commit 127393fb ("mm: thp: kvm: fix memory corruption in KVM with THP enabled") checks if there is a single PTE mapping on the page for anonymous THP when setting up EPT map. But the _mapcount < 0 check doesn't work for page cache THP since every subpage of page cache THP would get _mapcount inc'ed once it is PMD mapped, so PageTransCompoundMap() always returns false for page cache THP. This would prevent KVM from setting up PMD mapped EPT entry. So we need handle page cache THP correctly. However, when page cache THP's PMD gets split, kernel just remove the map instead of setting up PTE map like what anonymous THP does. Before KVM calls get_user_pages() the subpages may get PTE mapped even though it is still a THP since the page cache THP may be mapped by other processes at the mean time. Checking its _mapcount and whether the THP has PTE mapped or not. Although this may report some false negative cases (PTE mapped by other processes), it looks not trivial to make this accurate. With this fix /sys/kernel/debug/kvm/largepage would show reasonable pages are PMD mapped by EPT as the below: 7fbeaee00000-7fbfaee00000 rw-s 00000000 00:14 275464 /dev/shm/qemu_back_mem.mem.SKUvat (deleted) Size: 4194304 kB [snip] AnonHugePages: 0 kB ShmemPmdMapped: 557056 kB [snip] Locked: 0 kB cat /sys/kernel/debug/kvm/largepages 271 And the benchmarks are as same as anonymous THPs. [yang.shi@linux.alibaba.com: v4] Link: http://lkml.kernel.org/r/1571865575-42913-1-git-send-email-yang.shi@linux.alibaba.com Link: http://lkml.kernel.org/r/1571769577-89735-1-git-send-email-yang.shi@linux.alibaba.com Fixes: dd78fedd ("rmap: support file thp") Signed-off-by: NYang Shi <yang.shi@linux.alibaba.com> Reported-by: NGang Deng <gavin.dg@linux.alibaba.com> Tested-by: NGang Deng <gavin.dg@linux.alibaba.com> Suggested-by: NHugh Dickins <hughd@google.com> Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> [4.8+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
Deferred memory initialisation updates zone->managed_pages during the initialisation phase but before that finishes, the per-cpu page allocator (pcpu) calculates the number of pages allocated/freed in batches as well as the maximum number of pages allowed on a per-cpu list. As zone->managed_pages is not up to date yet, the pcpu initialisation calculates inappropriately low batch and high values. This increases zone lock contention quite severely in some cases with the degree of severity depending on how many CPUs share a local zone and the size of the zone. A private report indicated that kernel build times were excessive with extremely high system CPU usage. A perf profile indicated that a large chunk of time was lost on zone->lock contention. This patch recalculates the pcpu batch and high values after deferred initialisation completes for every populated zone in the system. It was tested on a 2-socket AMD EPYC 2 machine using a kernel compilation workload -- allmodconfig and all available CPUs. mmtests configuration: config-workload-kernbench-max Configuration was modified to build on a fresh XFS partition. kernbench 5.4.0-rc3 5.4.0-rc3 vanilla resetpcpu-v2 Amean user-256 13249.50 ( 0.00%) 16401.31 * -23.79%* Amean syst-256 14760.30 ( 0.00%) 4448.39 * 69.86%* Amean elsp-256 162.42 ( 0.00%) 119.13 * 26.65%* Stddev user-256 42.97 ( 0.00%) 19.15 ( 55.43%) Stddev syst-256 336.87 ( 0.00%) 6.71 ( 98.01%) Stddev elsp-256 2.46 ( 0.00%) 0.39 ( 84.03%) 5.4.0-rc3 5.4.0-rc3 vanilla resetpcpu-v2 Duration User 39766.24 49221.79 Duration System 44298.10 13361.67 Duration Elapsed 519.11 388.87 The patch reduces system CPU usage by 69.86% and total build time by 26.65%. The variance of system CPU usage is also much reduced. Before, this was the breakdown of batch and high values over all zones was: 256 batch: 1 256 batch: 63 512 batch: 7 256 high: 0 256 high: 378 512 high: 42 512 pcpu pagesets had a batch limit of 7 and a high limit of 42. After the patch: 256 batch: 1 768 batch: 63 256 high: 0 768 high: 378 [mgorman@techsingularity.net: fix merge/linkage snafu] Link: http://lkml.kernel.org/r/20191023084705.GD3016@techsingularity.netLink: http://lkml.kernel.org/r/20191021094808.28824-2-mgorman@techsingularity.netSigned-off-by: NMel Gorman <mgorman@techsingularity.net> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NDavid Hildenbrand <david@redhat.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Qian Cai <cai@lca.pw> Cc: <stable@vger.kernel.org> [4.1+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 John Hubbard 提交于
The MAP_HUGETLB ("-H" option) of gup_benchmark fails: $ sudo ./gup_benchmark -H mmap: Invalid argument This is because gup_benchmark.c is passing in a file descriptor to mmap(), but the fd came from opening up the /dev/zero file. This confuses the mmap syscall implementation, which thinks that, if the caller did not specify MAP_ANONYMOUS, then the file must be a huge page file. So it attempts to verify that the file really is a huge page file, as you can see here: ksys_mmap_pgoff() { if (!(flags & MAP_ANONYMOUS)) { retval = -EINVAL; if (unlikely(flags & MAP_HUGETLB && !is_file_hugepages(file))) goto out_fput; /* THIS IS WHERE WE END UP */ else if (flags & MAP_HUGETLB) { ...proceed normally, /dev/zero is ok here... ...and of course is_file_hugepages() returns "false" for the /dev/zero file. The problem is that the user space program, gup_benchmark.c, really just wants anonymous memory here. The simplest way to get that is to pass MAP_ANONYMOUS whenever MAP_HUGETLB is specified, so that's what this patch does. Link: http://lkml.kernel.org/r/20191021212435.398153-2-jhubbard@nvidia.comSigned-off-by: NJohn Hubbard <jhubbard@nvidia.com> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NJérôme Glisse <jglisse@redhat.com> Cc: Keith Busch <keith.busch@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Shakeel Butt 提交于
__mem_cgroup_free() can be called on the failure path in mem_cgroup_alloc(). However memcg_flush_percpu_vmstats() and memcg_flush_percpu_vmevents() which are called from __mem_cgroup_free() access the fields of memcg which can potentially be null if called from failure path from mem_cgroup_alloc(). Indeed syzbot has reported the following crash: kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 30393 Comm: syz-executor.1 Not tainted 5.4.0-rc2+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:memcg_flush_percpu_vmstats+0x4ae/0x930 mm/memcontrol.c:3436 Code: 05 41 89 c0 41 0f b6 04 24 41 38 c7 7c 08 84 c0 0f 85 5d 03 00 00 44 3b 05 33 d5 12 08 0f 83 e2 00 00 00 4c 89 f0 48 c1 e8 03 <42> 80 3c 28 00 0f 85 91 03 00 00 48 8b 85 10 fe ff ff 48 8b b0 90 RSP: 0018:ffff888095c27980 EFLAGS: 00010206 RAX: 0000000000000012 RBX: ffff888095c27b28 RCX: ffffc90008192000 RDX: 0000000000040000 RSI: ffffffff8340fae7 RDI: 0000000000000007 RBP: ffff888095c27be0 R08: 0000000000000000 R09: ffffed1013f0da33 R10: ffffed1013f0da32 R11: ffff88809f86d197 R12: fffffbfff138b760 R13: dffffc0000000000 R14: 0000000000000090 R15: 0000000000000007 FS: 00007f5027170700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000710158 CR3: 00000000a7b18000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __mem_cgroup_free+0x1a/0x190 mm/memcontrol.c:5021 mem_cgroup_free mm/memcontrol.c:5033 [inline] mem_cgroup_css_alloc+0x3a1/0x1ae0 mm/memcontrol.c:5160 css_create kernel/cgroup/cgroup.c:5156 [inline] cgroup_apply_control_enable+0x44d/0xc40 kernel/cgroup/cgroup.c:3119 cgroup_mkdir+0x899/0x11b0 kernel/cgroup/cgroup.c:5401 kernfs_iop_mkdir+0x14d/0x1d0 fs/kernfs/dir.c:1124 vfs_mkdir+0x42e/0x670 fs/namei.c:3807 do_mkdirat+0x234/0x2a0 fs/namei.c:3830 __do_sys_mkdir fs/namei.c:3846 [inline] __se_sys_mkdir fs/namei.c:3844 [inline] __x64_sys_mkdir+0x5c/0x80 fs/namei.c:3844 do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixing this by moving the flush to mem_cgroup_free as there is no need to flush anything if we see failure in mem_cgroup_alloc(). Link: http://lkml.kernel.org/r/20191018165231.249872-1-shakeelb@google.com Fixes: bb65f89b ("mm: memcontrol: flush percpu vmevents before releasing memcg") Fixes: c350a99e ("mm: memcontrol: flush percpu vmstats before releasing memcg") Signed-off-by: NShakeel Butt <shakeelb@google.com> Reported-by: syzbot+515d5bcfe179cdf049b2@syzkaller.appspotmail.com Reviewed-by: NRoman Gushchin <guro@fb.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 11月, 2019 2 次提交
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux由 Linus Torvalds 提交于
Pull clone3 stack argument update from Christian Brauner: "This changes clone3() to do basic stack validation and to set up the stack depending on whether or not it is growing up or down. With clone3() the expectation is now very simply that the .stack argument points to the lowest address of the stack and that .stack_size specifies the initial stack size. This is diferent from legacy clone() where the "stack" argument had to point to the lowest or highest address of the stack depending on the architecture. clone3() was released with 5.3. Currently, it is not documented and very unclear to userspace how the stack and stack_size argument have to be passed. After talking to glibc folks we concluded that changing clone3() to determine stack direction and doing basic validation is the right course of action. Note, this is a potentially user visible change. In the very unlikely case, that it breaks someone's use-case we will revert. (And then e.g. place the new behavior under an appropriate flag.) Note that passing an empty stack will continue working just as before. Breaking someone's use-case is very unlikely. Neither glibc nor musl currently expose a wrapper for clone3(). There is currently also no real motivation for anyone to use clone3() directly. First, because using clone{3}() with stacks requires some assembly (see glibc and musl). Second, because it does not provide features that legacy clone() doesn't. New features for clone3() will first happen in v5.5 which is why v5.4 is still a good time to try and make that change now and backport it to v5.3. I did a codesearch on https://codesearch.debian.net, github, and gitlab and could not find any software currently relying directly on clone3(). I expect this to change once we land CLONE_CLEAR_SIGHAND which was a request coming from glibc at which point they'll likely start using it" * tag 'for-linus-2019-11-05' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: clone3: validate stack arguments
-
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio由 Linus Torvalds 提交于
Pull GPIO fixes from Linus Walleij: "More GPIO fixes! We found a late regression in the Intel Merrifield driver. Oh well. We fixed it up. - Fix a build error in the tools used for kselftest - A series of reverts to bring the Intel Merrifield back to working. We will likely unrevert the reverts for v5.5 but we can't have v5.4 broken" * tag 'gpio-v5.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: Revert "gpio: merrifield: Pass irqchip when adding gpiochip" Revert "gpio: merrifield: Restore use of irq_base" Revert "gpio: merrifield: Move hardware initialization to callback" tools: gpio: Use !building_out_of_srctree to determine srctree
-
- 05 11月, 2019 1 次提交
-
-
由 Christian Brauner 提交于
Validate the stack arguments and setup the stack depening on whether or not it is growing down or up. Legacy clone() required userspace to know in which direction the stack is growing and pass down the stack pointer appropriately. To make things more confusing microblaze uses a variant of the clone() syscall selected by CONFIG_CLONE_BACKWARDS3 that takes an additional stack_size argument. IA64 has a separate clone2() syscall which also takes an additional stack_size argument. Finally, parisc has a stack that is growing upwards. Userspace therefore has a lot nasty code like the following: #define __STACK_SIZE (8 * 1024 * 1024) pid_t sys_clone(int (*fn)(void *), void *arg, int flags, int *pidfd) { pid_t ret; void *stack; stack = malloc(__STACK_SIZE); if (!stack) return -ENOMEM; #ifdef __ia64__ ret = __clone2(fn, stack, __STACK_SIZE, flags | SIGCHLD, arg, pidfd); #elif defined(__parisc__) /* stack grows up */ ret = clone(fn, stack, flags | SIGCHLD, arg, pidfd); #else ret = clone(fn, stack + __STACK_SIZE, flags | SIGCHLD, arg, pidfd); #endif return ret; } or even crazier variants such as [3]. With clone3() we have the ability to validate the stack. We can check that when stack_size is passed, the stack pointer is valid and the other way around. We can also check that the memory area userspace gave us is fine to use via access_ok(). Furthermore, we probably should not require userspace to know in which direction the stack is growing. It is easy for us to do this in the kernel and I couldn't find the original reasoning behind exposing this detail to userspace. /* Intentional user visible API change */ clone3() was released with 5.3. Currently, it is not documented and very unclear to userspace how the stack and stack_size argument have to be passed. After talking to glibc folks we concluded that trying to change clone3() to setup the stack instead of requiring userspace to do this is the right course of action. Note, that this is an explicit change in user visible behavior we introduce with this patch. If it breaks someone's use-case we will revert! (And then e.g. place the new behavior under an appropriate flag.) Breaking someone's use-case is very unlikely though. First, neither glibc nor musl currently expose a wrapper for clone3(). Second, there is no real motivation for anyone to use clone3() directly since it does not provide features that legacy clone doesn't. New features for clone3() will first happen in v5.5 which is why v5.4 is still a good time to try and make that change now and backport it to v5.3. Searches on [4] did not reveal any packages calling clone3(). [1]: https://lore.kernel.org/r/CAG48ez3q=BeNcuVTKBN79kJui4vC6nw0Bfq6xc-i0neheT17TA@mail.gmail.com [2]: https://lore.kernel.org/r/20191028172143.4vnnjpdljfnexaq5@wittgenstein [3]: https://github.com/systemd/systemd/blob/5238e9575906297608ff802a27e2ff9effa3b338/src/basic/raw-clone.h#L31 [4]: https://codesearch.debian.net Fixes: 7f192e3c ("fork: add clone3") Cc: Kees Cook <keescook@chromium.org> Cc: Jann Horn <jannh@google.com> Cc: David Howells <dhowells@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-api@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: <stable@vger.kernel.org> # 5.3 Cc: GNU C Library <libc-alpha@sourceware.org> Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com> Acked-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NAleksa Sarai <cyphar@cyphar.com> Link: https://lore.kernel.org/r/20191031113608.20713-1-christian.brauner@ubuntu.com
-
- 04 11月, 2019 5 次提交
-
-
由 Linus Walleij 提交于
This reverts commit 8f86a5b4. It has been established that this causes a boot regression on both Baytrail and Cherrytrail SoCs, and we can't have that in the final kernel release, so we need to revert it. Reported-by: NHans de Goede <hdegoede@redhat.com> Acked-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
-
由 Linus Walleij 提交于
This reverts commit 6658f87f. This revert is a prerequisite for the later revert of commit 8f86a5b4. Reported-by: NHans de Goede <hdegoede@redhat.com> Acked-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
-
由 Linus Walleij 提交于
This reverts commit 4c875409. This revert is a prerequisite for the later revert of commit 8f86a5b4. Reported-by: NHans de Goede <hdegoede@redhat.com> Acked-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
-
由 Linus Torvalds 提交于
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb由 Linus Torvalds 提交于
Pull USB fixes from Greg KH: "The USB sub-maintainers woke up this past week and sent a bunch of tiny fixes. Here are a lot of small patches that that resolve a bunch of reported issues in the USB core, drivers, serial drivers, gadget drivers, and of course, xhci :) All of these have been in linux-next with no reported issues" * tag 'usb-5.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (31 commits) usb: dwc3: gadget: fix race when disabling ep with cancelled xfers usb: cdns3: gadget: Fix g_audio use case when connected to Super-Speed host usb: cdns3: gadget: reset EP_CLAIMED flag while unloading USB: serial: whiteheat: fix line-speed endianness USB: serial: whiteheat: fix potential slab corruption USB: gadget: Reject endpoints with 0 maxpacket value UAS: Revert commit 3ae62a42 ("UAS: fix alignment of scatter/gather segments") usb-storage: Revert commit 747668db ("usb-storage: Set virt_boundary_mask to avoid SG overflows") usbip: Fix free of unallocated memory in vhci tx usbip: tools: Fix read_usb_vudc_device() error path handling usb: xhci: fix __le32/__le64 accessors in debugfs code usb: xhci: fix Immediate Data Transfer endianness xhci: Fix use-after-free regression in xhci clear hub TT implementation USB: ldusb: fix control-message timeout USB: ldusb: use unsigned size format specifiers USB: ldusb: fix ring-buffer locking USB: Skip endpoints with 0 maxpacket length usb: cdns3: gadget: Don't manage pullups usb: dwc3: remove the call trace of USBx_GFLADJ usb: gadget: configfs: fix concurrent issue between composite APIs ...
-
- 03 11月, 2019 1 次提交
-
-
git://git.samba.org/sfrench/cifs-2.6由 Linus Torvalds 提交于
Pull cifs fix from Steve French: "A small smb3 memleak fix" * tag '5.4-rc6-smb3-fix' of git://git.samba.org/sfrench/cifs-2.6: fix memory leak in large read decrypt offload
-