- 14 4月, 2017 1 次提交
-
-
由 Rafael J. Wysocki 提交于
* pm-cpufreq-fixes: cpufreq: Bring CPUs up even if cpufreq_online() failed * pm-tools-fixes: cpupower: Fix turbo frequency reporting for pre-Sandy Bridge cores tools/power turbostat: update version number tools/power turbostat: fix impossibly large CPU%c1 value tools/power turbostat: turbostat.8 add missing column definitions tools/power turbostat: update HWP dump to decimal from hex tools/power turbostat: enable package THERM_INTERRUPT dump tools/power turbostat: show missing Core and GFX power on SKL and KBL tools/power turbostat: bugfix: GFXMHz column not changing
-
- 13 4月, 2017 9 次提交
-
-
由 Ben Hutchings 提交于
The switch that conditionally sets CPUPOWER_CAP_HAS_TURBO_RATIO and CPUPOWER_CAP_IS_SNB flags is missing a break, so all cores get both flags set and an assumed base clock of 100 MHz for turbo values. Reported-by: NGSR <gsr.bugs@infernal-iceberg.com> Tested-by: NGSR <gsr.bugs@infernal-iceberg.com> References: https://bugs.debian.org/859978 Fixes: 8fb2e440 (cpupower: Show Intel turbo ratio support via ...) Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux由 Rafael J. Wysocki 提交于
Pull turbostat utility fixes for v4.11 from Len Brown. * 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools/power turbostat: update version number tools/power turbostat: fix impossibly large CPU%c1 value tools/power turbostat: turbostat.8 add missing column definitions tools/power turbostat: update HWP dump to decimal from hex tools/power turbostat: enable package THERM_INTERRUPT dump tools/power turbostat: show missing Core and GFX power on SKL and KBL tools/power turbostat: bugfix: GFXMHz column not changing
-
由 Chen Yu 提交于
There is a report that after commit 27622b06 ("cpufreq: Convert to hotplug state machine"), the normal CPU offline/online cycle fails on some platforms. According to the ftrace result, this problem was triggered on platforms using acpi-cpufreq as the default cpufreq driver, and due to the lack of some ACPI freq method (eg. _PCT), cpufreq_online() failed and returned a negative value, so the CPU hotplug state machine rolled back the CPU online process. Actually, from the user's perspective, the failure of cpufreq_online() should not prevent that CPU from being brought up, although cpufreq might not work on that CPU. BTW, during system startup cpufreq_online() is not invoked via CPU online but by the cpufreq device creation process, so the APs can be brought up even though cpufreq_online() fails in that stage. This patch ignores the return value of cpufreq_online/offline() and lets the cpufreq framework deal with the failure. cpufreq_online() itself will do a proper rollback in that case and if _PCT is missing, the ACPI cpufreq driver will print a warning if the corresponding debug options have been enabled. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=194581 Fixes: 27622b06 ("cpufreq: Convert to hotplug state machine") Reported-and-tested-by: NTomasz Maciej Nowak <tmn505@gmail.com> Signed-off-by: NChen Yu <yu.c.chen@intel.com> Acked-by: NViresh Kumar <viresh.kumar@linaro.org> Cc: 4.9+ <stable@vger.kernel.org> # 4.9+ Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Len Brown 提交于
Signed-off-by: NLen Brown <len.brown@intel.com>
-
由 Len Brown 提交于
Most CPUs do not have a hardware c1 counter, and so turbostat derives c1 residency: c1 = TSC - MPERF - other_core_cstate_counters As it is not possible to atomically read these coutners, measurement jitter can case this calcuation to "go negative" when very close to 0. Turbostat detect that case and simply prints c1 = 0.00% But that check neglected to account for systems where the TSC crystal clock domain and the MPERF BCLK domain are differ by a small amount. That allowed very small negative c1 numbers to escape this check and be printed as huge positve numbers. This code begs for a bit of cleanup, but this patch is the minimal change to fix the issue. Signed-off-by: NLen Brown <len.brown@intel.com>
-
由 Doug Smythies 提交于
Add GFX%rc6 and GFXMHz to the column descriptions section of the turbostat man page. Signed-off-by: NDoug Smythies <dsmythies@telus.net> Signed-off-by: NLen Brown <len.brown@intel.com>
-
由 Len Brown 提交于
Syntax only. The HWP CAPABILTIES and REQUEST ratios are more easily viewed in decimal -- just multiply by 100 and you get MHz... new: cpu0: MSR_HWP_CAPABILITIES: 0x010c1b23 (high 35 guar 27 eff 12 low 1) cpu0: MSR_HWP_REQUEST: 0x80002301 (min 1 max 35 des 0 epp 0x80 window 0x0 pkg 0x0) old: cpu0: MSR_HWP_CAPABILITIES: 0x010c1b23 (high 0x23 guar 0x1b eff 0xc low 0x1) cpu0: MSR_HWP_REQUEST: 0x80002301 (min 0x1 max 0x23 des 0x0 epp 0x80 window 0x0 pkg 0x0) Signed-off-by: NLen Brown <len.brown@intel.com>
-
由 Len Brown 提交于
cpu0: MSR_IA32_TEMPERATURE_TARGET: 0x00641400 (100 C) cpu0: MSR_IA32_PACKAGE_THERM_STATUS: 0x884b0800 (25 C) cpu0: MSR_IA32_PACKAGE_THERM_INTERRUPT: 0x00000003 (100 C, 100 C) Enable the same per-core output, but hide it behind --debug because it is too verbose on big systems. Signed-off-by: NLen Brown <len.brown@intel.com>
-
由 Len Brown 提交于
While the current SDM is silent on the matter, the Core and GFX RAPL power meters on SKL and KBL appear to work -- so show them. Reported-by: NYaroslav Isakov <yaroslav.isakov@gmail.com> Signed-off-by: NLen Brown <len.brown@intel.com>
-
- 10 4月, 2017 5 次提交
-
-
由 Linus Torvalds 提交于
-
git://git.samba.org/sfrench/cifs-2.6由 Linus Torvalds 提交于
Pull CIFS fixes from Steve French: "This is a set of CIFS/SMB3 fixes for stable. There is another set of four SMB3 reconnect fixes for stable in progress but they are still being reviewed/tested, so didn't want to wait any longer to send these five below" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: Reset TreeId to zero on SMB2 TREE_CONNECT CIFS: Fix build failure with smb2 Introduce cifs_copy_file_range() SMB3: Rename clone_range to copychunk_range Handle mismatched open calls
-
git://git.armlinux.org.uk/~rmk/linux-arm由 Linus Torvalds 提交于
Pull ARM fixes from Russell King: "A number of ARM fixes: - prevent oopses caused by dma_get_sgtable() and declared DMA coherent memory - fix boot failure on nommu caused by ID_PFR1 access - a number of kprobes fixes from Jon Medhurst and Masami Hiramatsu" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8665/1: nommu: access ID_PFR1 only if CPUID scheme ARM: dma-mapping: disallow dma_get_sgtable() for non-kernel managed memory arm: kprobes: Align stack to 8-bytes in test code arm: kprobes: Fix the return address of multiple kretprobes arm: kprobes: Skip single-stepping in recursing path if possible arm: kprobes: Allow to handle reentered kprobe on single-stepping
-
由 Linus Torvalds 提交于
Merge tag 'driver-core-4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are 3 small fixes for 4.11-rc6. One resolves a reported issue with sysfs files that NeilBrown found, one is a documenatation fix for the stable kernel rules, and the last is a small MAINTAINERS file update for kernfs" * tag 'driver-core-4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: MAINTAINERS: separate out kernfs maintainership sysfs: be careful of error returns from ops->show() Documentation: stable-kernel-rules: fix stable-tag format
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging由 Linus Torvalds 提交于
Pull staging/IIO driver rfixes from Greg KH: "Here are a number of small IIO and staging driver fixes for 4.11-rc6. Nothing big here, just iio fixes for reported issues, and an ashmem fix for a very old bug that has been reported by a number of Android vendors" * tag 'staging-4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: android: ashmem: lseek failed due to no FMODE_LSEEK. iio: hid-sensor-attributes: Fix sensor property setting failure. iio: accel: hid-sensor-accel-3d: Fix duplicate scan index error iio: core: Fix IIO_VAL_FRACTIONAL_LOG2 for negative values iio: st_pressure: initialize lps22hb bootime iio: bmg160: reset chip when probing iio: cros_ec_sensors: Fix return value to get raw and calibbias data.
-
- 09 4月, 2017 7 次提交
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs由 Linus Torvalds 提交于
Pull VFS fixes from Al Viro: "statx followup fixes and a fix for stack-smashing on alpha" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: alpha: fix stack smashing in old_adjtimex(2) statx: Include a mask for stx_attributes in struct statx statx: Reserve the top bit of the mask for future struct expansion xfs: report crtime and attribute flags to statx ext4: Add statx support statx: optimize copy of struct statx to userspace statx: remove incorrect part of vfs_statx() comment statx: reject unknown flags when using NULL path Documentation/filesystems: fix documentation for ->getattr()
-
git://git.kernel.dk/linux-block由 Linus Torvalds 提交于
Pull block fixes from Jens Axboe: "Here's a pull request for 4.11-rc, fixing a set of issues mostly centered around the new scheduling framework. These have been brewing for a while, but split up into what we absolutely need in 4.11, and what we can defer until 4.12. These are well tested, on both single queue and multiqueue setups, and with and without shared tags. They fix several hangs that have happened in testing. This is obviously larger than I would have preferred at this point in time, but I don't think we can shave much off this and still get the desired results. In detail, this pull request contains: - a set of five fixes for NVMe, mostly from Christoph and one from Roland. - a series from Bart, fixing issues with dm-mq and SCSI shared tags and scheduling. Note that one of those patches commit messages may read like an optimization, but it is in fact an important fix for queue restarts in particular. - a series from Omar, most importantly fixing a hang with multiple hardware queues when we fail to get a driver tag. Another important fix in there is for resizing hardware queues, which nbd does when handling multiple sockets for one connection. - fixing an imbalance in putting the ctx for hctx request allocations from Minchan" * 'for-linus' of git://git.kernel.dk/linux-block: blk-mq: Restart a single queue if tag sets are shared dm rq: Avoid that request processing stalls sporadically scsi: Avoid that SCSI queues get stuck blk-mq: Introduce blk_mq_delay_run_hw_queue() blk-mq: remap queues when adding/removing hardware queues blk-mq-sched: fix crash in switch error path blk-mq-sched: set up scheduler tags when bringing up new queues blk-mq-sched: refactor scheduler initialization blk-mq: use the right hctx when getting a driver tag fails nvmet: fix byte swap in nvmet_parse_io_cmd nvmet: fix byte swap in nvmet_execute_write_zeroes nvmet: add missing byte swap in nvmet_get_smart_log nvme: add missing byte swap in nvme_setup_discard nvme: Correct NVMF enum values to match NVMe-oF rev 1.0 block: do not put mq context in blk_mq_alloc_request_hctx
-
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl由 Linus Torvalds 提交于
Pull pin control fix from Linus Walleij: "This late fix for pin control is hopefully the last I send this cycle. The problem was detected early in the v4.11 release cycle and there has been some back and forth on how to solve it. Sadly the proper fix arrives late, but at least not too late. An issue was detected with pin control on the Freescale i.MX after the refactorings for more general group and function handling. We now have the proper fix for this" * tag 'pinctrl-v4.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: core: Fix pinctrl_register_and_init() with pinctrl_enable()
-
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux由 Linus Torvalds 提交于
Pull powerpc fixes from Michael Ellerman: "Some more powerpc fixes for 4.11: Headed to stable: - disable HFSCR[TM] if TM is not supported, fixes a potential host kernel crash triggered by a hostile guest, but only in configurations that no one uses - don't try to fix up misaligned load-with-reservation instructions - fix flush_(d|i)cache_range() called from modules on little endian kernels - add missing global TLB invalidate if cxl is active - fix missing preempt_disable() in crc32c-vpmsum And a fix for selftests build changes that went in this release: - selftests/powerpc: Fix standalone powerpc build Thanks to: Benjamin Herrenschmidt, Frederic Barrat, Oliver O'Halloran, Paul Mackerras" * tag 'powerpc-4.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/crypto/crc32c-vpmsum: Fix missing preempt_disable() powerpc/mm: Add missing global TLB invalidate if cxl is active powerpc/64: Fix flush_(d|i)cache_range() called from modules powerpc: Don't try to fix up misaligned load-with-reservation instructions powerpc: Disable HFSCR[TM] if TM is not supported selftests/powerpc: Fix standalone powerpc build
-
由 Chris Salls 提交于
In the case that compat_get_bitmap fails we do not want to copy the bitmap to the user as it will contain uninitialized stack data and leak sensitive data. Signed-off-by: NChris Salls <salls@cs.ucsb.edu> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Liping Zhang 提交于
Currently, inputting the following command will succeed but actually the value will be truncated: # echo 0x12ffffffff > /proc/sys/net/ipv4/tcp_notsent_lowat This is not friendly to the user, so instead, we should report error when the value is larger than UINT_MAX. Fixes: e7d316a0 ("sysctl: handle error writing UINT_MAX to u32 fields") Signed-off-by: NLiping Zhang <zlpnobody@gmail.com> Cc: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
Separate out kernfs from driver core and add myself as a co-maintainer. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 08 4月, 2017 18 次提交
-
-
由 NeilBrown 提交于
ops->show() can return a negative error code. Commit 65da3484 ("sysfs: correctly handle short reads on PREALLOC attrs.") (in v4.4) caused this to be stored in an unsigned 'size_t' variable, so errors would look like large numbers. As a result, if an error is returned, sysfs_kf_read() will return the value of 'count', typically 4096. Commit 17d0774f ("sysfs: correctly handle read offset on PREALLOC attrs") (in v4.8) extended this error to use the unsigned large 'len' as a size for memmove(). Consequently, if ->show returns an error, then the first read() on the sysfs file will return 4096 and could return uninitialized memory to user-space. If the application performs a subsequent read, this will trigger a memmove() with extremely large count, and is likely to crash the machine is bizarre ways. This bug can currently only be triggered by reading from an md sysfs attribute declared with __ATTR_PREALLOC() during the brief period between when mddev_put() deletes an mddev from the ->all_mddevs list, and when mddev_delayed_delete() - which is scheduled on a workqueue - completes. Before this, an error won't be returned by the ->show() After this, the ->show() won't be called. I can reproduce it reliably only by putting delay like usleep_range(500000,700000); early in mddev_delayed_delete(). Then after creating an md device md0 run echo clear > /sys/block/md0/md/array_state; cat /sys/block/md0/md/array_state The bug can be triggered without the usleep. Fixes: 65da3484 ("sysfs: correctly handle short reads on PREALLOC attrs.") Fixes: 17d0774f ("sysfs: correctly handle read offset on PREALLOC attrs") Cc: stable@vger.kernel.org Signed-off-by: NNeilBrown <neilb@suse.com> Acked-by: NTejun Heo <tj@kernel.org> Reported-and-tested-by: NMiroslav Benes <mbenes@suse.cz> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Johan Hovold 提交于
A patch documenting how to specify which kernels a particular fix should be backported to (seemingly) inadvertently added a minus sign after the kernel version. This particular stable-tag format had never been used prior to this patch, and was neither present when the patch in question was first submitted (it was added in v2 without any comment). Drop the minus sign to avoid any confusion. Fixes: fdc81b79 ("stable_kernel_rules: Add clause about specification of kernel versions to patch.") Signed-off-by: NJohan Hovold <johan@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Shuxiao Zhang 提交于
vfs_llseek will check whether the file mode has FMODE_LSEEK, no return failure. But ashmem can be lseek, so add FMODE_LSEEK to ashmem file. Comment From Greg Hackmann: ashmem_llseek() passes the llseek() call through to the backing shmem file. 91360b02 ("ashmem: use vfs_llseek()") changed this from directly calling the file's llseek() op into a VFS layer call. This also adds a check for the FMODE_LSEEK bit, so without that bit ashmem_llseek() now always fails with -ESPIPE. Fixes: 91360b02 ("ashmem: use vfs_llseek()") Signed-off-by: NShuxiao Zhang <zhangshuxiao@xiaomi.com> Tested-by: NGreg Hackmann <ghackmann@google.com> Cc: stable <stable@vger.kernel.org> # 3.18+ Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc由 Linus Torvalds 提交于
Pull sparc fixes from David Miller: "Several fixes here, mostly having to due with either build errors or memory corruptions depending upon whether you have THP enabled or not" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc: remove unused wp_works_ok macro sparc32: Export vac_cache_size to fix build error sparc64: Fix memory corruption when THP is enabled sparc64: Fix kernel panic due to erroneous #ifdef surrounding pmd_write() arch/sparc: Avoid DCTI Couples sparc64: kern_addr_valid regression sparc64: Add support for 2G hugepages sparc64: Fix size check in huge_pte_alloc
-
git://git.kernel.org/pub/scm/virt/kvm/kvm由 Linus Torvalds 提交于
Pull KVM fixes from Radim Krčmář: "ARM: - Fix a problem with GICv3 userspace save/restore - Clarify GICv2 userspace save/restore ABI - Be more careful in clearing GIC LRs - Add missing synchronization primitive to our MMU handling code PPC: - Check for a NULL return from kzalloc s390: - Prevent translation exception errors on valid page tables for the instruction-exection-protection support x86: - Fix Page-Modification Logging when running a nested guest" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: PPC: Book3S HV: Check for kmalloc errors in ioctl KVM: nVMX: initialize PML fields in vmcs02 KVM: nVMX: do not leak PML full vmexit to L1 KVM: arm/arm64: vgic: Fix GICC_PMR uaccess on GICv3 and clarify ABI KVM: arm64: Ensure LRs are clear when they should be kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd KVM: s390: remove change-recording override support arm/arm64: KVM: Take mmap_sem in kvm_arch_prepare_memory_region arm/arm64: KVM: Take mmap_sem in stage2_unmap_vm
-
git://git.infradead.org/users/pcmoore/audit由 Linus Torvalds 提交于
Pull audit cleanup from Paul Moore: "A week later than I had hoped, but as promised, here is the audit uninline-fix we talked about during the last audit pull request. The patch is slightly different than what we originally discussed as it made more sense to keep the audit_signal_info() function in auditsc.c rather than move it and bunch of other related variables/definitions into audit.c/audit.h. At some point in the future I need to look at how the audit code is organized across kernel/audit*, I suspect we could do things a bit better, but it doesn't seem like a -rc release is a good place for that ;) Regardless, this patch passes our tests without problem and looks good for v4.11" * 'stable-4.11' of git://git.infradead.org/users/pcmoore/audit: audit: move audit_signal_info() into kernel/auditsc.c
-
由 Linus Torvalds 提交于
Merge misc fixes from Andrew Morton: "10 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: move pcp and lru-pcp draining into single wq mailmap: update Yakir Yang email address mm, swap_cgroup: reschedule when neeed in swap_cgroup_swapoff() dax: fix radix tree insertion race mm, thp: fix setting of defer+madvise thp defrag mode ptrace: fix PTRACE_LISTEN race corrupting task->state vmlinux.lds: add missing VMLINUX_SYMBOL macros mm/page_alloc.c: fix print order in show_free_areas() userfaultfd: report actual registered features in fdinfo mm: fix page_vma_mapped_walk() for ksm pages
-
由 Michal Hocko 提交于
We currently have 2 specific WQ_RECLAIM workqueues in the mm code. vmstat_wq for updating pcp stats and lru_add_drain_wq dedicated to drain per cpu lru caches. This seems more than necessary because both can run on a single WQ. Both do not block on locks requiring a memory allocation nor perform any allocations themselves. We will save one rescuer thread this way. On the other hand drain_all_pages() queues work on the system wq which doesn't have rescuer and so this depend on memory allocation (when all workers are stuck allocating and new ones cannot be created). Initially we thought this would be more of a theoretical problem but Hugh Dickins has reported: : 4.11-rc has been giving me hangs after hours of swapping load. At : first they looked like memory leaks ("fork: Cannot allocate memory"); : but for no good reason I happened to do "cat /proc/sys/vm/stat_refresh" : before looking at /proc/meminfo one time, and the stat_refresh stuck : in D state, waiting for completion of flush_work like many kworkers. : kthreadd waiting for completion of flush_work in drain_all_pages(). This worker should be using WQ_RECLAIM as well in order to guarantee a forward progress. We can reuse the same one as for lru draining and vmstat. Link: http://lkml.kernel.org/r/20170307131751.24936-1-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com> Suggested-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NMel Gorman <mgorman@suse.de> Tested-by: NYang Li <pku.leo@gmail.com> Tested-by: NHugh Dickins <hughd@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jeffy Chen 提交于
Set current email address to replace previous employers email addresses. Link: http://lkml.kernel.org/r/1491450722-6633-1-git-send-email-jeffy.chen@rock-chips.comSigned-off-by: NJeffy Chen <jeffy.chen@rock-chips.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Rientjes 提交于
We got need_resched() warnings in swap_cgroup_swapoff() because swap_cgroup_ctrl[type].length is particularly large. Reschedule when needed. Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1704061315270.80559@chino.kir.corp.google.comSigned-off-by: NDavid Rientjes <rientjes@google.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ross Zwisler 提交于
While running generic/340 in my test setup I hit the following race. It can happen with kernels that support FS DAX PMDs, so v4.10 thru v4.11-rc5. Thread 1 Thread 2 -------- -------- dax_iomap_pmd_fault() grab_mapping_entry() spin_lock_irq() get_unlocked_mapping_entry() 'entry' is NULL, can't call lock_slot() spin_unlock_irq() radix_tree_preload() dax_iomap_pmd_fault() grab_mapping_entry() spin_lock_irq() get_unlocked_mapping_entry() ... lock_slot() spin_unlock_irq() dax_pmd_insert_mapping() <inserts a PMD mapping> spin_lock_irq() __radix_tree_insert() fails with -EEXIST <fall back to 4k fault, and die horribly when inserting a 4k entry where a PMD exists> The issue is that we have to drop mapping->tree_lock while calling radix_tree_preload(), but since we didn't have a radix tree entry to lock (unlike in the pmd_downgrade case) we have no protection against Thread 2 coming along and inserting a PMD at the same index. For 4k entries we handled this with a special-case response to -EEXIST coming from the __radix_tree_insert(), but this doesn't save us for PMDs because the -EEXIST case can also mean that we collided with a 4k entry in the radix tree at a different index, but one that is covered by our PMD range. So, correctly handle both the 4k and 2M collision cases by explicitly re-checking the radix tree for an entry at our index once we reacquire mapping->tree_lock. This patch has made it through a clean xfstests run with the current v4.11-rc5 based linux/master, and it also ran generic/340 500 times in a loop. It used to fail within the first 10 iterations. Link: http://lkml.kernel.org/r/20170406212944.2866-1-ross.zwisler@linux.intel.comSigned-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Cc: "Darrick J. Wong" <darrick.wong@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: <stable@vger.kernel.org> [4.10+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Rientjes 提交于
Setting thp defrag mode of "defer+madvise" actually sets "defer" in the kernel due to the name similarity and the out-of-order way the string is checked in defrag_store(). Check the string in the correct order so that TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG is set appropriately for "defer+madvise". Fixes: 21440d7e ("mm, thp: add new defer+madvise defrag option") Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1704051814420.137626@chino.kir.corp.google.comSigned-off-by: NDavid Rientjes <rientjes@google.com> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 bsegall@google.com 提交于
In PT_SEIZED + LISTEN mode STOP/CONT signals cause a wakeup against __TASK_TRACED. If this races with the ptrace_unfreeze_traced at the end of a PTRACE_LISTEN, this can wake the task /after/ the check against __TASK_TRACED, but before the reset of state to TASK_TRACED. This causes it to instead clobber TASK_WAKING, allowing a subsequent wakeup against TRACED while the task is still on the rq wake_list, corrupting it. Oleg said: "The kernel can crash or this can lead to other hard-to-debug problems. In short, "task->state = TASK_TRACED" in ptrace_unfreeze_traced() assumes that nobody else can wake it up, but PTRACE_LISTEN breaks the contract. Obviusly it is very wrong to manipulate task->state if this task is already running, or WAKING, or it sleeps again" [akpm@linux-foundation.org: coding-style fixes] Fixes: 9899d11f ("ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL") Link: http://lkml.kernel.org/r/xm26y3vfhmkp.fsf_-_@bsegall-linux.mtv.corp.google.comSigned-off-by: NBen Segall <bsegall@google.com> Acked-by: NOleg Nesterov <oleg@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jessica Yu 提交于
When __{start,end}_ro_after_init is referenced from C code, we run into the following build errors on blackfin: kernel/extable.c:169: undefined reference to `__start_ro_after_init' kernel/extable.c:169: undefined reference to `__end_ro_after_init' The build error is due to the fact that blackfin is one of the few arches that prepends an underscore '_' to all symbols defined in C. Fix this by wrapping __{start,end}_ro_after_init in vmlinux.lds.h with VMLINUX_SYMBOL(), which adds the necessary prefix for arches that have HAVE_UNDERSCORE_SYMBOL_PREFIX. Link: http://lkml.kernel.org/r/1491259387-15869-1-git-send-email-jeyu@redhat.comSigned-off-by: NJessica Yu <jeyu@redhat.com> Acked-by: NKees Cook <keescook@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Eddie Kovsky <ewk@edkovsky.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexander Polakov 提交于
Fixes: 11fb9989 ("mm: move most file-based accounting to the node") Link: http://lkml.kernel.org/r/1490377730.30219.2.camel@beget.ruSigned-off-by: NAlexander Polyakov <apolyakov@beget.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> [4.8+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mike Rapoport 提交于
fdinfo for userfault file descriptor reports UFFD_API_FEATURES. Up until recently, the UFFD_API_FEATURES was defined as 0, therefore corresponding field in fdinfo always contained zero. Now, with introduction of several additional features, UFFD_API_FEATURES is not longer 0 and it seems better to report actual features requested for the userfaultfd object described by the fdinfo. First, the applications that were using userfault will still see zero at the features field in fdinfo. Next, reporting actual features rather than available features, gives clear indication of what userfault features are used by an application. Link: http://lkml.kernel.org/r/1491140181-22121-1-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com> Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com> Cc: Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Hugh Dickins 提交于
Doug Smythies reports oops with KSM in this backtrace, I've been seeing the same: page_vma_mapped_walk+0xe6/0x5b0 page_referenced_one+0x91/0x1a0 rmap_walk_ksm+0x100/0x190 rmap_walk+0x4f/0x60 page_referenced+0x149/0x170 shrink_active_list+0x1c2/0x430 shrink_node_memcg+0x67a/0x7a0 shrink_node+0xe1/0x320 kswapd+0x34b/0x720 Just as observed in commit 4b0ece6f ("mm: migrate: fix remove_migration_pte() for ksm pages"), you cannot use page->index calculations on ksm pages. page_vma_mapped_walk() is relying on __vma_address(), where a ksm page can lead it off the end of the page table, and into whatever nonsense is in the next page, ending as an oops inside check_pte()'s pte_page(). KSM tells page_vma_mapped_walk() exactly where to look for the page, it does not need any page->index calculation: and that's so also for all the normal and file and anon pages - just not for THPs and their subpages. Get out early in most cases: instead of a PageKsm test, move down the earlier not-THP-page test, as suggested by Kirill. I'm also slightly worried that this loop can stray into other vmas, so added a vm_end test to prevent surprises; though I have not imagined anything worse than a very contrived case, in which a page mlocked in the next vma might be reclaimed because it is not mlocked in this vma. Fixes: ace71a19 ("mm: introduce page_vma_mapped_walk()") Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1704031104400.1118@eggly.anvilsSigned-off-by: NHugh Dickins <hughd@google.com> Reported-by: NDoug Smythies <dsmythies@telus.net> Tested-by: NDoug Smythies <dsmythies@telus.net> Reviewed-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Martin Brandenburg 提交于
Without this fix (and another to the userspace component itself described later), the kernel will be unable to process any OrangeFS requests after the userspace component is restarted (due to a crash or at the administrator's behest). The bug here is that inside orangefs_remount, the orangefs_request_mutex is locked. When the userspace component restarts while the filesystem is mounted, it sends a ORANGEFS_DEV_REMOUNT_ALL ioctl to the device, which causes the kernel to send it a few requests aimed at synchronizing the state between the two. While this is happening the orangefs_request_mutex is locked to prevent any other requests going through. This is only half of the bugfix. The other half is in the userspace component which outright ignores(!) requests made before it considers the filesystem remounted, which is after the ioctl returns. Of course the ioctl doesn't return until after the userspace component responds to the request it ignores. The userspace component has been changed to allow ORANGEFS_VFS_OP_FEATURES regardless of the mount status. Mike Marshall says: "I've tested this patch against the fixed userspace part. This patch is real important, I hope it can make it into 4.11... Here's what happens when the userspace daemon is restarted, without the patch: ============================================= [ INFO: possible recursive locking detected ] [ 4.10.0-00007-ge98bdb30 #1 Not tainted ] --------------------------------------------- pvfs2-client-co/29032 is trying to acquire lock: (orangefs_request_mutex){+.+.+.}, at: service_operation+0x3c7/0x7b0 [orangefs] but task is already holding lock: (orangefs_request_mutex){+.+.+.}, at: dispatch_ioctl_command+0x1bf/0x330 [orangefs] CPU: 0 PID: 29032 Comm: pvfs2-client-co Not tainted 4.10.0-00007-ge98bdb30 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014 Call Trace: __lock_acquire+0x7eb/0x1290 lock_acquire+0xe8/0x1d0 mutex_lock_killable_nested+0x6f/0x6e0 service_operation+0x3c7/0x7b0 [orangefs] orangefs_remount+0xea/0x150 [orangefs] dispatch_ioctl_command+0x227/0x330 [orangefs] orangefs_devreq_ioctl+0x29/0x70 [orangefs] do_vfs_ioctl+0xa3/0x6e0 SyS_ioctl+0x79/0x90" Signed-off-by: NMartin Brandenburg <martin@omnibond.com> Acked-by: NMike Marshall <hubcap@omnibond.com> Cc: stable@vger.kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-