- 30 9月, 2017 16 次提交
-
-
由 Peter Zijlstra 提交于
The problem with the overestimate is that it will subtract too big a value from the load_sum, thereby pushing it down further than it ought to go. Since runnable_load_avg is not subject to a similar 'force', this results in the occasional 'runnable_load > load' situation. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
The PELT _sum values are a saw-tooth function, dropping on the decay edge and then growing back up again during the window. When these window-edges are not aligned between cfs_rq and se, we can have the situation where, for example, on dequeue, the se decays first. Its _sum values will be small(er), while the cfs_rq _sum values will still be on their way up. Because of this, the subtraction: cfs_rq->avg._sum -= se->avg._sum will result in a positive value. This will then, once the cfs_rq reaches an edge, translate into its _avg value jumping up. This is especially visible with the runnable_load bits, since they get added/subtracted a lot. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Vincent wondered why his self migrating task had a roughly 50% dip in load_avg when landing on the new CPU. This is because we uncondionally take the asynchronous detatch_entity route, which can lead to the attach on the new CPU still seeing the old CPU's contribution to tg->load_avg, effectively halving the new CPU's shares. While in general this is something we have to live with, there is the special case of runnable migration where we can do better. Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
The load balancer uses runnable_load_avg as load indicator. For !cgroup this is: runnable_load_avg = \Sum se->avg.load_avg ; where se->on_rq That is, a direct sum of all runnable tasks on that runqueue. As opposed to load_avg, which is a sum of all tasks on the runqueue, which includes a blocked component. However, in the cgroup case, this comes apart since the group entities are always runnable, even if most of their constituent entities are blocked. Therefore introduce a runnable_weight which for task entities is the same as the regular weight, but for group entities is a fraction of the entity weight and represents the runnable part of the group runqueue. Then propagate this load through the PELT hierarchy to arrive at an effective runnable load avgerage -- which we should not confuse with the canonical runnable load average. Suggested-by: NTejun Heo <tj@kernel.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
When an entity migrates in (or out) of a runqueue, we need to add (or remove) its contribution from the entire PELT hierarchy, because even non-runnable entities are included in the load average sums. In order to do this we have some propagation logic that updates the PELT tree, however the way it 'propagates' the runnable (or load) change is (more or less): tg->weight * grq->avg.load_avg ge->avg.load_avg = ------------------------------ tg->load_avg But that is the expression for ge->weight, and per the definition of load_avg: ge->avg.load_avg := ge->weight * ge->avg.runnable_avg That destroys the runnable_avg (by setting it to 1) we wanted to propagate. Instead directly propagate runnable_sum. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Since on wakeup migration we don't hold the rq->lock for the old CPU we cannot update its state. Instead we add the removed 'load' to an atomic variable and have the next update on that CPU collect and process it. Currently we have 2 atomic variables; which already have the issue that they can be read out-of-sync. Also, two atomic ops on a single cacheline is already more expensive than an uncontended lock. Since we want to add more, convert the thing over to an explicit cacheline with a lock in. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Vincent Guittot 提交于
Now that we directly change load_avg and propagate that change into the sums, sys_nice() and co should do the same, otherwise its possible to confuse load accounting when we migrate near the weight change. Fixes-by: NJosef Bacik <josef@toxicpanda.com> Signed-off-by: NVincent Guittot <vincent.guittot@linaro.org> [ Added changelog, fixed the call condition. ] Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/20170517095045.GA8420@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
When a (group) entity changes it's weight we should instantly change its load_avg and propagate that change into the sums it is part of. Because we use these values to predict future behaviour and are not interested in its historical value. Without this change, the change in load would need to propagate through the average, by which time it could again have changed etc.. always chasing itself. With this change, the cfs_rq load_avg sum will more accurately reflect the current runnable and expected return of blocked load. Reported-by: NPaul Turner <pjt@google.com> [josef: compile fix !SMP || !FAIR_GROUP] Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Analogous to the existing {en,de}queue_runnable_load_avg() add helpers for {en,de}queue_load_avg(). More users will follow. Includes some code movement to avoid fwd declarations. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Since they're now purely about runnable_load, rename them. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Move the entity migrate handling from enqueue_entity_load_avg() to update_load_avg(). This has two benefits: - {en,de}queue_entity_load_avg() will become purely about managing runnable_load - we can avoid a double update_tg_load_avg() and reduce pressure on the global tg->shares cacheline The reason we do this is so that we can change update_cfs_shares() to change both weight and (future) runnable_weight. For this to work we need to have the cfs_rq averages up-to-date (which means having done the attach), but we need the cfs_rq->avg.runnable_avg to not yet include the se's contribution (since se->on_rq == 0). Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Most call sites of update_load_avg() already have cfs_rq_of(se) available, pass it down instead of recomputing it. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Remove the load from the load_sum for sched_entities, basically turning load_sum into runnable_sum. This prepares for better reweighting of group entities. Since we now have different rules for computing load_avg, split ___update_load_avg() into two parts, ___update_load_sum() and ___update_load_avg(). So for se: ___update_load_sum(.weight = 1) ___upate_load_avg(.weight = se->load.weight) and for cfs_rq: ___update_load_sum(.weight = cfs_rq->load.weight) ___upate_load_avg(.weight = 1) Since the primary consumable is load_avg, most things will not be affected. Only those few sites that initialize/modify load_sum need attention. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Vincent reported that when running in a cgroup, his root cfs_rq->avg.load_avg dropped to 0 on task idle. This is because reweight_entity() will now immediately propagate the weight change of the group entity to its cfs_rq, and as it happens, our approxmation (5) for calc_cfs_shares() results in 0 when the group is idle. Avoid this by using the correct (3) as a lower bound on (5). This way the empty cgroup will slowly decay instead of instantly drop to 0. Reported-by: NVincent Guittot <vincent.guittot@linaro.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Explain the magic equation in calc_cfs_shares() a bit better. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
For consistencies sake, we should have only a single reading of tg->shares. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 29 9月, 2017 17 次提交
-
-
由 Ethan Zhao 提交于
System will hang if user set sysctl_sched_time_avg to 0: [root@XXX ~]# sysctl kernel.sched_time_avg_ms=0 Stack traceback for pid 0 0xffff883f6406c600 0 0 1 3 R 0xffff883f6406cf50 *swapper/3 ffff883f7ccc3ae8 0000000000000018 ffffffff810c4dd0 0000000000000000 0000000000017800 ffff883f7ccc3d78 0000000000000003 ffff883f7ccc3bf8 ffffffff810c4fc9 ffff883f7ccc3c08 00000000810c5043 ffff883f7ccc3c08 Call Trace: <IRQ> [<ffffffff810c4dd0>] ? update_group_capacity+0x110/0x200 [<ffffffff810c4fc9>] ? update_sd_lb_stats+0x109/0x600 [<ffffffff810c5507>] ? find_busiest_group+0x47/0x530 [<ffffffff810c5b84>] ? load_balance+0x194/0x900 [<ffffffff810ad5ca>] ? update_rq_clock.part.83+0x1a/0xe0 [<ffffffff810c6d42>] ? rebalance_domains+0x152/0x290 [<ffffffff810c6f5c>] ? run_rebalance_domains+0xdc/0x1d0 [<ffffffff8108a75b>] ? __do_softirq+0xfb/0x320 [<ffffffff8108ac85>] ? irq_exit+0x125/0x130 [<ffffffff810b3a17>] ? scheduler_ipi+0x97/0x160 [<ffffffff81052709>] ? smp_reschedule_interrupt+0x29/0x30 [<ffffffff8173a1be>] ? reschedule_interrupt+0x6e/0x80 <EOI> [<ffffffff815bc83c>] ? cpuidle_enter_state+0xcc/0x230 [<ffffffff815bc80c>] ? cpuidle_enter_state+0x9c/0x230 [<ffffffff815bc9d7>] ? cpuidle_enter+0x17/0x20 [<ffffffff810cd6dc>] ? cpu_startup_entry+0x38c/0x420 [<ffffffff81053373>] ? start_secondary+0x173/0x1e0 Because divide-by-zero error happens in function: update_group_capacity() update_cpu_capacity() scale_rt_capacity() { ... total = sched_avg_period() + delta; used = div_u64(avg, total); ... } To fix this issue, check user input value of sysctl_sched_time_avg, keep it unchanged when hitting invalid input, and set the minimum limit of sysctl_sched_time_avg to 1 ms. Reported-by: NJames Puthukattukaran <james.puthukattukaran@oracle.com> Signed-off-by: NEthan Zhao <ethan.zhao@oracle.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: efault@gmx.de Cc: ethan.kernel@gmail.com Cc: keescook@chromium.org Cc: mcgrof@kernel.org Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/1504504774-18253-1-git-send-email-ethan.zhao@oracle.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Currently TASK_PARKED is masqueraded as TASK_INTERRUPTIBLE, give it its own print state because it will not in fact get woken by regular wakeups and is a long-term state. This requires moving TASK_PARKED into the TASK_REPORT mask, and since that latter needs to be a contiguous bitmask, we need to shuffle the bits around a bit. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Markus reported that tasks in TASK_IDLE state are reported by SysRq-W, which results in undesirable clutter. Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Markus reported that kthreads that idle using TASK_IDLE instead of TASK_INTERRUPTIBLE are reported in as TASK_UNINTERRUPTIBLE and things like htop mark those red. This is undesirable, so add an explicit state for TASK_IDLE. Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Remove yet another task-state char instance. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Convert trace_sched_switch to use the common task-state helpers and fix the "X" and "Z" order, possibly they ended up in the wrong order because TASK_REPORT has them in the wrong order too. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Bit patterns are easier in hex. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Currently get_task_state() and task_state_to_char() report different states, create a number of common helpers and unify the reported state space. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm由 Linus Torvalds 提交于
Pull ACPI fix from Rafael Wysocki: "This fixes an APEI problem that may cause a reported error to be missed due to a race condition" * tag 'acpi-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / APEI: clear error status before acknowledging the error
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm由 Linus Torvalds 提交于
Pull power management fixes from Rafael Wysocki: "These fix a deadlock in the operating performance points (OPP) framework introduced during the 4.11 cycle, more issues with duplicate device objects for cpufreq-dt and cpufreq documentation. Specifics: - Fix a deadlock in the operating performance points (OPP) framework caused by a notifier callback taking a lock that's already held by its caller (Viresh Kumar). - Prevent the ti-cpufreq and cpufreq-dt-platdev drivers from attempting to register conflicting device objects which triggers a warning from sysfs (Suniel Mahesh). - Drop a stale reference to a piece of intel_pstate documentation that's not in the tree any more (Rafael Wysocki)" * tag 'pm-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: docs: Drop intel-pstate.txt from index.txt cpufreq: dt: Fix sysfs duplicate filename creation for platform-device PM / OPP: Call notifier without holding opp_table->lock
-
git://git.kernel.org/pub/scm/fs/xfs/xfs-linux由 Linus Torvalds 提交于
Pull xfs fixes from Darrick Wong: - fix various problems with the copy-on-write extent maps getting freed at the wrong time - fix printk format specifier problems - report zeroing operation outcomes instead of dropping them on the floor - fix some crashes when dio operations partially fail - fix a race condition between unwritten extent conversion & dio read - fix some incorrect tests in the inode log item processing - correct the delayed allocation space reservations on rmap filesystems - fix some problems checking for dax support * tag 'xfs-4.14-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: revert "xfs: factor rmap btree size into the indlen calculations" xfs: Capture state of the right inode in xfs_iflush_done xfs: perag initialization should only touch m_ag_max_usable for AG 0 xfs: update i_size after unwritten conversion in dio completion iomap_dio_rw: Allocate AIO completion queue before submitting dio xfs: validate bdev support for DAX inode flag xfs: remove redundant re-initialization of total_nr_pages xfs: Output warning message when discard option was enabled even though the device does not support discard xfs: report zeroed or not correctly in xfs_zero_range() xfs: kill meaningless variable 'zero' fs/xfs: Use %pS printk format for direct addresses xfs: evict CoW fork extents when performing finsert/fcollapse xfs: don't unconditionally clear the reflink flag on zero-block files
-
由 Linus Torvalds 提交于
This reverts commit dbbccdc4. It turns out that the "legacy" users aren't so legacy at all, and that turning off the legacy ioctl will break the current Qt bluetooth stack for bluetooth LE devices that were released just a couple of months ago. So it's simply not true that this was a legacy interface that hasn't been needed and is only limited to old legacy BT devices. Because I actually read Kconfig help messages, and actively try to turn off features that I don't need, I turned the option off. Then I spent _way_ too much time debugging BLE issues until I realized that it wasn't the Qt and subsurface development that had broken one of my dive computer BLE downloads, but simply my broken kernel config. Maybe in a decade it will be true that this is a legacy interface. And maybe with a better help-text and correct dependencies, this kind of legacy removal might be acceptable. But as things are right now both the commit message and the Kconfig help text were misleading, and the Kconfig option had the wrong dependenencies. There's no reason to keep that broken Kconfig option in the tree. Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rafael J. Wysocki 提交于
* acpi-apei: ACPI / APEI: clear error status before acknowledging the error
-
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma由 Linus Torvalds 提交于
Pull rdma fixes from Doug Ledford: "Second -rc update for 4.14. Both Mellanox and Intel had a series of -rc fixes that landed this week. The Mellanox bunch is spread throughout the stack and not just in their driver, where as the Intel bunch was mostly in the hfi1 driver. And, several of the fixes in the hfi1 driver were more than just simple 5 line fixes. As a result, the hfi1 driver fixes has a sizable LOC count. Everything else is as one would expect in an RC cycle in terms of LOC count. One item that might jump out and make you think "That's not an rc item" is the fix that corrects a typo. But, that change fixes a typo in a user visible API that was just added in this merge window, so if we fix it now, we can fix it. If we don't, the typo is in the API forever. Another that might not appear to be a fix at first glance is the Simplify mlx5_ib_cont_pages patch, but the simplification allows them to fix a bug in the existing function whenever the length of an SGE exceeded page size. We also had to revert one patch from the merge window that was wrong. Summary: - a few core fixes - a few ipoib fixes - a few mlx5 fixes - a 7-patch hfi1 related series" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: IB/hfi1: Unsuccessful PCIe caps tuning should not fail driver load IB/hfi1: On error, fix use after free during user context setup Revert "IB/ipoib: Update broadcast object if PKey value was changed in index 0" IB/hfi1: Return correct value in general interrupt handler IB/hfi1: Check eeprom config partition validity IB/hfi1: Only reset QSFP after link up and turn off AOC TX IB/hfi1: Turn off AOC TX after offline substates IB/mlx5: Fix NULL deference on mlx5_ib_update_xlt failure IB/mlx5: Simplify mlx5_ib_cont_pages IB/ipoib: Fix inconsistency with free_netdev and free_rdma_netdev IB/ipoib: Fix sysfs Pkey create<->remove possible deadlock IB: Correct MR length field to be 64-bit IB/core: Fix qp_sec use after free access IB/core: Fix typo in the name of the tag-matching cap struct
-
由 Rafael J. Wysocki 提交于
* pm-opp: PM / OPP: Call notifier without holding opp_table->lock * pm-cpufreq: cpufreq: docs: Drop intel-pstate.txt from index.txt cpufreq: dt: Fix sysfs duplicate filename creation for platform-device
-
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux由 Linus Torvalds 提交于
Pull seccomp fix from Kees Cook: "Fix refcounting bug in CRIU interface, noticed by Chris Salls (Oleg & Tycho)" * tag 'seccomp-v4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: seccomp: fix the usage of get/put_seccomp_filter() in seccomp_get_filter()
-
- 28 9月, 2017 5 次提交
-
-
由 Oleg Nesterov 提交于
As Chris explains, get_seccomp_filter() and put_seccomp_filter() can end up using different filters. Once we drop ->siglock it is possible for task->seccomp.filter to have been replaced by SECCOMP_FILTER_FLAG_TSYNC. Fixes: f8e529ed ("seccomp, ptrace: add support for dumping seccomp filters") Reported-by: NChris Salls <chrissalls5@gmail.com> Cc: stable@vger.kernel.org # needs s/refcount_/atomic_/ for v4.12 and earlier Signed-off-by: NOleg Nesterov <oleg@redhat.com> [tycho: add __get_seccomp_filter vs. open coding refcount_inc()] Signed-off-by: NTycho Andersen <tycho@docker.com> [kees: tweak commit log] Signed-off-by: NKees Cook <keescook@chromium.org>
-
由 Rafael J. Wysocki 提交于
Commit 33fc30b4 (cpufreq: intel_pstate: Document the current behavior and user interface) dropped the intel-pstate.txt file from Documentation/cpu-freq/, but it did not update the index.txt file in there accordingly, so do that now. Fixes: 33fc30b4 (cpufreq: intel_pstate: Document the current behavior and user interface) Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Tyler Baicar 提交于
Currently we acknowledge errors before clearing the error status. This could cause a new error to be populated by firmware in-between the error acknowledgment and the error status clearing which would cause the second error's status to be cleared without being handled. So, clear the error status before acknowledging the errors. Also, make sure to acknowledge the error if the error status read fails. Signed-off-by: NTyler Baicar <tbaicar@codeaurora.org> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs由 Linus Torvalds 提交于
Pull quota and isofs fixes from Jan Kara: "Two quota fixes (fallout of the quota locking changes) and an isofs build fix" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: quota: Fix quota corruption with generic/232 test isofs: fix build regression quota: add missing lock into __dquot_transfer()
-
由 Linus Torvalds 提交于
Merge tag 'linux-kselftest-4.14-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fixes from Shuah Khan: "This update consists of: - fixes to several existing tests - a test for regression introduced by b9470c27 ("inet: kill smallest_size and smallest_port") - seccomp support for glibc 2.26 siginfo_t.h - fixes to kselftest framework and tests to run make O=dir use-case - fixes to silence unnecessary test output to de-clutter test results" * tag 'linux-kselftest-4.14-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (28 commits) selftests: timers: set-timer-lat: Fix hang when testing unsupported alarms selftests: timers: set-timer-lat: fix hang when std out/err are redirected selftests/memfd: correct run_tests.sh permission selftests/seccomp: Support glibc 2.26 siginfo_t.h selftests: futex: Makefile: fix for loops in targets to run silently selftests: Makefile: fix for loops in targets to run silently selftests: mqueue: Use full path to run tests from Makefile selftests: futex: copy sub-dir test scripts for make O=dir run selftests: lib.mk: copy test scripts and test files for make O=dir run selftests: sync: kselftest and kselftest-clean fail for make O=dir case selftests: sync: use TEST_CUSTOM_PROGS instead of TEST_PROGS selftests: lib.mk: add TEST_CUSTOM_PROGS to allow custom test run/install selftests: watchdog: fix to use TEST_GEN_PROGS and remove clean selftests: lib.mk: fix test executable status check to use full path selftests: Makefile: clear LDFLAGS for make O=dir use-case selftests: lib.mk: kselftest and kselftest-clean fail for make O=dir case Makefile: kselftest and kselftest-clean fail for make O=dir case selftests/net: msg_zerocopy enable build with older kernel headers selftests: actually run the various net selftests selftest: add a reuseaddr test ...
-
- 27 9月, 2017 2 次提交
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull x86 fpu fixes and cleanups from Ingo Molnar: "This is _way_ more cleanups than fixes, but the bugs were subtle and hard to hit, and the primary reason for them existing was the unnecessary historical complexity of some of the x86/fpu interfaces. The first bunch of commits clean up and simplify the xstate user copy handling functions, in reaction to the collective head-scratching about the xstate user-copy handling code that leads up to the fix for this SkyLake xstate handling bug: 0852b374: x86/fpu: Add FPU state copying quirk to handle XRSTOR failure on Intel Skylake CPUs The cleanups don't change any functionality, they just (hopefully) make it all clearer, more consistent, more debuggable and more robust. Note that most of the linecount increase comes from these commits, where we better split the user/kernel copy logic by having more variants, instead repeated fragile patterns of: if (kbuf) { memcpy(kbuf + pos, data, copy); } else { if (__copy_to_user(ubuf + pos, data, copy)) return -EFAULT; } The next bunch of commits simplify the FPU state-machine to get rid of old lazy-FPU idiosyncrasies - a defensive simplification to make all the code easier to review and fix. No change in functionality. Then there's a couple of additional debugging tweaks: static checker warning fix and move an FPU related warning to under WARN_ON_FPU(), followed by another bunch of commits that represent a finegrained split-up of the fixes from Eric Biggers to handle weird xstate bits properly. I did this finegrained split-up because some of these fixes also impact the ABI for weird xstate handling, for which we'd like to have good bisection results, should they cause any problems. (We also had one regression with the more monolithic fixes, so splitting it all up sounded prudent for robustness reasons as well.) About the whole series: the commits up to 03eaec81 have been in -next for months - but I've recently rebased them to remove a state machine clean-up commit that was objected to, and to make it more bisectable - so technically it's a new, rebased tree. Robustness history: this series had some regressions along the way, and all reported regressions have been fixed. All but one of the regressions manifested itself as easy to report warnings. The previous version of this latest series was also in linux-next, with one (warning-only) regression reported which is fixed in the latest version. Barring last minute brown paper bag bugs (and the commits are now older by a day which I'd hope helps paperbag reduction), I'm reasonably confident about its general robustness. Famous last words ..." * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits) x86/fpu: Use using_compacted_format() instead of open coded X86_FEATURE_XSAVES x86/fpu: Use validate_xstate_header() to validate the xstate_header in copy_user_to_xstate() x86/fpu: Eliminate the 'xfeatures' local variable in copy_user_to_xstate() x86/fpu: Copy the full header in copy_user_to_xstate() x86/fpu: Use validate_xstate_header() to validate the xstate_header in copy_kernel_to_xstate() x86/fpu: Eliminate the 'xfeatures' local variable in copy_kernel_to_xstate() x86/fpu: Copy the full state_header in copy_kernel_to_xstate() x86/fpu: Use validate_xstate_header() to validate the xstate_header in __fpu__restore_sig() x86/fpu: Use validate_xstate_header() to validate the xstate_header in xstateregs_set() x86/fpu: Introduce validate_xstate_header() x86/fpu: Rename fpu__activate_fpstate_read/write() to fpu__prepare_[read|write]() x86/fpu: Rename fpu__activate_curr() to fpu__initialize() x86/fpu: Simplify and speed up fpu__copy() x86/fpu: Fix stale comments about lazy FPU logic x86/fpu: Rename fpu::fpstate_active to fpu::initialized x86/fpu: Remove fpu__current_fpstate_write_begin/end() x86/fpu: Fix fpu__activate_fpstate_read() and update comments x86/fpu: Reinitialize FPU registers if restoring FPU state fails x86/fpu: Don't let userspace set bogus xcomp_bv x86/fpu: Turn WARN_ON() in context switch into WARN_ON_FPU() ...
-
由 Harish Chegondi 提交于
Failure to tune PCIe capabilities should not fail driver load. This can cause the driver load to fail on systems with any of the following: 1. HFI's parent is not root. Example: HFI card is behind a PCIe bridge. 2. HFI's parent is not PCI Express capable. In these situations, failure to tune PCIe capabilities should be logged in the system message logs but not cause the driver load to fail. This patch also ensures pcie capability word DevCtl is written only after a successful read and the capability tuning process continues even if read/write of the pcie capability word DevCtl fails. Fixes: c53df62c ("IB/hfi1: Check return values from PCI config API calls") Fixes: bf70a775 ("staging/rdma/hfi1: Enable WFR PCIe extended tags from the driver") Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: NJakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: NHarish Chegondi <harish.chegondi@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-