提交 · 7e868b6eff21edb59eb6a723dfd18761476ddb46 · openeuler / Kernel

18 12月, 2014 5 次提交

Y
libceph: require cephx message signature by default · a3fc9800
由 Yan, Zheng 提交于 11月 11, 2014
```
Signed-off-by: NYan, Zheng <zyan@redhat.com>
Reviewed-by: NIlya Dryomov <idryomov@redhat.com>
```
a3fc9800

libceph: update ceph_msg_header structure · d4e1a4e0

由 John Spray 提交于 10月 16, 2014

2 bytes of what was reserved space is now used by userspace for the
compat_version field.
Signed-off-by: NJohn Spray <john.spray@redhat.com>
Reviewed-by: NSage Weil <sage@redhat.com>

d4e1a4e0

Y
libceph: message signature support · 33d07337
由 Yan, Zheng 提交于 11月 04, 2014
```
Signed-off-by: NYan, Zheng <zyan@redhat.com>
```
33d07337

libceph: nuke ceph_kvfree() · 4965fc38

由 Ilya Dryomov 提交于 10月 23, 2014

Use kvfree() from linux/mm.h instead, which is identical.  Also fix the
ceph_buffer comment: we will allocate with kmalloc() up to 32k - the
value of PAGE_ALLOC_COSTLY_ORDER, but that really is just an
implementation detail so don't mention it at all.
Signed-off-by: NIlya Dryomov <idryomov@redhat.com>

4965fc38

ceph: fix file lock interruption · 9280be24

由 Yan, Zheng 提交于 10月 14, 2014

When a lock operation is interrupted, current code sends a unlock request to
MDS to undo the lock operation. This method does not work as expected because
the unlock request can drop locks that have already been acquired.

The fix is use the newly introduced CEPH_LOCK_FCNTL_INTR/CEPH_LOCK_FLOCK_INTR
requests to interrupt blocked file lock request. These requests do not drop
locks that have alread been acquired, they only interrupt blocked file lock
request.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

9280be24

26 11月, 2014 1 次提交

kvm: fix kvm_is_mmio_pfn() and rename to kvm_is_reserved_pfn() · d3fccc7e

由 Ard Biesheuvel 提交于 11月 10, 2014

This reverts commit 85c8555f ("KVM: check for !is_zero_pfn() in
kvm_is_mmio_pfn()") and renames the function to kvm_is_reserved_pfn.

The problem being addressed by the patch above was that some ARM code
based the memory mapping attributes of a pfn on the return value of
kvm_is_mmio_pfn(), whose name indeed suggests that such pfns should
be mapped as device memory.

However, kvm_is_mmio_pfn() doesn't do quite what it says on the tin,
and the existing non-ARM users were already using it in a way which
suggests that its name should probably have been 'kvm_is_reserved_pfn'
from the beginning, e.g., whether or not to call get_page/put_page on
it etc. This means that returning false for the zero page is a mistake
and the patch above should be reverted.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d3fccc7e

24 11月, 2014 2 次提交

PCI/MSI: Add device flag indicating that 64-bit MSIs don't work · f144d149

由 Benjamin Herrenschmidt 提交于 10月 03, 2014

This can be set by quirks/drivers to be used by the architecture code
that assigns the MSI addresses.

We additionally add verification in the core MSI code that the values
assigned by the architecture do satisfy the limitation in order to fail
gracefully if they don't (ie. the arch hasn't been updated to deal with
that quirk yet).
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org>
Acked-by: NBjorn Helgaas <bhelgaas@google.com>

f144d149

percpu-ref: fix DEAD flag contamination of percpu pointer · 4aab3b5b

由 Tejun Heo 提交于 11月 22, 2014

While decoupling ATOMIC and DEAD flags, f47ad457 ("percpu_ref:
decouple switching to percpu mode and reinit") updated
__ref_is_percpu() so that it only tests ATOMIC flag to determine
whether the ref is in percpu mode or not; however, while DEAD implies
ATOMIC, the two flags are set separately during percpu_ref_kill() and
if __ref_is_percpu() races percpu_ref_kill(), it may see DEAD w/o
ATOMIC.  Because __ref_is_percpu() returns @ref->percpu_count_ptr
value verbatim as the percpu pointer after testing ATOMIC, the pointer
may now be contaminated with the DEAD flag.

This can be fixed by clearing the flag bits before returning the
pointer which was the fix proposed by Shaohua; however, as DEAD
implies ATOMIC, we can just test for both flags at once and avoid the
explicit masking.

Update __ref_is_percpu() so that it tests that both ATOMIC and DEAD
are clear before returning @ref->percpu_count_ptr as the percpu
pointer.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-and-Reviewed-by: NShaohua Li <shli@kernel.org>
Link: http://lkml.kernel.org/r/995deb699f5b873c45d667df4add3b06f73c2c25.1416638887.git.shli@kernel.org
Fixes: f47ad457 ("percpu_ref: decouple switching to percpu mode and reinit")

4aab3b5b

18 11月, 2014 2 次提交

can: dev: add can_is_canfd_skb() API · 98e69016

由 Dong Aisheng 提交于 11月 07, 2014

The CAN device drivers can use can_is_canfd_skb() to check if the frame to send
is on CAN FD mode or normal CAN mode.
Acked-by: NOliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: NDong Aisheng <b29396@freescale.com>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

98e69016

clk-divider: Fix READ_ONLY when divider > 1 · e6d5e7d9

由 James Hogan 提交于 11月 14, 2014

Commit 79c6ab50 (clk: divider: add CLK_DIVIDER_READ_ONLY flag) in
v3.16 introduced the CLK_DIVIDER_READ_ONLY flag which caused the
recalc_rate() and round_rate() clock callbacks to be omitted.

However using this flag has the unfortunate side effect of causing the
clock recalculation code when a clock rate change is attempted to always
treat it as a pass-through clock, i.e. with a fixed divide of 1, which
may not be the case. Child clock rates are then recalculated using the
wrong parent rate.

Therefore instead of dropping the recalc_rate() and round_rate()
callbacks, alter clk_divider_bestdiv() to always report the current
divider as the best divider so that it is never altered.

For me the read only clock was the system clock, which divided the PLL
rate by 2, from which both the UART and the SPI clocks were divided.
Initial setting of the UART rate set it correctly, but when the SPI
clock was set, the other child clocks were miscalculated. The UART clock
was recalculated using the PLL rate as the parent rate, resulting in a
UART new_rate of double what it should be, and a UART which spewed forth
garbage when the rate changes were propagated.
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Thomas Abraham <thomas.ab@samsung.com>
Cc: Tomasz Figa <t.figa@samsung.com>
Cc: Max Schwarz <max.schwarz@online.de>
Cc: <stable@vger.kernel.org> # v3.16+
Acked-by: NHaojian Zhuang <haojian.zhuang@gmail.com>
Signed-off-by: NMichael Turquette <mturquette@linaro.org>

e6d5e7d9

16 11月, 2014 3 次提交

sched/cputime: Fix cpu_timer_sample_group() double accounting · 23cfa361

由 Peter Zijlstra 提交于 11月 12, 2014

While looking over the cpu-timer code I found that we appear to add
the delta for the calling task twice, through:

  cpu_timer_sample_group()
    thread_group_cputimer()
      thread_group_cputime()
        times->sum_exec_runtime += task_sched_runtime();

    *sample = cputime.sum_exec_runtime + task_delta_exec();

Which would make the sample run ahead, making the sleep short.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/20141112113737.GI10476@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

23cfa361

bitops: Fix shift overflow in GENMASK macros · 00b4d9a1

由 Maxime COQUELIN 提交于 11月 06, 2014

On some 32 bits architectures, including x86, GENMASK(31, 0) returns 0
instead of the expected ~0UL.

This is the same on some 64 bits architectures with GENMASK_ULL(63, 0).

This is due to an overflow in the shift operand, 1 << 32 for GENMASK,
1 << 64 for GENMASK_ULL.
Reported-by: NEric Paire <eric.paire@st.com>
Suggested-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: NMaxime Coquelin <maxime.coquelin@st.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org> # v3.13+
Cc: linux@rasmusvillemoes.dk
Cc: gong.chen@linux.intel.com
Cc: John Sullivan <jsrhbz@kanargh.force9.co.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Fixes: 10ef6b0d ("bitops: Introduce a more generic BITMASK macro")
Link: http://lkml.kernel.org/r/1415267659-10563-1-git-send-email-maxime.coquelin@st.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

00b4d9a1

iio: Fix IIO_EVENT_CODE_EXTRACT_DIR bit mask · ccf54555

由 Cristina Ciocan 提交于 11月 11, 2014

The direction field is set on 7 bits, thus we need to AND it with 0111 111 mask
in order to retrieve it, that is 0x7F, not 0xCF as it is now.

Fixes: ade7ef7b (staging:iio: Differential channel handling)
Signed-off-by: NCristina Ciocan <cristina.ciocan@intel.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: NJonathan Cameron <jic23@kernel.org>

ccf54555

15 11月, 2014 1 次提交

inetdevice: fixed signed integer overflow · 84bc8868

由 Vincent BENAYOUN 提交于 11月 13, 2014

There could be a signed overflow in the following code.

The expression, (32-logmask) is comprised between 0 and 31 included.
It may be equal to 31.
In such a case the left shift will produce a signed integer overflow.
According to the C99 Standard, this is an undefined behavior.
A simple fix is to replace the signed int 1 with the unsigned int 1U.
Signed-off-by: NVincent BENAYOUN <vincent.benayoun@trust-in-soft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

84bc8868

14 11月, 2014 2 次提交

mem-hotplug: reset node managed pages when hot-adding a new pgdat · f784a3f1

由 Tang Chen 提交于 11月 13, 2014

In free_area_init_core(), zone->managed_pages is set to an approximate
value for lowmem, and will be adjusted when the bootmem allocator frees
pages into the buddy system.

But free_area_init_core() is also called by hotadd_new_pgdat() when
hot-adding memory.  As a result, zone->managed_pages of the newly added
node's pgdat is set to an approximate value in the very beginning.

Even if the memory on that node has node been onlined,
/sys/device/system/node/nodeXXX/meminfo has wrong value:

  hot-add node2 (memory not onlined)
  cat /sys/device/system/node/node2/meminfo
  Node 2 MemTotal:       33554432 kB
  Node 2 MemFree:               0 kB
  Node 2 MemUsed:        33554432 kB
  Node 2 Active:                0 kB

This patch fixes this problem by reset node managed pages to 0 after
hot-adding a new node.

1. Move reset_managed_pages_done from reset_node_managed_pages() to
   reset_all_zones_managed_pages()
2. Make reset_node_managed_pages() non-static
3. Call reset_node_managed_pages() in hotadd_new_pgdat() after pgdat
   is initialized
Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: <stable@vger.kernel.org>	[3.16+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f784a3f1

mm/page_alloc: fix incorrect isolation behavior by rechecking migratetype · ad53f92e

由 Joonsoo Kim 提交于 11月 13, 2014

Before describing bugs itself, I first explain definition of freepage.

 1. pages on buddy list are counted as freepage.
 2. pages on isolate migratetype buddy list are *not* counted as freepage.
 3. pages on cma buddy list are counted as CMA freepage, too.

Now, I describe problems and related patch.

Patch 1: There is race conditions on getting pageblock migratetype that
it results in misplacement of freepages on buddy list, incorrect
freepage count and un-availability of freepage.

Patch 2: Freepages on pcp list could have stale cached information to
determine migratetype of buddy list to go.  This causes misplacement of
freepages on buddy list and incorrect freepage count.

Patch 4: Merging between freepages on different migratetype of
pageblocks will cause freepages accouting problem.  This patch fixes it.

Without patchset [3], above problem doesn't happens on my CMA allocation
test, because CMA reserved pages aren't used at all.  So there is no
chance for above race.

With patchset [3], I did simple CMA allocation test and get below
result:

 - Virtual machine, 4 cpus, 1024 MB memory, 256 MB CMA reservation
 - run kernel build (make -j16) on background
 - 30 times CMA allocation(8MB * 30 = 240MB) attempts in 5 sec interval
 - Result: more than 5000 freepage count are missed

With patchset [3] and this patchset, I found that no freepage count are
missed so that I conclude that problems are solved.

On my simple memory offlining test, these problems also occur on that
environment, too.

This patch (of 4):

There are two paths to reach core free function of buddy allocator,
__free_one_page(), one is free_one_page()->__free_one_page() and the
other is free_hot_cold_page()->free_pcppages_bulk()->__free_one_page().
Each paths has race condition causing serious problems.  At first, this
patch is focused on first type of freepath.  And then, following patch
will solve the problem in second type of freepath.

In the first type of freepath, we got migratetype of freeing page
without holding the zone lock, so it could be racy.  There are two cases
of this race.

 1. pages are added to isolate buddy list after restoring orignal
    migratetype

    CPU1                                   CPU2

    get migratetype => return MIGRATE_ISOLATE
    call free_one_page() with MIGRATE_ISOLATE

                                grab the zone lock
                                unisolate pageblock
                                release the zone lock

    grab the zone lock
    call __free_one_page() with MIGRATE_ISOLATE
    freepage go into isolate buddy list,
    although pageblock is already unisolated

This may cause two problems.  One is that we can't use this page anymore
until next isolation attempt of this pageblock, because freepage is on
isolate buddy list.  The other is that freepage accouting could be wrong
due to merging between different buddy list.  Freepages on isolate buddy
list aren't counted as freepage, but ones on normal buddy list are
counted as freepage.  If merge happens, buddy freepage on normal buddy
list is inevitably moved to isolate buddy list without any consideration
of freepage accouting so it could be incorrect.

 2. pages are added to normal buddy list while pageblock is isolated.
    It is similar with above case.

This also may cause two problems.  One is that we can't keep these
freepages from being allocated.  Although this pageblock is isolated,
freepage would be added to normal buddy list so that it could be
allocated without any restriction.  And the other problem is same as
case 1, that it, incorrect freepage accouting.

This race condition would be prevented by checking migratetype again
with holding the zone lock.  Because it is somewhat heavy operation and
it isn't needed in common case, we want to avoid rechecking as much as
possible.  So this patch introduce new variable, nr_isolate_pageblock in
struct zone to check if there is isolated pageblock.  With this, we can
avoid to re-check migratetype in common case and do it only if there is
isolated pageblock or migratetype is MIGRATE_ISOLATE.  This solve above
mentioned problems.

Changes from v3:
Add one more check in free_one_page() that checks whether migratetype is
MIGRATE_ISOLATE or not. Without this, abovementioned case 1 could happens.
Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: NMinchan Kim <minchan@kernel.org>
Acked-by: NMichal Nazarewicz <mina86@mina86.com>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Laura Abbott <lauraa@codeaurora.org>
Cc: Heesub Shin <heesub.shin@samsung.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Ritesh Harjani <ritesh.list@gmail.com>
Cc: Gioh Kim <gioh.kim@lge.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ad53f92e

13 11月, 2014 1 次提交

nfs: fix pnfs direct write memory leak · 8c393f9a

由 Peng Tao 提交于 11月 05, 2014

For pNFS direct writes, layout driver may dynamically allocate ds_cinfo.buckets.
So we need to take care to free them when freeing dreq.

Ideally this needs to be done inside layout driver where ds_cinfo.buckets
are allocated. But buckets are attached to dreq and reused across LD IO iterations.
So I feel it's OK to free them in the generic layer.

Cc: stable@vger.kernel.org [v3.4+]
Signed-off-by: NPeng Tao <tao.peng@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

8c393f9a

12 11月, 2014 1 次提交

PM / Domains: Fix initial default state of the need_restore flag · 67732cd3

由 Ulf Hansson 提交于 11月 11, 2014

The initial state of the device's need_restore flag should'nt depend on
the current state of the PM domain. For example it should be perfectly
valid to attach an inactive device to a powered PM domain.

The pm_genpd_dev_need_restore() API allow us to update the need_restore
flag to somewhat cope with such scenarios. Typically that should have
been done from drivers/buses ->probe() since it's those that put the
requirements on the value of the need_restore flag.

Until recently, the Exynos SOCs were the only user of the
pm_genpd_dev_need_restore() API, though invoking it from a centralized
location while adding devices to their PM domains.

Due to that Exynos now have swithed to the generic OF-based PM domain
look-up, it's no longer possible to invoke the API from a centralized
location. The reason is because devices are now added to their PM
domains during the probe sequence.

Commit "ARM: exynos: Move to generic PM domain DT bindings"
did the switch for Exynos to the generic OF-based PM domain look-up,
but it also removed the call to pm_genpd_dev_need_restore(). This
caused a regression for some of the Exynos drivers.

To handle things more properly in the generic PM domain, let's change
the default initial value of the need_restore flag to reflect that the
state is unknown. As soon as some of the runtime PM callbacks gets
invoked, update the initial value accordingly.

Moreover, since the generic PM domain is verifying that all devices
are both runtime PM enabled and suspended, using pm_runtime_suspended()
while pm_genpd_poweroff() is invoked from the scheduled work, we can be
sure of that the PM domain won't be powering off while having active
devices.

Do note that, the generic PM domain can still only know about active
devices which has been activated through invoking its runtime PM resume
callback. In other words, buses/drivers using pm_runtime_set_active()
during ->probe() will still suffer from a race condition, potentially
probing a device without having its PM domain being powered. That issue
will have to be solved using a different approach.

This a log from the boot regression for Exynos5, which is being fixed in
this patch.

------------[ cut here ]------------
WARNING: CPU: 0 PID: 308 at ../drivers/clk/clk.c:851 clk_disable+0x24/0x30()
Modules linked in:
CPU: 0 PID: 308 Comm: kworker/0:1 Not tainted 3.18.0-rc3-00569-gbd9449f-dirty #10
Workqueue: pm pm_runtime_work
[<c0013c64>] (unwind_backtrace) from [<c0010dec>] (show_stack+0x10/0x14)
[<c0010dec>] (show_stack) from [<c03ee4cc>] (dump_stack+0x70/0xbc)
[<c03ee4cc>] (dump_stack) from [<c0020d34>] (warn_slowpath_common+0x64/0x88)
[<c0020d34>] (warn_slowpath_common) from [<c0020d74>] (warn_slowpath_null+0x1c/0x24)
[<c0020d74>] (warn_slowpath_null) from [<c03107b0>] (clk_disable+0x24/0x30)
[<c03107b0>] (clk_disable) from [<c02cc834>] (gsc_runtime_suspend+0x128/0x160)
[<c02cc834>] (gsc_runtime_suspend) from [<c0249024>] (pm_generic_runtime_suspend+0x2c/0x38)
[<c0249024>] (pm_generic_runtime_suspend) from [<c024f44c>] (pm_genpd_default_save_state+0x2c/0x8c)
[<c024f44c>] (pm_genpd_default_save_state) from [<c024ff2c>] (pm_genpd_poweroff+0x224/0x3ec)
[<c024ff2c>] (pm_genpd_poweroff) from [<c02501b4>] (pm_genpd_runtime_suspend+0x9c/0xcc)
[<c02501b4>] (pm_genpd_runtime_suspend) from [<c024a4f8>] (__rpm_callback+0x2c/0x60)
[<c024a4f8>] (__rpm_callback) from [<c024a54c>] (rpm_callback+0x20/0x74)
[<c024a54c>] (rpm_callback) from [<c024a930>] (rpm_suspend+0xd4/0x43c)
[<c024a930>] (rpm_suspend) from [<c024bbcc>] (pm_runtime_work+0x80/0x90)
[<c024bbcc>] (pm_runtime_work) from [<c0032a9c>] (process_one_work+0x12c/0x314)
[<c0032a9c>] (process_one_work) from [<c0032cf4>] (worker_thread+0x3c/0x4b0)
[<c0032cf4>] (worker_thread) from [<c003747c>] (kthread+0xcc/0xe8)
[<c003747c>] (kthread) from [<c000e738>] (ret_from_fork+0x14/0x3c)
---[ end trace 40cd58bcd6988f12 ]---

Fixes: a4a8c2c4 (ARM: exynos: Move to generic PM domain DT bindings)
Reported-and-tested0by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Reviewed-by: NSylwester Nawrocki <s.nawrocki@samsung.com>
Reviewed-by: NKevin Hilman <khilman@linaro.org>
Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

67732cd3

11 11月, 2014 1 次提交

tracing: Do not busy wait in buffer splice · e30f53aa

由 Rabin Vincent 提交于 11月 10, 2014

On a !PREEMPT kernel, attempting to use trace-cmd results in a soft
lockup:

 # trace-cmd record -e raw_syscalls:* -F false
 NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [trace-cmd:61]
 ...
 Call Trace:
  [<ffffffff8105b580>] ? __wake_up_common+0x90/0x90
  [<ffffffff81092e25>] wait_on_pipe+0x35/0x40
  [<ffffffff810936e3>] tracing_buffers_splice_read+0x2e3/0x3c0
  [<ffffffff81093300>] ? tracing_stats_read+0x2a0/0x2a0
  [<ffffffff812d10ab>] ? _raw_spin_unlock+0x2b/0x40
  [<ffffffff810dc87b>] ? do_read_fault+0x21b/0x290
  [<ffffffff810de56a>] ? handle_mm_fault+0x2ba/0xbd0
  [<ffffffff81095c80>] ? trace_event_buffer_lock_reserve+0x40/0x80
  [<ffffffff810951e2>] ? trace_buffer_lock_reserve+0x22/0x60
  [<ffffffff81095c80>] ? trace_event_buffer_lock_reserve+0x40/0x80
  [<ffffffff8112415d>] do_splice_to+0x6d/0x90
  [<ffffffff81126971>] SyS_splice+0x7c1/0x800
  [<ffffffff812d1edd>] tracesys_phase2+0xd3/0xd8

The problem is this: tracing_buffers_splice_read() calls
ring_buffer_wait() to wait for data in the ring buffers.  The buffers
are not empty so ring_buffer_wait() returns immediately.  But
tracing_buffers_splice_read() calls ring_buffer_read_page() with full=1,
meaning it only wants to read a full page.  When the full page is not
available, tracing_buffers_splice_read() tries to wait again with
ring_buffer_wait(), which again returns immediately, and so on.

Fix this by adding a "full" argument to ring_buffer_wait() which will
make ring_buffer_wait() wait until the writer has left the reader's
page, i.e.  until full-page reads will succeed.

Link: http://lkml.kernel.org/r/1415645194-25379-1-git-send-email-rabin@rab.in

Cc: stable@vger.kernel.org # 3.16+
Fixes: b1169cc6 ("tracing: Remove mock up poll wait function")
Signed-off-by: NRabin Vincent <rabin@rab.in>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

e30f53aa

10 11月, 2014 1 次提交

mfd: max77693: Fix always masked MUIC interrupts · c0acb814

由 Krzysztof Kozlowski 提交于 10月 10, 2014

All interrupts coming from MUIC were ignored because interrupt source
register was masked.

The Maxim 77693 has a "interrupt source" - a separate register and interrupts
which give information about PMIC block triggering the individual
interrupt (charger, topsys, MUIC, flash LED).

By default bootloader could initialize this register to "mask all"
value. In such case (observed on Trats2 board) MUIC interrupts won't be
generated regardless of their mask status. Regmap irq chip was unmasking
individual MUIC interrupts but the source was masked

Before introducing regmap irq chip this interrupt source was unmasked,
read and acked. Reading and acking is not necessary but unmasking is.

Fixes: 342d669c ("mfd: max77693: Handle IRQs using regmap")

Cc: <stable@vger.kernel.org>
Signed-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
Reviewed-by: NChanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: NLee Jones <lee.jones@linaro.org>

c0acb814

08 11月, 2014 1 次提交

PM / Domains: Change prototype for the attach and detach callbacks · c16561e8

由 Ulf Hansson 提交于 11月 06, 2014

Convert the prototypes to return an int in order to support error
handling in these callbacks.

Also, as suggested by Dmitry Torokhov, pass the domain pointer for use
inside the callbacks, and so that they match the existing
power_on/power_off callbacks which currently take the domain pointer.
Acked-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
[ khilman: added domain as parameter to callbacks, as suggested by Dmitry ]
Signed-off-by: NKevin Hilman <khilman@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

c16561e8

06 11月, 2014 2 次提交

include/linux/socket.h: Fix comment · 9cdb5dbf

由 Rasmus Villemoes 提交于 11月 05, 2014

File descriptors are always closed on exit :-)
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9cdb5dbf

PCI: Don't oops on virtual buses in acpi_pci_get_bridge_handle() · 32f638fc

由 Yinghai Lu 提交于 10月 30, 2014

acpi_pci_get_bridge_handle() returns the ACPI handle for the bridge device
(either a host bridge or a PCI-to-PCI bridge) leading to a PCI bus.  But
SR-IOV virtual functions can be on a virtual bus with no bridge leading to
it.  Return a NULL acpi_handle in this case instead of trying to
dereference the NULL pointer to the bridge.

This fixes a NULL pointer dereference oops in pci_get_hp_params() when
adding SR-IOV VF devices on virtual buses.

[bhelgaas: changelog, add comment in code]
Fixes: 6cd33649 ("PCI: Add pci_configure_device() during enumeration")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=87591Reported-by: NChao Zhou <chao.zhou@intel.com>
Reported-by: NJoerg Roedel <joro@8bytes.org>
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

32f638fc

04 11月, 2014 1 次提交

of: Fix overflow bug in string property parsing functions · a87fa1d8

由 Grant Likely 提交于 11月 03, 2014

The string property read helpers will run off the end of the buffer if
it is handed a malformed string property. Rework the parsers to make
sure that doesn't happen. At the same time add new test cases to make
sure the functions behave themselves.

The original implementations of of_property_read_string_index() and
of_property_count_strings() both open-coded the same block of parsing
code, each with it's own subtly different bugs. The fix here merges
functions into a single helper and makes the original functions static
inline wrappers around the helper.

One non-bugfix aspect of this patch is the addition of a new wrapper,
of_property_read_string_array(). The new wrapper is needed by the
device_properties feature that Rafael is working on and planning to
merge for v3.19. The implementation is identical both with and without
the new static inline wrapper, so it just got left in to reduce the
churn on the header file.
Signed-off-by: NGrant Likely <grant.likely@linaro.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Darren Hart <darren.hart@intel.com>
Cc: <stable@vger.kernel.org>  # v3.3+: Drop selftest hunks that don't apply

a87fa1d8

31 10月, 2014 2 次提交

Return short read or 0 at end of a raw device, not EIO · b2de525f

由 David Jeffery 提交于 9月 29, 2014

Author: David Jeffery <djeffery@redhat.com>
Changes to the basic direct I/O code have broken the raw driver when reading
to the end of a raw device.  Instead of returning a short read for a read that
extends partially beyond the device's end or 0 when at the end of the device,
these reads now return EIO.

The raw driver needs the same end of device handling as was added for normal
block devices.  Using blkdev_read_iter, which has the needed size checks,
prevents the EIO conditions at the end of the device.
Signed-off-by: NDavid Jeffery <djeffery@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b2de525f

net: skb_fclone_busy() needs to detect orphaned skb · 39bb5e62

由 Eric Dumazet 提交于 10月 30, 2014

Some drivers are unable to perform TX completions in a bound time.
They instead call skb_orphan()

Problem is skb_fclone_busy() has to detect this case, otherwise
we block TCP retransmits and can freeze unlucky tcp sessions on
mostly idle hosts.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Fixes: 1f3279ae ("tcp: avoid retransmits of TCP packets hanging in host queues")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

39bb5e62

30 10月, 2014 4 次提交

mm: memcontrol: fix missed end-writeback page accounting · d7365e78

由 Johannes Weiner 提交于 10月 29, 2014

Commit 0a31bc97 ("mm: memcontrol: rewrite uncharge API") changed
page migration to uncharge the old page right away.  The page is locked,
unmapped, truncated, and off the LRU, but it could race with writeback
ending, which then doesn't unaccount the page properly:

test_clear_page_writeback()              migration
                                           wait_on_page_writeback()
  TestClearPageWriteback()
                                           mem_cgroup_migrate()
                                             clear PCG_USED
  mem_cgroup_update_page_stat()
    if (PageCgroupUsed(pc))
      decrease memcg pages under writeback

  release pc->mem_cgroup->move_lock

The per-page statistics interface is heavily optimized to avoid a
function call and a lookup_page_cgroup() in the file unmap fast path,
which means it doesn't verify whether a page is still charged before
clearing PageWriteback() and it has to do it in the stat update later.

Rework it so that it looks up the page's memcg once at the beginning of
the transaction and then uses it throughout.  The charge will be
verified before clearing PageWriteback() and migration can't uncharge
the page as long as that is still set.  The RCU lock will protect the
memcg past uncharge.

As far as losing the optimization goes, the following test results are
from a microbenchmark that maps, faults, and unmaps a 4GB sparse file
three times in a nested fashion, so that there are two negative passes
that don't account but still go through the new transaction overhead.
There is no actual difference:

 old:     33.195102545 seconds time elapsed       ( +-  0.01% )
 new:     33.199231369 seconds time elapsed       ( +-  0.03% )

The time spent in page_remove_rmap()'s callees still adds up to the
same, but the time spent in the function itself seems reduced:

     # Children      Self  Command        Shared Object       Symbol
 old:     0.12%     0.11%  filemapstress  [kernel.kallsyms]   [k] page_remove_rmap
 new:     0.12%     0.08%  filemapstress  [kernel.kallsyms]   [k] page_remove_rmap
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Acked-by: NMichal Hocko <mhocko@suse.cz>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: <stable@vger.kernel.org>	[3.17.x]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d7365e78

mm: page-writeback: inline account_page_dirtied() into single caller · 3a3c02ec

由 Johannes Weiner 提交于 10月 29, 2014

A follow-up patch would have changed the call signature.  To save the
trouble, just fold it instead.
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Acked-by: NMichal Hocko <mhocko@suse.cz>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: <stable@vger.kernel.org>	[3.17.x]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3a3c02ec

mm, thp: fix collapsing of hugepages on madvise · 6d50e60c

由 David Rientjes 提交于 10月 29, 2014

If an anonymous mapping is not allowed to fault thp memory and then
madvise(MADV_HUGEPAGE) is used after fault, khugepaged will never
collapse this memory into thp memory.

This occurs because the madvise(2) handler for thp, hugepage_madvise(),
clears VM_NOHUGEPAGE on the stack and it isn't stored in vma->vm_flags
until the final action of madvise_behavior().  This causes the
khugepaged_enter_vma_merge() to be a no-op in hugepage_madvise() when
the vma had previously had VM_NOHUGEPAGE set.

Fix this by passing the correct vma flags to the khugepaged mm slot
handler.  There's no chance khugepaged can run on this vma until after
madvise_behavior() returns since we hold mm->mmap_sem.

It would be possible to clear VM_NOHUGEPAGE directly from vma->vm_flags
in hugepage_advise(), but I didn't want to introduce special case
behavior into madvise_behavior().  I think it's best to just let it
always set vma->vm_flags itself.
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Reported-by: NSuleiman Souhlal <suleiman@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6d50e60c

drivers: of: add return value to of_reserved_mem_device_init() · 47f29df7

由 Marek Szyprowski 提交于 10月 29, 2014

Driver calling of_reserved_mem_device_init() might be interested if the
initialization has been successful or not, so add support for returning
error code.

This fixes a build warining caused by commit 7bfa5ab6 ("drivers:
dma-coherent: add initialization from device tree"), which has been
merged without this change and without fixing function return value.

Fixes: 7bfa5ab6 ("drivers: dma-coherent: add initialization from device tree")
Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Laura Abbott <lauraa@codeaurora.org>
Cc: Josh Cartwright <joshc@codeaurora.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

47f29df7

29 10月, 2014 5 次提交

block: Fix merge logic when CONFIG_BLK_DEV_INTEGRITY is not defined · cb1a5ab6

由 Martin K. Petersen 提交于 10月 28, 2014

Commit 4eaf99be switched to returning bool and as a result reversed
the logic of the integrity merge checks.  However, the empty stubs used
when the block integrity code is compiled out were still returning
0. Make these stubs return "true".
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reported-by: NMichael L. Semon <mlsemon35@gmail.com>
Tested-by: NMichael L. Semon <mlsemon35@gmail.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

cb1a5ab6

overlayfs: fix lockdep misannotation · d1b72cc6

由 Miklos Szeredi 提交于 10月 27, 2014

In an overlay directory that shadows an empty lower directory, say
/mnt/a/empty102, do:

 	touch /mnt/a/empty102/x
 	unlink /mnt/a/empty102/x
 	rmdir /mnt/a/empty102

It's actually harmless, but needs another level of nesting between
I_MUTEX_CHILD and I_MUTEX_NORMAL.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Tested-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

d1b72cc6

rcu: Provide counterpart to rcu_dereference() for non-RCU situations · 54ef6df3

由 Paul E. McKenney 提交于 10月 27, 2014

Although rcu_dereference() and friends can be used in situations where
object lifetimes are being managed by something other than RCU, the
resulting sparse and lockdep-RCU noise can be annoying. This commit
therefore supplies a lockless_dereference(), which provides the
protection for dereferences without the RCU-related debugging noise.
Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

54ef6df3

usbnet: add a callback for set_rx_mode · 1efed2d0

由 Olivier Blin 提交于 10月 24, 2014

To delegate promiscuous mode and multicast filtering to the subdriver.
Signed-off-by: NOlivier Blin <olivier.blin@softathome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1efed2d0

skbuff.h: fix kernel-doc warning for headers_end · ebcf34f3

由 Randy Dunlap 提交于 10月 26, 2014

Fix kernel-doc warning in <linux/skbuff.h> by making both headers_start
and headers_end private fields.

Warning(..//include/linux/skbuff.h:654): No description found for parameter 'headers_end[0]'
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ebcf34f3

28 10月, 2014 5 次提交

compiler/gcc4+: Remove inaccurate comment about 'asm goto' miscompiles · 5631b8fb

由 Steven Noonan 提交于 10月 25, 2014

The bug referenced by the comment in this commit was not
completely fixed in GCC 4.8.2, as I mentioned in a thread back
in February:

   https://lkml.org/lkml/2014/2/12/797

The conclusion at that time was to make the quirk unconditional
until the bug could be found and fixed in GCC. Unfortunately,
when I submitted the patch (commit a9f18034) I left a comment
in that claimed the bug was fixed in GCC 4.8.2+.

This comment is inaccurate, and should be removed.
Signed-off-by: NSteven Noonan <steven@uplinklabs.net>
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1414274982-14040-1-git-send-email-steven@uplinklabs.net
Cc: Ingo Molnar <mingo@kernel.org>

5631b8fb

Revert "block: all blk-mq requests are tagged" · e999dbc2

由 Christoph Hellwig 提交于 10月 19, 2014

This reverts commit fb3ccb5d.

SCSI-2/SPI actually needs the tagged/untagged flag in the request to
work properly.  Revert this patch and add a follow on to set it in
the right place.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Acked-by: NJens Axboe <axboe@kernel.dk>
Reported-by: NMeelis Roos <mroos@linux.ee>
Tested-by: NMeelis Roos <mroos@linux.ee>
Cc: stable@vger.kernel.org

e999dbc2

power: charger-manager: Fix accessing invalidated power supply after charger unbind · cdaf3e15

由 Krzysztof Kozlowski 提交于 10月 13, 2014

The charger manager obtained in probe references to power supplies for
all chargers with power_supply_get_by_name() for later usage. However
if such charger driver was removed then this reference would point to
old power supply (from driver which was removed).

This lead to accessing invalid memory which could be observed with:
$ echo "max77693-charger" > /sys/bus/platform/drivers/max77693-charger/unbind
$ grep . /sys/devices/virtual/power_supply/battery/charger.0/*
$ grep . /sys/devices/virtual/power_supply/battery/*
[   15.339817] Unable to handle kernel paging request at virtual address 0001c12c
[   15.346187] pgd = edd08000
[   15.348814] [0001c12c] *pgd=6dce2831, *pte=00000000, *ppte=00000000
[   15.355075] Internal error: Oops: 80000007 [#1] PREEMPT SMP ARM
[   15.360967] Modules linked in:
[   15.364010] CPU: 2 PID: 1388 Comm: grep Not tainted 3.17.0-next-20141007-00027-ga95e761db1b0 #245
[   15.372859] task: ee03ad00 ti: edcf6000 task.ti: edcf6000
[   15.378241] PC is at 0x1c12c
[   15.381113] LR is at is_ext_pwr_online+0x30/0x6c
[   15.385706] pc : [<0001c12c>]    lr : [<c0339fc4>]    psr: a0000013
[   15.385706] sp : edcf7e88  ip : 00000000  fp : 00000000
[   15.397161] r10: eeb02c08  r9 : c04b1f84  r8 : eeb02c00
[   15.402369] r7 : edc69a10  r6 : eea6ac10  r5 : eea6ac10  r4 : 00000004
[   15.408878] r3 : 0001c12c  r2 : edcf7e8c  r1 : 00000004  r0 : ee914418
[   15.415390] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[   15.422506] Control: 10c5387d  Table: 6dd0804a  DAC: 00000015
[   15.428236] Process grep (pid: 1388, stack limit = 0xedcf6240)
[   15.434050] Stack: (0xedcf7e88 to 0xedcf8000)
[   15.438395] 7e80:                   ee03ad00 00000000 edcf7f80 eea6aca8 edcf7ec4 c033b7b0
[   15.446554] 7ea0: 00000001 ee1cc3f0 00000004 c06e1e44 eebdc000 c06e1e44 eeb02c00 c0337144
[   15.454713] 7ec0: ee2dac68 c005cffc ee1cc3c0 c06e1e44 00000fff 00001000 eebdc000 c0278ca8
[   15.462872] 7ee0: c0278c8c ee1cc3c0 eeb7ce00 c014422c edcf7f20 00008000 ee1cc3c0 ee9a48c0
[   15.471030] 7f00: 00000001 00000001 edcf7f80 c0142d94 c0142d70 c01060f4 00021000 ee1cc3f0
[   15.479190] 7f20: 00000000 00000000 c06a2150 eebdc000 2e7ec000 ee9a48c0 00008000 00021000
[   15.487349] 7f40: edcf7f80 00008000 edcf6000 00021000 00021000 c00e39a4 00000000 ee9a48c0
[   15.495508] 7f60: 00004000 00000000 00000000 ee9a48c0 ee9a48c0 00008000 00021000 c00e3aa0
[   15.503668] 7f80: 00000000 00000000 0001f2e0 0001f2e0 00021000 00001000 00000003 c000f364
[   15.511826] 7fa0: 00000000 c000f1a0 0001f2e0 00021000 00000003 00021000 00008000 00000000
[   15.519986] 7fc0: 0001f2e0 00021000 00001000 00000003 00000001 000205e8 00000000 00021000
[   15.528145] 7fe0: 00008000 bebbe910 0000a7ad b6edc49c 60000010 00000003 aaaaaaaa aaaaaaaa
[   15.536320] [<c0339fc4>] (is_ext_pwr_online) from [<c033b7b0>] (charger_get_property+0x170/0x314)
[   15.545164] [<c033b7b0>] (charger_get_property) from [<c0337144>] (power_supply_show_property+0x48/0x20c)
[   15.554719] [<c0337144>] (power_supply_show_property) from [<c0278ca8>] (dev_attr_show+0x1c/0x48)
[   15.563577] [<c0278ca8>] (dev_attr_show) from [<c014422c>] (sysfs_kf_seq_show+0x84/0x104)
[   15.571725] [<c014422c>] (sysfs_kf_seq_show) from [<c0142d94>] (kernfs_seq_show+0x24/0x28)
[   15.579973] [<c0142d94>] (kernfs_seq_show) from [<c01060f4>] (seq_read+0x1b0/0x484)
[   15.587614] [<c01060f4>] (seq_read) from [<c00e39a4>] (vfs_read+0x88/0x144)
[   15.594552] [<c00e39a4>] (vfs_read) from [<c00e3aa0>] (SyS_read+0x40/0x8c)
[   15.601417] [<c00e3aa0>] (SyS_read) from [<c000f1a0>] (ret_fast_syscall+0x0/0x48)
[   15.608877] Code: bad PC value
[   15.611991] ---[ end trace a88fcc95208db283 ]---

The charger-manager should get reference to charger power supply on
each use of get_property callback.
Signed-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: <stable@vger.kernel.org>
Fixes: 3bb3dbbd ("power_supply: Add initial Charger-Manager driver")
Signed-off-by: NSebastian Reichel <sre@kernel.org>

cdaf3e15

power: charger-manager: Fix accessing invalidated power supply after fuel gauge unbind · bdbe8144

由 Krzysztof Kozlowski 提交于 10月 13, 2014

The charger manager obtained reference to fuel gauge power supply in probe
with power_supply_get_by_name() for later usage. However if fuel gauge
driver was removed and re-added then this reference would point to old
power supply (from driver which was removed).

This lead to accessing old (and probably invalid) memory which could be
observed with:
$ echo "12-0036" > /sys/bus/i2c/drivers/max17042/unbind
$ echo "12-0036" > /sys/bus/i2c/drivers/max17042/bind
$ cat /sys/devices/virtual/power_supply/battery/capacity
[  240.480084] INFO: task cat:1393 blocked for more than 120 seconds.
[  240.484799]       Not tainted 3.17.0-next-20141007-00028-ge60b6dd79570 #203
[  240.491782] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  240.499589] cat             D c0469530     0  1393      1 0x00000000
[  240.505947] [<c0469530>] (__schedule) from [<c0469d3c>] (schedule_preempt_disabled+0x14/0x20)
[  240.514449] [<c0469d3c>] (schedule_preempt_disabled) from [<c046af08>] (mutex_lock_nested+0x1bc/0x458)
[  240.523736] [<c046af08>] (mutex_lock_nested) from [<c0287a98>] (regmap_read+0x30/0x60)
[  240.531647] [<c0287a98>] (regmap_read) from [<c032238c>] (max17042_get_property+0x2e8/0x350)
[  240.540055] [<c032238c>] (max17042_get_property) from [<c03247d8>] (charger_get_property+0x264/0x348)
[  240.549252] [<c03247d8>] (charger_get_property) from [<c0320764>] (power_supply_show_property+0x48/0x1e0)
[  240.558808] [<c0320764>] (power_supply_show_property) from [<c027308c>] (dev_attr_show+0x1c/0x48)
[  240.567664] [<c027308c>] (dev_attr_show) from [<c0141fb0>] (sysfs_kf_seq_show+0x84/0x104)
[  240.575814] [<c0141fb0>] (sysfs_kf_seq_show) from [<c0140b18>] (kernfs_seq_show+0x24/0x28)
[  240.584061] [<c0140b18>] (kernfs_seq_show) from [<c0104574>] (seq_read+0x1b0/0x484)
[  240.591702] [<c0104574>] (seq_read) from [<c00e1e24>] (vfs_read+0x88/0x144)
[  240.598640] [<c00e1e24>] (vfs_read) from [<c00e1f20>] (SyS_read+0x40/0x8c)
[  240.605507] [<c00e1f20>] (SyS_read) from [<c000e760>] (ret_fast_syscall+0x0/0x48)
[  240.612952] 4 locks held by cat/1393:
[  240.616589]  #0:  (&p->lock){+.+.+.}, at: [<c01043f4>] seq_read+0x30/0x484
[  240.623414]  #1:  (&of->mutex){+.+.+.}, at: [<c01417dc>] kernfs_seq_start+0x1c/0x8c
[  240.631086]  #2:  (s_active#31){++++.+}, at: [<c01417e4>] kernfs_seq_start+0x24/0x8c
[  240.638777]  #3:  (&map->mutex){+.+...}, at: [<c0287a98>] regmap_read+0x30/0x60

The charger-manager should get reference to fuel gauge power supply on
each use of get_property callback. The thermal zone 'tzd' field of
power supply should not be used because of the same reason.

Additionally this change solves also the issue with nested
thermal_zone_get_temp() calls and related false lockdep positive for
deadlock for thermal zone's mutex [1]. When fuel gauge is used as source of
temperature then the charger manager forwards its get_temp calls to fuel
gauge thermal zone. So actually different mutexes are used (one for
charger manager thermal zone and second for fuel gauge thermal zone) but
for lockdep this is one class of mutex.

The recursion is removed by retrieving temperature through power
supply's get_property().

In case external thermal zone is used ('cm-thermal-zone' property is
present in DTS) the recursion does not exist. Charger manager simply
exports POWER_SUPPLY_PROP_TEMP_AMBIENT property (instead of
POWER_SUPPLY_PROP_TEMP) thus no thermal zone is created for this power
supply.

[1] https://lkml.org/lkml/2014/10/6/309Signed-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: <stable@vger.kernel.org>
Fixes: 3bb3dbbd ("power_supply: Add initial Charger-Manager driver")
Signed-off-by: NSebastian Reichel <sre@kernel.org>

bdbe8144

power_supply: Add no_thermal property to prevent recursive get_temp calls · a69d82b9

由 Krzysztof Kozlowski 提交于 10月 07, 2014

Add a 'no_thermal' property to the power supply class. If true then
thermal zone won't be created for this power supply in
power_supply_register().

Power supply drivers may want to set it if they support
POWER_SUPPLY_PROP_TEMP and they are forwarding this get property call to
other thermal zone.

If they won't set it lockdep may report false positive deadlock for
thermal zone's mutex because of nested calls to thermal_zone_get_temp().
First is the call to thermal_zone_get_temp() of the driver's thermal
zone. Thermal core gets POWER_SUPPLY_PROP_TEMP property from this
driver. The driver then calls other thermal zone thermal_zone_get_temp()
and returns result.

Example of such driver is charger manager.
Signed-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: NSebastian Reichel <sre@kernel.org>

a69d82b9

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功