提交 · e19ca3fe6cf230a0e5df70da7acade133f2b670c · openanolis / cloud-kernel

24 3月, 2019 40 次提交

clocksource/drivers/arch_timer: Workaround for Allwinner A64 timer instability · e19ca3fe

由 Samuel Holland 提交于 1月 12, 2019

commit c950ca8c35eeb32224a63adc47e12f9e226da241 upstream.

The Allwinner A64 SoC is known[1] to have an unstable architectural
timer, which manifests itself most obviously in the time jumping forward
a multiple of 95 years[2][3]. This coincides with 2^56 cycles at a
timer frequency of 24 MHz, implying that the time went slightly backward
(and this was interpreted by the kernel as it jumping forward and
wrapping around past the epoch).

Investigation revealed instability in the low bits of CNTVCT at the
point a high bit rolls over. This leads to power-of-two cycle forward
and backward jumps. (Testing shows that forward jumps are about twice as
likely as backward jumps.) Since the counter value returns to normal
after an indeterminate read, each "jump" really consists of both a
forward and backward jump from the software perspective.

Unless the kernel is trapping CNTVCT reads, a userspace program is able
to read the register in a loop faster than it changes. A test program
running on all 4 CPU cores that reported jumps larger than 100 ms was
run for 13.6 hours and reported the following:

 Count | Event
-------+---------------------------
  9940 | jumped backward      699ms
   268 | jumped backward     1398ms
     1 | jumped backward     2097ms
 16020 | jumped forward       175ms
  6443 | jumped forward       699ms
  2976 | jumped forward      1398ms
     9 | jumped forward    356516ms
     9 | jumped forward    357215ms
     4 | jumped forward    714430ms
     1 | jumped forward   3578440ms

This works out to a jump larger than 100 ms about every 5.5 seconds on
each CPU core.

The largest jump (almost an hour!) was the following sequence of reads:
    0x0000007fffffffff → 0x00000093feffffff → 0x0000008000000000

Note that the middle bits don't necessarily all read as all zeroes or
all ones during the anomalous behavior; however the low 10 bits checked
by the function in this patch have never been observed with any other
value.

Also note that smaller jumps are much more common, with backward jumps
of 2048 (2^11) cycles observed over 400 times per second on each core.
(Of course, this is partially explained by lower bits rolling over more
frequently.) Any one of these could have caused the 95 year time skip.

Similar anomalies were observed while reading CNTPCT (after patching the
kernel to allow reads from userspace). However, the CNTPCT jumps are
much less frequent, and only small jumps were observed. The same program
as before (except now reading CNTPCT) observed after 72 hours:

 Count | Event
-------+---------------------------
    17 | jumped backward      699ms
    52 | jumped forward       175ms
  2831 | jumped forward       699ms
     5 | jumped forward      1398ms

Further investigation showed that the instability in CNTPCT/CNTVCT also
affected the respective timer's TVAL register. The following values were
observed immediately after writing CNVT_TVAL to 0x10000000:

 CNTVCT             | CNTV_TVAL  | CNTV_CVAL          | CNTV_TVAL Error
--------------------+------------+--------------------+-----------------
 0x000000d4a2d8bfff | 0x10003fff | 0x000000d4b2d8bfff | +0x00004000
 0x000000d4a2d94000 | 0x0fffffff | 0x000000d4b2d97fff | -0x00004000
 0x000000d4a2d97fff | 0x10003fff | 0x000000d4b2d97fff | +0x00004000
 0x000000d4a2d9c000 | 0x0fffffff | 0x000000d4b2d9ffff | -0x00004000

The pattern of errors in CNTV_TVAL seemed to depend on exactly which
value was written to it. For example, after writing 0x10101010:

 CNTVCT             | CNTV_TVAL  | CNTV_CVAL          | CNTV_TVAL Error
--------------------+------------+--------------------+-----------------
 0x000001ac3effffff | 0x1110100f | 0x000001ac4f10100f | +0x1000000
 0x000001ac40000000 | 0x1010100f | 0x000001ac5110100f | -0x1000000
 0x000001ac58ffffff | 0x1110100f | 0x000001ac6910100f | +0x1000000
 0x000001ac66000000 | 0x1010100f | 0x000001ac7710100f | -0x1000000
 0x000001ac6affffff | 0x1110100f | 0x000001ac7b10100f | +0x1000000
 0x000001ac6e000000 | 0x1010100f | 0x000001ac7f10100f | -0x1000000

I was also twice able to reproduce the issue covered by Allwinner's
workaround[4], that writing to TVAL sometimes fails, and both CVAL and
TVAL are left with entirely bogus values. One was the following values:

 CNTVCT             | CNTV_TVAL  | CNTV_CVAL
--------------------+------------+--------------------------------------
 0x000000d4a2d6014c | 0x8fbd5721 | 0x000000d132935fff (615s in the past)
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>

========================================================================

Because the CPU can read the CNTPCT/CNTVCT registers faster than they
change, performing two reads of the register and comparing the high bits
(like other workarounds) is not a workable solution. And because the
timer can jump both forward and backward, no pair of reads can
distinguish a good value from a bad one. The only way to guarantee a
good value from consecutive reads would be to read _three_ times, and
take the middle value only if the three values are 1) each unique and
2) increasing. This takes at minimum 3 counter cycles (125 ns), or more
if an anomaly is detected.

However, since there is a distinct pattern to the bad values, we can
optimize the common case (1022/1024 of the time) to a single read by
simply ignoring values that match the error pattern. This still takes no
more than 3 cycles in the worst case, and requires much less code. As an
additional safety check, we still limit the loop iteration to the number
of max-frequency (1.2 GHz) CPU cycles in three 24 MHz counter periods.

For the TVAL registers, the simple solution is to not use them. Instead,
read or write the CVAL and calculate the TVAL value in software.

Although the manufacturer is aware of at least part of the erratum[4],
there is no official name for it. For now, use the kernel-internal name
"UNKNOWN1".

[1]: https://github.com/armbian/build/commit/a08cd6fe7ae9
[2]: https://forum.armbian.com/topic/3458-a64-datetime-clock-issue/
[3]: https://irclog.whitequark.org/linux-sunxi/2018-01-26
[4]: https://github.com/Allwinner-Homlet/H6-BSP4.9-linux/blob/master/drivers/clocksource/arm_arch_timer.c#L272Acked-by: NMaxime Ripard <maxime.ripard@bootlin.com>
Tested-by: NAndre Przywara <andre.przywara@arm.com>
Signed-off-by: NSamuel Holland <samuel@sholland.org>
Cc: stable@vger.kernel.org
Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

e19ca3fe

clocksource/drivers/exynos_mct: Clear timer interrupt when shutdown · ef8062e2

由 Stuart Menefy 提交于 2月 10, 2019

commit d2f276c8d3c224d5b493c42b6cf006ae4e64fb1c upstream.

When shutting down the timer, ensure that after we have stopped the
timer any pending interrupts are cleared. This fixes a problem when
suspending, as interrupts are disabled before the timer is stopped,
so the timer interrupt may still be asserted, preventing the system
entering a low power state when the wfi is executed.
Signed-off-by: NStuart Menefy <stuart.menefy@mathembedded.com>
Reviewed-by: NKrzysztof Kozlowski <krzk@kernel.org>
Tested-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Cc: <stable@vger.kernel.org> # v4.3+
Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

ef8062e2

clocksource/drivers/exynos_mct: Move one-shot check from tick clear to ISR · c1f45c10

由 Stuart Menefy 提交于 2月 10, 2019

commit a5719a40aef956ba704f2aa1c7b977224d60fa96 upstream.

When a timer tick occurs and the clock is in one-shot mode, the timer
needs to be stopped to prevent it triggering subsequent interrupts.
Currently this code is in exynos4_mct_tick_clear(), but as it is
only needed when an ISR occurs move it into exynos4_mct_tick_isr(),
leaving exynos4_mct_tick_clear() just doing what its name suggests it
should.
Signed-off-by: NStuart Menefy <stuart.menefy@mathembedded.com>
Reviewed-by: NKrzysztof Kozlowski <krzk@kernel.org>
Tested-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Cc: stable@vger.kernel.org # v4.3+
Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

c1f45c10

regulator: s2mpa01: Fix step values for some LDOs · 06607b1b

由 Stuart Menefy 提交于 2月 12, 2019

commit 28c4f730d2a44f2591cb104091da29a38dac49fe upstream.

The step values for some of the LDOs appears to be incorrect, resulting
in incorrect voltages (or at least, ones which are different from the
Samsung 3.4 vendor kernel).
Signed-off-by: NStuart Menefy <stuart.menefy@mathembedded.com>
Reviewed-by: NKrzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: NMark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

06607b1b

regulator: max77620: Initialize values for DT properties · c288e34d

由 Mark Zhang 提交于 1月 10, 2019

commit 0ab66b3c326ef8f77dae9f528118966365757c0c upstream.

If regulator DT node doesn't exist, its of_parse_cb callback
function isn't called. Then all values for DT properties are
filled with zero. This leads to wrong register update for
FPS and POK settings.
Signed-off-by: NJinyoung Park <jinyoungp@nvidia.com>
Signed-off-by: NMark Zhang <markz@nvidia.com>
Signed-off-by: NMark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

c288e34d

regulator: s2mps11: Fix steps for buck7, buck8 and LDO35 · 462aee48

由 Krzysztof Kozlowski 提交于 2月 09, 2019

commit 56b5d4ea778c1b0989c5cdb5406d4a488144c416 upstream.

LDO35 uses 25 mV step, not 50 mV.  Bucks 7 and 8 use 12.5 mV step
instead of 6.25 mV.  Wrong step caused over-voltage (LDO35) or
under-voltage (buck7 and 8) if regulators were used (e.g. on Exynos5420
Arndale Octa board).

Cc: <stable@vger.kernel.org>
Fixes: cb74685e ("regulator: s2mps11: Add samsung s2mps11 regulator driver")
Signed-off-by: NKrzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: NMark Brown <broonie@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

462aee48

spi: pxa2xx: Setup maximum supported DMA transfer length · 15ead7e2

由 Andy Shevchenko 提交于 2月 19, 2019

commit ef070b4e4aa25bb5f8632ad196644026c11903bf upstream.

When the commit b6ced294

   ("spi: pxa2xx: Switch to SPI core DMA mapping functionality")

switches to SPI core provided DMA helpers, it missed to setup maximum
supported DMA transfer length for the controller and thus users
mistakenly try to send more data than supported with the following
warning:

  ili9341 spi-PRP0001:01: DMA disabled for transfer length 153600 greater than 65536

Setup maximum supported DMA transfer length in order to make users know
the limit.

Fixes: b6ced294 ("spi: pxa2xx: Switch to SPI core DMA mapping functionality")
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: NMark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

15ead7e2

spi: ti-qspi: Fix mmap read when more than one CS in use · e51c5ec9

由 Vignesh R 提交于 1月 29, 2019

commit 673c865efbdc5fec3cc525c46d71844d42c60072 upstream.

Commit 4dea6c9b ("spi: spi-ti-qspi: add mmap mode read support") has
has got order of parameter wrong when calling regmap_update_bits() to
select CS for mmap access. Mask and value arguments are interchanged.
Code will work on a system with single slave, but fails when more than
one CS is in use. Fix this by correcting the order of parameters when
calling regmap_update_bits().

Fixes: 4dea6c9b ("spi: spi-ti-qspi: add mmap mode read support")
Cc: stable@vger.kernel.org
Signed-off-by: NVignesh R <vigneshr@ti.com>
Signed-off-by: NMark Brown <broonie@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

e51c5ec9

netfilter: ipt_CLUSTERIP: fix warning unused variable cn · 0d98ecb1

由 Anders Roxell 提交于 1月 23, 2019

commit 206b8cc514d7ff2b79dd2d5ad939adc7c493f07a upstream.

When CONFIG_PROC_FS isn't set the variable cn isn't used.

net/ipv4/netfilter/ipt_CLUSTERIP.c: In function ‘clusterip_net_exit’:
net/ipv4/netfilter/ipt_CLUSTERIP.c:849:24: warning: unused variable ‘cn’ [-Wunused-variable]
  struct clusterip_net *cn = clusterip_pernet(net);
                        ^~

Rework so the variable 'cn' is declared inside "#ifdef CONFIG_PROC_FS".

Fixes: b12f7bad5ad3 ("netfilter: ipt_CLUSTERIP: remove wrong WARN_ON_ONCE in netns exit routine")
Signed-off-by: NAnders Roxell <anders.roxell@linaro.org>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

0d98ecb1

mmc:fix a bug when max_discard is 0 · 6bd9959a

由 Jiong Wu 提交于 3月 01, 2019

commit d4721339dcca7def04909a8e60da43c19a24d8bf upstream.

The original purpose of the code I fix is to replace max_discard with
max_trim if max_trim is less than max_discard. When max_discard is 0
we should replace max_discard with max_trim as well, because
max_discard equals 0 happens only when the max_do_calc_max_discard
process is overflowed, so if mmc_can_trim(card) is true, max_discard
should be replaced by an available max_trim.
However, in the original code, there are two lines of code interfere
the right process.
1) if (max_discard && mmc_can_trim(card))
when max_discard is 0, it skips the process checking if max_discard
needs to be replaced with max_trim.
2) if (max_trim < max_discard)
the condition is false when max_discard is 0. it also skips the process
that replaces max_discard with max_trim, in fact, we should replace the
0-valued max_discard with max_trim.
Signed-off-by: NJiong Wu <Lohengrin1024@gmail.com>
Fixes: b305882f (mmc: core: optimize mmc_calc_max_discard)
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

6bd9959a

mmc: sdhci-esdhc-imx: fix HS400 timing issue · 2946910e

由 BOUGH CHEN 提交于 12月 27, 2018

commit de0a0decf2edfc5b0c782915f4120cf990a9bd13 upstream.

Now tuning reset will be done when the timing is MMC_TIMING_LEGACY/
MMC_TIMING_MMC_HS/MMC_TIMING_SD_HS. But for timing MMC_TIMING_MMC_HS,
we can not do tuning reset, otherwise HS400 timing is not right.

Here is the process of init HS400, first finish tuning in HS200 mode,
then switch to HS mode and 8 bit DDR mode, finally switch to HS400
mode. If we do tuning reset in HS mode, this will cause HS400 mode
lost the tuning setting, which will cause CRC error.
Signed-off-by: NHaibo Chen <haibo.chen@nxp.com>
Cc: stable@vger.kernel.org # v4.12+
Acked-by: NAdrian Hunter <adrian.hunter@intel.com>
Fixes: d9370424 ("mmc: sdhci-esdhc-imx: reset tuning circuit when power on mmc card")
Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

2946910e

ACPI / device_sysfs: Avoid OF modalias creation for removed device · c19b9673

由 Andy Shevchenko 提交于 3月 11, 2019

commit f16eb8a4b096514ac06fb25bf599dcc792899b3d upstream.

If SSDT overlay is loaded via ConfigFS and then unloaded the device,
we would like to have OF modalias for, already gone. Thus, acpi_get_name()
returns no allocated buffer for such case and kernel crashes afterwards:

 ACPI: Host-directed Dynamic ACPI Table Unload
 ads7950 spi-PRP0001:00: Dropping the link to regulator.0
 BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
 #PF error: [normal kernel read fault]
 PGD 80000000070d6067 P4D 80000000070d6067 PUD 70d0067 PMD 0
 Oops: 0000 [#1] SMP PTI
 CPU: 0 PID: 40 Comm: kworker/u4:2 Not tainted 5.0.0+ #96
 Hardware name: Intel Corporation Merrifield/BODEGA BAY, BIOS 542 2015.01.21:18.19.48
 Workqueue: kacpi_hotplug acpi_device_del_work_fn
 RIP: 0010:create_of_modalias.isra.1+0x4c/0x150
 Code: 00 00 48 89 44 24 18 31 c0 48 8d 54 24 08 48 c7 44 24 10 00 00 00 00 48 c7 44 24 08 ff ff ff ff e8 7a b0 03 00 48 8b 4c 24 10 <0f> b6 01 84 c0 74 27 48 c7 c7 00 09 f4 a5 0f b6 f0 8d 50 20 f6 04
 RSP: 0000:ffffa51040297c10 EFLAGS: 00010246
 RAX: 0000000000001001 RBX: 0000000000000785 RCX: 0000000000000000
 RDX: 0000000000001001 RSI: 0000000000000286 RDI: ffffa2163dc042e0
 RBP: ffffa216062b1196 R08: 0000000000001001 R09: ffffa21639873000
 R10: ffffffffa606761d R11: 0000000000000001 R12: ffffa21639873218
 R13: ffffa2163deb5060 R14: ffffa216063d1010 R15: 0000000000000000
 FS:  0000000000000000(0000) GS:ffffa2163e000000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 0000000007114000 CR4: 00000000001006f0
 Call Trace:
  __acpi_device_uevent_modalias+0xb0/0x100
  spi_uevent+0xd/0x40

 ...

In order to fix above let create_of_modalias() check the status returned
by acpi_get_name() and bail out in case of failure.

Fixes: 8765c5ba ("ACPI / scan: Rework modalias creation when "compatible" is present")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201381Reported-by: NFerry Toth <fntoth@gmail.com>
Tested-by: Ferry Toth<fntoth@gmail.com>
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Cc: 4.1+ <stable@vger.kernel.org> # 4.1+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

c19b9673

xen: fix dom0 boot on huge systems · 468ff43f

由 Juergen Gross 提交于 3月 07, 2019

commit 01bd2ac2f55a1916d81dace12fa8d7ae1c79b5ea upstream.

Commit f7c90c2a ("x86/xen: don't write ptes directly in 32-bit
PV guests") introduced a regression for booting dom0 on huge systems
with lots of RAM (in the TB range).

Reason is that on those hosts the p2m list needs to be moved early in
the boot process and this requires temporary page tables to be created.
Said commit modified xen_set_pte_init() to use a hypercall for writing
a PTE, but this requires the page table being in the direct mapped
area, which is not the case for the temporary page tables used in
xen_relocate_p2m().

As the page tables are completely written before being linked to the
actual address space instead of set_pte() a plain write to memory can
be used in xen_relocate_p2m().

Fixes: f7c90c2a ("x86/xen: don't write ptes directly in 32-bit PV guests")
Cc: stable@vger.kernel.org
Signed-off-by: NJuergen Gross <jgross@suse.com>
Reviewed-by: NJan Beulich <jbeulich@suse.com>
Signed-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

468ff43f

tracing/perf: Use strndup_user() instead of buggy open-coded version · 24d50976

由 Jann Horn 提交于 2月 20, 2019

commit 83540fbc8812a580b6ad8f93f4c29e62e417687e upstream.

The first version of this method was missing the check for
`ret == PATH_MAX`; then such a check was added, but it didn't call kfree()
on error, so there was still a small memory leak in the error case.
Fix it by using strndup_user() instead of open-coding it.

Link: http://lkml.kernel.org/r/20190220165443.152385-1-jannh@google.com

Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 0eadcc7a ("perf/core: Fix perf_uprobe_init()")
Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
Acked-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NJann Horn <jannh@google.com>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

24d50976

tracing: Do not free iter->trace in fail path of tracing_open_pipe() · f27077e5

由 zhangyi (F) 提交于 2月 13, 2019

commit e7f0c424d0806b05d6f47be9f202b037eb701707 upstream.

Commit d716ff71 ("tracing: Remove taking of trace_types_lock in
pipe files") use the current tracer instead of the copy in
tracing_open_pipe(), but it forget to remove the freeing sentence in
the error path.

There's an error path that can call kfree(iter->trace) after the iter->trace
was assigned to tr->current_trace, which would be bad to free.

Link: http://lkml.kernel.org/r/1550060946-45984-1-git-send-email-yi.zhang@huawei.com

Cc: stable@vger.kernel.org
Fixes: d716ff71 ("tracing: Remove taking of trace_types_lock in pipe files")
Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

f27077e5

tracing: Use strncpy instead of memcpy for string keys in hist triggers · ebca08d7

由 Tom Zanussi 提交于 2月 04, 2019

commit 9f0bbf3115ca9f91f43b7c74e9ac7d79f47fc6c2 upstream.

Because there may be random garbage beyond a string's null terminator,
it's not correct to copy the the complete character array for use as a
hist trigger key.  This results in multiple histogram entries for the
'same' string key.

So, in the case of a string key, use strncpy instead of memcpy to
avoid copying in the extra bytes.

Before, using the gdbus entries in the following hist trigger as an
example:

  # echo 'hist:key=comm' > /sys/kernel/debug/tracing/events/sched/sched_waking/trigger
  # cat /sys/kernel/debug/tracing/events/sched/sched_waking/hist

  ...

  { comm: ImgDecoder #4                      } hitcount:        203
  { comm: gmain                              } hitcount:        213
  { comm: gmain                              } hitcount:        216
  { comm: StreamTrans #73                    } hitcount:        221
  { comm: mozStorage #3                      } hitcount:        230
  { comm: gdbus                              } hitcount:        233
  { comm: StyleThread#5                      } hitcount:        253
  { comm: gdbus                              } hitcount:        256
  { comm: gdbus                              } hitcount:        260
  { comm: StyleThread#4                      } hitcount:        271

  ...

  # cat /sys/kernel/debug/tracing/events/sched/sched_waking/hist | egrep gdbus | wc -l
  51

After:

  # cat /sys/kernel/debug/tracing/events/sched/sched_waking/hist | egrep gdbus | wc -l
  1

Link: http://lkml.kernel.org/r/50c35ae1267d64eee975b8125e151e600071d4dc.1549309756.git.tom.zanussi@linux.intel.com

Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 79e577cb ("tracing: Support string type key properly")
Signed-off-by: NTom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

ebca08d7

CIFS: Fix read after write for files with read caching · 43eaa6cc

由 Pavel Shilovsky 提交于 3月 04, 2019

commit 6dfbd84684700cb58b34e8602c01c12f3d2595c8 upstream.

When we have a READ lease for a file and have just issued a write
operation to the server we need to purge the cache and set oplock/lease
level to NONE to avoid reading stale data. Currently we do that
only if a write operation succedeed thus not covering cases when
a request was sent to the server but a negative error code was
returned later for some other reasons (e.g. -EIOCBQUEUED or -EINTR).
Fix this by turning off caching regardless of the error code being
returned.

The patches fixes generic tests 075 and 112 from the xfs-tests.

Cc: <stable@vger.kernel.org>
Signed-off-by: NPavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: NSteve French <stfrench@microsoft.com>
Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

43eaa6cc

CIFS: Do not skip SMB2 message IDs on send failures · dc8e8ad9

由 Pavel Shilovsky 提交于 3月 04, 2019

commit c781af7e0c1fed9f1d0e0ec31b86f5b21a8dca17 upstream.

When we hit failures during constructing MIDs or sending PDUs
through the network, we end up not using message IDs assigned
to the packet. The next SMB packet will skip those message IDs
and continue with the next one. This behavior may lead to a server
not granting us credits until we use the skipped IDs. Fix this by
reverting the current ID to the original value if any errors occur
before we push the packet through the network stack.

This patch fixes the generic/310 test from the xfs-tests.

Cc: <stable@vger.kernel.org> # 4.19.x
Signed-off-by: NPavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: NSteve French <stfrench@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

dc8e8ad9

CIFS: Do not reset lease state to NONE on lease break · 3ed9f22e

由 Pavel Shilovsky 提交于 2月 13, 2019

commit 7b9b9edb49ad377b1e06abf14354c227e9ac4b06 upstream.

Currently on lease break the client sets a caching level twice:
when oplock is detected and when oplock is processed. While the
1st attempt sets the level to the value provided by the server,
the 2nd one resets the level to None unconditionally.
This happens because the oplock/lease processing code was changed
to avoid races between page cache flushes and oplock breaks.
The commit c11f1df5 ("cifs: Wait for writebacks to complete
before attempting write.") fixed the races for oplocks but didn't
apply the same changes for leases resulting in overwriting the
server granted value to None. Fix this by properly processing
lease breaks.
Signed-off-by: NPavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: NSteve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

3ed9f22e

crypto: arm64/aes-ccm - fix bugs in non-NEON fallback routine · 41e2d1c4

由 Ard Biesheuvel 提交于 1月 24, 2019

commit 969e2f59d589c15f6aaf306e590dde16f12ea4b3 upstream.

Commit 5092fcf3 ("crypto: arm64/aes-ce-ccm: add non-SIMD generic
fallback") introduced C fallback code to replace the NEON routines
when invoked from a context where the NEON is not available (i.e.,
from the context of a softirq taken while the NEON is already being
used in kernel process context)

Fix two logical flaws in the MAC calculation of the associated data.
Reported-by: NEric Biggers <ebiggers@kernel.org>
Fixes: 5092fcf3 ("crypto: arm64/aes-ce-ccm: add non-SIMD generic fallback")
Cc: stable@vger.kernel.org
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

41e2d1c4

crypto: arm64/aes-ccm - fix logical bug in AAD MAC handling · d5a5bded

由 Ard Biesheuvel 提交于 1月 24, 2019

commit eaf46edf6ea89675bd36245369c8de5063a0272c upstream.

The NEON MAC calculation routine fails to handle the case correctly
where there is some data in the buffer, and the input fills it up
exactly. In this case, we enter the loop at the end with w8 == 0,
while a negative value is assumed, and so the loop carries on until
the increment of the 32-bit counter wraps around, which is quite
obviously wrong.

So omit the loop altogether in this case, and exit right away.
Reported-by: NEric Biggers <ebiggers@kernel.org>
Fixes: a3fd8210 ("arm64/crypto: AES in CCM mode using ARMv8 Crypto ...")
Cc: stable@vger.kernel.org
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

d5a5bded

crypto: x86/morus - fix handling chunked inputs and MAY_SLEEP · 66700c89

由 Eric Biggers 提交于 1月 31, 2019

commit 2060e284e9595fc3baed6e035903c05b93266555 upstream.

The x86 MORUS implementations all fail the improved AEAD tests because
they produce the wrong result with some data layouts.  The issue is that
they assume that if the skcipher_walk API gives 'nbytes' not aligned to
the walksize (a.k.a. walk.stride), then it is the end of the data.  In
fact, this can happen before the end.

Also, when the CRYPTO_TFM_REQ_MAY_SLEEP flag is given, they can
incorrectly sleep in the skcipher_walk_*() functions while preemption
has been disabled by kernel_fpu_begin().

Fix these bugs.

Fixes: 56e8e57f ("crypto: morus - Add common SIMD glue code for MORUS")
Cc: <stable@vger.kernel.org> # v4.18+
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: NEric Biggers <ebiggers@google.com>
Reviewed-by: NOndrej Mosnacek <omosnace@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

66700c89

crypto: x86/aesni-gcm - fix crash on empty plaintext · 8a9fcf4a

由 Eric Biggers 提交于 1月 31, 2019

commit 3af349639597fea582a93604734d717e59a0e223 upstream.

gcmaes_crypt_by_sg() dereferences the NULL pointer returned by
scatterwalk_ffwd() when encrypting an empty plaintext and the source
scatterlist ends immediately after the associated data.

Fix it by only fast-forwarding to the src/dst data scatterlists if the
data length is nonzero.

This bug is reproduced by the "rfc4543(gcm(aes))" test vectors when run
with the new AEAD test manager.

Fixes: e8455207 ("crypto: aesni - Update aesni-intel_glue to use scatter/gather")
Cc: <stable@vger.kernel.org> # v4.17+
Cc: Dave Watson <davejwatson@fb.com>
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

8a9fcf4a

crypto: x86/aegis - fix handling chunked inputs and MAY_SLEEP · 5d2a5172

由 Eric Biggers 提交于 1月 31, 2019

commit ba6771c0a0bc2fac9d6a8759bab8493bd1cffe3b upstream.

The x86 AEGIS implementations all fail the improved AEAD tests because
they produce the wrong result with some data layouts.  The issue is that
they assume that if the skcipher_walk API gives 'nbytes' not aligned to
the walksize (a.k.a. walk.stride), then it is the end of the data.  In
fact, this can happen before the end.

Also, when the CRYPTO_TFM_REQ_MAY_SLEEP flag is given, they can
incorrectly sleep in the skcipher_walk_*() functions while preemption
has been disabled by kernel_fpu_begin().

Fix these bugs.

Fixes: 1d373d4e ("crypto: x86 - Add optimized AEGIS implementations")
Cc: <stable@vger.kernel.org> # v4.18+
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: NEric Biggers <ebiggers@google.com>
Reviewed-by: NOndrej Mosnacek <omosnace@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

5d2a5172

crypto: testmgr - skip crc32c context test for ahash algorithms · 574c19d9

由 Eric Biggers 提交于 1月 23, 2019

commit eb5e6730db98fcc4b51148b4a819fa4bf864ae54 upstream.

Instantiating "cryptd(crc32c)" causes a crypto self-test failure because
the crypto_alloc_shash() in alg_test_crc32c() fails.  This is because
cryptd(crc32c) is an ahash algorithm, not a shash algorithm; so it can
only be accessed through the ahash API, unlike shash algorithms which
can be accessed through both the ahash and shash APIs.

As the test is testing the shash descriptor format which is only
applicable to shash algorithms, skip it for ahash algorithms.

(Note that it's still important to fix crypto self-test failures even
 for weird algorithm instantiations like cryptd(crc32c) that no one
 would really use; in fips_enabled mode unprivileged users can use them
 to panic the kernel, and also they prevent treating a crypto self-test
 failure as a bug when fuzzing the kernel.)

Fixes: 8e3ee85e ("crypto: crc32c - Test descriptor context format")
Cc: stable@vger.kernel.org
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

574c19d9

crypto: skcipher - set CRYPTO_TFM_NEED_KEY if ->setkey() fails · e6c703f1

由 Eric Biggers 提交于 1月 06, 2019

commit b1f6b4bf416b49f00f3abc49c639371cdecaaad1 upstream.

Some algorithms have a ->setkey() method that is not atomic, in the
sense that setting a key can fail after changes were already made to the
tfm context. In this case, if a key was already set the tfm can end up
in a state that corresponds to neither the old key nor the new key.

For example, in lrw.c, if gf128mul_init_64k_bbe() fails due to lack of
memory, then priv::table will be left NULL. After that, encryption with
that tfm will cause a NULL pointer dereference.

It's not feasible to make all ->setkey() methods atomic, especially ones
that have to key multiple sub-tfms. Therefore, make the crypto API set
CRYPTO_TFM_NEED_KEY if ->setkey() fails and the algorithm requires a
key, to prevent the tfm from being used until a new key is set.

[Cc stable mainly because when introducing the NEED_KEY flag I changed
AF_ALG to rely on it; and unlike in-kernel crypto API users, AF_ALG
previously didn't have this problem. So these "incompletely keyed"
states became theoretically accessible via AF_ALG -- though, the
opportunities for causing real mischief seem pretty limited.]

Fixes: f8d33fac ("crypto: skcipher - prevent using skciphers without setting key")
Cc: <stable@vger.kernel.org> # v4.16+
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

e6c703f1

crypto: pcbc - remove bogus memcpy()s with src == dest · bb1ae0aa

由 Eric Biggers 提交于 1月 03, 2019

commit 251b7aea34ba3c4d4fdfa9447695642eb8b8b098 upstream.

The memcpy()s in the PCBC implementation use walk->iv as both the source
and destination, which has undefined behavior.  These memcpy()'s are
actually unneeded, because walk->iv is already used to hold the previous
plaintext block XOR'd with the previous ciphertext block.  Thus,
walk->iv is already updated to its final value.

So remove the broken and unnecessary memcpy()s.

Fixes: 91652be5 ("[CRYPTO] pcbc: Add Propagated CBC template")
Cc: <stable@vger.kernel.org> # v2.6.21+
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

bb1ae0aa

crypto: morus - fix handling chunked inputs · c0bfdac6

由 Eric Biggers 提交于 1月 31, 2019

commit d644f1c8746ed24f81075480f9e9cb3777ae8d65 upstream.

The generic MORUS implementations all fail the improved AEAD tests
because they produce the wrong result with some data layouts.  The issue
is that they assume that if the skcipher_walk API gives 'nbytes' not
aligned to the walksize (a.k.a. walk.stride), then it is the end of the
data.  In fact, this can happen before the end.  Fix them.

Fixes: 396be41f ("crypto: morus - Add generic MORUS AEAD implementations")
Cc: <stable@vger.kernel.org> # v4.18+
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: NEric Biggers <ebiggers@google.com>
Reviewed-by: NOndrej Mosnacek <omosnace@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

c0bfdac6

crypto: hash - set CRYPTO_TFM_NEED_KEY if ->setkey() fails · dc410d2d

由 Eric Biggers 提交于 1月 06, 2019

commit ba7d7433a0e998c902132bd47330e355a1eaa894 upstream.

Some algorithms have a ->setkey() method that is not atomic, in the
sense that setting a key can fail after changes were already made to the
tfm context.  In this case, if a key was already set the tfm can end up
in a state that corresponds to neither the old key nor the new key.

It's not feasible to make all ->setkey() methods atomic, especially ones
that have to key multiple sub-tfms.  Therefore, make the crypto API set
CRYPTO_TFM_NEED_KEY if ->setkey() fails and the algorithm requires a
key, to prevent the tfm from being used until a new key is set.

Note: we can't set CRYPTO_TFM_NEED_KEY for OPTIONAL_KEY algorithms, so
->setkey() for those must nevertheless be atomic.  That's fine for now
since only the crc32 and crc32c algorithms set OPTIONAL_KEY, and it's
not intended that OPTIONAL_KEY be used much.

[Cc stable mainly because when introducing the NEED_KEY flag I changed
 AF_ALG to rely on it; and unlike in-kernel crypto API users, AF_ALG
 previously didn't have this problem.  So these "incompletely keyed"
 states became theoretically accessible via AF_ALG -- though, the
 opportunities for causing real mischief seem pretty limited.]

Fixes: 9fa68f62 ("crypto: hash - prevent using keyed hashes without setting key")
Cc: stable@vger.kernel.org
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

dc410d2d

crypto: arm64/crct10dif - revert to C code for short inputs · 76f21678

由 Ard Biesheuvel 提交于 1月 27, 2019

commit d72b9d4acd548251f55b16843fc7a05dc5c80de8 upstream.

The SIMD routine ported from x86 used to have a special code path
for inputs < 16 bytes, which got lost somewhere along the way.
Instead, the current glue code aligns the input pointer to 16 bytes,
which is not really necessary on this architecture (although it
could be beneficial to performance to expose aligned data to the
the NEON routine), but this could result in inputs of less than
16 bytes to be passed in. This not only fails the new extended
tests that Eric has implemented, it also results in the code
reading past the end of the input, which could potentially result
in crashes when dealing with less than 16 bytes of input at the
end of a page which is followed by an unmapped page.

So update the glue code to only invoke the NEON routine if the
input is at least 16 bytes.
Reported-by: NEric Biggers <ebiggers@kernel.org>
Reviewed-by: NEric Biggers <ebiggers@kernel.org>
Fixes: 6ef5737f ("crypto: arm64/crct10dif - port x86 SSE implementation to arm64")
Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

76f21678

crypto: arm64/aes-neonbs - fix returning final keystream block · 4bca5a9a

由 Eric Biggers 提交于 1月 31, 2019

commit 12455e320e19e9cc7ad97f4ab89c280fe297387c upstream.

The arm64 NEON bit-sliced implementation of AES-CTR fails the improved
skcipher tests because it sometimes produces the wrong ciphertext. The
bug is that the final keystream block isn't returned from the assembly
code when the number of non-final blocks is zero. This can happen if
the input data ends a few bytes after a page boundary. In this case the
last bytes get "encrypted" by XOR'ing them with uninitialized memory.

Fix the assembly code to return the final keystream block when needed.

Fixes: 88a3f582 ("crypto: arm64/aes - don't use IV buffer to return final keystream block")
Cc: <stable@vger.kernel.org> # v4.11+
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

4bca5a9a

crypto: arm/crct10dif - revert to C code for short inputs · 0beb34b8

由 Ard Biesheuvel 提交于 1月 27, 2019

commit 62fecf295e3c48be1b5f17c440b93875b9adb4d6 upstream.

The SIMD routine ported from x86 used to have a special code path
for inputs < 16 bytes, which got lost somewhere along the way.
Instead, the current glue code aligns the input pointer to permit
the NEON routine to use special versions of the vld1 instructions
that assume 16 byte alignment, but this could result in inputs of
less than 16 bytes to be passed in. This not only fails the new
extended tests that Eric has implemented, it also results in the
code reading past the end of the input, which could potentially
result in crashes when dealing with less than 16 bytes of input
at the end of a page which is followed by an unmapped page.

So update the glue code to only invoke the NEON routine if the
input is at least 16 bytes.
Reported-by: NEric Biggers <ebiggers@kernel.org>
Reviewed-by: NEric Biggers <ebiggers@kernel.org>
Fixes: 1d481f1c ("crypto: arm/crct10dif - port x86 SSE implementation to ARM")
Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

0beb34b8

crypto: aegis - fix handling chunked inputs · 4c152af9

由 Eric Biggers 提交于 1月 31, 2019

commit 0f533e67d26f228ea5dfdacc8a4bdeb487af5208 upstream.

The generic AEGIS implementations all fail the improved AEAD tests
because they produce the wrong result with some data layouts.  The issue
is that they assume that if the skcipher_walk API gives 'nbytes' not
aligned to the walksize (a.k.a. walk.stride), then it is the end of the
data.  In fact, this can happen before the end.  Fix them.

Fixes: f606a88e ("crypto: aegis - Add generic AEGIS AEAD implementations")
Cc: <stable@vger.kernel.org> # v4.18+
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: NEric Biggers <ebiggers@google.com>
Reviewed-by: NOndrej Mosnacek <omosnace@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

4c152af9

crypto: aead - set CRYPTO_TFM_NEED_KEY if ->setkey() fails · 736807d6

由 Eric Biggers 提交于 1月 06, 2019

commit 6ebc97006b196aafa9df0497fdfa866cf26f259b upstream.

For example, in gcm.c, if the kzalloc() fails due to lack of memory,
then the CTR part of GCM will have the new key but GHASH will not.

It's not feasible to make all ->setkey() methods atomic, especially ones
that have to key multiple sub-tfms. Therefore, make the crypto API set
CRYPTO_TFM_NEED_KEY if ->setkey() fails, to prevent the tfm from being
used until a new key is set.

Fixes: dc26c17f ("crypto: aead - prevent using AEADs without setting key")
Cc: <stable@vger.kernel.org> # v4.16+
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

736807d6

fix cgroup_do_mount() handling of failure exits · 7a8b0484

由 Al Viro 提交于 1月 06, 2019

commit 399504e21a10be16dd1408ba0147367d9d82a10c upstream.

same story as with last May fixes in sysfs (7b745a4e
"unfuck sysfs_mount()"); new_sb is left uninitialized
in case of early errors in kernfs_mount_ns() and papering
over it by treating any error from kernfs_mount_ns() as
equivalent to !new_ns ends up conflating the cases when
objects had never been transferred to a superblock with
ones when that has happened and resulting new superblock
had been dropped.  Easily fixed (same way as in sysfs
case).  Additionally, there's a superblock leak on
kernfs_node_dentry() failure *and* a dentry leak inside
kernfs_node_dentry() itself - the latter on probably
impossible errors, but the former not impossible to trigger
(as the matter of fact, injecting allocation failures
at that point *does* trigger it).

Cc: stable@kernel.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

7a8b0484

libnvdimm: Fix altmap reservation size calculation · 3b8da135

由 Oliver O'Halloran 提交于 2月 06, 2019

commit 07464e88365e9236febaca9ed1a2e2006d8bc952 upstream.

Libnvdimm reserves the first 8K of pfn and devicedax namespaces to
store a superblock describing the namespace. This 8K reservation
is contained within the altmap area which the kernel uses for the
vmemmap backing for the pages within the namespace. The altmap
allows for some pages at the start of the altmap area to be reserved
and that mechanism is used to protect the superblock from being
re-used as vmemmap backing.

The number of PFNs to reserve is calculated using:

	PHYS_PFN(SZ_8K)

Which is implemented as:

 #define PHYS_PFN(x) ((unsigned long)((x) >> PAGE_SHIFT))

So on systems where PAGE_SIZE is greater than 8K the reservation
size is truncated to zero and the superblock area is re-used as
vmemmap backing. As a result all the namespace information stored
in the superblock (i.e. if it's a PFN or DAX namespace) is lost
and the namespace needs to be re-created to get access to the
contents.

This patch fixes this by using PFN_UP() rather than PHYS_PFN() to ensure
that at least one page is reserved. On systems with a 4K pages size this
patch should have no effect.

Cc: stable@vger.kernel.org
Cc: Dan Williams <dan.j.williams@intel.com>
Fixes: ac515c08 ("libnvdimm, pmem, pfn: move pfn setup to the core")
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Reviewed-by: NVishal Verma <vishal.l.verma@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

3b8da135

libnvdimm/pmem: Honor force_raw for legacy pmem regions · 696c3752

由 Dan Williams 提交于 1月 24, 2019

commit fa7d2e639cd90442d868dfc6ca1d4cc9d8bf206e upstream.

For recovery, where non-dax access is needed to a given physical address
range, and testing, allow the 'force_raw' attribute to override the
default establishment of a dev_pagemap.

Otherwise without this capability it is possible to end up with a
namespace that can not be activated due to corrupted info-block, and one
that can not be repaired due to a section collision.

Cc: <stable@vger.kernel.org>
Fixes: 004f1afb ("libnvdimm, pmem: direct map legacy pmem by default")
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

696c3752

libnvdimm, pfn: Fix over-trim in trim_pfn_device() · 6a89ed7a

由 Wei Yang 提交于 1月 22, 2019

commit f101ada7da6551127d192c2f1742c1e9e0f62799 upstream.

When trying to see whether current nd_region intersects with others,
trim_pfn_device() has already calculated the *size* to be expanded to
SECTION size.

Do not double append 'adjust' to 'size' when calculating whether the end
of a region collides with the next pmem region.

Fixes: ae86cbfef381 "libnvdimm, pfn: Pad pfn namespaces relative to other regions"
Cc: <stable@vger.kernel.org>
Signed-off-by: NWei Yang <richardw.yang@linux.intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

6a89ed7a

libnvdimm/label: Clear 'updating' flag after label-set update · 2b88d92e

由 Dan Williams 提交于 1月 15, 2019

commit 966d23a006ca7b44ac8cf4d0c96b19785e0c3da0 upstream.

The UEFI 2.7 specification sets expectations that the 'updating' flag is
eventually cleared. To date, the libnvdimm core has never adhered to
that protocol. The policy of the core matches the policy of other
multi-device info-block formats like MD-Software-RAID that expect
administrator intervention on inconsistent info-blocks, not automatic
invalidation.

However, some pre-boot environments may unfortunately attempt to "clean
up" the labels and invalidate a set when it fails to find at least one
"non-updating" label in the set. Clear the updating flag after set
updates to minimize the window of vulnerability to aggressive pre-boot
environments.

Ideally implementations would not write to the label area outside of
creating namespaces.

Note that this only minimizes the window, it does not close it as the
system can still crash while clearing the flag and the set can be
subsequently deleted / invalidated by the pre-boot environment.

Fixes: f524bf27 ("libnvdimm: write pmem label set")
Cc: <stable@vger.kernel.org>
Cc: Kelly Couch <kelly.j.couch@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

2b88d92e

nfit/ars: Attempt short-ARS even in the no_init_ars case · f4dfb94a

由 Dan Williams 提交于 2月 13, 2019

commit fa3ed4d981b1fc19acdd07fcb152a4bd3706892b upstream.

The no_init_ars option is meant to prevent long-ARS, but short-ARS
should be allowed to grab any immediate results.

Fixes: bc6ba808 ("nfit, address-range-scrub: rework and simplify ARS...")
Cc: <stable@vger.kernel.org>
Reported-by: NErwin Tsaur <erwin.tsaur@oracle.com>
Reviewed-by: NToshi Kani <toshi.kani@hpe.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

f4dfb94a

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功