提交 · 3b4b84b2ea9938e44fffa356c7b95f496b4246ab · openeuler / Kernel

03 8月, 2020 1 次提交
- L
  list: add "list_del_init_careful()" to go with "list_empty_careful()" · c6fe44d9
  由 Linus Torvalds 提交于 7月 23, 2020
```
That gives us ordering guarantees around the pair.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
```
  c6fe44d9
02 8月, 2020 1 次提交

fs: optimise kiocb_set_rw_flags() · 1752f0ad

由 Pavel Begunkov 提交于 8月 01, 2020

Use a local var to collect flags in kiocb_set_rw_flags(). That spares
some memory writes and allows to replace most of the jumps with MOVEcc.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Reviewed-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

1752f0ad

31 7月, 2020 2 次提交

lib/mpi: Add mpi_sub_ui() · 4278e9d9

由 Marcelo Henrique Cerri 提交于 7月 20, 2020

Add mpi_sub_ui() based on Gnu MP mpz_sub_ui() function from file
mpz/aors_ui.h[1] from change id 510b83519d1c adapting the code to the
kernel's data structures, helper functions and coding style and also
removing the defines used to produce mpz_sub_ui() and mpz_add_ui()
from the same code.

[1] https://gmplib.org/repo/gmp-6.2/file/510b83519d1c/mpz/aors.hSigned-off-by: NMarcelo Henrique Cerri <marcelo.cerri@canonical.com>
Signed-off-by: NStephan Mueller <smueller@chronox.de>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

4278e9d9

random: fix circular include dependency on arm64 after addition of percpu.h · 1c9df907

由 Willy Tarreau 提交于 7月 30, 2020

Daniel Díaz and Kees Cook independently reported that commit
f227e3ec ("random32: update the net random state on interrupt and
activity") broke arm64 due to a circular dependency on include files
since the addition of percpu.h in random.h.

The correct fix would definitely be to move all the prandom32 stuff out
of random.h but for backporting, a smaller solution is preferred.

This one replaces linux/percpu.h with asm/percpu.h, and this fixes the
problem on x86_64, arm64, arm, and mips.  Note that moving percpu.h
around didn't change anything and that removing it entirely broke
differently.  When backporting, such options might still be considered
if this patch fails to help.

[ It turns out that an alternate fix seems to be to just remove the
  troublesome <asm/pointer_auth.h> remove from the arm64 <asm/smp.h>
  that causes the circular dependency.

  But we might as well do the whole belt-and-suspenders thing, and
  minimize inclusion in <linux/random.h> too. Either will fix the
  problem, and both are good changes.   - Linus ]
Reported-by: NDaniel Díaz <daniel.diaz@linaro.org>
Reported-by: NKees Cook <keescook@chromium.org>
Tested-by: NMarc Zyngier <maz@kernel.org>
Fixes: f227e3ec
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NWilly Tarreau <w@1wt.eu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1c9df907

30 7月, 2020 2 次提交

random32: remove net_rand_state from the latent entropy gcc plugin · 83bdc727

由 Linus Torvalds 提交于 7月 29, 2020

It turns out that the plugin right now ends up being really unhappy
about the change from 'static' to 'extern' storage that happened in
commit f227e3ec ("random32: update the net random state on interrupt
and activity").

This is probably a trivial fix for the latent_entropy plugin, but for
now, just remove net_rand_state from the list of things the plugin
worries about.
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Cc: Emese Revfy <re.emese@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

83bdc727

random32: update the net random state on interrupt and activity · f227e3ec

由 Willy Tarreau 提交于 7月 10, 2020

This modifies the first 32 bits out of the 128 bits of a random CPU's
net_rand_state on interrupt or CPU activity to complicate remote
observations that could lead to guessing the network RNG's internal
state.

Note that depending on some network devices' interrupt rate moderation
or binding, this re-seeding might happen on every packet or even almost
never.

In addition, with NOHZ some CPUs might not even get timer interrupts,
leaving their local state rarely updated, while they are running
networked processes making use of the random state.  For this reason, we
also perform this update in update_process_times() in order to at least
update the state when there is user or system activity, since it's the
only case we care about.
Reported-by: NAmit Klein <aksecurity@gmail.com>
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NWilly Tarreau <w@1wt.eu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f227e3ec

29 7月, 2020 4 次提交

rhashtable: Restore RCU marking on rhash_lock_head · ce9b362b

由 Herbert Xu 提交于 7月 24, 2020

This patch restores the RCU marking on bucket_table->buckets as
it really does need RCU protection.  Its removal had led to a fatal
bug.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ce9b362b

rhashtable: Fix unprotected RCU dereference in __rht_ptr · 1748f6a2

由 Herbert Xu 提交于 7月 24, 2020

The rcu_dereference call in rht_ptr_rcu is completely bogus because
we've already dereferenced the value in __rht_ptr and operated on it.
This causes potential double readings which could be fatal.  The RCU
dereference must occur prior to the comparison in __rht_ptr.

This patch changes the order of RCU dereference so that it is done
first and the result is then fed to __rht_ptr.  The RCU marking
changes have been minimised using casts which will be removed in
a follow-up patch.

Fixes: ba6306e3 ("rhashtable: Remove RCU marking from...")
Reported-by: N"Gong, Sishuai" <sishuai@purdue.edu>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1748f6a2

net/mlx5e: Modify uplink state on interface up/down · 7d0314b1

由 Ron Diskin 提交于 4月 05, 2020

When setting the PF interface up/down, notify the firmware to update
uplink state via MODIFY_VPORT_STATE, when E-Switch is enabled.

This behavior will prevent sending traffic out on uplink port when PF is
down, such as sending traffic from a VF interface which is still up.
Currently when calling mlx5e_open/close(), the driver only sends PAOS
command to notify the firmware to set the physical port state to
up/down, however, it is not sufficient. When VF is in "auto" state, it
follows the uplink state, which was not updated on mlx5e_open/close()
before this patch.

When switchdev mode is enabled and uplink representor is first enabled,
set the uplink port state value back to its FW default "AUTO".

Fixes: 63bfd399 ("net/mlx5e: Send PAOS command on interface up/down")
Signed-off-by: NRon Diskin <rondi@mellanox.com>
Reviewed-by: NRoi Dayan <roid@mellanox.com>
Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

7d0314b1

block: Remove callback typedefs for blk_mq_ops · 0516c2f6

由 Daniel Wagner 提交于 7月 28, 2020

No need to define typedefs for the callbacks, because there is not a
single user except blk_mq_ops.
Signed-off-by: NDaniel Wagner <dwagner@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0516c2f6

28 7月, 2020 8 次提交

of/irq: Make of_msi_map_rid() PCI bus agnostic · 2bcdd8f2

由 Lorenzo Pieralisi 提交于 6月 19, 2020

There is nothing PCI bus specific in the of_msi_map_rid()
implementation other than the requester ID tag for the input
ID space. Rename requester ID to a more generic ID so that
the translation code can be used by all busses that require
input/output ID translations.

No functional change intended.
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: NRob Herring <robh@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200619082013.13661-11-lorenzo.pieralisi@arm.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

2bcdd8f2

of/irq: make of_msi_map_get_device_domain() bus agnostic · 6f881aba

由 Diana Craciun 提交于 6月 19, 2020

of_msi_map_get_device_domain() is PCI specific but it need not be and
can be easily changed to be bus agnostic in order to be used by other
busses by adding an IRQ domain bus token as an input parameter.
Signed-off-by: NDiana Craciun <diana.craciun@oss.nxp.com>
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: NRob Herring <robh@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>   # pci/msi.c
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200619082013.13661-10-lorenzo.pieralisi@arm.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

6f881aba

of/device: Add input id to of_dma_configure() · a081bd4a

由 Lorenzo Pieralisi 提交于 6月 19, 2020

Devices sitting on proprietary busses have a device ID space that
is owned by the respective bus and related firmware bindings. In order
to let the generic OF layer handle the input translations to
an IOMMU id, for such busses the current of_dma_configure() interface
should be extended in order to allow the bus layer to provide the
device input id parameter - that is retrieved/assigned in bus
specific code and firmware.

Augment of_dma_configure() to add an optional input_id parameter,
leaving current functionality unchanged.
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: NRob Herring <robh@kernel.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Link: https://lore.kernel.org/r/20200619082013.13661-8-lorenzo.pieralisi@arm.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

a081bd4a

of/iommu: Make of_map_rid() PCI agnostic · 746a71d0

由 Lorenzo Pieralisi 提交于 6月 19, 2020

There is nothing PCI specific (other than the RID - requester ID)
in the of_map_rid() implementation, so the same function can be
reused for input/output IDs mapping for other busses just as well.

Rename the RID instances/names to a generic "id" tag.

No functionality change intended.
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: NRob Herring <robh@kernel.org>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200619082013.13661-7-lorenzo.pieralisi@arm.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

746a71d0

ACPI/IORT: Add an input ID to acpi_dma_configure() · b8e069a2

由 Lorenzo Pieralisi 提交于 6月 19, 2020

Some HW devices are created as child devices of proprietary busses,
that have a bus specific policy defining how the child devices
wires representing the devices ID are translated into IOMMU and
IRQ controllers device IDs.

Current IORT code provides translations for:

- PCI devices, where the device ID is well identified at bus level
  as the requester ID (RID)
- Platform devices that are endpoint devices where the device ID is
  retrieved from the ACPI object IORT mappings (Named components single
  mappings). A platform device is represented in IORT as a named
  component node

For devices that are child devices of proprietary busses the IORT
firmware represents the bus node as a named component node in IORT
and it is up to that named component node to define in/out bus
specific ID translations for the bus child devices that are
allocated and created in a bus specific manner.

In order to make IORT ID translations available for proprietary
bus child devices, the current ACPI (and IORT) code must be
augmented to provide an additional ID parameter to acpi_dma_configure()
representing the child devices input ID. This ID is bus specific
and it is retrieved in bus specific code.

By adding an ID parameter to acpi_dma_configure(), the IORT
code can map the child device ID to an IOMMU stream ID through
the IORT named component representing the bus in/out ID mappings.
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Link: https://lore.kernel.org/r/20200619082013.13661-6-lorenzo.pieralisi@arm.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

b8e069a2

ACPI/IORT: Make iort_msi_map_rid() PCI agnostic · 39c3cf56

由 Lorenzo Pieralisi 提交于 6月 19, 2020

There is nothing PCI specific in iort_msi_map_rid().

Rename the function using a bus protocol agnostic name,
iort_msi_map_id(), and convert current callers to it.
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: NBjorn Helgaas <bhelgaas@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Link: https://lore.kernel.org/r/20200619082013.13661-4-lorenzo.pieralisi@arm.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

39c3cf56

ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic · d1718a1b

由 Lorenzo Pieralisi 提交于 6月 19, 2020

iort_get_device_domain() is PCI specific but it need not be,
since it can be used to retrieve IRQ domain nexus of any kind
by adding an irq_domain_bus_token input to it.

Make it PCI agnostic by also renaming the requestor ID input
to a more generic ID name.
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>   # pci/msi.c
Cc: Will Deacon <will@kernel.org>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Link: https://lore.kernel.org/r/20200619082013.13661-3-lorenzo.pieralisi@arm.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

d1718a1b

i2c: also convert placeholder function to return errno · 8be23aec

由 Wolfram Sang 提交于 7月 25, 2020

All i2c_new_device-alike functions return ERR_PTR these days, but this
fallback function was missed.

Fixes: 2dea645f ("i2c: acpi: Return error pointers from i2c_acpi_new_device()")
Signed-off-by: NWolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
[wsa: changed from 'ENOSYS' to 'ENODEV']
Signed-off-by: NWolfram Sang <wsa@kernel.org>

8be23aec

27 7月, 2020 1 次提交

genirq/affinity: Make affinity setting if activated opt-in · f0c7baca

由 Thomas Gleixner 提交于 7月 24, 2020

John reported that on a RK3288 system the perf per CPU interrupts are all
affine to CPU0 and provided the analysis:

 "It looks like what happens is that because the interrupts are not per-CPU
  in the hardware, armpmu_request_irq() calls irq_force_affinity() while
  the interrupt is deactivated and then request_irq() with IRQF_PERCPU |
  IRQF_NOBALANCING.  

  Now when irq_startup() runs with IRQ_STARTUP_NORMAL, it calls
  irq_setup_affinity() which returns early because IRQF_PERCPU and
  IRQF_NOBALANCING are set, leaving the interrupt on its original CPU."

This was broken by the recent commit which blocked interrupt affinity
setting in hardware before activation of the interrupt. While this works in
general, it does not work for this particular case. As contrary to the
initial analysis not all interrupt chip drivers implement an activate
callback, the safe cure is to make the deferred interrupt affinity setting
at activation time opt-in.

Implement the necessary core logic and make the two irqchip implementations
for which this is required opt-in. In hindsight this would have been the
right thing to do, but ...

Fixes: baedb87d ("genirq/affinity: Handle affinity setting on inactive interrupts correctly")
Reported-by: NJohn Keeping <john@metanate.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NMarc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/87blk4tzgm.fsf@nanos.tec.linutronix.de

f0c7baca

25 7月, 2020 3 次提交

io-mapping: indicate mapping failure · e0b3e0b1

由 Michael J. Ruhl 提交于 7月 23, 2020

The !ATOMIC_IOMAP version of io_maping_init_wc will always return
success, even when the ioremap fails.

Since the ATOMIC_IOMAP version returns NULL when the init fails, and
callers check for a NULL return on error this is unexpected.

During a device probe, where the ioremap failed, a crash can look like
this:

    BUG: unable to handle page fault for address: 0000000000210000
     #PF: supervisor write access in kernel mode
     #PF: error_code(0x0002) - not-present page
     Oops: 0002 [#1] PREEMPT SMP
     CPU: 0 PID: 177 Comm:
     RIP: 0010:fill_page_dma [i915]
       gen8_ppgtt_create [i915]
       i915_ppgtt_create [i915]
       intel_gt_init [i915]
       i915_gem_init [i915]
       i915_driver_probe [i915]
       pci_device_probe
       really_probe
       driver_probe_device

The remap failure occurred much earlier in the probe.  If it had been
propagated, the driver would have exited with an error.

Return NULL on ioremap failure.

[akpm@linux-foundation.org: detect ioremap_wc() errors earlier]

Fixes: cafaf14a ("io-mapping: Always create a struct to hold metadata about the io-mapping")
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200721171936.81563-1-michael.j.ruhl@intel.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e0b3e0b1

vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right way · 3bef735a

由 Chengguang Xu 提交于 7月 23, 2020

After commit fdc85222 ("kernfs: kvmalloc xattr value instead of
kmalloc"), simple xattr entry is allocated with kvmalloc() instead of
kmalloc(), so we should release it with kvfree() instead of kfree().

Fixes: fdc85222 ("kernfs: kvmalloc xattr value instead of kmalloc")
Signed-off-by: NChengguang Xu <cgxu519@mykernel.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Acked-by: NHugh Dickins <hughd@google.com>
Acked-by: NTejun Heo <tj@kernel.org>
Cc: Daniel Xu <dxu@dxuuu.xyz>
Cc: Chris Down <chris@chrisdown.name>
Cc: Andreas Dilger <adilger@dilger.ca>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>	[5.7]
Link: http://lkml.kernel.org/r/20200704051608.15043-1-cgxu519@mykernel.netSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3bef735a

tasks: add put_task_struct_many() · dd6f843a

由 Pavel Begunkov 提交于 7月 18, 2020

put_task_struct_many() is as put_task_struct() but puts several
references at once. Useful to batching it.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

dd6f843a

24 7月, 2020 4 次提交

tpm: Unify the mismatching TPM space buffer sizes · 6c4e79d9

由 Jarkko Sakkinen 提交于 7月 03, 2020

The size of the buffers for storing context's and sessions can vary from
arch to arch as PAGE_SIZE can be anything between 4 kB and 256 kB (the
maximum for PPC64). Define a fixed buffer size set to 16 kB. This should be
enough for most use with three handles (that is how many we allow at the
moment). Parametrize the buffer size while doing this, so that it is easier
to revisit this later on if required.

Cc: stable@vger.kernel.org
Reported-by: NStefan Berger <stefanb@linux.ibm.com>
Fixes: 745b361e ("tpm: infrastructure for TPM spaces")
Reviewed-by: NJerry Snitselaar <jsnitsel@redhat.com>
Tested-by: NStefan Berger <stefanb@linux.ibm.com>
Signed-off-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>

6c4e79d9

tpm: Require that all digests are present in TCG_PCR_EVENT2 structures · 7f3d176f

由 Tyler Hicks 提交于 7月 10, 2020

Require that the TCG_PCR_EVENT2.digests.count value strictly matches the
value of TCG_EfiSpecIdEvent.numberOfAlgorithms in the event field of the
TCG_PCClientPCREvent event log header. Also require that
TCG_EfiSpecIdEvent.numberOfAlgorithms is non-zero.

The TCG PC Client Platform Firmware Profile Specification section 9.1
(Family "2.0", Level 00 Revision 1.04) states:

 For each Hash algorithm enumerated in the TCG_PCClientPCREvent entry,
 there SHALL be a corresponding digest in all TCG_PCR_EVENT2 structures.
 Note: This includes EV_NO_ACTION events which do not extend the PCR.

Section 9.4.5.1 provides this description of
TCG_EfiSpecIdEvent.numberOfAlgorithms:

 The number of Hash algorithms in the digestSizes field. This field MUST
 be set to a value of 0x01 or greater.

Enforce these restrictions, as required by the above specification, in
order to better identify and ignore invalid sequences of bytes at the
end of an otherwise valid TPM2 event log. Firmware doesn't always have
the means necessary to inform the kernel of the actual event log size so
the kernel's event log parsing code should be stringent when parsing the
event log for resiliency against firmware bugs. This is true, for
example, when firmware passes the event log to the kernel via a reserved
memory region described in device tree.

POWER and some ARM systems use the "linux,sml-base" and "linux,sml-size"
device tree properties to describe the memory region used to pass the
event log from firmware to the kernel. Unfortunately, the
"linux,sml-size" property describes the size of the entire reserved
memory region rather than the size of the event long within the memory
region and the event log format does not include information describing
the size of the event log.

tpm_read_log_of(), in drivers/char/tpm/eventlog/of.c, is where the
"linux,sml-size" property is used. At the end of that function,
log->bios_event_log_end is pointing at the end of the reserved memory
region. That's typically 0x10000 bytes offset from "linux,sml-base",
depending on what's defined in the device tree source.

The firmware event log only fills a portion of those 0x10000 bytes and
the rest of the memory region should be zeroed out by firmware. Even in
the case of a properly zeroed bytes in the remainder of the memory
region, the only thing allowing the kernel's event log parser to detect
the end of the event log is the following conditional in
__calc_tpm2_event_size():

        if (event_type == 0 && event_field->event_size == 0)
                size = 0;

If that wasn't there, __calc_tpm2_event_size() would think that a 16
byte sequence of zeroes, following an otherwise valid event log, was
a valid event.

However, problems can occur if a single bit is set in the offset
corresponding to either the TCG_PCR_EVENT2.eventType or
TCG_PCR_EVENT2.eventSize fields, after the last valid event log entry.
This could confuse the parser into thinking that an additional entry is
present in the event log and exposing this invalid entry to userspace in
the /sys/kernel/security/tpm0/binary_bios_measurements file. Such
problems have been seen if firmware does not fully zero the memory
region upon a warm reboot.

This patch significantly raises the bar on how difficult it is for
stale/invalid memory to confuse the kernel's event log parser but
there's still, ultimately, a reliance on firmware to properly initialize
the remainder of the memory region reserved for the event log as the
parser cannot be expected to detect a stale but otherwise properly
formatted firmware event log entry.

Fixes: fd5c7869 ("tpm: fix handling of the TPM 2.0 event logs")
Signed-off-by: NTyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>

7f3d176f

tcp: allow at most one TLP probe per flight · 76be93fc

由 Yuchung Cheng 提交于 7月 23, 2020

Previously TLP may send multiple probes of new data in one
flight. This happens when the sender is cwnd limited. After the
initial TLP containing new data is sent, the sender receives another
ACK that acks partial inflight.  It may re-arm another TLP timer
to send more, if no further ACK returns before the next TLP timeout
(PTO) expires. The sender may send in theory a large amount of TLP
until send queue is depleted. This only happens if the sender sees
such irregular uncommon ACK pattern. But it is generally undesirable
behavior during congestion especially.

The original TLP design restrict only one TLP probe per inflight as
published in "Reducing Web Latency: the Virtue of Gentle Aggression",
SIGCOMM 2013. This patch changes TLP to send at most one probe
per inflight.

Note that if the sender is app-limited, TLP retransmits old data
and did not have this issue.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

76be93fc

dm integrity: fix integrity recalculation that is improperly skipped · 5df96f2b

由 Mikulas Patocka 提交于 7月 23, 2020

Commit adc0daad ("dm: report suspended
device during destroy") broke integrity recalculation.

The problem is dm_suspended() returns true not only during suspend,
but also during resume. So this race condition could occur:
1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work)
2. integrity_recalc (&ic->recalc_work) preempts the current thread
3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret;
4. integrity_recalc exits and no recalculating is done.

To fix this race condition, add a function dm_post_suspending that is
only true during the postsuspend phase and use it instead of
dm_suspended().

Signed-off-by: Mikulas Patocka <mpatocka redhat com>
Fixes: adc0daad ("dm: report suspended device during destroy")
Cc: stable vger kernel org # v4.18+
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

5df96f2b

23 7月, 2020 5 次提交

padata: remove padata_parallel_queue · f601c725

由 Daniel Jordan 提交于 7月 14, 2020

Only its reorder field is actually used now, so remove the struct and
embed @reorder directly in parallel_data.

No functional change, just a cleanup.
Signed-off-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

f601c725

padata: fold padata_alloc_possible() into padata_alloc() · 3f257191

由 Daniel Jordan 提交于 7月 14, 2020

There's no reason to have two interfaces when there's only one caller.
Removing _possible saves text and simplifies future changes.
Signed-off-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

3f257191

padata: remove effective cpumasks from the instance · d69e037b

由 Daniel Jordan 提交于 7月 14, 2020

A padata instance has effective cpumasks that store the user-supplied
masks ANDed with the online mask, but this middleman is unnecessary.
parallel_data keeps the same information around.  Removing this saves
text and code churn in future changes.
Signed-off-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

d69e037b

padata: remove stop function · 350ef051

由 Daniel Jordan 提交于 7月 14, 2020

padata_stop() has two callers and is unnecessary in both cases.  When
pcrypt calls it before padata_free(), it's being unloaded so there are
no outstanding padata jobs[0].  When __padata_free() calls it, it's
either along the same path or else pcrypt initialization failed, which
of course means there are also no outstanding jobs.

Removing it simplifies padata and saves text.

[0] https://lore.kernel.org/linux-crypto/20191119225017.mjrak2fwa5vccazl@gondor.apana.org.au/Signed-off-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

350ef051

padata: remove start function · bd25b488

由 Daniel Jordan 提交于 7月 14, 2020

padata_start() is only used right after pcrypt allocates an instance
with all possible CPUs, when PADATA_INVALID can't happen, so there's no
need for a separate "start" step.  It can be done during allocation to
save text, make using padata easier, and avoid unneeded calls in the
future.
Signed-off-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

bd25b488

22 7月, 2020 3 次提交

i2c: drop duplicated word in the header file · aca7ed09

由 Randy Dunlap 提交于 7月 17, 2020

Drop the doubled word "be" in a comment.
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NWolfram Sang <wsa@kernel.org>

aca7ed09

fs-verity: use smp_load_acquire() for ->i_verity_info · f3db0bed

由 Eric Biggers 提交于 7月 21, 2020

Normally smp_store_release() or cmpxchg_release() is paired with
smp_load_acquire().  Sometimes smp_load_acquire() can be replaced with
the more lightweight READ_ONCE().  However, for this to be safe, all the
published memory must only be accessed in a way that involves the
pointer itself.  This may not be the case if allocating the object also
involves initializing a static or global variable, for example.

fsverity_info::tree_params.hash_alg->tfm is a crypto_ahash object that's
internal to and is allocated by the crypto subsystem.  So by using
READ_ONCE() for ->i_verity_info, we're relying on internal
implementation details of the crypto subsystem.

Remove this fragile assumption by using smp_load_acquire() instead.

Also fix the cmpxchg logic to correctly execute an ACQUIRE barrier when
losing the cmpxchg race, since cmpxchg doesn't guarantee a memory
barrier on failure.

(Note: I haven't seen any real-world problems here.  This change is just
fixing the code to be guaranteed correct and less fragile.)

Fixes: fd2d1acf ("fs-verity: add the hook for file ->open()")
Link: https://lore.kernel.org/r/20200721225920.114347-6-ebiggers@kernel.orgSigned-off-by: NEric Biggers <ebiggers@google.com>

f3db0bed

fscrypt: use smp_load_acquire() for ->i_crypt_info · ab673b98

由 Eric Biggers 提交于 7月 21, 2020

Normally smp_store_release() or cmpxchg_release() is paired with
smp_load_acquire().  Sometimes smp_load_acquire() can be replaced with
the more lightweight READ_ONCE().  However, for this to be safe, all the
published memory must only be accessed in a way that involves the
pointer itself.  This may not be the case if allocating the object also
involves initializing a static or global variable, for example.

fscrypt_info includes various sub-objects which are internal to and are
allocated by other kernel subsystems such as keyrings and crypto.  So by
using READ_ONCE() for ->i_crypt_info, we're relying on internal
implementation details of these other kernel subsystems.

Remove this fragile assumption by using smp_load_acquire() instead.

(Note: I haven't seen any real-world problems here.  This change is just
fixing the code to be guaranteed correct and less fragile.)

Fixes: e37a784d ("fscrypt: use READ_ONCE() to access ->i_crypt_info")
Link: https://lore.kernel.org/r/20200721225920.114347-5-ebiggers@kernel.orgSigned-off-by: NEric Biggers <ebiggers@google.com>

ab673b98

21 7月, 2020 4 次提交

compiler.h: Move compiletime_assert() macros into compiler_types.h · eb5c2d4b

由 Will Deacon 提交于 7月 21, 2020

The kernel test robot reports that moving READ_ONCE() out into its own
header breaks a W=1 build for parisc, which is relying on the definition
of compiletime_assert() being available:

  | In file included from ./arch/parisc/include/generated/asm/rwonce.h:1,
  |                  from ./include/asm-generic/barrier.h:16,
  |                  from ./arch/parisc/include/asm/barrier.h:29,
  |                  from ./arch/parisc/include/asm/atomic.h:11,
  |                  from ./include/linux/atomic.h:7,
  |                  from kernel/locking/percpu-rwsem.c:2:
  | ./arch/parisc/include/asm/atomic.h: In function 'atomic_read':
  | ./include/asm-generic/rwonce.h:36:2: error: implicit declaration of function 'compiletime_assert' [-Werror=implicit-function-declaration]
  |    36 |  compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
  |       |  ^~~~~~~~~~~~~~~~~~
  | ./include/asm-generic/rwonce.h:49:2: note: in expansion of macro 'compiletime_assert_rwonce_type'
  |    49 |  compiletime_assert_rwonce_type(x);    \
  |       |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  | ./arch/parisc/include/asm/atomic.h:73:9: note: in expansion of macro 'READ_ONCE'
  |    73 |  return READ_ONCE((v)->counter);
  |       |         ^~~~~~~~~

Move these macros into compiler_types.h, so that they are available to
READ_ONCE() and friends.

Link: http://lists.infradead.org/pipermail/linux-arm-kernel/2020-July/587094.htmlReported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NWill Deacon <will@kernel.org>

eb5c2d4b

include/linux: Remove smp_read_barrier_depends() from comments · c6cd2e01

由 Will Deacon 提交于 11月 07, 2019

smp_read_barrier_depends() doesn't exist any more, so reword the two
comments that mention it to refer to "dependency ordering" instead.
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NPaul E. McKenney <paulmck@kernel.org>
Signed-off-by: NWill Deacon <will@kernel.org>

c6cd2e01

asm/rwonce: Don't pull <asm/barrier.h> into 'asm-generic/rwonce.h' · 002dff36

由 Will Deacon 提交于 7月 10, 2020

Now that 'smp_read_barrier_depends()' has gone the way of the Norwegian
Blue, drop the inclusion of <asm/barrier.h> in 'asm-generic/rwonce.h'.

This requires fixups to some architecture vdso headers which were
previously relying on 'asm/barrier.h' coming in via 'linux/compiler.h'.
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NWill Deacon <will@kernel.org>

002dff36

compiler.h: Split {READ,WRITE}_ONCE definitions out into rwonce.h · e506ea45

由 Will Deacon 提交于 10月 15, 2019

In preparation for allowing architectures to define their own
implementation of the READ_ONCE() macro, move the generic
{READ,WRITE}_ONCE() definitions out of the unwieldy 'linux/compiler.h'
file and into a new 'rwonce.h' header under 'asm-generic'.
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NPaul E. McKenney <paulmck@kernel.org>
Signed-off-by: NWill Deacon <will@kernel.org>

e506ea45

20 7月, 2020 1 次提交

sched_clock: Expose struct clock_read_data · 1b86abc1

由 Peter Zijlstra 提交于 7月 16, 2020

In order to support perf_event_mmap_page::cap_time features, an
architecture needs, aside from a userspace readable counter register,
to expose the exact clock data so that userspace can convert the
counter register into a correct timestamp.

Provide struct clock_read_data and two (seqcount) helpers so that
architectures (arm64 in specific) can expose the numbers to userspace.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NLeo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20200716051130.4359-2-leo.yan@linaro.orgSigned-off-by: NWill Deacon <will@kernel.org>

1b86abc1

18 7月, 2020 1 次提交

blk-cgroup: show global disk stats in root cgroup io.stat · ef45fe47

由 Boris Burkov 提交于 6月 01, 2020

In order to improve consistency and usability in cgroup stat accounting,
we would like to support the root cgroup's io.stat.

Since the root cgroup has processes doing io even if the system has no
explicitly created cgroups, we need to be careful to avoid overhead in
that case.  For that reason, the rstat algorithms don't handle the root
cgroup, so just turning the file on wouldn't give correct statistics.

To get around this, we simulate flushing the iostat struct by filling it
out directly from global disk stats. The result is a root cgroup io.stat
file consistent with both /proc/diskstats and io.stat.

Note that in order to collect the disk stats, we needed to iterate over
devices. To facilitate that, we had to change the linkage of a disk_type
to external so that it can be used from blk-cgroup.c to iterate over
disks.
Suggested-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NBoris Burkov <boris@bur.io>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ef45fe47

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功