提交 · e13e75b86ef2f88e3a47d672dd4c52a293efb95b · openeuler / Kernel

10 4月, 2018 3 次提交

D

Merge branch 'for-4.17/dax' into libnvdimm-for-next · e13e75b8
由 Dan Williams 提交于 4月 09, 2018

e13e75b8
D

Merge branch 'for-4.17/libnvdimm' into libnvdimm-for-next · 1ed41b56
由 Dan Williams 提交于 4月 09, 2018

1ed41b56

libnvdimm, of_pmem: workaround OF_NUMA=n build error · 291717b6

由 Dan Williams 提交于 4月 09, 2018

Stephen reports that an x86 allmodconfig build fails to build the
of_pmem driver due to a missing definition of of_node_to_nid(). That
helper is currently only exported in the OF_NUMA=y case. In other cases,
ppc and sparc, it is a weak symbol, and outside of those platforms it is
a static inline.

Until an OF_NUMA=n configuration can reliably support usage of
of_node_to_nid() in modules across architectures, mark this driver as
'bool' instead of 'tristate'.

Cc: Rob Herring <robh@kernel.org>
Cc: Oliver O'Halloran <oohall@gmail.com>
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

291717b6

07 4月, 2018 10 次提交

nfit, address-range-scrub: add module option to skip initial ars · bca811a7

由 Dan Williams 提交于 4月 02, 2018

After attempting to quickly retrieve known errors the kernel proceeds to
kick off a long running ARS. Add a module option to disable this
behavior at initialization time, or at new region discovery time.
Otherwise, ARS can be started manually regardless of the state of this
setting.
Co-developed-by: NDave Jiang <dave.jiang@intel.com>
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

bca811a7

nfit, address-range-scrub: rework and simplify ARS state machine · bc6ba808

由 Dan Williams 提交于 4月 05, 2018

ARS is an operation that can take 10s to 100s of seconds to find media
errors that should rarely be present. If the platform crashes due to
media errors in persistent memory, the expectation is that the BIOS will
report those known errors in a 'short' ARS request.

A 'short' ARS request asks platform firmware to return an ARS payload
with all known errors, but without issuing a 'long' scrub. At driver
init a short request is issued to all PMEM ranges before registering
regions. Then, in the background, a long ARS is scheduled for each
region.

The ARS implementation is simplified to centralize ARS completion work
in the ars_complete() helper. The timeout is removed since there is no
facility to cancel ARS, and this otherwise arranges for system init to
never be blocked waiting for a 'long' ARS. The ars_state flags are used
to coordinate ARS requests from driver init, ARS requests from
userspace, and ARS requests in response to media error notifications.

Given that there is no notification of ARS completion the implementation
still needs to poll. It backs off exponentially to a maximum poll period
of 30 minutes.
Suggested-by: NToshi Kani <toshi.kani@hpe.com>
Co-developed-by: NDave Jiang <dave.jiang@intel.com>
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

bc6ba808

nfit, address-range-scrub: determine one platform max_ars value · 459d0ddb

由 Dan Williams 提交于 4月 05, 2018

acpi_nfit_query_poison() is awkward in that it requires an nfit_spa
argument in order to determine what max_ars value to use. Instead probe
for the minimum max_ars across all scrub-capable ranges in the system
and drop the nfit_spa argument.

This enables a larger rework / simplification of the ARS state machine
whereby the status can be retrieved once and then iterated over all
address ranges to reap completions.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

459d0ddb

powerpc/powernv: Create platform devs for nvdimm buses · 3013e173

由 Oliver O'Halloran 提交于 4月 06, 2018

Scan the devicetree for an nvdimm-bus compatible and create
a platform device for them.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

3013e173

doc/devicetree: Persistent memory region bindings · ddc141e5

由 Oliver O'Halloran 提交于 4月 06, 2018

Add device-tree binding documentation for the nvdimm region driver.

Cc: devicetree@vger.kernel.org
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

ddc141e5

libnvdimm: Add device-tree based driver · 71719760

由 Oliver O'Halloran 提交于 4月 06, 2018

This patch adds peliminary device-tree bindings for persistent memory
regions. The driver registers a libnvdimm bus for each pmem-region
node and each address range under the node is converted to a region
within that bus.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

71719760

libnvdimm: Add of_node to region and bus descriptors · 1ff19f48

由 Oliver O'Halloran 提交于 4月 06, 2018

We want to be able to cross reference the region and bus devices
with the device tree node that they were spawned from. libNVDIMM
handles creating the actual devices for these internally, so we
need to pass in a pointer to the relevant node in the descriptor.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Acked-by: NDan Williams <dan.j.williams@intel.com>
Acked-by: NBalbir Singh <bsingharora@gmail.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

1ff19f48

libnvdimm, region: quiet region probe · 60ce0f93

由 Dan Williams 提交于 4月 07, 2018

The message about constraining number of online cpus to be less than or
equal to ND_MAX_LANES (256) is only useful for block-aperture
configurations and BTT. Make it debug since it is only relevant when
debugging performance.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

60ce0f93

libnvdimm, namespace: use a safe lookup for dimm device name · 4f867220

由 Dan Williams 提交于 4月 06, 2018

The following NULL dereference results from incorrectly assuming that
ndd is valid in this print:

  struct nvdimm_drvdata *ndd = to_ndd(&nd_region->mapping[i]);

  /*
   * Give up if we don't find an instance of a uuid at each
   * position (from 0 to nd_region->ndr_mappings - 1), or if we
   * find a dimm with two instances of the same uuid.
   */
  dev_err(&nd_region->dev, "%s missing label for %pUb\n",
                  dev_name(ndd->dev), nd_label->uuid);

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
 IP: nd_region_register_namespaces+0xd67/0x13c0 [libnvdimm]
 PGD 0 P4D 0
 Oops: 0000 [#1] SMP PTI
 CPU: 43 PID: 673 Comm: kworker/u609:10 Not tainted 4.16.0-rc4+ #1
 [..]
 RIP: 0010:nd_region_register_namespaces+0xd67/0x13c0 [libnvdimm]
 [..]
 Call Trace:
  ? devres_add+0x2f/0x40
  ? devm_kmalloc+0x52/0x60
  ? nd_region_activate+0x9c/0x320 [libnvdimm]
  nd_region_probe+0x94/0x260 [libnvdimm]
  ? kernfs_add_one+0xe4/0x130
  nvdimm_bus_probe+0x63/0x100 [libnvdimm]

Switch to using the nvdimm device directly.

Fixes: 0e3b0d12 ("libnvdimm, namespace: allow multiple pmem...")
Cc: <stable@vger.kernel.org>
Reported-by: NDave Jiang <dave.jiang@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

4f867220

libnvdimm, dimm: fix dpa reservation vs uninitialized label area · c31898c8

由 Dan Williams 提交于 4月 06, 2018

At initialization time the 'dimm' driver caches a copy of the memory
device's label area and reserves address space for each of the
namespaces defined.

However, as can be seen below, the reservation occurs even when the
index blocks are invalid:

 nvdimm nmem0: nvdimm_init_config_data: len: 131072 rc: 0
 nvdimm nmem0: config data size: 131072
 nvdimm nmem0: __nd_label_validate: nsindex0 labelsize 1 invalid
 nvdimm nmem0: __nd_label_validate: nsindex1 labelsize 1 invalid
 nvdimm nmem0: : pmem-6025e505: 0x1000000000 @ 0xf50000000 reserve <-- bad

Gate dpa reservation on the presence of valid index blocks.

Cc: <stable@vger.kernel.org>
Fixes: 4a826c83 ("libnvdimm: namespace indices: read and validate")
Reported-by: NKrzysztof Rusocki <krzysztof.rusocki@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

c31898c8

06 4月, 2018 3 次提交

libnvdimm, testing: update the default smart ctrl_temperature · f6adcca0

由 Vishal Verma 提交于 2月 08, 2018

The default value for smart ctrl_temperature was the same as the
threshold for ctrl_temperature. As a result, any arbitrary smart
injection to the nfit_test dimm could cause this alarm to trigger
and cause an acpi notification. Drop the default value to below the
threshold, so that unrelated injections don't trigger notifications.

Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

f6adcca0

libnvdimm, testing: Add emulation for smart injection commands · 4cf260fc

由 Vishal Verma 提交于 2月 08, 2018

Add support for the smart injection command in the nvdimm unit test
framework. This allows for directly injecting to smart fields and flags
that are supported in the injection command. If the injected values are
past the threshold, then an acpi notification is also triggered.

Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

4cf260fc

nfit, address-range-scrub: introduce nfit_spa->ars_state · 14c73f99

由 Dan Williams 提交于 4月 02, 2018

In preparation for re-working the ARS implementation to better handle
short vs long ARS runs, introduce nfit_spa->ars_state. For now this just
replaces the nfit_spa->ars_required bit-field/flag, but going forward it
will be used to track ARS completion and make short vs long ARS
requests.
Reviewed-by: NDave Jiang <dave.jiang@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

14c73f99

04 4月, 2018 2 次提交

libnvdimm: add an api to cast a 'struct nd_region' to its 'struct device' · 243f29fe

由 Dan Williams 提交于 4月 02, 2018

For debug, it is useful for bus providers to be able to retrieve the
'struct device' associated with an nd_region instance that it
registered. We already have to_nd_region() to perform the reverse cast
operation, in fact its duplicate declaration can be removed from the
private drivers/nvdimm/nd.h header.
Reviewed-by: NDave Jiang <dave.jiang@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

243f29fe

nfit, address-range-scrub: fix scrub in-progress reporting · 78727137

由 Dan Williams 提交于 4月 02, 2018

There is a small window whereby ARS scan requests can schedule work that
userspace will miss when polling scrub_show. Hold the init_mutex lock
over calls to report the status to close this potential escape. Also,
make sure that requests to cancel the ARS workqueue are treated as an
idle event.

Cc: <stable@vger.kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Fixes: 37b137ff ("nfit, libnvdimm: allow an ARS scrub...")
Reviewed-by: NDave Jiang <dave.jiang@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

78727137

03 4月, 2018 5 次提交

dax, dm: allow device-mapper to operate without dax support · 976431b0

由 Dan Williams 提交于 3月 29, 2018

Change device-mapper's DAX dependency to require the presence of at
least one DAX_DRIVER. This allows device-mapper to be built without
bringing the DAX core along which is especially wasteful when there are
no DAX drivers, like BLK_DEV_PMEM, configured.

Cc: Alasdair Kergon <agk@redhat.com>
Reported-by: NBart Van Assche <Bart.VanAssche@wdc.com>
Reported-by: Nkbuild test robot <lkp@intel.com>
Reported-by: NArnd Bergmann <arnd@arndb.de>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

976431b0

dax: introduce CONFIG_DAX_DRIVER · 2080e88a

由 Dan Williams 提交于 3月 29, 2018

In support of allowing device-mapper to compile out idle/dead code when
there are no dax providers in the system, introduce the DAX_DRIVER
symbol. This is selected by all leaf drivers that device-mapper might be
layered on top. This allows device-mapper to conditionally 'select DAX'
only when a provider is present.

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Reported-by: NBart Van Assche <Bart.VanAssche@wdc.com>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

2080e88a

fs, dax: use page->mapping to warn if truncate collides with a busy page · d2c997c0

由 Dan Williams 提交于 12月 22, 2017

Catch cases where extent unmap operations encounter pages that are
pinned / busy. Typically this is pinned pages that are under active dma.
This warning is a canary for potential data corruption as truncated
blocks could be allocated to a new file while the device is still
performing i/o.

Here is an example of a collision that this implementation catches:

 WARNING: CPU: 2 PID: 1286 at fs/dax.c:343 dax_disassociate_entry+0x55/0x80
 [..]
 Call Trace:
  __dax_invalidate_mapping_entry+0x6c/0xf0
  dax_delete_mapping_entry+0xf/0x20
  truncate_exceptional_pvec_entries.part.12+0x1af/0x200
  truncate_inode_pages_range+0x268/0x970
  ? tlb_gather_mmu+0x10/0x20
  ? up_write+0x1c/0x40
  ? unmap_mapping_range+0x73/0x140
  xfs_free_file_space+0x1b6/0x5b0 [xfs]
  ? xfs_file_fallocate+0x7f/0x320 [xfs]
  ? down_write_nested+0x40/0x70
  ? xfs_ilock+0x21d/0x2f0 [xfs]
  xfs_file_fallocate+0x162/0x320 [xfs]
  ? rcu_read_lock_sched_held+0x3f/0x70
  ? rcu_sync_lockdep_assert+0x2a/0x50
  ? __sb_start_write+0xd0/0x1b0
  ? vfs_fallocate+0x20c/0x270
  vfs_fallocate+0x154/0x270
  SyS_fallocate+0x43/0x80
  entry_SYSCALL_64_fastpath+0x1f/0x96

Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

d2c997c0

ext2, dax: introduce ext2_dax_aops · fb094c90

由 Dan Williams 提交于 12月 21, 2017

In preparation for the dax implementation to start associating dax pages
to inodes via page->mapping, we need to provide a 'struct
address_space_operations' instance for dax. Otherwise, direct-I/O
triggers incorrect page cache assumptions and warnings.
Reviewed-by: NJan Kara <jack@suse.com>
Reported-by: Nkbuild test robot <lkp@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

fb094c90

nfit: fix region registration vs block-data-window ranges · 8d0d8ed3

由 Dan Williams 提交于 4月 02, 2018

Commit 1cf03c00 "nfit: scrub and register regions in a workqueue"
mistakenly attempts to register a region per BLK aperture. There is
nothing to register for individual apertures as they belong as a set to
a BLK aperture group that are registered with a corresponding
DIMM-control-region. Filter them for registration to prevent some
needless devm_kzalloc() allocations.

Cc: <stable@vger.kernel.org>
Fixes: 1cf03c00 ("nfit: scrub and register regions in a workqueue")
Reviewed-by: NDave Jiang <dave.jiang@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

8d0d8ed3

31 3月, 2018 5 次提交

ext4, dax: introduce ext4_dax_aops · 5f0663bb

由 Dan Williams 提交于 12月 21, 2017

In preparation for the dax implementation to start associating dax pages
to inodes via page->mapping, we need to provide a 'struct
address_space_operations' instance for dax. Otherwise, direct-I/O
triggers incorrect page cache assumptions and warnings.

Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: linux-ext4@vger.kernel.org
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

5f0663bb

xfs, dax: introduce xfs_dax_aops · 6e2608df

由 Dan Williams 提交于 3月 07, 2018

In preparation for the dax implementation to start associating dax pages
to inodes via page->mapping, we need to provide a 'struct
address_space_operations' instance for dax. Otherwise, direct-I/O
triggers incorrect page cache assumptions and warnings like the
following:

 WARNING: CPU: 27 PID: 1783 at fs/xfs/xfs_aops.c:1468
 xfs_vm_set_page_dirty+0xf3/0x1b0 [xfs]
 [..]
 CPU: 27 PID: 1783 Comm: dma-collision Tainted: G           O 4.15.0-rc2+ #984
 [..]
 Call Trace:
  set_page_dirty_lock+0x40/0x60
  bio_set_pages_dirty+0x37/0x50
  iomap_dio_actor+0x2b7/0x3b0
  ? iomap_dio_zero+0x110/0x110
  iomap_apply+0xa4/0x110
  iomap_dio_rw+0x29e/0x3b0
  ? iomap_dio_zero+0x110/0x110
  ? xfs_file_dio_aio_read+0x7c/0x1a0 [xfs]
  xfs_file_dio_aio_read+0x7c/0x1a0 [xfs]
  xfs_file_read_iter+0xa0/0xc0 [xfs]
  __vfs_read+0xf9/0x170
  vfs_read+0xa6/0x150
  SyS_pread64+0x93/0xb0
  entry_SYSCALL_64_fastpath+0x1f/0x96

...where the default set_page_dirty() handler assumes that dirty state
is being tracked in 'struct page' flags.

Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Suggested-by: NJan Kara <jack@suse.cz>
Suggested-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

6e2608df

block, dax: remove dead code in blkdev_writepages() · 15aa8a01

由 Dan Williams 提交于 3月 11, 2018

Block device inodes never have S_DAX set, so kill the check for DAX and
diversion to dax_writeback_mapping_range().

Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Dave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

15aa8a01

fs, dax: prepare for dax-specific address_space_operations · f44c7763

由 Dan Williams 提交于 3月 07, 2018

In preparation for the dax implementation to start associating dax pages
to inodes via page->mapping, we need to provide a 'struct
address_space_operations' instance for dax. Define some generic VFS aops
helpers for dax. These noop implementations are there in the dax case to
prevent the VFS from falling back to operations with page-cache
assumptions, dax_writeback_mapping_range() may not be referenced in the
FS_DAX=n case.

Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Suggested-by: NMatthew Wilcox <mawilcox@microsoft.com>
Suggested-by: NJan Kara <jack@suse.cz>
Suggested-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Suggested-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

f44c7763

dax: store pfns in the radix · 3fe0791c

由 Dan Williams 提交于 10月 14, 2017

In preparation for examining the busy state of dax pages in the truncate
path, switch from sectors to pfns in the radix.

Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

3fe0791c

29 3月, 2018 1 次提交

acpi, nfit: rework NVDIMM leaf method detection · 466d1493

由 Dan Williams 提交于 3月 28, 2018

Some BIOSen do not handle 0-byte transfer lengths for the _LSR and _LSW
(label storage read/write) methods. This causes Linux to fallback to the
deprecated _DSM path, or otherwise disable label support.

Introduce acpi_nvdimm_has_method() to detect whether a method is
available rather than calling the method, require _LSI and _LSR to be
paired, and require read support before enabling write support.

Cc: <stable@vger.kernel.org>
Fixes: 4b27db7e ("acpi, nfit: add support for the _LS...")
Suggested-by: NErik Schmauss <erik.schmauss@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

466d1493

26 3月, 2018 8 次提交

L

Linux 4.16-rc7 · 3eb2ce82
由 Linus Torvalds 提交于 3月 25, 2018

3eb2ce82

Merge tag 'dmaengine-fix-4.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/slave-dma · cb641659

由 Linus Torvalds 提交于 3月 25, 2018

Pull dmaengine fix from Vinod Koul:
 "One small fix for stm32-dmamux fixing buffer overflow"

* tag 'dmaengine-fix-4.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/slave-dma:
  dmaengine: stm32-dmamux: fix a potential buffer overflow

cb641659

Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d2862360

由 Linus Torvalds 提交于 3月 25, 2018

Pull x86 and PTI fixes from Ingo Molnar:
 "Misc fixes:

   - fix EFI pagetables freeing

   - fix vsyscall pagetable setting on Xen PV guests

   - remove ancient CONFIG_X86_PPRO_FENCE=y - x86 is TSO again

   - fix two binutils (ld) development version related incompatibilities

   - clean up breakpoint handling

   - fix an x86 self-test"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/entry/64: Don't use IST entry for #BP stack
  x86/efi: Free efi_pgd with free_pages()
  x86/vsyscall/64: Use proper accessor to update P4D entry
  x86/cpu: Remove the CONFIG_X86_PPRO_FENCE=y quirk
  x86/boot/64: Verify alignment of the LOAD segment
  x86/build/64: Force the linker to use 2MB page size
  selftests/x86/ptrace_syscall: Fix for yet more glibc interference

d2862360

Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9fd64e8a

由 Linus Torvalds 提交于 3月 25, 2018

Pull timer fix from Ingo Molnar:
 "Make posix clock ID usage Spectre-safe"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  posix-timers: Protect posix clock array access against speculation

9fd64e8a

Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · bf45bae9

由 Linus Torvalds 提交于 3月 25, 2018

Pull scheduler fixes from Ingo Molnar:
 "Two sched debug output related fixes: a console output fix and
  formatting fixes"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/debug: Adjust newlines for better alignment
  sched/debug: Fix per-task line continuation for console output

bf45bae9

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · eaf67993

由 Linus Torvalds 提交于 3月 25, 2018

Pull perf fixes from Ingo Molnar:
 "Misc kernel side fixes.

  Generic:
   - cgroup events counting fix

  x86:
   - Intel PMU truncated-parameter fix

   - RDPMC fix

   - API naming fix/rename

   - uncore driver big-hardware PCI enumeration fix

   - uncore driver filter constraint fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/cgroup: Fix child event counting bug
  perf/x86/intel/uncore: Fix multi-domain PCI CHA enumeration bug on Skylake servers
  perf/x86/intel: Rename confusing 'freerunning PEBS' API and implementation to 'large PEBS'
  perf/x86/intel/uncore: Add missing filter constraint for SKX CHA event
  perf/x86/intel: Don't accidentally clear high bits in bdw_limit_period()
  perf/x86/intel: Disable userspace RDPMC usage for large PEBS

eaf67993

Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6bacf660

由 Linus Torvalds 提交于 3月 25, 2018

Pull locking fixes from Ingo Molnar:
 "Two fixes: tighten up a jump-labels warning to not trigger on certain
  modules and fix confusing (and non-existent) mutex API documentation"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  jump_label: Disable jump labels in __exit code
  locking/mutex: Improve documentation

6bacf660

tty: vt: fix up tabstops properly · f1869a89

由 Linus Torvalds 提交于 3月 24, 2018

Tabs on a console with long lines do not wrap properly, so correctly
account for the line length when computing the tab placement location.
Reported-by: NJames Holderness <j4_james@hotmail.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f1869a89

25 3月, 2018 3 次提交

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace · e43d40b3

由 Linus Torvalds 提交于 3月 24, 2018

Pull mqueuefs revert from Eric Biederman:
 "This fixes a regression that came in the merge window for v4.16.

  The problem is that the permissions for mounting and using the
  mqueuefs filesystem are broken. The necessary permission check is
  missing letting people who should not be able to mount mqueuefs mount
  mqueuefs. The field sb->s_user_ns is set incorrectly not allowing the
  mounter of mqueuefs to remount and otherwise have proper control over
  the filesystem.

  Al Viro and I see the path to the necessary fixes differently and I am
  not even certain at this point he actually sees all of the necessary
  fixes. Given a couple weeks we can probably work something out but I
  don't see the review being resolved in time for the final v4.16. I
  don't want v4.16 shipping with a nasty regression. So unfortunately I
  am sending a revert"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  Revert "mqueue: switch to on-demand creation of internal mount"

e43d40b3

Revert "mqueue: switch to on-demand creation of internal mount" · cfb2f6f6

由 Eric W. Biederman 提交于 3月 24, 2018

This reverts commit 36735a6a.

Aleksa Sarai <asarai@suse.de> writes:
> [REGRESSION v4.16-rc6] [PATCH] mqueue: forbid unprivileged user access to internal mount
>
> Felix reported weird behaviour on 4.16.0-rc6 with regards to mqueue[1],
> which was introduced by 36735a6a ("mqueue: switch to on-demand
> creation of internal mount").
>
> Basically, the reproducer boils down to being able to mount mqueue if
> you create a new user namespace, even if you don't unshare the IPC
> namespace.
>
> Previously this was not possible, and you would get an -EPERM. The mount
> is the *host* mqueue mount, which is being cached and just returned from
> mqueue_mount(). To be honest, I'm not sure if this is safe or not (or if
> it was intentional -- since I'm not familiar with mqueue).
>
> To me it looks like there is a missing permission check. I've included a
> patch below that I've compile-tested, and should block the above case.
> Can someone please tell me if I'm missing something? Is this actually
> safe?
>
> [1]: https://github.com/docker/docker/issues/36674

The issue is a lot deeper than a missing permission check.  sb->s_user_ns
was is improperly set as well.  So in addition to the filesystem being
mounted when it should not be mounted, so things are not allow that should
be.

We are practically to the release of 4.16 and there is no agreement between
Al Viro and myself on what the code should looks like to fix things properly.
So revert the code to what it was before so that we can take our time
and discuss this properly.

Fixes: 36735a6a ("mqueue: switch to on-demand creation of internal mount")
Reported-by: NFelix Abecassis <fabecassis@nvidia.com>
Reported-by: NAleksa Sarai <asarai@suse.de>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

cfb2f6f6

Merge tag 'pinctrl-v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · bcfc1f45

由 Linus Torvalds 提交于 3月 24, 2018

Pull pin control fixes from Linus Walleij:
 "Two fixes for pin control for v4.16:

   - Renesas SH-PFC: remove a duplicate clkout pin which was causing
     crashes

   - fix Samsung out of bounds exceptions"

* tag 'pinctrl-v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: samsung: Validate alias coming from DT
  pinctrl: sh-pfc: r8a7795: remove duplicate of CLKOUT pin in pinmux_pins[]

bcfc1f45

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功