提交 · 3ae3d67ba705c754a3c91ac009f9ce73a0e7286a · openanolis / cloud-kernel

11 5月, 2017 1 次提交

libnvdimm: add an atomic vs process context flag to rw_bytes · 3ae3d67b

由 Vishal Verma 提交于 5月 10, 2017

nsio_rw_bytes can clear media errors, but this cannot be done while we
are in an atomic context due to locking within ACPI. From the BTT,
->rw_bytes may be called either from atomic or process context depending
on whether the calls happen during initialization or during IO.

During init, we want to ensure error clearing happens, and the flag
marking process context allows nsio_rw_bytes to do that. When called
during IO, we're in atomic context, and error clearing can be skipped.

Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

3ae3d67b

10 5月, 2017 2 次提交

x86, pmem: Fix cache flushing for iovec write < 8 bytes · 8376efd3

由 Ben Hutchings 提交于 5月 09, 2017

Commit 11e63f6d added cache flushing for unaligned writes from an
iovec, covering the first and last cache line of a >= 8 byte write and
the first cache line of a < 8 byte write.  But an unaligned write of
2-7 bytes can still cover two cache lines, so make sure we flush both
in that case.

Cc: <stable@vger.kernel.org>
Fixes: 11e63f6d ("x86, pmem: fix broken __copy_user_nocache ...")
Signed-off-by: NBen Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

8376efd3

device-dax: kill NR_DEV_DAX · cf1e2289

由 Dan Williams 提交于 5月 08, 2017

There is no point to ask how many device-dax instances the kernel should
support. Since we are already using a dynamic major number, just allow
the max number of minors by default and be done. This also fixes the
fact that the proposed max for the NR_DEV_DAX range was larger than what
could be supported by alloc_chrdev_region().

Fixes: ba09c01d ("dax: convert to the cdev api")
Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Tested-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

cf1e2289

09 5月, 2017 2 次提交

block, dax: move "select DAX" from BLOCK to FS_DAX · ef510424

由 Dan Williams 提交于 5月 08, 2017

For configurations that do not enable DAX filesystems or drivers, do not
require the DAX core to be built.

Given that the 'direct_access' method has been removed from
'block_device_operations', we can also go ahead and remove the
block-related dax helper functions from fs/block_dev.c to
drivers/dax/super.c. This keeps dax details out of the block layer and
lets the DAX core be built as a module in the FS_DAX=n case.

Filesystems need to include dax.h to call bdev_dax_supported().

Cc: linux-xfs@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: NJan Kara <jack@suse.com>
Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

ef510424

device-dax: Tell kbuild DEV_DAX_PMEM depends on DEV_DAX · 74d71a01

由 Mike Galbraith 提交于 5月 06, 2017

ERROR: "devm_create_dev_dax" [drivers/dax/dax_pmem.ko] undefined!
ERROR: "alloc_dax_region" [drivers/dax/dax_pmem.ko] undefined!
ERROR: "dax_region_put" [drivers/dax/dax_pmem.ko] undefined!
Signed-off-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

74d71a01

05 5月, 2017 4 次提交

D

Merge branch 'for-4.12/dax' into libnvdimm-for-next · 73616367
由 Dan Williams 提交于 5月 04, 2017

73616367

libnvdimm, pfn: fix 'npfns' vs section alignment · d5483fed

由 Dan Williams 提交于 5月 04, 2017

Fix failures to create namespaces due to the vmem_altmap not advertising
enough free space to store the memmap.

 WARNING: CPU: 15 PID: 8022 at arch/x86/mm/init_64.c:656 arch_add_memory+0xde/0xf0
 [..]
 Call Trace:
  dump_stack+0x63/0x83
  __warn+0xcb/0xf0
  warn_slowpath_null+0x1d/0x20
  arch_add_memory+0xde/0xf0
  devm_memremap_pages+0x244/0x440
  pmem_attach_disk+0x37e/0x490 [nd_pmem]
  nd_pmem_probe+0x7e/0xa0 [nd_pmem]
  nvdimm_bus_probe+0x71/0x120 [libnvdimm]
  driver_probe_device+0x2bb/0x460
  bind_store+0x114/0x160
  drv_attr_store+0x25/0x30

In commit 658922e5 "libnvdimm, pfn: fix memmap reservation sizing"
we arranged for the capacity to be allocated, but failed to also update
the 'npfns' parameter. This leads to cases where there is enough
capacity reserved to hold all the allocated sections, but
vmemmap_populate_hugepages() still encounters -ENOMEM from
altmap_alloc_block_buf().

This fix is a stop-gap until we can teach the core memory hotplug
implementation to permit sub-section hotplug.

Cc: <stable@vger.kernel.org>
Fixes: 658922e5 ("libnvdimm, pfn: fix memmap reservation sizing")
Reported-by: NAnisha Allada <anisha.allada@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

d5483fed

libnvdimm: handle locked label storage areas · 9d62ed96

由 Dan Williams 提交于 5月 04, 2017

Per the latest version of the "NVDIMM DSM Interface Example" [1], the
label data retrieval routine can report a "locked" status. In this case
all regions associated with that DIMM are disabled until the label area
is unlocked. Provide generic libnvdimm enabling for NVDIMMs with label
data area locking capabilities.

[1]: http://pmem.io/documents/Signed-off-by: NDan Williams <dan.j.williams@intel.com>

9d62ed96

libnvdimm: convert NDD_ flags to use bitops, introduce NDD_LOCKED · 8f078b38

由 Dan Williams 提交于 5月 04, 2017

This is a preparation patch for handling locked nvdimm label regions, a
new concept as introduced by the latest DSM document on pmem.io [1]. A
future patch will leverage nvdimm_set_locked() at DIMM probe time to
flag regions that can not be enabled. There should be no functional
difference resulting from this change.

[1]: http://pmem.io/documents/NVDIMM_DSM_Interface_Example-V1.3.pdfSigned-off-by: NDan Williams <dan.j.williams@intel.com>

8f078b38

04 5月, 2017 1 次提交

brd: fix uninitialized use of brd->dax_dev · 1ef97fe4

由 Gerald Schaefer 提交于 5月 03, 2017

commit 1647b9b9 "brd: add dax_operations support" introduced the allocation
and freeing of a dax_device, but the allocated dax_device is not stored
into the brd_device, so brd_del_one() will eventually operate on an
uninitialized brd->dax_dev.

Fix this by storing the allocated dax_device to brd->dax_dev.
Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

1ef97fe4

02 5月, 2017 3 次提交

block, dax: use correct format string in bdev_dax_supported · 67fd3897

由 Arnd Bergmann 提交于 4月 27, 2017

The new message has an incorrect format string, causing a warning in some
configurations:

fs/block_dev.c: In function 'bdev_dax_supported':
fs/block_dev.c:779:5: error: format '%d' expects argument of type 'int', but argument 2 has type 'long int' [-Werror=format=]
     "error: dax access failed (%d)", len);

This changes it to use the correct %ld instead of %d.

Fixes: 2093f2e9 ("block, dax: convert bdev_dax_supported() to dax_direct_access()")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

67fd3897

device-dax: fix sysfs attribute deadlock · 565851c9

由 Dan Williams 提交于 4月 30, 2017

Usage of device_lock() for dax_region attributes is unnecessary and
deadlock prone. It's unnecessary because the order of registration /
un-registration guarantees that drvdata is always valid. It's deadlock
prone because it sets up this situation:

 ndctl           D    0  2170   2082 0x00000000
 Call Trace:
  __schedule+0x31f/0x980
  schedule+0x3d/0x90
  schedule_preempt_disabled+0x15/0x20
  __mutex_lock+0x402/0x980
  ? __mutex_lock+0x158/0x980
  ? align_show+0x2b/0x80 [dax]
  ? kernfs_seq_start+0x2f/0x90
  mutex_lock_nested+0x1b/0x20
  align_show+0x2b/0x80 [dax]
  dev_attr_show+0x20/0x50

 ndctl           D    0  2186   2079 0x00000000
 Call Trace:
  __schedule+0x31f/0x980
  schedule+0x3d/0x90
  __kernfs_remove+0x1f6/0x340
  ? kernfs_remove_by_name_ns+0x45/0xa0
  ? remove_wait_queue+0x70/0x70
  kernfs_remove_by_name_ns+0x45/0xa0
  remove_files.isra.1+0x35/0x70
  sysfs_remove_group+0x44/0x90
  sysfs_remove_groups+0x2e/0x50
  dax_region_unregister+0x25/0x40 [dax]
  devm_action_release+0xf/0x20
  release_nodes+0x16d/0x2b0
  devres_release_all+0x3c/0x60
  device_release_driver_internal+0x17d/0x220
  device_release_driver+0x12/0x20
  unbind_store+0x112/0x160

ndctl/2170 is trying to acquire the device_lock() to read an attribute,
and ndctl/2186 is holding the device_lock() while trying to drain all
active attribute readers.

Thanks to Yi Zhang for the reproduction script.

Fixes: d7fe1a67 ("dax: add region 'id', 'size', and 'align' attributes")
Cc: <stable@vger.kernel.org>
Reported-by: NYi Zhang <yizhan@redhat.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

565851c9

libnvdimm: restore "libnvdimm: band aid btt vs clear poison locking" · a3e9af95

由 Dan Williams 提交于 5月 01, 2017

This continues the 4.11 status quo of disabling of error clearing from
the BTT I/O path. Toshi found that even though we have eliminated all
the libnvdimm sources of sleeping-while-atomic triggers, we still have
sleeping operations that will occur in the path to send the ACPI DSM to
the DIMM to clear the error:

 BUG: sleeping function called from invalid context at mm/slab.h:432
 in_atomic(): 1, irqs_disabled(): 0, pid: 13353, name: dd
 Call Trace:
  dump_stack+0x86/0xc3
  ___might_sleep+0x17d/0x250
  __might_sleep+0x4a/0x80
  __kmalloc+0x1c0/0x2e0
  acpi_os_allocate_zeroed+0x2d/0x2f
  acpi_evaluate_object+0x59/0x3b1
  acpi_evaluate_dsm+0xbd/0x10c
  acpi_nfit_ctl+0x1ef/0x7c0 [nfit]
  ? nsio_rw_bytes+0x152/0x280
  nvdimm_clear_poison+0x77/0x140
  nsio_rw_bytes+0x18f/0x280
  btt_write_pg+0x1d4/0x3d0 [nd_btt]
  btt_make_request+0x119/0x2d0 [nd_btt]

A solution for tracking and handling media errors natively in the BTT is
needed.

Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Reported-by: NToshi Kani <toshi.kani@hpe.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

a3e9af95

01 5月, 2017 1 次提交

libnvdimm: fix nvdimm_bus_lock() vs device_lock() ordering · 452bae0a

由 Dan Williams 提交于 4月 28, 2017

A debug patch to turn the standard device_lock() into something that
lockdep can analyze yielded the following:

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 4.11.0-rc4+ #106 Tainted: G           O
 -------------------------------------------------------
 lt-libndctl/1898 is trying to acquire lock:
  (&dev->nvdimm_mutex/3){+.+.+.}, at: [<ffffffffc023c948>] nd_attach_ndns+0x178/0x1b0 [libnvdimm]

 but task is already holding lock:
  (&nvdimm_bus->reconfig_mutex){+.+.+.}, at: [<ffffffffc022e0b1>] nvdimm_bus_lock+0x21/0x30 [libnvdimm]

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 (&nvdimm_bus->reconfig_mutex){+.+.+.}:
        lock_acquire+0xf6/0x1f0
        __mutex_lock+0x88/0x980
        mutex_lock_nested+0x1b/0x20
        nvdimm_bus_lock+0x21/0x30 [libnvdimm]
        nvdimm_namespace_capacity+0x1b/0x40 [libnvdimm]
        nvdimm_namespace_common_probe+0x230/0x510 [libnvdimm]
        nd_pmem_probe+0x14/0x180 [nd_pmem]
        nvdimm_bus_probe+0xa9/0x260 [libnvdimm]

 -> #0 (&dev->nvdimm_mutex/3){+.+.+.}:
        __lock_acquire+0x1107/0x1280
        lock_acquire+0xf6/0x1f0
        __mutex_lock+0x88/0x980
        mutex_lock_nested+0x1b/0x20
        nd_attach_ndns+0x178/0x1b0 [libnvdimm]
        nd_namespace_store+0x308/0x3c0 [libnvdimm]
        namespace_store+0x87/0x220 [libnvdimm]

In this case '&dev->nvdimm_mutex/3' mirrors '&dev->mutex'.

Fix this by replacing the use of device_lock() with nvdimm_bus_lock() to protect
nd_{attach,detach}_ndns() operations.

Cc: <stable@vger.kernel.org>
Fixes: 8c2f7e86 ("libnvdimm: infrastructure for btt devices")
Reported-by: NYi Zhang <yizhan@redhat.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

452bae0a

30 4月, 2017 1 次提交

libnvdimm: rework region badblocks clearing · 23f49844

由 Dan Williams 提交于 4月 29, 2017

Toshi noticed that the new support for a region-level badblocks missed
the case where errors are cleared due to BTT I/O.

An initial attempt to fix this ran into a "sleeping while atomic"
warning due to taking the nvdimm_bus_lock() in the BTT I/O path to
satisfy the locking requirements of __nvdimm_bus_badblocks_clear().
However, that lock is not needed since we are not acting on any data that
is subject to change under that lock. The badblocks instance has its own
internal lock to handle mutations of the error list.

So, in order to make it clear that we are just acting on region devices,
rename __nvdimm_bus_badblocks_clear() to nvdimm_clear_badblocks_regions().
Eliminate the lock and consolidate all support routines for the new
nvdimm_account_cleared_poison() in drivers/nvdimm/bus.c. Finally, to the
opportunity to cleanup to some unnecessary casts, make the calling
convention of nvdimm_clear_badblocks_regions() clearer by replacing struct
resource with the minimal struct clear_badblocks_context, and use the
DEVICE_ATTR macro.

Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Reported-by: NToshi Kani <toshi.kani@hpe.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

23f49844

29 4月, 2017 4 次提交

acpi, nfit: kill ACPI_NFIT_DEBUG · 7699a6a3

由 Dan Williams 提交于 4月 28, 2017

Inevitably when one actually needs to debug a DSM issue it's on a
distribution kernel that has CONFIG_ACPI_NFIT_DEBUG=n. The config symbol
was only there to avoid the compile error due to the missing fallback for
print_hex_dump_debug in the CONFIG_DYNAMIC_DEBUG=n case. That was fixed
with commit cdf17449 "hexdump: do not print debug dumps for
!CONFIG_DEBUG", so the config symbol can just be dropped.

Cc: Joe Perches <joe@perches.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

7699a6a3

libnvdimm: fix clear length of nvdimm_forget_poison() · 8d13c029

由 Toshi Kani 提交于 4月 27, 2017

ND_CMD_CLEAR_ERROR command returns 'clear_err.cleared', the length
of error actually cleared, which may be smaller than its requested
'len'.

Change nvdimm_clear_poison() to call nvdimm_forget_poison() with
'clear_err.cleared' when this value is valid.

Cc: <stable@vger.kernel.org>
Fixes: e046114a ("libnvdimm: clear the internal poison_list when clearing badblocks")
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

8d13c029

libnvdimm, pmem: fix a NULL pointer BUG in nd_pmem_notify · b2518c78

由 Toshi Kani 提交于 4月 25, 2017

The following BUG was observed when nd_pmem_notify() was called
for a BTT device.  The use of a pmem_device pointer is not valid
with BTT.

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
 IP: nd_pmem_notify+0x30/0xf0 [nd_pmem]
 Call Trace:
  nd_device_notify+0x40/0x50
  child_notify+0x10/0x20
  device_for_each_child+0x50/0x90
  nd_region_notify+0x20/0x30
  nd_device_notify+0x40/0x50
  nvdimm_region_notify+0x27/0x30
  acpi_nfit_scrub+0x341/0x590 [nfit]
  process_one_work+0x197/0x450
  worker_thread+0x4e/0x4a0
  kthread+0x109/0x140

Fix nd_pmem_notify() by setting nd_region and badblocks pointers
properly for BTT.

Cc: <stable@vger.kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Fixes: 71999466 ("libnvdimm: async notification support")
Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

b2518c78

libnvdimm, region: sysfs trigger for nvdimm_flush() · ab630891

由 Dan Williams 提交于 4月 21, 2017

The nvdimm_flush() mechanism helps to reduce the impact of an ADR
(asynchronous-dimm-refresh) failure. The ADR mechanism handles flushing
platform WPQ (write-pending-queue) buffers when power is removed. The
nvdimm_flush() mechanism performs that same function on-demand.

When a pmem namespace is associated with a block device, an
nvdimm_flush() is triggered with every block-layer REQ_FUA, or REQ_FLUSH
request. These requests are typically associated with filesystem
metadata updates. However, when a namespace is in device-dax mode,
userspace (think database metadata) needs another path to perform the
same flushing. In other words this is not required to make data
persistent, but in the case of metadata it allows for a smaller failure
domain in the unlikely event of an ADR failure.

The new 'deep_flush' attribute is visible when the individual DIMMs
backing a given interleave-set are described by platform firmware. In
ACPI terms this is "NVDIMM Region Mapping Structures" and associated
"Flush Hint Address Structures". Reads return "1" if the region supports
triggering WPQ flushes on all DIMMs. Reads return "0" the flush
operation is a platform nop, and in that case the attribute is
read-only.

Why sysfs and not an ioctl? An ioctl requires establishing a new
ioctl function number space for device-dax. Given that this would be
called on a device-dax fd an application could be forgiven for
accidentally calling this on a filesystem-dax fd. Placing this interface
in libnvdimm sysfs removes that potential for collision with a
filesystem ioctl, and it keeps ioctls out of the generic device-dax
implementation.

Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

ab630891

28 4月, 2017 1 次提交

libnvdimm: fix phys_addr for nvdimm_clear_poison · 97681f9b

由 Toshi Kani 提交于 4月 25, 2017

nvdimm_clear_poison() expects a physical address, not an offset.
Fix nsio_rw_bytes() to call nvdimm_clear_poison() with a physical
address.
Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: NVishal Verma <vishal.l.verma@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

97681f9b

26 4月, 2017 7 次提交

x86, dax, pmem: remove indirection around memcpy_from_pmem() · 6abccd1b

由 Dan Williams 提交于 1月 13, 2017

memcpy_from_pmem() maps directly to memcpy_mcsafe(). The wrapper
serves no real benefit aside from affording a more generic function name
than the x86-specific 'mcsafe'. However this would not be the first time
that x86 terminology leaked into the global namespace. For lack of
better name, just use memcpy_mcsafe() directly.

This conversion also catches a place where we should have been using
plain memcpy, acpi_nfit_blk_single_io().

Cc: <x86@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Acked-by: NTony Luck <tony.luck@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

6abccd1b

block: remove block_device_operations ->direct_access() · d4b29fd7

由 Dan Williams 提交于 1月 27, 2017

Now that all the producers and consumers of dax interfaces have been
converted to using dax_operations on a dax_device, remove the block
device direct_access enabling.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

d4b29fd7

block, dax: convert bdev_dax_supported() to dax_direct_access() · 2093f2e9

由 Dan Williams 提交于 4月 01, 2017

Kill of the final user of bdev_direct_access() and struct blk_dax_ctl.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

2093f2e9

filesystem-dax: convert to dax_direct_access() · cccbce67

由 Dan Williams 提交于 1月 27, 2017

Now that a dax_device is plumbed through all dax-capable drivers we can
switch from block_device_operations to dax_operations for invoking
->direct_access.

This also lets us kill off some usages of struct blk_dax_ctl on the way
to its eventual removal.
Suggested-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

cccbce67

Revert "block: use DAX for partition table reads" · a41fe02b

由 Dan Williams 提交于 1月 27, 2017

commit d1a5f2b4 ("block: use DAX for partition table reads") was
part of a stalled effort to allow dax mappings of block devices. Since
then the device-dax mechanism has filled the role of dax-mapping static
device ranges.

Now that we are moving ->direct_access() from a block_device operation
to a dax_inode operation we would need block devices to map and carry
their own dax_inode reference.

Unless / until we decide to revive dax mapping of raw block devices
through the dax_inode scheme, there is no need to carry
read_dax_sector(). Its removal in turn allows for the removal of
bdev_direct_access() and should have been included in commit
22375701 ("block_dev: remove DAX leftovers").

Cc: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

a41fe02b

ext2, ext4, xfs: retrieve dax_device for iomap operations · fa5d932c

由 Dan Williams 提交于 1月 27, 2017

In preparation for converting fs/dax.c to use dax_direct_access()
instead of bdev_direct_access(), add the plumbing to retrieve the
dax_device associated with a given block_device.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

fa5d932c

dm: teach dm-targets to use a dax_device + dax_operations · 817bf402

由 Dan Williams 提交于 4月 12, 2017

Arrange for dm to lookup the dax services available from member devices.
Update the dax-capable targets, linear and stripe, to route dax
operations to the underlying device. Changes the target-internal
->direct_access() method to more closely align with the dax_operations
->direct_access() calling convention.

Cc: Toshi Kani <toshi.kani@hpe.com>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

817bf402

25 4月, 2017 1 次提交

libnvdimm, region: fix flush hint detection crash · bc042fdf

由 Dan Williams 提交于 4月 24, 2017

In the case where a dimm does not have any associated flush hints the
ndrd->flush_wpq array may be uninitialized leading to crashes with the
following signature:

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
 IP: region_visible+0x10f/0x160 [libnvdimm]

 Call Trace:
  internal_create_group+0xbe/0x2f0
  sysfs_create_groups+0x40/0x80
  device_add+0x2d8/0x650
  nd_async_device_register+0x12/0x40 [libnvdimm]
  async_run_entry_fn+0x39/0x170
  process_one_work+0x212/0x6c0
  ? process_one_work+0x197/0x6c0
  worker_thread+0x4e/0x4a0
  kthread+0x10c/0x140
  ? process_one_work+0x6c0/0x6c0
  ? kthread_create_on_node+0x60/0x60
  ret_from_fork+0x31/0x40

Cc: <stable@vger.kernel.org>
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Fixes: f284a4f2 ("libnvdimm: introduce nvdimm_flush() and nvdimm_has_flush()")
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

bc042fdf

21 4月, 2017 3 次提交

dm: add dax_device and dax_operations support · f26c5719

由 Dan Williams 提交于 4月 12, 2017

Allocate a dax_device to represent the capacity of a device-mapper
instance. Provide a ->direct_access() method via the new dax_operations
indirection that mirrors the functionality of the current direct_access
support via block_device_operations.  Once fs/dax.c has been converted
to use dax_operations the old dm_blk_direct_access() will be removed.

A new helper dm_dax_get_live_target() is introduced to separate some of
the dm-specifics from the direct_access implementation.

This enabling is only for the top-level dm representation to upper
layers. Converting target direct_access implementations is deferred to a
separate patch.

Cc: Toshi Kani <toshi.kani@hpe.com>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

f26c5719

dax: introduce dax_direct_access() · b0686260

由 Dan Williams 提交于 1月 26, 2017

Replace bdev_direct_access() with dax_direct_access() that uses
dax_device and dax_operations instead of a block_device and
block_device_operations for dax. Once all consumers of the old api have
been converted bdev_direct_access() will be deleted.

Given that block device partitioning decisions can cause dax page
alignment constraints to be violated this also introduces the
bdev_dax_pgoff() helper. It handles calculating a logical pgoff relative
to the dax_device and also checks for page alignment.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

b0686260

block: kill bdev_dax_capable() · d8f07aee

由 Dan Williams 提交于 1月 26, 2017

This is leftover dead code that has since been replaced by
bdev_dax_supported().
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

d8f07aee

20 4月, 2017 6 次提交

dcssblk: add dax_operations support · 7a2765f6

由 Dan Williams 提交于 1月 26, 2017

Setup a dax_dev to have the same lifetime as the dcssblk block device
and add a ->direct_access() method that is equivalent to
dcssblk_direct_access(). Once fs/dax.c has been converted to use
dax_operations the old dcssblk_direct_access() will be removed.
Reported-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
Acked-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

7a2765f6

brd: add dax_operations support · 1647b9b9

由 Dan Williams 提交于 1月 25, 2017

    
Setup a dax_inode to have the same lifetime as the brd block device and
add a ->direct_access() method that is equivalent to
brd_direct_access(). Once fs/dax.c has been converted to use
dax_operations the old brd_direct_access() will be removed.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

1647b9b9

axon_ram: add dax_operations support · 60fcd55c

由 Dan Williams 提交于 1月 25, 2017

Setup a dax_device to have the same lifetime as the axon_ram block
device and add a ->direct_access() method that is equivalent to
axon_ram_direct_access(). Once fs/dax.c has been converted to use
dax_operations the old axon_ram_direct_access() will be removed.
Reported-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

60fcd55c

pmem: add dax_operations support · c1d6e828

由 Dan Williams 提交于 1月 24, 2017

Setup a dax_device to have the same lifetime as the pmem block device
and add a ->direct_access() method that is equivalent to
pmem_direct_access(). Once fs/dax.c has been converted to use
dax_operations the old pmem_direct_access() will be removed.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

c1d6e828

dax: introduce dax_operations · 6568b08b

由 Dan Williams 提交于 1月 24, 2017

Track a set of dax_operations per dax_device that can be set at
alloc_dax() time. These operations will be used to stop the abuse of
block_device_operations for communicating dax capabilities to
filesystems. It will also be used to replace the "pmem api" and move
pmem-specific cache maintenance, and other dax-driver-specific
filesystem-dax operations, to dax device methods. In particular this
allows us to stop abusing __copy_user_nocache(), via memcpy_to_pmem(),
with a driver specific replacement.

This is a standalone introduction of the operations. Follow on patches
convert each dax-driver and teach fs/dax.c to use ->direct_access() from
dax_operations instead of block_device_operations.
Suggested-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

6568b08b

dax: add a facility to lookup a dax device by 'host' device name · 72058005

由 Dan Williams 提交于 4月 19, 2017

For the current block_device based filesystem-dax path, we need a way
for it to lookup the dax_device associated with a block_device. Add a
'host' property of a dax_device that can be used for this purpose. It is
a free form string, but for a dax_device associated with a block device
it is the bdev name.

This is a stop-gap until filesystems are able to mount on a dax-inode
directly.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

72058005

19 4月, 2017 2 次提交

acpi, nfit: fix module unload vs workqueue shutdown race · fbabd829

由 Dan Williams 提交于 4月 18, 2017

The workqueue may still be running when the devres callbacks start
firing to deallocate an acpi_nfit_desc instance. Stop and flush the
workqueue before letting any other devres de-allocations proceed.
Reported-by: NLinda Knippers <linda.knippers@hpe.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

fbabd829

tools/testing/nvdimm: fix nfit_test shutdown crash · 8b06b884

由 Dan Williams 提交于 4月 13, 2017

Keep the nfit_test instances alive until after nfit_test_teardown(), as
we may be doing resource lookups until the final un-registrations have
completed. This fixes crashes of the form.

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
 IP: __release_resource+0x12/0x90
 Call Trace:
  remove_resource+0x23/0x40
  __wrap_remove_resource+0x29/0x30 [nfit_test_iomap]
  acpi_nfit_remove_resource+0xe/0x10 [nfit]
  devm_action_release+0xf/0x20
  release_nodes+0x16d/0x2b0
  devres_release_all+0x3c/0x60
  device_release+0x21/0x90
  kobject_release+0x6a/0x170
  kobject_put+0x2f/0x60
  put_device+0x17/0x20
  platform_device_unregister+0x20/0x30
  nfit_test_exit+0x36/0x960 [nfit_test]
Reported-by: NLinda Knippers <linda.knippers@hpe.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

8b06b884

18 4月, 2017 1 次提交

acpi, nfit: limit ->flush_probe() to initialization work · 9ccaed4b

由 Dan Williams 提交于 4月 13, 2017

The nvdimm probe flushing mechanism gives userspace a sync point where
it knows all asynchronous driver probe sequences have completed.
However, it need not wait for other asynchronous actions, like
on-demand address-range-scrub. Track the init work separately from other
work in the workqueue, and only flush the former.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

9ccaed4b

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功