提交 · 32f61d67570db036dba9e4fb2509b5343e28f627 · openeuler / Kernel

02 9月, 2020 1 次提交

nvdimm: simplify revalidate_disk handling · 32f61d67

由 Christoph Hellwig 提交于 9月 01, 2020

The nvdimm block driver abuse revalidate_disk in a strange way, and
totally unrelated to what other drivers do.  Simplify this by just
calling nvdimm_revalidate_disk (which seems rather misnamed) from the
probe routines, as the additional bdev size revalidation is pointless
at this point, and remove the revalidate_disk methods given that
it can only be triggered from add_disk, which is right before the
manual calls.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

32f61d67

27 5月, 2020 1 次提交

nvdimm: use bio_{start,end}_io_acct · 0fd92f89

由 Christoph Hellwig 提交于 5月 27, 2020

Switch dm to use the nicer bio accounting helpers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0fd92f89

31 3月, 2020 1 次提交

libnvdimm/region: Replace zero-length array with flexible-array member · 9106137c

由 Gustavo A. R. Silva 提交于 3月 19, 2020

The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732 ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
Link: https://lore.kernel.org/r/20200319230937.GA16648@embeddedor.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>

9106137c

18 3月, 2020 2 次提交

libnvdimm/region: Introduce an 'align' attribute · 2522afb8

由 Dan Williams 提交于 1月 30, 2020

The align attribute applies an alignment constraint for namespace
creation in a region. Whereas the 'align' attribute of a namespace
applied alignment padding via an info block, the 'align' attribute
applies alignment constraints to the free space allocation.

The default for 'align' is the maximum known memremap_compat_align()
across all archs (16MiB from PowerPC at time of writing) multiplied by
the number of interleave ways if there is blk-aliasing. The minimum is
PAGE_SIZE and allows for the creation of cross-arch incompatible
namespaces, just as previous kernels allowed, but the expectation is
cross-arch and mode-independent compatibility by default.

The regression risk with this change is limited to cases that were
dependent on the ability to create unaligned namespaces, *and* for some
reason are unable to opt-out of aligned namespaces by writing to
'regionX/align'. If such a scenario arises the default can be flipped
from opt-out to opt-in of compat-aligned namespace creation, but that is
a last resort. The kernel will otherwise continue to support existing
defined misaligned namespaces.

Unfortunately this change needs to touch several parts of the
implementation at once:

- region/available_size: expand busy extents to current align
- region/max_available_extent: expand busy extents to current align
- namespace/size: trim free space to current align

...to keep the free space accounting conforming to the dynamic align
setting.
Reported-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reported-by: NJeff Moyer <jmoyer@redhat.com>
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Link: https://lore.kernel.org/r/158041478371.3889308.14542630147672668068.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>

2522afb8

libnvdimm/region: Introduce NDD_LABELING · a0e37452

由 Dan Williams 提交于 1月 30, 2020

The NDD_ALIASING flag is used to indicate where pmem capacity might
alias with blk capacity and require labeling. It is also used to
indicate whether the DIMM supports labeling. Separate this latter
capability into its own flag so that the NDD_ALIASING flag is scoped to
true aliased configurations.

To my knowledge aliased configurations only exist in the ACPI spec,
there are no known platforms that ship this support in production.

This clarity allows namespace-capacity alignment constraints around
interleave-ways to be relaxed.

Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/158041477856.3889308.4212605617834097674.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>

a0e37452

20 11月, 2019 2 次提交

libnvdimm: Move nvdimm_bus_attribute_group to device_type · e755799a

由 Dan Williams 提交于 11月 12, 2019

A 'struct device_type' instance can carry default attributes for the
device. Use this facility to remove the export of
nvdimm_bus_attribute_group and put the responsibility on the core rather
than leaf implementations to define this attribute.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/157309903815.1582359.6418211876315050283.stgit@dwillia2-desk3.amr.corp.intel.com

e755799a

libnvdimm: Move nd_numa_attribute_group to device_type · e2f6a0e3

由 Dan Williams 提交于 11月 19, 2019

A 'struct device_type' instance can carry default attributes for the
device. Use this facility to remove the export of
nd_numa_attribute_group and put the responsibility on the core rather
than leaf implementations to define this attribute.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/157401269537.43284.14411189404186877352.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>

e2f6a0e3

18 11月, 2019 2 次提交

libnvdimm: Move nd_device_attribute_group to device_type · adbb6829

由 Dan Williams 提交于 11月 12, 2019

A 'struct device_type' instance can carry default attributes for the
device. Use this facility to remove the export of
nd_device_attribute_group and put the responsibility on the core rather
than leaf implementations to define this attribute.

For regions this creates a new nd_region_attribute_groups[] added to the
per-region device-type instances.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/157309901138.1582359.12909354140826530394.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>

adbb6829

libnvdimm: Move attribute groups to device type · 78c81cc8

由 Dan Williams 提交于 11月 06, 2019

Statically initialize the attribute groups for each libnvdimm
device_type. This is a preparation step for removing unnecessary exports
of attributes that can be included in the device_type by default.

Also take the opportunity to mark 'struct device_type' instances const.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/157309900111.1582359.2445687530383470348.stgit@dwillia2-desk3.amr.corp.intel.com

78c81cc8

15 11月, 2019 1 次提交

libnvdimm/namespace: Differentiate between probe mapping and runtime mapping · 8f4b01fc

由 Aneesh Kumar K.V 提交于 10月 31, 2019

The nvdimm core currently maps the full namespace to an ioremap range
while probing the namespace mode. This can result in probe failures on
architectures that have limited ioremap space.

For example, with a large btt namespace that consumes most of I/O remap
range, depending on the sequence of namespace initialization, the user
can find a pfn namespace initialization failure due to unavailable I/O
remap space which nvdimm core uses for temporary mapping.

nvdimm core can avoid this failure by only mapping the reserved info
block area to check for pfn superblock type and map the full namespace
resource only before using the namespace.

Given that personalities like BTT can be layered on top of any namespace
type create a generic form of devm_nsio_enable (devm_namespace_enable)
and use it inside the per-personality attach routines. Now
devm_namespace_enable() is always paired with disable unless the mapping
is going to be used for long term runtime access.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/20191017073308.32645-1-aneesh.kumar@linux.ibm.com
[djbw: reworks to move devm_namespace_{en,dis}able into *attach helpers]
Reported-by: Nkbuild test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/20191031105741.102793-2-aneesh.kumar@linux.ibm.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>

8f4b01fc

25 9月, 2019 1 次提交

libnvdimm/dax: Pick the right alignment default when creating dax devices · f5376699

由 Aneesh Kumar K.V 提交于 9月 05, 2019

Allow arch to provide the supported alignments and use hugepage alignment only
if we support hugepage. Right now we depend on compile time configs whereas this
patch switch this to runtime discovery.

Architectures like ppc64 can have THP enabled in code, but then can have
hugepage size disabled by the hypervisor. This allows us to create dax devices
with PAGE_SIZE alignment in this case.

Existing dax namespace with alignment larger than PAGE_SIZE will fail to
initialize in this specific case. We still allow fsdax namespace initialization.

With respect to identifying whether to enable hugepage fault for a dax device,
if THP is enabled during compile, we default to taking hugepage fault and in dax
fault handler if we find the fault size > alignment we retry with PAGE_SIZE
fault size.

This also addresses the below failure scenario on ppc64

ndctl create-namespace --mode=devdax  | grep align
 "align":16777216,
 "align":16777216

cat /sys/devices/ndbus0/region0/dax0.0/supported_alignments
 65536 16777216

daxio.static-debug  -z -o /dev/dax0.0
  Bus error (core dumped)

  $ dmesg | tail
   lpar: Failed hash pte insert with error -4
   hash-mmu: mm: Hashing failure ! EA=0x7fff17000000 access=0x8000000000000006 current=daxio
   hash-mmu:     trap=0x300 vsid=0x22cb7a3a ssize=1 base psize=2 psize 10 pte=0xc000000501002b86
   daxio[3860]: bus error (7) at 7fff17000000 nip 7fff973c007c lr 7fff973bff34 code 2 in libpmem.so.1.0.0[7fff973b0000+20000]
   daxio[3860]: code: 792945e4 7d494b78 e95f0098 7d494b78 f93f00a0 4800012c e93f0088 f93f0120
   daxio[3860]: code: e93f00a0 f93f0128 e93f0120 e95f0128 <f9490000> e93f0088 39290008 f93f0110

The failure was due to guest kernel using wrong page size.

The namespaces created with 16M alignment will appear as below on a config with
16M page size disabled.

$ ndctl list -Ni
[
  {
    "dev":"namespace0.1",
    "mode":"fsdax",
    "map":"dev",
    "size":5351931904,
    "uuid":"fc6e9667-461a-4718-82b4-69b24570bddb",
    "align":16777216,
    "blockdev":"pmem0.1",
    "supported_alignments":[
      65536
    ]
  },
  {
    "dev":"namespace0.0",
    "mode":"fsdax",    <==== devdax 16M alignment marked disabled.
    "map":"mem",
    "size":5368709120,
    "uuid":"a4bdf81a-f2ee-4bc6-91db-7b87eddd0484",
    "state":"disabled"
  }
]

Cc: linux-mm@kvack.org
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/20190905154603.10349-8-aneesh.kumar@linux.ibm.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>

f5376699

06 9月, 2019 1 次提交

libnvdimm/pfn_dev: Add a build check to make sure we notice when struct page size change · e96f0bf2

由 Aneesh Kumar K.V 提交于 9月 05, 2019

Namespaces created with PFN_MODE_PMEM mode stores struct page in the reserve
block area. We need to make sure we account for the right struct page
size while doing this. Instead of directly depending on sizeof(struct page)
which can change based on different kernel config option, use the max struct
page size (64) while calculating the reserve block area. This makes sure pmem
device can be used across kernels built with different configs.

If the above assumption of max struct page size change, we need to update the
reserve block allocation space for new namespaces created.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/20190905154603.10349-4-aneesh.kumar@linux.ibm.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>

e96f0bf2

06 7月, 2019 1 次提交

libnvdimm: nd_region flush callback support · c5d4355d

由 Pankaj Gupta 提交于 7月 05, 2019

This patch adds functionality to perform flush from guest
to host over VIRTIO. We are registering a callback based
on 'nd_region' type. virtio_pmem driver requires this special
flush function. For rest of the region types we are registering
existing flush function. Report error returned by host fsync
failure to userspace.
Signed-off-by: NPankaj Gupta <pagupta@redhat.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

c5d4355d

05 6月, 2019 1 次提交

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 295 · 5b497af4

由 Thomas Gleixner 提交于 5月 29, 2019

Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of version 2 of the gnu general public license as
  published by the free software foundation this program is
  distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 64 file(s).
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAlexios Zavras <alexios.zavras@intel.com>
Reviewed-by: NAllison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.894819585@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

5b497af4

01 5月, 2019 1 次提交

libnvdimm/namespace: Fix label tracking error · c4703ce1

由 Dan Williams 提交于 4月 30, 2019

Users have reported intermittent occurrences of DIMM initialization
failures due to duplicate allocations of address capacity detected in
the labels, or errors of the form below, both have the same root cause.

    nd namespace1.4: failed to track label: 0
    WARNING: CPU: 17 PID: 1381 at drivers/nvdimm/label.c:863

    RIP: 0010:__pmem_label_update+0x56c/0x590 [libnvdimm]
    Call Trace:
     ? nd_pmem_namespace_label_update+0xd6/0x160 [libnvdimm]
     nd_pmem_namespace_label_update+0xd6/0x160 [libnvdimm]
     uuid_store+0x17e/0x190 [libnvdimm]
     kernfs_fop_write+0xf0/0x1a0
     vfs_write+0xb7/0x1b0
     ksys_write+0x57/0xd0
     do_syscall_64+0x60/0x210

Unfortunately those reports were typically with a busy parallel
namespace creation / destruction loop making it difficult to see the
components of the bug. However, Jane provided a simple reproducer using
the work-in-progress sub-section implementation.

When ndctl is reconfiguring a namespace it may take an existing defunct
/ disabled namespace and reconfigure it with a new uuid and other
parameters. Critically namespace_update_uuid() takes existing address
resources and renames them for the new namespace to use / reconfigure as
it sees fit. The bug is that this rename only happens in the resource
tracking tree. Existing labels with the old uuid are not reaped leading
to a scenario where multiple active labels reference the same span of
address range.

Teach namespace_update_uuid() to flag any references to the old uuid for
reaping at the next label update attempt.

Cc: <stable@vger.kernel.org>
Fixes: bf9bccc1 ("libnvdimm: pmem label sets and namespace instantiation")
Link: https://github.com/pmem/ndctl/issues/91Reported-by: NJane Chu <jane.chu@oracle.com>
Reported-by: NJeff Moyer <jmoyer@redhat.com>
Reported-by: NErwin Tsaur <erwin.tsaur@oracle.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

c4703ce1

22 1月, 2019 1 次提交

libnvdimm/security: Require nvdimm_security_setup_events() to succeed · 1cd73865

由 Dan Williams 提交于 1月 19, 2019

The following warning:

    ACPI0012:00: security event setup failed: -19

...is meant to capture exceptional failures of sysfs_get_dirent(),
however it will also fail in the common case when security support is
disabled. A few issues:

1/ A dev_warn() report for a common case is too chatty
2/ The setup of this notifier is generic, no need for it to be driven
   from the nfit driver, it can exist completely in the core.
3/ If it fails for any reason besides security support being disabled,
   that's fatal and should abort DIMM activation. Userspace may hang if
   it never gets overwrite notifications.
4/ The dirent needs to be released.

Move the call to the core 'dimm' driver, make it conditional on security
support being active, make it fatal for the exceptional case, add the
missing sysfs_put() at device disable time.

Fixes: 7d988097 ("...Add security DSM overwrite support")
Reviewed-by: NDave Jiang <dave.jiang@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

1cd73865

07 1月, 2019 1 次提交

acpi/nfit, device-dax: Identify differentiated memory with a unique numa-node · 8fc5c735

由 Dan Williams 提交于 11月 09, 2018

Persistent memory, as described by the ACPI NFIT (NVDIMM Firmware
Interface Table), is the first known instance of a memory range
described by a unique "target" proximity domain. Where "initiator" and
"target" proximity domains is an approach that the ACPI HMAT
(Heterogeneous Memory Attributes Table) uses to described the unique
performance properties of a memory range relative to a given initiator
(e.g. CPU or DMA device).

Currently the numa-node for a /dev/pmemX block-device or /dev/daxX.Y
char-device follows the traditional notion of 'numa-node' where the
attribute conveys the closest online numa-node. That numa-node attribute
is useful for cpu-binding and memory-binding processes *near* the
device. However, when the memory range backing a 'pmem', or 'dax' device
is onlined (memory hot-add) the memory-only-numa-node representing that
address needs to be differentiated from the set of online nodes. In
other words, the numa-node association of the device depends on whether
you can bind processes *near* the cpu-numa-node in the offline
device-case, or bind process *on* the memory-range directly after the
backing address range is onlined.

Allow for the case that platform firmware describes persistent memory
with a unique proximity domain, i.e. when it is distinct from the
proximity of DRAM and CPUs that are on the same socket. Plumb the Linux
numa-node translation of that proximity through the libnvdimm region
device to namespaces that are in device-dax mode. With this in place the
proposed kmem driver [1] can optionally discover a unique numa-node
number for the address range as it transitions the memory from an
offline state managed by a device-driver to an online memory range
managed by the core-mm.

[1]: https://lore.kernel.org/lkml/20181022201317.8558C1D8@viggo.jf.intel.comReported-by: NFan Du <fan.du@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: NYang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

8fc5c735

14 12月, 2018 1 次提交

acpi/nfit, libnvdimm: Add unlock of nvdimm support for Intel DIMMs · 4c6926a2

由 Dave Jiang 提交于 12月 06, 2018

Add support to unlock the dimm via the kernel key management APIs. The
passphrase is expected to be pulled from userspace through keyutils.
The key management and sysfs attributes are libnvdimm generic.

Encrypted keys are used to protect the nvdimm passphrase at rest. The
master key can be a trusted-key sealed in a TPM, preferred, or an
encrypted-key, more flexible, but more exposure to a potential attacker.
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Co-developed-by: NDan Williams <dan.j.williams@intel.com>
Reported-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

4c6926a2

12 10月, 2018 1 次提交

nvdimm: Split label init out from the logic for getting config data · 2d657d17

由 Alexander Duyck 提交于 10月 10, 2018

This patch splits the initialization of the label data into two functions.
One for doing the init, and another for reading the actual configuration
data. The idea behind this is that by doing this we create a symmetry
between the getting and setting of config data in that we have a function
for both. In addition it will make it easier for us to identify the bits
that are related to init versus the pieces that are a wrapper for reading
data from the ACPI interface.

So for example by splitting things out like this it becomes much more
obvious that we were performing checks that weren't necessarily related to
the set/get operations such as relying on ndd->data being present when the
set and get ops should not care about a locally cached copy of the label
area.
Reviewed-by: NToshi Kani <toshi.kani@hpe.com>
Signed-off-by: NAlexander Duyck <alexander.h.duyck@linux.intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

2d657d17

18 7月, 2018 1 次提交

block: Add and use op_stat_group() for indexing disk_stat fields. · ddcf35d3

由 Michael Callahan 提交于 7月 18, 2018

Add and use a new op_stat_group() function for indexing partition stat
fields rather than indexing them by rq_data_dir() or bio_data_dir().
This function works similarly to op_is_sync() in that it takes the
request::cmd_flags or bio::bi_opf flags and determines which stats
should et updated.

In addition, the second parameter to generic_start_io_acct() and
generic_end_io_acct() is now a REQ_OP rather than simply a read or
write bit and it uses op_stat_group() on the parameter to determine
the stat group.

Note that the partition in_flight counts are not part of the per-cpu
statistics and as such are not indexed via this function.  It's now
indexed by op_is_write().

tj: Refreshed on top of v4.17.  Updated to pass around REQ_OP.
Signed-off-by: NMichael Callahan <michaelcallahan@fb.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Joshua Morris <josh.h.morris@us.ibm.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Matias Bjorling <mb@lightnvm.io>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Alasdair Kergon <agk@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ddcf35d3

15 7月, 2018 1 次提交

libnvdimm: Introduce locked DIMM capacity support · 08e6b3c6

由 Dan Williams 提交于 6月 13, 2018

When a DIMM is locked its namespace label area may not be. Introduce the
distinction of locked namespaces to allow namespace enumeration while
the capacity is locked.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

08e6b3c6

04 4月, 2018 1 次提交

libnvdimm: add an api to cast a 'struct nd_region' to its 'struct device' · 243f29fe

由 Dan Williams 提交于 4月 02, 2018

For debug, it is useful for bus providers to be able to retrieve the
'struct device' associated with an nd_region instance that it
registered. We already have to_nd_region() to perform the reverse cast
operation, in fact its duplicate declaration can be removed from the
private drivers/nvdimm/nd.h header.
Reviewed-by: NDave Jiang <dave.jiang@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

243f29fe

18 3月, 2018 1 次提交

block: Move SECTOR_SIZE and SECTOR_SHIFT definitions into <linux/blkdev.h> · 233bde21

由 Bart Van Assche 提交于 3月 14, 2018

It happens often while I'm preparing a patch for a block driver that
I'm wondering: is a definition of SECTOR_SIZE and/or SECTOR_SHIFT
available for this driver? Do I have to introduce definitions of these
constants before I can use these constants? To avoid this confusion,
move the existing definitions of SECTOR_SIZE and SECTOR_SHIFT into the
<linux/blkdev.h> header file such that these become available for all
block drivers. Make the SECTOR_SIZE definition in the uapi msdos_fs.h
header file conditional to avoid that including that header file after
<linux/blkdev.h> causes the compiler to complain about a SECTOR_SIZE
redefinition.

Note: the SECTOR_SIZE / SECTOR_SHIFT / SECTOR_BITS definitions have
not been removed from uapi header files nor from NAND drivers in
which these constants are used for another purpose than converting
block layer offsets and sizes into a number of sectors.

Cc: David S. Miller <davem@davemloft.net>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Reviewed-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

233bde21

09 1月, 2018 1 次提交

memremap: change devm_memremap_pages interface to use struct dev_pagemap · e8d51348

由 Christoph Hellwig 提交于 12月 29, 2017

This new interface is similar to how struct device (and many others)
work. The caller initializes a 'struct dev_pagemap' as required
and calls 'devm_memremap_pages'. This allows the pagemap structure to
be embedded in another structure and thus container_of can be used. In
this way application specific members can be stored in a containing
struct.

This will be used by the P2P infrastructure and HMM could probably
be cleaned up to use it as well (instead of having it's own, similar
'hmm_devmem_pages_create' function).
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

e8d51348

03 11月, 2017 1 次提交

libnvdimm: move poison list functions to a new 'badrange' file · aa9ad44a

由 Dave Jiang 提交于 8月 23, 2017

nfit_test needs to use the poison list manipulation code as well. Make
it more generic and in the process rename poison to badrange, and move
all the related helpers to a new file.
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
[vishal: Add badrange.o to nfit_test's Kbuild]
[vishal: add a missed include in bus.c for the new badrange functions]
[vishal: rename all instances of 'be' to 'bre']
Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

aa9ad44a

29 9月, 2017 1 次提交

libnvdimm, dimm: clear 'locked' status on successful DIMM enable · d34cb808

由 Dan Williams 提交于 9月 25, 2017

If we successfully enable a DIMM then it must not be locked and we can
clear the label-read failure condition. Otherwise, we need to reload the
entire bus provider driver to achieve the same effect, and that can
disrupt unrelated DIMMs and namespaces.

Fixes: 9d62ed96 ("libnvdimm: handle locked label storage areas")
Cc: <stable@vger.kernel.org>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

d34cb808

30 8月, 2017 1 次提交

libnvdimm, label: fix index block size calculation · 02881768

由 Dan Williams 提交于 8月 29, 2017

The old calculation assumed that the label space was 128k and the label
size is 128. With v1.2 labels where the label size is 256 this
calculation will return zero. We are saved by the fact that the
nsindex_size is always pre-initialized from a previous 128 byte
assumption and we are lucky that the index sizes turn out the same.

Fix this going forward in case we start encountering different
geometries of label areas besides 128k.

Since the label size can change from one call to the next, drop the
caching of nsindex_size.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

02881768

24 8月, 2017 1 次提交

block: replace bi_bdev with a gendisk pointer and partitions index · 74d46992

由 Christoph Hellwig 提交于 8月 23, 2017

This way we don't need a block_device structure to submit I/O.  The
block_device has different life time rules from the gendisk and
request_queue and is usually only available when the block device node
is open.  Other callers need to explicitly create one (e.g. the lightnvm
passthrough code, or the new nvme multipathing code).

For the actual I/O path all that we need is the gendisk, which exists
once per block device.  But given that the block layer also does
partition remapping we additionally need a partition index, which is
used for said remapping in generic_make_request.

Note that all the block drivers generally want request_queue or
sometimes the gendisk, so this removes a layer of indirection all
over the stack.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

74d46992

12 8月, 2017 1 次提交

libnvdimm: rename nd_sector_size_{show,store} to nd_size_select_{show,store} · b2c48f9f

由 Dan Williams 提交于 8月 11, 2017

Prepare for other another consumer of this size selection scheme that is
not a 'sector size'.

Cc: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

b2c48f9f

10 8月, 2017 1 次提交

block: pass in queue to inflight accounting · d62e26b3

由 Jens Axboe 提交于 6月 30, 2017

No functional change in this patch, just in preparation for
basing the inflight mechanism on the queue in question.
Reviewed-by: NBart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d62e26b3

05 8月, 2017 1 次提交

nfit, libnvdimm, region: export 'position' in mapping info · 401c0a19

由 Dan Williams 提交于 8月 04, 2017

It is useful to be able to know the position of a DIMM in an
interleave-set. Consider the case where the order of the DIMMs changes
causing a namespace to be invalidated because the interleave-set cookie no
longer matches. If the before and after state of each DIMM position is
known this state debugged by the system owner.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

401c0a19

26 7月, 2017 1 次提交

libnvdimm: Stop using HPAGE_SIZE · 0dd69643

由 Oliver O'Halloran 提交于 6月 27, 2017

Currently libnvdimm uses HPAGE_SIZE as the default alignment for DAX and
PFN devices. HPAGE_SIZE is the default hugetlbfs page size and when
hugetlbfs is disabled it defaults to PAGE_SIZE. Given DAX has more
in common with THP than hugetlbfs we should proably be using
HPAGE_PMD_SIZE, but this is undefined when THP is disabled so lets just
give it a new name.

The other usage of HPAGE_SIZE in libnvdimm is when determining how large
the altmap should be. For the reasons mentioned above it doesn't really
make sense to use HPAGE_SIZE here either. PMD_SIZE seems to be safe to
use in generic code and it happens to match the vmemmap allocation block
on x86 and Power. It's still a hack, but it's a slightly nicer hack.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

0dd69643

30 6月, 2017 1 次提交

libnvdimm, btt: BTT updates for UEFI 2.7 format · 14e49454

由 Vishal Verma 提交于 6月 28, 2017

The UEFI 2.7 specification defines an updated BTT metadata format,
bumping the revision to 2.0. Add support for the new format, while
retaining compatibility for the old 1.1 format.

Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Linda Knippers <linda.knippers@hpe.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

14e49454

16 6月, 2017 5 次提交

libnvdimm, pmem: Add sysfs notifications to badblocks · 975750a9

由 Toshi Kani 提交于 6月 12, 2017

Sysfs "badblocks" information may be updated during run-time that:
 - MCE, SCI, and sysfs "scrub" may add new bad blocks
 - Writes and ioctl() may clear bad blocks

Add support to send sysfs notifications to sysfs "badblocks" file
under region and pmem directories when their badblocks information
is re-evaluated (but is not necessarily changed) during run-time.
Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Linda Knippers <linda.knippers@hpe.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

975750a9

libnvdimm, label: add address abstraction identifiers · b3fde74e

由 Dan Williams 提交于 6月 04, 2017

Starting with v1.2 labels, 'address abstractions' can be hinted via an
address abstraction id that implies an info-block format. The standard
address abstraction in the specification is the v2 format of the
Block-Translation-Table (BTT). Support for that is saved for a later
patch, for now we add support for the Linux supported address
abstractions BTT (v1), PFN, and DAX.

The new 'holder_class' attribute for namespace devices is added for
tooling to specify the 'abstraction_guid' to store in the namespace label.
For v1.1 labels this field is undefined and any setting of
'holder_class' away from the default 'none' value will only have effect
until the driver is unloaded. Setting 'holder_class' requires that
whatever device tries to claim the namespace must be of the specified
class.

Cc: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

b3fde74e

libnvdimm, label: honor the lba size specified in v1.2 labels · f979b13c

由 Dan Williams 提交于 6月 04, 2017

Previously we only honored the lba size for blk-aperture mode
namespaces. For pmem namespaces the lba size was just assumed to be 512.
With the new v1.2 label definition and compatibility with other
operating environments, the ->lbasize property is now respected for pmem
namespaces.

Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

f979b13c

libnvdimm, label: add v1.2 interleave-set-cookie algorithm · c12c48ce

由 Dan Williams 提交于 6月 04, 2017

The interleave-set-cookie algorithm is extended to incorporate all the
same components that are used to generate an nvdimm unique-id. For
backwards compatibility we still maintain the old v1.1 definition.
Reported-by: NNicholas Moulin <nicholas.w.moulin@intel.com>
Reported-by: NKaushik Kanetkar <kaushik.a.kanetkar@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

c12c48ce

libnvdimm, label: add v1.2 nvdimm label definitions · 564e871a

由 Dan Williams 提交于 6月 03, 2017

In support of improved interoperability between operating systems and pre-boot
environments the Intel proposed NVDIMM Namespace Specification [1], has been
adopted and modified to the the UEFI 2.7 NVDIMM Label Protocol [2].

Update the definitions of the namespace label data structures so that the new
format can be supported alongside the existing label format.

The new specification changes the default label size to 256 bytes, so
everywhere that relied on sizeof(struct nd_namespace_label) must now use the
sizeof_namespace_label() helper.

There should be no functional differences from these changes as the
default is still the v1.1 128-byte format. Future patches will move the
default to the v1.2 definition.

[1]: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
[2]: http://www.uefi.org/sites/default/files/resources/UEFI_Spec_2_7.pdfSigned-off-by: NDan Williams <dan.j.williams@intel.com>

564e871a

11 5月, 2017 1 次提交

libnvdimm: add an atomic vs process context flag to rw_bytes · 3ae3d67b

由 Vishal Verma 提交于 5月 10, 2017

nsio_rw_bytes can clear media errors, but this cannot be done while we
are in an atomic context due to locking within ACPI. From the BTT,
->rw_bytes may be called either from atomic or process context depending
on whether the calls happen during initialization or during IO.

During init, we want to ensure error clearing happens, and the flag
marking process context allows nsio_rw_bytes to do that. When called
during IO, we're in atomic context, and error clearing can be skipped.

Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

3ae3d67b

05 5月, 2017 1 次提交

libnvdimm: convert NDD_ flags to use bitops, introduce NDD_LOCKED · 8f078b38

由 Dan Williams 提交于 5月 04, 2017

This is a preparation patch for handling locked nvdimm label regions, a
new concept as introduced by the latest DSM document on pmem.io [1]. A
future patch will leverage nvdimm_set_locked() at DIMM probe time to
flag regions that can not be enabled. There should be no functional
difference resulting from this change.

[1]: http://pmem.io/documents/NVDIMM_DSM_Interface_Example-V1.3.pdfSigned-off-by: NDan Williams <dan.j.williams@intel.com>

8f078b38

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功