1. 25 6月, 2015 6 次提交
    • D
      libnvdimm, nfit: regions (block-data-window, persistent memory, volatile memory) · 1f7df6f8
      Dan Williams 提交于
      A "region" device represents the maximum capacity of a BLK range (mmio
      block-data-window(s)), or a PMEM range (DAX-capable persistent memory or
      volatile memory), without regard for aliasing.  Aliasing, in the
      dimm-local address space (DPA), is resolved by metadata on a dimm to
      designate which exclusive interface will access the aliased DPA ranges.
      Support for the per-dimm metadata/label arrvies is in a subsequent
      patch.
      
      The name format of "region" devices is "regionN" where, like dimms, N is
      a global ida index assigned at discovery time.  This id is not reliable
      across reboots nor in the presence of hotplug.  Look to attributes of
      the region or static id-data of the sub-namespace to generate a
      persistent name.  However, if the platform configuration does not change
      it is reasonable to expect the same region id to be assigned at the next
      boot.
      
      "region"s have 2 generic attributes "size", and "mapping"s where:
      - size: the BLK accessible capacity or the span of the
        system physical address range in the case of PMEM.
      
      - mappingN: a tuple describing a dimm's contribution to the region's
        capacity in the format (<nmemX>,<dpa>,<size>).  For a PMEM-region
        there will be at least one mapping per dimm in the interleave set.  For
        a BLK-region there is only "mapping0" listing the starting DPA of the
        BLK-region and the available DPA capacity of that space (matches "size"
        above).
      
      The max number of mappings per "region" is hard coded per the
      constraints of sysfs attribute groups.  That said the number of mappings
      per region should never exceed the maximum number of possible dimms in
      the system.  If the current number turns out to not be enough then the
      "mappings" attribute clarifies how many there are supposed to be. "32
      should be enough for anybody...".
      
      Cc: Neil Brown <neilb@suse.de>
      Cc: <linux-acpi@vger.kernel.org>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Robert Moore <robert.moore@intel.com>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Tested-by: NToshi Kani <toshi.kani@hp.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      1f7df6f8
    • D
      libnvdimm, nvdimm: dimm driver and base libnvdimm device-driver infrastructure · 4d88a97a
      Dan Williams 提交于
      * Implement the device-model infrastructure for loading modules and
        attaching drivers to nvdimm devices.  This is a simple association of a
        nd-device-type number with a driver that has a bitmask of supported
        device types.  To facilitate userspace bind/unbind operations 'modalias'
        and 'devtype', that also appear in the uevent, are added as generic
        sysfs attributes for all nvdimm devices.  The reason for the device-type
        number is to support sub-types within a given parent devtype, be it a
        vendor-specific sub-type or otherwise.
      
      * The first consumer of this infrastructure is the driver
        for dimm devices.  It simply uses control messages to retrieve and
        store the configuration-data image (label set) from each dimm.
      
      Note: nd_device_register() arranges for asynchronous registration of
            nvdimm bus devices by default.
      
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Neil Brown <neilb@suse.de>
      Acked-by: NChristoph Hellwig <hch@lst.de>
      Tested-by: NToshi Kani <toshi.kani@hp.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      4d88a97a
    • D
      libnvdimm: control (ioctl) messages for nvdimm_bus and nvdimm devices · 62232e45
      Dan Williams 提交于
      Most discovery/configuration of the nvdimm-subsystem is done via sysfs
      attributes.  However, some nvdimm_bus instances, particularly the
      ACPI.NFIT bus, define a small set of messages that can be passed to the
      platform.  For convenience we derive the initial libnvdimm-ioctl command
      formats directly from the NFIT DSM Interface Example formats.
      
          ND_CMD_SMART: media health and diagnostics
          ND_CMD_GET_CONFIG_SIZE: size of the label space
          ND_CMD_GET_CONFIG_DATA: read label space
          ND_CMD_SET_CONFIG_DATA: write label space
          ND_CMD_VENDOR: vendor-specific command passthrough
          ND_CMD_ARS_CAP: report address-range-scrubbing capabilities
          ND_CMD_ARS_START: initiate scrubbing
          ND_CMD_ARS_STATUS: report on scrubbing state
          ND_CMD_SMART_THRESHOLD: configure alarm thresholds for smart events
      
      If a platform later defines different commands than this set it is
      straightforward to extend support to those formats.
      
      Most of the commands target a specific dimm.  However, the
      address-range-scrubbing commands target the bus.  The 'commands'
      attribute in sysfs of an nvdimm_bus, or nvdimm, enumerate the supported
      commands for that object.
      
      Cc: <linux-acpi@vger.kernel.org>
      Cc: Robert Moore <robert.moore@intel.com>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reported-by: NNicholas Moulin <nicholas.w.moulin@linux.intel.com>
      Acked-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      62232e45
    • D
      libnvdimm, nfit: dimm/memory-devices · e6dfb2de
      Dan Williams 提交于
      Enable nvdimm devices to be registered on a nvdimm_bus.  The kernel
      assigned device id for nvdimm devicesis dynamic.  If userspace needs a
      more static identifier it should consult a provider-specific attribute.
      In the case where NFIT is the provider, the 'nmemX/nfit/handle' or
      'nmemX/nfit/serial' attributes may be used for this purpose.
      
      Cc: Neil Brown <neilb@suse.de>
      Cc: <linux-acpi@vger.kernel.org>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Robert Moore <robert.moore@intel.com>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Tested-by: NToshi Kani <toshi.kani@hp.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      e6dfb2de
    • D
      libnvdimm: control character device and nvdimm_bus sysfs attributes · 45def22c
      Dan Williams 提交于
      The control device for a nvdimm_bus is registered as an "nd" class
      device.  The expectation is that there will usually only be one "nd" bus
      registered under /sys/class/nd.  However, we allow for the possibility
      of multiple buses and they will listed in discovery order as
      ndctl0...ndctlN.  This character device hosts the ioctl for passing
      control messages.  The initial command set has a 1:1 correlation with
      the commands listed in the by the "NFIT DSM Example" document [1], but
      this scheme is extensible to future command sets.
      
      Note, nd_ioctl() and the backing ->ndctl() implementation are defined in
      a subsequent patch.  This is simply the initial registrations and sysfs
      attributes.
      
      [1]: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
      
      Cc: Neil Brown <neilb@suse.de>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: <linux-acpi@vger.kernel.org>
      Cc: Robert Moore <robert.moore@intel.com>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Tested-by: NToshi Kani <toshi.kani@hp.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      45def22c
    • D
      libnvdimm, nfit: initial libnvdimm infrastructure and NFIT support · b94d5230
      Dan Williams 提交于
      A struct nvdimm_bus is the anchor device for registering nvdimm
      resources and interfaces, for example, a character control device,
      nvdimm devices, and I/O region devices.  The ACPI NFIT (NVDIMM Firmware
      Interface Table) is one possible platform description for such
      non-volatile memory resources in a system.  The nfit.ko driver attaches
      to the "ACPI0012" device that indicates the presence of the NFIT and
      parses the table to register a struct nvdimm_bus instance.
      
      Cc: <linux-acpi@vger.kernel.org>
      Cc: Lv Zheng <lv.zheng@intel.com>
      Cc: Robert Moore <robert.moore@intel.com>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Acked-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Tested-by: NToshi Kani <toshi.kani@hp.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      b94d5230
  2. 28 5月, 2015 1 次提交
    • D
      e820, efi: add ACPI 6.0 persistent memory types · ad5fb870
      Dan Williams 提交于
      ACPI 6.0 formalizes e820-type-7 and efi-type-14 as persistent memory.
      Mark it "reserved" and allow it to be claimed by a persistent memory
      device driver.
      
      This definition is in addition to the Linux kernel's existing type-12
      definition that was recently added in support of shipping platforms with
      NVDIMM support that predate ACPI 6.0 (which now classifies type-12 as
      OEM reserved).
      
      Note, /proc/iomem can be consulted for differentiating legacy
      "Persistent Memory (legacy)" E820_PRAM vs standard "Persistent Memory"
      E820_PMEM.
      
      Cc: Boaz Harrosh <boaz@plexistor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Acked-by: NAndy Lutomirski <luto@amacapital.net>
      Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Acked-by: NChristoph Hellwig <hch@lst.de>
      Tested-by: NToshi Kani <toshi.kani@hp.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      ad5fb870
  3. 26 5月, 2015 2 次提交
  4. 22 5月, 2015 12 次提交
  5. 19 5月, 2015 1 次提交
  6. 07 5月, 2015 1 次提交
  7. 06 5月, 2015 4 次提交
  8. 05 5月, 2015 2 次提交
    • T
      RDMA/core: Enable the iWarp Port Mapper to provide the actual address of the... · 6eec1774
      Tatyana Nikolova 提交于
      RDMA/core: Enable the iWarp Port Mapper to provide the actual address of the connecting peer to its clients
      
      Add functionality to enable the port mapper on the passive side to provide to its
      clients the actual (non-mapped) ip/tcp address information of the connecting peer
      
      1) Adding remote_info_cb() to process the address info of the connecting peer
         The address info is provided by the user space port mapper service when
         the connection is initiated by the peer
      2) Adding a hash list to store the remote address info
      3) Adding functionality to add/remove the remote address info
         After the info has been provided to the port mapper client,
         it is removed from the hash list
      Signed-off-by: NTatyana Nikolova <tatyana.e.nikolova@intel.com>
      Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      6eec1774
    • S
      blk-mq: fix FUA request hang · b2387ddc
      Shaohua Li 提交于
      When a FUA request enters its DATA stage of flush pipeline, the
      request is added to mq requeue list, the request will then be added to
      ctx->rq_list. blk_mq_attempt_merge() might merge the request with a bio.
      Later when the request is finished the flush pipeline, the
      request->__data_len is 0. Then I only saw the bio gets endio called, the
      original request never finish.
      
      Adding REQ_FLUSH_SEQ into REQ_NOMERGE_FLAGS looks an easy fix.
      
      stable: 3.15+
      Signed-off-by: NShaohua Li <shli@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      b2387ddc
  9. 04 5月, 2015 1 次提交
    • D
      lib: make memzero_explicit more robust against dead store elimination · 7829fb09
      Daniel Borkmann 提交于
      In commit 0b053c95 ("lib: memzero_explicit: use barrier instead
      of OPTIMIZER_HIDE_VAR"), we made memzero_explicit() more robust in
      case LTO would decide to inline memzero_explicit() and eventually
      find out it could be elimiated as dead store.
      
      While using barrier() works well for the case of gcc, recent efforts
      from LLVMLinux people suggest to use llvm as an alternative to gcc,
      and there, Stephan found in a simple stand-alone user space example
      that llvm could nevertheless optimize and thus elimitate the memset().
      A similar issue has been observed in the referenced llvm bug report,
      which is regarded as not-a-bug.
      
      Based on some experiments, icc is a bit special on its own, while it
      doesn't seem to eliminate the memset(), it could do so with an own
      implementation, and then result in similar findings as with llvm.
      
      The fix in this patch now works for all three compilers (also tested
      with more aggressive optimization levels). Arguably, in the current
      kernel tree it's more of a theoretical issue, but imho, it's better
      to be pedantic about it.
      
      It's clearly visible with gcc/llvm though, with the below code: if we
      would have used barrier() only here, llvm would have omitted clearing,
      not so with barrier_data() variant:
      
        static inline void memzero_explicit(void *s, size_t count)
        {
          memset(s, 0, count);
          barrier_data(s);
        }
      
        int main(void)
        {
          char buff[20];
          memzero_explicit(buff, sizeof(buff));
          return 0;
        }
      
        $ gcc -O2 test.c
        $ gdb a.out
        (gdb) disassemble main
        Dump of assembler code for function main:
         0x0000000000400400  <+0>: lea   -0x28(%rsp),%rax
         0x0000000000400405  <+5>: movq  $0x0,-0x28(%rsp)
         0x000000000040040e <+14>: movq  $0x0,-0x20(%rsp)
         0x0000000000400417 <+23>: movl  $0x0,-0x18(%rsp)
         0x000000000040041f <+31>: xor   %eax,%eax
         0x0000000000400421 <+33>: retq
        End of assembler dump.
      
        $ clang -O2 test.c
        $ gdb a.out
        (gdb) disassemble main
        Dump of assembler code for function main:
         0x00000000004004f0  <+0>: xorps  %xmm0,%xmm0
         0x00000000004004f3  <+3>: movaps %xmm0,-0x18(%rsp)
         0x00000000004004f8  <+8>: movl   $0x0,-0x8(%rsp)
         0x0000000000400500 <+16>: lea    -0x18(%rsp),%rax
         0x0000000000400505 <+21>: xor    %eax,%eax
         0x0000000000400507 <+23>: retq
        End of assembler dump.
      
      As gcc, clang, but also icc defines __GNUC__, it's sufficient to define
      this in compiler-gcc.h only to be picked up. For a fallback or otherwise
      unsupported compiler, we define it as a barrier. Similarly, for ecc which
      does not support gcc inline asm.
      
      Reference: https://llvm.org/bugs/show_bug.cgi?id=15495Reported-by: NStephan Mueller <smueller@chronox.de>
      Tested-by: NStephan Mueller <smueller@chronox.de>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Stephan Mueller <smueller@chronox.de>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: mancha security <mancha1@zoho.com>
      Cc: Mark Charlebois <charlebm@gmail.com>
      Cc: Behan Webster <behanw@converseincode.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      7829fb09
  10. 02 5月, 2015 1 次提交
  11. 30 4月, 2015 2 次提交
    • N
      bridge/nl: remove wrong use of NLM_F_MULTI · 46c264da
      Nicolas Dichtel 提交于
      NLM_F_MULTI must be used only when a NLMSG_DONE message is sent. In fact,
      it is sent only at the end of a dump.
      
      Libraries like libnl will wait forever for NLMSG_DONE.
      
      Fixes: e5a55a89 ("net: create generic bridge ops")
      Fixes: 815cccbf ("ixgbe: add setlink, getlink support to ixgbe and ixgbevf")
      CC: John Fastabend <john.r.fastabend@intel.com>
      CC: Sathya Perla <sathya.perla@emulex.com>
      CC: Subbu Seetharaman <subbu.seetharaman@emulex.com>
      CC: Ajit Khaparde <ajit.khaparde@emulex.com>
      CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      CC: intel-wired-lan@lists.osuosl.org
      CC: Jiri Pirko <jiri@resnulli.us>
      CC: Scott Feldman <sfeldma@gmail.com>
      CC: Stephen Hemminger <stephen@networkplumber.org>
      CC: bridge@lists.linux-foundation.org
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      46c264da
    • B
      xen: Suspend ticks on all CPUs during suspend · 2b953a5e
      Boris Ostrovsky 提交于
      Commit 77e32c89 ("clockevents: Manage device's state separately for
      the core") decouples clockevent device's modes from states. With this
      change when a Xen guest tries to resume, it won't be calling its
      set_mode op which needs to be done on each VCPU in order to make the
      hypervisor aware that we are in oneshot mode.
      
      This happens because clockevents_tick_resume() (which is an intermediate
      step of resuming ticks on a processor) doesn't call clockevents_set_state()
      anymore and because during suspend clockevent devices on all VCPUs (except
      for the one doing the suspend) are left in ONESHOT state. As result, during
      resume the clockevents state machine will assume that device is already
      where it should be and doesn't need to be updated.
      
      To avoid this problem we should suspend ticks on all VCPUs during
      suspend.
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      2b953a5e
  12. 29 4月, 2015 2 次提交
  13. 28 4月, 2015 4 次提交
    • R
      ASoC: Update email-id of Rajeev Kumar · 9d7dd6cd
      Rajeev Kumar 提交于
      rajeev-dlh.kumar@st.com email-id doesn't exist anymore as I have left the
      company.  Replace ST's id with Rajeev Kumar <rajeevkumar.linux@gmail.com>
      Signed-off-by: NRajeev Kumar <rajeevkumar.linux@gmail.com>
      Signed-off-by: NMark Brown <broonie@kernel.org>
      9d7dd6cd
    • F
      tty: Re-add external interface for tty_set_termios() · b00f5c2d
      Frederic Danis 提交于
      This is needed by Bluetooth hci_uart module to be able to change speed
      of Bluetooth controller and local UART.
      Signed-off-by: NFrederic Danis <frederic.danis@linux.intel.com>
      Reviewed-by: NPeter Hurley <peter@hurleysoftware.com>
      Cc: Marcel Holtmann <marcel@holtmann.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b00f5c2d
    • H
      uas: Add US_FL_MAX_SECTORS_240 flag · ee136af4
      Hans de Goede 提交于
      The usb-storage driver sets max_sectors = 240 in its scsi-host template,
      for uas we do not want to do that for all devices, but testing has shown
      that some devices need it.
      
      This commit adds a US_FL_MAX_SECTORS_240 flag for such devices, and
      implements support for it in uas.c, while at it it also adds support
      for US_FL_MAX_SECTORS_64 to uas.c.
      
      Cc: stable@vger.kernel.org # 3.16
      Signed-off-by: NHans de Goede <hdegoede@redhat.com>
      Acked-by: NAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ee136af4
    • M
      SCSI: add 1024 max sectors black list flag · 35e9a9f9
      Mike Christie 提交于
      This works around a issue with qnap iscsi targets not handling large IOs
      very well.
      
      The target returns:
      
      VPD INQUIRY: Block limits page (SBC)
        Maximum compare and write length: 1 blocks
        Optimal transfer length granularity: 1 blocks
        Maximum transfer length: 4294967295 blocks
        Optimal transfer length: 4294967295 blocks
        Maximum prefetch, xdread, xdwrite transfer length: 0 blocks
        Maximum unmap LBA count: 8388607
        Maximum unmap block descriptor count: 1
        Optimal unmap granularity: 16383
        Unmap granularity alignment valid: 0
        Unmap granularity alignment: 0
        Maximum write same length: 0xffffffff blocks
        Maximum atomic transfer length: 0
        Atomic alignment: 0
        Atomic transfer length granularity: 0
      
      and it is *sometimes* able to handle at least one IO of size up to 8 MB. We
      have seen in traces where it will sometimes work, but other times it
      looks like it fails and it looks like it returns failures if we send
      multiple large IOs sometimes. Also it looks like it can return 2 different
      errors. It will sometimes send iscsi reject errors indicating out of
      resources or it will send invalid cdb illegal requests check conditions.
      And then when it sends iscsi rejects it does not seem to handle retries
      when there are command sequence holes, so I could not just add code to
      try and gracefully handle that error code.
      
      The problem is that we do not have a good contact for the company,
      so we are not able to determine under what conditions it returns
      which error and why it sometimes works.
      
      So, this patch just adds a new black list flag to set targets like this to
      the old max safe sectors of 1024. The max_hw_sectors changes added in 3.19
      caused this regression, so I also ccing stable.
      Reported-by: NChristian Hesse <list@eworm.de>
      Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
      Cc: stable@vger.kernel.org
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJames Bottomley <JBottomley@Odin.com>
      35e9a9f9
  14. 27 4月, 2015 1 次提交