1. 25 6月, 2015 1 次提交
    • D
      libnvdimm: control (ioctl) messages for nvdimm_bus and nvdimm devices · 62232e45
      Dan Williams 提交于
      Most discovery/configuration of the nvdimm-subsystem is done via sysfs
      attributes.  However, some nvdimm_bus instances, particularly the
      ACPI.NFIT bus, define a small set of messages that can be passed to the
      platform.  For convenience we derive the initial libnvdimm-ioctl command
      formats directly from the NFIT DSM Interface Example formats.
      
          ND_CMD_SMART: media health and diagnostics
          ND_CMD_GET_CONFIG_SIZE: size of the label space
          ND_CMD_GET_CONFIG_DATA: read label space
          ND_CMD_SET_CONFIG_DATA: write label space
          ND_CMD_VENDOR: vendor-specific command passthrough
          ND_CMD_ARS_CAP: report address-range-scrubbing capabilities
          ND_CMD_ARS_START: initiate scrubbing
          ND_CMD_ARS_STATUS: report on scrubbing state
          ND_CMD_SMART_THRESHOLD: configure alarm thresholds for smart events
      
      If a platform later defines different commands than this set it is
      straightforward to extend support to those formats.
      
      Most of the commands target a specific dimm.  However, the
      address-range-scrubbing commands target the bus.  The 'commands'
      attribute in sysfs of an nvdimm_bus, or nvdimm, enumerate the supported
      commands for that object.
      
      Cc: <linux-acpi@vger.kernel.org>
      Cc: Robert Moore <robert.moore@intel.com>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reported-by: NNicholas Moulin <nicholas.w.moulin@linux.intel.com>
      Acked-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      62232e45
  2. 05 5月, 2015 1 次提交
  3. 02 5月, 2015 1 次提交
  4. 24 4月, 2015 1 次提交
  5. 22 4月, 2015 2 次提交
  6. 21 4月, 2015 1 次提交
    • M
      KVM: PPC: Book3S HV: Add fast real-mode H_RANDOM implementation. · e928e9cb
      Michael Ellerman 提交于
      Some PowerNV systems include a hardware random-number generator.
      This HWRNG is present on POWER7+ and POWER8 chips and is capable of
      generating one 64-bit random number every microsecond.  The random
      numbers are produced by sampling a set of 64 unstable high-frequency
      oscillators and are almost completely entropic.
      
      PAPR defines an H_RANDOM hypercall which guests can use to obtain one
      64-bit random sample from the HWRNG.  This adds a real-mode
      implementation of the H_RANDOM hypercall.  This hypercall was
      implemented in real mode because the latency of reading the HWRNG is
      generally small compared to the latency of a guest exit and entry for
      all the threads in the same virtual core.
      
      Userspace can detect the presence of the HWRNG and the H_RANDOM
      implementation by querying the KVM_CAP_PPC_HWRNG capability.  The
      H_RANDOM hypercall implementation will only be invoked when the guest
      does an H_RANDOM hypercall if userspace first enables the in-kernel
      H_RANDOM implementation using the KVM_CAP_PPC_ENABLE_HCALL capability.
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      e928e9cb
  7. 20 4月, 2015 2 次提交
    • A
      target: Version 2 of TCMU ABI · 0ad46af8
      Andy Grover 提交于
      The initial version of TCMU (in 3.18) does not properly handle
      bidirectional SCSI commands -- those with both an in and out buffer. In
      looking to fix this it also became clear that TCMU's support for adding
      new types of entries (opcodes) to the command ring was broken. We need
      to fix this now, so that future issues can be handled properly by adding
      new opcodes.
      
      We make the most of this ABI break by enabling bidi cmd handling within
      TCMP_OP_CMD opcode. Add an iov_bidi_cnt field to tcmu_cmd_entry.req.
      This enables TCMU to describe bidi commands, but further kernel work is
      needed for full bidi support.
      
      Enlarge tcmu_cmd_entry_hdr by 32 bits by pulling in cmd_id and __pad1. Turn
      __pad1 into two 8 bit flags fields, for kernel-set and userspace-set flags,
      "kflags" and "uflags" respectively.
      
      Update version fields so userspace can tell the interface is changed.
      
      Update tcmu-design.txt with details of how new stuff works:
      - Specify an additional requirement for userspace to set UNKNOWN_OP
        (bit 0) in hdr.uflags for unknown/unhandled opcodes.
      - Define how Data-In and Data-Out fields are described in req.iov[]
      
      Changed in v2:
      - Change name of SKIPPED bit to UNKNOWN bit
      - PAD op does not set the bit any more
      - Change len_op helper functions to take just len_op, not the whole struct
      - Change version to 2 in missed spots, and use defines
      - Add 16 unused bytes to cmd_entry.req, in case additional SAM cmd
        parameters need to be included
      - Add iov_dif_cnt field to specify buffers used for DIF info in iov[]
      - Rearrange fields to naturally align cdb_off
      - Handle if userspace sets UNKNOWN_OP by indicating failure of the cmd
      - Wrap some overly long UPDATE_HEAD lines
      
      (Add missing req.iov_bidi_cnt + req.iov_dif_cnt zeroing - Ilias)
      Signed-off-by: NAndy Grover <agrover@redhat.com>
      Reviewed-by: NIlias Tsitsimpis <iliastsi@arrikto.com>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      0ad46af8
    • P
      media-bus: Fixup RGB444_1X12, RGB565_1X16, and YUV8_1X24 media bus format · cec32a47
      Philipp Zabel 提交于
      Change the constant values for RGB444_1X12, RGB565_1X16, and YUV8_1X24 media
      bus formats in anticipation of a merge conflict with the media tree, where
      the old values are already taken by RBG888_1X24, RGB888_1X32_PADHI, and
      VUY8_1X24, respectively.
      Signed-off-by: NPhilipp Zabel <p.zabel@pengutronix.de>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      cec32a47
  8. 18 4月, 2015 1 次提交
  9. 17 4月, 2015 2 次提交
    • A
      errno.h: Improve ENOSYS's comment · e15f431f
      Andy Lutomirski 提交于
      ENOSYS is the mechanism used by user code to detect whether the running
      kernel implements a given system call.  It should not be returned by
      anything except an unimplemented system call.
      
      Unfortunately, it is rather frequently used in the kernel to indicate that
      various new functions of existing system calls are not implemented.  This
      should be discouraged.
      
      Improve the comment in errno.h to help clarify ENOSYS's purpose.
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e15f431f
    • A
      bpf: fix bpf helpers to use skb->mac_header relative offsets · a166151c
      Alexei Starovoitov 提交于
      For the short-term solution, lets fix bpf helper functions to use
      skb->mac_header relative offsets instead of skb->data in order to
      get the same eBPF programs with cls_bpf and act_bpf work on ingress
      and egress qdisc path. We need to ensure that mac_header is set
      before calling into programs. This is effectively the first option
      from below referenced discussion.
      
      More long term solution for LD_ABS|LD_IND instructions will be more
      intrusive but also more beneficial than this, and implemented later
      as it's too risky at this point in time.
      
      I.e., we plan to look into the option of moving skb_pull() out of
      eth_type_trans() and into netif_receive_skb() as has been suggested
      as second option. Meanwhile, this solution ensures ingress can be
      used with eBPF, too, and that we won't run into ABI troubles later.
      For dealing with negative offsets inside eBPF helper functions,
      we've implemented bpf_skb_clone_unwritable() to test for unwriteable
      headers.
      
      Reference: http://thread.gmane.org/gmane.linux.network/359129/focus=359694
      Fixes: 608cd71a ("tc: bpf: generalize pedit action")
      Fixes: 91bc4822 ("tc: bpf: add checksum helpers")
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a166151c
  10. 16 4月, 2015 1 次提交
    • M
      dm: add full blk-mq support to request-based DM · bfebd1cd
      Mike Snitzer 提交于
      Commit e5863d9a ("dm: allocate requests in target when stacking on
      blk-mq devices") served as the first step toward fully utilizing blk-mq
      in request-based DM -- it enabled stacking an old-style (request_fn)
      request_queue ontop of the underlying blk-mq device(s).  That first step
      didn't improve performance of DM multipath ontop of fast blk-mq devices
      (e.g. NVMe) because the top-level old-style request_queue was severely
      limited by the queue_lock.
      
      The second step offered here enables stacking a blk-mq request_queue
      ontop of the underlying blk-mq device(s).  This unlocks significant
      performance gains on fast blk-mq devices, Keith Busch tested on his NVMe
      testbed and offered this really positive news:
      
       "Just providing a performance update. All my fio tests are getting
        roughly equal performance whether accessed through the raw block
        device or the multipath device mapper (~470k IOPS). I could only push
        ~20% of the raw iops through dm before this conversion, so this latest
        tree is looking really solid from a performance standpoint."
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Tested-by: NKeith Busch <keith.busch@intel.com>
      bfebd1cd
  11. 15 4月, 2015 1 次提交
  12. 14 4月, 2015 5 次提交
    • A
      drm/nouveau/gem: allow user-space to specify an object should be coherent · 996f545f
      Alexandre Courbot 提交于
      User-space use mappable BOs notably for fences, and expects that a
      value update by the GPU will be immediatly visible through the
      user-space mapping.
      
      ARM has a property that may prevent this from happening though: memory
      can be mapped multiple times only if the different mappings share the
      same caching properties. However all the lowmem memory is already
      identity-mapped into the kernel with cache enabled, so when user-space
      requests an uncached mapping, we actually get an "undefined caching
      policy" one and this has strange side-effects described on Freedesktop
      bug 86690.
      
      To prevent this from happening, allow user-space to explicitly specify
      which objects should be coherent, and create such objects with the
      TTM_PL_FLAG_UNCACHED flag. This will make TTM allocate memory using the
      DMA API, which will fix the identify mapping and allow us to safely map
      the objects to user-space uncached.
      Signed-off-by: NAlexandre Courbot <acourbot@nvidia.com>
      Reviewed-by: NLucas Stach <dev@lynxeye.de>
      Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
      996f545f
    • P
      netfilter: nft_dynset: dynamic stateful expression instantiation · 3e135cd4
      Patrick McHardy 提交于
      Support instantiating stateful expressions based on a template that
      are associated with dynamically created set entries. The expressions
      are evaluated when adding or updating the set element.
      
      This allows to maintain per flow state using the existing set
      infrastructure and expression types, with arbitrary definitions of
      a flow.
      
      Usage is currently restricted to anonymous sets, meaning only a single
      binding can exist, since the desired semantics of multiple independant
      bindings haven't been defined so far.
      
      Examples (userspace syntax is still WIP):
      
      1. Limit the rate of new SSH connections per host, similar to iptables
         hashlimit:
      
      	flow ip saddr timeout 60s \
      	limit 10/second \
      	accept
      
      2. Account network traffic between each set of /24 networks:
      
      	flow ip saddr & 255.255.255.0 . ip daddr & 255.255.255.0 \
      	counter
      
      3. Account traffic to each host per user:
      
      	flow skuid . ip daddr \
      	counter
      
      4. Account traffic for each combination of source address and TCP flags:
      
      	flow ip saddr . tcp flags \
      	counter
      
      The resulting set content after a Xmas-scan look like this:
      
      {
      	192.168.122.1 . fin | psh | urg : counter packets 1001 bytes 40040,
      	192.168.122.1 . ack : counter packets 74 bytes 3848,
      	192.168.122.1 . psh | ack : counter packets 35 bytes 3144
      }
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      3e135cd4
    • P
      netfilter: nf_tables: add flag to indicate set contains expressions · 7c6c6e95
      Patrick McHardy 提交于
      Add a set flag to indicate that the set is used as a state table and
      contains expressions for evaluation. This operation is mutually
      exclusive with the mapping operation, so sets specifying both are
      rejected. The lookup expression also rejects binding to state tables
      since it only deals with loopup and map operations.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      7c6c6e95
    • P
      netfilter: nf_tables: prepare for expressions associated to set elements · f25ad2e9
      Patrick McHardy 提交于
      Preparation to attach expressions to set elements: add a set extension
      type to hold an expression and dump the expression information with the
      set element.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      f25ad2e9
    • P
      uapi: ebtables: don't include linux/if.h · 24477e57
      Pablo Neira Ayuso 提交于
      linux/if.h creates conflicts in userspace with net/if.h
      
      By using it here we force userspace to use linux/if.h while
      net/if.h may be needed.
      
      Note that:
      
      include/linux/netfilter_ipv4/ip_tables.h and
      include/linux/netfilter_ipv6/ip6_tables.h
      
      don't include linux/if.h and they also refer to IFNAMSIZ, so they are
      expecting userspace to include use net/if.h from the client program.
      Signed-off-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      24477e57
  13. 13 4月, 2015 3 次提交
  14. 11 4月, 2015 1 次提交
  15. 08 4月, 2015 5 次提交
  16. 07 4月, 2015 1 次提交
    • A
      tc: bpf: add checksum helpers · 91bc4822
      Alexei Starovoitov 提交于
      Commit 608cd71a ("tc: bpf: generalize pedit action") has added the
      possibility to mangle packet data to BPF programs in the tc pipeline.
      This patch adds two helpers bpf_l3_csum_replace() and bpf_l4_csum_replace()
      for fixing up the protocol checksums after the packet mangling.
      
      It also adds 'flags' argument to bpf_skb_store_bytes() helper to avoid
      unnecessary checksum recomputations when BPF programs adjusting l3/l4
      checksums and documents all three helpers in uapi header.
      
      Moreover, a sample program is added to show how BPF programs can make use
      of the mangle and csum helpers.
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91bc4822
  17. 04 4月, 2015 1 次提交
  18. 03 4月, 2015 9 次提交
  19. 02 4月, 2015 1 次提交
    • A
      perf: Add ITRACE_START record to indicate that tracing has started · ec0d7729
      Alexander Shishkin 提交于
      For counters that generate AUX data that is bound to the context of a
      running task, such as instruction tracing, the decoder needs to know
      exactly which task is running when the event is first scheduled in,
      before the first sched_switch. The decoder's need to know this stems
      from the fact that instruction flow trace decoding will almost always
      require program's object code in order to reconstruct said flow and
      for that we need at least its pid/tid in the perf stream.
      
      To single out such instruction tracing pmus, this patch introduces
      ITRACE PMU capability. The reason this is not part of RECORD_AUX
      record is that not all pmus capable of generating AUX data need this,
      and the opposite is *probably* also true.
      
      While sched_switch covers for most cases, there are two problems with it:
      the consumer will need to process events out of order (that is, having
      found RECORD_AUX, it will have to skip forward to the nearest sched_switch
      to figure out which task it was, then go back to the actual trace to
      decode it) and it completely misses the case when the tracing is enabled
      and disabled before sched_switch, for example, via PERF_EVENT_IOC_DISABLE.
      Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kaixu Xia <kaixu.xia@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@infradead.org
      Cc: adrian.hunter@intel.com
      Cc: kan.liang@intel.com
      Cc: markus.t.metzger@intel.com
      Cc: mathieu.poirier@linaro.org
      Link: http://lkml.kernel.org/r/1421237903-181015-15-git-send-email-alexander.shishkin@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ec0d7729