1. 06 3月, 2015 1 次提交
    • J
      bridge: Extend Proxy ARP design to allow optional rules for Wi-Fi · 842a9ae0
      Jouni Malinen 提交于
      This extends the design in commit 95850116 ("bridge: Add support for
      IEEE 802.11 Proxy ARP") with optional set of rules that are needed to
      meet the IEEE 802.11 and Hotspot 2.0 requirements for ProxyARP. The
      previously added BR_PROXYARP behavior is left as-is and a new
      BR_PROXYARP_WIFI alternative is added so that this behavior can be
      configured from user space when required.
      
      In addition, this enables proxyarp functionality for unicast ARP
      requests for both BR_PROXYARP and BR_PROXYARP_WIFI since it is possible
      to use unicast as well as broadcast for these frames.
      
      The key differences in functionality:
      
      BR_PROXYARP:
      - uses the flag on the bridge port on which the request frame was
        received to determine whether to reply
      - block bridge port flooding completely on ports that enable proxy ARP
      
      BR_PROXYARP_WIFI:
      - uses the flag on the bridge port to which the target device of the
        request belongs
      - block bridge port flooding selectively based on whether the proxyarp
        functionality replied
      Signed-off-by: NJouni Malinen <jouni@codeaurora.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      842a9ae0
  2. 04 3月, 2015 1 次提交
    • E
      mpls: Basic routing support · 0189197f
      Eric W. Biederman 提交于
      This change adds a new Kconfig option MPLS_ROUTING.
      
      The core of this change is the code to look at an mpls packet received
      from another machine.  Look that packet up in a routing table and
      forward the packet on.
      
      Support of MPLS over ATM is not considered or attempted here.  This
      implemntation follows RFC3032 and implements the MPLS shim header that
      can pass over essentially any network.
      
      What RFC3021 refers to as the as the Incoming Label Map (ILM) I call
      net->mpls.platform_label[].  What RFC3031 refers to as the Next Label
      Hop Forwarding Entry (NHLFE) I call mpls_route.  Though calling it the
      label fordwarding information base (lfib) might also be valid.
      
      Further the implemntation forwards packets as described in RFC3032.
      There is no need and given the original motivation for MPLS a strong
      discincentive to have a flexible label forwarding path.  In essence
      the logic is the topmost label is read, looked up, removed, and
      replaced by 0 or more new lables and the sent out the specified
      interface to it's next hop.
      
      Quite a few optional features are not implemented here.  Among them
      are generation of ICMP errors when the TTL is exceeded or the packet
      is larger than the next hop MTU (those conditions are detected and the
      packets are dropped instead of generating an icmp error).  The traffic
      class field is always set to 0.  The implementation focuses on IP over
      MPLS and does not handle egress of other kinds of protocols.
      
      Instead of implementing coordination with the neighbour table and
      sorting out how to input next hops in a different address family (for
      which there is value).  I was lazy and implemented a next hop mac
      address instead.  The code is simpler and there are flavor of MPLS
      such as MPLS-TP where neither an IPv4 nor an IPv6 next hop is
      appropriate so a next hop by mac address would need to be implemented
      at some point.
      
      Two new definitions AF_MPLS and PF_MPLS are exposed to userspace.
      
      Decoding the mpls header must be done by first byeswapping a 32bit bit
      endian word into the local cpu endian and then bit shifting to extract
      the pieces.  There is no C bit-field that can represent a wire format
      mpls header on a little endian machine as the low bits of the 20bit
      label wind up in the wrong half of third byte.  Therefore internally
      everything is deal with in cpu native byte order except when writing
      to and reading from a packet.
      
      For management simplicity if a label is configured to forward out
      an interface that is down the packet is dropped early.  Similarly
      if an network interface is removed rt_dev is updated to NULL
      (so no reference is preserved) and any packets for that label
      are dropped.  Keeping the label entries in the kernel allows
      the kernel label table to function as the definitive source
      of which labels are allocated and which are not.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0189197f
  3. 03 3月, 2015 4 次提交
  4. 02 3月, 2015 6 次提交
  5. 01 3月, 2015 2 次提交
  6. 28 2月, 2015 2 次提交
  7. 25 2月, 2015 1 次提交
  8. 23 2月, 2015 5 次提交
  9. 22 2月, 2015 2 次提交
  10. 21 2月, 2015 2 次提交
  11. 20 2月, 2015 6 次提交
    • K
      NVMe: Fix potential corruption during shutdown · 07836e65
      Keith Busch 提交于
      The driver has to end unreturned commands at some point even if the
      controller has not provided a completion. The driver tried to be safe by
      deleting IO queues prior to ending all unreturned commands. That should
      cause the controller to internally abort inflight commands, but IO queue
      deletion request does not have to be successful, so all bets are off. We
      still have to make progress, so to be extra safe, this patch doesn't
      clear a queue to release the dma mapping for a command until after the
      pci device has been disabled.
      
      This patch removes the special handling during device initialization
      so controller recovery can be done all the time. This is possible since
      initialization is not inlined with pci probe anymore.
      Reported-by: NNilish Choudhury <nilesh.choudhury@oracle.com>
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      07836e65
    • K
      NVMe: Asynchronous controller probe · 2e1d8448
      Keith Busch 提交于
      This performs the longest parts of nvme device probe in scheduled work.
      This speeds up probe significantly when multiple devices are in use.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      2e1d8448
    • K
      NVMe: Register management handle under nvme class · b3fffdef
      Keith Busch 提交于
      This creates a new class type for nvme devices to register their
      management character devices with. This is so we do not rely on miscdev
      to provide enough minors for as many nvme devices some people plan to
      use. The previous limit was approximately 60 NVMe controllers, depending
      on the platform and kernel. Now the limit is 1M, which ought to be enough
      for anybody.
      
      Since we have a new device class, it makes sense to attach the block
      devices under this as well, so part of this patch moves the management
      handle initialization prior to the namespaces discovery.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      b3fffdef
    • K
      NVMe: Update SCSI Inquiry VPD 83h translation · 4f1982b4
      Keith Busch 提交于
      The original translation created collisions on Inquiry VPD 83 for many
      existing devices. Newer specifications provide other ways to translate
      based on the device's version can be used to create unique identifiers.
      
      Version 1.1 provides an EUI64 field that uniquely identifies each
      namespace, and 1.2 added the longer NGUID field for the same reason.
      Both follow the IEEE EUI format and readily translate to the SCSI device
      identification EUI designator type 2h. For devices implementing either,
      the translation will use this type, defaulting to the EUI64 8-byte type if
      implemented then NGUID's 16 byte version if not. If neither are provided,
      the 1.0 translation is used, and is updated to use the SCSI String format
      to guarantee a unique identifier.
      
      Knowing when to use the new fields depends on the nvme controller's
      revision. The NVME_VS macro was not decoding this correctly, so that is
      fixed in this patch and moved to a more appropriate place.
      
      Since the Identify Namespace structure required an update for the NGUID
      field, this patch adds the remaining new 1.2 fields to the structure.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      4f1982b4
    • K
      NVMe: Metadata format support · e1e5e564
      Keith Busch 提交于
      Adds support for NVMe metadata formats and exposes block devices for
      all namespaces regardless of their format. Namespace formats that are
      unusable will have disk capacity set to 0, but a handle to the block
      device is created to simplify device management. A namespace is not
      usable when the format requires host interleave block and metadata in
      single buffer, has no provisioned storage, or has better data but failed
      to register with blk integrity.
      
      The namespace has to be scanned in two phases to support separate
      metadata formats. The first establishes the sector size and capacity
      prior to invoking add_disk. If metadata is required, the capacity will
      be temporarilly set to 0 until it can be revalidated and registered with
      the integrity extenstions after add_disk completes.
      
      The driver relies on the integrity extensions to provide the metadata
      buffer. NVMe requires this be a single physically contiguous region,
      so only one integrity segment is allowed per command. If the metadata
      is used for T10 PI, the driver provides mappings to save and restore
      the reftag physical block translation. The driver provides no-op
      functions for generate and verify if metadata is not used for protection
      information. This way the setup is always provided by the block layer.
      
      If a request does not supply a required metadata buffer, the command
      is failed with bad address. This could only happen if a user manually
      disables verify/generate on such a disk. The only exception to where
      this is okay is if the controller is capable of stripping/generating
      the metadata, which is possible on some types of formats.
      
      The metadata scatter gather list now occupies the spot in the nvme_iod
      that used to be used to link retryable IOD's, but we don't do that
      anymore, so the field was unused.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      e1e5e564
    • D
      kdb: Avoid printing KERN_ levels to consoles · f7d4ca8b
      Daniel Thompson 提交于
      Currently when kdb traps printk messages then the raw log level prefix
      (consisting of '\001' followed by a numeral) does not get stripped off
      before the message is issued to the various I/O handlers supported by
      kdb. This causes annoying visual noise as well as causing problems
      grepping for ^. It is also a change of behaviour compared to normal usage
      of printk() usage. For example <SysRq>-h ends up with different output to
      that of kdb's "sr h".
      
      This patch addresses the problem by stripping log levels from messages
      before they are issued to the I/O handlers. printk() which can also
      act as an i/o handler in some cases is special cased; if the caller
      provided a log level then the prefix will be preserved when sent to
      printk().
      
      The addition of non-printable characters to the output of kdb commands is a
      regression, albeit and extremely elderly one, introduced by commit
      04d2c8c8 ("printk: convert the format for KERN_<LEVEL> to a 2 byte
      pattern"). Note also that this patch does *not* restore the original
      behaviour from v3.5. Instead it makes printk() from within a kdb command
      display the message without any prefix (i.e. like printk() normally does).
      Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Cc: Joe Perches <joe@perches.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      f7d4ca8b
  12. 19 2月, 2015 6 次提交
  13. 18 2月, 2015 2 次提交
    • N
      sched: Prevent recursion in io_schedule() · 9cff8ade
      NeilBrown 提交于
      io_schedule() calls blk_flush_plug() which, depending on the
      contents of current->plug, can initiate arbitrary blk-io requests.
      
      Note that this contrasts with blk_schedule_flush_plug() which requires
      all non-trivial work to be handed off to a separate thread.
      
      This makes it possible for io_schedule() to recurse, and initiating
      block requests could possibly call mempool_alloc() which, in times of
      memory pressure, uses io_schedule().
      
      Apart from any stack usage issues, io_schedule() will not behave
      correctly when called recursively as delayacct_blkio_start() does
      not allow for repeated calls.
      
      So:
       - use ->in_iowait to detect recursion.  Set it earlier, and restore
         it to the old value.
       - move the call to "raw_rq" after the call to blk_flush_plug().
         As this is some sort of per-cpu thing, we want some chance that
         we are on the right CPU
       - When io_schedule() is called recurively, use blk_schedule_flush_plug()
         which cannot further recurse.
       - as this makes io_schedule() a lot more complex and as io_schedule()
         must match io_schedule_timeout(), but all the changes in io_schedule_timeout()
         and make io_schedule a simple wrapper for that.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      [ Moved the now rudimentary io_schedule() into sched.h. ]
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Tony Battersby <tonyb@cybernetics.com>
      Link: http://lkml.kernel.org/r/20150213162600.059fffb2@notabene.brownSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9cff8ade
    • J
      7647f14f