1. 08 7月, 2014 1 次提交
    • T
      ipv6: Implement automatic flow label generation on transmit · cb1ce2ef
      Tom Herbert 提交于
      Automatically generate flow labels for IPv6 packets on transmit.
      The flow label is computed based on skb_get_hash. The flow label will
      only automatically be set when it is zero otherwise (i.e. flow label
      manager hasn't set one). This supports the transmit side functionality
      of RFC 6438.
      
      Added an IPv6 sysctl auto_flowlabels to enable/disable this behavior
      system wide, and added IPV6_AUTOFLOWLABEL socket option to enable this
      functionality per socket.
      
      By default, auto flowlabels are disabled to avoid possible conflicts
      with flow label manager, however if this feature proves useful we
      may want to enable it by default.
      
      It should also be noted that FreeBSD has already implemented automatic
      flow labels (including the sysctl and socket option). In FreeBSD,
      automatic flow labels default to enabled.
      
      Performance impact:
      
      Running super_netperf with 200 flows for TCP_RR and UDP_RR for
      IPv6. Note that in UDP case, __skb_get_hash will be called for
      every packet with explains slight regression. In the TCP case
      the hash is saved in the socket so there is no regression.
      
      Automatic flow labels disabled:
      
        TCP_RR:
          86.53% CPU utilization
          127/195/322 90/95/99% latencies
          1.40498e+06 tps
      
        UDP_RR:
          90.70% CPU utilization
          118/168/243 90/95/99% latencies
          1.50309e+06 tps
      
      Automatic flow labels enabled:
      
        TCP_RR:
          85.90% CPU utilization
          128/199/337 90/95/99% latencies
          1.40051e+06
      
        UDP_RR
          92.61% CPU utilization
          115/164/236 90/95/99% latencies
          1.4687e+06
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cb1ce2ef
  2. 02 7月, 2014 1 次提交
    • B
      ipv6: Allow accepting RA from local IP addresses. · d9333196
      Ben Greear 提交于
      This can be used in virtual networking applications, and
      may have other uses as well.  The option is disabled by
      default.
      
      A specific use case is setting up virtual routers, bridges, and
      hosts on a single OS without the use of network namespaces or
      virtual machines.  With proper use of ip rules, routing tables,
      veth interface pairs and/or other virtual interfaces,
      and applications that can bind to interfaces and/or IP addresses,
      it is possibly to create one or more virtual routers with multiple
      hosts attached.  The host interfaces can act as IPv6 systems,
      with radvd running on the ports in the virtual routers.  With the
      option provided in this patch enabled, those hosts can now properly
      obtain IPv6 addresses from the radvd.
      Signed-off-by: NBen Greear <greearb@candelatech.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d9333196
  3. 26 6月, 2014 1 次提交
  4. 23 6月, 2014 1 次提交
  5. 14 6月, 2014 1 次提交
  6. 13 6月, 2014 1 次提交
  7. 12 6月, 2014 2 次提交
  8. 11 6月, 2014 1 次提交
    • T
      RDMA/core: Add support for iWARP Port Mapper user space service · 30dc5e63
      Tatyana Nikolova 提交于
      This patch adds iWARP Port Mapper (IWPM) Version 2 support.  The iWARP
      Port Mapper implementation is based on the port mapper specification
      section in the Sockets Direct Protocol paper -
      http://www.rdmaconsortium.org/home/draft-pinkerton-iwarp-sdp-v1.0.pdf
      
      Existing iWARP RDMA providers use the same IP address as the native
      TCP/IP stack when creating RDMA connections.  They need a mechanism to
      claim the TCP ports used for RDMA connections to prevent TCP port
      collisions when other host applications use TCP ports.  The iWARP Port
      Mapper provides a standard mechanism to accomplish this.  Without this
      service it is possible for RDMA application to bind/listen on the same
      port which is already being used by native TCP host application.  If
      that happens the incoming TCP connection data can be passed to the
      RDMA stack with error.
      
      The iWARP Port Mapper solution doesn't contain any changes to the
      existing network stack in the kernel space.  All the changes are
      contained with the infiniband tree and also in user space.
      
      The iWARP Port Mapper service is implemented as a user space daemon
      process.  Source for the IWPM service is located at
      http://git.openfabrics.org/git?p=~tnikolova/libiwpm-1.0.0/.git;a=summary
      
      The iWARP driver (port mapper client) sends to the IWPM service the
      local IP address and TCP port it has received from the RDMA
      application, when starting a connection.  The IWPM service performs a
      socket bind from user space to get an available TCP port, called a
      mapped port, and communicates it back to the client.  In that sense,
      the IWPM service is used to map the TCP port, which the RDMA
      application uses to any port available from the host TCP port
      space. The mapped ports are used in iWARP RDMA connections to avoid
      collisions with native TCP stack which is aware that these ports are
      taken. When an RDMA connection using a mapped port is terminated, the
      client notifies the IWPM service, which then releases the TCP port.
      
      The message exchange between the IWPM service and the iWARP drivers
      (between user space and kernel space) is implemented using netlink
      sockets.
      
      1) Netlink interface functions are added: ibnl_unicast() and
         ibnl_mulitcast() for sending netlink messages to user space
      
      2) The signature of the existing ibnl_put_msg() is changed to be more
         generic
      
      3) Two netlink clients are added: RDMA_NL_NES, RDMA_NL_C4IW
         corresponding to the two iWarp drivers - nes and cxgb4 which use
         the IWPM service
      
      4) Enums are added to enumerate the attributes in the netlink
         messages, which are exchanged between the user space IWPM service
         and the iWARP drivers
      Signed-off-by: NTatyana Nikolova <tatyana.e.nikolova@intel.com>
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Reviewed-by: NPJ Waskiewicz <pj.waskiewicz@solidfire.com>
      
      [ Fold in range checking fixes and nlh_next removal as suggested by Dan
        Carpenter and Steve Wise.  Fix sparse endianness in hash.  - Roland ]
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      30dc5e63
  9. 10 6月, 2014 3 次提交
  10. 09 6月, 2014 1 次提交
  11. 07 6月, 2014 2 次提交
    • D
      ipc,shm: document new limits in the uapi header · f57a19a7
      Davidlohr Bueso 提交于
      This is useful in the future and allows users to better understand the
      reasoning behind the changes.
      
      Also use UL as we're dealing with it anyways.
      Signed-off-by: NDavidlohr Bueso <davidlohr@hp.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f57a19a7
    • M
      ipc/shm.c: increase the defaults for SHMALL, SHMMAX · 060028ba
      Manfred Spraul 提交于
      System V shared memory
      
      a) can be abused to trigger out-of-memory conditions and the standard
         measures against out-of-memory do not work:
      
          - it is not possible to use setrlimit to limit the size of shm segments.
      
          - segments can exist without association with any processes, thus
            the oom-killer is unable to free that memory.
      
      b) is typically used for shared information - today often multiple GB.
         (e.g. database shared buffers)
      
      The current default is a maximum segment size of 32 MB and a maximum
      total size of 8 GB.  This is often too much for a) and not enough for
      b), which means that lots of users must change the defaults.
      
      This patch increases the default limits (nearly) to the maximum, which
      is perfect for case b).  The defaults are used after boot and as the
      initial value for each new namespace.
      
      Admins/distros that need a protection against a) should reduce the
      limits and/or enable shm_rmid_forced.
      
      Unix has historically required setting these limits for shared memory,
      and Linux inherited such behavior.  The consequence of this is added
      complexity for users and administrators.  One very common example are
      Database setup/installation documents and scripts, where users must
      manually calculate the values for these limits.  This also requires
      (some) knowledge of how the underlying memory management works, thus
      causing, in many occasions, the limits to just be flat out wrong.
      Disabling these limits sooner could have saved companies a lot of time,
      headaches and money for support.  But it's never too late, simplify
      users life now.
      
      Further notes:
      - The patch only changes default, overrides behave as before:
              # sysctl kernel.shmall=33554432
        would recreate the previous limit for SHMMAX (for the current namespace).
      
      - Disabling sysv shm allocation is possible with:
              # sysctl kernel.shmall=0
        (not a new feature, also per-namespace)
      
      - The limits are intentionally set to a value slightly less than ULONG_MAX,
        to avoid triggering overflows in user space apps.
        [not unreasonable, see http://marc.info/?l=linux-mm&m=139638334330127]
      Signed-off-by: NManfred Spraul <manfred@colorfullife.com>
      Signed-off-by: NDavidlohr Bueso <davidlohr@hp.com>
      Reported-by: NDavidlohr Bueso <davidlohr@hp.com>
      Acked-by: NMichael Kerrisk <mtk.manpages@gmail.com>
      Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      060028ba
  12. 06 6月, 2014 1 次提交
    • A
      perf: Differentiate exec() and non-exec() comm events · 82b89778
      Adrian Hunter 提交于
      perf tools like 'perf report' can aggregate samples by comm strings,
      which generally works.  However, there are other potential use-cases.
      For example, to pair up 'calls' with 'returns' accurately (from branch
      events like Intel BTS) it is necessary to identify whether the process
      has exec'd.  Although a comm event is generated when an 'exec' happens
      it is also generated whenever the comm string is changed on a whim
      (e.g. by prctl PR_SET_NAME).  This patch adds a flag to the comm event
      to differentiate one case from the other.
      
      In order to determine whether the kernel supports the new flag, a
      selection bit named 'exec' is added to struct perf_event_attr.  The
      bit does nothing but will cause perf_event_open() to fail if the bit
      is set on kernels that do not have it defined.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/537D9EBE.7030806@intel.com
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-fsdevel@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      82b89778
  13. 05 6月, 2014 2 次提交
  14. 04 6月, 2014 3 次提交
  15. 03 6月, 2014 2 次提交
    • M
      NVMe: Update data structures for NVMe 1.2 · 23372af1
      Matthew Wilcox 提交于
      Include changes from the current set of ratified Technical Proposals
      for NVMe 1.2.
      Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
      23372af1
    • R
      bridge: Add bridge ifindex to bridge fdb notify msgs · 41c389d7
      Roopa Prabhu 提交于
      (This patch was previously posted as RFC at
      http://patchwork.ozlabs.org/patch/352677/)
      
      This patch adds NDA_MASTER attribute to neighbour attributes enum for
      bridge/master ifindex. And adds NDA_MASTER to bridge fdb notify msgs.
      
      Today bridge fdb notifications dont contain bridge information.
      Userspace can derive it from the port information in the fdb
      notification. However this is tricky in some scenarious.
      
      Example, bridge port delete notification comes before bridge fdb
      delete notifications. And we have seen problems in userspace
      when using libnl where, the bridge fdb delete notification handling code
      does not understand which bridge this fdb entry is part of because
      the bridge and port association has already been deleted.
      And these notifications (port membership and fdb) are generated on
      separate rtnl groups.
      
      Fixing the order of notifications could possibly solve the problem
      for some cases (I can submit a separate patch for that).
      
      This patch chooses to add NDA_MASTER to bridge fdb notify msgs
      because it not only solves the problem described above, but also helps
      userspace avoid another lookup into link msgs to derive the master index.
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      41c389d7
  16. 31 5月, 2014 1 次提交
  17. 30 5月, 2014 2 次提交
    • A
      KVM: PPC: Add CAP to indicate hcall fixes · f2e91042
      Alexander Graf 提交于
      We worked around some nasty KVM magic page hcall breakages:
      
        1) NX bit not honored, so ignore NX when we detect it
        2) LE guests swizzle hypercall instruction
      
      Without these fixes in place, there's no way it would make sense to expose kvm
      hypercalls to a guest. Chances are immensely high it would trip over and break.
      
      So add a new CAP that gives user space a hint that we have workarounds for the
      bugs above in place. It can use those as hint to disable PV hypercalls when
      the guest CPU is anything POWER7 or higher and the host does not have fixes
      in place.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      f2e91042
    • D
      drm: add DP MST encoder type · 182407a6
      Dave Airlie 提交于
      This adds an encoder type for DP MST encoders.
      Reviewed-by: NTodd Previte <tprevite@gmail.com>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      182407a6
  18. 28 5月, 2014 2 次提交
  19. 26 5月, 2014 3 次提交
    • T
      ALSA: bebob: Add hwdep interface · 618eabea
      Takashi Sakamoto 提交于
      This interface is designed for mixer/control application. By using hwdep
      interface, the application can get information about firewire node, can
      lock/unlock kernel streaming and can get notification at starting/stopping
      kernel streaming.
      Signed-off-by: NTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      618eabea
    • T
      ALSA: fireworks: Add command/response functionality into hwdep interface · 555e8a8f
      Takashi Sakamoto 提交于
      This commit adds two functionality for hwdep interface, adds two parameters for
      this driver, add a node for proc interface.
      
      To receive responses from devices, this driver already allocate own callback
      into initial memory space in host controller. This means no one can allocate
      its own callback to the address. So this driver must give a way for user
      applications to receive responses.
      
      This commit adds a functionality to receive responses via hwdep interface. The
      application can receive responses to read from this interface. To achieve this,
      this commit adds a buffer to queue responses. The default size of this buffer is
      1024 bytes. This size can be changed to give preferrable size to
      'resp_buf_size' parameter for this driver. The application should notice rest
      of space in this buffer because this driver don't push responses when this
      buffer has no space.
      
      Additionaly, this commit adds a functionality to transmit commands via hwdep
      interface. The application can transmit commands to write into this interface.
      I note that the application can transmit one command at once, but can receive
      as many responses as possible untill the user-buffer is full.
      
      When using these interfaces, the application must keep maximum number of
      sequence number in command within the number in firewire.h because this driver
      uses this number to distinguish the response is against the command by the
      application or this driver.
      
      Usually responses against commands which the application transmits are pushed
      into this buffer. But to enable 'resp_buf_debug' parameter for this driver, all
      responses are pushed into the buffer. When using this mode, I reccomend to
      expand the size of buffer.
      
      Finally this commit adds a new node into proc interface to output status of the
      buffer.
      Signed-off-by: NTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      555e8a8f
    • T
      ALSA: fireworks: Add hwdep interface · 594ddced
      Takashi Sakamoto 提交于
      This interface is designed for mixer/control application. To use hwdep
      interface, the application can get information about firewire node, can
      lock/unlock kernel streaming and can get notification at starting/stopping
      kernel streaming.
      Signed-off-by: NTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      594ddced
  20. 25 5月, 2014 3 次提交
  21. 24 5月, 2014 4 次提交
    • A
      [media] v4l: Add source change event · 3cbe6e5b
      Arun Kumar K 提交于
      This event indicates that the video device has encountered
      a source parameter change during runtime. This can typically be a
      resolution change detected by a video decoder OR a format change
      detected by an input connector.
      
      This needs to be nofified to the userspace and the application may
      be expected to reallocate buffers before proceeding. The application
      can subscribe to events on a specific pad or input port which
      it is interested in.
      Signed-off-by: NArun Kumar K <arun.kk@samsung.com>
      Acked-by: NSylwester Nawrocki <s.nawrocki@samsung.com>
      Signed-off-by: NHans Verkuil <hans.verkuil@cisco.com>
      Signed-off-by: NMauro Carvalho Chehab <m.chehab@samsung.com>
      3cbe6e5b
    • T
      l2tp: Add support for zero IPv6 checksums · 6b649fea
      Tom Herbert 提交于
      Added new L2TP configuration options to allow TX and RX of
      zero checksums in IPv6. Default is not to use them.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6b649fea
    • T
      net: Make enabling of zero UDP6 csums more restrictive · 1c19448c
      Tom Herbert 提交于
      RFC 6935 permits zero checksums to be used in IPv6 however this is
      recommended only for certain tunnel protocols, it does not make
      checksums completely optional like they are in IPv4.
      
      This patch restricts the use of IPv6 zero checksums that was previously
      intoduced. no_check6_tx and no_check6_rx have been added to control
      the use of checksums in UDP6 RX and TX path. The normal
      sk_no_check_{rx,tx} settings are not used (this avoids ambiguity when
      dealing with a dual stack socket).
      
      A helper function has been added (udp_set_no_check6) which can be
      called by tunnel impelmentations to all zero checksums (send on the
      socket, and accept them as valid).
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1c19448c
    • S
      net-next:v4: Add support to configure SR-IOV VF minimum and maximum Tx rate through ip tool. · ed616689
      Sucheta Chakraborty 提交于
      o min_tx_rate puts lower limit on the VF bandwidth. VF is guaranteed
        to have a bandwidth of at least this value.
        max_tx_rate puts cap on the VF bandwidth. VF can have a bandwidth
        of up to this value.
      
      o A new handler set_vf_rate for attr IFLA_VF_RATE has been introduced
        which takes 4 arguments:
        netdev, VF number, min_tx_rate, max_tx_rate
      
      o ndo_set_vf_rate replaces ndo_set_vf_tx_rate handler.
      
      o Drivers that currently implement ndo_set_vf_tx_rate should now call
        ndo_set_vf_rate instead and reject attempt to set a minimum bandwidth
        greater than 0 for IFLA_VF_TX_RATE when IFLA_VF_RATE is not yet
        implemented by driver.
      
      o If user enters only one of either min_tx_rate or max_tx_rate, then,
        userland should read back the other value from driver and set both
        for IFLA_VF_RATE.
        Drivers that have not yet implemented IFLA_VF_RATE should always
        return min_tx_rate as 0 when read from ip tool.
      
      o If both IFLA_VF_TX_RATE and IFLA_VF_RATE options are specified, then
        IFLA_VF_RATE should override.
      
      o Idea is to have consistent display of rate values to user.
      
      o Usage example: -
      
        ./ip link set p4p1 vf 0 rate 900
      
        ./ip link show p4p1
        32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
        DEFAULT qlen 1000
          link/ether 00:0e:1e:08:b0:f0 brd ff:ff:ff:ff:ff:ff
          vf 0 MAC 3e:a0:ca:bd:ae:5a, tx rate 900 (Mbps), max_tx_rate 900Mbps
          vf 1 MAC f6:c6:7c:3f:3d:6c
          vf 2 MAC 56:32:43:98:d7:71
          vf 3 MAC d6:be:c3:b5:85:ff
          vf 4 MAC ee:a9:9a:1e:19:14
          vf 5 MAC 4a:d0:4c:07:52:18
          vf 6 MAC 3a:76:44:93:62:f9
          vf 7 MAC 82:e9:e7:e3:15:1a
      
        ./ip link set p4p1 vf 0 max_tx_rate 300 min_tx_rate 200
      
        ./ip link show p4p1
        32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
        DEFAULT qlen 1000
          link/ether 00:0e:1e:08:b0:f0 brd ff:ff:ff:ff:ff:ff
          vf 0 MAC 3e:a0:ca:bd:ae:5a, tx rate 300 (Mbps), max_tx_rate 300Mbps,
          min_tx_rate 200Mbps
          vf 1 MAC f6:c6:7c:3f:3d:6c
          vf 2 MAC 56:32:43:98:d7:71
          vf 3 MAC d6:be:c3:b5:85:ff
          vf 4 MAC ee:a9:9a:1e:19:14
          vf 5 MAC 4a:d0:4c:07:52:18
          vf 6 MAC 3a:76:44:93:62:f9
          vf 7 MAC 82:e9:e7:e3:15:1a
      
        ./ip link set p4p1 vf 0 max_tx_rate 600 rate 300
      
        ./ip link show p4p1
        32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
        DEFAULT qlen 1000
          link/ether 00:0e:1e:08:b0:f brd ff:ff:ff:ff:ff:ff
          vf 0 MAC 3e:a0:ca:bd:ae:5, tx rate 600 (Mbps), max_tx_rate 600Mbps,
          min_tx_rate 200Mbps
          vf 1 MAC f6:c6:7c:3f:3d:6c
          vf 2 MAC 56:32:43:98:d7:71
          vf 3 MAC d6:be:c3:b5:85:ff
          vf 4 MAC ee:a9:9a:1e:19:14
          vf 5 MAC 4a:d0:4c:07:52:18
          vf 6 MAC 3a:76:44:93:62:f9
          vf 7 MAC 82:e9:e7:e3:15:1a
      Signed-off-by: NSucheta Chakraborty <sucheta.chakraborty@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ed616689
  22. 23 5月, 2014 2 次提交