1. 27 8月, 2016 8 次提交
  2. 26 8月, 2016 7 次提交
  3. 25 8月, 2016 15 次提交
    • L
      net: diag: allow socket bytecode filters to match socket marks · a52e95ab
      Lorenzo Colitti 提交于
      This allows a privileged process to filter by socket mark when
      dumping sockets via INET_DIAG_BY_FAMILY. This is useful on
      systems that use mark-based routing such as Android.
      
      The ability to filter socket marks requires CAP_NET_ADMIN, which
      is consistent with other privileged operations allowed by the
      SOCK_DIAG interface such as the ability to destroy sockets and
      the ability to inspect BPF filters attached to packet sockets.
      
      Tested: https://android-review.googlesource.com/261350Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
      Acked-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a52e95ab
    • L
      net: diag: slightly refactor the inet_diag_bc_audit error checks. · 627cc4ad
      Lorenzo Colitti 提交于
      This simplifies the code a bit and also allows inet_diag_bc_audit
      to send to userspace an error that isn't EINVAL.
      Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
      Acked-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      627cc4ad
    • V
      net: dsa: rename switch operations structure · 9d490b4e
      Vivien Didelot 提交于
      Now that the dsa_switch_driver structure contains only function pointers
      as it is supposed to, rename it to the more appropriate dsa_switch_ops,
      uniformly to any other operations structure in the kernel.
      
      No functional changes here, basically just the result of something like:
      s/dsa_switch_driver *drv/dsa_switch_ops *ops/g
      
      However keep the {un,}register_switch_driver functions and their
      dsa_switch_drivers list as is, since they represent the -- likely to be
      deprecated soon -- legacy DSA registration framework.
      
      In the meantime, also fix the following checks from checkpatch.pl to
      make it happy with this patch:
      
          CHECK: Comparison to NULL could be written "!ops"
          #403: FILE: net/dsa/dsa.c:470:
          +	if (ops == NULL) {
      
          CHECK: Comparison to NULL could be written "ds->ops->get_strings"
          #773: FILE: net/dsa/slave.c:697:
          +		if (ds->ops->get_strings != NULL)
      
          CHECK: Comparison to NULL could be written "ds->ops->get_ethtool_stats"
          #824: FILE: net/dsa/slave.c:785:
          +	if (ds->ops->get_ethtool_stats != NULL)
      
          CHECK: Comparison to NULL could be written "ds->ops->get_sset_count"
          #835: FILE: net/dsa/slave.c:798:
          +		if (ds->ops->get_sset_count != NULL)
      
          total: 0 errors, 0 warnings, 4 checks, 784 lines checked
      Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Acked-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9d490b4e
    • Y
      bnx2x: Don't flush multicast MACs · c7b7b483
      Yuval Mintz 提交于
      When ndo_set_rx_mode() is called for bnx2x, as part of process of
      configuring the new MAC address filters [both unicast & multicast]
      driver begins by flushing the existing configuration and then iterating
      over the network device's list of addresses and configures those instead.
      
      This has the side-effect of creating a short gap where traffic wouldn't
      be properly classified, as no filters are configured in HW.
      While for unicasts this is rather insignificant [as unicast MACs don't
      frequently change while interface is actually running],
      for multicast traffic it does pose an issue as there are multicast-based
      networks where new multicast groups would constantly be removed and
      added.
      
      This patch tries to remedy this [at least for the newer adapters] -
      Instead of flushing & reconfiguring all existing multicast filters,
      the driver would instead create the approximate hash match that would
      result from the required filters. It would then compare it against the
      currently configured approximate hash match, and only add and remove the
      delta between those.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7b7b483
    • D
      Merge tag 'rxrpc-rewrite-20160824-2' of... · 6546c78e
      David S. Miller 提交于
      Merge tag 'rxrpc-rewrite-20160824-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
      
      David Howells says:
      
      ====================
      rxrpc: Add better client conn management strategy
      
      These two patches add a better client connection management strategy.  They
      need to be applied on top of the just-posted fixes.
      
       (1) Duplicate the connection list and separate out procfs iteration from
           garbage collection.  This is necessary for the next patch as with that
           client connections no longer appear on a single list and may not
           appear on a list at all - and really don't want to be exposed to the
           old garbage collector.
      
           (Note that client conns aren't left dangling, they're also in a tree
           rooted in the local endpoint so that they can be found by a user
           wanting to make a new client call.  Service conns do not appear in
           this tree.)
      
       (2) Implement a better lifetime management and garbage collection strategy
           for client connections.
      
           In this, a client connection can be in one of five cache states
           (inactive, waiting, active, culled and idle).  Limits are set on the
           number of client conns that may be active at any one time and makes
           users wait if they want to start a new call when there isn't capacity
           available.
      
           To make capacity available, active and idle connections can be culled,
           after a short delay (to allow for retransmission).  The delay is
           reduced if the capacity exceeds a tunable threshold.
      
           If there is spare capacity, client conns are permitted to hang around
           a fair bit longer (tunable) so as to allow reuse of negotiated
           security contexts.
      
           After this patch, the client conn strategy is separate from that of
           service conns (which continues to use the old code for the moment).
      
           This difference in strategy is because the client side retains control
           over when it allows a connection to become active, whereas the service
           side has no control over when it sees a new connection or a new call
           on an old connection.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6546c78e
    • D
      Merge tag 'rxrpc-rewrite-20160824-1' of... · d3c10db1
      David S. Miller 提交于
      Merge tag 'rxrpc-rewrite-20160824-1' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
      
      David Howells says:
      
      ====================
      rxrpc: More fixes
      
      Here are a couple of fix patches:
      
       (1) Fix the conn-based retransmission patch posted yesterday.  This breaks
           if it actually has to retransmit.  However, it seems the likelihood of
           this happening is really low, despite the server I'm testing against
           being located >3000 miles away, and sometime of the time it's handled
           in the call background processor before we manage to disconnect the
           call - hence why I didn't spot it.
      
       (2) /proc/net/rxrpc_calls can cause a crash it accessed whilst a call is
           being torn down.  The window of opportunity is pretty small, however,
           as calls don't stay in this state for long.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d3c10db1
    • D
      Merge branch 'mlxsw-fdb-learning-offload' · d14c800b
      David S. Miller 提交于
      Jiri Pirko says:
      
      ====================
      mlxsw: Offload FDB learning configuration
      
      Ido says:
      This patchset addresses two long standing issues in the mlxsw driver
      concerning FDB learning.
      
      Patch 1 limits the number of FDB records processed by the driver in a
      single session. This is useful in situations in which many new records
      need to be processed, thereby causing the RTNL mutex to be held for
      long periods of time.
      
      Patches 2-6 offload the learning configuration (on / off) of bridge
      ports to the device instead of having the driver decide whether a
      record needs to be learned or not.
      
      The last patch is fallout and removes configuration no longer necessary
      after the first patches are applied.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d14c800b
    • I
      mlxsw: spectrum: Don't set learning when creating vPorts · 0f7a4d8a
      Ido Schimmel 提交于
      Before commit 99724c18 ("mlxsw: spectrum: Introduce support for
      router interfaces") we used to assign vFIDs to the created vPorts. Since
      these vPorts were used for slow path traffic we had to disable learning
      for them, as it doesn't make sense to have it enabled.
      
      This is no longer the case and now vPorts are either used for router
      interfaces (for which learning is disabled by the firmware) or bridge
      ports (for which learning is explicitly enabled by the driver).
      
      Therefore, we can remove the learning configuration upon vPort creation.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0f7a4d8a
    • I
      mlxsw: spectrum: Remove unnecessary check in FDB processing · 81f77bc0
      Ido Schimmel 提交于
      We now offload the learning configuration to the device and don't rely
      on the driver to decide whether to learn the FDB record, so remove the
      check.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      81f77bc0
    • I
      mlxsw: spectrum: Offload learning to the switch ASIC · 89b548f0
      Ido Schimmel 提交于
      Up until now we simply stored the learning configuration of a bridge
      port in the driver and decided whether to learn a new FDB record based
      on this value.
      
      However, this is sub-optimal in cases where learning is disabled on the
      bridge port, as the device repeatedly generates learning notifications
      for the same record.
      
      Instead, offload the learning configuration to the device, thereby
      preventing it from generating notifications when learning is disabled.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      89b548f0
    • I
      mlxsw: spectrum: Configure learning for VLAN-aware bridge port · 584d73df
      Ido Schimmel 提交于
      We are going to prevent the device from generating learning
      notifications for a port that was configured with learning disabled.
      
      Since learning configuration is done per {Port, VID} we need to apply
      the port's learning configuration for any VID that is added to the
      bridge port's VLAN filter list.
      
      When a VID is added to the VLAN filter list of a VLAN-aware bridge port,
      configure the {Port, VID} learning status according to the port's
      configuration. When the VID is removed, disable learning for the {Port,
      VID}.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      584d73df
    • I
      mlxsw: spectrum: Don't abort on first error when removing VLANs · 640be7b7
      Ido Schimmel 提交于
      When removing VLANs from the VLAN-aware bridge we shouldn't abort on the
      first error, as we'll otherwise have resources that will never be freed.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      640be7b7
    • I
      mlxsw: spectrum: Make VLAN deletion function symmetric · f7a8f6ce
      Ido Schimmel 提交于
      Commit 05978481 ("mlxsw: spectrum: Create PVID vPort before
      registering netdevice") removed __mlxsw_sp_port_vlans_del() from the
      init sequence of the driver, which forced it to be non-symmetric with
      regards to __mlxsw_sp_port_vlans_add().
      
      Make both functions symmetric as the constraint no longer exists.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f7a8f6ce
    • I
      mlxsw: spectrum: Limit number of FDB records per learning session · 1803e0fb
      Ido Schimmel 提交于
      Up until now a learning session ended whenever the number of queried
      records was zero. This turned out to be problematic in situations where
      a large number of MACs (48K) had to be processed by the switch driver,
      as RTNL mutex is held during the learning session.
      
      Instead, limit the number of FDB records that can be processed in a
      session to 64. This means that every time the device is queried for
      learning notifications (currently, every 100ms), up to 64 records will
      be processed by the switch driver.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1803e0fb
    • D
      Merge tag 'shared-for-4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma · fff84d2a
      David S. Miller 提交于
      Saeed Mahameed says:
      
      ====================
      Mellanox mlx5 core driver updates 2016-08-24
      
      This series contains some low level and API updates for mlx5 core
      driver interface and mlx5_ifc.h, plus mlx5 LAG core driver support,
      to be shared as base code for net-next and rdma mlx5 4.9 submissions.
      
      From Alex and Artemy, Update mlx5_ifc for modify RQ and XRC bits.
      
      From Noa, Expose mlx5 link modes so they can be used in RDMA tree for rdma tools.
      
      From Aviv, LAG support needed for RDMA.
          - Add needed hardware structures, layouts and interface
          - mlx5 core driver LAG implementation
          - Introduce mlx5 core driver LAG API for mlx5_ib
      
      From Maor, add two low level patches for mlx5 hardware sniffer QP
      infrastructure bits and capabilities, plus added the namespace for sniffer
      steering tables.  Needed for RDMA subtree.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fff84d2a
  4. 24 8月, 2016 10 次提交
    • D
      rxrpc: Improve management and caching of client connection objects · 45025bce
      David Howells 提交于
      Improve the management and caching of client rxrpc connection objects.
      From this point, client connections will be managed separately from service
      connections because AF_RXRPC controls the creation and re-use of client
      connections but doesn't have that luxury with service connections.
      
      Further, there will be limits on the numbers of client connections that may
      be live on a machine.  No direct restriction will be placed on the number
      of client calls, excepting that each client connection can support a
      maximum of four concurrent calls.
      
      Note that, for a number of reasons, we don't want to simply discard a
      client connection as soon as the last call is apparently finished:
      
       (1) Security is negotiated per-connection and the context is then shared
           between all calls on that connection.  The context can be negotiated
           again if the connection lapses, but that involves holding up calls
           whilst at least two packets are exchanged and various crypto bits are
           performed - so we'd ideally like to cache it for a little while at
           least.
      
       (2) If a packet goes astray, we will need to retransmit a final ACK or
           ABORT packet.  To make this work, we need to keep around the
           connection details for a little while.
      
       (3) The locally held structures represent some amount of setup time, to be
           weighed against their occupation of memory when idle.
      
      
      To this end, the client connection cache is managed by a state machine on
      each connection.  There are five states:
      
       (1) INACTIVE - The connection is not held in any list and may not have
           been exposed to the world.  If it has been previously exposed, it was
           discarded from the idle list after expiring.
      
       (2) WAITING - The connection is waiting for the number of client conns to
           drop below the maximum capacity.  Calls may be in progress upon it
           from when it was active and got culled.
      
           The connection is on the rxrpc_waiting_client_conns list which is kept
           in to-be-granted order.  Culled conns with waiters go to the back of
           the queue just like new conns.
      
       (3) ACTIVE - The connection has at least one call in progress upon it, it
           may freely grant available channels to new calls and calls may be
           waiting on it for channels to become available.
      
           The connection is on the rxrpc_active_client_conns list which is kept
           in activation order for culling purposes.
      
       (4) CULLED - The connection got summarily culled to try and free up
           capacity.  Calls currently in progress on the connection are allowed
           to continue, but new calls will have to wait.  There can be no waiters
           in this state - the conn would have to go to the WAITING state
           instead.
      
       (5) IDLE - The connection has no calls in progress upon it and must have
           been exposed to the world (ie. the EXPOSED flag must be set).  When it
           expires, the EXPOSED flag is cleared and the connection transitions to
           the INACTIVE state.
      
           The connection is on the rxrpc_idle_client_conns list which is kept in
           order of how soon they'll expire.
      
      A connection in the ACTIVE or CULLED state must have at least one active
      call upon it; if in the WAITING state it may have active calls upon it;
      other states may not have active calls.
      
      As long as a connection remains active and doesn't get culled, it may
      continue to process calls - even if there are connections on the wait
      queue.  This simplifies things a bit and reduces the amount of checking we
      need do.
      
      
      There are a couple flags of relevance to the cache:
      
       (1) EXPOSED - The connection ID got exposed to the world.  If this flag is
           set, an extra ref is added to the connection preventing it from being
           reaped when it has no calls outstanding.  This flag is cleared and the
           ref dropped when a conn is discarded from the idle list.
      
       (2) DONT_REUSE - The connection should be discarded as soon as possible and
           should not be reused.
      
      
      This commit also provides a number of new settings:
      
       (*) /proc/net/rxrpc/max_client_conns
      
           The maximum number of live client connections.  Above this number, new
           connections get added to the wait list and must wait for an active
           conn to be culled.  Culled connections can be reused, but they will go
           to the back of the wait list and have to wait.
      
       (*) /proc/net/rxrpc/reap_client_conns
      
           If the number of desired connections exceeds the maximum above, the
           active connection list will be culled until there are only this many
           left in it.
      
       (*) /proc/net/rxrpc/idle_conn_expiry
      
           The normal expiry time for a client connection, provided there are
           fewer than reap_client_conns of them around.
      
       (*) /proc/net/rxrpc/idle_conn_fast_expiry
      
           The expedited expiry time, used when there are more than
           reap_client_conns of them around.
      
      
      Note that I combined the Tx wait queue with the channel grant wait queue to
      save space as only one of these should be in use at once.
      
      Note also that, for the moment, the service connection cache still uses the
      old connection management code.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      45025bce
    • D
      rxrpc: Dup the main conn list for the proc interface · 4d028b2c
      David Howells 提交于
      The main connection list is used for two independent purposes: primarily it
      is used to find connections to reap and secondarily it is used to list
      connections in procfs.
      
      Split the procfs list out from the reap list.  This allows us to stop using
      the reap list for client connections when they acquire a separate
      management strategy from service collections.
      
      The client connections will not be on a management single list, and sometimes
      won't be on a management list at all.  This doesn't leave them floating,
      however, as they will also be on an rb-tree rooted on the socket so that the
      socket can find them to dispatch calls.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      4d028b2c
    • D
      rxrpc: Make /proc/net/rxrpc_calls safer · df5d8bf7
      David Howells 提交于
      Make /proc/net/rxrpc_calls safer by stashing a copy of the peer pointer in
      the rxrpc_call struct and checking in the show routine that the peer
      pointer, the socket pointer and the local pointer obtained from the socket
      pointer aren't NULL before we use them.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      df5d8bf7
    • D
      rxrpc: Fix conn-based retransmit · 2266ffde
      David Howells 提交于
      If a duplicate packet comes in for a call that has just completed on a
      connection's channel then there will be an oops in the data_ready handler
      because it tries to examine the connection struct via a call struct (which
      we don't have - the pointer is unset).
      
      Since the connection struct pointer is available to us, go direct instead.
      
      Also, the ACK packet to be retransmitted needs three octets of padding
      between the soft ack list and the ackinfo.
      
      Fixes: 18bfeba5 ("rxrpc: Perform terminal call ACK/ABORT retransmission from conn processor")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      2266ffde
    • D
      Merge branch 'remove-clear_sk' · e53ee454
      David S. Miller 提交于
      Eric Dumazet says:
      
      ====================
      net: remove clear_sk() method
      
      Since IPv6 socket lookups no longer dereference pinet6 pointer
      and UDP lost SLAB_DESTROY_BY_RCU special rules, we no longer
      need special clear_sk() methods.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e53ee454
    • E
      net: remove clear_sk() method · ba2489b0
      Eric Dumazet 提交于
      We no longer use this handler, we can delete it.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ba2489b0
    • E
      ipv6: tcp: get rid of tcp_v6_clear_sk() · 391bb6be
      Eric Dumazet 提交于
      Now RCU lookups of IPv6 TCP sockets no longer dereference pinet6,
      we do not need tcp_v6_clear_sk() anymore.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      391bb6be
    • E
      udp: get rid of sk_prot_clear_portaddr_nulls() · 4cac8204
      Eric Dumazet 提交于
      Since we no longer use SLAB_DESTROY_BY_RCU for UDP,
      we do not need sk_prot_clear_portaddr_nulls() helper.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4cac8204
    • E
      ipv6: udp: remove udp_v6_clear_sk() · 6a6ad2a4
      Eric Dumazet 提交于
      Now RCU lookups of ipv6 udp sockets no longer dereference
      pinet6 field, we can get rid of udp_v6_clear_sk() helper.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6a6ad2a4
    • D
      net: diag: support SOCK_DESTROY for UDP sockets · 5d77dca8
      David Ahern 提交于
      This implements SOCK_DESTROY for UDP sockets similar to what was done
      for TCP with commit c1e64e29 ("net: diag: Support destroying TCP
      sockets.") A process with a UDP socket targeted for destroy is awakened
      and recvmsg fails with ECONNABORTED.
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5d77dca8