1. 12 8月, 2015 1 次提交
  2. 11 8月, 2015 4 次提交
  3. 10 8月, 2015 3 次提交
  4. 08 8月, 2015 1 次提交
  5. 01 8月, 2015 5 次提交
  6. 31 7月, 2015 1 次提交
    • D
      net: sched: fix refcount imbalance in actions · 28e6b67f
      Daniel Borkmann 提交于
      Since commit 55334a5d ("net_sched: act: refuse to remove bound action
      outside"), we end up with a wrong reference count for a tc action.
      
      Test case 1:
      
        FOO="1,6 0 0 4294967295,"
        BAR="1,6 0 0 4294967294,"
        tc filter add dev foo parent 1: bpf bytecode "$FOO" flowid 1:1 \
           action bpf bytecode "$FOO"
        tc actions show action bpf
          action order 0: bpf bytecode '1,6 0 0 4294967295' default-action pipe
          index 1 ref 1 bind 1
        tc actions replace action bpf bytecode "$BAR" index 1
        tc actions show action bpf
          action order 0: bpf bytecode '1,6 0 0 4294967294' default-action pipe
          index 1 ref 2 bind 1
        tc actions replace action bpf bytecode "$FOO" index 1
        tc actions show action bpf
          action order 0: bpf bytecode '1,6 0 0 4294967295' default-action pipe
          index 1 ref 3 bind 1
      
      Test case 2:
      
        FOO="1,6 0 0 4294967295,"
        tc filter add dev foo parent 1: bpf bytecode "$FOO" flowid 1:1 action ok
        tc actions show action gact
          action order 0: gact action pass
          random type none pass val 0
           index 1 ref 1 bind 1
        tc actions add action drop index 1
          RTNETLINK answers: File exists [...]
        tc actions show action gact
          action order 0: gact action pass
           random type none pass val 0
           index 1 ref 2 bind 1
        tc actions add action drop index 1
          RTNETLINK answers: File exists [...]
        tc actions show action gact
          action order 0: gact action pass
           random type none pass val 0
           index 1 ref 3 bind 1
      
      What happens is that in tcf_hash_check(), we check tcf_common for a given
      index and increase tcfc_refcnt and conditionally tcfc_bindcnt when we've
      found an existing action. Now there are the following cases:
      
        1) We do a late binding of an action. In that case, we leave the
           tcfc_refcnt/tcfc_bindcnt increased and are done with the ->init()
           handler. This is correctly handeled.
      
        2) We replace the given action, or we try to add one without replacing
           and find out that the action at a specific index already exists
           (thus, we go out with error in that case).
      
      In case of 2), we have to undo the reference count increase from
      tcf_hash_check() in the tcf_hash_check() function. Currently, we fail to
      do so because of the 'tcfc_bindcnt > 0' check which bails out early with
      an -EPERM error.
      
      Now, while commit 55334a5d prevents 'tc actions del action ...' on an
      already classifier-bound action to drop the reference count (which could
      then become negative, wrap around etc), this restriction only accounts for
      invocations outside a specific action's ->init() handler.
      
      One possible solution would be to add a flag thus we possibly trigger
      the -EPERM ony in situations where it is indeed relevant.
      
      After the patch, above test cases have correct reference count again.
      
      Fixes: 55334a5d ("net_sched: act: refuse to remove bound action outside")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: NCong Wang <cwang@twopensource.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      28e6b67f
  7. 30 7月, 2015 3 次提交
  8. 27 7月, 2015 6 次提交
  9. 25 7月, 2015 1 次提交
    • J
      ipv4: consider TOS in fib_select_default · 2392debc
      Julian Anastasov 提交于
      fib_select_default considers alternative routes only when
      res->fi is for the first alias in res->fa_head. In the
      common case this can happen only when the initial lookup
      matches the first alias with highest TOS value. This
      prevents the alternative routes to require specific TOS.
      
      This patch solves the problem as follows:
      
      - routes that require specific TOS should be returned by
      fib_select_default only when TOS matches, as already done
      in fib_table_lookup. This rule implies that depending on the
      TOS we can have many different lists of alternative gateways
      and we have to keep the last used gateway (fa_default) in first
      alias for the TOS instead of using single tb_default value.
      
      - as the aliases are ordered by many keys (TOS desc,
      fib_priority asc), we restrict the possible results to
      routes with matching TOS and lowest metric (fib_priority)
      and routes that match any TOS, again with lowest metric.
      
      For example, packet with TOS 8 can not use gw3 (not lowest
      metric), gw4 (different TOS) and gw6 (not lowest metric),
      all other gateways can be used:
      
      tos 8 via gw1 metric 2 <--- res->fa_head and res->fi
      tos 8 via gw2 metric 2
      tos 8 via gw3 metric 3
      tos 4 via gw4
      tos 0 via gw5
      tos 0 via gw6 metric 1
      Reported-by: NHagen Paul Pfeifer <hagen@jauu.net>
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2392debc
  10. 23 7月, 2015 7 次提交
  11. 22 7月, 2015 8 次提交
    • M
      net: #ifdefify sk_classid member of struct sock · e181a543
      Mathias Krause 提交于
      The sk_classid member is only required when CONFIG_CGROUP_NET_CLASSID is
      enabled. #ifdefify it to reduce the size of struct sock on 32 bit
      systems, at least.
      Signed-off-by: NMathias Krause <minipli@googlemail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e181a543
    • T
      openvswitch: Use regular VXLAN net_device device · 614732ea
      Thomas Graf 提交于
      This gets rid of all OVS specific VXLAN code in the receive and
      transmit path by using a VXLAN net_device to represent the vport.
      Only a small shim layer remains which takes care of handling the
      VXLAN specific OVS Netlink configuration.
      
      Unexports vxlan_sock_add(), vxlan_sock_release(), vxlan_xmit_skb()
      since they are no longer needed.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      614732ea
    • T
      vxlan: Factor out device configuration · 0dfbdf41
      Thomas Graf 提交于
      This factors out the device configuration out of the RTNL newlink
      API which allows for in-kernel creation of VXLAN net_devices.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0dfbdf41
    • T
      fib: Add fib rule match on tunnel id · e7030878
      Thomas Graf 提交于
      This add the ability to select a routing table based on the tunnel
      id which allows to maintain separate routing tables for each virtual
      tunnel network.
      
      ip rule add from all tunnel-id 100 lookup 100
      ip rule add from all tunnel-id 200 lookup 200
      
      A new static key controls the collection of metadata at tunnel level
      upon demand.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e7030878
    • T
      route: Per route IP tunnel metadata via lightweight tunnel · 3093fbe7
      Thomas Graf 提交于
      This introduces a new IP tunnel lightweight tunnel type which allows
      to specify IP tunnel instructions per route. Only IPv4 is supported
      at this point.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3093fbe7
    • T
      route: Extend flow representation with tunnel key · 1b7179d3
      Thomas Graf 提交于
      Add a new flowi_tunnel structure which is a subset of ip_tunnel_key to
      allow routes to match on tunnel metadata. For now, the tunnel id is
      added to flowi_tunnel which allows for routes to be bound to specific
      virtual tunnels.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1b7179d3
    • T
      vxlan: Flow based tunneling · ee122c79
      Thomas Graf 提交于
      Allows putting a VXLAN device into a new flow-based mode in which
      skbs with a ip_tunnel_info dst metadata attached will be encapsulated
      according to the instructions stored in there with the VXLAN device
      defaults taken into consideration.
      
      Similar on the receive side, if the VXLAN_F_COLLECT_METADATA flag is
      set, the packet processing will populate a ip_tunnel_info struct for
      each packet received and attach it to the skb using the new metadata
      dst.  The metadata structure will contain the outer header and tunnel
      header fields which have been stripped off. Layers further up in the
      stack such as routing, tc or netfitler can later match on these fields
      and perform forwarding. It is the responsibility of upper layers to
      ensure that the flag is set if the metadata is needed. The flag limits
      the additional cost of metadata collecting based on demand.
      
      This prepares the VXLAN device to be steered by the routing and other
      subsystems which allows to support encapsulation for a large number
      of tunnel endpoints and tunnel ids through a single net_device which
      improves the scalability.
      
      It also allows for OVS to leverage this mode which in turn allows for
      the removal of the OVS specific VXLAN code.
      
      Because the skb is currently scrubed in vxlan_rcv(), the attachment of
      the new dst metadata is postponed until after scrubing which requires
      the temporary addition of a new member to vxlan_metadata. This member
      is removed again in a later commit after the indirect VXLAN receive API
      has been removed.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee122c79
    • T
      dst: Metadata destinations · f38a9eb1
      Thomas Graf 提交于
      Introduces a new dst_metadata which enables to carry per packet metadata
      between forwarding and processing elements via the skb->dst pointer.
      
      The structure is set up to be a union. Thus, each separate type of
      metadata requires its own dst instance. If demand arises to carry
      multiple types of metadata concurrently, metadata dst entries can be
      made stackable.
      
      The metadata dst entry is refcnt'ed as expected for now but a non
      reference counted use is possible if the reference is forced before
      queueing the skb.
      
      In order to allow allocating dsts with variable length, the existing
      dst_alloc() is split into a dst_alloc() and dst_init() function. The
      existing dst_init() function to initialize the subsystem is being
      renamed to dst_subsys_init() to make it clear what is what.
      
      The check before ip_route_input() is changed to ignore metadata dsts
      and drop the dst inside the routing function thus allowing to interpret
      metadata in a later commit.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f38a9eb1