1. 03 11月, 2016 26 次提交
  2. 02 11月, 2016 7 次提交
    • F
      netfilter: nf_queue: place volatile data in own cacheline · 886bc503
      Florian Westphal 提交于
      As the comment indicates, the data at the end of nfqnl_instance struct is
      written on every queue/dequeue, so it should reside in its own cacheline.
      
      Before this change, 'lock' was in first cacheline so we dirtied both.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      886bc503
    • L
      netfilter: nf_tables: remove useless U8_MAX validation · e41e9d62
      Liping Zhang 提交于
      After call nft_data_init, size is already validated and desc.len will
      not exceed the sizeof(struct nft_data), i.e. 16 bytes. So it will never
      exceed U8_MAX.
      
      Furthermore, in nft_immediate_init, we forget to call nft_data_uninit
      when desc.len exceeds U8_MAX, although this will not happen, but it's
      a logical mistake.
      
      Now remove these redundant validation introduced by commit 36b701fa
      ("netfilter: nf_tables: validate maximum value of u32 netlink attributes")
      Signed-off-by: NLiping Zhang <zlpnobody@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      e41e9d62
    • A
      netfilter: nf_tables: introduce routing expression · 2fa84193
      Anders K. Pedersen 提交于
      Introduces an nftables rt expression for routing related data with support
      for nexthop (i.e. the directly connected IP address that an outgoing packet
      is sent to), which can be used either for matching or accounting, eg.
      
       # nft add rule filter postrouting \
      	ip daddr 192.168.1.0/24 rt nexthop != 192.168.0.1 drop
      
      This will drop any traffic to 192.168.1.0/24 that is not routed via
      192.168.0.1.
      
       # nft add rule filter postrouting \
      	flow table acct { rt nexthop timeout 600s counter }
       # nft add rule ip6 filter postrouting \
      	flow table acct { rt nexthop timeout 600s counter }
      
      These rules count outgoing traffic per nexthop. Note that the timeout
      releases an entry if no traffic is seen for this nexthop within 10 minutes.
      
       # nft add rule inet filter postrouting \
      	ether type ip \
      	flow table acct { rt nexthop timeout 600s counter }
       # nft add rule inet filter postrouting \
      	ether type ip6 \
      	flow table acct { rt nexthop timeout 600s counter }
      
      Same as above, but via the inet family, where the ether type must be
      specified explicitly.
      
      "rt classid" is also implemented identical to "meta rtclassid", since it
      is more logical to have this match in the routing expression going forward.
      Signed-off-by: NAnders K. Pedersen <akp@cohaesio.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      2fa84193
    • P
      netfilter: move socket lookup infrastructure to nf_socket_ipv{4,6}.c · 8db4c5be
      Pablo Neira Ayuso 提交于
      We need this split to reuse existing codebase for the upcoming nf_tables
      socket expression.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      8db4c5be
    • P
      netfilter: nf_log: add packet logging for netdev family · 1fddf4ba
      Pablo Neira Ayuso 提交于
      Move layer 2 packet logging into nf_log_l2packet() that resides in
      nf_log_common.c, so this can be shared by both bridge and netdev
      families.
      
      This patch adds the boiler plate code to register the netdev logging
      family.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      1fddf4ba
    • F
      netfilter: nf_tables: add fib expression · f6d0cbcf
      Florian Westphal 提交于
      Add FIB expression, supported for ipv4, ipv6 and inet family (the latter
      just dispatches to ipv4 or ipv6 one based on nfproto).
      
      Currently supports fetching output interface index/name and the
      rtm_type associated with an address.
      
      This can be used for adding path filtering. rtm_type is useful
      to e.g. enforce a strong-end host model where packets
      are only accepted if daddr is configured on the interface the
      packet arrived on.
      
      The fib expression is a native nftables alternative to the
      xtables addrtype and rp_filter matches.
      
      FIB result order for oif/oifname retrieval is as follows:
       - if packet is local (skb has rtable, RTF_LOCAL set, this
         will also catch looped-back multicast packets), set oif to
         the loopback interface.
       - if fib lookup returns an error, or result points to local,
         store zero result.  This means '--local' option of -m rpfilter
         is not supported. It is possible to use 'fib type local' or add
         explicit saddr/daddr matching rules to create exceptions if this
         is really needed.
       - store result in the destination register.
         In case of multiple routes, search set for desired oif in case
         strict matching is requested.
      
      ipv4 and ipv6 behave fib expressions are supposed to behave the same.
      
      [ I have collapsed Arnd Bergmann's ("netfilter: nf_tables: fib warnings")
      
      	http://patchwork.ozlabs.org/patch/688615/
      
        to address fallout from this patch after rebasing nf-next, that was
        posted to address compilation warnings. --pablo ]
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      f6d0cbcf
    • W
      genetlink: fix error return code in genl_register_family() · 22ca904a
      Wei Yongjun 提交于
      Fix to return a negative error code from the idr_alloc() error handling
      case instead of 0, as done elsewhere in this function.
      
      Also fix the return value check of idr_alloc() since idr_alloc return
      negative errors on failure, not zero.
      
      Fixes: 2ae0f17d ("genetlink: use idr to track families")
      Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22ca904a
  3. 01 11月, 2016 7 次提交
    • D
      net: Enable support for VRF with ipv4 multicast · e58e4159
      David Ahern 提交于
      Enable support for IPv4 multicast:
      - similar to unicast the flow struct is updated to L3 master device
        if relevant prior to calling fib_rules_lookup. The table id is saved
        to the lookup arg so the rule action for ipmr can return the table
        associated with the device.
      
      - ip_mr_forward needs to check for master device mismatch as well
        since the skb->dev is set to it
      
      - allow multicast address on VRF device for Rx by checking for the
        daddr in the VRF device as well as the original ingress device
      
      - on Tx need to drop to __mkroute_output when FIB lookup fails for
        multicast destination address.
      
      - if CONFIG_IP_MROUTE_MULTIPLE_TABLES is enabled VRF driver creates
        IPMR FIB rules on first device create similar to FIB rules. In
        addition the VRF driver does not divert IPv4 multicast packets:
        it breaks on Tx since the fib lookup fails on the mcast address.
      
      With this patch, ipmr forwarding and local rx/tx work.
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e58e4159
    • D
      Merge branch 'tipc-socket-layer-improvements' · 1c851758
      David S. Miller 提交于
      Parthasarathy Bhuvaragan says:
      
      ====================
      tipc: socket layer improvements
      
      The following issues with the current socket layer hinders socket diagnostics
      implementation, which led to this patch series.
      
      1. tipc socket state is derived from multiple variables like
         sock->state, tsk->probing_state and tsk->connected. This style forces
         us to export multiple attributes to the user space, which has to be
         backward compatible.
      
      2. Abuse of sock->state cannot be exported to user-space without
         requiring tipc specific hacks in the user-space.
         - For connection less (CL) sockets sock->state is overloaded to
           tipc state SS_READY.
         - For connection oriented (CO) listening socket sock->state is
           overloaded to tipc state SS_LISTEN.
      
      This series is split into four:
      1. Bug fixes in patch #1,2,3.
      2. Minor cleanups in patch#4-5.
      3. Express all tipc states using a single variable in patch#6-8.
      4. Migrate the new tipc states to sk->sk_state in patch#9-16.
      
      The figures below represents the FSM after this series:
      
      Stream Server Listening Socket:
      +-----------+       +-------------+
      | TIPC_OPEN |------>| TIPC_LISTEN |
      +-----------+       +-------------+
      
      Stream Server Data Socket:
      +-----------+       +------------------+
      | TIPC_OPEN |------>| TIPC_ESTABLISHED |
      +-----------+       +------------------+
                                ^   |
                                |   |
                                |   v
                          +--------------------+
                          | TIPC_DISCONNECTING |
                          +--------------------+
      
      Stream Socket Client:
      +-----------+       +-----------------+
      | TIPC_OPEN |------>| TIPC_CONNECTING |------+
      +-----------+       +-----------------+      |
                                  |                |
                                  |                |
                                  v                |
                          +------------------+     |
                          | TIPC_ESTABLISHED |     |
                          +------------------+     |
                                ^   |              |
                                |   |              |
                                |   v              |
                          +--------------------+   |
                          | TIPC_DISCONNECTING |<--+
                          +--------------------+
      
      NOTE:
      This is just a base refractoring required for socket diagnostics.
      TIPC socket diagnostics support will be introduced in a later series.
      
      v2: - remove extra cast and parenthesis as suggested by David S. Miller in #4.
          - map new tipc state values to tcp states to address Eric Dumazet's concern,
            thus allow the usage of generic sk_* helpers. This is done in patch#10-15.
          - remove TIPC_PROBING state and replace it with probe_unacked flag in #11.
          - replace the TIPC_CLOSING state in v1 with sk_shutdown flag in #14.
          - introduce __tipc_shutdown() to avoid code duplication in #14.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1c851758
    • P
      tipc: remove SS_CONNECTED sock state · f40acbaf
      Parthasarathy Bhuvaragan 提交于
      In this commit, we replace references to sock->state SS_CONNECTE
      with sk_state TIPC_ESTABLISHED.
      
      Finally, the sock->state is no longer explicitly used by tipc.
      The FSM below is for various types of connection oriented sockets.
      
      Stream Server Listening Socket:
      +-----------+       +-------------+
      | TIPC_OPEN |------>| TIPC_LISTEN |
      +-----------+       +-------------+
      
      Stream Server Data Socket:
      +-----------+       +------------------+
      | TIPC_OPEN |------>| TIPC_ESTABLISHED |
      +-----------+       +------------------+
                                ^   |
                                |   |
                                |   v
                          +--------------------+
                          | TIPC_DISCONNECTING |
                          +--------------------+
      
      Stream Socket Client:
      +-----------+       +-----------------+
      | TIPC_OPEN |------>| TIPC_CONNECTING |------+
      +-----------+       +-----------------+      |
                                  |                |
                                  |                |
                                  v                |
                          +------------------+     |
                          | TIPC_ESTABLISHED |     |
                          +------------------+     |
                                ^   |              |
                                |   |              |
                                |   v              |
                          +--------------------+   |
                          | TIPC_DISCONNECTING |<--+
                          +--------------------+
      Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f40acbaf
    • P
      tipc: create TIPC_CONNECTING as a new sk_state · 99a20889
      Parthasarathy Bhuvaragan 提交于
      In this commit, we create a new tipc socket state TIPC_CONNECTING
      by primarily replacing the SS_CONNECTING with TIPC_CONNECTING.
      
      There is no functional change in this commit.
      Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      99a20889
    • P
      tipc: remove SS_DISCONNECTING state · 6f00089c
      Parthasarathy Bhuvaragan 提交于
      In this commit, we replace the references to SS_DISCONNECTING with
      the combination of sk_state TIPC_DISCONNECTING and flags set in
      sk_shutdown.
      We introduce a new function _tipc_shutdown(), which provides
      the common code required by tipc_release() and tipc_shutdown().
      Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6f00089c
    • P
      tipc: create TIPC_DISCONNECTING as a new sk_state · 9fd4b070
      Parthasarathy Bhuvaragan 提交于
      In this commit, we create a new tipc socket state TIPC_DISCONNECTING in
      sk_state. TIPC_DISCONNECTING is replacing the socket connection status
      update using SS_DISCONNECTING.
      TIPC_DISCONNECTING is set for connection oriented sockets at:
      - tipc_shutdown()
      - connection probe timeout
      - when we receive an error message on the connection.
      
      There is no functional change in this commit.
      Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9fd4b070
    • P
      tipc: create TIPC_OPEN as a new sk_state · 438adcaf
      Parthasarathy Bhuvaragan 提交于
      In this commit, we create a new tipc socket state TIPC_OPEN in
      sk_state. We primarily replace the SS_UNCONNECTED sock->state with
      TIPC_OPEN.
      Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      438adcaf