1. 28 10月, 2015 1 次提交
  2. 27 10月, 2015 1 次提交
    • M
      netfilter: nf_nat_redirect: add missing NULL pointer check · 94f9cd81
      Munehisa Kamata 提交于
      Commit 8b13eddf ("netfilter: refactor NAT
      redirect IPv4 to use it from nf_tables") has introduced a trivial logic
      change which can result in the following crash.
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
      IP: [<ffffffffa033002d>] nf_nat_redirect_ipv4+0x2d/0xa0 [nf_nat_redirect]
      PGD 3ba662067 PUD 3ba661067 PMD 0
      Oops: 0000 [#1] SMP
      Modules linked in: ipv6(E) xt_REDIRECT(E) nf_nat_redirect(E) xt_tcpudp(E) iptable_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack(E) ip_tables(E) x_tables(E) binfmt_misc(E) xfs(E) libcrc32c(E) evbug(E) evdev(E) psmouse(E) i2c_piix4(E) i2c_core(E) acpi_cpufreq(E) button(E) ext4(E) crc16(E) jbd2(E) mbcache(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
      CPU: 0 PID: 2536 Comm: ip Tainted: G            E   4.1.7-15.23.amzn1.x86_64 #1
      Hardware name: Xen HVM domU, BIOS 4.2.amazon 05/06/2015
      task: ffff8800eb438000 ti: ffff8803ba664000 task.ti: ffff8803ba664000
      [...]
      Call Trace:
       <IRQ>
       [<ffffffffa0334065>] redirect_tg4+0x15/0x20 [xt_REDIRECT]
       [<ffffffffa02e2e99>] ipt_do_table+0x2b9/0x5e1 [ip_tables]
       [<ffffffffa0328045>] iptable_nat_do_chain+0x25/0x30 [iptable_nat]
       [<ffffffffa031777d>] nf_nat_ipv4_fn+0x13d/0x1f0 [nf_nat_ipv4]
       [<ffffffffa0328020>] ? iptable_nat_ipv4_fn+0x20/0x20 [iptable_nat]
       [<ffffffffa031785e>] nf_nat_ipv4_in+0x2e/0x90 [nf_nat_ipv4]
       [<ffffffffa03280a5>] iptable_nat_ipv4_in+0x15/0x20 [iptable_nat]
       [<ffffffff81449137>] nf_iterate+0x57/0x80
       [<ffffffff814491f7>] nf_hook_slow+0x97/0x100
       [<ffffffff814504d4>] ip_rcv+0x314/0x400
      
      unsigned int
      nf_nat_redirect_ipv4(struct sk_buff *skb,
      ...
      {
      ...
      		rcu_read_lock();
      		indev = __in_dev_get_rcu(skb->dev);
      		if (indev != NULL) {
      			ifa = indev->ifa_list;
      			newdst = ifa->ifa_local; <---
      		}
      		rcu_read_unlock();
      ...
      }
      
      Before the commit, 'ifa' had been always checked before access. After the
      commit, however, it could be accessed even if it's NULL. Interestingly,
      this was once fixed in 2003.
      
      http://marc.info/?l=netfilter-devel&m=106668497403047&w=2
      
      In addition to the original one, we have seen the crash when packets that
      need to be redirected somehow arrive on an interface which hasn't been
      yet fully configured.
      
      This change just reverts the logic to the old behavior to avoid the crash.
      
      Fixes: 8b13eddf ("netfilter: refactor NAT redirect IPv4 to use it from nf_tables")
      Signed-off-by: NMunehisa Kamata <kamatam@amazon.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      94f9cd81
  3. 22 10月, 2015 1 次提交
  4. 17 10月, 2015 1 次提交
    • N
      netfilter: ipset: Fix sleeping memory allocation in atomic context · 00db674b
      Nikolay Borisov 提交于
      Commit 00590fdd introduced RCU locking in list type and in
      doing so introduced a memory allocation in list_set_add, which
      is done in an atomic context, due to the fact that ipset rcu
      list modifications are serialised with a spin lock. The reason
      why we can't use a mutex is that in addition to modifying the
      list with ipset commands, it's also being modified when a
      particular ipset rule timeout expires aka garbage collection.
      This gc is triggered from set_cleanup_entries, which in turn
      is invoked from a timer thus requiring the lock to be bh-safe.
      
      Concretely the following call chain can lead to "sleeping function
      called in atomic context" splat:
      call_ad -> list_set_uadt -> list_set_uadd -> kzalloc(, GFP_KERNEL).
      And since GFP_KERNEL allows initiating direct reclaim thus
      potentially sleeping in the allocation path.
      
      To fix the issue change the allocation type to GFP_ATOMIC, to
      correctly reflect that it is occuring in an atomic context.
      
      Fixes: 00590fdd ("netfilter: ipset: Introduce RCU locking in list type")
      Signed-off-by: NNikolay Borisov <kernel@kyup.com>
      Acked-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      00db674b
  5. 13 10月, 2015 1 次提交
    • F
      netfilter: sync with packet rx also after removing queue entries · 514ed62e
      Florian Westphal 提交于
      We need to sync packet rx again after flushing the queue entries.
      Otherwise, the following race could happen:
      
      cpu1: nf_unregister_hook(H) called, H unliked from lists, calls
      synchronize_net() to wait for packet rx completion.
      
      Problem is that while no new nf_queue_entry structs that use H can be
      allocated, another CPU might receive a verdict from userspace just before
      cpu1 calls nf_queue_nf_hook_drop to remove this entry:
      
      cpu2: receive verdict from userspace, lock queue
      cpu2: unlink nf_queue_entry struct E, which references H, from queue list
      cpu1: calls nf_queue_nf_hook_drop, blocks on queue spinlock
      cpu2: unlock queue
      cpu1: nf_queue_nf_hook_drop drops affected queue entries
      cpu2: call nf_reinject for E
      cpu1: kfree(H)
      cpu2: potential use-after-free for H
      
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Fixes: 085db2c0 ("netfilter: Per network namespace netfilter hooks.")
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      514ed62e
  6. 17 9月, 2015 1 次提交
  7. 15 9月, 2015 1 次提交
    • P
      netfilter: nft_compat: skip family comparison in case of NFPROTO_UNSPEC · ba378ca9
      Pablo Neira Ayuso 提交于
      Fix lookup of existing match/target structures in the corresponding list
      by skipping the family check if NFPROTO_UNSPEC is used.
      
      This is resulting in the allocation and insertion of one match/target
      structure for each use of them. So this not only bloats memory
      consumption but also severely affects the time to reload the ruleset
      from the iptables-compat utility.
      
      After this patch, iptables-compat-restore and iptables-compat take
      almost the same time to reload large rulesets.
      
      Fixes: 0ca743a5 ("netfilter: nf_tables: add compatibility layer for x_tables")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      ba378ca9
  8. 14 9月, 2015 1 次提交
  9. 10 9月, 2015 1 次提交
    • D
      netlink, mmap: fix edge-case leakages in nf queue zero-copy · 6bb0fef4
      Daniel Borkmann 提交于
      When netlink mmap on receive side is the consumer of nf queue data,
      it can happen that in some edge cases, we write skb shared info into
      the user space mmap buffer:
      
      Assume a possible rx ring frame size of only 4096, and the network skb,
      which is being zero-copied into the netlink skb, contains page frags
      with an overall skb->len larger than the linear part of the netlink
      skb.
      
      skb_zerocopy(), which is generic and thus not aware of the fact that
      shared info cannot be accessed for such skbs then tries to write and
      fill frags, thus leaking kernel data/pointers and in some corner cases
      possibly writing out of bounds of the mmap area (when filling the
      last slot in the ring buffer this way).
      
      I.e. the ring buffer slot is then of status NL_MMAP_STATUS_VALID, has
      an advertised length larger than 4096, where the linear part is visible
      at the slot beginning, and the leaked sizeof(struct skb_shared_info)
      has been written to the beginning of the next slot (also corrupting
      the struct nl_mmap_hdr slot header incl. status etc), since skb->end
      points to skb->data + ring->frame_size - NL_MMAP_HDRLEN.
      
      The fix adds and lets __netlink_alloc_skb() take the actual needed
      linear room for the network skb + meta data into account. It's completely
      irrelevant for non-mmaped netlink sockets, but in case mmap sockets
      are used, it can be decided whether the available skb_tailroom() is
      really large enough for the buffer, or whether it needs to internally
      fallback to a normal alloc_skb().
      
      >From nf queue side, the information whether the destination port is
      an mmap RX ring is not really available without extra port-to-socket
      lookup, thus it can only be determined in lower layers i.e. when
      __netlink_alloc_skb() is called that checks internally for this. I
      chose to add the extra ldiff parameter as mmap will then still work:
      We have data_len and hlen in nfqnl_build_packet_message(), data_len
      is the full length (capped at queue->copy_range) for skb_zerocopy()
      and hlen some possible part of data_len that needs to be copied; the
      rem_len variable indicates the needed remaining linear mmap space.
      
      The only other workaround in nf queue internally would be after
      allocation time by f.e. cap'ing the data_len to the skb_tailroom()
      iff we deal with an mmap skb, but that would 1) expose the fact that
      we use a mmap skb to upper layers, and 2) trim the skb where we
      otherwise could just have moved the full skb into the normal receive
      queue.
      
      After the patch, in my test case the ring slot doesn't fit and therefore
      shows NL_MMAP_STATUS_COPY, where a full skb carries all the data and
      thus needs to be picked up via recv().
      
      Fixes: 3ab1f683 ("nfnetlink: add support for memory mapped netlink")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6bb0fef4
  10. 03 9月, 2015 1 次提交
    • D
      netfilter: nf_conntrack: make nf_ct_zone_dflt built-in · 62da9865
      Daniel Borkmann 提交于
      Fengguang reported, that some randconfig generated the following linker
      issue with nf_ct_zone_dflt object involved:
      
        [...]
        CC      init/version.o
        LD      init/built-in.o
        net/built-in.o: In function `ipv4_conntrack_defrag':
        nf_defrag_ipv4.c:(.text+0x93e95): undefined reference to `nf_ct_zone_dflt'
        net/built-in.o: In function `ipv6_defrag':
        nf_defrag_ipv6_hooks.c:(.text+0xe3ffe): undefined reference to `nf_ct_zone_dflt'
        make: *** [vmlinux] Error 1
      
      Given that configurations exist where we have a built-in part, which is
      accessing nf_ct_zone_dflt such as the two handlers nf_ct_defrag_user()
      and nf_ct6_defrag_user(), and a part that configures nf_conntrack as a
      module, we must move nf_ct_zone_dflt into a fixed, guaranteed built-in
      area when netfilter is configured in general.
      
      Therefore, split the more generic parts into a common header under
      include/linux/netfilter/ and move nf_ct_zone_dflt into the built-in
      section that already holds parts related to CONFIG_NF_CONNTRACK in the
      netfilter core. This fixes the issue on my side.
      
      Fixes: 308ac914 ("netfilter: nf_conntrack: push zone object into functions")
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      62da9865
  11. 01 9月, 2015 1 次提交
    • D
      netfilter: conntrack: use nf_ct_tmpl_free in CT/synproxy error paths · 9cf94eab
      Daniel Borkmann 提交于
      Commit 0838aa7f ("netfilter: fix netns dependencies with conntrack
      templates") migrated templates to the new allocator api, but forgot to
      update error paths for them in CT and synproxy to use nf_ct_tmpl_free()
      instead of nf_conntrack_free().
      
      Due to that, memory is being freed into the wrong kmemcache, but also
      we drop the per net reference count of ct objects causing an imbalance.
      
      In Brad's case, this leads to a wrap-around of net->ct.count and thus
      lets __nf_conntrack_alloc() refuse to create a new ct object:
      
        [   10.340913] xt_addrtype: ipv6 does not support BROADCAST matching
        [   10.810168] nf_conntrack: table full, dropping packet
        [   11.917416] r8169 0000:07:00.0 eth0: link up
        [   11.917438] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
        [   12.815902] nf_conntrack: table full, dropping packet
        [   15.688561] nf_conntrack: table full, dropping packet
        [   15.689365] nf_conntrack: table full, dropping packet
        [   15.690169] nf_conntrack: table full, dropping packet
        [   15.690967] nf_conntrack: table full, dropping packet
        [...]
      
      With slab debugging, it also reports the wrong kmemcache (kmalloc-512 vs.
      nf_conntrack_ffffffff81ce75c0) and reports poison overwrites, etc. Thus,
      to fix the problem, export and use nf_ct_tmpl_free() instead.
      
      Fixes: 0838aa7f ("netfilter: fix netns dependencies with conntrack templates")
      Reported-by: NBrad Jackson <bjackson0971@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      9cf94eab
  12. 29 8月, 2015 4 次提交
    • P
      netfilter: nfnetlink: work around wrong endianess in res_id field · a9de9777
      Pablo Neira Ayuso 提交于
      The convention in nfnetlink is to use network byte order in every header field
      as well as in the attribute payload. The initial version of the batching
      infrastructure assumes that res_id comes in host byte order though.
      
      The only client of the batching infrastructure is nf_tables, so let's add a
      workaround to address this inconsistency. We currently have 11 nfnetlink
      subsystems according to NFNL_SUBSYS_COUNT, so we can assume that the subsystem
      2560, ie. htons(10), will not be allocated anytime soon, so it can be an alias
      of nf_tables from the nfnetlink batching path when interpreting the res_id
      field.
      
      Based on original patch from Florian Westphal.
      Reported-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      a9de9777
    • E
      netfilter: ipset: Fixing unnamed union init · 96be5f28
      Elad Raz 提交于
      In continue to proposed Vinson Lee's post [1], this patch fixes compilation
      issues founded at gcc 4.4.7. The initialization of .cidr field of unnamed
      unions causes compilation error in gcc 4.4.x.
      
      References
      
      Visible links
      [1] https://lkml.org/lkml/2015/7/5/74Signed-off-by: NElad Raz <eladr@mellanox.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      96be5f28
    • F
      netfilter: reduce sparse warnings · 851345c5
      Florian Westphal 提交于
      bridge/netfilter/ebtables.c:290:26: warning: incorrect type in assignment (different modifiers)
      -> remove __pure annotation.
      
      ipv6/netfilter/ip6t_SYNPROXY.c:240:27: warning: cast from restricted __be16
      -> switch ntohs to htons and vice versa.
      
      netfilter/core.c:391:30: warning: symbol 'nfq_ct_nat_hook' was not declared. Should it be static?
      -> delete it, got removed
      
      net/netfilter/nf_synproxy_core.c:221:48: warning: cast to restricted __be32
      -> Use __be32 instead of u32.
      
      Tested with objdiff that these changes do not affect generated code.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      851345c5
    • J
      netfilter: ipset: Out of bound access in hash:net* types fixed · 6fe7ccfd
      Jozsef Kadlecsik 提交于
      Dave Jones reported that KASan detected out of bounds access in hash:net*
      types:
      
      [   23.139532] ==================================================================
      [   23.146130] BUG: KASan: out of bounds access in hash_net4_add_cidr+0x1db/0x220 at addr ffff8800d4844b58
      [   23.152937] Write of size 4 by task ipset/457
      [   23.159742] =============================================================================
      [   23.166672] BUG kmalloc-512 (Not tainted): kasan: bad access detected
      [   23.173641] -----------------------------------------------------------------------------
      [   23.194668] INFO: Allocated in hash_net_create+0x16a/0x470 age=7 cpu=1 pid=456
      [   23.201836]  __slab_alloc.constprop.66+0x554/0x620
      [   23.208994]  __kmalloc+0x2f2/0x360
      [   23.216105]  hash_net_create+0x16a/0x470
      [   23.223238]  ip_set_create+0x3e6/0x740
      [   23.230343]  nfnetlink_rcv_msg+0x599/0x640
      [   23.237454]  netlink_rcv_skb+0x14f/0x190
      [   23.244533]  nfnetlink_rcv+0x3f6/0x790
      [   23.251579]  netlink_unicast+0x272/0x390
      [   23.258573]  netlink_sendmsg+0x5a1/0xa50
      [   23.265485]  SYSC_sendto+0x1da/0x2c0
      [   23.272364]  SyS_sendto+0xe/0x10
      [   23.279168]  entry_SYSCALL_64_fastpath+0x12/0x6f
      
      The bug is fixed in the patch and the testsuite is extended in ipset
      to check cidr handling more thoroughly.
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      6fe7ccfd
  13. 28 8月, 2015 2 次提交
  14. 22 8月, 2015 5 次提交
  15. 19 8月, 2015 1 次提交
  16. 18 8月, 2015 3 次提交
    • T
      net: Change pseudohdr argument of inet_proto_csum_replace* to be a bool · 4b048d6d
      Tom Herbert 提交于
      inet_proto_csum_replace4,2,16 take a pseudohdr argument which indicates
      the checksum field carries a pseudo header. This argument should be a
      boolean instead of an int.
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4b048d6d
    • D
      netfilter: nf_conntrack: add efficient mark to zone mapping · 5e8018fc
      Daniel Borkmann 提交于
      This work adds the possibility of deriving the zone id from the skb->mark
      field in a scalable manner. This allows for having only a single template
      serving hundreds/thousands of different zones, for example, instead of the
      need to have one match for each zone as an extra CT jump target.
      
      Note that we'd need to have this information attached to the template as at
      the time when we're trying to lookup a possible ct object, we already need
      to know zone information for a possible match when going into
      __nf_conntrack_find_get(). This work provides a minimal implementation for
      a possible mapping.
      
      In order to not add/expose an extra ct->status bit, the zone structure has
      been extended to carry a flag for deriving the mark.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      5e8018fc
    • D
      netfilter: nf_conntrack: add direction support for zones · deedb590
      Daniel Borkmann 提交于
      This work adds a direction parameter to netfilter zones, so identity
      separation can be performed only in original/reply or both directions
      (default). This basically opens up the possibility of doing NAT with
      conflicting IP address/port tuples from multiple, isolated tenants
      on a host (e.g. from a netns) without requiring each tenant to NAT
      twice resp. to use its own dedicated IP address to SNAT to, meaning
      overlapping tuples can be made unique with the zone identifier in
      original direction, where the NAT engine will then allocate a unique
      tuple in the commonly shared default zone for the reply direction.
      In some restricted, local DNAT cases, also port redirection could be
      used for making the reply traffic unique w/o requiring SNAT.
      
      The consensus we've reached and discussed at NFWS and since the initial
      implementation [1] was to directly integrate the direction meta data
      into the existing zones infrastructure, as opposed to the ct->mark
      approach we proposed initially.
      
      As we pass the nf_conntrack_zone object directly around, we don't have
      to touch all call-sites, but only those, that contain equality checks
      of zones. Thus, based on the current direction (original or reply),
      we either return the actual id, or the default NF_CT_DEFAULT_ZONE_ID.
      CT expectations are direction-agnostic entities when expectations are
      being compared among themselves, so we can only use the identifier
      in this case.
      
      Note that zone identifiers can not be included into the hash mix
      anymore as they don't contain a "stable" value that would be equal
      for both directions at all times, f.e. if only zone->id would
      unconditionally be xor'ed into the table slot hash, then replies won't
      find the corresponding conntracking entry anymore.
      
      If no particular direction is specified when configuring zones, the
      behaviour is exactly as we expect currently (both directions).
      
      Support has been added for the CT netlink interface as well as the
      x_tables raw CT target, which both already offer existing interfaces
      to user space for the configuration of zones.
      
      Below a minimal, simplified collision example (script in [2]) with
      netperf sessions:
      
        +--- tenant-1 ---+   mark := 1
        |    netperf     |--+
        +----------------+  |                CT zone := mark [ORIGINAL]
         [ip,sport] := X   +--------------+  +--- gateway ---+
                           | mark routing |--|     SNAT      |-- ... +
                           +--------------+  +---------------+       |
        +--- tenant-2 ---+  |                                     ~~~|~~~
        |    netperf     |--+                +-----------+           |
        +----------------+   mark := 2       | netserver |------ ... +
         [ip,sport] := X                     +-----------+
                                              [ip,port] := Y
      On the gateway netns, example:
      
        iptables -t raw -A PREROUTING -j CT --zone mark --zone-dir ORIGINAL
        iptables -t nat -A POSTROUTING -o <dev> -j SNAT --to-source <ip> --random-fully
      
        iptables -t mangle -A PREROUTING -m conntrack --ctdir ORIGINAL -j CONNMARK --save-mark
        iptables -t mangle -A POSTROUTING -m conntrack --ctdir REPLY -j CONNMARK --restore-mark
      
      conntrack dump from gateway netns:
      
        netperf -H 10.1.1.2 -t TCP_STREAM -l60 -p12865,5555 from each tenant netns
      
        tcp 6 431995 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=5555 dport=12865 zone-orig=1
                                 src=10.1.1.2 dst=10.1.1.1 sport=12865 dport=1024
                     [ASSURED] mark=1 secctx=system_u:object_r:unlabeled_t:s0 use=1
      
        tcp 6 431994 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=5555 dport=12865 zone-orig=2
                                 src=10.1.1.2 dst=10.1.1.1 sport=12865 dport=5555
                     [ASSURED] mark=2 secctx=system_u:object_r:unlabeled_t:s0 use=1
      
        tcp 6 299 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=39438 dport=33768 zone-orig=1
                              src=10.1.1.2 dst=10.1.1.1 sport=33768 dport=39438
                     [ASSURED] mark=1 secctx=system_u:object_r:unlabeled_t:s0 use=1
      
        tcp 6 300 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=32889 dport=40206 zone-orig=2
                              src=10.1.1.2 dst=10.1.1.1 sport=40206 dport=32889
                     [ASSURED] mark=2 secctx=system_u:object_r:unlabeled_t:s0 use=2
      
      Taking this further, test script in [2] creates 200 tenants and runs
      original-tuple colliding netperf sessions each. A conntrack -L dump in
      the gateway netns also confirms 200 overlapping entries, all in ESTABLISHED
      state as expected.
      
      I also did run various other tests with some permutations of the script,
      to mention some: SNAT in random/random-fully/persistent mode, no zones (no
      overlaps), static zones (original, reply, both directions), etc.
      
        [1] http://thread.gmane.org/gmane.comp.security.firewalls.netfilter.devel/57412/
        [2] https://paste.fedoraproject.org/242835/65657871/Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      deedb590
  17. 11 8月, 2015 1 次提交
  18. 07 8月, 2015 10 次提交
  19. 05 8月, 2015 1 次提交
  20. 30 7月, 2015 2 次提交
    • D
      netfilter: nf_conntrack: checking for IS_ERR() instead of NULL · 1a727c63
      Dan Carpenter 提交于
      We recently changed this from nf_conntrack_alloc() to nf_ct_tmpl_alloc()
      so the error handling needs to changed to check for NULL instead of
      IS_ERR().
      
      Fixes: 0838aa7f ('netfilter: fix netns dependencies with conntrack templates')
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      1a727c63
    • M
      netfilter: nf_ct_sctp: minimal multihoming support · d7ee3519
      Michal Kubeček 提交于
      Currently nf_conntrack_proto_sctp module handles only packets between
      primary addresses used to establish the connection. Any packets between
      secondary addresses are classified as invalid so that usual firewall
      configurations drop them. Allowing HEARTBEAT and HEARTBEAT-ACK chunks to
      establish a new conntrack would allow traffic between secondary
      addresses to pass through. A more sophisticated solution based on the
      addresses advertised in the initial handshake (and possibly also later
      dynamic address addition and removal) would be much harder to implement.
      Moreover, in general we cannot assume to always see the initial
      handshake as it can be routed through a different path.
      
      The patch adds two new conntrack states:
      
        SCTP_CONNTRACK_HEARTBEAT_SENT  - a HEARTBEAT chunk seen but not acked
        SCTP_CONNTRACK_HEARTBEAT_ACKED - a HEARTBEAT acked by HEARTBEAT-ACK
      
      State transition rules:
      
      - HEARTBEAT_SENT responds to usual chunks the same way as NONE (so that
        the behaviour changes as little as possible)
      - HEARTBEAT_ACKED responds to usual chunks the same way as ESTABLISHED
        does, except the resulting state is HEARTBEAT_ACKED rather than
        ESTABLISHED
      - previously existing states except NONE are preserved when HEARTBEAT or
        HEARTBEAT-ACK is seen
      - NONE (in the initial direction) changes to HEARTBEAT_SENT on HEARTBEAT
        and to CLOSED on HEARTBEAT-ACK
      - HEARTBEAT_SENT changes to HEARTBEAT_ACKED on HEARTBEAT-ACK in the
        reply direction
      - HEARTBEAT_SENT and HEARTBEAT_ACKED are preserved on HEARTBEAT and
        HEARTBEAT-ACK otherwise
      
      Normally, vtag is set from the INIT chunk for the reply direction and
      from the INIT-ACK chunk for the originating direction (i.e. each of
      these defines vtag value for the opposite direction). For secondary
      conntracks, we can't rely on seeing INIT/INIT-ACK and even if we have
      seen them, we would need to connect two different conntracks. Therefore
      simplified logic is applied: vtag of first packet in each direction
      (HEARTBEAT in the originating and HEARTBEAT-ACK in reply direction) is
      saved and all following packets in that direction are compared with this
      saved value. While INIT and INIT-ACK define vtag for the opposite
      direction, vtags extracted from HEARTBEAT and HEARTBEAT-ACK are always
      for their direction.
      
      Default timeout values for new states are
      
        HEARTBEAT_SENT: 30 seconds (default hb_interval)
        HEARTBEAT_ACKED: 210 seconds (hb_interval * path_max_retry + max_rto)
      
      (We cannot expect to see the shutdown sequence so that, unlike
      ESTABLISHED, the HEARTBEAT_ACKED timeout shouldn't be too long.)
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      d7ee3519