1. 26 9月, 2012 1 次提交
  2. 25 9月, 2012 3 次提交
    • E
      net: raw: revert unrelated change · 8489c1d9
      Eric Dumazet 提交于
      Commit 5640f768 ("net: use a per task frag allocator")
      accidentally contained an unrelated change to net/ipv4/raw.c,
      later committed (without the pr_err() debugging bits) in
      net tree as commit ab43ed8b (ipv4: raw: fix icmp_filter())
      
      This patch reverts this glitch, noticed by Stephen Rothwell.
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8489c1d9
    • D
      filter: add XOR instruction for use with X/K · 9e49e889
      Daniel Borkmann 提交于
      SKF_AD_ALU_XOR_X has been added a while ago, but as an 'ancillary'
      operation that is invoked through a negative offset in K within BPF
      load operations. Since BPF_MOD has recently been added, BPF_XOR should
      also be part of the common ALU operations. Removing SKF_AD_ALU_XOR_X
      might not be an option since this is exposed to user space.
      Signed-off-by: NDaniel Borkmann <daniel.borkmann@tik.ee.ethz.ch>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9e49e889
    • E
      net: use a per task frag allocator · 5640f768
      Eric Dumazet 提交于
      We currently use a per socket order-0 page cache for tcp_sendmsg()
      operations.
      
      This page is used to build fragments for skbs.
      
      Its done to increase probability of coalescing small write() into
      single segments in skbs still in write queue (not yet sent)
      
      But it wastes a lot of memory for applications handling many mostly
      idle sockets, since each socket holds one page in sk->sk_sndmsg_page
      
      Its also quite inefficient to build TSO 64KB packets, because we need
      about 16 pages per skb on arches where PAGE_SIZE = 4096, so we hit
      page allocator more than wanted.
      
      This patch adds a per task frag allocator and uses bigger pages,
      if available. An automatic fallback is done in case of memory pressure.
      
      (up to 32768 bytes per frag, thats order-3 pages on x86)
      
      This increases TCP stream performance by 20% on loopback device,
      but also benefits on other network devices, since 8x less frags are
      mapped on transmit and unmapped on tx completion. Alexander Duyck
      mentioned a probable performance win on systems with IOMMU enabled.
      
      Its possible some SG enabled hardware cant cope with bigger fragments,
      but their ndo_start_xmit() should already handle this, splitting a
      fragment in sub fragments, since some arches have PAGE_SIZE=65536
      
      Successfully tested on various ethernet devices.
      (ixgbe, igb, bnx2x, tg3, mellanox mlx4)
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Vijay Subramanian <subramanian.vijay@gmail.com>
      Cc: Alexander Duyck <alexander.h.duyck@intel.com>
      Tested-by: NVijay Subramanian <subramanian.vijay@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5640f768
  3. 24 9月, 2012 4 次提交
    • P
      netfilter: nfnetlink_queue: add NFQA_CAP_LEN attribute · 6ee584be
      Pablo Neira Ayuso 提交于
      This patch adds the NFQA_CAP_LEN attribute that allows us to know
      what is the real packet size from user-space (even if we decided
      to retrieve just a few bytes from the packet instead of all of it).
      
      Security software that inspects packets should always check for
      this new attribute to make sure that it is inspecting the entire
      packet.
      
      This also helps to provide a workaround for the problem described
      in: http://marc.info/?l=netfilter-devel&m=134519473212536&w=2
      
      Original idea from Florian Westphal.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      6ee584be
    • P
      netfilter: nfnetlink_queue: fix maximum packet length to userspace · ba8d3b0b
      Pablo Neira Ayuso 提交于
      The packets that we send via NFQUEUE are encapsulated in the NFQA_PAYLOAD
      attribute. The length of the packet in userspace is obtained via
      attr->nla_len field. This field contains the size of the Netlink
      attribute header plus the packet length.
      
      If the maximum packet length is specified, ie. 65535 bytes, and
      packets in the range of (65531,65535] are sent to userspace, the
      attr->nla_len overflows and it reports bogus lengths to the
      application.
      
      To fix this, this patch limits the maximum packet length to 65531
      bytes. If larger packet length is specified, the packet that we
      send to user-space is truncated to 65531 bytes.
      
      To support 65535 bytes packets, we have to revisit the idea of
      the 32-bits Netlink attribute length.
      Reported-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      ba8d3b0b
    • P
      netfilter: nf_ct_ftp: add sequence tracking pickup facility for injected entries · 7be54ca4
      Pablo Neira Ayuso 提交于
      This patch allows the FTP helper to pickup the sequence tracking from
      the first packet seen. This is useful to fix the breakage of the first
      FTP command after the failover while using conntrackd to synchronize
      states.
      
      The seq_aft_nl_num field in struct nf_ct_ftp_info has been shrinked to
      16-bits (enough for what it does), so we can use the remaining 16-bits
      to store the flags while using the same size for the private FTP helper
      data.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      7be54ca4
    • F
      netfilter: xt_time: add support to ignore day transition · 54eb3df3
      Florian Westphal 提交于
      Currently, if you want to do something like:
      "match Monday, starting 23:00, for two hours"
      You need two rules, one for Mon 23:00 to 0:00 and one for Tue 0:00-1:00.
      
      The rule: --weekdays Mo --timestart 23:00  --timestop 01:00
      
      looks correct, but it will first match on monday from midnight to 1 a.m.
      and then again for another hour from 23:00 onwards.
      
      This permits userspace to explicitly ignore the day transition and
      match for a single, continuous time period instead.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      54eb3df3
  4. 23 9月, 2012 9 次提交
  5. 22 9月, 2012 4 次提交
    • J
      netfilter: ipset: Check and reject crazy /0 input parameters · b9fed748
      Jozsef Kadlecsik 提交于
      bitmap:ip and bitmap:ip,mac type did not reject such a crazy range
      when created and using such a set results in a kernel crash.
      The hash types just silently ignored such parameters.
      
      Reject invalid /0 input parameters explicitely.
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      b9fed748
    • J
    • C
      ipconfig: add nameserver IPs to kernel-parameter ip= · 5e953778
      Christoph Fritz 提交于
      On small systems (e.g. embedded ones) IP addresses are often configured
      by bootloaders and get assigned to kernel via parameter "ip=".  If set to
      "ip=dhcp", even nameserver entries from DHCP daemons are handled. These
      entries exported in /proc/net/pnp are commonly linked by /etc/resolv.conf.
      
      To configure nameservers for networks without DHCP, this patch adds option
      <dns0-ip> and <dns1-ip> to kernel-parameter 'ip='.
      Signed-off-by: NChristoph Fritz <chf.fritz@googlemail.com>
      Tested-by: NJan Weitzel <j.weitzel@phytec.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5e953778
    • A
      l2tp: fix compile error when CONFIG_IPV6=m and CONFIG_L2TP=y · fc181625
      Amerigo Wang 提交于
      When CONFIG_IPV6=m and CONFIG_L2TP=y, I got the following compile error:
      
        LD      init/built-in.o
      net/built-in.o: In function `l2tp_xmit_core':
      l2tp_core.c:(.text+0x147781): undefined reference to `inet6_csk_xmit'
      net/built-in.o: In function `l2tp_tunnel_create':
      (.text+0x149067): undefined reference to `udpv6_encap_enable'
      net/built-in.o: In function `l2tp_ip6_recvmsg':
      l2tp_ip6.c:(.text+0x14e991): undefined reference to `ipv6_recv_error'
      net/built-in.o: In function `l2tp_ip6_sendmsg':
      l2tp_ip6.c:(.text+0x14ec64): undefined reference to `fl6_sock_lookup'
      l2tp_ip6.c:(.text+0x14ed6b): undefined reference to `datagram_send_ctl'
      l2tp_ip6.c:(.text+0x14eda0): undefined reference to `fl6_sock_lookup'
      l2tp_ip6.c:(.text+0x14ede5): undefined reference to `fl6_merge_options'
      l2tp_ip6.c:(.text+0x14edf4): undefined reference to `ipv6_fixup_options'
      l2tp_ip6.c:(.text+0x14ee5d): undefined reference to `fl6_update_dst'
      l2tp_ip6.c:(.text+0x14eea3): undefined reference to `ip6_dst_lookup_flow'
      l2tp_ip6.c:(.text+0x14eee7): undefined reference to `ip6_dst_hoplimit'
      l2tp_ip6.c:(.text+0x14ef8b): undefined reference to `ip6_append_data'
      l2tp_ip6.c:(.text+0x14ef9d): undefined reference to `ip6_flush_pending_frames'
      l2tp_ip6.c:(.text+0x14efe2): undefined reference to `ip6_push_pending_frames'
      net/built-in.o: In function `l2tp_ip6_destroy_sock':
      l2tp_ip6.c:(.text+0x14f090): undefined reference to `ip6_flush_pending_frames'
      l2tp_ip6.c:(.text+0x14f0a0): undefined reference to `inet6_destroy_sock'
      net/built-in.o: In function `l2tp_ip6_connect':
      l2tp_ip6.c:(.text+0x14f14d): undefined reference to `ip6_datagram_connect'
      net/built-in.o: In function `l2tp_ip6_bind':
      l2tp_ip6.c:(.text+0x14f4fe): undefined reference to `ipv6_chk_addr'
      net/built-in.o: In function `l2tp_ip6_init':
      l2tp_ip6.c:(.init.text+0x73fa): undefined reference to `inet6_add_protocol'
      l2tp_ip6.c:(.init.text+0x740c): undefined reference to `inet6_register_protosw'
      net/built-in.o: In function `l2tp_ip6_exit':
      l2tp_ip6.c:(.exit.text+0x1954): undefined reference to `inet6_unregister_protosw'
      l2tp_ip6.c:(.exit.text+0x1965): undefined reference to `inet6_del_protocol'
      net/built-in.o:(.rodata+0xf2d0): undefined reference to `inet6_release'
      net/built-in.o:(.rodata+0xf2d8): undefined reference to `inet6_bind'
      net/built-in.o:(.rodata+0xf308): undefined reference to `inet6_ioctl'
      net/built-in.o:(.data+0x1af40): undefined reference to `ipv6_setsockopt'
      net/built-in.o:(.data+0x1af48): undefined reference to `ipv6_getsockopt'
      net/built-in.o:(.data+0x1af50): undefined reference to `compat_ipv6_setsockopt'
      net/built-in.o:(.data+0x1af58): undefined reference to `compat_ipv6_getsockopt'
      make: *** [vmlinux] Error 1
      
      This is due to l2tp uses symbols from IPV6, so when IPV6
      is a module, l2tp is not allowed to be builtin.
      
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc181625
  6. 21 9月, 2012 7 次提交
    • J
      netfilter: combine ipt_REDIRECT and ip6t_REDIRECT · 2cbc78a2
      Jan Engelhardt 提交于
      Combine more modules since the actual code is so small anyway that the
      kmod metadata and the module in its loaded state totally outweighs the
      combined actual code size.
      
      IP_NF_TARGET_REDIRECT becomes a compat option; IP6_NF_TARGET_REDIRECT
      is completely eliminated since it has not see a release yet.
      Signed-off-by: NJan Engelhardt <jengelh@inai.de>
      Acked-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      2cbc78a2
    • J
      netfilter: combine ipt_NETMAP and ip6t_NETMAP · b3d54b3e
      Jan Engelhardt 提交于
      Combine more modules since the actual code is so small anyway that the
      kmod metadata and the module in its loaded state totally outweighs the
      combined actual code size.
      
      IP_NF_TARGET_NETMAP becomes a compat option; IP6_NF_TARGET_NETMAP
      is completely eliminated since it has not see a release yet.
      Signed-off-by: NJan Engelhardt <jengelh@inai.de>
      Acked-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      b3d54b3e
    • U
      netfilter: nf_nat: remove obsolete rcu_read_unlock call · 136251d0
      Ulrich Weber 提交于
      hlist walk in find_appropriate_src() is not protected anymore by rcu_read_lock(),
      so rcu_read_unlock() is unnecessary if in_range() matches.
      
      This bug was added in (c7232c99 netfilter: add protocol independent NAT core).
      Signed-off-by: NUlrich Weber <ulrich.weber@sophos.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      136251d0
    • P
      netfilter: nf_nat: fix oops when unloading protocol modules · b0cdb1d9
      Patrick McHardy 提交于
      When unloading a protocol module nf_ct_iterate_cleanup() is used to
      remove all conntracks using the protocol from the bysource hash and
      clean their NAT sections. Since the conntrack isn't actually killed,
      the NAT callback is invoked twice, once for each direction, which
      causes an oops when trying to delete it from the bysource hash for
      the second time.
      
      The same oops can also happen when removing both an L3 and L4 protocol
      since the cleanup function doesn't check whether the conntrack has
      already been cleaned up.
      
      Pid: 4052, comm: modprobe Not tainted 3.6.0-rc3-test-nat-unload-fix+ #32 Red Hat KVM
      RIP: 0010:[<ffffffffa002c303>]  [<ffffffffa002c303>] nf_nat_proto_clean+0x73/0xd0 [nf_nat]
      RSP: 0018:ffff88007808fe18  EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff8800728550c0 RCX: ffff8800756288b0
      RDX: dead000000200200 RSI: ffff88007808fe88 RDI: ffffffffa002f208
      RBP: ffff88007808fe28 R08: ffff88007808e000 R09: 0000000000000000
      R10: dead000000200200 R11: dead000000100100 R12: ffffffff81c6dc00
      R13: ffff8800787582b8 R14: ffff880078758278 R15: ffff88007808fe88
      FS:  00007f515985d700(0000) GS:ffff88007cd00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00007f515986a000 CR3: 000000007867a000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process modprobe (pid: 4052, threadinfo ffff88007808e000, task ffff8800756288b0)
      Stack:
       ffff88007808fe68 ffffffffa002c290 ffff88007808fe78 ffffffff815614e3
       ffffffff00000000 00000aeb00000246 ffff88007808fe68 ffffffff81c6dc00
       ffff88007808fe88 ffffffffa00358a0 0000000000000000 000000000040f5b0
      Call Trace:
       [<ffffffffa002c290>] ? nf_nat_net_exit+0x50/0x50 [nf_nat]
       [<ffffffff815614e3>] nf_ct_iterate_cleanup+0xc3/0x170
       [<ffffffffa002c55a>] nf_nat_l3proto_unregister+0x8a/0x100 [nf_nat]
       [<ffffffff812a0303>] ? compat_prepare_timeout+0x13/0xb0
       [<ffffffffa0035848>] nf_nat_l3proto_ipv4_exit+0x10/0x23 [nf_nat_ipv4]
       ...
      
      To fix this,
      
      - check whether the conntrack has already been cleaned up in
        nf_nat_proto_clean
      
      - change nf_ct_iterate_cleanup() to only invoke the callback function
        once for each conntrack (IP_CT_DIR_ORIGINAL).
      
      The second change doesn't affect other callers since when conntracks are
      actually killed, both directions are removed from the hash immediately
      and the callback is already only invoked once. If it is not killed, the
      second callback invocation will always return the same decision not to
      kill it.
      Reported-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      b0cdb1d9
    • P
      netfilter: fix IPv6 NAT dependencies in Kconfig · b0041d1b
      Pablo Neira Ayuso 提交于
      * NF_NAT_IPV6 requires IP6_NF_IPTABLES
      
      * IP6_NF_TARGET_MASQUERADE, IP6_NF_TARGET_NETMAP, IP6_NF_TARGET_REDIRECT
        and IP6_NF_TARGET_NPT require NF_NAT_IPV6.
      
      This change just mirrors what IPv4 does in Kconfig, for consistency.
      Reported-by: NRandy Dunlap <rdunlap@xenotime.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      b0041d1b
    • A
      tcp: Document use of undefined variable. · 4308fc58
      Alan Cox 提交于
      Both tcp_timewait_state_process and tcp_check_req use the same basic
      construct of
      
      	struct tcp_options received tmp_opt;
      	tmp_opt.saw_tstamp = 0;
      
      then call
      
      	tcp_parse_options
      
      However if they are fed a frame containing a TCP_SACK then tbe code
      behaviour is undefined because opt_rx->sack_ok is undefined data.
      
      This ought to be documented if it is intentional.
      Signed-off-by: NAlan Cox <alan@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4308fc58
    • C
  7. 20 9月, 2012 11 次提交
  8. 19 9月, 2012 1 次提交