1. 03 5月, 2011 1 次提交
    • D
      ipv4: Make sure flowi4->{saddr,daddr} are always set. · 56157872
      David S. Miller 提交于
      Slow path output route resolution always makes sure that
      ->{saddr,daddr} are set, and also if we trigger into IPSEC resolution
      we initialize them as well, because xfrm_lookup() expects them to be
      fully resolved.
      
      But if we hit the fast path and flowi4->flowi4_proto is zero, we won't
      do this initialization.
      
      Therefore, move the IPSEC path initialization to the route cache
      lookup fast path to make sure these are always set.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56157872
  2. 02 5月, 2011 4 次提交
  3. 30 4月, 2011 6 次提交
  4. 29 4月, 2011 17 次提交
  5. 28 4月, 2011 8 次提交
    • E
      net: filter: Just In Time compiler for x86-64 · 0a14842f
      Eric Dumazet 提交于
      In order to speedup packet filtering, here is an implementation of a
      JIT compiler for x86_64
      
      It is disabled by default, and must be enabled by the admin.
      
      echo 1 >/proc/sys/net/core/bpf_jit_enable
      
      It uses module_alloc() and module_free() to get memory in the 2GB text
      kernel range since we call helpers functions from the generated code.
      
      EAX : BPF A accumulator
      EBX : BPF X accumulator
      RDI : pointer to skb   (first argument given to JIT function)
      RBP : frame pointer (even if CONFIG_FRAME_POINTER=n)
      r9d : skb->len - skb->data_len (headlen)
      r8  : skb->data
      
      To get a trace of generated code, use :
      
      echo 2 >/proc/sys/net/core/bpf_jit_enable
      
      Example of generated code :
      
      # tcpdump -p -n -s 0 -i eth1 host 192.168.20.0/24
      
      flen=18 proglen=147 pass=3 image=ffffffffa00b5000
      JIT code: ffffffffa00b5000: 55 48 89 e5 48 83 ec 60 48 89 5d f8 44 8b 4f 60
      JIT code: ffffffffa00b5010: 44 2b 4f 64 4c 8b 87 b8 00 00 00 be 0c 00 00 00
      JIT code: ffffffffa00b5020: e8 24 7b f7 e0 3d 00 08 00 00 75 28 be 1a 00 00
      JIT code: ffffffffa00b5030: 00 e8 fe 7a f7 e0 24 00 3d 00 14 a8 c0 74 49 be
      JIT code: ffffffffa00b5040: 1e 00 00 00 e8 eb 7a f7 e0 24 00 3d 00 14 a8 c0
      JIT code: ffffffffa00b5050: 74 36 eb 3b 3d 06 08 00 00 74 07 3d 35 80 00 00
      JIT code: ffffffffa00b5060: 75 2d be 1c 00 00 00 e8 c8 7a f7 e0 24 00 3d 00
      JIT code: ffffffffa00b5070: 14 a8 c0 74 13 be 26 00 00 00 e8 b5 7a f7 e0 24
      JIT code: ffffffffa00b5080: 00 3d 00 14 a8 c0 75 07 b8 ff ff 00 00 eb 02 31
      JIT code: ffffffffa00b5090: c0 c9 c3
      
      BPF program is 144 bytes long, so native program is almost same size ;)
      
      (000) ldh      [12]
      (001) jeq      #0x800           jt 2    jf 8
      (002) ld       [26]
      (003) and      #0xffffff00
      (004) jeq      #0xc0a81400      jt 16   jf 5
      (005) ld       [30]
      (006) and      #0xffffff00
      (007) jeq      #0xc0a81400      jt 16   jf 17
      (008) jeq      #0x806           jt 10   jf 9
      (009) jeq      #0x8035          jt 10   jf 17
      (010) ld       [28]
      (011) and      #0xffffff00
      (012) jeq      #0xc0a81400      jt 16   jf 13
      (013) ld       [38]
      (014) and      #0xffffff00
      (015) jeq      #0xc0a81400      jt 16   jf 17
      (016) ret      #65535
      (017) ret      #0
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Hagen Paul Pfeifer <hagen@jauu.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a14842f
    • D
      ipv4: Remove erroneous check in igmpv3_newpack() and igmp_send_report(). · 2e97e980
      David S. Miller 提交于
      Output route resolution never returns a route with rt_src set to zero
      (which is INADDR_ANY).
      
      Even if the flow key for the output route lookup specifies INADDR_ANY
      for the source address, the output route resolution chooses a real
      source address to use in the final route.
      
      This test has existed forever in igmp_send_report() and David Stevens
      simply copied over the erroneous test when implementing support for
      IGMPv3.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Reviewed-by: NEric Dumazet <eric.dumazet@gmail.com>
      2e97e980
    • D
      ipv4: Sanitize and simplify ip_route_{connect,newports}() · 2d7192d6
      David S. Miller 提交于
      These functions are used together as a unit for route resolution
      during connect().  They address the chicken-and-egg problem that
      exists when ports need to be allocated during connect() processing,
      yet such port allocations require addressing information from the
      routing code.
      
      It's currently more heavy handed than it needs to be, and in
      particular we allocate and initialize a flow object twice.
      
      Let the callers provide the on-stack flow object.  That way we only
      need to initialize it once in the ip_route_connect() call.
      
      Later, if ip_route_newports() needs to do anything, it re-uses that
      flow object as-is except for the ports which it updates before the
      route re-lookup.
      
      Also, describe why this set of facilities are needed and how it works
      in a big comment.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Reviewed-by: NEric Dumazet <eric.dumazet@gmail.com>
      2d7192d6
    • V
      sctp: clean up route lookup calls · da0420be
      Vlad Yasevich 提交于
      Change the call to take the transport parameter and set the
      cached 'dst' appropriately inside the get_dst() function calls.
      
      This will allow us in the future  to clean up source address
      storage as well.
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      da0420be
    • V
      sctp: remove useless arguments from get_saddr() call · af138470
      Vlad Yasevich 提交于
      There is no point in passing a destination address to
      a get_saddr() call.
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      af138470
    • V
      sctp: make sctp over IPv6 work with IPsec · 9c6a02f4
      Vlad Yasevich 提交于
      SCTP never called xfrm_output after it's v6 route lookups so
      that never really worked with ipsec.  Additioanlly, we never
      passed port nubmers and protocol in the flowi, so any port
      based policies were never applied as well.  Now that we can
      fixed ipv6 routing lookup code, using ip6_dst_lookup_flow()
      and pass port numbers.
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9c6a02f4
    • V
      sctp: cache the ipv6 source after route lookup · 9914ae3c
      Vlad Yasevich 提交于
      The ipv6 routing lookup does give us a source address,
      but instead of filling it into the dst, it's stored in
      the flowi.  We can use that instead of going through the
      entire source address selection again.
      Also the useless ->dst_saddr member of sctp_pf is removed.
      And sctp_v6_dst_saddr() is removed, instead by introduce
      sctp_v6_to_addr(), which can be reused to cleanup some dup
      code.
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9914ae3c
    • W
      sctp: fix sctp to work with ipv6 source address routing · 62503411
      Weixing Shi 提交于
      In the below test case, using the source address routing,
      sctp can not work.
      Node-A
      1)ifconfig eth0 inet6 add 2001:1::1/64
      2)ip -6 rule add from 2001:1::1 table 100 pref 100
      3)ip -6 route add 2001:2::1 dev eth0 table 100
      4)sctp_darn -H 2001:1::1 -P 250 -l &
      Node-B
      1)ifconfig eth0 inet6 add 2001:2::1/64
      2)ip -6 rule add from 2001:2::1 table 100 pref 100
      3)ip -6 route add 2001:1::1 dev eth0 table 100
      4)sctp_darn -H 2001:2::1 -P 250 -h 2001:1::1 -p 250 -s
      
      root cause:
      Node-A and Node-B use the source address routing, and
      at begining, source address will be NULL,sctp will
      search the  routing table by the destination address,
      because using the source address routing table, and
      the result dst_entry will be NULL.
      
      solution:
      walk through the bind address list to get the source
      address and then lookup the routing table again to get
      the correct dst_entry.
      Signed-off-by: NWeixing Shi <Weixing.Shi@windriver.com>
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      62503411
  6. 26 4月, 2011 4 次提交
    • D
      bluetooth: Fix use-before-initiailized var. · bf734843
      David S. Miller 提交于
      net/bluetooth/l2cap_core.c: In function ‘l2cap_recv_frame’:
      net/bluetooth/l2cap_core.c:3612:15: warning: ‘sk’ may be used uninitialized in this function
      net/bluetooth/l2cap_core.c:3612:15: note: ‘sk’ was declared here
      
      Actually the problem is in the inline function l2cap_data_channel(), we
      branch to the label 'done' which tests 'sk' before we set it to anything.
      
      Initialize it to NULL to fix this.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bf734843
    • J
      bonding: move processing of recv handlers into handle_frame() · 3aba891d
      Jiri Pirko 提交于
      Since now when bonding uses rx_handler, all traffic going into bond
      device goes thru bond_handle_frame. So there's no need to go back into
      bonding code later via ptype handlers. This patch converts
      original ptype handlers into "bonding receive probes". These functions
      are called from bond_handle_frame and they are registered per-mode.
      
      Note that vlan packets are also handled because they are always untagged
      thanks to vlan_untag()
      
      Note that this also allows arpmon for eth-bond-bridge-vlan topology.
      Signed-off-by: NJiri Pirko <jpirko@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3aba891d
    • M
    • H
      net: provide cow_metrics() methods to blackhole dst_ops · 0972ddb2
      Held Bernhard 提交于
      Since commit 62fa8a84 (net: Implement read-only protection and COW'ing
      of metrics.) the kernel throws an oops.
      
      [  101.620985] BUG: unable to handle kernel NULL pointer dereference at
                 (null)
      [  101.621050] IP: [<          (null)>]           (null)
      [  101.621084] PGD 6e53c067 PUD 3dd6a067 PMD 0
      [  101.621122] Oops: 0010 [#1] SMP
      [  101.621153] last sysfs file: /sys/devices/virtual/ppp/ppp/uevent
      [  101.621192] CPU 2
      [  101.621206] Modules linked in: l2tp_ppp pppox ppp_generic slhc
      l2tp_netlink l2tp_core deflate zlib_deflate twofish_x86_64
      twofish_common des_generic cbc ecb sha1_generic hmac af_key
      iptable_filter snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device loop
      snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec
      snd_pcm snd_timer snd i2c_i801 iTCO_wdt psmouse soundcore snd_page_alloc
      evdev uhci_hcd ehci_hcd thermal
      [  101.621552]
      [  101.621567] Pid: 5129, comm: openl2tpd Not tainted 2.6.39-rc4-Quad #3
      Gigabyte Technology Co., Ltd. G33-DS3R/G33-DS3R
      [  101.621637] RIP: 0010:[<0000000000000000>]  [<          (null)>]   (null)
      [  101.621684] RSP: 0018:ffff88003ddeba60  EFLAGS: 00010202
      [  101.621716] RAX: ffff88003ddb5600 RBX: ffff88003ddb5600 RCX:
      0000000000000020
      [  101.621758] RDX: ffffffff81a69a00 RSI: ffffffff81b7ee61 RDI:
      ffff88003ddb5600
      [  101.621800] RBP: ffff8800537cd900 R08: 0000000000000000 R09:
      ffff88003ddb5600
      [  101.621840] R10: 0000000000000005 R11: 0000000000014b38 R12:
      ffff88003ddb5600
      [  101.621881] R13: ffffffff81b7e480 R14: ffffffff81b7e8b8 R15:
      ffff88003ddebad8
      [  101.621924] FS:  00007f06e4182700(0000) GS:ffff88007fd00000(0000)
      knlGS:0000000000000000
      [  101.621971] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  101.622005] CR2: 0000000000000000 CR3: 0000000045274000 CR4:
      00000000000006e0
      [  101.622046] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
      0000000000000000
      [  101.622087] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
      0000000000000400
      [  101.622129] Process openl2tpd (pid: 5129, threadinfo
      ffff88003ddea000, task ffff88003de9a280)
      [  101.622177] Stack:
      [  101.622191]  ffffffff81447efa ffff88007d3ded80 ffff88003de9a280
      ffff88007d3ded80
      [  101.622245]  0000000000000001 ffff88003ddebbb8 ffffffff8148d5a7
      0000000000000212
      [  101.622299]  ffff88003dcea000 ffff88003dcea188 ffffffff00000001
      ffffffff81b7e480
      [  101.622353] Call Trace:
      [  101.622374]  [<ffffffff81447efa>] ? ipv4_blackhole_route+0x1ba/0x210
      [  101.622415]  [<ffffffff8148d5a7>] ? xfrm_lookup+0x417/0x510
      [  101.622450]  [<ffffffff8127672a>] ? extract_buf+0x9a/0x140
      [  101.622485]  [<ffffffff8144c6a0>] ? __ip_flush_pending_frames+0x70/0x70
      [  101.622526]  [<ffffffff8146fbbf>] ? udp_sendmsg+0x62f/0x810
      [  101.622562]  [<ffffffff813f98a6>] ? sock_sendmsg+0x116/0x130
      [  101.622599]  [<ffffffff8109df58>] ? find_get_page+0x18/0x90
      [  101.622633]  [<ffffffff8109fd6a>] ? filemap_fault+0x12a/0x4b0
      [  101.622668]  [<ffffffff813fb5c4>] ? move_addr_to_kernel+0x64/0x90
      [  101.622706]  [<ffffffff81405d5a>] ? verify_iovec+0x7a/0xf0
      [  101.622739]  [<ffffffff813fc772>] ? sys_sendmsg+0x292/0x420
      [  101.622774]  [<ffffffff810b994a>] ? handle_pte_fault+0x8a/0x7c0
      [  101.622810]  [<ffffffff810b76fe>] ? __pte_alloc+0xae/0x130
      [  101.622844]  [<ffffffff810ba2f8>] ? handle_mm_fault+0x138/0x380
      [  101.622880]  [<ffffffff81024af9>] ? do_page_fault+0x189/0x410
      [  101.622915]  [<ffffffff813fbe03>] ? sys_getsockname+0xf3/0x110
      [  101.622952]  [<ffffffff81450c4d>] ? ip_setsockopt+0x4d/0xa0
      [  101.622986]  [<ffffffff813f9932>] ? sockfd_lookup_light+0x22/0x90
      [  101.623024]  [<ffffffff814b61fb>] ? system_call_fastpath+0x16/0x1b
      [  101.623060] Code:  Bad RIP value.
      [  101.623090] RIP  [<          (null)>]           (null)
      [  101.623125]  RSP <ffff88003ddeba60>
      [  101.623146] CR2: 0000000000000000
      [  101.650871] ---[ end trace ca3856a7d8e8dad4 ]---
      [  101.651011] __sk_free: optmem leakage (160 bytes) detected.
      
      The oops happens in dst_metrics_write_ptr()
      include/net/dst.h:124: return dst->ops->cow_metrics(dst, p);
      
      dst->ops->cow_metrics is NULL and causes the oops.
      
      Provide cow_metrics() methods, like we did in commit 214f45c9
      (net: provide default_advmss() methods to blackhole dst_ops)
      Signed-off-by: NHeld Bernhard <berny156@gmx.de>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0972ddb2