1. 18 5月, 2017 1 次提交
    • C
      ipv6: Prevent overrun when parsing v6 header options · 2423496a
      Craig Gallek 提交于
      The KASAN warning repoted below was discovered with a syzkaller
      program.  The reproducer is basically:
        int s = socket(AF_INET6, SOCK_RAW, NEXTHDR_HOP);
        send(s, &one_byte_of_data, 1, MSG_MORE);
        send(s, &more_than_mtu_bytes_data, 2000, 0);
      
      The socket() call sets the nexthdr field of the v6 header to
      NEXTHDR_HOP, the first send call primes the payload with a non zero
      byte of data, and the second send call triggers the fragmentation path.
      
      The fragmentation code tries to parse the header options in order
      to figure out where to insert the fragment option.  Since nexthdr points
      to an invalid option, the calculation of the size of the network header
      can made to be much larger than the linear section of the skb and data
      is read outside of it.
      
      This fix makes ip6_find_1stfrag return an error if it detects
      running out-of-bounds.
      
      [   42.361487] ==================================================================
      [   42.364412] BUG: KASAN: slab-out-of-bounds in ip6_fragment+0x11c8/0x3730
      [   42.365471] Read of size 840 at addr ffff88000969e798 by task ip6_fragment-oo/3789
      [   42.366469]
      [   42.366696] CPU: 1 PID: 3789 Comm: ip6_fragment-oo Not tainted 4.11.0+ #41
      [   42.367628] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014
      [   42.368824] Call Trace:
      [   42.369183]  dump_stack+0xb3/0x10b
      [   42.369664]  print_address_description+0x73/0x290
      [   42.370325]  kasan_report+0x252/0x370
      [   42.370839]  ? ip6_fragment+0x11c8/0x3730
      [   42.371396]  check_memory_region+0x13c/0x1a0
      [   42.371978]  memcpy+0x23/0x50
      [   42.372395]  ip6_fragment+0x11c8/0x3730
      [   42.372920]  ? nf_ct_expect_unregister_notifier+0x110/0x110
      [   42.373681]  ? ip6_copy_metadata+0x7f0/0x7f0
      [   42.374263]  ? ip6_forward+0x2e30/0x2e30
      [   42.374803]  ip6_finish_output+0x584/0x990
      [   42.375350]  ip6_output+0x1b7/0x690
      [   42.375836]  ? ip6_finish_output+0x990/0x990
      [   42.376411]  ? ip6_fragment+0x3730/0x3730
      [   42.376968]  ip6_local_out+0x95/0x160
      [   42.377471]  ip6_send_skb+0xa1/0x330
      [   42.377969]  ip6_push_pending_frames+0xb3/0xe0
      [   42.378589]  rawv6_sendmsg+0x2051/0x2db0
      [   42.379129]  ? rawv6_bind+0x8b0/0x8b0
      [   42.379633]  ? _copy_from_user+0x84/0xe0
      [   42.380193]  ? debug_check_no_locks_freed+0x290/0x290
      [   42.380878]  ? ___sys_sendmsg+0x162/0x930
      [   42.381427]  ? rcu_read_lock_sched_held+0xa3/0x120
      [   42.382074]  ? sock_has_perm+0x1f6/0x290
      [   42.382614]  ? ___sys_sendmsg+0x167/0x930
      [   42.383173]  ? lock_downgrade+0x660/0x660
      [   42.383727]  inet_sendmsg+0x123/0x500
      [   42.384226]  ? inet_sendmsg+0x123/0x500
      [   42.384748]  ? inet_recvmsg+0x540/0x540
      [   42.385263]  sock_sendmsg+0xca/0x110
      [   42.385758]  SYSC_sendto+0x217/0x380
      [   42.386249]  ? SYSC_connect+0x310/0x310
      [   42.386783]  ? __might_fault+0x110/0x1d0
      [   42.387324]  ? lock_downgrade+0x660/0x660
      [   42.387880]  ? __fget_light+0xa1/0x1f0
      [   42.388403]  ? __fdget+0x18/0x20
      [   42.388851]  ? sock_common_setsockopt+0x95/0xd0
      [   42.389472]  ? SyS_setsockopt+0x17f/0x260
      [   42.390021]  ? entry_SYSCALL_64_fastpath+0x5/0xbe
      [   42.390650]  SyS_sendto+0x40/0x50
      [   42.391103]  entry_SYSCALL_64_fastpath+0x1f/0xbe
      [   42.391731] RIP: 0033:0x7fbbb711e383
      [   42.392217] RSP: 002b:00007ffff4d34f28 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      [   42.393235] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbbb711e383
      [   42.394195] RDX: 0000000000001000 RSI: 00007ffff4d34f60 RDI: 0000000000000003
      [   42.395145] RBP: 0000000000000046 R08: 00007ffff4d34f40 R09: 0000000000000018
      [   42.396056] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000400aad
      [   42.396598] R13: 0000000000000066 R14: 00007ffff4d34ee0 R15: 00007fbbb717af00
      [   42.397257]
      [   42.397411] Allocated by task 3789:
      [   42.397702]  save_stack_trace+0x16/0x20
      [   42.398005]  save_stack+0x46/0xd0
      [   42.398267]  kasan_kmalloc+0xad/0xe0
      [   42.398548]  kasan_slab_alloc+0x12/0x20
      [   42.398848]  __kmalloc_node_track_caller+0xcb/0x380
      [   42.399224]  __kmalloc_reserve.isra.32+0x41/0xe0
      [   42.399654]  __alloc_skb+0xf8/0x580
      [   42.400003]  sock_wmalloc+0xab/0xf0
      [   42.400346]  __ip6_append_data.isra.41+0x2472/0x33d0
      [   42.400813]  ip6_append_data+0x1a8/0x2f0
      [   42.401122]  rawv6_sendmsg+0x11ee/0x2db0
      [   42.401505]  inet_sendmsg+0x123/0x500
      [   42.401860]  sock_sendmsg+0xca/0x110
      [   42.402209]  ___sys_sendmsg+0x7cb/0x930
      [   42.402582]  __sys_sendmsg+0xd9/0x190
      [   42.402941]  SyS_sendmsg+0x2d/0x50
      [   42.403273]  entry_SYSCALL_64_fastpath+0x1f/0xbe
      [   42.403718]
      [   42.403871] Freed by task 1794:
      [   42.404146]  save_stack_trace+0x16/0x20
      [   42.404515]  save_stack+0x46/0xd0
      [   42.404827]  kasan_slab_free+0x72/0xc0
      [   42.405167]  kfree+0xe8/0x2b0
      [   42.405462]  skb_free_head+0x74/0xb0
      [   42.405806]  skb_release_data+0x30e/0x3a0
      [   42.406198]  skb_release_all+0x4a/0x60
      [   42.406563]  consume_skb+0x113/0x2e0
      [   42.406910]  skb_free_datagram+0x1a/0xe0
      [   42.407288]  netlink_recvmsg+0x60d/0xe40
      [   42.407667]  sock_recvmsg+0xd7/0x110
      [   42.408022]  ___sys_recvmsg+0x25c/0x580
      [   42.408395]  __sys_recvmsg+0xd6/0x190
      [   42.408753]  SyS_recvmsg+0x2d/0x50
      [   42.409086]  entry_SYSCALL_64_fastpath+0x1f/0xbe
      [   42.409513]
      [   42.409665] The buggy address belongs to the object at ffff88000969e780
      [   42.409665]  which belongs to the cache kmalloc-512 of size 512
      [   42.410846] The buggy address is located 24 bytes inside of
      [   42.410846]  512-byte region [ffff88000969e780, ffff88000969e980)
      [   42.411941] The buggy address belongs to the page:
      [   42.412405] page:ffffea000025a780 count:1 mapcount:0 mapping:          (null) index:0x0 compound_mapcount: 0
      [   42.413298] flags: 0x100000000008100(slab|head)
      [   42.413729] raw: 0100000000008100 0000000000000000 0000000000000000 00000001800c000c
      [   42.414387] raw: ffffea00002a9500 0000000900000007 ffff88000c401280 0000000000000000
      [   42.415074] page dumped because: kasan: bad access detected
      [   42.415604]
      [   42.415757] Memory state around the buggy address:
      [   42.416222]  ffff88000969e880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [   42.416904]  ffff88000969e900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [   42.417591] >ffff88000969e980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   42.418273]                    ^
      [   42.418588]  ffff88000969ea00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   42.419273]  ffff88000969ea80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   42.419882] ==================================================================
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NCraig Gallek <kraig@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2423496a
  2. 14 3月, 2017 1 次提交
    • F
      ipv6: avoid write to a possibly cloned skb · 79e49503
      Florian Westphal 提交于
      ip6_fragment, in case skb has a fraglist, checks if the
      skb is cloned.  If it is, it will move to the 'slow path' and allocates
      new skbs for each fragment.
      
      However, right before entering the slowpath loop, it updates the
      nexthdr value of the last ipv6 extension header to NEXTHDR_FRAGMENT,
      to account for the fragment header that will be inserted in the new
      ipv6-fragment skbs.
      
      In case original skb is cloned this munges nexthdr value of another
      skb.  Avoid this by doing the nexthdr update for each of the new fragment
      skbs separately.
      
      This was observed with tcpdump on a bridge device where netfilter ipv6
      reassembly is active:  tcpdump shows malformed fragment headers as
      the l4 header (icmpv6, tcp, etc). is decoded as a fragment header.
      
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Reported-by: NAndreas Karis <akaris@redhat.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79e49503
  3. 10 3月, 2017 1 次提交
    • A
      udp: avoid ufo handling on IP payload compression packets · 4b3b45ed
      Alexey Kodanev 提交于
      commit c146066a ("ipv4: Don't use ufo handling on later transformed
      packets") and commit f89c56ce ("ipv6: Don't use ufo handling on
      later transformed packets") added a check that 'rt->dst.header_len' isn't
      zero in order to skip UFO, but it doesn't include IPcomp in transport mode
      where it equals zero.
      
      Packets, after payload compression, may not require further fragmentation,
      and if original length exceeds MTU, later compressed packets will be
      transmitted incorrectly. This can be reproduced with LTP udp_ipsec.sh test
      on veth device with enabled UFO, MTU is 1500 and UDP payload is 2000:
      
      * IPv4 case, offset is wrong + unnecessary fragmentation
          udp_ipsec.sh -p comp -m transport -s 2000 &
          tcpdump -ni ltp_ns_veth2
          ...
          IP (tos 0x0, ttl 64, id 45203, offset 0, flags [+],
            proto Compressed IP (108), length 49)
            10.0.0.2 > 10.0.0.1: IPComp(cpi=0x1000)
          IP (tos 0x0, ttl 64, id 45203, offset 1480, flags [none],
            proto UDP (17), length 21) 10.0.0.2 > 10.0.0.1: ip-proto-17
      
      * IPv6 case, sending small fragments
          udp_ipsec.sh -6 -p comp -m transport -s 2000 &
          tcpdump -ni ltp_ns_veth2
          ...
          IP6 (flowlabel 0x6b9ba, hlim 64, next-header Compressed IP (108)
            payload length: 37) fd00::2 > fd00::1: IPComp(cpi=0x1000)
          IP6 (flowlabel 0x6b9ba, hlim 64, next-header Compressed IP (108)
            payload length: 21) fd00::2 > fd00::1: IPComp(cpi=0x1000)
      
      Fix it by checking 'rt->dst.xfrm' pointer to 'xfrm_state' struct, skip UFO
      if xfrm is set. So the new check will include both cases: IPcomp and IPsec.
      
      Fixes: c146066a ("ipv4: Don't use ufo handling on later transformed packets")
      Fixes: f89c56ce ("ipv6: Don't use ufo handling on later transformed packets")
      Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4b3b45ed
  4. 19 2月, 2017 1 次提交
  5. 15 2月, 2017 1 次提交
  6. 12 2月, 2017 1 次提交
  7. 08 2月, 2017 2 次提交
  8. 31 1月, 2017 1 次提交
  9. 27 1月, 2017 1 次提交
  10. 30 12月, 2016 1 次提交
    • Z
      ipv6: Should use consistent conditional judgement for ip6 fragment between... · e4c5e13a
      Zheng Li 提交于
      ipv6: Should use consistent conditional judgement for ip6 fragment between __ip6_append_data and ip6_finish_output
      
      There is an inconsistent conditional judgement between __ip6_append_data
      and ip6_finish_output functions, the variable length in __ip6_append_data
      just include the length of application's payload and udp6 header, don't
      include the length of ipv6 header, but in ip6_finish_output use
      (skb->len > ip6_skb_dst_mtu(skb)) as judgement, and skb->len include the
      length of ipv6 header.
      
      That causes some particular application's udp6 payloads whose length are
      between (MTU - IPv6 Header) and MTU were fragmented by ip6_fragment even
      though the rst->dev support UFO feature.
      
      Add the length of ipv6 header to length in __ip6_append_data to keep
      consistent conditional judgement as ip6_finish_output for ip6 fragment.
      Signed-off-by: NZheng Li <james.z.li@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e4c5e13a
  11. 26 11月, 2016 1 次提交
    • D
      net: ipv4, ipv6: run cgroup eBPF egress programs · 33b48679
      Daniel Mack 提交于
      If the cgroup associated with the receiving socket has an eBPF
      programs installed, run them from ip_output(), ip6_output() and
      ip_mc_output(). From mentioned functions we have two socket contexts
      as per 7026b1dd ("netfilter: Pass socket pointer down through
      okfn()."). We explicitly need to use sk instead of skb->sk here,
      since otherwise the same program would run multiple times on egress
      when encap devices are involved, which is not desired in our case.
      
      eBPF programs used in this context are expected to either return 1 to
      let the packet pass, or != 1 to drop them. The programs have access to
      the skb through bpf_skb_load_bytes(), and the payload starts at the
      network headers (L3).
      
      Note that cgroup_bpf_run_filter() is stubbed out as static inline nop
      for !CONFIG_CGROUP_BPF, and is otherwise guarded by a static key if
      the feature is unused.
      Signed-off-by: NDaniel Mack <daniel@zonque.org>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      33b48679
  12. 20 11月, 2016 1 次提交
    • A
      net: fix bogus cast in skb_pagelen() and use unsigned variables · c72d8cda
      Alexey Dobriyan 提交于
      1) cast to "int" is unnecessary:
         u8 will be promoted to int before decrementing,
         small positive numbers fit into "int", so their values won't be changed
         during promotion.
      
         Once everything is int including loop counters, signedness doesn't
         matter: 32-bit operations will stay 32-bit operations.
      
         But! Someone tried to make this loop smart by making everything of
         the same type apparently in an attempt to optimise it.
         Do the optimization, just differently.
         Do the cast where it matters. :^)
      
      2) frag size is unsigned entity and sum of fragments sizes is also
         unsigned.
      
      Make everything unsigned, leave no MOVSX instruction behind.
      
      	add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-4 (-4)
      	function                                     old     new   delta
      	skb_cow_data                                 835     834      -1
      	ip_do_fragment                              2549    2548      -1
      	ip6_fragment                                3130    3128      -2
      	Total: Before=154865032, After=154865028, chg -0.00%
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c72d8cda
  13. 10 11月, 2016 1 次提交
  14. 01 11月, 2016 1 次提交
    • J
      ipv6: Don't use ufo handling on later transformed packets · f89c56ce
      Jakub Sitnicki 提交于
      Similar to commit c146066a ("ipv4: Don't use ufo handling on later
      transformed packets"), don't perform UFO on packets that will be IPsec
      transformed. To detect it we rely on the fact that headerlen in
      dst_entry is non-zero only for transformation bundles (xfrm_dst
      objects).
      
      Unwanted segmentation can be observed with a NETIF_F_UFO capable device,
      such as a dummy device:
      
        DEV=dum0 LEN=1493
      
        ip li add $DEV type dummy
        ip addr add fc00::1/64 dev $DEV nodad
        ip link set $DEV up
        ip xfrm policy add dir out src fc00::1 dst fc00::2 \
           tmpl src fc00::1 dst fc00::2 proto esp spi 1
        ip xfrm state add src fc00::1 dst fc00::2 \
           proto esp spi 1 enc 'aes' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b
      
        tcpdump -n -nn -i $DEV -t &
        socat /dev/zero,readbytes=$LEN udp6:[fc00::2]:$LEN
      
      tcpdump output before:
      
        IP6 fc00::1 > fc00::2: frag (0|1448) ESP(spi=0x00000001,seq=0x1), length 1448
        IP6 fc00::1 > fc00::2: frag (1448|48)
        IP6 fc00::1 > fc00::2: ESP(spi=0x00000001,seq=0x2), length 88
      
      ... and after:
      
        IP6 fc00::1 > fc00::2: frag (0|1448) ESP(spi=0x00000001,seq=0x1), length 1448
        IP6 fc00::1 > fc00::2: frag (1448|80)
      
      Fixes: e89e9cf5 ("[IPv4/IPv6]: UFO Scatter-gather approach")
      Signed-off-by: NJakub Sitnicki <jkbs@redhat.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f89c56ce
  15. 11 9月, 2016 3 次提交
  16. 31 8月, 2016 1 次提交
    • R
      net: lwtunnel: Handle fragmentation · 14972cbd
      Roopa Prabhu 提交于
      Today mpls iptunnel lwtunnel_output redirect expects the tunnel
      output function to handle fragmentation. This is ok but can be
      avoided if we did not do the mpls output redirect too early.
      ie we could wait until ip fragmentation is done and then call
      mpls output for each ip fragment.
      
      To make this work we will need,
      1) the lwtunnel state to carry encap headroom
      2) and do the redirect to the encap output handler on the ip fragment
      (essentially do the output redirect after fragmentation)
      
      This patch adds tunnel headroom in lwtstate to make sure we
      account for tunnel data in mtu calculations during fragmentation
      and adds new xmit redirect handler to redirect to lwtunnel xmit func
      after ip fragmentation.
      
      This includes IPV6 and some mtu fixes and testing from David Ahern.
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      14972cbd
  17. 18 6月, 2016 1 次提交
    • D
      net: vrf: Implement get_saddr for IPv6 · 0d240e78
      David Ahern 提交于
      IPv6 source address selection needs to consider the real egress route.
      Similar to IPv4 implement a get_saddr6 method which is called if
      source address has not been set.  The get_saddr6 method does a full
      lookup which means pulling a route from the VRF FIB table and properly
      considering linklocal/multicast destination addresses. Lookup failures
      (eg., unreachable) then cause the source address selection to fail
      which gets propagated back to the caller.
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0d240e78
  18. 09 6月, 2016 1 次提交
    • J
      ipv6: Skip XFRM lookup if dst_entry in socket cache is valid · 00bc0ef5
      Jakub Sitnicki 提交于
      At present we perform an xfrm_lookup() for each UDPv6 message we
      send. The lookup involves querying the flow cache (flow_cache_lookup)
      and, in case of a cache miss, creating an XFRM bundle.
      
      If we miss the flow cache, we can end up creating a new bundle and
      deriving the path MTU (xfrm_init_pmtu) from on an already transformed
      dst_entry, which we pass from the socket cache (sk->sk_dst_cache) down
      to xfrm_lookup(). This can happen only if we're caching the dst_entry
      in the socket, that is when we're using a connected UDP socket.
      
      To put it another way, the path MTU shrinks each time we miss the flow
      cache, which later on leads to incorrectly fragmented payload. It can
      be observed with ESPv6 in transport mode:
      
        1) Set up a transformation and lower the MTU to trigger fragmentation
          # ip xfrm policy add dir out src ::1 dst ::1 \
            tmpl src ::1 dst ::1 proto esp spi 1
          # ip xfrm state add src ::1 dst ::1 \
            proto esp spi 1 enc 'aes' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b
          # ip link set dev lo mtu 1500
      
        2) Monitor the packet flow and set up an UDP sink
          # tcpdump -ni lo -ttt &
          # socat udp6-listen:12345,fork /dev/null &
      
        3) Send a datagram that needs fragmentation with a connected socket
          # perl -e 'print "@" x 1470 | socat - udp6:[::1]:12345
          2016/06/07 18:52:52 socat[724] E read(3, 0x555bb3d5ba00, 8192): Protocol error
          00:00:00.000000 IP6 ::1 > ::1: frag (0|1448) ESP(spi=0x00000001,seq=0x2), length 1448
          00:00:00.000014 IP6 ::1 > ::1: frag (1448|32)
          00:00:00.000050 IP6 ::1 > ::1: ESP(spi=0x00000001,seq=0x3), length 1272
          (^ ICMPv6 Parameter Problem)
          00:00:00.000022 IP6 ::1 > ::1: ESP(spi=0x00000001,seq=0x5), length 136
      
        4) Compare it to a non-connected socket
          # perl -e 'print "@" x 1500' | socat - udp6-sendto:[::1]:12345
          00:00:40.535488 IP6 ::1 > ::1: frag (0|1448) ESP(spi=0x00000001,seq=0x6), length 1448
          00:00:00.000010 IP6 ::1 > ::1: frag (1448|64)
      
      What happens in step (3) is:
      
        1) when connecting the socket in __ip6_datagram_connect(), we
           perform an XFRM lookup, miss the flow cache, create an XFRM
           bundle, and cache the destination,
      
        2) afterwards, when sending the datagram, we perform an XFRM lookup,
           again, miss the flow cache (due to mismatch of flowi6_iif and
           flowi6_oif, which is an issue of its own), and recreate an XFRM
           bundle based on the cached (and already transformed) destination.
      
      To prevent the recreation of an XFRM bundle, avoid an XFRM lookup
      altogether whenever we already have a destination entry cached in the
      socket. This prevents the path MTU shrinkage and brings us on par with
      UDPv4.
      
      The fix also benefits connected PINGv6 sockets, another user of
      ip6_sk_dst_lookup_flow(), who also suffer messages being transformed
      twice.
      
      Joint work with Hannes Frederic Sowa.
      Reported-by: NJan Tluka <jtluka@redhat.com>
      Signed-off-by: NJakub Sitnicki <jkbs@redhat.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      00bc0ef5
  19. 04 6月, 2016 1 次提交
  20. 04 5月, 2016 1 次提交
    • W
      ipv6: add new struct ipcm6_cookie · 26879da5
      Wei Wang 提交于
      In the sendmsg function of UDP, raw, ICMP and l2tp sockets, we use local
      variables like hlimits, tclass, opt and dontfrag and pass them to corresponding
      functions like ip6_make_skb, ip6_append_data and xxx_push_pending_frames.
      This is not a good practice and makes it hard to add new parameters.
      This fix introduces a new struct ipcm6_cookie similar to ipcm_cookie in
      ipv4 and include the above mentioned variables. And we only pass the
      pointer to this structure to corresponding functions. This makes it easier
      to add new parameters in the future and makes the function cleaner.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      26879da5
  21. 28 4月, 2016 1 次提交
  22. 08 4月, 2016 1 次提交
    • J
      ipv6: Count in extension headers in skb->network_header · 3ba3458f
      Jakub Sitnicki 提交于
      When sending a UDPv6 message longer than MTU, account for the length
      of fragmentable IPv6 extension headers in skb->network_header offset.
      Same as we do in alloc_new_skb path in __ip6_append_data().
      
      This ensures that later on __ip6_make_skb() will make space in
      headroom for fragmentable extension headers:
      
      	/* move skb->data to ip header from ext header */
      	if (skb->data < skb_network_header(skb))
      		__skb_pull(skb, skb_network_offset(skb));
      
      Prevents a splat due to skb_under_panic:
      
      skbuff: skb_under_panic: text:ffffffff8143397b len:2126 put:14 \
      head:ffff880005bacf50 data:ffff880005bacf4a tail:0x48 end:0xc0 dev:lo
      ------------[ cut here ]------------
      kernel BUG at net/core/skbuff.c:104!
      invalid opcode: 0000 [#1] KASAN
      CPU: 0 PID: 160 Comm: reproducer Not tainted 4.6.0-rc2 #65
      [...]
      Call Trace:
       [<ffffffff813eb7b9>] skb_push+0x79/0x80
       [<ffffffff8143397b>] eth_header+0x2b/0x100
       [<ffffffff8141e0d0>] neigh_resolve_output+0x210/0x310
       [<ffffffff814eab77>] ip6_finish_output2+0x4a7/0x7c0
       [<ffffffff814efe3a>] ip6_output+0x16a/0x280
       [<ffffffff815440c1>] ip6_local_out+0xb1/0xf0
       [<ffffffff814f1115>] ip6_send_skb+0x45/0xd0
       [<ffffffff81518836>] udp_v6_send_skb+0x246/0x5d0
       [<ffffffff8151985e>] udpv6_sendmsg+0xa6e/0x1090
      [...]
      Reported-by: NJi Jianwen <jiji@redhat.com>
      Signed-off-by: NJakub Sitnicki <jkbs@redhat.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3ba3458f
  23. 05 4月, 2016 1 次提交
    • S
      sock: enable timestamping using control messages · c14ac945
      Soheil Hassas Yeganeh 提交于
      Currently, SOL_TIMESTAMPING can only be enabled using setsockopt.
      This is very costly when users want to sample writes to gather
      tx timestamps.
      
      Add support for enabling SO_TIMESTAMPING via control messages by
      using tsflags added in `struct sockcm_cookie` (added in the previous
      patches in this series) to set the tx_flags of the last skb created in
      a sendmsg. With this patch, the timestamp recording bits in tx_flags
      of the skbuff is overridden if SO_TIMESTAMPING is passed in a cmsg.
      
      Please note that this is only effective for overriding the recording
      timestamps flags. Users should enable timestamp reporting (e.g.,
      SOF_TIMESTAMPING_SOFTWARE | SOF_TIMESTAMPING_OPT_ID) using
      socket options and then should ask for SOF_TIMESTAMPING_TX_*
      using control messages per sendmsg to sample timestamps for each
      write.
      Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c14ac945
  24. 02 3月, 2016 1 次提交
  25. 30 1月, 2016 1 次提交
    • P
      ipv6: enforce flowi6_oif usage in ip6_dst_lookup_tail() · 6f21c96a
      Paolo Abeni 提交于
      The current implementation of ip6_dst_lookup_tail basically
      ignore the egress ifindex match: if the saddr is set,
      ip6_route_output() purposefully ignores flowi6_oif, due
      to the commit d46a9d67 ("net: ipv6: Dont add RT6_LOOKUP_F_IFACE
      flag if saddr set"), if the saddr is 'any' the first route lookup
      in ip6_dst_lookup_tail fails, but upon failure a second lookup will
      be performed with saddr set, thus ignoring the ifindex constraint.
      
      This commit adds an output route lookup function variant, which
      allows the caller to specify lookup flags, and modify
      ip6_dst_lookup_tail() to enforce the ifindex match on the second
      lookup via said helper.
      
      ip6_route_output() becames now a static inline function build on
      top of ip6_route_output_flags(); as a side effect, out-of-tree
      modules need now a GPL license to access the output route lookup
      functionality.
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6f21c96a
  26. 12 1月, 2016 1 次提交
  27. 16 12月, 2015 1 次提交
  28. 02 11月, 2015 2 次提交
  29. 29 10月, 2015 2 次提交
  30. 23 10月, 2015 1 次提交
  31. 22 10月, 2015 1 次提交
  32. 13 10月, 2015 1 次提交
  33. 11 10月, 2015 1 次提交
  34. 08 10月, 2015 2 次提交