1. 15 7月, 2014 1 次提交
    • D
      net: sctp: fix information leaks in ulpevent layer · 8f2e5ae4
      Daniel Borkmann 提交于
      While working on some other SCTP code, I noticed that some
      structures shared with user space are leaking uninitialized
      stack or heap buffer. In particular, struct sctp_sndrcvinfo
      has a 2 bytes hole between .sinfo_flags and .sinfo_ppid that
      remains unfilled by us in sctp_ulpevent_read_sndrcvinfo() when
      putting this into cmsg. But also struct sctp_remote_error
      contains a 2 bytes hole that we don't fill but place into a skb
      through skb_copy_expand() via sctp_ulpevent_make_remote_error().
      
      Both structures are defined by the IETF in RFC6458:
      
      * Section 5.3.2. SCTP Header Information Structure:
      
        The sctp_sndrcvinfo structure is defined below:
      
        struct sctp_sndrcvinfo {
          uint16_t sinfo_stream;
          uint16_t sinfo_ssn;
          uint16_t sinfo_flags;
          <-- 2 bytes hole  -->
          uint32_t sinfo_ppid;
          uint32_t sinfo_context;
          uint32_t sinfo_timetolive;
          uint32_t sinfo_tsn;
          uint32_t sinfo_cumtsn;
          sctp_assoc_t sinfo_assoc_id;
        };
      
      * 6.1.3. SCTP_REMOTE_ERROR:
      
        A remote peer may send an Operation Error message to its peer.
        This message indicates a variety of error conditions on an
        association. The entire ERROR chunk as it appears on the wire
        is included in an SCTP_REMOTE_ERROR event. Please refer to the
        SCTP specification [RFC4960] and any extensions for a list of
        possible error formats. An SCTP error notification has the
        following format:
      
        struct sctp_remote_error {
          uint16_t sre_type;
          uint16_t sre_flags;
          uint32_t sre_length;
          uint16_t sre_error;
          <-- 2 bytes hole  -->
          sctp_assoc_t sre_assoc_id;
          uint8_t  sre_data[];
        };
      
      Fix this by setting both to 0 before filling them out. We also
      have other structures shared between user and kernel space in
      SCTP that contains holes (e.g. struct sctp_paddrthlds), but we
      copy that buffer over from user space first and thus don't need
      to care about it in that cases.
      
      While at it, we can also remove lengthy comments copied from
      the draft, instead, we update the comment with the correct RFC
      number where one can look it up.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f2e5ae4
  2. 12 7月, 2014 2 次提交
  3. 10 7月, 2014 1 次提交
    • B
      netlink: Fix handling of error from netlink_dump(). · ac30ef83
      Ben Pfaff 提交于
      netlink_dump() returns a negative errno value on error.  Until now,
      netlink_recvmsg() directly recorded that negative value in sk->sk_err, but
      that's wrong since sk_err takes positive errno values.  (This manifests as
      userspace receiving a positive return value from the recv() system call,
      falsely indicating success.) This bug was introduced in the commit that
      started checking the netlink_dump() return value, commit b44d211e (netlink:
      handle errors from netlink_dump()).
      
      Multithreaded Netlink dumps are one way to trigger this behavior in
      practice, as described in the commit message for the userspace workaround
      posted here:
          http://openvswitch.org/pipermail/dev/2014-June/042339.html
      
      This commit also fixes the same bug in netlink_poll(), introduced in commit
      cd1df525 (netlink: add flow control for memory mapped I/O).
      Signed-off-by: NBen Pfaff <blp@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac30ef83
  4. 09 7月, 2014 3 次提交
    • A
      appletalk: Fix socket referencing in skb · 36beddc2
      Andrey Utkin 提交于
      Setting just skb->sk without taking its reference and setting a
      destructor is invalid. However, in the places where this was done, skb
      is used in a way not requiring skb->sk setting. So dropping the setting
      of skb->sk.
      Thanks to Eric Dumazet <eric.dumazet@gmail.com> for correct solution.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=79441Reported-by: NEd Martin <edman007@edman007.com>
      Signed-off-by: NAndrey Utkin <andrey.krieger.utkin@gmail.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      36beddc2
    • D
      ip_tunnel: fix ip_tunnel_lookup · e0056593
      Dmitry Popov 提交于
      This patch fixes 3 similar bugs where incoming packets might be routed into
      wrong non-wildcard tunnels:
      
      1) Consider the following setup:
          ip address add 1.1.1.1/24 dev eth0
          ip address add 1.1.1.2/24 dev eth0
          ip tunnel add ipip1 remote 2.2.2.2 local 1.1.1.1 mode ipip dev eth0
          ip link set ipip1 up
      
      Incoming ipip packets from 2.2.2.2 were routed into ipip1 even if it has dst =
      1.1.1.2. Moreover even if there was wildcard tunnel like
         ip tunnel add ipip0 remote 2.2.2.2 local any mode ipip dev eth0
      but it was created before explicit one (with local 1.1.1.1), incoming ipip
      packets with src = 2.2.2.2 and dst = 1.1.1.2 were still routed into ipip1.
      
      Same issue existed with all tunnels that use ip_tunnel_lookup (gre, vti)
      
      2)  ip address add 1.1.1.1/24 dev eth0
          ip tunnel add ipip1 remote 2.2.146.85 local 1.1.1.1 mode ipip dev eth0
          ip link set ipip1 up
      
      Incoming ipip packets with dst = 1.1.1.1 were routed into ipip1, no matter what
      src address is. Any remote ip address which has ip_tunnel_hash = 0 raised this
      issue, 2.2.146.85 is just an example, there are more than 4 million of them.
      And again, wildcard tunnel like
         ip tunnel add ipip0 remote any local 1.1.1.1 mode ipip dev eth0
      wouldn't be ever matched if it was created before explicit tunnel like above.
      
      Gre & vti tunnels had the same issue.
      
      3)  ip address add 1.1.1.1/24 dev eth0
          ip tunnel add gre1 remote 2.2.146.84 local 1.1.1.1 key 1 mode gre dev eth0
          ip link set gre1 up
      
      Any incoming gre packet with key = 1 were routed into gre1, no matter what
      src/dst addresses are. Any remote ip address which has ip_tunnel_hash = 0 raised
      the issue, 2.2.146.84 is just an example, there are more than 4 million of them.
      Wildcard tunnel like
         ip tunnel add gre2 remote any local any key 1 mode gre dev eth0
      wouldn't be ever matched if it was created before explicit tunnel like above.
      
      All this stuff happened because while looking for a wildcard tunnel we didn't
      check that matched tunnel is a wildcard one. Fixed.
      Signed-off-by: NDmitry Popov <ixaphire@qrator.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e0056593
    • J
      tipc: fix bug in multicast/broadcast message reassembly · 29322d0d
      Jon Paul Maloy 提交于
      Since commit 37e22164 ("tipc: rename and
      move message reassembly function") reassembly of long broadcast messages
      has been broken. This is because we test for a non-NULL return value
      of the *buf parameter as criteria for succesful reassembly. However, this
      parameter is left defined even after reception of the first fragment,
      when reassebly is still incomplete. This leads to a kernel crash as soon
      as a the first fragment of a long broadcast message is received.
      
      We fix this with this commit, by implementing a stricter behavior of the
      function and its return values.
      
      This commit should be applied to both net and net-next.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      29322d0d
  5. 08 7月, 2014 5 次提交
    • Y
      tcp: fix false undo corner cases · 6e08d5e3
      Yuchung Cheng 提交于
      The undo code assumes that, upon entering loss recovery, TCP
      1) always retransmit something
      2) the retransmission never fails locally (e.g., qdisc drop)
      
      so undo_marker is set in tcp_enter_recovery() and undo_retrans is
      incremented only when tcp_retransmit_skb() is successful.
      
      When the assumption is broken because TCP's cwnd is too small to
      retransmit or the retransmit fails locally. The next (DUP)ACK
      would incorrectly revert the cwnd and the congestion state in
      tcp_try_undo_dsack() or tcp_may_undo(). Subsequent (DUP)ACKs
      may enter the recovery state. The sender repeatedly enter and
      (incorrectly) exit recovery states if the retransmits continue to
      fail locally while receiving (DUP)ACKs.
      
      The fix is to initialize undo_retrans to -1 and start counting on
      the first retransmission. Always increment undo_retrans even if the
      retransmissions fail locally because they couldn't cause DSACKs to
      undo the cwnd reduction.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6e08d5e3
    • D
      igmp: fix the problem when mc leave group · 52ad353a
      dingtianhong 提交于
      The problem was triggered by these steps:
      
      1) create socket, bind and then setsockopt for add mc group.
         mreq.imr_multiaddr.s_addr = inet_addr("255.0.0.37");
         mreq.imr_interface.s_addr = inet_addr("192.168.1.2");
         setsockopt(sockfd, IPPROTO_IP, IP_ADD_MEMBERSHIP, &mreq, sizeof(mreq));
      
      2) drop the mc group for this socket.
         mreq.imr_multiaddr.s_addr = inet_addr("255.0.0.37");
         mreq.imr_interface.s_addr = inet_addr("0.0.0.0");
         setsockopt(sockfd, IPPROTO_IP, IP_DROP_MEMBERSHIP, &mreq, sizeof(mreq));
      
      3) and then drop the socket, I found the mc group was still used by the dev:
      
         netstat -g
      
         Interface       RefCnt Group
         --------------- ------ ---------------------
         eth2		   1	  255.0.0.37
      
      Normally even though the IP_DROP_MEMBERSHIP return error, the mc group still need
      to be released for the netdev when drop the socket, but this process was broken when
      route default is NULL, the reason is that:
      
      The ip_mc_leave_group() will choose the in_dev by the imr_interface.s_addr, if input addr
      is NULL, the default route dev will be chosen, then the ifindex is got from the dev,
      then polling the inet->mc_list and return -ENODEV, but if the default route dev is NULL,
      the in_dev and ifIndex is both NULL, when polling the inet->mc_list, the mc group will be
      released from the mc_list, but the dev didn't dec the refcnt for this mc group, so
      when dropping the socket, the mc_list is NULL and the dev still keep this group.
      
      v1->v2: According Hideaki's suggestion, we should align with IPv6 (RFC3493) and BSDs,
      	so I add the checking for the in_dev before polling the mc_list, make sure when
      	we remove the mc group, dec the refcnt to the real dev which was using the mc address.
      	The problem would never happened again.
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52ad353a
    • L
      net: Fix NETDEV_CHANGE notifier usage causing spurious arp flush · 54951194
      Loic Prylli 提交于
      A bug was introduced in NETDEV_CHANGE notifier sequence causing the
      arp table to be sometimes spuriously cleared (including manual arp
      entries marked permanent), upon network link carrier changes.
      
      The changed argument for the notifier was applied only to a single
      caller of NETDEV_CHANGE, missing among others netdev_state_change().
      So upon net_carrier events induced by the network, which are
      triggering a call to netdev_state_change(), arp_netdev_event() would
      decide whether to clear or not arp cache based on random/junk stack
      values (a kind of read buffer overflow).
      
      Fixes: be9efd36 ("net: pass changed flags along with NETDEV_CHANGE event")
      Fixes: 6c8b4e3f ("arp: flush arp cache on IFF_NOARP change")
      Signed-off-by: NLoic Prylli <loicp@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      54951194
    • T
      net: Performance fix for process_backlog · 11ef7a89
      Tom Herbert 提交于
      In process_backlog the input_pkt_queue is only checked once for new
      packets and quota is artificially reduced to reflect precisely the
      number of packets on the input_pkt_queue so that the loop exits
      appropriately.
      
      This patches changes the behavior to be more straightforward and
      less convoluted. Packets are processed until either the quota
      is met or there are no more packets to process.
      
      This patch seems to provide a small, but noticeable performance
      improvement. The performance improvement is a result of staying
      in the process_backlog loop longer which can reduce number of IPI's.
      
      Performance data using super_netperf TCP_RR with 200 flows:
      
      Before fix:
      
      88.06% CPU utilization
      125/190/309 90/95/99% latencies
      1.46808e+06 tps
      1145382 intrs.sec.
      
      With fix:
      
      87.73% CPU utilization
      122/183/296 90/95/99% latencies
      1.4921e+06 tps
      1021674.30 intrs./sec.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      11ef7a89
    • E
      ipv4: icmp: Fix pMTU handling for rare case · 68b7107b
      Edward Allcutt 提交于
      Some older router implementations still send Fragmentation Needed
      errors with the Next-Hop MTU field set to zero. This is explicitly
      described as an eventuality that hosts must deal with by the
      standard (RFC 1191) since older standards specified that those
      bits must be zero.
      
      Linux had a generic (for all of IPv4) implementation of the algorithm
      described in the RFC for searching a list of MTU plateaus for a good
      value. Commit 46517008 ("ipv4: Kill ip_rt_frag_needed().")
      removed this as part of the changes to remove the routing cache.
      Subsequently any Fragmentation Needed packet with a zero Next-Hop
      MTU has been discarded without being passed to the per-protocol
      handlers or notifying userspace for raw sockets.
      
      When there is a router which does not implement RFC 1191 on an
      MTU limited path then this results in stalled connections since
      large packets are discarded and the local protocols are not
      notified so they never attempt to lower the pMTU.
      
      One example I have seen is an OpenBSD router terminating IPSec
      tunnels. It's worth pointing out that this case is distinct from
      the BSD 4.2 bug which incorrectly calculated the Next-Hop MTU
      since the commit in question dismissed that as a valid concern.
      
      All of the per-protocols handlers implement the simple approach from
      RFC 1191 of immediately falling back to the minimum value. Although
      this is sub-optimal it is vastly preferable to connections hanging
      indefinitely.
      
      Remove the Next-Hop MTU != 0 check and allow such packets
      to follow the normal path.
      
      Fixes: 46517008 ("ipv4: Kill ip_rt_frag_needed().")
      Signed-off-by: NEdward Allcutt <edward.allcutt@openmarket.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      68b7107b
  6. 03 7月, 2014 2 次提交
    • C
      tcp: Fix divide by zero when pushing during tcp-repair · 5924f17a
      Christoph Paasch 提交于
      When in repair-mode and TCP_RECV_QUEUE is set, we end up calling
      tcp_push with mss_now being 0. If data is in the send-queue and
      tcp_set_skb_tso_segs gets called, we crash because it will divide by
      mss_now:
      
      [  347.151939] divide error: 0000 [#1] SMP
      [  347.152907] Modules linked in:
      [  347.152907] CPU: 1 PID: 1123 Comm: packetdrill Not tainted 3.16.0-rc2 #4
      [  347.152907] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
      [  347.152907] task: f5b88540 ti: f3c82000 task.ti: f3c82000
      [  347.152907] EIP: 0060:[<c1601359>] EFLAGS: 00210246 CPU: 1
      [  347.152907] EIP is at tcp_set_skb_tso_segs+0x49/0xa0
      [  347.152907] EAX: 00000b67 EBX: f5acd080 ECX: 00000000 EDX: 00000000
      [  347.152907] ESI: f5a28f40 EDI: f3c88f00 EBP: f3c83d10 ESP: f3c83d00
      [  347.152907]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
      [  347.152907] CR0: 80050033 CR2: 083158b0 CR3: 35146000 CR4: 000006b0
      [  347.152907] Stack:
      [  347.152907]  c167f9d9 f5acd080 000005b4 00000002 f3c83d20 c16013e6 f3c88f00 f5acd080
      [  347.152907]  f3c83da0 c1603b5a f3c83d38 c10a0188 00000000 00000000 f3c83d84 c10acc85
      [  347.152907]  c1ad5ec0 00000000 00000000 c1ad679c 010003e0 00000000 00000000 f3c88fc8
      [  347.152907] Call Trace:
      [  347.152907]  [<c167f9d9>] ? apic_timer_interrupt+0x2d/0x34
      [  347.152907]  [<c16013e6>] tcp_init_tso_segs+0x36/0x50
      [  347.152907]  [<c1603b5a>] tcp_write_xmit+0x7a/0xbf0
      [  347.152907]  [<c10a0188>] ? up+0x28/0x40
      [  347.152907]  [<c10acc85>] ? console_unlock+0x295/0x480
      [  347.152907]  [<c10ad24f>] ? vprintk_emit+0x1ef/0x4b0
      [  347.152907]  [<c1605716>] __tcp_push_pending_frames+0x36/0xd0
      [  347.152907]  [<c15f4860>] tcp_push+0xf0/0x120
      [  347.152907]  [<c15f7641>] tcp_sendmsg+0xf1/0xbf0
      [  347.152907]  [<c116d920>] ? kmem_cache_free+0xf0/0x120
      [  347.152907]  [<c106a682>] ? __sigqueue_free+0x32/0x40
      [  347.152907]  [<c106a682>] ? __sigqueue_free+0x32/0x40
      [  347.152907]  [<c114f0f0>] ? do_wp_page+0x3e0/0x850
      [  347.152907]  [<c161c36a>] inet_sendmsg+0x4a/0xb0
      [  347.152907]  [<c1150269>] ? handle_mm_fault+0x709/0xfb0
      [  347.152907]  [<c15a006b>] sock_aio_write+0xbb/0xd0
      [  347.152907]  [<c1180b79>] do_sync_write+0x69/0xa0
      [  347.152907]  [<c1181023>] vfs_write+0x123/0x160
      [  347.152907]  [<c1181d55>] SyS_write+0x55/0xb0
      [  347.152907]  [<c167f0d8>] sysenter_do_call+0x12/0x28
      
      This can easily be reproduced with the following packetdrill-script (the
      "magic" with netem, sk_pacing and limit_output_bytes is done to prevent
      the kernel from pushing all segments, because hitting the limit without
      doing this is not so easy with packetdrill):
      
      0   socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      +0  setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      
      +0  bind(3, ..., ...) = 0
      +0  listen(3, 1) = 0
      
      +0  < S 0:0(0) win 32792 <mss 1460>
      +0  > S. 0:0(0) ack 1 <mss 1460>
      +0.1  < . 1:1(0) ack 1 win 65000
      
      +0  accept(3, ..., ...) = 4
      
      // This forces that not all segments of the snd-queue will be pushed
      +0 `tc qdisc add dev tun0 root netem delay 10ms`
      +0 `sysctl -w net.ipv4.tcp_limit_output_bytes=2`
      +0 setsockopt(4, SOL_SOCKET, 47, [2], 4) = 0
      
      +0 write(4,...,10000) = 10000
      +0 write(4,...,10000) = 10000
      
      // Set tcp-repair stuff, particularly TCP_RECV_QUEUE
      +0 setsockopt(4, SOL_TCP, 19, [1], 4) = 0
      +0 setsockopt(4, SOL_TCP, 20, [1], 4) = 0
      
      // This now will make the write push the remaining segments
      +0 setsockopt(4, SOL_SOCKET, 47, [20000], 4) = 0
      +0 `sysctl -w net.ipv4.tcp_limit_output_bytes=130000`
      
      // Now we will crash
      +0 write(4,...,1000) = 1000
      
      This happens since ec342325 (tcp: fix retransmission in repair
      mode). Prior to that, the call to tcp_push was prevented by a check for
      tp->repair.
      
      The patch fixes it, by adding the new goto-label out_nopush. When exiting
      tcp_sendmsg and a push is not required, which is the case for tp->repair,
      we go to this label.
      
      When repairing and calling send() with TCP_RECV_QUEUE, the data is
      actually put in the receive-queue. So, no push is required because no
      data has been added to the send-queue.
      
      Cc: Andrew Vagin <avagin@openvz.org>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Fixes: ec342325 (tcp: fix retransmission in repair mode)
      Signed-off-by: NChristoph Paasch <christoph.paasch@uclouvain.be>
      Acked-by: NAndrew Vagin <avagin@openvz.org>
      Acked-by: NPavel Emelyanov <xemul@parallels.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5924f17a
    • E
      vlan: free percpu stats in device destructor · a48e5faf
      Eric Dumazet 提交于
      Madalin-Cristian reported crashs happening after a recent commit
      (5a4ae5f6 "vlan: unnecessary to check if vlan_pcpu_stats is NULL")
      
      -----------------------------------------------------------------------
      root@p5040ds:~# vconfig add eth8 1
      root@p5040ds:~# vconfig rem eth8.1
      Unable to handle kernel paging request for data at address 0x2bc88028
      Faulting instruction address: 0xc058e950
      Oops: Kernel access of bad area, sig: 11 [#1]
      SMP NR_CPUS=8 CoreNet Generic
      Modules linked in:
      CPU: 3 PID: 2167 Comm: vconfig Tainted: G        W     3.16.0-rc3-00346-g65e85bf #2
      task: e7264d90 ti: e2c2c000 task.ti: e2c2c000
      NIP: c058e950 LR: c058ea30 CTR: c058e900
      REGS: e2c2db20 TRAP: 0300   Tainted: G        W      (3.16.0-rc3-00346-g65e85bf)
      MSR: 00029002 <CE,EE,ME>  CR: 48000428  XER: 20000000
      DEAR: 2bc88028 ESR: 00000000
      GPR00: c047299c e2c2dbd0 e7264d90 00000000 2bc88000 00000000 ffffffff 00000000
      GPR08: 0000000f 00000000 000000ff 00000000 28000422 10121928 10100000 10100000
      GPR16: 10100000 00000000 c07c5968 00000000 00000000 00000000 e2c2dc48 e7838000
      GPR24: c07c5bac c07c58a8 e77290cc c07b0000 00000000 c05de6c0 e7838000 e2c2dc48
      NIP [c058e950] vlan_dev_get_stats64+0x50/0x170
      LR [c058ea30] vlan_dev_get_stats64+0x130/0x170
      Call Trace:
      [e2c2dbd0] [ffffffea] 0xffffffea (unreliable)
      [e2c2dc20] [c047299c] dev_get_stats+0x4c/0x140
      [e2c2dc40] [c0488ca8] rtnl_fill_ifinfo+0x3d8/0x960
      [e2c2dd70] [c0489f4c] rtmsg_ifinfo+0x6c/0x110
      [e2c2dd90] [c04731d4] rollback_registered_many+0x344/0x3b0
      [e2c2ddd0] [c047332c] rollback_registered+0x2c/0x50
      [e2c2ddf0] [c0476058] unregister_netdevice_queue+0x78/0xf0
      [e2c2de00] [c058d800] unregister_vlan_dev+0xc0/0x160
      [e2c2de20] [c058e360] vlan_ioctl_handler+0x1c0/0x550
      [e2c2de90] [c045d11c] sock_ioctl+0x28c/0x2f0
      [e2c2deb0] [c010d070] do_vfs_ioctl+0x90/0x7b0
      [e2c2df20] [c010d7d0] SyS_ioctl+0x40/0x80
      [e2c2df40] [c000f924] ret_from_syscall+0x0/0x3c
      
      Fix this problem by freeing percpu stats from dev->destructor() instead
      of ndo_uninit()
      Reported-by: NMadalin-Cristian Bucur <madalin.bucur@freescale.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Tested-by: NMadalin-Cristian Bucur <madalin.bucur@freescale.com>
      Fixes: 5a4ae5f6 ("vlan: unnecessary to check if vlan_pcpu_stats is NULL")
      Cc: Li RongQing <roy.qing.li@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a48e5faf
  7. 01 7月, 2014 2 次提交
    • E
      ipv4: irq safe sk_dst_[re]set() and ipv4_sk_update_pmtu() fix · 7f502361
      Eric Dumazet 提交于
      We have two different ways to handle changes to sk->sk_dst
      
      First way (used by TCP) assumes socket lock is owned by caller, and use
      no extra lock : __sk_dst_set() & __sk_dst_reset()
      
      Another way (used by UDP) uses sk_dst_lock because socket lock is not
      always taken. Note that sk_dst_lock is not softirq safe.
      
      These ways are not inter changeable for a given socket type.
      
      ipv4_sk_update_pmtu(), added in linux-3.8, added a race, as it used
      the socket lock as synchronization, but users might be UDP sockets.
      
      Instead of converting sk_dst_lock to a softirq safe version, use xchg()
      as we did for sk_rx_dst in commit e47eb5df ("udp: ipv4: do not use
      sk_dst_lock from softirq context")
      
      In a follow up patch, we probably can remove sk_dst_lock, as it is
      only used in IPv6.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Fixes: 9cb3a50c ("ipv4: Invalidate the socket cached route on pmtu events if possible")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f502361
    • A
      openvswitch: Use exact lookup for flow_get and flow_del. · 4a46b24e
      Alex Wang 提交于
      Due to the race condition in userspace, there is chance that two
      overlapping megaflows could be installed in datapath.  And this
      causes userspace unable to delete the less inclusive megaflow flow
      even after it timeout, since the flow_del logic will stop at the
      first match of masked flow.
      
      This commit fixes the bug by making the kernel flow_del and flow_get
      logic check all masks in that case.
      
      Introduced by 03f0d916 (openvswitch: Mega flow implementation).
      Signed-off-by: NAlex Wang <alexw@nicira.com>
      Acked-by: NAndy Zhou <azhou@nicira.com>
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      4a46b24e
  8. 30 6月, 2014 3 次提交
  9. 27 6月, 2014 2 次提交
  10. 26 6月, 2014 5 次提交
  11. 25 6月, 2014 1 次提交
  12. 24 6月, 2014 1 次提交
  13. 23 6月, 2014 3 次提交
  14. 22 6月, 2014 1 次提交
  15. 20 6月, 2014 7 次提交
    • L
      Bluetooth: Fix for ACL disconnect when pairing fails · 1d56dc4f
      Lukasz Rymanowski 提交于
      When pairing fails hci_conn refcnt drops below zero. This cause that
      ACL link is not disconnected when disconnect timeout fires.
      
      Probably this is because l2cap_conn_del calls l2cap_chan_del for each
      channel, and inside l2cap_chan_del conn is dropped. After that loop
      hci_chan_del is called which also drops conn.
      
      Anyway, as it is desrcibed in hci_core.h, it is known that refcnt
      drops below 0 sometimes and it should be fine. If so, let disconnect
      link when hci_conn_timeout fires and refcnt is 0 or below. This patch
      does it.
      
      This affects PTS test SM_TC_JW_BV_05_C
      
      Logs from scenario:
      
      [69713.706227] [6515] pair_device:
      [69713.706230] [6515] hci_conn_add: hci0 dst 00:1b:dc:06:06:22
      [69713.706233] [6515] hci_dev_hold: hci0 orig refcnt 8
      [69713.706235] [6515] hci_conn_init_sysfs: conn ffff88021f65a000
      [69713.706239] [6515] hci_req_add_ev: hci0 opcode 0x200d plen 25
      [69713.706242] [6515] hci_prepare_cmd: skb len 28
      [69713.706243] [6515] hci_req_run: length 1
      [69713.706248] [6515] hci_conn_hold: hcon ffff88021f65a000 orig refcnt 0
      [69713.706251] [6515] hci_dev_put: hci0 orig refcnt 9
      [69713.706281] [8909] hci_cmd_work: hci0 cmd_cnt 1 cmd queued 1
      [69713.706288] [8909] hci_send_frame: hci0 type 1 len 28
      [69713.706290] [8909] hci_send_to_monitor: hdev ffff88021f0c7000 len 28
      [69713.706316] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000
      [69713.706382] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000
      [69713.711664] [8909] hci_rx_work: hci0
      [69713.711668] [8909] hci_send_to_monitor: hdev ffff88021f0c7000 len 6
      [69713.711680] [8909] hci_rx_work: hci0 Event packet
      [69713.711683] [8909] hci_cs_le_create_conn: hci0 status 0x00
      [69713.711685] [8909] hci_sent_cmd_data: hci0 opcode 0x200d
      [69713.711688] [8909] hci_req_cmd_complete: opcode 0x200d status 0x00
      [69713.711690] [8909] hci_sent_cmd_data: hci0 opcode 0x200d
      [69713.711695] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000
      [69713.711744] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000
      [69713.818875] [8909] hci_rx_work: hci0
      [69713.818889] [8909] hci_send_to_monitor: hdev ffff88021f0c7000 len 21
      [69713.818913] [8909] hci_rx_work: hci0 Event packet
      [69713.818917] [8909] hci_le_conn_complete_evt: hci0 status 0x00
      [69713.818922] [8909] hci_send_to_control: len 19
      [69713.818927] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000
      [69713.818938] [8909] hci_conn_add_sysfs: conn ffff88021f65a000
      [69713.818975] [6450] bt_sock_poll: sock ffff88005e758500, sk ffff88010323b800
      [69713.818981] [6515] hci_sock_recvmsg: sock ffff88005e75a080, sk ffff88010323ac00
      ...
      [69713.819021] [8909] hci_dev_hold: hci0 orig refcnt 10
      [69713.819025] [8909] l2cap_connect_cfm: hcon ffff88021f65a000 bdaddr 00:1b:dc:06:06:22 status 0
      [69713.819028] [8909] hci_chan_create: hci0 hcon ffff88021f65a000
      [69713.819031] [8909] l2cap_conn_add: hcon ffff88021f65a000 conn ffff880221005c00 hchan ffff88020d60b1c0
      [69713.819034] [8909] l2cap_conn_ready: conn ffff880221005c00
      [69713.819036] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000
      [69713.819037] [8909] smp_conn_security: conn ffff880221005c00 hcon ffff88021f65a000 level 0x02
      [69713.819039] [8909] smp_chan_create:
      [69713.819041] [8909] hci_conn_hold: hcon ffff88021f65a000 orig refcnt 1
      [69713.819043] [8909] smp_send_cmd: code 0x01
      [69713.819045] [8909] hci_send_acl: hci0 chan ffff88020d60b1c0 flags 0x0000
      [69713.819046] [5949] hci_sock_recvmsg: sock ffff8800941a9900, sk ffff88012bf4e800
      [69713.819049] [8909] hci_queue_acl: hci0 nonfrag skb ffff88005157c100 len 15
      [69713.819055] [5949] hci_sock_recvmsg: sock ffff8800941a9900, sk ffff88012bf4e800
      [69713.819057] [8909] l2cap_le_conn_ready:
      [69713.819064] [8909] l2cap_chan_create: chan ffff88005ede2c00
      [69713.819066] [8909] l2cap_chan_hold: chan ffff88005ede2c00 orig refcnt 1
      [69713.819069] [8909] l2cap_sock_init: sk ffff88005ede5800
      [69713.819072] [8909] bt_accept_enqueue: parent ffff880160356000, sk ffff88005ede5800
      [69713.819074] [8909] __l2cap_chan_add: conn ffff880221005c00, psm 0x00, dcid 0x0004
      [69713.819076] [8909] l2cap_chan_hold: chan ffff88005ede2c00 orig refcnt 2
      [69713.819078] [8909] hci_conn_hold: hcon ffff88021f65a000 orig refcnt 2
      [69713.819080] [8909] smp_conn_security: conn ffff880221005c00 hcon ffff88021f65a000 level 0x01
      [69713.819082] [8909] l2cap_sock_ready_cb: sk ffff88005ede5800, parent ffff880160356000
      [69713.819086] [8909] le_pairing_complete_cb: status 0
      [69713.819091] [8909] hci_tx_work: hci0 acl 10 sco 8 le 0
      [69713.819093] [8909] hci_sched_acl: hci0
      [69713.819094] [8909] hci_sched_sco: hci0
      [69713.819096] [8909] hci_sched_esco: hci0
      [69713.819098] [8909] hci_sched_le: hci0
      [69713.819099] [8909] hci_chan_sent: hci0
      [69713.819101] [8909] hci_chan_sent: chan ffff88020d60b1c0 quote 10
      [69713.819104] [8909] hci_sched_le: chan ffff88020d60b1c0 skb ffff88005157c100 len 15 priority 7
      [69713.819106] [8909] hci_send_frame: hci0 type 2 len 15
      [69713.819108] [8909] hci_send_to_monitor: hdev ffff88021f0c7000 len 15
      [69713.819119] [8909] hci_chan_sent: hci0
      [69713.819121] [8909] hci_prio_recalculate: hci0
      [69713.819123] [8909] process_pending_rx:
      [69713.819226] [6450] hci_sock_recvmsg: sock ffff88005e758780, sk ffff88010323d400
      ...
      [69713.822022] [6450] l2cap_sock_accept: sk ffff880160356000 timeo 0
      [69713.822024] [6450] bt_accept_dequeue: parent ffff880160356000
      [69713.822026] [6450] bt_accept_unlink: sk ffff88005ede5800 state 1
      [69713.822028] [6450] l2cap_sock_accept: new socket ffff88005ede5800
      [69713.822368] [6450] l2cap_sock_getname: sock ffff8800941ab700, sk ffff88005ede5800
      [69713.822375] [6450] l2cap_sock_getsockopt: sk ffff88005ede5800
      [69713.822383] [6450] l2cap_sock_getname: sock ffff8800941ab700, sk ffff88005ede5800
      [69713.822414] [6450] bt_sock_poll: sock ffff8800941ab700, sk ffff88005ede5800
      ...
      [69713.823255] [6450] l2cap_sock_getname: sock ffff8800941ab700, sk ffff88005ede5800
      [69713.823259] [6450] l2cap_sock_getsockopt: sk ffff88005ede5800
      [69713.824322] [6450] l2cap_sock_getname: sock ffff8800941ab700, sk ffff88005ede5800
      [69713.824330] [6450] l2cap_sock_getsockopt: sk ffff88005ede5800
      [69713.825029] [6450] bt_sock_poll: sock ffff88005e758500, sk ffff88010323b800
      ...
      [69713.825187] [6450] l2cap_sock_sendmsg: sock ffff8800941ab700, sk ffff88005ede5800
      [69713.825189] [6450] bt_sock_wait_ready: sk ffff88005ede5800
      [69713.825192] [6450] l2cap_create_basic_pdu: chan ffff88005ede2c00 len 3
      [69713.825196] [6450] l2cap_do_send: chan ffff88005ede2c00, skb ffff880160b0b500 len 7 priority 0
      [69713.825199] [6450] hci_send_acl: hci0 chan ffff88020d60b1c0 flags 0x0000
      [69713.825201] [6450] hci_queue_acl: hci0 nonfrag skb ffff880160b0b500 len 11
      [69713.825210] [8909] hci_tx_work: hci0 acl 9 sco 8 le 0
      [69713.825213] [8909] hci_sched_acl: hci0
      [69713.825214] [8909] hci_sched_sco: hci0
      [69713.825216] [8909] hci_sched_esco: hci0
      [69713.825217] [8909] hci_sched_le: hci0
      [69713.825219] [8909] hci_chan_sent: hci0
      [69713.825221] [8909] hci_chan_sent: chan ffff88020d60b1c0 quote 9
      [69713.825223] [8909] hci_sched_le: chan ffff88020d60b1c0 skb ffff880160b0b500 len 11 priority 0
      [69713.825225] [8909] hci_send_frame: hci0 type 2 len 11
      [69713.825227] [8909] hci_send_to_monitor: hdev ffff88021f0c7000 len 11
      [69713.825242] [8909] hci_chan_sent: hci0
      [69713.825253] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000
      [69713.825253] [8909] hci_prio_recalculate: hci0
      [69713.825292] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000
      [69713.825768] [6450] bt_sock_poll: sock ffff88005e758500, sk ffff88010323b800
      ...
      [69713.866902] [8909] hci_rx_work: hci0
      [69713.866921] [8909] hci_send_to_monitor: hdev ffff88021f0c7000 len 7
      [69713.866928] [8909] hci_rx_work: hci0 Event packet
      [69713.866931] [8909] hci_num_comp_pkts_evt: hci0 num_hndl 1
      [69713.866937] [8909] hci_tx_work: hci0 acl 9 sco 8 le 0
      [69713.866939] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000
      [69713.866940] [8909] hci_sched_acl: hci0
      ...
      [69713.866944] [8909] hci_sched_le: hci0
      [69713.866953] [8909] hci_chan_sent: hci0
      [69713.866997] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000
      [69713.867840] [28074] hci_rx_work: hci0
      [69713.867844] [28074] hci_send_to_monitor: hdev ffff88021f0c7000 len 7
      [69713.867850] [28074] hci_rx_work: hci0 Event packet
      [69713.867853] [28074] hci_num_comp_pkts_evt: hci0 num_hndl 1
      [69713.867857] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000
      [69713.867858] [28074] hci_tx_work: hci0 acl 10 sco 8 le 0
      [69713.867860] [28074] hci_sched_acl: hci0
      [69713.867861] [28074] hci_sched_sco: hci0
      [69713.867862] [28074] hci_sched_esco: hci0
      [69713.867863] [28074] hci_sched_le: hci0
      [69713.867865] [28074] hci_chan_sent: hci0
      [69713.867888] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000
      [69714.145661] [8909] hci_rx_work: hci0
      [69714.145666] [8909] hci_send_to_monitor: hdev ffff88021f0c7000 len 10
      [69714.145676] [8909] hci_rx_work: hci0 ACL data packet
      [69714.145679] [8909] hci_acldata_packet: hci0 len 6 handle 0x002d flags 0x0002
      [69714.145681] [8909] hci_conn_enter_active_mode: hcon ffff88021f65a000 mode 0
      [69714.145683] [8909] l2cap_recv_acldata: conn ffff880221005c00 len 6 flags 0x2
      [69714.145693] [8909] l2cap_recv_frame: len 2, cid 0x0006
      [69714.145696] [8909] hci_send_to_control: len 14
      [69714.145710] [8909] smp_chan_destroy:
      [69714.145713] [8909] pairing_complete: status 3
      [69714.145714] [8909] cmd_complete: sock ffff88010323ac00
      [69714.145717] [8909] hci_conn_drop: hcon ffff88021f65a000 orig refcnt 3
      [69714.145719] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000
      [69714.145720] [6450] bt_sock_poll: sock ffff88005e758500, sk ffff88010323b800
      [69714.145722] [6515] hci_sock_recvmsg: sock ffff88005e75a080, sk ffff88010323ac00
      [69714.145724] [6450] bt_sock_poll: sock ffff8801db6b4f00, sk ffff880160351c00
      ...
      [69714.145735] [6515] hci_sock_recvmsg: sock ffff88005e75a080, sk ffff88010323ac00
      [69714.145737] [8909] hci_conn_drop: hcon ffff88021f65a000 orig refcnt 2
      [69714.145739] [8909] l2cap_conn_del: hcon ffff88021f65a000 conn ffff880221005c00, err 13
      [69714.145740] [6450] bt_sock_poll: sock ffff8801db6b5400, sk ffff88021e775000
      [69714.145743] [6450] bt_sock_poll: sock ffff8801db6b5e00, sk ffff880160356000
      [69714.145744] [8909] l2cap_chan_hold: chan ffff88005ede2c00 orig refcnt 3
      [69714.145746] [6450] bt_sock_poll: sock ffff8800941ab700, sk ffff88005ede5800
      [69714.145748] [8909] l2cap_chan_del: chan ffff88005ede2c00, conn ffff880221005c00, err 13
      [69714.145749] [8909] l2cap_chan_put: chan ffff88005ede2c00 orig refcnt 4
      [69714.145751] [8909] hci_conn_drop: hcon ffff88021f65a000 orig refcnt 1
      [69714.145754] [6450] bt_sock_poll: sock ffff8800941ab700, sk ffff88005ede5800
      [69714.145756] [8909] l2cap_chan_put: chan ffff88005ede2c00 orig refcnt 3
      [69714.145759] [8909] hci_chan_del: hci0 hcon ffff88021f65a000 chan ffff88020d60b1c0
      [69714.145766] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000
      [69714.145787] [6515] hci_sock_release: sock ffff88005e75a080 sk ffff88010323ac00
      [69714.146002] [6450] hci_sock_recvmsg: sock ffff88005e758780, sk ffff88010323d400
      [69714.150795] [6450] l2cap_sock_release: sock ffff8800941ab700, sk ffff88005ede5800
      [69714.150799] [6450] l2cap_sock_shutdown: sock ffff8800941ab700, sk ffff88005ede5800
      [69714.150802] [6450] l2cap_chan_close: chan ffff88005ede2c00 state BT_CLOSED
      [69714.150805] [6450] l2cap_sock_kill: sk ffff88005ede5800 state BT_CLOSED
      [69714.150806] [6450] l2cap_chan_put: chan ffff88005ede2c00 orig refcnt 2
      [69714.150808] [6450] l2cap_sock_destruct: sk ffff88005ede5800
      [69714.150809] [6450] l2cap_chan_put: chan ffff88005ede2c00 orig refcnt 1
      [69714.150811] [6450] l2cap_chan_destroy: chan ffff88005ede2c00
      [69714.150970] [6450] bt_sock_poll: sock ffff88005e758500, sk ffff88010323b800
      ...
      [69714.151991] [8909] hci_conn_drop: hcon ffff88021f65a000 orig refcnt 0
      [69716.150339] [8909] hci_conn_timeout: hcon ffff88021f65a000 state BT_CONNECTED, refcnt -1
      Signed-off-by: NLukasz Rymanowski <lukasz.rymanowski@tieto.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      1d56dc4f
    • J
      Bluetooth: Fix rejecting pairing in case of insufficient capabilities · 2ed8f65c
      Johan Hedberg 提交于
      If we need an MITM protected connection but the local and remote IO
      capabilities cannot provide it we should reject the pairing attempt in
      the appropriate way. This patch adds the missing checks for such a
      situation to the smp_cmd_pairing_req() and smp_cmd_pairing_rsp()
      functions.
      Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      2ed8f65c
    • J
      Bluetooth: Refactor authentication method lookup into its own function · 581370cc
      Johan Hedberg 提交于
      We'll need to do authentication method lookups from more than one place,
      so refactor the lookup into its own function.
      Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      581370cc
    • J
      Bluetooth: Fix overriding higher security level in SMP · c7262e71
      Johan Hedberg 提交于
      When we receive a pairing request or an internal request to start
      pairing we shouldn't blindly overwrite the existing pending_sec_level
      value as that may actually be higher than the new one. This patch fixes
      the SMP code to only overwrite the value in case the new one is higher
      than the old.
      Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      c7262e71
    • D
      net: sctp: check proc_dointvec result in proc_sctp_do_auth · 24599e61
      Daniel Borkmann 提交于
      When writing to the sysctl field net.sctp.auth_enable, it can well
      be that the user buffer we handed over to proc_dointvec() via
      proc_sctp_do_auth() handler contains something other than integers.
      
      In that case, we would set an uninitialized 4-byte value from the
      stack to net->sctp.auth_enable that can be leaked back when reading
      the sysctl variable, and it can unintentionally turn auth_enable
      on/off based on the stack content since auth_enable is interpreted
      as a boolean.
      
      Fix it up by making sure proc_dointvec() returned sucessfully.
      
      Fixes: b14878cc ("net: sctp: cache auth_enable per endpoint")
      Reported-by: NFlorian Westphal <fwestpha@redhat.com>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      24599e61
    • N
      tcp: fix tcp_match_skb_to_sack() for unaligned SACK at end of an skb · 2cd0d743
      Neal Cardwell 提交于
      If there is an MSS change (or misbehaving receiver) that causes a SACK
      to arrive that covers the end of an skb but is less than one MSS, then
      tcp_match_skb_to_sack() was rounding up pkt_len to the full length of
      the skb ("Round if necessary..."), then chopping all bytes off the skb
      and creating a zero-byte skb in the write queue.
      
      This was visible now because the recently simplified TLP logic in
      bef1909e ("tcp: fixing TLP's FIN recovery") could find that 0-byte
      skb at the end of the write queue, and now that we do not check that
      skb's length we could send it as a TLP probe.
      
      Consider the following example scenario:
      
       mss: 1000
       skb: seq: 0 end_seq: 4000  len: 4000
       SACK: start_seq: 3999 end_seq: 4000
      
      The tcp_match_skb_to_sack() code will compute:
      
       in_sack = false
       pkt_len = start_seq - TCP_SKB_CB(skb)->seq = 3999 - 0 = 3999
       new_len = (pkt_len / mss) * mss = (3999/1000)*1000 = 3000
       new_len += mss = 4000
      
      Previously we would find the new_len > skb->len check failing, so we
      would fall through and set pkt_len = new_len = 4000 and chop off
      pkt_len of 4000 from the 4000-byte skb, leaving a 0-byte segment
      afterward in the write queue.
      
      With this new commit, we notice that the new new_len >= skb->len check
      succeeds, so that we return without trying to fragment.
      
      Fixes: adb92db8 ("tcp: Make SACK code to split only at mss boundaries")
      Reported-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Ilpo Jarvinen <ilpo.jarvinen@helsinki.fi>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2cd0d743
    • D
      Revert "net: return actual error on register_queue_kobjects" · 8e4946cc
      David S. Miller 提交于
      This reverts commit d36a4f4b.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8e4946cc
  16. 19 6月, 2014 1 次提交