1. 15 10月, 2014 1 次提交
    • D
      net: sctp: fix panic on duplicate ASCONF chunks · b69040d8
      Daniel Borkmann 提交于
      When receiving a e.g. semi-good formed connection scan in the
      form of ...
      
        -------------- INIT[ASCONF; ASCONF_ACK] ------------->
        <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
        -------------------- COOKIE-ECHO -------------------->
        <-------------------- COOKIE-ACK ---------------------
        ---------------- ASCONF_a; ASCONF_b ----------------->
      
      ... where ASCONF_a equals ASCONF_b chunk (at least both serials
      need to be equal), we panic an SCTP server!
      
      The problem is that good-formed ASCONF chunks that we reply with
      ASCONF_ACK chunks are cached per serial. Thus, when we receive a
      same ASCONF chunk twice (e.g. through a lost ASCONF_ACK), we do
      not need to process them again on the server side (that was the
      idea, also proposed in the RFC). Instead, we know it was cached
      and we just resend the cached chunk instead. So far, so good.
      
      Where things get nasty is in SCTP's side effect interpreter, that
      is, sctp_cmd_interpreter():
      
      While incoming ASCONF_a (chunk = event_arg) is being marked
      !end_of_packet and !singleton, and we have an association context,
      we do not flush the outqueue the first time after processing the
      ASCONF_ACK singleton chunk via SCTP_CMD_REPLY. Instead, we keep it
      queued up, although we set local_cork to 1. Commit 2e3216cd
      changed the precedence, so that as long as we get bundled, incoming
      chunks we try possible bundling on outgoing queue as well. Before
      this commit, we would just flush the output queue.
      
      Now, while ASCONF_a's ASCONF_ACK sits in the corked outq, we
      continue to process the same ASCONF_b chunk from the packet. As
      we have cached the previous ASCONF_ACK, we find it, grab it and
      do another SCTP_CMD_REPLY command on it. So, effectively, we rip
      the chunk->list pointers and requeue the same ASCONF_ACK chunk
      another time. Since we process ASCONF_b, it's correctly marked
      with end_of_packet and we enforce an uncork, and thus flush, thus
      crashing the kernel.
      
      Fix it by testing if the ASCONF_ACK is currently pending and if
      that is the case, do not requeue it. When flushing the output
      queue we may relink the chunk for preparing an outgoing packet,
      but eventually unlink it when it's copied into the skb right
      before transmission.
      
      Joint work with Vlad Yasevich.
      
      Fixes: 2e3216cd ("sctp: Follow security requirement of responding with 1 packet")
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b69040d8
  2. 30 8月, 2014 1 次提交
    • D
      net: sctp: fix ABI mismatch through sctp_assoc_to_state helper · 38ab1fa9
      Daniel Borkmann 提交于
      Since SCTP day 1, that is, 19b55a2af145 ("Initial commit") from lksctp
      tree, the official <netinet/sctp.h> header carries a copy of enum
      sctp_sstat_state that looks like (compared to the current in-kernel
      enumeration):
      
        User definition:                     Kernel definition:
      
        enum sctp_sstat_state {              typedef enum {
          SCTP_EMPTY             = 0,          <removed>
          SCTP_CLOSED            = 1,          SCTP_STATE_CLOSED            = 0,
          SCTP_COOKIE_WAIT       = 2,          SCTP_STATE_COOKIE_WAIT       = 1,
          SCTP_COOKIE_ECHOED     = 3,          SCTP_STATE_COOKIE_ECHOED     = 2,
          SCTP_ESTABLISHED       = 4,          SCTP_STATE_ESTABLISHED       = 3,
          SCTP_SHUTDOWN_PENDING  = 5,          SCTP_STATE_SHUTDOWN_PENDING  = 4,
          SCTP_SHUTDOWN_SENT     = 6,          SCTP_STATE_SHUTDOWN_SENT     = 5,
          SCTP_SHUTDOWN_RECEIVED = 7,          SCTP_STATE_SHUTDOWN_RECEIVED = 6,
          SCTP_SHUTDOWN_ACK_SENT = 8,          SCTP_STATE_SHUTDOWN_ACK_SENT = 7,
        };                                   } sctp_state_t;
      
      This header was later on also placed into the uapi, so that user space
      programs can compile without having <netinet/sctp.h>, but the shipped
      with <linux/sctp.h> instead.
      
      While RFC6458 under 8.2.1.Association Status (SCTP_STATUS) says that
      sstat_state can range from SCTP_CLOSED to SCTP_SHUTDOWN_ACK_SENT, we
      nevertheless have a what it appears to be dummy SCTP_EMPTY state from
      the very early days.
      
      While it seems to do just nothing, commit 0b8f9e25 ("sctp: remove
      completely unsed EMPTY state") did the right thing and removed this dead
      code. That however, causes an off-by-one when the user asks the SCTP
      stack via SCTP_STATUS API and checks for the current socket state thus
      yielding possibly undefined behaviour in applications as they expect
      the kernel to tell the right thing.
      
      The enumeration had to be changed however as based on the current socket
      state, we access a function pointer lookup-table through this. Therefore,
      I think the best way to deal with this is just to add a helper function
      sctp_assoc_to_state() to encapsulate the off-by-one quirk.
      Reported-by: NTristan Su <sooqing@gmail.com>
      Fixes: 0b8f9e25 ("sctp: remove completely unsed EMPTY state")
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      38ab1fa9
  3. 01 8月, 2014 1 次提交
    • J
      sctp: Fixup v4mapped behaviour to comply with Sock API · 299ee123
      Jason Gunthorpe 提交于
      The SCTP socket extensions API document describes the v4mapping option as
      follows:
      
      8.1.15.  Set/Clear IPv4 Mapped Addresses (SCTP_I_WANT_MAPPED_V4_ADDR)
      
         This socket option is a Boolean flag which turns on or off the
         mapping of IPv4 addresses.  If this option is turned on, then IPv4
         addresses will be mapped to V6 representation.  If this option is
         turned off, then no mapping will be done of V4 addresses and a user
         will receive both PF_INET6 and PF_INET type addresses on the socket.
         See [RFC3542] for more details on mapped V6 addresses.
      
      This description isn't really in line with what the code does though.
      
      Introduce addr_to_user (renamed addr_v4map), which should be called
      before any sockaddr is passed back to user space. The new function
      places the sockaddr into the correct format depending on the
      SCTP_I_WANT_MAPPED_V4_ADDR option.
      
      Audit all places that touched v4mapped and either sanely construct
      a v4 or v6 address then call addr_to_user, or drop the
      unnecessary v4mapped check entirely.
      
      Audit all places that call addr_to_user and verify they are on a sycall
      return path.
      
      Add a custom getname that formats the address properly.
      
      Several bugs are addressed:
       - SCTP_I_WANT_MAPPED_V4_ADDR=0 often returned garbage for
         addresses to user space
       - The addr_len returned from recvmsg was not correct when
         returning AF_INET on a v6 socket
       - flowlabel and scope_id were not zerod when promoting
         a v4 to v6
       - Some syscalls like bind and connect behaved differently
         depending on v4mapped
      
      Tested bind, getpeername, getsockname, connect, and recvmsg for proper
      behaviour in v4mapped = 1 and 0 cases.
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Tested-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      299ee123
  4. 17 7月, 2014 1 次提交
    • G
      net: sctp: implement rfc6458, 5.3.6. SCTP_NXTINFO cmsg support · 2347c80f
      Geir Ola Vaagland 提交于
      This patch implements section 5.3.6. of RFC6458, that is, support
      for 'SCTP Next Receive Information Structure' (SCTP_NXTINFO) which
      is placed into ancillary data cmsghdr structure for each recvmsg()
      call, if this information is already available when delivering the
      current message.
      
      This option can be enabled/disabled via setsockopt(2) on SOL_SCTP
      level by setting an int value with 1/0 for SCTP_RECVNXTINFO in
      user space applications as per RFC6458, section 8.1.30.
      
      The sctp_nxtinfo structure is defined as per RFC as below ...
      
        struct sctp_nxtinfo {
          uint16_t nxt_sid;
          uint16_t nxt_flags;
          uint32_t nxt_ppid;
          uint32_t nxt_length;
          sctp_assoc_t nxt_assoc_id;
        };
      
      ... and provided under cmsg_level IPPROTO_SCTP, cmsg_type
      SCTP_NXTINFO, while cmsg_data[] contains struct sctp_nxtinfo.
      
      Joint work with Daniel Borkmann.
      Signed-off-by: NGeir Ola Vaagland <geirola@gmail.com>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2347c80f
  5. 03 7月, 2014 1 次提交
    • D
      net: sctp: improve timer slack calculation for transport HBs · 8f61059a
      Daniel Borkmann 提交于
      RFC4960, section 8.3 says:
      
        On an idle destination address that is allowed to heartbeat,
        it is recommended that a HEARTBEAT chunk is sent once per RTO
        of that destination address plus the protocol parameter
        'HB.interval', with jittering of +/- 50% of the RTO value,
        and exponential backoff of the RTO if the previous HEARTBEAT
        is unanswered.
      
      Currently, we calculate jitter via sctp_jitter() function first,
      and then add its result to the current RTO for the new timeout:
      
        TMO = RTO + (RAND() % RTO) - (RTO / 2)
                    `------------------------^-=> sctp_jitter()
      
      Instead, we can just simplify all this by directly calculating:
      
        TMO = (RTO / 2) + (RAND() % RTO)
      
      With the help of prandom_u32_max(), we don't need to open code
      our own global PRNG, but can instead just make use of the per
      CPU implementation of prandom with better quality numbers. Also,
      we can now spare us the conditional for divide by zero check
      since no div or mod operation needs to be used. Note that
      prandom_u32_max() won't emit the same result as a mod operation,
      but we really don't care here as we only want to have a random
      number scaled into RTO interval.
      
      Note, exponential RTO backoff is handeled elsewhere, namely in
      sctp_do_8_2_transport_strike().
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f61059a
  6. 12 4月, 2014 1 次提交
    • D
      net: Fix use after free by removing length arg from sk_data_ready callbacks. · 676d2369
      David S. Miller 提交于
      Several spots in the kernel perform a sequence like:
      
      	skb_queue_tail(&sk->s_receive_queue, skb);
      	sk->sk_data_ready(sk, skb->len);
      
      But at the moment we place the SKB onto the socket receive queue it
      can be consumed and freed up.  So this skb->len access is potentially
      to freed up memory.
      
      Furthermore, the skb->len can be modified by the consumer so it is
      possible that the value isn't accurate.
      
      And finally, no actual implementation of this callback actually uses
      the length argument.  And since nobody actually cared about it's
      value, lots of call sites pass arbitrary values in such as '0' and
      even '1'.
      
      So just remove the length argument from the callback, that way there
      is no confusion whatsoever and all of these use-after-free cases get
      fixed as a side effect.
      
      Based upon a patch by Eric Dumazet and his suggestion to audit this
      issue tree-wide.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      676d2369
  7. 22 1月, 2014 7 次提交
  8. 07 12月, 2013 1 次提交
  9. 24 9月, 2013 1 次提交
  10. 10 8月, 2013 1 次提交
  11. 03 8月, 2013 1 次提交
  12. 25 7月, 2013 1 次提交
  13. 02 7月, 2013 1 次提交
    • D
      net: sctp: rework debugging framework to use pr_debug and friends · bb33381d
      Daniel Borkmann 提交于
      We should get rid of all own SCTP debug printk macros and use the ones
      that the kernel offers anyway instead. This makes the code more readable
      and conform to the kernel code, and offers all the features of dynamic
      debbuging that pr_debug() et al has, such as only turning on/off portions
      of debug messages at runtime through debugfs. The runtime cost of having
      CONFIG_DYNAMIC_DEBUG enabled, but none of the debug statements printing,
      is negligible [1]. If kernel debugging is completly turned off, then these
      statements will also compile into "empty" functions.
      
      While we're at it, we also need to change the Kconfig option as it /now/
      only refers to the ifdef'ed code portions in outqueue.c that enable further
      debugging/tracing of SCTP transaction fields. Also, since SCTP_ASSERT code
      was enabled with this Kconfig option and has now been removed, we
      transform those code parts into WARNs resp. where appropriate BUG_ONs so
      that those bugs can be more easily detected as probably not many people
      have SCTP debugging permanently turned on.
      
      To turn on all SCTP debugging, the following steps are needed:
      
       # mount -t debugfs none /sys/kernel/debug
       # echo -n 'module sctp +p' > /sys/kernel/debug/dynamic_debug/control
      
      This can be done more fine-grained on a per file, per line basis and others
      as described in [2].
      
       [1] https://www.kernel.org/doc/ols/2009/ols2009-pages-39-46.pdf
       [2] Documentation/dynamic-debug-howto.txt
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bb33381d
  14. 26 6月, 2013 2 次提交
  15. 18 6月, 2013 2 次提交
  16. 18 3月, 2013 1 次提交
  17. 28 2月, 2013 1 次提交
    • S
      hlist: drop the node parameter from iterators · b67bfe0d
      Sasha Levin 提交于
      I'm not sure why, but the hlist for each entry iterators were conceived
      
              list_for_each_entry(pos, head, member)
      
      The hlist ones were greedy and wanted an extra parameter:
      
              hlist_for_each_entry(tpos, pos, head, member)
      
      Why did they need an extra pos parameter? I'm not quite sure. Not only
      they don't really need it, it also prevents the iterator from looking
      exactly like the list iterator, which is unfortunate.
      
      Besides the semantic patch, there was some manual work required:
      
       - Fix up the actual hlist iterators in linux/list.h
       - Fix up the declaration of other iterators based on the hlist ones.
       - A very small amount of places were using the 'node' parameter, this
       was modified to use 'obj->member' instead.
       - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
       properly, so those had to be fixed up manually.
      
      The semantic patch which is mostly the work of Peter Senna Tschudin is here:
      
      @@
      iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
      
      type T;
      expression a,c,d,e;
      identifier b;
      statement S;
      @@
      
      -T b;
          <+... when != b
      (
      hlist_for_each_entry(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue(a,
      - b,
      c) S
      |
      hlist_for_each_entry_from(a,
      - b,
      c) S
      |
      hlist_for_each_entry_rcu(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_rcu_bh(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue_rcu_bh(a,
      - b,
      c) S
      |
      for_each_busy_worker(a, c,
      - b,
      d) S
      |
      ax25_uid_for_each(a,
      - b,
      c) S
      |
      ax25_for_each(a,
      - b,
      c) S
      |
      inet_bind_bucket_for_each(a,
      - b,
      c) S
      |
      sctp_for_each_hentry(a,
      - b,
      c) S
      |
      sk_for_each(a,
      - b,
      c) S
      |
      sk_for_each_rcu(a,
      - b,
      c) S
      |
      sk_for_each_from
      -(a, b)
      +(a)
      S
      + sk_for_each_from(a) S
      |
      sk_for_each_safe(a,
      - b,
      c, d) S
      |
      sk_for_each_bound(a,
      - b,
      c) S
      |
      hlist_for_each_entry_safe(a,
      - b,
      c, d, e) S
      |
      hlist_for_each_entry_continue_rcu(a,
      - b,
      c) S
      |
      nr_neigh_for_each(a,
      - b,
      c) S
      |
      nr_neigh_for_each_safe(a,
      - b,
      c, d) S
      |
      nr_node_for_each(a,
      - b,
      c) S
      |
      nr_node_for_each_safe(a,
      - b,
      c, d) S
      |
      - for_each_gfn_sp(a, c, d, b) S
      + for_each_gfn_sp(a, c, d) S
      |
      - for_each_gfn_indirect_valid_sp(a, c, d, b) S
      + for_each_gfn_indirect_valid_sp(a, c, d) S
      |
      for_each_host(a,
      - b,
      c) S
      |
      for_each_host_safe(a,
      - b,
      c, d) S
      |
      for_each_mesh_entry(a,
      - b,
      c, d) S
      )
          ...+>
      
      [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
      [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
      [akpm@linux-foundation.org: checkpatch fixes]
      [akpm@linux-foundation.org: fix warnings]
      [akpm@linux-foudnation.org: redo intrusive kvm changes]
      Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b67bfe0d
  18. 04 12月, 2012 1 次提交
    • M
      sctp: Add support to per-association statistics via a new SCTP_GET_ASSOC_STATS call · 196d6759
      Michele Baldessari 提交于
      The current SCTP stack is lacking a mechanism to have per association
      statistics. This is an implementation modeled after OpenSolaris'
      SCTP_GET_ASSOC_STATS.
      
      Userspace part will follow on lksctp if/when there is a general ACK on
      this.
      V4:
      - Move ipackets++ before q->immediate.func() for consistency reasons
      - Move sctp_max_rto() at the end of sctp_transport_update_rto() to avoid
        returning bogus RTO values
      - return asoc->rto_min when max_obs_rto value has not changed
      
      V3:
      - Increase ictrlchunks in sctp_assoc_bh_rcv() as well
      - Move ipackets++ to sctp_inq_push()
      - return 0 when no rto updates took place since the last call
      
      V2:
      - Implement partial retrieval of stat struct to cope for future expansion
      - Kill the rtxpackets counter as it cannot be precise anyway
      - Rename outseqtsns to outofseqtsns to make it clearer that these are out
        of sequence unexpected TSNs
      - Move asoc->ipackets++ under a lock to avoid potential miscounts
      - Fold asoc->opackets++ into the already existing asoc check
      - Kill unneeded (q->asoc) test when increasing rtxchunks
      - Do not count octrlchunks if sending failed (SCTP_XMIT_OK != 0)
      - Don't count SHUTDOWNs as SACKs
      - Move SCTP_GET_ASSOC_STATS to the private space API
      - Adjust the len check in sctp_getsockopt_assoc_stats() to allow for
        future struct growth
      - Move association statistics in their own struct
      - Update idupchunks when we send a SACK with dup TSNs
      - return min_rto in max_rto when RTO has not changed. Also return the
        transport when max_rto last changed.
      
      Signed-off: Michele Baldessari <michele@acksyn.org>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      196d6759
  19. 15 8月, 2012 10 次提交
  20. 16 7月, 2012 1 次提交
  21. 12 7月, 2012 1 次提交
  22. 11 5月, 2012 1 次提交
  23. 09 3月, 2012 1 次提交