1. 02 8月, 2017 2 次提交
    • W
      net: add skb_frag_foreach_page and use with kmap_atomic · c613c209
      Willem de Bruijn 提交于
      Skb frags may contain compound pages. Various operations map frags
      temporarily using kmap_atomic, but this function works on single
      pages, not whole compound pages. The distinction is only relevant
      for high mem pages that require temporary mappings.
      
      Introduce a looping mechanism that for compound highmem pages maps
      one page at a time, does not change behavior on other pages.
      Use the loop in the kmap_atomic callers in net/core/skbuff.c.
      
      Verified by triggering skb_copy_bits with
      
          tcpdump -n -c 100 -i ${DEV} -w /dev/null &
          netperf -t TCP_STREAM -H ${HOST}
      
        and by triggering __skb_checksum with
      
          ethtool -K ${DEV} tx off
      
        repeated the tests with looping on a non-highmem platform
        (x86_64) by making skb_frag_must_loop always return true.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c613c209
    • T
      skbuff: Function to send an skbuf on a socket · 20bf50de
      Tom Herbert 提交于
      Add skb_send_sock to send an skbuff on a socket within the kernel.
      Arguments include an offset so that an skbuf might be sent in mulitple
      calls (e.g. send buffer limit is hit).
      Signed-off-by: NTom Herbert <tom@quantonium.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      20bf50de
  2. 18 7月, 2017 2 次提交
  3. 03 7月, 2017 1 次提交
  4. 01 7月, 2017 3 次提交
  5. 21 6月, 2017 1 次提交
    • Y
      net: introduce __skb_put_[zero, data, u8] · de77b966
      yuan linyu 提交于
      follow Johannes Berg, semantic patch file as below,
      @@
      identifier p, p2;
      expression len;
      expression skb;
      type t, t2;
      @@
      (
      -p = __skb_put(skb, len);
      +p = __skb_put_zero(skb, len);
      |
      -p = (t)__skb_put(skb, len);
      +p = __skb_put_zero(skb, len);
      )
      ... when != p
      (
      p2 = (t2)p;
      -memset(p2, 0, len);
      |
      -memset(p, 0, len);
      )
      
      @@
      identifier p;
      expression len;
      expression skb;
      type t;
      @@
      (
      -t p = __skb_put(skb, len);
      +t p = __skb_put_zero(skb, len);
      )
      ... when != p
      (
      -memset(p, 0, len);
      )
      
      @@
      type t, t2;
      identifier p, p2;
      expression skb;
      @@
      t *p;
      ...
      (
      -p = __skb_put(skb, sizeof(t));
      +p = __skb_put_zero(skb, sizeof(t));
      |
      -p = (t *)__skb_put(skb, sizeof(t));
      +p = __skb_put_zero(skb, sizeof(t));
      )
      ... when != p
      (
      p2 = (t2)p;
      -memset(p2, 0, sizeof(*p));
      |
      -memset(p, 0, sizeof(*p));
      )
      
      @@
      expression skb, len;
      @@
      -memset(__skb_put(skb, len), 0, len);
      +__skb_put_zero(skb, len);
      
      @@
      expression skb, len, data;
      @@
      -memcpy(__skb_put(skb, len), data, len);
      +__skb_put_data(skb, data, len);
      
      @@
      expression SKB, C, S;
      typedef u8;
      identifier fn = {__skb_put};
      fresh identifier fn2 = fn ## "_u8";
      @@
      - *(u8 *)fn(SKB, S) = C;
      + fn2(SKB, C);
      Signed-off-by: Nyuan linyu <Linyu.Yuan@alcatel-sbell.com.cn>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      de77b966
  6. 16 6月, 2017 6 次提交
    • J
      networking: add and use skb_put_u8() · 634fef61
      Johannes Berg 提交于
      Joe and Bjørn suggested that it'd be nicer to not have the
      cast in the fairly common case of doing
      	*(u8 *)skb_put(skb, 1) = c;
      
      Add skb_put_u8() for this case, and use it across the code,
      using the following spatch:
      
          @@
          expression SKB, C, S;
          typedef u8;
          identifier fn = {skb_put};
          fresh identifier fn2 = fn ## "_u8";
          @@
          - *(u8 *)fn(SKB, S) = C;
          + fn2(SKB, C);
      
      Note that due to the "S", the spatch isn't perfect, it should
      have checked that S is 1, but there's also places that use a
      sizeof expression like sizeof(var) or sizeof(u8) etc. Turns
      out that nobody ever did something like
      	*(u8 *)skb_put(skb, 2) = c;
      
      which would be wrong anyway since the second byte wouldn't be
      initialized.
      Suggested-by: NJoe Perches <joe@perches.com>
      Suggested-by: NBjørn Mork <bjorn@mork.no>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      634fef61
    • J
      networking: make skb_push & __skb_push return void pointers · d58ff351
      Johannes Berg 提交于
      It seems like a historic accident that these return unsigned char *,
      and in many places that means casts are required, more often than not.
      
      Make these functions return void * and remove all the casts across
      the tree, adding a (u8 *) cast only where the unsigned char pointer
      was used directly, all done with the following spatch:
      
          @@
          expression SKB, LEN;
          typedef u8;
          identifier fn = { skb_push, __skb_push, skb_push_rcsum };
          @@
          - *(fn(SKB, LEN))
          + *(u8 *)fn(SKB, LEN)
      
          @@
          expression E, SKB, LEN;
          identifier fn = { skb_push, __skb_push, skb_push_rcsum };
          type T;
          @@
          - E = ((T *)(fn(SKB, LEN)))
          + E = fn(SKB, LEN)
      
          @@
          expression SKB, LEN;
          identifier fn = { skb_push, __skb_push, skb_push_rcsum };
          @@
          - fn(SKB, LEN)[0]
          + *(u8 *)fn(SKB, LEN)
      
      Note that the last part there converts from push(...)[0] to the
      more idiomatic *(u8 *)push(...).
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d58ff351
    • J
      networking: make skb_pull & friends return void pointers · af72868b
      Johannes Berg 提交于
      It seems like a historic accident that these return unsigned char *,
      and in many places that means casts are required, more often than not.
      
      Make these functions return void * and remove all the casts across
      the tree, adding a (u8 *) cast only where the unsigned char pointer
      was used directly, all done with the following spatch:
      
          @@
          expression SKB, LEN;
          typedef u8;
          identifier fn = {
                  skb_pull,
                  __skb_pull,
                  skb_pull_inline,
                  __pskb_pull_tail,
                  __pskb_pull,
                  pskb_pull
          };
          @@
          - *(fn(SKB, LEN))
          + *(u8 *)fn(SKB, LEN)
      
          @@
          expression E, SKB, LEN;
          identifier fn = {
                  skb_pull,
                  __skb_pull,
                  skb_pull_inline,
                  __pskb_pull_tail,
                  __pskb_pull,
                  pskb_pull
          };
          type T;
          @@
          - E = ((T *)(fn(SKB, LEN)))
          + E = fn(SKB, LEN)
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      af72868b
    • J
      networking: make skb_put & friends return void pointers · 4df864c1
      Johannes Berg 提交于
      It seems like a historic accident that these return unsigned char *,
      and in many places that means casts are required, more often than not.
      
      Make these functions (skb_put, __skb_put and pskb_put) return void *
      and remove all the casts across the tree, adding a (u8 *) cast only
      where the unsigned char pointer was used directly, all done with the
      following spatch:
      
          @@
          expression SKB, LEN;
          typedef u8;
          identifier fn = { skb_put, __skb_put };
          @@
          - *(fn(SKB, LEN))
          + *(u8 *)fn(SKB, LEN)
      
          @@
          expression E, SKB, LEN;
          identifier fn = { skb_put, __skb_put };
          type T;
          @@
          - E = ((T *)(fn(SKB, LEN)))
          + E = fn(SKB, LEN)
      
      which actually doesn't cover pskb_put since there are only three
      users overall.
      
      A handful of stragglers were converted manually, notably a macro in
      drivers/isdn/i4l/isdn_bsdcomp.c and, oddly enough, one of the many
      instances in net/bluetooth/hci_sock.c. In the former file, I also
      had to fix one whitespace problem spatch introduced.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4df864c1
    • J
      networking: introduce and use skb_put_data() · 59ae1d12
      Johannes Berg 提交于
      A common pattern with skb_put() is to just want to memcpy()
      some data into the new space, introduce skb_put_data() for
      this.
      
      An spatch similar to the one for skb_put_zero() converts many
      of the places using it:
      
          @@
          identifier p, p2;
          expression len, skb, data;
          type t, t2;
          @@
          (
          -p = skb_put(skb, len);
          +p = skb_put_data(skb, data, len);
          |
          -p = (t)skb_put(skb, len);
          +p = skb_put_data(skb, data, len);
          )
          (
          p2 = (t2)p;
          -memcpy(p2, data, len);
          |
          -memcpy(p, data, len);
          )
      
          @@
          type t, t2;
          identifier p, p2;
          expression skb, data;
          @@
          t *p;
          ...
          (
          -p = skb_put(skb, sizeof(t));
          +p = skb_put_data(skb, data, sizeof(t));
          |
          -p = (t *)skb_put(skb, sizeof(t));
          +p = skb_put_data(skb, data, sizeof(t));
          )
          (
          p2 = (t2)p;
          -memcpy(p2, data, sizeof(*p));
          |
          -memcpy(p, data, sizeof(*p));
          )
      
          @@
          expression skb, len, data;
          @@
          -memcpy(skb_put(skb, len), data, len);
          +skb_put_data(skb, data, len);
      
      (again, manually post-processed to retain some comments)
      Reviewed-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59ae1d12
    • J
      skbuff: make skb_put_zero() return void · 83ad357d
      Johannes Berg 提交于
      It's nicer to return void, since then there's no need to
      cast to any structures. Currently none of the users have
      a cast, but a number of future conversions do.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      83ad357d
  7. 12 6月, 2017 2 次提交
  8. 05 6月, 2017 1 次提交
    • J
      skbuff: return -EMSGSIZE in skb_to_sgvec to prevent overflow · 48a1df65
      Jason A. Donenfeld 提交于
      This is a defense-in-depth measure in response to bugs like
      4d6fa57b ("macsec: avoid heap overflow in skb_to_sgvec"). There's
      not only a potential overflow of sglist items, but also a stack overflow
      potential, so we fix this by limiting the amount of recursion this function
      is allowed to do. Not actually providing a bounded base case is a future
      disaster that we can easily avoid here.
      
      As a small matter of house keeping, we take this opportunity to move the
      documentation comment over the actual function the documentation is for.
      
      While this could be implemented by using an explicit stack of skbuffs,
      when implementing this, the function complexity increased considerably,
      and I don't think such complexity and bloat is actually worth it. So,
      instead I built this and tested it on x86, x86_64, ARM, ARM64, and MIPS,
      and measured the stack usage there. I also reverted the recent MIPS
      changes that give it a separate IRQ stack, so that I could experience
      some worst-case situations. I found that limiting it to 24 layers deep
      yielded a good stack usage with room for safety, as well as being much
      deeper than any driver actually ever creates.
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Sabrina Dubroca <sd@queasysnail.net>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Jason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      48a1df65
  9. 30 5月, 2017 1 次提交
  10. 22 5月, 2017 2 次提交
  11. 20 5月, 2017 6 次提交
  12. 18 5月, 2017 1 次提交
  13. 17 5月, 2017 1 次提交
  14. 16 5月, 2017 1 次提交
  15. 22 4月, 2017 1 次提交
  16. 14 4月, 2017 1 次提交
  17. 09 4月, 2017 1 次提交
    • S
      skbuff: Extend gso_type to unsigned int. · 7f564528
      Steffen Klassert 提交于
      All available gso_type flags are currently in use, so
      extend gso_type from 'unsigned short' to 'unsigned int'
      to be able to add further flags.
      
      We reorder the struct skb_shared_info to use
      two bytes of the four byte hole before dataref.
      All fields before dataref are cleared, i.e.
      four bytes more than before the change.
      
      The remaining two byte hole is moved to the
      beginning of the structure, this protects us
      from immediate overwites on out of bound writes
      to the sk_buff head.
      
      Structure layout on x86-64 before the change:
      
      struct skb_shared_info {
      	unsigned char              nr_frags;             /*     0     1 */
      	__u8                       tx_flags;             /*     1     1 */
      	short unsigned int         gso_size;             /*     2     2 */
      	short unsigned int         gso_segs;             /*     4     2 */
      	short unsigned int         gso_type;             /*     6     2 */
      	struct sk_buff *           frag_list;            /*     8     8 */
      	struct skb_shared_hwtstamps hwtstamps;           /*    16     8 */
      	u32                        tskey;                /*    24     4 */
      	__be32                     ip6_frag_id;          /*    28     4 */
      	atomic_t                   dataref;              /*    32     4 */
      
      	/* XXX 4 bytes hole, try to pack */
      
      	void *                     destructor_arg;       /*    40     8 */
      	skb_frag_t                 frags[17];            /*    48   272 */
      	/* --- cacheline 5 boundary (320 bytes) --- */
      
      	/* size: 320, cachelines: 5, members: 12 */
      	/* sum members: 316, holes: 1, sum holes: 4 */
      };
      
      Structure layout on x86-64 after the change:
      
      struct skb_shared_info {
      	short unsigned int         _unused;              /*     0     2 */
      	unsigned char              nr_frags;             /*     2     1 */
      	__u8                       tx_flags;             /*     3     1 */
      	short unsigned int         gso_size;             /*     4     2 */
      	short unsigned int         gso_segs;             /*     6     2 */
      	struct sk_buff *           frag_list;            /*     8     8 */
      	struct skb_shared_hwtstamps hwtstamps;           /*    16     8 */
      	unsigned int               gso_type;             /*    24     4 */
      	u32                        tskey;                /*    28     4 */
      	__be32                     ip6_frag_id;          /*    32     4 */
      	atomic_t                   dataref;              /*    36     4 */
      	void *                     destructor_arg;       /*    40     8 */
      	skb_frag_t                 frags[17];            /*    48   272 */
      	/* --- cacheline 5 boundary (320 bytes) --- */
      
      	/* size: 320, cachelines: 5, members: 13 */
      };
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f564528
  18. 02 3月, 2017 1 次提交
  19. 11 2月, 2017 1 次提交
  20. 08 2月, 2017 1 次提交
  21. 02 2月, 2017 2 次提交
  22. 11 1月, 2017 1 次提交
  23. 09 1月, 2017 1 次提交
    • W
      net-tc: convert tc_from to tc_from_ingress and tc_redirected · bc31c905
      Willem de Bruijn 提交于
      The tc_from field fulfills two roles. It encodes whether a packet was
      redirected by an act_mirred device and, if so, whether act_mirred was
      called on ingress or egress. Split it into separate fields.
      
      The information is needed by the special IFB loop, where packets are
      taken out of the normal path by act_mirred, forwarded to IFB, then
      reinjected at their original location (ingress or egress) by IFB.
      
      The IFB device cannot use skb->tc_at_ingress, because that may have
      been overwritten as the packet travels from act_mirred to ifb_xmit,
      when it passes through tc_classify on the IFB egress path. Cache this
      value in skb->tc_from_ingress.
      
      That field is valid only if a packet arriving at ifb_xmit came from
      act_mirred. Other packets can be crafted to reach ifb_xmit. These
      must be dropped. Set tc_redirected on redirection and drop all packets
      that do not have this bit set.
      
      Both fields are set only on cloned skbs in tc actions, so original
      packet sources do not have to clear the bit when reusing packets
      (notably, pktgen and octeon).
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc31c905