1. 25 6月, 2017 1 次提交
  2. 24 6月, 2017 5 次提交
  3. 22 6月, 2017 6 次提交
  4. 21 6月, 2017 5 次提交
  5. 19 6月, 2017 1 次提交
    • H
      mm: larger stack guard gap, between vmas · 1be7107f
      Hugh Dickins 提交于
      Stack guard page is a useful feature to reduce a risk of stack smashing
      into a different mapping. We have been using a single page gap which
      is sufficient to prevent having stack adjacent to a different mapping.
      But this seems to be insufficient in the light of the stack usage in
      userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
      used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
      which is 256kB or stack strings with MAX_ARG_STRLEN.
      
      This will become especially dangerous for suid binaries and the default
      no limit for the stack size limit because those applications can be
      tricked to consume a large portion of the stack and a single glibc call
      could jump over the guard page. These attacks are not theoretical,
      unfortunatelly.
      
      Make those attacks less probable by increasing the stack guard gap
      to 1MB (on systems with 4k pages; but make it depend on the page size
      because systems with larger base pages might cap stack allocations in
      the PAGE_SIZE units) which should cover larger alloca() and VLA stack
      allocations. It is obviously not a full fix because the problem is
      somehow inherent, but it should reduce attack space a lot.
      
      One could argue that the gap size should be configurable from userspace,
      but that can be done later when somebody finds that the new 1MB is wrong
      for some special case applications.  For now, add a kernel command line
      option (stack_guard_gap) to specify the stack gap size (in page units).
      
      Implementation wise, first delete all the old code for stack guard page:
      because although we could get away with accounting one extra page in a
      stack vma, accounting a larger gap can break userspace - case in point,
      a program run with "ulimit -S -v 20000" failed when the 1MB gap was
      counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
      and strict non-overcommit mode.
      
      Instead of keeping gap inside the stack vma, maintain the stack guard
      gap as a gap between vmas: using vm_start_gap() in place of vm_start
      (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
      places which need to respect the gap - mainly arch_get_unmapped_area(),
      and and the vma tree's subtree_gap support for that.
      Original-patch-by: NOleg Nesterov <oleg@redhat.com>
      Original-patch-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Tested-by: Helge Deller <deller@gmx.de> # parisc
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1be7107f
  6. 16 6月, 2017 12 次提交
    • M
      net: Add IFLA_XDP_PROG_ID · 58038695
      Martin KaFai Lau 提交于
      Expose prog_id through IFLA_XDP_PROG_ID.  This patch
      makes modification to generic_xdp.  The later patches will
      modify other xdp-supported drivers.
      
      prog_id is added to struct net_dev_xdp.
      
      iproute2 patch will be followed. Here is how the 'ip link'
      will look like:
      > ip link show eth0
      3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp(prog_id:1) qdisc fq_codel state UP mode DEFAULT group default qlen 1000
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Acked-by: NAlexei Starovoitov <ast@fb.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      58038695
    • J
      networking: add and use skb_put_u8() · 634fef61
      Johannes Berg 提交于
      Joe and Bjørn suggested that it'd be nicer to not have the
      cast in the fairly common case of doing
      	*(u8 *)skb_put(skb, 1) = c;
      
      Add skb_put_u8() for this case, and use it across the code,
      using the following spatch:
      
          @@
          expression SKB, C, S;
          typedef u8;
          identifier fn = {skb_put};
          fresh identifier fn2 = fn ## "_u8";
          @@
          - *(u8 *)fn(SKB, S) = C;
          + fn2(SKB, C);
      
      Note that due to the "S", the spatch isn't perfect, it should
      have checked that S is 1, but there's also places that use a
      sizeof expression like sizeof(var) or sizeof(u8) etc. Turns
      out that nobody ever did something like
      	*(u8 *)skb_put(skb, 2) = c;
      
      which would be wrong anyway since the second byte wouldn't be
      initialized.
      Suggested-by: NJoe Perches <joe@perches.com>
      Suggested-by: NBjørn Mork <bjorn@mork.no>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      634fef61
    • J
      networking: make skb_push & __skb_push return void pointers · d58ff351
      Johannes Berg 提交于
      It seems like a historic accident that these return unsigned char *,
      and in many places that means casts are required, more often than not.
      
      Make these functions return void * and remove all the casts across
      the tree, adding a (u8 *) cast only where the unsigned char pointer
      was used directly, all done with the following spatch:
      
          @@
          expression SKB, LEN;
          typedef u8;
          identifier fn = { skb_push, __skb_push, skb_push_rcsum };
          @@
          - *(fn(SKB, LEN))
          + *(u8 *)fn(SKB, LEN)
      
          @@
          expression E, SKB, LEN;
          identifier fn = { skb_push, __skb_push, skb_push_rcsum };
          type T;
          @@
          - E = ((T *)(fn(SKB, LEN)))
          + E = fn(SKB, LEN)
      
          @@
          expression SKB, LEN;
          identifier fn = { skb_push, __skb_push, skb_push_rcsum };
          @@
          - fn(SKB, LEN)[0]
          + *(u8 *)fn(SKB, LEN)
      
      Note that the last part there converts from push(...)[0] to the
      more idiomatic *(u8 *)push(...).
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d58ff351
    • J
      networking: make skb_pull & friends return void pointers · af72868b
      Johannes Berg 提交于
      It seems like a historic accident that these return unsigned char *,
      and in many places that means casts are required, more often than not.
      
      Make these functions return void * and remove all the casts across
      the tree, adding a (u8 *) cast only where the unsigned char pointer
      was used directly, all done with the following spatch:
      
          @@
          expression SKB, LEN;
          typedef u8;
          identifier fn = {
                  skb_pull,
                  __skb_pull,
                  skb_pull_inline,
                  __pskb_pull_tail,
                  __pskb_pull,
                  pskb_pull
          };
          @@
          - *(fn(SKB, LEN))
          + *(u8 *)fn(SKB, LEN)
      
          @@
          expression E, SKB, LEN;
          identifier fn = {
                  skb_pull,
                  __skb_pull,
                  skb_pull_inline,
                  __pskb_pull_tail,
                  __pskb_pull,
                  pskb_pull
          };
          type T;
          @@
          - E = ((T *)(fn(SKB, LEN)))
          + E = fn(SKB, LEN)
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      af72868b
    • J
      networking: make skb_put & friends return void pointers · 4df864c1
      Johannes Berg 提交于
      It seems like a historic accident that these return unsigned char *,
      and in many places that means casts are required, more often than not.
      
      Make these functions (skb_put, __skb_put and pskb_put) return void *
      and remove all the casts across the tree, adding a (u8 *) cast only
      where the unsigned char pointer was used directly, all done with the
      following spatch:
      
          @@
          expression SKB, LEN;
          typedef u8;
          identifier fn = { skb_put, __skb_put };
          @@
          - *(fn(SKB, LEN))
          + *(u8 *)fn(SKB, LEN)
      
          @@
          expression E, SKB, LEN;
          identifier fn = { skb_put, __skb_put };
          type T;
          @@
          - E = ((T *)(fn(SKB, LEN)))
          + E = fn(SKB, LEN)
      
      which actually doesn't cover pskb_put since there are only three
      users overall.
      
      A handful of stragglers were converted manually, notably a macro in
      drivers/isdn/i4l/isdn_bsdcomp.c and, oddly enough, one of the many
      instances in net/bluetooth/hci_sock.c. In the former file, I also
      had to fix one whitespace problem spatch introduced.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4df864c1
    • J
      networking: introduce and use skb_put_data() · 59ae1d12
      Johannes Berg 提交于
      A common pattern with skb_put() is to just want to memcpy()
      some data into the new space, introduce skb_put_data() for
      this.
      
      An spatch similar to the one for skb_put_zero() converts many
      of the places using it:
      
          @@
          identifier p, p2;
          expression len, skb, data;
          type t, t2;
          @@
          (
          -p = skb_put(skb, len);
          +p = skb_put_data(skb, data, len);
          |
          -p = (t)skb_put(skb, len);
          +p = skb_put_data(skb, data, len);
          )
          (
          p2 = (t2)p;
          -memcpy(p2, data, len);
          |
          -memcpy(p, data, len);
          )
      
          @@
          type t, t2;
          identifier p, p2;
          expression skb, data;
          @@
          t *p;
          ...
          (
          -p = skb_put(skb, sizeof(t));
          +p = skb_put_data(skb, data, sizeof(t));
          |
          -p = (t *)skb_put(skb, sizeof(t));
          +p = skb_put_data(skb, data, sizeof(t));
          )
          (
          p2 = (t2)p;
          -memcpy(p2, data, sizeof(*p));
          |
          -memcpy(p, data, sizeof(*p));
          )
      
          @@
          expression skb, len, data;
          @@
          -memcpy(skb_put(skb, len), data, len);
          +skb_put_data(skb, data, len);
      
      (again, manually post-processed to retain some comments)
      Reviewed-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59ae1d12
    • M
      net/mlx5: Add fast unload support in shutdown flow · 8812c24d
      Majd Dibbiny 提交于
      Adding a support to flush all HW resources with one FW command and
      skip all the heavy unload flows of the driver on kernel shutdown.
      There's no need to free all the SW context since a new fresh kernel
      will be loaded afterwards.
      
      Regarding the FW resources, they should be closed, otherwise we will
      have leakage in the FW. To accelerate this flow, we execute one command
      in the beginning that tells the FW that the driver isn't going to close
      any of the FW resources and asks the FW to clean up everything.
      Once the commands complete, it's safe to close the PCI resources and
      finish the routine.
      Signed-off-by: NMajd Dibbiny <majd@mellanox.com>
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      8812c24d
    • M
      net/mlx5: Expose command polling interface · 4525abea
      Majd Dibbiny 提交于
      Add a new interface for commands execution that allows the
      caller to wait for the command's completion in a busy-wait
      loop (polling mode).
      
      This is useful if we want to execute a command in a polling mode
      while the driver is working in events mode for the rest of
      the commands.
      This interface will be used in the downstream patches.
      Signed-off-by: NMajd Dibbiny <majd@mellanox.com>
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      4525abea
    • G
      net/mlx5e: Move and optimize query out of buffer function · 432609a4
      Gal Pressman 提交于
      Move "query queue counter out of buffer" helper function out of
      qp.c to en_main.c, since mlx5e netdev driver is the only one to use it.
      
      Also allocate the output buffer on the stack instead of the heap, to reduce
      number of heap allocs on update_stats work.
      Signed-off-by: NGal Pressman <galp@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Cc: kernel-team@fb.com
      432609a4
    • O
      net/mlx5: Fix some spelling mistakes · bd10838a
      Or Gerlitz 提交于
      Fixed few places where endianness was misspelled and
      one spot whwere output was:
      
      CHECK: 'endianess' may be misspelled - perhaps 'endianness'?
      CHECK: 'ouput' may be misspelled - perhaps 'output'?
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      bd10838a
    • J
      skbuff: make skb_put_zero() return void · 83ad357d
      Johannes Berg 提交于
      It's nicer to return void, since then there's no need to
      cast to any structures. Currently none of the users have
      a cast, but a number of future conversions do.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      83ad357d
    • D
      tls: kernel TLS support · 3c4d7559
      Dave Watson 提交于
      Software implementation of transport layer security, implemented using ULP
      infrastructure.  tcp proto_ops are replaced with tls equivalents of sendmsg and
      sendpage.
      
      Only symmetric crypto is done in the kernel, keys are passed by setsockopt
      after the handshake is complete.  All control messages are supported via CMSG
      data - the actual symmetric encryption is the same, just the message type needs
      to be passed separately.
      
      For user API, please see Documentation patch.
      
      Pieces that can be shared between hw and sw implementation
      are in tls_main.c
      Signed-off-by: NBoris Pismenny <borisp@mellanox.com>
      Signed-off-by: NIlya Lesokhin <ilyal@mellanox.com>
      Signed-off-by: NAviad Yehezkel <aviadye@mellanox.com>
      Signed-off-by: NDave Watson <davejwatson@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3c4d7559
  7. 15 6月, 2017 4 次提交
  8. 14 6月, 2017 3 次提交
  9. 13 6月, 2017 2 次提交
  10. 12 6月, 2017 1 次提交
    • P
      udp: avoid a cache miss on dequeue · 0a463c78
      Paolo Abeni 提交于
      Since UDP no more uses sk->destructor, we can clear completely
      the skb head state before enqueuing. Amend and use
      skb_release_head_state() for that.
      
      All head states share a single cacheline, which is not
      normally used/accesses on dequeue. We can avoid entirely accessing
      such cacheline implementing and using in the UDP code a specialized
      skb free helper which ignores the skb head state.
      
      This saves a cacheline miss at skb deallocation time.
      
      v1 -> v2:
        replaced secpath_reset() with skb_release_head_state()
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a463c78