1. 02 2月, 2017 2 次提交
  2. 10 1月, 2017 1 次提交
    • W
      xtables: add xt_match, xt_target and data copy_to_user functions · f32815d2
      Willem de Bruijn 提交于
      xt_entry_target, xt_entry_match and their private data may contain
      kernel data.
      
      Introduce helper functions xt_match_to_user, xt_target_to_user and
      xt_data_to_user that copy only the expected fields. These replace
      existing logic that calls copy_to_user on entire structs, then
      overwrites select fields.
      
      Private data is defined in xt_match and xt_target. All matches and
      targets that maintain kernel data store this at the tail of their
      private structure. Extend xt_match and xt_target with .usersize to
      limit how many bytes of data are copied. The remainder is cleared.
      
      If compatsize is specified, usersize can only safely be used if all
      fields up to usersize use platform-independent types. Otherwise, the
      compat_to_user callback must be defined.
      
      This patch does not yet enable the support logic.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      f32815d2
  3. 03 1月, 2017 10 次提交
  4. 02 1月, 2017 3 次提交
  5. 30 12月, 2016 1 次提交
    • M
      net: dev_weight: TX/RX orthogonality · 3d48b53f
      Matthias Tafelmeier 提交于
      Oftenly, introducing side effects on packet processing on the other half
      of the stack by adjusting one of TX/RX via sysctl is not desirable.
      There are cases of demand for asymmetric, orthogonal configurability.
      
      This holds true especially for nodes where RPS for RFS usage on top is
      configured and therefore use the 'old dev_weight'. This is quite a
      common base configuration setup nowadays, even with NICs of superior processing
      support (e.g. aRFS).
      
      A good example use case are nodes acting as noSQL data bases with a
      large number of tiny requests and rather fewer but large packets as responses.
      It's affordable to have large budget and rx dev_weights for the
      requests. But as a side effect having this large a number on TX
      processed in one run can overwhelm drivers.
      
      This patch therefore introduces an independent configurability via sysctl to
      userland.
      Signed-off-by: NMatthias Tafelmeier <matthias.tafelmeier@gmx.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3d48b53f
  6. 28 12月, 2016 1 次提交
  7. 26 12月, 2016 5 次提交
    • N
      mm: add PageWaiters indicating tasks are waiting for a page bit · 62906027
      Nicholas Piggin 提交于
      Add a new page flag, PageWaiters, to indicate the page waitqueue has
      tasks waiting. This can be tested rather than testing waitqueue_active
      which requires another cacheline load.
      
      This bit is always set when the page has tasks on page_waitqueue(page),
      and is set and cleared under the waitqueue lock. It may be set when
      there are no tasks on the waitqueue, which will cause a harmless extra
      wakeup check that will clears the bit.
      
      The generic bit-waitqueue infrastructure is no longer used for pages.
      Instead, waitqueues are used directly with a custom key type. The
      generic code was not flexible enough to have PageWaiters manipulation
      under the waitqueue lock (which simplifies concurrency).
      
      This improves the performance of page lock intensive microbenchmarks by
      2-3%.
      
      Putting two bits in the same word opens the opportunity to remove the
      memory barrier between clearing the lock bit and testing the waiters
      bit, after some work on the arch primitives (e.g., ensuring memory
      operand widths match and cover both bits).
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Bob Peterson <rpeterso@redhat.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Andrew Lutomirski <luto@kernel.org>
      Cc: Andreas Gruenbacher <agruenba@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      62906027
    • N
      mm: Use owner_priv bit for PageSwapCache, valid when PageSwapBacked · 6326fec1
      Nicholas Piggin 提交于
      A page is not added to the swap cache without being swap backed,
      so PageSwapBacked mappings can use PG_owner_priv_1 for PageSwapCache.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Acked-by: NHugh Dickins <hughd@google.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Bob Peterson <rpeterso@redhat.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Andrew Lutomirski <luto@kernel.org>
      Cc: Andreas Gruenbacher <agruenba@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6326fec1
    • T
      ktime: Get rid of ktime_equal() · 1f3a8e49
      Thomas Gleixner 提交于
      No point in going through loops and hoops instead of just comparing the
      values.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      1f3a8e49
    • T
      ktime: Cleanup ktime_set() usage · 8b0e1953
      Thomas Gleixner 提交于
      ktime_set(S,N) was required for the timespec storage type and is still
      useful for situations where a Seconds and Nanoseconds part of a time value
      needs to be converted. For anything where the Seconds argument is 0, this
      is pointless and can be replaced with a simple assignment.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      8b0e1953
    • T
      ktime: Get rid of the union · 2456e855
      Thomas Gleixner 提交于
      ktime is a union because the initial implementation stored the time in
      scalar nanoseconds on 64 bit machine and in a endianess optimized timespec
      variant for 32bit machines. The Y2038 cleanup removed the timespec variant
      and switched everything to scalar nanoseconds. The union remained, but
      become completely pointless.
      
      Get rid of the union and just keep ktime_t as simple typedef of type s64.
      
      The conversion was done with coccinelle and some manual mopping up.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      2456e855
  8. 25 12月, 2016 9 次提交
  9. 24 12月, 2016 1 次提交
  10. 23 12月, 2016 1 次提交
    • A
      move aio compat to fs/aio.c · c00d2c7e
      Al Viro 提交于
      ... and fix the minor buglet in compat io_submit() - native one
      kills ioctx as cleanup when put_user() fails.  Get rid of
      bogus compat_... in !CONFIG_AIO case, while we are at it - they
      should simply fail with ENOSYS, same as for native counterparts.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      c00d2c7e
  11. 21 12月, 2016 2 次提交
  12. 20 12月, 2016 2 次提交
  13. 19 12月, 2016 1 次提交
  14. 18 12月, 2016 1 次提交
    • D
      bpf: fix overflow in prog accounting · 5ccb071e
      Daniel Borkmann 提交于
      Commit aaac3ba9 ("bpf: charge user for creation of BPF maps and
      programs") made a wrong assumption of charging against prog->pages.
      Unlike map->pages, prog->pages are still subject to change when we
      need to expand the program through bpf_prog_realloc().
      
      This can for example happen during verification stage when we need to
      expand and rewrite parts of the program. Should the required space
      cross a page boundary, then prog->pages is not the same anymore as
      its original value that we used to bpf_prog_charge_memlock() on. Thus,
      we'll hit a wrap-around during bpf_prog_uncharge_memlock() when prog
      is freed eventually. I noticed this that despite having unlimited
      memlock, programs suddenly refused to load with EPERM error due to
      insufficient memlock.
      
      There are two ways to fix this issue. One would be to add a cached
      variable to struct bpf_prog that takes a snapshot of prog->pages at the
      time of charging. The other approach is to also account for resizes. I
      chose to go with the latter for a couple of reasons: i) We want accounting
      rather to be more accurate instead of further fooling limits, ii) adding
      yet another page counter on struct bpf_prog would also be a waste just
      for this purpose. We also do want to charge as early as possible to
      avoid going into the verifier just to find out later on that we crossed
      limits. The only place that needs to be fixed is bpf_prog_realloc(),
      since only here we expand the program, so we try to account for the
      needed delta and should we fail, call-sites check for outcome anyway.
      On cBPF to eBPF migrations, we don't grab a reference to the user as
      they are charged differently. With that in place, my test case worked
      fine.
      
      Fixes: aaac3ba9 ("bpf: charge user for creation of BPF maps and programs")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5ccb071e