1. 11 10月, 2015 1 次提交
    • A
      bpf: fix cb access in socket filter programs · ff936a04
      Alexei Starovoitov 提交于
      eBPF socket filter programs may see junk in 'u32 cb[5]' area,
      since it could have been used by protocol layers earlier.
      
      For socket filter programs used in af_packet we need to clean
      20 bytes of skb->cb area if it could be used by the program.
      For programs attached to TCP/UDP sockets we need to save/restore
      these 20 bytes, since it's used by protocol layers.
      
      Remove SK_RUN_FILTER macro, since it's no longer used.
      
      Long term we may move this bpf cb area to per-cpu scratch, but that
      requires addition of new 'per-cpu load/store' instructions,
      so not suitable as a short term fix.
      
      Fixes: d691f9e8 ("bpf: allow programs to write to certain skb fields")
      Reported-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ff936a04
  2. 09 10月, 2015 10 次提交
  3. 08 10月, 2015 6 次提交
  4. 07 10月, 2015 1 次提交
  5. 06 10月, 2015 2 次提交
  6. 05 10月, 2015 4 次提交
  7. 03 10月, 2015 3 次提交
    • S
      phylib: Add phy_set_max_speed helper · f3a6bd39
      Simon Horman 提交于
      Add a helper to allow ethernet drivers to limit the speed of a phy
      (that they are attached to).
      
      This mainly involves factoring out the business-end of
      of_set_phy_supported() and exporting a new symbol.
      
      This code seems to be open coded in several places, in several different
      variants.
      
      It is is envisaged that this will be used in situations where setting the
      "max-speed" property in DT is not appropriate, e.g. because the maximum
      speed is not a property of the phy hardware.
      Signed-off-by: NSimon Horman <horms+renesas@verge.net.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f3a6bd39
    • D
      sched, bpf: add helper for retrieving routing realms · c46646d0
      Daniel Borkmann 提交于
      Using routing realms as part of the classifier is quite useful, it
      can be viewed as a tag for one or multiple routing entries (think of
      an analogy to net_cls cgroup for processes), set by user space routing
      daemons or via iproute2 as an indicator for traffic classifiers and
      later on processed in the eBPF program.
      
      Unlike actions, the classifier can inspect device flags and enable
      netif_keep_dst() if necessary. tc actions don't have that possibility,
      but in case people know what they are doing, it can be used from there
      as well (e.g. via devs that must keep dsts by design anyway).
      
      If a realm is set, the handler returns the non-zero realm. User space
      can set the full 32bit realm for the dst.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c46646d0
    • D
      ebpf: migrate bpf_prog's flags to bitfield · a91263d5
      Daniel Borkmann 提交于
      As we need to add further flags to the bpf_prog structure, lets migrate
      both bools to a bitfield representation. The size of the base structure
      (excluding insns) remains unchanged at 40 bytes.
      
      Add also tags for the kmemchecker, so that it doesn't throw false
      positives. Even in case gcc would generate suboptimal code, it's not
      being accessed in performance critical paths.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a91263d5
  8. 02 10月, 2015 2 次提交
    • G
      memcg: remove pcp_counter_lock · ef510194
      Greg Thelen 提交于
      Commit 733a572e ("memcg: make mem_cgroup_read_{stat|event}() iterate
      possible cpus instead of online") removed the last use of the per memcg
      pcp_counter_lock but forgot to remove the variable.
      
      Kill the vestigial variable.
      Signed-off-by: NGreg Thelen <gthelen@google.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ef510194
    • G
      memcg: fix dirty page migration · 0610c25d
      Greg Thelen 提交于
      The problem starts with a file backed dirty page which is charged to a
      memcg.  Then page migration is used to move oldpage to newpage.
      
      Migration:
       - copies the oldpage's data to newpage
       - clears oldpage.PG_dirty
       - sets newpage.PG_dirty
       - uncharges oldpage from memcg
       - charges newpage to memcg
      
      Clearing oldpage.PG_dirty decrements the charged memcg's dirty page
      count.
      
      However, because newpage is not yet charged, setting newpage.PG_dirty
      does not increment the memcg's dirty page count.  After migration
      completes newpage.PG_dirty is eventually cleared, often in
      account_page_cleaned().  At this time newpage is charged to a memcg so
      the memcg's dirty page count is decremented which causes underflow
      because the count was not previously incremented by migration.  This
      underflow causes balance_dirty_pages() to see a very large unsigned
      number of dirty memcg pages which leads to aggressive throttling of
      buffered writes by processes in non root memcg.
      
      This issue:
       - can harm performance of non root memcg buffered writes.
       - can report too small (even negative) values in
         memory.stat[(total_)dirty] counters of all memcg, including the root.
      
      To avoid polluting migrate.c with #ifdef CONFIG_MEMCG checks, introduce
      page_memcg() and set_page_memcg() helpers.
      
      Test:
          0) setup and enter limited memcg
          mkdir /sys/fs/cgroup/test
          echo 1G > /sys/fs/cgroup/test/memory.limit_in_bytes
          echo $$ > /sys/fs/cgroup/test/cgroup.procs
      
          1) buffered writes baseline
          dd if=/dev/zero of=/data/tmp/foo bs=1M count=1k
          sync
          grep ^dirty /sys/fs/cgroup/test/memory.stat
      
          2) buffered writes with compaction antagonist to induce migration
          yes 1 > /proc/sys/vm/compact_memory &
          rm -rf /data/tmp/foo
          dd if=/dev/zero of=/data/tmp/foo bs=1M count=1k
          kill %
          sync
          grep ^dirty /sys/fs/cgroup/test/memory.stat
      
          3) buffered writes without antagonist, should match baseline
          rm -rf /data/tmp/foo
          dd if=/dev/zero of=/data/tmp/foo bs=1M count=1k
          sync
          grep ^dirty /sys/fs/cgroup/test/memory.stat
      
                             (speed, dirty residue)
                   unpatched                       patched
          1) 841 MB/s 0 dirty pages          886 MB/s 0 dirty pages
          2) 611 MB/s -33427456 dirty pages  793 MB/s 0 dirty pages
          3) 114 MB/s -33427456 dirty pages  891 MB/s 0 dirty pages
      
          Notice that unpatched baseline performance (1) fell after
          migration (3): 841 -> 114 MB/s.  In the patched kernel, post
          migration performance matches baseline.
      
      Fixes: c4843a75 ("memcg: add per cgroup dirty page accounting")
      Signed-off-by: NGreg Thelen <gthelen@google.com>
      Reported-by: NDave Hansen <dave.hansen@intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: <stable@vger.kernel.org>	[4.2+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0610c25d
  9. 30 9月, 2015 11 次提交