1. 11 12月, 2018 1 次提交
  2. 01 12月, 2018 1 次提交
    • J
      psi: make disabling/enabling easier for vendor kernels · e0c27447
      Johannes Weiner 提交于
      Mel Gorman reports a hackbench regression with psi that would prohibit
      shipping the suse kernel with it default-enabled, but he'd still like
      users to be able to opt in at little to no cost to others.
      
      With the current combination of CONFIG_PSI and the psi_disabled bool set
      from the commandline, this is a challenge.  Do the following things to
      make it easier:
      
      1. Add a config option CONFIG_PSI_DEFAULT_DISABLED that allows distros
         to enable CONFIG_PSI in their kernel but leave the feature disabled
         unless a user requests it at boot-time.
      
         To avoid double negatives, rename psi_disabled= to psi=.
      
      2. Make psi_disabled a static branch to eliminate any branch costs
         when the feature is disabled.
      
      In terms of numbers before and after this patch, Mel says:
      
      : The following is a comparision using CONFIG_PSI=n as a baseline against
      : your patch and a vanilla kernel
      :
      :                          4.20.0-rc4             4.20.0-rc4             4.20.0-rc4
      :                 kconfigdisable-v1r1                vanilla        psidisable-v1r1
      : Amean     1       1.3100 (   0.00%)      1.3923 (  -6.28%)      1.3427 (  -2.49%)
      : Amean     3       3.8860 (   0.00%)      4.1230 *  -6.10%*      3.8860 (  -0.00%)
      : Amean     5       6.8847 (   0.00%)      8.0390 * -16.77%*      6.7727 (   1.63%)
      : Amean     7       9.9310 (   0.00%)     10.8367 *  -9.12%*      9.9910 (  -0.60%)
      : Amean     12     16.6577 (   0.00%)     18.2363 *  -9.48%*     17.1083 (  -2.71%)
      : Amean     18     26.5133 (   0.00%)     27.8833 *  -5.17%*     25.7663 (   2.82%)
      : Amean     24     34.3003 (   0.00%)     34.6830 (  -1.12%)     32.0450 (   6.58%)
      : Amean     30     40.0063 (   0.00%)     40.5800 (  -1.43%)     41.5087 (  -3.76%)
      : Amean     32     40.1407 (   0.00%)     41.2273 (  -2.71%)     39.9417 (   0.50%)
      :
      : It's showing that the vanilla kernel takes a hit (as the bisection
      : indicated it would) and that disabling PSI by default is reasonably
      : close in terms of performance for this particular workload on this
      : particular machine so;
      
      Link: http://lkml.kernel.org/r/20181127165329.GA29728@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Tested-by: NMel Gorman <mgorman@techsingularity.net>
      Reported-by: NMel Gorman <mgorman@techsingularity.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e0c27447
  3. 30 11月, 2018 2 次提交
    • Z
      tracepoint: Use __idx instead of idx in DO_TRACE macro to make it unique · 0c7a52e4
      Zenghui Yu 提交于
      After enabling KVM event tracing, almost all of trace_kvm_exit()'s
      printk shows
      
      	"kvm_exit: IRQ: ..."
      
      even if the actual exception_type is NOT IRQ.  More specifically,
      trace_kvm_exit() is defined in virt/kvm/arm/trace.h by TRACE_EVENT.
      
      This slight problem may have existed after commit e6753f23
      ("tracepoint: Make rcuidle tracepoint callers use SRCU"). There are
      two variables in trace_kvm_exit() and __DO_TRACE() which have the
      same name, *idx*. Thus the actual value of *idx* will be overwritten
      when tracing. Fix it by adding a simple prefix.
      
      Cc: Joel Fernandes <joel@joelfernandes.org>
      Cc: Wang Haibin <wanghaibin.wang@huawei.com>
      Cc: linux-trace-devel@vger.kernel.org
      Cc: stable@vger.kernel.org
      Fixes: e6753f23 ("tracepoint: Make rcuidle tracepoint callers use SRCU")
      Reviewed-by: NJoel Fernandes (Google) <joel@joelfernandes.org>
      Signed-off-by: NZenghui Yu <yuzenghui@huawei.com>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      0c7a52e4
    • K
      pstore/ram: Correctly calculate usable PRZ bytes · 89d328f6
      Kees Cook 提交于
      The actual number of bytes stored in a PRZ is smaller than the
      bytes requested by platform data, since there is a header on each
      PRZ. Additionally, if ECC is enabled, there are trailing bytes used
      as well. Normally this mismatch doesn't matter since PRZs are circular
      buffers and the leading "overflow" bytes are just thrown away. However, in
      the case of a compressed record, this rather badly corrupts the results.
      
      This corruption was visible with "ramoops.mem_size=204800 ramoops.ecc=1".
      Any stored crashes would not be uncompressable (producing a pstorefs
      "dmesg-*.enc.z" file), and triggering errors at boot:
      
        [    2.790759] pstore: crypto_comp_decompress failed, ret = -22!
      
      Backporting this depends on commit 70ad35db ("pstore: Convert console
      write to use ->write_buf")
      Reported-by: NJoel Fernandes <joel@joelfernandes.org>
      Fixes: b0aad7a9 ("pstore: Add compression support to pstore")
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NJoel Fernandes (Google) <joel@joelfernandes.org>
      89d328f6
  4. 28 11月, 2018 7 次提交
    • K
      fscache: Fix race in fscache_op_complete() due to split atomic_sub & read · 3f2b7b90
      kiran.modukuri 提交于
      The code in fscache_retrieval_complete is using atomic_sub followed by an
      atomic_read:
      
              atomic_sub(n_pages, &op->n_pages);
              if (atomic_read(&op->n_pages) <= 0)
                      fscache_op_complete(&op->op, true);
      
      This causes two threads doing a decrement of n_pages to race with each
      other seeing the op->refcount 0 at same time - and they end up calling
      fscache_op_complete() in both the threads leading to an assertion failure.
      
      Fix this by using atomic_sub_return_relaxed() instead of two calls.  Note
      that I'm using 'relaxed' rather than, say, 'release' as there aren't
      multiple variables that appear to need ordering across the release.
      
      The oops looks something like:
      
      FS-Cache: Assertion failed
      FS-Cache: 0 > 0 is false
      ...
      kernel BUG at /usr/src/linux-4.4.0/fs/fscache/operation.c:449!
      ...
      Workqueue: fscache_operation fscache_op_work_func [fscache]
      ...
      RIP: 0010:[<ffffffffc037eacd>] fscache_op_complete+0x10d/0x180 [fscache]
      ...
      Call Trace:
       [<ffffffffc1464cf9>] cachefiles_read_copier+0x3a9/0x410 [cachefiles]
       [<ffffffffc037e272>] fscache_op_work_func+0x22/0x50 [fscache]
       [<ffffffff81096da0>] process_one_work+0x150/0x3f0
       [<ffffffff8109751a>] worker_thread+0x11a/0x470
       [<ffffffff81808e59>] ? __schedule+0x359/0x980
       [<ffffffff81097400>] ? rescuer_thread+0x310/0x310
       [<ffffffff8109cdd6>] kthread+0xd6/0xf0
       [<ffffffff8109cd00>] ? kthread_park+0x60/0x60
       [<ffffffff8180d0cf>] ret_from_fork+0x3f/0x70
       [<ffffffff8109cd00>] ? kthread_park+0x60/0x60
      
      This seen this in 4.4.x kernels and the same bug affects fscache in latest
      upstreams kernels.
      
      Fixes: 1bb4b7f9 ("FS-Cache: The retrieval remaining-pages counter needs to be atomic_t")
      Signed-off-by: NKiran Kumar Modukuri <kiran.modukuri@gmail.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      3f2b7b90
    • T
      x86/speculation: Add prctl() control for indirect branch speculation · 9137bb27
      Thomas Gleixner 提交于
      Add the PR_SPEC_INDIRECT_BRANCH option for the PR_GET_SPECULATION_CTRL and
      PR_SET_SPECULATION_CTRL prctls to allow fine grained per task control of
      indirect branch speculation via STIBP and IBPB.
      
      Invocations:
       Check indirect branch speculation status with
       - prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, 0, 0, 0);
      
       Enable indirect branch speculation with
       - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_ENABLE, 0, 0);
      
       Disable indirect branch speculation with
       - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_DISABLE, 0, 0);
      
       Force disable indirect branch speculation with
       - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_FORCE_DISABLE, 0, 0);
      
      See Documentation/userspace-api/spec_ctrl.rst.
      Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.866780996@linutronix.de
      9137bb27
    • T
      ptrace: Remove unused ptrace_may_access_sched() and MODE_IBRS · 46f7ecb1
      Thomas Gleixner 提交于
      The IBPB control code in x86 removed the usage. Remove the functionality
      which was introduced for this.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.559149393@linutronix.de
      
      46f7ecb1
    • T
      x86/speculation: Rework SMT state change · a74cfffb
      Thomas Gleixner 提交于
      arch_smt_update() is only called when the sysfs SMT control knob is
      changed. This means that when SMT is enabled in the sysfs control knob the
      system is considered to have SMT active even if all siblings are offline.
      
      To allow finegrained control of the speculation mitigations, the actual SMT
      state is more interesting than the fact that siblings could be enabled.
      
      Rework the code, so arch_smt_update() is invoked from each individual CPU
      hotplug function, and simplify the update function while at it.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185004.521974984@linutronix.de
      
      a74cfffb
    • T
      sched/smt: Expose sched_smt_present static key · 321a874a
      Thomas Gleixner 提交于
      Make the scheduler's 'sched_smt_present' static key globaly available, so
      it can be used in the x86 speculation control code.
      
      Provide a query function and a stub for the CONFIG_SMP=n case.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185004.430168326@linutronix.de
      321a874a
    • S
      function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack · 39eb456d
      Steven Rostedt (VMware) 提交于
      Currently, the depth of the ret_stack is determined by curr_ret_stack index.
      The issue is that there's a race between setting of the curr_ret_stack and
      calling of the callback attached to the return of the function.
      
      Commit 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling
      trace return callback") moved the calling of the callback to after the
      setting of the curr_ret_stack, even stating that it was safe to do so, when
      in fact, it was the reason there was a barrier() there (yes, I should have
      commented that barrier()).
      
      Not only does the curr_ret_stack keep track of the current call graph depth,
      it also keeps the ret_stack content from being overwritten by new data.
      
      The function profiler, uses the "subtime" variable of ret_stack structure
      and by moving the curr_ret_stack, it allows for interrupts to use the same
      structure it was using, corrupting the data, and breaking the profiler.
      
      To fix this, there needs to be two variables to handle the call stack depth
      and the pointer to where the ret_stack is being used, as they need to change
      at two different locations.
      
      Cc: stable@kernel.org
      Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
      Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      39eb456d
    • S
      function_graph: Make ftrace_push_return_trace() static · d125f3f8
      Steven Rostedt (VMware) 提交于
      As all architectures now call function_graph_enter() to do the entry work,
      no architecture should ever call ftrace_push_return_trace(). Make it static.
      
      This is needed to prepare for a fix of a design bug on how the curr_ret_stack
      is used.
      
      Cc: stable@kernel.org
      Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
      Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      d125f3f8
  5. 27 11月, 2018 2 次提交
    • D
      bpf, ppc64: generalize fetching subprog into bpf_jit_get_func_addr · e2c95a61
      Daniel Borkmann 提交于
      Make fetching of the BPF call address from ppc64 JIT generic. ppc64
      was using a slightly different variant rather than through the insns'
      imm field encoding as the target address would not fit into that space.
      Therefore, the target subprog number was encoded into the insns' offset
      and fetched through fp->aux->func[off]->bpf_func instead. Given there
      are other JITs with this issue and the mechanism of fetching the address
      is JIT-generic, move it into the core as a helper instead. On the JIT
      side, we get information on whether the retrieved address is a fixed
      one, that is, not changing through JIT passes, or a dynamic one. For
      the former, JITs can optimize their imm emission because this doesn't
      change jump offsets throughout JIT process.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: NSandipan Das <sandipan@linux.ibm.com>
      Tested-by: NSandipan Das <sandipan@linux.ibm.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      e2c95a61
    • S
      function_graph: Create function_graph_enter() to consolidate architecture code · 8114865f
      Steven Rostedt (VMware) 提交于
      Currently all the architectures do basically the same thing in preparing the
      function graph tracer on entry to a function. This code can be pulled into a
      generic location and then this will allow the function graph tracer to be
      fixed, as well as extended.
      
      Create a new function graph helper function_graph_enter() that will call the
      hook function (ftrace_graph_entry) and the shadow stack operation
      (ftrace_push_return_trace), and remove the need of the architecture code to
      manage the shadow stack.
      
      This is needed to prepare for a fix of a design bug on how the curr_ret_stack
      is used.
      
      Cc: stable@kernel.org
      Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
      Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      8114865f
  6. 26 11月, 2018 2 次提交
    • B
      gpio: davinci: restore a way to manually specify the GPIO base · 786a9ab1
      Bartosz Golaszewski 提交于
      Commit 587f7a69 ("gpio: davinci: Use dev name for label and
      automatic base selection") broke the network support in legacy boot
      mode for da850-evm since we can no longer request the MDIO clock GPIO.
      
      Other boards may be broken too, which I haven't tested.
      
      The problem is in the fact that most board files still use the legacy
      GPIO API where lines are requested by numbers rather than descriptors.
      
      While this should be fixed eventually, in order to unbreak the board
      for now - provide a way to manually specify the GPIO base in platform
      data.
      
      Fixes: 587f7a69 ("gpio: davinci: Use dev name for label and automatic base selection")
      Cc: stable@vger.kernel.org
      Signed-off-by: NBartosz Golaszewski <bgolaszewski@baylibre.com>
      Acked-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NSekhar Nori <nsekhar@ti.com>
      786a9ab1
    • F
      netfilter: nfnetlink_cttimeout: fetch timeouts for udplite and gre, too · 89259088
      Florian Westphal 提交于
      syzbot was able to trigger the WARN in cttimeout_default_get() by
      passing UDPLITE as l4protocol.  Alias UDPLITE to UDP, both use
      same timeout values.
      
      Furthermore, also fetch GRE timeouts.  GRE is a bit more complicated,
      as it still can be a module and its netns_proto_gre struct layout isn't
      visible outside of the gre module. Can't move timeouts around, it
      appears conntrack sysctl unregister assumes net_generic() returns
      nf_proto_net, so we get crash. Expose layout of netns_proto_gre instead.
      
      A followup nf-next patch could make gre tracker be built-in as well
      if needed, its not that large.
      
      Last, make the WARN() mention the missing protocol value in case
      anything else is missing.
      
      Reported-by: syzbot+2fae8fa157dd92618cae@syzkaller.appspotmail.com
      Fixes: 8866df92 ("netfilter: nfnetlink_cttimeout: pass default timeout policy to obj_to_nlattr")
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      89259088
  7. 24 11月, 2018 1 次提交
    • W
      packet: copy user buffers before orphan or clone · 5cd8d46e
      Willem de Bruijn 提交于
      tpacket_snd sends packets with user pages linked into skb frags. It
      notifies that pages can be reused when the skb is released by setting
      skb->destructor to tpacket_destruct_skb.
      
      This can cause data corruption if the skb is orphaned (e.g., on
      transmit through veth) or cloned (e.g., on mirror to another psock).
      
      Create a kernel-private copy of data in these cases, same as tun/tap
      zerocopy transmission. Reuse that infrastructure: mark the skb as
      SKBTX_ZEROCOPY_FRAG, which will trigger copy in skb_orphan_frags(_rx).
      
      Unlike other zerocopy packets, do not set shinfo destructor_arg to
      struct ubuf_info. tpacket_destruct_skb already uses that ptr to notify
      when the original skb is released and a timestamp is recorded. Do not
      change this timestamp behavior. The ubuf_info->callback is not needed
      anyway, as no zerocopy notification is expected.
      
      Mark destructor_arg as not-a-uarg by setting the lower bit to 1. The
      resulting value is not a valid ubuf_info pointer, nor a valid
      tpacket_snd frame address. Add skb_zcopy_.._nouarg helpers for this.
      
      The fix relies on features introduced in commit 52267790 ("sock:
      add MSG_ZEROCOPY"), so can be backported as is only to 4.14.
      
      Tested with from `./in_netns.sh ./txring_overwrite` from
      http://github.com/wdebruij/kerneltools/tests
      
      Fixes: 69e3c75f ("net: TX_RING and packet mmap")
      Reported-by: NAnand H. Krishnan <anandhkrishnan@gmail.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5cd8d46e
  8. 23 11月, 2018 1 次提交
    • T
      net/dim: Update DIM start sample after each DIM iteration · 0211dda6
      Tal Gilboa 提交于
      On every iteration of net_dim, the algorithm may choose to
      check for the system state by comparing current data sample
      with previous data sample. After each of these comparison,
      regardless of the action taken, the sample used as baseline
      is needed to be updated.
      
      This patch fixes a bug that causes DIM to take wrong decisions,
      due to never updating the baseline sample for comparison between
      iterations. This way, DIM always compares current sample with
      zeros.
      
      Although this is a functional fix, it also improves and stabilizes
      performance as the algorithm works properly now.
      
      Performance:
      Tested single UDP TX stream with pktgen:
      samples/pktgen/pktgen_sample03_burst_single_flow.sh -i p4p2 -d 1.1.1.1
      -m 24:8a:07:88:26:8b -f 3 -b 128
      
      ConnectX-5 100GbE packet rate improved from 15-19Mpps to 19-20Mpps.
      Also, toggling between profiles is less frequent with the fix.
      
      Fixes: 8115b750 ("net/dim: use struct net_dim_sample as arg to net_dim")
      Signed-off-by: NTal Gilboa <talgi@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0211dda6
  9. 22 11月, 2018 3 次提交
  10. 16 11月, 2018 1 次提交
  11. 15 11月, 2018 1 次提交
  12. 12 11月, 2018 1 次提交
  13. 10 11月, 2018 3 次提交
  14. 09 11月, 2018 1 次提交
  15. 08 11月, 2018 1 次提交
  16. 07 11月, 2018 4 次提交
  17. 06 11月, 2018 8 次提交