1. 21 9月, 2016 15 次提交
    • N
      tcp: use windowed min filter library for TCP min_rtt estimation · 64033892
      Neal Cardwell 提交于
      Refactor the TCP min_rtt code to reuse the new win_minmax library in
      lib/win_minmax.c to simplify the TCP code.
      
      This is a pure refactor: the functionality is exactly the same. We
      just moved the windowed min code to make TCP easier to read and
      maintain, and to allow other parts of the kernel to use the windowed
      min/max filter code.
      Signed-off-by: NVan Jacobson <vanj@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNandita Dukkipati <nanditad@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      64033892
    • N
      lib/win_minmax: windowed min or max estimator · a4f1f9ac
      Neal Cardwell 提交于
      This commit introduces a generic library to estimate either the min or
      max value of a time-varying variable over a recent time window. This
      is code originally from Kathleen Nichols. The current form of the code
      is from Van Jacobson.
      
      A single struct minmax_sample will track the estimated windowed-max
      value of the series if you call minmax_running_max() or the estimated
      windowed-min value of the series if you call minmax_running_min().
      
      Nearly equivalent code is already in place for minimum RTT estimation
      in the TCP stack. This commit extracts that code and generalizes it to
      handle both min and max. Moving the code here reduces the footprint
      and complexity of the TCP code base and makes the filter generally
      available for other parts of the codebase, including an upcoming TCP
      congestion control module.
      
      This library works well for time series where the measurements are
      smoothly increasing or decreasing.
      Signed-off-by: NVan Jacobson <vanj@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNandita Dukkipati <nanditad@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a4f1f9ac
    • S
      tcp: cdg: rename struct minmax in tcp_cdg.c to avoid a naming conflict · f78e73e2
      Soheil Hassas Yeganeh 提交于
      The upcoming change "lib/win_minmax: windowed min or max estimator"
      introduces a struct called minmax, which is then included in
      include/linux/tcp.h in the upcoming change "tcp: use windowed min
      filter library for TCP min_rtt estimation". This would create a
      compilation error for tcp_cdg.c, which defines its own minmax
      struct. To avoid this naming conflict (and potentially others in the
      future), this commit renames the version used in tcp_cdg.c to
      cdg_minmax.
      Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Kenneth Klette Jonassen <kennetkl@ifi.uio.no>
      Acked-by: NKenneth Klette Jonassen <kennetkl@ifi.uio.no>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f78e73e2
    • S
      net: ethernet: mediatek: enhance with avoiding superfluous assignment inside mtk_get_ethtool_stats · 94d308d0
      Sean Wang 提交于
      data_src is unchanged inside the loop, so this patch moves
      the assignment to outside the loop to avoid unnecessarily
      assignment
      Signed-off-by: NSean Wang <sean.wang@mediatek.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      94d308d0
    • V
      net: dsa: mv88e6xxx: handle multiple ports in ATU · 88472939
      Vivien Didelot 提交于
      An address can be loaded in the ATU with multiple ports, for instance
      when adding multiple ports to a Multicast group with "bridge mdb".
      
      The current code doesn't allow that. Add an helper to get a single entry
      from the ATU, then set or clear the requested port, before loading the
      entry back in the ATU.
      
      Note that the required _mv88e6xxx_atu_getnext function is defined below
      mv88e6xxx_port_db_load_purge, so forward-declare it for the moment. The
      ATU code will be isolated in future patches.
      
      Fixes: 83dabd1f ("net: dsa: mv88e6xxx: make switchdev DB ops generic")
      Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      88472939
    • J
      net sched actions: fix GETing actions · aecc5cef
      Jamal Hadi Salim 提交于
      With the batch changes that translated transient actions into
      a temporary list lost in the translation was the fact that
      tcf_action_destroy() will eventually delete the action from
      the permanent location if the refcount is zero.
      
      Example of what broke:
      ...add a gact action to drop
      sudo $TC actions add action drop index 10
      ...now retrieve it, looks good
      sudo $TC actions get action gact index 10
      ...retrieve it again and find it is gone!
      sudo $TC actions get action gact index 10
      
      Fixes: 22dc13c8 ("net_sched: convert tcf_exts from list to pointer array"),
      Fixes: 824a7e88 ("net_sched: remove an unnecessary list_del()")
      Fixes: f07fed82 ("net_sched: remove the leftover cleanup_a()")
      Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aecc5cef
    • D
      Merge branch 'bpf-direct-packet-access-improvements' · 1d9423ae
      David S. Miller 提交于
      Daniel Borkmann says:
      
      ====================
      BPF direct packet access improvements
      
      This set adds write support to the currently available read support
      for {cls,act}_bpf programs. First one is a fix for affected commit
      sitting in net-next and prerequisite for the second one, last patch
      adds a number of test cases against the verifier. For details, please
      see individual patches.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1d9423ae
    • D
      bpf: add test cases for direct packet access · 7d95b0ab
      Daniel Borkmann 提交于
      Add couple of test cases for direct write and the negative size issue, and
      also adjust the direct packet access test4 since it asserts that writes are
      not possible, but since we've just added support for writes, we need to
      invert the verdict to ACCEPT, of course. Summary: 133 PASSED, 0 FAILED.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d95b0ab
    • D
      bpf: direct packet write and access for helpers for clsact progs · 36bbef52
      Daniel Borkmann 提交于
      This work implements direct packet access for helpers and direct packet
      write in a similar fashion as already available for XDP types via commits
      4acf6c0b ("bpf: enable direct packet data write for xdp progs") and
      6841de8b ("bpf: allow helpers access the packet directly"), and as a
      complementary feature to the already available direct packet read for tc
      (cls/act) programs.
      
      For enabling this, we need to introduce two helpers, bpf_skb_pull_data()
      and bpf_csum_update(). The first is generally needed for both, read and
      write, because they would otherwise only be limited to the current linear
      skb head. Usually, when the data_end test fails, programs just bail out,
      or, in the direct read case, use bpf_skb_load_bytes() as an alternative
      to overcome this limitation. If such data sits in non-linear parts, we
      can just pull them in once with the new helper, retest and eventually
      access them.
      
      At the same time, this also makes sure the skb is uncloned, which is, of
      course, a necessary condition for direct write. As this needs to be an
      invariant for the write part only, the verifier detects writes and adds
      a prologue that is calling bpf_skb_pull_data() to effectively unclone the
      skb from the very beginning in case it is indeed cloned. The heuristic
      makes use of a similar trick that was done in 233577a2 ("net: filter:
      constify detection of pkt_type_offset"). This comes at zero cost for other
      programs that do not use the direct write feature. Should a program use
      this feature only sparsely and has read access for the most parts with,
      for example, drop return codes, then such write action can be delegated
      to a tail called program for mitigating this cost of potential uncloning
      to a late point in time where it would have been paid similarly with the
      bpf_skb_store_bytes() as well. Advantage of direct write is that the
      writes are inlined whereas the helper cannot make any length assumptions
      and thus needs to generate a call to memcpy() also for small sizes, as well
      as cost of helper call itself with sanity checks are avoided. Plus, when
      direct read is already used, we don't need to cache or perform rechecks
      on the data boundaries (due to verifier invalidating previous checks for
      helpers that change skb->data), so more complex programs using rewrites
      can benefit from switching to direct read plus write.
      
      For direct packet access to helpers, we save the otherwise needed copy into
      a temp struct sitting on stack memory when use-case allows. Both facilities
      are enabled via may_access_direct_pkt_data() in verifier. For now, we limit
      this to map helpers and csum_diff, and can successively enable other helpers
      where we find it makes sense. Helpers that definitely cannot be allowed for
      this are those part of bpf_helper_changes_skb_data() since they can change
      underlying data, and those that write into memory as this could happen for
      packet typed args when still cloned. bpf_csum_update() helper accommodates
      for the fact that we need to fixup checksum_complete when using direct write
      instead of bpf_skb_store_bytes(), meaning the programs can use available
      helpers like bpf_csum_diff(), and implement csum_add(), csum_sub(),
      csum_block_add(), csum_block_sub() equivalents in eBPF together with the
      new helper. A usage example will be provided for iproute2's examples/bpf/
      directory.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      36bbef52
    • D
      bpf, verifier: enforce larger zero range for pkt on overloading stack buffs · b399cf64
      Daniel Borkmann 提交于
      Current contract for the following two helper argument types is:
      
        * ARG_CONST_STACK_SIZE: passed argument pair must be (ptr, >0).
        * ARG_CONST_STACK_SIZE_OR_ZERO: passed argument pair can be either
          (NULL, 0) or (ptr, >0).
      
      With 6841de8b ("bpf: allow helpers access the packet directly"), we can
      pass also raw packet data to helpers, so depending on the argument type
      being PTR_TO_PACKET, we now either assert memory via check_packet_access()
      or check_stack_boundary(). As a result, the tests in check_packet_access()
      currently allow more than intended with regards to reg->imm.
      
      Back in 969bf05e ("bpf: direct packet access"), check_packet_access()
      was fine to ignore size argument since in check_mem_access() size was
      bpf_size_to_bytes() derived and prior to the call to check_packet_access()
      guaranteed to be larger than zero.
      
      However, for the above two argument types, it currently means, we can have
      a <= 0 size and thus breaking current guarantees for helpers. Enforce a
      check for size <= 0 and bail out if so.
      
      check_stack_boundary() doesn't have such an issue since it already tests
      for access_size <= 0 and bails out, resp. access_size == 0 in case of NULL
      pointer passed when allowed.
      
      Fixes: 6841de8b ("bpf: allow helpers access the packet directly")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b399cf64
    • M
      ipvlan: Fix dependency issue · cf714ac1
      Mahesh Bandewar 提交于
      kbuild-build-bot reported that if NETFILTER is not selected, the
      build fails pointing to netfilter symbols.
      
      Fixes: 4fbae7d8 ("ipvlan: Introduce l3s mode")
      Signed-off-by: NMahesh Bandewar <maheshb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cf714ac1
    • P
      openvswitch: avoid resetting flow key while installing new flow. · 2279994d
      pravin shelar 提交于
      since commit commit db74a333 ("openvswitch: use percpu
      flow stats") flow alloc resets flow-key. So there is no need
      to reset the flow-key again if OVS is using newly allocated
      flow-key.
      Signed-off-by: NPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2279994d
    • P
      openvswitch: Fix Frame-size larger than 1024 bytes warning. · 190aa3e7
      pravin shelar 提交于
      There is no need to declare separate key on stack,
      we can just use sw_flow->key to store the key directly.
      
      This commit fixes following warning:
      
      net/openvswitch/datapath.c: In function ‘ovs_flow_cmd_new’:
      net/openvswitch/datapath.c:1080:1: warning: the frame size of 1040 bytes
      is larger than 1024 bytes [-Wframe-larger-than=]
      Signed-off-by: NPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      190aa3e7
    • D
      Merge branch 'for-upstream' of... · 204dfe17
      David S. Miller 提交于
      Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
      
      Johan Hedberg says:
      
      ====================
      pull request: bluetooth-next 2016-09-19
      
      Here's the main bluetooth-next pull request for the 4.9 kernel.
      
       - Added new messages for monitor sockets for better mgmt tracing
       - Added local name and appearance support in scan response
       - Added new Qualcomm WCNSS SMD based HCI driver
       - Minor fixes & cleanup to 802.15.4 code
       - New USB ID to btusb driver
       - Added Marvell support to HCI UART driver
       - Add combined LED trigger for controller power
       - Other minor fixes here and there
      
      Please let me know if there are any issues pulling. Thanks.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      204dfe17
    • A
      6pack: fix buffer length mishandling · ad979896
      Alan Cox 提交于
      Dmitry Vyukov wrote:
      > different runs). Looking at code, the following looks suspicious -- we
      > limit copy by 512 bytes, but use the original count which can be
      > larger than 512:
      >
      > static void sixpack_receive_buf(struct tty_struct *tty,
      >     const unsigned char *cp, char *fp, int count)
      > {
      >     unsigned char buf[512];
      >     ....
      >     memcpy(buf, cp, count < sizeof(buf) ? count : sizeof(buf));
      >     ....
      >     sixpack_decode(sp, buf, count1);
      
      With the sane tty locking we now have I believe the following is safe as
      we consume the bytes and move them into the decoded buffer before
      returning.
      Signed-off-by: NAlan Cox <alan@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad979896
  2. 20 9月, 2016 25 次提交