1. 16 5月, 2020 38 次提交
    • E
      net/mlx5e: IPoIB, Enable loopback packets for IPoIB interfaces · 80639b19
      Erez Shitrit 提交于
      Enable loopback of unicast and multicast traffic for IPoIB enhanced
      mode.
      This will allow interfaces with the same pkey to communicate between
      them e.g cloned interfaces that located in different namespaces.
      Signed-off-by: NErez Shitrit <erezsh@mellanox.com>
      Reviewed-by: NAlex Vesker <valex@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      80639b19
    • R
      net/mlx5e: CT: Fix offload with CT action after CT NAT action · 9102d836
      Roi Dayan 提交于
      It could be a chain of rules will do action CT again after CT NAT
      Before this fix matching will break as we get into the CT table
      after NAT changes and not CT NAT.
      Fix this by adding pre ct and pre ct nat tables to skip ct/ct_nat
      tables and go straight to post_ct table if ct/nat was already done.
      Signed-off-by: NRoi Dayan <roid@mellanox.com>
      Reviewed-by: NPaul Blakey <paulb@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      9102d836
    • E
      net/mlx5: Move internal timer read function to clock library · 90bf1c8d
      Eran Ben Elisha 提交于
      Move mlx5_read_internal_timer() into lib/clock.c file as it is being
      used there. As such, make this function a static one.
      
      In addition, rearrange headers include to support function move.
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Reviewed-by: NAya Levin <ayal@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      90bf1c8d
    • P
      net/mlx5: Wait for inactive autogroups · 49c0355d
      Paul Blakey 提交于
      Currently, if one thread tries to add an entry to an autogrouped table
      with no free matching group, while another thread is in the process of
      creating a new matching autogroup, it doesn't wait for the new group
      creation, and creates an unnecessary new autogroup.
      
      Instead of skipping inactive, wait on the write lock of those groups.
      Signed-off-by: NPaul Blakey <paulb@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Reviewed-by: NMark Bloch <markb@mellanox.com>
      Reviewed-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      49c0355d
    • P
      net/mlx5: Drain wq first during PCI device removal · 41798df9
      Parav Pandit 提交于
      mlx5_unload_one() is done with cleanup = true only once.
      
      So instead of doing health wq drain inside the if(), directly do
      during PCI device removal.
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      41798df9
    • P
      net/mlx5: Have single error unwinding path · 4162f58b
      Parav Pandit 提交于
      Having multiple error unwinding path are error prone.
      Lets have just one error unwinding path.
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      4162f58b
    • E
      net/mlx5: Fix a bug of releasing wrong chunks on > 4K page size systems · e7f860e2
      Eran Ben Elisha 提交于
      On systems with page size larger than 4K, a fwp object has few 4K chunks.
      Fix a bug in fwp free flow where the chunk address was dropped and
      fwp->addr was used instead (first chunk address). This caused a wrong
      update of fwp->bitmask which later can cause errors in re-alloc fwp
      chunk flow.
      
      In order to fix this it, re-factor the release flow:
      - Free 4k: Releases a specific 4k chunk inside the fwp, defined by
        starting address.
      - Free fwp: Unconditionally release the whole fwp and its resources.
      Free addr will call free fwp if all chunks were released, in order to do
      code sharing.
      
      In addition, fix npages to count for all released chunks correctly.
      
      Fixes: c6168161 ("net/mlx5: Add support for release all pages event")
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      e7f860e2
    • E
      net/mlx5: Dedicate fw page to the requesting function · 2726cd4a
      Eran Ben Elisha 提交于
      The cited patch assumes that all chuncks in a fw page belong to the same
      function, thus the driver must dedicate fw page to the requesting
      function, which is actually what was intedned in the original fw pages
      allocator design, hence the fwp->func_id !
      
      Up until the cited patch everything worked ok, but now "relase all pages"
      is broken on systems with page_size > 4k.
      
      Fix this by dedicating fw page to the requesting function id via adding a
      func_id parameter to alloc_4k() function.
      
      Fixes: c6168161 ("net/mlx5: Add support for release all pages event")
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      2726cd4a
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · da07f52d
      David S. Miller 提交于
      Move the bpf verifier trace check into the new switch statement in
      HEAD.
      
      Resolve the overlapping changes in hinic, where bug fixes overlap
      the addition of VF support.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      da07f52d
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · f85c1598
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Fix sk_psock reference count leak on receive, from Xiyu Yang.
      
       2) CONFIG_HNS should be invisible, from Geert Uytterhoeven.
      
       3) Don't allow locking route MTUs in ipv6, RFCs actually forbid this,
          from Maciej Żenczykowski.
      
       4) ipv4 route redirect backoff wasn't actually enforced, from Paolo
          Abeni.
      
       5) Fix netprio cgroup v2 leak, from Zefan Li.
      
       6) Fix infinite loop on rmmod in conntrack, from Florian Westphal.
      
       7) Fix tcp SO_RCVLOWAT hangs, from Eric Dumazet.
      
       8) Various bpf probe handling fixes, from Daniel Borkmann.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (68 commits)
        selftests: mptcp: pm: rm the right tmp file
        dpaa2-eth: properly handle buffer size restrictions
        bpf: Restrict bpf_trace_printk()'s %s usage and add %pks, %pus specifier
        bpf: Add bpf_probe_read_{user, kernel}_str() to do_refine_retval_range
        bpf: Restrict bpf_probe_read{, str}() only to archs where they work
        MAINTAINERS: Mark networking drivers as Maintained.
        ipmr: Add lockdep expression to ipmr_for_each_table macro
        ipmr: Fix RCU list debugging warning
        drivers: net: hamradio: Fix suspicious RCU usage warning in bpqether.c
        net: phy: broadcom: fix BCM54XX_SHD_SCR3_TRDDAPD value for BCM54810
        tcp: fix error recovery in tcp_zerocopy_receive()
        MAINTAINERS: Add Jakub to networking drivers.
        MAINTAINERS: another add of Karsten Graul for S390 networking
        drivers: ipa: fix typos for ipa_smp2p structure doc
        pppoe: only process PADT targeted at local interfaces
        selftests/bpf: Enforce returning 0 for fentry/fexit programs
        bpf: Enforce returning 0 for fentry/fexit progs
        net: stmmac: fix num_por initialization
        security: Fix the default value of secid_to_secctx hook
        libbpf: Fix register naming in PT_REGS s390 macros
        ...
      f85c1598
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · d5dfe4f1
      Linus Torvalds 提交于
      Pull rdma fixes from Jason Gunthorpe:
       "A few minor bug fixes for user visible defects, and one regression:
      
         - Various bugs from static checkers and syzkaller
      
         - Add missing error checking in mlx4
      
         - Prevent RTNL lock recursion in i40iw
      
         - Fix segfault in cxgb4 in peer abort cases
      
         - Fix a regression added in 5.7 where the IB_EVENT_DEVICE_FATAL could
           be lost, and wasn't delivered to all the FDs"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        RDMA/uverbs: Move IB_EVENT_DEVICE_FATAL to destroy_uobj
        RDMA/uverbs: Do not discard the IB_EVENT_DEVICE_FATAL event
        RDMA/iw_cxgb4: Fix incorrect function parameters
        RDMA/core: Fix double put of resource
        IB/core: Fix potential NULL pointer dereference in pkey cache
        IB/hfi1: Fix another case where pq is left on waitlist
        IB/i40iw: Remove bogus call to netdev_master_upper_dev_get()
        IB/mlx4: Test return value of calls to ib_get_cached_pkey
        RDMA/rxe: Always return ERR_PTR from rxe_create_mmap_info()
        i40iw: Fix error handling in i40iw_manage_arp_cache()
      d5dfe4f1
    • L
      Merge tag 'linux-kselftest-5.7-rc6' of... · ce247296
      Linus Torvalds 提交于
      Merge tag 'linux-kselftest-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull kselftest fixes from Shuah Khan:
      
       - lkdtm runner fixes to prevent dmesg clearing and shellcheck errors
      
       - ftrace test handling when test module doesn't exist
      
       - nsfs test fix to replace zero-length array with flexible-array
      
       - dmabuf-heaps test fix to return clear error value
      
      * tag 'linux-kselftest-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        selftests/lkdtm: Use grep -E instead of egrep
        selftests/lkdtm: Don't clear dmesg when running tests
        selftests/ftrace: mark irqsoff_tracer.tc test as unresolved if the test module does not exist
        tools/testing: Replace zero-length array with flexible-array
        kselftests: dmabuf-heaps: Fix confused return value on expected error testing
      ce247296
    • L
      Merge tag 'riscv-for-linus-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 67e45621
      Linus Torvalds 提交于
      Pull RISC-V fixes from Palmer Dabbelt:
       "A handful of build fixes, all found by Huawei's autobuilder.
      
        None of these patches should have any functional impact on kernels
        that build, and they're mostly related to various features
        intermingling with !MMU.
      
        While some of these might be better hoisted to generic code, it seems
        better to have the simple fixes in the meanwhile.
      
        As far as I know these are the only outstanding patches for 5.7"
      
      * tag 'riscv-for-linus-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: mmiowb: Fix implicit declaration of function 'smp_processor_id'
        riscv: pgtable: Fix __kernel_map_pages build error if NOMMU
        riscv: Make SYS_SUPPORTS_HUGETLBFS depends on MMU
        riscv: Disable ARCH_HAS_DEBUG_VIRTUAL if NOMMU
        riscv: Add pgprot_writecombine/device and PAGE_SHARED defination if NOMMU
        riscv: stacktrace: Fix undefined reference to `walk_stackframe'
        riscv: Fix unmet direct dependencies built based on SOC_VIRT
        riscv: perf: RISCV_BASE_PMU should be independent
        riscv: perf_event: Make some funciton static
      67e45621
    • D
      Merge branch 'mptcp-fix-MP_JOIN-failure-handling' · 93d43e58
      David S. Miller 提交于
      Paolo Abeni says:
      
      ====================
      mptcp: fix MP_JOIN failure handling
      
      Currently if we hit an MP_JOIN failure on the third ack, the child socket is
      closed with reset, but the request socket is not deleted, causing weird
      behaviors.
      
      The main problem is that MPTCP's MP_JOIN code needs to plug it's own
      'valid 3rd ack' checks and the current TCP callbacks do not allow that.
      
      This series tries to address the above shortcoming introducing a new MPTCP
      specific bit in a 'struct tcp_request_sock' hole, and leveraging that to allow
      tcp_check_req releasing the request socket when needed.
      
      The above allows cleaning-up a bit current MPTCP hooking in tcp_check_req().
      
      An alternative solution, possibly cleaner but more invasive, would be
      changing the 'bool *own_req' syn_recv_sock() argument into 'int *req_status'
      and let MPTCP set it to 'REQ_DROP'.
      
      v1 -> v2:
       - be more conservative about drop_req initialization
      
      RFC -> v1:
       - move the drop_req bit inside tcp_request_sock (Eric)
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93d43e58
    • P
      mptcp: cope better with MP_JOIN failure · 729cd643
      Paolo Abeni 提交于
      Currently, on MP_JOIN failure we reset the child
      socket, but leave the request socket untouched.
      
      tcp_check_req will deal with it according to the
      'tcp_abort_on_overflow' sysctl value - by default the
      req socket will stay alive.
      
      The above leads to inconsistent behavior on MP JOIN
      failure, and bad listener overflow accounting.
      
      This patch addresses the issue leveraging the infrastructure
      just introduced to ask the TCP stack to drop the req on
      failure.
      
      The child socket is not freed anymore by subflow_syn_recv_sock(),
      instead it's moved to a dead state and will be disposed by the
      next sock_put done by the TCP stack, so that listener overflow
      accounting is not affected by MP JOIN failure.
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NChristoph Paasch <cpaasch@apple.com>
      Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      729cd643
    • P
      inet_connection_sock: factor out destroy helper. · 2f8a397d
      Paolo Abeni 提交于
      Move the steps to prepare an inet_connection_sock for
      forced disposal inside a separate helper. No functional
      changes inteded, this will just simplify the next patch.
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NChristoph Paasch <cpaasch@apple.com>
      Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f8a397d
    • P
      mptcp: add new sock flag to deal with join subflows · 90bf4513
      Paolo Abeni 提交于
      MP_JOIN subflows must not land into the accept queue.
      Currently tcp_check_req() calls an mptcp specific helper
      to detect such scenario.
      
      Such helper leverages the subflow context to check for
      MP_JOIN subflows. We need to deal also with MP JOIN
      failures, even when the subflow context is not available
      due allocation failure.
      
      A possible solution would be changing the syn_recv_sock()
      signature to allow returning a more descriptive action/
      error code and deal with that in tcp_check_req().
      
      Since the above need is MPTCP specific, this patch instead
      uses a TCP request socket hole to add a MPTCP specific flag.
      Such flag is used by the MPTCP syn_recv_sock() to tell
      tcp_check_req() how to deal with the request socket.
      
      This change is a no-op for !MPTCP build, and makes the
      MPTCP code simpler. It allows also the next patch to deal
      correctly with MP JOIN failure.
      
      v1 -> v2:
       - be more conservative on drop_req initialization (Mat)
      
      RFC -> v1:
       - move the drop_req bit inside tcp_request_sock (Eric)
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Reviewed-by: NChristoph Paasch <cpaasch@apple.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90bf4513
    • L
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 01d8a748
      Linus Torvalds 提交于
      Pull arm64 fix from Catalin Marinas:
       "Fix flush_icache_range() second argument in machine_kexec() to be an
        address rather than size"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: fix the flush_icache_range arguments in machine_kexec
      01d8a748
    • O
      net: phy: tja11xx: execute cable test on link up · ca1c933b
      Oleksij Rempel 提交于
      A typical 100Base-T1 link should be always connected. If the link is in
      a shot or open state, it is a failure. In most cases, we won't be able
      to automatically handle this issue, but we need to log it or notify user
      (if possible).
      
      With this patch, the cable will be tested on "ip l s dev .. up" attempt
      and send ethnl notification to the user space.
      
      This patch was tested with TJA1102 PHY and "ethtool --monitor" command.
      Signed-off-by: NOleksij Rempel <o.rempel@pengutronix.de>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ca1c933b
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 8e138104
      David S. Miller 提交于
      Alexei Starovoitov says:
      
      ====================
      pull-request: bpf 2020-05-15
      
      The following pull-request contains BPF updates for your *net* tree.
      
      We've added 9 non-merge commits during the last 2 day(s) which contain
      a total of 14 files changed, 137 insertions(+), 43 deletions(-).
      
      The main changes are:
      
      1) Fix secid_to_secctx LSM hook default value, from Anders.
      
      2) Fix bug in mmap of bpf array, from Andrii.
      
      3) Restrict bpf_probe_read to archs where they work, from Daniel.
      
      4) Enforce returning 0 for fentry/fexit progs, from Yonghong.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8e138104
    • K
      net: phy: broadcom: add support for BCM54811 PHY · b0ed0bbf
      Kevin Lo 提交于
      The BCM54811 PHY shares many similarities with the already supported BCM54810
      PHY but additionally requires some semi-unique configuration.
      Signed-off-by: NKevin Lo <kevlo@kevlo.org>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b0ed0bbf
    • D
      Merge branch 'cxgb4-improve-and-tune-TC-MQPRIO-offload' · d42d118c
      David S. Miller 提交于
      Rahul Lakkireddy says:
      
      ====================
      cxgb4: improve and tune TC-MQPRIO offload
      
      Patch 1 improves the Tx path's credit request and recovery mechanism
      when running under heavy load.
      
      Patch 2 adds ability to tune the burst buffer sizes of all traffic
      classes to improve performance for <= 1500 MTU, under heavy load.
      
      Patch 3 adds support to track EOTIDs and dump software queue
      contexts used by TC-MQPRIO offload.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d42d118c
    • R
      cxgb4: add EOTID tracking and software context dump · 5148e595
      Rahul Lakkireddy 提交于
      Rework and add support for dumping EOTID software context used by
      TC-MQPRIO. Also track number of EOTIDs in use.
      Signed-off-by: NRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5148e595
    • R
      cxgb4: tune burst buffer size for TC-MQPRIO offload · 4bccfc03
      Rahul Lakkireddy 提交于
      For each traffic class, firmware handles up to 4 * MTU amount of data
      per burst cycle. Under heavy load, this small buffer size is a
      bottleneck when buffering large TSO packets in <= 1500 MTU case.
      Increase the burst buffer size to 8 * MTU when supported.
      
      Also, keep the driver's traffic class configuration API similar to
      the firmware API counterpart.
      Signed-off-by: NRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4bccfc03
    • R
      cxgb4: improve credits recovery in TC-MQPRIO Tx path · 4f1d9726
      Rahul Lakkireddy 提交于
      Request credit update for every half credits consumed, including
      the current request. Also, avoid re-trying to post packets when there
      are no credits left. The credit update reply via interrupt will
      eventually restore the credits and will invoke the Tx path again.
      Signed-off-by: NRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f1d9726
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · 3430223d
      David S. Miller 提交于
      Alexei Starovoitov says:
      
      ====================
      pull-request: bpf-next 2020-05-15
      
      The following pull-request contains BPF updates for your *net-next* tree.
      
      We've added 37 non-merge commits during the last 1 day(s) which contain
      a total of 67 files changed, 741 insertions(+), 252 deletions(-).
      
      The main changes are:
      
      1) bpf_xdp_adjust_tail() now allows to grow the tail as well, from Jesper.
      
      2) bpftool can probe CONFIG_HZ, from Daniel.
      
      3) CAP_BPF is introduced to isolate user processes that use BPF infra and
         to secure BPF networking services by dropping CAP_SYS_ADMIN requirement
         in certain cases, from Alexei.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3430223d
    • D
      net: dsa: mt7530: fix VLAN setup · 0141792f
      DENG Qingfang 提交于
      Allow DSA to add VLAN entries even if VLAN filtering is disabled, so
      enabling it will not block the traffic of existent ports in the bridge
      Signed-off-by: NDENG Qingfang <dqfext@gmail.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0141792f
    • D
      Merge branch 'Implement-classifier-action-terse-dump-mode' · cd2809cc
      David S. Miller 提交于
      Vlad Buslov says:
      
      ====================
      Implement classifier-action terse dump mode
      
      Output rate of current upstream kernel TC filter dump implementation if
      relatively low (~100k rules/sec depending on configuration). This
      constraint impacts performance of software switch implementation that
      rely on TC for their datapath implementation and periodically call TC
      filter dump to update rules stats. Moreover, TC filter dump output a lot
      of static data that don't change during the filter lifecycle (filter
      key, specific action details, etc.) which constitutes significant
      portion of payload on resulting netlink packets and increases amount of
      syscalls necessary to dump all filters on particular Qdisc. In order to
      significantly improve filter dump rate this patch sets implement new
      mode of TC filter dump operation named "terse dump" mode. In this mode
      only parameters necessary to identify the filter (handle, action cookie,
      etc.) and data that can change during filter lifecycle (filter flags,
      action stats, etc.) are preserved in dump output while everything else
      is omitted.
      
      Userspace API is implemented using new TCA_DUMP_FLAGS tlv with only
      available flag value TCA_DUMP_FLAGS_TERSE. Internally, new API requires
      individual classifier support (new tcf_proto_ops->terse_dump()
      callback). Support for action terse dump is implemented in act API and
      don't require changing individual action implementations.
      
      The following table provides performance comparison between regular
      filter dump and new terse dump mode for two classifier-action profiles:
      one minimal config with L2 flower classifier and single gact action and
      another heavier config with L2+5tuple flower classifier with
      tunnel_key+mirred actions.
      
       Classifier-action type      |        dump |  terse dump | X improvement
                                   | (rules/sec) | (rules/sec) |
      -----------------------------+-------------+-------------+---------------
       L2 with gact                |       141.8 |       293.2 |          2.07
       L2+5tuple tunnel_key+mirred |        76.4 |       198.8 |          2.60
      
      Benchmark details: to measure the rate tc filter dump and terse dump
      commands are invoked on ingress Qdisc that have one million filters
      configured using following commands.
      
      > time sudo tc -s filter show dev ens1f0 ingress >/dev/null
      
      > time sudo tc -s filter show terse dev ens1f0 ingress >/dev/null
      
      Value in results table is calculated by dividing 1000000 total rules by
      "real" time reported by time command.
      
      Setup details: 2x Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, 32GB memory
      ====================
      Reviewed-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cd2809cc
    • M
      selftests: mptcp: pm: rm the right tmp file · 9a2dbb59
      Matthieu Baerts 提交于
      "$err" is a variable pointing to a temp file. "$out" is not: only used
      as a local variable in "check()" and representing the output of a
      command line.
      
      Fixes: eedbc685 (selftests: add PM netlink functional tests)
      Signed-off-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9a2dbb59
    • I
      dpaa2-eth: properly handle buffer size restrictions · efa6a7d0
      Ioana Ciornei 提交于
      Depending on the WRIOP version, the buffer size on the RX path must by a
      multiple of 64 or 256. Handle this restriction properly by aligning down
      the buffer size to the necessary value. Also, use the new buffer size
      dynamically computed instead of the compile time one.
      
      Fixes: 27c87486 ("dpaa2-eth: Use a single page per Rx buffer")
      Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      efa6a7d0
    • V
      selftests: implement flower classifier terse dump tests · e7534fd4
      Vlad Buslov 提交于
      Implement two basic tests to verify terse dump functionality of flower
      classifier:
      
      - Test that verifies that terse dump works.
      
      - Test that verifies that terse dump doesn't print filter key.
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e7534fd4
    • V
      net: sched: cls_flower: implement terse dump support · 0348451d
      Vlad Buslov 提交于
      Implement tcf_proto_ops->terse_dump() callback for flower classifier. Only
      dump handle, flags and action data in terse mode.
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0348451d
    • V
      net: sched: implement terse dump support in act · ca44b738
      Vlad Buslov 提交于
      Extend tcf_action_dump() with boolean argument 'terse' that is used to
      request terse-mode action dump. In terse mode only essential data needed to
      identify particular action (action kind, cookie, etc.) and its stats is put
      to resulting skb and everything else is omitted. Implement
      tcf_exts_terse_dump() helper in cls API that is intended to be used to
      request terse dump of all exts (actions) attached to the filter.
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ca44b738
    • V
      net: sched: introduce terse dump flag · f8ab1807
      Vlad Buslov 提交于
      Add new TCA_DUMP_FLAGS attribute and use it in cls API to request terse
      filter output from classifiers with TCA_DUMP_FLAGS_TERSE flag. This option
      is intended to be used to improve performance of TC filter dump when
      userland only needs to obtain stats and not the whole classifier/action
      data. Extend struct tcf_proto_ops with new terse_dump() callback that must
      be defined by supporting classifier implementations.
      
      Support of the options in specific classifiers and actions is
      implemented in following patches in the series.
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8ab1807
    • T
      net: core: recursively find netdev by device node · 2e186a2c
      Tobias Waldekranz 提交于
      The assumption that a device node is associated either with the
      netdev's device, or the parent of that device, does not hold for all
      drivers. E.g. Freescale's DPAA has two layers of platform devices
      above the netdev. Instead, recursively walk up the tree from the
      netdev, allowing any parent to match against the sought after node.
      Signed-off-by: NTobias Waldekranz <tobias@waldekranz.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2e186a2c
    • L
      Merge tag 'hwmon-for-v5.7-rc6' of... · 051e6b7e
      Linus Torvalds 提交于
      Merge tag 'hwmon-for-v5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
      
      Pull hwmon fixes from Guenter Roeck:
      
       - Fix ADC access synchronization problem with da9052 driver
      
       - Fix temperature limit and status reporting in nct7904 driver
      
       - Fix drivetemp temperature reporting if SCT is supported but SCT data
         tables are not.
      
      * tag 'hwmon-for-v5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (da9052) Synchronize access with mfd
        hwmon: (nct7904) Fix incorrect range of temperature limit registers
        hwmon: (nct7904) Read all SMI status registers in probe function
        hwmon: (drivetemp) Fix SCT support if SCT data tables are not supported
      051e6b7e
    • L
      Merge tag 'sound-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 1742bcd0
      Linus Torvalds 提交于
      Pull sound fixes from Takashi Iwai:
       "Things look good and calming down; the only change to ALSA core is the
        fix for racy rawmidi buffer accesses spotted by syzkaller, and the
        rest are all small device-specific quirks for HD-audio and USB-audio
        devices"
      
      * tag 'sound-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda/realtek - Limit int mic boost for Thinkpad T530
        ALSA: hda/realtek - Add COEF workaround for ASUS ZenBook UX431DA
        ALSA: hda/realtek: Enable headset mic of ASUS UX581LV with ALC295
        ALSA: hda/realtek - Enable headset mic of ASUS UX550GE with ALC295
        ALSA: hda/realtek - Enable headset mic of ASUS GL503VM with ALC295
        ALSA: hda/realtek: Add quirk for Samsung Notebook
        ALSA: rawmidi: Fix racy buffer resize under concurrent accesses
        ALSA: usb-audio: add mapping for ASRock TRX40 Creator
        ALSA: hda/realtek - Fix S3 pop noise on Dell Wyse
        Revert "ALSA: hda/realtek: Fix pop noise on ALC225"
        ALSA: firewire-lib: fix 'function sizeof not defined' error of tracepoints format
        ALSA: usb-audio: Add control message quirk delay for Kingston HyperX headset
      1742bcd0
    • L
      Merge tag 'drm-fixes-2020-05-15' of git://anongit.freedesktop.org/drm/drm · e7cea790
      Linus Torvalds 提交于
      Pull drm fixes from Dave Airlie:
       "As mentioned last week an i915 PR came in late, but I left it, so the
        i915 bits of this cover 2 weeks, which is why it's likely a bit larger
        than usual.
      
        Otherwise it's mostly amdgpu fixes, one tegra fix, one meson fix.
      
        i915:
         - Handle idling during i915_gem_evict_something busy loops (Chris)
         - Mark current submissions with a weak-dependency (Chris)
         - Propagate error from completed fences (Chris)
         - Fixes on execlist to avoid GPU hang situation (Chris)
         - Fixes couple deadlocks (Chris)
         - Timeslice preemption fixes (Chris)
         - Fix Display Port interrupt handling on Tiger Lake (Imre)
         - Reduce debug noise around Frame Buffer Compression (Peter)
         - Fix logic around IPC W/a for Coffee Lake and Kaby Lake (Sultan)
         - Avoid dereferencing a dead context (Chris)
      
        tegra:
         - tegra120/4 smmu fixes
      
        amdgpu:
         - Clockgating fixes
         - Fix fbdev with scatter/gather display
         - S4 fix for navi
         - Soft recovery for gfx10
         - Freesync fixes
         - Atomic check cursor fix
         - Add a gfxoff quirk
         - MST fix
      
        amdkfd:
         - Fix GEM reference counting
      
        meson:
         - error code propogation fix"
      
      * tag 'drm-fixes-2020-05-15' of git://anongit.freedesktop.org/drm/drm: (29 commits)
        drm/i915: Handle idling during i915_gem_evict_something busy loops
        drm/meson: pm resume add return errno branch
        drm/amd/amdgpu: Update update_config() logic
        drm/amd/amdgpu: add raven1 part to the gfxoff quirk list
        drm/i915: Mark concurrent submissions with a weak-dependency
        drm/i915: Propagate error from completed fences
        drm/i915/gvt: Fix kernel oops for 3-level ppgtt guest
        drm/i915/gvt: Init DPLL/DDI vreg for virtual display instead of inheritance.
        drm/amd/display: add basic atomic check for cursor plane
        drm/amd/display: Fix vblank and pageflip event handling for FreeSync
        drm/amdgpu: implement soft_recovery for gfx10
        drm/amdgpu: enable hibernate support on Navi1X
        drm/amdgpu: Use GEM obj reference for KFD BOs
        drm/amdgpu: force fbdev into vram
        drm/amd/powerplay: perform PG ungate prior to CG ungate
        drm/amdgpu: drop unnecessary cancel_delayed_work_sync on PG ungate
        drm/amdgpu: disable MGCG/MGLS also on gfx CG ungate
        drm/i915/execlists: Track inflight CCID
        drm/i915/execlists: Avoid reusing the same logical CCID
        drm/i915/gem: Remove object_is_locked assertion from unpin_from_display_plane
        ...
      e7cea790
  2. 15 5月, 2020 2 次提交
    • D
      Merge branch 'bpf-cap' · ed24a7a8
      Daniel Borkmann 提交于
      Alexei Starovoitov says:
      
      ====================
      v6->v7:
      - permit SK_REUSEPORT program type under CAP_BPF as suggested by Marek Majkowski.
        It's equivalent to SOCKET_FILTER which is unpriv.
      
      v5->v6:
      - split allow_ptr_leaks into four flags.
      - retain bpf_jit_limit under cap_sys_admin.
      - fixed few other issues spotted by Daniel.
      
      v4->v5:
      
      Split BPF operations that are allowed under CAP_SYS_ADMIN into combination of
      CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN and keep some of them under CAP_SYS_ADMIN.
      
      The user process has to have
      - CAP_BPF to create maps, do other sys_bpf() commands and load SK_REUSEPORT progs.
        Note: dev_map, sock_hash, sock_map map types still require CAP_NET_ADMIN.
        That could be relaxed in the future.
      - CAP_BPF and CAP_PERFMON to load tracing programs.
      - CAP_BPF and CAP_NET_ADMIN to load networking programs.
      (or CAP_SYS_ADMIN for backward compatibility).
      
      CAP_BPF solves three main goals:
      1. provides isolation to user space processes that drop CAP_SYS_ADMIN and switch to CAP_BPF.
         More on this below. This is the major difference vs v4 set back from Sep 2019.
      2. makes networking BPF progs more secure, since CAP_BPF + CAP_NET_ADMIN
         prevents pointer leaks and arbitrary kernel memory access.
      3. enables fuzzers to exercise all of the verifier logic. Eventually finding bugs
         and making BPF infra more secure. Currently fuzzers run in unpriv.
         They will be able to run with CAP_BPF.
      
      The patchset is long overdue follow-up from the last plumbers conference.
      Comparing to what was discussed at LPC the CAP* checks at attach time are gone.
      For tracing progs the CAP_SYS_ADMIN check was done at load time only. There was
      no check at attach time. For networking and cgroup progs CAP_SYS_ADMIN was
      required at load time and CAP_NET_ADMIN at attach time, but there are several
      ways to bypass CAP_NET_ADMIN:
      - if networking prog is using tail_call writing FD into prog_array will
        effectively attach it, but bpf_map_update_elem is an unprivileged operation.
      - freplace prog with CAP_SYS_ADMIN can replace networking prog
      
      Consolidating all CAP checks at load time makes security model similar to
      open() syscall. Once the user got an FD it can do everything with it.
      read/write/poll don't check permissions. The same way when bpf_prog_load
      command returns an FD the user can do everything (including attaching,
      detaching, and bpf_test_run).
      
      The important design decision is to allow ID->FD transition for
      CAP_SYS_ADMIN only. What it means that user processes can run
      with CAP_BPF and CAP_NET_ADMIN and they will not be able to affect each
      other unless they pass FDs via scm_rights or via pinning in bpffs.
      ID->FD is a mechanism for human override and introspection.
      An admin can do 'sudo bpftool prog ...'. It's possible to enforce via LSM that
      only bpftool binary does bpf syscall with CAP_SYS_ADMIN and the rest of user
      space processes do bpf syscall with CAP_BPF isolating bpf objects (progs, maps,
      links) that are owned by such processes from each other.
      
      Another significant change from LPC is that the verifier checks are split into
      four flags. The allow_ptr_leaks flag allows pointer manipulations. The
      bpf_capable flag enables all modern verifier features like bpf-to-bpf calls,
      BTF, bounded loops, dead code elimination, etc. All the goodness. The
      bypass_spec_v1 flag enables indirect stack access from bpf programs and
      disables speculative analysis and bpf array mitigations. The bypass_spec_v4
      flag disables store sanitation. That allows networking progs with CAP_BPF +
      CAP_NET_ADMIN enjoy modern verifier features while being more secure.
      
      Some networking progs may need CAP_BPF + CAP_NET_ADMIN + CAP_PERFMON,
      since subtracting pointers (like skb->data_end - skb->data) is a pointer leak,
      but the verifier may get smarter in the future.
      ====================
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      ed24a7a8
    • A
      selftests/bpf: Use CAP_BPF and CAP_PERFMON in tests · 81626001
      Alexei Starovoitov 提交于
      Make all test_verifier test exercise CAP_BPF and CAP_PERFMON
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20200513230355.7858-4-alexei.starovoitov@gmail.com
      81626001