1. 19 2月, 2017 8 次提交
  2. 18 2月, 2017 32 次提交
    • E
      tcp: use page_ref_inc() in tcp_sendmsg() · 4e33e346
      Eric Dumazet 提交于
      sk_page_frag_refill() allocates either a compound page or an order-0
      page. We can use page_ref_inc() which is slightly faster than get_page()
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e33e346
    • C
      tcp: accommodate sequence number to a peer's shrunk receive window caused by... · a4ecb15a
      Cui, Cheng 提交于
      tcp: accommodate sequence number to a peer's shrunk receive window caused by precision loss in window scaling
      
      Prevent sending out a left-shifted sequence number from a Linux sender in
      response to a peer's shrunk receive-window caused by losing least significant
      bits in window-scaling.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: James Morris <jmorris@namei.org>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Signed-off-by: NCheng Cui <Cheng.Cui@netapp.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a4ecb15a
    • D
      Merge branch 'sfc-misc-fixes' · e606519e
      David S. Miller 提交于
      Edward Cree says:
      
      ====================
      sfc: misc. fixes
      
      Three largely unrelated fixes to increase robustness in rare edge cases.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e606519e
    • P
      sfc: do not device_attach if a reset is pending · 9c568fd8
      Peter Dunning 提交于
      efx_start_all can return without initialising queues as a reset is pending.
       This means that when netif_device_attach is called, the kernel can start
       sending traffic without having an initialised TX queue to send to.
      This patch avoids this by not calling netif_device_attach if there is a
       pending reset.
      
      Fixes: e283546c ("sfc:On MCDI timeout, issue an FLR (and mark MCDI to fail-fast)")
      Signed-off-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9c568fd8
    • B
      sfc: forget filters from sw table if hw replies ENOENT on removing them · 105eac6c
      Bert Kenward 提交于
      If the hw doesn't think they exist, we should defer to its authority.
      Signed-off-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      105eac6c
    • J
      sfc: fix filter_id misinterpretation in edge case · 0ccb998b
      Jon Cooper 提交于
      On EF10, hardware filter IDs are 13 bits, but in some places we store
       32-bit "full filter IDs" in which higher order bits encode the filter
       match-priority.  This could cause a filter to have a full filter ID of
       0xffff, which is also the value EFX_EF10_FILTER_ID_INVALID which we use
       in 16-bit "short" filter IDs (without match-priority bits).  This would
       occur if the hardware filter ID was 0x1fff and the match-priority was 7.
      Unfortunately, some code that checks for EFX_EF10_FILTER_ID_INVALID can
       be called on full filter IDs, and will WARN_ON if this ever happens.
      So, since we have plenty of spare bits in the full filter ID, this patch
       shifts the priority bits left one bit when constructing the full filter
       IDs, ensuring that the 0x2000 bit of a full filter ID will always be 0
       and thus no full filter ID can ever equal EFX_EF10_FILTER_ID_INVALID.
      
      This patch also replaces open-coded full<->short filter ID conversions
       with calls to functions, thus keeping the definition of the full filter
       ID format in one place.
      Signed-off-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0ccb998b
    • A
      vmxnet3: prevent building with 64K pages · fbdf0e28
      Arnd Bergmann 提交于
      I got a warning about broken code on ARM64 with 64K pages:
      
      drivers/net/vmxnet3/vmxnet3_drv.c: In function 'vmxnet3_rq_init':
      drivers/net/vmxnet3/vmxnet3_drv.c:1679:29: error: large integer implicitly truncated to unsigned type [-Werror=overflow]
          rq->buf_info[0][i].len = PAGE_SIZE;
      
      'len' here is a 16-bit integer, so this clearly won't work. I don't think
      this driver is used much on anything other than x86, so there is no need
      to fix this properly and we can work around it with a Kconfig dependency
      to forbid known-broken configurations. qemu in theory supports it on
      other architectures too, but presumably only for compatibility with x86
      guests that also run on vmware.
      
      CONFIG_PAGE_SIZE_64KB is used on hexagon, mips, sh and tile, the other
      symbols are architecture-specific names for the same thing.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fbdf0e28
    • V
      net/wan: add MODULE_LICENSE for fsl_ucc_hdlc · 74179d44
      Valentin Longchamp 提交于
      It is required to build it as a module.
      Signed-off-by: NValentin Longchamp <valentin.longchamp@keymile.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      74179d44
    • Z
      rds:Remove unnecessary ib_ring unalloc · d2c58294
      Zhu Yanjun 提交于
      In the function rds_ib_xmit_atomic, ib_ring is not allocated
      successfully. As such, it is not necessary to unalloc it.
      
      Cc: Joe Jin <joe.jin@oracle.com>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Signed-off-by: NZhu Yanjun <yanjun.zhu@oracle.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d2c58294
    • S
      nfp: Use PCI_DEVICE_ID_NETRONOME_NFP* defines · 3b473528
      Simon Horman 提交于
      Use PCI_DEVICE_ID_NETRONOME_NFP*, defined in linux/pci_ids.h,
      rather than replicating the same values in the NFP driver.
      Signed-off-by: NSimon Horman <simon.horman@netronome.com>
      Acked-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3b473528
    • G
      pkt_sched: Remove useless qdisc_stab_lock · 806a8376
      Gao Feng 提交于
      The qdisc_stab_lock is used in qdisc_get_stab and qdisc_put_stab.
      These two functions are invoked in qdisc_create, qdisc_change, and
      qdisc_destroy which run fully under RTNL.
      
      So it already makes sure only one could access the qdisc_stab_list at
      the same time. Then it is unnecessary to use qdisc_stab_lock now.
      Signed-off-by: NGao Feng <fgao@ikuai8.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      806a8376
    • D
      rxrpc: Change module filename to rxrpc.ko · 88c4845d
      David Howells 提交于
      Change module filename from af-rxrpc.ko to rxrpc.ko so as to be consistent
      with the other protocol drivers.
      
      Also adjust the documentation to reflect this.
      
      Further, there is no longer a standalone rxkad module, as it has been
      merged into the rxrpc core, so get rid of references to that.
      Reported-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      88c4845d
    • S
      netvsc: fix typo on statistics · b5124720
      Simon Xiao 提交于
      Return the correct tx_errors stats in netvsc.
      Reviewed-by: NHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: NSimon Xiao <sixiao@microsoft.com>
      Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b5124720
    • D
      rtnl: don't account unused struct ifla_port_vsi in rtnl_port_size · 025331df
      Daniel Borkmann 提交于
      When allocating rtnl dump messages, struct ifla_port_vsi is never dumped,
      so we can save header plus payload in rtnl_port_size(). Infact, attribute
      IFLA_PORT_VSI_TYPE and struct ifla_port_vsi are not used anywhere in
      the kernel. We only need to keep the nla policy should applications in
      user space be filling this out. Same NLA_BINARY issue exists as was fixed
      in 364d5716 ("rtnetlink: ifla_vf_policy: fix misuses of NLA_BINARY")
      and others, but then again IFLA_PORT_VSI_TYPE is not used anywhere, so
      just add a comment that it's unused.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      025331df
    • P
      net: qlogic: netxen: use new api ethtool_{get|set}_link_ksettings · 99f18f1d
      Philippe Reynes 提交于
      The ethtool api {get|set}_settings is deprecated.
      We move this driver to new api {get|set}_link_ksettings.
      
      As I don't have the hardware, I'd be very pleased if
      someone may test this patch.
      Signed-off-by: NPhilippe Reynes <tremyfr@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      99f18f1d
    • P
      net: hamachi: use new api ethtool_{get|set}_link_ksettings · 336f8a71
      Philippe Reynes 提交于
      The ethtool api {get|set}_settings is deprecated.
      We move this driver to new api {get|set}_link_ksettings.
      
      As I don't have the hardware, I'd be very pleased if
      someone may test this patch.
      Signed-off-by: NPhilippe Reynes <tremyfr@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      336f8a71
    • R
      bridge: don't indicate expiry on NTF_EXT_LEARNED fdb entries · eda7a5e8
      Roopa Prabhu 提交于
      added_by_external_learn fdb entries are added and expired by
      external entities like switchdev driver or external controllers.
      ageing is already disabled for such entries. Hence, don't
      indicate expiry for such fdb entries.
      
      CC: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
      CC: Jiri Pirko <jiri@resnulli.us>
      CC: Ido Schimmel <idosch@mellanox.com>
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
      Tested-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eda7a5e8
    • D
      Merge branch 'bpf-misc' · a2b4eb55
      David S. Miller 提交于
      Daniel Borkmann says:
      
      ====================
      Misc BPF improvements
      
      This last series for this window adds various misc
      improvements to BPF, one is to mark registered map and
      prog types as __ro_after_init, another one for removing
      cBPF stubs in eBPF JITs and moving the stub to the core
      and last also improving JITs is to make generated images
      visible to the kernel and kallsyms, so they can be
      seen in traces. For details, please have a look at the
      individual patches.
      
      Thanks a lot!
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a2b4eb55
    • D
      bpf: make jited programs visible in traces · 74451e66
      Daniel Borkmann 提交于
      Long standing issue with JITed programs is that stack traces from
      function tracing check whether a given address is kernel code
      through {__,}kernel_text_address(), which checks for code in core
      kernel, modules and dynamically allocated ftrace trampolines. But
      what is still missing is BPF JITed programs (interpreted programs
      are not an issue as __bpf_prog_run() will be attributed to them),
      thus when a stack trace is triggered, the code walking the stack
      won't see any of the JITed ones. The same for address correlation
      done from user space via reading /proc/kallsyms. This is read by
      tools like perf, but the latter is also useful for permanent live
      tracing with eBPF itself in combination with stack maps when other
      eBPF types are part of the callchain. See offwaketime example on
      dumping stack from a map.
      
      This work tries to tackle that issue by making the addresses and
      symbols known to the kernel. The lookup from *kernel_text_address()
      is implemented through a latched RB tree that can be read under
      RCU in fast-path that is also shared for symbol/size/offset lookup
      for a specific given address in kallsyms. The slow-path iteration
      through all symbols in the seq file done via RCU list, which holds
      a tiny fraction of all exported ksyms, usually below 0.1 percent.
      Function symbols are exported as bpf_prog_<tag>, in order to aide
      debugging and attribution. This facility is currently enabled for
      root-only when bpf_jit_kallsyms is set to 1, and disabled if hardening
      is active in any mode. The rationale behind this is that still a lot
      of systems ship with world read permissions on kallsyms thus addresses
      should not get suddenly exposed for them. If that situation gets
      much better in future, we always have the option to change the
      default on this. Likewise, unprivileged programs are not allowed
      to add entries there either, but that is less of a concern as most
      such programs types relevant in this context are for root-only anyway.
      If enabled, call graphs and stack traces will then show a correct
      attribution; one example is illustrated below, where the trace is
      now visible in tooling such as perf script --kallsyms=/proc/kallsyms
      and friends.
      
      Before:
      
        7fff8166889d bpf_clone_redirect+0x80007f0020ed (/lib/modules/4.9.0-rc8+/build/vmlinux)
               f5d80 __sendmsg_nocancel+0xffff006451f1a007 (/usr/lib64/libc-2.18.so)
      
      After:
      
        7fff816688b7 bpf_clone_redirect+0x80007f002107 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fffa0575728 bpf_prog_33c45a467c9e061a+0x8000600020fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fffa07ef1fc cls_bpf_classify+0x8000600020dc (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff81678b68 tc_classify+0x80007f002078 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164d40b __netif_receive_skb_core+0x80007f0025fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164d718 __netif_receive_skb+0x80007f002018 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164e565 process_backlog+0x80007f002095 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164dc71 net_rx_action+0x80007f002231 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff81767461 __softirqentry_text_start+0x80007f0020d1 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff817658ac do_softirq_own_stack+0x80007f00201c (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff810a2c20 do_softirq+0x80007f002050 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff810a2cb5 __local_bh_enable_ip+0x80007f002085 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8168d452 ip_finish_output2+0x80007f002152 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8168ea3d ip_finish_output+0x80007f00217d (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8168f2af ip_output+0x80007f00203f (/lib/modules/4.9.0-rc8+/build/vmlinux)
        [...]
        7fff81005854 do_syscall_64+0x80007f002054 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff817649eb return_from_SYSCALL_64+0x80007f002000 (/lib/modules/4.9.0-rc8+/build/vmlinux)
               f5d80 __sendmsg_nocancel+0xffff01c484812007 (/usr/lib64/libc-2.18.so)
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      74451e66
    • D
      bpf: remove stubs for cBPF from arch code · 9383191d
      Daniel Borkmann 提交于
      Remove the dummy bpf_jit_compile() stubs for eBPF JITs and make
      that a single __weak function in the core that can be overridden
      similarly to the eBPF one. Also remove stale pr_err() mentions
      of bpf_jit_compile.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9383191d
    • D
      bpf: mark all registered map/prog types as __ro_after_init · c78f8bdf
      Daniel Borkmann 提交于
      All map types and prog types are registered to the BPF core through
      bpf_register_map_type() and bpf_register_prog_type() during init and
      remain unchanged thereafter. As by design we don't (and never will)
      have any pluggable code that can register to that at any later point
      in time, lets mark all the existing bpf_{map,prog}_type_list objects
      in the tree as __ro_after_init, so they can be moved to read-only
      section from then onwards.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c78f8bdf
    • R
      bridge: vlan_tunnel: explicitly reset metadata attrs to NULL on failure · afcb50ba
      Roopa Prabhu 提交于
      Fixes: efa5356b ("bridge: per vlan dst_metadata netlink support")
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      afcb50ba
    • T
      net: bgmac: store MAC address directly in netdev->dev_addr · 6850f8b5
      Tobias Klauser 提交于
      After commit 34a5102c ("net: bgmac: allocate struct bgmac just once
      & don't copy it") the mac_addr member of struct bgmac is no longer
      necessary to pass the MAC address to bgmac_enet_probe(). Instead it can
      directly be stored in netdev->dev_addr.
      
      Also use eth_hw_addr_random() instead of eth_random_addr() in case a
      random MAC is nedded. This will make sure netdev->addr_assign_type will
      be properly set.
      Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
      Acked-by: NJon Mason <jon.mason@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6850f8b5
    • D
      Merge tag 'wireless-drivers-next-for-davem-2017-02-16' of... · 3105dfb2
      David S. Miller 提交于
      Merge tag 'wireless-drivers-next-for-davem-2017-02-16' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
      
      Kalle Valo says:
      
      ====================
      wireless-drivers-next patches for 4.11
      
      Mostly small fixes, not really any new features.
      
      Major changes:
      
      ath10k
      
      * when trying older firmware versions don't confuse user with error messages
      
      ath9k
      
      * fix crash in AP mode (regression)
      * fix relayfs crash (regression)
      * fix initialisation with AR9340 and AR9550
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3105dfb2
    • T
      net: ethoc: Use eth_hw_addr_random() · 6d6a505a
      Tobias Klauser 提交于
      Use eth_hw_addr_random() to set a random dev_addr and update
      addr_assign_type instead of open-coding it.
      Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6d6a505a
    • D
      Merge branch 'rhashtable-allocation-failure-during-insertion' · e1c151a4
      David S. Miller 提交于
      Herbert Xu says:
      
      ====================
      rhashtable: Handle table allocation failure during insertion
      
      v2 -
      
      Added Ack to patch 2.
      Fixed RCU annotation in code path executed by rehasher by using
      rht_dereference_bucket.
      
      v1 -
      
      This series tackles the problem of table allocation failures during
      insertion.  The issue is that we cannot vmalloc during insertion.
      This series deals with this by introducing nested tables.
      
      The first two patches removes manual hash table walks which cannot
      work on a nested table.
      
      The final patch introduces nested tables.
      
      I've tested this with test_rhashtable and it appears to work.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e1c151a4
    • H
      rhashtable: Add nested tables · da20420f
      Herbert Xu 提交于
      This patch adds code that handles GFP_ATOMIC kmalloc failure on
      insertion.  As we cannot use vmalloc, we solve it by making our
      hash table nested.  That is, we allocate single pages at each level
      and reach our desired table size by nesting them.
      
      When a nested table is created, only a single page is allocated
      at the top-level.  Lower levels are allocated on demand during
      insertion.  Therefore for each insertion to succeed, only two
      (non-consecutive) pages are needed.
      
      After a nested table is created, a rehash will be scheduled in
      order to switch to a vmalloced table as soon as possible.  Also,
      the rehash code will never rehash into a nested table.  If we
      detect a nested table during a rehash, the rehash will be aborted
      and a new rehash will be scheduled.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      da20420f
    • H
      tipc: Fix tipc_sk_reinit race conditions · 40f9f439
      Herbert Xu 提交于
      There are two problems with the function tipc_sk_reinit.  Firstly
      it's doing a manual walk over an rhashtable.  This is broken as
      an rhashtable can be resized and if you manually walk over it
      during a resize then you may miss entries.
      
      Secondly it's missing memory barriers as previously the code used
      spinlocks which provide the barriers implicitly.
      
      This patch fixes both problems.
      
      Fixes: 07f6c4bc ("tipc: convert tipc reference table to...")
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40f9f439
    • H
      gfs2: Use rhashtable walk interface in glock_hash_walk · 98687f42
      Herbert Xu 提交于
      The function glock_hash_walk walks the rhashtable by hand.  This
      is broken because if it catches the hash table in the middle of
      a rehash, then it will miss entries.
      
      This patch replaces the manual walk by using the rhashtable walk
      interface.
      
      Fixes: 88ffbf3e ("GFS2: Use resizable hash table for glocks")
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      98687f42
    • J
      net: mvneta: make mvneta_eth_tool_ops static · 4581be42
      Jisheng Zhang 提交于
      The mvneta_eth_tool_ops is only used internally in mvneta driver, so
      make it static.
      Signed-off-by: NJisheng Zhang <jszhang@marvell.com>
      Acked-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4581be42
    • D
      Merge branch 'net-sched-reflect-hw-offload-in-classifiers' · 4b5026ad
      David S. Miller 提交于
      Or Gerlitz says:
      
      ====================
      net/sched: Reflect HW offload status in classifiers
      
      Currently there is no way of querying whether a filter is
      offloaded to HW or not when using "both" policy (where none
      of skip_sw or skip_hw flags are set by user-space).
      
      Added two new flags, "in hw" and "not in hw" such that user space
      can determine if a filter is actually offloaded to hw. The "in hw"
      UAPI semantics was chosen so it's similar to the "skip hw" flag logic.
      
      If none of these two flags are set, this signals running
      over older kernel.
      
      As an example, add one vlan push + fwd rule, one matchall rule and one u32 rule
      without any flags, and another vlan + fwd skip_sw rule, such that the different TC
      classifier attempt to offload all of them -- all over mlx5 SRIOV VF rep:
      
      	flower skip_sw indev eth2_0 src_mac e4:11:22:33:44:50 dst_mac e4:1d:2d:a5:f3:9d
      	action vlan push id 52 action mirred egress redirect dev eth2
      
      	flower indev eth2_0 src_mac e4:11:22:33:44:50 dst_mac e4:11:22:33:44:51
      	action vlan push id 53 action mirred egress redirect dev eth2
      
      	u32 ht 800: flowid 800:1 match ip src 192.168.1.0/24 action drop
      
      Since that VF rep doesn't offload matchall/u32 and can currently offload
      only one vlan push rule we expect three of the rules not to be offloaded:
      
      filter protocol ip pref 99 u32
      filter protocol ip pref 99 u32 fh 800: ht divisor 1
      filter protocol ip pref 99 u32 fh 800::1 order 1 key ht 800 bkt 0 flowid 800:1 not in_hw
        match c0a80100/ffffff00 at 12
      	action order 1: gact action drop
      	 random type none pass val 0
      	 index 8 ref 1 bind 1
      
      filter protocol all pref 49150 matchall
      filter protocol all pref 49150 matchall handle 0x1
        not in_hw
      	action order 1: mirred (Egress Mirror to device veth1) pipe
       	index 27 ref 1 bind 1
      
      filter protocol ip pref 49151 flower
      filter protocol ip pref 49151 flower handle 0x1
        indev eth2_0
        dst_mac e4:11:22:33:44:51
        src_mac e4:11:22:33:44:50
        eth_type ipv4
        not in_hw
      	action order 1:  vlan push id 53 protocol 802.1Q priority 0 pipe
      	 index 20 ref 1 bind 1
      
      	action order 2: mirred (Egress Redirect to device eth2) stolen
       	index 26 ref 1 bind 1
      
      filter protocol ip pref 49152 flower
      filter protocol ip pref 49152 flower handle 0x1
        indev eth2_0
        dst_mac e4:1d:2d:a5:f3:9d
        src_mac e4:11:22:33:44:50
        eth_type ipv4
        skip_sw
        in_hw
      	action order 1:  vlan push id 52 protocol 802.1Q priority 0 pipe
      	 index 19 ref 1 bind 1
      
      	action order 2: mirred (Egress Redirect to device eth2) stolen
       	index 25 ref 1 bind 1
      
      v3 --> v4 changes:
       - removed extra parenthesis (Dave)
      
      v2 --> v3 changes:
       - fixed the matchall dump flags patch to do proper checks (Jakub)
       - added the same proper checks to flower where they were missing
       - that flower patch was added as #1 and hence all the other patches are offed-by-one
      
      v1 --> v2 changes:
       - applied feedback from Jakub and Dave -- where none of the skip flags were set,
         the suggested approach didn't allow user space to distringuish between old kernel
         to a case when offloading to HW worked fine.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4b5026ad
    • O
      net/sched: cls_bpf: Reflect HW offload status · 5cecb6cc
      Or Gerlitz 提交于
      BPF classifier support for the "in hw" offloading flags.
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: NAmir Vadai <amir@vadai.me>
      Acked-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5cecb6cc