- 19 5月, 2018 1 次提交
-
-
由 Jiong Wang 提交于
For indirect shifts, shift amount is not specified as constant, NFP needs to get the shift amount through the low 5 bits of source A operand in PREV_ALU, therefore extra instructions are needed compared with shifts by constants. Because NFP is 32-bit, so we are using register pair for 64-bit shifts and therefore would need different instruction sequences depending on whether shift amount is less than 32 or not. NFP branch-on-bit-test instruction emitter is added by this patch and is used for efficient runtime check on shift amount. We'd think the shift amount is less than 32 if bit 5 is clear and greater or equal than 32 otherwise. Shift amount is greater than or equal to 64 will result in undefined behavior. This patch also use range info to avoid generating unnecessary runtime code if we are certain shift amount is less than 32 or not. Signed-off-by: NJiong Wang <jiong.wang@netronome.com> Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 05 5月, 2018 2 次提交
-
-
由 Jakub Kicinski 提交于
Add support for the perf_event_output family of helpers. The implementation on the NFP will not match the host code exactly. The state of the host map and rings is unknown to the device, hence device can't return errors when rings are not installed. The device simply packs the data into a firmware notification message and sends it over to the host, returning success to the program. There is no notion of a host CPU on the device when packets are being processed. Device will only offload programs which set BPF_F_CURRENT_CPU. Still, if map index doesn't match CPU no error will be returned (see above). Dropped/lost firmware notification messages will not cause "lost events" event on the perf ring, they are only visible via device error counters. Firmware notification messages may also get reordered in respect to the packets which caused their generation. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Jakub Kicinski 提交于
For asynchronous events originating from the device, like perf event output, we need to be able to make sure that objects being referred to by the FW message are valid on the host. FW events can get queued and reordered. Even if we had a FW message "barrier" we should still protect ourselves from bogus FW output. Add a reverse-mapping hash table and record in it all raw map pointers FW may refer to. Only record neutral maps, i.e. perf event arrays. These are currently the only objects FW can refer to. Use RCU protection on the read side, update side is under RTNL. Since program vs map destruction order is slightly painful for offload simply take an extra reference on all the recorded maps to make sure they don't disappear. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 29 3月, 2018 1 次提交
-
-
由 Jakub Kicinski 提交于
Implement atomic add operation for 32 and 64 bit values. Depend on the verifier to ensure alignment. Values have to be kept in big endian and swapped upon read/write. For now only support atomic add of a constant. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com> Reviewed-by: NJiong Wang <jiong.wang@netronome.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 23 1月, 2018 1 次提交
-
-
由 Quentin Monnet 提交于
Use the recently added extack support for eBPF offload in the driver. Signed-off-by: NQuentin Monnet <quentin.monnet@netronome.com> Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 1月, 2018 1 次提交
-
-
由 Jakub Kicinski 提交于
The special handling of different map types is left to the driver. Allow offload of array maps by simply adding it to accepted types. For nfp we have to make sure array elements are not deleted. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 18 1月, 2018 1 次提交
-
-
由 Jiong Wang 提交于
This patch set those new jit info fields introduced in this patch set. Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NJiong Wang <jiong.wang@netronome.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 15 1月, 2018 2 次提交
-
-
由 Jakub Kicinski 提交于
Plug in to the stack's map offload callbacks for BPF map offload. Get next call needs some special handling on the FW side, since we can't send a NULL pointer to the FW there is a get first entry FW command. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Jakub Kicinski 提交于
With map offload coming, we need to call program offload structure something less ambiguous. Pure rename, no functional changes. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 10 1月, 2018 5 次提交
-
-
由 Jakub Kicinski 提交于
Instead of having an app callback per message type hand off all offload-related handling to apps with one "rest of ndo_bpf" callback. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Jakub Kicinski 提交于
The translator pre-allocates a buffer of maximal program size. Due to HW/FW limitations the program buffer can't currently be longer than 128Kb, so we used to kmalloc() it, and then map for DMA directly. Now that the late branch resolution is copying the program image anyway, we can just kvmalloc() the buffer. While at it, after translation reallocate the buffer to save space. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Jakub Kicinski 提交于
Don't translate the program assuming it will be loaded at a given address. This will be required for sharing programs between ports of the same NIC, tail calls and subprograms. It will also make the jump targets easier to understand when dumping the program to user space. Translate the program as if it was going to be loaded at address zero. When load happens add the load offset in and set addresses of special branches. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NJiong Wang <jiong.wang@netronome.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Jakub Kicinski 提交于
Jump target resolution should be in jit.c not offload.c. No functional changes. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NJiong Wang <jiong.wang@netronome.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Jakub Kicinski 提交于
Kernel enforces the alignment of the bottom of the stack, NFP deals with positive offsets better so we should align the top of the stack. Round the stack size to NFP word size (4B). Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 15 12月, 2017 1 次提交
-
-
由 Jakub Kicinski 提交于
BPF FW creates a run time symbol called bpf_capabilities which contains TLV-formatted capability information. Allocate app private structure to store parsed capabilities and add a skeleton of parsing logic. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 02 12月, 2017 2 次提交
-
-
由 Jiong Wang 提交于
NFP eBPF offload JIT engine is doing some instruction combine based optimizations which however must not be safe if the combined sequences are across basic block boarders. Currently, there are post checks during fixing jump destinations. If the jump destination is found to be eBPF insn that has been combined into another one, then JIT engine will raise error and abort. This is not optimal. The JIT engine ought to disable the optimization on such cross-bb-border sequences instead of abort. As there is no control flow information in eBPF infrastructure that we can't do basic block based optimizations, this patch extends the existing jump destination record pass to also flag the jump destination, then in instruction combine passes we could skip the optimizations if insns in the sequence are jump targets. Suggested-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NJiong Wang <jiong.wang@netronome.com> Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Jiong Wang 提交于
eBPF insns are internally organized as dual-list inside NFP offload JIT. Random access to an insn needs to be done by either forward or backward traversal along the list. One place we need to do such traversal is at nfp_fixup_branches where one traversal is needed for each jump insn to find the destination. Such traversals could be avoided if jump destinations are collected through a single travesal in a pre-scan pass, and such information could also be useful in other places where jump destination info are needed. This patch adds such jump destination collection in nfp_prog_prepare. Suggested-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NJiong Wang <jiong.wang@netronome.com> Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 21 11月, 2017 1 次提交
-
-
由 Jakub Kicinski 提交于
With TC shared block changes we can't depend on correct netdev pointer being available in cls_bpf. Move the device validation to the driver. Core will only make sure that offloaded programs are always attached in the driver (or in HW by the driver). We trust that drivers which implement offload callbacks will perform necessary checks. Moving the checks to the driver is generally a useful thing, in practice the check should be against a switchdev instance, not a netdev, given that most ASICs will probably allow using the same program on many ports. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 05 11月, 2017 7 次提交
-
-
由 Jakub Kicinski 提交于
Following steps are taken in the driver to offload an XDP program: XDP_SETUP_PROG: * prepare: - allocate program state; - run verifier (bpf_analyzer()); - run translation; * load: - stop old program if needed; - load program; - enable BPF if not enabled; * clean up: - free program image. With new infrastructure the flow will look like this: BPF_OFFLOAD_VERIFIER_PREP: - allocate program state; BPF_OFFLOAD_TRANSLATE: - run translation; XDP_SETUP_PROG: - stop old program if needed; - load program; - enable BPF if not enabled; BPF_OFFLOAD_DESTROY: - free program image. Take advantage of the new infrastructure. Allocation of driver metadata has to be moved from jit.c to offload.c since it's now done at a different stage. Since there is no separate driver private data for verification step, move temporary nfp_meta pointer into nfp_prog. We will now use user space context offsets. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Kicinski 提交于
struct nfp_prog is currently only used internally by the translator. This means there is a lot of parameter passing going on, between the translator and different stages of offload. Simplify things by allocating nfp_prog in offload.c already. We will now use kmalloc() to allocate the program area and only DMA map it for the time of loading (instead of allocating DMA coherent memory upfront). Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Kicinski 提交于
Most of offload/translation prepare logic will be moved to offload.c. To help git generate more reasonable diffs move nfp_prog_prepare() and nfp_prog_free() functions there as a first step. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Kicinski 提交于
Firmware supports live replacement of programs for quite some time now. Remove the software-fallback related logic and depend on the FW for program replace. Seamless reload will become a requirement if maps are present, anyway. Load and start stages have to be split now, since replace only needs a load, start has already been done on add. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Kicinski 提交于
We currently create a fake cls_bpf offload object when we want to offload XDP. Simplify and clarify the code by moving the TC/XDP specific logic out of common offload code. This is easy now that we don't support legacy TC actions. We only need the bpf program and state of the skip_sw flag. Temporarily set @code to NULL in nfp_net_bpf_offload(), compilers seem to have trouble recognizing it's always initialized. Next patches will eliminate that variable. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Kicinski 提交于
The register renumbering was removed and will not be coming back in its old, naive form, given that it would be fundamentally incompatible with calling functions. Remove the leftovers. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Kicinski 提交于
Only support BPF_PROG_TYPE_SCHED_CLS programs in direct action mode. This simplifies preparing the offload since there will now be only one mode of operation for that type of program. We need to know the attachment mode type of cls_bpf programs, because exit codes are interpreted differently for legacy vs DA mode. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 11月, 2017 1 次提交
-
-
由 Jakub Kicinski 提交于
If kernel config does not include BPF just replace the BPF app handler with the handler for basic NIC. The BPF app will now be built only if BPF infrastructure is selected in kernel config. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NSimon Horman <simon.horman@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 10月, 2017 1 次提交
-
-
由 Kees Cook 提交于
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jiri Pirko <jiri@mellanox.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Simon Horman <simon.horman@netronome.com> Cc: oss-drivers@netronome.com Cc: netdev@vger.kernel.org Signed-off-by: NKees Cook <keescook@chromium.org> Acked-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 10月, 2017 2 次提交
-
-
由 Jakub Kicinski 提交于
To access beyond 64th byte of the stack we need to set a new stack pointer register (LMEM is accessed indirectly through those pointers). Add a function for encoding local CSR access instruction. Use stack pointer number 3. Note that stack pointer registers allow us to index into 32 bytes of LMEM (with shift operations i.e. when operands are restricted). This means if access is crossing 32 byte boundary we must not use offsetting, we have to set the pointer to the exact address and move it with post-increments. We depend on the datapath placing the stack base address in GPR A22 for our use. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Kicinski 提交于
Stack is implemented by the LMEM register file. Unaligned accesses to LMEM are not allowed. Accesses also have to be 4B wide. To support stack we need to make sure offsets of pointers are known at translation time (for now) and perform correct load/mask/shift operations. Since we can access first 64B of LMEM without much effort support only stacks not bigger than 64B. Following commits will extend the possible sizes beyond that. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 8月, 2017 1 次提交
-
-
由 Jiri Pirko 提交于
The rest of the helpers are named tcf_exts_*, so change the name of the action number helpers to be aligned. While at it, change to inline functions. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 6月, 2017 3 次提交
-
-
由 Jakub Kicinski 提交于
Allow apps to associate private data with vNICs and move BPF-specific fields of nfp_net to such structure. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Kicinski 提交于
Pure move of eBPF offload files to BPF app directory, only change the names and relative header location. nfp_asm.h stays in the main dir and it doesn't really have to include nfp_bpf.h. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Kicinski 提交于
Forgetting to disable preemption around tcf_action_stats_update() seems to be a common mistake. Add a helper function for updating stats on all actions of a filter. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 5月, 2017 1 次提交
-
-
由 Jakub Kicinski 提交于
As Or points out in commit 423b3aec ("net/mlx4: Change ENOTSUPP to EOPNOTSUPP"), ENOTSUPP is NFS specific error. Replace it with EOPNOTSUPP. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 3月, 2017 1 次提交
-
-
由 Jakub Kicinski 提交于
Move all data path information into a separate structure. This way we will be able to allocate new data path with all new rings etc. and swap it in easily. No functional changes. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 3月, 2017 1 次提交
-
-
由 Jakub Kicinski 提交于
We really only need the device pointer on the fast path, stash it at the beginning of the adapter structure and move pci_dev pointer down. This saves up a few lines of code. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 11月, 2016 2 次提交
-
-
由 Jakub Kicinski 提交于
Most infrastructure can be reused, provide separate handling of context offsets and exit codes. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Kicinski 提交于
nfp_net_bpf_offload() takes all .setup_tc() parameters but it doesn't use them at the moment. Remove unnecessary ones to make it possible for XDP to reuse this function. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 10月, 2016 1 次提交
-
-
由 Shmulik Ladkani 提交于
These accessors are used in various drivers that support tc offloading, to detect properties of a given 'tc_action'. 'is_tcf_mirred_redirect' tests that the action is TCA_EGRESS_REDIR. 'is_tcf_mirred_mirror' tests that the action is TCA_EGRESS_MIRROR. As a prep towards supporting INGRESS redir/mirror, rename these predicates to reflect their true meaning: s/is_tcf_mirred_redirect/is_tcf_mirred_egress_redirect/ s/is_tcf_mirred_mirror/is_tcf_mirred_egress_mirror/ Signed-off-by: NShmulik Ladkani <shmulik.ladkani@gmail.com> Cc: Hariprasad S <hariprasad@chelsio.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Jiri Pirko <jiri@mellanox.com> Cc: Ido Schimmel <idosch@mellanox.com> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 9月, 2016 1 次提交
-
-
由 Arnd Bergmann 提交于
I stumbled over a new warning during randconfig testing, with CONFIG_BPF_SYSCALL disabled: drivers/net/ethernet/netronome/nfp/nfp_net_offload.c: In function 'nfp_net_bpf_offload': drivers/net/ethernet/netronome/nfp/nfp_net_offload.c:263:3: error: '*((void *)&res+4)' may be used uninitialized in this function [-Werror=maybe-uninitialized] drivers/net/ethernet/netronome/nfp/nfp_net_offload.c:263:3: error: 'res.n_instr' may be used uninitialized in this function [-Werror=maybe-uninitialized] As far as I can tell, this is a false positive caused by the compiler getting confused about a function that is partially inlined, but it's easy to avoid while improving the code: The nfp_bpf_jit() stub helper for that configuration is unusual as it is defined in a header file but not marked 'static inline'. By moving the compile-time check into the caller using the IS_ENABLED() macro, we can remove that stub and simplify the nfp_net_bpf_offload_prepare() function enough to unconfuse the compiler. Fixes: 7533fdc0 ("nfp: bpf: add hardware bpf offload") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-