提交 · 3f6148e76a58eceef1435b65afecaf448b509cfd · openeuler / raspberrypi-kernel

25 7月, 2014 1 次提交

net/mlx4_en: mlx4_en_[gs]et_priv_flags() can be static · 3f6148e7

由 Fengguang Wu 提交于 7月 24, 2014

Fixes sparse warning intrduced by commit 0fef9d03 ("net/mlx4_en: Disable
blueflame using ethtool private flags")
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3f6148e7

23 7月, 2014 4 次提交

net/mlx4_en: Reduce memory consumption on kdump kernel · ea1c1af1

由 Amir Vadai 提交于 7月 22, 2014

When memory is limited, reduce number of rx and tx rings.
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ea1c1af1

net/mlx4_core: Use low memory profile on kdump kernel · 2599d858

由 Amir Vadai 提交于 7月 22, 2014

When running in kdump kernel, reduce number of resources allocated for
the hardware. This will enable the NIC to operate in this low memory
environment at the expense of performance and some features not related
to the basic NIC functionality.
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2599d858

net/mlx4_en: Disable blueflame using ethtool private flags · 0fef9d03

由 Amir Vadai 提交于 7月 22, 2014

Enable the user to turn off the hardware feature called BlueFlame.
Since it is something specific to mlx4_en hardware, we control
the feature via ethtool private flags.
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0fef9d03

net/mlx4_en: current_mac isn't updated in port up · b94901f3

由 Eyal Perry 提交于 7月 22, 2014

When port is down dev_addr is changed (e.g. by bonding) but current_mac
is not touched. When port is up again, hash_mac is updated to dev_addr,
but current_mac isn't. This leads to inconsistency between current_mac
and mac_hash. Because of that, mlx4_en_replace_mac() fails to find
current_mac in mac_hash.

Fix is to reset current_mac to dev_addr when port is up - as we do for
mac_hash.
Signed-off-by: NEyal Perry <eyalpe@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b94901f3

17 7月, 2014 6 次提交

net/mlx4_en: cq->irq_desc wasn't set in legacy EQ's · 858e6c32

由 Amir Vadai 提交于 7月 16, 2014

Fix a regression introduced by commit 35f6f453 ("net/mlx4_en: Don't use
irq_affinity_notifier to track changes in IRQ affinity map").
When core is started in legacy EQ's (number of IRQ's < rx rings), cq->irq_desc
was NULL.  This caused a kernel crash under heavy traffic - when having more
than rx NAPI budget completions.
Fixed to have it set for both EQ modes.
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

858e6c32

net/mlx4_core: Remove MCG in case it is attached to promiscuous QPs only · 1645a540

由 Saeed Mahameed 提交于 7月 16, 2014

In B0 steering mode if promiscuous QP asks to be detached from MCG entry,
and it is the only one in this entry then the entry will never be deleted.
This is a wrong behavior since we don't want to keep those entries after
the promiscuous QP becomes non-promiscuous. Therefore remove steering
entry containing only promiscuous QP.
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1645a540

net/mlx4_core: In SR-IOV mode host should add promisc QP to default entry only · 816e5984

由 Eugenia Emantayev 提交于 7月 16, 2014

In current situation host is adding the promiscuous QP to all steering
entries and the default entry as well. In this case when having PV
and SR-IOV on the same setup bridge will receive all traffic that is
targeted to the other VMs. This is bad.
Solution: In SR-IOV mode host can add promiscuous QP to default entry only.
The above problem and fix are relevant for B0 steering mode only.
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

816e5984

net/mlx4_core: Make sure the max number of QPs per MCG isn't exceeded · 75908376

由 Alexander Guller 提交于 7月 16, 2014

In B0 steering mode when adding QPs to the default MCG entry need
to check that maximal number of QPs per MCG entry was not exceeded.
Signed-off-by: NAlexander Guller <alexg@mellanox.com>
Reviewed-by: NAviad Yehezkel <aviadye@mellanox.co.il>
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

75908376

net/mlx4_core: Make sure that negative array index isn't used · aab2bb0e

由 Dotan Barak 提交于 7月 16, 2014

To make sure that the array index isn't used in the code with
negative value, we stop using the for loop integer iterator
outside of it.
>From now on use members count to swap the last QP with removed one.
Fix also the second occurrence of this flow in mlx4_qp_detach_common().
In mlx4_qp_detach_common() use members_count instead of
loop iterator outside of the for loop.
Signed-off-by: NDotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aab2bb0e

net/mlx4_core: Fix leakage of SW multicast entries · 0d7869ac

由 Yevgeny Petrilin 提交于 7月 16, 2014

When removing multicast address in B0 steering mode there is
a bug in cases where there is a single QP registered for the address,
and this QP is also promiscuous. In such cases the entry wouldn't be
deleted from the SW structure representing all Ethernet MCG entries,
but would be removed in HW. This way when driver goes to remove it
from SW and HW structures the HW deletion fails.
Moreover the same index could later be used for registering
different address, which can be Infiniband.
Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0d7869ac

15 7月, 2014 1 次提交

mlx4: mark napi id for gro_skb · 32b333fe

由 Jason Wang 提交于 7月 14, 2014

Napi id was not marked for gro_skb, this will lead rx busy loop won't
work correctly since they stack never try to call low latency receive
method because of a zero socket napi id. Fix this by marking napi id
for gro_skb.

The transaction rate of 1 byte netperf tcp_rr gets about 50% increased
(from 20531.68 to 30610.88).

Cc: Amir Vadai <amirv@mellanox.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

32b333fe

09 7月, 2014 7 次提交

net/mlx4_en: Ignore budget on TX napi polling · fbc6daf1

由 Amir Vadai 提交于 7月 08, 2014

It is recommended that TX work not count against the quota.
The cost of TX packet liberation is a minute percentage of what it costs to
process an RX frame. Furthermore, that SKB freeing makes memory available for
other paths in the stack.

Give the TX a larger budget and be more aggressive about cleaning up the Tx
descriptors this budget could be changed using ethtool:
$ ethtool -C eth1 tx-frames-irq <budget>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fbc6daf1

net/mlx4_en: Fix mac_hash database inconsistency · 2695bab2

由 Noa Osherovich 提交于 7月 08, 2014

Using a local copy of dev_addr in mlx4_en_set_mac() to prevent dev_addr
from being modified during error flow or when dev_addr is modified in
another context (which is another problem that is being discussed over
the mailing list [1]).
Also fixing bad naming of priv->prev_mac into priv->current_mac.

[1] - http://patchwork.ozlabs.org/patch/351489/Reviewed-by: NEyal Perry <eyalpe@mellanox.com>
Signed-off-by: NNoa Osherovich <noaos@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2695bab2

net/mlx4_en: Do not count LLC/SNAP in MTU calculation · d5b8dff0

由 Yishai Hadas 提交于 7月 08, 2014

LLC/SNAP 8 bytes should not be added as part of header calculation.
If used, payload will be decreased accordingly. For MTU of 1500
we'll set 1522 instead of 1523.
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Reviewed-by: NLiran Liss <liranl@mellanox.com>
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d5b8dff0

net/mlx4_en: Do not disable vlan filter during promiscuous mode · 49a1e4f6

由 Eugenia Emantayev 提交于 7月 08, 2014

Promiscous mode is only for MACs.
Should not disable/enable VLAN filter when entering/leaving promisuous mode.
Signed-off-by: NAviad Yehezkel <aviadye@mellanox.co.il>
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

49a1e4f6

net/mlx4: Verify port number in __mlx4_unregister_mac · 143b3efb

由 Eugenia Emantayev 提交于 7月 08, 2014

Verify port number to avoid crashes if port number is outside the range.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

143b3efb

net/mlx4_en: Run loopback test only when port is up · 4359db1e

由 Eugenia Emantayev 提交于 7月 08, 2014

Loopback can't work when port is down.
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4359db1e

net/mlx4_en: Fix set port ratelimit for 40GE · 523ece88

由 Eugenia Emantayev 提交于 7月 08, 2014

In 40GE we can't use the default bw units for set ratelimit (100 Mbps)
since the max is 255*100 Mbps = 25 Gbps (not suited for 40GE), thus we need 1 Gbps units.
But for 10GE 1 Gbps units might be too bruit so we use the following solution.

For user set ratelimit <= 25 Gbps:
        use 100 Mbps units * user_ratelimit (* 10).

For user set ratelimit > 25 Gbps:
        use 1 Gbps units * user_ratelimit.

For user set unlimited ratelimit (0 Gbps):
        use 1 Gbps units * MAX_RATELIMIT_DEFAULT (57)

Note: any value > 58 will damage the FW ratelimit computation, so we allow
      a max and any higher value will be pulled down to 57.
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

523ece88

08 7月, 2014 1 次提交

net/mlx4_en: Don't configure the HW vxlan parser when vxlan offloading isn't set · e326f2f1

由 Or Gerlitz 提交于 7月 02, 2014

The add_vxlan_port ndo driver code was wrongly testing whether HW vxlan offloads
are supported by the device instead of checking if they are currently enabled.

This causes the driver to configure the HW parser to conduct matching for vxlan
packets but since no steering rules were set, vxlan packets are dropped on RX.

Fix that by doing the right test, as done in the del_vxlan_port ndo handler.

Fixes: 1b136de1 ('net/mlx4: Implement vxlan ndo calls')
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e326f2f1

03 7月, 2014 2 次提交

net/mlx4_en: IRQ affinity hint is not cleared on port down · bb273617

由 Amir Vadai 提交于 6月 29, 2014

Need to remove affinity hint at mlx4_en_deactivate_cq() and not at
mlx4_en_destroy_cq() - since affinity_mask might be free'd while still
being used by procfs.
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bb273617

net/mlx4_en: Don't use irq_affinity_notifier to track changes in IRQ affinity map · 35f6f453

由 Amir Vadai 提交于 6月 29, 2014

IRQ affinity notifier can only have a single notifier - cpu_rmap
notifier. Can't use it to track changes in IRQ affinity map.
Detect IRQ affinity changes by comparing CPU to current IRQ affinity map
during NAPI poll thread.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ben Hutchings <ben@decadent.org.uk>
Fixes: 2eacc23c ("net/mlx4_core: Enforce irq affinity changes immediatly")
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

35f6f453

23 6月, 2014 1 次提交

net/mlx4_core: Fix the error flow when probing with invalid VF configuration · 960b1f45

由 Or Gerlitz 提交于 6月 22, 2014

Single ported VF are currently not supported on configurations where
one or both ports are IB. When we hit this case, the relevant flow in
the driver didn't return error and jumped to the wrong label. Fix that.

Fixes: dd41cc3b ('net/mlx4: Adapt num_vfs/probed_vf params for single port VF')
Reported-by: NShirley Ma <shirley.ma@oracle.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

960b1f45

12 6月, 2014 1 次提交

net/mlx4_en: Use affinity hint · 9e311e77

由 Yuval Atias 提交于 6月 09, 2014

The “affinity hint” mechanism is used by the user space
daemon, irqbalancer, to indicate a preferred CPU mask for irqs.
Irqbalancer can use this hint to balance the irqs between the
cpus indicated by the mask.

We wish the HCA to preferentially map the IRQs it uses to numa cores
close to it.  To accomplish this, we use cpumask_set_cpu_local_first(), that
sets the affinity hint according the following policy:
First it maps IRQs to “close” numa cores.  If these are exhausted, the
remaining IRQs are mapped to “far” numa cores.
Signed-off-by: NYuval Atias <yuvala@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9e311e77

11 6月, 2014 2 次提交

net/mlx4_core: Keep only one driver entry release mlx4_priv · da1de8df

由 Wei Yang 提交于 6月 08, 2014

Following commit befdf897 "net/mlx4_core: Preserve pci_dev_data after
__mlx4_remove_one()", there are two mlx4 pci callbacks which will
attempt to release the mlx4_priv object -- .shutdown and .remove.

This leads to a use-after-free access to the already freed mlx4_priv
instance and trigger a "Kernel access of bad area" crash when both
.shutdown and .remove are called.

During reboot or kexec, .shutdown is called, with the VFs probed to
the host going through shutdown first and then the PF. Later, the PF
will trigger VFs' .remove since VFs still have driver attached.

Fix that by keeping only one driver entry which releases mlx4_priv.

Fixes: befdf897 ('net/mlx4_core: Preserve pci_dev_data after __mlx4_remove_one()')
CC: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NWei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

da1de8df

net/mlx4_core: Fix SRIOV free-pool management when enforcing resource quotas · 95646373

由 Jack Morgenstein 提交于 6月 08, 2014

The Hypervisor driver tracks free slots and reserved slots at the global level
and tracks allocated slots and guaranteed slots per VF.

Guaranteed slots are treated as reserved by the driver, so the total
reserved slots is the sum of all guaranteed slots over all the VFs.

As VFs allocate resources, free (global) is decremented and allocated (per VF)
is incremented for those resources. However, reserved (global) is never changed.

This means that effectively, when a VF allocates a resource from its
guaranteed pool, it is actually reducing that resource's free pool (since
the global reserved count was not also reduced).

The fix for this problem is the following: For each resource, as long as a
VF's allocated count is <= its guaranteed number, when allocating for that
VF, the reserved count (global) should be reduced by the allocation as well.

When the global reserved count reaches zero, the remaining global free count
is still accessible as the free pool for that resource.

When the VF frees resources, the reverse happens: the global reserved count
for a resource is incremented only once the VFs allocated number falls below
its guaranteed number.

This fix was developed by Rick Kready <kready@us.ibm.com>
Reported-by: NRick Kready <kready@us.ibm.com>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

95646373

07 6月, 2014 1 次提交
- J
  net: use SPEED_UNKNOWN and DUPLEX_UNKNOWN when appropriate · 537fae01
  由 Jiri Pirko 提交于 6月 06, 2014
```
Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  537fae01
05 6月, 2014 1 次提交

mlx4_core: Fix GFP flags parameters to be gfp_t · 4e2c341b

由 Roland Dreier 提交于 6月 04, 2014

Otherwise sparse gives a bunch of warnings like

drivers/net/ethernet/mellanox/mlx4/srq.c:110:66: sparse: incorrect type in argument 4 (different base types)
drivers/net/ethernet/mellanox/mlx4/srq.c:110:66: expected int [signed] gfp
drivers/net/ethernet/mellanox/mlx4/srq.c:110:66: got restricted gfp_t
Signed-off-by: NRoland Dreier <roland@purestorage.com>

4e2c341b

03 6月, 2014 2 次提交

ethtool: Replace ethtool_ops::{get,set}_rxfh_indir() with {get,set}_rxfh() · fe62d001

由 Ben Hutchings 提交于 5月 15, 2014

ETHTOOL_{G,S}RXFHINDIR and ETHTOOL_{G,S}RSSH should work for drivers
regardless of whether they expose the hash key, unless you try to
set a hash key for a driver that doesn't expose it.
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Acked-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

fe62d001

IB/mlx4: Implement IB_QP_CREATE_USE_GFP_NOIO · 40f2287b

由 Jiri Kosina 提交于 5月 11, 2014

Modify the various routines used to allocate memory resources which
serve QPs in mlx4 to get an input GFP directive.  Have the Ethernet
driver to use GFP_KERNEL in it's QP allocations as done prior to this
commit, and the IB driver to use GFP_NOIO when the IB verbs
IB_QP_CREATE_USE_GFP_NOIO QP creation flag is provided.
Signed-off-by: NMel Gorman <mgorman@suse.de>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

40f2287b

02 6月, 2014 2 次提交

D
Revert "net/mlx4_en: Use affinity hint" · 96b2e73c
由 David S. Miller 提交于 6月 02, 2014
```
This reverts commit 70a640d0.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
96b2e73c

net/mlx4_en: Use affinity hint · 70a640d0

由 Yuval Atias 提交于 5月 25, 2014

The “affinity hint” mechanism is used by the user space
daemon, irqbalancer, to indicate a preferred CPU mask for irqs.
Irqbalancer can use this hint to balance the irqs between the
cpus indicated by the mask.

We wish the HCA to preferentially map the IRQs it uses to numa cores
close to it.  To accomplish this, we use cpumask_set_cpu_local_first(), that
sets the affinity hint according the following policy:
First it maps IRQs to “close” numa cores.  If these are exhausted, the
remaining IRQs are mapped to “far” numa cores.
Signed-off-by: NYuval Atias <yuvala@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

70a640d0

31 5月, 2014 2 次提交

net/mlx4_core: Reset RoCE VF gids when guest driver goes down · 111c6094

由 Jack Morgenstein 提交于 5月 27, 2014

Reset the GIDs assigned to a VF in the port RoCE GID table when
that guest goes down (either crashes or goes down cleanly).

As part of this fix, we refactor the RoCE gid table driver copy,
moving it to the mlx4_port_info structure (together with the MAC
and VLAN tables).

As with the MAC and VLAN tables, we now use a mutex per port
for the GID table so that modifying the driver copy and
modifying the firmware copy of a port GID table becomes an
atomic operation (thus avoiding driver-copy/FW-copy mismatches).
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

111c6094

mlx4_core: Move handling of MLX4_QP_ST_MLX to proper switch statement · 165cb465

由 Roland Dreier 提交于 5月 30, 2014

The handling of MLX4_QP_ST_MLX in verify_qp_parameters() was
accidentally put inside the inner switch statement (that handles which
transition of RC/UC/XRC QPs is happening).  Fix this by moving the case
to the outer switch statement.

The compiler pointed this out with:

    drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'verify_qp_parameters':
 >> drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:2875:3: warning: case value '7' not in enumerated type 'enum qp_transition' [-Wswitch]
       case MLX4_QP_ST_MLX:
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Fixes: 99ec41d0 ("mlx4: Add infrastructure for selecting VFs to enable QP0 via MLX proxy QPs")
Signed-off-by: NRoland Dreier <roland@purestorage.com>

165cb465

30 5月, 2014 5 次提交

IB/mlx4: Add interface for selecting VFs to enable QP0 via MLX proxy QPs · 65fed8a8

由 Jack Morgenstein 提交于 5月 29, 2014

This commit adds the sysfs interface for enabling QP0 on VFs for
selected VF/port.

By default, no VFs are enabled for QP0 operation.

To enable QP0 operation on a VF/port, under
/sys/class/infiniband/mlx4_x/iov/<b:d:f>/ports/x there are two new entries:

- smi_enabled (read-only). Indicates whether smi is currently
  enabled for the indicated VF/port

- enable_smi_admin (rw). Used by the admin to request that smi
  capability be enabled or disabled for the indicated VF/port.
  0 = disable, 1 = enable.
  The requested enablement will occur at the next reset of the
  VF (e.g. driver restart on the VM which owns the VF).
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

65fed8a8

mlx4: Add infrastructure for selecting VFs to enable QP0 via MLX proxy QPs · 99ec41d0

由 Jack Morgenstein 提交于 5月 29, 2014

This commit adds the infrastructure for enabling selected VFs to
operate SMI (QP0) MADs without restriction.

Additionally, for these enabled VFs, their QP0 proxy and tunnel QPs
are MLX QPs.  As such, they operate over VL15.  Therefore, they are
not affected by "credit" problems or changes in the VLArb table (which
may shut down VL0).

Non-enabled VFs may only create UD proxy QP0 qps (which are forced by
the hypervisor to send packets using the q-key it assigns and places
in the qp-context).  Thus, non-enabled VFs will not pose a security
risk.  The hypervisor discards any privileged MADs it receives from
these non-enabled VFs.

By default, all VFs are NOT enabled, and must explicitly be enabled
by the administrator.

The sysfs interface which operates the VF enablement infrastructure
is provided in the next commit.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

99ec41d0

IB/mlx4: Preparation for VFs to issue/receive SMI (QP0) requests/responses · 97982f5a

由 Jack Morgenstein 提交于 5月 29, 2014

Currently, VFs in SRIOV VFs are denied QP0 access.  The main reason
for this decision is security, since Subnet Management Datagrams
(SMPs) are not restricted by network partitioning and may affect the
physical network topology.  Moreover, even the SM may be denied access
from portions of the network by setting management keys unknown to the
SM.

However, it is desirable to grant SMI access to certain privileged
VFs, so that certain network management activities may be conducted
within virtual machines instead of the hypervisor.

This commit does the following:

1. Create QP0 tunnel QPs for all VFs.

2. Discard SMI mads sent-from/received-for non-privileged VFs in the
   hypervisor MAD multiplex/demultiplex logic.  SMI mads from/for
   privileged VFs are allowed to pass.

3. MAD_IFC wrapper changes/fixes.  For non-privileged VFs, only
   host-view MAD_IFC commands are allowed, and only for SMI LID-Routed
   GET mads.  For privileged VFs, there are no restrictions.

This commit does not allow privileged VFs as yet.  To determine if a VF
is privileged, it calls function mlx4_vf_smi_enabled().  This function
returns 0 unconditionally for now.

The next two commits allow defining and activating privileged VFs.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

97982f5a

mlx4_core: Fix incorrect FLAGS1 bitmap test in mlx4_QUERY_FUNC_CAP · bc82878b

由 Jack Morgenstein 提交于 5月 29, 2014

Commit eb17711b ("net/mlx4_core: Introduce nic_info new flag in
QUERY_FUNC_CAP") did:

	if (func_cap->flags1 & QUERY_FUNC_CAP_FLAGS1_OFFSET) {

which should be:

	if (func_cap->flags1 & QUERY_FUNC_CAP_FLAGS1_FORCE_VLAN) {

Fix that.

Fixes: eb17711b ("net/mlx4_core: Introduce nic_info new flag in QUERY_FUNC_CAP")
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

bc82878b

mlx4_core: Fix memory leaks in SR-IOV error paths · b38f2879

由 Dotan Barak 提交于 5月 29, 2014

Fix a few memory leaks that happen if errors happen in SR-IOV mode.
Signed-off-by: NDotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

b38f2879

24 5月, 2014 1 次提交

net-next:v4: Add support to configure SR-IOV VF minimum and maximum Tx rate through ip tool. · ed616689

由 Sucheta Chakraborty 提交于 5月 22, 2014

o min_tx_rate puts lower limit on the VF bandwidth. VF is guaranteed
  to have a bandwidth of at least this value.
  max_tx_rate puts cap on the VF bandwidth. VF can have a bandwidth
  of up to this value.

o A new handler set_vf_rate for attr IFLA_VF_RATE has been introduced
  which takes 4 arguments:
  netdev, VF number, min_tx_rate, max_tx_rate

o ndo_set_vf_rate replaces ndo_set_vf_tx_rate handler.

o Drivers that currently implement ndo_set_vf_tx_rate should now call
  ndo_set_vf_rate instead and reject attempt to set a minimum bandwidth
  greater than 0 for IFLA_VF_TX_RATE when IFLA_VF_RATE is not yet
  implemented by driver.

o If user enters only one of either min_tx_rate or max_tx_rate, then,
  userland should read back the other value from driver and set both
  for IFLA_VF_RATE.
  Drivers that have not yet implemented IFLA_VF_RATE should always
  return min_tx_rate as 0 when read from ip tool.

o If both IFLA_VF_TX_RATE and IFLA_VF_RATE options are specified, then
  IFLA_VF_RATE should override.

o Idea is to have consistent display of rate values to user.

o Usage example: -

  ./ip link set p4p1 vf 0 rate 900

  ./ip link show p4p1
  32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
  DEFAULT qlen 1000
    link/ether 00:0e:1e:08:b0:f0 brd ff:ff:ff:ff:ff:ff
    vf 0 MAC 3e:a0:ca:bd:ae:5a, tx rate 900 (Mbps), max_tx_rate 900Mbps
    vf 1 MAC f6:c6:7c:3f:3d:6c
    vf 2 MAC 56:32:43:98:d7:71
    vf 3 MAC d6:be:c3:b5:85:ff
    vf 4 MAC ee:a9:9a:1e:19:14
    vf 5 MAC 4a:d0:4c:07:52:18
    vf 6 MAC 3a:76:44:93:62:f9
    vf 7 MAC 82:e9:e7:e3:15:1a

  ./ip link set p4p1 vf 0 max_tx_rate 300 min_tx_rate 200

  ./ip link show p4p1
  32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
  DEFAULT qlen 1000
    link/ether 00:0e:1e:08:b0:f0 brd ff:ff:ff:ff:ff:ff
    vf 0 MAC 3e:a0:ca:bd:ae:5a, tx rate 300 (Mbps), max_tx_rate 300Mbps,
    min_tx_rate 200Mbps
    vf 1 MAC f6:c6:7c:3f:3d:6c
    vf 2 MAC 56:32:43:98:d7:71
    vf 3 MAC d6:be:c3:b5:85:ff
    vf 4 MAC ee:a9:9a:1e:19:14
    vf 5 MAC 4a:d0:4c:07:52:18
    vf 6 MAC 3a:76:44:93:62:f9
    vf 7 MAC 82:e9:e7:e3:15:1a

  ./ip link set p4p1 vf 0 max_tx_rate 600 rate 300

  ./ip link show p4p1
  32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
  DEFAULT qlen 1000
    link/ether 00:0e:1e:08:b0:f brd ff:ff:ff:ff:ff:ff
    vf 0 MAC 3e:a0:ca:bd:ae:5, tx rate 600 (Mbps), max_tx_rate 600Mbps,
    min_tx_rate 200Mbps
    vf 1 MAC f6:c6:7c:3f:3d:6c
    vf 2 MAC 56:32:43:98:d7:71
    vf 3 MAC d6:be:c3:b5:85:ff
    vf 4 MAC ee:a9:9a:1e:19:14
    vf 5 MAC 4a:d0:4c:07:52:18
    vf 6 MAC 3a:76:44:93:62:f9
    vf 7 MAC 82:e9:e7:e3:15:1a
Signed-off-by: NSucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ed616689