提交 · 9e311e77a85e37b5caec3d64c3593cd52b2cdb71 · xiphi1978 / linux

12 6月, 2014 12 次提交

net/mlx4_en: Use affinity hint · 9e311e77

由 Yuval Atias 提交于 6月 09, 2014

The “affinity hint” mechanism is used by the user space
daemon, irqbalancer, to indicate a preferred CPU mask for irqs.
Irqbalancer can use this hint to balance the irqs between the
cpus indicated by the mask.

We wish the HCA to preferentially map the IRQs it uses to numa cores
close to it.  To accomplish this, we use cpumask_set_cpu_local_first(), that
sets the affinity hint according the following policy:
First it maps IRQs to “close” numa cores.  If these are exhausted, the
remaining IRQs are mapped to “far” numa cores.
Signed-off-by: NYuval Atias <yuvala@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9e311e77

cpumask: Utility function to set n'th cpu - local cpu first · da91309e

由 Amir Vadai 提交于 6月 09, 2014

This function sets the n'th cpu - local cpu's first.
For example: in a 16 cores server with even cpu's local, will get the
following values:
cpumask_set_cpu_local_first(0, numa, cpumask) => cpu 0 is set
cpumask_set_cpu_local_first(1, numa, cpumask) => cpu 2 is set
...
cpumask_set_cpu_local_first(7, numa, cpumask) => cpu 14 is set
cpumask_set_cpu_local_first(8, numa, cpumask) => cpu 1 is set
cpumask_set_cpu_local_first(9, numa, cpumask) => cpu 3 is set
...
cpumask_set_cpu_local_first(15, numa, cpumask) => cpu 15 is set

Curently this function will be used by multi queue networking devices to
calculate the irq affinity mask, such that as many local cpu's as
possible will be utilized to handle the mq device irq's.
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

da91309e

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next · d4f38620

由 David S. Miller 提交于 6月 11, 2014

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2014-06-11

This series contains updates to igb, i40e and i40evf.

Todd makes a change to igb to un-hide invariant returns by getting rid of
the E1000_SUCCESS define and converting those returns to return 0.

Jacob separates the hardware logic from the set function, so that we can
re-use it during a ptp_reset in igb.  This enables the reset to return
functionality to the last know timestamp mode, rather than resetting the
value.

Ashish implements context flags for headwb and headwb_addr so that we
do not have to keep them always enabled.

Shannon updates the admin queue API for the new firmware, which adds
set_pf_content, nvm_config_read/write, replaces set_phy_reset with
set_phy_debug and removes nvm_read/write_reg_se.  Cleans up the driver
to use the stored base_queue value since there is no need to read the
PCI register for the PF's base queue on every single transmit queue
enable and disable as we already have the value stored from reading
the capability features at startup.

Anjali changes the notion of source and destination for FD_SB in ethtool
to align i40e with other drivers.  Adds flow director statistics to
the PF stats.  Fixes a bug in ethtool for flow director drop packet
filter where the drop action comes down as a ring_cookie value, so allow
it as a special value that can be used to configure destination control.

Mitch fixes the i40evf to keep the driver from going down when it is
already in a down state.  This prevents a CPU soft lock in napi_disable().
Also change the i40evf to check the admin queue error bits since the
firmware can indicate any admin queue error states to the driver via
some bits in the length registers.

Neerav separates out the DCB capability and enabled flags because currently
if the firmware reports DCB capability the driver enables
I40E_FLAG_DCB_ENABLED flag.  When this flag is enabled the driver inserts
a tag when transmitting a packet from the port even if there are no DCB
traffic classes configured at the port.  So by adding the additional flag,
I40E_FLAG_DCB_CAPABLE, that will be set when the DCB capability is present
and the existing enabled flag will only be set if there are more than one
traffic classes configured at the port.

Greg fixes the i40e driver to not automatically accept tagged packets by
default so that the system must request a VLAN tag packet filter to get
packets with that tag.  Greg also converts i40e to use the in-kernel
ether_addr_copy() instead of mempcy().

Jesse removes the FTYPE field from the receive descriptor to match the
hardware implementation.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4f38620

Merge branch 'sctp-next' · 813ebbbf

由 David S. Miller 提交于 6月 11, 2014

Daniel Borkmann says:

====================
SCTP update

This set contains transport path selection improvements in
SCTP. Please see individual patches for details.
====================
Acked-by: NVlad Yasevich <vyasevich@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

813ebbbf

net: sctp: fix incorrect type in gfp initializer · 9b87d465

由 Daniel Borkmann 提交于 6月 11, 2014

This fixes the following sparse warning:

net/sctp/associola.c:1556:29: warning: incorrect type in initializer (different base types)
net/sctp/associola.c:1556:29: expected bool [unsigned] [usertype] preload
net/sctp/associola.c:1556:29: got restricted gfp_t
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b87d465

net: sctp: improve sctp_select_active_and_retran_path selection · a7288c4d

由 Daniel Borkmann 提交于 6月 11, 2014

In function sctp_select_active_and_retran_path(), we walk the
transport list in order to look for the two most recently used
ACTIVE transports (trans_pri, trans_sec). In case we didn't find
anything ACTIVE, we currently just camp on a possibly PF or
INACTIVE transport that is primary path; this behavior actually
dates back to linux-history tree of the very early days of
lksctp, and can yield a behavior that chooses suboptimal
transport paths.

Instead, be a bit more clever by reusing and extending the
recently introduced sctp_trans_elect_best() handler. In case
both transports are evaluated to have the same score resulting
from their states, break the tie by looking at: 1) transport
patch error count 2) last_time_heard value from each transport.

This is analogous to Nishida's Quick Failover draft [1],
section 5.1, 3:

  The sender SHOULD avoid data transmission to PF destinations.
  When all destinations are in either PF or Inactive state,
  the sender MAY either move the destination from PF to active
  state (and transmit data to the active destination) or the
  sender MAY transmit data to a PF destination. In the former
  scenario, (i) the sender MUST NOT notify the ULP about the
  state transition, and (ii) MUST NOT clear the destination's
  error counter. It is recommended that the sender picks the
  PF destination with least error count (fewest consecutive
  timeouts) for data transmission. In case of a tie (multiple PF
  destinations with same error count), the sender MAY choose the
  last active destination.

Thus for sctp_select_active_and_retran_path(), we keep track of
the best, if any, transport that is in PF state and in case no
ACTIVE transport has been found (hence trans_{pri,sec} is NULL),
we select the best out of the three: current primary_path and
retran_path as well as a possible PF transport.

The secondary may still camp on the original primary_path as
before. The change in sctp_trans_elect_best() with a more fine
grained tie selection also improves at the same time path selection
for sctp_assoc_update_retran_path() in case of non-ACTIVE states.

  [1] http://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a7288c4d

net: sctp: migrate most recently used transport to ktime · e575235f

由 Daniel Borkmann 提交于 6月 11, 2014

Be more precise in transport path selection and use ktime
helpers instead of jiffies to compare and pick the better
primary and secondary recently used transports. This also
avoids any side-effects during a possible roll-over, and
could lead to better path decision-making.
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e575235f

net: sctp: refactor active path selection · b82e8f31

由 Daniel Borkmann 提交于 6月 11, 2014

This patch just refactors and moves the code for the active
path selection into its own helper function outside of
sctp_assoc_control_transport() which is already big enough.
No functional changes here.
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b82e8f31

ktime: add ktime_after and ktime_before helper · 67cb9366

由 Daniel Borkmann 提交于 6月 11, 2014

Add two minimal helper functions analogous to time_before() and
time_after() that will later on both be needed by SCTP code.
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

67cb9366

Merge branch 'mac802154' · 9181a6bd

由 David S. Miller 提交于 6月 11, 2014

Phoebe Buckheister says:

====================
Recent llsec code introduced a memory leak on decryption failures during rx.
This fixes said leak, and optimizes the receive loops for monitor and wpan
devices to only deliver skbs to devices that are actually up. Also changes a
dev_kfree_skb to kfree_skb when an invalid packet is dropped before being
pushed into the stack.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9181a6bd

mac802154: don't deliver packets to devices that are down · 2d3b5b0a

由 Phoebe Buckheister 提交于 6月 11, 2014

Only one WPAN devices can be active at any given time, so only deliver
packets to that one interface that is actually up. Multiple monitors may
be up at any given time, but we don't have to deliver to monitors that
are down either.
Signed-off-by: NPhoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d3b5b0a

mac802154: properly free incoming skbs on decryption failure · a374eeb5

由 Phoebe Buckheister 提交于 6月 11, 2014

mac802154 RX did not free skbs on decryption failure, assuming that the
caller would when the local rx handler returned _DROP. This was false.
Signed-off-by: NPhoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a374eeb5

11 6月, 2014 28 次提交

i40e/i40evf: Bump i40e to version 0.4.10 and i40evf to 0.9.34 · f8320902

由 Catherine Sullivan 提交于 5月 22, 2014

Bump versions.

Change-ID: Ic4a84354955061ca18321b1e97c9c30fe1563b5c
Signed-off-by: NCatherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

f8320902

i40e: use stored base_queue value · dfb699f9

由 Shannon Nelson 提交于 5月 22, 2014

No need to read the PCI register for the PF's base queue on every single Tx
queue enable and disable as we already have the value stored from reading
the capability features at startup.

Change-ID: Ic02fb622757742f43cb8269369c3d972d4f66555
Signed-off-by: NShannon Nelson <shannon.nelson@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

dfb699f9

i40e: Fix a bug in ethtool for FD drop packet filter action · 387ce1a9

由 Anjali Singhai Jain 提交于 5月 22, 2014

A drop action comes down as a ring_cookie value, so allow it as
a special value that can be used to configure destination control.

Also fix the output to filter read command accordingly.

Change-ID: I9956723cee42f3194885403317dd21ed4a151144
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

387ce1a9

i40e/i40evf: Add Flow director stats to PF stats · 433c47de

由 Anjali Singhai Jain 提交于 5月 22, 2014

Add members to stat struct to keep track of Flow director ATR and
SideBand filter packet matches.

Change-ID: Ibbb31a53c7adcc2bb96991dd80565442a2f2513c
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

433c47de

i40e/i40evf: remove FTYPE · 2c50ef80

由 Jesse Brandeburg 提交于 5月 22, 2014

This change drops the FTYPE field from the Rx descriptor, to
match the hardware implementation.

Change-ID: I66d31d2b43861da45e8ace4fb03df033abe88bab
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

2c50ef80

i40evf: check admin queue error bits · 912257e5

由 Mitch Williams 提交于 5月 22, 2014

FW can indicate any admin queue error states to the driver via some bits
in the length registers. Each time we process an admin queue message,
check these bits and log any errors we find. Since the VF really can't
do much, we just print the message and depend on the PF driver to clear
things up on our behalf.

Change-ID: I92bc6c53ce3b4400544e0ca19c5de2d27490bd0d
Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

912257e5

i40e/i40evf: User ether_addr_copy instead of memcpy · 9a173901

由 Greg Rose 提交于 5月 22, 2014

Linux gives us a function to copy Ethernet MAC addresses, let's use it.

Change-ID: I0c861900029ca5ea65a53ca39565852fb633f6fd
Signed-off-by: NGreg Rose <gregory.v.rose@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

9a173901

i40e: Do not accept tagged packets by default · 8c27d42e

由 Greg Rose 提交于 5月 22, 2014

Remove the filter created by the firmware with the default MAC address it
reads out of the NVM storage and a promiscuous VLAN tag and replace it
with a filter that will not accept tagged packets by default. The system
must request a VLAN tag packet filter to get packets with that tag.

Change-ID: I119e6c3603a039bd68282ba31bf26f33a575490a
Signed-off-by: NGreg Rose <gregory.v.rose@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

8c27d42e

i40e: Separate out DCB capability and enabled flags · 4d9b6043

由 Neerav Parikh 提交于 5月 22, 2014

Currently if the firmware reports DCB capability the driver enables
I40E_FLAG_DCB_ENABLED flag. When this flag is enabled the driver
inserts a tag when transmitting a packet from the port even if there
are no DCB traffic classes configured at the port.

This patch adds a new flag I40E_FLAG_DCB_CAPABLE that will be set
when the DCB capability is present and the existing flag
I40E_FLAG_DCB_ENABLED will be set only if there are more than one
traffic classes configured at the port.

Change-ID: I24ccbf53ef293db2eba80c8a9772acf729795bd5
Signed-off-by: NNeerav Parikh <neerav.parikh@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

4d9b6043

i40evf: don't go further down · ddf0b3a6

由 Mitch Williams 提交于 5月 22, 2014

If the device is down, there's no place to go but up, so don't try to go
down even more. This prevents a CPU soft lock in napi_disable().

Change-ID: I8b058b9ee974dfa01c212fae2597f4f54b333314
Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

ddf0b3a6

i40e: Change the notion of src and dst for FD_SB in ethtool · 04b73bd7

由 Anjali Singhai Jain 提交于 5月 22, 2014

In XL710 devices we program FD filter's fields from Tx perspective of the flow.
However the user interface exposed in ethtool should be compliant with the
previous generation of drivers where a filter src and dst field are from
the RX perspective. This patch changes the ethtool interface in this regard
to match the other drivers.

Change-ID: Iec6ccddd87357c4fb53ccf33aa0fae699faf70cf
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

04b73bd7

i40e/i40evf: AdminQ API update for new FW · f94234ee

由 Shannon Nelson 提交于 5月 22, 2014

Add set_pf_context, replace set_phy_reset with set_phy_debug, add
nvm_config_read/write, remove nvm_read/write_reg_se and add some
PHY types.

With these changes we bump the API version to 1.2.

Change-ID: I4dc3aec175c2316f66fc9b726b3f7d594699d84e
Signed-off-by: NShannon Nelson <shannon.nelson@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

f94234ee

i40e/i40evf: set headwb Tx context flags and use them · 5d29896a

由 Ashish Shah 提交于 5月 22, 2014

Set appropriate fields in Tx queue configuration virtchnl message
to pf to enable headwb and setup headwb addr.
Then use that info from the VF to set headwb and headwb_addr instead of
always enabling them.

Change-ID: I7d393d1b2b07f0f3355b3a4f7c2d3c6ee3b0d622
Signed-off-by: NAshish Shah <ashish.n.shah@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

5d29896a

igb: separate hardware setting from the set_ts_config ioctl · 9f62ecf4

由 Jacob Keller 提交于 6月 05, 2014

This patch separates the hardware logic from the set function, so that
we can re-use it during a ptp_reset. This enables the reset to return
functionality to the last known timestamp mode, rather than resetting
the value. We initialize the mode to off during the ptp_init cycle.
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Tested-by: NAaron Brown <aaron.f.brown@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

9f62ecf4

igb: unhide invariant returns · 23d87824

由 Todd Fujinaka 提交于 6月 04, 2014

Return a 0 directly rather than a constant.
Reported-by: NPeter Senna Tschudin <peter.senna@gmail.com>
Signed-off-by: NTodd Fujinaka <todd.fujinaka@intel.com>
Tested-by: NAaron Brown <aaron.f.brown@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

23d87824

amd-xgbe: Rename MAX_DMA_CHANNELS to avoid powerpc conflict · d5c48582

由 Lendacky, Thomas 提交于 6月 09, 2014

MAX_DMA_CHANNELS is defined in asm/scatterlist.h of the powerpc
architecture.  Rename this #define in xgbe.h to avoid the
redefined warning issued during compilation.
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d5c48582

farsync: Fix confusion about DMA address and buffer offset types · 581d9baa

由 Ben Hutchings 提交于 6月 08, 2014

Use dma_addr_t for DMA address parameters and u32 for shared memory
offset parameters.

Do not assume that dma_addr_t is the same as unsigned long; it will
not be in PAE configurations.  Truncate DMA addresses to 32 bits when
printing them.  This is OK because the DMA mask for this device is
32-bit (per default).

Also rename the DMA address parameters from 'skb' to 'dma'.

Compile-tested only.
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

581d9baa

Merge branch 'mlx4' · 7e1a61b6

由 David S. Miller 提交于 6月 11, 2014

Or Gerlitz says:

====================
mlx4 SRIOV fixes

The patch from Wei Yang is a designed fix to a regression introduced by earlier commit
of him. Jack added a fix to the resource management which we got from IBM.

Let's get that into 3.16-rc1 1st and later see to what stable version/s this should go.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7e1a61b6

net/mlx4_core: Keep only one driver entry release mlx4_priv · da1de8df

由 Wei Yang 提交于 6月 08, 2014

Following commit befdf897 "net/mlx4_core: Preserve pci_dev_data after
__mlx4_remove_one()", there are two mlx4 pci callbacks which will
attempt to release the mlx4_priv object -- .shutdown and .remove.

This leads to a use-after-free access to the already freed mlx4_priv
instance and trigger a "Kernel access of bad area" crash when both
.shutdown and .remove are called.

During reboot or kexec, .shutdown is called, with the VFs probed to
the host going through shutdown first and then the PF. Later, the PF
will trigger VFs' .remove since VFs still have driver attached.

Fix that by keeping only one driver entry which releases mlx4_priv.

Fixes: befdf897 ('net/mlx4_core: Preserve pci_dev_data after __mlx4_remove_one()')
CC: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NWei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

da1de8df

net/mlx4_core: Fix SRIOV free-pool management when enforcing resource quotas · 95646373

由 Jack Morgenstein 提交于 6月 08, 2014

The Hypervisor driver tracks free slots and reserved slots at the global level
and tracks allocated slots and guaranteed slots per VF.

Guaranteed slots are treated as reserved by the driver, so the total
reserved slots is the sum of all guaranteed slots over all the VFs.

As VFs allocate resources, free (global) is decremented and allocated (per VF)
is incremented for those resources. However, reserved (global) is never changed.

This means that effectively, when a VF allocates a resource from its
guaranteed pool, it is actually reducing that resource's free pool (since
the global reserved count was not also reduced).

The fix for this problem is the following: For each resource, as long as a
VF's allocated count is <= its guaranteed number, when allocating for that
VF, the reserved count (global) should be reduced by the allocation as well.

When the global reserved count reaches zero, the remaining global free count
is still accessible as the free pool for that resource.

When the VF frees resources, the reverse happens: the global reserved count
for a resource is incremented only once the VFs allocated number falls below
its guaranteed number.

This fix was developed by Rick Kready <kready@us.ibm.com>
Reported-by: NRick Kready <kready@us.ibm.com>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

95646373

net: wimax: i2400m: control.c: Cleaning up conjunction always evaluates to false · f05d435b

由 Rickard Strandqvist 提交于 6月 07, 2014

Logical conjunction always evaluates to false: minor < 2 && minor > 1
I guess what you wanted is rather: minor > 2 || minor < 1

This was partly found using a static code analysis program called cppcheck.
Signed-off-by: NRickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f05d435b

net: ethernet: toshiba: ps3_gelic_net.c: Cleaning up a check on a memory allocation · 655aa393

由 Rickard Strandqvist 提交于 6月 07, 2014

A check on a memory allocation is checked incorrectly.

This was partly found using a static code analysis program called cppcheck.
Signed-off-by: NRickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Acked-by: NGeoff Levand <geoff@infradead.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

655aa393

amd-xgbe: fix unused variable compilation warning in phylib driver · a25aafaa

由 françois romieu 提交于 6月 07, 2014

Fix following compilation warning:
[...]
  CC      drivers/net/phy/amd-xgbe-phy.o
drivers/net/phy/amd-xgbe-phy.c:1353:30: warning:
‘amd_xgbe_phy_ids’ defined but not used [-Wunused-variable]
 static struct mdio_device_id amd_xgbe_phy_ids[] = {
                              ^
Signed-off-by: NFrancois Romieu <romieu@fr.zoreil.com>
Acked-by: NTom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a25aafaa

net: filter: fix nlattr and nlattr_nest BPF tests · df6d0f98

由 Alexei Starovoitov 提交于 6月 06, 2014

- 'struct nlattr' must be 2 byte aligned
- provide big-endian input data for nlattr/nlattr_nest tests
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Acked-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

df6d0f98

net: filter: cleanup A/X name usage · e430f34e

由 Alexei Starovoitov 提交于 6月 06, 2014

The macro 'A' used in internal BPF interpreter:
 #define A regs[insn->a_reg]
was easily confused with the name of classic BPF register 'A', since
'A' would mean two different things depending on context.

This patch is trying to clean up the naming and clarify its usage in the
following way:

- A and X are names of two classic BPF registers

- BPF_REG_A denotes internal BPF register R0 used to map classic register A
  in internal BPF programs generated from classic

- BPF_REG_X denotes internal BPF register R7 used to map classic register X
  in internal BPF programs generated from classic

- internal BPF instruction format:
struct sock_filter_int {
        __u8    code;           /* opcode */
        __u8    dst_reg:4;      /* dest register */
        __u8    src_reg:4;      /* source register */
        __s16   off;            /* signed offset */
        __s32   imm;            /* signed immediate constant */
};

- BPF_X/BPF_K is 1 bit used to encode source operand of instruction
In classic:
  BPF_X - means use register X as source operand
  BPF_K - means use 32-bit immediate as source operand
In internal:
  BPF_X - means use 'src_reg' register as source operand
  BPF_K - means use 32-bit immediate as source operand
Suggested-by: NChema Gonzalez <chema@google.com>
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Acked-by: NDaniel Borkmann <dborkman@redhat.com>
Acked-by: NChema Gonzalez <chema@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e430f34e

Merge branch 'bridge_multicast_exports' · 7b0dcbd8

由 David S. Miller 提交于 6月 10, 2014

Linus Lüssing says:

====================
bridge: multicast snooping patches / exports

The first patch is simply a cosmetic patch. So far I (and maybe others
too?) have been regularly confusing these two structs, therefore I'd
suggest renaming them and therefore making the follow-up patches easier
to understand and nicer to fit in.

The second patch fixes a minor issue, but probably not worth for stable.

On the other hand the first two patches are also preparations for the
third and fourth patch:

These two patches are exporting functionality needed to marry the bridge
multicast snooping with the batman-adv multicast optimizations recently
added for the 3.15 kernel, allowing to use these optimzations in common
setups having a bridge on top of e.g. bat0, too. So far these bridged
setups would fall back to simple flooding through the batman-adv mesh
network for any multicast packet entering bat0.

More information about the batman-adv multicast optimizations currently
implemented can be found here:

http://www.open-mesh.org/projects/batman-adv/wiki/Basic-multicast-optimizations

The integration on the batman-adv side could afterwards look like this,
for instance:

http://git.open-mesh.org/batman-adv.git/commitdiff/576b59dd3e34737c702e548b21fa72059262f796?hp=f95ce7131746c65fbcdffcf2089cab59e2c2f7ac
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7b0dcbd8

bridge: memorize and export selected IGMP/MLD querier port · 2cd41431

由 Linus Lüssing 提交于 6月 07, 2014

Adding bridge support to the batman-adv multicast optimization requires
batman-adv knowing about the existence of bridged-in IGMP/MLD queriers
to be able to reliably serve any multicast listener behind this same
bridge.
Signed-off-by: NLinus Lüssing <linus.luessing@web.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2cd41431

bridge: add export of multicast database adjacent to net_dev · 07f8ac4a

由 Linus Lüssing 提交于 6月 07, 2014

With this new, exported function br_multicast_list_adjacent(net_dev) a
list of IPv4/6 addresses is returned. This list contains all multicast
addresses sensed by the bridge multicast snooping feature on all bridge
ports of the bridge interface of net_dev, excluding addresses from the
specified net_device itself.

Adding bridge support to the batman-adv multicast optimization requires
batman-adv knowing about the existence of bridged-in multicast
listeners to be able to reliably serve them with multicast packets.
Signed-off-by: NLinus Lüssing <linus.luessing@web.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

07f8ac4a