提交 · 61bd3857ff2c7daf756d49b41e6277bbdaa8f789 · OpenHarmony / kernel_linux

05 2月, 2015 3 次提交

net/core: Add event for a change in slave state · 61bd3857

由 Moni Shoua 提交于 2月 03, 2015

Add event which provides an indication on a change in the state
of a bonding slave. The event handler should cast the pointer to the
appropriate type (struct netdev_bonding_info) in order to get the
full info about the slave.
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

61bd3857

net: add skb functions to process remote checksum offload · dcdc8994

由 Tom Herbert 提交于 2月 02, 2015

This patch adds skb_remcsum_process and skb_gro_remcsum_process to
perform the appropriate adjustments to the skb when receiving
remote checksum offload.

Updated vxlan and gue to use these functions.

Tested: Ran TCP_RR and TCP_STREAM netperf for VXLAN and GUE, did
not see any change in performance.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dcdc8994

xps: fix xps for stacked devices · 2bd82484

由 Eric Dumazet 提交于 2月 03, 2015

A typical qdisc setup is the following :

bond0 : bonding device, using HTB hierarchy
eth1/eth2 : slaves, multiqueue NIC, using MQ + FQ qdisc

XPS allows to spread packets on specific tx queues, based on the cpu
doing the send.

Problem is that dequeues from bond0 qdisc can happen on random cpus,
due to the fact that qdisc_run() can dequeue a batch of packets.

CPUA -> queue packet P1 on bond0 qdisc, P1->ooo_okay=1
CPUA -> queue packet P2 on bond0 qdisc, P2->ooo_okay=0

CPUB -> dequeue packet P1 from bond0
        enqueue packet on eth1/eth2
CPUC -> dequeue packet P2 from bond0
        enqueue packet on eth1/eth2 using sk cache (ooo_okay is 0)

get_xps_queue() then might select wrong queue for P1, since current cpu
might be different than CPUA.

P2 might be sent on the old queue (stored in sk->sk_tx_queue_mapping),
if CPUC runs a bit faster (or CPUB spins a bit on qdisc lock)

Effect of this bug is TCP reorders, and more generally not optimal
TX queue placement. (A victim bulk flow can be migrated to the wrong TX
queue for a while)

To fix this, we have to record sender cpu number the first time
dev_queue_xmit() is called for one tx skb.

We can union napi_id (used on receive path) and sender_cpu,
granted we clear sender_cpu in skb_scrub_packet() (credit to Willem for
this union idea)
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2bd82484

03 2月, 2015 1 次提交

ipv6: pull cork initialization into its own function. · 366e41d9

由 Vlad Yasevich 提交于 1月 31, 2015

Pull IPv6 cork initialization into its own function that
can be re-used.  IPv6 specific cork data did not have an
explicit data structure.  This patch creats eone so that
just ipv6 cork data can be as arguemts.  Also, since
IPv6 tries to save the flow label into inet_cork_full
tructure, pass the full cork.

Adjust ip6_cork_release() to take cork data structures.
Signed-off-by: NVladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

366e41d9

02 2月, 2015 2 次提交

bridge: add flags argument to ndo_bridge_setlink and ndo_bridge_dellink · add511b3

由 Roopa Prabhu 提交于 1月 29, 2015

bridge flags are needed inside ndo_bridge_setlink/dellink handlers to
avoid another call to parse IFLA_AF_SPEC inside these handlers

This is used later in this series
Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

add511b3

netdev: introduce new NETIF_F_HW_SWITCH_OFFLOAD feature flag for switch device offloads · aafb3e98

由 Roopa Prabhu 提交于 1月 29, 2015

This is a high level feature flag for all switch asic offloads

switch drivers set this flag on switch ports. Logical devices like
bridge, bonds, vxlans can inherit this flag from their slaves/ports.

The patch also adds the flag to NETIF_F_ONE_FOR_ALL, so that it gets
propagated to the upperdevices (bridges and bonds).
Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aafb3e98

30 1月, 2015 1 次提交

dev: add per net_device packet type chains · 7866a621

由 Salam Noureddine 提交于 1月 27, 2015

When many pf_packet listeners are created on a lot of interfaces the
current implementation using global packet type lists scales poorly.
This patch adds per net_device packet type lists to fix this problem.

The patch was originally written by Eric Biederman for linux-2.6.29.
Tested on linux-3.16.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NSalam Noureddine <noureddine@arista.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7866a621

28 1月, 2015 3 次提交

net/mlx4_core: Adjust command timeouts to conform to the firmware spec · 5a031086

由 Jack Morgenstein 提交于 1月 27, 2015

The firmware spec states that the timeout for all commands should be 60 seconds.

In the past, the spec indicated that there were several classes of timeout
(short, medium, and long).  The driver has these different timeout classes.
We leave the class differentiation in the driver as-is (to protect against any
future spec changes), but set the timeout for all classes to be 60 seconds.

In addition, we fix a few commands which had hard-coded numeric timeouts specified.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5a031086

net/mlx4_core: Add bad-cable event support · be6a6b43

由 Jack Morgenstein 提交于 1月 27, 2015

If the firmware can detect a bad cable, allow it to generate an
event, and print the problem in the log.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

be6a6b43

NFC: st21nfca: Adding support for secure element · 2130fb97

由 Christophe Ricard 提交于 1月 27, 2015

st21nfca has 1 physical SWP line and can support up to 2 secure elements
(UICC & eSE) thanks to an external switch managed with a gpio.

The platform integrator needs to specify thanks to 2 initialization
properties, uicc-present and ese-present, if it is suppose to have uicc
and/or ese. Of course if the platform does not have an external switch,
only one kind of secure element can be supported. Those parameters are
under platform integrator responsibilities.

During initialization, the white_list will be set according to those
parameters.

The discovery_se function will assume a secure element is physically
present according to uicc-present and ese-present values and will add it
to the secure element list. On ese activation, the atr is retrieved to
calculate a command exchange timeout based on the first atr(TB) value.

The se_io will allow to transfer data over SWP. 2 kind of events may appear
after a data is sent over:
- ST21NFCA_EVT_TRANSMIT_DATA when receiving an apdu answer
- ST21NFCA_EVT_WTX_REQUEST when the secure element needs more time than
expected to compute a command. If this timeout expired, a first recovery
tentative consist to send a simple software reset proprietary command.
If this tentative still fail, a second recovery tentative consist to send
a hardware reset proprietary command.
This function is only relevant for eSE like secure element.

This patch also change the way a pipe is referenced. There can be
different pipe connected to the same gate with different host destination
(ex: CONNECTIVITY). In order to keep host information every pipe are
reference with a tuple (gate, host). In order to reduce changes, we are
keeping unchanged the way a gate is addressed on the Terminal Host.
However, this is working because we consider the apdu reader gate is only
present on the eSE slot also the connectivity gate cannot give a reliable
value; it will give the latest stored pipe value.
Signed-off-by: NChristophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: NSamuel Ortiz <sameo@linux.intel.com>

2130fb97

27 1月, 2015 8 次提交

mac80111: Add BIP-CMAC-256 cipher · 56c52da2

由 Jouni Malinen 提交于 1月 24, 2015

This allows mac80211 to configure BIP-CMAC-256 to the driver and also
use software-implementation within mac80211 when the driver does not
support this with hardware accelaration.
Signed-off-by: NJouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

56c52da2

cfg80211: Add new GCMP, CCMP-256, BIP-GMAC, BIP-CMAC-256 ciphers · cfcf1682

由 Jouni Malinen 提交于 1月 24, 2015

This makes cfg80211 aware of the GCMP, GCMP-256, CCMP-256, BIP-GMAC-128,
BIP-GMAC-256, and BIP-CMAC-256 cipher suites. These new cipher suites
were defined in IEEE Std 802.11ac-2013.
Signed-off-by: NJouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

cfcf1682

net: phy: keep track of the PHY suspend state · 8a477a6f

由 Florian Fainelli 提交于 1月 26, 2015

In order to avoid double calls to phydev->drv->suspend and resume, keep
track of whether the PHY has already been suspended as a consequence of
a successful call to phy_suspend(). We will use this in our MDIO bus
suspend/resume hooks to avoid a double suspend call.
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8a477a6f

net: phy: document has_fixups field · aae88261

由 Florian Fainelli 提交于 1月 26, 2015

has_fixups was introduced to help keeping track of fixups/quirks running
on a PHY device, but we did not update the comment above struct
phy_device accordingly.

Fixes: b0ae009f (net: phy: add "has_fixups" boolean property")
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aae88261

NFC: st21nfcb: Fix copy/paste error in comment · 6da7c85c

由 Christophe Ricard 提交于 1月 25, 2015

include/linux/platform_data/st21nfcb.h is based on
include/linux/platform_data/st21nfca.h.

The endif comment is inacurrate for st21nfcb.
Signed-off-by: NChristophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: NSamuel Ortiz <sameo@linux.intel.com>

6da7c85c

NFC: st21nfcb: Remove useless include · 0b35fa7d

由 Christophe Ricard 提交于 1月 25, 2015

include/linux/platform_data/st21nfcb.h is phy generic.
There is no need to include linux/i2c.h
Signed-off-by: NChristophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: NSamuel Ortiz <sameo@linux.intel.com>

0b35fa7d

printk: add dummy routine for when CONFIG_PRINTK=n · 07261edb

由 Pranith Kumar 提交于 1月 26, 2015

There are missing dummy routines for log_buf_addr_get() and
log_buf_len_get() for when CONFIG_PRINTK is not set causing build
failures.

This patch adds these dummy routines at the appropriate location.
Signed-off-by: NPranith Kumar <bobby.prani@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: NPetr Mladek <pmladek@suse.cz>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

07261edb

mm: page_alloc: embed OOM killing naturally into allocation slowpath · 9879de73

由 Johannes Weiner 提交于 1月 26, 2015

The OOM killing invocation does a lot of duplicative checks against the
task's allocation context.  Rework it to take advantage of the existing
checks in the allocator slowpath.

The OOM killer is invoked when the allocator is unable to reclaim any
pages but the allocation has to keep looping.  Instead of having a check
for __GFP_NORETRY hidden in oom_gfp_allowed(), just move the OOM
invocation to the true branch of should_alloc_retry().  The __GFP_FS
check from oom_gfp_allowed() can then be moved into the OOM avoidance
branch in __alloc_pages_may_oom(), along with the PF_DUMPCORE test.

__alloc_pages_may_oom() can then signal to the caller whether the OOM
killer was invoked, instead of requiring it to duplicate the order and
high_zoneidx checks to guess this when deciding whether to continue.
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Acked-by: NMichal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9879de73

26 1月, 2015 9 次提交

rhashtable: fix rht_for_each_entry_safe() endless loop · 607954b0

由 Patrick McHardy 提交于 1月 21, 2015

"next" is not updated, causing an endless loop for buckets with more than
one element.

Fixes: 88d6ed15 ("rhashtable: Convert bucket iterators to take table and index")
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Acked-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

607954b0

net: ipv6: Add sysctl entry to disable MTU updates from RA · c2943f14

由 Harout Hedeshian 提交于 1月 20, 2015

The kernel forcefully applies MTU values received in router
advertisements provided the new MTU is less than the current. This
behavior is undesirable when the user space is managing the MTU. Instead
a sysctl flag 'accept_ra_mtu' is introduced such that the user space
can control whether or not RA provided MTU updates should be applied. The
default behavior is unchanged; user space must explicitly set this flag
to 0 for RA MTUs to be ignored.
Signed-off-by: NHarout Hedeshian <harouth@codeaurora.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c2943f14

net/mlx4_core: Enable device recovery flow with SRIOV · 55ad3592

由 Yishai Hadas 提交于 1月 25, 2015

In SRIOV, both the PF and the VF may attempt device recovery whenever they
assume that the device is not functioning.  When the PF driver resets the
device, the VF should detect this and attempt to reinitialize itself.

The VF must be able to reset itself under all circumstances, even
if the PF is not responsive.

The VF shall reset itself in the following cases:

1. Commands are not processed within reasonable time over the communication channel.
This is done considering device state and the correct return code based on
the command as was done in the native mode, done in the next patch.

2. The VF driver receives an internal error event reported by the PF on the
communication channel. This occurs when the PF driver resets the device or
when VF is out of sync with the PF.

Add 'VF reset' capability, which allows the VF to reinitialize itself even when the
PF is not responsive.

As PF and VF may run their reset flow simulantanisly, there are several cases
that are handled:
- Prevent freeing VF resources upon FLR, when PF is in its unloading stage.
- Prevent PF getting VF commands before it has finished initializing its resources.
- Upon VF startup, check that comm-channel is online before sending
  commands to the PF and getting timed-out.
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55ad3592

net/mlx4_core: Manage interface state for Reset flow cases · c69453e2

由 Yishai Hadas 提交于 1月 25, 2015

We need to manage interface state to sync between reset flow and some other
relative cases such as remove_one. This has to be done to prevent certain
races. For example in case software stack is down as a result of unload call,
the remove_one should skip the unload phase.

Implement the remove_one case, handling AER and other cases comes next.

The interface can be up/down, upon remove_one, the state will include an extra
bit indicating that the device is cleaned-up, forcing other tasks to finish
before the final cleanup.
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c69453e2

net/mlx4_core: Activate reset flow upon fatal command cases · f5aef5aa

由 Yishai Hadas 提交于 1月 25, 2015

We activate reset flow upon command fatal errors, when the device enters an
erroneous state, and must be reset.

The cases below are assumed to be fatal: FW command timed-out, an error from FW
on closing commands, pci is offline when posting/pending a command.

In those cases we place the device into an error state: chip is reset, pending
commands are awakened and completed immediately. Subsequent commands will
return immediately.

The return code in the above cases will depend on the command. Commands which
free and close resources will return success (because the chip was reset, so
callers may safely free their kernel resources). Other commands will return -EIO.

Since the device's state was marked as error, the catas poller will
detect this and restart the device's software stack (as is done when a FW
internal error is directly detected). The device state is protected by a
persistent mutex lives on its mlx4_dev, as such no need any more for the
hcr_mutex which is removed.
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f5aef5aa

net/mlx4_core: Enhance the catas flow to support device reset · f6bc11e4

由 Yishai Hadas 提交于 1月 25, 2015

This includes:

- resetting the chip when a fatal error is detected (the current code
  does not do this).

- exposing the ability to enter error state from outside the catas code
  by calling its functionality. (E.g. FW Command timeout, AER error).

- managing a persistent device state. This is needed to sync between
  reset flow cases.
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6bc11e4

net/mlx4_core: Refactor the catas flow to work per device · ad9a0bf0

由 Yishai Hadas 提交于 1月 25, 2015

Using a WQ per device instead of a single global WQ, this allows
independent reset handling per device even when SRIOV is used.

This comes as a pre-patch for supporting chip reset
for both native and SRIOV.
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ad9a0bf0

net/mlx4_core: Set device configuration data to be persistent across reset · dd0eefe3

由 Yishai Hadas 提交于 1月 25, 2015

When an HCA enters an internal error state, this is detected by the driver.
The driver then should reset the HCA and restart the software stack.

Keep ports information and some SRIOV configuration in a persistent area
to have it valid across reset.
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dd0eefe3

net/mlx4_core: Maintain a persistent memory for mlx4 device · 872bf2fb

由 Yishai Hadas 提交于 1月 25, 2015

Maintain a persistent memory that should survive reset flow/PCI error.
This comes as a preparation for coming series to support above flows.
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

872bf2fb

24 1月, 2015 1 次提交

bcma: use standard bus scanning during early register · c5ed1df7

由 Rafał Miłecki 提交于 1月 19, 2015

Starting with kernel 3.19-rc1 early registration of bcma on MIPS is done
a bit later, with memory allocator available. This allows us to simplify
code by using standard bus scanning method.
Signed-off-by: NRafał Miłecki <zajec5@gmail.com>
Signed-off-by: NKalle Valo <kvalo@codeaurora.org>

c5ed1df7

22 1月, 2015 1 次提交

module: make module_refcount() a signed integer. · d5db139a

由 Rusty Russell 提交于 1月 22, 2015

James Bottomley points out that it will be -1 during unload.  It's
only used for diagnostics, so let's not hide that as it could be a
clue as to what's gone wrong.

Cc: Jason Wessel <jason.wessel@windriver.com>
Acked-and-documention-added-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
Reviewed-by: NMasami Hiramatsu <maasami.hiramatsu.pt@hitachi.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

d5db139a

20 1月, 2015 2 次提交

module: remove mod arg from module_free, rename module_memfree(). · be1f221c

由 Rusty Russell 提交于 1月 20, 2015

Nothing needs the module pointer any more, and the next patch will
call it from RCU, where the module itself might no longer exist.
Removing the arg is the safest approach.

This just codifies the use of the module_alloc/module_free pattern
which ftrace and bpf use.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: x86@kernel.org
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: linux-cris-kernel@axis.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: nios2-dev@lists.rocketboards.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: sparclinux@vger.kernel.org
Cc: netdev@vger.kernel.org

be1f221c

module_arch_freeing_init(): new hook for archs before module->module_init freed. · d453cded

由 Rusty Russell 提交于 1月 20, 2015

Archs have been abusing module_free() to clean up their arch-specific
allocations.  Since module_free() is also (ab)used by BPF and trace code,
let's keep it to simple allocations, and provide a hook called before
that.

This means that avr32, ia64, parisc and s390 no longer need to implement
their own module_free() at all.  avr32 doesn't need module_finalize()
either.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linux-s390@vger.kernel.org

d453cded

19 1月, 2015 1 次提交

libata: allow sata_sil24 to opt-out of tag ordered submission · 72dd299d

由 Dan Williams 提交于 1月 16, 2015

Ronny reports: https://bugzilla.kernel.org/show_bug.cgi?id=87101
    "Since commit 8a4aeec8 "libata/ahci: accommodate tag ordered
    controllers" the access to the harddisk on the first SATA-port is
    failing on its first access. The access to the harddisk on the
    second port is working normal.

    When reverting the above commit, access to both harddisks is working
    fine again."

Maintain tag ordered submission as the default, but allow sata_sil24 to
continue with the old behavior.

Cc: <stable@vger.kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Reported-by: NRonny Hegewald <Ronny.Hegewald@online.de>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

72dd299d

18 1月, 2015 1 次提交

net: replace br_fdb_external_learn_* calls with switchdev notifier events · 3aeb6617

由 Jiri Pirko 提交于 1月 15, 2015

This patch benefits from newly introduced switchdev notifier and uses it
to propagate fdb learn events from rocker driver to bridge. That avoids
direct function calls and possible use by other listeners (ovs).
Suggested-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NScott Feldman <sfeldma@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3aeb6617

17 1月, 2015 3 次提交

genetlink: synchronize socket closing and family removal · ee1c2442

由 Johannes Berg 提交于 1月 16, 2015

In addition to the problem Jeff Layton reported, I looked at the code
and reproduced the same warning by subscribing and removing the genl
family with a socket still open. This is a fairly tricky race which
originates in the fact that generic netlink allows the family to go
away while sockets are still open - unlike regular netlink which has
a module refcount for every open socket so in general this cannot be
triggered.

Trying to resolve this issue by the obvious locking isn't possible as
it will result in deadlocks between unregistration and group unbind
notification (which incidentally lockdep doesn't find due to the home
grown locking in the netlink table.)

To really resolve this, introduce a "closing socket" reference counter
(for generic netlink only, as it's the only affected family) in the
core netlink code and use that in generic netlink to wait for all the
sockets that are being closed at the same time as a generic netlink
family is removed.

This fixes the race that when a socket is closed, it will should call
the unbind, but if the family is removed at the same time the unbind
will not find it, leading to the warning. The real problem though is
that in this case the unbind could actually find a new family that is
registered to have a multicast group with the same ID, and call its
mcast_unbind() leading to confusing.

Also remove the warning since it would still trigger, but is now no
longer a problem.

This also moves the code in af_netlink.c to before unreferencing the
module to avoid having the same problem in the normal non-genl case.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ee1c2442

PCI: Add pci_claim_bridge_resource() to clip window if necessary · 8505e729

由 Yinghai Lu 提交于 1月 15, 2015

Add pci_claim_bridge_resource() to claim a PCI-PCI bridge window. This is
like regular pci_claim_resource(), except that if we fail to claim the
window, we check to see if we can reduce the size of the window and try
again.

This is for scenarios like this:

pci_bus 0000:00: root bus resource [mem 0xc0000000-0xffffffff]
pci 0000:00:01.0: bridge window [mem 0xbdf00000-0xddefffff 64bit pref]
pci 0000:01:00.0: reg 0x10: [mem 0xc0000000-0xcfffffff pref]

The 00:01.0 window is illegal: it starts before the host bridge window, so
we have to assume the [0xbdf00000-0xbfffffff] region is inaccessible. We
can make it legal by clipping it to [mem 0xc0000000-0xddefffff 64bit pref].

Previously we discarded the 00:01.0 window and tried to reassign that part
of the hierarchy from scratch. That is a problem because Linux doesn't
always assign things optimally. For example, in this case, BIOS put the
01:00.0 device in a prefetchable window below 4GB, but after 5b285415,
Linux puts the prefetchable window above 4GB where the 32-bit 01:00.0
device can't use it.

Clipping the 00:01.0 window is less intrusive than completely reassigning
things and is sufficient to let us use most of the BIOS configuration. Of
course, it's possible that devices below 00:01.0 will no longer fit. If
that's the case, we'll have to reassign things. But that's a separate
problem.

[bhelgaas: changelog, split into separate patch]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=85491Reported-by: NMarek Kordik <kordikmarek@gmail.com>
Fixes: 5b285415 ("PCI: Restrict 64-bit prefetchable bridge windows to 64-bit resources")
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.16+

8505e729

PCI: Add flag for devices where we can't use bus reset · f331a859

由 Alex Williamson 提交于 1月 15, 2015

Enable a mechanism for devices to quirk that they do not behave when
doing a PCI bus reset. We require a modest level of spec compliant
behavior in order to do a reset, for instance the device should come
out of reset without throwing errors and PCI config space should be
accessible after reset. This is too much to ask for some devices.

Link: http://lkml.kernel.org/r/20140923210318.498dacbd@dualc.maya.orgSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.14+

f331a859

16 1月, 2015 1 次提交

rhashtable: Fix race in rhashtable_destroy() and use regular work_struct · 57699a40

由 Ying Xue 提交于 1月 16, 2015

When we put our declared work task in the global workqueue with
schedule_delayed_work(), its delay parameter is always zero.
Therefore, we should define a regular work in rhashtable structure
instead of a delayed work.

By the way, we add a condition to check whether resizing functions
are NULL before cancelling the work, avoiding to cancel an
uninitialized work.

Lastly, while we wait for all work items we submitted before to run
to completion with cancel_delayed_work(), ht->mutex has been taken in
rhashtable_destroy(). Moreover, cancel_delayed_work() doesn't return
until all work items are accomplished, and when work items are
scheduled, the work's function - rht_deferred_worker() will be called.
However, as rht_deferred_worker() also needs to acquire the lock,
deadlock might happen at the moment as the lock is already held before.
So if the cancel work function is moved out of the lock covered scope,
this will avoid the deadlock.

Fixes: 97defe1e ("rhashtable: Per bucket locks & deferred expansion/shrinking")
Signed-off-by: NYing Xue <ying.xue@windriver.com>
Cc: Thomas Graf <tgraf@suug.ch>
Acked-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

57699a40

15 1月, 2015 2 次提交

netdevice: Add missing parentheses in macro · 4ccce02e

由 Benjamin Poirier 提交于 1月 14, 2015

For example, one could conceivably call
	for_each_netdev_in_bond_rcu(condition ? bond1 : bond2, slave)
and get an unexpected result.
Signed-off-by: NBenjamin Poirier <bpoirier@suse.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4ccce02e

udp: pass udp_offload struct to UDP gro callbacks · a2b12f3c

由 Tom Herbert 提交于 1月 12, 2015

This patch introduces udp_offload_callbacks which has the same
GRO functions (but not a GSO function) as offload_callbacks,
except there is an argument to a udp_offload struct passed to
gro_receive and gro_complete functions. This additional argument
can be used to retrieve the per port structure of the encapsulation
for use in gro processing (mostly by doing container_of on the
structure).
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a2b12f3c

14 1月, 2015 1 次提交

net: rename vlan_tx_* helpers since "tx" is misleading there · df8a39de

由 Jiri Pirko 提交于 1月 13, 2015

The same macros are used for rx as well. So rename it.
Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

df8a39de

OpenHarmony / kernel_linux 上一次同步 大约 4 年

OpenHarmony / kernel_linux
上一次同步大约 4 年