提交 · 8f25348b65cd073f77945f559ab1e5de83422cd1 · openanolis / cloud-kernel

05 11月, 2015 2 次提交

net: add forgotten IFF_L3MDEV_SLAVE define · 8f25348b

由 Jiri Pirko 提交于 11月 04, 2015

Fixes: fee6d4c7 ("net: Add netif_is_l3_slave")
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Acked-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8f25348b

tun_dst: Fix potential NULL dereference · f63ce5b6

由 Tobias Klauser 提交于 11月 04, 2015

In tun_dst_unclone() the return value of skb_metadata_dst() is checked
for being NULL after it is dereferenced. Fix this by moving the
dereference after the NULL check.

Found by the Coverity scanner (CID 1338068).

Fixes: fc4099f1 ("openvswitch: Fix egress tunnel info.")
Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f63ce5b6

04 11月, 2015 3 次提交

atomic: remove all traces of READ_ONCE_CTRL() and atomic*_read_ctrl() · 105ff3cb

由 Linus Torvalds 提交于 11月 03, 2015

This seems to be a mis-reading of how alpha memory ordering works, and
is not backed up by the alpha architecture manual.  The helper functions
don't do anything special on any other architectures, and the arguments
that support them being safe on other architectures also argue that they
are safe on alpha.

Basically, the "control dependency" is between a previous read and a
subsequent write that is dependent on the value read.  Even if the
subsequent write is actually done speculatively, there is no way that
such a speculative write could be made visible to other cpu's until it
has been committed, which requires validating the speculation.

Note that most weakely ordered architectures (very much including alpha)
do not guarantee any ordering relationship between two loads that depend
on each other on a control dependency:

    read A
    if (val == 1)
        read B

because the conditional may be predicted, and the "read B" may be
speculatively moved up to before reading the value A.  So we require the
user to insert a smp_rmb() between the two accesses to be correct:

    read A;
    if (A == 1)
        smp_rmb()
        read B

Alpha is further special in that it can break that ordering even if the
*address* of B depends on the read of A, because the cacheline that is
read later may be stale unless you have a memory barrier in between the
pointer read and the read of the value behind a pointer:

    read ptr
    read offset(ptr)

whereas all other weakly ordered architectures guarantee that the data
dependency (as opposed to just a control dependency) will order the two
accesses.  As a result, alpha needs a "smp_read_barrier_depends()" in
between those two reads for them to be ordered.

The coontrol dependency that "READ_ONCE_CTRL()" and "atomic_read_ctrl()"
had was a control dependency to a subsequent *write*, however, and
nobody can finalize such a subsequent write without having actually done
the read.  And were you to write such a value to a "stale" cacheline
(the way the unordered reads came to be), that would seem to lose the
write entirely.

So the things that make alpha able to re-order reads even more
aggressively than other weak architectures do not seem to be relevant
for a subsequent write.  Alpha memory ordering may be strange, but
there's no real indication that it is *that* strange.

Also, the alpha architecture reference manual very explicitly talks
about the definition of "Dependence Constraints" in section 5.6.1.7,
where a preceding read dominates a subsequent write.

Such a dependence constraint admittedly does not impose a BEFORE (alpha
architecture term for globally visible ordering), but it does guarantee
that there can be no "causal loop".  I don't see how you could avoid
such a loop if another cpu could see the stored value and then impact
the value of the first read.  Put another way: the read and the write
could not be seen as being out of order wrt other cpus.

So I do not see how these "x_ctrl()" functions can currently be necessary.

I may have to eat my words at some point, but in the absense of clear
proof that alpha actually needs this, or indeed even an explanation of
how alpha could _possibly_ need it, I do not believe these functions are
called for.

And if it turns out that alpha really _does_ need a barrier for this
case, that barrier still should not be "smp_read_barrier_depends()".
We'd have to make up some new speciality barrier just for alpha, along
with the documentation for why it really is necessary.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul E McKenney <paulmck@us.ibm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

105ff3cb

net/core: fix for_each_netdev_feature · 5ba3f7d6

由 Jarod Wilson 提交于 11月 03, 2015

As pointed out by Nikolay and further explained by Geert, the initial
for_each_netdev_feature macro was broken, as feature would get set outside
of the block of code it was intended to run in, thus only ever working for
the first feature bit in the mask. While less pretty this way, this is
tested and confirmed functional with multiple feature bits set in
NETIF_F_UPPER_DISABLES.

[root@dell-per730-01 ~]# ethtool -K bond0 lro off
...
[  242.761394] bond0: Disabling feature 0x0000000000008000 on lower dev p5p2.
[  243.552178] bnx2x 0000:06:00.1 p5p2: using MSI-X  IRQs: sp 74  fp[0] 76 ... fp[7] 83
[  244.353978] bond0: Disabling feature 0x0000000000008000 on lower dev p5p1.
[  245.147420] bnx2x 0000:06:00.0 p5p1: using MSI-X  IRQs: sp 62  fp[0] 64 ... fp[7] 71

[root@dell-per730-01 ~]# ethtool -K bond0 gro off
...
[  251.925645] bond0: Disabling feature 0x0000000000004000 on lower dev p5p2.
[  252.713693] bnx2x 0000:06:00.1 p5p2: using MSI-X  IRQs: sp 74  fp[0] 76 ... fp[7] 83
[  253.499085] bond0: Disabling feature 0x0000000000004000 on lower dev p5p1.
[  254.290922] bnx2x 0000:06:00.0 p5p1: using MSI-X  IRQs: sp 62  fp[0] 64 ... fp[7] 71

Fixes: fd867d51 ("net/core: generic support for disabling netdev features down stack")
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <gospo@cumulusnetworks.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Nikolay Aleksandrov <razor@blackwall.org>
CC: Michal Kubecek <mkubecek@suse.cz>
CC: Alexander Duyck <alexander.duyck@gmail.com>
CC: Geert Uytterhoeven <geert@linux-m68k.org>
CC: netdev@vger.kernel.org
Signed-off-by: NJarod Wilson <jarod@redhat.com>
Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5ba3f7d6

ptp: Change ptp_class to a proper bitmask · 5f94c943

由 Stefan Sørensen 提交于 11月 03, 2015

Change the definition of PTP_CLASS_L2 to not have any bits overlapping with
the other defined protocol values, allowing the PTP_CLASS_* definitions to
be for simple filtering on packet type.
Signed-off-by: NStefan Sørensen <stefan.sorensen@spectralink.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5f94c943

03 11月, 2015 10 次提交

mac80211: document sleep requirements for channel context ops · dcae9e02

由 Chaitanya T K 提交于 10月 30, 2015

Channel context driver operations can sleep, so add might_sleep()
and document this.
Signed-off-by: NChaitanya T K <chaitanya.mgit@gmail.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

dcae9e02

cfg80211/mac80211: clarify RSSI CQM reporting requirements · e86abc68

由 Johannes Berg 提交于 10月 22, 2015

The previous patch changed mac80211 to always report an event
after a CQM RSSI reconfiguration. Document that as expected
behaviour in both the cfg80211 and mac80211 API.

Currently, iwlmvm already implements that behaviour; the other
drivers implementing CQM RSSI events may have to be changed.

This behaviour lets userspace know what the current state is
without relying on querying the data which is racy.
Reviewed-by: NSharon, Sara <sara.sharon@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

e86abc68

leds: netxbig: add device tree binding · 2976b179

由 Simon Guinot 提交于 9月 26, 2015

This patch adds device tree support for the netxbig LEDs.

This also introduces a additionnal DT binding for the GPIO extension bus
(netxbig-gpio-ext) used to configure the LEDs. Since this bus could also
be used to control other devices, then it seems more suitable to have it
in a separate DT binding.
Signed-off-by: NSimon Guinot <simon.guinot@sequanux.org>
Acked-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NJacek Anaszewski <j.anaszewski@samsung.com>

2976b179

net/core: generic support for disabling netdev features down stack · fd867d51

由 Jarod Wilson 提交于 11月 02, 2015

There are some netdev features, which when disabled on an upper device,
such as a bonding master or a bridge, must be disabled and cannot be
re-enabled on underlying devices.

This is a rework of an earlier more heavy-handed appraoch, which simply
disables and prevents re-enabling of netdev features listed in a new
define in include/net/netdev_features.h, NETIF_F_UPPER_DISABLES. Any upper
device that disables a flag in that feature mask, the disabling will
propagate down the stack, and any lower device that has any upper device
with one of those flags disabled should not be able to enable said flag.

Initially, only LRO is included for proof of concept, and because this
code effectively does the same thing as dev_disable_lro(), though it will
also activate from the ethtool path, which was one of the goals here.

[root@dell-per730-01 ~]# ethtool -k bond0 |grep large
large-receive-offload: on
[root@dell-per730-01 ~]# ethtool -k p5p1 |grep large
large-receive-offload: on
[root@dell-per730-01 ~]# ethtool -K bond0 lro off
[root@dell-per730-01 ~]# ethtool -k bond0 |grep large
large-receive-offload: off
[root@dell-per730-01 ~]# ethtool -k p5p1 |grep large
large-receive-offload: off

dmesg dump:

[ 1033.277986] bond0: Disabling feature 0x0000000000008000 on lower dev p5p2.
[ 1034.067949] bnx2x 0000:06:00.1 p5p2: using MSI-X  IRQs: sp 74  fp[0] 76 ... fp[7] 83
[ 1034.753612] bond0: Disabling feature 0x0000000000008000 on lower dev p5p1.
[ 1035.591019] bnx2x 0000:06:00.0 p5p1: using MSI-X  IRQs: sp 62  fp[0] 64 ... fp[7] 71

This has been successfully tested with bnx2x, qlcnic and netxen network
cards as slaves in a bond interface. Turning LRO on or off on the master
also turns it on or off on each of the slaves, new slaves are added with
LRO in the same state as the master, and LRO can't be toggled on the
slaves.

Also, this should largely remove the need for dev_disable_lro(), and most,
if not all, of its call sites can be replaced by simply making sure
NETIF_F_LRO isn't included in the relevant device's feature flags.

Note that this patch is driven by bug reports from users saying it was
confusing that bonds and slaves had different settings for the same
features, and while it won't be 100% in sync if a lower device doesn't
support a feature like LRO, I think this is a good step in the right
direction.

CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <gospo@cumulusnetworks.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Nikolay Aleksandrov <razor@blackwall.org>
CC: Michal Kubecek <mkubecek@suse.cz>
CC: Alexander Duyck <alexander.duyck@gmail.com>
CC: netdev@vger.kernel.org
Signed-off-by: NJarod Wilson <jarod@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fd867d51

bonding: simplify / unify event handling code for 3ad mode. · 52bc6716

由 Mahesh Bandewar 提交于 10月 31, 2015

Old logic of updating state-machine is not required since
ad_update_actor_keys() does it implicitly. The only loss is
the notification differentiation between speed vs. duplex
change. Now only one unified notification is printed.
Signed-off-by: NMahesh Bandewar <maheshb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

52bc6716

bpf: add support for persistent maps/progs · b2197755

由 Daniel Borkmann 提交于 10月 29, 2015

This work adds support for "persistent" eBPF maps/programs. The term
"persistent" is to be understood that maps/programs have a facility
that lets them survive process termination. This is desired by various
eBPF subsystem users.

Just to name one example: tc classifier/action. Whenever tc parses
the ELF object, extracts and loads maps/progs into the kernel, these
file descriptors will be out of reach after the tc instance exits.
So a subsequent tc invocation won't be able to access/relocate on this
resource, and therefore maps cannot easily be shared, f.e. between the
ingress and egress networking data path.

The current workaround is that Unix domain sockets (UDS) need to be
instrumented in order to pass the created eBPF map/program file
descriptors to a third party management daemon through UDS' socket
passing facility. This makes it a bit complicated to deploy shared
eBPF maps or programs (programs f.e. for tail calls) among various
processes.

We've been brainstorming on how we could tackle this issue and various
approches have been tried out so far, which can be read up further in
the below reference.

The architecture we eventually ended up with is a minimal file system
that can hold map/prog objects. The file system is a per mount namespace
singleton, and the default mount point is /sys/fs/bpf/. Any subsequent
mounts within a given namespace will point to the same instance. The
file system allows for creating a user-defined directory structure.
The objects for maps/progs are created/fetched through bpf(2) with
two new commands (BPF_OBJ_PIN/BPF_OBJ_GET). I.e. a bpf file descriptor
along with a pathname is being passed to bpf(2) that in turn creates
(we call it eBPF object pinning) the file system nodes. Only the pathname
is being passed to bpf(2) for getting a new BPF file descriptor to an
existing node. The user can use that to access maps and progs later on,
through bpf(2). Removal of file system nodes is being managed through
normal VFS functions such as unlink(2), etc. The file system code is
kept to a very minimum and can be further extended later on.

The next step I'm working on is to add dump eBPF map/prog commands
to bpf(2), so that a specification from a given file descriptor can
be retrieved. This can be used by things like CRIU but also applications
can inspect the meta data after calling BPF_OBJ_GET.

Big thanks also to Alexei and Hannes who significantly contributed
in the design discussion that eventually let us end up with this
architecture here.

Reference: https://lkml.org/lkml/2015/10/15/925Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b2197755

bpf: align and clean bpf_{map,prog}_get helpers · c2101297

由 Daniel Borkmann 提交于 10月 29, 2015

Add a bpf_map_get() function that we're going to use later on and
align/clean the remaining helpers a bit so that we have them a bit
more consistent:

  - __bpf_map_get() and __bpf_prog_get() that both work on the fd
    struct, check whether the descriptor is eBPF and return the
    pointer to the map/prog stored in the private data.

    Also, we can return f.file->private_data directly, the function
    signature is enough of a documentation already.

  - bpf_map_get() and bpf_prog_get() that both work on u32 user fd,
    call their respective __bpf_map_get()/__bpf_prog_get() variants,
    and take a reference.
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c2101297

net: fix percpu memory leaks · 1d6119ba

由 Eric Dumazet 提交于 11月 02, 2015

This patch fixes following problems :

1) percpu_counter_init() can return an error, therefore
  init_frag_mem_limit() must propagate this error so that
  inet_frags_init_net() can do the same up to its callers.

2) If ip[46]_frags_ns_ctl_register() fail, we must unwind
   properly and free the percpu_counter.

Without this fix, we leave freed object in percpu_counters
global list (if CONFIG_HOTPLUG_CPU) leading to crashes.

This bug was detected by KASAN and syzkaller tool
(http://github.com/google/syzkaller)

Fixes: 6d7b857d ("net: use lib/percpu_counter API for fragmentation mem accounting")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d6119ba

net: avoid NULL deref in inet_ctl_sock_destroy() · 8fa677d2

由 Eric Dumazet 提交于 11月 02, 2015

Under low memory conditions, tcp_sk_init() and icmp_sk_init()
can both iterate on all possible cpus and call inet_ctl_sock_destroy(),
with eventual NULL pointer.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8fa677d2

net: make skb_set_owner_w() more robust · 9e17f8a4

由 Eric Dumazet 提交于 11月 01, 2015

skb_set_owner_w() is called from various places that assume
skb->sk always point to a full blown socket (as it changes
sk->sk_wmem_alloc)

We'd like to attach skb to request sockets, and in the future
to timewait sockets as well. For these kind of pseudo sockets,
we need to take a traditional refcount and use sock_edemux()
as the destructor.

It is now time to un-inline skb_set_owner_w(), being too big.

Fixes: ca6fb065 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Bisected-by: NHaiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9e17f8a4

02 11月, 2015 4 次提交

mm: get rid of 'vmalloc_info' from /proc/meminfo · a5ad88ce

由 Linus Torvalds 提交于 11月 01, 2015

It turns out that at least some versions of glibc end up reading
/proc/meminfo at every single startup, because glibc wants to know the
amount of memory the machine has.  And while that's arguably insane,
it's just how things are.

And it turns out that it's not all that expensive most of the time, but
the vmalloc information statistics (amount of virtual memory used in the
vmalloc space, and the biggest remaining chunk) can be rather expensive
to compute.

The 'get_vmalloc_info()' function actually showed up on my profiles as
4% of the CPU usage of "make test" in the git source repository, because
the git tests are lots of very short-lived shell-scripts etc.

It turns out that apparently this same silly vmalloc info gathering
shows up on the facebook servers too, according to Dave Jones.  So it's
not just "make test" for git.

We had two patches to just cache the information (one by me, one by
Ingo) to mitigate this issue, but the whole vmalloc information of of
rather dubious value to begin with, and people who *actually* want to
know what the situation is wrt the vmalloc area should just look at the
much more complete /proc/vmallocinfo instead.

In fact, according to my testing - and perhaps more importantly,
according to that big search engine in the sky: Google - there is
nothing out there that actually cares about those two expensive fields:
VmallocUsed and VmallocChunk.

So let's try to just remove them entirely.  Actually, this just removes
the computation and reports the numbers as zero for now, just to try to
be minimally intrusive.

If this breaks anything, we'll obviously have to re-introduce the code
to compute this all and add the caching patches on top.  But if given
the option, I'd really prefer to just remove this bad idea entirely
rather than add even more code to work around our historical mistake
that likely nobody really cares about.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a5ad88ce

ipv4: fix to not remove local route on link down · 4f823def

由 Julian Anastasov 提交于 10月 30, 2015

When fib_netdev_event calls fib_disable_ip on NETDEV_DOWN event
we should not delete the local routes if the local address
is still present. The confusion comes from the fact that both
fib_netdev_event and fib_inetaddr_event use the NETDEV_DOWN
constant. Fix it by returning back the variable 'force'.

Steps to reproduce:
modprobe dummy
ifconfig dummy0 192.168.168.1 up
ifconfig dummy0 down
ip route list table local | grep dummy | grep host
local 192.168.168.1 dev dummy0  proto kernel  scope host  src 192.168.168.1

Fixes: 8a3d0316 ("net: track link-status of ipv4 nexthops")
Signed-off-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4f823def

net: dsa: use switchdev obj for VLAN add/del ops · 76e398a6

由 Vivien Didelot 提交于 11月 01, 2015

Simplify DSA by pushing the switchdev objects for VLAN add and delete
operations down to its drivers. Currently only mv88e6xxx is affected.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

76e398a6

VSOCK: define VSOCK_SS_LISTEN once only · ea3803c1

由 Stefan Hajnoczi 提交于 10月 29, 2015

The SS_LISTEN socket state is defined by both af_vsock.c and
vmci_transport.c.  This is risky since the value could be changed in one
file and the other would be out of sync.

Rename from SS_LISTEN to VSOCK_SS_LISTEN since the constant is not part
of enum socket_state (SS_CONNECTED, ...).  This way it is clear that the
constant is vsock-specific.

The big text reflow in af_vsock.c was necessary to keep to the maximum
line length.  Text is unchanged except for s/SS_LISTEN/VSOCK_SS_LISTEN/.
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ea3803c1

01 11月, 2015 1 次提交

vfs: Fix pathological performance case for __alloc_fd() · f3f86e33

由 Linus Torvalds 提交于 10月 30, 2015

Al Viro points out that:
> >     * [Linux-specific aside] our __alloc_fd() can degrade quite badly
> > with some use patterns.  The cacheline pingpong in the bitmap is probably
> > inevitable, unless we accept considerably heavier memory footprint,
> > but we also have a case when alloc_fd() takes O(n) and it's _not_ hard
> > to trigger - close(3);open(...); will have the next open() after that
> > scanning the entire in-use bitmap.

And Eric Dumazet has a somewhat realistic multithreaded microbenchmark
that opens and closes a lot of sockets with minimal work per socket.

This patch largely fixes it.  We keep a 2nd-level bitmap of the open
file bitmaps, showing which words are already full.  So then we can
traverse that second-level bitmap to efficiently skip already allocated
file descriptors.

On his benchmark, this improves performance by up to an order of
magnitude, by avoiding the excessive open file bitmap scanning.
Tested-and-acked-by: NEric Dumazet <edumazet@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f3f86e33

30 10月, 2015 2 次提交

Document that IRQ_NONE should be returned when IRQ not actually handled · d9e4ad5b

由 David Woodhouse 提交于 10月 28, 2015

Our IRQ storm detection works when an interrupt handler returns
IRQ_NONE for thousands of consecutive interrupts in a second. It
doesn't hurt to occasionally return IRQ_NONE when the interrupt is
actually genuine.

Drivers should only be returning IRQ_HANDLED if they have actually
*done* something to stop an interrupt from happening — it doesn't just
mean "this really *was* my device".
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
Cc: davem@davemloft.net
Link: http://lkml.kernel.org/r/1446016471.3405.201.camel@infradead.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

d9e4ad5b

geneve: implement support for IPv6-based tunnels · 8ed66f0e

由 John W. Linville 提交于 10月 26, 2015

NOTE: Link-local IPv6 addresses for remote endpoints are not supported,
since the driver currently has no capacity for binding a geneve
interface to a specific link.
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
Reviewed-by: NJesse Gross <jesse@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8ed66f0e

29 10月, 2015 2 次提交

Revert "Merge branch 'ipv6-overflow-arith'" · 1e0d69a9

由 Hannes Frederic Sowa 提交于 10月 28, 2015

Linus dislikes these changes. To not hold up the net-merge let's revert
it for now and fix the bug like Linus suggested.

This reverts commit ec3661b4, reversing
changes made to c80dbe04.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1e0d69a9

ARM: OMAP1: fix incorrect INT_DMA_LCD · 1bd5dfe4

由 Aaro Koskinen 提交于 10月 26, 2015

Commit 685e2d08 ("ARM: OMAP1: Change interrupt numbering for
sparse IRQ") turned on SPARSE_IRQ on OMAP1, but forgot to change
the number of INT_DMA_LCD. This broke the boot at least on Nokia 770,
where the device hangs during framebuffer initialization.

Fix by defining INT_DMA_LCD like the other interrupts.

Cc: stable@vger.kernel.org # v4.2+
Fixes: 685e2d08 ("ARM: OMAP1: Change interrupt numbering for sparse IRQ")
Signed-off-by: NAaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: NTony Lindgren <tony@atomide.com>

1bd5dfe4

28 10月, 2015 8 次提交

efi: Use correct type for struct efi_memory_map::phys_map · 44511fb9

由 Ard Biesheuvel 提交于 10月 23, 2015

We have been getting away with using a void* for the physical
address of the UEFI memory map, since, even on 32-bit platforms
with 64-bit physical addresses, no truncation takes place if the
memory map has been allocated by the firmware (which only uses
1:1 virtually addressable memory), which is usually the case.

However, commit:

  0f96a99d ("efi: Add "efi_fake_mem" boot option")

adds code that clones and modifies the UEFI memory map, and the
clone may live above 4 GB on 32-bit platforms.

This means our use of void* for struct efi_memory_map::phys_map has
graduated from 'incorrect but working' to 'incorrect and
broken', and we need to fix it.

So redefine struct efi_memory_map::phys_map as phys_addr_t, and
get rid of a bunch of casts that are now unneeded.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: NMatt Fleming <matt@codeblueprint.co.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: izumi.taku@jp.fujitsu.com
Cc: kamezawa.hiroyu@jp.fujitsu.com
Cc: linux-efi@vger.kernel.org
Cc: matt.fleming@intel.com
Link: http://lkml.kernel.org/r/1445593697-1342-1-git-send-email-ard.biesheuvel@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

44511fb9

NFC: nci: non-static functions can not be inline · f1163174

由 Robert Dolca 提交于 10月 26, 2015

This fixes a build error that seems to be toochain
dependent (Not seen with gcc v5.1):

In file included from net/nfc/nci/rsp.c:36:0:
net/nfc/nci/rsp.c: In function ‘nci_rsp_packet’:
include/net/nfc/nci_core.h:355:12: error: inlining failed in call to
always_inline ‘nci_prop_rsp_packet’: function body not available
 inline int nci_prop_rsp_packet(struct nci_dev *ndev, __u16 opcode,
Signed-off-by: NRobert Dolca <robert.dolca@intel.com>
Signed-off-by: NSamuel Ortiz <sameo@linux.intel.com>

f1163174

seccomp, ptrace: add support for dumping seccomp filters · f8e529ed

由 Tycho Andersen 提交于 10月 27, 2015

This patch adds support for dumping a process' (classic BPF) seccomp
filters via ptrace.

PTRACE_SECCOMP_GET_FILTER allows the tracer to dump the user's classic BPF
seccomp filters. addr should be an integer which represents the ith seccomp
filter (0 is the most recently installed filter). data should be a struct
sock_filter * with enough room for the ith filter, or NULL, in which case
the filter is not saved. The return value for this command is the number of
BPF instructions the program represents, or negative in the case of errors.
Command specific errors are ENOENT: which indicates that there is no ith
filter in this seccomp tree, and EMEDIUMTYPE, which indicates that the ith
filter was not installed as a classic BPF filter.

A caveat with this approach is that there is no way to get explicitly at
the heirarchy of seccomp filters, and users need to memcmp() filters to
decide which are inherited. This means that a task which installs two of
the same filter can potentially confuse users of this interface.

v2: * make save_orig const
    * check that the orig_prog exists (not necessary right now, but when
       grows eBPF support it will be)
    * s/n/filter_off and make it an unsigned long to match ptrace
    * count "down" the tree instead of "up" when passing a filter offset

v3: * don't take the current task's lock for inspecting its seccomp mode
    * use a 0x42** constant for the ptrace command value

v4: * don't copy to userspace while holding spinlocks

v5: * add another condition to WARN_ON

v6: * rebase on net-next
Signed-off-by: NTycho Andersen <tycho.andersen@canonical.com>
Acked-by: NKees Cook <keescook@chromium.org>
CC: Will Drewry <wad@chromium.org>
Reviewed-by: NOleg Nesterov <oleg@redhat.com>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Pavel Emelyanov <xemul@parallels.com>
CC: Serge E. Hallyn <serge.hallyn@ubuntu.com>
CC: Alexei Starovoitov <ast@kernel.org>
CC: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f8e529ed

qed: Add statistics support · 9df2ed04

由 Manish Chopra 提交于 10月 26, 2015

Device statistics can be gathered on-demand. This adds the qed support for
reading the statistics [both function and port] from the device, and adds
to the public API a method for requesting the current statistics.
Signed-off-by: NManish Chopra <Manish.Chopra@qlogic.com>
Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: NAriel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9df2ed04

qed: Add link support · cc875c2e

由 Yuval Mintz 提交于 10月 26, 2015

Physical link is handled by the management Firmware.
This patch lays the infrastructure for attention handling in the driver,
as link change notifications arrive via async. attentions,
as well the handling of such notifications.

This patch also extends the API with the protocol drivers by adding
registered callbacks which the protocol driver passes to qed in order
to be notified of async. events originating from the FW/HW.
Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: NAriel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cc875c2e

qed: Add slowpath L2 support · cee4d264

由 Manish Chopra 提交于 10月 26, 2015

This patch adds to the qed the support to configure various L2 elements,
such as channels and basic filtering conditions.
It also enhances its public API to allow qede to later utilize this
functionality.
Signed-off-by: NManish Chopra <Manish.Chopra@qlogic.com>
Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: NAriel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cee4d264

qed: Add basic L2 interface · 25c089d7

由 Yuval Mintz 提交于 10月 26, 2015

This patch adds a public API for a network driver to work on top of QED.
The interface itself is very minimal - it's mostly infrastructure, as the
only content it has after this patch is a query for HW-based information
required for the creation of a network interface [I.e., no actual
protocol-specific configurations are supported].
Signed-off-by: NManish Chopra <Manish.Chopra@qlogic.com>
Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: NAriel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

25c089d7

qed: Add module with basic common support · fe56b9e6

由 Yuval Mintz 提交于 10月 26, 2015

The Qlogic Everest Driver is the backend module for the QL4xxx ethernet
products by Qlogic.

This module serves two main purposes:
 1. It's responsible to contain all the common code that will be shared
    between the various drivers that would be used with said line of
    products. Flows such as chip initialization and de-initialization
    fall under this category.

 2. It would abstract the protocol-specific HW & FW components, allowing
    the protocol drivers to have a clean APIs which is detached in its
    slowpath configuration from the actual HSI.

This adds a very basic module without any protocol-specific bits.
I.e., this adds a basic implementation that almost entirely falls under
the first category.
Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: NAriel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fe56b9e6

27 10月, 2015 8 次提交

drivers/pinctrl: Add the concept of an "init" state · ef0eebc0

由 Douglas Anderson 提交于 10月 20, 2015

For pinctrl the "default" state is applied to pins before the driver's
probe function is called.  This is normally a sensible thing to do,
but in some cases can cause problems.  That's because the pins will
change state before the driver is given a chance to program how those
pins should behave.

As an example you might have a regulator that is controlled by a PWM
(output high = high voltage, output low = low voltage).  The firmware
might leave this pin as driven high.  If we allow the driver core to
reconfigure this pin as a PWM pin before the PWM's probe function runs
then you might end up running at too low of a voltage while we probe.

Let's introudce a new "init" state.  If this is defined we'll set
pinctrl to this state before probe and then "default" after probe
(unless the driver explicitly changed states already).

An alternative idea that was thought of was to use the pre-existing
"sleep" or "idle" states and add a boolean property that we should
start in that mode.  This was not done because the "init" state is
needed for correctness and those other states are only present (and
only transitioned in to and out of) when (optional) power management
is enabled.

Changes in v3:
- Moved declarations to pinctrl/devinfo.h
- Fixed author/SoB

Changes in v2:
- Added comment to pinctrl_init_done() as per Linus W.
Signed-off-by: NDouglas Anderson <dianders@chromium.org>
Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: NCaesar Wang <wxt@rock-chips.com>
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>

ef0eebc0

mmc: mmc: extend the mmc_send_tuning() · 9979dbe5

由 Chaotian Jing 提交于 10月 27, 2015

The mmc_execute_tuning() has already prepared the opcode,
there is no need to prepare it again at mmc_send_tuning(),
and, there is a BUG of mmc_send_tuning() to determine the opcode
by bus width, assume eMMC was running at HS200, 4bit mode,
then the mmc_send_tuning() will overwrite the opcode from CMD21
to CMD19, then got error.

in addition, extend an argument of "cmd_error" to allow getting
if there was cmd error when tune response.
Signed-off-by: NChaotian Jing <chaotian.jing@mediatek.com>
[Ulf: Rebased patch]
Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>

9979dbe5

blkcg: fix incorrect read/write sync/async stat accounting · 174fd8d3

由 Tejun Heo 提交于 10月 22, 2015

While unifying how blkcg stats are collected, 77ea7338 ("blkcg:
move io_service_bytes and io_serviced stats into blkcg_gq")
incorrectly used bio->flags instead of bio->rw to tell the IO type.
This made IOs to be accounted as the wrong type.  Fix it.
Signed-off-by: NTejun Heo <tj@kernel.org>
Fixes: 77ea7338 ("blkcg: move io_service_bytes and io_serviced stats into blkcg_gq")
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

174fd8d3

net: tso: add support for IPv6 · 8941faa1

由 emmanuel.grumbach@intel.com 提交于 10月 26, 2015

Adding IPv6 for the TSO helper API is trivial:
* Don't play with the id (which doesn't exist in IPv6)
* Correctly update the payload_len (don't include the
  length of the IP header itself)
Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8941faa1

bpf: fix bpf_perf_event_read() helper · 62544ce8

由 Alexei Starovoitov 提交于 10月 22, 2015

Fix safety checks for bpf_perf_event_read():
- only non-inherited events can be added to perf_event_array map
  (do this check statically at map insertion time)
- dynamically check that event is local and !pmu->count
Otherwise buggy bpf program can cause kernel splat.

Also fix error path after perf_event_attrs()
and remove redundant 'extern'.

Fixes: 35578d79 ("bpf: Implement function bpf_perf_event_read() that get the selected hardware PMU conuter")
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Tested-by: NWang Nan <wangnan0@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62544ce8

NFC: NCI: allow spi driver to choose transfer clock · 2bd83245

由 Vincent Cuissard 提交于 10月 26, 2015

In some cases low level drivers might want to update the
SPI transfer clock (e.g. during firmware download).

This patch adds this support. Without any modification the
driver will use the default SPI clock (from pdata or device tree).
Signed-off-by: NVincent Cuissard <cuissard@marvell.com>
Signed-off-by: NSamuel Ortiz <sameo@linux.intel.com>

2bd83245

NFC: nfcmrvl: add i2c driver · b5b3e23e

由 Vincent Cuissard 提交于 10月 26, 2015

This driver adds the support of I2C-based Marvell NFC controller.
Signed-off-by: NVincent Cuissard <cuissard@marvell.com>
Signed-off-by: NSamuel Ortiz <sameo@linux.intel.com>

b5b3e23e

NFC: NCI: export nci_send_frame and nci_send_cmd function · e5629d29

由 Vincent Cuissard 提交于 10月 26, 2015

Export nci_send_frame and nci_send_cmd symbols to allow drivers
to use it. This is needed for example if NCI is used during
firmware download phase.
Signed-off-by: NVincent Cuissard <cuissard@marvell.com>
Signed-off-by: NSamuel Ortiz <sameo@linux.intel.com>

e5629d29

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功