提交 · 0e90b31f4ba77027a7c21cbfc66404df0851ca21 · openeuler / raspberrypi-kernel

23 1月, 2012 2 次提交

net: introduce res_counter_charge_nofail() for socket allocations · 0e90b31f

由 Glauber Costa 提交于 1月 20, 2012

There is a case in __sk_mem_schedule(), where an allocation
is beyond the maximum, but yet we are allowed to proceed.
It happens under the following condition:

	sk->sk_wmem_queued + size >= sk->sk_sndbuf

The network code won't revert the allocation in this case,
meaning that at some point later it'll try to do it. Since
this is never communicated to the underlying res_counter
code, there is an inbalance in res_counter uncharge operation.

I see two ways of fixing this:

1) storing the information about those allocations somewhere
   in memcg, and then deducting from that first, before
   we start draining the res_counter,
2) providing a slightly different allocation function for
   the res_counter, that matches the original behavior of
   the network code more closely.

I decided to go for #2 here, believing it to be more elegant,
since #1 would require us to do basically that, but in a more
obscure way.
Signed-off-by: NGlauber Costa <glommer@parallels.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
CC: Tejun Heo <tj@kernel.org>
CC: Li Zefan <lizf@cn.fujitsu.com>
CC: Laurent Chavey <chavey@google.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e90b31f

pktgen: Fix unsigned function that is returning negative vals · bf0813bd

由 Paul Gortmaker 提交于 1月 19, 2012

Every call to num_args() immediately checks the return value for
less than zero, as it will return -EFAULT for a failed get_user()
call.  So it makes no sense for the function to be declared as an
unsigned long.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bf0813bd

18 1月, 2012 2 次提交

net: fix NULL-deref in WARN() in skb_gso_segment() · 65e9d2fa

由 Michał Mirosław 提交于 1月 17, 2012

Bug was introduced in commit c8f44aff.
Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

65e9d2fa

net: WARN if skb_checksum_help() is called on skb requiring segmentation · 36c92474

由 Ben Hutchings 提交于 1月 17, 2012

skb_checksum_help() has never done anything useful with skbs that
require segmentation.  Setting skb->ip_summed = CHECKSUM_NONE makes
them invalid and provokes a later WARNing in skb_gso_segment().

Passing such an skb to skb_checksum_help() indicates a bug, so we
should warn about it immediately.  Move the warning from
skb_gso_segment() into a shared function, and add gso_type and
gso_size to it.
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

36c92474

17 1月, 2012 3 次提交

net: fix some sparse errors · 747465ef

由 Eric Dumazet 提交于 1月 16, 2012

make C=2 CF="-D__CHECK_ENDIAN__" M=net

And fix flowi4_init_output() prototype for sport
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

747465ef

net: Use device model to get driver name in skb_gso_segment() · e52ac339

由 Ben Hutchings 提交于 1月 16, 2012

ethtool operations generally require the caller to hold RTNL and are
not safe to call in atomic context.  The device model provides this
information for most devices; we'll only lose it for some old ISA
drivers.
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e52ac339

bql: Fix inconsistency between file mode and attr method. · 795d9a25

由 Hiroaki SHIMODA 提交于 1月 14, 2012

There is no store() method for inflight attribute in the
tx-<n>/byte_queue_limits sysfs directory.
So remove S_IWUSR bit.
Signed-off-by: NHiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

795d9a25

13 1月, 2012 1 次提交

net: reintroduce missing rcu_assign_pointer() calls · cf778b00

由 Eric Dumazet 提交于 1月 12, 2012

commit a9b3cd7f (rcu: convert uses of rcu_assign_pointer(x, NULL) to
RCU_INIT_POINTER) did a lot of incorrect changes, since it did a
complete conversion of rcu_assign_pointer(x, y) to RCU_INIT_POINTER(x,
y).

We miss needed barriers, even on x86, when y is not NULL.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cf778b00

10 1月, 2012 3 次提交

net: Fix build with INET disabled. · 3969eb38

由 David S. Miller 提交于 1月 09, 2012

> net/core/sock.c: In function 'sk_update_clone':
> net/core/sock.c:1278:3: error: implicit declaration of function 'sock_update_memcg'
Reported-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3969eb38

net: introduce netif_addr_lock_nested() and call if when appropriate · 2429f7ac

由 Jiri Pirko 提交于 1月 09, 2012

dev_uc_sync() and dev_mc_sync() are acquiring netif_addr_lock for
destination device of synchronization. Since netif_addr_lock is
already held at the time for source device, this triggers lockdep
deadlock warning.

There's no way this deadlock can happen so use spin_lock_nested() to
silence the warning.
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2429f7ac

J
net: correct lock name in dev_[uc/mc]_sync documentations. · ab16ebf3
由 Jiri Pirko 提交于 1月 09, 2012
```
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
ab16ebf3

09 1月, 2012 1 次提交

net: sk_update_clone is only used in net/core/sock.c · 475f1b52

由 Stephen Rothwell 提交于 1月 09, 2012

so move it there.  Fixes build errors when CONFIG_INET is not defined:

In file included from include/linux/tcp.h:211:0,
                 from include/linux/ipv6.h:221,
                 from include/net/ipv6.h:16,
                 from include/linux/sunrpc/clnt.h:26,
                 from include/linux/nfs_fs.h:50,
                 from init/do_mounts.c:20:
include/net/sock.h: In function 'sk_update_clone':
include/net/sock.h:1109:3: error: implicit declaration of function 'sock_update_memcg' [-Werror=implicit-function-declaration]
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

475f1b52

08 1月, 2012 2 次提交

pktgen: set correct max and min in pktgen_setup_inject() · 26e29eed

由 Dan Carpenter 提交于 1月 06, 2012

In 88271660 "pktgen: fix multiple queue warning" we added special
logic to handle the case where ntxq is zero.  It's not clear to me that
ntxq can actually be zero.  But if it were then we would set
->queue_map_min and ->queue_map_max to USHRT_MAX when probably we want
to set them to zero?
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

26e29eed

net: fix sock_clone reference mismatch with tcp memcontrol · f3f511e1

由 Glauber Costa 提交于 1月 05, 2012

Sockets can also be created through sock_clone. Because it copies
all data in the sock structure, it also copies the memcg-related pointer,
and all should be fine. However, since we now use reference counts in
socket creation, we are left with some sockets that have no reference
counts. It matters when we destroy them, since it leads to a mismatch.
Signed-off-by: NGlauber Costa <glommer@parallels.com>
CC: David S. Miller <davem@davemloft.net>
CC: Greg Thelen <gthelen@google.com>
CC: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com>
CC: Laurent Chavey <chavey@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f3f511e1

06 1月, 2012 1 次提交

security: remove the security_netlink_recv hook as it is equivalent to capable() · fd778461

由 Eric Paris 提交于 1月 03, 2012

Once upon a time netlink was not sync and we had to get the effective
capabilities from the skb that was being received.  Today we instead get
the capabilities from the current task.  This has rendered the entire
purpose of the hook moot as it is now functionally equivalent to the
capable() call.
Signed-off-by: NEric Paris <eparis@redhat.com>

fd778461

05 1月, 2012 2 次提交

ethtool: Remove ethtool_ops::set_rx_ntuple operation · 6cfb5e75

由 Ben Hutchings 提交于 1月 03, 2012

All implementations have been converted to implement set_rxnfc
instead.
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6cfb5e75

ethtool: Allow drivers to select RX NFC rule locations · 55664f32

由 Ben Hutchings 提交于 1月 03, 2012

Define special location values for RX NFC that request the driver to
select the actual rule location.  This allows for implementation on
devices that use hash-based filter lookup, whereas currently the API is
more suited to devices with TCAM lookup or linear search.

In ethtool_set_rxnfc() and the compat wrapper ethtool_ioctl(), copy
the structure back to user-space after insertion so that the actual
location is returned.
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55664f32

31 12月, 2011 1 次提交

sock_diag: Introduce the meminfo nla core (v2) · 5d2e5f27

由 Pavel Emelyanov 提交于 12月 30, 2011

Add a routine that dumps memory-related values of a socket.
It's made as an array to make it possible to add more stuff
here later without breaking compatibility.

Since v1: The SK_MEMINFO_ constants are in userspace
visible part of sock_diag.h, the rest is under __KERNEL__.
Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d2e5f27

29 12月, 2011 1 次提交

ipv6: Use universal hash for NDISC. · 2c2aba6c

由 David S. Miller 提交于 12月 28, 2011

In order to perform a proper universal hash on a vector of integers,
we have to use different universal hashes on each vector element.

Which means we need 4 different hash randoms for ipv6.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2c2aba6c

25 12月, 2011 1 次提交

rfs: better sizing of dev_flow_table · 60b778ce

由 Eric Dumazet 提交于 12月 24, 2011

Aim of this patch is to provide full range of rps_flow_cnt on 64bit arches.

Theorical limit on number of flows is 2^32

Fix some buggy RPS/RFS macros as well.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Tom Herbert <therbert@google.com>
CC: Xi Wang <xi.wang@gmail.com>
CC: Laurent Chavey <chavey@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

60b778ce

23 12月, 2011 2 次提交

net: relax rcvbuf limits · 0fd7bac6

由 Eric Dumazet 提交于 12月 21, 2011

skb->truesize might be big even for a small packet.

Its even bigger after commit 87fb4b7b (net: more accurate skb
truesize) and big MTU.

We should allow queueing at least one packet per receiver, even with a
low RCVBUF setting.
Reported-by: NMichal Simek <monstr@monstr.eu>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0fd7bac6

rps: fix insufficient bounds checking in store_rps_dev_flow_table_cnt() · a0a129f8

由 Xi Wang 提交于 12月 22, 2011

Setting a large rps_flow_cnt like (1 << 30) on 32-bit platform will
cause a kernel oops due to insufficient bounds checking.

	if (count > 1<<30) {
		/* Enforce a limit to prevent overflow */
		return -EINVAL;
	}
	count = roundup_pow_of_two(count);
	table = vmalloc(RPS_DEV_FLOW_TABLE_SIZE(count));

Note that the macro RPS_DEV_FLOW_TABLE_SIZE(count) is defined as:

	... + (count * sizeof(struct rps_dev_flow))

where sizeof(struct rps_dev_flow) is 8.  (1 << 30) * 8 will overflow
32 bits.

This patch replaces the magic number (1 << 30) with a symbolic bound.
Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NXi Wang <xi.wang@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a0a129f8

22 12月, 2011 1 次提交

net: Add a flow_cache_flush_deferred function · c0ed1c14

由 Steffen Klassert 提交于 12月 21, 2011

flow_cach_flush() might sleep but can be called from
atomic context via the xfrm garbage collector. So add
a flow_cache_flush_deferred() function and use this if
the xfrm garbage colector is invoked from within the
packet path.
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
Acked-by: NTimo Teräs <timo.teras@iki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c0ed1c14

20 12月, 2011 1 次提交

Revert "net: Remove unused neighbour layer ops." · 447f2191

由 David S. Miller 提交于 12月 19, 2011

This reverts commit 5c3ddec7.

S390 qeth driver actually still uses the setup ops.
Reported-by: NFrank Blaschka <blaschka@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

447f2191

17 12月, 2011 6 次提交

net:core: use IS_ENABLED · a3bf7ae9

由 Igor Maravić 提交于 12月 12, 2011

Use IS_ENABLED(CONFIG_FOO)
instead of defined(CONFIG_FOO) || defined (CONFIG_FOO_MODULE)
Signed-off-by: NIgor Maravić <igorm@etf.rs>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a3bf7ae9

net: fix sleeping while atomic problem in sock mem_cgroup. · 36b77a52

由 Glauber Costa 提交于 12月 16, 2011

We can't scan the proto_list to initialize sock cgroups, as it
holds a rwlock, and we also want to keep the code generic enough to
avoid calling the initialization functions of protocols directly,

Convert proto_list_lock into a mutex, so we can sleep and do the
necessary allocations. This lock is seldom taken, so there shouldn't
be any performance penalties associated with that
Signed-off-by: NGlauber Costa <glommer@parallels.com>
CC: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com>
CC: David S. Miller <davem@davemloft.net>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
CC: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

36b77a52

ethtool: Define and apply a default policy for RX flow hash indirection · 278bc429

由 Ben Hutchings 提交于 12月 15, 2011

All drivers that support modification of the RX flow hash indirection
table initialise it in the same way: RX rings are assigned to table
entries in rotation.  Make that default policy explicit by having them
call a ethtool_rxfh_indir_default() function.

In the ethtool core, add support for a zero size value for
ETHTOOL_SRXFHINDIR, which resets the table to this default.
Partly-suggested-by: NMatt Carlson <mcarlson@broadcom.com>
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Acked-by: NShreyas N Bhatewara <sbhatewara@vmware.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

278bc429

ethtool: Centralise validation of ETHTOOL_{G, S}RXFHINDIR parameters · 7850f63f

由 Ben Hutchings 提交于 12月 15, 2011

Add a new ethtool operation (get_rxfh_indir_size) to get the
indirectional table size.  Use this to validate the user buffer size
before calling get_rxfh_indir or set_rxfh_indir.  Use get_rxnfc to get
the number of RX rings, and validate the contents of the new
indirection table before calling set_rxfh_indir.  Remove this
validation from drivers.
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Acked-by: NDimitris Michailidis <dm@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7850f63f

sock_diag: Generalize requests cookies managements · f65c1b53

由 Pavel Emelyanov 提交于 12月 15, 2011

The sk address is used as a cookie between dump/get_exact calls.
It will be required for unix socket sdumping, so move it from
inet_diag to sock_diag.
Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f65c1b53

sock_diag: Fix module netlink aliases · aec8dc62

由 Pavel Emelyanov 提交于 12月 15, 2011

I've made a mistake when fixing the sock_/inet_diag aliases :(

1. The sock_diag layer should request the family-based alias,
   not just the IPPROTO_IP one;
2. The inet_diag layer should request for AF_INET+protocol alias,
   not just the protocol one.

Thus fix this.
Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aec8dc62

14 12月, 2011 2 次提交

rtnetlink: rtnl_link_register() sanity test · c63044f0

由 Eric Dumazet 提交于 12月 13, 2011

Before adding a struct rtnl_link_ops into link_ops list, check it doesnt
clash with a prior one.

Based on a previous patch from Alexander Smirnov
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c63044f0

net: Remove unused neighbour layer ops. · 5c3ddec7

由 David S. Miller 提交于 12月 13, 2011

It's simpler to just keep these things out until there is a real user
of them, so we can see what the needs actually are, rather than keep
these things around as useless overhead.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5c3ddec7

13 12月, 2011 3 次提交

tcp memory pressure controls · d1a4c0b3

由 Glauber Costa 提交于 12月 11, 2011

This patch introduces memory pressure controls for the tcp
protocol. It uses the generic socket memory pressure code
introduced in earlier patches, and fills in the
necessary data in cg_proto struct.
Signed-off-by: NGlauber Costa <glommer@parallels.com>
Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtisu.com>
CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d1a4c0b3

socket: initial cgroup code. · e1aab161

由 Glauber Costa 提交于 12月 11, 2011

The goal of this work is to move the memory pressure tcp
controls to a cgroup, instead of just relying on global
conditions.

To avoid excessive overhead in the network fast paths,
the code that accounts allocated memory to a cgroup is
hidden inside a static_branch(). This branch is patched out
until the first non-root cgroup is created. So when nobody
is using cgroups, even if it is mounted, no significant performance
penalty should be seen.

This patch handles the generic part of the code, and has nothing
tcp-specific.
Signed-off-by: NGlauber Costa <glommer@parallels.com>
Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtsu.com>
CC: Kirill A. Shutemov <kirill@shutemov.name>
CC: David S. Miller <davem@davemloft.net>
CC: Eric W. Biederman <ebiederm@xmission.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e1aab161

foundations of per-cgroup memory pressure controlling. · 180d8cd9

由 Glauber Costa 提交于 12月 11, 2011

This patch replaces all uses of struct sock fields' memory_pressure,
memory_allocated, sockets_allocated, and sysctl_mem to acessor
macros. Those macros can either receive a socket argument, or a mem_cgroup
argument, depending on the context they live in.

Since we're only doing a macro wrapping here, no performance impact at all is
expected in the case where we don't have cgroups disabled.
Signed-off-by: NGlauber Costa <glommer@parallels.com>
Reviewed-by: NHiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com>
CC: David S. Miller <davem@davemloft.net>
CC: Eric W. Biederman <ebiederm@xmission.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

180d8cd9

12 12月, 2011 1 次提交

net: use IS_ENABLED(CONFIG_IPV6) · dfd56b8b

由 Eric Dumazet 提交于 12月 10, 2011

Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dfd56b8b

10 12月, 2011 1 次提交

Revert "net: netprio_cgroup: make net_prio_subsys static" · 0221cd51

由 John Fastabend 提交于 12月 09, 2011

This reverts commit 865d9f9f.

This commit breaks the build with CONFIG_NETPRIO_CGROUP=y so
revert it. It does build as a module though. The SUBSYS macro
in the cgroup core code automatically defines a subsys structure
as extern. Long term we should fix the macro. And I need to
fully build test things.

Tested with CONFIG_NETPRIO_CGROUP={y|m|n} with and without
CONFIG_CGROUPS defined.
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
CC: Neil Horman <nhorman@tuxdriver.com>
Reported-By: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0221cd51

09 12月, 2011 2 次提交

sock_diag: off by one checks · 6f8e4ad0

由 Dan Carpenter 提交于 12月 07, 2011

These tests are off by one because sock_diag_handlers[] only has AF_MAX
elements.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Acked-by: NPavel Emelyanov <xemul@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6f8e4ad0

net: netprio_cgroup: make net_prio_subsys static · 865d9f9f

由 John Fastabend 提交于 12月 07, 2011

net_prio_subsys can be made static this removes the sparse
warning it was throwing.
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

865d9f9f

07 12月, 2011 1 次提交

net: Silence seq_scale() unused warning · 68109090

由 Stephen Boyd 提交于 12月 06, 2011

On a CONFIG_NET=y build

net/core/secure_seq.c:22: warning: 'seq_scale' defined but not
used
Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

68109090