- 23 1月, 2012 2 次提交
-
-
由 Glauber Costa 提交于
There is a case in __sk_mem_schedule(), where an allocation is beyond the maximum, but yet we are allowed to proceed. It happens under the following condition: sk->sk_wmem_queued + size >= sk->sk_sndbuf The network code won't revert the allocation in this case, meaning that at some point later it'll try to do it. Since this is never communicated to the underlying res_counter code, there is an inbalance in res_counter uncharge operation. I see two ways of fixing this: 1) storing the information about those allocations somewhere in memcg, and then deducting from that first, before we start draining the res_counter, 2) providing a slightly different allocation function for the res_counter, that matches the original behavior of the network code more closely. I decided to go for #2 here, believing it to be more elegant, since #1 would require us to do basically that, but in a more obscure way. Signed-off-by: NGlauber Costa <glommer@parallels.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> CC: Tejun Heo <tj@kernel.org> CC: Li Zefan <lizf@cn.fujitsu.com> CC: Laurent Chavey <chavey@google.com> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paul Gortmaker 提交于
Every call to num_args() immediately checks the return value for less than zero, as it will return -EFAULT for a failed get_user() call. So it makes no sense for the function to be declared as an unsigned long. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 1月, 2012 2 次提交
-
-
由 Michał Mirosław 提交于
Bug was introduced in commit c8f44aff. Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ben Hutchings 提交于
skb_checksum_help() has never done anything useful with skbs that require segmentation. Setting skb->ip_summed = CHECKSUM_NONE makes them invalid and provokes a later WARNing in skb_gso_segment(). Passing such an skb to skb_checksum_help() indicates a bug, so we should warn about it immediately. Move the warning from skb_gso_segment() into a shared function, and add gso_type and gso_size to it. Signed-off-by: NBen Hutchings <bhutchings@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 1月, 2012 3 次提交
-
-
由 Eric Dumazet 提交于
make C=2 CF="-D__CHECK_ENDIAN__" M=net And fix flowi4_init_output() prototype for sport Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ben Hutchings 提交于
ethtool operations generally require the caller to hold RTNL and are not safe to call in atomic context. The device model provides this information for most devices; we'll only lose it for some old ISA drivers. Signed-off-by: NBen Hutchings <bhutchings@solarflare.com> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hiroaki SHIMODA 提交于
There is no store() method for inflight attribute in the tx-<n>/byte_queue_limits sysfs directory. So remove S_IWUSR bit. Signed-off-by: NHiroaki SHIMODA <shimoda.hiroaki@gmail.com> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 1月, 2012 1 次提交
-
-
由 Eric Dumazet 提交于
commit a9b3cd7f (rcu: convert uses of rcu_assign_pointer(x, NULL) to RCU_INIT_POINTER) did a lot of incorrect changes, since it did a complete conversion of rcu_assign_pointer(x, y) to RCU_INIT_POINTER(x, y). We miss needed barriers, even on x86, when y is not NULL. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> CC: Stephen Hemminger <shemminger@vyatta.com> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 1月, 2012 3 次提交
-
-
由 David S. Miller 提交于
> net/core/sock.c: In function 'sk_update_clone': > net/core/sock.c:1278:3: error: implicit declaration of function 'sock_update_memcg' Reported-by: NRandy Dunlap <rdunlap@xenotime.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
dev_uc_sync() and dev_mc_sync() are acquiring netif_addr_lock for destination device of synchronization. Since netif_addr_lock is already held at the time for source device, this triggers lockdep deadlock warning. There's no way this deadlock can happen so use spin_lock_nested() to silence the warning. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 1月, 2012 1 次提交
-
-
由 Stephen Rothwell 提交于
so move it there. Fixes build errors when CONFIG_INET is not defined: In file included from include/linux/tcp.h:211:0, from include/linux/ipv6.h:221, from include/net/ipv6.h:16, from include/linux/sunrpc/clnt.h:26, from include/linux/nfs_fs.h:50, from init/do_mounts.c:20: include/net/sock.h: In function 'sk_update_clone': include/net/sock.h:1109:3: error: implicit declaration of function 'sock_update_memcg' [-Werror=implicit-function-declaration] Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 1月, 2012 2 次提交
-
-
由 Dan Carpenter 提交于
In 88271660 "pktgen: fix multiple queue warning" we added special logic to handle the case where ntxq is zero. It's not clear to me that ntxq can actually be zero. But if it were then we would set ->queue_map_min and ->queue_map_max to USHRT_MAX when probably we want to set them to zero? Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Glauber Costa 提交于
Sockets can also be created through sock_clone. Because it copies all data in the sock structure, it also copies the memcg-related pointer, and all should be fine. However, since we now use reference counts in socket creation, we are left with some sockets that have no reference counts. It matters when we destroy them, since it leads to a mismatch. Signed-off-by: NGlauber Costa <glommer@parallels.com> CC: David S. Miller <davem@davemloft.net> CC: Greg Thelen <gthelen@google.com> CC: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com> CC: Laurent Chavey <chavey@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 1月, 2012 1 次提交
-
-
由 Eric Paris 提交于
Once upon a time netlink was not sync and we had to get the effective capabilities from the skb that was being received. Today we instead get the capabilities from the current task. This has rendered the entire purpose of the hook moot as it is now functionally equivalent to the capable() call. Signed-off-by: NEric Paris <eparis@redhat.com>
-
- 05 1月, 2012 2 次提交
-
-
由 Ben Hutchings 提交于
All implementations have been converted to implement set_rxnfc instead. Signed-off-by: NBen Hutchings <bhutchings@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ben Hutchings 提交于
Define special location values for RX NFC that request the driver to select the actual rule location. This allows for implementation on devices that use hash-based filter lookup, whereas currently the API is more suited to devices with TCAM lookup or linear search. In ethtool_set_rxnfc() and the compat wrapper ethtool_ioctl(), copy the structure back to user-space after insertion so that the actual location is returned. Signed-off-by: NBen Hutchings <bhutchings@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 12月, 2011 1 次提交
-
-
由 Pavel Emelyanov 提交于
Add a routine that dumps memory-related values of a socket. It's made as an array to make it possible to add more stuff here later without breaking compatibility. Since v1: The SK_MEMINFO_ constants are in userspace visible part of sock_diag.h, the rest is under __KERNEL__. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 12月, 2011 1 次提交
-
-
由 David S. Miller 提交于
In order to perform a proper universal hash on a vector of integers, we have to use different universal hashes on each vector element. Which means we need 4 different hash randoms for ipv6. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 12月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
Aim of this patch is to provide full range of rps_flow_cnt on 64bit arches. Theorical limit on number of flows is 2^32 Fix some buggy RPS/RFS macros as well. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> CC: Tom Herbert <therbert@google.com> CC: Xi Wang <xi.wang@gmail.com> CC: Laurent Chavey <chavey@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 12月, 2011 2 次提交
-
-
由 Eric Dumazet 提交于
skb->truesize might be big even for a small packet. Its even bigger after commit 87fb4b7b (net: more accurate skb truesize) and big MTU. We should allow queueing at least one packet per receiver, even with a low RCVBUF setting. Reported-by: NMichal Simek <monstr@monstr.eu> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xi Wang 提交于
Setting a large rps_flow_cnt like (1 << 30) on 32-bit platform will cause a kernel oops due to insufficient bounds checking. if (count > 1<<30) { /* Enforce a limit to prevent overflow */ return -EINVAL; } count = roundup_pow_of_two(count); table = vmalloc(RPS_DEV_FLOW_TABLE_SIZE(count)); Note that the macro RPS_DEV_FLOW_TABLE_SIZE(count) is defined as: ... + (count * sizeof(struct rps_dev_flow)) where sizeof(struct rps_dev_flow) is 8. (1 << 30) * 8 will overflow 32 bits. This patch replaces the magic number (1 << 30) with a symbolic bound. Suggested-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NXi Wang <xi.wang@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 12月, 2011 1 次提交
-
-
由 Steffen Klassert 提交于
flow_cach_flush() might sleep but can be called from atomic context via the xfrm garbage collector. So add a flow_cache_flush_deferred() function and use this if the xfrm garbage colector is invoked from within the packet path. Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com> Acked-by: NTimo Teräs <timo.teras@iki.fi> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 12月, 2011 1 次提交
-
-
由 David S. Miller 提交于
This reverts commit 5c3ddec7. S390 qeth driver actually still uses the setup ops. Reported-by: NFrank Blaschka <blaschka@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 12月, 2011 6 次提交
-
-
由 Igor Maravić 提交于
Use IS_ENABLED(CONFIG_FOO) instead of defined(CONFIG_FOO) || defined (CONFIG_FOO_MODULE) Signed-off-by: NIgor Maravić <igorm@etf.rs> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Glauber Costa 提交于
We can't scan the proto_list to initialize sock cgroups, as it holds a rwlock, and we also want to keep the code generic enough to avoid calling the initialization functions of protocols directly, Convert proto_list_lock into a mutex, so we can sleep and do the necessary allocations. This lock is seldom taken, so there shouldn't be any performance penalties associated with that Signed-off-by: NGlauber Costa <glommer@parallels.com> CC: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com> CC: David S. Miller <davem@davemloft.net> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Stephen Rothwell <sfr@canb.auug.org.au> CC: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ben Hutchings 提交于
All drivers that support modification of the RX flow hash indirection table initialise it in the same way: RX rings are assigned to table entries in rotation. Make that default policy explicit by having them call a ethtool_rxfh_indir_default() function. In the ethtool core, add support for a zero size value for ETHTOOL_SRXFHINDIR, which resets the table to this default. Partly-suggested-by: NMatt Carlson <mcarlson@broadcom.com> Signed-off-by: NBen Hutchings <bhutchings@solarflare.com> Acked-by: NShreyas N Bhatewara <sbhatewara@vmware.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ben Hutchings 提交于
Add a new ethtool operation (get_rxfh_indir_size) to get the indirectional table size. Use this to validate the user buffer size before calling get_rxfh_indir or set_rxfh_indir. Use get_rxnfc to get the number of RX rings, and validate the contents of the new indirection table before calling set_rxfh_indir. Remove this validation from drivers. Signed-off-by: NBen Hutchings <bhutchings@solarflare.com> Acked-by: NDimitris Michailidis <dm@chelsio.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
The sk address is used as a cookie between dump/get_exact calls. It will be required for unix socket sdumping, so move it from inet_diag to sock_diag. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
I've made a mistake when fixing the sock_/inet_diag aliases :( 1. The sock_diag layer should request the family-based alias, not just the IPPROTO_IP one; 2. The inet_diag layer should request for AF_INET+protocol alias, not just the protocol one. Thus fix this. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 12月, 2011 2 次提交
-
-
由 Eric Dumazet 提交于
Before adding a struct rtnl_link_ops into link_ops list, check it doesnt clash with a prior one. Based on a previous patch from Alexander Smirnov Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> CC: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
It's simpler to just keep these things out until there is a real user of them, so we can see what the needs actually are, rather than keep these things around as useless overhead. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 12月, 2011 3 次提交
-
-
由 Glauber Costa 提交于
This patch introduces memory pressure controls for the tcp protocol. It uses the generic socket memory pressure code introduced in earlier patches, and fills in the necessary data in cg_proto struct. Signed-off-by: NGlauber Costa <glommer@parallels.com> Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtisu.com> CC: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Glauber Costa 提交于
The goal of this work is to move the memory pressure tcp controls to a cgroup, instead of just relying on global conditions. To avoid excessive overhead in the network fast paths, the code that accounts allocated memory to a cgroup is hidden inside a static_branch(). This branch is patched out until the first non-root cgroup is created. So when nobody is using cgroups, even if it is mounted, no significant performance penalty should be seen. This patch handles the generic part of the code, and has nothing tcp-specific. Signed-off-by: NGlauber Costa <glommer@parallels.com> Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtsu.com> CC: Kirill A. Shutemov <kirill@shutemov.name> CC: David S. Miller <davem@davemloft.net> CC: Eric W. Biederman <ebiederm@xmission.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Glauber Costa 提交于
This patch replaces all uses of struct sock fields' memory_pressure, memory_allocated, sockets_allocated, and sysctl_mem to acessor macros. Those macros can either receive a socket argument, or a mem_cgroup argument, depending on the context they live in. Since we're only doing a macro wrapping here, no performance impact at all is expected in the case where we don't have cgroups disabled. Signed-off-by: NGlauber Costa <glommer@parallels.com> Reviewed-by: NHiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com> CC: David S. Miller <davem@davemloft.net> CC: Eric W. Biederman <ebiederm@xmission.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 12月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 12月, 2011 1 次提交
-
-
由 John Fastabend 提交于
This reverts commit 865d9f9f. This commit breaks the build with CONFIG_NETPRIO_CGROUP=y so revert it. It does build as a module though. The SUBSYS macro in the cgroup core code automatically defines a subsys structure as extern. Long term we should fix the macro. And I need to fully build test things. Tested with CONFIG_NETPRIO_CGROUP={y|m|n} with and without CONFIG_CGROUPS defined. Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com> CC: Neil Horman <nhorman@tuxdriver.com> Reported-By: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 12月, 2011 2 次提交
-
-
由 Dan Carpenter 提交于
These tests are off by one because sock_diag_handlers[] only has AF_MAX elements. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 John Fastabend 提交于
net_prio_subsys can be made static this removes the sparse warning it was throwing. Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 12月, 2011 1 次提交
-
-
由 Stephen Boyd 提交于
On a CONFIG_NET=y build net/core/secure_seq.c:22: warning: 'seq_scale' defined but not used Signed-off-by: NStephen Boyd <sboyd@codeaurora.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-