- 26 6月, 2012 1 次提交
-
-
由 Vijay Subramanian 提交于
With early demux enabled by default for TCP flows, there is high chance that skb->sk will be non-null. 'unlikely()' was removed from __inet_lookup_skb() but maybe it can be removed from skb_steal_sock() as well. Note: skb_steal_sock() is also called by __inet6_lookup_skb() and __udp4_lib_lookup_skb() but they are protected by their own 'unlikely' calls. Signed-off-by: NVijay Subramanian <subramanian.vijay@gmail.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 6月, 2012 1 次提交
-
-
由 David S. Miller 提交于
Input packet processing for local sockets involves two major demuxes. One for the route and one for the socket. But we can optimize this down to one demux for certain kinds of local sockets. Currently we only do this for established TCP sockets, but it could at least in theory be expanded to other kinds of connections. If a TCP socket is established then it's identity is fully specified. This means that whatever input route was used during the three-way handshake must work equally well for the rest of the connection since the keys will not change. Once we move to established state, we cache the receive packet's input route to use later. Like the existing cached route in sk->sk_dst_cache used for output packets, we have to check for route invalidations using dst->obsolete and dst->ops->check(). Early demux occurs outside of a socket locked section, so when a route invalidation occurs we defer the fixup of sk->sk_rx_dst until we are actually inside of established state packet processing and thus have the socket locked. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 5月, 2012 1 次提交
-
-
由 Glauber Costa 提交于
We call the destroy function when a cgroup starts to be removed, such as by a rmdir event. However, because of our reference counters, some objects are still inflight. Right now, we are decrementing the static_keys at destroy() time, meaning that if we get rid of the last static_key reference, some objects will still have charges, but the code to properly uncharge them won't be run. This becomes a problem specially if it is ever enabled again, because now new charges will be added to the staled charges making keeping it pretty much impossible. We just need to be careful with the static branch activation: since there is no particular preferred order of their activation, we need to make sure that we only start using it after all call sites are active. This is achieved by having a per-memcg flag that is only updated after static_key_slow_inc() returns. At this time, we are sure all sites are active. This is made per-memcg, not global, for a reason: it also has the effect of making socket accounting more consistent. The first memcg to be limited will trigger static_key() activation, therefore, accounting. But all the others will then be accounted no matter what. After this patch, only limited memcgs will have its sockets accounted. [akpm@linux-foundation.org: move enum sock_flag_bits into sock.h, document enum sock_flag_bits, convert memcg_proto_active() and memcg_proto_activated() to test_bit(), redo tcp_update_limit() comment to 80 cols] Signed-off-by: NGlauber Costa <glommer@parallels.com> Cc: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Acked-by: NDavid Miller <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 5月, 2012 2 次提交
-
-
由 Eric Dumazet 提交于
bool/const conversions where possible __inline__ -> inline space cleanups Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
- sock_flag() accepts a const pointer - sock_flag() returns a boolean Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 5月, 2012 1 次提交
-
-
由 Eric Dumazet 提交于
Denys Fedoryshchenko reported frequent crashes on a proxy server and kindly provided a lockdep report that explains it all : [ 762.903868] [ 762.903880] ================================= [ 762.903890] [ INFO: inconsistent lock state ] [ 762.903903] 3.3.4-build-0061 #8 Not tainted [ 762.904133] --------------------------------- [ 762.904344] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. [ 762.904542] squid/1603 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 762.904542] (key#3){+.?...}, at: [<c0232cc4>] __percpu_counter_sum+0xd/0x58 [ 762.904542] {IN-SOFTIRQ-W} state was registered at: [ 762.904542] [<c0158b84>] __lock_acquire+0x284/0xc26 [ 762.904542] [<c01598e8>] lock_acquire+0x71/0x85 [ 762.904542] [<c0349765>] _raw_spin_lock+0x33/0x40 [ 762.904542] [<c0232c93>] __percpu_counter_add+0x58/0x7c [ 762.904542] [<c02cfde1>] sk_clone_lock+0x1e5/0x200 [ 762.904542] [<c0303ee4>] inet_csk_clone_lock+0xe/0x78 [ 762.904542] [<c0315778>] tcp_create_openreq_child+0x1b/0x404 [ 762.904542] [<c031339c>] tcp_v4_syn_recv_sock+0x32/0x1c1 [ 762.904542] [<c031615a>] tcp_check_req+0x1fd/0x2d7 [ 762.904542] [<c0313f77>] tcp_v4_do_rcv+0xab/0x194 [ 762.904542] [<c03153bb>] tcp_v4_rcv+0x3b3/0x5cc [ 762.904542] [<c02fc0c4>] ip_local_deliver_finish+0x13a/0x1e9 [ 762.904542] [<c02fc539>] NF_HOOK.clone.11+0x46/0x4d [ 762.904542] [<c02fc652>] ip_local_deliver+0x41/0x45 [ 762.904542] [<c02fc4d1>] ip_rcv_finish+0x31a/0x33c [ 762.904542] [<c02fc539>] NF_HOOK.clone.11+0x46/0x4d [ 762.904542] [<c02fc857>] ip_rcv+0x201/0x23e [ 762.904542] [<c02daa3a>] __netif_receive_skb+0x319/0x368 [ 762.904542] [<c02dac07>] netif_receive_skb+0x4e/0x7d [ 762.904542] [<c02dacf6>] napi_skb_finish+0x1e/0x34 [ 762.904542] [<c02db122>] napi_gro_receive+0x20/0x24 [ 762.904542] [<f85d1743>] e1000_receive_skb+0x3f/0x45 [e1000e] [ 762.904542] [<f85d3464>] e1000_clean_rx_irq+0x1f9/0x284 [e1000e] [ 762.904542] [<f85d3926>] e1000_clean+0x62/0x1f4 [e1000e] [ 762.904542] [<c02db228>] net_rx_action+0x90/0x160 [ 762.904542] [<c012a445>] __do_softirq+0x7b/0x118 [ 762.904542] irq event stamp: 156915469 [ 762.904542] hardirqs last enabled at (156915469): [<c019b4f4>] __slab_alloc.clone.58.clone.63+0xc4/0x2de [ 762.904542] hardirqs last disabled at (156915468): [<c019b452>] __slab_alloc.clone.58.clone.63+0x22/0x2de [ 762.904542] softirqs last enabled at (156915466): [<c02ce677>] lock_sock_nested+0x64/0x6c [ 762.904542] softirqs last disabled at (156915464): [<c0349914>] _raw_spin_lock_bh+0xe/0x45 [ 762.904542] [ 762.904542] other info that might help us debug this: [ 762.904542] Possible unsafe locking scenario: [ 762.904542] [ 762.904542] CPU0 [ 762.904542] ---- [ 762.904542] lock(key#3); [ 762.904542] <Interrupt> [ 762.904542] lock(key#3); [ 762.904542] [ 762.904542] *** DEADLOCK *** [ 762.904542] [ 762.904542] 1 lock held by squid/1603: [ 762.904542] #0: (sk_lock-AF_INET){+.+.+.}, at: [<c03055c0>] lock_sock+0xa/0xc [ 762.904542] [ 762.904542] stack backtrace: [ 762.904542] Pid: 1603, comm: squid Not tainted 3.3.4-build-0061 #8 [ 762.904542] Call Trace: [ 762.904542] [<c0347b73>] ? printk+0x18/0x1d [ 762.904542] [<c015873a>] valid_state+0x1f6/0x201 [ 762.904542] [<c0158816>] mark_lock+0xd1/0x1bb [ 762.904542] [<c015876b>] ? mark_lock+0x26/0x1bb [ 762.904542] [<c015805d>] ? check_usage_forwards+0x77/0x77 [ 762.904542] [<c0158bf8>] __lock_acquire+0x2f8/0xc26 [ 762.904542] [<c0159b8e>] ? mark_held_locks+0x5d/0x7b [ 762.904542] [<c0159cf6>] ? trace_hardirqs_on+0xb/0xd [ 762.904542] [<c0158dd4>] ? __lock_acquire+0x4d4/0xc26 [ 762.904542] [<c01598e8>] lock_acquire+0x71/0x85 [ 762.904542] [<c0232cc4>] ? __percpu_counter_sum+0xd/0x58 [ 762.904542] [<c0349765>] _raw_spin_lock+0x33/0x40 [ 762.904542] [<c0232cc4>] ? __percpu_counter_sum+0xd/0x58 [ 762.904542] [<c0232cc4>] __percpu_counter_sum+0xd/0x58 [ 762.904542] [<c02cebc4>] __sk_mem_schedule+0xdd/0x1c7 [ 762.904542] [<c02d178d>] ? __alloc_skb+0x76/0x100 [ 762.904542] [<c0305e8e>] sk_wmem_schedule+0x21/0x2d [ 762.904542] [<c0306370>] sk_stream_alloc_skb+0x42/0xaa [ 762.904542] [<c0306567>] tcp_sendmsg+0x18f/0x68b [ 762.904542] [<c031f3dc>] ? ip_fast_csum+0x30/0x30 [ 762.904542] [<c0320193>] inet_sendmsg+0x53/0x5a [ 762.904542] [<c02cb633>] sock_aio_write+0xd2/0xda [ 762.904542] [<c015876b>] ? mark_lock+0x26/0x1bb [ 762.904542] [<c01a1017>] do_sync_write+0x9f/0xd9 [ 762.904542] [<c01a2111>] ? file_free_rcu+0x2f/0x2f [ 762.904542] [<c01a17a1>] vfs_write+0x8f/0xab [ 762.904542] [<c01a284d>] ? fget_light+0x75/0x7c [ 762.904542] [<c01a1900>] sys_write+0x3d/0x5e [ 762.904542] [<c0349ec9>] syscall_call+0x7/0xb [ 762.904542] [<c0340000>] ? rp_sidt+0x41/0x83 Bug is that sk_sockets_allocated_read_positive() calls percpu_counter_sum_positive() without BH being disabled. This bug was added in commit 180d8cd9 (foundations of per-cgroup memory pressure controlling.), since previous code was using percpu_counter_read_positive() which is IRQ safe. In __sk_mem_schedule() we dont need the precise count of allocated sockets and can revert to previous behavior. Reported-by: NDenys Fedoryshchenko <denys@visp.net.lb> Sined-off-by: NEric Dumazet <edumazet@google.com> Cc: Glauber Costa <glommer@parallels.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 4月, 2012 1 次提交
-
-
由 Eric Dumazet 提交于
sk_add_backlog() & sk_rcvqueues_full() hard coded sk_rcvbuf as the memory limit. We need to make this limit a parameter for TCP use. No functional change expected in this patch, all callers still using the old sk_rcvbuf limit. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Rick Jones <rick.jones2@hp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 4月, 2012 1 次提交
-
-
由 Pavel Emelyanov 提交于
Name them in a "backward compatible" manner, i.e. reuse or not are still 1 and 0 respectively. The reuse value of 2 means that the socket with it will forcibly reuse everyone else's port. Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 4月, 2012 1 次提交
-
-
由 Randy Dunlap 提交于
Fix kernel-doc warning in net/sock.h: Warning(include/net/sock.h:377): No description found for parameter 'sk_peek_off' Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 4月, 2012 1 次提交
-
-
由 Glauber Costa 提交于
The only reason cgroup was used, was to be consistent with the populate() interface. Now that we're getting rid of it, not only we no longer need it, but we also *can't* call it this way. Since we will no longer rely on populate(), this will be called from create(). During create, the association between struct mem_cgroup and struct cgroup does not yet exist, since cgroup internals hasn't yet initialized its bookkeeping. This means we would not be able to draw the memcg pointer from the cgroup pointer in these functions, which is highly undesirable. Signed-off-by: NGlauber Costa <glommer@parallels.com> Acked-by: NKamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org> CC: Li Zefan <lizefan@huawei.com> CC: Johannes Weiner <hannes@cmpxchg.org> CC: Michal Hocko <mhocko@suse.cz>
-
- 24 3月, 2012 1 次提交
-
-
由 Hans Verkuil 提交于
In some cases the poll() implementation in a driver has to do different things depending on the events the caller wants to poll for. An example is when a driver needs to start a DMA engine if the caller polls for POLLIN, but doesn't want to do that if POLLIN is not requested but instead only POLLOUT or POLLPRI is requested. This is something that can happen in the video4linux subsystem among others. Unfortunately, the current epoll/poll/select implementation doesn't provide that information reliably. The poll_table_struct does have it: it has a key field with the event mask. But once a poll() call matches one or more bits of that mask any following poll() calls are passed a NULL poll_table pointer. Also, the eventpoll implementation always left the key field at ~0 instead of using the requested events mask. This was changed in eventpoll.c so the key field now contains the actual events that should be polled for as set by the caller. The solution to the NULL poll_table pointer is to set the qproc field to NULL in poll_table once poll() matches the events, not the poll_table pointer itself. That way drivers can obtain the mask through a new poll_requested_events inline. The poll_table_struct can still be NULL since some kernel code calls it internally (netfs_state_poll() in ./drivers/staging/pohmelfs/netfs.h). In that case poll_requested_events() returns ~0 (i.e. all events). Very rarely drivers might want to know whether poll_wait will actually wait. If another earlier file descriptor in the set already matched the events the caller wanted to wait for, then the kernel will return from the select() call without waiting. This might be useful information in order to avoid doing expensive work. A new helper function poll_does_not_wait() is added that drivers can use to detect this situation. This is now used in sock_poll_wait() in include/net/sock.h. This was the only place in the kernel that needed this information. Drivers should no longer access any of the poll_table internals, but use the poll_requested_events() and poll_does_not_wait() access functions instead. In order to enforce that the poll_table fields are now prepended with an underscore and a comment was added warning against using them directly. This required a change in unix_dgram_poll() in unix/af_unix.c which used the key field to get the requested events. It's been replaced by a call to poll_requested_events(). For qproc it was especially important to change its name since the behavior of that field changes with this patch since this function pointer can now be NULL when that wasn't possible in the past. Any driver accessing the qproc or key fields directly will now fail to compile. Some notes regarding the correctness of this patch: the driver's poll() function is called with a 'struct poll_table_struct *wait' argument. This pointer may or may not be NULL, drivers can never rely on it being one or the other as that depends on whether or not an earlier file descriptor in the select()'s fdset matched the requested events. There are only three things a driver can do with the wait argument: 1) obtain the key field: events = wait ? wait->key : ~0; This will still work although it should be replaced with the new poll_requested_events() function (which does exactly the same). This will now even work better, since wait is no longer set to NULL unnecessarily. 2) use the qproc callback. This could be deadly since qproc can now be NULL. Renaming qproc should prevent this from happening. There are no kernel drivers that actually access this callback directly, BTW. 3) test whether wait == NULL to determine whether poll would return without waiting. This is no longer sufficient as the correct test is now wait == NULL || wait->_qproc == NULL. However, the worst that can happen here is a slight performance hit in the case where wait != NULL and wait->_qproc == NULL. In that case the driver will assume that poll_wait() will actually add the fd to the set of waiting file descriptors. Of course, poll_wait() will not do that since it tests for wait->_qproc. This will not break anything, though. There is only one place in the whole kernel where this happens (sock_poll_wait() in include/net/sock.h) and that code will be replaced by a call to poll_does_not_wait() in the next patch. Note that even if wait->_qproc != NULL drivers cannot rely on poll_wait() actually waiting. The next file descriptor from the set might match the event mask and thus any possible waits will never happen. Signed-off-by: NHans Verkuil <hans.verkuil@cisco.com> Reviewed-by: NJonathan Corbet <corbet@lwn.net> Reviewed-by: NAl Viro <viro@zeniv.linux.org.uk> Cc: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: NHans de Goede <hdegoede@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 24 2月, 2012 2 次提交
-
-
由 Ben Greear 提交于
This is useful for testing RX handling of frames with bad CRCs. Requires driver support to actually put the packet on the wire properly. Signed-off-by: NBen Greear <greearb@candelatech.com> Tested-by: NAaron Brown <aaron.f.brown@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Ingo Molnar 提交于
static keys: Introduce 'struct static_key', static_key_true()/false() and static_key_slow_[inc|dec]() So here's a boot tested patch on top of Jason's series that does all the cleanups I talked about and turns jump labels into a more intuitive to use facility. It should also address the various misconceptions and confusions that surround jump labels. Typical usage scenarios: #include <linux/static_key.h> struct static_key key = STATIC_KEY_INIT_TRUE; if (static_key_false(&key)) do unlikely code else do likely code Or: if (static_key_true(&key)) do likely code else do unlikely code The static key is modified via: static_key_slow_inc(&key); ... static_key_slow_dec(&key); The 'slow' prefix makes it abundantly clear that this is an expensive operation. I've updated all in-kernel code to use this everywhere. Note that I (intentionally) have not pushed through the rename blindly through to the lowest levels: the actual jump-label patching arch facility should be named like that, so we want to decouple jump labels from the static-key facility a bit. On non-jump-label enabled architectures static keys default to likely()/unlikely() branches. Signed-off-by: NIngo Molnar <mingo@elte.hu> Acked-by: NJason Baron <jbaron@redhat.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Cc: a.p.zijlstra@chello.nl Cc: mathieu.desnoyers@efficios.com Cc: davem@davemloft.net Cc: ddaney.cavm@gmail.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20120222085809.GA26397@elte.huSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 22 2月, 2012 1 次提交
-
-
由 Pavel Emelyanov 提交于
This one specifies where to start MSG_PEEK-ing queue data from. When set to negative value means that MSG_PEEK works as ususally -- peeks from the head of the queue always. When some bytes are peeked from queue and the peeking offset is non negative it is moved forward so that the next peek will return next portion of data. When non-peeking recvmsg occurs and the peeking offset is non negative is is moved backward so that the next peek will still peek the proper data (i.e. the one that would have been picked if there were no non peeking recv in between). The offset is set using per-proto opteration to let the protocol handle the locking issues and to check whether the peeking offset feature is supported by the protocol the socket belongs to. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 2月, 2012 1 次提交
-
-
由 Al Viro 提交于
Trim security.h Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NJames Morris <jmorris@namei.org>
-
- 03 2月, 2012 1 次提交
-
-
由 Li Zefan 提交于
The argument is not used at all, and it's not necessary, because a specific callback handler of course knows which subsys it belongs to. Now only ->pupulate() takes this argument, because the handlers of this callback always call cgroup_add_file()/cgroup_add_files(). So we reduce a few lines of code, though the shrinking of object size is minimal. 16 files changed, 113 insertions(+), 162 deletions(-) text data bss dec hex filename 5486240 656987 7039960 13183187 c928d3 vmlinux.o.orig 5486170 656987 7039960 13183117 c9288d vmlinux.o Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 27 1月, 2012 1 次提交
-
-
由 Glauber Costa 提交于
Commit 36a12119 removed linux/module.h include statement from one of the headers that end up in net/sock.h. It was providing us with static_branch() definition implicitly, so after its removal the build got broken. To fix this, and avoid having this happening in the future, let me do the right thing and include linux/jump_label.h explicitly in sock.h. Signed-off-by: NGlauber Costa <glommer@parallels.com> Reported-by: NRandy Dunlap <rdunlap@xenotime.net> CC: David S. Miller <davem@davemloft.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 1月, 2012 3 次提交
-
-
由 Glauber Costa 提交于
There is a case in __sk_mem_schedule(), where an allocation is beyond the maximum, but yet we are allowed to proceed. It happens under the following condition: sk->sk_wmem_queued + size >= sk->sk_sndbuf The network code won't revert the allocation in this case, meaning that at some point later it'll try to do it. Since this is never communicated to the underlying res_counter code, there is an inbalance in res_counter uncharge operation. I see two ways of fixing this: 1) storing the information about those allocations somewhere in memcg, and then deducting from that first, before we start draining the res_counter, 2) providing a slightly different allocation function for the res_counter, that matches the original behavior of the network code more closely. I decided to go for #2 here, believing it to be more elegant, since #1 would require us to do basically that, but in a more obscure way. Signed-off-by: NGlauber Costa <glommer@parallels.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> CC: Tejun Heo <tj@kernel.org> CC: Li Zefan <lizf@cn.fujitsu.com> CC: Laurent Chavey <chavey@google.com> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Glauber Costa 提交于
There is still a build bug with the sock memcg code, that triggers with !CONFIG_NET, that survived my series of randconfig builds. Signed-off-by: NGlauber Costa <glommer@parallels.com> Reported-by: NRandy Dunlap <rdunlap@xenotime.net> CC: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Randy Dunlap 提交于
Fix new kernel-doc warning: Warning(include/net/sock.h:372): No description found for parameter 'sk_cgrp_prioidx' Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 1月, 2012 1 次提交
-
-
由 Stephen Rothwell 提交于
so move it there. Fixes build errors when CONFIG_INET is not defined: In file included from include/linux/tcp.h:211:0, from include/linux/ipv6.h:221, from include/net/ipv6.h:16, from include/linux/sunrpc/clnt.h:26, from include/linux/nfs_fs.h:50, from init/do_mounts.c:20: include/net/sock.h: In function 'sk_update_clone': include/net/sock.h:1109:3: error: implicit declaration of function 'sock_update_memcg' [-Werror=implicit-function-declaration] Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 1月, 2012 1 次提交
-
-
由 Glauber Costa 提交于
Sockets can also be created through sock_clone. Because it copies all data in the sock structure, it also copies the memcg-related pointer, and all should be fine. However, since we now use reference counts in socket creation, we are left with some sockets that have no reference counts. It matters when we destroy them, since it leads to a mismatch. Signed-off-by: NGlauber Costa <glommer@parallels.com> CC: David S. Miller <davem@davemloft.net> CC: Greg Thelen <gthelen@google.com> CC: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com> CC: Laurent Chavey <chavey@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 12月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
skb->truesize might be big even for a small packet. Its even bigger after commit 87fb4b7b (net: more accurate skb truesize) and big MTU. We should allow queueing at least one packet per receiver, even with a low RCVBUF setting. Reported-by: NMichal Simek <monstr@monstr.eu> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 12月, 2011 1 次提交
-
-
由 Glauber Costa 提交于
Reported-by: NRandy Dunlap <rdunlap@xenotime.net> Signed-off-by: NGlauber Costa <glommer@parallels.com> CC: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com> CC: David S. Miller <davem@davemloft.net> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 12月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
Reported-by: NChristoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 12月, 2011 3 次提交
-
-
由 Glauber Costa 提交于
This patch introduces memory pressure controls for the tcp protocol. It uses the generic socket memory pressure code introduced in earlier patches, and fills in the necessary data in cg_proto struct. Signed-off-by: NGlauber Costa <glommer@parallels.com> Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtisu.com> CC: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Glauber Costa 提交于
The goal of this work is to move the memory pressure tcp controls to a cgroup, instead of just relying on global conditions. To avoid excessive overhead in the network fast paths, the code that accounts allocated memory to a cgroup is hidden inside a static_branch(). This branch is patched out until the first non-root cgroup is created. So when nobody is using cgroups, even if it is mounted, no significant performance penalty should be seen. This patch handles the generic part of the code, and has nothing tcp-specific. Signed-off-by: NGlauber Costa <glommer@parallels.com> Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtsu.com> CC: Kirill A. Shutemov <kirill@shutemov.name> CC: David S. Miller <davem@davemloft.net> CC: Eric W. Biederman <ebiederm@xmission.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Glauber Costa 提交于
This patch replaces all uses of struct sock fields' memory_pressure, memory_allocated, sockets_allocated, and sysctl_mem to acessor macros. Those macros can either receive a socket argument, or a mem_cgroup argument, depending on the context they live in. Since we're only doing a macro wrapping here, no performance impact at all is expected in the case where we don't have cgroups disabled. Signed-off-by: NGlauber Costa <glommer@parallels.com> Reviewed-by: NHiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com> CC: David S. Miller <davem@davemloft.net> CC: Eric W. Biederman <ebiederm@xmission.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 11月, 2011 1 次提交
-
-
由 Neil Horman 提交于
This patch adds in the infrastructure code to create the network priority cgroup. The cgroup, in addition to the standard processes file creates two control files: 1) prioidx - This is a read-only file that exports the index of this cgroup. This is a value that is both arbitrary and unique to a cgroup in this subsystem, and is used to index the per-device priority map 2) priomap - This is a writeable file. On read it reports a table of 2-tuples <name:priority> where name is the name of a network interface and priority is indicates the priority assigned to frames egresessing on the named interface and originating from a pid in this cgroup This cgroup allows for skb priority to be set prior to a root qdisc getting selected. This is benenficial for DCB enabled systems, in that it allows for any application to use dcb configured priorities so without application modification Signed-off-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com> CC: Robert Love <robert.w.love@intel.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 11月, 2011 1 次提交
-
-
由 Michał Mirosław 提交于
v2: add couple missing conversions in drivers split unexporting netdev_fix_features() implemented %pNF convert sock::sk_route_(no?)caps Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 11月, 2011 1 次提交
-
-
由 Johannes Berg 提交于
The 802.1X EAPOL handshake hostapd does requires knowing whether the frame was ack'ed by the peer. Currently, we fudge this pretty badly by not even transmitting the frame as a normal data frame but injecting it with radiotap and getting the status out of radiotap monitor as well. This is rather complex, confuses users (mon.wlan0 presence) and doesn't work with all hardware. To get rid of that hack, introduce a real wifi TX status option for data frame transmissions. This works similar to the existing TX timestamping in that it reflects the SKB back to the socket's error queue with a SCM_WIFI_STATUS cmsg that has an int indicating ACK status (0/1). Since it is possible that at some point we will want to have TX timestamping and wifi status in a single errqueue SKB (there's little point in not doing that), redefine SO_EE_ORIGIN_TIMESTAMPING to SO_EE_ORIGIN_TXSTATUS which can collect more than just the timestamp; keep the old constant as an alias of course. Currently the internal APIs don't make that possible, but it wouldn't be hard to split them up in a way that makes it possible. Thanks to Neil Horman for helping me figure out the functions that add the control messages. Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
-
- 09 11月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
Make clear that sk_clone() and inet_csk_clone() return a locked socket. Add _lock() prefix and kerneldoc. Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 11月, 2011 2 次提交
-
-
由 Joe Perches 提交于
Standardize the style for compiler based printf format verification. Standardized the location of __printf too. Done via script and a little typing. $ grep -rPl --include=*.[ch] -w "__attribute__" * | \ grep -vP "^(tools|scripts|include/linux/compiler-gcc.h)" | \ xargs perl -n -i -e 'local $/; while (<>) { s/\b__attribute__\s*\(\s*\(\s*format\s*\(\s*printf\s*,\s*(.+)\s*,\s*(.+)\s*\)\s*\)\s*\)/__printf($1, $2)/g ; print; }' [akpm@linux-foundation.org: revert arch bits] Signed-off-by: NJoe Perches <joe@perches.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Gortmaker 提交于
The <linux/module.h> pretty much brings in the kitchen sink along with it, so it should be avoided wherever reasonably possible in terms of being included from other commonly used <linux/something.h> files, as it results in a measureable increase on compile times. The worst culprit was probably device.h since it is used everywhere. This file also had an implicit dependency/usage of mutex.h which was masked by module.h, and is also fixed here at the same time. There are over a dozen other headers that simply declare the struct instead of pulling in the whole file, so follow their lead and simply make it a few more. Most of the implicit dependencies on module.h being present by these headers pulling it in have been now weeded out, so we can finally make this change with hopefully minimal breakage. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
- 18 8月, 2011 1 次提交
-
-
由 Tom Herbert 提交于
The l4_rxhash flag was added to the skb structure to indicate that the rxhash value was computed over the 4 tuple for the packet which includes the port information in the encapsulated transport packet. This is used by the stack to preserve the rxhash value in __skb_rx_tunnel. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 7月, 2011 1 次提交
-
-
由 Michal Hocko 提交于
Since ca5ecddf (rcu: define __rcu address space modifier for sparse) rcu_dereference_check use rcu_read_lock_held as a part of condition automatically so callers do not have to do that as well. Signed-off-by: NMichal Hocko <mhocko@suse.cz> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 07 7月, 2011 1 次提交
-
-
由 Shirley Ma 提交于
Signed-off-by: NShirley Ma <xma@us.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 6月, 2011 2 次提交
-
-
由 Vitaliy Ivanov 提交于
Fix 'make htmldocs' warnings: Warning(/include/linux/hrtimer.h:153): No description found for parameter 'clockid' Warning(/include/linux/device.h:604): Excess struct/union/enum/typedef member 'of_match' description in 'device' Warning(/include/net/sock.h:349): Excess struct/union/enum/typedef member 'sk_rmem_alloc' description in 'sock' Signed-off-by: NVitaliy Ivanov <vitalivanov@gmail.com> Acked-by: NGrant Likely <grant.likely@secretlab.ca> Acked-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NRandy Dunlap <rdunlap@xenotime.net> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
由 Vitaliy Ivanov 提交于
Fix 'make htmldocs' warnings: Warning(/include/linux/hrtimer.h:153): No description found for parameter 'clockid' Warning(/include/linux/device.h:604): Excess struct/union/enum/typedef member 'of_match' description in 'device' Warning(/include/net/sock.h:349): Excess struct/union/enum/typedef member 'sk_rmem_alloc' description in 'sock' Signed-off-by: NVitaliy Ivanov <vitalivanov@gmail.com> Acked-by: NGrant Likely <grant.likely@secretlab.ca> Acked-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 6月, 2011 1 次提交
-
-
由 Alexey Dobriyan 提交于
* remove interrupt.g inclusion from netdevice.h -- not needed * fixup fallout, add interrupt.h and hardirq.h back where needed. Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-