- 20 4月, 2013 1 次提交
-
-
由 Patrick McHardy 提交于
Rename the hardware VLAN acceleration features to include "CTAG" to indicate that they only support CTAGs. Follow up patches will introduce 802.1ad server provider tagging (STAGs) and require the distinction for hardware not supporting acclerating both. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 4月, 2013 1 次提交
-
-
由 Joe Perches 提交于
Update debugging messages to a more current style. Emit these debugging messages at KERN_DEBUG instead of KERN_DEFAULT. Add and use neigh_dbg(level, fmt, ...) macro Add dynamic_debug capability via pr_debug Convert embedded function names to "%s: ", __func__ Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 4月, 2013 1 次提交
-
-
由 Vlad Yasevich 提交于
The current implementation of dev_uc_sync/unsync() assumes that there is a strict 1-to-1 relationship between the source and destination of the sync. In other words, once an address has been synced to a destination device, it will not be synced to any other device through the sync API. However, there are some virtual devices that aggreate a number of lower devices and need to sync addresses to all of them. The current API falls short there. This patch introduces a new dev_uc_sync_multiple() api that can be called in the above circumstances and allows sync to work for every invocation. CC: Jiri Pirko <jiri@resnulli.us> Signed-off-by: NVlad Yasevich <vyasevic@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 4月, 2013 1 次提交
-
-
由 David S. Miller 提交于
This reverts commit 763eff57. It causes build regressions, as per Stephen Rothwell: ==================== After merging the final tree, today's linux-next build (powerpc allyesconfig) failed like this: net/core/netprio_cgroup.c:250:29: error: static declaration of 'net_prio_subsys' follows non-static declaration include/linux/cgroup_subsys.h:71:1: note: previous declaration of 'net_prio_subsys' was here ==================== Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 4月, 2013 1 次提交
-
-
由 stephen hemminger 提交于
Minor sparse warning Signed-off-by: NStephen Hemminger <stephen@networkplumber.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 4月, 2013 2 次提交
-
-
由 Zefan Li 提交于
The callers always pass current to sock_update_netprio(). Signed-off-by: NLi Zefan <lizefan@huawei.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Zefan Li 提交于
The callers always pass current to sock_update_classid(). Signed-off-by: NLi Zefan <lizefan@huawei.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 4月, 2013 1 次提交
-
-
由 Eric W. Biederman 提交于
Now that uids and gids are completely encapsulated in kuid_t and kgid_t we no longer need to pass struct cred which allowed us to test both the uid and the user namespace for equality. Passing struct cred potentially allows us to pass the entire group list as BSD does but I don't believe the cost of cache line misses justifies retaining code for a future potential application. Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 4月, 2013 1 次提交
-
-
由 Patrick McHardy 提交于
Commit 130549fe ("netfilter: reset nf_trace in nf_reset") added code to reset nf_trace in nf_reset(). This is wrong and unnecessary. nf_reset() is used in the following cases: - when passing packets up the the socket layer, at which point we want to release all netfilter references that might keep modules pinned while the packet is queued. nf_trace doesn't matter anymore at this point. - when encapsulating or decapsulating IPsec packets. We want to continue tracing these packets after IPsec processing. - when passing packets through virtual network devices. Only devices on that encapsulate in IPv4/v6 matter since otherwise nf_trace is not used anymore. Its not entirely clear whether those packets should be traced after that, however we've always done that. - when passing packets through virtual network devices that make the packet cross network namespace boundaries. This is the only cases where we clearly want to reset nf_trace and is also what the original patch intended to fix. Add a new function nf_reset_trace() and use it in dev_forward_skb() to fix this properly. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 4月, 2013 1 次提交
-
-
由 Vlad Yasevich 提交于
A few drivers use dev_uc_sync/unsync to synchronize the address lists from master down to slave/lower devices. In some cases (bond/team) a single address list is synched down to multiple devices. At the time of unsync, we have a leak in these lower devices, because "synced" is treated as a boolean and the address will not be unsynced for anything after the first device/call. Treat "synced" as a count (same as refcount) and allow all unsync calls to work. Signed-off-by: NVlad Yasevich <vyasevic@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 4月, 2013 1 次提交
-
-
由 Jacob Keller 提交于
Commit 7d4c04fc ("net: add option to enable error queue packets waking select") has an issue due to operator precedence causing the bit-wise OR to bind to the sock_flags call instead of the result of the terniary conditional. This fixes the *_poll functions to work properly. The old code results in "mask |= POLLPRI" instead of what was intended, which is to only include POLLPRI when the socket option is enabled. Signed-off-by: NJacob Keller <jacob.e.keller@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 4月, 2013 1 次提交
-
-
由 Julian Anastasov 提交于
Rename skb_dst_set_noref to __skb_dst_set_noref and add force flag as suggested by David Miller. The new wrapper skb_dst_set_noref_force will force dst entries that are not cached to be attached as skb dst without taking reference as long as provided dst is reclaimed after RCU grace period. Signed-off-by: NJulian Anastasov <ja@ssi.bg> Signed-off by: Hans Schillstrom <hans@schillstrom.com> Acked-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSimon Horman <horms@verge.net.au>
-
- 01 4月, 2013 1 次提交
-
-
由 Keller, Jacob E 提交于
Currently, when a socket receives something on the error queue it only wakes up the socket on select if it is in the "read" list, that is the socket has something to read. It is useful also to wake the socket if it is in the error list, which would enable software to wait on error queue packets without waking up for regular data on the socket. The main use case is for receiving timestamped transmit packets which return the timestamp to the socket via the error queue. This enables an application to select on the socket for the error queue only instead of for the regular traffic. -v2- * Added the SO_SELECT_ERR_QUEUE socket option to every architechture specific file * Modified every socket poll function that checks error queue Signed-off-by: NJacob Keller <jacob.e.keller@intel.com> Cc: Jeffrey Kirsher <jeffrey.t.kirsher@intel.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Matthew Vick <matthew.vick@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 3月, 2013 5 次提交
-
-
由 Eric Dumazet 提交于
commit 35d48903 (bonding: fix rx_handler locking) added a race in bonding driver, reported by Steven Rostedt who did a very good diagnosis : <quoting Steven> I'm currently debugging a crash in an old 3.0-rt kernel that one of our customers is seeing. The bug happens with a stress test that loads and unloads the bonding module in a loop (I don't know all the details as I'm not the one that is directly interacting with the customer). But the bug looks to be something that may still be present and possibly present in mainline too. It will just be much harder to trigger it in mainline. In -rt, interrupts are threads, and can schedule in and out just like any other thread. Note, mainline now supports interrupt threads so this may be easily reproducible in mainline as well. I don't have the ability to tell the customer to try mainline or other kernels, so my hands are somewhat tied to what I can do. But according to a core dump, I tracked down that the eth irq thread crashed in bond_handle_frame() here: slave = bond_slave_get_rcu(skb->dev); bond = slave->bond; <--- BUG the slave returned was NULL and accessing slave->bond caused a NULL pointer dereference. Looking at the code that unregisters the handler: void netdev_rx_handler_unregister(struct net_device *dev) { ASSERT_RTNL(); RCU_INIT_POINTER(dev->rx_handler, NULL); RCU_INIT_POINTER(dev->rx_handler_data, NULL); } Which is basically: dev->rx_handler = NULL; dev->rx_handler_data = NULL; And looking at __netif_receive_skb() we have: rx_handler = rcu_dereference(skb->dev->rx_handler); if (rx_handler) { if (pt_prev) { ret = deliver_skb(skb, pt_prev, orig_dev); pt_prev = NULL; } switch (rx_handler(&skb)) { My question to all of you is, what stops this interrupt from happening while the bonding module is unloading? What happens if the interrupt triggers and we have this: CPU0 CPU1 ---- ---- rx_handler = skb->dev->rx_handler netdev_rx_handler_unregister() { dev->rx_handler = NULL; dev->rx_handler_data = NULL; rx_handler() bond_handle_frame() { slave = skb->dev->rx_handler; bond = slave->bond; <-- NULL pointer dereference!!! What protection am I missing in the bond release handler that would prevent the above from happening? </quoting Steven> We can fix bug this in two ways. First is adding a test in bond_handle_frame() and others to check if rx_handler_data is NULL. A second way is adding a synchronize_net() in netdev_rx_handler_unregister() to make sure that a rcu protected reader has the guarantee to see a non NULL rx_handler_data. The second way is better as it avoids an extra test in fast path. Reported-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Jiri Pirko <jpirko@redhat.com> Cc: Paul E. McKenney <paulmck@us.ibm.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Reviewed-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 John Fastabend 提交于
In rtnl_fdb_dump() when the fdb_dump ndo op is not populated we never set the idx value so that cb->arg[0] is always 0. Resulting in a endless loop of messages. Introduced with this commit, commit 090096bf Author: Vlad Yasevich <vyasevic@redhat.com> Date: Wed Mar 6 15:39:42 2013 +0000 net: generic fdb support for drivers without ndo_fdb_<op> CC: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Shmulik Ladkani 提交于
'nf_reset' is called just prior calling 'netif_rx'. No need to call it twice. Reported-by: NIgor Michailov <rgohita@gmail.com> Signed-off-by: NShmulik Ladkani <shmulik.ladkani@gmail.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Li RongQing 提交于
replace per_cpu with per_cpu_ptr to save conversion between address and pointer Signed-off-by: NLi RongQing <roy.qing.li@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Li RongQing 提交于
flush_tasklet is not percpu var, and percpu is percpu var, and this_cpu_ptr(&info->cache->percpu->flush_tasklet) is not equal to &this_cpu_ptr(info->cache->percpu)->flush_tasklet 1f743b07(use this_cpu_ptr per-cpu helper) introduced this bug. Signed-off-by: NLi RongQing <roy.qing.li@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 3月, 2013 1 次提交
-
-
由 Hong zhi guo 提交于
Signed-off-by: NHong Zhiguo <honkiko@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 3月, 2013 3 次提交
-
-
由 Wei Yongjun 提交于
Fix to return a negative error code from the error handling case instead of 0(possible overwrite to 0 by ops->fill_xstats call), as returned elsewhere in this function. Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andy Shevchenko 提交于
In kernel we have fast and pretty implementation of the isxdigit() function. Let's use it. Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
For untrusted packets with partial checksum, we need to set the transport header for precise packet length estimation. We can just let skb_pratial_csum_set() to do this to avoid extra call to skb_flow_dissect() and simplify the caller. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 3月, 2013 1 次提交
-
-
由 Jason Wang 提交于
gso_segs were reset to zero when kernel receive packets from untrusted source. But we use this zero value to estimate precise packet len which is wrong. So this patch tries to estimate the correct gso_segs value before using it in qdisc_pkt_len_init(). Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 3月, 2013 1 次提交
-
-
由 David S. Miller 提交于
It's always zero. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 3月, 2013 2 次提交
-
-
由 Eric Dumazet 提交于
The WARN_ON(in_interrupt()) in net_enable_timestamp() can get false positive, in socket clone path, run from softirq context : [ 3641.624425] WARNING: at net/core/dev.c:1532 net_enable_timestamp+0x7b/0x80() [ 3641.668811] Call Trace: [ 3641.671254] <IRQ> [<ffffffff80286817>] warn_slowpath_common+0x87/0xc0 [ 3641.677871] [<ffffffff8028686a>] warn_slowpath_null+0x1a/0x20 [ 3641.683683] [<ffffffff80742f8b>] net_enable_timestamp+0x7b/0x80 [ 3641.689668] [<ffffffff80732ce5>] sk_clone_lock+0x425/0x450 [ 3641.695222] [<ffffffff8078db36>] inet_csk_clone_lock+0x16/0x170 [ 3641.701213] [<ffffffff807ae449>] tcp_create_openreq_child+0x29/0x820 [ 3641.707663] [<ffffffff807d62e2>] ? ipt_do_table+0x222/0x670 [ 3641.713354] [<ffffffff807aaf5b>] tcp_v4_syn_recv_sock+0xab/0x3d0 [ 3641.719425] [<ffffffff807af63a>] tcp_check_req+0x3da/0x530 [ 3641.724979] [<ffffffff8078b400>] ? inet_hashinfo_init+0x60/0x80 [ 3641.730964] [<ffffffff807ade6f>] ? tcp_v4_rcv+0x79f/0xbe0 [ 3641.736430] [<ffffffff807ab9bd>] tcp_v4_do_rcv+0x38d/0x4f0 [ 3641.741985] [<ffffffff807ae14a>] tcp_v4_rcv+0xa7a/0xbe0 Its safe at this point because the parent socket owns a reference on the netstamp_needed, so we cant have a 0 -> 1 transition, which requires to lock a mutex. Instead of refining the check, lets remove it, as all known callers are safe. If it ever changes in the future, static_key_slow_inc() will complain anyway. Reported-by: NLaurent Chavey <chavey@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nicolas Dichtel 提交于
This patch takes benefit of dev_addr_genid and dev_base_seq to check if a change occurs during a netlink dump. If a change is detected, the flag NLM_F_DUMP_INTR is set in the first message after the dump was interrupted. Note that seq and prev_seq must be reset between each family in rtnl_dump_all() because they are specific to each family. Reported-by: NJunwei Zhang <junwei.zhang@6wind.com> Reported-by: NHongjun Li <hongjun.li@6wind.com> Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 3月, 2013 1 次提交
-
-
由 Thomas Graf 提交于
With decnet converted, we can finally get rid of rta_buf and its computations around it. It also gets rid of the minimal header length verification since all message handlers do that explicitly anyway. Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 3月, 2013 5 次提交
-
-
由 Zefan Li 提交于
The cgroup code has been surrounded by ifdef CONFIG_NET_CLS_CGROUP and CONFIG_NETPRIO_CGROUP. Signed-off-by: NLi Zefan <lizefan@huawei.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Chris Metcalf 提交于
Previously, if you did an "ifconfig down" or similar on one core, and the kernel had CONFIG_XFRM enabled, every core would be interrupted to check its percpu flow list for items that could be garbage collected. With this change, we generate a mask of cores that actually have any percpu items, and only interrupt those cores. When we are trying to isolate a set of cpus from interrupts, this is important to do. Signed-off-by: NChris Metcalf <cmetcalf@tilera.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
It is very useful to do dynamic truncation of packets. In particular, we're interested to push the necessary header bytes to the user space and cut off user payload that should probably not be transferred for some reasons (e.g. privacy, speed, or others). With the ancillary extension PAY_OFFSET, we can load it into the accumulator, and return it. E.g. in bpfc syntax ... ld #poff ; { 0x20, 0, 0, 0xfffff034 }, ret a ; { 0x16, 0, 0, 0x00000000 }, ... as a filter will accomplish this without having to do a big hackery in a BPF filter itself. Follow-up JIT implementations are welcome. Thanks to Eric Dumazet for suggesting and discussing this during the Netfilter Workshop in Copenhagen. Suggested-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
__skb_get_poff() returns the offset to the payload as far as it could be dissected. The main user is currently BPF, so that we can dynamically truncate packets without needing to push actual payload to the user space and instead can analyze headers only. Suggested-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
In skb_flow_dissect(), we perform a dissection of a skbuff. Since we're doing the work here anyway, also store thoff for a later usage, e.g. in the BPF filter. Suggested-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 3月, 2013 1 次提交
-
-
由 Kusanagi Kouichi 提交于
Signed-off-by: NKusanagi Kouichi <slash@ac.auone-net.jp> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 3月, 2013 3 次提交
-
-
由 Eric W. Biederman 提交于
Don't allow spoofing pids over unix domain sockets in the corner cases where a user has created a user namespace but has not yet created a pid namespace. Cc: stable@vger.kernel.org Reported-by: NAndy Lutomirski <luto@amacapital.net> Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
由 Lai Jiangshan 提交于
DEFINE_STATIC_SRCU() defines srcu struct and do init at build time. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Stevens 提交于
This patch generalizes VXLAN forwarding table entries allowing an administrator to: 1) specify multiple destinations for a given MAC 2) specify alternate vni's in the VXLAN header 3) specify alternate destination UDP ports 4) use multicast MAC addresses as fdb lookup keys 5) specify multicast destinations 6) specify the outgoing interface for forwarded packets The combination allows configuration of more complex topologies using VXLAN encapsulation. Changes since v1: rebase to 3.9.0-rc2 Signed-Off-By: NDavid L Stevens <dlstevens@us.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 3月, 2013 1 次提交
-
-
由 Vlad Yasevich 提交于
Range/validity checks on rta_type in rtnetlink_rcv_msg() do not account for flags that may be set. This causes the function to return -EINVAL when flags are set on the type (for example NLA_F_NESTED). Signed-off-by: NVlad Yasevich <vyasevic@redhat.com> Acked-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 3月, 2013 2 次提交
-
-
由 Li RongQing 提交于
[ Bug added added in commit 05e8ef4a (net: factor out skb_mac_gso_segment() from skb_gso_segment() ) ] move vlan_depth out of while loop, or else vlan_depth always is ETH_HLEN, can not be increased, and lead to infinite loop when frame has two vlan headers. Signed-off-by: NLi RongQing <roy.qing.li@gmail.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Michael Dalton 提交于
Add support for L2 GRE tunnels, so that RPS can be more effective. Signed-off-by: NMichael Dalton <mwdalton@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 3月, 2013 1 次提交
-
-
由 Mathias Krause 提交于
Initialize the mac address buffer with 0 as the driver specific function will probably not fill the whole buffer. In fact, all in-kernel drivers fill only ETH_ALEN of the MAX_ADDR_LEN bytes, i.e. 6 of the 32 possible bytes. Therefore we currently leak 26 bytes of stack memory to userland via the netlink interface. Signed-off-by: NMathias Krause <minipli@googlemail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-