提交 · d94ce9b283736a876b2e6dec665c68e5e8b5d55e · openeuler / Kernel

23 10月, 2012 3 次提交

ipv4: 16 slots in initial fib_info hash table · d94ce9b2

由 Eric Dumazet 提交于 10月 21, 2012

A small host typically needs ~10 fib_info structures, so create initial
hash table with 16 slots instead of only one. This removes potential
false sharing and reallocs/rehashes (1->2->4->8->16)
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d94ce9b2

tcp: speedup SIOCINQ ioctl · 0e71c55c

由 Eric Dumazet 提交于 10月 21, 2012

SIOCINQ can use the lock_sock_fast() version to avoid double acquisition
of socket lock.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e71c55c

tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation · 354e4aa3

由 Eric Dumazet 提交于 10月 21, 2012

RFC 5961 5.2 [Blind Data Injection Attack].[Mitigation]

  All TCP stacks MAY implement the following mitigation.  TCP stacks
  that implement this mitigation MUST add an additional input check to
  any incoming segment.  The ACK value is considered acceptable only if
  it is in the range of ((SND.UNA - MAX.SND.WND) <= SEG.ACK <=
  SND.NXT).  All incoming segments whose ACK value doesn't satisfy the
  above condition MUST be discarded and an ACK sent back.

Move tcp_send_challenge_ack() before tcp_ack() to avoid a forward
declaration.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Jerry Chu <hkchu@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

354e4aa3

22 10月, 2012 4 次提交

pkt_sched: use ns_to_ktime() helper · 46baac38

由 Eric Dumazet 提交于 10月 20, 2012

ns_to_ktime() seems better than ktime_set() + ktime_add_ns()
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

46baac38

net:dev: remove double indentical assignment in dev_change_net_namespace(). · 47b70db5

由 Rami Rosen 提交于 10月 19, 2012

This patch removes double assignment of err to -EINVAL in dev_change_net_namespace().
Signed-off-by: NRami Rosen <ramirose@gmail.com>
Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

47b70db5

sockopt: Make SO_BINDTODEVICE readable · f7b86bfe

由 Pavel Emelyanov 提交于 10月 18, 2012

The SO_BINDTODEVICE option is the only SOL_SOCKET one that can be set, but
cannot be get via sockopt API. The only way we can find the device id a
socket is bound to is via sock-diag interface. But the diag works only on
hashed sockets, while the opt in question can be set for yet unhashed one.

That said, in order to know what device a socket is bound to (we do want
to know this in checkpoint-restore project) I propose to make this option
getsockopt-able and report the respective device index.

Another solution to the problem might be to teach the sock-diag reporting
info on unhashed sockets. Should I go this way instead?
Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f7b86bfe

pktgen: Use ipv6_addr_any · 06e30411

由 Joe Perches 提交于 10月 18, 2012

Use the standard test for a non-zero ipv6 address.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

06e30411

19 10月, 2012 6 次提交

tcp: fix FIONREAD/SIOCINQ · a3374c42

由 Eric Dumazet 提交于 10月 18, 2012

tcp_ioctl() tries to take into account if tcp socket received a FIN
to report correct number bytes in receive queue.

But its flaky because if the application ate the last skb,
we return 1 instead of 0.

Correct way to detect that FIN was received is to test SOCK_DONE.
Reported-by: NElliot Hughes <enh@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a3374c42

netlink: use kfree_rcu() in netlink_release() · 6d772ac5

由 Eric Dumazet 提交于 10月 18, 2012

On some suspend/resume operations involving wimax device, we have
noticed some intermittent memory corruptions in netlink code.

Stéphane Marchesin tracked this corruption in netlink_update_listeners()
and suggested a patch.

It appears netlink_release() should use kfree_rcu() instead of kfree()
for the listeners structure as it may be used by other cpus using RCU
protection.

netlink_release() must set to NULL the listeners pointer when
it is about to be freed.

Also have to protect netlink_update_listeners() and
netlink_has_listeners() if listeners is NULL.

Add a nl_deref_protected() lockdep helper to properly document which
locks protects us.
Reported-by: NJonathan Kliegman <kliegs@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Stéphane Marchesin <marcheu@google.com>
Cc: Sam Leffler <sleffler@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6d772ac5

ipv4: Fix flushing of cached routing informations · 13d82bf5

由 Steffen Klassert 提交于 10月 17, 2012

Currently we can not flush cached pmtu/redirect informations via
the ipv4_sysctl_rtcache_flush sysctl. We need to check the rt_genid
of the old route and reset the nh exeption if the old route is
expired when we bind a new route to a nh exeption.
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

13d82bf5

vlan: allow to change type when no vlan device is hooked on netdev · 18c22a03

由 Jiri Pirko 提交于 10月 17, 2012

vlan_info might be present but still no vlan devices might be there.
That is in case of vlan0 automatically added.

So in that case, allow to change netdev type.
Reported-by: NJon Stanley <jstanley@rmrf.net>
Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

18c22a03

batman-adv: Fix potential broadcast BLA-duplicate-check race condition · 7dac7b76

由 Linus Lüssing 提交于 10月 17, 2012

Threads in the bottom half of batadv_bla_check_bcast_duplist() might
otherwise for instance overwrite variables which other threads might
be using/reading at the same time in the top half, potentially
leading to messing up the bcast_duplist, possibly resulting in false
bridge loop avoidance duplicate check decisions.
Signed-off-by: NLinus Lüssing <linus.luessing@web.de>
Acked-by: NSimon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: NMarek Lindner <lindner_marek@yahoo.de>

7dac7b76

batman-adv: Fix broadcast packet CRC calculation · 7f112af4

由 Linus Lüssing 提交于 10月 17, 2012

So far the crc16 checksum for a batman-adv broadcast data packet, received
on a batman-adv hard interface, was calculated over zero bytes of its
content leading to many incoming broadcast data packets wrongly being
dropped (60-80% packet loss).

This patch fixes this issue by calculating the crc16 over the actual,
complete broadcast payload.

The issue is a regression introduced by
("batman-adv: add broadcast duplicate check").
Signed-off-by: NLinus Lüssing <linus.luessing@web.de>
Acked-by: NSimon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: NMarek Lindner <lindner_marek@yahoo.de>

7f112af4

17 10月, 2012 4 次提交

netfilter: xt_TEE: don't use destination address found in header · 2ad5b9e4

由 Eric Dumazet 提交于 10月 16, 2012

Torsten Luettgert bisected TEE regression starting with commit
f8126f1d (ipv4: Adjust semantics of rt->rt_gateway.)

The problem is that it tries to ARP-lookup the original destination
address of the forwarded packet, not the address of the gateway.

Fix this using FLOWI_FLAG_KNOWN_NH Julian added in commit
c92b9655 (ipv4: Add FLOWI_FLAG_KNOWN_NH), so that known
nexthop (info->gw.ip) has preference on resolving.
Reported-by: NTorsten Luettgert <ml-netfilter@enda.eu>
Bisected-by: NTorsten Luettgert <ml-netfilter@enda.eu>
Tested-by: NTorsten Luettgert <ml-netfilter@enda.eu>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

2ad5b9e4

ipv6: addrconf: fix /proc/net/if_inet6 · 9f0d3c27

由 Eric Dumazet 提交于 10月 16, 2012

Commit 1d578303 (ipv6/addrconf: speedup /proc/net/if_inet6 filling)
added bugs hiding some devices from if_inet6 and breaking applications.

"ip -6 addr" could still display all IPv6 addresses, while "ifconfig -a"
couldnt.

One way to reproduce the bug is by starting in a shell :

unshare -n /bin/bash
ifconfig lo up

And in original net namespace, lo device disappeared from if_inet6
Reported-by: NJan Hinnerk Stosch <janhinnerk.stosch@gmail.com>
Tested-by: NJan Hinnerk Stosch <janhinnerk.stosch@gmail.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Mihai Maruseac <mihai.maruseac@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9f0d3c27

sctp: fix call to SCTP_CMD_PROCESS_SACK in sctp_cmd_interpreter() · f6e80abe

由 Zijie Pan 提交于 10月 15, 2012

Bug introduced by commit edfee033
(sctp: check src addr when processing SACK to update transport state)
Signed-off-by: NZijie Pan <zijie.pan@6wind.com>
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: NVlad Yasevich <vyasevich@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6e80abe

vlan: fix bond/team enslave of vlan challenged slave/port · 55462cf3

由 Jiri Pirko 提交于 10月 14, 2012

In vlan_uses_dev() check for number of vlan devs rather than existence
of vlan_info. The reason is that vlan id 0 is there without appropriate
vlan dev on it by default which prevented from enslaving vlan challenged
dev.
Reported-by: NJon Stanley <jstanley@rmrf.net>
Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55462cf3

15 10月, 2012 2 次提交

netfilter: xt_nat: fix incorrect hooks for SNAT and DNAT targets · 939ccba4

由 Elison Niven 提交于 10月 15, 2012

In (c7232c99 netfilter: add protocol independent NAT core), the
hooks were accidentally modified:

SNAT hooks are POST_ROUTING and LOCAL_IN (before it was LOCAL_OUT).
DNAT hooks are PRE_ROUTING and LOCAL_OUT (before it was LOCAL_IN).
Signed-off-by: NElison Niven <elison.niven@cyberoam.com>
Signed-off-by: NSanket Shah <sanket.shah@cyberoam.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

939ccba4

netfilter: xt_CT: fix timeout setting with IPv6 · 0153d5a8

由 Pablo Neira Ayuso 提交于 10月 11, 2012

This patch fixes ip6tables and the CT target if it is used to set
some custom conntrack timeout policy for IPv6.

Use xt_ct_find_proto which already handles the ip6tables case for us.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

0153d5a8

13 10月, 2012 5 次提交

userns: Properly print bluetooth socket uids · 1bbb3095

由 Eric W. Biederman 提交于 10月 03, 2012

With user namespace support enabled building bluetooth generated the warning.
net/bluetooth/af_bluetooth.c: In function ‘bt_seq_show’:
net/bluetooth/af_bluetooth.c:598:7: warning: format ‘%u’ expects argument of type ‘unsigned int’, but argument 7 has type ‘kuid_t’ [-Wformat]

Convert sock_i_uid from a kuid_t to a uid_t before printing, to avoid
this problem.
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Cc: Masatake YAMATO <yamato@redhat.com>
Cc: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

1bbb3095

net: add doc for in4_pton() · 93ac0ef0

由 Amerigo Wang 提交于 10月 11, 2012

It is not easy to use in4_pton() correctly without reading
its definition, so add some doc for it.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NCong Wang <amwang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

93ac0ef0

net: add doc for in6_pton() · 28194fcd

由 Amerigo Wang 提交于 10月 11, 2012

It is not easy to use in6_pton() correctly without reading
its definition, so add some doc for it.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NCong Wang <amwang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

28194fcd

vti: fix sparse bit endian warnings · 8437e761

由 stephen hemminger 提交于 10月 11, 2012

Use be32_to_cpu instead of htonl to keep sparse happy.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8437e761

tcp: resets are misrouted · 4c675258

由 Alexey Kuznetsov 提交于 10月 12, 2012

After commit e2446eaa ("tcp_v4_send_reset: binding oif to iif in no
sock case").. tcp resets are always lost, when routing is asymmetric.
Yes, backing out that patch will result in misrouting of resets for
dead connections which used interface binding when were alive, but we
actually cannot do anything here.  What's died that's died and correct
handling normal unbound connections is obviously a priority.

Comment to comment:
> This has few benefits:
>   1. tcp_v6_send_reset already did that.

It was done to route resets for IPv6 link local addresses. It was a
mistake to do so for global addresses. The patch fixes this as well.

Actually, the problem appears to be even more serious than guaranteed
loss of resets.  As reported by Sergey Soloviev <sol@eqv.ru>, those
misrouted resets create a lot of arp traffic and huge amount of
unresolved arp entires putting down to knees NAT firewalls which use
asymmetric routing.
Signed-off-by: NAlexey Kuznetsov <kuznet@ms2.inr.ac.ru>

4c675258

12 10月, 2012 2 次提交

tcp: sysctl interface leaks 16 bytes of kernel memory · 0e24c4fc

由 Alan Cox 提交于 10月 11, 2012

If the rc_dereference of tcp_fastopen_ctx ever fails then we copy 16 bytes
of kernel stack into the proc result.
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e24c4fc

9P: Fix race between p9_write_work() and p9_fd_request() · 759f4298

由 Simon Derr 提交于 9月 17, 2012

Race scenario:

thread A			thread B

p9_write_work()                p9_fd_request()

if (list_empty
  (&m->unsent_req_list))
  ...

                               spin_lock(&client->lock);
                               req->status = REQ_STATUS_UNSENT;
                               list_add_tail(..., &m->unsent_req_list);
                               spin_unlock(&client->lock);
                               ....
                               if (n & POLLOUT &&
                               !test_and_set_bit(Wworksched, &m->wsched)
                               schedule_work(&m->wq);
                               --> not done because Wworksched is set

  clear_bit(Wworksched, &m->wsched);
  return;

--> nobody will take care of sending the new request.

This is not very likely to happen though, because p9_write_work()
being called with an empty unsent_req_list is not frequent.
But this also means that taking the lock earlier will not be costly.
Signed-off-by: NSimon Derr <simon.derr@bull.net>
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

759f4298

11 10月, 2012 7 次提交

ipv4: fix route mark sparse warning · 68aaed54

由 stephen hemminger 提交于 10月 10, 2012

Sparse complains about RTA_MARK which is should be host order according
to include file and usage in iproute.

net/ipv4/route.c:2223:46: warning: incorrect type in argument 3 (different base types)
net/ipv4/route.c:2223:46: expected restricted __be32 [usertype] value
net/ipv4/route.c:2223:46: got unsigned int [unsigned] [usertype] flowic_mark
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

68aaed54

bridge: Pull ip header into skb->data before looking into ip header. · 6caab7b0

由 Sarveshwar Bandi 提交于 10月 10, 2012

If lower layer driver leaves the ip header in the skb fragment, it needs to
be first pulled into skb->data before inspecting ip header length or ip version
number.
Signed-off-by: NSarveshwar Bandi <sarveshwar.bandi@emulex.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6caab7b0

pktgen: replace scan_ip6() with in6_pton() · c468fb13