提交 · a50a97d415d839e6db9df288ff0205528e52c03e · openeuler / Kernel

15 7月, 2009 1 次提交

gre: fix ToS/DiffServ inherit bug · ee686ca9

由 Andreas Jaggi 提交于 7月 14, 2009

Fixes two bugs:
- ToS/DiffServ inheritance was unintentionally activated when using impair fixed ToS values
- ECN bit was lost during ToS/DiffServ inheritance
Signed-off-by: NAndreas Jaggi <aj@open.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ee686ca9

12 7月, 2009 1 次提交

net: ip_push_pending_frames() fix · e51a67a9

由 Eric Dumazet 提交于 7月 08, 2009

After commit 2b85a34e
(net: No more expensive sock_hold()/sock_put() on each tx)
we do not take any more references on sk->sk_refcnt on outgoing packets.

I forgot to delete two __sock_put() from ip_push_pending_frames()
and ip6_push_pending_frames().
Reported-by: NEmil S Tantilov <emils.tantilov@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Tested-by: NEmil S Tantilov <emils.tantilov@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e51a67a9

10 7月, 2009 1 次提交

net: adding memory barrier to the poll and receive callbacks · a57de0b4

由 Jiri Olsa 提交于 7月 08, 2009

Adding memory barrier after the poll_wait function, paired with
receive callbacks. Adding fuctions sock_poll_wait and sk_has_sleeper
to wrap the memory barrier.

Without the memory barrier, following race can happen.
The race fires, when following code paths meet, and the tp->rcv_nxt
and __add_wait_queue updates stay in CPU caches.

CPU1                         CPU2

sys_select                   receive packet
  ...                        ...
  __add_wait_queue           update tp->rcv_nxt
  ...                        ...
  tp->rcv_nxt check          sock_def_readable
  ...                        {
  schedule                      ...
                                if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
                                        wake_up_interruptible(sk->sk_sleep)
                                ...
                             }

If there was no cache the code would work ok, since the wait_queue and
rcv_nxt are opposit to each other.

Meaning that once tp->rcv_nxt is updated by CPU2, the CPU1 either already
passed the tp->rcv_nxt check and sleeps, or will get the new value for
tp->rcv_nxt and will return with new data mask.
In both cases the process (CPU1) is being added to the wait queue, so the
waitqueue_active (CPU2) call cannot miss and will wake up CPU1.

The bad case is when the __add_wait_queue changes done by CPU1 stay in its
cache, and so does the tp->rcv_nxt update on CPU2 side.  The CPU1 will then
endup calling schedule and sleep forever if there are no more data on the
socket.

Calls to poll_wait in following modules were ommited:
	net/bluetooth/af_bluetooth.c
	net/irda/af_irda.c
	net/irda/irnet/irnet_ppp.c
	net/mac80211/rc80211_pid_debugfs.c
	net/phonet/socket.c
	net/rds/af_rds.c
	net/rfkill/core.c
	net/sunrpc/cache.c
	net/sunrpc/rpc_pipe.c
	net/tipc/socket.c
Signed-off-by: NJiri Olsa <jolsa@redhat.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a57de0b4

09 7月, 2009 1 次提交

ipv4: Fix fib_trie rebalancing, part 4 (root thresholds) · 345aa031

由 Jarek Poplawski 提交于 7月 07, 2009

Pawel Staszewski wrote:
<blockquote>
Some time ago i report this:
http://bugzilla.kernel.org/show_bug.cgi?id=6648

and now with 2.6.29 / 2.6.29.1 / 2.6.29.3 and 2.6.30 it back
dmesg output:
oprofile: using NMI interrupt.
Fix inflate_threshold_root. Now=15 size=11 bits
...
Fix inflate_threshold_root. Now=15 size=11 bits

cat /proc/net/fib_triestat
Basic info: size of leaf: 40 bytes, size of tnode: 56 bytes.
Main:
        Aver depth:     2.28
        Max depth:      6
        Leaves:         276539
        Prefixes:       289922
        Internal nodes: 66762
          1: 35046  2: 13824  3: 9508  4: 4897  5: 2331  6: 1149  7: 5
9: 1  18: 1
        Pointers: 691228
Null ptrs: 347928
Total size: 35709  kB
</blockquote>

It seems, the current threshold for root resizing is too aggressive,
and it causes misleading warnings during big updates, but it might be
also responsible for memory problems, especially with non-preempt
configs, when RCU freeing is delayed long after call_rcu.

It should be also mentioned that because of non-atomic changes during
resizing/rebalancing the current lookup algorithm can miss valid leaves
so it's additional argument to shorten these activities even at a cost
of a minimally longer searching.

This patch restores values before the patch "[IPV4]: fib_trie root
node settings", commit: 965ffea4 from
v2.6.22.

Pawel's report:
<blockquote>
I dont see any big change of (cpu load or faster/slower
routing/propagating routes from bgpd or something else) - in avg there
is from 2% to 3% more of CPU load i dont know why but it is - i change
from "preempt" to "no preempt" 3 times and check this my "mpstat -P ALL
1 30"
always avg cpu load was from 2 to 3% more compared to "no preempt"
[...]
cat /proc/net/fib_triestat
Basic info: size of leaf: 20 bytes, size of tnode: 36 bytes.
Main:
        Aver depth:     2.44
        Max depth:      6
        Leaves:         277814
        Prefixes:       291306
        Internal nodes: 66420
          1: 32737  2: 14850  3: 10332  4: 4871  5: 2313  6: 942  7: 371  8: 3  17: 1
        Pointers: 599098
Null ptrs: 254865
Total size: 18067  kB
</blockquote>

According to this and other similar reports average depth is slightly
increased (~0.2), and root nodes are shorter (log 17 vs. 18), but
there is no visible performance decrease. So, until memory handling is
improved or added parameters for changing this individually, this
patch resets to safer defaults.
Reported-by: NPawel Staszewski <pstaszewski@itcare.pl>
Reported-by: NJorge Boncompte [DTI2] <jorge@dti2.net>
Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
Tested-by: NPawel Staszewski <pstaszewski@itcare.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

345aa031

04 7月, 2009 1 次提交

xfrm4: fix the ports decode of sctp protocol · c615c9f3

由 Wei Yongjun 提交于 7月 02, 2009

The SCTP pushed the skb data above the sctp chunk header, so the check
of pskb_may_pull(skb, xprth + 4 - skb->data) in _decode_session4() will
never return 0 because xprth + 4 - skb->data < 0, the ports decode of
sctp will always fail.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c615c9f3

01 7月, 2009 2 次提交

Revert "ipv4: arp announce, arp_proxy and windows ip conflict verification" · f8a68e75

由 Eric W. Biederman 提交于 6月 30, 2009

This reverts commit 73ce7b01.

After discovering that we don't listen to gratuitious arps in 2.6.30
I tracked the failure down to this commit.

The patch makes absolutely no sense.  RFC2131 RFC3927 and RFC5227.
are all in agreement that an arp request with sip == 0 should be used
for the probe (to prevent learning) and an arp request with sip == tip
should be used for the gratitous announcement that people can learn
from.

It appears the author of the broken patch got those two cases confused
and modified the code to drop all gratuitous arp traffic.  Ouch!

Cc: stable@kernel.org
Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f8a68e75

ipv4: Fix fib_trie rebalancing, part 3 · 008440e3

由 Jarek Poplawski 提交于 6月 30, 2009

Alas current delaying of freeing old tnodes by RCU in trie_rebalance
is still not enough because we can free a top tnode before updating a
t->trie pointer.
Reported-by: NPawel Staszewski <pstaszewski@itcare.pl>
Tested-by: NPawel Staszewski <pstaszewski@itcare.pl>
Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

008440e3

30 6月, 2009 2 次提交

tcp: Do not tack on TSO data to non-TSO packet · 6828b92b

由 Herbert Xu 提交于 6月 28, 2009

If a socket starts out on a non-TSO route, and then switches to
a TSO route, then we will tack on data to the tail of the tx queue
even if it started out life as non-TSO. This is suboptimal because
all of it will then be copied and checksummed unnecessarily.

This patch fixes this by ensuring that skb->ip_summed is set to
CHECKSUM_PARTIAL before appending extra data beyond the MSS.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6828b92b

tcp: Stop non-TSO packets morphing into TSO · 8e5b9dda

由 Herbert Xu 提交于 6月 28, 2009

If a socket starts out on a non-TSO route, and then switches to
a TSO route, then the tail on the tx queue can morph into a TSO
packet, causing mischief because the rest of the stack does not
expect a partially linear TSO packet.

This patch fixes this by ensuring that skb->ip_summed is set to
CHECKSUM_PARTIAL before declaring a packet as TSO.
Reported-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8e5b9dda

29 6月, 2009 1 次提交

netfilter: tcp conntrack: fix unacknowledged data detection with NAT · a3a9f79e

由 Patrick McHardy 提交于 6月 29, 2009

When NAT helpers change the TCP packet size, the highest seen sequence
number needs to be corrected. This is currently only done upwards, when
the packet size is reduced the sequence number is unchanged. This causes
TCP conntrack to falsely detect unacknowledged data and decrease the
timeout.

Fix by updating the highest seen sequence number in both directions after
packet mangling.
Tested-by: NKrzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

a3a9f79e

27 6月, 2009 1 次提交

inet: Call skb_orphan before tproxy activates · 71f9dacd

由 Herbert Xu 提交于 6月 26, 2009

As transparent proxying looks up the socket early and assigns
it to the skb for later processing, we must drop any existing
socket ownership prior to that in order to distinguish between
the case where tproxy is active and where it is not.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

71f9dacd

26 6月, 2009 1 次提交

tcp: missing check ACK flag of received segment in FIN-WAIT-2 state · 1ac530b3

由 Wei Yongjun 提交于 6月 24, 2009

RFC0793 defined that in FIN-WAIT-2 state if the ACK bit is off drop
the segment and return[Page 72]. But this check is missing in function
tcp_timewait_state_process(). This cause the segment with FIN flag but
no ACK has two diffent action:

Case 1:
    Node A                      Node B
              <-------------    FIN,ACK
                                (enter FIN-WAIT-1)
    ACK       ------------->
                                (enter FIN-WAIT-2)
    FIN       ------------->    discard
                                (move sk to tw list)

Case 2:
    Node A                      Node B
              <-------------    FIN,ACK
                                (enter FIN-WAIT-1)
    ACK       ------------->
                                (enter FIN-WAIT-2)
                                (move sk to tw list)
    FIN       ------------->

              <-------------    ACK

This patch fixed the problem.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1ac530b3

24 6月, 2009 1 次提交

ipv4 routing: Ensure that route cache entries are usable and reclaimable with caching is off · b6280b47

由 Neil Horman 提交于 6月 22, 2009

When route caching is disabled (rt_caching returns false), We still use route
cache entries that are created and passed into rt_intern_hash once. These
routes need to be made usable for the one call path that holds a reference to
them, and they need to be reclaimed when they're finished with their use. To be
made usable, they need to be associated with a neighbor table entry (which they
currently are not), otherwise iproute_finish2 just discards the packet, since we
don't know which L2 peer to send the packet to. To do this binding, we need to
follow the path a bit higher up in rt_intern_hash, which calls
arp_bind_neighbour, but not assign the route entry to the hash table.
Currently, if caching is off, we simply assign the route to the rp pointer and
are reutrn success. This patch associates us with a neighbor entry first.

Secondly, we need to make sure that any single use routes like this are known to
the garbage collector when caching is off. If caching is off, and we try to
hash in a route, it will leak when its refcount reaches zero. To avoid this,
this patch calls rt_free on the route cache entry passed into rt_intern_hash.
This places us on the gc list for the route cache garbage collector, so that
when its refcount reaches zero, it will be reclaimed (Thanks to Alexey for this
suggestion).

I've tested this on a local system here, and with these patches in place, I'm
able to maintain routed connectivity to remote systems, even if I set
/proc/sys/net/ipv4/rt_cache_rebuild_count to -1, which forces rt_caching to
return false.
Signed-off-by: NNeil Horman <nhorman@redhat.com>
Reported-by: NJarek Poplawski <jarkao2@gmail.com>
Reported-by: NMaxime Bizon <mbizon@freebox.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b6280b47

20 6月, 2009 1 次提交

ipv4: fix NULL pointer + success return in route lookup path · 73e42897

由 Neil Horman 提交于 6月 20, 2009

Don't drop route if we're not caching

I recently got a report of an oops on a route lookup. Maxime was
testing what would happen if route caching was turned off (doing so by setting
making rt_caching always return 0), and found that it triggered an oops. I
looked at it and found that the problem stemmed from the fact that the route
lookup routines were returning success from their lookup paths (which is good),
but never set the **rp pointer to anything (which is bad). This happens because
in rt_intern_hash, if rt_caching returns false, we call rt_drop and return 0.
This almost emulates slient success. What we should be doing is assigning *rp =
rt and _not_ dropping the route. This way, during slow path lookups, when we
create a new route cache entry, we don't immediately discard it, rather we just
don't add it into the cache hash table, but we let this one lookup use it for
the purpose of this route request. Maxime has tested and reports it prevents
the oops. There is still a subsequent routing issue that I'm looking into
further, but I'm confident that, even if its related to this same path, this
patch makes sense to take.
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73e42897

18 6月, 2009 2 次提交

net: correct off-by-one write allocations reports · 31e6d363

由 Eric Dumazet 提交于 6月 17, 2009

commit 2b85a34e
(net: No more expensive sock_hold()/sock_put() on each tx)
changed initial sk_wmem_alloc value.

We need to take into account this offset when reporting
sk_wmem_alloc to user, in PROC_FS files or various
ioctls (SIOCOUTQ/TIOCOUTQ)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

31e6d363

ipv4: Fix fib_trie rebalancing, part 2 · 7b85576d

由 Jarek Poplawski 提交于 6月 18, 2009

My previous patch, which explicitly delays freeing of tnodes by adding
them to the list to flush them after the update is finished, isn't
strict enough. It treats exceptionally tnodes without parent, assuming
they are newly created, so "invisible" for the read side yet.

But the top tnode doesn't have parent as well, so we have to exclude
all exceptions (at least until a better way is found). Additionally we
need to move rcu assignment of this node before flushing, so the
return type of the trie_rebalance() function is changed.
Reported-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7b85576d

15 6月, 2009 2 次提交

net: annotate inet_timewait_sock bitfields · 9e337b0f

由 Vegard Nossum 提交于 10月 18, 2008

The use of bitfields here would lead to false positive warnings with
kmemcheck. Silence them.

(Additionally, one erroneous comment related to the bitfield was also
fixed.)
Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>

9e337b0f

ipv4: Fix fib_trie rebalancing · e0f7cb8c

由 Jarek Poplawski 提交于 6月 15, 2009

While doing trie_rebalance(): resize(), inflate(), halve() RCU free
tnodes before updating their parents. It depends on RCU delaying the
real destruction, but if RCU readers start after call_rcu() and before
parent update they could access freed memory.

It is currently prevented with preempt_disable() on the update side,
but it's not safe, except maybe classic RCU, plus it conflicts with
memory allocations with GFP_KERNEL flag used from these functions.

This patch explicitly delays freeing of tnodes by adding them to the
list, which is flushed after the update is finished.
Reported-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e0f7cb8c

14 6月, 2009 3 次提交

PIM-SM: namespace changes · 403dbb97

由 Tom Goff 提交于 6月 14, 2009

IPv4:
  - make PIM register vifs netns local
  - set the netns when a PIM register vif is created
  - make PIM available in all network namespaces (if CONFIG_IP_PIMSM_V2)
    by adding the protocol handler when multicast routing is initialized

IPv6:
  - make PIM register vifs netns local
  - make PIM available in all network namespaces (if CONFIG_IPV6_PIMSM_V2)
    by adding the protocol handler when multicast routing is initialized
Signed-off-by: NTom Goff <thomas.goff@boeing.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

403dbb97

ipv4: update ARPD help text · e61a4b63

由 Timo Teräs 提交于 6月 11, 2009

Removed the statements about ARP cache size as this config option does
not affect it. The cache size is controlled by neigh_table gc thresholds.

Remove also expiremental and obsolete markings as the API originally
intended for arp caching is useful for implementing ARP-like protocols
(e.g. NHRP) in user space and has been there for a long enough time.
Signed-off-by: NTimo Teras <timo.teras@iki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e61a4b63

net: use a deferred timer in rt_check_expire · 125bb8f5

由 Eric Dumazet 提交于 6月 11, 2009

For the sake of power saver lovers, use a deferrable timer to fire
rt_check_expire()

As some big routers cache equilibrium depends on garbage collection
done in time, we take into account elapsed time between two
rt_check_expire() invocations to adjust the amount of slots we have to
check.

Based on an initial idea and patch from Tero Kristo
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NTero Kristo <tero.kristo@nokia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

125bb8f5

12 6月, 2009 1 次提交

netfilter: ip_tables: fix build error · 24992eac

由 Patrick McHardy 提交于 6月 12, 2009

Fix build error introduced by commit bb70dfa5 (netfilter: xtables:
consolidate comefrom debug cast access):

net/ipv4/netfilter/ip_tables.c: In function 'ipt_do_table':
net/ipv4/netfilter/ip_tables.c:421: error: 'comefrom' undeclared (first use in this function)
net/ipv4/netfilter/ip_tables.c:421: error: (Each undeclared identifier is reported only once
net/ipv4/netfilter/ip_tables.c:421: error: for each function it appears in.)
Signed-off-by: NPatrick McHardy <kaber@trash.net>

24992eac

11 6月, 2009 1 次提交

net: No more expensive sock_hold()/sock_put() on each tx · 2b85a34e

由 Eric Dumazet 提交于 6月 11, 2009

One of the problem with sock memory accounting is it uses
a pair of sock_hold()/sock_put() for each transmitted packet.

This slows down bidirectional flows because the receive path
also needs to take a refcount on socket and might use a different
cpu than transmit path or transmit completion path. So these
two atomic operations also trigger cache line bounces.

We can see this in tx or tx/rx workloads (media gateways for example),
where sock_wfree() can be in top five functions in profiles.

We use this sock_hold()/sock_put() so that sock freeing
is delayed until all tx packets are completed.

As we also update sk_wmem_alloc, we could offset sk_wmem_alloc
by one unit at init time, until sk_free() is called.
Once sk_free() is called, we atomic_dec_and_test(sk_wmem_alloc)
to decrement initial offset and atomicaly check if any packets
are in flight.

skb_set_owner_w() doesnt call sock_hold() anymore

sock_wfree() doesnt call sock_put() anymore, but check if sk_wmem_alloc
reached 0 to perform the final freeing.

Drawback is that a skb->truesize error could lead to unfreeable sockets, or
even worse, prematurely calling __sk_free() on a live socket.

Nice speedups on SMP. tbench for example, going from 2691 MB/s to 2711 MB/s
on my 8 cpu dev machine, even if tbench was not really hitting sk_refcnt
contention point. 5 % speedup on a UDP transmit workload (depends
on number of flows), lowering TX completion cpu usage.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2b85a34e

10 6月, 2009 1 次提交
- D
  netfilter: Fix extra semi-colon in skb_walk_frags() changes. · 0808dc80
  由 David S. Miller 提交于 6月 09, 2009
```
Noticed by Jesper Dangaard Brouer
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  0808dc80
09 6月, 2009 2 次提交
- D
  netfilter: Use frag list abstraction interfaces. · 343a9972
  由 David S. Miller 提交于 6月 09, 2009
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  343a9972
- D
  ipv4: Use frag list abstraction interfaces. · d7fcf1a5
  由 David S. Miller 提交于 6月 09, 2009
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  d7fcf1a5
08 6月, 2009 1 次提交

netfilter: nf_ct_icmp: keep the ICMP ct entries longer · f87fb666

由 Jan Kasprzak 提交于 6月 08, 2009

Current conntrack code kills the ICMP conntrack entry as soon as
the first reply is received. This is incorrect, as we then see only
the first ICMP echo reply out of several possible duplicates as
ESTABLISHED, while the rest will be INVALID. Also this unnecessarily
increases the conntrackd traffic on H-A firewalls.

Make all the ICMP conntrack entries (including the replied ones)
last for the default of nf_conntrack_icmp{,v6}_timeout seconds.
Signed-off-by: NJan "Yenya" Kasprzak <kas@fi.muni.cz>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

f87fb666

05 6月, 2009 1 次提交

netfilter: ipt_MASQUERADE: remove redundant rwlock · 17f2f52b

由 Florian Westphal 提交于 6月 05, 2009

The lock "protects" an assignment and a comparision of an integer.
When the caller of device_cmp() evaluates the result, nat->masq_index
may already have been changed (regardless if the lock is there or not).

So, the lock either has to be held during nf_ct_iterate_cleanup(),
or can be removed.

This does the latter.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

17f2f52b

04 6月, 2009 2 次提交

E
netfilter: x_tables: added hook number into match extension parameter structure. · a5e78820
由 Evgeniy Polyakov 提交于 6月 04, 2009
```
Signed-off-by: NEvgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
```
a5e78820

ipv4: remove ip_mc_drop_socket() declaration from af_inet.c. · 2307f866

由 Rami Rosen 提交于 6月 03, 2009

ip_mc_drop_socket() method is declared in linux/igmp.h, which
is included anyhow in af_inet.c. So there is no need for this declaration.
This patch removes it from af_inet.c.
Signed-off-by: NRami Rosen <ramirose@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2307f866

03 6月, 2009 3 次提交

net: skb->dst accessors · adf30907

由 Eric Dumazet 提交于 6月 02, 2009

Define three accessors to get/set dst attached to a skb

struct dst_entry *skb_dst(const struct sk_buff *skb)

void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;

Delete skb->dst field
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

adf30907

net: skb->rtable accessor · 511c3f92

由 Eric Dumazet 提交于 6月 02, 2009

Define skb_rtable(const struct sk_buff *skb) accessor to get rtable from skb

Delete skb->rtable field

Setting rtable is not allowed, just set dst instead as rtable is an alias.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

511c3f92

netfilter: conntrack: simplify event caching system · 17e6e4ea

由 Pablo Neira Ayuso 提交于 6月 02, 2009

This patch simplifies the conntrack event caching system by removing
several events:

 * IPCT_[*]_VOLATILE, IPCT_HELPINFO and IPCT_NATINFO has been deleted
   since the have no clients.
 * IPCT_COUNTER_FILLING which is a leftover of the 32-bits counter
   days.
 * IPCT_REFRESH which is not of any use since we always include the
   timeout in the messages.

After this patch, the existing events are:

 * IPCT_NEW, IPCT_RELATED and IPCT_DESTROY, that are used to identify
 addition and deletion of entries.
 * IPCT_STATUS, that notes that the status bits have changes,
 eg. IPS_SEEN_REPLY and IPS_ASSURED.
 * IPCT_PROTOINFO, that reports that internal protocol information has
 changed, eg. the TCP, DCCP and SCTP protocol state.
 * IPCT_HELPER, that a helper has been assigned or unassigned to this
 entry.
 * IPCT_MARK and IPCT_SECMARK, that reports that the mark has changed, this
 covers the case when a mark is set to zero.
 * IPCT_NATSEQADJ, to report that there's updates in the NAT sequence
 adjustment.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

17e6e4ea

02 6月, 2009 2 次提交

ipv4: New multicast-all socket option · f771bef9

由 Nivedita Singhvi 提交于 5月 28, 2009

After some discussion offline with Christoph Lameter and David Stevens
regarding multicast behaviour in Linux, I'm submitting a slightly
modified patch from the one Christoph submitted earlier.

This patch provides a new socket option IP_MULTICAST_ALL.

In this case, default behaviour is _unchanged_ from the current
Linux standard. The socket option is set by default to provide
original behaviour. Sockets wishing to receive data only from
multicast groups they join explicitly will need to clear this
socket option.
Signed-off-by: NNivedita Singhvi <niv@us.ibm.com>
Signed-off-by: Christoph Lameter<cl@linux.com>
Acked-by: NDavid Stevens <dlstevens@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f771bef9

net: ipv4/ip_sockglue.c cleanups · 4d52cfbe

由 Eric Dumazet 提交于 6月 02, 2009

Pure cleanups
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4d52cfbe

30 5月, 2009 1 次提交

tcp: fix loop in ofo handling code and reduce its complexity · 2df9001e

由 Ilpo Järvinen 提交于 5月 29, 2009

Somewhat luckily, I was looking into these parts with very fine
comb because I've made somewhat similar changes on the same
area (conflicts that arose weren't that lucky though). The loop
was very much overengineered recently in commit 91521944
(tcp: Use SKB queue and list helpers instead of doing it
by-hand), while it basically just wants to know if there are
skbs after 'skb'.

Also it got broken because skb1 = skb->next got translated into
skb1 = skb1->next (though abstracted) improperly. Note that
'skb1' is pointing to previous sk_buff than skb or NULL if at
head. Two things went wrong:
- We'll kfree 'skb' on the first iteration instead of the
  skbuff following 'skb' (it would require required SACK reneging
  to recover I think).
- The list head case where 'skb1' is NULL is checked too early
  and the loop won't execute whereas it previously did.

Conclusion, mostly revert the recent changes which makes the
cset very messy looking but using proper accessor in the
previous-like version.

The effective changes against the original can be viewed with:
  git-diff 91521944^ \
		net/ipv4/tcp_input.c | sed -n -e '57,70 p'
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2df9001e

29 5月, 2009 3 次提交

net: unset IFF_XMIT_DST_RELEASE in ipgre_tunnel_setup() · 108bfa89

由 Eric Dumazet 提交于 5月 28, 2009

ipgre_tunnel_xmit() might need skb->dst, so tell dev_hard_start_xmit()
to no release it.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

108bfa89

net: unset IFF_XMIT_DST_RELEASE in ipip_tunnel_setup() · 28e72216

由 Eric Dumazet 提交于 5月 28, 2009

ipip_tunnel_xmit() might need skb->dst, so tell dev_hard_start_xmit()
to no release it.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

28e72216

D
tcp: Use SKB queue and list helpers instead of doing it by-hand. · 91521944
由 David S. Miller 提交于 5月 28, 2009
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
91521944

27 5月, 2009 1 次提交

tcp: Do not check flush when comparing options for GRO · a2a804cd

由 Herbert Xu 提交于 5月 26, 2009

There is no need to repeatedly check flush when comparing TCP
options for GRO as it will be false 99% of the time where it
matters.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a2a804cd

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功