提交 · dd7ba3b8b15f9c65366986d723ae83254d8d78b7 · openeuler / raspberrypi-kernel

25 3月, 2006 1 次提交

[IPV4]: Aggregate route entries with different TOS values · cef2685e

由 Ilia Sotnikov 提交于 3月 25, 2006

When we get an ICMP need-to-frag message, the original TOS value in the
ICMP payload cannot be used as a key to look up the routes to update.
This is because the TOS field may have been modified by routers on the
way.  Similarly, ip_rt_redirect should also ignore the TOS as the router
that gave us the message may have modified the TOS value.

The patch achieves this objective by aggregating entries with different
TOS values (but are otherwise identical) into the same bucket.  This
makes it easy to update them at the same time when an ICMP message is
received.

In future we should use a twin-hashing scheme where teh aggregation
occurs at the entry level.  That is, the TOS goes back into the hash
for normal lookups while ICMP lookups will end up with a node that
gives us a list that contains all other route entries that differ
only by TOS.
Signed-off-by: NIlia Sotnikov <hostcc@gmail.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cef2685e

24 2月, 2006 1 次提交

[IPV4]: Fix garbage collection of multipath route entries · 85259878

由 Suresh Bhogavilli 提交于 2月 21, 2006

When garbage collecting route cache entries of multipath routes
in rt_garbage_collect(), entries were deleted from the hash bucket
'i' while holding a spin lock on bucket 'k' resulting in a system
hang.  Delete entries, if any, from bucket 'k' instead.
Signed-off-by: NSuresh Bhogavilli <sbhogavilli@verisign.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

85259878

18 1月, 2006 1 次提交

[IPV4]: RT_CACHE_STAT_INC() warning fix · dbd2915c

由 Andrew Morton 提交于 1月 17, 2006

BUG: using smp_processor_id() in preemptible [00000001] code: rpc.statd/2408

And it _is_ a bug, but I guess we don't care enough to add preempt_disable().
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dbd2915c

17 1月, 2006 1 次提交

[IPV4]: rt_cache_stat can be statically defined · 2f970d83

由 Eric Dumazet 提交于 1月 17, 2006

Using __get_cpu_var(obj) is slightly faster than per_cpu_ptr(obj, 
raw_smp_processor_id()).

1) Smaller code and memory use
For static and small objects, DEFINE_PER_CPU(type, object) is preferred over a 
alloc_percpu() : Better and smaller code to access them, and no extra memory 
(storing the pointer, and the percpu array of pointers)

x86_64 code before patch

mov    1237577(%rip),%rax        # ffffffff803e5990 <rt_cache_stat>
not    %rax  # part of per_cpu machinery
mov    %gs:0x3c,%edx # get cpu number
movslq %edx,%rdx # extend 32 bits cpu number to 64 bits
mov    (%rax,%rdx,8),%rax # get the pointer for this cpu
incl   0x38(%rax)

x86_64 code after patch

mov    $per_cpu__rt_cache_stat,%rdx
mov    %gs:0x48,%rax # get percpu data offset
incl   0x38(%rax,%rdx,1)

2) False sharing avoidance for SMP :
For a small NR_CPUS, the array of per cpu pointers allocated in alloc_percpu() 
can be <= 32 bytes. This let slab code gives a part of a cache line. If the 
other part of this 64 bytes (or 128 bytes) cache line is used by a mostly 
written object, we can have false sharing and expensive per_cpu_ptr() operations.

Size of rt_cache_stat is 64 bytes, so this patch is not a danger of a too big 
increase of bss (in UP mode) or static per_cpu data for SMP 
(PERCPU_ENOUGH_ROOM is currently 32768 bytes)
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2f970d83

30 11月, 2005 2 次提交

[NET]: Add const markers to various variables. · 9b5b5cff

由 Arjan van de Ven 提交于 11月 29, 2005

the patch below marks various variables const in net/; the goal is to
move them to the .rodata section so that they can't false-share
cachelines with things that get written to, as well as potentially
helping gcc a bit with optimisations.  (these were found using a gcc
patch to warn about such variables)
Signed-off-by: NArjan van de Ven <arjan@infradead.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b5b5cff

[IPV4] tcp/route: Another look at hash table sizes · 18955cfc

由 Mike Stroyan 提交于 11月 29, 2005

  The tcp_ehash hash table gets too big on systems with really big memory.
It is worse on systems with pages larger than 4KB.  It wastes memory that
could be better used.  It also makes the netstat command slow because reading
/proc/net/tcp and /proc/net/tcp6 needs to go through the full hash table.

  The default value should not be larger for larger page sizes.  It seems
that the effect of page size is an unintended error dating back a long
time.  I also wonder if the default value really should be a larger
fraction of memory for systems with more memory.  While systems with
really big ram can afford more space for hash tables, it is not clear to
me that they benefit from increasing the allocation ratio for this table.

  The amount of memory allocated is determined by net/ipv4/tcp.c:tcp_init and
mm/page_alloc.c:alloc_large_system_hash.

tcp_init calls alloc_large_system_hash passing parameters-
    bucketsize=sizeof(struct tcp_ehash_bucket)
    numentries=thash_entries
    scale=(num_physpages >= 128 * 1024) ? (25-PAGE_SHIFT) : (27-PAGE_SHIFT)
    limit=0

On i386, PAGE_SHIFT is 12 for a page size of 4K
On ia64, PAGE_SHIFT defaults to 14 for a page size of 16K

The num_physpages test above makes the allocation take a larger fraction
of the total memory on systems with larger memory.  The threshold size
for a i386 system is 512MB.  For an ia64 system with 16KB pages the
threshold is 2GB.

For smaller memory systems-
On i386, scale = (27 - 12) = 15
On ia64, scale = (27 - 14) = 13
For larger memory systems-
On i386, scale = (25 - 12) = 13
On ia64, scale = (25 - 14) = 11

  For the rest of this discussion, I'll just track the larger memory case.

  The default behavior has numentries=thash_entries=0, so the allocated
size is determined by either scale or by the default limit of 1/16 of
total memory.

In alloc_large_system_hash-
|	numentries = (flags & HASH_HIGHMEM) ? nr_all_pages : nr_kernel_pages;
|	numentries += (1UL << (20 - PAGE_SHIFT)) - 1;
|	numentries >>= 20 - PAGE_SHIFT;
|	numentries <<= 20 - PAGE_SHIFT;

  At this point, numentries is pages for all of memory, rounded up to the
nearest megabyte boundary.

|	/* limit to 1 bucket per 2^scale bytes of low memory */
|	if (scale > PAGE_SHIFT)
|		numentries >>= (scale - PAGE_SHIFT);
|	else
|		numentries <<= (PAGE_SHIFT - scale);

On i386, numentries >>= (13 - 12), so numentries is 1/8196 of
bytes of total memory.
On ia64, numentries <<= (14 - 11), so numentries is 1/2048 of
bytes of total memory.

|        log2qty = long_log2(numentries);
|
|        do {
|                size = bucketsize << log2qty;

bucketsize is 16, so size is 16 times numentries, rounded
down to a power of two.

On i386, size is 1/512 of bytes of total memory.
On ia64, size is 1/128 of bytes of total memory.

For smaller systems the results are
On i386, size is 1/2048 of bytes of total memory.
On ia64, size is 1/512 of bytes of total memory.

  The large page effect can be removed by just replacing
the use of PAGE_SHIFT with a constant of 12 in the calls to
alloc_large_system_hash.  That makes them more like the other uses of
that function from fs/inode.c and fs/dcache.c
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

18955cfc

04 10月, 2005 1 次提交

[IPV4]: Replace __in_dev_get with __in_dev_get_rcu/rtnl · e5ed6399

由 Herbert Xu 提交于 10月 03, 2005

The following patch renames __in_dev_get() to __in_dev_get_rtnl() and
introduces __in_dev_get_rcu() to cover the second case.

1) RCU with refcnt should use in_dev_get().
2) RCU without refcnt should use __in_dev_get_rcu().
3) All others must hold RTNL and use __in_dev_get_rtnl().

There is one exception in net/ipv4/route.c which is in fact a pre-existing
race condition.  I've marked it as such so that we remember to fix it.

This patch is based on suggestions and prior work by Suzanne Wood and
Paul McKenney.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e5ed6399

09 9月, 2005 1 次提交

[IPV4]: Fix refcount damaging in net/ipv4/route.c · ce723d8e

由 Julian Anastasov 提交于 9月 08, 2005

	One such place that can damage the dst refcnts is route.c with
CONFIG_IP_ROUTE_MULTIPATH_CACHED enabled, i don't see the user's
.config. In this new code i see that rt_intern_hash is called before
dst->refcnt is set to 1, dst is the 2nd arg to rt_intern_hash.

Arg 2 of rt_intern_hash must come with refcnt 1 as it is added to
table or dropped depending on error/add/update. One such example is
ip_mkroute_input where __mkroute_input return rth with refcnt 0 which
is provided to rt_intern_hash. ip_mkroute_output looks like a 2nd such
place. Appending untested patch for comments and review.  The idea is
to put previous reference as we are going to return next result/error.
Signed-off-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ce723d8e

30 8月, 2005 2 次提交

[NET]: Export symbols needed by the current DCCP code · d8c97a94

由 Arnaldo Carvalho de Melo 提交于 8月 09, 2005

Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d8c97a94

[IPV4]: possible cleanups · 0742fd53

由 Adrian Bunk 提交于 8月 09, 2005

This patch contains the following possible cleanups:
- make needlessly global code static
- #if 0 the following unused global function:
  - xfrm4_state.c: xfrm4_state_fini
- remove the following unneeded EXPORT_SYMBOL's:
  - ip_output.c: ip_finish_output
  - ip_output.c: sysctl_ip_default_ttl
  - fib_frontend.c: ip_dev_find
  - inetpeer.c: inet_peer_idlock
  - ip_options.c: ip_options_compile
  - ip_options.c: ip_options_undo
  - net/core/request_sock.c: sysctl_max_syn_backlog
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0742fd53

12 7月, 2005 1 次提交

[IPV4]: Prevent oops when printing martian source · 0b7f22aa

由 Olaf Kirch 提交于 7月 11, 2005

In some cases, we may be generating packets with a source address that
qualifies as martian. This can happen when we're in the middle of setting
up the network, and netfilter decides to reject a packet with an RST.
The IPv4 routing code would try to print a warning and oops, because
locally generated packets do not have a valid skb->mac.raw pointer
at this point.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0b7f22aa

06 7月, 2005 3 次提交

[IPV4]: Bug fix in rt_check_expire() · bb1d23b0

由 Eric Dumazet 提交于 7月 05, 2005

- rt_check_expire() fixes (an overflow occured if size of the hash
  was >= 65536)

reminder of the bugfix:

The rt_check_expire() has a serious problem on machines with large
route caches, and a standard HZ value of 1000.

With default values, ie ip_rt_gc_interval = 60*HZ = 60000 ;

the loop count :

     for (t = ip_rt_gc_interval << rt_hash_log; t >= 0;


overflows (t is a 31 bit value) as soon rt_hash_log is >= 16  (65536
slots in route cache hash table).

In this case, rt_check_expire() does nothing at all
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bb1d23b0

[IPV4]: Use the fancy alloc_large_system_hash() function for route hash table · 424c4b70

由 Eric Dumazet 提交于 7月 05, 2005

- rt hash table allocated using alloc_large_system_hash() function.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

424c4b70

[NET]: Hashed spinlocks in net/ipv4/route.c · 22c047cc

由 Eric Dumazet 提交于 7月 05, 2005

- Locking abstraction
- Spinlocks moved out of rt hash table : Less memory (50%) used by rt 
  hash table. it's a win even on UP.
- Sizing of spinlocks table depends on NR_CPUS
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

22c047cc

29 6月, 2005 1 次提交

[IPV4]: Snmpv2 Mib IP counter ipInAddrErrors support · 2c2910a4

由 Dietmar Eggemann 提交于 6月 28, 2005

I followed Thomas' proposal to see every martian destination as a case
where the ipInAddrErrors counter has to be incremented. There are
two advantages by doing so: (1) The relation between the ipInReceive
counter and all the other ipInXXX counters is more accurate in the
case the RTN_UNICAST code check fails and (2) it makes the code in
ip_route_input_slow easier.
Signed-off-by: NDietmar Eggemann <dietmar.eggemann@gmx.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2c2910a4

23 6月, 2005 1 次提交

[IPV4]: Fix route.c gcc4 warnings · 7abaa27c

由 Chuck Short 提交于 6月 22, 2005

Signed-off by: Chuck Short <zulcss@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7abaa27c

19 6月, 2005 1 次提交

[NETLINK]: Correctly set NLM_F_MULTI without checking the pid · b6544c0b

由 Jamal Hadi Salim 提交于 6月 18, 2005

This patch rectifies some rtnetlink message builders that derive the
flags from the pid. It is now explicit like the other cases
which get it right. Also fixes half a dozen dumpers which did not
set NLM_F_MULTI at all.
Signed-off-by: NJamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b6544c0b

06 5月, 2005 1 次提交

[PATCH] update Ross Biro bouncing email address · 02c30a84

由 Jesper Juhl 提交于 5月 05, 2005

Ross moved.  Remove the bad email address so people will find the correct
one in ./CREDITS.
Signed-off-by: NJesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

02c30a84

29 4月, 2005 2 次提交

[NET]: /proc/net/stat/* header cleanup · 5bec0039

由 Olaf Rempel 提交于 4月 28, 2005

Signed-off-by: NOlaf Rempel <razzor@kopf-tisch.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5bec0039

[IPV4]: Incorrect permissions on route flush sysctl · 7e3e0360

由 Dave Jones 提交于 4月 28, 2005

This has been brought up before.. http://lkml.org/lkml/2000/1/21/116
but didnt seem to get resolved.  This morning I got someone
file a bugzilla about it breaking sysctl(8).
Signed-off-by: NDave Jones <davej@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7e3e0360

20 4月, 2005 1 次提交

[NET]: skbuff: remove old NET_CALLER macro · 9c2b3328

由 Stephen Hemminger 提交于 4月 19, 2005

Here is a revised alternative that uses BUG_ON/WARN_ON
(as suggested by Herbert Xu) to eliminate NET_CALLER.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9c2b3328

17 4月, 2005 1 次提交

Linux-2.6.12-rc2 · 1da177e4

由 Linus Torvalds 提交于 4月 16, 2005

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

1da177e4