提交 · 78fb2de711ec28997bf38bcf3e48e108e907be77 · openanolis / cloud-kernel

11 9月, 2012 1 次提交

netlink: Rename pid to portid to avoid confusion · 15e47304

由 Eric W. Biederman 提交于 9月 07, 2012

It is a frequent mistake to confuse the netlink port identifier with a
process identifier.  Try to reduce this confusion by renaming fields
that hold port identifiers portid instead of pid.

I have carefully avoided changing the structures exported to
userspace to avoid changing the userspace API.

I have successfully built an allyesconfig kernel with this change.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

15e47304

10 8月, 2012 1 次提交

net: Loopback ifindex is constant now · 1fb9489b

由 Pavel Emelyanov 提交于 8月 08, 2012

As pointed out, there are places, that access net->loopback_dev->ifindex
and after ifindex generation is made per-net this value becomes constant
equals 1. So go ahead and introduce the LOOPBACK_IFINDEX constant and use
it where appropriate.
Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1fb9489b

01 8月, 2012 1 次提交

ipv4: Restore old dst_free() behavior. · 54764bb6

由 Eric Dumazet 提交于 7月 31, 2012

commit 404e0a8b (net: ipv4: fix RCU races on dst refcounts) tried
to solve a race but added a problem at device/fib dismantle time :

We really want to call dst_free() as soon as possible, even if sockets
still have dst in their cache.
dst_release() calls in free_fib_info_rcu() are not welcomed.

Root of the problem was that now we also cache output routes (in
nh_rth_output), we must use call_rcu() instead of call_rcu_bh() in
rt_free(), because output route lookups are done in process context.

Based on feedback and initial patch from David Miller (adding another
call_rcu_bh() call in fib, but it appears it was not the right fix)

I left the inet_sk_rx_dst_set() helper and added __rcu attributes
to nh_rth_output and nh_rth_input to better document what is going on in
this code.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

54764bb6

31 7月, 2012 1 次提交

net: ipv4: fix RCU races on dst refcounts · 404e0a8b

由 Eric Dumazet 提交于 7月 29, 2012

commit c6cffba4 (ipv4: Fix input route performance regression.)
added various fatal races with dst refcounts.

crashes happen on tcp workloads if routes are added/deleted at the same
time.

The dst_free() calls from free_fib_info_rcu() are clearly racy.

We need instead regular dst refcounting (dst_release()) and make
sure dst_release() is aware of RCU grace periods :

Add DST_RCU_FREE flag so that dst_release() respects an RCU grace period
before dst destruction for cached dst

Introduce a new inet_sk_rx_dst_set() helper, using atomic_inc_not_zero()
to make sure we dont increase a zero refcount (On a dst currently
waiting an rcu grace period before destruction)

rt_cache_route() must take a reference on the new cached route, and
release it if was not able to install it.

With this patch, my machines survive various benchmarks.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

404e0a8b

24 7月, 2012 1 次提交

decnet: Don't set RTCF_DIRECTSRC. · 8acfaa94

由 David S. Miller 提交于 7月 23, 2012

It's an ipv4 defined route flag, and only ipv4 uses it.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8acfaa94

21 7月, 2012 1 次提交

net: Document dst->obsolete better. · f5b0a874

由 David S. Miller 提交于 7月 19, 2012

Add a big comment explaining how the field works, and use defines
instead of magic constants for the values assigned to it.

Suggested by Joe Perches.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f5b0a874

17 7月, 2012 1 次提交

net: Pass optional SKB and SK arguments to dst_ops->{update_pmtu,redirect}() · 6700c270

由 David S. Miller 提交于 7月 17, 2012

This will be used so that we can compose a full flow key.

Even though we have a route in this context, we need more. In the
future the routes will be without destination address, source address,
etc. keying. One ipv4 route will cover entire subnets, etc.

In this environment we have to have a way to possess persistent storage
for redirects and PMTU information. This persistent storage will exist
in the FIB tables, and that's why we'll need to be able to rebuild a
full lookup flow key here. Using that flow key will do a fib_lookup()
and create/update the persistent entry.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6700c270

12 7月, 2012 1 次提交
- D
  net: Add dummy dst_ops->redirect method where needed. · b587ee3b
  由 David S. Miller 提交于 7月 12, 2012
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  b587ee3b
11 7月, 2012 2 次提交

D
rtnetlink: Remove ts/tsage args to rtnl_put_cacheinfo(). · 87a50699
由 David S. Miller 提交于 7月 10, 2012
```
Nobody provides non-zero values any longer.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
87a50699

net: Don't report route RTT metric value in cache dumps. · 794785bf

由 David S. Miller 提交于 7月 10, 2012

We don't maintain it dynamically any longer, so reporting it would
be extremely misleading.  Report zero instead.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

794785bf

05 7月, 2012 2 次提交

decnet: Use neighbours privately in dn_route struct. · fccd7d5c

由 David S. Miller 提交于 7月 02, 2012

This allows an easy conversion away from dst_get_neighbour*().
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fccd7d5c

net: Add optional SKB arg to dst_ops->neigh_lookup(). · f894cbf8

由 David S. Miller 提交于 7月 02, 2012

Causes the handler to use the daddr in the ipv4/ipv6 header when
the route gateway is unspecified (local subnet).
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f894cbf8

28 6月, 2012 1 次提交

decnet: Do not use RTA_PUT() macros · 6b60978f

由 Thomas Graf 提交于 6月 26, 2012

Also, no need to trim on nlmsg_put() failure, nothing has been added
yet.  We also want to use nlmsg_end(), nlmsg_new() and nlmsg_free().
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6b60978f

27 6月, 2012 1 次提交
- D
  decnet: dn_route: Move away from NLMSG_NEW(). · 737100e1
  由 David S. Miller 提交于 6月 26, 2012
```
And use nlmsg_data() while we're here too.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  737100e1
16 5月, 2012 1 次提交

net: Convert net_ratelimit uses to net_<level>_ratelimited · e87cc472

由 Joe Perches 提交于 5月 13, 2012

Standardize the net core ratelimited logging functions.

Coalesce formats, align arguments.
Change a printk then vprintk sequence to use printf extension %pV.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e87cc472

16 4月, 2012 1 次提交

net: cleanup unsigned to unsigned int · 95c96174

由 Eric Dumazet 提交于 4月 15, 2012

Use of "unsigned int" is preferred to bare "unsigned" in net tree.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

95c96174

06 2月, 2012 1 次提交

decnet: remove unused variable from dn_output() · 22b6a2eb

由 Jesper Juhl 提交于 2月 05, 2012

The variable 'neigh' is assigned to, but otherwise completely
unused. So let's remove it.
Signed-off-by: NJesper Juhl <jj@chaosbits.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

22b6a2eb

06 12月, 2011 1 次提交

net: Rename dst_get_neighbour{, _raw} to dst_get_neighbour_noref{, _raw}. · 27217455

由 David Miller 提交于 12月 02, 2011

To reflect the fact that a refrence is not obtained to the
resulting neighbour entry.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NRoland Dreier <roland@purestorage.com>

27217455

27 11月, 2011 2 次提交

net: Move mtu handling down to the protocol depended handlers · 618f9bc7

由 Steffen Klassert 提交于 11月 23, 2011

We move all mtu handling from dst_mtu() down to the protocol
layer. So each protocol can implement the mtu handling in
a different manner.
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

618f9bc7

net: Rename the dst_opt default_mtu method to mtu · ebb762f2

由 Steffen Klassert 提交于 11月 23, 2011

We plan to invoke the dst_opt->default_mtu() method unconditioally
from dst_mtu(). So rename the method to dst_opt->mtu() to match
the name with the new meaning.
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ebb762f2

01 11月, 2011 1 次提交

net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules · bc3b2d7f

由 Paul Gortmaker 提交于 7月 15, 2011

These files are non modular, but need to export symbols using
the macros now living in export.h -- call out the include so
that things won't break when we remove the implicit presence
of module.h from everywhere.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

bc3b2d7f

18 7月, 2011 3 次提交

net: Add ->neigh_lookup() operation to dst_ops · d3aaeb38

由 David S. Miller 提交于 7月 18, 2011

In the future dst entries will be neigh-less.  In that environment we
need to have an easy transition point for current users of
dst->neighbour outside of the packet output fast path.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d3aaeb38

D
net: Abstract dst->neighbour accesses behind helpers. · 69cce1d1
由 David S. Miller 提交于 7月 17, 2011
```
dst_{get,set}_neighbour()
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
69cce1d1

neigh: Pass neighbour entry to output ops. · 8f40b161

由 David S. Miller 提交于 7月 17, 2011

This will get us closer to being able to do "neigh stuff"
completely independent of the underlying dst_entry for
protocols (ipv4/ipv6) that wish to do so.

We will also be able to make dst entries neigh-less.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8f40b161

02 7月, 2011 1 次提交

decnet: Reduce switch/case indent · 06f8fe11

由 Joe Perches 提交于 7月 01, 2011

Make the case labels the same indent as the switch.

git diff -w shows differences for line wrapping.
(fit multiple lines to 80 columns, join where possible)
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

06f8fe11

10 6月, 2011 1 次提交

rtnetlink: Compute and store minimum ifinfo dump size · c7ac8679

由 Greg Rose 提交于 6月 10, 2011

The message size allocated for rtnl ifinfo dumps was limited to
a single page.  This is not enough for additional interface info
available with devices that support SR-IOV and caused a bug in
which VF info would not be displayed if more than approximately
40 VFs were created per interface.

Implement a new function pointer for the rtnl_register service that will
calculate the amount of data required for the ifinfo dump and allocate
enough data to satisfy the request.
Signed-off-by: NGreg Rose <gregory.v.rose@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

c7ac8679

29 4月, 2011 2 次提交

net: Use non-zero allocations in dst_alloc(). · cf911662

由 David S. Miller 提交于 4月 28, 2011

Make dst_alloc() and it's users explicitly initialize the entire
entry.

The zero'ing done by kmem_cache_zalloc() was almost entirely
redundant.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cf911662

net: Make dst_alloc() take more explicit initializations. · 5c1e6aa3

由 David S. Miller 提交于 4月 28, 2011

Now the dst->dev, dev->obsolete, and dst->flags values can
be specified as well.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5c1e6aa3

13 3月, 2011 2 次提交

D
decnet: Convert to use flowidn where applicable. · bef55aeb
由 David S. Miller 提交于 3月 12, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
bef55aeb

net: Put flowi_* prefix on AF independent members of struct flowi · 1d28f42c

由 David S. Miller 提交于 3月 12, 2011

I intend to turn struct flowi into a union of AF specific flowi
structs.  There will be a common structure that each variant includes
first, much like struct sock_common.

This is the first step to move in that direction.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d28f42c

03 3月, 2011 1 次提交
- D
  xfrm: Return dst directly from xfrm_lookup() · 452edd59
  由 David S. Miller 提交于 3月 02, 2011
```
Instead of on the stack.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  452edd59
02 3月, 2011 1 次提交

xfrm: Kill XFRM_LOOKUP_WAIT flag. · 80c0bc9e

由 David S. Miller 提交于 3月 01, 2011

This can be determined from the flow flags instead.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

80c0bc9e

18 2月, 2011 1 次提交

net: Add initial_ref arg to dst_alloc(). · 3c7bd1a1

由 David S. Miller 提交于 2月 16, 2011

This allows avoiding multiple writes to the initial __refcnt.

The most simplest cases of wanting an initial reference of "1"
in ipv4 and ipv6 have been converted, the rest have been left
along and kept at the existing "0".
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3c7bd1a1

27 1月, 2011 1 次提交

net: Implement read-only protection and COW'ing of metrics. · 62fa8a84

由 David S. Miller 提交于 1月 26, 2011

Routing metrics are now copy-on-write.

Initially a route entry points it's metrics at a read-only location.
If a routing table entry exists, it will point there.  Else it will
point at the all zero metric place-holder called 'dst_default_metrics'.

The writeability state of the metrics is stored in the low bits of the
metrics pointer, we have two bits left to spare if we want to store
more states.

For the initial implementation, COW is implemented simply via kmalloc.
However future enhancements will change this to place the writable
metrics somewhere else, in order to increase sharing.  Very likely
this "somewhere else" will be the inetpeer cache.

Note also that this means that metrics updates may transiently fail
if we cannot COW the metrics successfully.

But even by itself, this patch should decrease memory usage and
increase cache locality especially for routing workloads.  In those
cases the read-only metric copies stay in place and never get written
to.

TCP workloads where metrics get updated, and those rare cases where
PMTU triggers occur, will take a very slight performance hit.  But
that hit will be alleviated when the long-term writable metrics
move to a more sharable location.

Since the metrics storage went from a u32 array of RTAX_MAX entries to
what is essentially a pointer, some retooling of the dst_entry layout
was necessary.

Most importantly, we need to preserve the alignment of the reference
count so that it doesn't share cache lines with the read-mostly state,
as per Eric Dumazet's alignment assertion checks.

The only non-trivial bit here is the move of the 'flags' member into
the writeable cacheline.  This is OK since we are always accessing the
flags around the same moment when we made a modification to the
reference count.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62fa8a84

15 12月, 2010 1 次提交

net: Abstract default MTU metric calculation behind an accessor. · d33e4553

由 David S. Miller 提交于 12月 14, 2010

Like RTAX_ADVMSS, make the default calculation go through a dst_ops
method rather than caching the computation in the routing cache
entries.

Now dst metrics are pretty much left as-is when new entries are
created, thus optimizing metric sharing becomes a real possibility.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d33e4553

14 12月, 2010 1 次提交

net: Abstract default ADVMSS behind an accessor. · 0dbaee3b

由 David S. Miller 提交于 12月 13, 2010

Make all RTAX_ADVMSS metric accesses go through a new helper function,
dst_metric_advmss().

Leave the actual default metric as "zero" in the real metric slot,
and compute the actual default value dynamically via a new dst_ops
AF specific callback.

For stacked IPSEC routes, we use the advmss of the path which
preserves existing behavior.

Unlike ipv4/ipv6, DecNET ties the advmss to the mtu and thus updates
advmss on pmtu updates.  This inconsistency in advmss handling
results in more raw metric accesses than I wish we ended up with.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0dbaee3b

10 12月, 2010 1 次提交

net: Abstract away all dst_entry metrics accesses. · defb3519

由 David S. Miller 提交于 12月 08, 2010

Use helper functions to hide all direct accesses, especially writes,
to dst_entry metrics values.

This will allow us to:

1) More easily change how the metrics are stored.

2) Implement COW for metrics.

In particular this will help us put metrics into the inetpeer
cache if that is what we end up doing.  We can make the _metrics
member a pointer instead of an array, initially have it point
at the read-only metrics in the FIB, and then on the first set
grab an inetpeer entry and point the _metrics member there.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>

defb3519

18 11月, 2010 1 次提交

net: use the macros defined for the members of flowi · 5811662b

由 Changli Gao 提交于 11月 12, 2010

Use the macros defined for the members of flowi to clean the code up.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5811662b

12 11月, 2010 1 次提交

ipv4: Make rt->fl.iif tests lest obscure. · c7537967

由 David S. Miller 提交于 11月 11, 2010

When we test rt->fl.iif against zero, we're seeing if it's
an output or an input route.

Make that explicit with some helper functions.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c7537967

09 11月, 2010 1 次提交

decnet: RCU conversion and get rid of dev_base_lock · fc766e4c

由 Eric Dumazet 提交于 10月 29, 2010

While tracking dev_base_lock users, I found decnet used it in
dnet_select_source(), but for a wrong purpose:

Writers only hold RTNL, not dev_base_lock, so readers must use RCU if
they cannot use RTNL.

Adds an rcu_head in struct dn_ifaddr and handle proper RCU management.

Adds __rcu annotation in dn_route as well.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fc766e4c

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功