提交 · 52e804c6dfaa5df1e4b0e290357b82ad4e4cda2c · openanolis / cloud-kernel

19 11月, 2012 3 次提交

net: Allow userns root to control ipv4 · 52e804c6

由 Eric W. Biederman 提交于 11月 16, 2012

Allow an unpriviled user who has created a user namespace, and then
created a network namespace to effectively use the new network
namespace, by reducing capable(CAP_NET_ADMIN) and
capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.

Settings that merely control a single network device are allowed.
Either the network device is a logical network device where
restrictions make no difference or the network device is hardware NIC
that has been explicity moved from the initial network namespace.

In general policy and network stack state changes are allowed
while resource control is left unchanged.

Allow creating raw sockets.
Allow the SIOCSARP ioctl to control the arp cache.
Allow the SIOCSIFFLAG ioctl to allow setting network device flags.
Allow the SIOCSIFADDR ioctl to allow setting a netdevice ipv4 address.
Allow the SIOCSIFBRDADDR ioctl to allow setting a netdevice ipv4 broadcast address.
Allow the SIOCSIFDSTADDR ioctl to allow setting a netdevice ipv4 destination address.
Allow the SIOCSIFNETMASK ioctl to allow setting a netdevice ipv4 netmask.
Allow the SIOCADDRT and SIOCDELRT ioctls to allow adding and deleting ipv4 routes.

Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
adding, changing and deleting gre tunnels.

Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
adding, changing and deleting ipip tunnels.

Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
adding, changing and deleting ipsec virtual tunnel interfaces.

Allow setting the MRT_INIT, MRT_DONE, MRT_ADD_VIF, MRT_DEL_VIF, MRT_ADD_MFC,
MRT_DEL_MFC, MRT_ASSERT, MRT_PIM, MRT_TABLE socket options on multicast routing
sockets.

Allow setting and receiving IPOPT_CIPSO, IP_OPT_SEC, IP_OPT_SID and
arbitrary ip options.

Allow setting IP_SEC_POLICY/IP_XFRM_POLICY ipv4 socket option.
Allow setting the IP_TRANSPARENT ipv4 socket option.
Allow setting the TCP_REPAIR socket option.
Allow setting the TCP_CONGESTION socket option.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

52e804c6

net: Push capable(CAP_NET_ADMIN) into the rtnl methods · dfc47ef8

由 Eric W. Biederman 提交于 11月 16, 2012

- In rtnetlink_rcv_msg convert the capable(CAP_NET_ADMIN) check
  to ns_capable(net->user-ns, CAP_NET_ADMIN).  Allowing unprivileged
  users to make netlink calls to modify their local network
  namespace.

- In the rtnetlink doit methods add capable(CAP_NET_ADMIN) so
  that calls that are not safe for unprivileged users are still
  protected.

Later patches will remove the extra capable calls from methods
that are safe for unprivilged users.
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dfc47ef8

net: Don't export sysctls to unprivileged users · 464dc801

由 Eric W. Biederman 提交于 11月 16, 2012

In preparation for supporting the creation of network namespaces
by unprivileged users, modify all of the per net sysctl exports
and refuse to allow them to unprivileged users.

This makes it safe for unprivileged users in general to access
per net sysctls, and allows sysctls to be exported to unprivileged
users on an individual basis as they are deemed safe.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

464dc801

02 11月, 2012 1 次提交

rtnl/ipv4: use netconf msg to advertise rp_filter status · cc535dfb

由 Nicolas Dichtel 提交于 10月 29, 2012

Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cc535dfb

29 10月, 2012 2 次提交

rtnl/ipv4: add support of RTM_GETNETCONF · 9e551110

由 Nicolas Dichtel 提交于 10月 25, 2012

This message allows to get the devconf for an interface.
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9e551110

rtnl/ipv4: use netconf msg to advertise forwarding status · edc9e748

由 Nicolas Dichtel 提交于 10月 25, 2012

Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

edc9e748

22 9月, 2012 1 次提交

net: change return values from -EACCES to -EPERM · bf5b30b8

由 Zhao Hongjiang 提交于 9月 20, 2012

Change return value from -EACCES to -EPERM when the permission check fails.
Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bf5b30b8

19 9月, 2012 1 次提交

ipv4/route: arg delay is useless in rt_cache_flush() · bafa6d9d

由 Nicolas Dichtel 提交于 9月 07, 2012

Since route cache deletion (89aef892), delay is no
more used. Remove it.
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bafa6d9d

11 9月, 2012 1 次提交

netlink: Rename pid to portid to avoid confusion · 15e47304

由 Eric W. Biederman 提交于 9月 07, 2012

It is a frequent mistake to confuse the netlink port identifier with a
process identifier.  Try to reduce this confusion by renaming fields
that hold port identifiers portid instead of pid.

I have carefully avoided changing the structures exported to
userspace to avoid changing the userspace API.

I have successfully built an allyesconfig kernel with this change.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

15e47304

08 9月, 2012 1 次提交

ipv4/route: arg delay is useless in rt_cache_flush() · 4ccfe6d4

由 Nicolas Dichtel 提交于 9月 07, 2012

Since route cache deletion (89aef892), delay is no
more used. Remove it.
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4ccfe6d4

24 8月, 2012 1 次提交

net: reinstate rtnl in call_netdevice_notifiers() · 748e2d93

由 Eric Dumazet 提交于 8月 22, 2012

Eric Biederman pointed out that not holding RTNL while calling
call_netdevice_notifiers() was racy.

This patch is a direct transcription his feedback
against commit 0115e8e3 (net: remove delay at device dismantle)

Thanks Eric !
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Gao feng <gaofeng@cn.fujitsu.com>
Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

748e2d93

23 8月, 2012 1 次提交

net: remove delay at device dismantle · 0115e8e3

由 Eric Dumazet 提交于 8月 22, 2012

I noticed extra one second delay in device dismantle, tracked down to
a call to dst_dev_event() while some call_rcu() are still in RCU queues.

These call_rcu() were posted by rt_free(struct rtable *rt) calls.

We then wait a little (but one second) in netdev_wait_allrefs() before
kicking again NETDEV_UNREGISTER.

As the call_rcu() are now completed, dst_dev_event() can do the needed
device swap on busy dst.

To solve this problem, add a new NETDEV_UNREGISTER_FINAL, called
after a rcu_barrier(), but outside of RTNL lock.

Use NETDEV_UNREGISTER_FINAL with care !

Change dst_dev_event() handler to react to NETDEV_UNREGISTER_FINAL

Also remove NETDEV_UNREGISTER_BATCH, as its not used anymore after
IP cache removal.

With help from Gao feng
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0115e8e3

04 8月, 2012 1 次提交

ipv4: change inet_addr_hash() · 40384999

由 Eric Dumazet 提交于 8月 03, 2012

Use net_hash_mix(net) instead of hash_ptr(net, 8), and use
hash_32() instead of using a serie of XOR

Define IN4_ADDR_HSIZE_SHIFT for clarity

__ip_dev_find() can perform the net_eq() call only if ifa_local
matches the key, to avoid unneeded dereferences.

remove inline attributes

# size net/ipv4/devinet.o.before net/ipv4/devinet.o
   text	   data	    bss	    dec	    hex	filename
  17471	   2545	   2048	  22064	   5630	net/ipv4/devinet.o.before
  17335	   2545	   2048	  21928	   55a8	net/ipv4/devinet.o
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

40384999

13 6月, 2012 1 次提交

ipv4: Add interface option to enable routing of 127.0.0.0/8 · d0daebc3

由 Thomas Graf 提交于 6月 12, 2012

Routing of 127/8 is tradtionally forbidden, we consider
packets from that address block martian when routing and do
not process corresponding ARP requests.

This is a sane default but renders a huge address space
practically unuseable.

The RFC states that no address within the 127/8 block should
ever appear on any network anywhere but it does not forbid
the use of such addresses outside of the loopback device in
particular. For example to address a pool of virtual guests
behind a load balancer.

This patch adds a new interface option 'route_localnet'
enabling routing of the 127/8 address block and processing
of ARP requests on a specific interface.

Note that for the feature to work, the default local route
covering 127/8 dev lo needs to be removed.

Example:
  $ sysctl -w net.ipv4.conf.eth0.route_localnet=1
  $ ip route del 127.0.0.0/8 dev lo table local
  $ ip addr add 127.1.0.1/16 dev eth0
  $ ip route flush cache

V2: Fix invalid check to auto flush cache (thanks davem)
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d0daebc3

16 5月, 2012 1 次提交

net: ipv4 and ipv6: Convert printk(KERN_DEBUG to pr_debug · 91df42be

由 Joe Perches 提交于 5月 15, 2012

Use the current debugging style and enable dynamic_debug.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

91df42be

21 4月, 2012 1 次提交

net ipv4: Convert devinet to use register_net_sysctl · 8607ddb8

由 Eric W. Biederman 提交于 4月 19, 2012

Using an ascii path to register_net_sysctl as opposed to the slightly
awkward ctl_path allows for much simpler code.

We no longer need to malloc dev_name to keep it alive the length of our
sysctl register instead we can use a small temporary buffer on the
stack.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Acked-by: NPavel Emelyanov <xemul@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8607ddb8

16 4月, 2012 1 次提交

net: cleanup unsigned to unsigned int · 95c96174

由 Eric Dumazet 提交于 4月 15, 2012

Use of "unsigned int" is preferred to bare "unsigned" in net tree.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

95c96174

02 4月, 2012 1 次提交

ipv4: Stop using NLA_PUT*(). · f3756b79

由 David S. Miller 提交于 4月 01, 2012

These macros contain a hidden goto, and are thus extremely error
prone and make code hard to audit.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f3756b79

29 3月, 2012 1 次提交

Remove all #inclusions of asm/system.h · 9ffc93f2

由 David Howells 提交于 3月 28, 2012

Remove all #inclusions of asm/system.h preparatory to splitting and killing
it. Performed with the following command:

perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *`
Signed-off-by: NDavid Howells <dhowells@redhat.com>

9ffc93f2

23 3月, 2012 1 次提交

bonding: remove entries for master_ip and vlan_ip and query devices instead · eaddcd76

由 Andy Gospodarek 提交于 3月 22, 2012

The following patch aimed to resolve an issue where secondary, tertiary,
etc. addresses added to bond interfaces could overwrite the
bond->master_ip and vlan_ip values.

        commit 917fbdb3
        Author: Henrik Saavedra Persson <henrik.e.persson@ericsson.com>
        Date:   Wed Nov 23 23:37:15 2011 +0000

            bonding: only use primary address for ARP

That patch was good because it prevented bonds using ARP monitoring from
sending frames with an invalid source IP address.  Unfortunately, it
didn't always work as expected.

When using an ioctl (like ifconfig does) to set the IP address and
netmask, 2 separate ioctls are actually called to set the IP and netmask
if the mask chosen doesn't match the standard mask for that class of
address.  The first ioctl did not have a mask that matched the one in
the primary address and would still cause the device address to be
overwritten.  The second ioctl that was called to set the mask would
then detect as secondary and ignored, but the damage was already done.

This was not an issue when using an application that used netlink
sockets as the setting of IP and netmask came down at once.  The
inconsistent behavior between those two interfaces was something that
needed to be resolved.

While I was thinking about how I wanted to resolve this, Ralf Zeidler
came with a patch that resolved this on a RHEL kernel by keeping a full
shadow of the entries in dev->ifa_list for the bonding device and vlan
devices in the bonding driver.  I didn't like the duplication of the
list as I want to see the 'bonding' struct and code shrink rather than
grow, but liked the general idea.

As the Subject indicates this patch drops the master_ip and vlan_ip
elements from the 'bonding' and 'vlan_entry' structs, respectively.
This can be done because a device's address-list is now traversed to
determine the optimal source IP address for ARP requests and for checks
to see if the bonding device has a particular IP address.  This code
could have all be contained inside the bonding driver, but it made more
sense to me to EXPORT and call inet_confirm_addr since it did exactly
what was needed.

I tested this and a backported patch and everything works as expected.
Ralf also helped with verification of the backported patch.

Thanks to Ralf for all his help on this.

v2: Whitespace and organizational changes based on suggestions from Jay
Vosburgh and Dave Miller.

v3: Fixup incorrect usage of rcu_read_unlock based on Dave Miller's
suggestion.
Signed-off-by: NAndy Gospodarek <andy@greyhouse.net>
CC: Ralf Zeidler <ralf.zeidler@nsn.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eaddcd76

13 1月, 2012 1 次提交

net: reintroduce missing rcu_assign_pointer() calls · cf778b00

由 Eric Dumazet 提交于 1月 12, 2012

commit a9b3cd7f (rcu: convert uses of rcu_assign_pointer(x, NULL) to
RCU_INIT_POINTER) did a lot of incorrect changes, since it did a
complete conversion of rcu_assign_pointer(x, y) to RCU_INIT_POINTER(x,
y).

We miss needed barriers, even on x86, when y is not NULL.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cf778b00

02 12月, 2011 1 次提交

ipv4: flush route cache after change accept_local · d01ff0a0

由 Peter Pan(潘卫平) 提交于 12月 01, 2011

After reset ipv4_devconf->data[IPV4_DEVCONF_ACCEPT_LOCAL] to 0,
we should flush route cache, or it will continue receive packets with local
source address, which should be dropped.
Signed-off-by: NWeiping Pan <panweiping3@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d01ff0a0

02 8月, 2011 1 次提交

rcu: convert uses of rcu_assign_pointer(x, NULL) to RCU_INIT_POINTER · a9b3cd7f

由 Stephen Hemminger 提交于 8月 01, 2011

When assigning a NULL value to an RCU protected pointer, no barrier
is needed. The rcu_assign_pointer, used to handle that but will soon
change to not handle the special case.

Convert all rcu_assign_pointer of NULL value.

//smpl
@@ expression P; @@

- rcu_assign_pointer(P, NULL)
+ RCU_INIT_POINTER(P, NULL)

// </smpl>
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a9b3cd7f

26 7月, 2011 1 次提交

IPv4: Send gratuitous ARP for secondary IP addresses also · b76d0789

由 Zoltan Kiss 提交于 7月 24, 2011

If a device event generates gratuitous ARP messages, only primary
address is used for sending. This patch iterates through the whole
list. Tested with 2 IP addresses configuration on bonding interface.
Signed-off-by: NZoltan Kiss <schaman@sch.bme.hu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b76d0789

10 6月, 2011 1 次提交

rtnetlink: Compute and store minimum ifinfo dump size · c7ac8679

由 Greg Rose 提交于 6月 10, 2011

The message size allocated for rtnl ifinfo dumps was limited to
a single page.  This is not enough for additional interface info
available with devices that support SR-IOV and caused a bug in
which VF info would not be displayed if more than approximately
40 VFs were created per interface.

Implement a new function pointer for the rtnl_register service that will
calculate the amount of data required for the ifinfo dump and allocate
enough data to satisfy the request.
Signed-off-by: NGreg Rose <gregory.v.rose@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

c7ac8679

11 5月, 2011 1 次提交

net: fix two lockdep splats · 1fc19aff

由 Eric Dumazet 提交于 5月 09, 2011

Commit e67f88dd (net: dont hold rtnl mutex during netlink dump
callbacks) switched rtnl protection to RCU, but we forgot to adjust two
rcu_dereference() lockdep annotations :

inet_get_link_af_size() or inet_fill_link_af() might be called with
rcu_read_lock or rtnl held, so use rcu_dereference_rtnl()
instead of rtnl_dereference()
Reported-by: NValdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1fc19aff

03 5月, 2011 1 次提交

sysctl: net: call unregister_net_sysctl_table where needed · ff538818

由 Lucian Adrian Grijincu 提交于 5月 01, 2011

ctl_table_headers registered with register_net_sysctl_table should
have been unregistered with the equivalent unregister_net_sysctl_table
Signed-off-by: NLucian Adrian Grijincu <lucian.grijincu@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ff538818

30 4月, 2011 1 次提交

ipv4, ipv6, bonding: Restore control over number of peer notifications · ad246c99

由 Ben Hutchings 提交于 4月 26, 2011

For backward compatibility, we should retain the module parameters and
sysfs attributes to control the number of peer notifications
(gratuitous ARPs and unsolicited NAs) sent after bonding failover.
Also, it is possible for failover to take place even though the new
active slave does not have link up, and in that case the peer
notification should be deferred until it does.

Change ipv4 and ipv6 so they do not automatically send peer
notifications on bonding failover.

Change the bonding driver to send separate NETDEV_NOTIFY_PEERS
notifications when the link is up, as many times as requested.  Since
it does not directly control which protocols send notifications, make
num_grat_arp and num_unsol_na aliases for a single parameter.  Bump
the bonding version number and update its documentation.
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
Acked-by: NBrian Haley <brian.haley@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ad246c99

18 4月, 2011 1 次提交

bonding, ipv4, ipv6, vlan: Handle NETDEV_BONDING_FAILOVER like NETDEV_NOTIFY_PEERS · 7c899432

由 Ben Hutchings 提交于 4月 15, 2011

It is undesirable for the bonding driver to be poking into higher
level protocols, and notifiers provide a way to avoid that.  This does
mean removing the ability to configure reptitition of gratuitous ARPs
and unsolicited NAs.
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7c899432

24 3月, 2011 1 次提交

ipv4: Fallback to FIB local table in __ip_dev_find(). · 406b6f97

由 David S. Miller 提交于 3月 22, 2011

In commit 9435eb1c
("ipv4: Implement __ip_dev_find using new interface address hash.")
we reimplemented __ip_dev_find() so that it doesn't have to
do a full FIB table lookup.

Instead, it consults a hash table of addresses configured to
interfaces.

This works identically to the old code in all except one case,
and that is for loopback subnets.

The old code would match the loopback device for any IP address
that falls within a subnet configured to the loopback device.

Handle this corner case by doing the FIB lookup.

We could implement this via inet_addr_onlink() but:

1) Someone could configure many addresses to loopback and
   inet_addr_onlink() is a simple list traversal.

2) We know the old code works.
Reported-by: NJulian Anastasov <ja@ssi.bg>
Acked-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

406b6f97

22 3月, 2011 2 次提交

ipv4: optimize route adding on secondary promotion · 04024b93

由 Julian Anastasov 提交于 3月 19, 2011

Optimize the calling of fib_add_ifaddr for all
secondary addresses after the promoted one to start from
their place, not from the new place of the promoted
secondary. It will save some CPU cycles because we
are sure the promoted secondary was first for the subnet
and all next secondaries do not change their place.
Signed-off-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

04024b93

ipv4: remove the routes on secondary promotion · 2d230e2b

由 Julian Anastasov 提交于 3月 19, 2011

The secondary address promotion relies on fib_sync_down_addr
to remove all routes created for the secondary addresses when
the old primary address is deleted. It does not happen for cases
when the primary address is also in another subnet. Fix that
by deleting local and broadcast routes for all secondaries while
they are on device list and by faking that all addresses from
this subnet are to be deleted. It relies on fib_del_ifaddr being
able to ignore the IPs from the concerned subnet while checking
for duplication.
Signed-off-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d230e2b

10 3月, 2011 1 次提交

ipv4: Fix erroneous uses of ifa_address. · 6c91afe1

由 David S. Miller 提交于 3月 09, 2011

In usual cases ifa_address == ifa_local, but in the case where
SIOCSIFDSTADDR sets the destination address on a point-to-point
link, ifa_address gets set to that destination address.

Therefore we should use ifa_local when we want the local interface
address.

There were two cases where the selection was done incorrectly:

1) When devinet_ioctl() does matching, it checks ifa_address even
   though gifconf correct reported ifa_local to the user

2) IN_DEV_ARP_NOTIFY handling sends a gratuitous ARP using
   ifa_address instead of ifa_local.
Reported-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6c91afe1

04 3月, 2011 1 次提交

ipv4: Fix __ip_dev_find() to use ifa_local instead of ifa_address. · e066008b

由 David S. Miller 提交于 3月 03, 2011

Reported-by: NStephen Hemminger <shemminger@vyatta.com>
Reported-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e066008b

19 2月, 2011 2 次提交

D
ipv4: Implement __ip_dev_find using new interface address hash. · 9435eb1c
由 David S. Miller 提交于 2月 18, 2011
```
Much quicker than going through the FIB tables.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
9435eb1c

ipv4: Add hash table of interface addresses. · fd23c3b3

由 David S. Miller 提交于 2月 18, 2011

This will be used to optimize __ip_dev_find() and friends.

With help from Eric Dumazet.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fd23c3b3

15 2月, 2011 1 次提交

arp_notify: unconditionally send gratuitous ARP for NETDEV_NOTIFY_PEERS. · d11327ad

由 Ian Campbell 提交于 2月 11, 2011

NETDEV_NOTIFY_PEER is an explicit request by the driver to send a link
notification while NETDEV_UP/NETDEV_CHANGEADDR generate link
notifications as a sort of side effect.

In the later cases the sysctl option is present because link
notification events can have undesired effects e.g. if the link is
flapping. I don't think this applies in the case of an explicit
request from a driver.

This patch makes NETDEV_NOTIFY_PEER unconditional, if preferred we
could add a new sysctl for this case which defaults to on.

This change causes Xen post-migration ARP notifications (which cause
switches to relearn their MAC tables etc) to be sent by default.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d11327ad

13 12月, 2010 1 次提交

ipv4: Don't pre-seed hoplimit metric. · 323e126f

由 David S. Miller 提交于 12月 12, 2010

Always go through a new ip4_dst_hoplimit() helper, just like ipv6.

This allowed several simplifications:

1) The interim dst_metric_hoplimit() can go as it's no longer
   userd.

2) The sysctl_ip_default_ttl entry no longer needs to use
   ipv4_doint_and_flush, since the sysctl is not cached in
   routing cache metrics any longer.

3) ipv4_doint_and_flush no longer needs to be exported and
   therefore can be marked static.

When ipv4_doint_and_flush_strategy was removed some time ago,
the external declaration in ip.h was mistakenly left around
so kill that off too.

We have to move the sysctl_ip_default_ttl declaration into
ipv4's route cache definition header net/route.h, because
currently net/ip.h (where the declaration lives now) has
a back dependency on net/route.h
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

323e126f

07 12月, 2010 1 次提交

net: kill an RCU warning in inet_fill_link_af() · f7fce74e

由 Eric Dumazet 提交于 12月 01, 2010

commits 9f0f7272 (ipv4: AF_INET link address family) and cf7afbfe
(rtnl: make link af-specific updates atomic) used incorrect
__in_dev_get_rcu() in RTNL protected contexts, triggering PROVE_RCU
warnings.

Switch to __in_dev_get_rtnl(), wich is more appropriate, since we hold
RTNL.

Based on a report and initial patch from Amerigo Wang.
Reported-by: NAmerigo Wang <amwang@redhat.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Thomas Graf <tgraf@infradead.org>
Reviewed-by: NWANG Cong <amwang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f7fce74e

28 11月, 2010 1 次提交

rtnl: make link af-specific updates atomic · cf7afbfe

由 Thomas Graf 提交于 11月 22, 2010

As David pointed out correctly, updates to af-specific attributes
are currently not atomic. If multiple changes are requested and
one of them fails, previous updates may have been applied already
leaving the link behind in a undefined state.

This patch splits the function parse_link_af() into two functions
validate_link_af() and set_link_at(). validate_link_af() is placed
to validate_linkmsg() check for errors as early as possible before
any changes to the link have been made. set_link_af() is called to
commit the changes later.

This method is not fail proof, while it is currently sufficient
to make set_link_af() inerrable and thus 100% atomic, the
validation function method will not be able to detect all error
scenarios in the future, there will likely always be errors
depending on states which are f.e. not protected by rtnl_mutex
and thus may change between validation and setting.

Also, instead of silently ignoring unknown address families and
config blocks for address families which did not register a set
function the errors EAFNOSUPPORT respectively EOPNOSUPPORT are
returned to avoid comitting 4 out of 5 update requests without
notifying the user.
Signed-off-by: NThomas Graf <tgraf@infradead.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cf7afbfe

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功