提交 · 5c1469de7545a35a16ff2b902e217044a7d2f8a5 · openanolis / cloud-kernel

17 6月, 2010 6 次提交

user_ns: Introduce user_nsmap_uid and user_ns_map_gid. · 5c1469de

由 Eric W. Biederman 提交于 6月 13, 2010

Define what happens when a we view a uid from one user_namespace
in another user_namepece.

- If the user namespaces are the same no mapping is necessary.

- For most cases of difference use overflowuid and overflowgid,
  the uid and gid currently used for 16bit apis when we have a 32bit uid
  that does fit in 16bits.  Effectively the situation is the same,
  we want to return a uid or gid that is not assigned to any user.

- For the case when we happen to be mapping the uid or gid of the
  creator of the target user namespace use uid 0 and gid as confusing
  that user with root is not a problem.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Acked-by: NSerge E. Hallyn <serue@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5c1469de

scm: Reorder scm_cookie. · 812e876e

由 Eric W. Biederman 提交于 6月 13, 2010

Reorder the fields in scm_cookie so they pack better on 64bit.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Acked-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

812e876e

qlcnic: Bumped up version number · 434d7b38

由 Anirban Chakraborty 提交于 6月 16, 2010

Changed the driver version number to 5.0.4
Signed-off-by: NAnirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

434d7b38

qlcnic: Fix a bug in setting up NIC partitioning mode · 0e33c664

由 Anirban Chakraborty 提交于 6月 16, 2010

The driver was not detecting the presence of NIC partitioning capability of the
firmware properly. Now, it checks the eswitch set bit in the FW capabilities
register and accordingly sets the driver mode as NPAR capable or not.
Signed-off-by: NAnirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e33c664

syncookies: check decoded options against sysctl settings · 8c763681

由 Florian Westphal 提交于 6月 16, 2010

Discard the ACK if we find options that do not match current sysctl
settings.

Previously it was possible to create a connection with sack, wscale,
etc. enabled even if the feature was disabled via sysctl.

Also remove an unneeded call to tcp_sack_reset() in
cookie_check_timestamp: Both call sites (cookie_v4_check,
cookie_v6_check) zero "struct tcp_options_received", hand it to
tcp_parse_options() (which does not change tcp_opt->num_sacks/dsack)
and then call cookie_check_timestamp().

Even if num_sacks/dsacks were changed, the structure is allocated on
the stack and after cookie_check_timestamp returns only a few selected
members are copied to the inet_request_sock.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8c763681

inetpeer: restore small inet_peer structures · 317fe0e6

由 Eric Dumazet 提交于 6月 16, 2010

Addition of rcu_head to struct inet_peer added 16bytes on 64bit arches.

Thats a bit unfortunate, since old size was exactly 64 bytes.

This can be solved, using an union between this rcu_head an four fields,
that are normally used only when a refcount is taken on inet_peer.
rcu_head is used only when refcnt=-1, right before structure freeing.

Add a inet_peer_refcheck() function to check this assertion for a while.

We can bring back SLAB_HWCACHE_ALIGN qualifier in kmem cache creation.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

317fe0e6

16 6月, 2010 29 次提交

D
gadget/rndis: dev_get_stats() now returns rtnl_link_stats64. · fdb93f8a
由 David S. Miller 提交于 6月 15, 2010
```
Based upon a report by Stephen Rothwell.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
fdb93f8a

inetpeer: do not use zero refcnt for freed entries · 5f2f8920

由 Eric Dumazet 提交于 6月 15, 2010

Followup of commit aa1039e7 (inetpeer: RCU conversion)

Unused inet_peer entries have a null refcnt.

Using atomic_inc_not_zero() in rcu lookups is not going to work for
them, and slow path is taken.

Fix this using -1 marker instead of 0 for deleted entries.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5f2f8920

netpoll: Use correct primitives for RCU dereferencing · d5f31fbf

由 Herbert Xu 提交于 6月 15, 2010

Now that RCU debugging checks for matching rcu_dereference calls
and rcu_read_lock, we need to use the correct primitives or face
nasty warnings.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d5f31fbf

bridge: Add const to dummy br_netpoll_send_skb · 9f70b0fc

由 Herbert Xu 提交于 6月 15, 2010

The version of br_netpoll_send_skb used when netpoll is off is
missing a const thus causing a warning.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9f70b0fc

net: NET_SKB_PAD should depend on L1_CACHE_BYTES · 5933dd2f

由 Eric Dumazet 提交于 6月 15, 2010

In old kernels, NET_SKB_PAD was defined to 16.

Then commit d6301d3d (net: Increase default NET_SKB_PAD to 32), and
commit 18e8c134 (net: Increase NET_SKB_PAD to 64 bytes) increased it
to 64.

While first patch was governed by network stack needs, second was more
driven by performance issues on current hardware. Real intent was to
align data on a cache line boundary.

So use max(32, L1_CACHE_BYTES) instead of 64, to be more generic.

Remove microblaze and powerpc own NET_SKB_PAD definitions.

Thanks to Alexander Duyck and David Miller for their comments.
Suggested-by: NDavid Miller <davem@davemloft.net>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5933dd2f

ipfrag : frag_kfree_skb() cleanup · a95d8c88

由 Eric Dumazet 提交于 6月 13, 2010

Third param (work) is unused, remove it.

Remove __inline__ and inline qualifiers.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a95d8c88

ip_frag: Remove some atomic ops · d27f9b35

由 Eric Dumazet 提交于 6月 13, 2010

Instead of doing one atomic operation per frag, we can factorize them.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d27f9b35

ipv6: syncookies: do not skip ->iif initialization · 2bbdf389

由 Florian Westphal 提交于 6月 13, 2010

When syncookies are in effect, req->iif is left uninitialized.
In case of e.g. link-local addresses the route lookup then fails
and no syn-ack is sent.

Rearrange things so ->iif is also initialized in the syncookie case.

want_cookie can only be true when the isn was zero, thus move the want_cookie
check into the "!isn" branch.

Cc: Glenn Griffin <ggriffin.kernel@gmail.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2bbdf389

net: Fix error in comment on net_device_ops::ndo_get_stats · 82695d9b

由 Ben Hutchings 提交于 6月 15, 2010

ndo_get_stats still returns struct net_device_stats *; there is
no struct net_device_stats64.
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

82695d9b

netdev:bfin_mac: reclaim and free tx skb as soon as possible after transfer · 4fcc3d34

由 Sonic Zhang 提交于 6月 11, 2010

SKBs hold onto resources that can't be held indefinitely, such as TCP
socket references and netfilter conntrack state. So if a packet is left
in TX ring for a long time, there might be a TCP socket that cannot be
closed and freed up.

Current blackfin EMAC driver always reclaim and free used tx skbs in future
transfers. The problem is that future transfer may not come as soon as
possible. This patch start a timer after transfer to reclaim and free skb.
There is nearly no performance drop with this patch.

TX interrupt is not enabled because of a strange behavior of the Blackfin EMAC.
If EMAC TX transfer control is turned on, endless TX interrupts are triggered
no matter if TX DMA is enabled or not. Since DMA walks down the ring automatically,
TX transfer control can't be turned off in the middle. The only way is to disable
TX interrupt completely.
Signed-off-by: NSonic Zhang <sonic.zhang@analog.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4fcc3d34

inetpeer: RCU conversion · aa1039e7

由 Eric Dumazet 提交于 6月 15, 2010

inetpeer currently uses an AVL tree protected by an rwlock.

It's possible to make most lookups use RCU

1) Add a struct rcu_head to struct inet_peer

2) add a lookup_rcu_bh() helper to perform lockless and opportunistic
lookup. This is a normal function, not a macro like lookup().

3) Add a limit to number of links followed by lookup_rcu_bh(). This is
needed in case we fall in a loop.

4) add an smp_wmb() in link_to_pool() right before node insert.

5) make unlink_from_pool() use atomic_cmpxchg() to make sure it can take
last reference to an inet_peer, since lockless readers could increase
refcount, even while we hold peers.lock.

6) Delay struct inet_peer freeing after rcu grace period so that
lookup_rcu_bh() cannot crash.

7) inet_getpeer() first attempts lockless lookup.
   Note this lookup can fail even if target is in AVL tree, but a
concurrent writer can let tree in a non correct form.
   If this attemps fails, lock is taken a regular lookup is performed
again.

8) convert peers.lock from rwlock to a spinlock

9) Remove SLAB_HWCACHE_ALIGN when peer_cachep is created, because
rcu_head adds 16 bytes on 64bit arches, doubling effective size (64 ->
128 bytes)
In a future patch, this is probably possible to revert this part, if rcu
field is put in an union to share space with rid, ip_id_count, tcp_ts &
tcp_ts_stamp. These fields being manipulated only with refcnt > 0.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa1039e7

cnic: Fix cnic_cm_abort() error handling. · 7b34a464

由 Michael Chan 提交于 6月 15, 2010

Fix the code that handles the error case when cnic_cm_abort() cannot
proceed normally.  We cannot just set the csk->state and we must
go through cnic_ready_to_close() to handle all the conditions.  We
also add error return code in cnic_cm_abort().
Signed-off-by: NMichael Chan <mchan@broadcom.com>
Signed-off-by: NEddie Wai <waie@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7b34a464

cnic: Refactor and fix cnic_ready_to_close(). · 943189f1

由 Michael Chan 提交于 6月 15, 2010

Combine RESET_RECEIVED and RESET_COMP logic and fix race condition
between these 2 events and cnic_cm_close().  In particular, we need
to (test_and_clear_bit(SK_F_OFFLD_COMPLETE, &csk->flags)) before we
update csk->state.
Signed-off-by: NMichael Chan <mchan@broadcom.com>
Signed-off-by: NEddie Wai <waie@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

943189f1

cnic: Refactor code in cnic_cm_process_kcqe(). · a1e621bf

由 Michael Chan 提交于 6月 15, 2010

Move chip-specific code to the respective chip's ->close_conn() functions
for better code organization.
Signed-off-by: NMichael Chan <mchan@broadcom.com>
Signed-off-by: NEddie Wai <waie@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a1e621bf

cnic: Return error code in cnic_cm_close() if unsuccessful. · ed99daa5

由 Michael Chan 提交于 6月 15, 2010

So that bnx2i can handle the error condition immediately and not have to
wait for timeout.

Signed-off-by: Michael Chan <mchan@broadcom.com.
Signed-off-by: NEddie Wai <waie@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ed99daa5

ixgbe: update set_rx_mode to fix issues w/ macvlan · 2850062a

由 Alexander Duyck 提交于 6月 15, 2010

This change corrects issues where macvlan was not correctly triggering
promiscuous mode on ixgbe due to the filters not being correctly set.  It
also corrects the fact that VF rar filters were being overwritten when the
PF was reset.

CC: Shirley Ma <xma@us.ibm.com>
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Tested-by: NEmil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2850062a

D

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 · 16fb62b6
由 David S. Miller 提交于 6月 15, 2010

16fb62b6

tcp: unify tcp flag macros · a3433f35

由 Changli Gao 提交于 6月 12, 2010

unify tcp flag macros: TCPHDR_FIN, TCPHDR_SYN, TCPHDR_RST, TCPHDR_PSH,
TCPHDR_ACK, TCPHDR_URG, TCPHDR_ECE and TCPHDR_CWR. TCBCB_FLAG_* are replaced
with the corresponding TCPHDR_*.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
----
 include/net/tcp.h                      |   24 ++++++-------
 net/ipv4/tcp.c                         |    8 ++--
 net/ipv4/tcp_input.c                   |    2 -
 net/ipv4/tcp_output.c                  |   59 ++++++++++++++++-----------------
 net/netfilter/nf_conntrack_proto_tcp.c |   32 ++++++-----------
 net/netfilter/xt_TCPMSS.c              |    4 --
 6 files changed, 58 insertions(+), 71 deletions(-)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a3433f35

bridge: use rx_handler_data pointer to store net_bridge_port pointer · f350a0a8

由 Jiri Pirko 提交于 6月 15, 2010

Register net_bridge_port pointer as rx_handler data pointer. As br_port is
removed from struct net_device, another netdev priv_flag is added to indicate
the device serves as a bridge port. Also rcuized pointers are now correctly
dereferenced in br_fdb.c and in netfilter parts.
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f350a0a8

macvlan: use rx_handler_data pointer to store macvlan_port pointer V2 · a35e2c1b

由 Jiri Pirko 提交于 6月 15, 2010

Register macvlan_port pointer as rx_handler data pointer. As macvlan_port is
removed from struct net_device, another netdev priv_flag is added to indicate
the device serves as a macvlan port.
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Acked-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a35e2c1b

net: add rx_handler data pointer · 93e2c32b

由 Jiri Pirko 提交于 6月 10, 2010

Add possibility to register rx_handler data pointer along with a rx_handler.
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

93e2c32b

bridge: Fix netpoll support · 91d2c34a

由 Herbert Xu 提交于 6月 10, 2010

There are multiple problems with the newly added netpoll support:

1) Use-after-free on each netpoll packet.
2) Invoking unsafe code on netpoll/IRQ path.
3) Breaks when netpoll is enabled on the underlying device.

This patch fixes all of these problems.  In particular, we now
allocate proper netpoll structures for each underlying device.

We only allow netpoll to be enabled on the bridge when all the
devices underneath it support netpoll.  Once it is enabled, we
do not allow non-netpoll devices to join the bridge (until netpoll
is disabled again).

This allows us to do away with the npinfo juggling that caused
problem number 1.

Incidentally this patch fixes number 2 by bypassing unsafe code
such as multicast snooping and netfilter.
Reported-by: NQianfeng Zhang <frzhang@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

91d2c34a

netpoll: Add netpoll_tx_running · c18370f5

由 Herbert Xu 提交于 6月 10, 2010

This patch adds the helper netpoll_tx_running for use within
ndo_start_xmit.  It returns non-zero if ndo_start_xmit is being
invoked by netpoll, and zero otherwise.

This is currently implemented by simply looking at the hardirq
count.  This is because for all non-netpoll uses of ndo_start_xmit,
IRQs must be enabled while netpoll always disables IRQs before
calling ndo_start_xmit.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c18370f5

netpoll: Allow netpoll_setup/cleanup recursion · 8fdd95ec

由 Herbert Xu 提交于 6月 10, 2010

This patch adds the functions __netpoll_setup/__netpoll_cleanup
which is designed to be called recursively through ndo_netpoll_seutp.

They must be called with RTNL held, and the caller must initialise
np->dev and ensure that it has a valid reference count.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8fdd95ec

netpoll: Add ndo_netpoll_setup · 4247e161

由 Herbert Xu 提交于 6月 10, 2010

This patch adds ndo_netpoll_setup as the initialisation primitive
to complement ndo_netpoll_cleanup.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4247e161

netpoll: Add locking for netpoll_setup/cleanup · dbaa1541

由 Herbert Xu 提交于 6月 10, 2010

As it stands, netpoll_setup and netpoll_cleanup have no locking
protection whatsoever.  So chaos ensures if two entities try to
perform them on the same device.

This patch adds RTNL to the equation.  The code has been rearranged so
that bits that do not need RTNL protection are now moved to the top of
netpoll_setup.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dbaa1541

netpoll: Fix RCU usage · de85d99e

由 Herbert Xu 提交于 6月 10, 2010

The use of RCU in netpoll is incorrect in a number of places:

1) The initial setting is lacking a write barrier.
2) The synchronize_rcu is in the wrong place.
3) Read barriers are missing.
4) Some places are even missing rcu_read_lock.
5) npinfo is zeroed after freeing.

This patch fixes those issues.  As most users are in BH context,
this also converts the RCU usage to the BH variant.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

de85d99e

bridge: Remove redundant npinfo NULL setting · 36655042

由 Herbert Xu 提交于 6月 10, 2010

Now that netpoll always zaps npinfo we no longer need to do it
in bridge.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

36655042

netpoll: Set npinfo to NULL even with ndo_netpoll_cleanup · c04ec806

由 Herbert Xu 提交于 6月 10, 2010

Since we have to NULL npinfo regardless of whether there is a
ndo_netpoll_cleanup, it makes sense to do this unconditionally
in netpoll_cleanup rather than having every driver do it by
themselves.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c04ec806

15 6月, 2010 5 次提交

Merge branch 'master' of /repos/git/net-next-2.6 · f9181f4f

由 Patrick McHardy 提交于 6月 15, 2010

Conflicts:
	include/net/netfilter/xt_rateest.h
	net/bridge/br_netfilter.c
	net/netfilter/nf_conntrack_core.c
Signed-off-by: NPatrick McHardy <kaber@trash.net>

f9181f4f

netfilter: xtables: idletimer target implementation · 0902b469

由 Luciano Coelho 提交于 6月 15, 2010

This patch implements an idletimer Xtables target that can be used to
identify when interfaces have been idle for a certain period of time.

Timers are identified by labels and are created when a rule is set with a new
label.  The rules also take a timeout value (in seconds) as an option.  If
more than one rule uses the same timer label, the timer will be restarted
whenever any of the rules get a hit.

One entry for each timer is created in sysfs.  This attribute contains the
timer remaining for the timer to expire.  The attributes are located under
the xt_idletimer class:

/sys/class/xt_idletimer/timers/<label>

When the timer expires, the target module sends a sysfs notification to the
userspace, which can then decide what to do (eg. disconnect to save power).

Cc: Timo Teras <timo.teras@iki.fi>
Signed-off-by: NLuciano Coelho <luciano.coelho@nokia.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

0902b469

netfilter: CLUSTERIP: RCU conversion · d73f33b1

由 Eric Dumazet 提交于 6月 15, 2010

- clusterip_lock becomes a spinlock
- lockless lookups
- kfree() deferred after RCU grace period
- rcu_barrier_bh() inserted in clusterip_tg_exit()

v2)
- As Patrick pointed out, we use atomic_inc_not_zero() in
clusterip_config_find_get().
- list_add_rcu() and list_del_rcu() variants are used.
- atomic_dec_and_lock() used in clusterip_config_entry_put()
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

d73f33b1

bnx2x: Fix link problem with some DACs · 1ab6c163

由 Yaniv Rosner 提交于 6月 14, 2010

Change 2wire transfer rate of SFP+ module EEPROM from 400Khz to 100Khz
since some DACs(direct attached cables) do not work at 400Khz.
Reported-by: NKrzysztof Oldzki <ole@ans.pl>
Signed-off-by: NYaniv Rosner <yanivr@broadcom.com>
Signed-off-by: NEilon Greenstein <eilong@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1ab6c163

inetpeer: various changes · d6cc1d64

由 Eric Dumazet 提交于 6月 14, 2010

Try to reduce cache line contentions in peer management, to reduce IP
defragmentation overhead.

- peer_fake_node is marked 'const' to make sure its not modified.
  (tested with CONFIG_DEBUG_RODATA=y)

- Group variables in two structures to reduce number of dirtied cache
lines. One named "peers" for avl tree root, its number of entries, and
associated lock. (candidate for RCU conversion)

- A second one named "unused_peers" for unused list and its lock

- Add a !list_empty() test in unlink_from_unused() to avoid taking lock
when entry is not unused.

- Use atomic_dec_and_lock() in inet_putpeer() to avoid taking lock in
some cases.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d6cc1d64

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功