提交 · 714f095f74582764d629785f03b459a3d0503624 · openanolis / cloud-kernel

19 10月, 2010 2 次提交

由 Hans Schillstrom 提交于 10月 19, 2010

IPv6 encapsulation uses a bad source address for the tunnel.
i.e. VIP will be used as local-addr and encap. dst addr.
Decapsulation will not accept this.

Example
LVS (eth1 2003::2:0:1/96, VIP 2003::2:0:100)
   (eth0 2003::1:0:1/96)
RS  (ethX 2003::1:0:5/96)

tcpdump
2003::2:0:100 > 2003::1:0:5: IP6 (hlim 63, next-header TCP (6) payload length: 40)  2003::3:0:10.50991 > 2003::2:0:100.http: Flags [S], cksum 0x7312 (correct), seq 3006460279, win 5760, options [mss 1440,sackOK,TS val 1904932 ecr 0,nop,wscale 3], length 0

In Linux IPv6 impl. you can't have a tunnel with an any cast address
receiving packets (I have not tried to interpret RFC 2473)
To have receive capabilities the tunnel must have:
 - Local address set as multicast addr or an unicast addr
 - Remote address set as an unicast addr.
 - Loop back addres or Link local address are not allowed.

This causes us to setup a tunnel in the Real Server with the
LVS as the remote address, here you can't use the VIP address since it's
used inside the tunnel.

Solution
Use outgoing interface IPv6 address (match against the destination).
i.e. use ip6_route_output() to look up the route cache and
then use ipv6_dev_get_saddr(...) to set the source address of the
encapsulated packet.

Additionally, cache the results in new destination
fields: dst_cookie and dst_saddr and properly check the
returned dst from ip6_route_output. We now add xfrm_lookup
call only for the tunneling method where the source address
is a local one.
Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

714f095f

netfilter: ctnetlink: add expectation deletion events · ebbf41df

由 Pablo Neira Ayuso 提交于 10月 19, 2010

This patch allows to listen to events that inform about
expectations destroyed.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

ebbf41df

18 10月, 2010 1 次提交

netfilter: install missing ebtables headers for userspace · 43f974cd

由 Nick Bowler 提交于 10月 18, 2010

The ebt_ip6.h and ebt_nflog.h headers are not not known to Kbuild and
therefore not installed by make headers_install.  Fix that up.
Signed-off-by: NNick Bowler <nbowler@elliptictech.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

43f974cd

14 10月, 2010 5 次提交
- J
  netfilter: xtables: remove unused defines · 9ecdafd8
  由 Jan Engelhardt 提交于 10月 13, 2010
```
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
```
  9ecdafd8
- J
  netfilter: xtables: unify {ip,ip6,arp}t_error_target · 75f0a0fd
  由 Jan Engelhardt 提交于 10月 13, 2010
```
Unification of struct *_error_target was forgotten in
v2.6.16-1689-g1e30a014.
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
```
  75f0a0fd
- J
  
  netfilter: xtables: resolve indirect macros 3/3 · 243bf6e2
  由 Jan Engelhardt 提交于 10月 13, 2010
  
  243bf6e2
- J
  netfilter: xtables: resolve indirect macros 2/3 · 87a2e70d
  由 Jan Engelhardt 提交于 10月 13, 2010
```
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
```
  87a2e70d
- J
  netfilter: xtables: resolve indirect macros 1/3 · 12b00c2c
  由 Jan Engelhardt 提交于 10月 13, 2010
```
Many of the used macros are just there for userspace compatibility.
Substitute the in-kernel code to directly use the terminal macro
and stuff the defines into #ifndef __KERNEL__ sections.
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
```
  12b00c2c
05 10月, 2010 2 次提交

netfilter: add missing xt_log.h file · eecc5458

由 Patrick McHardy 提交于 10月 04, 2010

Forgot to add xt_log.h in commit a8defca0 (netfilter: ipt_LOG:
add bufferisation to call printk() once)
Signed-off-by: NPatrick McHardy <kaber@trash.net>

eecc5458

netfilter: nf_nat: make find/put static · 0c200d93

由 Stephen Hemminger 提交于 10月 04, 2010

The functions nf_nat_proto_find_get and nf_nat_proto_put are
only used internally in nf_nat_core. This might break some out
of tree NAT module.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

0c200d93

04 10月, 2010 6 次提交

IPVS: Allow configuration of persistence engines · 0d1e71b0

由 Simon Horman 提交于 8月 22, 2010

Allow the persistence engine of a virtual service to be set, edited
and unset.

This feature only works with the netlink user-space interface.
Signed-off-by: NSimon Horman <horms@verge.net.au>
Acked-by: NJulian Anastasov <ja@ssi.bg>

0d1e71b0

IPVS: management of persistence engine modules · 8be67a66

由 Simon Horman 提交于 8月 22, 2010

This is based heavily on the scheduler management code
Signed-off-by: NSimon Horman <horms@verge.net.au>
Acked-by: NJulian Anastasov <ja@ssi.bg>

8be67a66

IPVS: Add persistence engine data to /proc/net/ip_vs_conn · a3c918ac

由 Simon Horman 提交于 8月 22, 2010

This shouldn't break compatibility with userspace as the new data
is at the end of the line.

I have confirmed that this doesn't break ipvsadm, the main (only?)
user-space user of this data.
Signed-off-by: NSimon Horman <horms@verge.net.au>
Acked-by: NJulian Anastasov <ja@ssi.bg>

a3c918ac

IPVS: Add struct ip_vs_pe · 85999283

由 Simon Horman 提交于 8月 22, 2010

Signed-off-by: NSimon Horman <horms@verge.net.au>
Acked-by: NJulian Anastasov <ja@ssi.bg>

85999283

IPVS: Add struct ip_vs_conn_param · f11017ec

由 Simon Horman 提交于 8月 22, 2010

Signed-off-by: NSimon Horman <horms@verge.net.au>
Acked-by: NJulian Anastasov <ja@ssi.bg>

f11017ec

S
netfilter: nf_conntrack_sip: Add callid parser · 001985b2
由 Simon Horman 提交于 8月 22, 2010
```
Signed-off-by: NSimon Horman <horms@verge.net.au>
Acked-by: NJulian Anastasov <ja@ssi.bg>
```
001985b2

29 9月, 2010 1 次提交

netfilter: ctnetlink: add support for user-space expectation helpers · bc01befd

由 Pablo Neira Ayuso 提交于 9月 28, 2010

This patch adds the basic infrastructure to support user-space
expectation helpers via ctnetlink and the netfilter queuing
infrastructure NFQUEUE. Basically, this patch:

* adds NF_CT_EXPECT_USERSPACE flag to identify user-space
  created expectations. I have also added a sanity check in
  __nf_ct_expect_check() to avoid that kernel-space helpers
  may create an expectation if the master conntrack has no
  helper assigned.
* adds some branches to check if the master conntrack helper
  exists, otherwise we skip the code that refers to kernel-space
  helper such as the local expectation list and the expectation
  policy.
* allows to set the timeout for user-space expectations with
  no helper assigned.
* a list of expectations created from user-space that depends
  on ctnetlink (if this module is removed, they are deleted).
* includes USERSPACE in the /proc output for expectations
  that have been created by a user-space helper.

This patch also modifies ctnetlink to skip including the helper
name in the Netlink messages if no kernel-space helper is set
(since no user-space expectation has not kernel-space kernel
assigned).

You can access an example user-space FTP conntrack helper at:
http://people.netfilter.org/pablo/userspace-conntrack-helpers/nf-ftp-helper-userspace-POC.tar.bzSigned-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

bc01befd

22 9月, 2010 1 次提交

netfilter: ctnetlink: allow to specify the expectation flags · 8b008faf

由 Pablo Neira Ayuso 提交于 9月 22, 2010

With this patch, you can specify the expectation flags for user-space
created expectations.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

8b008faf

21 9月, 2010 2 次提交

ipvs: make rerouting optional with snat_reroute · 8a803040

由 Julian Anastasov 提交于 9月 21, 2010

	Add new sysctl flag "snat_reroute". Recent kernels use
ip_route_me_harder() to route LVS-NAT responses properly by
VIP when there are multiple paths to client. But setups
that do not have alternative default routes can skip this
routing lookup by using snat_reroute=0.
Signed-off-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

8a803040

ipvs: netfilter connection tracking changes · f4bc17cd

由 Julian Anastasov 提交于 9月 21, 2010

	Add more code to IPVS to work with Netfilter connection
tracking and fix some problems.

- Allow IPVS to be compiled without connection tracking as in
2.6.35 and before. This can avoid keeping conntracks for all
IPVS connections because this costs memory. ip_vs_ftp still
depends on connection tracking and NAT as implemented for 2.6.36.

- Add sysctl var "conntrack" to enable connection tracking for
all IPVS connections. For loaded IPVS directors it needs
tuning of nf_conntrack_max limit.

- Add IP_VS_CONN_F_NFCT connection flag to request the connection
to use connection tracking. This allows user space to provide this
flag, for example, in dest->conn_flags. This can be useful to
request connection tracking per real server instead of forcing it
for all connections with the "conntrack" sysctl. This flag is
set currently only by ip_vs_ftp and of course by "conntrack" sysctl.

- Add ip_vs_nfct.c file to hold all connection tracking code,
by this way main code should not depend of netfilter conntrack
support.

- Return back the ip_vs_post_routing handler as in 2.6.35 and use
skb->ipvs_property=1 to allow IPVS to work without connection
tracking

Connection tracking:

- most of the code is already in 2.6.36-rc

- alter conntrack reply tuple for LVS-NAT connections when first packet
from client is forwarded and conntrack state is NEW or RELATED.
Additionally, alter reply for RELATED connections from real server,
again for packet in original direction.

- add IP_VS_XMIT_TUNNEL to confirm conntrack (without altering
reply) for LVS-TUN early because we want to call nf_reset. It is
needed because we add IPIP header and the original conntrack
should be preserved, not destroyed. The transmitted IPIP packets
can reuse same conntrack, so we do not set skb->ipvs_property.

- try to destroy conntrack when the IPVS connection is destroyed.
It is not fatal if conntrack disappears before that, it depends
on the used timers.

Fix problems from long time:

- add skb->ip_summed = CHECKSUM_NONE for the LVS-TUN transmitters
Signed-off-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

f4bc17cd

17 9月, 2010 1 次提交

ipvs: extend connection flags to 32 bits · 3575792e

由 Julian Anastasov 提交于 9月 17, 2010

- the sync protocol supports 16 bits only, so bits 0..15 should be
used only for flags that should go to backup server, bits 16 and
above should be allocated for flags not sent to backup.

- use IP_VS_CONN_F_DEST_MASK as mask of connection flags in
destination that can be changed by user space

- allow IP_VS_CONN_F_ONE_PACKET to be set in destination
Signed-off-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

3575792e

09 9月, 2010 10 次提交

udp: add rehash on connect() · 719f8358

由 Eric Dumazet 提交于 9月 08, 2010

commit 30fff923 introduced in linux-2.6.33 (udp: bind() optimisation)
added a secondary hash on UDP, hashed on (local addr, local port).

Problem is that following sequence :

fd = socket(...)
connect(fd, &remote, ...)

not only selects remote end point (address and port), but also sets
local address, while UDP stack stored in secondary hash table the socket
while its local address was INADDR_ANY (or ipv6 equivalent)

Sequence is :
 - autobind() : choose a random local port, insert socket in hash tables
              [while local address is INADDR_ANY]
 - connect() : set remote address and port, change local address to IP
              given by a route lookup.

When an incoming UDP frame comes, if more than 10 sockets are found in
primary hash table, we switch to secondary table, and fail to find
socket because its local address changed.

One solution to this problem is to rehash datagram socket if needed.

We add a new rehash(struct socket *) method in "struct proto", and
implement this method for UDP v4 & v6, using a common helper.

This rehashing only takes care of secondary hash table, since primary
hash (based on local port only) is not changed.
Reported-by: NKrzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Tested-by: NKrzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

719f8358

A
RDS: Remove dead struct from rds.h · 905d64c8
由 Andy Grover 提交于 9月 08, 2010
```
flows are an obsolete date type.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
```
905d64c8

RDS: rds.h: Replace u_int[size]_t with uint[size]_t · a46f561b

由 Andy Grover 提交于 8月 25, 2010

Replace e.g. u_int32_t types with the more common uint32_t.
Reported-by: NMatthew Wilcox <willy@linux.intel.com>
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

a46f561b

RDS: Add rds.h to exported headers list · fd128dfa

由 Andy Grover 提交于 8月 25, 2010

Also, a number of changes were made based on the assumption that
rds.h wasn't exported, so roll these back.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

fd128dfa

RDS: Implement masked atomic operations · 20c72bd5

由 Andy Grover 提交于 8月 25, 2010

Add two CMSGs for masked versions of cswp and fadd. args
struct modified to use a union for different atomic op type's
arguments. Change IB to do masked atomic ops. Atomic op type
in rds_message similarly unionized.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

20c72bd5

RDS: Add flag for silent ops. Do atomic op before RDMA · 2c3a5f9a

由 Andy Grover 提交于 3月 01, 2010

Add a flag to the API so users can indicate they want
silent operations. This is needed because silent ops
cannot be used with USE_ONCE MRs, so we can't just
assume silent.

Also, change send_xmit to do atomic op before rdma op if
both are present, and centralize the hairy logic to determine if
we want to attempt silent, or not.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

2c3a5f9a

RDS: Implement atomic operations · 15133f6e

由 Andy Grover 提交于 1月 12, 2010

Implement a CMSG-based interface to do FADD and CSWP ops.

Alter send routines to handle atomic ops.

Add atomic counters to stats.

Add xmit_atomic() to struct rds_transport

Inline rds_ib_send_unmap_rdma into unmap_rm
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

15133f6e

net: introduce rcu_dereference_rtnl · a6e0fc85

由 Eric Dumazet 提交于 9月 08, 2010

We use rcu_dereference_check(p, rcu_read_lock_held() ||
lockdep_rtnl_is_held()) several times in network stack.

More usages to come too, so its time to create a helper.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a6e0fc85

J
include/net/raw.h: Convert raw_seq_private macro to inline · e3634169
由 Joe Perches 提交于 9月 07, 2010
```
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
e3634169

ipvs: fix active FTP · 6523ce15

由 Julian Anastasov 提交于 9月 05, 2010

- Do not create expectation when forwarding the PORT
  command to avoid blocking the connection. The problem is that
  nf_conntrack_ftp.c:help() tries to create the same expectation later in
  POST_ROUTING and drops the packet with "dropping packet" message after
  failure in nf_ct_expect_related.

- Change ip_vs_update_conntrack to alter the conntrack
  for related connections from real server. If we do not alter the reply in
  this direction the next packet from client sent to vport 20 comes as NEW
  connection. We alter it but may be some collision happens for both
  conntracks and the second conntrack gets destroyed immediately. The
  connection stucks too.
Signed-off-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6523ce15

07 9月, 2010 1 次提交

agp/intel: Fix cache control for Sandybridge · f8f235e5

由 Zhenyu Wang 提交于 8月 27, 2010

Sandybridge GTT has new cache control bits in PTE, which controls
graphics page cache in LLC or LLC/MLC, so we need to extend the mask
function to respect the new bits.

And set cache control to always LLC only by default on Gen6.
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>
Cc: stable@kernel.org
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>

f8f235e5

05 9月, 2010 1 次提交

cgroups: fix API thinko · 73457f0f

由 Michael S. Tsirkin 提交于 8月 06, 2010

cgroup_attach_task_current_cg API that have upstream is backwards: we
really need an API to attach to the cgroups from another process A to
the current one.

In our case (vhost), a priveledged user wants to attach it's task to cgroups
from a less priveledged one, the API makes us run it in the other
task's context, and this fails.

So let's make the API generic and just pass in 'from' and 'to' tasks.
Add an inline wrapper for cgroup_attach_task_current_cg to avoid
breaking bisect.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NPaul Menage <menage@google.com>

73457f0f

04 9月, 2010 2 次提交

serial: fix port type conflict between NS16550A & U6_16550A · 71cad055

由 Philippe Langlais 提交于 8月 31, 2010

Bug seen by Dr. David Alan Gilbert with sparse
Signed-off-by: NPhilippe Langlais <philippe.langlais@stericsson.com>
Cc: stable <stable@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

71cad055

cls_cgroup: Fix rcu lockdep warning · 3fb5a991

由 Li Zefan 提交于 9月 02, 2010

Dave reported an rcu lockdep warning on 2.6.35.4 kernel

task->cgroups and task->cgroups->subsys[i] are protected by RCU.
So we avoid accessing invalid pointers here. This might happen,
for example, when you are deref-ing those pointers while someone
move @task from one cgroup to another.
Reported-by: NDave Jones <davej@redhat.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3fb5a991

03 9月, 2010 2 次提交

drivers/net: avoid some skb->ip_summed initializations · bc8acf2c

由 Eric Dumazet 提交于 9月 02, 2010

fresh skbs have ip_summed set to CHECKSUM_NONE (0)

We can avoid setting again skb->ip_summed to CHECKSUM_NONE in drivers.

Introduce skb_checksum_none_assert() helper so that we keep this
assertion documented in driver sources.

Change most occurrences of :

skb->ip_summed = CHECKSUM_NONE;

by :

skb_checksum_none_assert(skb);
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bc8acf2c

net: Improve comments in include/linux/phy.h · c6883996

由 Peter Meerwald 提交于 9月 02, 2010

Correct state range of PHY bus addresses (i.e. 0-31) in comment,
make spelling of PHY consistent in comments.
Signed-off-by: NPeter Meerwald <pmeerw@pmeerw.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c6883996

02 9月, 2010 1 次提交

skge: add GRO support · 86cac58b

由 Eric Dumazet 提交于 8月 31, 2010

- napi_gro_flush() is exported from net/core/dev.c, to avoid
  an irq_save/irq_restore in the packet receive path.
- use napi_gro_receive() instead of netif_receive_skb()
- use napi_gro_flush() before calling __napi_complete()
- turn on NETIF_F_GRO by default
- Tested on a Marvell 88E8001 Gigabit NIC
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86cac58b

01 9月, 2010 2 次提交

powerpc/85xx: Add P1021 PCI IDs and quirks · a28dec2f

由 Anton Vorontsov 提交于 8月 08, 2010

This is needed for proper PCI-E support on P1021 SoCs.
Signed-off-by: NAnton Vorontsov <avorontsov@mvista.com>
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

a28dec2f

net: add a comment on netdev->last_rx · 4dc89133

由 Eric Dumazet 提交于 8月 31, 2010

As some driver authors seem to reintroduce dev->last_rx use,
add a comment to strongly discourage this.

Since commit 6cf3f41e (bonding, net: Move last_rx update into bonding
recv logic), network drivers dont need to update last_rx themselves,
unless they use this field to implement a timeout.

Not updating last_rx helps not dirtying a cache line, improving
performance in SMP.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4dc89133

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功