提交 · a2205472c3017bfe97b6cb6f5acd6ca141a97eda · openanolis / cloud-kernel

10 3月, 2009 1 次提交

net: fix warning about non-const string · a2205472

由 Stephen Hemminger 提交于 3月 09, 2009

Since dev_set_name takes a printf style string, new gcc complains
if arg is not const.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a2205472

05 3月, 2009 2 次提交

vlan: Fix vlan-in-vlan crashes. · 9d40bbda

由 David S. Miller 提交于 3月 04, 2009

As analyzed by Patrick McHardy, vlan needs to reset it's
netdev_ops pointer in it's ->init() function but this
leaves the compat method pointers stale.

Add a netdev_resync_ops() and call it from the vlan code.

Any other driver which changes ->netdev_ops after register_netdevice()
will need to call this new function after doing so too.

With help from Patrick McHardy.
Tested-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d40bbda

D
net: Fix missing dev->neigh_setup in register_netdevice(). · 54acd0ef
由 David S. Miller 提交于 3月 04, 2009
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
54acd0ef

04 3月, 2009 1 次提交

neigh: Allow for user space users of the neighbour table · 0c5c2d30

由 Eric Biederman 提交于 3月 04, 2009

Currently it is possible to do just about everything with the arp table
from user space except treat an entry like you are using it. To that end
implement and a flag NTF_USE that when set in a netwlink update request
treats the neighbour table entry like the kernel does on the output path.

This allows user space applications to share the kernel's arp cache.
Signed-off-by: NEric Biederman <ebiederm@aristanetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c5c2d30

03 3月, 2009 2 次提交

netns: Remove net_alive · 17edde52

由 Eric W. Biederman 提交于 2月 22, 2009

It turns out that net_alive is unnecessary, and the original problem
that led to it being added was simply that the icmp code thought
it was a network device and wound up being unable to handle packets
while there were still packets in the network namespace.

Now that icmp and tcp have been fixed to properly register themselves
this problem is no longer present and we have a stronger guarantee
that packets will not arrive in a network namespace then that provided
by net_alive in netif_receive_skb.  So remove net_alive allowing
packet reception run a little faster.

Additionally document the strong reason why network namespace cleanup
is safe so that if something happens again someone else will have
a chance of figuring it out.
Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

17edde52

net: Avoid race between network down and sysfs · 5a5990d3

由 Stephen Hemminger 提交于 2月 26, 2009

Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5a5990d3

01 3月, 2009 1 次提交

netpoll: Add drop checks to all entry points · 4ead4431

由 Herbert Xu 提交于 3月 01, 2009

The netpoll entry checks are required to ensure that we don't
receive normal packets when invoked via netpoll.  Unfortunately
it only ever worked for the netif_receive_skb/netif_rx entry
points.  The VLAN (and subsequently GRO) entry point didn't
have the check and therefore can trigger all sorts of weird
problems.

This patch adds the netpoll check to all entry points.

I'm still uneasy with receiving at all under netpoll (which
apparently is only used by the out-of-tree kdump code).  The
reason is it is perfectly legal to receive all data including
headers into highmem if netpoll is off, but if you try to do
that with netpoll on and someone gets a printk in an IRQ handler                                             
you're going to get a nice BUG_ON.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4ead4431

27 2月, 2009 4 次提交

RDS: Add RDS to AF key strings · cbd151bf

由 Andy Grover 提交于 2月 26, 2009

Signed-off-by: NAndy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cbd151bf

sysctl: fix sparse warning: Should it be static? · 63d819ca

由 Hannes Eder 提交于 2月 25, 2009

Impact: Include header file.

Fix this sparse warning:
  net/core/sysctl_net_core.c:123:32: warning: symbol 'net_core_path' was not declared. Should it be static?
Signed-off-by: NHannes Eder <hannes@hanneseder.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

63d819ca

core: remove some pointless conditionals before kfree_skb() · f3fbbe0f

由 Wei Yongjun 提交于 2月 25, 2009

Remove some pointless conditionals before kfree_skb().
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f3fbbe0f

pktgen: remove some pointless conditionals before kfree_skb() · 86dc1ad2

由 Wei Yongjun 提交于 2月 25, 2009

Remove some pointless conditionals before kfree_skb().
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86dc1ad2

25 2月, 2009 1 次提交

netlink: change nlmsg_notify() return value logic · 1ce85fe4

由 Pablo Neira Ayuso 提交于 2月 24, 2009

This patch changes the return value of nlmsg_notify() as follows:

If NETLINK_BROADCAST_ERROR is set by any of the listeners and
an error in the delivery happened, return the broadcast error;
else if there are no listeners apart from the socket that
requested a change with the echo flag, return the result of the
unicast notification. Thus, with this patch, the unicast
notification is handled in the same way of a broadcast listener
that has set the NETLINK_BROADCAST_ERROR socket flag.

This patch is useful in case that the caller of nlmsg_notify()
wants to know the result of the delivery of a netlink notification
(including the broadcast delivery) and take any action in case
that the delivery failed. For example, ctnetlink can drop packets
if the event delivery failed to provide reliable logging and
state-synchronization at the cost of dropping packets.

This patch also modifies the rtnetlink code to ignore the return
value of rtnl_notify() in all callers. The function rtnl_notify()
(before this patch) returned the error of the unicast notification
which makes rtnl_set_sk_err() reports errors to all listeners. This
is not of any help since the origin of the change (the socket that
requested the echoing) notices the ENOBUFS error if the notification
fails and should resync itself.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Acked-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1ce85fe4

24 2月, 2009 2 次提交

net: amend the fix for SO_BSDCOMPAT gsopt infoleak · 50fee1de

由 Eugene Teo 提交于 2月 23, 2009

The fix for CVE-2009-0676 (upstream commit df0bca04) is incomplete. Note
that the same problem of leaking kernel memory will reappear if someone
on some architecture uses struct timeval with some internal padding (for
example tv_sec 64-bit and tv_usec 32-bit) --- then, you are going to
leak the padded bytes to userspace.
Signed-off-by: NEugene Teo <eugeneteo@kernel.sg>
Reported-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

50fee1de

netns: build fix for net_alloc_generic · ebe47d47

由 Clemens Noss 提交于 2月 23, 2009

net_alloc_generic was defined in #ifdef CONFIG_NET_NS, but used
unconditionally. Move net_alloc_generic out of #ifdef.
Signed-off-by: NClemens Noss <cnoss@gmx.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ebe47d47

23 2月, 2009 1 次提交

netns: Remove net_alive · ce16c533

由 Eric W. Biederman 提交于 2月 22, 2009

It turns out that net_alive is unnecessary, and the original problem
that led to it being added was simply that the icmp code thought
it was a network device and wound up being unable to handle packets
while there were still packets in the network namespace.

Now that icmp and tcp have been fixed to properly register themselves
this problem is no longer present and we have a stronger guarantee
that packets will not arrive in a network namespace then that provided
by net_alive in netif_receive_skb.  So remove net_alive allowing
packet reception run a little faster.

Additionally document the strong reason why network namespace cleanup
is safe so that if something happens again someone else will have
a chance of figuring it out.
Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ce16c533

22 2月, 2009 1 次提交

netns: fix double free at netns creation · 486a87f1

由 Daniel Lezcano 提交于 2月 22, 2009

This patch fix a double free when a network namespace fails.
The previous code does a kfree of the net_generic structure when
one of the init subsystem initialization fails.
The 'setup_net' function does kfree(ng) and returns an error.
The caller, 'copy_net_ns', call net_free on error, and this one
calls kfree(net->gen), making this pointer freed twice.

This patch make the code symetric, the net_alloc does the net_generic
allocation and the net_free frees the net_generic.
Signed-off-by: NDaniel Lezcano <daniel.lezcano@free.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

486a87f1

21 2月, 2009 1 次提交

net: kernel panic in dev_hard_start_xmit: remove faulty software TX time stamping · cd4d8fda

由 Patrick Ohly 提交于 2月 21, 2009

The current implementation of the TX software time stamping fallback is
faulty because it accesses the skb after ndo_start_xmit() returns
successfully. This patch removes the fallback, which fixes kernel panics
seen during stress tests. Hardware time stamping is not affected by this
removal.
Signed-off-by: NPatrick Ohly <patrick.ohly@intel.com>
Signed-off-by: NEmil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cd4d8fda

20 2月, 2009 1 次提交

ethtool: Add RX pkt classification interface · 59089d8d

由 Santwona Behera 提交于 2月 20, 2009

Signed-off-by: NSantwona Behera <santwona.behera@sun.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

59089d8d

19 2月, 2009 1 次提交

net: Optimize skb_tx_hash() by eliminating a comparison · e88721f8

由 Krishna Kumar 提交于 2月 18, 2009

Optimize skb_tx_hash() by eliminating a comparison that executes for
every packet. skb_tx_hashrnd initialization is moved to a later part of
the startup sequence, namely after the "random" driver is initialized.

Rebooted the system three times and verified that the code generates
different random numbers each time.
Signed-off-by: NKrishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e88721f8

18 2月, 2009 1 次提交

net: Kill skb_truesize_check(), it only catches false-positives. · 92a0acce

由 David S. Miller 提交于 2月 17, 2009

A long time ago we had bugs, primarily in TCP, where we would modify
skb->truesize (for TSO queue collapsing) in ways which would corrupt
the socket memory accounting.

skb_truesize_check() was added in order to try and catch this error
more systematically.

However this debugging check has morphed into a Frankenstein of sorts
and these days it does nothing other than catch false-positives.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

92a0acce

16 2月, 2009 3 次提交

net: pass new SIOCSHWTSTAMP through to device drivers · d24fff22

由 Patrick Ohly 提交于 2月 12, 2009

Signed-off-by: NPatrick Ohly <patrick.ohly@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d24fff22

net: socket infrastructure for SO_TIMESTAMPING · 20d49473

由 Patrick Ohly 提交于 2月 12, 2009

The overlap with the old SO_TIMESTAMP[NS] options is handled so
that time stamping in software (net_enable_timestamp()) is
enabled when SO_TIMESTAMP[NS] and/or SO_TIMESTAMPING_RX_SOFTWARE
is set.  It's disabled if all of these are off.
Signed-off-by: NPatrick Ohly <patrick.ohly@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

20d49473

net: infrastructure for hardware time stamping · ac45f602

由 Patrick Ohly 提交于 2月 12, 2009

The additional per-packet information (16 bytes for time stamps, 1
byte for flags) is stored for all packets in the skb_shared_info
struct. This implementation detail is hidden from users of that
information via skb_* accessor functions. A separate struct resp.
union is used for the additional information so that it can be
stored/copied easily outside of skb_shared_info.

Compared to previous implementations (reusing the tstamp field
depending on the context, optional additional structures) this
is the simplest solution. It does not extend sk_buff itself.

TX time stamping is implemented in software if the device driver
doesn't support hardware time stamping.

The new semantic for hardware/software time stamping around
ndo_start_xmit() is based on two assumptions about existing
network device drivers which don't support hardware time
stamping and know nothing about it:
 - they leave the new skb_shared_tx unmodified
 - the keep the connection to the originating socket in skb->sk
   alive, i.e., don't call skb_orphan()

Given that skb_shared_tx is new, the first assumption is safe.
The second is only true for some drivers. As a result, software
TX time stamping currently works with the bnx2 driver, but not
with the unmodified igb driver (the two drivers this patch series
was tested with).
Signed-off-by: NPatrick Ohly <patrick.ohly@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ac45f602

13 2月, 2009 2 次提交

net: 4 bytes kernel memory disclosure in SO_BSDCOMPAT gsopt try #2 · df0bca04

由 Clément Lecigne 提交于 2月 12, 2009

In function sock_getsockopt() located in net/core/sock.c, optval v.val
is not correctly initialized and directly returned in userland in case
we have SO_BSDCOMPAT option set.

This dummy code should trigger the bug:

int main(void)
{
	unsigned char buf[4] = { 0, 0, 0, 0 };
	int len;
	int sock;
	sock = socket(33, 2, 2);
	getsockopt(sock, 1, SO_BSDCOMPAT, &buf, &len);
	printf("%x%x%x%x\n", buf[0], buf[1], buf[2], buf[3]);
	close(sock);
}

Here is a patch that fix this bug by initalizing v.val just after its
declaration.
Signed-off-by: NClément Lecigne <clement.lecigne@netasq.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

df0bca04

net: Fix page seeking for skb_splice_bits(). · ce3dd395

由 Jarek Poplawski 提交于 2月 12, 2009

struct page walking should be done with proper accessor functions, not
directly.

With doubts from David S. Miller and Herbert Xu.
Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ce3dd395

10 2月, 2009 1 次提交

net: Move skbuff symbol exports after each symbol's definition. · b4ac530f

由 David S. Miller 提交于 2月 10, 2009

net/core/skbuff.c is a hodge-podge of symbol export placement.
Some of the exports are right after the definition of the
symbol being exported, others are clumped together into a big
group at the end of the file.

Make things consistent.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b4ac530f

09 2月, 2009 2 次提交

gro: Optimise Ethernet header comparison · aa4b9f53

由 Herbert Xu 提交于 2月 08, 2009

This patch optimises the Ethernet header comparison to use 2-byte
and 4-byte xors instead of memcmp.  In order to facilitate this,
the actual comparison is now carried out by the callers of the
shared dev_gro_receive function.

This has a significant impact when receiving 1500B packets through
10GbE.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa4b9f53

gro: Remember number of held packets instead of counting every time · 4ae5544f

由 Herbert Xu 提交于 2月 08, 2009

This patch prepares for the move of the same_flow checks out of
dev_gro_receive.  As such we need to remember the number of held
packets since doing a loop just to count them every time is silly.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4ae5544f

07 2月, 2009 1 次提交

net_dma: call dmaengine_get only if NET_DMA enabled · b4bd07c2

由 David S. Miller 提交于 2月 06, 2009

Based upon a patch from Atsushi Nemoto <anemo@mba.ocn.ne.jp>

--------------------
The commit 649274d9 ("net_dma:
acquire/release dma channels on ifup/ifdown") added unconditional call
of dmaengine_get() to net_dma.  The API should be called only if
NET_DMA was enabled.
--------------------
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NDan Williams <dan.j.williams@intel.com>

b4bd07c2

06 2月, 2009 2 次提交

neigh: some entries can be skipped during dumping · efc683fc

由 Gautam Kachroo 提交于 2月 06, 2009

neightbl_dump_info and neigh_dump_table  can skip entries if the
*fill*info functions return an error. This results in an incomplete
dump ((invoked by netlink requests for RTM_GETNEIGHTBL or
RTM_GETNEIGH)

nidx and idx should not be incremented if the current entry was not
placed in the output buffer
Signed-off-by: NGautam Kachroo <gk@aristanetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

efc683fc

gro: Fix frag_list merging on imprecisely split packets · 56035022

由 Herbert Xu 提交于 2月 05, 2009

The previous fix ad0f9904 (gro:
Fix handling of imprecisely split packets) only fixed the case
of frags merging, frag_list merging in the same circumstances
were still broken.

In particular, the packet headers end up in the data stream.

This patch fixes this plus another issue where an imprecisely
split packet header may be read incorrectly (this is mostly
harmless since it'll simply cause the packet to not match and
be rejected for GRO).

Thanks to Emil Tantilov and Jeff Kirsher for helping to track
this down.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

56035022

05 2月, 2009 2 次提交

net: Reexport sock_alloc_send_pskb · 4cc7f68d

由 Herbert Xu 提交于 2月 04, 2009

The function sock_alloc_send_pskb is completely useless if not
exported since most of the code in it won't be used as is.  In
fact, this code has already been duplicated in the tun driver.

Now that we need accounting in the tun driver, we can in fact
use this function as is.  So this patch marks it for export again.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4cc7f68d

net: Partially allow skb destructors to be used on receive path · 9a279bcb

由 Herbert Xu 提交于 2月 04, 2009

As it currently stands, skb destructors are forbidden on the
receive path because the protocol end-points will overwrite
any existing destructor with their own.

This is the reason why we have to call skb_orphan in the loopback
driver before we reinject the packet back into the stack, thus
creating a period during which loopback traffic isn't charged
to any socket.

With virtualisation, we have a similar problem in that traffic
is reinjected into the stack without being associated with any
socket entity, thus providing no natural congestion push-back
for those poor folks still stuck with UDP.

Now had we been consistent in telling them that UDP simply has
no congestion feedback, I could just fob them off.  Unfortunately,
we appear to have gone to some length in catering for this on
the standard UDP path, with skb/socket accounting so that has
created a very unhealthy dependency.

Alas habits are difficult to break out of, so we may just have
to allow skb destructors on the receive path.

It turns out that making skb destructors useable on the receive path
isn't as easy as it seems.  For instance, simply adding skb_orphan
to skb_set_owner_r isn't enough.  This is because we assume all
over the IP stack that skb->sk is an IP socket if present.

The new transparent proxy code goes one step further and assumes
that skb->sk is the receiving socket if present.

Now all of this can be dealt with by adding simple checks such
as only treating skb->sk as an IP socket if skb->sk->sk_family
matches.  However, it turns out that for bridging at least we
don't need to do all of this work.

This is of interest because most virtualisation setups use bridging
so we don't actually go through the IP stack on the host (with
the exception of our old nemesis the bridge netfilter, but that's
easily taken care of).

So this patch simply adds skb_orphan to the point just before we
enter the IP stack, but after we've gone through the bridge on the
receive path.  It also adds an skb_orphan to the one place in
netfilter that touches skb->sk/skb->destructor, that is, tproxy.

One word of caution, because of the internal code structure, anyone
wishing to deploy this must use skb_set_owner_w as opposed to
skb_set_owner_r since many functions that create a new skb from
an existing one will invoke skb_set_owner_w on the new skb.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9a279bcb

01 2月, 2009 2 次提交

gro: Fix handling of imprecisely split packets · ad0f9904

由 Herbert Xu 提交于 2月 01, 2009

The commit 89a1b249edcf9be884e71f92df84d48355c576aa (gro: Avoid
copying headers of unmerged packets) only worked for packets
which are either completely linear, completely non-linear, or
packets which exactly split at the boundary between headers and
payload.

Anything else would cause bits in the header to go missing if
the packet is held by GRO.

This may have broken drivers such as ixgbe.

This patch fixes the places that assumed or only worked with
the above cases.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ad0f9904

net: Optimize memory usage when splicing from sockets. · 4fb66994

由 Jarek Poplawski 提交于 2月 01, 2009

The recent fix of data corruption when splicing from sockets uses
memory very inefficiently allocating a new page to copy each chunk of
linear part of skb. This patch uses the same page until it's full
(almost) by caching the page in sk_sndmsg_page field.

With changes from David S. Miller <davem@davemloft.net>
Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4fb66994

30 1月, 2009 5 次提交

gro: Open-code memcpy in napi_fraginfo_skb · 80595d59

由 Herbert Xu 提交于 1月 29, 2009

This patch optimises napi_fraginfo_skb to only copy the bits
necessary.  We also open-code the memcpy so that the alignment
information is always available to gcc.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

80595d59

gro: Do not merge paged packets into frag_list · 81705ad1

由 Herbert Xu 提交于 1月 29, 2009

gro: Do not merge paged packets into frag_list

Bigger is not always better :)

It was easy to continue to merged packets into frag_list after the
page array is full.  However, this turns out to be worse than LRO
because frag_list is a much less efficient form of storage than the
page array.  So we're better off stopping the merge and starting
a new entry with an empty page array.

In future we can optimise this further by doing frag_list merging
but making sure that we continue to fill in the page array.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

81705ad1

gro: Avoid copying headers of unmerged packets · 86911732

由 Herbert Xu 提交于 1月 29, 2009

Unfortunately simplicity isn't always the best.  The fraginfo
interface turned out to be suboptimal.  The problem was quite
obvious.  For every packet, we have to copy the headers from
the frags structure into skb->head, even though for 99% of the
packets this part is immediately thrown away after the merge.

LRO didn't have this problem because it directly read the headers
from the frags structure.

This patch attempts to address this by creating an interface
that allows GRO to access the headers in the first frag without
having to copy it.  Because all drivers that use frags place the
headers in the first frag this optimisation should be enough.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86911732

gro: Move common completion code into helpers · 5d0d9be8

由 Herbert Xu 提交于 1月 29, 2009

Currently VLAN still has a bit of common code handling the aftermath
of GRO that's shared with the common path.  This patch moves them
into shared helpers to reduce code duplication.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d0d9be8

net: Fix OOPS in skb_seq_read(). · 71b3346d

由 Shyam Iyer 提交于 1月 29, 2009

It oopsd for me in skb_seq_read. addr2line said it was
linux-2.6/net/core/skbuff.c:2228, which is this line:


	while (st->frag_idx < skb_shinfo(st->cur_skb)->nr_frags) {


I added some printks in there and it looks like we hit this:

        } else if (st->root_skb == st->cur_skb &&
                   skb_shinfo(st->root_skb)->frag_list) {
                 st->cur_skb = skb_shinfo(st->root_skb)->frag_list;
                 st->frag_idx = 0;
                 goto next_skb;
        }



Actually I did some testing and added a few printks and found that the
st->cur_skb->data was 0 and hence the ptr used by iscsi_tcp was null.
This caused the kernel panic.

 	if (abs_offset < block_limit) {
-		*data = st->cur_skb->data + abs_offset;
+		*data = st->cur_skb->data + (abs_offset - st->stepped_offset);

I enabled the debug_tcp and with a few printks found that the code did
not go to the next_skb label and could find that the sequence being
followed was this -

It hit this if condition -

        if (st->cur_skb->next) {
                st->cur_skb = st->cur_skb->next;
                st->frag_idx = 0;
                goto next_skb;

And so, now the st pointer is shifted to the next skb whereas actually
it should have hit the second else if first since the data is in the
frag_list.

        else if (st->root_skb == st->cur_skb &&
                 skb_shinfo(st->root_skb)->frag_list) {
                st->cur_skb = skb_shinfo(st->root_skb)->frag_list;
                goto next_skb;
        }

Reversing the two conditions the attached patch fixes the issue for me
on top of Herbert's patches. 
Signed-off-by: NShyam Iyer <shyam_iyer@dell.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

71b3346d

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功