提交 · fc445084f185cdd877bec323bfe724a361e2292a · openeuler / raspberrypi-kernel

09 9月, 2010 9 次提交

RDS: Explicitly allocate rm in sendmsg() · fc445084

由 Andy Grover 提交于 1月 12, 2010

r_m_copy_from_user used to allocate the rm as well as kernel
buffers for the data, and then copy the data in. Now, sendmsg()
allocates the rm, although the data buffer alloc still happens
in r_m_copy_from_user.

SGs are still allocated with rm, but now r_m_alloc_sgs() is
used to reserve them. This allows multiple SG lists to be
allocated from the one rm -- this is important once we also
want to alloc our rdma sgl from this pool.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

fc445084

RDS: cleanup/fix rds_rdma_unuse · 3ef13f3c

由 Andy Grover 提交于 1月 12, 2010

First, it looks to me like the atomic_inc is wrong.
We should be decrementing refcount only once here, no? It's
already being done by the mr_put() at the end.

Second, simplify the logic a bit by bailing early (with a warning)
if !mr.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

3ef13f3c

RDS: break out rdma and data ops into nested structs in rds_message · e779137a

由 Andy Grover 提交于 1月 12, 2010

Clearly separate rdma-related variables in rm from data-related ones.
This is in anticipation of adding atomic support.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

e779137a

A
RDS: cleanup: remove "== NULL"s and "!= NULL"s in ptr comparisons · 8690bfa1
由 Andy Grover 提交于 1月 12, 2010
```
Favor "if (foo)" style over "if (foo != NULL)".
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
```
8690bfa1
A
RDS: move rds_shutdown_worker impl. to rds_conn_shutdown · 2dc39357
由 Andy Grover 提交于 6月 11, 2010
```
This fits better in connection.c, rather than threads.c.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
```
2dc39357

RDS: Fix locking in send on m_rs_lock · 9de0864c

由 Andy Grover 提交于 3月 29, 2010

Do not nest m_rs_lock under c_lock

Disable interrupts in {rdma,atomic}_send_complete
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

9de0864c

RDS: Rewrite rds_send_drop_to() for clarity · 7c82eaf0

由 Andy Grover 提交于 2月 19, 2010

This function has been the source of numerous bugs; it's just
too complicated. Simplified to nest spinlocks cleanly within
the second loop body, and kick out early if there are no
rms to drop.

This will be a little slower because conn lock is grabbed for
each entry instead of "caching" the lock across rms, but this
should be entirely irrelevant to fastpath performance.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

7c82eaf0

RDS: Fix corrupted rds_mrs · 35b52c70

由 Tina Yang 提交于 4月 01, 2010

On second look at this bug (OFED #2002), it seems that the
collision is not with the retransmission queue (packet acked
by the peer), but with the local send completion.  A theoretical
sequence of events (from time t0 to t3) is thought to be as
follows,

Thread #1
t0:
    sock_release
    rds_release
    rds_send_drop_to /* wait on send completion */
t2:
    rds_rdma_drop_keys()   /* destroy & free all mrs */

Thread #2
t1:
    rds_ib_send_cq_comp_handler
    rds_ib_send_unmap_rm
    rds_message_unmapped   /* wake up #1 @ t0 */
t3:
    rds_message_put
    rds_message_purge
    rds_mr_put   /* memory corruption detected */

The problem with the rds_rdma_drop_keys() is it could
remove a mr's refcount more than its due (i.e. repeatedly
as long as it still remains in the tree (mr->r_refcount > 0)).
Theoretically it should remove only one reference - reference
by the tree.

        /* Release any MRs associated with this socket */
        while ((node = rb_first(&rs->rs_rdma_keys))) {
                mr = container_of(node, struct rds_mr, r_rb_node);
                if (mr->r_trans == rs->rs_transport)
                        mr->r_invalidate = 0;
                rds_mr_put(mr);
        }

I think the correct way of doing it is to remove the mr from
the tree and rds_destroy_mr it first, then a rds_mr_put()
to decrement its reference count by one.  Whichever thread
holds the last reference will free the mr via rds_mr_put().
Signed-off-by: NTina Yang <tina.yang@oracle.com>
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

35b52c70

RDS: Fix BUG_ONs to not fire when in a tasklet · 9e2effba

由 Andy Grover 提交于 3月 12, 2010

in_interrupt() is true in softirqs. The BUG_ONs are supposed
to check for if irqs are disabled, so we should use
BUG_ON(irqs_disabled()) instead, duh.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

9e2effba

19 8月, 2010 1 次提交

rds: fix a leak of kernel memory · f037590f

由 Eric Dumazet 提交于 8月 16, 2010

struct rds_rdma_notify contains a 32 bits hole on 64bit arches,
make sure it is zeroed before copying it to user.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Andy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f037590f

29 5月, 2010 1 次提交

net/rds: Add missing mutex_unlock · 5daf47bb

由 Julia Lawall 提交于 5月 26, 2010

Add a mutex_unlock missing on the error path.  In each case, whenever the
label out is reached from elsewhere in the function, mutex is not locked.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression E1;
@@

* mutex_lock(E1);
  <+... when != E1
  if (...) {
    ... when != E1
*   return ...;
  }
  ...+>
* mutex_unlock(E1);
// </smpl>
Signed-off-by: NJulia Lawall <julia@diku.dk>
Reviewed-by: NZach Brown <zach.brown@oracle.com>
Acked-by: NAndy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5daf47bb

18 5月, 2010 1 次提交

net: Remove unnecessary semicolons after switch statements · ccbd6a5a

由 Joe Perches 提交于 5月 14, 2010

Also added an explicit break; to avoid
a fallthrough in net/ipv4/tcp_input.c
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ccbd6a5a

23 4月, 2010 1 次提交

rdma: potential ERR_PTR dereference · 24acc689

由 Dan Carpenter 提交于 4月 21, 2010

In the original code, the "goto out" calls "rdma_destroy_id(cm_id);"
That isn't needed here and would cause problems because "cm_id" is an
ERR_PTR.  The new code just returns directly.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Acked-by: NAndy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

24acc689

21 4月, 2010 1 次提交

net: sk_sleep() helper · aa395145

由 Eric Dumazet 提交于 4月 20, 2010

Define a new function to return the waitqueue of a "struct sock".

static inline wait_queue_head_t *sk_sleep(struct sock *sk)
{
	return sk->sk_sleep;
}

Change all read occurrences of sk_sleep by a call to this function.

Needed for a future RCU conversion. sk_sleep wont be a field directly
available.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa395145

30 3月, 2010 1 次提交

include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6

由 Tejun Heo 提交于 3月 24, 2010

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

5a0e3ad6

25 3月, 2010 1 次提交

rds: cleanup: remove unneeded variable · 18062ca9

由 Dan Carpenter 提交于 3月 24, 2010

We never use "sk" so this patch removes it.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

18062ca9

17 3月, 2010 13 次提交

RDS: Enable per-cpu workqueue threads · 768bbedf

由 Tina Yang 提交于 3月 11, 2010

Create per-cpu workqueue threads instead of a single
krdsd thread. This is a step towards better scalability.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

768bbedf

RDS: Do not call set_page_dirty() with irqs off · 561c7df6

由 Andy Grover 提交于 3月 11, 2010

set_page_dirty() unconditionally re-enables interrupts, so
if we call it with irqs off, they will be on after the call,
and that's bad. This patch moves the call after we've re-enabled
interrupts in send_drop_to(), so it's safe.

Also, add BUG_ONs to let us know if we ever do call set_page_dirty
with interrupts off.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

561c7df6

RDS: Properly unmap when getting a remote access error · 450d06c0

由 Sherman Pun 提交于 3月 11, 2010

If the RDMA op has aborted with a remote access error,
in addition to what we already do (tell userspace it has
completed with an error) also unmap it and put() the rm.

Otherwise, hangs may occur on arches that track maps and
will not exit without proper cleanup.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

450d06c0

RDS: only put sockets that have seen congestion on the poll_waitq · b98ba52f

由 Andy Grover 提交于 3月 11, 2010

rds_poll_waitq's listeners will be awoken if we receive a congestion
notification. Bad performance may result because *all* polled sockets
contend for this single lock. However, it should not be necessary to
wake pollers when a congestion update arrives if they have never
experienced congestion, and not putting these on the waitq will
hopefully greatly reduce contention.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b98ba52f

RDS: Fix locking in rds_send_drop_to() · 550a8002

由 Tina Yang 提交于 3月 11, 2010

It seems rds_send_drop_to() called
__rds_rdma_send_complete(rs, rm, RDS_RDMA_CANCELED)
with only rds_sock lock, but not rds_message lock. It raced with
other threads that is attempting to modify the rds_message as well,
such as from within rds_rdma_send_complete().
Signed-off-by: NTina Yang <tina.yang@oracle.com>
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

550a8002

RDS: Turn down alarming reconnect messages · 97069788

由 Andy Grover 提交于 3月 11, 2010

RDS's error messages when a connection goes down are a little
extreme. A connection may go down, and it will be re-established,
and everything is fine. This patch links these messages through
rdsdebug(), instead of to printk directly.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

97069788

RDS: Workaround for in-use MRs on close causing crash · 571c02fa

由 Andy Grover 提交于 3月 11, 2010

if a machine is shut down without closing sockets properly, and
freeing all MRs, then a BUG_ON will bring it down. This patch
changes these to WARN_ONs -- leaking MRs is not fatal (although
not ideal, and there is more work to do here for a proper fix.)
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

571c02fa

RDS: Fix send locking issue · 048c15e6

由 Tina Yang 提交于 3月 11, 2010

Fix a deadlock between rds_rdma_send_complete() and
rds_send_remove_from_sock() when rds socket lock and
rds message lock are acquired out-of-order.
Signed-off-by: NTina Yang <Tina.Yang@oracle.com>
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

048c15e6

RDS: Fix congestion issues for loopback · 2e7b3b99

由 Andy Grover 提交于 3月 11, 2010

We have two kinds of loopback: software (via loop transport)
and hardware (via IB). sw is used for 127.0.0.1, and doesn't
support rdma ops. hw is used for sends to local device IPs,
and supports rdma. Both are used in different cases.

For both of these, when there is a congestion map update, we
want to call rds_cong_map_updated() but not actually send
anything -- since loopback local and foreign congestion maps
point to the same spot, they're already in sync.

The old code never called sw loop's xmit_cong_map(),so
rds_cong_map_updated() wasn't being called for it. sw loop
ports would not work right with the congestion monitor.

Fixing that meant that hw loopback now would send congestion maps
to itself. This is also undesirable (racy), so we check for this
case in the ib-specific xmit code.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2e7b3b99

RDS/TCP: Wait to wake thread when write space available · 8e82376e

由 Andy Grover 提交于 3月 11, 2010

Instead of waking the send thread whenever any send space is available,
wait until it is at least half empty. This is modeled on how
sock_def_write_space() does it, and may help to minimize context
switches.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8e82376e

RDS: update copy_to_user state in tcp transport · b075cfdb

由 Andy Grover 提交于 3月 11, 2010

Other transports use rds_page_copy_user, which updates our
s_copy_to_user counter. TCP doesn't, so it needs to explicity
call rds_stats_add().
Reported-by: NRichard Frank <richard.frank@oracle.com>
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b075cfdb

RDS: sendmsg() should check sndtimeo, not rcvtimeo · 1123fd73

由 Andy Grover 提交于 3月 11, 2010

Most likely cut n paste error - sendmsg() was checking sock_rcvtimeo.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1123fd73

RDS: Do not BUG() on error returned from ib_post_send · 735f61e6

由 Andy Grover 提交于 3月 11, 2010

BUGging on a runtime error code should be avoided. This
patch also eliminates all other BUG()s that have no real
reason to exist.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

735f61e6

04 2月, 2010 1 次提交

net/rds: remove uses of NIPQUAD, use %pI4 · 6884b348

由 Joe Perches 提交于 2月 02, 2010

Signed-off-by: NJoe Perches <joe@perches.com>
Cc: Andy Grover <andy.grover@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6884b348

30 11月, 2009 1 次提交

net: Move && and || to end of previous line · f64f9e71

由 Joe Perches 提交于 11月 29, 2009

Not including net/atm/

Compiled tested x86 allyesconfig only
Added a > 80 column line or two, which I ignored.
Existing checkpatch plaints willfully, cheerfully ignored.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f64f9e71

20 11月, 2009 1 次提交

RDMA/cm: fix loopback address support · 6f8372b6

由 Sean Hefty 提交于 11月 19, 2009

The RDMA CM is intended to support the use of a loopback address
when establishing a connection; however, the behavior of the CM
when loopback addresses are used is confusing and does not always
work, depending on whether loopback was specified by the server,
the client, or both.

The defined behavior of rdma_bind_addr is to associate an RDMA
device with an rdma_cm_id, as long as the user specified a non-
zero address.  (ie they weren't just trying to reserve a port)
Currently, if the loopback address is passed to rdam_bind_addr,
no device is associated with the rdma_cm_id.  Fix this.

If a loopback address is specified by the client as the destination
address for a connection, it will fail to establish a connection.
This is true even if the server is listing across all addresses or
on the loopback address itself.  The issue is that the server tries
to translate the IP address carried in the REQ message to a local
net_device address, which fails.  The translation is not needed in
this case, since the REQ carries the actual HW address that should
be used.

Finally, cleanup loopback support to be more transport neutral.
Replace separate calls to get/set the sgid and dgid from the
device address to a single call that behaves correctly depending
on the format of the device address.  And support both IPv4 and
IPv6 address formats.
Signed-off-by: NSean Hefty <sean.hefty@intel.com>

[ Fixed RDS build by s/ib_addr_get/rdma_addr_get/  - Roland ]
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

6f8372b6

19 11月, 2009 1 次提交

sysctl: Drop & in front of every proc_handler. · 6d456111

由 Eric W. Biederman 提交于 11月 16, 2009

For consistency drop & in front of every proc_handler.  Explicity
taking the address is unnecessary and it prevents optimizations
like stubbing the proc_handlers to NULL.

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>

6d456111

12 11月, 2009 1 次提交

sysctl net: Remove unused binary sysctl code · f8572d8f

由 Eric W. Biederman 提交于 11月 05, 2009

Now that sys_sysctl is a compatiblity wrapper around /proc/sys
all sysctl strategy routines, and all ctl_name and strategy
entries in the sysctl tables are unused, and can be
revmoed.

In addition neigh_sysctl_register has been modified to no longer
take a strategy argument and it's callers have been modified not
to pass one.

Cc: "David Miller" <davem@davemloft.net>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: netdev@vger.kernel.org
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>

f8572d8f

06 11月, 2009 1 次提交

net: pass kern to net_proto_family create function · 3f378b68

由 Eric Paris 提交于 11月 05, 2009

The generic __sock_create function has a kern argument which allows the
security system to make decisions based on if a socket is being created by
the kernel or by userspace. This patch passes that flag to the
net_proto_family specific create function, so it can do the same thing.
Signed-off-by: NEric Paris <eparis@redhat.com>
Acked-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3f378b68

31 10月, 2009 5 次提交

RDS/IB+IW: Move recv processing to a tasklet · d521b63b