提交 · b1fb67fa501c4787035317f84db6caf013385581 · openanolis / cloud-kernel

06 10月, 2017 1 次提交

RDS: IB: Initialize max_items based on underlying device attributes · b1fb67fa

由 Avinash Repaka 提交于 10月 04, 2017

Use max_1m_mrs/max_8k_mrs while setting max_items, as the former
variables are set based on the underlying device attributes.
Signed-off-by: NAvinash Repaka <avinash.repaka@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b1fb67fa

05 7月, 2017 1 次提交

net, rds: convert rds_ib_device.refcount from atomic_t to refcount_t · 50d61ff7

由 Reshetova, Elena 提交于 7月 04, 2017

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: NHans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NDavid Windsor <dwindsor@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

50d61ff7

15 6月, 2016 1 次提交

RDS: split out connection specific state from rds_connection to rds_conn_path · 0cb43965

由 Sowmini Varadhan 提交于 6月 13, 2016

In preparation for multipath RDS, split the rds_connection
structure into a base structure, and a per-path struct rds_conn_path.
The base structure tracks information and locks common to all
paths. The workqs for send/recv/shutdown etc are tracked per
rds_conn_path. Thus the workq callbacks now work with rds_conn_path.

This commit allows for one rds_conn_path per rds_connection, and will
be extended into multiple conn_paths in subsequent commits.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0cb43965

11 6月, 2016 1 次提交

RDS: IB: Remove deprecated create_workqueue · 231edca9

由 Bhaktipriya Shridhar 提交于 6月 08, 2016

alloc_workqueue replaces deprecated create_workqueue().

Since the driver is infiniband which can be used as block device and the
workqueue seems involved in regular operation of the device, so a
dedicated workqueue has been used  with WQ_MEM_RECLAIM set to guarantee
forward progress under memory pressure.
Since there are only a fixed number of work items, explicit concurrency
limit is unnecessary here.
Signed-off-by: NBhaktipriya Shridhar <bhaktipriya96@gmail.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

231edca9

03 3月, 2016 5 次提交

RDS: IB: Support Fastreg MR (FRMR) memory registration mode · 1659185f

由 Avinash Repaka 提交于 3月 01, 2016

Fastreg MR(FRMR) is another method with which one can
register memory to HCA. Some of the newer HCAs supports only fastreg
mr mode, so we need to add support for it to have RDS functional
on them.
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NAvinash Repaka <avinash.repaka@oracle.com>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1659185f

RDS: IB: add mr reused stats · db42753a

由 santosh.shilimkar@oracle.com 提交于 3月 01, 2016

Add MR reuse statistics to RDS IB transport.
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

db42753a

RDS: IB: move FMR code to its own file · 490ea596

由 santosh.shilimkar@oracle.com 提交于 3月 01, 2016

No functional change.
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

490ea596

RDS: IB: create struct rds_ib_fmr · a69365a3

由 santosh.shilimkar@oracle.com 提交于 3月 01, 2016

Keep fmr related filed in its own struct. Fastreg MR structure
will be added to the union.
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a69365a3

RDS: IB: Re-organise ibmr code · f6df683f

由 santosh.shilimkar@oracle.com 提交于 3月 01, 2016

No functional changes. This is in preperation towards adding
fastreg memory resgitration support.
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6df683f

06 10月, 2015 4 次提交

RDS: IB: split mr pool to improve 8K messages performance · 06766513

由 Santosh Shilimkar 提交于 9月 10, 2015

8K message sizes are pretty important usecase for RDS current
workloads so we make provison to have 8K mrs available from the pool.
Based on number of SG's in the RDS message, we pick a pool to use.

Also to make sure that we don't under utlise mrs when say 8k messages
are dominating which could lead to 8k pull being exhausted, we fall-back
to 1m pool till 8k pool recovers for use.

This helps to at least push ~55 kB/s bidirectional data which
is a nice improvement.
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>

06766513

RDS: IB: mark rds_ib_fmr_wq static · 67161e25

由 Santosh Shilimkar 提交于 9月 19, 2015

Fix below warning by marking rds_ib_fmr_wq static

net/rds/ib_rdma.c:87:25: warning: symbol 'rds_ib_fmr_wq' was not declared. Should it be static?
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>

67161e25

RDS: IB: use already available pool handle from ibmr · 26139dc1

由 Santosh Shilimkar 提交于 9月 15, 2015

rds_ib_mr already keeps the pool handle which it associates
with. Lets use that instead of round about way of fetching
it from rds_ib_device.

No functional change.
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>

26139dc1

RDS: IB: fix the rds_ib_fmr_wq kick call · 2e1d6b81

由 Santosh Shilimkar 提交于 9月 13, 2015

RDS IB mr pool has its own workqueue 'rds_ib_fmr_wq', so we need
to use queue_delayed_work() to kick the work. This was hurting
the performance since pool maintenance was less often triggered
from other path.
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>

2e1d6b81

01 10月, 2015 1 次提交

RDS: use kfree_rcu in rds_ib_remove_ipaddr · 59fe4606

由 Santosh Shilimkar 提交于 2月 03, 2012

synchronize_rcu() slowing down un-necessarily the socket shutdown
path. It is used just kfree() the ip addresses in rds_ib_remove_ipaddr()
which is perfect usecase for kfree_rcu();

So lets use that to gain some speedup.
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>

59fe4606

26 8月, 2015 6 次提交

RDS: remove superfluous from rds_ib_alloc_fmr() · 27241214

由 santosh.shilimkar@oracle.com 提交于 8月 25, 2015

Memory allocated for 'ibmr' uses kzalloc_node() which already
initialises the memory to zero. There is no need to do
memset() 0 on that memory.
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

27241214

RDS: flush the FMR pool less often · ef5217a6

由 santosh.shilimkar@oracle.com 提交于 8月 25, 2015

FMR flush is an expensive and time consuming operation. Reduce the
frequency of FMR pool flush by 50% so that more FMR work gets accumulated
for more efficient flushing.
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ef5217a6

RDS: push FMR pool flush work to its own worker · ad1d7dc0

由 santosh.shilimkar@oracle.com 提交于 8月 25, 2015

RDS FMR flush operation and also it races with connect/reconect
which happes a lot with RDS. FMR flush being on common rds_wq aggrevates
the problem. Lets push RDS FMR pool flush work to its own worker.
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ad1d7dc0

RDS: fix fmr pool dirty_count · 6116c203

由 Wengang Wang 提交于 8月 25, 2015

In rds_ib_flush_mr_pool(), dirty_count accounts the clean ones
which is wrong. This can lead to a negative dirty count value.

Lets fix it.
Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6116c203

RDS: Fix assertion level from fatal to warning · 5c240fa2

由 santosh.shilimkar@oracle.com 提交于 8月 22, 2015

Fix the asserion level since its not fatal and can be hit
in normal execution paths. There is no need to take the
system down.

We keep the WARN_ON() to detect the condition if we get
here with bad pages.
Reviewed-by: NAjaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5c240fa2

RDS: don't update ip address tables if the address hasn't changed · e1f475a7

由 santosh.shilimkar@oracle.com 提交于 8月 22, 2015

If the ip address tables hasn't changed, there is no need to remove
them only to be added back again.

Lets fix it.
Reviewed-by: NAjaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e1f475a7

15 7月, 2015 1 次提交

rds: rds_ib_device.refcount overflow · 4fabb594

由 Wengang Wang 提交于 7月 06, 2015

Fixes: 3e0249f9 ("RDS/IB: add refcount tracking to struct rds_ib_device")

There lacks a dropping on rds_ib_device.refcount in case rds_ib_alloc_fmr
failed(mr pool running out). this lead to the refcount overflow.

A complain in line 117(see following) is seen. From vmcore:
s_ib_rdma_mr_pool_depleted is 2147485544 and rds_ibdev->refcount is -2147475448.
That is the evidence the mr pool is used up. so rds_ib_alloc_fmr is very likely
to return ERR_PTR(-EAGAIN).

115 void rds_ib_dev_put(struct rds_ib_device *rds_ibdev)
116 {
117         BUG_ON(atomic_read(&rds_ibdev->refcount) <= 0);
118         if (atomic_dec_and_test(&rds_ibdev->refcount))
119                 queue_work(rds_wq, &rds_ibdev->free_work);
120 }

fix is to drop refcount when rds_ib_alloc_fmr failed.
Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
Reviewed-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

4fabb594

27 8月, 2014 1 次提交

net: Replace get_cpu_var through this_cpu_ptr · 903ceff7

由 Christoph Lameter 提交于 8月 17, 2014

Replace uses of get_cpu_var for address calculation through this_cpu_ptr.

Cc: netdev@vger.kernel.org
Cc: Eric Dumazet <edumazet@google.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NChristoph Lameter <cl@linux.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

903ceff7

16 9月, 2011 1 次提交

net, rds, Replace xlist in net/rds/xlist.h with llist · 1bc144b6

由 Huang Ying 提交于 8月 30, 2011

The functionality of xlist and llist is almost same.  This patch
replace xlist with llist to avoid code duplication.

Known issues: don't know how to test this, need special hardware?
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Andy Grover <andy.grover@oracle.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1bc144b6

01 2月, 2011 1 次提交

rds/ib: use system_wq instead of rds_ib_fmr_wq · c534a107

由 Tejun Heo 提交于 2月 01, 2011

With cmwq, there's no reason to use dedicated rds_ib_fmr_wq - it's not
in the memory reclaim path and the maximum number of concurrent work
items is bound by the number of devices.  Drop it and use system_wq
instead.  This rds_ib_fmr_init/exit() noops.  Both removed.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Andy Grover <andy.grover@oracle.com>

c534a107

21 10月, 2010 1 次提交

rds: make local functions/variables static · ff51bf84

由 stephen hemminger 提交于 10月 19, 2010

The RDS protocol has lots of functions that should be
declared static. rds_message_get/add_version_extension is
removed since it defined but never used.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ff51bf84

20 9月, 2010 1 次提交

rds: spin_lock_irq() is not nestable · aef3ea33

由 Dan Carpenter 提交于 9月 18, 2010

This is basically just a cleanup.  IRQs were disabled on the previous
line so we don't need to do it again here.  In the current code IRQs
would get turned on one line earlier than intended.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aef3ea33

09 9月, 2010 14 次提交

RDS/IB: protect the list of IB devices · ea819867

由 Zach Brown 提交于 7月 15, 2010

The RDS IB device list wasn't protected by any locking.  Traversal in
both the get_mr and FMR flushing paths could race with additon and
removal.

List manipulation is done with RCU primatives and is protected by the
write side of a rwsem.  The list traversal in the get_mr fast path is
protected by a rcu read critical section.  The FMR list traversal is
more problematic because it can block while traversing the list.  We
protect this with the read side of the rwsem.
Signed-off-by: NZach Brown <zach.brown@oracle.com>

ea819867

RDS: flush fmrs before allocating new ones · 8576f374

由 Chris Mason 提交于 7月 19, 2010

Flushing FMRs is somewhat expensive, and is currently kicked off when
the interrupt handler notices that we are getting low.  The result of
this is that FMR flushing only happens from the interrupt cpus.

This spreads the load more effectively by triggering flushes just before
we allocate a new FMR.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

8576f374

RDS: remove __init and __exit annotation · ef87b7ea

由 Zach Brown 提交于 7月 09, 2010

The trivial amount of memory saved isn't worth the cost of dealing with section
mismatches.
Signed-off-by: NZach Brown <zach.brown@oracle.com>

ef87b7ea

RDS/IB: create a work queue for FMR flushing · 515e079d

由 Zach Brown 提交于 7月 06, 2010

This patch moves the FMR flushing work in to its own mult-threaded work queue.
This is to maintain performance in preparation for returning the main krdsd
work queue back to a single threaded work queue to avoid deep-rooted
concurrency bugs.

This is also good because it further separates FMRs, which might be removed
some day, from the rest of the code base.
Signed-off-by: NZach Brown <zach.brown@oracle.com>

515e079d

RDS/IB: destroy connections on rmmod · 8aeb1ba6

由 Zach Brown 提交于 6月 25, 2010

IB connections were not being destroyed during rmmod.

First, recently IB device removal callback was changed to disconnect
connections that used the removing device rather than destroying them. So
connections with devices during rmmod were not being destroyed.

Second, rds_ib_destroy_nodev_conns() was being called before connections are
disassociated with devices. It would almost never find connections in the
nodev list.

We first get rid of rds_ib_destroy_conns(), which is no longer called, and
refactor the existing caller into the main body of the function and get rid of
the list and lock wrappers.

Then we call rds_ib_destroy_nodev_conns() *after* ib_unregister_client() has
removed the IB device from all the conns and put the conns on the nodev list.

The result is that IB connections are destroyed by rmmod.
Signed-off-by: NZach Brown <zach.brown@oracle.com>

8aeb1ba6

A

RDS: whitespace · c9455d99
由 Andy Grover 提交于 6月 11, 2010

c9455d99

RDS: use delayed work for the FMR flushes · 7a0ff5db

由 Chris Mason 提交于 6月 11, 2010

Using a delayed work queue helps us make sure a healthy number of FMRs
have queued up over the limit.  It makes for a large improvement in RDMA
iops.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

7a0ff5db

rds: recycle FMRs through lockless lists · 6fa70da6

由 Chris Mason 提交于 6月 11, 2010

FRM allocation and recycling is performance critical and fairly lock
intensive.  The current code has a per connection lock that all
processes bang on and it becomes a major bottleneck on large systems.

This changes things to use a number of cmpxchg based lists instead,
allowing us to go through the whole FMR lifecycle without locking inside
RDS.

Zach Brown pointed out that our usage of cmpxchg for xlist removal is
racey if someone manages to remove and add back an FMR struct into the list
while another CPU can see the FMR's address at the head of the list.

The second CPU might assume the list hasn't changed when in fact any
number of operations might have happened in between the deletion and
reinsertion.

This commit maintains a per cpu count of CPUs that are currently
in xlist removal, and establishes a grace period to make sure that
nobody can see an entry we have just removed from the list.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6fa70da6

RDS/IB: add refcount tracking to struct rds_ib_device · 3e0249f9

由 Zach Brown 提交于 5月 18, 2010

The RDS IB client .remove callback used to free the rds_ibdev for the given
device unconditionally. This could race other users of the struct. This patch
adds refcounting so that we only free the rds_ibdev once all of its users are
done.

Many rds_ibdev users are tied to connections. We give the connection a
reference and change these users to reference the device in the connection
instead of looking it up in the IB client data. The only user of the IB client
data remaining is the first lookup of the device as connections are built up.

Incrementing the reference count of a device found in the IB client data could
race with final freeing so we use an RCU grace period to make sure that freeing
won't happen until those lookups are done.

MRs need the rds_ibdev to get at the pool that they're freed in to. They exist
outside a connection and many MRs can reference different devices from one
socket, so it was natural to have each MR hold a reference. MR refs can be
dropped from interrupt handlers and final device teardown can block so we push
it off to a work struct. Pool teardown had to be fixed to cancel its pending
work instead of deadlocking waiting for all queued work, including itself, to
finish.

MRs get their reference from the global device list, which gets a reference.
It is left unprotected by locks and remains racy. A simple global lock would
be a significant bottleneck. More scalable (complicated) locking should be
done carefully in a later patch.
Signed-off-by: NZach Brown <zach.brown@oracle.com>

3e0249f9

rds: Use RCU for the bind lookup searches · 38a4e5e6

由 Chris Mason 提交于 5月 11, 2010

The RDS bind lookups are somewhat expensive in terms of CPU
time and locking overhead.  This commit changes them into a
faster RCU based hash tree instead of the rbtrees they were using
before.

On large NUMA systems it is a significant improvement.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

38a4e5e6

RDS/IB: add _to_node() macros for numa and use {k,v}malloc_node() · e4c52c98

由 Andy Grover 提交于 4月 23, 2010

Allocate send/recv rings in memory that is node-local to the HCA.
This significantly helps performance.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

e4c52c98

A
RDS/IB: Remove unused variable in ib_remove_addr() · 4a81802b
由 Andy Grover 提交于 4月 23, 2010
```
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
```
4a81802b

rds: rcu-ize rds_ib_get_device() · 764f2dd9

由 Chris Mason 提交于 4月 22, 2010

rds_ib_get_device is called very often as we turn an
ip address into a corresponding device structure.  It currently
take a global spinlock as it walks different lists to find active
devices.

This commit changes the lists over to RCU, which isn't very complex
because they are not updated very often at all.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

764f2dd9

RDS: Implement atomic operations · 15133f6e

由 Andy Grover 提交于 1月 12, 2010

Implement a CMSG-based interface to do FADD and CSWP ops.

Alter send routines to handle atomic ops.

Add atomic counters to stats.

Add xmit_atomic() to struct rds_transport

Inline rds_ib_send_unmap_rdma into unmap_rm
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

15133f6e

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功