提交 · 00f1f36ffa29a6b8e0529770bce2a44585ab3af0 · openeuler / raspberrypi-kernel

22 3月, 2012 24 次提交

由 Alex Elder 提交于 2月 07, 2012

A few blocks of code are rearranged a bit here:
    - In rbd_header_from_disk():
	- Don't bother computing snap_count until we're sure the
	  on-disk header starts with a good signature.
	- Move a few independent lines of code so they are *after* a
	  check for a failed memory allocation.
	- Get rid of unnecessary local variable "ret".
    - Make a few other changes in rbd_read_header(), similar to the
      above--just moving things around a bit while preserving the
      functionality.
    - In rbd_rq_fn(), just assign rq in the while loop's controlling
      expression rather than duplicating it before and at the end of
      the loop body.  This allows the use of "continue" rather than
      "goto next" in a number of spots.
    - Rearrange the logic in snap_by_name().  End result is the same.
Signed-off-by: NAlex Elder <elder@dreamhost.com>

00f1f36f

rbd: fix module sysfs setup/teardown code · fed4c143

由 Alex Elder 提交于 2月 07, 2012

Once rbd_bus_type is registered, it allows an "add" operation via
the /sys/bus/rbd/add bus attribute, and adding a new rbd device that
way establishes a connection between the device and rbd_root_dev.
But rbd_root_dev is not registered until after the rbd_bus_type
registration is complete.  This could (in principle anyway) result
in an invalid state.

Since rbd_root_dev has no tie to rbd_bus_type we can reorder these
two initializations and never be faced with this scenario.

In addition, unregister the device in the event the bus registration
fails at module init time.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

fed4c143

rbd: don't allocate mon_addrs buffer in rbd_add() · 7ef3214a

由 Alex Elder 提交于 2月 02, 2012

The mon_addrs buffer in rbd_add is used to hold a copy of the
monitor IP addresses supplied via /sys/bus/rbd/add.  That is
passed to rbd_get_client(), which never modifies it (nor do
any of the functions it gets passed to thereafter)--the mon_addr
parameter to rbd_get_client() is a pointer to constant data, so it
can't be modifed.  Furthermore, rbd_get_client() has the length of
the mon_addrs buffer and that is used to ensure nothing goes beyond
its end.

Based on all this, there is no reason that a buffer needs to
be used to hold a copy of the mon_addrs provided via
/sys/bus/rbd/add.   Instead, the location within that passed-in
buffer can be provided, along with the length of the "token"
therein which represents the monitor IP's.

A small change to rbd_add_parse_args() allows the address within the
buffer to be passed back, and the length is already returned.  This
now means that, at least from the perspective of this interface,
there is no such thing as a list of monitor addresses that is too
long.
Signed-off-by: NAlex Elder <elder@dreamhost.com>

7ef3214a

rbd: have rbd_parse_args() report found mon_addrs size · 5214ecc4

由 Alex Elder 提交于 2月 02, 2012

The argument parsing routine already computes the size of the
mon_addrs buffer it extracts from the "command."  Pass it to the
caller so it can use it to provide the length to rbd_get_client().
Signed-off-by: NAlex Elder <elder@dreamhost.com>

5214ecc4

rbd: do a few checks at build time · 81a89793

由 Alex Elder 提交于 2月 02, 2012

This is a bit gratuitous, but there are a few things that can be
verified at build time rather than run time, so do that.
Signed-off-by: NAlex Elder <elder@dreamhost.com>

81a89793

rbd: don't use sscanf() in rbd_add_parse_args() · e28fff26

由 Alex Elder 提交于 2月 02, 2012

Make use of a few simple helper routines to parse the arguments
rather than sscanf().  This will treat both missing and too-long
arguments as invalid input (rather than silently truncating the
input in the too-long case).  In time this can also be used by
rbd_add() to use the passed-in buffer in place, rather than copying
its contents into new buffers.

It appears to me that the sscanf() previously used would not
correctly handle a supplied snapshot--the two final "%s" conversion
specifications were not separated by a space, and I'm not sure
how sscanf() handles that situation.  It may not be well-defined.
So that may be a bug this change fixes (but I didn't verify that).

The sizes of the mon_addrs and options buffers are now passed to
rbd_add_parse_args(), so they can be supplied to copy_token().
Signed-off-by: NAlex Elder <elder@dreamhost.com>

e28fff26

rbd: encapsulate argument parsing for rbd_add() · a725f65e

由 Alex Elder 提交于 2月 02, 2012

Move the code that parses the arguments provided to rbd_add() (which
are supplied via /sys/bus/rbd/add) into a separate function.

Also rename the "mon_dev_name" variable in rbd_add() to be
"mon_addrs".   The variable represents a list of one or more
comma-separated monitor IP addresses, each with an optional port
number.  I think "mon_addrs" captures that notion a little better.
Signed-off-by: NAlex Elder <elder@dreamhost.com>

a725f65e

rbd: simplify error handling in rbd_add() · 27cc2594

由 Alex Elder 提交于 2月 02, 2012

If a couple pointers are initialized to NULL then a single
"out_nomem" label can be used for all of the memory allocation
failure cases in rbd_add().

Also, get rid of the "irc" local variable there.  There is no
real need for "rc" to be type ssize_t, and it can be used in
the spot "irc" was.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

27cc2594

rbd: reduce memory used for rbd_dev fields · 60571c7d

由 Alex Elder 提交于 2月 02, 2012

The length of the string containing the monitor address
specification(s) will never exceed the length of the string passed
in to rbd_add().  The same holds true for the ceph + rbd options
string.  So reduce the amount of memory allocated for these to
that length rather than the maximum (1024 bytes).
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

60571c7d

rbd: have rbd_get_client() return a rbd_client · d720bcb0

由 Alex Elder 提交于 2月 02, 2012

Since rbd_get_client() currently returns an error code.  It assigns
the rbd_client field of the rbd_device structure it is passed if
successful.  Instead, have it return the created rbd_client
structure and return a pointer-coded error if there is an error.
This makes the assignment of the client pointer more obvious at the
call site.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

d720bcb0

rbd: a few simple changes · f0f8cef5

由 Alex Elder 提交于 1月 29, 2012

Here are a few very simple cleanups:
    - Add a "RBD_" prefix to the two driver name string definitions.
    - Move the definition of struct rbd_request below struct rbd_req_coll
      to avoid the need for an empty declaration of the latter.
    - Move and group the definitions of rbd_root_dev_release() and
      rbd_root_dev, as well as rbd_bus_type and rbd_bus_attrs[],
      close to the top of the file.  Arrange the latter so
      rbd_bus_type.bus_attrs can be initialized statically.
    - Get rid of an unnecessary local variable in rbd_open().
    - Rework some hokey logic in rbd_bus_add_dev(), so the value of
      "ret" at the end is either 0 or -ENOENT to avoid the need for
      the code duplication that was there.
    - Rename a goto target in rbd_add().
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

f0f8cef5

rbd: rename "node_lock" · 432b8587

由 Alex Elder 提交于 1月 29, 2012

The spinlock used to protect rbd_client_list is named "node_lock".
Rename it to "rbd_client_list_lock" to make it more obvious what
it's for.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

432b8587

rbd: move ctl_mutex lock inside rbd_client_create() · bc534d86

由 Alex Elder 提交于 1月 29, 2012

Since rbd_client_create() is only called in one place, move the
acquisition of the mutex around that call inside that function.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

bc534d86

rbd: move ctl_mutex lock inside rbd_get_client() · d97081b0

由 Alex Elder 提交于 1月 29, 2012

Since rbd_get_client() is only called in one place, move the
acquisition of the mutex around that call inside that function.

Furthermore, within rbd_get_client(), it appears the mutex only
needs to be held while calling rbd_client_create().  (Moving
the lock inside that function will wait for the next patch.)
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

d97081b0

rbd: release client list lock sooner · e6994d3d

由 Alex Elder 提交于 1月 29, 2012

In rbd_get_client(), if a client is reused, a number of things
get done while still holding the list lock unnecessarily.

This just moves a few things that need no lock protection outside
the lock.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

e6994d3d

rbd: restore previous rbd id sequence behavior · d184f6bf

由 Alex Elder 提交于 1月 29, 2012

It used to be that selecting a new unique identifier for an added
rbd device required searching all existing ones to find the highest
id is used.  A recent change made that unnecessary, but made it
so that id's used were monotonically non-decreasing.  It's a bit
more pleasant to have smaller rbd id's though, and this change
makes ids get allocated as they were before--each new id is one more
than the maximum currently in use.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

d184f6bf

rbd: tie rbd_dev_list changes to rbd_id operations · 499afd5b

由 Alex Elder 提交于 2月 02, 2012

The only time entries are added to or removed from the global
rbd_dev_list is exactly when a "put" or "get" operation is being
performed on a rbd_dev's id.  So just move the list management code
into get/put routines.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

499afd5b

rbd: protect the rbd_dev_list with a spinlock · e124a82f

由 Alex Elder 提交于 1月 29, 2012

The rbd_dev_list is just a simple list of all the current
rbd_devices.  Using the ctl_mutex as a concurrency guard is
overkill.  Instead, use a spinlock for that specific purpose.

This also reduces the window that the ctl_mutex needs to be held in
rbd_add().
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

e124a82f

rbd: rework calculation of new rbd id's · 1ddbe94e

由 Alex Elder 提交于 1月 29, 2012

In order to select a new unique identifier for an added rbd device,
the list of all existing ones is searched and a value one greater
than the highest id is used.

The list search can be avoided by using an atomic variable that
keeps track of the current highest id.  Using a get/put model for
id's we can limit the boundless growth of id numbers a bit by
arranging to reuse the current highest id once it gets released.
Add these calls to "put" the id when an rbd is getting removed.

Note that this changes the pattern of device id's used--new values
will never be below the highest one seen so far (even if there
exists an unused lower one).  I assert this is OK because the key
property of an rbd id is its uniqueness, not its magnitude.

Regardless, a follow-on patch will restore the old way of doing
things, I just think this commit just makes the incremental change
to atomics a little easier to understand.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

1ddbe94e

rbd: encapsulate new rbd id selection · b7f23c36

由 Alex Elder 提交于 1月 29, 2012

Move the loop that finds a new unique rbd id to use into
its own helper function.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

b7f23c36

rbd: use a single value of snap_name to mean no snap · cc9d734c

由 Josh Durgin 提交于 11月 21, 2011

There's already a constant for this anyway.

Since rbd_header_set_snap() is only used to set the rbd device
snap_name field, just do that within that function rather than
having it take the snap_name as an argument.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

v2: Changed interface rbd_header_set_snap() so it explicitly updates
    the snap_name in the rbd_device.  Also added a BUILD_BUG_ON()
    to verify the size of the snap_name field is sufficient for
    SNAP_HEAD_NAME.

cc9d734c

rbd: do not duplicate ceph_client pointer in rbd_device · 1dbb4399

由 Alex Elder 提交于 1月 24, 2012

The rbd_device structure maintains a duplicate copy of the
ceph_client pointer maintained in its rbd_client structure.  There
appears to be no good reason for this, and its presence presents a
risk of them getting out of synch or otherwise misused.  So kill it
off, and use the rbd_client copy only.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

1dbb4399

rbd: make ceph_parse_options() return a pointer · ee57741c

由 Alex Elder 提交于 1月 24, 2012

ceph_parse_options() takes the address of a pointer as an argument
and uses it to return the address of an allocated structure if
successful.  With this interface is not evident at call sites that
the pointer is always initialized.  Change the interface to return
the address instead (or a pointer-coded error code) to make the
validity of the returned pointer obvious.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

ee57741c

rbd: a few small cleanups · 21079786

由 Alex Elder 提交于 1月 24, 2012

Some minor cleanups in "drivers/block/rbd.c:
    - Use the more meaningful "RBD_MAX_OBJ_NAME_LEN" in place if "96"
      in the definition of RBD_MAX_MD_NAME_LEN.
    - Use DEFINE_SPINLOCK() to define and initialize node_lock.
    - Drop a needless (char *) cast in parse_rbd_opts_token().
    - Make a few minor formatting changes.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

21079786

03 2月, 2012 2 次提交

rbd: fix safety of rbd_put_client() · d23a4b3f

由 Alex Elder 提交于 1月 29, 2012

The rbd_client structure uses a kref to arrange for cleaning up and
freeing an instance when its last reference is dropped.  The cleanup
routine is rbd_client_release(), and one of the things it does is
delete the rbd_client from rbd_client_list.  It acquires node_lock
to do so, but the way it is done is still not safe.

The problem is that when attempting to reuse an existing rbd_client,
the structure found might already be in the process of getting
destroyed and cleaned up.

Here's the scenario, with "CLIENT" representing an existing
rbd_client that's involved in the race:

 Thread on CPU A                | Thread on CPU B
 ---------------                | ---------------
 rbd_put_client(CLIENT)         | rbd_get_client()
   kref_put()                   |   (acquires node_lock)
     kref->refcount becomes 0   |   __rbd_client_find() returns CLIENT
     calls rbd_client_release() |   kref_get(&CLIENT->kref);
                                |   (releases node_lock)
       (acquires node_lock)     |
       deletes CLIENT from list | ...and starts using CLIENT...
       (releases node_lock)     |
       and frees CLIENT         | <-- but CLIENT gets freed here

Fix this by having rbd_put_client() acquire node_lock.  The result
could still be improved, but at least it avoids this problem.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

d23a4b3f

rbd: fix a memory leak in rbd_get_client() · 97bb59a0

由 Alex Elder 提交于 1月 24, 2012

If an existing rbd client is found to be suitable for use in
rbd_get_client(), the rbd_options structure is not being
freed as it should.  Fix that.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

97bb59a0

13 1月, 2012 1 次提交

rbd: initialize snap_rwsem in rbd_add() · 0e805a1d

由 Alex Elder 提交于 1月 11, 2012

New rbd device structures get initialized in rbd_add().  Many of
the fields rely on being initially zero-filled.  However we lockdep
was noticing that the rw_semaphore embedded in the header field
was not getting properly initialized.  Fix that.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

0e805a1d

08 12月, 2011 2 次提交

rbd: remove buggy rollback functionality · 51703306

由 Josh Durgin 提交于 10月 24, 2011

This doesn't interact with resizing well, since it doesn't set the
size of the device to the size at the snapshot. It's also an expensive
operation to be synchronous. Rollback can still be done with the
userspace rbd tool.
Signed-off-by: NJosh Durgin <josh.durgin@dreamhost.com>

51703306

rbd: return an error when an invalid header is read · 81e759fb

由 Josh Durgin 提交于 11月 15, 2011

This protects against opening future rbd images that have incompatible format changes.
Signed-off-by: NJosh Durgin <josh.durgin@dreamhost.com>

81e759fb

26 10月, 2011 1 次提交

libceph: create messenger with client · 6ab00d46

由 Sage Weil 提交于 8月 09, 2011

This simplifies the init/shutdown paths, and makes client->msgr available
during the rest of the setup process.
Signed-off-by: NSage Weil <sage@newdream.net>

6ab00d46

15 9月, 2011 1 次提交

treewide: remove extra semicolons from various parts of the kernel · 69932487

由 Justin P. Mattock 提交于 7月 26, 2011

This is a resend from the original, changing the title from PATCH to
RFC(since this is a review for commit, and I should have put that the first go around).
and also removing some of the commit's with ia64 and bash since it is significant.
let me know if I might have missed anything etc..
Signed-off-by: NJustin P. Mattock <justinmattock@gmail.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

69932487

27 7月, 2011 2 次提交

rbd: set blk_queue request sizes to object size · 029bcbd8

由 Josh Durgin 提交于 7月 22, 2011

This improves performance since more requests can be merged.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NJosh Durgin <josh.durgin@dreamhost.com>

029bcbd8

rbd: cancel watch request when releasing the device · 79e3057c

由 Yehuda Sadeh 提交于 7月 12, 2011

We were missing this cleanup, so when a device was released
the osd didn't clean up its watchers list, so following notifications
could be slow as osd needed to timeout on the client.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>

79e3057c

25 5月, 2011 3 次提交
- S
  rbd: handle online resize of underlying rbd image · 9db4b3e3
  由 Sage Weil 提交于 4月 19, 2011
```
If we get a notification that the image header has changed, check for
a change in the image size.
Signed-off-by: NSage Weil <sage@newdream.net>
```
  9db4b3e3
- S
  rbd: use snprintf for disk->disk_name · aedfec59
  由 Sage Weil 提交于 5月 12, 2011
```
Signed-off-by: NSage Weil <sage@newdream.net>
```
  aedfec59
- S
  rbd: cleanup: make kfree match kmalloc · 916d4d67
  由 Sage Weil 提交于 5月 12, 2011
```
Signed-off-by: NSage Weil <sage@newdream.net>
```
  916d4d67
20 5月, 2011 1 次提交
- S
  rbd: warn on update_snaps failure on notify · 13143d2d
  由 Sage Weil 提交于 5月 12, 2011
```
Signed-off-by: NSage Weil <sage@newdream.net>
```
  13143d2d
14 5月, 2011 1 次提交

rbd: fix split bio handling · 1fec7093

由 Yehuda Sadeh 提交于 5月 13, 2011

The rbd driver currently splits bios when they span an object boundary.
However, the blk_end_request expects the completions to roll up the results
in block device order, and the split rbd/ceph ops can complete in any
order.  This patch adds a struct rbd_req_coll to track completion of split
requests and ensures that the results are passed back up to the block layer
in order.

This fixes errors where the file system gets completion of a read operation
that spans an object boundary before the data has actually arrived.  The
bug is easily reproduced with iozone with a working set larger than
available RAM.
Reported-by: NFyodor Ustinov <ufm@ufm.su>
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

1fec7093

13 5月, 2011 1 次提交

rbd: fix leak of ops struct · 11f77002

由 Sage Weil 提交于 5月 12, 2011

The ops vector must be freed by the rbd_do_request caller.
Signed-off-by: NSage Weil <sage@newdream.net>

11f77002

04 5月, 2011 1 次提交
- S
  libceph: fix ceph_osdc_alloc_request error checks · 4ad12621
  由 Sage Weil 提交于 5月 03, 2011
```
ceph_osdc_alloc_request returns NULL on failure.
Signed-off-by: NSage Weil <sage@newdream.net>
```
  4ad12621