- 04 7月, 2013 4 次提交
-
-
由 Alex Elder 提交于
When unmapping a device, its id is supplied, and that is used to look up which rbd device should be unmapped. Looking up the device involves searching the rbd device list while holding a spinlock that protects access to that list. Currently all of this is done under protection of the control lock, but that protection is going away soon. To ensure the rbd_dev is still valid (still on the list) while setting its REMOVING flag, do so while still holding the list lock. To do so, get rid of __rbd_get_dev(), and open code what it did in the one place it was used. Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Alex Elder 提交于
If more than one rbd image has the same ceph cluster configuration (same options, same set of monitors, same keys) they normally share a single rbd client. When an image is getting mapped, rbd looks to see if an existing client can be used, and creates a new one if not. The lookup and creation are not done under a common lock though, so mapping two images concurrently could lead to duplicate clients getting set up needlessly. This isn't a major problem, but it's wasteful and different from what's intended. This patch fixes that by using the control mutex to protect both the lookup and (if needed) creation of the client. It was previously used just when creating. This resolves: http://tracker.ceph.com/issues/3094Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Alex Elder 提交于
This includes a few relatively small fixes I found while examining the code that refreshes image information. This resolves: http://tracker.ceph.com/issues/5040Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Alex Elder 提交于
Neither zero_bio_chain() nor zero_pages() contains a call to flush caches after zeroing a portion of a page. This can cause problems on architectures that have caches that allow virtual address aliasing. This resolves: http://tracker.ceph.com/issues/4777Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
- 02 7月, 2013 2 次提交
-
-
由 Alex Elder 提交于
The reference to the original request dropped at the end of rbd_img_obj_exists_callback() corresponds to the reference taken in rbd_img_obj_exists_submit() to account for the stat request referring to it. Move the put of that reference up right after clearing that pointer to make its purpose more obvious. Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Geert Uytterhoeven 提交于
drivers/block/rbd.c: In function ‘zero_pages’: drivers/block/rbd.c:1102: warning: comparison of distinct pointer types lacks a cast Remove the hackish casts and use min_t() to fix this. Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: NAlex Elder <elder@inktank.com>
-
- 27 6月, 2013 1 次提交
-
-
由 Josh Durgin 提交于
Sending the right snapshot context with each write is required for snapshots to work. Due to the ordering of calls, the snapshot context is never set for any requests. This causes writes to the current version of the image to be reflected in all snapshots, which are supposed to be read-only. This happens because rbd_osd_req_format_write() sets the snapshot context based on obj_request->img_request. At this point, however, obj_request->img_request has not been set yet, to the snapshot context is set to NULL. Fix this by moving rbd_img_obj_request_add(), which sets obj_request->img_request, before the osd request formatting calls. This resolves: http://tracker.ceph.com/issues/5465Reported-by: NKarol Jurak <karol.jurak@gmail.com> Signed-off-by: NJosh Durgin <josh.durgin@inktank.com> Reviewed-by: NSage Weil <sage@inktank.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
- 26 6月, 2013 1 次提交
-
-
由 Josh Durgin 提交于
rbd_dev_v2_header_onetime() fetches striping information, and checks whether the image can be read by compariing the stripe unit to the object size. It determines the object size by shifting the object order, which is 0 at this point since it has not been read yet. Move the call to get the image size and object order before rbd_dev_v2_header_onetime() so it is set before use. Signed-off-by: NJosh Durgin <josh.durgin@inktank.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
- 13 6月, 2013 2 次提交
-
-
由 Josh Durgin 提交于
Format 2 objects use 16 characters for the object name suffix to be able to express the full 64-bit range of object numbers. Format 1 images only use 12 characters for this. Using 12-character names for format 2 caused userspace and kernel rbd clients to read differently named objects, which made an image written by one client look empty to the other client. CC: stable@vger.kernel.org # 3.9+ Reported-by: NChris Dunlop <chris@onthe.net.au> Signed-off-by: NJosh Durgin <josh.durgin@inktank.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
由 Stephen M. Cameron 提交于
If a new logical drive is added and the CCISS_REGNEWD ioctl is invoked (as is normal with the Array Configuration Utility) the process will hang as below. It attempts to acquire the same mutex twice, once in do_ioctl() and once in cciss_unlocked_open(). The BKL was recursive, the mutex isn't. Linux version 3.10.0-rc2 (scameron@localhost.localdomain) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Fri May 24 14:32:12 CDT 2013 [...] acu D 0000000000000001 0 3246 3191 0x00000080 Call Trace: schedule+0x29/0x70 schedule_preempt_disabled+0xe/0x10 __mutex_lock_slowpath+0x17b/0x220 mutex_lock+0x2b/0x50 cciss_unlocked_open+0x2f/0x110 [cciss] __blkdev_get+0xd3/0x470 blkdev_get+0x5c/0x1e0 register_disk+0x182/0x1a0 add_disk+0x17c/0x310 cciss_add_disk+0x13a/0x170 [cciss] cciss_update_drive_info+0x39b/0x480 [cciss] rebuild_lun_table+0x258/0x370 [cciss] cciss_ioctl+0x34f/0x470 [cciss] do_ioctl+0x49/0x70 [cciss] __blkdev_driver_ioctl+0x28/0x30 blkdev_ioctl+0x200/0x7b0 block_ioctl+0x3c/0x40 do_vfs_ioctl+0x89/0x350 SyS_ioctl+0xa1/0xb0 system_call_fastpath+0x16/0x1b This mutex usage was added into the ioctl path when the big kernel lock was removed. As it turns out, these paths are all thread safe anyway (or can easily be made so) and we don't want ioctl() to be single threaded in any case. Signed-off-by: NStephen M. Cameron <scameron@beardog.cce.hp.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Mike Miller <mike.miller@hp.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 31 5月, 2013 1 次提交
-
-
由 Ramachandra Rao Gajula 提交于
Some devices only have support for MSI, not MSI-X. While MSI is more limited, it still provides better performance than line-based interrupts. Signed-off-by: NRamachandra Gajula <rama@fastorsystems.com> Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
-
- 29 5月, 2013 2 次提交
-
-
由 Dan Carpenter 提交于
Static checkers complain about widening the binary "not" operations here because sectors are u64 and "(pd)->settings.size" is unsigned int. It unintentionally clears the high 32 bits of the sector. This means that the driver won't work for devices with over 2TB of space. Since this is a DVD drive, we're unlikely to reach that limit, but we may as well silence the warning. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
由 Matthew Wilcox 提交于
In some circumstances setting a 64-bit DMA mask can fail, as explained in Documentation/DMA-API-HOWTO.txt. Use the recommended code sequence to set a 32-bit DMA mask if setting a 64-bit DMA mask fails. Reported-by: NChayan Biswas <Chayan.Biswas@sandisk.com> Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
-
- 25 5月, 2013 2 次提交
-
-
由 Brian Behlendorf 提交于
The index on the page must be set before it is inserted in the radix tree. Otherwise there is a small race which can occur during lookup where the page can be found with the incorrect index. This will trigger the BUG_ON() in brd_lookup_page(). Signed-off-by: NBrian Behlendorf <behlendorf1@llnl.gov> Reported-by: NChris Wedgwood <cw@f00f.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Gernot Vormayr 提交于
If the port number is missing from the device-tree the device gets named xs` instead of xsa. This fixes the check for missing ids. Tested on ml507 board. Signed-off-by: NGernot Vormayr <gvormayr@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Grant Likely <grant.likely@linaro.org> Cc: Rob Herring <rob.herring@calxeda.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 24 5月, 2013 1 次提交
-
-
由 Chayan Biswas 提交于
We copy the result to user if the command is completed from the controller even if it completes with failure (non-zero) status. A return status of < 0 indicates the command was not completed by the controller. The user application may expect the error code in the result field in case of failure. Signed-off-by: NChayan Biswas <Chayan.Biswas@sandisk.com> Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
-
- 18 5月, 2013 2 次提交
-
-
由 Alex Elder 提交于
Bjorn Helgaas pointed out that a recent commit introduced a use-after-free condition in an error path for rbd_add(). He correctly stated: I think b536f69a "rbd: set up devices only for mapped images" introduced a use-after-free error in rbd_add(): ... If rbd_dev_device_setup() returns an error, we call rbd_dev_image_release(), which ultimately kfrees rbd_dev. Then we call rbd_dev_destroy(), which references fields in the already-freed rbd_dev struct before kfreeing it again. The simple fix is to return the error code after the call to rbd_dev_image_release(). Closer examination revealed that there's no need to clean up rbd_opts in that function, so fix that too. Update some other comments that have also become out of date. Reported-by: NBjorn Helgaas <bhelgaas@google.com> Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Alex Elder 提交于
Whether rbd_client_create() successfully creates a new client or not, it takes responsibility for getting the ceph_opts structure it's passed destroyed. If successful, the structure becomes associated with the created client; if not, rbd_client_create() will destroy it. Previously, rbd_get_client() would call ceph_destroy_options() if rbd_get_client() failed, and that meant it got called twice. That led freeing various pointers more than once, which is never a good idea. This resolves: http://tracker.ceph.com/issues/4559 Cc: stable@vger.kernel.org # 3.8+ Reported-by: NDan van der Ster <dan@vanderster.com> Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
- 17 5月, 2013 5 次提交
-
-
由 Keith Busch 提交于
Cancelling an already cancelled command does not do anything, so check the command context before cancelling it, continuing if had already been cancelled so we do not log the same problem every second if a device stops responding. Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
-
由 Wei Yongjun 提交于
nvme_submit_flush_data() might overwrite the initialisation of the return value with 0, so move the -ENOMEM setting close to the usage. Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
-
由 Dan Carpenter 提交于
You need to have CAP_SYS_ADMIN to trigger this overflow but it makes the static checkers complain so we should fix it. The worry is that "length" comes from copy_from_user() so we need to check that "length + offset" can't overflow. I also changed the min_t() cast to be unsigned instead of signed. Now that we cap "length" to INT_MAX it doesn't make a difference, but it's a little easier for reviewers to know that large values aren't cast to negative. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
-
由 Vishal Verma 提交于
nvme_trans_modesel_get_mp() was defined with a unsigned return type, but can return signed values. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NVishal Verma <vishal.l.verma@linux.intel.com> Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
-
由 Sachin Kamat 提交于
version.h header inclusion is not necessary as detected by checkversion.pl. Signed-off-by: NSachin Kamat <sachin.kamat@linaro.org> Acked-by: NVishal Verma <vishal.l.verma@linux.intel.com> Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
-
- 15 5月, 2013 2 次提交
-
-
由 Sam Bradshaw 提交于
Stacking drivers may append bvecs to existing bio's, resulting in non-zero bi_idx conditions. This patch counts the loops of bio_for_each_segment() rather than inheriting the bi_idx value to pass as a segment count to the hardware submission routine. Signed-off-by: NSam Bradshaw <sbradshaw@micron.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Sam Bradshaw 提交于
An open file-handle to one or more of the driver exported debugfs nodes causes raciness in recursive removal during module unload; sometimes a stale parent dentry is dereferenced when more than 1 pci device is present. Signed-off-by: NSam Bradshaw <sbradshaw@micron.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 14 5月, 2013 12 次提交
-
-
由 Alex Elder 提交于
Add code to rbd_img_obj_exists_callback() to detect when a clone's parent image has disappeared, and re-submit the original write request in that case. Kill off some redundant assertions. This completes the resolution for: http://tracker.ceph.com/issues/3763Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Alex Elder 提交于
Add code to rbd_img_parent_read_full_callback() to detect when a clone's parent image has disappeared, and re-submit the original write request in that case. (See the previous commit for more reasoning about why this is appropriate.) Rename some variables in rbd_img_obj_parent_read_full_callback() to match the convention used in the previous patch. Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Alex Elder 提交于
If a clone image gets flattened while a parent read request is underway, the original rbd object request needs to be resubmitted. The reason is that by the time we get the response to the parent read request, the data read from the parent may be out of date. In other words, we could see this sequence of events: rbd client parent image/osd ---------- ---------------- original object ENOENT; issue parent read respond to parent read child image flattened original image header refresh <--- original object written independently here parent read response received Add code to rbd_img_parent_read_callback() to detect when a clone's parent image has disappeared (as evidenced by its parent overlap becoming 0), and re-submit the original read request in that case. Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Alex Elder 提交于
A format 2 clone image can be the subject of a "flatten" operation, during which all of its data gets "copied up" from its parent image, leaving the image fully populated. Once this is complete, the clone's association with the parent is abolished. Since this can occur when a clone is mapped, we need to detect when it has occurred and handle it accordingly. We know an image has been flattened when we know it at one time had a parent, but we have learned (via a "get_parent" object class method call) it no longer has one. There might be in-flight requests at the point we learn an image has been flattened, so we can't simply clean up parent data structures right away. Instead, we'll drop the initial parent reference when the parent has disappeared (rather than when the image gets destroyed), which will allow the last in-flight reference to clean things up when it's complete. We leverage the fact that a zero parent overlap renders an image effectively unlayered. We set the overlap to 0 at the point we detect the clone image has flattened, which allows the unlayered behavior to take effect immediately, while keeping other parent structures in place until in-flight requests to complete. This and the next few patches resolve: http://tracker.ceph.com/issues/3763Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Alex Elder 提交于
Keep a reference count for uses of the parent information for an rbd device. An initial reference is set in rbd_img_request_create() if the target image has a parent (with non-zero overlap). Each image request for an image with a non-zero parent overlap gets another reference when it's created, and that reference is dropped when the request is destroyed. The initial reference is dropped when the image gets torn down. Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Alex Elder 提交于
Define rbd_parent_request_create() and rbd_parent_request_destroy() to handle the creation of parent image requests submitted for layered image objects. For simplicity, let rbd_img_request_put() handle dropping the reference to any image request (parent or not), and call whichever destructor is appropriate on the last put. Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Alex Elder 提交于
Define rbd_dev_unparent() to encapsulate cleaning up parent data structures from a layered rbd image. Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Alex Elder 提交于
Previously when a layered write was going to involve a copyup request, the original osd request was released before submitting the parent full-object read. The osd request for the copyup would then be allocated in rbd_img_obj_parent_read_full_callback(). Shortly we will be handling the event of mapped layered images getting flattened, and when that occurs we need to resubmit the original request. We therefore don't want to release the osd request until we really konw we're going to replace it--in the callback function. Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Alex Elder 提交于
Get parent info for format 2 images on every refresh (rather than just during the initial probe). This will be needed to detect the disappearance of the parent image in the event a mapped image becomes unlayered (i.e., flattened). Avoid leaking the previous parent spec on the second and subsequent times this information is requested by dropping the previous one (if any) before updating it. (Also, extract the pool id into a local variable before assigning it into the parent spec.) Switch to using a non-zero parent overlap value rather than the existence of a parent (a non-null parent_spec pointer) to determine whether to mark a request layered. It will soon be possible for a layered image to become unlayered while a request is in flight. This means that the layered flag for an image request indicates that there was a non-zero parent overlap at the time the image request was created. The parent overlap can change thereafter, which may lead to special handling at request submission or completion time. This and the next several patches are related to: http://tracker.ceph.com/issues/3763 NOTE: If an error occurs while refreshing the parent info (i.e., requesting it after initial probe), the old parent info will persist. This is not really correct, and is a scenario that needs to be addressed. For now we'll assert that the failure mode is unlikely, but the issue has been documented in tracker issue 5040. Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Alex Elder 提交于
An rbd clone image that has an overlap with its parent of 0 is effectively not a layered image at all. Detect this case and treat such an image as non-layered. Issue a warning to be sure the user knows what's going on. This resolves: http://tracker.ceph.com/issues/5028Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Alex Elder 提交于
Currently, rbd_img_obj_parent_read_full() assumes the incoming object request contains bio data. But if a layered image is part of a multi-layer stack of images it will result in read requests of page data to parent images. This is handling the same kind of issue as was resolved by this commit: 5b2ab72d rbd: support reading parent page data This resolves: http://tracker.ceph.com/issues/5027Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Alex Elder 提交于
The code that reads object data from the parent for a copyup on write request currently assumes that the size of that request is the size of a "full" object from the original target image. That is not necessarily the case. The parent overlap could reduce the request size below that. To fix that assumption we need to record the number of pages in the copyup_pages array, for both an image request and an object request. Rename a local variable in rbd_img_obj_parent_read_full_callback() to reflect we're recording the length of the parent read request, not the size of the target object. This resolves: http://tracker.ceph.com/issues/5038Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
- 10 5月, 2013 1 次提交
-
-
由 Keith Busch 提交于
Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
-
- 09 5月, 2013 2 次提交
-
-
由 Alex Elder 提交于
Get rid of rbd_img_request_get(), because it isn't used, and maybe won't ever be needed. Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Alex Elder 提交于
Any changes to parent images are immaterial to any mapped clone. So there is no need to have a watch event registered on header objects except for the header object of an image that is mapped. In fact, a watch request is a write operation, and we may only have read access to a parent image. We can't set up the watch request until we know the name of the header object though. So pass a flag to rbd_dev_image_probe() to indicate whether this probe is for a mapping or for a parent image. Change the second parameter to rbd_dev_header_watch_sync() be Boolean while we're at it. This resolves: http://tracker.ceph.com/issues/4941Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-