- 29 9月, 2011 2 次提交
-
-
由 Sage Weil 提交于
The incremental map updates have a record for each pg_temp mapping that is to be add/updated (len > 0) or removed (len == 0). The old code was written as if the updates were a complete enumeration; that was just wrong. Update the code to remove 0-length entries and drop the rbtree traversal. This avoids misdirected (and hung) requests that manifest as server errors like [WRN] client4104 10.0.1.219:0/275025290 misdirected client4104.1:129 0.1 to osd0 not [1,0] in e11/11 Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
We need to apply the modulo pg_num calculation before looking up a pgid in the pg_temp mapping rbtree. This fixes pg_temp mappings, and fixes (some) misdirected requests that result in messages like [WRN] client4104 10.0.1.219:0/275025290 misdirected client4104.1:129 0.1 to osd0 not [1,0] in e11/11 on the server and stall make the client block without getting a reply (at least until the pg_temp mapping goes way, but that can take a long long time). Reorder calc_pg_raw() a bit to make more sense. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 25 5月, 2011 1 次提交
-
-
由 Sage Weil 提交于
Old incrementals encode a 0 value (nearly always) when an osd goes down. Change that to allow any state bit(s) to be flipped. Special case 0 to mean flip the CEPH_OSD_UP bit to mimic the old behavior. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 20 5月, 2011 1 次提交
-
-
由 Sage Weil 提交于
Signed-off-by: NSage Weil <sage@newdream.net>
-
- 13 1月, 2011 1 次提交
-
-
由 Jesper Juhl 提交于
Always free memory allocated to 'pi' in net/ceph/osdmap.c::osdmap_decode(). Signed-off-by: NJesper Juhl <jj@chaosbits.net> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 21 10月, 2010 2 次提交
-
-
由 Yehuda Sadeh 提交于
This factors out protocol and low-level storage parts of ceph into a separate libceph module living in net/ceph and include/linux/ceph. This is mostly a matter of moving files around. However, a few key pieces of the interface change as well: - ceph_client becomes ceph_fs_client and ceph_client, where the latter captures the mon and osd clients, and the fs_client gets the mds client and file system specific pieces. - Mount option parsing and debugfs setup is correspondingly broken into two pieces. - The mon client gets a generic handler callback for otherwise unknown messages (mds map, in this case). - The basic supported/required feature bits can be expanded (and are by ceph_fs_client). No functional change, aside from some subtle error handling cases that got cleaned up in the refactoring process. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Yehuda Sadeh 提交于
Implement a pool lookup by name. This will be used by rbd. Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 04 8月, 2010 1 次提交
-
-
由 Sage Weil 提交于
Signed-off-by: NSage Weil <sage@newdream.net>
-
- 03 8月, 2010 1 次提交
-
-
由 Sage Weil 提交于
The pool info contains a vector for snap_info_t, not snap ids. This fixes the broken decoding, which would declare teh update corrupt when a pool snapshot was created. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 02 8月, 2010 2 次提交
-
-
由 Sage Weil 提交于
Include the crush_ruleset in the error message. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Yehuda Sadeh 提交于
Mainly fixing minor issues reported by sparse. Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 24 7月, 2010 1 次提交
-
-
由 Sage Weil 提交于
Free the ceph_pg_mapping structs when they are removed from the pg_temp rbtree. Also fix a leak in the __insert_pg_mapping() error path. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 08 7月, 2010 1 次提交
-
-
由 Dan Carpenter 提交于
We leak a "pi" on this error path. Signed-off-by: NDan Carpenter <error27@gmail.com> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 18 6月, 2010 1 次提交
-
-
由 Sage Weil 提交于
If the incremental osdmap has a new crush map, advance the position after decoding so that we can parse the rest of the osdmap properly. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 30 5月, 2010 1 次提交
-
-
由 Julia Lawall 提交于
Use ERR_CAST(x) rather than ERR_PTR(PTR_ERR(x)). The former makes more clear what is the purpose of the operation, which otherwise looks like a no-op. In the case of fs/ceph/inode.c, ERR_CAST is not needed, because the type of the returned value is the same as the type of the enclosing function. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ type T; T x; identifier f; @@ T f (...) { <+... - ERR_PTR(PTR_ERR(x)) + x ...+> } @@ expression x; @@ - ERR_PTR(PTR_ERR(x)) + ERR_CAST(x) // </smpl> Signed-off-by: NJulia Lawall <julia@diku.dk> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 12 5月, 2010 1 次提交
-
-
由 Sage Weil 提交于
OSD requests need to be resubmitted on any pg mapping change, not just when the pg primary changes. Resending only when the primary changes results in occasional 'hung' requests during osd cluster recovery or rebalancing. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 10 4月, 2010 1 次提交
-
-
由 Sage Weil 提交于
Teach the client to decode an updated format for the osdmap. The new format includes pool names, which will be useful shortly. Get this change in earlier rather than later. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 30 3月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: NTejun Heo <tj@kernel.org> Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
-
- 23 3月, 2010 1 次提交
-
-
由 Sage Weil 提交于
The incremental map decoding of pg pool updates wasn't skipping the snaps and removed_snaps vectors. This caused osd requests to stall when pool snapshots were created or fs snapshots were deleted. Use a common helper for full and incremental map decoders that decodes pools properly. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 02 3月, 2010 1 次提交
-
-
由 Sage Weil 提交于
Add missing pointer dereference (p is a void **). Signed-off-by: NSage Weil <sage@newdream.net>
-
- 18 2月, 2010 2 次提交
-
-
由 Sage Weil 提交于
Since we can now create and destroy pg pools, the pool ids will be sparse, and an array no longer makes sense for looking up by pool id. Use an rbtree instead. The OSDMap encoding also no longer has a max pool count (previously used to allocate the array). There is a new pool_max, that is the largest pool id we've ever used, although we don't actually need it in the client. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
Also move _lookup_pg_mapping into a helper. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 12 2月, 2010 1 次提交
-
-
由 Sage Weil 提交于
Also verify encoding version as we go. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 26 1月, 2010 1 次提交
-
-
由 Sage Weil 提交于
Signed-off-by: NSage Weil <sage@newdream.net>
-
- 22 12月, 2009 3 次提交
-
-
由 Sage Weil 提交于
An incremental pg_temp wasn't being decoded properly (wrong bound on for loop). Also remove unused local variable, while we're at it. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
Both osdmap_decode() and osdmap_apply_incremental() should never return NULL. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
Also, print fsid using standard format, NOT hex dump. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 10 12月, 2009 1 次提交
-
-
由 Sage Weil 提交于
Do not feed bad (large) device ids to CRUSH. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 08 11月, 2009 1 次提交
-
-
由 Sage Weil 提交于
Make the integer hash function a property of the bucket it is used on. This allows us to gracefully add support for new hash functions without starting from scatch. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 07 11月, 2009 2 次提交
-
-
由 Sage Weil 提交于
The object will be hashed to a placement seed (ps) based on the pg_pool's hash function. This allows new hashes to be introduced into an existing object store, or selection of a hash appropriate to the objects that will be stored in a particular pool. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
No ceph prefix. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 05 11月, 2009 1 次提交
-
-
由 Sage Weil 提交于
The endian conversions don't quite work with the old union ceph_pg. Just make it a regular struct, and make each field __le. This is simpler and it has the added bonus of actually working. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 04 11月, 2009 1 次提交
-
-
由 Sage Weil 提交于
We exchange struct ceph_entity_addr over the wire and store it on disk. The sockaddr_storage.ss_family field, however, is host endianness. So, fix ss_family endianness to big endian when sending/receiving over the wire. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 31 10月, 2009 1 次提交
-
-
由 Noah Watkins 提交于
Commit 645a1025 fixes calculation of object offset for layouts with multiple stripes per object. This updates the calculation of the length written to take into account multiple stripes per object. Signed-off-by: NNoah Watkins <noah@noahdesu.com> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 29 10月, 2009 3 次提交
-
-
由 Sage Weil 提交于
We were incorrectly calculationing of object offset. If we have multiple stripe units per object, we need to shift to the start of the current su in addition to the offset within the su. Also rename bno to ono (object number) to avoid some variable naming confusion. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
The object extent offset is the file offset _modulo_ the stripe unit. The code was correct, the comment was wrong. Reported-by: NNoah Watkins <jayhawk@soe.ucsc.edu> Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Noah Watkins 提交于
Using stripe unit size calculated and saved on the stack to avoid a redundant call to le32_to_cpu. Signed-off-by: NNoah Watkins <noah@noahdesu.com> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 20 10月, 2009 1 次提交
-
-
由 Sage Weil 提交于
Mix the preferred osd (if any) into the placement seed that is fed into the CRUSH object placement calculation. This prevents all the placement pgs from peering with the same osds. Rev the osd client protocol with this change. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 15 10月, 2009 1 次提交
-
-
由 Sage Weil 提交于
This avoids the fugly pass by reference and makes the code a bit easier to read. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 10 10月, 2009 1 次提交
-
-
由 Sage Weil 提交于
Return an error and report a corrupt map instead of crying BUG(). Signed-off-by: NSage Weil <sage@newdream.net>
-