- 03 6月, 2011 2 次提交
-
-
由 Artem Bityutskiy 提交于
UBIFS leaks memory on error path in 'ubifs_jnl_update()' in case of write failure because it forgets to free the 'struct ubifs_dent_node *dent' object. Although the object is small, the alignment can make it large - e.g., 2KiB if the min. I/O unit is 2KiB. Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com> Cc: stable@kernel.org
-
由 Artem Bityutskiy 提交于
Sometimes VM asks the shrinker to return amount of objects it can shrink, and we return the ubifs_clean_zn_cnt in that case. However, it is possible that this counter is negative for a short period of time, due to the way UBIFS TNC code updates it. And I can observe the following warnings sometimes: shrink_slab: ubifs_shrinker+0x0/0x2b7 [ubifs] negative objects to delete nr=-8541616642706119788 This patch makes sure UBIFS never returns negative count of objects. Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com> Cc: stable@kernel.org
-
- 01 6月, 2011 4 次提交
-
-
由 Artem Bityutskiy 提交于
Unfortunately, the recovery fix d1606a59b6be4ea392eabd40d1250aa1eeb19efb (UBIFS: fix extremely rare mount failure) broke recovery. This commit make UBIFS drop the last min. I/O unit in all journal heads, but this is needed only for the GC head. And this does not work for non-GC heads. For example, if suppose we have min. I/O units A and B, and A contains a valid node X, which was fsynced, and then a group of nodes Y which spans the rest of A and B. In this case we'll drop not only Y, but also X, which is obviously incorrect. This patch fixes the issue and additionally makes recovery to drop last min. I/O unit only for the GC head, and leave things as they have been for ages for the other heads - this is safer. Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
-
由 Artem Bityutskiy 提交于
Instead of passing "grouped" parameter to 'ubifs_recover_leb()' which tells whether the nodes are grouped in the LEB to recover, pass the journal head number and let 'ubifs_recover_leb()' look at the journal head's 'grouped' flag. This patch is a preparation to a further fix where we'll need to know the journal head number for other purposes. Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
-
由 Artem Bityutskiy 提交于
Journal heads are different in a way how UBIFS writes nodes there. All normal journal heads receive grouped nodes, while the GC journal heads receives ungrouped nodes. This patch adds a 'grouped' flag to 'struct ubifs_jhead' which describes this property. This patch is a preparation to a further recovery fix. Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
-
由 Artem Bityutskiy 提交于
Commit ab51afe05273741f72383529ef488aa1ea598ec6 was a good clean-up, but it introduced a regression - now UBIFS prints scary error messages during recovery on all corrupted nodes, even though the corruptions are expected (due to a power cut). This patch fixes the issue. Additionally fix a typo in a commentary introduced by the same commit. Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
-
- 30 5月, 2011 34 次提交
-
-
由 Tyler Hicks 提交于
Now that ecryptfs_lookup_interpose() is no longer using ecryptfs_header_cache_2 to read in metadata, the kmem_cache can be removed and the ecryptfs_header_cache_1 kmem_cache can be renamed to ecryptfs_header_cache. Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>
-
由 Tyler Hicks 提交于
ecryptfs_lookup_interpose() has turned into spaghetti code over the years. This is an effort to clean it up. - Shorten overly descriptive variable names such as ecryptfs_dentry - Simplify gotos and error paths - Create helper function for reading plaintext i_size from metadata It also includes an optimization when reading i_size from the metadata. A complete page-sized kmem_cache_alloc() was being done to read in 16 bytes of metadata. The buffer for that is now statically declared. Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>
-
由 Tyler Hicks 提交于
Instead of having the calling functions translate the true/false return code to either 0 or -EINVAL, have contains_ecryptfs_marker() return 0 or -EINVAL so that the calling functions can just reuse the return code. Also, rename the function to ecryptfs_validate_marker() to avoid callers mistakenly thinking that it returns true/false codes. Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>
-
由 Tyler Hicks 提交于
Only unlock and d_add() new inodes after the plaintext inode size has been read from the lower filesystem. This fixes a race condition that was sometimes seen during a multi-job kernel build in an eCryptfs mount. https://bugzilla.kernel.org/show_bug.cgi?id=36002Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com> Reported-by: NDavid <david@unsolicited.net> Tested-by: NDavid <david@unsolicited.net>
-
由 Al Viro 提交于
Commit 1495f230 ("vmscan: change shrinker API by passing shrink_control struct") changed the API of ->shrink(), but missed ubifs and cifs instances. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Boaz Harrosh 提交于
Implement pg_test vector to test for max IO sizes. We calculate a max_io_size member only once, and cache it in lseg so to not do so on every page insert. Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> [simplify logic] Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Boaz Harrosh 提交于
By default, unless pnfs is used coalesce pages until pg_bsize (rsize or wsize) is reached. pnfs layout drivers define their own pg_test methods that use pnfs_generic_pg_test and need to define their own I/O size limits (e.g. based on the file stripe size). [Move a check from nfs_pageio_do_add_request to nfs_generic_pg_test] Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Benny Halevy 提交于
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Benny Halevy 提交于
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Benny Halevy 提交于
Use common code for pnfs_pageio_init_{read,write} and use a common generic pg_test function. Note that this function always assumes the the layout driver's pg_test method is implemented. [Fix BUG] Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Boaz Harrosh 提交于
* Define API for io-engines to report delta_space_used in IOs * Encode the osd-layout specific information of the layoutcommit XDR buffer. Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Benny Halevy 提交于
Add a layout driver method to encode the layout type specific opaque part of layout commit in-line in the xdr stream. Currently, the pnfs-objects layout driver uses it to encode metadata hints to the MDS and the blocks layout driver to commit provisionally allocated extents to the file. Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Boaz Harrosh 提交于
An io_state pre-allocates an error information structure for each possible osd-device that might error during IO. When IO is done if all was well the io_state is freed. (as today). If the I/O has ended with an error, the io_state is queued on a per-layout err_list. When eventually encode_layoutreturn() is called, each error is properly encoded on the XDR buffer and only then the io_state is removed from err_list and de-allocated. It is up to the io_engine to fill in the segment that fault and the type of osd_error that occurred. By calling objlayout_io_set_result() for each failing device. In objio_osd: * Allocate io-error descriptors space as part of io_state * Use generic objlayout error reporting at end of io. Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Andy Adamson 提交于
Add a layout driver method to encode the layout type specific opaque part of layout return in-line in the xdr stream. Currently the pnfs-objects layout driver uses it to encode i/o error information on LAYOUTRETURN. Signed-off-by: NAndy Adamson <andros@netapp.com> [fixup layout header pointer for encode_layoutreturn] Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Benny Halevy 提交于
With the objects layout security model, we have object capabilities that are associated with the layout and we anticipate that the server will issue a cb_layoutrecall for any setattr that changes security related attributes (user/group/mode/acl) or truncates the file. Therefore, the layout is returned before issuing the setattr to avoid the anticipated cb_layoutrecall. Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Benny Halevy 提交于
NFSv4.1 LAYOUTRETURN implementation Currently, does not support layout-type payload encoding. Signed-off-by: NAlexandros Batsakis <batsakis@netapp.com> Signed-off-by: NAndy Adamson <andros@citi.umich.edu> Signed-off-by: NAndy Adamson <andros@netapp.com> Signed-off-by: NDean Hildebrand <dhildeb@us.ibm.com> Signed-off-by: NFred Isaman <iisaman@citi.umich.edu> Signed-off-by: NFred Isaman <iisaman@netapp.com> Signed-off-by: NMarc Eshel <eshel@almaden.ibm.com> Signed-off-by: NZhang Jingwang <zhangjingwang@nrchpc.ac.cn> [call pnfs_return_layout right before pnfs_destroy_layout] [remove assert_spin_locked from pnfs_clear_lseg_list] [remove wait parameter from the layoutreturn path.] [remove return_type field from nfs4_layoutreturn_args] [remove range from nfs4_layoutreturn_args] [no need to send layoutcommit from _pnfs_return_layout] [don't wait on sync layoutreturn] [fix layout stateid in layoutreturn args] [fixed NULL deref in _pnfs_return_layout] [removed recaim member of nfs4_layoutreturn_args] Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Boaz Harrosh 提交于
With the use of the in-kernel osd library. Implement read/write of data from/to osd-objects according to information specified in the objects-layout. Support for stripping over mirrors with a received stripe_unit. There are however a few constrains which are not supported: 1. Stripe Unit must be a multiple of PAGE_SIZE 2. stripe length (stripe_unit * number_of_stripes) can not be bigger then 32bit. Also support raid-groups and partial-layout. Partial-layout is when not all the groups are received on the line, addressing only a partial range of the file. TODO: Only raid0! raid 4/5/6 support will come at later stage A none supported layout will send IO through the MDS [Important fallout from the last rebase] Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> [gfp_flags] Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Benny Halevy 提交于
Non-rpc layout driver such as for objects and blocks implement their own I/O path and error handling logic. Therefore bypass NFS-based error handling for these layout drivers. [fix lseg ref-count bugs, and null de-refs] [Fall out from: non-rpc layout drivers] Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> [get rid of PNFS_USE_RPC_CODE] [get rid of __nfs4_write_done_cb] [revert useless change in nfs4_write_done_cb] Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Benny Halevy 提交于
allocate and deallocate per-inode private pnfs_layout_hdr in preparation for I/O implementation. Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Benny Halevy 提交于
[gfp_flags] Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Boaz Harrosh 提交于
When a new layout is received in objio_alloc_lseg all device_ids referenced are retrieved. The device information is queried for from MDS and then the osd_device is looked-up from the osd-initiator library. The devices are cached in a per-mount-point list, for later use. At unmount all devices are "put" back to the library. objlayout_get_deviceinfo(), objlayout_put_deviceinfo() middleware API for retrieving device information given a device_id. TODO: The device cache can get big. Cap its size. Keep an LRU and start to return devices which were not used, when list gets to big, or when new entries allocation fail. [pnfs-obj: Bugs in new global-device-cache code] Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> [gfp_flags] [use global device cache] [use layout driver in global device cache] Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Boaz Harrosh 提交于
objlayout_alloc_lseg prepares an xdr_stream and calls the raid engins objio_alloc_lseg() to allocate a private pnfs_layout_segment. objio_osd.c::objio_alloc_lseg() uses passed xdr_stream to decode and store the layout_segment information in an objio_segment struct, using the pnfs_osd_xdr.h API for the actual parsing the layout xdr. objlayout_free_lseg calls objio_free_lseg() to free the allocated space. Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> [gfp_flags] [removed "extern" from function definitions] Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Boaz Harrosh 提交于
* Add the fs/nfs/objlayout/pnfs_osd_xdr_cli.c file, which will include the XDR encode/decode implementations for the pNFS client objlayout driver. [Wrong type in comments] Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Benny Halevy 提交于
* Define the PNFS_OBJLAYOUT Kconfig option in the nfs master Kconfig file. * Add the objlayout driver to the Kernel's Kbuild system. * Add the fs/nfs/objlayout/Kbuild file for building the objlayoutdriver.ko driver * Define fs/nfs/objlayout/objio_osd.c, register the driver on module initialization and unregister on exit. [pnfs-obj: remove of CONFIG_PNFS fallout] Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> [added "unsure" clause] [depend on NFS_V4_1] Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 J. Bruce Fields 提交于
A pNFS client auto-negotiates a lot of features (minorversion level, pNFS layout type, etc.). This is convenient, but makes certain kinds of failures hard for a user to detect. For example, if the client falls back on 4.0, or falls back to MDS IO because the user didn't connect to the right iscsi disks before mounting, the only symptoms may be reduced performance, which may not be noticed till long after the actual failure, and may be difficult for a user to diagnose. However, such "failures" may also be perfectly normal in some cases, so we don't want to spam the system logs with them. One approach would be to put some more information into /proc/self/mountstats. Signed-off-by: NJ. Bruce Fields <bfields@fieldses.org> Signed-off-by: NBenny Halevy <bhalevy@panasas.com> [pnfs: add commit client stats] [fixup data types for "ret" variables in pnfs_try_to* inline funcs.] Signed-off-by: NBenny Halevy <bhalevy@panasas.com> [fix definition of show_pnfs for !CONFIG_PNFS] Signed-off-by: NBenny Halevy <bhalevy@panasas.com> [nfs41: Fix show_sessions in the not CONFIG_NFS_V4_1 case] There is a build error when CONFIG_NFS_V4 is set but CONFIG_NFS_V4_1 is *not* set. show_sessions() prototype was unbalanced between the two cases. Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> [pnfs: super.c remove CONFIG_PNFS] Signed-off-by: NAndy Adamson <andros@netapp.com> Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Benny Halevy 提交于
Use recalled range to invalidate particular layout segments in the layout cache. Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Benny Halevy 提交于
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Benny Halevy 提交于
Add offset and count parameters to pnfs_update_layout and use them to get the layout in the pageio path. Order cache layout segments in the following order: * offset (ascending) * length (descending) * iomode (RW before READ) Test byte range against the layout segment in use in pnfs_{read,write}_pg_test so not to coalesce pages not using the same layout segment. [fix lseg ordering] [clean up pnfs_find_lseg lseg arg] [remove unnecessary FIXME] [fix ordering in pnfs_insert_layout] [clean up pnfs_insert_layout] Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Benny Halevy 提交于
Initialize xdr_stream and xdr_buf using an array of page pointers and length of buffer. Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Benny Halevy 提交于
pnfs deviceids are unique per server, per layout type. struct nfs_client is currently used to distinguish deviceids from different nfs servers, yet these may clash between different layout types on the same server. Therefore, use the layout driver associated with each deviceid at insertion time to look it up, unhash, or delete it. Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Marc Eshel 提交于
Note: This functionlaity is incomplete as all layout segments referring to the 'to be removed device id' need to be reaped, and all in flight I/O drained. [use be32 res in nfs4_callback_devicenotify] [use nfs_client to qualify deviceid for cb_notify_deviceid] [use global deviceid cache for CB_NOTIFY_DEVICEID] [refactor device cache _lookup_deviceid] [refactor device cache _find_get_deviceid] Signed-off-by: NBenny Halevy <bhalevy@panasas.com> [Bug in new global-device-cache code] [layout_driver MUST set free_deviceid_node if using dev-cache] Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Tyler Hicks 提交于
The eCryptfs inode get, initialization, and dentry interposition code has two separate paths. One is for when dentry interposition is needed after doing things like a mkdir in the lower filesystem and the other is needed after a lookup. Unlocking new inodes and doing a d_add() needs to happen at different times, depending on which type of dentry interposing is being done. This patch cleans up the inode get and initialization code paths and splits them up so that the locking and d_add() differences mentioned above can be handled appropriately in a later patch. Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com> Tested-by: NDavid <david@unsolicited.net>
-
由 Benny Halevy 提交于
Use the pnfs_layoutdriver_type both as a qualifier for the deviceid, distinguishing deviceid from different layout types on the server, and for freeing the layout-driver allocated structure containing the nfs4_deviceid_node. [BUG in _deviceid_purge_client] [layout_driver MUST set free_deviceid_node if using dev-cache] [let ver < 4.1 compile] Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> [removed EXPORT_SYMBOL_GPL(nfs4_deviceid_purge_client)] Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
由 Tyler Hicks 提交于
These functions should live in inode.c since their focus is on inodes and they're primarily used by functions in inode.c. Also does a simple cleanup of ecryptfs_inode_test() and rolls ecryptfs_init_inode() into ecryptfs_inode_set(). Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com> Tested-by: NDavid <david@unsolicited.net>
-