- 04 11月, 2011 1 次提交
-
-
由 Jeff Layton 提交于
This service should not be registered with or unregistered from rpcbind. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 03 11月, 2011 19 次提交
-
-
由 Boaz Harrosh 提交于
The ore need suplied a r4w_get_page/r4w_put_page API from Filesystem so it can get cache pages to read-into when writing parial stripes. Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Boaz Harrosh 提交于
Finally remove all the old raid engine, which is by now dead code. Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Boaz Harrosh 提交于
In this patch we are actually moving to the ORE. (Object Raid Engine). objio_state holds a pointer to an ore_io_state. Once we have an ore_io_state at hand we can call the ore for reading/writing. We register on the done path to kick off the nfs io_done mechanism. Again for Ease of reviewing the old code is "#if 0" but is not removed so the diff command works better. The old code will be removed in the next patch. fs/exofs/Kconfig::ORE is modified to also be auto-included if PNFS_OBJLAYOUT is set. Since we now depend on ORE. (See comments in fs/exofs/Kconfig) Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Boaz Harrosh 提交于
For Ease of reviewing I split the move to ore into 3 parts move to ore 01: ore_layout & ore_components move to ore 02: move to ORE move to ore 03: Remove old raid engine This patch modifies the objio_lseg, layout-segment level and devices and components arrays to use the ORE types. Though it will be removed soon, also the raid engine is modified to actually compile, possibly run, with the new types. So it is the same old raid engine but with some new ORE types. For Ease of reviewing, some of the old code is "#if 0" but is not removed so the diff command works better. The old code will be removed in the 3rd patch. Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Boaz Harrosh 提交于
* All instances of objlayout_io_state => objlayout_io_res * All instances of state => oir; * All instances of ol_state => oir; Big but nothing to it Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Boaz Harrosh 提交于
This is part of moving objio_osd to use the ORE. objlayout_io_state had two functions: 1. It was used in the error reporting mechanism at layout_return. This function is kept intact. (Later patch will rename objlayout_io_state => objlayout_io_res) 2. Carrier of rw io members into the objio_read/write_paglist API. This is removed in this patch. The {r,w}data received from NFS are passed directly to the objio_{read,write}_paglist API. The io_engine is now allocating it's own IO state as part of the read/write. The minimal functionality that was part of the generic allocation is passed to the io_engine. So part of this patch is rename of: ios->ol_state.foo => ios->foo At objlayout_{read,write}_done an objlayout_io_state is passed that denotes the result of the IO. (Hence the later name change). If the IO is successful objlayout calls an objio_free_result() API immediately (Which for objio_osd causes the release of the io_state). If the IO ended in an error it is hanged onto until reported in layout_return and is released later through the objio_free_result() API. (All this is not new just renamed and cleaned) Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Boaz Harrosh 提交于
objlayout driver was always returning PNFS_ATTEMPTED from it's read/write_pagelist operations. Even on error. Fix that. Start by establishing an error return API from io-engine, by not returning ssize_t (length-or-error) but returning "int" 0=OK, 0>Error. And clean up all return types in io-engine. Then if io-engine returned error return PNFS_NOT_ATTEMPTED to generic layer. (With a dprint) Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Boaz Harrosh 提交于
The EOF calculation was done on .read_pagelist(), cached in objlayout_io_state->eof, and set in objlayout_read_done() into nfs_read_data->res.eof. So set it directly into nfs_read_data->res.eof and avoid the extra member. Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Rakib Mullick 提交于
When CONFIG_NFS=y and CONFIG_NFS_V3_{,V4}=n we get the following warning. fs/nfs/write.c: In function ‘nfs_writeback_done’: fs/nfs/write.c:1246:21: warning: unused variable ‘server’ Remove the variable 'server' to fix the above warning. Signed-off-by: NRakib Mullick <rakib.mullick@gmail.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Rakib Mullick 提交于
Fix the following unused variable warning. fs/nfs/file.c: In function ‘nfs_file_release’: fs/nfs/file.c:140:17: warning: unused variable ‘dentry’ fs/nfs/file.c: In function ‘nfs_file_read’: fs/nfs/file.c:237:9: warning: unused variable ‘count’ Signed-off-by: NRakib Mullick <rakib.mullick@gmail.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Jeff Moyer 提交于
In testing aio on a fast storage device, I found that the context lock takes up a fair amount of cpu time in the I/O submission path. The reason is that we take it for every I/O submitted (see __aio_get_req). Since we know how many I/Os are passed to io_submit, we can preallocate the kiocbs in batches, reducing the number of times we take and release the lock. In my testing, I was able to reduce the amount of time spent in _raw_spin_lock_irq by .56% (average of 3 runs). The command I used to test this was: aio-stress -O -o 2 -o 3 -r 8 -d 128 -b 32 -i 32 -s 16384 <dev> I also tested the patch with various numbers of events passed to io_submit, and I ran the xfstests aio group of tests to ensure I didn't break anything. Signed-off-by: NJeff Moyer <jmoyer@redhat.com> Cc: Daniel Ehrenberg <dehrenberg@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lucas De Marchi 提交于
Adding support for poll() in sysctl fs allows userspace to receive notifications of changes in sysctl entries. This adds a infrastructure to allow files in sysctl fs to be pollable and implements it for hostname and domainname. [akpm@linux-foundation.org: s/declare/define/ for definitions] Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi> Cc: Greg KH <gregkh@suse.de> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vasiliy Kulikov 提交于
fd* files are restricted to the task's owner, and other users may not get direct access to them. But one may open any of these files and run any setuid program, keeping opened file descriptors. As there are permission checks on open(), but not on readdir() and read(), operations on the kept file descriptors will not be checked. It makes it possible to violate procfs permission model. Reading fdinfo/* may disclosure current fds' position and flags, reading directory contents of fdinfo/ and fd/ may disclosure the number of opened files by the target task. This information is not sensible per se, but it can reveal some private information (like length of a password stored in a file) under certain conditions. Used existing (un)lock_trace functions to check for ptrace_may_access(), but instead of using EPERM return code from it use EACCES to be consistent with existing proc_pid_follow_link()/proc_pid_readlink() return code. If they differ, attacker can guess what fds exist by analyzing stat() return code. Patched handlers: stat() for fd/*, stat() and read() for fdindo/*, readdir() and lookup() for fd/ and fdinfo/. Signed-off-by: NVasiliy Kulikov <segoon@openwall.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: <stable@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Pavel Emelyanov 提交于
On reading sysctl dirs we should return -EISDIR instead of -EINVAL. Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Phillip Lougher 提交于
Clement Lecigne reports a filesystem which causes a kernel oops in hfs_find_init() trying to dereference sb->ext_tree which is NULL. This proves to be because the filesystem has a corrupted MDB extent record, where the extents file does not fit into the first three extents in the file record (the first blocks). In hfs_get_block() when looking up the blocks for the extent file (HFS_EXT_CNID), it fails the first blocks special case, and falls through to the extent code (which ultimately calls hfs_find_init()) which is in the process of being initialised. Hfs avoids this scenario by always having the extents b-tree fitting into the first blocks (the extents B-tree can't have overflow extents). The fix is to check at mount time that the B-tree fits into first blocks, i.e. fail if HFS_I(inode)->alloc_blocks >= HFS_I(inode)->first_blocks Note, the existing commit 47f365eb ("hfs: fix oops on mount with corrupted btree extent records") becomes subsumed into this as a special case, but only for the extents B-tree (HFS_EXT_CNID), it is perfectly acceptable for the catalog B-Tree file to grow beyond three extents, with the remaining extent descriptors in the extents overfow. This fixes CVE-2011-2203 Reported-by: NClement LECIGNE <clement.lecigne@netasq.com> Signed-off-by: NPhillip Lougher <plougher@redhat.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Namjae Jeon 提交于
Use mpage_readpages() instead of multiple calls to isofs_readpage() to reduce the CPU utilization and make performance higher. Signed-off-by: NNamjae Jeon <linkinjeon@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jan Kara <jack@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Richard Weinberger 提交于
Since ramfs is hard-selected to "y", the module leftovers make no sense. Signed-off-by: NRichard Weinberger <richard@nod.at> Reviewed-by: NWANG Cong <xiyou.wangcong@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jiri Kosina 提交于
The case of address space randomization being disabled in runtime through randomize_va_space sysctl is not treated properly in load_elf_binary(), resulting in SIGKILL coming at exec() time for certain PIE-linked binaries in case the randomization has been disabled at runtime prior to calling exec(). Handle the randomize_va_space == 0 case the same way as if we were not supporting .text randomization at all. Based on original patch by H.J. Lu and Josh Boyer. Signed-off-by: NJiri Kosina <jkosina@suse.cz> Cc: Ingo Molnar <mingo@elte.hu> Cc: Russell King <rmk@arm.linux.org.uk> Cc: H.J. Lu <hongjiu.lu@intel.com> Cc: <stable@kernel.org> Tested-by: NJosh Boyer <jwboyer@redhat.com> Acked-by: NNicolas Pitre <nicolas.pitre@linaro.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Phillip Lougher 提交于
This commit adds an option to set the device block size used to 4K. By default Squashfs sets the device block size (sb_min_blocksize) to 1K or the smallest block size supported by the block device (if larger). This, because blocks are packed together and unaligned in Squashfs, should reduce latency. This, however, gives poor performance on MTD NAND devices where the optimal I/O size is 4K (even though the devices can support smaller block sizes). Using a 4K device block size may also improve overall I/O performance for some file access patterns (e.g. sequential accesses of files in filesystem order) on all media. Signed-off-by: NPhillip Lougher <phillip@squashfs.org.uk>
-
- 02 11月, 2011 17 次提交
-
-
由 Al Viro 提交于
everything in USER_OBJ gets it via -include user.h Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NRichard Weinberger <richard@nod.at>
-
由 Sage Weil 提交于
This adds a d_prune dentry operation that is called by the VFS prior to pruning (i.e. unhashing and killing) a hashed dentry from the dcache. Wrap dentry_lru_del() and use the new _prune() helper in the cases where we are about to unhash and kill the dentry. This will be used by Ceph to maintain a flag indicating whether the complete contents of a directory are contained in the dcache, allowing it to satisfy lookups and readdir without addition server communication. Renumber a few DCACHE_* #defines to group DCACHE_OP_PRUNE with the other DCACHE_OP_ bits. Signed-off-by: NSage Weil <sage@newdream.net> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Miklos Szeredi 提交于
Prevent direct modification of i_nlink by making it const and adding a non-const __i_nlink alias. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Tested-by: NToshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Miklos Szeredi 提交于
Replace remaining direct i_nlink updates with a new set_nlink() updater function. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Tested-by: NToshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Miklos Szeredi 提交于
Replace direct i_nlink updates with the respective updater function (inc_nlink, drop_nlink, clear_nlink, inode_dec_link_count). Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
由 Miklos Szeredi 提交于
alloc_inode() initializes i_nlink to 1. Remove unnecessary re-initialization. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> CC: Joern Engel <joern@logfs.org> CC: Prasad Joshi <prasadjoshi.linux@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Miklos Szeredi 提交于
alloc_inode() initializes i_nlink to 1. Remove unnecessary re-initialization. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> CC: Joel Becker <jlbec@evilplan.org> CC: Mark Fasheh <mfasheh@suse.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Miklos Szeredi 提交于
alloc_inode() initializes i_nlink to 1. Remove unnecessary re-initialization. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Acked-by: NDave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Miklos Szeredi 提交于
On emergency remount we want to force MS_RDONLY on the super block even if ->remount_fs() failed for some reason. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Tested-by: NToshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Andy Whitcroft 提交于
Since the commit below which added O_PATH support to the *at() calls, the error return for readlink/readlinkat for the empty pathname has switched from ENOENT to EINVAL: commit 65cfc672 Author: Al Viro <viro@zeniv.linux.org.uk> Date: Sun Mar 13 15:56:26 2011 -0400 readlinkat(), fchownat() and fstatat() with empty relative pathnames This is both unexpected for userspace and makes readlink/readlinkat inconsistant with all other interfaces; and inconsistant with our stated return for these pathnames. As the readlinkat call does not have a flags parameter we cannot use the AT_EMPTY_PATH approach used in the other calls. Therefore expose whether the original path is infact entry via a new user_path_at_empty() path lookup function. Use this to determine whether to default to EINVAL or ENOENT for failures. Addresses http://bugs.launchpad.net/bugs/817187 [akpm@linux-foundation.org: remove unused getname_flags()] Signed-off-by: NAndy Whitcroft <apw@canonical.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Konstantin Khlebnikov 提交于
put dentry if inode allocation failed, d_genocide() cannot release it Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Eryu Guan 提交于
Some jbd2 code prints out kernel messages with "JBD2: " prefix, at the same time other jbd2 code prints with "JBD: " prefix. Unify the prefix to "JBD2: ". Signed-off-by: NEryu Guan <guaneryu@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Eryu Guan 提交于
I hit a J_ASSERT(blocknr != 0) failure in cleanup_journal_tail() when mounting a fsfuzzed ext3 image. It turns out that the corrupted ext3 image has s_first = 0 in journal superblock, and the 0 is passed to journal->j_head in journal_reset(), then to blocknr in cleanup_journal_tail(), in the end the J_ASSERT failed. So validate s_first after reading journal superblock from disk in journal_get_superblock() to ensure s_first is valid. The following script could reproduce it: fstype=ext3 blocksize=1024 img=$fstype.img offset=0 found=0 magic="c0 3b 39 98" dd if=/dev/zero of=$img bs=1M count=8 mkfs -t $fstype -b $blocksize -F $img filesize=`stat -c %s $img` while [ $offset -lt $filesize ] do if od -j $offset -N 4 -t x1 $img | grep -i "$magic";then echo "Found journal: $offset" found=1 break fi offset=`echo "$offset+$blocksize" | bc` done if [ $found -ne 1 ];then echo "Magic \"$magic\" not found" exit 1 fi dd if=/dev/zero of=$img seek=$(($offset+23)) conv=notrunc bs=1 count=1 mkdir -p ./mnt mount -o loop $img ./mnt Cc: Jan Kara <jack@suse.cz> Signed-off-by: NEryu Guan <guaneryu@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Yongqiang Yang 提交于
The variable 'block' is removed by commit 750c9c47, so use the replacement ex_ee_block instead. Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Yongqiang Yang 提交于
This patch fixes a syntax error which omits a comma. Besides this, logical block number is unsigend 32 bits, so printk should use %u instead %d. Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Benny Halevy 提交于
Signed-off-by: NBenny Halevy <bhalevy@tonian.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Eric W. Biederman 提交于
In sysfs_rename we need to remove the optimization of not calling sysfs_unlink_sibling and sysfs_link_sibling if the renamed parent directory is not changing. This optimization is no longer valid now that sysfs dirents are stored in an rbtree sorted by name. Move the assignment of s_ns before the call of sysfs_link_sibling. With no sysfs_dirent fields changing after the call of sysfs_link_sibling this allows sysfs_link_sibling to take any of the directory entries into account when it builds the rbtrees, and s_ns looks like a prime canidate to be used in the rbtree in the future. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Cc: Jiri Slaby <jirislaby@gmail.com> Cc: Greg KH <gregkh@suse.de> Cc: David Miller <davem@davemloft.net> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 11月, 2011 3 次提交
-
-
由 Nelson Elhage 提交于
epoll can acquire recursively acquire ep->mtx on multiple "struct eventpoll"s at once in the case where one epoll fd is monitoring another epoll fd. This is perfectly OK, since we're careful about the lock ordering, but it causes spurious lockdep warnings. Annotate the recursion using mutex_lock_nested, and add a comment explaining the nesting rules for good measure. Recent versions of systemd are triggering this, and it can also be demonstrated with the following trivial test program: --------------------8<-------------------- int main(void) { int e1, e2; struct epoll_event evt = { .events = EPOLLIN }; e1 = epoll_create1(0); e2 = epoll_create1(0); epoll_ctl(e1, EPOLL_CTL_ADD, e2, &evt); return 0; } --------------------8<-------------------- Reported-by: NPaul Bolle <pebolle@tiscali.nl> Tested-by: NPaul Bolle <pebolle@tiscali.nl> Signed-off-by: NNelson Elhage <nelhage@nelhage.com> Acked-by: NJason Baron <jbaron@redhat.com> Cc: Dave Jones <davej@redhat.com> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: <stable@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andy Shevchenko 提交于
There is no functional change. Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joe Perches 提交于
Standardize the style for compiler based printf format verification. Standardized the location of __printf too. Done via script and a little typing. $ grep -rPl --include=*.[ch] -w "__attribute__" * | \ grep -vP "^(tools|scripts|include/linux/compiler-gcc.h)" | \ xargs perl -n -i -e 'local $/; while (<>) { s/\b__attribute__\s*\(\s*\(\s*format\s*\(\s*printf\s*,\s*(.+)\s*,\s*(.+)\s*\)\s*\)\s*\)/__printf($1, $2)/g ; print; }' [akpm@linux-foundation.org: revert arch bits] Signed-off-by: NJoe Perches <joe@perches.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-