- 09 5月, 2007 40 次提交
-
-
由 Randy Dunlap 提交于
Remove includes of <linux/smp_lock.h> where it is not used/needed. Suggested by Al Viro. Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc, sparc64, and arm (all 59 defconfigs). Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Adrian Bunk 提交于
remove_inode_dquot_ref() can now become static. Signed-off-by: NAdrian Bunk <bunk@stusta.de> Acked-by: NJan Kara <jack@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexey Dobriyan 提交于
Additions and removal from tty_drivers list were just done as well as iterating on it for /proc/tty/drivers generation. testing: modprobe/rmmod loop of simple module which does nothing but tty_register_driver() vs cat /proc/tty/drivers loop BUG: unable to handle kernel paging request at virtual address 6b6b6b6b printing eip: c01cefa7 *pde = 00000000 Oops: 0000 [#1] PREEMPT last sysfs file: devices/pci0000:00/0000:00:1d.7/usb5/5-0:1.0/bInterfaceProtocol Modules linked in: ohci_hcd af_packet e1000 ehci_hcd uhci_hcd usbcore xfs CPU: 0 EIP: 0060:[<c01cefa7>] Not tainted VLI EFLAGS: 00010297 (2.6.21-rc4-mm1 #4) EIP is at vsnprintf+0x3a4/0x5fc eax: 6b6b6b6b ebx: f6cb50f2 ecx: 6b6b6b6b edx: fffffffe esi: c0354700 edi: f6cb6000 ebp: 6b6b6b6b esp: f31f5e68 ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Process cat (pid: 31864, ti=f31f4000 task=c1998030 task.ti=f31f4000) Stack: 00000000 c0103f20 c013003a c0103f20 00000000 f6cb50da 0000000a 00000f0e f6cb50f2 00000010 00000014 ffffffff ffffffff 00000007 c0354753 f6cb50f2 f73e39dc f73e39dc 00000001 c0175416 f31f5ed8 f31f5ed4 0ee00000 f32090bc Call Trace: [<c0103f20>] restore_nocheck+0x12/0x15 [<c013003a>] mark_held_locks+0x6d/0x86 [<c0103f20>] restore_nocheck+0x12/0x15 [<c0175416>] seq_printf+0x2e/0x52 [<c0192895>] show_tty_range+0x35/0x1f3 [<c0175416>] seq_printf+0x2e/0x52 [<c0192add>] show_tty_driver+0x8a/0x1d9 [<c01758f6>] seq_read+0x70/0x2ba [<c0175886>] seq_read+0x0/0x2ba [<c018d8e6>] proc_reg_read+0x63/0x9f [<c015e764>] vfs_read+0x7d/0xb5 [<c018d883>] proc_reg_read+0x0/0x9f [<c015eab1>] sys_read+0x41/0x6a [<c0103e4e>] sysenter_past_esp+0x5f/0x99 ======================= Code: 00 8b 4d 04 e9 44 ff ff ff 8d 4d 04 89 4c 24 50 8b 6d 00 81 fd ff 0f 00 00 b8 a4 c1 35 c0 0f 46 e8 8b 54 24 2c 89 e9 89 c8 eb 06 <80> 38 00 74 07 40 4a 83 fa ff 75 f4 29 c8 89 c6 8b 44 24 28 89 EIP: [<c01cefa7>] vsnprintf+0x3a4/0x5fc SS:ESP 0068:f31f5e68 Signed-off-by: NAlexey Dobriyan <adobriyan@sw.ru> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mark Fasheh 提交于
Remove do_sync_file_range() and convert callers to just use do_sync_mapping_range(). Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Randy Dunlap 提交于
REISER_FS /proc option needs to depend on PROC_FS. fs/reiserfs/procfs.c: In function 'show_super': fs/reiserfs/procfs.c:134: error: 'reiserfs_proc_info_data_t' has no member named 'max_hash_collisions' fs/reiserfs/procfs.c:134: error: 'reiserfs_proc_info_data_t' has no member named 'breads' fs/reiserfs/procfs.c:135: error: 'reiserfs_proc_info_data_t' has no member named 'bread_miss' fs/reiserfs/procfs.c:135: error: 'reiserfs_proc_info_data_t' has no member named 'search_by_key' fs/reiserfs/procfs.c:136: error: 'reiserfs_proc_info_data_t' has no member named 'search_by_key_fs_changed' fs/reiserfs/procfs.c:136: error: 'reiserfs_proc_info_data_t' has no member named 'search_by_key_restarted' fs/reiserfs/procfs.c:137: error: 'reiserfs_proc_info_data_t' has no member named 'insert_item_restarted' fs/reiserfs/procfs.c:137: error: 'reiserfs_proc_info_data_t' has no member named 'paste_into_item_restarted' fs/reiserfs/procfs.c:138: error: 'reiserfs_proc_info_data_t' has no member named 'cut_from_item_restarted' fs/reiserfs/procfs.c:139: error: 'reiserfs_proc_info_data_t' has no member named 'delete_solid_item_restarted' fs/reiserfs/procfs.c:139: error: 'reiserfs_proc_info_data_t' has no member named 'delete_item_restarted' fs/reiserfs/procfs.c:140: error: 'reiserfs_proc_info_data_t' has no member named 'leaked_oid' fs/reiserfs/procfs.c:140: error: 'reiserfs_proc_info_data_t' has no member named 'leaves_removable' fs/reiserfs/procfs.c: In function 'show_per_level': fs/reiserfs/procfs.c:184: error: 'reiserfs_proc_info_data_t' has no member named 'balance_at' fs/reiserfs/procfs.c:185: error: 'reiserfs_proc_info_data_t' has no member named 'sbk_read_at' fs/reiserfs/procfs.c:186: error: 'reiserfs_proc_info_data_t' has no member named 'sbk_fs_changed' fs/reiserfs/procfs.c:187: error: 'reiserfs_proc_info_data_t' has no member named 'sbk_restarted' fs/reiserfs/procfs.c:188: error: 'reiserfs_proc_info_data_t' has no member named 'free_at' fs/reiserfs/procfs.c:189: error: 'reiserfs_proc_info_data_t' has no member named 'items_at' fs/reiserfs/procfs.c:190: error: 'reiserfs_proc_info_data_t' has no member named 'can_node_be_removed' fs/reiserfs/procfs.c:191: error: 'reiserfs_proc_info_data_t' has no member named 'lnum' fs/reiserfs/procfs.c:192: error: 'reiserfs_proc_info_data_t' has no member named 'rnum' fs/reiserfs/procfs.c:193: error: 'reiserfs_proc_info_data_t' has no member named 'lbytes' fs/reiserfs/procfs.c:194: error: 'reiserfs_proc_info_data_t' has no member named 'rbytes' fs/reiserfs/procfs.c:195: error: 'reiserfs_proc_info_data_t' has no member named 'get_neighbors' fs/reiserfs/procfs.c:196: error: 'reiserfs_proc_info_data_t' has no member named 'get_neighbors_restart' fs/reiserfs/procfs.c:197: error: 'reiserfs_proc_info_data_t' has no member named 'need_l_neighbor' fs/reiserfs/procfs.c:197: error: 'reiserfs_proc_info_data_t' has no member named 'need_r_neighbor' fs/reiserfs/procfs.c: In function 'show_bitmap': fs/reiserfs/procfs.c:224: error: 'reiserfs_proc_info_data_t' has no member named 'free_block' fs/reiserfs/procfs.c:225: error: 'reiserfs_proc_info_data_t' has no member named 'scan_bitmap' fs/reiserfs/procfs.c:226: error: 'reiserfs_proc_info_data_t' has no member named 'scan_bitmap' fs/reiserfs/procfs.c:227: error: 'reiserfs_proc_info_data_t' has no member named 'scan_bitmap' fs/reiserfs/procfs.c:228: error: 'reiserfs_proc_info_data_t' has no member named 'scan_bitmap' fs/reiserfs/procfs.c:229: error: 'reiserfs_proc_info_data_t' has no member named 'scan_bitmap' fs/reiserfs/procfs.c:230: error: 'reiserfs_proc_info_data_t' has no member named 'scan_bitmap' fs/reiserfs/procfs.c:230: error: 'reiserfs_proc_info_data_t' has no member named 'scan_bitmap' fs/reiserfs/procfs.c: In function 'show_journal': fs/reiserfs/procfs.c:384: error: 'reiserfs_proc_info_data_t' has no member named 'journal' fs/reiserfs/procfs.c:385: error: 'reiserfs_proc_info_data_t' has no member named 'journal' fs/reiserfs/procfs.c:386: error: 'reiserfs_proc_info_data_t' has no member named 'journal' fs/reiserfs/procfs.c:387: error: 'reiserfs_proc_info_data_t' has no member named 'journal' fs/reiserfs/procfs.c:388: error: 'reiserfs_proc_info_data_t' has no member named 'journal' fs/reiserfs/procfs.c:389: error: 'reiserfs_proc_info_data_t' has no member named 'journal' fs/reiserfs/procfs.c:390: error: 'reiserfs_proc_info_data_t' has no member named 'journal' fs/reiserfs/procfs.c:391: error: 'reiserfs_proc_info_data_t' has no member named 'journal' fs/reiserfs/procfs.c:392: error: 'reiserfs_proc_info_data_t' has no member named 'journal' fs/reiserfs/procfs.c:393: error: 'reiserfs_proc_info_data_t' has no member named 'journal' fs/reiserfs/procfs.c:394: error: 'reiserfs_proc_info_data_t' has no member named 'journal' fs/reiserfs/procfs.c:395: error: 'reiserfs_proc_info_data_t' has no member named 'journal' fs/reiserfs/procfs.c:395: error: 'reiserfs_proc_info_data_t' has no member named 'journal' fs/reiserfs/procfs.c:395: error: 'reiserfs_proc_info_data_t' has no member named 'journal' fs/reiserfs/procfs.c: In function 'reiserfs_proc_info_init': fs/reiserfs/procfs.c:504: warning: implicit declaration of function '__PINFO' fs/reiserfs/procfs.c:504: error: request for member 'lock' in something not a structure or union fs/reiserfs/procfs.c: In function 'reiserfs_proc_info_done': fs/reiserfs/procfs.c:544: error: request for member 'lock' in something not a structure or union fs/reiserfs/procfs.c:545: error: request for member 'exiting' in something not a structure or union fs/reiserfs/procfs.c:546: error: request for member 'lock' in something not a structure or union Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexey Dobriyan 提交于
Eternal quest to make while true; do cat /proc/fs/xfs/stat >/dev/null 2>/dev/null; done while true; do find /proc -type f 2>/dev/null | xargs cat >/dev/null 2>/dev/null; done while true; do modprobe xfs; rmmod xfs; done work reliably continues and now kernel oopses in the following way: BUG: unable to handle ... at virtual address 6b6b6b6b EIP is at badness process: cat proc_oom_score proc_info_read sys_fstat64 vfs_read proc_info_read sys_read Failing code is prefetch hidden in list_for_each_entry() in badness(). badness() is reachable from two points. One is proc_oom_score, another is out_of_memory() => select_bad_process() => badness(). Second path grabs tasklist_lock, while first doesn't. Signed-off-by: NAlexey Dobriyan <adobriyan@sw.ru> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Eric Dumazet 提交于
1) Introduces a new method in 'struct dentry_operations'. This method called d_dname() might be called from d_path() to build a pathname for special filesystems. It is called without locks. Future patches (if we succeed in having one common dentry for all pipes/sockets) may need to change prototype of this method, but we now use : char *d_dname(struct dentry *dentry, char *buffer, int buflen); 2) Adds a dynamic_dname() helper function that eases d_dname() implementations 3) Defines d_dname method for sockets : No more sprintf() at socket creation. This is delayed up to the moment someone does an access to /proc/pid/fd/... 4) Defines d_dname method for pipes : No more sprintf() at pipe creation. This is delayed up to the moment someone does an access to /proc/pid/fd/... A benchmark consisting of 1.000.000 calls to pipe()/close()/close() gives a *nice* speedup on my Pentium(M) 1.6 Ghz : 3.090 s instead of 3.450 s Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Acked-by: NChristoph Hellwig <hch@infradead.org> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miklos Szeredi 提交于
Add support for finding out the current file position, open flags and possibly other info in the future. These new entries are added: /proc/PID/fdinfo/FD /proc/PID/task/TID/fdinfo/FD For each fd the information is provided in the following format: pos: 1234 flags: 0100002 [bunk@stusta.de: make struct proc_fdinfo_file_operations static] Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Cc: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: NAdrian Bunk <bunk@stusta.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Eric Dumazet 提交于
Change the order of fields of struct pid_entry (file fs/proc/base.c) in order to avoid a hole on 64bit archs. (8 bytes saved per object) Also change all pid_entry arrays to be const qualified, to make clear they must not be modified. Before (on x86_64) : # size fs/proc/base.o text data bss dec hex filename 15549 2192 0 17741 454d fs/proc/base.o After : # size fs/proc/base.o text data bss dec hex filename 17229 176 0 17405 43fd fs/proc/base.o Thats 336 bytes saved on kernel size on x86_64 Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kees Cook 提交于
The /proc/pid/ "maps", "smaps", and "numa_maps" files contain sensitive information about the memory location and usage of processes. Issues: - maps should not be world-readable, especially if programs expect any kind of ASLR protection from local attackers. - maps cannot just be 0400 because "-D_FORTIFY_SOURCE=2 -O2" makes glibc check the maps when %n is in a *printf call, and a setuid(getuid()) process wouldn't be able to read its own maps file. (For reference see http://lkml.org/lkml/2006/1/22/150) - a system-wide toggle is needed to allow prior behavior in the case of non-root applications that depend on access to the maps contents. This change implements a check using "ptrace_may_attach" before allowing access to read the maps contents. To control this protection, the new knob /proc/sys/kernel/maps_protect has been added, with corresponding updates to the procfs documentation. [akpm@linux-foundation.org: build fixes] [akpm@linux-foundation.org: New sysctl numbers are old hat] Signed-off-by: NKees Cook <kees@outflux.net> Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Hellwig 提交于
We don't have a routine called namei() anymore since at least 2.3.x, and the comment is just totally out of sync with the current lookup logic. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Hellwig 提交于
inode->i_sb is always set, not need to check for it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexey Dobriyan 提交于
WARN_ON(de && de->deleted); is sooo unreliable. Why? proc_lookup remove_proc_entry =========== ================= lock_kernel(); spin_lock(&proc_subdir_lock); [find proc entry] spin_unlock(&proc_subdir_lock); spin_lock(&proc_subdir_lock); [find proc entry] proc_get_inode ============== WARN_ON(de && de->deleted); ... if (!atomic_read(&de->count)) free_proc_entry(de); else de->deleted = 1; So, if you have some strange oops [1], and doesn't see this WARN_ON it means nothing. [1] try_module_get() of module which doesn't exist, two lines below should suffice, or not? Signed-off-by: NAlexey Dobriyan <adobriyan@sw.ru> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Darrick J. Wong 提交于
Fix the following race: proc_readdir remove_proc_entry ============ ================= spin_lock(&proc_subdir_lock); [choose PDE to start filldir from] spin_unlock(&proc_subdir_lock); spin_lock(&proc_subdir_lock); [find PDE] [free PDE, refcount is 0] spin_unlock(&proc_subdir_lock); /* boom */ if (filldir(dirent, de->name, ... [de_put on error path --adobriyan] Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com> Signed-off-by: NAlexey Dobriyan <adobriyan@sw.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexey Dobriyan 提交于
proc_lookup remove_proc_entry =========== ================= lock_kernel(); spin_lock(&proc_subdir_lock); [find PDE with refcount 0] spin_unlock(&proc_subdir_lock); spin_lock(&proc_subdir_lock); [find PDE with refcount 0] [check refcount and free PDE] spin_unlock(&proc_subdir_lock); proc_get_inode: de_get(de); /* boom */ Signed-off-by: NAlexey Dobriyan <adobriyan@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miklos Szeredi 提交于
There's a slight problem with filesystem type representation in fuse based filesystems. From the kernel's view, there are just two filesystem types: fuse and fuseblk. From the user's view there are lots of different filesystem types. The user is not even much concerned if the filesystem is fuse based or not. So there's a conflict of interest in how this should be represented in fstab, mtab and /proc/mounts. The current scheme is to encode the real filesystem type in the mount source. So an sshfs mount looks like this: sshfs#user@server:/ /mnt/server fuse rw,nosuid,nodev,... This url-ish syntax works OK for sshfs and similar filesystems. However for block device based filesystems (ntfs-3g, zfs) it doesn't work, since the kernel expects the mount source to be a real device name. A possibly better scheme would be to encode the real type in the type field as "type.subtype". So fuse mounts would look like this: /dev/hda1 /mnt/windows fuseblk.ntfs-3g rw,... user@server:/ /mnt/server fuse.sshfs rw,nosuid,nodev,... This patch adds the necessary code to the kernel so that this can be correctly displayed in /proc/mounts. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Davide Libenzi 提交于
Epoll is doing multiple passes over the ready set at the moment, because of the constraints over the f_op->poll() call. Looking at the code again, I noticed that we already hold the epoll semaphore in read, and this (together with other locking conditions that hold while doing an epoll_wait()) can lead to a smarter way [1] to "ship" events to userspace (in a single pass). This is a stress application that can be used to test the new code. It spwans multiple thread and call epoll_wait() and epoll_ctl() from many threads. Stress tested on my dual Opteron 254 w/out any problems. http://www.xmailserver.org/totalmess.c This is not a benchmark, just something that tries to stress and exploit possible problems with the new code. Also, I made a stupid micro-benchmark: http://www.xmailserver.org/epwbench.c [1] Considering that epoll must be thread-safe, there are five ways we can be hit during an epoll_wait() transfer loop (ep_send_events()): 1) The epoll fd going away and calling ep_free This just can't happen, since we did an fget() in sys_epoll_wait 2) An epoll_ctl(EPOLL_CTL_DEL) This can't happen because epoll_ctl() gets ep->sem in write, and we're holding it in read during ep_send_events() 3) An fd stored inside the epoll fd going away This can't happen because in eventpoll_release_file() we get ep->sem in write, and we're holding it in read during ep_send_events() 4) Another epoll_wait() happening on another thread They both can be inside ep_send_events() at the same time, we get (splice) the ready-list under the spinlock, so each one will get its own ready list. Note that an fd cannot be at the same time inside more than one ready list, because ep_poll_callback() will not re-queue it if it sees it already linked: if (ep_is_linked(&epi->rdllink)) goto is_linked; Another case that can happen, is two concurrent epoll_wait(), coming in with a userspace event buffer of size, say, ten. Suppose there are 50 event ready in the list. The first epoll_wait() will "steal" the whole list, while the second, seeing no events, will go to sleep. But at the end of ep_send_events() in the first epoll_wait(), we will re-inject surplus ready fds, and we will trigger the proper wake_up to the second epoll_wait(). 5) ep_poll_callback() hitting us asyncronously This is the tricky part. As I said above, the ep_is_linked() test done inside ep_poll_callback(), will guarantee us that until the item will result linked to a list, ep_poll_callback() will not try to re-queue it again (read, write data on any of its members). When we do a list_del() in ep_send_events(), the item will still satisfy the ep_is_linked() test (whatever data is written in prev/next, it'll never be its own pointer), so ep_poll_callback() will still leave us alone. It's only after the eventual smp_mb()+INIT_LIST_HEAD(&epi->rdllink) that it'll become visible to ep_poll_callback(), but at the point we're already past it. [akpm@osdl.org: 80 cols] Signed-off-by: NDavide Libenzi <davidel@xmailserver.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dmitriy Monakhov 提交于
- ext3_dx_find_entry() exit with out setting proper error pointer - do_split() exit with out setting proper error pointer it is realy painful because many callers contain folowing code: de = do_split(handle,dir, &bh, frame, &hinfo, &retval); if (!(de)) return retval; <<< WOW retval wasn't changed by do_split(), so caller failed <<< but return SUCCESS :) - Rearrange do_split() error path. Current error path is realy ugly, all this up and down jump stuff doesn't make code easy to understand. [dmonakhov@sw.ru: fix annoying fake error messages] Signed-off-by: NMonakhov Dmitriy <dmonakhov@openvz.org> Cc: Andreas Dilger <adilger@clusterfs.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: NMonakhov Dmitriy <dmonakhov@openvz.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Badari Pulavarty 提交于
sys_clone() and sys_unshare() both makes copies of nsproxy and its associated namespaces. But they have different code paths. This patch merges all the nsproxy and its associated namespace copy/clone handling (as much as possible). Posted on container list earlier for feedback. - Create a new nsproxy and its associated namespaces and pass it back to caller to attach it to right process. - Changed all copy_*_ns() routines to return a new copy of namespace instead of attaching it to task->nsproxy. - Moved the CAP_SYS_ADMIN checks out of copy_*_ns() routines. - Removed unnessary !ns checks from copy_*_ns() and added BUG_ON() just incase. - Get rid of all individual unshare_*_ns() routines and make use of copy_*_ns() instead. [akpm@osdl.org: cleanups, warning fix] [clg@fr.ibm.com: remove dup_namespaces() declaration] [serue@us.ibm.com: fix CONFIG_IPC_NS=n, clone(CLONE_NEWIPC) retval] [akpm@linux-foundation.org: fix build with CONFIG_SYSVIPC=n] Signed-off-by: NBadari Pulavarty <pbadari@us.ibm.com> Signed-off-by: NSerge Hallyn <serue@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: <containers@lists.osdl.org> Signed-off-by: NCedric Le Goater <clg@fr.ibm.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Nick Piggin 提交于
Petr Tesarik discovered a problem in remove_arg_zero(). He writes: When a script is loaded, load_script() replaces argv[0] with the name of the interpreter and the filename passed to the exec syscall. However, there is no guarantee that the length of the interpreter name plus the length of the filename is greater than the length of the original argv[0]. If the difference happens to cross a page boundary, setup_arg_pages() will call put_dirty_page() [aka install_arg_page()] with an address outside the VMA. Therefore, remove_arg_zero() must free all pages which would be unused after the argument is removed. So, rewrite the remove_arg_zero function without gotos, with a few comments, and with the commonly used explicit index/offset. This fixes the problem and makes it easier to understand as well. [a.p.zijlstra@chello.nl: add comment] Signed-off-by: NNick Piggin <npiggin@suse.de> Cc: Petr Tesarik <ptesarik@suse.cz> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Robert P. J. Day 提交于
Correct the misspelling of the preprocessor check of a Kconfig option to refer to CONFIG_REISERFS_PROC_INFO and not just the incorrect REISERFS_PROC_INFO. Signed-off-by: NRobert P. J. Day <rpjday@mindspring.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexey Dobriyan 提交于
This makes in-core superblock fit into one cacheline here. Before: struct dentry * xattr_root; /* 124 4 */ /* --- cacheline 1 boundary (128 bytes) --- */ struct rw_semaphore xattr_dir_sem; /* 128 12 */ int j_errno; /* 140 4 */ }; /* size: 144, cachelines: 2 */ /* sum members: 142, holes: 1, sum holes: 2 */ /* last cacheline: 16 bytes */ After: int j_errno; /* 124 4 */ /* --- cacheline 1 boundary (128 bytes) --- */ }; /* size: 128, cachelines: 1 */ /* sum members: 126, holes: 1, sum holes: 2 */ Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dmitriy Monakhov 提交于
sb_read may return NULL, let's explicitly check it. If so free new bitmap blocks array, after this we may safely exit as it done above during bitmap allocation. Signed-off-by: NDmitriy Monakhov <dmonakhov@openvz.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dmitriy Monakhov 提交于
sb_read may return NULL, so let's explicitly check it. Signed-off-by: NDmitriy Monakhov <dmonakhov@openvz.org> Acked-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vignesh Babu BM 提交于
Replace (n & (n-1)) in the context of power of 2 checks with is_power_of_2 Signed-off-by: Nvignesh babu <vignesh.babu@wipro.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vignesh Babu BM 提交于
Replace (n & (n-1)) in the context of power of 2 checks with is_power_of_2 Signed-off-by: Nvignesh babu <vignesh.babu@wipro.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vignesh Babu BM 提交于
Replacing (n & (n-1)) in the context of power of 2 checks with is_power_of_2 Signed-off-by: Nvignesh babu <vignesh.babu@wipro.com> Acked-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Florin Malita 提交于
Currently, devpts doesn't generate an fsnotify event upon pts creation because the regular vfs paths aren't involved. Deallocation, on the other hand, correctly generates a nameremove event thanks to the d_delete() invocation in devpts_pty_kill(). This patch adds the missing fsnotify_create() trigger in devpts_pty_new(). Signed-off-by: NFlorin Malita <fmalita@gmail.com> Acked-by: NH. Peter Anvin <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Chris Snook 提交于
Add SEEK_MAX and use it to validate lseek arguments from userspace. Signed-off-by: NChris Snook <csnook@redhat.com> Acked-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Chris Snook 提交于
Convert magic numbers to SEEK_* values from fs.h Signed-off-by: NChris Snook <csnook@redhat.com> Acked-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrew Morton 提交于
Teach the dentry slab shrinker to aggressively shrink parent dentries when shrinking the dentry cache. This is done to attempt to improve the situation where the dentry slab cache gets a lot of internal fragmentation due to pages containing directory dentries. It is expected that this change will cause some of those dentries to be reaped earlier, and with less scanning. Needs careful testing. Cc: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miklos Szeredi 提交于
The time shrink_dcache_parent() takes, grows quadratically with the depth of the tree under 'parent'. This starts to get noticable at about 10,000. These kinds of depths don't occur normally, and filesystems which invoke shrink_dcache_parent() via d_invalidate() seem to have other depth dependent timings, so it's not even easy to expose this problem. However with FUSE it's easy to create a deep tree and d_invalidate() will also get called. This can make a syscall hang for a very long time. This is the original discovery of the problem by Russ Cox: http://article.gmane.org/gmane.comp.file-systems.fuse.devel/3826 The following patch fixes the quadratic behavior, by optionally allowing prune_dcache() to prune ancestors of a dentry in one go, instead of doing it one at a time. Common code in dput() and prune_one_dentry() is extracted into a new helper function d_kill(). shrink_dcache_parent() as well as shrink_dcache_sb() are converted to use the ancestry-pruner option. Only for shrink_dcache_memory() is this behavior not desirable, so it keeps using the old algorithm. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Maneesh Soni <maneesh@in.ibm.com> Acked-by: N"Paul E. McKenney" <paulmck@us.ibm.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Neil Brown <neilb@suse.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 William Cohen 提交于
This past week I was playing around with that pahole tool (http://oops.ghostprotocols.net:81/acme/dwarves/) and looking at the size of various struct in the kernel. I was surprised by the size of the task_struct on x86_64, approaching 4K. I looked through the fields in task_struct and found that a number of them were declared as "unsigned long" rather than "unsigned int" despite them appearing okay as 32-bit sized fields. On x86_64 "unsigned long" ends up being 8 bytes in size and forces 8 byte alignment. Is there a reason there a reason they are "unsigned long"? The patch below drops the size of the struct from 3808 bytes (60 64-byte cachelines) to 3760 bytes (59 64-byte cachelines). A couple other fields in the task struct take a signficant amount of space: struct thread_struct thread; 688 struct held_lock held_locks[30]; 1680 CONFIG_LOCKDEP is turned on in the .config [akpm@linux-foundation.org: fix printk warnings] Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Markus Rechberger 提交于
Taken from http://bugzilla.kernel.org/show_bug.cgi?id=5079 signed long ranges from -2.147.483.648 to 2.147.483.647 on x86 32bit 10000011110110100100111110111101 .. -2,082,844,739 10000011110110100100111110111101 .. 2,212,122,557 <- this currently gets stored on the disk but when converting it to a 64bit signed long value it loses its sign and becomes positive. Cc: Andreas Dilger <adilger@dilger.ca> Cc: <linux-ext4@vger.kernel.org> Andreas says: This patch is now treating timestamps with the high bit set as negative times (before Jan 1, 1970). This means we lose 1/2 of the possible range of timestamps (lopping off 68 years before unix timestamp overflow - now only 30 years away :-) to handle the extremely rare case of setting timestamps into the distant past. If we are only interested in fixing the underflow case, we could just limit the values to 0 instead of storing negative values. At worst this will skew the timestamp by a few hours for timezones in the far east (files would still show Jan 1, 1970 in "ls -l" output). That said, it seems 32-bit systems (mine at least) allow files to be set into the past (01/01/1907 works fine) so it seems this patch is bringing the x86_64 behaviour into sync with other kernels. On the plus side, we have a patch that is ready to add nanosecond timestamps to ext3 and as an added bonus adds 2 high bits to the on-disk timestamp so this extends the maximum date to 2242. Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexey Dobriyan 提交于
/proc/$PID/fd has r-x------ permissions, so if process does setuid(), it will not be able to access /proc/*/fd/. This breaks fstatat() emulation in glibc. open("foo", O_RDONLY|O_DIRECTORY) = 4 setuid32(65534) = 0 stat64("/proc/self/fd/4/bar", 0xbfafb298) = -1 EACCES (Permission denied) Signed-off-by: NAlexey Dobriyan <adobriyan@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: James Morris <jmorris@namei.org> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Acked-By: NKirill Korotaev <dev@openvz.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrew Morton 提交于
block_write_full_page() forgot to propagate ENPSOC into the address_space. Cc: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Guillaume Chazarain 提交于
Cleanup: setting an outstanding error on a mapping was open coded too many times. Factor it out in mapping_set_error(). Signed-off-by: NGuillaume Chazarain <guichaz@yahoo.fr> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jeff Dike 提交于
hostfs needed some style goodness. Signed-off-by: NJeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alberto Bertogli 提交于
This patch allows hostfs_setattr() to work on unlinked open files by calling set_attr() (the userspace part) with the inode's fd. Without this, applications that depend on doing attribute changes to unlinked open files will fail. It works by using the fd versions instead of the path ones (for example fchmod() instead of chmod(), fchown() instead of chown()) when an fd is available. Signed-off-by: NAlberto Bertogli <albertito@gmail.com> Signed-off-by: NJeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dmitriy Monakhov 提交于
[akpm@linux-foundation.org: cleanup] Signed-off-by: NMonakhov Dmitriy <dmonakhov@openvz.org> Cc: Christoph Hellwig <hch@lst.de> Acked-by: NAnton Altaparmakov <aia21@cam.ac.uk> Acked-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-