- 07 3月, 2010 21 次提交
-
-
由 Neil Horman 提交于
Modify uid check in do_coredump so as to not apply it in the case of pipes. This just got noticed in testing. The end of do_coredump validates the uid of the inode for the created file against the uid of the crashing process to ensure that no one can pre-create a core file with different ownership and grab the information contained in the core when they shouldn' tbe able to. This causes failures when using pipes for a core dumps if the crashing process is not root, which is the uid of the pipe when it is created. The fix is simple. Since the check for matching uid's isn't relevant for pipes (a process can't create a pipe that the uermodehelper code will open anyway), we can just just skip it in the event ispipe is non-zero Reverts a pipe-affecting change which was accidentally made in : commit c46f739d : Author: Ingo Molnar <mingo@elte.hu> : AuthorDate: Wed Nov 28 13:59:18 2007 +0100 : Commit: Linus Torvalds <torvalds@woody.linux-foundation.org> : CommitDate: Wed Nov 28 10:58:01 2007 -0800 : : vfs: coredumping fix Signed-off-by: NNeil Horman <nhorman@tuxdriver.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Oleg Nesterov 提交于
User visible change. do_coredump() kills all threads which share the same ->mm but only the coredumping process gets the proper exit_code. Other tasks which share the same ->mm die "silently" and return status == 0 to parent. This is historical behaviour, not actually a bug. But I think Frank Heckenbach rightly dislikes the current behaviour. Simple test-case: #include <stdio.h> #include <unistd.h> #include <signal.h> #include <sys/wait.h> int main(void) { int stat; if (!fork()) { if (!vfork()) kill(getpid(), SIGQUIT); } wait(&stat); printf("stat=%x\n", stat); return 0; } Before this patch it prints "stat=0" despite the fact the child was killed by SIGQUIT. After this patch the output is "stat=3" which obviously makes more sense. Even with this patch, only the task which originates the coredumping gets "|= 0x80" if the core was actually dumped, but at least the coredumping signal is visible to do_wait/etc. Reported-by: NFrank Heckenbach <f.heckenbach@fh-soft.de> Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: NWANG Cong <xiyou.wangcong@gmail.com> Cc: Roland McGrath <roland@redhat.com> Cc: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Masami Hiramatsu 提交于
Pass mm->flags as a coredump parameter for consistency. --- 1787 if (mm->core_state || !get_dumpable(mm)) { <- (1) 1788 up_write(&mm->mmap_sem); 1789 put_cred(cred); 1790 goto fail; 1791 } 1792 [...] 1798 if (get_dumpable(mm) == 2) { /* Setuid core dump mode */ <-(2) 1799 flag = O_EXCL; /* Stop rewrite attacks */ 1800 cred->fsuid = 0; /* Dump root private */ 1801 } --- Since dumpable bits are not protected by lock, there is a chance to change these bits between (1) and (2). To solve this issue, this patch copies mm->flags to coredump_params.mm_flags at the beginning of do_coredump() and uses it instead of get_dumpable() while dumping core. This copy is also passed to binfmt->core_dump, since elf*_core_dump() uses dump_filter bits in mm->flags. [akpm@linux-foundation.org: fix merge] Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com> Acked-by: NRoland McGrath <roland@redhat.com> Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Daisuke HATAYAMA 提交于
The current ELF dumper implementation can produce broken corefiles if program headers exceed 65535. This number is determined by the number of vmas which the process have. In particular, some extreme programs may use more than 65535 vmas. (If you google max_map_count, you can find some users facing this problem.) This kind of program never be able to generate correct coredumps. This patch implements ``extended numbering'' that uses sh_info field of the first section header instead of e_phnum field in order to represent upto 4294967295 vmas. This is supported by AMD64-ABI(http://www.x86-64.org/documentation.html) and Solaris(http://docs.sun.com/app/docs/doc/817-1984/). Of course, we are preparing patches for gdb and binutils. Signed-off-by: NDaisuke HATAYAMA <d.hatayama@jp.fujitsu.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: David Howells <dhowells@redhat.com> Cc: Greg Ungerer <gerg@snapgear.com> Cc: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <andi@firstfloor.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Daisuke HATAYAMA 提交于
By the next patch, elf_core_dump() and elf_fdpic_core_dump() will support extended numbering and so will produce the corefiles with section header table in a special case. The problem is the process of writing a file header offset of the section header table into e_shoff field of the ELF header. ELF header is positioned at the beginning of the corefile, while section header at the end. So, we need to take which of the following ways: 1. Seek backward to retry writing operation for ELF header after writing process for a whole part 2. Make offset calculation process and writing process totally sequential The clause 1. is not always possible: one cannot assume that file system supports seek function. Consider the no_llseek case. Therefore, this patch adopts the clause 2. Signed-off-by: NDaisuke HATAYAMA <d.hatayama@jp.fujitsu.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: David Howells <dhowells@redhat.com> Cc: Greg Ungerer <gerg@snapgear.com> Cc: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <andi@firstfloor.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Daisuke HATAYAMA 提交于
elf_core_dump() and elf_fdpic_core_dump() use #ifdef and the corresponding macro for hiding _multiline_ logics in functions. This patch removes #ifdef and replaces ELF_CORE_EXTRA_* by corresponding functions. For architectures not implemeonting ELF_CORE_EXTRA_*, we use weak functions in order to reduce a range of modification. This cleanup is for my next patches, but I think this cleanup itself is worth doing regardless of my firnal purpose. Signed-off-by: NDaisuke HATAYAMA <d.hatayama@jp.fujitsu.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: David Howells <dhowells@redhat.com> Cc: Greg Ungerer <gerg@snapgear.com> Cc: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <andi@firstfloor.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Daisuke HATAYAMA 提交于
My next patch will replace ELF_CORE_EXTRA_* macros by functions, putting them into other newly created *.c files. Then, each files will contain dump_write(), where each pair of binfmt_*.c and elfcore.c should be the same. So, this patch moves them into a header file with dump_seek(). Also, the patch deletes confusing DUMP_WRITE macros in each files. Signed-off-by: NDaisuke HATAYAMA <d.hatayama@jp.fujitsu.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: David Howells <dhowells@redhat.com> Cc: Greg Ungerer <gerg@snapgear.com> Cc: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <andi@firstfloor.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Daisuke HATAYAMA 提交于
The current ELF dumper can produce broken corefiles if program headers exceed 65535. In particular, the program in 64-bit environment often demands more than 65535 mmaps. If you google max_map_count, then you can find many users facing this problem. Solaris has already dealt with this issue, and other OSes have also adopted the same method as in Solaris. Currently, Sun's document and AMD 64 ABI include the description for the extension, where they call the extension Extended Numbering. See Reference for further information. I believe that linux kernel should adopt the same way as they did, so I've written this patch. I am also preparing for patches of GDB and binutils. How to fix ========== In new dumping process, there are two cases according to weather or not the number of program headers is equal to or more than 65535. - if less than 65535, the produced corefile format is exactly the same as the ordinary one. - if equal to or more than 65535, then e_phnum field is set to newly introduced constant PN_XNUM(0xffff) and the actual number of program headers is set to sh_info field of the section header at index 0. Compatibility Concern ===================== * As already mentioned in Summary, Sun and AMD64 has already adopted this. See Reference. * There are four combinations according to whether kernel and userland tools are respectively modified or not. The next table summarizes shortly for each combination. --------------------------------------------- Original Kernel | Modified Kernel --------------------------------------------- < 65535 | >= 65535 | < 65535 | >= 65535 ------------------------------------------------------------- Original Tools | OK | broken | OK | broken (#) ------------------------------------------------------------- Modified Tools | OK | broken | OK | OK ------------------------------------------------------------- Note that there is no case that `OK' changes to `broken'. (#) Although this case remains broken, O-M behaves better than O-O. That is, while in O-O case e_phnum field would be extremely small due to integer overflow, in O-M case it is guaranteed to be at least 65535 by being set to PN_XNUM(0xFFFF), much closer to the actual correct value than the O-O case. Test Program ============ Here is a test program mkmmaps.c that is useful to produce the corefile with many mmaps. To use this, please take the following steps: $ ulimit -c unlimited $ sysctl vm.max_map_count=70000 # default 65530 is too small $ sysctl fs.file-max=70000 $ mkmmaps 65535 Then, the program will abort and a corefile will be generated. If failed, there are two cases according to the error message displayed. * ``out of memory'' means vm.max_map_count is still smaller * ``too many open files'' means fs.file-max is still smaller So, please change it to a larger value, and then retry it. mkmmaps.c == #include <stdio.h> #include <stdlib.h> #include <sys/mman.h> #include <fcntl.h> #include <unistd.h> int main(int argc, char **argv) { int maps_num; if (argc < 2) { fprintf(stderr, "mkmmaps [number of maps to be created]\n"); exit(1); } if (sscanf(argv[1], "%d", &maps_num) == EOF) { perror("sscanf"); exit(2); } if (maps_num < 0) { fprintf(stderr, "%d is invalid\n", maps_num); exit(3); } for (; maps_num > 0; --maps_num) { if (MAP_FAILED == mmap((void *)NULL, (size_t) 1, PROT_READ, MAP_SHARED | MAP_ANONYMOUS, (int) -1, (off_t) NULL)) { perror("mmap"); exit(4); } } abort(); { char buffer[128]; sprintf(buffer, "wc -l /proc/%u/maps", getpid()); system(buffer); } return 0; } Tested on i386, ia64 and um/sys-i386. Built on sh4 (which covers fs/binfmt_elf_fdpic.c) References ========== - Sun microsystems: Linker and Libraries. Part No: 817-1984-17, September 2008. URL: http://docs.sun.com/app/docs/doc/817-1984 - System V ABI AMD64 Architecture Processor Supplement Draft Version 0.99., May 11, 2009. URL: http://www.x86-64.org/ This patch: There are three different definitions for dump_seek() functions in binfmt_aout.c, binfmt_elf.c and binfmt_elf_fdpic.c, respectively. The only for binfmt_elf.c. My next patch will move dump_seek() into a header file in order to share the same implementations for dump_write() and dump_seek(). As the first step, this patch unify these three definitions for dump_seek() by applying the past commits that have been applied only for binfmt_elf.c. Specifically, the modification made here is part of the following commits: * d025c9db * 7f14daa1 This patch does not change a shape of corefiles. Signed-off-by: NDaisuke HATAYAMA <d.hatayama@jp.fujitsu.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: David Howells <dhowells@redhat.com> Cc: Greg Ungerer <gerg@snapgear.com> Cc: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <andi@firstfloor.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexey Dobriyan 提交于
* warn if creation goes on to non-existent directory * warn if removal goes on from non-existing directory * warn if non-existing proc entry is removed Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexey Dobriyan 提交于
remove_proc_entry() does lock lookup parent unlock lock unlink proc entry from lists unlock which can be made bit more correct by doing parent translation + unlink without dropping lock. Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrew Morton 提交于
fs/compat_ioctl.c: In function 'do_ioctl_trans': fs/compat_ioctl.c:534: warning: 'karg' may be used uninitialized in this function fs/compat_ioctl.c:533: warning: 'kcmd' may be used uninitialized in this function fs/compat_ioctl.c:656: warning: 'ret' may be used uninitialized in this function Reduces text size by 44 bytes. If someone calls one of these functions with an unexpected argument, the code's buggy as-is. Amerigo Wang <amwang@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Acked-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Don Mullis 提交于
Build list_sort() only for configs that need it -- those that don't save ~581 bytes (i386). Signed-off-by: NDon Mullis <don.mullis@gmail.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Artem Bityutskiy <dedekind@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michael Neuling 提交于
Currently we create the initial stack based on the PAGE_SIZE. This is unnecessary. This creates this initial stack independent of the PAGE_SIZE. It also bumps up the number of 4k pages allocated from 20 to 32, to align with 64K page systems. Signed-off-by: NMichael Neuling <mikey@neuling.org> Cc: Helge Deller <deller@gmx.de> Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Americo Wang <xiyou.wangcong@gmail.com> Cc: Anton Blanchard <anton@samba.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jiri Slaby 提交于
Make sure compiler won't do weird things with limits. E.g. fetching them twice may return 2 different values after writable limits are implemented. I.e. either use rlimit helpers added in commit 3e10e716 ("resource: add helpers for fetching rlimits") or ACCESS_ONCE if not applicable. Signed-off-by: NJiri Slaby <jslaby@suse.cz> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rik van Riel 提交于
The old anon_vma code can lead to scalability issues with heavily forking workloads. Specifically, each anon_vma will be shared between the parent process and all its child processes. In a workload with 1000 child processes and a VMA with 1000 anonymous pages per process that get COWed, this leads to a system with a million anonymous pages in the same anon_vma, each of which is mapped in just one of the 1000 processes. However, the current rmap code needs to walk them all, leading to O(N) scanning complexity for each page. This can result in systems where one CPU is walking the page tables of 1000 processes in page_referenced_one, while all other CPUs are stuck on the anon_vma lock. This leads to catastrophic failure for a benchmark like AIM7, where the total number of processes can reach in the tens of thousands. Real workloads are still a factor 10 less process intensive than AIM7, but they are catching up. This patch changes the way anon_vmas and VMAs are linked, which allows us to associate multiple anon_vmas with a VMA. At fork time, each child process gets its own anon_vmas, in which its COWed pages will be instantiated. The parents' anon_vma is also linked to the VMA, because non-COWed pages could be present in any of the children. This reduces rmap scanning complexity to O(1) for the pages of the 1000 child processes, with O(N) complexity for at most 1/N pages in the system. This reduces the average scanning cost in heavily forking workloads from O(N) to 2. The only real complexity in this patch stems from the fact that linking a VMA to anon_vmas now involves memory allocations. This means vma_adjust can fail, if it needs to attach a VMA to anon_vma structures. This in turn means error handling needs to be added to the calling functions. A second source of complexity is that, because there can be multiple anon_vmas, the anon_vma linking in vma_adjust can no longer be done under "the" anon_vma lock. To prevent the rmap code from walking up an incomplete VMA, this patch introduces the VM_LOCK_RMAP VMA flag. This bit flag uses the same slot as the NOMMU VM_MAPPED_COPY, with an ifdef in mm.h to make sure it is impossible to compile a kernel that needs both symbolic values for the same bitflag. Some test results: Without the anon_vma changes, when AIM7 hits around 9.7k users (on a test box with 16GB RAM and not quite enough IO), the system ends up running >99% in system time, with every CPU on the same anon_vma lock in the pageout code. With these changes, AIM7 hits the cross-over point around 29.7k users. This happens with ~99% IO wait time, there never seems to be any spike in system time. The anon_vma lock contention appears to be resolved. [akpm@linux-foundation.org: cleanups] Signed-off-by: NRik van Riel <riel@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Wu Fengguang 提交于
We'll introduce FMODE_RANDOM which will be runtime modified. So protect all runtime modification to f_mode with f_lock to avoid races. Signed-off-by: NWu Fengguang <fengguang.wu@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: <stable@kernel.org> [2.6.33.x] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KAMEZAWA Hiroyuki 提交于
A frequent questions from users about memory management is what numbers of swap ents are user for processes. And this information will give some hints to oom-killer. Besides we can count the number of swapents per a process by scanning /proc/<pid>/smaps, this is very slow and not good for usual process information handler which works like 'ps' or 'top'. (ps or top is now enough slow..) This patch adds a counter of swapents to mm_counter and update is at each swap events. Information is exported via /proc/<pid>/status file as [kamezawa@bluextal memory]$ cat /proc/self/status Name: cat State: R (running) Tgid: 2910 Pid: 2910 PPid: 2823 TracerPid: 0 Uid: 500 500 500 500 Gid: 500 500 500 500 FDSize: 256 Groups: 500 VmPeak: 82696 kB VmSize: 82696 kB VmLck: 0 kB VmHWM: 432 kB VmRSS: 432 kB VmData: 172 kB VmStk: 84 kB VmExe: 48 kB VmLib: 1568 kB VmPTE: 40 kB VmSwap: 0 kB <=============== this. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: NMinchan Kim <minchan.kim@gmail.com> Reviewed-by: NChristoph Lameter <cl@linux-foundation.org> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KAMEZAWA Hiroyuki 提交于
Considering the nature of per mm stats, it's the shared object among threads and can be a cache-miss point in the page fault path. This patch adds per-thread cache for mm_counter. RSS value will be counted into a struct in task_struct and synchronized with mm's one at events. Now, in this patch, the event is the number of calls to handle_mm_fault. Per-thread value is added to mm at each 64 calls. rough estimation with small benchmark on parallel thread (2threads) shows [before] 4.5 cache-miss/faults [after] 4.0 cache-miss/faults Anyway, the most contended object is mmap_sem if the number of threads grows. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KAMEZAWA Hiroyuki 提交于
Presently, per-mm statistics counter is defined by macro in sched.h This patch modifies it to - defined in mm.h as inlinf functions - use array instead of macro's name creation. This patch is for reducing patch size in future patch to modify implementation of per-mm counter. Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: NMinchan Kim <minchan.kim@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Akinobu Mita 提交于
Rename for_each_bit to for_each_set_bit in the kernel source tree. To permit for_each_clear_bit(), should that ever be added. The patch includes a macro to map the old for_each_bit() onto the new for_each_set_bit(). This is a (very) temporary thing to ease the migration. [akpm@linux-foundation.org: add temporary for_each_bit()] Suggested-by: NAlexey Dobriyan <adobriyan@gmail.com> Suggested-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Russell King <rmk@arm.linux.org.uk> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Artem Bityutskiy <dedekind@infradead.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Al Viro 提交于
We managed to lose O_DIRECTORY testing due to a stupid typo in commit 1f36f774 ("Switch !O_CREAT case to use of do_last()") Reported-by: NWalter Sheets <w41ter@gmail.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 3月, 2010 15 次提交
-
-
由 Aneesh Kumar K.V 提交于
For regular file and directories we put the link count in th extension field in a tagged string format. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>
-
由 Sripathi Kodi 提交于
Removes 'dotu' variable and make everything dependent on 'proto_version' field. Signed-off-by: NSripathi Kodi <sripathik@in.ibm.com> Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>
-
由 Sripathi Kodi 提交于
Add 9P2000.u and 9P2010.L protocol flags to V9FS VFS This patch adds 9P2000.u and 9P2010.L protocol flags into V9FS VFS side code and removes the single flag used for 'extended'. Signed-off-by: NSripathi Kodi <sripathik@in.ibm.com> Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Remove the redundant call to filemap_write_and_wait(). Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Now that we have correct COMMIT semantics in writeback_single_inode, we can reduce and simplify nfs_wb_all(). Also replace nfs_wb_nocommit() with a call to filemap_write_and_wait(), which doesn't need to hold the inode->i_mutex. With that done, we can eliminate nfs_write_mapping() altogether. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
In all cases we should be able to just remove the request and call cancel_dirty_page(). Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Since nfs_scan_list() doesn't wait for locked pages, we have a race in which it is possible to end up with an inode that needs to send a COMMIT, but which does not have the I_DIRTY_DATASYNC flag set. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Acked-by: NPeter Zijlstra <peterz@infradead.org> Acked-by: NWu Fengguang <fengguang.wu@intel.com>
-
由 Trond Myklebust 提交于
If the caller is doing a non-blocking flush, and there are still writebacks pending on the wire, we can usually defer the COMMIT call until those writes are done. Also ensure that we honour the wbc->nonblocking flag. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
In order to know when we should do opportunistic commits of the unstable writes, when the VM is doing a background flush, we add a field to count the number of unstable writes. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
The sole purpose of nfs_write_inode is to commit unstable writes, so move it into fs/nfs/write.c, and make nfs_commit_inode static. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Christoph Hellwig 提交于
This gives the filesystem more information about the writeback that is happening. Trond requested this for the NFS unstable write handling, and other filesystems might benefit from this too by beeing able to distinguish between the different callers in more detail. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Christoph Hellwig 提交于
Similar to the fsync issue fixed a while ago in commit 2daea67e we need to write for data to actually hit the disk before writing out the metadata to guarantee data integrity for filesystems that modify the inode in the data I/O completion path. Currently XFS and NFS handle this manually, and AFS has a write_inode method that does nothing but waiting for data, while others are possibly missing out on this. Fortunately this change has a lot less impact than the fsync change as none of the write_inode methods starts data writeout of any form by itself. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 05 3月, 2010 4 次提交
-
-
由 Phillip Lougher 提交于
Signed-off-by: NPhillip Lougher <phillip@lougher.demon.co.uk>
-
由 Phillip Lougher 提交于
Signed-off-by: NPhillip Lougher <phillip@lougher.demon.co.uk>
-
由 Al Viro 提交于
... and now we have all intents crap well localized Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Now that nd->last stays around until ->put_link() is called, we can just postpone that ->put_link() in do_filp_open() a bit and don't bother with copying. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-