1. 19 6月, 2009 3 次提交
  2. 17 6月, 2009 5 次提交
    • D
      oom: move oom_adj value from task_struct to mm_struct · 2ff05b2b
      David Rientjes 提交于
      The per-task oom_adj value is a characteristic of its mm more than the
      task itself since it's not possible to oom kill any thread that shares the
      mm.  If a task were to be killed while attached to an mm that could not be
      freed because another thread were set to OOM_DISABLE, it would have
      needlessly been terminated since there is no potential for future memory
      freeing.
      
      This patch moves oomkilladj (now more appropriately named oom_adj) from
      struct task_struct to struct mm_struct.  This requires task_lock() on a
      task to check its oom_adj value to protect against exec, but it's already
      necessary to take the lock when dereferencing the mm to find the total VM
      size for the badness heuristic.
      
      This fixes a livelock if the oom killer chooses a task and another thread
      sharing the same memory has an oom_adj value of OOM_DISABLE.  This occurs
      because oom_kill_task() repeatedly returns 1 and refuses to kill the
      chosen task while select_bad_process() will repeatedly choose the same
      task during the next retry.
      
      Taking task_lock() in select_bad_process() to check for OOM_DISABLE and in
      oom_kill_task() to check for threads sharing the same memory will be
      removed in the next patch in this series where it will no longer be
      necessary.
      
      Writing to /proc/pid/oom_adj for a kthread will now return -EINVAL since
      these threads are immune from oom killing already.  They simply report an
      oom_adj value of OOM_DISABLE.
      
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2ff05b2b
    • K
      mm: remove CONFIG_UNEVICTABLE_LRU config option · 68377659
      KOSAKI Motohiro 提交于
      Currently, nobody wants to turn UNEVICTABLE_LRU off.  Thus this
      configurability is unnecessary.
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Acked-by: NMinchan Kim <minchan.kim@gmail.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      68377659
    • W
      proc: export more page flags in /proc/kpageflags · 17797549
      Wu Fengguang 提交于
      Export all page flags faithfully in /proc/kpageflags.
      
      	11. KPF_MMAP		(pseudo flag) memory mapped page
      	12. KPF_ANON		(pseudo flag) memory mapped page (anonymous)
      	13. KPF_SWAPCACHE	page is in swap cache
      	14. KPF_SWAPBACKED	page is swap/RAM backed
      	15. KPF_COMPOUND_HEAD	(*)
      	16. KPF_COMPOUND_TAIL	(*)
      	17. KPF_HUGE		hugeTLB pages
      	18. KPF_UNEVICTABLE	page is in the unevictable LRU list
      	19. KPF_HWPOISON(TBD)	hardware detected corruption
      	20. KPF_NOPAGE		(pseudo flag) no page frame at the address
      	32-39.			more obscure flags for kernel developers
      
      	(*) For compound pages, exporting _both_ head/tail info enables
      	    users to tell where a compound page starts/ends, and its order.
      
      The accompanying page-types tool will handle the details like decoupling
      overloaded flags and hiding obscure flags to normal users.
      
      Thanks to KOSAKI and Andi for their valuable recommendations!
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      17797549
    • W
      proc: kpagecount/kpageflags code cleanup · ed7ce0f1
      Wu Fengguang 提交于
      Move increments of pfn/out to bottom of the loop.
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Acked-by: NMatt Mackall <mpm@selenic.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed7ce0f1
    • W
      mm: introduce PageHuge() for testing huge/gigantic pages · 20a0307c
      Wu Fengguang 提交于
      A series of patches to enhance the /proc/pagemap interface and to add a
      userspace executable which can be used to present the pagemap data.
      
      Export 10 more flags to end users (and more for kernel developers):
      
              11. KPF_MMAP            (pseudo flag) memory mapped page
              12. KPF_ANON            (pseudo flag) memory mapped page (anonymous)
              13. KPF_SWAPCACHE       page is in swap cache
              14. KPF_SWAPBACKED      page is swap/RAM backed
              15. KPF_COMPOUND_HEAD   (*)
              16. KPF_COMPOUND_TAIL   (*)
              17. KPF_HUGE		hugeTLB pages
              18. KPF_UNEVICTABLE     page is in the unevictable LRU list
              19. KPF_HWPOISON        hardware detected corruption
              20. KPF_NOPAGE          (pseudo flag) no page frame at the address
      
              (*) For compound pages, exporting _both_ head/tail info enables
                  users to tell where a compound page starts/ends, and its order.
      
      a simple demo of the page-types tool
      
      # ./page-types -h
      page-types [options]
                  -r|--raw                  Raw mode, for kernel developers
                  -a|--addr    addr-spec    Walk a range of pages
                  -b|--bits    bits-spec    Walk pages with specified bits
                  -l|--list                 Show page details in ranges
                  -L|--list-each            Show page details one by one
                  -N|--no-summary           Don't show summay info
                  -h|--help                 Show this usage message
      addr-spec:
                  N                         one page at offset N (unit: pages)
                  N+M                       pages range from N to N+M-1
                  N,M                       pages range from N to M-1
                  N,                        pages range from N to end
                  ,M                        pages range from 0 to M
      bits-spec:
                  bit1,bit2                 (flags & (bit1|bit2)) != 0
                  bit1,bit2=bit1            (flags & (bit1|bit2)) == bit1
                  bit1,~bit2                (flags & (bit1|bit2)) == bit1
                  =bit1,bit2                flags == (bit1|bit2)
      bit-names:
                locked              error         referenced           uptodate
                 dirty                lru             active               slab
             writeback            reclaim              buddy               mmap
             anonymous          swapcache         swapbacked      compound_head
         compound_tail               huge        unevictable           hwpoison
                nopage           reserved(r)         mlocked(r)    mappedtodisk(r)
               private(r)       private_2(r)   owner_private(r)            arch(r)
              uncached(r)       readahead(o)       slob_free(o)     slub_frozen(o)
            slub_debug(o)
                                         (r) raw mode bits  (o) overloaded bits
      
      # ./page-types
                   flags      page-count       MB  symbolic-flags                     long-symbolic-flags
      0x0000000000000000          487369     1903  _________________________________
      0x0000000000000014               5        0  __R_D____________________________  referenced,dirty
      0x0000000000000020               1        0  _____l___________________________  lru
      0x0000000000000024              34        0  __R__l___________________________  referenced,lru
      0x0000000000000028            3838       14  ___U_l___________________________  uptodate,lru
      0x0001000000000028              48        0  ___U_l_______________________I___  uptodate,lru,readahead
      0x000000000000002c            6478       25  __RU_l___________________________  referenced,uptodate,lru
      0x000100000000002c              47        0  __RU_l_______________________I___  referenced,uptodate,lru,readahead
      0x0000000000000040            8344       32  ______A__________________________  active
      0x0000000000000060               1        0  _____lA__________________________  lru,active
      0x0000000000000068             348        1  ___U_lA__________________________  uptodate,lru,active
      0x0001000000000068              12        0  ___U_lA______________________I___  uptodate,lru,active,readahead
      0x000000000000006c             988        3  __RU_lA__________________________  referenced,uptodate,lru,active
      0x000100000000006c              48        0  __RU_lA______________________I___  referenced,uptodate,lru,active,readahead
      0x0000000000004078               1        0  ___UDlA_______b__________________  uptodate,dirty,lru,active,swapbacked
      0x000000000000407c              34        0  __RUDlA_______b__________________  referenced,uptodate,dirty,lru,active,swapbacked
      0x0000000000000400             503        1  __________B______________________  buddy
      0x0000000000000804               1        0  __R________M_____________________  referenced,mmap
      0x0000000000000828            1029        4  ___U_l_____M_____________________  uptodate,lru,mmap
      0x0001000000000828              43        0  ___U_l_____M_________________I___  uptodate,lru,mmap,readahead
      0x000000000000082c             382        1  __RU_l_____M_____________________  referenced,uptodate,lru,mmap
      0x000100000000082c              12        0  __RU_l_____M_________________I___  referenced,uptodate,lru,mmap,readahead
      0x0000000000000868             192        0  ___U_lA____M_____________________  uptodate,lru,active,mmap
      0x0001000000000868              12        0  ___U_lA____M_________________I___  uptodate,lru,active,mmap,readahead
      0x000000000000086c             800        3  __RU_lA____M_____________________  referenced,uptodate,lru,active,mmap
      0x000100000000086c              31        0  __RU_lA____M_________________I___  referenced,uptodate,lru,active,mmap,readahead
      0x0000000000004878               2        0  ___UDlA____M__b__________________  uptodate,dirty,lru,active,mmap,swapbacked
      0x0000000000001000             492        1  ____________a____________________  anonymous
      0x0000000000005808               4        0  ___U_______Ma_b__________________  uptodate,mmap,anonymous,swapbacked
      0x0000000000005868            2839       11  ___U_lA____Ma_b__________________  uptodate,lru,active,mmap,anonymous,swapbacked
      0x000000000000586c              30        0  __RU_lA____Ma_b__________________  referenced,uptodate,lru,active,mmap,anonymous,swapbacked
                   total          513968     2007
      
      # ./page-types -r
                   flags      page-count       MB  symbolic-flags                     long-symbolic-flags
      0x0000000000000000          468002     1828  _________________________________
      0x0000000100000000           19102       74  _____________________r___________  reserved
      0x0000000000008000              41        0  _______________H_________________  compound_head
      0x0000000000010000             188        0  ________________T________________  compound_tail
      0x0000000000008014               1        0  __R_D__________H_________________  referenced,dirty,compound_head
      0x0000000000010014               4        0  __R_D___________T________________  referenced,dirty,compound_tail
      0x0000000000000020               1        0  _____l___________________________  lru
      0x0000000800000024              34        0  __R__l__________________P________  referenced,lru,private
      0x0000000000000028            3794       14  ___U_l___________________________  uptodate,lru
      0x0001000000000028              46        0  ___U_l_______________________I___  uptodate,lru,readahead
      0x0000000400000028              44        0  ___U_l_________________d_________  uptodate,lru,mappedtodisk
      0x0001000400000028               2        0  ___U_l_________________d_____I___  uptodate,lru,mappedtodisk,readahead
      0x000000000000002c            6434       25  __RU_l___________________________  referenced,uptodate,lru
      0x000100000000002c              47        0  __RU_l_______________________I___  referenced,uptodate,lru,readahead
      0x000000040000002c              14        0  __RU_l_________________d_________  referenced,uptodate,lru,mappedtodisk
      0x000000080000002c              30        0  __RU_l__________________P________  referenced,uptodate,lru,private
      0x0000000800000040            8124       31  ______A_________________P________  active,private
      0x0000000000000040             219        0  ______A__________________________  active
      0x0000000800000060               1        0  _____lA_________________P________  lru,active,private
      0x0000000000000068             322        1  ___U_lA__________________________  uptodate,lru,active
      0x0001000000000068              12        0  ___U_lA______________________I___  uptodate,lru,active,readahead
      0x0000000400000068              13        0  ___U_lA________________d_________  uptodate,lru,active,mappedtodisk
      0x0000000800000068              12        0  ___U_lA_________________P________  uptodate,lru,active,private
      0x000000000000006c             977        3  __RU_lA__________________________  referenced,uptodate,lru,active
      0x000100000000006c              48        0  __RU_lA______________________I___  referenced,uptodate,lru,active,readahead
      0x000000040000006c               5        0  __RU_lA________________d_________  referenced,uptodate,lru,active,mappedtodisk
      0x000000080000006c               3        0  __RU_lA_________________P________  referenced,uptodate,lru,active,private
      0x0000000c0000006c               3        0  __RU_lA________________dP________  referenced,uptodate,lru,active,mappedtodisk,private
      0x0000000c00000068               1        0  ___U_lA________________dP________  uptodate,lru,active,mappedtodisk,private
      0x0000000000004078               1        0  ___UDlA_______b__________________  uptodate,dirty,lru,active,swapbacked
      0x000000000000407c              34        0  __RUDlA_______b__________________  referenced,uptodate,dirty,lru,active,swapbacked
      0x0000000000000400             538        2  __________B______________________  buddy
      0x0000000000000804               1        0  __R________M_____________________  referenced,mmap
      0x0000000000000828            1029        4  ___U_l_____M_____________________  uptodate,lru,mmap
      0x0001000000000828              43        0  ___U_l_____M_________________I___  uptodate,lru,mmap,readahead
      0x000000000000082c             382        1  __RU_l_____M_____________________  referenced,uptodate,lru,mmap
      0x000100000000082c              12        0  __RU_l_____M_________________I___  referenced,uptodate,lru,mmap,readahead
      0x0000000000000868             192        0  ___U_lA____M_____________________  uptodate,lru,active,mmap
      0x0001000000000868              12        0  ___U_lA____M_________________I___  uptodate,lru,active,mmap,readahead
      0x000000000000086c             800        3  __RU_lA____M_____________________  referenced,uptodate,lru,active,mmap
      0x000100000000086c              31        0  __RU_lA____M_________________I___  referenced,uptodate,lru,active,mmap,readahead
      0x0000000000004878               2        0  ___UDlA____M__b__________________  uptodate,dirty,lru,active,mmap,swapbacked
      0x0000000000001000             492        1  ____________a____________________  anonymous
      0x0000000000005008               2        0  ___U________a_b__________________  uptodate,anonymous,swapbacked
      0x0000000000005808               4        0  ___U_______Ma_b__________________  uptodate,mmap,anonymous,swapbacked
      0x000000000000580c               1        0  __RU_______Ma_b__________________  referenced,uptodate,mmap,anonymous,swapbacked
      0x0000000000005868            2839       11  ___U_lA____Ma_b__________________  uptodate,lru,active,mmap,anonymous,swapbacked
      0x000000000000586c              29        0  __RU_lA____Ma_b__________________  referenced,uptodate,lru,active,mmap,anonymous,swapbacked
                   total          513968     2007
      
      # ./page-types --raw --list --no-summary --bits reserved
      offset  count   flags
      0       15      _____________________r___________
      31      4       _____________________r___________
      159     97      _____________________r___________
      4096    2067    _____________________r___________
      6752    2390    _____________________r___________
      9355    3       _____________________r___________
      9728    14526   _____________________r___________
      
      This patch:
      
      Introduce PageHuge(), which identifies huge/gigantic pages by their
      dedicated compound destructor functions.
      
      Also move prep_compound_gigantic_page() to hugetlb.c and make
      __free_pages_ok() non-static.
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      20a0307c
  3. 12 6月, 2009 1 次提交
  4. 29 5月, 2009 1 次提交
  5. 15 5月, 2009 1 次提交
  6. 11 5月, 2009 1 次提交
  7. 09 5月, 2009 1 次提交
  8. 05 5月, 2009 1 次提交
  9. 03 5月, 2009 2 次提交
    • K
      mm: fix Committed_AS underflow on large NR_CPUS environment · 00a62ce9
      KOSAKI Motohiro 提交于
      The Committed_AS field can underflow in certain situations:
      
      >         # while true; do cat /proc/meminfo  | grep _AS; sleep 1; done | uniq -c
      >               1 Committed_AS: 18446744073709323392 kB
      >              11 Committed_AS: 18446744073709455488 kB
      >               6 Committed_AS:    35136 kB
      >               5 Committed_AS: 18446744073709454400 kB
      >               7 Committed_AS:    35904 kB
      >               3 Committed_AS: 18446744073709453248 kB
      >               2 Committed_AS:    34752 kB
      >               9 Committed_AS: 18446744073709453248 kB
      >               8 Committed_AS:    34752 kB
      >               3 Committed_AS: 18446744073709320960 kB
      >               7 Committed_AS: 18446744073709454080 kB
      >               3 Committed_AS: 18446744073709320960 kB
      >               5 Committed_AS: 18446744073709454080 kB
      >               6 Committed_AS: 18446744073709320960 kB
      
      Because NR_CPUS can be greater than 1000 and meminfo_proc_show() does
      not check for underflow.
      
      But NR_CPUS proportional isn't good calculation.  In general,
      possibility of lock contention is proportional to the number of online
      cpus, not theorical maximum cpus (NR_CPUS).
      
      The current kernel has generic percpu-counter stuff.  using it is right
      way.  it makes code simplify and percpu_counter_read_positive() don't
      make underflow issue.
      Reported-by: NDave Hansen <dave@linux.vnet.ibm.com>
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Eric B Munson <ebmunson@us.ibm.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: <stable@kernel.org>		[All kernel versions]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      00a62ce9
    • V
      pagemap: require aligned-length, non-null reads of /proc/pid/pagemap · 08161786
      Vitaly Mayatskikh 提交于
      The intention of commit aae8679b
      ("pagemap: fix bug in add_to_pagemap, require aligned-length reads of
      /proc/pid/pagemap") was to force reads of /proc/pid/pagemap to be a
      multiple of 8 bytes, but now it allows to read 0 bytes, which actually
      puts some data to user's buffer.  According to POSIX, if count is zero,
      read() should return zero and has no other results.
      Signed-off-by: NVitaly Mayatskikh <v.mayatskih@gmail.com>
      Cc: Thomas Tuttle <ttuttle@google.com>
      Acked-by: NMatt Mackall <mpm@selenic.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      08161786
  10. 23 4月, 2009 1 次提交
    • M
      [S390] /proc/stat idle field for idle cpus · e1c80530
      Martin Schwidefsky 提交于
      The cpu idle field in the output of /proc/stat is too small for cpus
      that have been idle for more than a tick. Add the architecture hook
      arch_idle_time that allows to add the not accounted idle time of a
      sleeping cpu without waking the cpu.
      
      The s390 implementation of arch_idle_time uses the already existing
      s390_idle_data per_cpu variable to find the sleep time of a neighboring
      idle cpu.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      e1c80530
  11. 17 4月, 2009 1 次提交
  12. 09 4月, 2009 1 次提交
  13. 07 4月, 2009 1 次提交
  14. 03 4月, 2009 1 次提交
    • D
      nommu: fix a number of issues with the per-MM VMA patch · 33e5d769
      David Howells 提交于
      Fix a number of issues with the per-MM VMA patch:
      
       (1) Make mmap_pages_allocated an atomic_long_t, just in case this is used on
           a NOMMU system with more than 2G pages.  Makes no difference on a 32-bit
           system.
      
       (2) Report vma->vm_pgoff * PAGE_SIZE as a 64-bit value, not a 32-bit value,
           lest it overflow.
      
       (3) Move the allocation of the vm_area_struct slab back for fork.c.
      
       (4) Use KMEM_CACHE() for both vm_area_struct and vm_region slabs.
      
       (5) Use BUG_ON() rather than if () BUG().
      
       (6) Make the default validate_nommu_regions() a static inline rather than a
           #define.
      
       (7) Make free_page_series()'s objection to pages with a refcount != 1 more
           informative.
      
       (8) Adjust the __put_nommu_region() banner comment to indicate that the
           semaphore must be held for writing.
      
       (9) Limit the number of warnings about munmaps of non-mmapped regions.
      Reported-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Cc: Greg Ungerer <gerg@snapgear.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      33e5d769
  15. 01 4月, 2009 4 次提交
  16. 31 3月, 2009 5 次提交
  17. 30 3月, 2009 1 次提交
  18. 29 3月, 2009 1 次提交
    • H
      fix setuid sometimes wouldn't · 7c2c7d99
      Hugh Dickins 提交于
      check_unsafe_exec() also notes whether the fs_struct is being
      shared by more threads than will get killed by the exec, and if so
      sets LSM_UNSAFE_SHARE to make bprm_set_creds() careful about euid.
      But /proc/<pid>/cwd and /proc/<pid>/root lookups make transient
      use of get_fs_struct(), which also raises that sharing count.
      
      This might occasionally cause a setuid program not to change euid,
      in the same way as happened with files->count (check_unsafe_exec
      also looks at sighand->count, but /proc doesn't raise that one).
      
      We'd prefer exec not to unshare fs_struct: so fix this in procfs,
      replacing get_fs_struct() by get_fs_path(), which does path_get
      while still holding task_lock, instead of raising fs->count.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Cc: stable@kernel.org
      ___
      
       fs/proc/base.c |   50 +++++++++++++++--------------------------------
       1 file changed, 16 insertions(+), 34 deletions(-)
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7c2c7d99
  19. 28 3月, 2009 2 次提交
  20. 18 3月, 2009 1 次提交
    • L
      Avoid 64-bit "switch()" statements on 32-bit architectures · ee568b25
      Linus Torvalds 提交于
      Commit ee6f779b ("filp->f_pos not
      correctly updated in proc_task_readdir") changed the proc code to use
      filp->f_pos directly, rather than through a temporary variable.  In the
      process, that caused the operations to be done on the full 64 bits, even
      though the offset is never that big.
      
      That's all fine and dandy per se, but for some unfathomable reason gcc
      generates absolutely horrid code when using 64-bit values in switch()
      statements.  To the point of actually calling out to gcc helper
      functions like __cmpdi2 rather than just doing the trivial comparisons
      directly the way gcc does for normal compares.  At which point we get
      link failures, because we really don't want to support that kind of
      crazy code.
      
      Fix this by just casting the f_pos value to "unsigned long", which
      is plenty big enough for /proc, and avoids the gcc code generation issue.
      Reported-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Cc: Zhang Le <r0bertz@gentoo.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ee568b25
  21. 16 3月, 2009 1 次提交
    • Z
      filp->f_pos not correctly updated in proc_task_readdir · ee6f779b
      Zhang Le 提交于
      filp->f_pos only get updated at the end of the function. Thus d_off of those
      dirents who are in the middle will be 0, and this will cause a problem in
      glibc's readdir implementation, specifically endless loop. Because when overflow
      occurs, f_pos will be set to next dirent to read, however it will be 0, unless
      the next one is the last one. So it will start over again and again.
      
      There is a sample program in man 2 gendents. This is the output of the program
      running on a multithread program's task dir before this patch is applied:
      
        $ ./a.out /proc/3807/task
        --------------- nread=128 ---------------
        i-node#  file type  d_reclen  d_off   d_name
          506442  directory    16          1  .
          506441  directory    16          0  ..
          506443  directory    16          0  3807
          506444  directory    16          0  3809
          506445  directory    16          0  3812
          506446  directory    16          0  3861
          506447  directory    16          0  3862
          506448  directory    16          8  3863
      
      This is the output after this patch is applied
      
        $ ./a.out /proc/3807/task
        --------------- nread=128 ---------------
        i-node#  file type  d_reclen  d_off   d_name
          506442  directory    16          1  .
          506441  directory    16          2  ..
          506443  directory    16          3  3807
          506444  directory    16          4  3809
          506445  directory    16          5  3812
          506446  directory    16          6  3861
          506447  directory    16          7  3862
          506448  directory    16          8  3863
      Signed-off-by: NZhang Le <r0bertz@gentoo.org>
      Acked-by: NAl Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ee6f779b
  22. 11 3月, 2009 1 次提交
  23. 25 2月, 2009 1 次提交
  24. 24 2月, 2009 1 次提交
  25. 09 1月, 2009 1 次提交
    • M
      vmcore: remove saved_max_pfn check · 921d58c0
      Magnus Damm 提交于
      Remove the saved_max_pfn check from the /proc/vmcore function
      read_from_oldmem().  No need to verify, we should be able to just trust
      that "elfcorehdr=" is correctly passed to the crash kernel on the kernel
      command line like we do with other parameters.
      
      The read_from_oldmem() function in fs/proc/vmcore.c is quite similar to
      read_from_oldmem() in drivers/char/mem.c, but only in the latter it makes
      sense to use saved_max_pfn.  For oldmem it is used to determine when to
      stop reading.  For vmcore we already have the elf header info pointing out
      the physical memory regions, no need to pass the end-of- old-memory twice.
      
      Removing the saved_max_pfn check from vmcore makes it possible for
      architectures to skip oldmem but still support crash dump through vmcore -
      without the need for the old saved_max_pfn cruft.
      
      Architectures that want to play safe can do the saved_max_pfn check in
      copy_oldmem_page().  Not sure why anyone would want to do that, but that's
      even safer than today - the saved_max_pfn check in vmcore removed by this
      patch only checks the first page.
      Signed-off-by: NMagnus Damm <damm@igel.co.jp>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Acked-by: NSimon Horman <horms@verge.net.au>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      921d58c0