1. 09 5月, 2017 2 次提交
    • M
      mm, vmalloc: properly track vmalloc users · 1f5307b1
      Michal Hocko 提交于
      __vmalloc_node_flags used to be static inline but this has changed by
      "mm: introduce kv[mz]alloc helpers" because kvmalloc_node needs to use
      it as well and the code is outside of the vmalloc proper.  I haven't
      realized that changing this will lead to a subtle bug though.  The
      function is responsible to track the caller as well.  This caller is
      then printed by /proc/vmallocinfo.  If __vmalloc_node_flags is not
      inline then we would get only direct users of __vmalloc_node_flags as
      callers (e.g.  v[mz]alloc) which reduces usefulness of this debugging
      feature considerably.  It simply doesn't help to see that the given
      range belongs to vmalloc as a caller:
      
        0xffffc90002c79000-0xffffc90002c7d000   16384 vmalloc+0x16/0x18 pages=3 vmalloc N0=3
        0xffffc90002c81000-0xffffc90002c85000   16384 vmalloc+0x16/0x18 pages=3 vmalloc N1=3
        0xffffc90002c8d000-0xffffc90002c91000   16384 vmalloc+0x16/0x18 pages=3 vmalloc N1=3
        0xffffc90002c95000-0xffffc90002c99000   16384 vmalloc+0x16/0x18 pages=3 vmalloc N1=3
      
      We really want to catch the _caller_ of the vmalloc function.  Fix this
      issue by making __vmalloc_node_flags static inline again.
      
      Link: http://lkml.kernel.org/r/20170502134657.12381-1-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1f5307b1
    • M
      mm: introduce kv[mz]alloc helpers · a7c3e901
      Michal Hocko 提交于
      Patch series "kvmalloc", v5.
      
      There are many open coded kmalloc with vmalloc fallback instances in the
      tree.  Most of them are not careful enough or simply do not care about
      the underlying semantic of the kmalloc/page allocator which means that
      a) some vmalloc fallbacks are basically unreachable because the kmalloc
      part will keep retrying until it succeeds b) the page allocator can
      invoke a really disruptive steps like the OOM killer to move forward
      which doesn't sound appropriate when we consider that the vmalloc
      fallback is available.
      
      As it can be seen implementing kvmalloc requires quite an intimate
      knowledge if the page allocator and the memory reclaim internals which
      strongly suggests that a helper should be implemented in the memory
      subsystem proper.
      
      Most callers, I could find, have been converted to use the helper
      instead.  This is patch 6.  There are some more relying on __GFP_REPEAT
      in the networking stack which I have converted as well and Eric Dumazet
      was not opposed [2] to convert them as well.
      
      [1] http://lkml.kernel.org/r/20170130094940.13546-1-mhocko@kernel.org
      [2] http://lkml.kernel.org/r/1485273626.16328.301.camel@edumazet-glaptop3.roam.corp.google.com
      
      This patch (of 9):
      
      Using kmalloc with the vmalloc fallback for larger allocations is a
      common pattern in the kernel code.  Yet we do not have any common helper
      for that and so users have invented their own helpers.  Some of them are
      really creative when doing so.  Let's just add kv[mz]alloc and make sure
      it is implemented properly.  This implementation makes sure to not make
      a large memory pressure for > PAGE_SZE requests (__GFP_NORETRY) and also
      to not warn about allocation failures.  This also rules out the OOM
      killer as the vmalloc is a more approapriate fallback than a disruptive
      user visible action.
      
      This patch also changes some existing users and removes helpers which
      are specific for them.  In some cases this is not possible (e.g.
      ext4_kvmalloc, libcfs_kvzalloc) because those seems to be broken and
      require GFP_NO{FS,IO} context which is not vmalloc compatible in general
      (note that the page table allocation is GFP_KERNEL).  Those need to be
      fixed separately.
      
      While we are at it, document that __vmalloc{_node} about unsupported gfp
      mask because there seems to be a lot of confusion out there.
      kvmalloc_node will warn about GFP_KERNEL incompatible (which are not
      superset) flags to catch new abusers.  Existing ones would have to die
      slowly.
      
      [sfr@canb.auug.org.au: f2fs fixup]
        Link: http://lkml.kernel.org/r/20170320163735.332e64b7@canb.auug.org.au
      Link: http://lkml.kernel.org/r/20170306103032.2540-2-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Reviewed-by: Andreas Dilger <adilger@dilger.ca>	[ext4 part]
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a7c3e901
  2. 03 4月, 2017 1 次提交
    • M
      kernel-api.rst: fix a series of errors when parsing C files · 0e056eb5
      mchehab@s-opensource.com 提交于
      ./lib/string.c:134: WARNING: Inline emphasis start-string without end-string.
      ./mm/filemap.c:522: WARNING: Inline interpreted text or phrase reference start-string without end-string.
      ./mm/filemap.c:1283: ERROR: Unexpected indentation.
      ./mm/filemap.c:3003: WARNING: Inline interpreted text or phrase reference start-string without end-string.
      ./mm/vmalloc.c:1544: WARNING: Inline emphasis start-string without end-string.
      ./mm/page_alloc.c:4245: ERROR: Unexpected indentation.
      ./ipc/util.c:676: ERROR: Unexpected indentation.
      ./drivers/pci/irq.c:35: WARNING: Block quote ends without a blank line; unexpected unindent.
      ./security/security.c:109: ERROR: Unexpected indentation.
      ./security/security.c:110: WARNING: Definition list ends without a blank line; unexpected unindent.
      ./block/genhd.c:275: WARNING: Inline strong start-string without end-string.
      ./block/genhd.c:283: WARNING: Inline strong start-string without end-string.
      ./include/linux/clk.h:134: WARNING: Inline emphasis start-string without end-string.
      ./include/linux/clk.h:134: WARNING: Inline emphasis start-string without end-string.
      ./ipc/util.c:477: ERROR: Unknown target name: "s".
      Signed-off-by: NMauro Carvalho Chehab <mchehab@s-opensource.com>
      Acked-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NJonathan Corbet <corbet@lwn.net>
      0e056eb5
  3. 18 3月, 2017 1 次提交
  4. 17 3月, 2017 1 次提交
  5. 16 3月, 2017 1 次提交
    • T
      x86/mm: Adapt MODULES_END based on fixmap section size · f06bdd40
      Thomas Garnier 提交于
      This patch aligns MODULES_END to the beginning of the fixmap section.
      It optimizes the space available for both sections. The address is
      pre-computed based on the number of pages required by the fixmap
      section.
      
      It will allow GDT remapping in the fixmap section. The current
      MODULES_END static address does not provide enough space for the kernel
      to support a large number of processors.
      Signed-off-by: NThomas Garnier <thgarnie@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lorenzo Stoakes <lstoakes@gmail.com>
      Cc: Luis R . Rodriguez <mcgrof@kernel.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Rafael J . Wysocki <rjw@rjwysocki.net>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: kasan-dev@googlegroups.com
      Cc: kernel-hardening@lists.openwall.com
      Cc: kvm@vger.kernel.org
      Cc: lguest@lists.ozlabs.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-efi@vger.kernel.org
      Cc: linux-mm@kvack.org
      Cc: linux-pm@vger.kernel.org
      Cc: xen-devel@lists.xenproject.org
      Cc: zijun_hu <zijun_hu@htc.com>
      Link: http://lkml.kernel.org/r/20170314170508.100882-1-thgarnie@google.com
      [ Small build fix. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      f06bdd40
  6. 10 3月, 2017 1 次提交
  7. 03 3月, 2017 1 次提交
    • I
      sched/headers: Move task_struct::signal and task_struct::sighand types and... · c3edc401
      Ingo Molnar 提交于
      sched/headers: Move task_struct::signal and task_struct::sighand types and accessors into <linux/sched/signal.h>
      
      task_struct::signal and task_struct::sighand are pointers, which would normally make it
      straightforward to not define those types in sched.h.
      
      That is not so, because the types are accompanied by a myriad of APIs (macros and inline
      functions) that dereference them.
      
      Split the types and the APIs out of sched.h and move them into a new header, <linux/sched/signal.h>.
      
      With this change sched.h does not know about 'struct signal' and 'struct sighand' anymore,
      trying to put accessors into sched.h as a test fails the following way:
      
        ./include/linux/sched.h: In function ‘test_signal_types’:
        ./include/linux/sched.h:2461:18: error: dereferencing pointer to incomplete type ‘struct signal_struct’
                          ^
      
      This reduces the size and complexity of sched.h significantly.
      
      Update all headers and .c code that relied on getting the signal handling
      functionality from <linux/sched.h> to include <linux/sched/signal.h>.
      
      The list of affected files in the preparatory patch was partly generated by
      grepping for the APIs, and partly by doing coverage build testing, both
      all[yes|mod|def|no]config builds on 64-bit and 32-bit x86, and an array of
      cross-architecture builds.
      
      Nevertheless some (trivial) build breakage is still expected related to rare
      Kconfig combinations and in-flight patches to various kernel code, but most
      of it should be handled by this patch.
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      c3edc401
  8. 25 2月, 2017 2 次提交
  9. 23 2月, 2017 2 次提交
  10. 25 12月, 2016 1 次提交
  11. 13 12月, 2016 8 次提交
  12. 08 10月, 2016 2 次提交
  13. 27 7月, 2016 1 次提交
    • V
      mm: charge/uncharge kmemcg from generic page allocator paths · 4949148a
      Vladimir Davydov 提交于
      Currently, to charge a non-slab allocation to kmemcg one has to use
      alloc_kmem_pages helper with __GFP_ACCOUNT flag.  A page allocated with
      this helper should finally be freed using free_kmem_pages, otherwise it
      won't be uncharged.
      
      This API suits its current users fine, but it turns out to be impossible
      to use along with page reference counting, i.e.  when an allocation is
      supposed to be freed with put_page, as it is the case with pipe or unix
      socket buffers.
      
      To overcome this limitation, this patch moves charging/uncharging to
      generic page allocator paths, i.e.  to __alloc_pages_nodemask and
      free_pages_prepare, and zaps alloc/free_kmem_pages helpers.  This way,
      one can use any of the available page allocation functions to get the
      allocated page charged to kmemcg - it's enough to pass __GFP_ACCOUNT,
      just like in case of kmalloc and friends.  A charged page will be
      automatically uncharged on free.
      
      To make it possible, we need to mark pages charged to kmemcg somehow.
      To avoid introducing a new page flag, we make use of page->_mapcount for
      marking such pages.  Since pages charged to kmemcg are not supposed to
      be mapped to userspace, it should work just fine.  There are other
      (ab)users of page->_mapcount - buddy and balloon pages - but we don't
      conflict with them.
      
      In case kmemcg is compiled out or not used at runtime, this patch
      introduces no overhead to generic page allocator paths.  If kmemcg is
      used, it will be plus one gfp flags check on alloc and plus one
      page->_mapcount check on free, which shouldn't hurt performance, because
      the data accessed are hot.
      
      Link: http://lkml.kernel.org/r/a9736d856f895bcb465d9f257b54efe32eda6f99.1464079538.git.vdavydov@virtuozzo.comSigned-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4949148a
  14. 04 6月, 2016 1 次提交
  15. 21 5月, 2016 1 次提交
    • C
      mm/vmalloc: keep a separate lazy-free list · 80c4bd7a
      Chris Wilson 提交于
      When mixing lots of vmallocs and set_memory_*() (which calls
      vm_unmap_aliases()) I encountered situations where the performance
      degraded severely due to the walking of the entire vmap_area list each
      invocation.
      
      One simple improvement is to add the lazily freed vmap_area to a
      separate lockless free list, such that we then avoid having to walk the
      full list on each purge.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NRoman Pen <r.peniaev@gmail.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Roman Pen <r.peniaev@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Shawn Lin <shawn.lin@rock-chips.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      80c4bd7a
  16. 05 4月, 2016 1 次提交
    • C
      mm/vmap: Add a notifier for when we run out of vmap address space · 4da56b99
      Chris Wilson 提交于
      vmaps are temporary kernel mappings that may be of long duration.
      Reusing a vmap on an object is preferrable for a driver as the cost of
      setting up the vmap can otherwise dominate the operation on the object.
      However, the vmap address space is rather limited on 32bit systems and
      so we add a notification for vmap pressure in order for the driver to
      release any cached vmappings.
      
      The interface is styled after the oom-notifier where the callees are
      passed a pointer to an unsigned long counter for them to indicate if they
      have freed any space.
      
      v2: Guard the blocking notifier call with gfpflags_allow_blocking()
      v3: Correct typo in forward declaration and move to head of file
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Roman Peniaev <r.peniaev@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: linux-mm@kvack.org
      Cc: linux-kernel@vger.kernel.org
      Acked-by: Andrew Morton <akpm@linux-foundation.org> # for inclusion via DRM
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1459777603-23618-3-git-send-email-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      4da56b99
  17. 18 3月, 2016 3 次提交
  18. 16 1月, 2016 1 次提交
  19. 15 1月, 2016 3 次提交
  20. 21 11月, 2015 1 次提交
  21. 07 11月, 2015 2 次提交
    • M
      mm: page_alloc: hide some GFP internals and document the bits and flag combinations · dd56b046
      Mel Gorman 提交于
      Andrew stated the following
      
      	We have quite a history of remote parts of the kernel using
      	weird/wrong/inexplicable combinations of __GFP_ flags.	I tend
      	to think that this is because we didn't adequately explain the
      	interface.
      
      	And I don't think that gfp.h really improved much in this area as
      	a result of this patchset.  Could you go through it some time and
      	decide if we've adequately documented all this stuff?
      
      This patches first moves some GFP flag combinations that are part of the MM
      internals to mm/internal.h. The rest of the patch documents the __GFP_FOO
      bits under various headings and then documents the flag combinations. It
      will not help callers that are brain damaged but the clarity might motivate
      some fixes and avoid future mistakes.
      Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Vitaly Wool <vitalywool@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dd56b046
    • M
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep... · d0164adc
      Mel Gorman 提交于
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd
      
      __GFP_WAIT has been used to identify atomic context in callers that hold
      spinlocks or are in interrupts.  They are expected to be high priority and
      have access one of two watermarks lower than "min" which can be referred
      to as the "atomic reserve".  __GFP_HIGH users get access to the first
      lower watermark and can be called the "high priority reserve".
      
      Over time, callers had a requirement to not block when fallback options
      were available.  Some have abused __GFP_WAIT leading to a situation where
      an optimisitic allocation with a fallback option can access atomic
      reserves.
      
      This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
      cannot sleep and have no alternative.  High priority users continue to use
      __GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
      are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
      callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
      redefined as a caller that is willing to enter direct reclaim and wake
      kswapd for background reclaim.
      
      This patch then converts a number of sites
      
      o __GFP_ATOMIC is used by callers that are high priority and have memory
        pools for those requests. GFP_ATOMIC uses this flag.
      
      o Callers that have a limited mempool to guarantee forward progress clear
        __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
        into this category where kswapd will still be woken but atomic reserves
        are not used as there is a one-entry mempool to guarantee progress.
      
      o Callers that are checking if they are non-blocking should use the
        helper gfpflags_allow_blocking() where possible. This is because
        checking for __GFP_WAIT as was done historically now can trigger false
        positives. Some exceptions like dm-crypt.c exist where the code intent
        is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
        flag manipulations.
      
      o Callers that built their own GFP flags instead of starting with GFP_KERNEL
        and friends now also need to specify __GFP_KSWAPD_RECLAIM.
      
      The first key hazard to watch out for is callers that removed __GFP_WAIT
      and was depending on access to atomic reserves for inconspicuous reasons.
      In some cases it may be appropriate for them to use __GFP_HIGH.
      
      The second key hazard is callers that assembled their own combination of
      GFP flags instead of starting with something like GFP_KERNEL.  They may
      now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
      if it's missed in most cases as other activity will wake kswapd.
      Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Vitaly Wool <vitalywool@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0164adc
  22. 06 11月, 2015 1 次提交
  23. 02 11月, 2015 1 次提交
    • L
      mm: get rid of 'vmalloc_info' from /proc/meminfo · a5ad88ce
      Linus Torvalds 提交于
      It turns out that at least some versions of glibc end up reading
      /proc/meminfo at every single startup, because glibc wants to know the
      amount of memory the machine has.  And while that's arguably insane,
      it's just how things are.
      
      And it turns out that it's not all that expensive most of the time, but
      the vmalloc information statistics (amount of virtual memory used in the
      vmalloc space, and the biggest remaining chunk) can be rather expensive
      to compute.
      
      The 'get_vmalloc_info()' function actually showed up on my profiles as
      4% of the CPU usage of "make test" in the git source repository, because
      the git tests are lots of very short-lived shell-scripts etc.
      
      It turns out that apparently this same silly vmalloc info gathering
      shows up on the facebook servers too, according to Dave Jones.  So it's
      not just "make test" for git.
      
      We had two patches to just cache the information (one by me, one by
      Ingo) to mitigate this issue, but the whole vmalloc information of of
      rather dubious value to begin with, and people who *actually* want to
      know what the situation is wrt the vmalloc area should just look at the
      much more complete /proc/vmallocinfo instead.
      
      In fact, according to my testing - and perhaps more importantly,
      according to that big search engine in the sky: Google - there is
      nothing out there that actually cares about those two expensive fields:
      VmallocUsed and VmallocChunk.
      
      So let's try to just remove them entirely.  Actually, this just removes
      the computation and reports the numbers as zero for now, just to try to
      be minimally intrusive.
      
      If this breaks anything, we'll obviously have to re-introduce the code
      to compute this all and add the caching patches on top.  But if given
      the option, I'd really prefer to just remove this bad idea entirely
      rather than add even more code to work around our historical mistake
      that likely nobody really cares about.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a5ad88ce
  24. 16 4月, 2015 1 次提交
    • R
      mm/vmalloc: get rid of dirty bitmap inside vmap_block structure · 7d61bfe8
      Roman Pen 提交于
      In original implementation of vm_map_ram made by Nick Piggin there were
      two bitmaps: alloc_map and dirty_map.  None of them were used as supposed
      to be: finding a suitable free hole for next allocation in block.
      vm_map_ram allocates space sequentially in block and on free call marks
      pages as dirty, so freed space can't be reused anymore.
      
      Actually it would be very interesting to know the real meaning of those
      bitmaps, maybe implementation was incomplete, etc.
      
      But long time ago Zhang Yanfei removed alloc_map by these two commits:
      
        mm/vmalloc.c: remove dead code in vb_alloc
           3fcd76e8
        mm/vmalloc.c: remove alloc_map from vmap_block
           b8e748b6
      
      In this patch I replaced dirty_map with two range variables: dirty min and
      max.  These variables store minimum and maximum position of dirty space in
      a block, since we need only to know the dirty range, not exact position of
      dirty pages.
      
      Why it was made?  Several reasons: at first glance it seems that
      vm_map_ram allocator concerns about fragmentation thus it uses bitmaps for
      finding free hole, but it is not true.  To avoid complexity seems it is
      better to use something simple, like min or max range values.  Secondly,
      code also becomes simpler, without iteration over bitmap, just comparing
      values in min and max macros.  Thirdly, bitmap occupies up to 1024 bits
      (4MB is a max size of a block).  Here I replaced the whole bitmap with two
      longs.
      
      Finally vm_unmap_aliases should be slightly faster and the whole
      vmap_block structure occupies less memory.
      Signed-off-by: NRoman Pen <r.peniaev@gmail.com>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Acked-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Gioh Kim <gioh.kim@lge.com>
      Cc: Rob Jones <rob.jones@codethink.co.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7d61bfe8