1. 05 6月, 2014 1 次提交
    • D
      swap: change swap_info singly-linked list to list_head · adfab836
      Dan Streetman 提交于
      The logic controlling the singly-linked list of swap_info_struct entries
      for all active, i.e.  swapon'ed, swap targets is rather complex, because:
      
       - it stores the entries in priority order
       - there is a pointer to the highest priority entry
       - there is a pointer to the highest priority not-full entry
       - there is a highest_priority_index variable set outside the swap_lock
       - swap entries of equal priority should be used equally
      
      this complexity leads to bugs such as: https://lkml.org/lkml/2014/2/13/181
      where different priority swap targets are incorrectly used equally.
      
      That bug probably could be solved with the existing singly-linked lists,
      but I think it would only add more complexity to the already difficult to
      understand get_swap_page() swap_list iteration logic.
      
      The first patch changes from a singly-linked list to a doubly-linked list
      using list_heads; the highest_priority_index and related code are removed
      and get_swap_page() starts each iteration at the highest priority
      swap_info entry, even if it's full.  While this does introduce unnecessary
      list iteration (i.e.  Schlemiel the painter's algorithm) in the case where
      one or more of the highest priority entries are full, the iteration and
      manipulation code is much simpler and behaves correctly re: the above bug;
      and the fourth patch removes the unnecessary iteration.
      
      The second patch adds some minor plist helper functions; nothing new
      really, just functions to match existing regular list functions.  These
      are used by the next two patches.
      
      The third patch adds plist_requeue(), which is used by get_swap_page() in
      the next patch - it performs the requeueing of same-priority entries
      (which moves the entry to the end of its priority in the plist), so that
      all equal-priority swap_info_structs get used equally.
      
      The fourth patch converts the main list into a plist, and adds a new plist
      that contains only swap_info entries that are both active and not full.
      As Mel suggested using plists allows removing all the ordering code from
      swap - plists handle ordering automatically.  The list naming is also
      clarified now that there are two lists, with the original list changed
      from swap_list_head to swap_active_head and the new list named
      swap_avail_head.  A new spinlock is also added for the new list, so
      swap_info entries can be added or removed from the new list immediately as
      they become full or not full.
      
      This patch (of 4):
      
      Replace the singly-linked list tracking active, i.e.  swapon'ed,
      swap_info_struct entries with a doubly-linked list using struct
      list_heads.  Simplify the logic iterating and manipulating the list of
      entries, especially get_swap_page(), by using standard list_head
      functions, and removing the highest priority iteration logic.
      
      The change fixes the bug:
      https://lkml.org/lkml/2014/2/13/181
      in which different priority swap entries after the highest priority entry
      are incorrectly used equally in pairs.  The swap behavior is now as
      advertised, i.e. different priority swap entries are used in order, and
      equal priority swap targets are used concurrently.
      Signed-off-by: NDan Streetman <ddstreet@ieee.org>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Cc: Shaohua Li <shli@fusionio.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
      Cc: Weijie Yang <weijieut@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Bob Liu <bob.liu@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      adfab836
  2. 13 6月, 2013 1 次提交
    • A
      frontswap: fix incorrect zeroing and allocation size for frontswap_map · 7b57976d
      Akinobu Mita 提交于
      The bitmap accessed by bitops must have enough size to hold the required
      numbers of bits rounded up to a multiple of BITS_PER_LONG.  And the
      bitmap must not be zeroed by memset() if the number of bits cleared is
      not a multiple of BITS_PER_LONG.
      
      This fixes incorrect zeroing and allocation size for frontswap_map.  The
      incorrect zeroing part doesn't cause any problem because frontswap_map
      is freed just after zeroing.  But the wrongly calculated allocation size
      may cause the problem.
      
      For 32bit systems, the allocation size of frontswap_map is about twice
      as large as required size.  For 64bit systems, the allocation size is
      smaller than requeired if the number of bits is not a multiple of
      BITS_PER_LONG.
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7b57976d
  3. 01 5月, 2013 4 次提交
  4. 21 9月, 2012 2 次提交
    • D
      frontswap: support exclusive gets if tmem backend is capable · e3483a5f
      Dan Magenheimer 提交于
      Tmem, as originally specified, assumes that "get" operations
      performed on persistent pools never flush the page of data out
      of tmem on a successful get, waiting instead for a flush
      operation.  This is intended to mimic the model of a swap
      disk, where a disk read is non-destructive.  Unlike a
      disk, however, freeing up the RAM can be valuable.  Over
      the years that frontswap was in the review process, several
      reviewers (and notably Hugh Dickins in 2010) pointed out that
      this would result, at least temporarily, in two copies of the
      data in RAM: one (compressed for zcache) copy in tmem,
      and one copy in the swap cache.  We wondered if this could
      be done differently, at least optionally.
      
      This patch allows tmem backends to instruct the frontswap
      code that this backend performs exclusive gets.  Zcache2
      already contains hooks to support this feature.  Other
      backends are completely unaffected unless/until they are
      updated to support this feature.
      
      While it is not clear that exclusive gets are a performance
      win on all workloads at all times, this small patch allows for
      experimentation by backends.
      
      P.S. Let's not quibble about the naming of "get" vs "read" vs
      "load" etc.  The naming is currently horribly inconsistent between
      cleancache and frontswap and existing tmem backends, so will need
      to be straightened out as a separate patch.  "Get" is used
      by the tmem architecture spec, existing backends, and
      all documentation and presentation material so I am
      using it in this patch.
      Signed-off-by: NDan Magenheimer <dan.magenheimer@oracle.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      e3483a5f
    • Z
      mm: frontswap: fix a wrong if condition in frontswap_shrink · a00bb1e9
      Zhenzhong Duan 提交于
      pages_to_unuse is set to 0 to unuse all frontswap pages
      But that doesn't happen since a wrong condition in frontswap_shrink
      cancel it.
      
      -v2: Add comment to explain return value of __frontswap_shrink,
      as suggested by Dan Carpenter, thanks
      Signed-off-by: NZhenzhong Duan <zhenzhong.duan@oracle.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      a00bb1e9
  5. 14 8月, 2012 1 次提交
  6. 23 7月, 2012 2 次提交
  7. 20 7月, 2012 3 次提交
  8. 12 6月, 2012 7 次提交
  9. 15 5月, 2012 2 次提交
    • K
      frontswap: s/put_page/store/g s/get_page/load · 165c8aed
      Konrad Rzeszutek Wilk 提交于
      Sounds so much more natural.
      Suggested-by: NAndrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      165c8aed
    • D
      mm: frontswap: core frontswap functionality · 29f233cf
      Dan Magenheimer 提交于
      This patch, 3of4, provides the core frontswap code that interfaces between
      the hooks in the swap subsystem and a frontswap backend via frontswap_ops.
      
      ---
      New file added: mm/frontswap.c
      
      [v14: add support for writethrough, per suggestion by aarcange@redhat.com]
      [v11: sjenning@linux.vnet.ibm.com: s/puts/failed_puts/]
      [v10: sjenning@linux.vnet.ibm.com: fix debugfs calls on 32-bit]
      [v9: akpm@linux-foundation.org: change "flush" to "invalidate", part 1]
      [v9: akpm@linux-foundation.org: mark some statics __read_mostly]
      [v9: akpm@linux-foundation.org: add clarifying comments]
      [v9: akpm@linux-foundation.org: no need to loop repeating try_to_unuse]
      [v9: error27@gmail.com: remove superfluous check for NULL]
      [v8: rebase to 3.0-rc4]
      [v8: kamezawa.hiroyu@jp.fujitsu.com: add comment to clarify find_next_to_unuse]
      [v7: rebase to 3.0-rc3]
      [v7: JBeulich@novell.com: use new static inlines, no-ops if not config'd]
      [v6: rebase to 3.1-rc1]
      [v6: lliubbo@gmail.com: use vzalloc]
      [v6: lliubbo@gmail.com: fix null pointer deref if vzalloc fails]
      [v6: konrad.wilk@oracl.com: various checks and code clarifications/comments]
      [v4: rebase to 2.6.39]
      Signed-off-by: NDan Magenheimer <dan.magenheimer@oracle.com>
      Acked-by: NJan Beulich <JBeulich@novell.com>
      Acked-by: NSeth Jennings <sjenning@linux.vnet.ibm.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Rik Riel <riel@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      [v12: Squashed s/flush/invalidate/ in]
      [v15: A bit of cleanup and seperate DEBUGFS]
      Signed-off-by: NKonrad Wilk <konrad.wilk@oracle.com>
      29f233cf