1. 14 9月, 2018 1 次提交
    • L
      mm: get rid of vmacache_flush_all() entirely · 7a9cdebd
      Linus Torvalds 提交于
      Jann Horn points out that the vmacache_flush_all() function is not only
      potentially expensive, it's buggy too.  It also happens to be entirely
      unnecessary, because the sequence number overflow case can be avoided by
      simply making the sequence number be 64-bit.  That doesn't even grow the
      data structures in question, because the other adjacent fields are
      already 64-bit.
      
      So simplify the whole thing by just making the sequence number overflow
      case go away entirely, which gets rid of all the complications and makes
      the code faster too.  Win-win.
      
      [ Oleg Nesterov points out that the VMACACHE_FULL_FLUSHES statistics
        also just goes away entirely with this ]
      Reported-by: NJann Horn <jannh@google.com>
      Suggested-by: NWill Deacon <will.deacon@arm.com>
      Acked-by: NDavidlohr Bueso <dave@stgolabs.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7a9cdebd
  2. 05 9月, 2018 6 次提交
  3. 01 9月, 2018 1 次提交
    • D
      blkcg: delay blkg destruction until after writeback has finished · 59b57717
      Dennis Zhou (Facebook) 提交于
      Currently, blkcg destruction relies on a sequence of events:
        1. Destruction starts. blkcg_css_offline() is called and blkgs
           release their reference to the blkcg. This immediately destroys
           the cgwbs (writeback).
        2. With blkgs giving up their reference, the blkcg ref count should
           become zero and eventually call blkcg_css_free() which finally
           frees the blkcg.
      
      Jiufei Xue reported that there is a race between blkcg_bio_issue_check()
      and cgroup_rmdir(). To remedy this, blkg destruction becomes contingent
      on the completion of all writeback associated with the blkcg. A count of
      the number of cgwbs is maintained and once that goes to zero, blkg
      destruction can follow. This should prevent premature blkg destruction
      related to writeback.
      
      The new process for blkcg cleanup is as follows:
        1. Destruction starts. blkcg_css_offline() is called which offlines
           writeback. Blkg destruction is delayed on the cgwb_refcnt count to
           avoid punting potentially large amounts of outstanding writeback
           to root while maintaining any ongoing policies. Here, the base
           cgwb_refcnt is put back.
        2. When the cgwb_refcnt becomes zero, blkcg_destroy_blkgs() is called
           and handles destruction of blkgs. This is where the css reference
           held by each blkg is released.
        3. Once the blkcg ref count goes to zero, blkcg_css_free() is called.
           This finally frees the blkg.
      
      It seems in the past blk-throttle didn't do the most understandable
      things with taking data from a blkg while associating with current. So,
      the simplification and unification of what blk-throttle is doing caused
      this.
      
      Fixes: 08e18eab ("block: add bi_blkg to the bio for cgroups")
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDennis Zhou <dennisszhou@gmail.com>
      Cc: Jiufei Xue <jiufei.xue@linux.alibaba.com>
      Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Josef Bacik <josef@toxicpanda.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      59b57717
  4. 31 8月, 2018 1 次提交
  5. 30 8月, 2018 2 次提交
  6. 26 8月, 2018 1 次提交
    • L
      mm/cow: don't bother write protecting already write-protected pages · 1b2de5d0
      Linus Torvalds 提交于
      This is not normally noticeable, but repeated forks are unnecessarily
      expensive because they repeatedly dirty the parent page tables during
      the page table copy operation.
      
      It's trivial to just avoid write protecting the page table entry if it
      was already not writable.
      
      This patch was inspired by
      
          https://bugzilla.kernel.org/show_bug.cgi?id=200447
      
      which points to an ancient "waste time re-doing fork" issue in the
      presence of lots of signals.
      
      That bug was fixed by Eric Biederman's signal handling series
      culminating in commit c3ad2c3b ("signal: Don't restart fork when
      signals come in"), but the unnecessary work for repeated forks is still
      work just fixing, particularly since the fix is trivial.
      
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1b2de5d0
  7. 24 8月, 2018 9 次提交
  8. 23 8月, 2018 19 次提交