1. 04 7月, 2013 1 次提交
    • P
      mm: soft-dirty bits for user memory changes tracking · 0f8975ec
      Pavel Emelyanov 提交于
      The soft-dirty is a bit on a PTE which helps to track which pages a task
      writes to.  In order to do this tracking one should
      
        1. Clear soft-dirty bits from PTEs ("echo 4 > /proc/PID/clear_refs)
        2. Wait some time.
        3. Read soft-dirty bits (55'th in /proc/PID/pagemap2 entries)
      
      To do this tracking, the writable bit is cleared from PTEs when the
      soft-dirty bit is.  Thus, after this, when the task tries to modify a
      page at some virtual address the #PF occurs and the kernel sets the
      soft-dirty bit on the respective PTE.
      
      Note, that although all the task's address space is marked as r/o after
      the soft-dirty bits clear, the #PF-s that occur after that are processed
      fast.  This is so, since the pages are still mapped to physical memory,
      and thus all the kernel does is finds this fact out and puts back
      writable, dirty and soft-dirty bits on the PTE.
      
      Another thing to note, is that when mremap moves PTEs they are marked
      with soft-dirty as well, since from the user perspective mremap modifies
      the virtual memory at mremap's new address.
      Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Cc: Glauber Costa <glommer@parallels.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0f8975ec
  2. 03 7月, 2013 1 次提交
  3. 29 6月, 2013 6 次提交
    • J
      locks: give the blocked_hash its own spinlock · 7b2296af
      Jeff Layton 提交于
      There's no reason we have to protect the blocked_hash and file_lock_list
      with the same spinlock. With the tests I have, breaking it in two gives
      a barely measurable performance benefit, but it seems reasonable to make
      this locking as granular as possible.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7b2296af
    • J
      locks: add a new "lm_owner_key" lock operation · 3999e493
      Jeff Layton 提交于
      Currently, the hashing that the locking code uses to add these values
      to the blocked_hash is simply calculated using fl_owner field. That's
      valid in most cases except for server-side lockd, which validates the
      owner of a lock based on fl_owner and fl_pid.
      
      In the case where you have a small number of NFS clients doing a lot
      of locking between different processes, you could end up with all
      the blocked requests sitting in a very small number of hash buckets.
      
      Add a new lm_owner_key operation to the lock_manager_operations that
      will generate an unsigned long to use as the key in the hashtable.
      That function is only implemented for server-side lockd, and simply
      XORs the fl_owner and fl_pid.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      3999e493
    • J
      locks: protect most of the file_lock handling with i_lock · 1c8c601a
      Jeff Layton 提交于
      Having a global lock that protects all of this code is a clear
      scalability problem. Instead of doing that, move most of the code to be
      protected by the i_lock instead. The exceptions are the global lists
      that the ->fl_link sits on, and the ->fl_block list.
      
      ->fl_link is what connects these structures to the
      global lists, so we must ensure that we hold those locks when iterating
      over or updating these lists.
      
      Furthermore, sound deadlock detection requires that we hold the
      blocked_list state steady while checking for loops. We also must ensure
      that the search and update to the list are atomic.
      
      For the checking and insertion side of the blocked_list, push the
      acquisition of the global lock into __posix_lock_file and ensure that
      checking and update of the  blocked_list is done without dropping the
      lock in between.
      
      On the removal side, when waking up blocked lock waiters, take the
      global lock before walking the blocked list and dequeue the waiters from
      the global list prior to removal from the fl_block list.
      
      With this, deadlock detection should be race free while we minimize
      excessive file_lock_lock thrashing.
      
      Finally, in order to avoid a lock inversion problem when handling
      /proc/locks output we must ensure that manipulations of the fl_block
      list are also protected by the file_lock_lock.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      1c8c601a
    • L
      Don't pass inode to ->d_hash() and ->d_compare() · da53be12
      Linus Torvalds 提交于
      Instances either don't look at it at all (the majority of cases) or
      only want it to find the superblock (which can be had as dentry->d_sb).
      A few cases that want more are actually safe with dentry->d_inode -
      the only precaution needed is the check that it hadn't been replaced with
      NULL by rmdir() or by overwriting rename(), which case should be simply
      treated as cache miss.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      da53be12
    • A
      [readdir] ->readdir() is gone · 2233f31a
      Al Viro 提交于
      everything's converted to ->iterate()
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      2233f31a
    • A
      [readdir] introduce iterate_dir() and dir_context · 5c0ba4e0
      Al Viro 提交于
      iterate_dir(): new helper, replacing vfs_readdir().
      
      struct dir_context: contains the readdir callback (and will get more stuff
      in it), embedded into whatever data that callback wants to deal with;
      eventually, we'll be passing it to ->readdir() replacement instead of
      (data,filldir) pair.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      5c0ba4e0
  4. 28 6月, 2013 1 次提交
  5. 27 6月, 2013 1 次提交
  6. 26 6月, 2013 4 次提交
  7. 25 6月, 2013 1 次提交
  8. 24 6月, 2013 2 次提交
  9. 23 6月, 2013 2 次提交
    • D
      x86: Add NMI duration tracepoints · 0c4df02d
      Dave Hansen 提交于
      This patch has been invaluable in my adventures finding
      issues in the perf NMI handler.  I'm as big a fan of
      printk() as anybody is, but using printk() in NMIs is
      deadly when they're happening frequently.
      
      Even hacking in trace_printk() ended up eating enough
      CPU to throw off some of the measurements I was making.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus@samba.org
      Cc: acme@ghostprotocols.net
      Cc: Dave Hansen <dave@sr71.net>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      0c4df02d
    • D
      perf: Drop sample rate when sampling is too slow · 14c63f17
      Dave Hansen 提交于
      This patch keeps track of how long perf's NMI handler is taking,
      and also calculates how many samples perf can take a second.  If
      the sample length times the expected max number of samples
      exceeds a configurable threshold, it drops the sample rate.
      
      This way, we don't have a runaway sampling process eating up the
      CPU.
      
      This patch can tend to drop the sample rate down to level where
      perf doesn't work very well.  *BUT* the alternative is that my
      system hangs because it spends all of its time handling NMIs.
      
      I'll take a busted performance tool over an entire system that's
      busted and undebuggable any day.
      
      BTW, my suspicion is that there's still an underlying bug here.
      Using the HPET instead of the TSC is definitely a contributing
      factor, but I suspect there are some other things going on.
      But, I can't go dig down on a bug like that with my machine
      hanging all the time.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus@samba.org
      Cc: acme@ghostprotocols.net
      Cc: Dave Hansen <dave@sr71.net>
      [ Prettified it a bit. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      14c63f17
  10. 22 6月, 2013 1 次提交
  11. 21 6月, 2013 1 次提交
  12. 20 6月, 2013 1 次提交
    • L
      clk: nomadik: implement the Nomadik clocks properly · ef6eb322
      Linus Walleij 提交于
      The Nomadik clock implementation was a stub just using
      fixed clocks.
      
      This implements the clocks properly instead of relying
      on them all being on at boot and leaving them all on.
      
      The PLLs are on the top locking to the main chrystal
      oscillator, then the HCLK for the peripherals are
      below PLL2.
      
      The gated clocks are implemented with zero cells and
      given the clock ID as a property of each node, so every
      gate need to have its own node in the device tree.
      This is because the gate registers contain both HCLK
      gates and PCLK gates, where the latter has HCLK as
      parent. As can be seen from the register layout, this
      is a complete mixup, which means all these gates need
      their own node to properly model parent/child relations
      for PCLKs apart from the HCLKs.
      
      This driver also adds a helpful debugfs file to inspect
      the hardware state of the clock gates.
      
      This is the end result in <debugfs>/clk/clk_summary
      after applying a proper device tree:
      
      ulpiclk                0   0    60000000
      mxtal                  3   3    19200000
         pll2                1   1    864000000
            clk48            3   3    48000000
               rngcclk       1   1    48000000
               usbmclk       0   0    48000000
               mshcclk       0   0    48000000
               mspclk3       0   0    48000000
               x3dclk        0   0    48000000
               skeclk        0   0    48000000
               owmclk        0   0    48000000
               mspclk2       0   0    48000000
               mspclk1       0   0    48000000
               uart2clk      0   0    48000000
               ipbmcclk      0   0    48000000
               ipi2cclk      0   0    48000000
               usbclk        0   0    48000000
               mspclk0       0   0    48000000
               uart1clk      1   2    48000000
               i2c1clk       0   0    48000000
               i2c0clk       0   0    48000000
               sdiclk        1   1    48000000
               uart0clk      0   0    48000000
               sspiclk       0   0    48000000
               irdaclk       0   0    48000000
            clk72            0   0    72000000
               difclk        0   0    72000000
               clcdclk       0   0    72000000
            clk216           0   0    216000000
               hsiclkrx      0   0    216000000
               clk108        0   0    108000000
                  hsiclktx   0   0    108000000
                  clk27      0   0    27000000
         pll1                1   1    264000000
            hclk             3   3    264000000
               hclkrng       1   1    264000000
               hclkusbm      0   0    264000000
               hclkcryp      0   0    264000000
               hclkhash      0   0    264000000
               hclk3d        0   0    264000000
               hclkhpi       0   0    264000000
               hclksva       0   0    264000000
               hclksaa       0   0    264000000
               hclkdif       0   0    264000000
               hclkusb       0   0    264000000
               hclkclcd      0   0    264000000
               hclkdma1      0   0    264000000
               hclksdram     0   0    264000000
               hclksmc       1   1    264000000
               hclkdma0      0   0    264000000
               pclk          7   9    264000000
                  pclkmsp3   0   0    264000000
                  pclkmshc   0   0    264000000
                  pclkhsem   0   0    264000000
                  pclkske    0   0    264000000
                  pclkowm    0   0    264000000
                  pclkmsp2   0   0    264000000
                  pclkmsp1   0   0    264000000
                  pclkuart2  0   0    264000000
                  pclkxti    0   0    264000000
                  pclkhsi    0   0    264000000
                  pclkmsp0   0   0    264000000
                  pclkuart1  1   1    264000000
                  pclki2c1   0   0    264000000
                  pclki2c0   0   0    264000000
                  pclksdi    1   1    264000000
                  pclkuart0  1   1    264000000
                  pclkssp    0   0    264000000
                  pclkirda   0   0    264000000
         timclk              1   1    2400000
      Acked-by: NMike Turquette <mturquette@linaro.org>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      ef6eb322
  13. 19 6月, 2013 14 次提交
  14. 18 6月, 2013 4 次提交