1. 18 7月, 2015 1 次提交
  2. 22 6月, 2015 1 次提交
  3. 19 6月, 2015 1 次提交
    • T
      timer: Replace timer base by a cpu index · 0eeda71b
      Thomas Gleixner 提交于
      Instead of storing a pointer to the per cpu tvec_base we can simply
      cache a CPU index in the timer_list and use that to get hold of the
      correct per cpu tvec_base. This is only used in lock_timer_base() and
      the slightly larger code is peanuts versus the spinlock operation and
      the d-cache foot print of the timer wheel.
      
      Aside of that this allows to get rid of following nuisances:
      
       - boot_tvec_base
      
         That statically allocated 4k bss data is just kept around so the
         timer has a home when it gets statically initialized. It serves no
         other purpose.
      
         With the CPU index we assign the timer to CPU0 at static
         initialization time and therefor can avoid the whole boot_tvec_base
         dance.  That also simplifies the init code, which just can use the
         per cpu base.
      
         Before:
           text	   data	    bss	    dec	    hex	filename
          17491	   9201	   4160	  30852	   7884	../build/kernel/time/timer.o
         After:
           text	   data	    bss	    dec	    hex	filename
          17440	   9193	      0	  26633	   6809	../build/kernel/time/timer.o
      
       - Overloading the base pointer with various flags
      
         The CPU index has enough space to hold the flags (deferrable,
         irqsafe) so we can get rid of the extra masking and bit fiddling
         with the base pointer.
      
      As a benefit we reduce the size of struct timer_list on 64 bit
      machines. 4 - 8 bytes, a size reduction up to 15% per struct timer_list,
      which is a real win as we have tons of them embedded in other structs.
      
      This changes also the newly added deferrable printout of the timer
      start trace point to capture and print all timer->flags, which allows
      us to decode the target cpu of the timer as well.
      
      We might have used bitfields for this, but that would change the
      static initializers and the init function for no value to accomodate
      big endian bitfields.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Joonwoo Park <joonwoop@codeaurora.org>
      Cc: Wenbo Wang <wenbo.wang@memblaze.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Badhri Jagan Sridharan <Badhri@google.com>
      Link: http://lkml.kernel.org/r/20150526224511.950084301@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      0eeda71b
  4. 11 6月, 2015 1 次提交
  5. 09 6月, 2015 1 次提交
    • N
      ext4: Add support FALLOC_FL_INSERT_RANGE for fallocate · 331573fe
      Namjae Jeon 提交于
      This patch implements fallocate's FALLOC_FL_INSERT_RANGE for Ext4.
      
      1) Make sure that both offset and len are block size aligned.
      2) Update the i_size of inode by len bytes.
      3) Compute the file's logical block number against offset. If the computed
         block number is not the starting block of the extent, split the extent
         such that the block number is the starting block of the extent.
      4) Shift all the extents which are lying between [offset, last allocated extent]
         towards right by len bytes. This step will make a hole of len bytes
         at offset.
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NAshish Sangwan <a.sangwan@samsung.com>
      331573fe
  6. 02 6月, 2015 3 次提交
    • B
      target: Minimize SCSI header #include directives · ba929992
      Bart Van Assche 提交于
      Only include SCSI initiator header files in target code that needs
      these header files, namely the SCSI pass-through code and the tcm_loop
      driver. Change SCSI_SENSE_BUFFERSIZE into TRANSPORT_SENSE_BUFFER in
      target code because the former is intended for initiator code and the
      latter for target code. With this patch the only initiator include
      directives in target code that remain are as follows:
      
      $ git grep -nHE 'include .scsi/(scsi.h|scsi_host.h|scsi_device.h|scsi_cmnd.h)' drivers/target drivers/infiniband/ulp/{isert,srpt} drivers/usb/gadget/legacy/tcm_*.[ch] drivers/{vhost,xen} include/{target,trace/events/target.h}
      drivers/target/loopback/tcm_loop.c:29:#include <scsi/scsi.h>
      drivers/target/loopback/tcm_loop.c:31:#include <scsi/scsi_host.h>
      drivers/target/loopback/tcm_loop.c:32:#include <scsi/scsi_device.h>
      drivers/target/loopback/tcm_loop.c:33:#include <scsi/scsi_cmnd.h>
      drivers/target/target_core_pscsi.c:39:#include <scsi/scsi_device.h>
      drivers/target/target_core_pscsi.c:40:#include <scsi/scsi_host.h>
      drivers/xen/xen-scsiback.c:52:#include <scsi/scsi_host.h> /* SG_ALL */
      Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJames Bottomley <JBottomley@Odin.com>
      ba929992
    • T
      writeback: move global_dirty_limit into wb_domain · dcc25ae7
      Tejun Heo 提交于
      This patch is a part of the series to define wb_domain which
      represents a domain that wb's (bdi_writeback's) belong to and are
      measured against each other in.  This will enable IO backpressure
      propagation for cgroup writeback.
      
      global_dirty_limit exists to regulate the global dirty threshold which
      is a property of the wb_domain.  This patch moves hard_dirty_limit,
      dirty_lock, and update_time into wb_domain.
      
      This is pure reorganization and doesn't introduce any behavioral
      changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Greg Thelen <gthelen@google.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      dcc25ae7
    • T
      writeback: move bandwidth related fields from backing_dev_info into bdi_writeback · a88a341a
      Tejun Heo 提交于
      Currently, a bdi (backing_dev_info) embeds single wb (bdi_writeback)
      and the role of the separation is unclear.  For cgroup support for
      writeback IOs, a bdi will be updated to host multiple wb's where each
      wb serves writeback IOs of a different cgroup on the bdi.  To achieve
      that, a wb should carry all states necessary for servicing writeback
      IOs for a cgroup independently.
      
      This patch moves bandwidth related fields from backing_dev_info into
      bdi_writeback.
      
      * The moved fields are: bw_time_stamp, dirtied_stamp, written_stamp,
        write_bandwidth, avg_write_bandwidth, dirty_ratelimit,
        balanced_dirty_ratelimit, completions and dirty_exceeded.
      
      * writeback_chunk_size() and over_bground_thresh() now take @wb
        instead of @bdi.
      
      * bdi_writeout_fraction(bdi, ...)	-> wb_writeout_fraction(wb, ...)
        bdi_dirty_limit(bdi, ...)		-> wb_dirty_limit(wb, ...)
        bdi_position_ration(bdi, ...)		-> wb_position_ratio(wb, ...)
        bdi_update_writebandwidth(bdi, ...)	-> wb_update_write_bandwidth(wb, ...)
        [__]bdi_update_bandwidth(bdi, ...)	-> [__]wb_update_bandwidth(wb, ...)
        bdi_{max|min}_pause(bdi, ...)		-> wb_{max|min}_pause(wb, ...)
        bdi_dirty_limits(bdi, ...)		-> wb_dirty_limits(wb, ...)
      
      * Init/exits of the relocated fields are moved to bdi_wb_init/exit()
        respectively.  Note that explicit zeroing is dropped in the process
        as wb's are cleared in entirety anyway.
      
      * As there's still only one bdi_writeback per backing_dev_info, all
        uses of bdi->stat[] are mechanically replaced with bdi->wb.stat[]
        introducing no behavior changes.
      
      v2: Typo in description fixed as suggested by Jan.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Jaegeuk Kim <jaegeuk@kernel.org>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      a88a341a
  7. 29 5月, 2015 5 次提交
    • S
      tracing/mm: don't trace mm_page_pcpu_drain on offline cpus · 649b8de2
      Shreyas B. Prabhu 提交于
      Since tracepoints use RCU for protection, they must not be called on
      offline cpus.  trace_mm_page_pcpu_drain can be called on an offline cpu
      in this scenario caught by LOCKDEP:
      
           ===============================
           [ INFO: suspicious RCU usage. ]
           4.1.0-rc1+ #9 Not tainted
           -------------------------------
           include/trace/events/kmem.h:265 suspicious rcu_dereference_check() usage!
      
          other info that might help us debug this:
      
          RCU used illegally from offline CPU!
          rcu_scheduler_active = 1, debug_locks = 1
           1 lock held by swapper/5/0:
            #0:  (&(&zone->lock)->rlock){..-...}, at: [<c0000000002073b0>] .free_pcppages_bulk+0x70/0x920
      
          stack backtrace:
           CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.1.0-rc1+ #9
           Call Trace:
             .dump_stack+0x98/0xd4 (unreliable)
             .lockdep_rcu_suspicious+0x108/0x170
             .free_pcppages_bulk+0x60c/0x920
             .free_hot_cold_page+0x208/0x280
             .destroy_context+0x90/0xd0
             .__mmdrop+0x58/0x160
             .idle_task_exit+0xf0/0x100
             .pnv_smp_cpu_kill_self+0x58/0x2c0
             .cpu_die+0x34/0x50
             .arch_cpu_idle_dead+0x20/0x40
             .cpu_startup_entry+0x708/0x7a0
             .start_secondary+0x36c/0x3a0
             start_secondary_prolog+0x10/0x14
      
      Fix this by converting mm_page_pcpu_drain trace point into
      TRACE_EVENT_CONDITION where condition is cpu_online(smp_processor_id())
      Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
      Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      649b8de2
    • S
      tracing/mm: don't trace mm_page_free on offline cpus · 1f0c27b5
      Shreyas B. Prabhu 提交于
      Since tracepoints use RCU for protection, they must not be called on
      offline cpus.  trace_mm_page_free can be called on an offline cpu in this
      scenario caught by LOCKDEP:
      
           ===============================
           [ INFO: suspicious RCU usage. ]
           4.1.0-rc1+ #9 Not tainted
           -------------------------------
           include/trace/events/kmem.h:170 suspicious rcu_dereference_check() usage!
      
          other info that might help us debug this:
      
          RCU used illegally from offline CPU!
          rcu_scheduler_active = 1, debug_locks = 1
           no locks held by swapper/1/0.
      
          stack backtrace:
           CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.1.0-rc1+ #9
           Call Trace:
             .dump_stack+0x98/0xd4 (unreliable)
             .lockdep_rcu_suspicious+0x108/0x170
             .free_pages_prepare+0x494/0x680
             .free_hot_cold_page+0x50/0x280
             .destroy_context+0x90/0xd0
             .__mmdrop+0x58/0x160
             .idle_task_exit+0xf0/0x100
             .pnv_smp_cpu_kill_self+0x58/0x2c0
             .cpu_die+0x34/0x50
             .arch_cpu_idle_dead+0x20/0x40
             .cpu_startup_entry+0x708/0x7a0
             .start_secondary+0x36c/0x3a0
             start_secondary_prolog+0x10/0x14
      
      Fix this by converting mm_page_free trace point into TRACE_EVENT_CONDITION
      where condition is cpu_online(smp_processor_id())
      Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
      Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1f0c27b5
    • S
      tracing/mm: don't trace kmem_cache_free on offline cpus · e5feb1eb
      Shreyas B. Prabhu 提交于
      Since tracepoints use RCU for protection, they must not be called on
      offline cpus.  trace_kmem_cache_free can be called on an offline cpu in
      this scenario caught by LOCKDEP:
      
          ===============================
          [ INFO: suspicious RCU usage. ]
          4.1.0-rc1+ #9 Not tainted
          -------------------------------
          include/trace/events/kmem.h:148 suspicious rcu_dereference_check() usage!
      
          other info that might help us debug this:
      
          RCU used illegally from offline CPU!
          rcu_scheduler_active = 1, debug_locks = 1
          no locks held by swapper/1/0.
      
          stack backtrace:
          CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.1.0-rc1+ #9
          Call Trace:
            .dump_stack+0x98/0xd4 (unreliable)
            .lockdep_rcu_suspicious+0x108/0x170
            .kmem_cache_free+0x344/0x4b0
            .__mmdrop+0x4c/0x160
            .idle_task_exit+0xf0/0x100
            .pnv_smp_cpu_kill_self+0x58/0x2c0
            .cpu_die+0x34/0x50
            .arch_cpu_idle_dead+0x20/0x40
            .cpu_startup_entry+0x708/0x7a0
            .start_secondary+0x36c/0x3a0
            start_secondary_prolog+0x10/0x14
      
      Fix this by converting kmem_cache_free trace point into
      TRACE_EVENT_CONDITION where condition is cpu_online(smp_processor_id())
      Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
      Reported-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e5feb1eb
    • J
      f2fs: add f2fs_map_blocks · 003a3e1d
      Jaegeuk Kim 提交于
      This patch introduces f2fs_map_blocks structure likewise ext4_map_blocks.
      Now, f2fs uses f2fs_map_blocks when handling get_block.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      003a3e1d
    • N
      block: discard bdi_unregister() in favour of bdi_destroy() · aad653a0
      NeilBrown 提交于
      bdi_unregister() now contains very little functionality.
      
      It contains a "WARN_ON" if bdi->dev is NULL.  This warning is of no
      real consequence as bdi->dev isn't needed by anything else in the function,
      and it triggers if
         blk_cleanup_queue() -> bdi_destroy()
      is called before bdi_unregister, which happens since
        Commit: 6cd18e71 ("block: destroy bdi before blockdev is unregistered.")
      
      So this isn't wanted.
      
      It also calls bdi_set_min_ratio().  This needs to be called after
      writes through the bdi have all been flushed, and before the bdi is destroyed.
      Calling it early is better than calling it late as it frees up a global
      resource.
      
      Calling it immediately after bdi_wb_shutdown() in bdi_destroy()
      perfectly fits these requirements.
      
      So bdi_unregister() can be discarded with the important content moved to
      bdi_destroy(), as can the
        writeback_bdi_unregister
      event which is already not used.
      Reported-by: NMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org (v4.0)
      Fixes: c4db59d3 ("fs: don't reassign dirty inodes to default_backing_dev_info")
      Fixes: 6cd18e71 ("block: destroy bdi before blockdev is unregistered.")
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NDan Williams <dan.j.williams@intel.com>
      Tested-by: NNicholas Moulin <nicholas.w.moulin@linux.intel.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      aad653a0
  8. 23 5月, 2015 1 次提交
  9. 19 5月, 2015 1 次提交
    • P
      sched/wait: Introduce TASK_NOLOAD and TASK_IDLE · 80ed87c8
      Peter Zijlstra 提交于
      Currently people use TASK_INTERRUPTIBLE to idle kthreads and wait for
      'work' because TASK_UNINTERRUPTIBLE contributes to the loadavg. Having
      all idle kthreads contribute to the loadavg is somewhat silly.
      
      Now mostly this works OK, because kthreads have all their signals
      masked. However there's a few sites where this is causing problems and
      TASK_UNINTERRUPTIBLE should be used, except for that loadavg issue.
      
      This patch adds TASK_NOLOAD which, when combined with
      TASK_UNINTERRUPTIBLE avoids the loadavg accounting.
      
      As most of imagined usage sites are loops where a thread wants to
      idle, waiting for work, a helper TASK_IDLE is introduced.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Julian Anastasov <ja@ssi.bg>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      80ed87c8
  10. 14 5月, 2015 16 次提交
  11. 12 5月, 2015 1 次提交
  12. 08 5月, 2015 1 次提交
  13. 05 5月, 2015 2 次提交
  14. 17 4月, 2015 1 次提交
  15. 16 4月, 2015 2 次提交
  16. 15 4月, 2015 1 次提交
  17. 13 4月, 2015 1 次提交