1. 01 5月, 2013 6 次提交
    • E
      epoll: trim epitem by one cache line · 39732ca5
      Eric Wong 提交于
      It is common for epoll users to have thousands of epitems, so saving a
      cache line on every allocation leads to large memory savings.
      
      Since epitem allocations are cache-aligned, reducing sizeof(struct
      epitem) from 136 bytes to 128 bytes will allow it to squeeze under a
      cache line boundary on x86_64.
      
      Via /sys/kernel/slab/eventpoll_epi, I see the following changes on my
      x86_64 Core2 Duo (which has 64-byte cache alignment):
      
      	object_size  :  192 => 128
      	objs_per_slab:   21 =>  32
      
      Also, add a BUILD_BUG_ON() to check for future accidental breakage.
      
      [akpm@linux-foundation.org: use __packed, for all architectures]
      Signed-off-by: NEric Wong <normalperson@yhbt.net>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      39732ca5
    • A
      binfmt_misc: reuse string_unescape_inplace() · 8d82e180
      Andy Shevchenko 提交于
      There is string_unescape_inplace() function which decodes strings in generic
      way. Let's use it.
      Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d82e180
    • T
      writeback: set worker desc to identify writeback workers in task dumps · ef3b1019
      Tejun Heo 提交于
      Writeback has been recently converted to use workqueue instead of its
      private thread pool implementation.  One negative side effect of this
      conversion is that there's no easy to tell which backing device a
      writeback work item was working on at the time of task dump, be it
      sysrq-t, BUG, WARN or whatever, which, according to our writeback
      brethren, is important in tracking down issues with a lot of mounted
      file systems on a lot of different devices.
      
      This patch restores that information using the new worker description
      facility.  bdi_writeback_workfn() calls set_work_desc() to identify
      which bdi it's working on.  The description is printed out together with
      the worqueue name and worker function as in the following example dump.
      
       WARNING: at fs/fs-writeback.c:1015 bdi_writeback_workfn+0x2b4/0x3c0()
       Modules linked in:
       Pid: 28, comm: kworker/u18:0 Not tainted 3.9.0-rc1-work+ #24 empty empty/S3992
       Workqueue: writeback bdi_writeback_workfn (flush-8:16)
        ffffffff820a3a98 ffff88015b927cb8 ffffffff81c61855 ffff88015b927cf8
        ffffffff8108f500 0000000000000000 ffff88007a171948 ffff88007a1716b0
        ffff88015b49df00 ffff88015b8d3940 0000000000000000 ffff88015b927d08
       Call Trace:
        [<ffffffff81c61855>] dump_stack+0x19/0x1b
        [<ffffffff8108f500>] warn_slowpath_common+0x70/0xa0
        [<ffffffff8108f54a>] warn_slowpath_null+0x1a/0x20
        [<ffffffff81200144>] bdi_writeback_workfn+0x2b4/0x3c0
        [<ffffffff810b4c87>] process_one_work+0x1d7/0x660
        [<ffffffff810b5c72>] worker_thread+0x122/0x380
        [<ffffffff810bdfea>] kthread+0xea/0xf0
        [<ffffffff81c6cedc>] ret_from_fork+0x7c/0xb0
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ef3b1019
    • G
      fs/dcache.c: add cond_resched() to shrink_dcache_parent() · 421348f1
      Greg Thelen 提交于
      Call cond_resched() in shrink_dcache_parent() to maintain interactivity.
      
      Before this patch:
      
      	void shrink_dcache_parent(struct dentry * parent)
      	{
      		while ((found = select_parent(parent, &dispose)) != 0)
      			shrink_dentry_list(&dispose);
      	}
      
      select_parent() populates the dispose list with dentries which
      shrink_dentry_list() then deletes.  select_parent() carefully uses
      need_resched() to avoid doing too much work at once.  But neither
      shrink_dcache_parent() nor its called functions call cond_resched().  So
      once need_resched() is set select_parent() will return single dentry
      dispose list which is then deleted by shrink_dentry_list().  This is
      inefficient when there are a lot of dentry to process.  This can cause
      softlockup and hurts interactivity on non preemptable kernels.
      
      This change adds cond_resched() in shrink_dcache_parent().  The benefit
      of this is that need_resched() is quickly cleared so that future calls
      to select_parent() are able to efficiently return a big batch of dentry.
      
      These additional cond_resched() do not seem to impact performance, at
      least for the workload below.
      
      Here is a program which can cause soft lockup if other system activity
      sets need_resched().
      
      	int main()
      	{
      	        struct rlimit rlim;
      	        int i;
      	        int f[100000];
      	        char buf[20];
      	        struct timeval t1, t2;
      	        double diff;
      
      	        /* cleanup past run */
      	        system("rm -rf x");
      
      	        /* boost nfile rlimit */
      	        rlim.rlim_cur = 200000;
      	        rlim.rlim_max = 200000;
      	        if (setrlimit(RLIMIT_NOFILE, &rlim))
      	                err(1, "setrlimit");
      
      	        /* make directory for files */
      	        if (mkdir("x", 0700))
      	                err(1, "mkdir");
      
      	        if (gettimeofday(&t1, NULL))
      	                err(1, "gettimeofday");
      
      	        /* populate directory with open files */
      	        for (i = 0; i < 100000; i++) {
      	                snprintf(buf, sizeof(buf), "x/%d", i);
      	                f[i] = open(buf, O_CREAT);
      	                if (f[i] == -1)
      	                        err(1, "open");
      	        }
      
      	        /* close some of the files */
      	        for (i = 0; i < 85000; i++)
      	                close(f[i]);
      
      	        /* unlink all files, even open ones */
      	        system("rm -rf x");
      
      	        if (gettimeofday(&t2, NULL))
      	                err(1, "gettimeofday");
      
      	        diff = (((double)t2.tv_sec * 1000000 + t2.tv_usec) -
      	                ((double)t1.tv_sec * 1000000 + t1.tv_usec));
      
      	        printf("done: %g elapsed\n", diff/1e6);
      	        return 0;
      	}
      Signed-off-by: NGreg Thelen <gthelen@google.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      421348f1
    • Y
      fs/block_dev.c: no need to check inode->i_bdev in bd_forget() · b4ea2eaa
      Yan Hong 提交于
      Its only caller evict() has promised a non-NULL inode->i_bdev.
      Signed-off-by: NYan Hong <clouds.yan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b4ea2eaa
    • Z
      inotify: invalid mask should return a error number but not set it · 04df32fa
      Zhao Hongjiang 提交于
      When we run the crackerjack testsuite, the inotify_add_watch test is
      stalled.
      
      This is caused by the invalid mask 0 - the task is waiting for the event
      but it never comes.  inotify_add_watch() should return -EINVAL as it did
      before commit 676a0675 ("inotify: remove broken mask checks causing
      unmount to be EINVAL").  That commit removes the invalid mask check, but
      that check is needed.
      
      Check the mask's ALL_INOTIFY_BITS before the inotify_arg_to_mask() call.
      If none are set, just return -EINVAL.
      
      Because IN_UNMOUNT is in ALL_INOTIFY_BITS, this change will not trigger
      the problem that above commit fixed.
      
      [akpm@linux-foundation.org: fix build]
      Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com>
      Acked-by: NJim Somerville <Jim.Somerville@windriver.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      04df32fa
  2. 30 4月, 2013 23 次提交
  3. 29 4月, 2013 1 次提交
  4. 26 4月, 2013 1 次提交
  5. 19 4月, 2013 1 次提交
  6. 18 4月, 2013 5 次提交
  7. 14 4月, 2013 1 次提交
  8. 13 4月, 2013 1 次提交
    • J
      Btrfs: make sure nbytes are right after log replay · 4bc4bee4
      Josef Bacik 提交于
      While trying to track down a tree log replay bug I noticed that fsck was always
      complaining about nbytes not being right for our fsynced file.  That is because
      the new fsync stuff doesn't wait for ordered extents to complete, so the inodes
      nbytes are not necessarily updated properly when we log it.  So to fix this we
      need to set nbytes to whatever it is on the inode that is on disk, so when we
      replay the extents we can just add the bytes that are being added as we replay
      the extent.  This makes it work for the case that we have the wrong nbytes or
      the case that we logged everything and nbytes is actually correct.  With this
      I'm no longer getting nbytes errors out of btrfsck.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      4bc4bee4
  9. 12 4月, 2013 1 次提交