1. 01 5月, 2013 12 次提交
    • J
      binfmt_elf: PIE: make PF_RANDOMIZE check comment more accurate · c1d025e2
      Jiri Kosina 提交于
      The comment I originally added in commit a3defbe5 ("binfmt_elf: fix
      PIE execution with randomization disabled") is not really 100% accurate
      -- sysctl is not the only way how PF_RANDOMIZE could be forcibly unset
      in runtime.
      
      Another option of course is direct modification of personality flags
      (i.e.  running through setarch wrapper).
      
      Make the comment more explicit and accurate.
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c1d025e2
    • J
      fs: make binfmt support for #! scripts modular and removable · 2535e0d7
      Josh Triplett 提交于
      Add a new configuration option CONFIG_BINFMT_SCRIPT to configure support
      for interpreted scripts starting with "#!"; allow compiling out that
      support, or building it as a module.  Embedded systems running exclusively
      compiled binaries could leave this support out, and systems that don't
      need scripts before mounting the root filesystem can build this as a
      module.
      Signed-off-by: NJosh Triplett <josh@joshtriplett.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2535e0d7
    • E
      epoll: cleanup: use RCU_INIT_POINTER when nulling · d6d67e72
      Eric Wong 提交于
      It is always safe to use RCU_INIT_POINTER to NULL a pointer.  This results
      in slightly smaller/faster code.
      Signed-off-by: NEric Wong <normalperson@yhbt.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d6d67e72
    • E
      epoll: cleanup: hoist out f_op->poll calls · 450d89ec
      Eric Wong 提交于
      This reduces the amount of code inside the ready list iteration loops for
      better readability IMHO.
      Signed-off-by: NEric Wong <normalperson@yhbt.net>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      450d89ec
    • E
      epoll: lock ep->mtx in ep_free to silence lockdep · ddf676c3
      Eric Wong 提交于
      Technically we do not need to hold ep->mtx during ep_free since we are
      certain there are no other users of ep at that point.  However, lockdep
      complains with a "suspicious rcu_dereference_check() usage!" message; so
      lock the mutex before ep_remove to silence the warning.
      Signed-off-by: NEric Wong <normalperson@yhbt.net>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Arve Hjønnevåg <arve@android.com>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: NeilBrown <neilb@suse.de>,
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ddf676c3
    • E
      epoll: use RCU to protect wakeup_source in epitem · eea1d585
      Eric Wong 提交于
      This prevents wakeup_source destruction when a user hits the item with
      EPOLL_CTL_MOD while ep_poll_callback is running.
      
      Tested with CONFIG_SPARSE_RCU_POINTER=y and "make fs/eventpoll.o C=2"
      Signed-off-by: NEric Wong <normalperson@yhbt.net>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Arve Hjønnevåg <arve@android.com>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: NeilBrown <neilb@suse.de>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eea1d585
    • E
      epoll: trim epitem by one cache line · 39732ca5
      Eric Wong 提交于
      It is common for epoll users to have thousands of epitems, so saving a
      cache line on every allocation leads to large memory savings.
      
      Since epitem allocations are cache-aligned, reducing sizeof(struct
      epitem) from 136 bytes to 128 bytes will allow it to squeeze under a
      cache line boundary on x86_64.
      
      Via /sys/kernel/slab/eventpoll_epi, I see the following changes on my
      x86_64 Core2 Duo (which has 64-byte cache alignment):
      
      	object_size  :  192 => 128
      	objs_per_slab:   21 =>  32
      
      Also, add a BUILD_BUG_ON() to check for future accidental breakage.
      
      [akpm@linux-foundation.org: use __packed, for all architectures]
      Signed-off-by: NEric Wong <normalperson@yhbt.net>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      39732ca5
    • A
      binfmt_misc: reuse string_unescape_inplace() · 8d82e180
      Andy Shevchenko 提交于
      There is string_unescape_inplace() function which decodes strings in generic
      way. Let's use it.
      Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d82e180
    • T
      writeback: set worker desc to identify writeback workers in task dumps · ef3b1019
      Tejun Heo 提交于
      Writeback has been recently converted to use workqueue instead of its
      private thread pool implementation.  One negative side effect of this
      conversion is that there's no easy to tell which backing device a
      writeback work item was working on at the time of task dump, be it
      sysrq-t, BUG, WARN or whatever, which, according to our writeback
      brethren, is important in tracking down issues with a lot of mounted
      file systems on a lot of different devices.
      
      This patch restores that information using the new worker description
      facility.  bdi_writeback_workfn() calls set_work_desc() to identify
      which bdi it's working on.  The description is printed out together with
      the worqueue name and worker function as in the following example dump.
      
       WARNING: at fs/fs-writeback.c:1015 bdi_writeback_workfn+0x2b4/0x3c0()
       Modules linked in:
       Pid: 28, comm: kworker/u18:0 Not tainted 3.9.0-rc1-work+ #24 empty empty/S3992
       Workqueue: writeback bdi_writeback_workfn (flush-8:16)
        ffffffff820a3a98 ffff88015b927cb8 ffffffff81c61855 ffff88015b927cf8
        ffffffff8108f500 0000000000000000 ffff88007a171948 ffff88007a1716b0
        ffff88015b49df00 ffff88015b8d3940 0000000000000000 ffff88015b927d08
       Call Trace:
        [<ffffffff81c61855>] dump_stack+0x19/0x1b
        [<ffffffff8108f500>] warn_slowpath_common+0x70/0xa0
        [<ffffffff8108f54a>] warn_slowpath_null+0x1a/0x20
        [<ffffffff81200144>] bdi_writeback_workfn+0x2b4/0x3c0
        [<ffffffff810b4c87>] process_one_work+0x1d7/0x660
        [<ffffffff810b5c72>] worker_thread+0x122/0x380
        [<ffffffff810bdfea>] kthread+0xea/0xf0
        [<ffffffff81c6cedc>] ret_from_fork+0x7c/0xb0
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ef3b1019
    • G
      fs/dcache.c: add cond_resched() to shrink_dcache_parent() · 421348f1
      Greg Thelen 提交于
      Call cond_resched() in shrink_dcache_parent() to maintain interactivity.
      
      Before this patch:
      
      	void shrink_dcache_parent(struct dentry * parent)
      	{
      		while ((found = select_parent(parent, &dispose)) != 0)
      			shrink_dentry_list(&dispose);
      	}
      
      select_parent() populates the dispose list with dentries which
      shrink_dentry_list() then deletes.  select_parent() carefully uses
      need_resched() to avoid doing too much work at once.  But neither
      shrink_dcache_parent() nor its called functions call cond_resched().  So
      once need_resched() is set select_parent() will return single dentry
      dispose list which is then deleted by shrink_dentry_list().  This is
      inefficient when there are a lot of dentry to process.  This can cause
      softlockup and hurts interactivity on non preemptable kernels.
      
      This change adds cond_resched() in shrink_dcache_parent().  The benefit
      of this is that need_resched() is quickly cleared so that future calls
      to select_parent() are able to efficiently return a big batch of dentry.
      
      These additional cond_resched() do not seem to impact performance, at
      least for the workload below.
      
      Here is a program which can cause soft lockup if other system activity
      sets need_resched().
      
      	int main()
      	{
      	        struct rlimit rlim;
      	        int i;
      	        int f[100000];
      	        char buf[20];
      	        struct timeval t1, t2;
      	        double diff;
      
      	        /* cleanup past run */
      	        system("rm -rf x");
      
      	        /* boost nfile rlimit */
      	        rlim.rlim_cur = 200000;
      	        rlim.rlim_max = 200000;
      	        if (setrlimit(RLIMIT_NOFILE, &rlim))
      	                err(1, "setrlimit");
      
      	        /* make directory for files */
      	        if (mkdir("x", 0700))
      	                err(1, "mkdir");
      
      	        if (gettimeofday(&t1, NULL))
      	                err(1, "gettimeofday");
      
      	        /* populate directory with open files */
      	        for (i = 0; i < 100000; i++) {
      	                snprintf(buf, sizeof(buf), "x/%d", i);
      	                f[i] = open(buf, O_CREAT);
      	                if (f[i] == -1)
      	                        err(1, "open");
      	        }
      
      	        /* close some of the files */
      	        for (i = 0; i < 85000; i++)
      	                close(f[i]);
      
      	        /* unlink all files, even open ones */
      	        system("rm -rf x");
      
      	        if (gettimeofday(&t2, NULL))
      	                err(1, "gettimeofday");
      
      	        diff = (((double)t2.tv_sec * 1000000 + t2.tv_usec) -
      	                ((double)t1.tv_sec * 1000000 + t1.tv_usec));
      
      	        printf("done: %g elapsed\n", diff/1e6);
      	        return 0;
      	}
      Signed-off-by: NGreg Thelen <gthelen@google.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      421348f1
    • Y
      fs/block_dev.c: no need to check inode->i_bdev in bd_forget() · b4ea2eaa
      Yan Hong 提交于
      Its only caller evict() has promised a non-NULL inode->i_bdev.
      Signed-off-by: NYan Hong <clouds.yan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b4ea2eaa
    • Z
      inotify: invalid mask should return a error number but not set it · 04df32fa
      Zhao Hongjiang 提交于
      When we run the crackerjack testsuite, the inotify_add_watch test is
      stalled.
      
      This is caused by the invalid mask 0 - the task is waiting for the event
      but it never comes.  inotify_add_watch() should return -EINVAL as it did
      before commit 676a0675 ("inotify: remove broken mask checks causing
      unmount to be EINVAL").  That commit removes the invalid mask check, but
      that check is needed.
      
      Check the mask's ALL_INOTIFY_BITS before the inotify_arg_to_mask() call.
      If none are set, just return -EINVAL.
      
      Because IN_UNMOUNT is in ALL_INOTIFY_BITS, this change will not trigger
      the problem that above commit fixed.
      
      [akpm@linux-foundation.org: fix build]
      Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com>
      Acked-by: NJim Somerville <Jim.Somerville@windriver.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      04df32fa
  2. 30 4月, 2013 23 次提交
  3. 29 4月, 2013 1 次提交
  4. 26 4月, 2013 1 次提交
  5. 19 4月, 2013 1 次提交
  6. 18 4月, 2013 2 次提交