1. 05 8月, 2010 15 次提交
    • R
      module: sysfs cleanup · 8f6d0378
      Rusty Russell 提交于
      We change the sysfs functions to take struct load_info, and call
      them all in mod_sysfs_setup().
      
      We also clean up the #ifdefs a little.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      8f6d0378
    • R
      module: layout_and_allocate · d913188c
      Rusty Russell 提交于
      layout_and_allocate() does everything up to and including the final
      struct module placement inside the allocated module memory.  We have
      to store the symbol layout information in our struct load_info though.
      
      This avoids the nasty code we had before where 'mod' pointed first
      to the version inside the temporary allocation containing the entire
      file, then later was moved to point to the real struct module: now
      the main code only ever sees the final module address.
      
      (Includes fix for the Tony Luck-found Linus-diagnosed failure path
       error).
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      d913188c
    • R
      module: fix crash in get_ksymbol() when oopsing in module init · 511ca6ae
      Rusty Russell 提交于
      Andrew had the sole pleasure of tickling this bug in linux-next; when we set
      up "info->strtab" it's pointing into the temporary copy of the module.  For
      most uses that is fine, but kallsyms keeps a pointer around during module
      load (inside mod->strtab).
      
      If we oops for some reason inside a module's init function, kallsyms will use
      the mod->strtab pointer into the now-freed temporary module copy.
      
      (Later oopses work fine: after init we overwrite mod->strtab to point to a
       compacted core-only strtab).
      Reported-by: NAndrew "Grumpy" Morton <akpm@linux-foundation.org>
      Signed-off-by: NRusty "Buggy" Russell <rusty@rustcorp.com.au>
      Tested-by: NAndrew "Happy" Morton <akpm@linux-foundation.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      511ca6ae
    • R
      module: kallsyms functions take struct load_info · eded41c1
      Rusty Russell 提交于
      Simple refactor causes us to lift struct definition to top of file.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      eded41c1
    • R
      module: refactor out section header rewriting: FIX modversions · d6df72a0
      Rusty Russell 提交于
      We can't do the find_sec after removing the SHF_ALLOC flags; it won't
      find the sections.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      d6df72a0
    • R
      module: refactor out section header rewriting · 8b5f61a7
      Rusty Russell 提交于
      Put all the "rewrite and check section headers" in one place.  This
      adds another iteration over the sections, but it's far clearer.  We
      iterate once for every find_section() so we already iterate over many
      times.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      8b5f61a7
    • L
      module: add load_info · 3264d3f9
      Linus Torvalds 提交于
      Btw, here's a patch that _looks_ large, but it really pretty trivial, and
      sets things up so that it would be way easier to split off pieces of the
      module loading.
      
      The reason it looks large is that it creates a "module_info" structure
      that contains all the module state that we're building up while loading,
      instead of having individual variables for all the indices etc.
      
      So the patch ends up being large, because every "symindex" access instead
      becomes "info.index.sym" etc. That may be a few characters longer, but it
      then means that we can just pass a pointer to that "info" structure
      around. and let all the pieces fill it in very naturally.
      
      As an example of that, the patch also moves the initialization of all
      those convenience variables into a "setup_module_info()" function. And at
      this point it really does become very natural to start to peel off some of
      the error labels and move them into the helper functions - now the
      "truncated" case is gone, and is handled inside that setup function
      instead.
      
      So maybe you don't like this approach, and it does make the variable
      accesses a bit longer, but I don't think unreadably so. And the patch
      really does look big and scary, but there really should be absolutely no
      semantic changes - most of it was a trivial and mindless rename.
      
      In fact, it was so mindless that I on purpose kept the existing helper
      functions looking like this:
      
      -       err = check_modinfo(mod, sechdrs, infoindex, versindex);
      +       err = check_modinfo(mod, info.sechdrs, info.index.info, info.index.vers);
      
      rather than changing them to just take the "info" pointer. IOW, a second
      phase (if you think the approach is ok) would change that calling
      convention to just do
      
      	err = check_modinfo(mod, &info);
      
      (and same for "layout_sections()", "layout_symtabs()" etc.) Similarly,
      while right now it makes things _look_ bigger, with things like this:
      
      	versindex = find_sec(hdr, sechdrs, secstrings, "__versions");
      
      becoming
      
      	info->index.vers = find_sec(info->hdr, info->sechdrs, info->secstrings, "__versions");
      
      in the new "setup_module_info()" function, that's again just a result of
      it being a search-and-replace patch. By using the 'info' pointer, we could
      just change the 'find_sec()' interface so that it ends up being
      
      	info->index.vers = find_sec(info, "__versions");
      
      instead, and then we'd actually have a shorter and more readable line. So
      for a lot of those mindless variable name expansions there's would be room
      for separate cleanups.
      
      I didn't move quite everything in there - if we do this to layout_symtabs,
      for example, we'd want to move the percpu, symoffs, stroffs, *strmap
      variables to be fields in that module_info structure too. But that's a
      much smaller patch, I moved just the really core stuff that is currently
      being set up and used in various parts.
      
      But even in this rough form, it removes close to 70 lines from that
      function (but adds 22 lines overall, of course - the structure definition,
      the helper function declarations and call-sites etc etc).
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      3264d3f9
    • L
      module: reduce stack usage for each_symbol() · 44032e63
      Linus Torvalds 提交于
      And now that I'm looking at that call-chain (to see if it would make sense
      to use some other more specific lock - doesn't look like it: all the
      readers are using RCU and this is the only writer), I also give you this
      trivial one-liner. It changes each_symbol() to not put that constant array
      on the stack, resulting in changing
      
              movq    $C.388.31095, %rsi      #, tmp85
              subq    $376, %rsp      #,
              movq    %rdi, %rbx      # fn, fn
              leaq    -208(%rbp), %rdi        #, tmp84
              movq    %rbx, %rdx      # fn,
              rep movsl
              xorl    %esi, %esi      #
              leaq    -208(%rbp), %rdi        #, tmp87
              movq    %r12, %rcx      # data,
              call    each_symbol_in_section.clone.0  #
      
      into
      
              xorl    %esi, %esi      #
              subq    $216, %rsp      #,
              movq    %rdi, %rbx      # fn, fn
              movq    $arr.31078, %rdi        #,
              call    each_symbol_in_section.clone.0  #
      
      which is not so much about being obviously shorter and simpler because we
      don't unnecessarily copy that constant array around onto the stack, but
      also about having a much smaller stack footprint (376 vs 216 bytes - see
      the update of 'rsp').
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      44032e63
    • R
      module: refactor load_module part 5 · 22e268eb
      Rusty Russell 提交于
      1) Extract out the relocation loop into apply_relocations
      2) Extract license and version checks into check_module_license_and_versions
      3) Extract icache flushing into flush_module_icache
      4) Move __obsparm warning into find_module_sections
      5) Move license setting into check_modinfo.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      22e268eb
    • R
      module: refactor load_module part 4 · 9f85a4bb
      Rusty Russell 提交于
      Allocate references inside module_unload_init(), clean up inside
      module_unload_free().
      
      This version fixed to do allocation before __this_cpu_write, thanks to
      bug reports from linux-next from Dave Young <hidave.darkstar@gmail.com>
      and Stephen Rothwell <sfr@canb.auug.org.au>.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      9f85a4bb
    • R
      module: refactor load_module part 3 · 40dd2560
      Rusty Russell 提交于
      Extract out the allocation and copying in from userspace, and the
      first set of modinfo checks.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      40dd2560
    • L
      module: refactor load_module part 2 · 65b8a9b4
      Linus Torvalds 提交于
      Here's a second one. It's slightly less trivial - since we now have error
      cases - and equally untested so it may well be totally broken. But it also
      cleans up a bit more, and avoids one of the goto targets, because the
      "move_module()" helper now does both allocations or none.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      65b8a9b4
    • L
      module: refactor load_module · f91a13bb
      Linus Torvalds 提交于
      I'd start from the trivial stuff. There's a fair amount of straight-line
      code that just makes the function hard to read just because you have to
      page up and down so far. Some of it is trivial to just create a helper
      function for.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      f91a13bb
    • E
      module: module_unload_init() cleanup · 2409e742
      Eric Dumazet 提交于
      No need to clear mod->refptr in module_unload_init(), since
      alloc_percpu() already clears allocated chunks.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (removed unused var)
      2409e742
    • D
      CRED: Fix RCU warning due to previous patch fixing __task_cred()'s checks · 694f690d
      David Howells 提交于
      Commit 8f92054e ("CRED: Fix __task_cred()'s lockdep check and banner
      comment") fixed the lockdep checks on __task_cred().  This has shown up
      a place in the signalling code where a lock should be held - namely that
      check_kill_permission() requires its callers to hold the RCU lock.
      
      Fix group_send_sig_info() to get the RCU read lock around its call to
      check_kill_permission().
      
      Without this patch, the following warning can occur:
      
        ===================================================
        [ INFO: suspicious rcu_dereference_check() usage. ]
        ---------------------------------------------------
        kernel/signal.c:660 invoked rcu_dereference_check() without protection!
        ...
      Reported-by: NTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      694f690d
  2. 03 8月, 2010 1 次提交
  3. 31 7月, 2010 4 次提交
  4. 30 7月, 2010 1 次提交
    • D
      CRED: Fix get_task_cred() and task_state() to not resurrect dead credentials · de09a977
      David Howells 提交于
      It's possible for get_task_cred() as it currently stands to 'corrupt' a set of
      credentials by incrementing their usage count after their replacement by the
      task being accessed.
      
      What happens is that get_task_cred() can race with commit_creds():
      
      	TASK_1			TASK_2			RCU_CLEANER
      	-->get_task_cred(TASK_2)
      	rcu_read_lock()
      	__cred = __task_cred(TASK_2)
      				-->commit_creds()
      				old_cred = TASK_2->real_cred
      				TASK_2->real_cred = ...
      				put_cred(old_cred)
      				  call_rcu(old_cred)
      		[__cred->usage == 0]
      	get_cred(__cred)
      		[__cred->usage == 1]
      	rcu_read_unlock()
      							-->put_cred_rcu()
      							[__cred->usage == 1]
      							panic()
      
      However, since a tasks credentials are generally not changed very often, we can
      reasonably make use of a loop involving reading the creds pointer and using
      atomic_inc_not_zero() to attempt to increment it if it hasn't already hit zero.
      
      If successful, we can safely return the credentials in the knowledge that, even
      if the task we're accessing has released them, they haven't gone to the RCU
      cleanup code.
      
      We then change task_state() in procfs to use get_task_cred() rather than
      calling get_cred() on the result of __task_cred(), as that suffers from the
      same problem.
      
      Without this change, a BUG_ON in __put_cred() or in put_cred_rcu() can be
      tripped when it is noticed that the usage count is not zero as it ought to be,
      for example:
      
      kernel BUG at kernel/cred.c:168!
      invalid opcode: 0000 [#1] SMP
      last sysfs file: /sys/kernel/mm/ksm/run
      CPU 0
      Pid: 2436, comm: master Not tainted 2.6.33.3-85.fc13.x86_64 #1 0HR330/OptiPlex
      745
      RIP: 0010:[<ffffffff81069881>]  [<ffffffff81069881>] __put_cred+0xc/0x45
      RSP: 0018:ffff88019e7e9eb8  EFLAGS: 00010202
      RAX: 0000000000000001 RBX: ffff880161514480 RCX: 00000000ffffffff
      RDX: 00000000ffffffff RSI: ffff880140c690c0 RDI: ffff880140c690c0
      RBP: ffff88019e7e9eb8 R08: 00000000000000d0 R09: 0000000000000000
      R10: 0000000000000001 R11: 0000000000000040 R12: ffff880140c690c0
      R13: ffff88019e77aea0 R14: 00007fff336b0a5c R15: 0000000000000001
      FS:  00007f12f50d97c0(0000) GS:ffff880007400000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f8f461bc000 CR3: 00000001b26ce000 CR4: 00000000000006f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process master (pid: 2436, threadinfo ffff88019e7e8000, task ffff88019e77aea0)
      Stack:
       ffff88019e7e9ec8 ffffffff810698cd ffff88019e7e9ef8 ffffffff81069b45
      <0> ffff880161514180 ffff880161514480 ffff880161514180 0000000000000000
      <0> ffff88019e7e9f28 ffffffff8106aace 0000000000000001 0000000000000246
      Call Trace:
       [<ffffffff810698cd>] put_cred+0x13/0x15
       [<ffffffff81069b45>] commit_creds+0x16b/0x175
       [<ffffffff8106aace>] set_current_groups+0x47/0x4e
       [<ffffffff8106ac89>] sys_setgroups+0xf6/0x105
       [<ffffffff81009b02>] system_call_fastpath+0x16/0x1b
      Code: 48 8d 71 ff e8 7e 4e 15 00 85 c0 78 0b 8b 75 ec 48 89 df e8 ef 4a 15 00
      48 83 c4 18 5b c9 c3 55 8b 07 8b 07 48 89 e5 85 c0 74 04 <0f> 0b eb fe 65 48 8b
      04 25 00 cc 00 00 48 3b b8 58 04 00 00 75
      RIP  [<ffffffff81069881>] __put_cred+0xc/0x45
       RSP <ffff88019e7e9eb8>
      ---[ end trace df391256a100ebdd ]---
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      de09a977
  5. 28 7月, 2010 2 次提交
  6. 26 7月, 2010 3 次提交
  7. 22 7月, 2010 5 次提交
  8. 21 7月, 2010 1 次提交
    • N
      drop_monitor: convert some kfree_skb call sites to consume_skb · 70d4bf6d
      Neil Horman 提交于
      Convert a few calls from kfree_skb to consume_skb
      
      Noticed while I was working on dropwatch that I was detecting lots of internal
      skb drops in several places.  While some are legitimate, several were not,
      freeing skbs that were at the end of their life, rather than being discarded due
      to an error.  This patch converts those calls sites from using kfree_skb to
      consume_skb, which quiets the in-kernel drop_monitor code from detecting them as
      drops.  Tested successfully by myself
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      70d4bf6d
  9. 19 7月, 2010 8 次提交