1. 18 10月, 2007 4 次提交
    • L
      9p: attach-per-user · ba17674f
      Latchesar Ionkov 提交于
      The 9P2000 protocol requires the authentication and permission checks to be
      done in the file server. For that reason every user that accesses the file
      server tree has to authenticate and attach to the server separately.
      Multiple users can share the same connection to the server.
      
      Currently v9fs does a single attach and executes all I/O operations as a
      single user. This makes using v9fs in multiuser environment unsafe as it
      depends on the client doing the permission checking.
      
      This patch improves the 9P2000 support by allowing every user to attach
      separately. The patch defines three modes of access (new mount option
      'access'):
      
      - attach-per-user (access=user) (default mode for 9P2000.u)
       If a user tries to access a file served by v9fs for the first time, v9fs
       sends an attach command to the server (Tattach) specifying the user. If
       the attach succeeds, the user can access the v9fs tree.
       As there is no uname->uid (string->integer) mapping yet, this mode works
       only with the 9P2000.u dialect.
      
      - allow only one user to access the tree (access=<uid>)
       Only the user with uid can access the v9fs tree. Other users that attempt
       to access it will get EPERM error.
      
      - do all operations as a single user (access=any) (default for 9P2000)
       V9fs does a single attach and all operations are done as a single user.
       If this mode is selected, the v9fs behavior is identical with the current
       one.
      Signed-off-by: NLatchesar Ionkov <lucho@ionkov.net>
      Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>
      ba17674f
    • L
      9p: rename uid and gid parameters · bd32b82d
      Latchesar Ionkov 提交于
      Change the names of 'uid' and 'gid' parameters to the more appropriate
      'dfltuid' and 'dfltgid'.  This also sets the default uid/gid to -2
      (aka nfsnobody)
      Signed-off-by: NLatchesar Ionkov <lucho@ionkov.net>
      Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>
      bd32b82d
    • E
      9p: Make transports dynamic · a80d923e
      Eric Van Hensbergen 提交于
      This patch abstracts out the interfaces to underlying transports so that
      new transports can be added as modules.  This should also allow kernel
      configuration of transports without ifdef-hell.
      Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>
      a80d923e
    • J
      x86: expand /proc/interrupts to include missing vectors, v2 · 38e760a1
      Joe Korty 提交于
      Add missing IRQs and IRQ descriptions to /proc/interrupts.
      
      /proc/interrupts is most useful when it displays every IRQ vector in use by
      the system, not just those somebody thought would be interesting.
      
      This patch inserts the following vector displays to the i386 and x86_64
      platforms, as appropriate:
      
      	rescheduling interrupts
      	TLB flush interrupts
      	function call interrupts
      	thermal event interrupts
      	threshold interrupts
      	spurious interrupts
      
      A threshold interrupt occurs when ECC memory correction is occuring at too
      high a frequency.  Thresholds are used by the ECC hardware as occasional
      ECC failures are part of normal operation, but long sequences of ECC
      failures usually indicate a memory chip that is about to fail.
      
      Thermal event interrupts occur when a temperature threshold has been
      exceeded for some CPU chip.  IIRC, a thermal interrupt is also generated
      when the temperature drops back to a normal level.
      
      A spurious interrupt is an interrupt that was raised then lowered by the
      device before it could be fully processed by the APIC.  Hence the apic sees
      the interrupt but does not know what device it came from.  For this case
      the APIC hardware will assume a vector of 0xff.
      
      Rescheduling, call, and TLB flush interrupts are sent from one CPU to
      another per the needs of the OS.  Typically, their statistics would be used
      to discover if an interrupt flood of the given type has been occuring.
      
      AK: merged v2 and v4 which had some more tweaks
      AK: replace Local interrupts with Local timer interrupts
      AK: Fixed description of interrupt types.
      
      [ tglx: arch/x86 adaptation ]
      [ mingo: small cleanup ]
      Signed-off-by: NJoe Korty <joe.korty@ccur.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Tim Hockin <thockin@hockin.org>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      38e760a1
  2. 17 10月, 2007 5 次提交
  3. 13 10月, 2007 1 次提交
    • A
      NTFS: Fix a mount time deadlock. · bfab36e8
      Anton Altaparmakov 提交于
      Big thanks go to Mathias Kolehmainen for reporting the bug, providing
      debug output and testing the patches I sent him to get it working.
      
      The fix was to stop calling ntfs_attr_set() at mount time as that causes
      balance_dirty_pages_ratelimited() to be called which on systems with
      little memory actually tries to go and balance the dirty pages which tries
      to take the s_umount semaphore but because we are still in fill_super()
      across which the VFS holds s_umount for writing this results in a
      deadlock.
      
      We now do the dirty work by hand by submitting individual buffers.  This
      has the annoying "feature" that mounting can take a few seconds if the
      journal is large as we have clear it all.  One day someone should improve
      on this by deferring the journal clearing to a helper kernel thread so it
      can be done in the background but I don't have time for this at the moment
      and the current solution works fine so I am leaving it like this for now.
      Signed-off-by: NAnton Altaparmakov <aia21@cantab.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bfab36e8
  4. 10 10月, 2007 3 次提交
  5. 12 9月, 2007 2 次提交
  6. 23 8月, 2007 1 次提交
  7. 01 8月, 2007 1 次提交
  8. 20 7月, 2007 6 次提交
    • Y
      some kmalloc/memset ->kzalloc (tree wide) · dd00cc48
      Yoann Padioleau 提交于
      Transform some calls to kmalloc/memset to a single kzalloc (or kcalloc).
      
      Here is a short excerpt of the semantic patch performing
      this transformation:
      
      @@
      type T2;
      expression x;
      identifier f,fld;
      expression E;
      expression E1,E2;
      expression e1,e2,e3,y;
      statement S;
      @@
      
       x =
      - kmalloc
      + kzalloc
        (E1,E2)
        ...  when != \(x->fld=E;\|y=f(...,x,...);\|f(...,x,...);\|x=E;\|while(...) S\|for(e1;e2;e3) S\)
      - memset((T2)x,0,E1);
      
      @@
      expression E1,E2,E3;
      @@
      
      - kzalloc(E1 * E2,E3)
      + kcalloc(E1,E2,E3)
      
      [akpm@linux-foundation.org: get kcalloc args the right way around]
      Signed-off-by: NYoann Padioleau <padator@wanadoo.fr>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Acked-by: NRussell King <rmk@arm.linux.org.uk>
      Cc: Bryan Wu <bryan.wu@analog.com>
      Acked-by: NJiri Slaby <jirislaby@gmail.com>
      Cc: Dave Airlie <airlied@linux.ie>
      Acked-by: NRoland Dreier <rolandd@cisco.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Acked-by: NDmitry Torokhov <dtor@mail.ru>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NMauro Carvalho Chehab <mchehab@infradead.org>
      Acked-by: NPierre Ossman <drzeus-list@drzeus.cx>
      Cc: Jeff Garzik <jeff@garzik.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Acked-by: NGreg KH <greg@kroah.com>
      Cc: James Bottomley <James.Bottomley@steeleye.com>
      Cc: "Antonino A. Daplas" <adaplas@pol.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dd00cc48
    • K
      coredump masking: documentation for /proc/pid/coredump_filter · bb90110d
      Kawai, Hidehiro 提交于
      This patch adds the documentation for /proc/<pid>/coredump_filter.
      Signed-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bb90110d
    • P
      audit: rework execve audit · bdf4c48a
      Peter Zijlstra 提交于
      The purpose of audit_bprm() is to log the argv array to a userspace daemon at
      the end of the execve system call.  Since user-space hasn't had time to run,
      this array is still in pristine state on the process' stack; so no need to
      copy it, we can just grab it from there.
      
      In order to minimize the damage to audit_log_*() copy each string into a
      temporary kernel buffer first.
      
      Currently the audit code requires that the full argument vector fits in a
      single packet.  So currently it does clip the argv size to a (sysctl) limit,
      but only when execve auditing is enabled.
      
      If the audit protocol gets extended to allow for multiple packets this check
      can be removed.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NOllie Wild <aaw@google.com>
      Cc: <linux-audit@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bdf4c48a
    • N
      mm: fault feedback #1 · d0217ac0
      Nick Piggin 提交于
      Change ->fault prototype.  We now return an int, which contains
      VM_FAULT_xxx code in the low byte, and FAULT_RET_xxx code in the next byte.
       FAULT_RET_ code tells the VM whether a page was found, whether it has been
      locked, and potentially other things.  This is not quite the way he wanted
      it yet, but that's changed in the next patch (which requires changes to
      arch code).
      
      This means we no longer set VM_CAN_INVALIDATE in the vma in order to say
      that a page is locked which requires filemap_nopage to go away (because we
      can no longer remain backward compatible without that flag), but we were
      going to do that anyway.
      
      struct fault_data is renamed to struct vm_fault as Linus asked. address
      is now a void __user * that we should firmly encourage drivers not to use
      without really good reason.
      
      The page is now returned via a page pointer in the vm_fault struct.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0217ac0
    • M
      Document ->page_mkwrite() locking · ed2f2f9b
      Mark Fasheh 提交于
      There seems to be very little documentation about this callback in general.
      The locking in particular is a bit tricky, so it's worth having this in
      writing.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed2f2f9b
    • N
      mm: merge populate and nopage into fault (fixes nonlinear) · 54cb8821
      Nick Piggin 提交于
      Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes
      the virtual address -> file offset differently from linear mappings.
      
      ->populate is a layering violation because the filesystem/pagecache code
      should need to know anything about the virtual memory mapping.  The hitch here
      is that the ->nopage handler didn't pass down enough information (ie.  pgoff).
       But it is more logical to pass pgoff rather than have the ->nopage function
      calculate it itself anyway (because that's a similar layering violation).
      
      Having the populate handler install the pte itself is likewise a nasty thing
      to be doing.
      
      This patch introduces a new fault handler that replaces ->nopage and
      ->populate and (later) ->nopfn.  Most of the old mechanism is still in place
      so there is a lot of duplication and nice cleanups that can be removed if
      everyone switches over.
      
      The rationale for doing this in the first place is that nonlinear mappings are
      subject to the pagefault vs invalidate/truncate race too, and it seemed stupid
      to duplicate the synchronisation logic rather than just consolidate the two.
      
      After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in
      pagecache.  Seems like a fringe functionality anyway.
      
      NOPAGE_REFAULT is removed.  This should be implemented with ->fault, and no
      users have hit mainline yet.
      
      [akpm@linux-foundation.org: cleanup]
      [randy.dunlap@oracle.com: doc. fixes for readahead]
      [akpm@linux-foundation.org: build fix]
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Cc: Mark Fasheh <mark.fasheh@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      54cb8821
  9. 18 7月, 2007 2 次提交
  10. 17 7月, 2007 3 次提交
  11. 11 7月, 2007 3 次提交
    • J
      configfs: config item dependancies. · 631d1feb
      Joel Becker 提交于
      Sometimes other drivers depend on particular configfs items.  For
      example, ocfs2 mounts depend on a heartbeat region item.  If that
      region item is removed with rmdir(2), the ocfs2 mount must BUG or go
      readonly.  Not happy.
      
      This provides two additional API calls: configfs_depend_item() and
      configfs_undepend_item().  A client driver can call
      configfs_depend_item() on an existing item to tell configfs that it is
      depended on.  configfs will then return -EBUSY from rmdir(2) for that
      item.  When the item is no longer depended on, the client driver calls
      configfs_undepend_item() on it.
      
      These API cannot be called underneath any configfs callbacks, as
      they will conflict.  They can block and allocate.  A client driver
      probably shouldn't calling them of its own gumption.  Rather it should
      be providing an API that external subsystems call.
      
      How does this work?  Imagine the ocfs2 mount process.  When it mounts,
      it asks for a heart region item.  This is done via a call into the
      heartbeat code.  Inside the heartbeat code, the region item is looked
      up.  Here, the heartbeat code calls configfs_depend_item().  If it
      succeeds, then heartbeat knows the region is safe to give to ocfs2.
      If it fails, it was being torn down anyway, and heartbeat can gracefully
      pass up an error.
      
      [ Fixed some bad whitespace in configfs.txt. --Mark ]
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      631d1feb
    • J
      configfs: accessing item hierarchy during rmdir(2) · 299894cc
      Joel Becker 提交于
      Add a notification callback, ops->disconnect_notify(). It has the same
      prototype as ->drop_item(), but it will be called just before the item
      linkage is broken. This way, configfs users who want to do work while
      the object is still in the heirarchy have a chance.
      
      Client drivers will still need to config_item_put() in their
      ->drop_item(), if they implement it.  They need do nothing in
      ->disconnect_notify().  They don't have to provide it if they don't
      care.  But someone who wants to be notified before ci_parent is set to
      NULL can now be notified.
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      299894cc
    • J
      configfs: Convert subsystem semaphore to mutex · e6bd07ae
      Joel Becker 提交于
      Convert the su_sem member of struct configfs_subsystem to a struct
      mutex, as that's what it is. Also convert all the users and update
      Documentation/configfs.txt and Documentation/configfs_example.c
      accordingly.
      
      [ Conflict in fs/dlm/config.c with commit
        3168b078 manually resolved. --Mark ]
      Inspired-by: NSatyam Sharma <ssatyam@cse.iitk.ac.in>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      e6bd07ae
  12. 09 6月, 2007 1 次提交
  13. 25 5月, 2007 1 次提交
  14. 09 5月, 2007 6 次提交
  15. 08 5月, 2007 1 次提交
    • D
      smaps: add clear_refs file to clear reference · b813e931
      David Rientjes 提交于
      Adds /proc/pid/clear_refs.  When any non-zero number is written to this file,
      pte_mkold() and ClearPageReferenced() is called for each pte and its
      corresponding page, respectively, in that task's VMAs.  This file is only
      writable by the user who owns the task.
      
      It is now possible to measure _approximately_ how much memory a task is using
      by clearing the reference bits with
      
      	echo 1 > /proc/pid/clear_refs
      
      and checking the reference count for each VMA from the /proc/pid/smaps output
      at a measured time interval.  For example, to observe the approximate change
      in memory footprint for a task, write a script that clears the references
      (echo 1 > /proc/pid/clear_refs), sleeps, and then greps for Pgs_Referenced and
      extracts the size in kB.  Add the sizes for each VMA together for the total
      referenced footprint.  Moments later, repeat the process and observe the
      difference.
      
      For example, using an efficient Mozilla:
      
      	accumulated time		referenced memory
      	----------------		-----------------
      		 0 s				 408 kB
      		 1 s				 408 kB
      		 2 s				 556 kB
      		 3 s				1028 kB
      		 4 s				 872 kB
      		 5 s				1956 kB
      		 6 s				 416 kB
      		 7 s				1560 kB
      		 8 s				2336 kB
      		 9 s				1044 kB
      		10 s				 416 kB
      
      This is a valuable tool to get an approximate measurement of the memory
      footprint for a task.
      
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Christoph Lameter <clameter@sgi.com>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      [akpm@linux-foundation.org: build fixes]
      [mpm@selenic.com: rename for_each_pmd]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b813e931