1. 02 10月, 2006 23 次提交
  2. 01 10月, 2006 17 次提交
    • A
      [PATCH] Support piping into commands in /proc/sys/kernel/core_pattern · d025c9db
      Andi Kleen 提交于
      Using the infrastructure created in previous patches implement support to
      pipe core dumps into programs.
      
      This is done by overloading the existing core_pattern sysctl
      with a new syntax:
      
      |program
      
      When the first character of the pattern is a '|' the kernel will instead
      threat the rest of the pattern as a command to run.  The core dump will be
      written to the standard input of that program instead of to a file.
      
      This is useful for having automatic core dump analysis without filling up
      disks.  The program can do some simple analysis and save only a summary of
      the core dump.
      
      The core dump proces will run with the privileges and in the name space of
      the process that caused the core dump.
      
      I also increased the core pattern size to 128 bytes so that longer command
      lines fit.
      
      Most of the changes comes from allowing core dumps without seeks.  They are
      fairly straight forward though.
      
      One small incompatibility is that if someone had a core pattern previously
      that started with '|' they will get suddenly new behaviour.  I think that's
      unlikely to be a real problem though.
      
      Additional background:
      
      > Very nice, do you happen to have a program that can accept this kind of
      > input for crash dumps?  I'm guessing that the embedded people will
      > really want this functionality.
      
      I had a cheesy demo/prototype.  Basically it wrote the dump to a file again,
      ran gdb on it to get a backtrace and wrote the summary to a shared directory.
      Then there was a simple CGI script to generate a "top 10" crashes HTML
      listing.
      
      Unfortunately this still had the disadvantage to needing full disk space for a
      dump except for deleting it afterwards (in fact it was worse because over the
      pipe holes didn't work so if you have a holey address map it would require
      more space).
      
      Fortunately gdb seems to be happy to handle /proc/pid/fd/xxx input pipes as
      cores (at least it worked with zsh's =(cat core) syntax), so it would be
      likely possible to do it without temporary space with a simple wrapper that
      calls it in the right way.  I ran out of time before doing that though.
      
      The demo prototype scripts weren't very good.  If there is really interest I
      can dig them out (they are currently on a laptop disk on the desk with the
      laptop itself being in service), but I would recommend to rewrite them for any
      serious application of this and fix the disk space problem.
      
      Also to be really useful it should probably find a way to automatically fetch
      the debuginfos (I cheated and just installed them in advance).  If nobody else
      does it I can probably do the rewrite myself again at some point.
      
      My hope at some point was that desktops would support it in their builtin
      crash reporters, but at least the KDE people I talked too seemed to be happy
      with their user space only solution.
      
      Alan sayeth:
      
        I don't believe that piping as such as neccessarily the right model, but
        the ability to intercept and processes core dumps from user space is asked
        for by many enterprise users as well.  They want to know about, capture,
        analyse and process core dumps, often centrally and in automated form.
      
      [akpm@osdl.org: loff_t != unsigned long]
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d025c9db
    • A
      [PATCH] Some cleanup in the pipe code · d6cbd281
      Andi Kleen 提交于
      Split the big and hard to read do_pipe function into smaller pieces.
      
      This creates new create_write_pipe/free_write_pipe/create_read_pipe
      functions.  These functions are made global so that they can be used by
      other parts of the kernel.
      
      The resulting code is more generic and easier to read and has cleaner error
      handling and less gotos.
      
      [akpm@osdl.org: cleanup]
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d6cbd281
    • D
      [PATCH] r/o bind mounts: monitor zeroing of i_nlink · ce71ec36
      Dave Hansen 提交于
      Some filesystems, instead of simply decrementing i_nlink, simply zero it
      during an unlink operation.  We need to catch these in addition to the
      decrement operations.
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Acked-by: NChristoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ce71ec36
    • M
      [PATCH] r/o bind mounts: clean up OCFS2 nlink handling · 17ff7856
      Mark Fasheh 提交于
      OCFS2 does some operations on i_nlink, then reverts them if some of its
      operations fail to complete.  This does not fit in well with the
      drop_nlink() logic where we expect i_nlink to stay at zero once it gets
      there.
      
      So, delay all of the nlink operations until we're sure that the operations
      have completed.  Also, introduce a small helper to check whether an inode
      has proper "unlinkable" i_nlink counts no matter whether it is a directory
      or regular inode.
      
      This patch is broken out from the others because it does contain some
      logical changes.
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      17ff7856
    • D
      [PATCH] r/o bind mount prepwork: inc_nlink() helper · d8c76e6f
      Dave Hansen 提交于
      This is mostly included for parity with dec_nlink(), where we will have some
      more hooks.  This one should stay pretty darn straightforward for now.
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Acked-by: NChristoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d8c76e6f
    • D
      [PATCH] r/o bind mounts: unlink: monitor i_nlink · 9a53c3a7
      Dave Hansen 提交于
      When a filesystem decrements i_nlink to zero, it means that a write must be
      performed in order to drop the inode from the filesystem.
      
      We're shortly going to have keep filesystems from being remounted r/o between
      the time that this i_nlink decrement and that write occurs.
      
      So, add a little helper function to do the decrements.  We'll tie into it in a
      bit to note when i_nlink hits zero.
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Acked-by: NChristoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      9a53c3a7
    • D
      [PATCH] r/o bind mount prepwork: move open_namei()'s vfs_create() · aab520e2
      Dave Hansen 提交于
      The code around vfs_create() in open_namei() is getting a bit too complex.
      Right now, there is at least the reference count on the dentry, and the
      i_mutex to worry about.  Soon, we'll also have mnt_writecount.
      
      So, break the vfs_create() call out of open_namei(), and into a helper
      function.  This duplicates the call to may_open(), but that isn't such a bad
      thing since the arguments (acc_mode and flag) were being heavily massaged
      anyway.
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Acked-by: NChristoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      aab520e2
    • D
      [PATCH] r/o bind mounts: prepare for write access checks: collapse if() · 6902d925
      Dave Hansen 提交于
      We're shortly going to be adding a bunch more permission checks in these
      functions.  That requires adding either a bunch of new if() conditions, or
      some gotos.  This patch collapses existing if()s and uses gotos instead to
      prepare for the upcoming changes.
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Acked-by: NChristoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      6902d925
    • J
      [PATCH] csa: convert CONFIG tag for extended accounting routines · 8f0ab514
      Jay Lan 提交于
      There were a few accounting data/macros that are used in CSA but are #ifdef'ed
      inside CONFIG_BSD_PROCESS_ACCT.  This patch is to change those ifdef's from
      CONFIG_BSD_PROCESS_ACCT to CONFIG_TASK_XACCT.  A few defines are moved from
      kernel/acct.c and include/linux/acct.h to kernel/tsacct.c and
      include/linux/tsacct_kern.h.
      Signed-off-by: NJay Lan <jlan@sgi.com>
      Cc: Shailabh Nagar <nagar@watson.ibm.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Jes Sorensen <jes@sgi.com>
      Cc: Chris Sturtivant <csturtiv@sgi.com>
      Cc: Tony Ernst <tee@sgi.com>
      Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8f0ab514
    • B
      [PATCH] Add vector AIO support · eed4e51f
      Badari Pulavarty 提交于
      This work is initially done by Zach Brown to add support for vectored aio.
      These are the core changes for AIO to support
      IOCB_CMD_PREADV/IOCB_CMD_PWRITEV.
      
      [akpm@osdl.org: huge build fix]
      Signed-off-by: NZach Brown <zach.brown@oracle.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBadari Pulavarty <pbadari@us.ibm.com>
      Acked-by: NBenjamin LaHaise <bcrl@kvack.org>
      Acked-by: NJames Morris <jmorris@namei.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      eed4e51f
    • B
      [PATCH] Streamline generic_file_* interfaces and filemap cleanups · 543ade1f
      Badari Pulavarty 提交于
      This patch cleans up generic_file_*_read/write() interfaces.  Christoph
      Hellwig gave me the idea for this clean ups.
      
      In a nutshell, all filesystems should set .aio_read/.aio_write methods and use
      do_sync_read/ do_sync_write() as their .read/.write methods.  This allows us
      to cleanup all variants of generic_file_* routines.
      
      Final available interfaces:
      
      generic_file_aio_read() - read handler
      generic_file_aio_write() - write handler
      generic_file_aio_write_nolock() - no lock write handler
      
      __generic_file_aio_write_nolock() - internal worker routine
      Signed-off-by: NBadari Pulavarty <pbadari@us.ibm.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      543ade1f
    • B
      [PATCH] Remove readv/writev methods and use aio_read/aio_write instead · ee0b3e67
      Badari Pulavarty 提交于
      This patch removes readv() and writev() methods and replaces them with
      aio_read()/aio_write() methods.
      Signed-off-by: NBadari Pulavarty <pbadari@us.ibm.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ee0b3e67
    • B
      [PATCH] Vectorize aio_read/aio_write fileop methods · 027445c3
      Badari Pulavarty 提交于
      This patch vectorizes aio_read() and aio_write() methods to prepare for
      collapsing all aio & vectored operations into one interface - which is
      aio_read()/aio_write().
      Signed-off-by: NBadari Pulavarty <pbadari@us.ibm.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Michael Holzheu <HOLZHEU@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      027445c3
    • J
      [PATCH] reiserfs: eliminate minimum window size for bitmap searching · 9ea0f949
      Jeff Mahoney 提交于
      When a file system becomes fragmented (using MythTV, for example), the
      bigalloc window searching ends up causing huge performance problems.  In a
      file system presented by a user experiencing this bug, the file system was
      90% free, but no 32-block free windows existed on the entire file system.
      This causes the allocator to scan the entire file system for each 128k
      write before backing down to searching for individual blocks.
      
      In the end, finding a contiguous window for all the blocks in a write is an
      advantageous special case, but one that can be found naturally when such a
      window exists anyway.
      
      This patch removes the bigalloc window searching, and has been proven to
      fix the test case described above.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      9ea0f949
    • J
      [PATCH] reiserfs: use generic_file_open for open() checks · 5a2618e6
      Jeff Mahoney 提交于
      The other common disk-based file systems (I checked ext[23], xfs, jfs)
      check to ensure that opens of files > 2 GB fail unless O_LARGEFILE is
      specified.  They check via generic_file_open or their own open routine.
      
      ReiserFS doesn't have an f_op->open defined, and as such, it's possible to
      open files > 2 GB without O_LARGEFILE.
      
      This patch adds the f_op->open member to conform with the expected
      behavior.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Cc: <reiserfs-dev@namesys.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5a2618e6
    • J
      [PATCH] reiserfs: on-demand bitmap loading · 5065227b
      Jeff Mahoney 提交于
      This is the patch the three previous ones have been leading up to.
      
      It changes the behavior of ReiserFS from loading and caching all the bitmaps
      as special, to treating the bitmaps like any other bit of metadata and just
      letting the system-wide caches figure out what to hang on to.
      
      Buffer heads are allocated on the fly, so there is no need to retain pointers
      to all of them.  The caching of the metadata occurs when the data is read and
      updated, and is considered invalid and uncached until then.
      
      I needed to remove the vs-4040 check for performing a duplicate operation on a
      particular bit.  The reason is that while the other sites for working with
      bitmaps are allowed to schedule, is_reusable() is called from do_balance(),
      which will panic if a schedule occurs in certain places.
      
      The benefit of on-demand bitmaps clearly outweighs a sanity check that depends
      on a compile-time option that is discouraged.
      
      [akpm@osdl.org: warning fix]
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Cc: <reiserfs-dev@namesys.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5065227b
    • J
      [PATCH] reiserfs: reorganize bitmap loading functions · 6f01046b
      Jeff Mahoney 提交于
      This patch moves the bitmap loading code from super.c to bitmap.c
      
      The code is also restructured somewhat.  The only difference between new
      format bitmaps and old format bitmaps is where they are.  That's a two liner
      before loading the block to use the correct one.  There's no need for an
      entirely separate code path.
      
      The load path is generally the same, with the pattern being to throw out a
      bunch of requests and then wait for them, then cache the metadata from the
      contents.
      
      Again, like the previous patches, the purpose is to set up for later ones.
      
      Update: There was a bug in the previously posted version of this that resulted
      in corruption.  The problem was that bitmap 0 on new format file systems must
      be treated specially, and wasn't.  A stupid bug with an easy fix.
      
      This is hopefully the last fix for the disaster that is the reiserfs bitmap
      patch set.
      
      If a bitmap block was full, first_zero_hint would end up at zero since it
      would never be changed from it's zeroed out value.  This just sets it
      beyond the end of the bitmap block.  If any bits are freed, it will be
      reset to a valid bit.  When info->free_count = 0, then we already know it's
      full.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Cc: <reiserfs-dev@namesys.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      6f01046b