1. 10 4月, 2008 1 次提交
  2. 04 4月, 2008 1 次提交
    • H
      splice: use mapping_gfp_mask · 4cd13504
      Hugh Dickins 提交于
      The loop block driver is careful to mask __GFP_IO|__GFP_FS out of its
      mapping_gfp_mask, to avoid hangs under memory pressure.  But nowadays
      it uses splice, usually going through __generic_file_splice_read.  That
      must use mapping_gfp_mask instead of GFP_KERNEL to avoid those hangs.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4cd13504
  3. 04 3月, 2008 1 次提交
  4. 11 2月, 2008 1 次提交
  5. 09 2月, 2008 1 次提交
  6. 01 2月, 2008 1 次提交
  7. 30 1月, 2008 1 次提交
  8. 29 1月, 2008 1 次提交
  9. 25 1月, 2008 1 次提交
  10. 17 10月, 2007 3 次提交
    • S
      Implement file posix capabilities · b5376771
      Serge E. Hallyn 提交于
      Implement file posix capabilities.  This allows programs to be given a
      subset of root's powers regardless of who runs them, without having to use
      setuid and giving the binary all of root's powers.
      
      This version works with Kaigai Kohei's userspace tools, found at
      http://www.kaigai.gr.jp/index.php.  For more information on how to use this
      patch, Chris Friedhoff has posted a nice page at
      http://www.friedhoff.org/fscaps.html.
      
      Changelog:
      	Nov 27:
      	Incorporate fixes from Andrew Morton
      	(security-introduce-file-caps-tweaks and
      	security-introduce-file-caps-warning-fix)
      	Fix Kconfig dependency.
      	Fix change signaling behavior when file caps are not compiled in.
      
      	Nov 13:
      	Integrate comments from Alexey: Remove CONFIG_ ifdef from
      	capability.h, and use %zd for printing a size_t.
      
      	Nov 13:
      	Fix endianness warnings by sparse as suggested by Alexey
      	Dobriyan.
      
      	Nov 09:
      	Address warnings of unused variables at cap_bprm_set_security
      	when file capabilities are disabled, and simultaneously clean
      	up the code a little, by pulling the new code into a helper
      	function.
      
      	Nov 08:
      	For pointers to required userspace tools and how to use
      	them, see http://www.friedhoff.org/fscaps.html.
      
      	Nov 07:
      	Fix the calculation of the highest bit checked in
      	check_cap_sanity().
      
      	Nov 07:
      	Allow file caps to be enabled without CONFIG_SECURITY, since
      	capabilities are the default.
      	Hook cap_task_setscheduler when !CONFIG_SECURITY.
      	Move capable(TASK_KILL) to end of cap_task_kill to reduce
      	audit messages.
      
      	Nov 05:
      	Add secondary calls in selinux/hooks.c to task_setioprio and
      	task_setscheduler so that selinux and capabilities with file
      	cap support can be stacked.
      
      	Sep 05:
      	As Seth Arnold points out, uid checks are out of place
      	for capability code.
      
      	Sep 01:
      	Define task_setscheduler, task_setioprio, cap_task_kill, and
      	task_setnice to make sure a user cannot affect a process in which
      	they called a program with some fscaps.
      
      	One remaining question is the note under task_setscheduler: are we
      	ok with CAP_SYS_NICE being sufficient to confine a process to a
      	cpuset?
      
      	It is a semantic change, as without fsccaps, attach_task doesn't
      	allow CAP_SYS_NICE to override the uid equivalence check.  But since
      	it uses security_task_setscheduler, which elsewhere is used where
      	CAP_SYS_NICE can be used to override the uid equivalence check,
      	fixing it might be tough.
      
      	     task_setscheduler
      		 note: this also controls cpuset:attach_task.  Are we ok with
      		     CAP_SYS_NICE being used to confine to a cpuset?
      	     task_setioprio
      	     task_setnice
      		 sys_setpriority uses this (through set_one_prio) for another
      		 process.  Need same checks as setrlimit
      
      	Aug 21:
      	Updated secureexec implementation to reflect the fact that
      	euid and uid might be the same and nonzero, but the process
      	might still have elevated caps.
      
      	Aug 15:
      	Handle endianness of xattrs.
      	Enforce capability version match between kernel and disk.
      	Enforce that no bits beyond the known max capability are
      	set, else return -EPERM.
      	With this extra processing, it may be worth reconsidering
      	doing all the work at bprm_set_security rather than
      	d_instantiate.
      
      	Aug 10:
      	Always call getxattr at bprm_set_security, rather than
      	caching it at d_instantiate.
      
      [morgan@kernel.org: file-caps clean up for linux/capability.h]
      [bunk@kernel.org: unexport cap_inode_killpriv]
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: James Morris <jmorris@namei.org>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: Andrew Morgan <morgan@kernel.org>
      Signed-off-by: NAndrew Morgan <morgan@kernel.org>
      Signed-off-by: NAdrian Bunk <bunk@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b5376771
    • N
      fs: introduce write_begin, write_end, and perform_write aops · afddba49
      Nick Piggin 提交于
      These are intended to replace prepare_write and commit_write with more
      flexible alternatives that are also able to avoid the buffered write
      deadlock problems efficiently (which prepare_write is unable to do).
      
      [mark.fasheh@oracle.com: API design contributions, code review and fixes]
      [akpm@linux-foundation.org: various fixes]
      [dmonakhov@sw.ru: new aop block_write_begin fix]
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      Signed-off-by: NDmitriy Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      afddba49
    • F
      readahead: combine file_ra_state.prev_index/prev_offset into prev_pos · f4e6b498
      Fengguang Wu 提交于
      Combine the file_ra_state members
      				unsigned long prev_index
      				unsigned int prev_offset
      into
      				loff_t prev_pos
      
      It is more consistent and better supports huge files.
      
      Thanks to Peter for the nice proposal!
      
      [akpm@linux-foundation.org: fix shift overflow]
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NFengguang Wu <wfg@mail.ustc.edu.cn>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f4e6b498
  11. 16 10月, 2007 1 次提交
    • J
      splice: fix double kunmap() in vmsplice copy path · 6866bef4
      Jens Axboe 提交于
      The out label should not include the unmap, the only way to jump
      there already has unmapped the source.
      
      00002000
             f7c21a00 00000000 00000000 c0489036 00018e32 00000002 00000000
      00001000
      Call Trace:
       [<c0487dd9>] pipe_to_user+0xca/0xd3
       [<c0488233>] __splice_from_pipe+0x53/0x1bd
       [<c0454947>] ------------[ cut here ]------------
      filemap_fault+0x221/0x380
       [<c0487d0f>] pipe_to_user+0x0/0xd3
       [<c0489036>] sys_vmsplice+0x3b7/0x422
       [<c045ec3f>] kernel BUG at mm/highmem.c:206!
      handle_mm_fault+0x4d5/0x8eb
       [<c041ed5b>] kmap_atomic+0x1c/0x20
       [<c045d33d>] unmap_vmas+0x3d1/0x584
       [<c045f717>] free_pgtables+0x90/0xa0
       [<c041d84b>] pgd_dtor+0x0/0x1
       [<c044d665>] audit_syscall_exit+0x2aa/0x2c6
       [<c0407817>] do_syscall_trace+0x124/0x169
       [<c0404df2>] syscall_call+0x7/0xb
       =======================
      Code: 2d 00 d0 5b 00 25 00 00 e0 ff 29 invalid opcode: 0000 [#1]
      c2 89 d0 c1 e8 0c 8b 14 85 a0 6c 7c c0 4a 85 d2 89 14 85 a0 6c 7c c0 74 07
      31 c9 4a 75 15 eb 04 <0f> 0b eb fe 31 c9 81 3d 78 38 6d c0 78 38 6d c0 0f
      95 c1 b0 01
      EIP: [<c045bbc3>] kunmap_high+0x51/0x8e SS:ESP 0068:f5960df0
      SMP
      Modules linked in: netconsole autofs4 hidp nfs lockd nfs_acl rfcomm l2cap
      bluetooth sunrpc ipv6 ib_iser rdma_cm ib_cm iw_cmib_sa ib_mad ib_core
      ib_addr iscsi_tcp libiscsi scsi_transport_iscsi dm_mirror dm_multipath
      dm_mod video output sbs batteryac parport_pc lp parport sg i2c_piix4
      i2c_core floppy cfi_probe gen_probe scb2_flash mtd chipreg tg3 e1000 button
      ide_cd serio_raw cdrom aic7xxx scsi_transport_spi sd_mod scsi_mod ext3 jbd
      ehci_hcd ohci_hcd uhci_hcd
      CPU:    3
      EIP:    0060:[<c045bbc3>]    Not tainted VLI
      EFLAGS: 00010246   (2.6.23 #1)
      EIP is at kunmap_high+0x51/0x8e
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      6866bef4
  12. 02 10月, 2007 1 次提交
  13. 27 7月, 2007 1 次提交
  14. 21 7月, 2007 1 次提交
  15. 20 7月, 2007 4 次提交
  16. 16 7月, 2007 1 次提交
    • J
      splice: direct splicing updates ppos twice · bcd4f3ac
      Jens Axboe 提交于
      OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> reported that he's noticed
      nfsd read corruption in recent kernels, and did the hard work of
      discovering that it's due to splice updating the file position twice.
      This means that the next operation would start further ahead than it
      should.
      
      nfsd_vfs_read()
          splice_direct_to_actor()
              while(len) {
                  do_splice_to()                     [update sd->pos]
                      -> generic_file_splice_read()  [read from sd->pos]
                  nfsd_direct_splice_actor()
                      -> __splice_from_pipe()        [update sd->pos]
      
      There's nothing wrong with the core splice code, but the direct
      splicing is an addon that calls both input and output paths.
      So it has to take care in locally caching offset so it remains correct.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      bcd4f3ac
  17. 13 7月, 2007 2 次提交
  18. 10 7月, 2007 7 次提交
  19. 15 6月, 2007 3 次提交
  20. 08 6月, 2007 5 次提交
  21. 08 5月, 2007 2 次提交
    • J
      [PATCH] splice: always call into page_cache_readahead() · 86aa5ac5
      Jens Axboe 提交于
      Don't try to guess what the read-ahead logic will do, allow it
      to make its own decisions.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      86aa5ac5
    • F
      [PATCH] splice(): fix interaction with readahead · 9ae9d68c
      Fengguang Wu 提交于
      Eric Dumazet, thank you for disclosing this bug.
      
      Readahead logic somehow fails to populate the page range with data.
      It can be because
      
      1) the readahead routine is not always called in the following lines of
      
      fs/splice.c:
              if (!loff || nr_pages > 1)
                      page_cache_readahead(mapping, &in->f_ra, in, index, nr_pages);
      
      2) even called, page_cache_readahead() wont guarantee the pages are there.
      It wont submit readahead I/O for pages already in the radix tree, or when
      (ra_pages == 0), or after 256 cache hits.
      
      In your case, it should be because of the retried reads, which lead to
      excessive cache hits, and disables readahead at some time.
      
      And that _one_ failure of readahead blocks the whole read process.
      The application receives EAGAIN and retries the read, but
      __generic_file_splice_read() refuse to make progress:
      
      - in the previous invocation, it has allocated a blank page and inserted it
        into the radix tree, but never has the chance to start I/O for it: the test
        of SPLICE_F_NONBLOCK goes before that.
      
      - in the retried invocation, the readahead code will neither get out of the
        cache hit mode, nor will it submit I/O for an already existing page.
      
      Cc: Eric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      9ae9d68c