1. 30 4月, 2013 4 次提交
  2. 07 3月, 2013 1 次提交
  3. 04 3月, 2013 1 次提交
    • E
      fs: Limit sys_mount to only request filesystem modules. · 7f78e035
      Eric W. Biederman 提交于
      Modify the request_module to prefix the file system type with "fs-"
      and add aliases to all of the filesystems that can be built as modules
      to match.
      
      A common practice is to build all of the kernel code and leave code
      that is not commonly needed as modules, with the result that many
      users are exposed to any bug anywhere in the kernel.
      
      Looking for filesystems with a fs- prefix limits the pool of possible
      modules that can be loaded by mount to just filesystems trivially
      making things safer with no real cost.
      
      Using aliases means user space can control the policy of which
      filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
      with blacklist and alias directives.  Allowing simple, safe,
      well understood work-arounds to known problematic software.
      
      This also addresses a rare but unfortunate problem where the filesystem
      name is not the same as it's module name and module auto-loading
      would not work.  While writing this patch I saw a handful of such
      cases.  The most significant being autofs that lives in the module
      autofs4.
      
      This is relevant to user namespaces because we can reach the request
      module in get_fs_type() without having any special permissions, and
      people get uncomfortable when a user specified string (in this case
      the filesystem type) goes all of the way to request_module.
      
      After having looked at this issue I don't think there is any
      particular reason to perform any filtering or permission checks beyond
      making it clear in the module request that we want a filesystem
      module.  The common pattern in the kernel is to call request_module()
      without regards to the users permissions.  In general all a filesystem
      module does once loaded is call register_filesystem() and go to sleep.
      Which means there is not much attack surface exposed by loading a
      filesytem module unless the filesystem is mounted.  In a user
      namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
      which most filesystems do not set today.
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Reported-by: NKees Cook <keescook@google.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      7f78e035
  4. 28 2月, 2013 4 次提交
    • S
      hlist: drop the node parameter from iterators · b67bfe0d
      Sasha Levin 提交于
      I'm not sure why, but the hlist for each entry iterators were conceived
      
              list_for_each_entry(pos, head, member)
      
      The hlist ones were greedy and wanted an extra parameter:
      
              hlist_for_each_entry(tpos, pos, head, member)
      
      Why did they need an extra pos parameter? I'm not quite sure. Not only
      they don't really need it, it also prevents the iterator from looking
      exactly like the list iterator, which is unfortunate.
      
      Besides the semantic patch, there was some manual work required:
      
       - Fix up the actual hlist iterators in linux/list.h
       - Fix up the declaration of other iterators based on the hlist ones.
       - A very small amount of places were using the 'node' parameter, this
       was modified to use 'obj->member' instead.
       - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
       properly, so those had to be fixed up manually.
      
      The semantic patch which is mostly the work of Peter Senna Tschudin is here:
      
      @@
      iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
      
      type T;
      expression a,c,d,e;
      identifier b;
      statement S;
      @@
      
      -T b;
          <+... when != b
      (
      hlist_for_each_entry(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue(a,
      - b,
      c) S
      |
      hlist_for_each_entry_from(a,
      - b,
      c) S
      |
      hlist_for_each_entry_rcu(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_rcu_bh(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue_rcu_bh(a,
      - b,
      c) S
      |
      for_each_busy_worker(a, c,
      - b,
      d) S
      |
      ax25_uid_for_each(a,
      - b,
      c) S
      |
      ax25_for_each(a,
      - b,
      c) S
      |
      inet_bind_bucket_for_each(a,
      - b,
      c) S
      |
      sctp_for_each_hentry(a,
      - b,
      c) S
      |
      sk_for_each(a,
      - b,
      c) S
      |
      sk_for_each_rcu(a,
      - b,
      c) S
      |
      sk_for_each_from
      -(a, b)
      +(a)
      S
      + sk_for_each_from(a) S
      |
      sk_for_each_safe(a,
      - b,
      c, d) S
      |
      sk_for_each_bound(a,
      - b,
      c) S
      |
      hlist_for_each_entry_safe(a,
      - b,
      c, d, e) S
      |
      hlist_for_each_entry_continue_rcu(a,
      - b,
      c) S
      |
      nr_neigh_for_each(a,
      - b,
      c) S
      |
      nr_neigh_for_each_safe(a,
      - b,
      c, d) S
      |
      nr_node_for_each(a,
      - b,
      c) S
      |
      nr_node_for_each_safe(a,
      - b,
      c, d) S
      |
      - for_each_gfn_sp(a, c, d, b) S
      + for_each_gfn_sp(a, c, d) S
      |
      - for_each_gfn_indirect_valid_sp(a, c, d, b) S
      + for_each_gfn_indirect_valid_sp(a, c, d) S
      |
      for_each_host(a,
      - b,
      c) S
      |
      for_each_host_safe(a,
      - b,
      c, d) S
      |
      for_each_mesh_entry(a,
      - b,
      c, d) S
      )
          ...+>
      
      [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
      [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
      [akpm@linux-foundation.org: checkpatch fixes]
      [akpm@linux-foundation.org: fix warnings]
      [akpm@linux-foudnation.org: redo intrusive kvm changes]
      Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b67bfe0d
    • T
      ocfs2: convert to idr_alloc() · 6b207ba3
      Tejun Heo 提交于
      Convert to the much saner new idr interface.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Acked-by: NJoel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6b207ba3
    • X
      ocfs2: ac->ac_allow_chain_relink=0 won't disable group relink · 309a85b6
      Xiaowei.Hu 提交于
      ocfs2_block_group_alloc_discontig() disables chain relink by setting
      ac->ac_allow_chain_relink = 0 because it grabs clusters from multiple
      cluster groups.
      
      It doesn't keep the credits for all chain relink,but
      ocfs2_claim_suballoc_bits overrides this in this call trace:
      ocfs2_block_group_claim_bits()->ocfs2_claim_clusters()->
      __ocfs2_claim_clusters()->ocfs2_claim_suballoc_bits()
      ocfs2_claim_suballoc_bits set ac->ac_allow_chain_relink = 1; then call
      ocfs2_search_chain() one time and disable it again, and then we run out
      of credits.
      
      Fix is to allow relink by default and disable it in
      ocfs2_block_group_alloc_discontig.
      
      Without this patch, End-users will run into a crash due to run out of
      credits, backtrace like this:
      
        RIP: 0010:[<ffffffffa0808b14>]  [<ffffffffa0808b14>]
        jbd2_journal_dirty_metadata+0x164/0x170 [jbd2]
        RSP: 0018:ffff8801b919b5b8  EFLAGS: 00010246
        RAX: 0000000000000000 RBX: ffff88022139ddc0 RCX: ffff880159f652d0
        RDX: ffff880178aa3000 RSI: ffff880159f652d0 RDI: ffff880087f09bf8
        RBP: ffff8801b919b5e8 R08: 0000000000000000 R09: 0000000000000000
        R10: 0000000000001e00 R11: 00000000000150b0 R12: ffff880159f652d0
        R13: ffff8801a0cae908 R14: ffff880087f09bf8 R15: ffff88018d177800
        FS:  00007fc9b0b6b6e0(0000) GS:ffff88022fd40000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
        CR2: 000000000040819c CR3: 0000000184017000 CR4: 00000000000006e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
        Process dd (pid: 9945, threadinfo ffff8801b919a000, task ffff880149a264c0)
        Call Trace:
          ocfs2_journal_dirty+0x2f/0x70 [ocfs2]
          ocfs2_relink_block_group+0x111/0x480 [ocfs2]
          ocfs2_search_chain+0x455/0x9a0 [ocfs2]
          ...
      Signed-off-by: NXiaowei.Hu <xiaowei.hu@oracle.com>
      Reviewed-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      309a85b6
    • J
      ocfs2: fix ocfs2_init_security_and_acl() to initialize acl correctly · 32918dd9
      Jeff Liu 提交于
      We need to re-initialize the security for a new reflinked inode with its
      parent dirs if it isn't specified to be preserved for ocfs2_reflink().
      However, the code logic is broken at ocfs2_init_security_and_acl()
      although ocfs2_init_security_get() succeed.  As a result,
      ocfs2_acl_init() does not involked and therefore the default ACL of
      parent dir was missing on the new inode.
      
      Note this was introduced by 9d8f13ba ("security: new
      security_inode_init_security API adds function callback")
      
      To reproduce:
      
          set default ACL for the parent dir(ocfs2 in this case):
          $ setfacl -m default:user:jeff:rwx ../ocfs2/
          $ getfacl ../ocfs2/
          # file: ../ocfs2/
          # owner: jeff
          # group: jeff
          user::rwx
          group::r-x
          other::r-x
          default:user::rwx
          default:user:jeff:rwx
          default:group::r-x
          default:mask::rwx
          default:other::r-x
      
          $ touch a
          $ getfacl a
          # file: a
          # owner: jeff
          # group: jeff
          user::rw-
          group::rw-
          other::r--
      
      Before patching, create reflink file b from a, the user
      default ACL entry(user:jeff:rwx)was missing:
      
          $ ./ocfs2_reflink a b
          $ getfacl b
          # file: b
          # owner: jeff
          # group: jeff
          user::rw-
          group::rw-
          other::r--
      
      In this case, the end user can also observed an error message at syslog:
      
        (ocfs2_reflink,3229,2):ocfs2_init_security_and_acl:7193 ERROR: status = 0
      
      After applying this patch, create reflink file c from a:
      
          $ ./ocfs2_reflink a c
          $ getfacl c
          # file: c
          # owner: jeff
          # group: jeff
          user::rw-
          user:jeff:rwx			#effective:rw-
          group::r-x			#effective:r--
          mask::rw-
          other::r--
      
      Test program:
      /* Usage: reflink <source> <dest> */
      #include <stdio.h>
      #include <stdint.h>
      #include <stdbool.h>
      #include <string.h>
      #include <errno.h>
      #include <sys/types.h>
      #include <sys/stat.h>
      #include <fcntl.h>
      #include <sys/ioctl.h>
      
      static int
      reflink_file(char const *src_name, char const *dst_name,
      	     bool preserve_attrs)
      {
      	int fd;
      
      #ifndef REFLINK_ATTR_NONE
      #  define REFLINK_ATTR_NONE 0
      #endif
      #ifndef REFLINK_ATTR_PRESERVE
      #  define REFLINK_ATTR_PRESERVE 1
      #endif
      #ifndef OCFS2_IOC_REFLINK
      	struct reflink_arguments {
      		uint64_t old_path;
      		uint64_t new_path;
      		uint64_t preserve;
      	};
      
      #  define OCFS2_IOC_REFLINK _IOW ('o', 4, struct reflink_arguments)
      #endif
      	struct reflink_arguments args = {
      		.old_path = (unsigned long) src_name,
      		.new_path = (unsigned long) dst_name,
      		.preserve = preserve_attrs ? REFLINK_ATTR_PRESERVE :
      					     REFLINK_ATTR_NONE,
      	};
      
      	fd = open(src_name, O_RDONLY);
      	if (fd < 0) {
      		fprintf(stderr, "Failed to open %s: %s\n",
      			src_name, strerror(errno));
      		return -1;
      	}
      
      	if (ioctl(fd, OCFS2_IOC_REFLINK, &args) < 0) {
      		fprintf(stderr, "Failed to reflink %s to %s: %s\n",
      			src_name, dst_name, strerror(errno));
      		return -1;
      	}
      }
      
      int
      main(int argc, char *argv[])
      {
      	if (argc != 3) {
      		fprintf(stdout, "Usage: %s source dest\n", argv[0]);
      		return 1;
      	}
      
      	return reflink_file(argv[1], argv[2], 0);
      }
      Signed-off-by: NJie Liu <jeff.liu@oracle.com>
      Reviewed-by: NTao Ma <boyu.mt@taobao.com>
      Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      32918dd9
  5. 26 2月, 2013 5 次提交
  6. 23 2月, 2013 1 次提交
  7. 22 2月, 2013 3 次提交
  8. 13 2月, 2013 5 次提交
  9. 21 1月, 2013 1 次提交
  10. 09 1月, 2013 1 次提交
  11. 21 12月, 2012 1 次提交
  12. 18 12月, 2012 1 次提交
  13. 12 12月, 2012 1 次提交
  14. 09 10月, 2012 1 次提交
    • K
      mm: kill vma flag VM_CAN_NONLINEAR · 0b173bc4
      Konstantin Khlebnikov 提交于
      Move actual pte filling for non-linear file mappings into the new special
      vma operation: ->remap_pages().
      
      Filesystems must implement this method to get non-linear mapping support,
      if it uses filemap_fault() then generic_file_remap_pages() can be used.
      
      Now device drivers can implement this method and obtain nonlinear vma support.
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Carsten Otte <cotte@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>	#arch/tile
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0b173bc4
  15. 03 10月, 2012 1 次提交
  16. 27 9月, 2012 2 次提交
  17. 18 9月, 2012 3 次提交
    • E
      userns: Convert struct dquot dq_id to be a struct kqid · 4c376dca
      Eric W. Biederman 提交于
      Change struct dquot dq_id to a struct kqid and remove the now
      unecessary dq_type.
      
      Make minimal changes to dquot, quota_tree, quota_v1, quota_v2, ext3,
      ext4, and ocfs2 to deal with the change in quota structures and
      signatures.  The ocfs2 changes are larger than most because of the
      extensive tracing throughout the ocfs2 quota code that prints out
      dq_id.
      
      quota_tree.c:get_index is modified to take a struct kqid instead of a
      qid_t because all of it's callers pass in dquot->dq_id and it allows
      me to introduce only a single conversion.
      
      The rest of the changes are either just replacing dq_type with dq_id.type,
      adding conversions to deal with the change in type and occassionally
      adding qid_eq to allow quota id comparisons in a user namespace safe way.
      
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: Theodore Tso <tytso@mit.edu>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      4c376dca
    • E
      userns: Modify dqget to take struct kqid · aca645a6
      Eric W. Biederman 提交于
      Modify dqget to take struct kqid instead of a type and an identifier
      pair.
      
      Modify the callers of dqget in ocfs2 and dquot to take generate
      a struct kqid so they can continue to call dqget.  The conversion
      to create struct kqid should all be the final conversions that
      are needed in those code paths.
      
      Cc: Jan Kara <jack@suse.cz>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      aca645a6
    • E
      userns: Pass a userns parameter into posix_acl_to_xattr and posix_acl_from_xattr · 5f3a4a28
      Eric W. Biederman 提交于
       - Pass the user namespace the uid and gid values in the xattr are stored
         in into posix_acl_from_xattr.
      
       - Pass the user namespace kuid and kgid values should be converted into
         when storing uid and gid values in an xattr in posix_acl_to_xattr.
      
      - Modify all callers of posix_acl_from_xattr and posix_acl_to_xattr to
        pass in &init_user_ns.
      
      In the short term this change is not strictly needed but it makes the
      code clearer.  In the longer term this change is necessary to be able to
      mount filesystems outside of the initial user namespace that natively
      store posix acls in the linux xattr format.
      
      Cc: Theodore Tso <tytso@mit.edu>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      5f3a4a28
  18. 21 8月, 2012 1 次提交
    • T
      workqueue: deprecate flush[_delayed]_work_sync() · 43829731
      Tejun Heo 提交于
      flush[_delayed]_work_sync() are now spurious.  Mark them deprecated
      and convert all users to flush[_delayed]_work().
      
      If you're cc'd and wondering what's going on: Now all workqueues are
      non-reentrant and the regular flushes guarantee that the work item is
      not pending or running on any CPU on return, so there's no reason to
      use the sync flushes at all and they're going away.
      
      This patch doesn't make any functional difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Ian Campbell <ian.campbell@citrix.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Mattia Dongili <malattia@linux.it>
      Cc: Kent Yoder <key@linux.vnet.ibm.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Karsten Keil <isdn@linux-pingi.de>
      Cc: Bryan Wu <bryan.wu@canonical.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Alasdair Kergon <agk@redhat.com>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: linux-wireless@vger.kernel.org
      Cc: Anton Vorontsov <cbou@mail.ru>
      Cc: Sangbeom Kim <sbkim73@samsung.com>
      Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Eric Van Hensbergen <ericvh@gmail.com>
      Cc: Takashi Iwai <tiwai@suse.de>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Petr Vandrovec <petr@vandrovec.name>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Avi Kivity <avi@redhat.com> 
      43829731
  19. 31 7月, 2012 2 次提交
  20. 30 7月, 2012 1 次提交