1. 01 7月, 2015 2 次提交
  2. 14 5月, 2015 1 次提交
    • E
      mnt: Refactor the logic for mounting sysfs and proc in a user namespace · 1b852bce
      Eric W. Biederman 提交于
      Fresh mounts of proc and sysfs are a very special case that works very
      much like a bind mount.  Unfortunately the current structure can not
      preserve the MNT_LOCK... mount flags.  Therefore refactor the logic
      into a form that can be modified to preserve those lock bits.
      
      Add a new filesystem flag FS_USERNS_VISIBLE that requires some mount
      of the filesystem be fully visible in the current mount namespace,
      before the filesystem may be mounted.
      
      Move the logic for calling fs_fully_visible from proc and sysfs into
      fs/namespace.c where it has greater access to mount namespace state.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      1b852bce
  3. 17 4月, 2015 1 次提交
    • A
      proc: show locks in /proc/pid/fdinfo/X · 6c8c9031
      Andrey Vagin 提交于
      Let's show locks which are associated with a file descriptor in
      its fdinfo file.
      
      Currently we don't have a reliable way to determine who holds a lock.  We
      can find some information in /proc/locks, but PID which is reported there
      can be wrong.  For example, a process takes a lock, then forks a child and
      dies.  In this case /proc/locks contains the parent pid, which can be
      reused by another process.
      
      $ cat /proc/locks
      ...
      6: FLOCK  ADVISORY  WRITE 324 00:13:13431 0 EOF
      ...
      
      $ ps -C rpcbind
        PID TTY          TIME CMD
        332 ?        00:00:00 rpcbind
      
      $ cat /proc/332/fdinfo/4
      pos:	0
      flags:	0100000
      mnt_id:	22
      lock:	1: FLOCK  ADVISORY  WRITE 324 00:13:13431 0 EOF
      
      $ ls -l /proc/332/fd/4
      lr-x------ 1 root root 64 Mar  5 14:43 /proc/332/fd/4 -> /run/rpcbind.lock
      
      $ ls -l /proc/324/fd/
      total 0
      lrwx------ 1 root root 64 Feb 27 14:50 0 -> /dev/pts/0
      lrwx------ 1 root root 64 Feb 27 14:50 1 -> /dev/pts/0
      lrwx------ 1 root root 64 Feb 27 14:49 2 -> /dev/pts/0
      
      You can see that the process with the 324 pid doesn't hold the lock.
      
      This information is required for proper dumping and restoring file
      locks.
      Signed-off-by: NAndrey Vagin <avagin@openvz.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Acked-by: NJeff Layton <jlayton@poochiereds.net>
      Acked-by: N"J. Bruce Fields" <bfields@fieldses.org>
      Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6c8c9031
  4. 16 4月, 2015 4 次提交
  5. 18 3月, 2015 1 次提交
  6. 23 2月, 2015 1 次提交
  7. 18 2月, 2015 1 次提交
    • W
      vmcore: fix PT_NOTE n_namesz, n_descsz overflow issue · 34b47764
      WANG Chao 提交于
      When updating PT_NOTE header size (ie.  p_memsz), an overflow issue
      happens with the following bogus note entry:
      
        n_namesz = 0xFFFFFFFF
        n_descsz = 0x0
        n_type   = 0x0
      
      This kind of note entry should be dropped during updating p_memsz.  But
      because n_namesz is 32bit, after (n_namesz + 3) & (~3), it's overflow to
      0x0, the note entry size looks sane and reserved.
      
      When userspace (eg.  crash utility) is trying to access such bogus note,
      it could lead to an unexpected behavior (eg.  crash utility segment fault
      because it's reading bogus address).
      
      The source of bogus note hasn't been identified yet.  At least we could
      drop the bogus note so user space wouldn't be surprised.
      Signed-off-by: NWANG Chao <chaowang@redhat.com>
      Cc: Dave Anderson <anderson@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Randy Wright <rwright@hp.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Rashika Kheria <rashika.kheria@gmail.com>
      Cc: Greg Pearson <greg.pearson@hp.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      34b47764
  8. 14 2月, 2015 1 次提交
  9. 13 2月, 2015 4 次提交
  10. 12 2月, 2015 10 次提交
  11. 11 2月, 2015 1 次提交
  12. 26 1月, 2015 1 次提交
  13. 19 12月, 2014 1 次提交
    • P
      fs/proc/meminfo.c: include cma info in proc/meminfo · 47f8f929
      Pintu Kumar 提交于
      This patch include CMA info (CMATotal, CMAFree) in /proc/meminfo.
      Currently, in a CMA enabled system, if somebody wants to know the total
      CMA size declared, there is no way to tell, other than the dmesg or
      /var/log/messages logs.
      
      With this patch we are showing the CMA info as part of meminfo, so that it
      can be determined at any point of time.  This will be populated only when
      CMA is enabled.
      
      Below is the sample output from a ARM based device with RAM:512MB and CMA:16MB.
      
        MemTotal:         471172 kB
        MemFree:          111712 kB
        MemAvailable:     271172 kB
        .
        .
        .
        CmaTotal:          16384 kB
        CmaFree:            6144 kB
      
      This patch also fix below checkpatch errors that were found during these changes.
      
        ERROR: space required after that ',' (ctx:ExV)
        199: FILE: fs/proc/meminfo.c:199:
        +       ,atomic_long_read(&num_poisoned_pages) << (PAGE_SHIFT - 10)
                ^
      
        ERROR: space required after that ',' (ctx:ExV)
        202: FILE: fs/proc/meminfo.c:202:
        +       ,K(global_page_state(NR_ANON_TRANSPARENT_HUGEPAGES) *
                ^
      
        ERROR: space required after that ',' (ctx:ExV)
        206: FILE: fs/proc/meminfo.c:206:
        +       ,K(totalcma_pages)
                ^
      
        total: 3 errors, 0 warnings, 2 checks, 236 lines checked
      Signed-off-by: NPintu Kumar <pintu.k@samsung.com>
      Signed-off-by: NVishnu Pratap Singh <vishnu.ps@samsung.com>
      Acked-by: NMichal Nazarewicz <mina86@mina86.com>
      Cc: Rafael Aquini <aquini@redhat.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      47f8f929
  14. 13 12月, 2014 1 次提交
    • T
      genirq: Prevent proc race against freeing of irq descriptors · c291ee62
      Thomas Gleixner 提交于
      Since the rework of the sparse interrupt code to actually free the
      unused interrupt descriptors there exists a race between the /proc
      interfaces to the irq subsystem and the code which frees the interrupt
      descriptor.
      
      CPU0				CPU1
      				show_interrupts()
      				  desc = irq_to_desc(X);
      free_desc(desc)
        remove_from_radix_tree();
        kfree(desc);
      				  raw_spinlock_irq(&desc->lock);
      
      /proc/interrupts is the only interface which can actively corrupt
      kernel memory via the lock access. /proc/stat can only read from freed
      memory. Extremly hard to trigger, but possible.
      
      The interfaces in /proc/irq/N/ are not affected by this because the
      removal of the proc file is serialized in procfs against concurrent
      readers/writers. The removal happens before the descriptor is freed.
      
      For architectures which have CONFIG_SPARSE_IRQ=n this is a non issue
      as the descriptor is never freed. It's merely cleared out with the irq
      descriptor lock held. So any concurrent proc access will either see
      the old correct value or the cleared out ones.
      
      Protect the lookup and access to the irq descriptor in
      show_interrupts() with the sparse_irq_lock.
      
      Provide kstat_irqs_usr() which is protecting the lookup and access
      with sparse_irq_lock and switch /proc/stat to use it.
      
      Document the existing kstat_irqs interfaces so it's clear that the
      caller needs to take care about protection. The users of these
      interfaces are either not affected due to SPARSE_IRQ=n or already
      protected against removal.
      
      Fixes: 1f5a5b87 "genirq: Implement a sane sparse_irq allocator"
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      c291ee62
  15. 12 12月, 2014 1 次提交
    • E
      userns: Add a knob to disable setgroups on a per user namespace basis · 9cc46516
      Eric W. Biederman 提交于
      - Expose the knob to user space through a proc file /proc/<pid>/setgroups
      
        A value of "deny" means the setgroups system call is disabled in the
        current processes user namespace and can not be enabled in the
        future in this user namespace.
      
        A value of "allow" means the segtoups system call is enabled.
      
      - Descendant user namespaces inherit the value of setgroups from
        their parents.
      
      - A proc file is used (instead of a sysctl) as sysctls currently do
        not allow checking the permissions at open time.
      
      - Writing to the proc file is restricted to before the gid_map
        for the user namespace is set.
      
        This ensures that disabling setgroups at a user namespace
        level will never remove the ability to call setgroups
        from a process that already has that ability.
      
        A process may opt in to the setgroups disable for itself by
        creating, entering and configuring a user namespace or by calling
        setns on an existing user namespace with setgroups disabled.
        Processes without privileges already can not call setgroups so this
        is a noop.  Prodcess with privilege become processes without
        privilege when entering a user namespace and as with any other path
        to dropping privilege they would not have the ability to call
        setgroups.  So this remains within the bounds of what is possible
        without a knob to disable setgroups permanently in a user namespace.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      9cc46516
  16. 11 12月, 2014 9 次提交