1. 10 10月, 2017 1 次提交
    • J
      x86/unwind: Use MSB for frame pointer encoding on 32-bit · 5c99b692
      Josh Poimboeuf 提交于
      On x86-32, Tetsuo Handa and Fengguang Wu reported unwinder warnings
      like:
      
        WARNING: kernel stack regs at f60bb9c8 in swapper:1 has bad 'bp' value 0ba00000
      
      And also there were some stack dumps with a bunch of unreliable '?'
      symbols after an apic_timer_interrupt symbol, meaning the unwinder got
      confused when it tried to read the regs.
      
      The cause of those issues is that, with GCC 4.8 (and possibly older),
      there are cases where GCC misaligns the stack pointer in a leaf function
      for no apparent reason:
      
        c124a388 <acpi_rs_move_data>:
        c124a388:       55                      push   %ebp
        c124a389:       89 e5                   mov    %esp,%ebp
        c124a38b:       57                      push   %edi
        c124a38c:       56                      push   %esi
        c124a38d:       89 d6                   mov    %edx,%esi
        c124a38f:       53                      push   %ebx
        c124a390:       31 db                   xor    %ebx,%ebx
        c124a392:       83 ec 03                sub    $0x3,%esp
        ...
        c124a3e3:       83 c4 03                add    $0x3,%esp
        c124a3e6:       5b                      pop    %ebx
        c124a3e7:       5e                      pop    %esi
        c124a3e8:       5f                      pop    %edi
        c124a3e9:       5d                      pop    %ebp
        c124a3ea:       c3                      ret
      
      If an interrupt occurs in such a function, the regs on the stack will be
      unaligned, which breaks the frame pointer encoding assumption.  So on
      32-bit, use the MSB instead of the LSB to encode the regs.
      
      This isn't an issue on 64-bit, because interrupts align the stack before
      writing to it.
      Reported-and-tested-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-and-tested-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: LKP <lkp@01.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/279a26996a482ca716605c7dbc7f2db9d8d91e81.1507597785.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5c99b692
  2. 01 9月, 2017 1 次提交
  3. 29 8月, 2017 4 次提交
  4. 10 8月, 2017 3 次提交
  5. 27 7月, 2017 1 次提交
  6. 18 7月, 2017 3 次提交
  7. 08 7月, 2017 1 次提交
    • T
      x86/syscalls: Check address limit on user-mode return · 5ea0727b
      Thomas Garnier 提交于
      Ensure the address limit is a user-mode segment before returning to
      user-mode. Otherwise a process can corrupt kernel-mode memory and elevate
      privileges [1].
      
      The set_fs function sets the TIF_SETFS flag to force a slow path on
      return. In the slow path, the address limit is checked to be USER_DS if
      needed.
      
      The addr_limit_user_check function is added as a cross-architecture
      function to check the address limit.
      
      [1] https://bugs.chromium.org/p/project-zero/issues/detail?id=990Signed-off-by: NThomas Garnier <thgarnie@google.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: kernel-hardening@lists.openwall.com
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Miroslav Benes <mbenes@suse.cz>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Pratyush Anand <panand@redhat.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Will Drewry <wad@chromium.org>
      Cc: linux-api@vger.kernel.org
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Link: http://lkml.kernel.org/r/20170615011203.144108-1-thgarnie@google.com
      5ea0727b
  8. 21 6月, 2017 1 次提交
    • D
      ARM: 8683/1: ARM32: Support mremap() for sigpage/vDSO · 280e87e9
      Dmitry Safonov 提交于
      CRIU restores application mappings on the same place where they
      were before Checkpoint. That means, that we need to move vDSO
      and sigpage during restore on exactly the same place where
      they were before C/R.
      
      Make mremap() code update mm->context.{sigpage,vdso} pointers
      during VMA move. Sigpage is used for landing after handling
      a signal - if the pointer is not updated during moving, the
      application might crash on any signal after mremap().
      
      vDSO pointer on ARM32 is used only for setting auxv at this moment,
      update it during mremap() in case of future usage.
      
      Without those updates, current work of CRIU on ARM32 is not reliable.
      Historically, we error Checkpointing if we find vDSO page on ARM32
      and suggest user to disable CONFIG_VDSO.
      But that's not correct - it goes from x86 where signal processing
      is ended in vDSO blob. For arm32 it's sigpage, which is not disabled
      with `CONFIG_VDSO=n'.
      
      Looks like C/R was working by luck - because userspace on ARM32 at
      this moment always sets SA_RESTORER.
      Signed-off-by: NDmitry Safonov <dsafonov@virtuozzo.com>
      Acked-by: NAndy Lutomirski <luto@amacapital.net>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Pavel Emelyanov <xemul@virtuozzo.com>
      Cc: Christopher Covington <cov@codeaurora.org>
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      280e87e9
  9. 13 6月, 2017 1 次提交
  10. 24 5月, 2017 1 次提交
    • J
      Revert "x86/entry: Fix the end of the stack for newly forked tasks" · ebd57499
      Josh Poimboeuf 提交于
      Petr Mladek reported the following warning when loading the livepatch
      sample module:
      
        WARNING: CPU: 1 PID: 3699 at arch/x86/kernel/stacktrace.c:132 save_stack_trace_tsk_reliable+0x133/0x1a0
        ...
        Call Trace:
         __schedule+0x273/0x820
         schedule+0x36/0x80
         kthreadd+0x305/0x310
         ? kthread_create_on_cpu+0x80/0x80
         ? icmp_echo.part.32+0x50/0x50
         ret_from_fork+0x2c/0x40
      
      That warning means the end of the stack is no longer recognized as such
      for newly forked tasks.  The problem was introduced with the following
      commit:
      
        ff3f7e24 ("x86/entry: Fix the end of the stack for newly forked tasks")
      
      ... which was completely misguided.  It only partially fixed the
      reported issue, and it introduced another bug in the process.  None of
      the other entry code saves the frame pointer before calling into C code,
      so it doesn't make sense for ret_from_fork to do so either.
      
      Contrary to what I originally thought, the original issue wasn't related
      to newly forked tasks.  It was actually related to ftrace.  When entry
      code calls into a function which then calls into an ftrace handler, the
      stack frame looks different than normal.
      
      The original issue will be fixed in the unwinder, in a subsequent patch.
      Reported-by: NPetr Mladek <pmladek@suse.com>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Dave Jones <davej@codemonkey.org.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: live-patching@vger.kernel.org
      Fixes: ff3f7e24 ("x86/entry: Fix the end of the stack for newly forked tasks")
      Link: http://lkml.kernel.org/r/f350760f7e82f0750c8d1dd093456eb212751caa.1495553739.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ebd57499
  11. 18 4月, 2017 1 次提交
    • A
      Remove compat_sys_getdents64() · 2611dc19
      Al Viro 提交于
      Unlike normal compat syscall variants, it is needed only for
      biarch architectures that have different alignement requirements for
      u64 in 32bit and 64bit ABI *and* have __put_user() that won't handle
      a store of 64bit value at 32bit-aligned address.  We used to have one
      such (ia64), but its biarch support has been gone since 2010 (after
      being broken in 2008, which went unnoticed since nobody had been using
      it).
      
      It had escaped removal at the same time only because back in 2004
      a patch that switched several syscalls on amd64 from private wrappers to
      generic compat ones had switched to use of compat_sys_getdents64(), which
      hadn't needed (or used) a compat wrapper on amd64.
      
      Let's bury it - it's at least 7 years overdue.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      2611dc19
  12. 11 4月, 2017 1 次提交
  13. 04 4月, 2017 1 次提交
  14. 24 3月, 2017 1 次提交
  15. 20 3月, 2017 1 次提交
    • K
      x86/syscalls/32: Wire up arch_prctl on x86-32 · 79170fda
      Kyle Huey 提交于
      Hook up arch_prctl to call do_arch_prctl() on x86-32, and in 32 bit compat
      mode on x86-64. This allows to have arch_prctls that are not specific to 64
      bits.
      
      On UML, simply stub out this syscall.
      Signed-off-by: NKyle Huey <khuey@kylehuey.com>
      Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
      Cc: kvm@vger.kernel.org
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: linux-kselftest@vger.kernel.org
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Robert O'Callahan <robert@ocallahan.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: user-mode-linux-devel@lists.sourceforge.net
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: user-mode-linux-user@lists.sourceforge.net
      Cc: David Matlack <dmatlack@google.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
      Cc: linux-fsdevel@vger.kernel.org
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Link: http://lkml.kernel.org/r/20170320081628.18952-7-khuey@kylehuey.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      79170fda
  16. 16 3月, 2017 1 次提交
    • T
      x86: Remap GDT tables in the fixmap section · 69218e47
      Thomas Garnier 提交于
      Each processor holds a GDT in its per-cpu structure. The sgdt
      instruction gives the base address of the current GDT. This address can
      be used to bypass KASLR memory randomization. With another bug, an
      attacker could target other per-cpu structures or deduce the base of
      the main memory section (PAGE_OFFSET).
      
      This patch relocates the GDT table for each processor inside the
      fixmap section. The space is reserved based on number of supported
      processors.
      
      For consistency, the remapping is done by default on 32 and 64-bit.
      
      Each processor switches to its remapped GDT at the end of
      initialization. For hibernation, the main processor returns with the
      original GDT and switches back to the remapping at completion.
      
      This patch was tested on both architectures. Hibernation and KVM were
      both tested specially for their usage of the GDT.
      
      Thanks to Boris Ostrovsky <boris.ostrovsky@oracle.com> for testing and
      recommending changes for Xen support.
      Signed-off-by: NThomas Garnier <thgarnie@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lorenzo Stoakes <lstoakes@gmail.com>
      Cc: Luis R . Rodriguez <mcgrof@kernel.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Rafael J . Wysocki <rjw@rjwysocki.net>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: kasan-dev@googlegroups.com
      Cc: kernel-hardening@lists.openwall.com
      Cc: kvm@vger.kernel.org
      Cc: lguest@lists.ozlabs.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-efi@vger.kernel.org
      Cc: linux-mm@kvack.org
      Cc: linux-pm@vger.kernel.org
      Cc: xen-devel@lists.xenproject.org
      Cc: zijun_hu <zijun_hu@htc.com>
      Link: http://lkml.kernel.org/r/20170314170508.100882-2-thgarnie@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      69218e47
  17. 11 3月, 2017 1 次提交
  18. 08 3月, 2017 1 次提交
  19. 03 3月, 2017 1 次提交
    • D
      statx: Add a system call to make enhanced file info available · a528d35e
      David Howells 提交于
      Add a system call to make extended file information available, including
      file creation and some attribute flags where available through the
      underlying filesystem.
      
      The getattr inode operation is altered to take two additional arguments: a
      u32 request_mask and an unsigned int flags that indicate the
      synchronisation mode.  This change is propagated to the vfs_getattr*()
      function.
      
      Functions like vfs_stat() are now inline wrappers around new functions
      vfs_statx() and vfs_statx_fd() to reduce stack usage.
      
      ========
      OVERVIEW
      ========
      
      The idea was initially proposed as a set of xattrs that could be retrieved
      with getxattr(), but the general preference proved to be for a new syscall
      with an extended stat structure.
      
      A number of requests were gathered for features to be included.  The
      following have been included:
      
       (1) Make the fields a consistent size on all arches and make them large.
      
       (2) Spare space, request flags and information flags are provided for
           future expansion.
      
       (3) Better support for the y2038 problem [Arnd Bergmann] (tv_sec is an
           __s64).
      
       (4) Creation time: The SMB protocol carries the creation time, which could
           be exported by Samba, which will in turn help CIFS make use of
           FS-Cache as that can be used for coherency data (stx_btime).
      
           This is also specified in NFSv4 as a recommended attribute and could
           be exported by NFSD [Steve French].
      
       (5) Lightweight stat: Ask for just those details of interest, and allow a
           netfs (such as NFS) to approximate anything not of interest, possibly
           without going to the server [Trond Myklebust, Ulrich Drepper, Andreas
           Dilger] (AT_STATX_DONT_SYNC).
      
       (6) Heavyweight stat: Force a netfs to go to the server, even if it thinks
           its cached attributes are up to date [Trond Myklebust]
           (AT_STATX_FORCE_SYNC).
      
      And the following have been left out for future extension:
      
       (7) Data version number: Could be used by userspace NFS servers [Aneesh
           Kumar].
      
           Can also be used to modify fill_post_wcc() in NFSD which retrieves
           i_version directly, but has just called vfs_getattr().  It could get
           it from the kstat struct if it used vfs_xgetattr() instead.
      
           (There's disagreement on the exact semantics of a single field, since
           not all filesystems do this the same way).
      
       (8) BSD stat compatibility: Including more fields from the BSD stat such
           as creation time (st_btime) and inode generation number (st_gen)
           [Jeremy Allison, Bernd Schubert].
      
       (9) Inode generation number: Useful for FUSE and userspace NFS servers
           [Bernd Schubert].
      
           (This was asked for but later deemed unnecessary with the
           open-by-handle capability available and caused disagreement as to
           whether it's a security hole or not).
      
      (10) Extra coherency data may be useful in making backups [Andreas Dilger].
      
           (No particular data were offered, but things like last backup
           timestamp, the data version number and the DOS archive bit would come
           into this category).
      
      (11) Allow the filesystem to indicate what it can/cannot provide: A
           filesystem can now say it doesn't support a standard stat feature if
           that isn't available, so if, for instance, inode numbers or UIDs don't
           exist or are fabricated locally...
      
           (This requires a separate system call - I have an fsinfo() call idea
           for this).
      
      (12) Store a 16-byte volume ID in the superblock that can be returned in
           struct xstat [Steve French].
      
           (Deferred to fsinfo).
      
      (13) Include granularity fields in the time data to indicate the
           granularity of each of the times (NFSv4 time_delta) [Steve French].
      
           (Deferred to fsinfo).
      
      (14) FS_IOC_GETFLAGS value.  These could be translated to BSD's st_flags.
           Note that the Linux IOC flags are a mess and filesystems such as Ext4
           define flags that aren't in linux/fs.h, so translation in the kernel
           may be a necessity (or, possibly, we provide the filesystem type too).
      
           (Some attributes are made available in stx_attributes, but the general
           feeling was that the IOC flags were to ext[234]-specific and shouldn't
           be exposed through statx this way).
      
      (15) Mask of features available on file (eg: ACLs, seclabel) [Brad Boyer,
           Michael Kerrisk].
      
           (Deferred, probably to fsinfo.  Finding out if there's an ACL or
           seclabal might require extra filesystem operations).
      
      (16) Femtosecond-resolution timestamps [Dave Chinner].
      
           (A __reserved field has been left in the statx_timestamp struct for
           this - if there proves to be a need).
      
      (17) A set multiple attributes syscall to go with this.
      
      ===============
      NEW SYSTEM CALL
      ===============
      
      The new system call is:
      
      	int ret = statx(int dfd,
      			const char *filename,
      			unsigned int flags,
      			unsigned int mask,
      			struct statx *buffer);
      
      The dfd, filename and flags parameters indicate the file to query, in a
      similar way to fstatat().  There is no equivalent of lstat() as that can be
      emulated with statx() by passing AT_SYMLINK_NOFOLLOW in flags.  There is
      also no equivalent of fstat() as that can be emulated by passing a NULL
      filename to statx() with the fd of interest in dfd.
      
      Whether or not statx() synchronises the attributes with the backing store
      can be controlled by OR'ing a value into the flags argument (this typically
      only affects network filesystems):
      
       (1) AT_STATX_SYNC_AS_STAT tells statx() to behave as stat() does in this
           respect.
      
       (2) AT_STATX_FORCE_SYNC will require a network filesystem to synchronise
           its attributes with the server - which might require data writeback to
           occur to get the timestamps correct.
      
       (3) AT_STATX_DONT_SYNC will suppress synchronisation with the server in a
           network filesystem.  The resulting values should be considered
           approximate.
      
      mask is a bitmask indicating the fields in struct statx that are of
      interest to the caller.  The user should set this to STATX_BASIC_STATS to
      get the basic set returned by stat().  It should be noted that asking for
      more information may entail extra I/O operations.
      
      buffer points to the destination for the data.  This must be 256 bytes in
      size.
      
      ======================
      MAIN ATTRIBUTES RECORD
      ======================
      
      The following structures are defined in which to return the main attribute
      set:
      
      	struct statx_timestamp {
      		__s64	tv_sec;
      		__s32	tv_nsec;
      		__s32	__reserved;
      	};
      
      	struct statx {
      		__u32	stx_mask;
      		__u32	stx_blksize;
      		__u64	stx_attributes;
      		__u32	stx_nlink;
      		__u32	stx_uid;
      		__u32	stx_gid;
      		__u16	stx_mode;
      		__u16	__spare0[1];
      		__u64	stx_ino;
      		__u64	stx_size;
      		__u64	stx_blocks;
      		__u64	__spare1[1];
      		struct statx_timestamp	stx_atime;
      		struct statx_timestamp	stx_btime;
      		struct statx_timestamp	stx_ctime;
      		struct statx_timestamp	stx_mtime;
      		__u32	stx_rdev_major;
      		__u32	stx_rdev_minor;
      		__u32	stx_dev_major;
      		__u32	stx_dev_minor;
      		__u64	__spare2[14];
      	};
      
      The defined bits in request_mask and stx_mask are:
      
      	STATX_TYPE		Want/got stx_mode & S_IFMT
      	STATX_MODE		Want/got stx_mode & ~S_IFMT
      	STATX_NLINK		Want/got stx_nlink
      	STATX_UID		Want/got stx_uid
      	STATX_GID		Want/got stx_gid
      	STATX_ATIME		Want/got stx_atime{,_ns}
      	STATX_MTIME		Want/got stx_mtime{,_ns}
      	STATX_CTIME		Want/got stx_ctime{,_ns}
      	STATX_INO		Want/got stx_ino
      	STATX_SIZE		Want/got stx_size
      	STATX_BLOCKS		Want/got stx_blocks
      	STATX_BASIC_STATS	[The stuff in the normal stat struct]
      	STATX_BTIME		Want/got stx_btime{,_ns}
      	STATX_ALL		[All currently available stuff]
      
      stx_btime is the file creation time, stx_mask is a bitmask indicating the
      data provided and __spares*[] are where as-yet undefined fields can be
      placed.
      
      Time fields are structures with separate seconds and nanoseconds fields
      plus a reserved field in case we want to add even finer resolution.  Note
      that times will be negative if before 1970; in such a case, the nanosecond
      fields will also be negative if not zero.
      
      The bits defined in the stx_attributes field convey information about a
      file, how it is accessed, where it is and what it does.  The following
      attributes map to FS_*_FL flags and are the same numerical value:
      
      	STATX_ATTR_COMPRESSED		File is compressed by the fs
      	STATX_ATTR_IMMUTABLE		File is marked immutable
      	STATX_ATTR_APPEND		File is append-only
      	STATX_ATTR_NODUMP		File is not to be dumped
      	STATX_ATTR_ENCRYPTED		File requires key to decrypt in fs
      
      Within the kernel, the supported flags are listed by:
      
      	KSTAT_ATTR_FS_IOC_FLAGS
      
      [Are any other IOC flags of sufficient general interest to be exposed
      through this interface?]
      
      New flags include:
      
      	STATX_ATTR_AUTOMOUNT		Object is an automount trigger
      
      These are for the use of GUI tools that might want to mark files specially,
      depending on what they are.
      
      Fields in struct statx come in a number of classes:
      
       (0) stx_dev_*, stx_blksize.
      
           These are local system information and are always available.
      
       (1) stx_mode, stx_nlinks, stx_uid, stx_gid, stx_[amc]time, stx_ino,
           stx_size, stx_blocks.
      
           These will be returned whether the caller asks for them or not.  The
           corresponding bits in stx_mask will be set to indicate whether they
           actually have valid values.
      
           If the caller didn't ask for them, then they may be approximated.  For
           example, NFS won't waste any time updating them from the server,
           unless as a byproduct of updating something requested.
      
           If the values don't actually exist for the underlying object (such as
           UID or GID on a DOS file), then the bit won't be set in the stx_mask,
           even if the caller asked for the value.  In such a case, the returned
           value will be a fabrication.
      
           Note that there are instances where the type might not be valid, for
           instance Windows reparse points.
      
       (2) stx_rdev_*.
      
           This will be set only if stx_mode indicates we're looking at a
           blockdev or a chardev, otherwise will be 0.
      
       (3) stx_btime.
      
           Similar to (1), except this will be set to 0 if it doesn't exist.
      
      =======
      TESTING
      =======
      
      The following test program can be used to test the statx system call:
      
      	samples/statx/test-statx.c
      
      Just compile and run, passing it paths to the files you want to examine.
      The file is built automatically if CONFIG_SAMPLES is enabled.
      
      Here's some example output.  Firstly, an NFS directory that crosses to
      another FSID.  Note that the AUTOMOUNT attribute is set because transiting
      this directory will cause d_automount to be invoked by the VFS.
      
      	[root@andromeda ~]# /tmp/test-statx -A /warthog/data
      	statx(/warthog/data) = 0
      	results=7ff
      	  Size: 4096            Blocks: 8          IO Block: 1048576  directory
      	Device: 00:26           Inode: 1703937     Links: 125
      	Access: (3777/drwxrwxrwx)  Uid:     0   Gid:  4041
      	Access: 2016-11-24 09:02:12.219699527+0000
      	Modify: 2016-11-17 10:44:36.225653653+0000
      	Change: 2016-11-17 10:44:36.225653653+0000
      	Attributes: 0000000000001000 (-------- -------- -------- -------- -------- -------- ---m---- --------)
      
      Secondly, the result of automounting on that directory.
      
      	[root@andromeda ~]# /tmp/test-statx /warthog/data
      	statx(/warthog/data) = 0
      	results=7ff
      	  Size: 4096            Blocks: 8          IO Block: 1048576  directory
      	Device: 00:27           Inode: 2           Links: 125
      	Access: (3777/drwxrwxrwx)  Uid:     0   Gid:  4041
      	Access: 2016-11-24 09:02:12.219699527+0000
      	Modify: 2016-11-17 10:44:36.225653653+0000
      	Change: 2016-11-17 10:44:36.225653653+0000
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a528d35e
  20. 02 3月, 2017 3 次提交
  21. 01 3月, 2017 2 次提交
  22. 25 2月, 2017 1 次提交
  23. 12 1月, 2017 1 次提交
    • J
      x86/entry: Fix the end of the stack for newly forked tasks · ff3f7e24
      Josh Poimboeuf 提交于
      When unwinding a task, the end of the stack is always at the same offset
      right below the saved pt_regs, regardless of which syscall was used to
      enter the kernel.  That convention allows the unwinder to verify that a
      stack is sane.
      
      However, newly forked tasks don't always follow that convention, as
      reported by the following unwinder warning seen by Dave Jones:
      
        WARNING: kernel stack frame pointer at ffffc90001443f30 in kworker/u8:8:30468 has bad value           (null)
      
      The warning was due to the following call chain:
      
        (ftrace handler)
        call_usermodehelper_exec_async+0x5/0x140
        ret_from_fork+0x22/0x30
      
      The problem is that ret_from_fork() doesn't create a stack frame before
      calling other functions.  Fix that by carefully using the frame pointer
      macros.
      
      In addition to conforming to the end of stack convention, this also
      makes related stack traces more sensible by making it clear to the user
      that ret_from_fork() was involved.
      Reported-by: NDave Jones <davej@codemonkey.org.uk>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Miroslav Benes <mbenes@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/8854cdaab980e9700a81e9ebf0d4238e4bbb68ef.1483978430.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ff3f7e24
  24. 25 12月, 2016 3 次提交
  25. 15 12月, 2016 1 次提交
  26. 09 12月, 2016 1 次提交
  27. 28 10月, 2016 1 次提交
    • D
      x86/vdso: Set vDSO pointer only after success · 67dece7d
      Dmitry Safonov 提交于
      Those pointers were initialized before call to _install_special_mapping()
      after the commit:
      
        f7b6eb3f ("x86: Set context.vdso before installing the mapping")
      
      This is not required anymore as special mappings have their vma name and
      don't use arch_vma_name() after commit:
      
        a62c34bd ("x86, mm: Improve _install_special_mapping and fix x86 vdso naming")
      
      So, this way to init looks less entangled.
      
      I even belive that we can remove NULL initializers:
      
       - on failure load_elf_binary() will not start a new thread;
       - arch_prctl will have the same pointers as before syscall.
      Signed-off-by: NDmitry Safonov <dsafonov@virtuozzo.com>
      Acked-by: NAndy Lutomirski <luto@kernel.org>
      Cc: 0x7f454c46@gmail.com
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Cc: oleg@redhat.com
      Link: http://lkml.kernel.org/r/20161027141516.28447-3-dsafonov@virtuozzo.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      67dece7d
  28. 25 10月, 2016 1 次提交