1. 04 12月, 2019 1 次提交
    • M
      orangefs: posix open permission checking... · f9bbb682
      Mike Marshall 提交于
      Orangefs has no open, and orangefs checks file permissions
      on each file access. Posix requires that file permissions
      be checked on open and nowhere else. Orangefs-through-the-kernel
      needs to seem posix compliant.
      
      The VFS opens files, even if the filesystem provides no
      method. We can see if a file was successfully opened for
      read and or for write by looking at file->f_mode.
      
      When writes are flowing from the page cache, file is no
      longer available. We can trust the VFS to have checked
      file->f_mode before writing to the page cache.
      
      The mode of a file might change between when it is opened
      and IO commences, or it might be created with an arbitrary mode.
      
      We'll make sure we don't hit EACCES during the IO stage by
      using UID 0. Some of the time we have access without changing
      to UID 0 - how to check?
      Signed-off-by: NMike Marshall <hubcap@omnibond.com>
      f9bbb682
  2. 01 8月, 2019 1 次提交
    • M
      docs: fs: convert porting to ReST · 25b532ce
      Mauro Carvalho Chehab 提交于
      This file has its own proper style, except that, after a while,
      the coding style gets violated and whitespaces are placed on
      different ways.
      
      As Sphinx and ReST are very sentitive to whitespace differences,
      I had to opt if each entry after required/mandatory/... fields
      should start with zero spaces or with a tab. I opted to start them
      all from the zero position, in order to avoid needing to break lines
      with more than 80 columns, with would make harder for review.
      
      Most of the other changes at porting.rst were made to use an unified
      notation with works nice as a text file while also produce a good html
      output after being parsed.
      Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: NJonathan Corbet <corbet@lwn.net>
      25b532ce
  3. 04 5月, 2019 9 次提交
    • M
      orangefs: pass slot index back to readpage. · 4077a0f2
      Mike Marshall 提交于
      When userspace deposits more than a page of data into the shared buffer,
      we'll need to know which slot it is in when we get back to readpage
      so that we can try to use the extra data to fill some extra pages.
      Signed-off-by: NMike Marshall <hubcap@omnibond.com>
      Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
      4077a0f2
    • M
      orangefs: remember count when reading. · c2549f8c
      Mike Marshall 提交于
      Orangefs wins when it can do IO on large (up to four meg) blocks at a time,
      and looses when it has to do tiny "small io" reads and writes. Accessing
      Orangefs through the pagecache with the kernel module helps with small io,
      both reading and writing, a great deal. Readpage generally tries to fetch a
      page (four k) at a time. We'll let users use "count" (as in read(2) or
      pread(2) for example) as a knob to control how much data they get from
      Orangefs at a time and we'll try to use the data to fill extra
      pagecache pages when we get to ->readpage, hopefully resulting in
      fewer calls to readpage and Orangefs userspace.
      
      We need a way to remember how they set count so that we can still have
      it available when we get to ->readpage.
      
       - We'll use file->private_data to keep track of "count".
         We'll wrap generic_file_open with orangefs_file_open and
         initialize private_data to NULL there.
      
       - In ->read_iter we have access to both "count" and file, so
         we'll kmalloc some space onto file->private_data and store
         "count" there.
      
       - We'll kfree file->private_data each time we visit ->flush and
         reinitialize it to NULL.
      Signed-off-by: NMike Marshall <hubcap@omnibond.com>
      Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
      c2549f8c
    • M
      orangefs: add orangefs_revalidate_mapping · 8f04e1be
      Martin Brandenburg 提交于
      This is modeled after NFS, except our method is different.  We use a
      simple timer to determine whether to invalidate the page cache.  This
      is bound to perform.
      
      This addes a sysfs parameter cache_timeout_msecs which controls the time
      between page cache invalidations.
      Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
      Signed-off-by: NMike Marshall <hubcap@omnibond.com>
      8f04e1be
    • M
      orangefs: write range tracking · 52e2d0a3
      Martin Brandenburg 提交于
      Attach the actual range of bytes written to plus the responsible uid/gid
      to each dirty page.  This information must be sent to the server when
      the page is written out.
      
      Now write_begin, page_mkwrite, and invalidatepage keep up with this
      information.  There are several conditions where they must write out the
      page immediately to store the new range.  Two non-contiguous ranges
      cannot be stored on a single page.
      Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
      Signed-off-by: NMike Marshall <hubcap@omnibond.com>
      52e2d0a3
    • M
      orangefs: migrate to generic_file_read_iter · c453dcfc
      Martin Brandenburg 提交于
      Remove orangefs_inode_read.  It was used by readpage.  Calling
      wait_for_direct_io directly serves the purpose just as well.  There is
      now no check of the bufmap size in the readpage path.  There are already
      other places the bufmap size is assumed to be greater than PAGE_SIZE.
      
      Important to call truncate_inode_pages now in the write path so a
      subsequent read sees the new data.
      Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
      Signed-off-by: NMike Marshall <hubcap@omnibond.com>
      c453dcfc
    • M
    • M
      orangefs: reorganize setattr functions to track attribute changes · afd9fb2a
      Martin Brandenburg 提交于
      OrangeFS accepts a mask indicating which attributes were changed.  The
      kernel must not set any bits except those that were actually changed.
      The kernel must set the uid/gid of the request to the actual uid/gid
      responsible for the change.
      
      Code path for notify_change initiated setattrs is
      
      orangefs_setattr(dentry, iattr)
      -> __orangefs_setattr(inode, iattr)
      
      In kernel changes are initiated by calling __orangefs_setattr.
      
      Code path for writeback is
      
      orangefs_write_inode
      -> orangefs_inode_setattr
      
      attr_valid and attr_uid and attr_gid change together under i_lock.
      I_DIRTY changes separately.
      
      __orangefs_setattr
      	lock
      	if needs to be cleaned first, unlock and retry
      	set attr_valid
      	copy data in
      	unlock
      	mark_inode_dirty
      
      orangefs_inode_setattr
      	lock
      	copy attributes out
      	unlock
      	clear getattr_time
      	# __writeback_single_inode clears dirty
      
      orangefs_inode_getattr
      	# possible to get here with attr_valid set and not dirty
      	lock
      	if getattr_time ok or attr_valid set, unlock and return
      	unlock
      	do server operation
      	# another thread may getattr or setattr, so check for that
      	lock
      	if getattr_time ok or attr_valid, unlock and return
      	else, copy in
      	update getattr_time
      	unlock
      Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
      Signed-off-by: NMike Marshall <hubcap@omnibond.com>
      afd9fb2a
    • M
      orangefs: simplify orangefs_inode_getattr interface · 8b60785c
      Martin Brandenburg 提交于
      No need to store the received mask.  It is either STATX_BASIC_STATS or
      STATX_BASIC_STATS & ~STATX_SIZE.  If STATX_SIZE is requested, the cache
      is bypassed anyway, so the cached mask is unnecessary to decide whether
      to do a real getattr.
      
      This is a change.  Previously a getattr would want size and use the
      cached size.  All of the in-kernel callers that wanted size did not want
      a cached size.  Now a getattr cannot use the cached size if it wants
      size at all.
      Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
      Signed-off-by: NMike Marshall <hubcap@omnibond.com>
      8b60785c
    • M
      orangefs: implement xattr cache · fc2e2e9c
      Martin Brandenburg 提交于
      This uses the same timeout as the getattr cache.  This substantially
      increases performance when writing files with smaller buffer sizes.
      
      When writing, the size is (often) changed, which causes a call to
      notify_change which calls security_inode_need_killpriv which needs a
      getxattr.  Caching it reduces traffic to the server.
      Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
      Signed-off-by: NMike Marshall <hubcap@omnibond.com>
      fc2e2e9c
  4. 06 6月, 2018 1 次提交
    • D
      vfs: change inode times to use struct timespec64 · 95582b00
      Deepa Dinamani 提交于
      struct timespec is not y2038 safe. Transition vfs to use
      y2038 safe struct timespec64 instead.
      
      The change was made with the help of the following cocinelle
      script. This catches about 80% of the changes.
      All the header file and logic changes are included in the
      first 5 rules. The rest are trivial substitutions.
      I avoid changing any of the function signatures or any other
      filesystem specific data structures to keep the patch simple
      for review.
      
      The script can be a little shorter by combining different cases.
      But, this version was sufficient for my usecase.
      
      virtual patch
      
      @ depends on patch @
      identifier now;
      @@
      - struct timespec
      + struct timespec64
        current_time ( ... )
        {
      - struct timespec now = current_kernel_time();
      + struct timespec64 now = current_kernel_time64();
        ...
      - return timespec_trunc(
      + return timespec64_trunc(
        ... );
        }
      
      @ depends on patch @
      identifier xtime;
      @@
       struct \( iattr \| inode \| kstat \) {
       ...
      -       struct timespec xtime;
      +       struct timespec64 xtime;
       ...
       }
      
      @ depends on patch @
      identifier t;
      @@
       struct inode_operations {
       ...
      int (*update_time) (...,
      -       struct timespec t,
      +       struct timespec64 t,
      ...);
       ...
       }
      
      @ depends on patch @
      identifier t;
      identifier fn_update_time =~ "update_time$";
      @@
       fn_update_time (...,
      - struct timespec *t,
      + struct timespec64 *t,
       ...) { ... }
      
      @ depends on patch @
      identifier t;
      @@
      lease_get_mtime( ... ,
      - struct timespec *t
      + struct timespec64 *t
        ) { ... }
      
      @te depends on patch forall@
      identifier ts;
      local idexpression struct inode *inode_node;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier fn_update_time =~ "update_time$";
      identifier fn;
      expression e, E3;
      local idexpression struct inode *node1;
      local idexpression struct inode *node2;
      local idexpression struct iattr *attr1;
      local idexpression struct iattr *attr2;
      local idexpression struct iattr attr;
      identifier i_xtime1 =~ "^i_[acm]time$";
      identifier i_xtime2 =~ "^i_[acm]time$";
      identifier ia_xtime1 =~ "^ia_[acm]time$";
      identifier ia_xtime2 =~ "^ia_[acm]time$";
      @@
      (
      (
      - struct timespec ts;
      + struct timespec64 ts;
      |
      - struct timespec ts = current_time(inode_node);
      + struct timespec64 ts = current_time(inode_node);
      )
      
      <+... when != ts
      (
      - timespec_equal(&inode_node->i_xtime, &ts)
      + timespec64_equal(&inode_node->i_xtime, &ts)
      |
      - timespec_equal(&ts, &inode_node->i_xtime)
      + timespec64_equal(&ts, &inode_node->i_xtime)
      |
      - timespec_compare(&inode_node->i_xtime, &ts)
      + timespec64_compare(&inode_node->i_xtime, &ts)
      |
      - timespec_compare(&ts, &inode_node->i_xtime)
      + timespec64_compare(&ts, &inode_node->i_xtime)
      |
      ts = current_time(e)
      |
      fn_update_time(..., &ts,...)
      |
      inode_node->i_xtime = ts
      |
      node1->i_xtime = ts
      |
      ts = inode_node->i_xtime
      |
      <+... attr1->ia_xtime ...+> = ts
      |
      ts = attr1->ia_xtime
      |
      ts.tv_sec
      |
      ts.tv_nsec
      |
      btrfs_set_stack_timespec_sec(..., ts.tv_sec)
      |
      btrfs_set_stack_timespec_nsec(..., ts.tv_nsec)
      |
      - ts = timespec64_to_timespec(
      + ts =
      ...
      -)
      |
      - ts = ktime_to_timespec(
      + ts = ktime_to_timespec64(
      ...)
      |
      - ts = E3
      + ts = timespec_to_timespec64(E3)
      |
      - ktime_get_real_ts(&ts)
      + ktime_get_real_ts64(&ts)
      |
      fn(...,
      - ts
      + timespec64_to_timespec(ts)
      ,...)
      )
      ...+>
      (
      <... when != ts
      - return ts;
      + return timespec64_to_timespec(ts);
      ...>
      )
      |
      - timespec_equal(&node1->i_xtime1, &node2->i_xtime2)
      + timespec64_equal(&node1->i_xtime2, &node2->i_xtime2)
      |
      - timespec_equal(&node1->i_xtime1, &attr2->ia_xtime2)
      + timespec64_equal(&node1->i_xtime2, &attr2->ia_xtime2)
      |
      - timespec_compare(&node1->i_xtime1, &node2->i_xtime2)
      + timespec64_compare(&node1->i_xtime1, &node2->i_xtime2)
      |
      node1->i_xtime1 =
      - timespec_trunc(attr1->ia_xtime1,
      + timespec64_trunc(attr1->ia_xtime1,
      ...)
      |
      - attr1->ia_xtime1 = timespec_trunc(attr2->ia_xtime2,
      + attr1->ia_xtime1 =  timespec64_trunc(attr2->ia_xtime2,
      ...)
      |
      - ktime_get_real_ts(&attr1->ia_xtime1)
      + ktime_get_real_ts64(&attr1->ia_xtime1)
      |
      - ktime_get_real_ts(&attr.ia_xtime1)
      + ktime_get_real_ts64(&attr.ia_xtime1)
      )
      
      @ depends on patch @
      struct inode *node;
      struct iattr *attr;
      identifier fn;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      expression e;
      @@
      (
      - fn(node->i_xtime);
      + fn(timespec64_to_timespec(node->i_xtime));
      |
       fn(...,
      - node->i_xtime);
      + timespec64_to_timespec(node->i_xtime));
      |
      - e = fn(attr->ia_xtime);
      + e = fn(timespec64_to_timespec(attr->ia_xtime));
      )
      
      @ depends on patch forall @
      struct inode *node;
      struct iattr *attr;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier fn;
      @@
      {
      + struct timespec ts;
      <+...
      (
      + ts = timespec64_to_timespec(node->i_xtime);
      fn (...,
      - &node->i_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      fn (...,
      - &attr->ia_xtime,
      + &ts,
      ...);
      )
      ...+>
      }
      
      @ depends on patch forall @
      struct inode *node;
      struct iattr *attr;
      struct kstat *stat;
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier i_xtime =~ "^i_[acm]time$";
      identifier xtime =~ "^[acm]time$";
      identifier fn, ret;
      @@
      {
      + struct timespec ts;
      <+...
      (
      + ts = timespec64_to_timespec(node->i_xtime);
      ret = fn (...,
      - &node->i_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(node->i_xtime);
      ret = fn (...,
      - &node->i_xtime);
      + &ts);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      ret = fn (...,
      - &attr->ia_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      ret = fn (...,
      - &attr->ia_xtime);
      + &ts);
      |
      + ts = timespec64_to_timespec(stat->xtime);
      ret = fn (...,
      - &stat->xtime);
      + &ts);
      )
      ...+>
      }
      
      @ depends on patch @
      struct inode *node;
      struct inode *node2;
      identifier i_xtime1 =~ "^i_[acm]time$";
      identifier i_xtime2 =~ "^i_[acm]time$";
      identifier i_xtime3 =~ "^i_[acm]time$";
      struct iattr *attrp;
      struct iattr *attrp2;
      struct iattr attr ;
      identifier ia_xtime1 =~ "^ia_[acm]time$";
      identifier ia_xtime2 =~ "^ia_[acm]time$";
      struct kstat *stat;
      struct kstat stat1;
      struct timespec64 ts;
      identifier xtime =~ "^[acmb]time$";
      expression e;
      @@
      (
      ( node->i_xtime2 \| attrp->ia_xtime2 \| attr.ia_xtime2 \) = node->i_xtime1  ;
      |
       node->i_xtime2 = \( node2->i_xtime1 \| timespec64_trunc(...) \);
      |
       node->i_xtime2 = node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \);
      |
       node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \);
      |
       stat->xtime = node2->i_xtime1;
      |
       stat1.xtime = node2->i_xtime1;
      |
      ( node->i_xtime2 \| attrp->ia_xtime2 \) = attrp->ia_xtime1  ;
      |
      ( attrp->ia_xtime1 \| attr.ia_xtime1 \) = attrp2->ia_xtime2;
      |
      - e = node->i_xtime1;
      + e = timespec64_to_timespec( node->i_xtime1 );
      |
      - e = attrp->ia_xtime1;
      + e = timespec64_to_timespec( attrp->ia_xtime1 );
      |
      node->i_xtime1 = current_time(...);
      |
       node->i_xtime2 = node->i_xtime1 = node->i_xtime3 =
      - e;
      + timespec_to_timespec64(e);
      |
       node->i_xtime1 = node->i_xtime3 =
      - e;
      + timespec_to_timespec64(e);
      |
      - node->i_xtime1 = e;
      + node->i_xtime1 = timespec_to_timespec64(e);
      )
      Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Cc: <anton@tuxera.com>
      Cc: <balbi@kernel.org>
      Cc: <bfields@fieldses.org>
      Cc: <darrick.wong@oracle.com>
      Cc: <dhowells@redhat.com>
      Cc: <dsterba@suse.com>
      Cc: <dwmw2@infradead.org>
      Cc: <hch@lst.de>
      Cc: <hirofumi@mail.parknet.co.jp>
      Cc: <hubcap@omnibond.com>
      Cc: <jack@suse.com>
      Cc: <jaegeuk@kernel.org>
      Cc: <jaharkes@cs.cmu.edu>
      Cc: <jslaby@suse.com>
      Cc: <keescook@chromium.org>
      Cc: <mark@fasheh.com>
      Cc: <miklos@szeredi.hu>
      Cc: <nico@linaro.org>
      Cc: <reiserfs-devel@vger.kernel.org>
      Cc: <richard@nod.at>
      Cc: <sage@redhat.com>
      Cc: <sfrench@samba.org>
      Cc: <swhiteho@redhat.com>
      Cc: <tj@kernel.org>
      Cc: <trond.myklebust@primarydata.com>
      Cc: <tytso@mit.edu>
      Cc: <viro@zeniv.linux.org.uk>
      95582b00
  5. 02 6月, 2018 1 次提交
    • M
      orangefs: revamp block sizes · 9f8fd53c
      Martin Brandenburg 提交于
      Now the superblock block size is PAGE_SIZE.  The inode block size is
      PAGE_SIZE for directories and symlinks, but is the server-reported
      block size for regular files.
      
      The block size in the OrangeFS private inode is now deleted.  Stat
      now reports PAGE_SIZE for directories and symlinks and the
      server-reported block size for regular files.
      
      The user-space visible change is that the block size for directores
      and symlinks and the superblock is now PAGE_SIZE rather than the size of
      the client-core shared memory buffers, which was typically four
      megabytes.
      Reported-by: NBecky Ligon <ligon@clemson.edu>
      Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
      Cc: hubcap@omnibond.com
      Cc: walt@omnibond.com
      Signed-off-by: NMike Marshall <hubcap@omnibond.com>
      9f8fd53c
  6. 04 4月, 2018 2 次提交
  7. 07 2月, 2018 2 次提交
  8. 26 1月, 2018 1 次提交
  9. 14 11月, 2017 1 次提交
    • M
      orangefs: stop setting atime on inode dirty · a55f2d86
      Martin Brandenburg 提交于
      The previous code path was to mark the inode dirty, let
      orangefs_inode_dirty set a flag in our private inode, then later during
      inode release call orangefs_flush_inode which notices the flag and
      writes the atime out.
      
      The code path worked almost identically for mtime, ctime, and mode
      except that those flags are set explicitly and not as side effects of
      dirty.
      
      Now orangefs_flush_inode is removed.  Marking an inode dirty does not
      imply an atime update.  Any place where flags were set before is now
      an explicit call to orangefs_inode_setattr.  Since OrangeFS does not
      utilize inode writeback, the attribute change should be written out
      immediately.
      
      Fixes generic/120.
      
      In namei.c, there are several places where the directory mtime and ctime
      are set, but only the mtime is sent to the server.  These don't seem
      right, but I've left them as is for now.
      Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
      Signed-off-by: NMike Marshall <hubcap@omnibond.com>
      a55f2d86
  10. 02 11月, 2017 1 次提交
    • G
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman 提交于
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
  11. 12 10月, 2017 1 次提交
  12. 27 4月, 2017 2 次提交
  13. 16 4月, 2017 1 次提交
    • M
      orangefs: free superblock when mount fails · 1ec1688c
      Martin Brandenburg 提交于
      Otherwise lockdep says:
      
      [ 1337.483798] ================================================
      [ 1337.483999] [ BUG: lock held when returning to user space! ]
      [ 1337.484252] 4.11.0-rc6 #19 Not tainted
      [ 1337.484423] ------------------------------------------------
      [ 1337.484626] mount/14766 is leaving the kernel with locks still held!
      [ 1337.484841] 1 lock held by mount/14766:
      [ 1337.485017]  #0:  (&type->s_umount_key#33/1){+.+.+.}, at: [<ffffffff8124171f>] sget_userns+0x2af/0x520
      
      Caught by xfstests generic/413 which tried to mount with the unsupported
      mount option dax.  Then xfstests generic/422 ran sync which deadlocks.
      Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
      Acked-by: NMike Marshall <hubcap@omnibond.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1ec1688c
  14. 03 3月, 2017 1 次提交
    • D
      statx: Add a system call to make enhanced file info available · a528d35e
      David Howells 提交于
      Add a system call to make extended file information available, including
      file creation and some attribute flags where available through the
      underlying filesystem.
      
      The getattr inode operation is altered to take two additional arguments: a
      u32 request_mask and an unsigned int flags that indicate the
      synchronisation mode.  This change is propagated to the vfs_getattr*()
      function.
      
      Functions like vfs_stat() are now inline wrappers around new functions
      vfs_statx() and vfs_statx_fd() to reduce stack usage.
      
      ========
      OVERVIEW
      ========
      
      The idea was initially proposed as a set of xattrs that could be retrieved
      with getxattr(), but the general preference proved to be for a new syscall
      with an extended stat structure.
      
      A number of requests were gathered for features to be included.  The
      following have been included:
      
       (1) Make the fields a consistent size on all arches and make them large.
      
       (2) Spare space, request flags and information flags are provided for
           future expansion.
      
       (3) Better support for the y2038 problem [Arnd Bergmann] (tv_sec is an
           __s64).
      
       (4) Creation time: The SMB protocol carries the creation time, which could
           be exported by Samba, which will in turn help CIFS make use of
           FS-Cache as that can be used for coherency data (stx_btime).
      
           This is also specified in NFSv4 as a recommended attribute and could
           be exported by NFSD [Steve French].
      
       (5) Lightweight stat: Ask for just those details of interest, and allow a
           netfs (such as NFS) to approximate anything not of interest, possibly
           without going to the server [Trond Myklebust, Ulrich Drepper, Andreas
           Dilger] (AT_STATX_DONT_SYNC).
      
       (6) Heavyweight stat: Force a netfs to go to the server, even if it thinks
           its cached attributes are up to date [Trond Myklebust]
           (AT_STATX_FORCE_SYNC).
      
      And the following have been left out for future extension:
      
       (7) Data version number: Could be used by userspace NFS servers [Aneesh
           Kumar].
      
           Can also be used to modify fill_post_wcc() in NFSD which retrieves
           i_version directly, but has just called vfs_getattr().  It could get
           it from the kstat struct if it used vfs_xgetattr() instead.
      
           (There's disagreement on the exact semantics of a single field, since
           not all filesystems do this the same way).
      
       (8) BSD stat compatibility: Including more fields from the BSD stat such
           as creation time (st_btime) and inode generation number (st_gen)
           [Jeremy Allison, Bernd Schubert].
      
       (9) Inode generation number: Useful for FUSE and userspace NFS servers
           [Bernd Schubert].
      
           (This was asked for but later deemed unnecessary with the
           open-by-handle capability available and caused disagreement as to
           whether it's a security hole or not).
      
      (10) Extra coherency data may be useful in making backups [Andreas Dilger].
      
           (No particular data were offered, but things like last backup
           timestamp, the data version number and the DOS archive bit would come
           into this category).
      
      (11) Allow the filesystem to indicate what it can/cannot provide: A
           filesystem can now say it doesn't support a standard stat feature if
           that isn't available, so if, for instance, inode numbers or UIDs don't
           exist or are fabricated locally...
      
           (This requires a separate system call - I have an fsinfo() call idea
           for this).
      
      (12) Store a 16-byte volume ID in the superblock that can be returned in
           struct xstat [Steve French].
      
           (Deferred to fsinfo).
      
      (13) Include granularity fields in the time data to indicate the
           granularity of each of the times (NFSv4 time_delta) [Steve French].
      
           (Deferred to fsinfo).
      
      (14) FS_IOC_GETFLAGS value.  These could be translated to BSD's st_flags.
           Note that the Linux IOC flags are a mess and filesystems such as Ext4
           define flags that aren't in linux/fs.h, so translation in the kernel
           may be a necessity (or, possibly, we provide the filesystem type too).
      
           (Some attributes are made available in stx_attributes, but the general
           feeling was that the IOC flags were to ext[234]-specific and shouldn't
           be exposed through statx this way).
      
      (15) Mask of features available on file (eg: ACLs, seclabel) [Brad Boyer,
           Michael Kerrisk].
      
           (Deferred, probably to fsinfo.  Finding out if there's an ACL or
           seclabal might require extra filesystem operations).
      
      (16) Femtosecond-resolution timestamps [Dave Chinner].
      
           (A __reserved field has been left in the statx_timestamp struct for
           this - if there proves to be a need).
      
      (17) A set multiple attributes syscall to go with this.
      
      ===============
      NEW SYSTEM CALL
      ===============
      
      The new system call is:
      
      	int ret = statx(int dfd,
      			const char *filename,
      			unsigned int flags,
      			unsigned int mask,
      			struct statx *buffer);
      
      The dfd, filename and flags parameters indicate the file to query, in a
      similar way to fstatat().  There is no equivalent of lstat() as that can be
      emulated with statx() by passing AT_SYMLINK_NOFOLLOW in flags.  There is
      also no equivalent of fstat() as that can be emulated by passing a NULL
      filename to statx() with the fd of interest in dfd.
      
      Whether or not statx() synchronises the attributes with the backing store
      can be controlled by OR'ing a value into the flags argument (this typically
      only affects network filesystems):
      
       (1) AT_STATX_SYNC_AS_STAT tells statx() to behave as stat() does in this
           respect.
      
       (2) AT_STATX_FORCE_SYNC will require a network filesystem to synchronise
           its attributes with the server - which might require data writeback to
           occur to get the timestamps correct.
      
       (3) AT_STATX_DONT_SYNC will suppress synchronisation with the server in a
           network filesystem.  The resulting values should be considered
           approximate.
      
      mask is a bitmask indicating the fields in struct statx that are of
      interest to the caller.  The user should set this to STATX_BASIC_STATS to
      get the basic set returned by stat().  It should be noted that asking for
      more information may entail extra I/O operations.
      
      buffer points to the destination for the data.  This must be 256 bytes in
      size.
      
      ======================
      MAIN ATTRIBUTES RECORD
      ======================
      
      The following structures are defined in which to return the main attribute
      set:
      
      	struct statx_timestamp {
      		__s64	tv_sec;
      		__s32	tv_nsec;
      		__s32	__reserved;
      	};
      
      	struct statx {
      		__u32	stx_mask;
      		__u32	stx_blksize;
      		__u64	stx_attributes;
      		__u32	stx_nlink;
      		__u32	stx_uid;
      		__u32	stx_gid;
      		__u16	stx_mode;
      		__u16	__spare0[1];
      		__u64	stx_ino;
      		__u64	stx_size;
      		__u64	stx_blocks;
      		__u64	__spare1[1];
      		struct statx_timestamp	stx_atime;
      		struct statx_timestamp	stx_btime;
      		struct statx_timestamp	stx_ctime;
      		struct statx_timestamp	stx_mtime;
      		__u32	stx_rdev_major;
      		__u32	stx_rdev_minor;
      		__u32	stx_dev_major;
      		__u32	stx_dev_minor;
      		__u64	__spare2[14];
      	};
      
      The defined bits in request_mask and stx_mask are:
      
      	STATX_TYPE		Want/got stx_mode & S_IFMT
      	STATX_MODE		Want/got stx_mode & ~S_IFMT
      	STATX_NLINK		Want/got stx_nlink
      	STATX_UID		Want/got stx_uid
      	STATX_GID		Want/got stx_gid
      	STATX_ATIME		Want/got stx_atime{,_ns}
      	STATX_MTIME		Want/got stx_mtime{,_ns}
      	STATX_CTIME		Want/got stx_ctime{,_ns}
      	STATX_INO		Want/got stx_ino
      	STATX_SIZE		Want/got stx_size
      	STATX_BLOCKS		Want/got stx_blocks
      	STATX_BASIC_STATS	[The stuff in the normal stat struct]
      	STATX_BTIME		Want/got stx_btime{,_ns}
      	STATX_ALL		[All currently available stuff]
      
      stx_btime is the file creation time, stx_mask is a bitmask indicating the
      data provided and __spares*[] are where as-yet undefined fields can be
      placed.
      
      Time fields are structures with separate seconds and nanoseconds fields
      plus a reserved field in case we want to add even finer resolution.  Note
      that times will be negative if before 1970; in such a case, the nanosecond
      fields will also be negative if not zero.
      
      The bits defined in the stx_attributes field convey information about a
      file, how it is accessed, where it is and what it does.  The following
      attributes map to FS_*_FL flags and are the same numerical value:
      
      	STATX_ATTR_COMPRESSED		File is compressed by the fs
      	STATX_ATTR_IMMUTABLE		File is marked immutable
      	STATX_ATTR_APPEND		File is append-only
      	STATX_ATTR_NODUMP		File is not to be dumped
      	STATX_ATTR_ENCRYPTED		File requires key to decrypt in fs
      
      Within the kernel, the supported flags are listed by:
      
      	KSTAT_ATTR_FS_IOC_FLAGS
      
      [Are any other IOC flags of sufficient general interest to be exposed
      through this interface?]
      
      New flags include:
      
      	STATX_ATTR_AUTOMOUNT		Object is an automount trigger
      
      These are for the use of GUI tools that might want to mark files specially,
      depending on what they are.
      
      Fields in struct statx come in a number of classes:
      
       (0) stx_dev_*, stx_blksize.
      
           These are local system information and are always available.
      
       (1) stx_mode, stx_nlinks, stx_uid, stx_gid, stx_[amc]time, stx_ino,
           stx_size, stx_blocks.
      
           These will be returned whether the caller asks for them or not.  The
           corresponding bits in stx_mask will be set to indicate whether they
           actually have valid values.
      
           If the caller didn't ask for them, then they may be approximated.  For
           example, NFS won't waste any time updating them from the server,
           unless as a byproduct of updating something requested.
      
           If the values don't actually exist for the underlying object (such as
           UID or GID on a DOS file), then the bit won't be set in the stx_mask,
           even if the caller asked for the value.  In such a case, the returned
           value will be a fabrication.
      
           Note that there are instances where the type might not be valid, for
           instance Windows reparse points.
      
       (2) stx_rdev_*.
      
           This will be set only if stx_mode indicates we're looking at a
           blockdev or a chardev, otherwise will be 0.
      
       (3) stx_btime.
      
           Similar to (1), except this will be set to 0 if it doesn't exist.
      
      =======
      TESTING
      =======
      
      The following test program can be used to test the statx system call:
      
      	samples/statx/test-statx.c
      
      Just compile and run, passing it paths to the files you want to examine.
      The file is built automatically if CONFIG_SAMPLES is enabled.
      
      Here's some example output.  Firstly, an NFS directory that crosses to
      another FSID.  Note that the AUTOMOUNT attribute is set because transiting
      this directory will cause d_automount to be invoked by the VFS.
      
      	[root@andromeda ~]# /tmp/test-statx -A /warthog/data
      	statx(/warthog/data) = 0
      	results=7ff
      	  Size: 4096            Blocks: 8          IO Block: 1048576  directory
      	Device: 00:26           Inode: 1703937     Links: 125
      	Access: (3777/drwxrwxrwx)  Uid:     0   Gid:  4041
      	Access: 2016-11-24 09:02:12.219699527+0000
      	Modify: 2016-11-17 10:44:36.225653653+0000
      	Change: 2016-11-17 10:44:36.225653653+0000
      	Attributes: 0000000000001000 (-------- -------- -------- -------- -------- -------- ---m---- --------)
      
      Secondly, the result of automounting on that directory.
      
      	[root@andromeda ~]# /tmp/test-statx /warthog/data
      	statx(/warthog/data) = 0
      	results=7ff
      	  Size: 4096            Blocks: 8          IO Block: 1048576  directory
      	Device: 00:27           Inode: 2           Links: 125
      	Access: (3777/drwxrwxrwx)  Uid:     0   Gid:  4041
      	Access: 2016-11-24 09:02:12.219699527+0000
      	Modify: 2016-11-17 10:44:36.225653653+0000
      	Change: 2016-11-17 10:44:36.225653653+0000
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a528d35e
  15. 02 3月, 2017 1 次提交
  16. 04 2月, 2017 1 次提交
  17. 25 10月, 2016 1 次提交
  18. 16 8月, 2016 3 次提交
  19. 15 8月, 2016 1 次提交
  20. 13 8月, 2016 1 次提交
    • M
      orangefs: add features op · 482664dd
      Martin Brandenburg 提交于
      This is a new userspace operation, which will be done if the client-core
      version is greater than or equal to 2.9.6. This will provide a way to
      implement optional features and to determine which features are
      supported by the client-core. If the client-core version is older than
      2.9.6, no optional features are supported and the op will not be done.
      
      The intent is to allow protocol extensions without relying on the
      client-core's current behavior of ignoring what it doesn't understand.
      Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
      482664dd
  21. 10 8月, 2016 1 次提交
  22. 03 8月, 2016 2 次提交
  23. 06 7月, 2016 3 次提交
    • J
      orangefs: fix namespace handling · 78fee0b6
      Jann Horn 提交于
      In orangefs_inode_getxattr(), an fsuid is written to dmesg. The kuid is
      converted to a userspace uid via from_kuid(current_user_ns(), [...]), but
      since dmesg is global, init_user_ns should be used here instead.
      
      In copy_attributes_from_inode(), op_alloc() and fill_default_sys_attrs(),
      upcall structures are populated with uids/gids that have been mapped into
      the caller's namespace. However, those upcall structures are read by
      another process (the userspace filesystem driver), and that process might
      be running in another namespace. This effectively lets any user spoof its
      uid and gid as seen by the userspace filesystem driver.
      
      To fix the second issue, I just construct the opcall structures with
      init_user_ns uids/gids and require the filesystem server to run in the
      init namespace. Since orangefs is full of global state anyway (as the error
      message in DUMP_DEVICE_ERROR explains, there can only be one userspace
      orangefs filesystem driver at once), that shouldn't be a problem.
      
      [
      Why does orangefs even exist in the kernel if everything does upcalls into
      userspace? What does orangefs do that couldn't be done with the FUSE
      interface? If there is no good answer to those questions, I'd prefer to see
      orangefs kicked out of the kernel. Can that be done for something that
      shipped in a release?
      
      According to commit f7ab093f ("Orangefs: kernel client part 1"), they
      even already have a FUSE daemon, and the only rational reason (apart from
      "but most of our users report preferring to use our kernel module instead")
      given for not wanting to use FUSE is one "in-the-works" feature that could
      probably be integated into FUSE instead.
      ]
      
      This patch has been compile-tested.
      Signed-off-by: NJann Horn <jannh@google.com>
      Signed-off-by: NMike Marshall <hubcap@omnibond.com>
      78fee0b6
    • A
      orangefs: Remove useless xattr prefix arguments · d373a712
      Andreas Gruenbacher 提交于
      Mike,
      
      On Fri, Jun 3, 2016 at 9:44 PM, Mike Marshall <hubcap@omnibond.com> wrote:
      > We use the return value in this one line you changed, our userspace code gets
      > ill when we send it (-ENOMEM +1) as a key length...
      
      ah, my mistake.  Here's a fixed version.
      
      Thanks,
      Andreas
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NMike Marshall <hubcap@omnibond.com>
      d373a712
    • A
      orangefs: Remove useless defines · 972a7344
      Andreas Gruenbacher 提交于
      The ORANGEFS_XATTR_INDEX_ defines are unused; the ORANGEFS_XATTR_NAME_
      defines only obfuscate the code.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NMike Marshall <hubcap@omnibond.com>
      972a7344
  24. 30 5月, 2016 1 次提交