1. 29 5月, 2016 4 次提交
    • G
      Rename other copy of hash_string to hashlen_string · f2a031b6
      George Spelvin 提交于
      The original name was simply hash_string(), but that conflicted with a
      function with that name in drivers/base/power/trace.c, and I decided
      that calling it "hashlen_" was better anyway.
      
      But you have to do it in two places.
      
      [ This caused build errors for architectures that don't define
        CONFIG_DCACHE_WORD_ACCESS   - Linus ]
      Signed-off-by: NGeorge Spelvin <linux@sciencehorizons.net>
      Reported-by: NGuenter Roeck <linux@roeck-us.net>
      Fixes: fcfd2fbf ("fs/namei.c: Add hashlen_string() function")
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f2a031b6
    • G
      <linux/hash.h>: Add support for architecture-specific functions · 468a9428
      George Spelvin 提交于
      This is just the infrastructure; there are no users yet.
      
      This is modelled on CONFIG_ARCH_RANDOM; a CONFIG_ symbol declares
      the existence of <asm/hash.h>.
      
      That file may define its own versions of various functions, and define
      HAVE_* symbols (no CONFIG_ prefix!) to suppress the generic ones.
      
      Included is a self-test (in lib/test_hash.c) that verifies the basics.
      It is NOT in general required that the arch-specific functions compute
      the same thing as the generic, but if a HAVE_* symbol is defined with
      the value 1, then equality is tested.
      Signed-off-by: NGeorge Spelvin <linux@sciencehorizons.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Andreas Schwab <schwab@linux-m68k.org>
      Cc: Philippe De Muyter <phdm@macq.eu>
      Cc: linux-m68k@lists.linux-m68k.org
      Cc: Alistair Francis <alistai@xilinx.com>
      Cc: Michal Simek <michal.simek@xilinx.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: uclinux-h8-devel@lists.sourceforge.jp
      468a9428
    • G
      fs/namei.c: Improve dcache hash function · 2a18da7a
      George Spelvin 提交于
      Patch 0fed3ac8 improved the hash mixing, but the function is slower
      than necessary; there's a 7-instruction dependency chain (10 on x86)
      each loop iteration.
      
      Word-at-a-time access is a very tight loop (which is good, because
      link_path_walk() is one of the hottest code paths in the entire kernel),
      and the hash mixing function must not have a longer latency to avoid
      slowing it down.
      
      There do not appear to be any published fast hash functions that:
      1) Operate on the input a word at a time, and
      2) Don't need to know the length of the input beforehand, and
      3) Have a single iterated mixing function, not needing conditional
         branches or unrolling to distinguish different loop iterations.
      
      One of the algorithms which comes closest is Yann Collet's xxHash, but
      that's two dependent multiplies per word, which is too much.
      
      The key insights in this design are:
      
      1) Barring expensive ops like multiplies, to diffuse one input bit
         across 64 bits of hash state takes at least log2(64) = 6 sequentially
         dependent instructions.  That is more cycles than we'd like.
      2) An operation like "hash ^= hash << 13" requires a second temporary
         register anyway, and on a 2-operand machine like x86, it's three
         instructions.
      3) A better use of a second register is to hold a two-word hash state.
         With careful design, no temporaries are needed at all, so it doesn't
         increase register pressure.  And this gets rid of register copying
         on 2-operand machines, so the code is smaller and faster.
      4) Using two words of state weakens the requirement for one-round mixing;
         we now have two rounds of mixing before cancellation is possible.
      5) A two-word hash state also allows operations on both halves to be
         done in parallel, so on a superscalar processor we get more mixing
         in fewer cycles.
      
      I ended up using a mixing function inspired by the ChaCha and Speck
      round functions.  It is 6 simple instructions and 3 cycles per iteration
      (assuming multiply by 9 can be done by an "lea" instruction):
      
      		x ^= *input++;
      	y ^= x;	x = ROL(x, K1);
      	x += y;	y = ROL(y, K2);
      	y *= 9;
      
      Not only is this reversible, two consecutive rounds are reversible:
      if you are given the initial and final states, but not the intermediate
      state, it is possible to compute both input words.  This means that at
      least 3 words of input are required to create a collision.
      
      (It also has the property, used by hash_name() to avoid a branch, that
      it hashes all-zero to all-zero.)
      
      The rotate constants K1 and K2 were found by experiment.  The search took
      a sample of random initial states (I used 1023) and considered the effect
      of flipping each of the 64 input bits on each of the 128 output bits two
      rounds later.  Each of the 8192 pairs can be considered a biased coin, and
      adding up the Shannon entropy of all of them produces a score.
      
      The best-scoring shifts also did well in other tests (flipping bits in y,
      trying 3 or 4 rounds of mixing, flipping all 64*63/2 pairs of input bits),
      so the choice was made with the additional constraint that the sum of the
      shifts is odd and not too close to the word size.
      
      The final state is then folded into a 32-bit hash value by a less carefully
      optimized multiply-based scheme.  This also has to be fast, as pathname
      components tend to be short (the most common case is one iteration!), but
      there's some room for latency, as there is a fair bit of intervening logic
      before the hash value is used for anything.
      
      (Performance verified with "bonnie++ -s 0 -n 1536:-2" on tmpfs.  I need
      a better benchmark; the numbers seem to show a slight dip in performance
      between 4.6.0 and this patch, but they're too noisy to quote.)
      
      Special thanks to Bruce fields for diligent testing which uncovered a
      nasty fencepost error in an earlier version of this patch.
      
      [checkpatch.pl formatting complaints noted and respectfully disagreed with.]
      Signed-off-by: NGeorge Spelvin <linux@sciencehorizons.net>
      Tested-by: NJ. Bruce Fields <bfields@redhat.com>
      2a18da7a
    • G
      fs/namei.c: Add hashlen_string() function · fcfd2fbf
      George Spelvin 提交于
      We'd like to make more use of the highly-optimized dcache hash functions
      throughout the kernel, rather than have every subsystem create its own,
      and a function that hashes basic null-terminated strings is required
      for that.
      
      (The name is to emphasize that it returns both hash and length.)
      
      It's actually useful in the dcache itself, specifically d_alloc_name().
      Other uses in the next patch.
      
      full_name_hash() is also tweaked to make it more generally useful:
      1) Take a "char *" rather than "unsigned char *" argument, to
         be consistent with hash_name().
      2) Handle zero-length inputs.  If we want more callers, we don't want
         to make them worry about corner cases.
      Signed-off-by: NGeorge Spelvin <linux@sciencehorizons.net>
      fcfd2fbf
  2. 17 5月, 2016 1 次提交
  3. 11 5月, 2016 2 次提交
    • M
      vfs: add lookup_hash() helper · 3c9fe8cd
      Miklos Szeredi 提交于
      Overlayfs needs lookup without inode_permission() and already has the name
      hash (in form of dentry->d_name on overlayfs dentry).  It also doesn't
      support filesystems with d_op->d_hash() so basically it only needs
      the actual hashed lookup from lookup_one_len_unlocked()
      
      So add a new helper that does unlocked lookup of a hashed name.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      3c9fe8cd
    • M
      vfs: rename: check backing inode being equal · 9409e22a
      Miklos Szeredi 提交于
      If a file is renamed to a hardlink of itself POSIX specifies that rename(2)
      should do nothing and return success.
      
      This condition is checked in vfs_rename().  However it won't detect hard
      links on overlayfs where these are given separate inodes on the overlayfs
      layer.
      
      Overlayfs itself detects this condition and returns success without doing
      anything, but then vfs_rename() will proceed as if this was a successful
      rename (detach_mounts(), d_move()).
      
      The correct thing to do is to detect this condition before even calling
      into overlayfs.  This patch does this by calling vfs_select_inode() to get
      the underlying inodes.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Cc: <stable@vger.kernel.org> # v4.2+
      9409e22a
  4. 03 5月, 2016 19 次提交
  5. 01 5月, 2016 2 次提交
    • M
      ima: add support for creating files using the mknodat syscall · 05d1a717
      Mimi Zohar 提交于
      Commit 3034a146 "ima: pass 'opened' flag to identify newly created files"
      stopped identifying empty files as new files.  However new empty files
      can be created using the mknodat syscall.  On systems with IMA-appraisal
      enabled, these empty files are not labeled with security.ima extended
      attributes properly, preventing them from subsequently being opened in
      order to write the file data contents.  This patch defines a new hook
      named ima_post_path_mknod() to mark these empty files, created using
      mknodat, as new in order to allow the file data contents to be written.
      
      In addition, files with security.ima xattrs containing a file signature
      are considered "immutable" and can not be modified.  The file contents
      need to be written, before signing the file.  This patch relaxes this
      requirement for new files, allowing the file signature to be written
      before the file contents.
      
      Changelog:
      - defer identifying files with signatures stored as security.ima
        (based on Dmitry Rozhkov's comments)
      - removing tests (eg. dentry, dentry->d_inode, inode->i_size == 0)
        (based on Al's review)
      Signed-off-by: NMimi Zohar <zohar@linux.vnet.ibm.com>
      Cc: Al Viro <<viro@zeniv.linux.org.uk>
      Tested-by: NDmitry Rozhkov <dmitry.rozhkov@linux.intel.com>
      05d1a717
    • A
      atomic_open(): fix the handling of create_error · 10c64cea
      Al Viro 提交于
      * if we have a hashed negative dentry and either CREAT|EXCL on
      r/o filesystem, or CREAT|TRUNC on r/o filesystem, or CREAT|EXCL
      with failing may_o_create(), we should fail with EROFS or the
      error may_o_create() has returned, but not ENOENT.  Which is what
      the current code ends up returning.
      
      * if we have CREAT|TRUNC hitting a regular file on a read-only
      filesystem, we can't fail with EROFS here.  At the very least,
      not until we'd done follow_managed() - we might have a writable
      file (or a device, for that matter) bound on top of that one.
      Moreover, the code downstream will see that O_TRUNC and attempt
      to grab the write access (*after* following possible mount), so
      if we really should fail with EROFS, it will happen.  No need
      to do that inside atomic_open().
      
      The real logics is much simpler than what the current code is
      trying to do - if we decided to go for simple lookup, ended
      up with a negative dentry *and* had create_error set, fail with
      create_error.  No matter whether we'd got that negative dentry
      from lookup_real() or had found it in dcache.
      
      Cc: stable@vger.kernel.org # v3.6+
      Acked-by: NMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      10c64cea
  6. 11 4月, 2016 1 次提交
  7. 06 4月, 2016 1 次提交
  8. 31 3月, 2016 2 次提交
  9. 28 3月, 2016 1 次提交
  10. 14 3月, 2016 7 次提交