1. 05 9月, 2015 40 次提交
    • Y
      ocfs2: fix a tiny case that inode can not removed · 928dda1f
      Yiwen Jiang 提交于
      When running dirop_fileop_racer we found a case that inode
      can not removed.
      
      Two nodes, say Node A and Node B, mount the same ocfs2 volume.  Create
      two dirs /race/1/ and /race/2/ in the filesystem.
      
        Node A                            Node B
        rm -r /race/2/
                                          mv /race/1/ /race/2/
        call ocfs2_unlink(), get
        the EX mode of /race/2/
                                          wait for B unlock /race/2/
        decrease i_nlink of /race/2/ to 0,
        and add inode of /race/2/ into
        orphan dir, unlock /race/2/
                                          got EX mode of /race/2/. because
                                          /race/1/ is dir, so inc i_nlink
                                          of /race/2/ and update into disk,
                                          unlock /race/2/
        because i_nlink of /race/2/
        is not zero, this inode will
        always remain in orphan dir
      
      This patch fixes this case by test whether i_nlink of new dir is zero.
      Signed-off-by: NYiwen Jiang <jiangyiwen@huawei.com>
      Reviewed-by: NMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Joseph Qi <joseph.qi@huawei.com>
      Cc: Xue jiufei <xuejiufei@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      928dda1f
    • W
      ocfs2: add ip_alloc_sem in direct IO to protect allocation changes · 6ab855a9
      WeiWei Wang 提交于
      In ocfs2, ip_alloc_sem is used to protect allocation changes on the
      node.  In direct IO, we add ip_alloc_sem to protect date consistent
      between direct-io and ocfs2_truncate_file race (buffer io use
      ip_alloc_sem already).  Although inode->i_mutex lock is used to avoid
      concurrency of above situation, i think ip_alloc_sem is still needed
      because protect allocation changes is significant.
      
      Other filesystem like ext4 also uses rw_semaphore to protect data
      consistent between get_block-vs-truncate race by other means, So
      ip_alloc_sem in ocfs2 direct io is needed.
      Signed-off-by: NWeiwei Wang <wangww631@huawei.com>
      Signed-off-by: NMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6ab855a9
    • G
      ocfs2: clear the rest of the buffers on error · 34237681
      Goldwyn Rodrigues 提交于
      In case a validation fails, clear the rest of the buffers and return the
      error to the calling function.
      
      This also facilitates bubbling up the error originating from ocfs2_error
      to calling functions.
      Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Reviewed-by: NMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      34237681
    • G
      ocfs2: acknowledge return value of ocfs2_error() · 17a5b9ab
      Goldwyn Rodrigues 提交于
      Caveat: This may return -EROFS for a read case, which seems wrong.  This
      is happening even without this patch series though.  Should we convert
      EROFS to EIO?
      Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      17a5b9ab
    • G
      ocfs2: add errors=continue · 7d0fb914
      Goldwyn Rodrigues 提交于
      OCFS2 is often used in high-availaibility systems.  However, ocfs2
      converts the filesystem to read-only at the drop of the hat.  This may
      not be necessary, since turning the filesystem read-only would affect
      other running processes as well, decreasing availability.
      
      This attempt is to add errors=continue, which would return the EIO to
      the calling process and terminate furhter processing so that the
      filesystem is not corrupted further.  However, the filesystem is not
      converted to read-only.
      
      As a future plan, I intend to create a small utility or extend
      fsck.ocfs2 to fix small errors such as in the inode.  The input to the
      utility such as the inode can come from the kernel logs so we don't have
      to schedule a downtime for fixing small-enough errors.
      
      The patch changes the ocfs2_error to return an error.  The error
      returned depends on the mount option set.  If none is set, the default
      is to turn the filesystem read-only.
      
      Perhaps errors=continue is not the best option name.  Historically it is
      used for making an attempt to progress in the current process itself.
      Should we call it errors=eio? or errors=killproc? Suggestions/Comments
      welcome.
      
      Sources are available at:
        https://github.com/goldwynr/linux/tree/error-contSigned-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: NMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7d0fb914
    • X
      ocfs2: flush inode data to disk and free inode when i_count becomes zero · 513e2dae
      Xue jiufei 提交于
      Disk inode deletion may be heavily delayed when one node unlink a file
      after the same dentry is freed on another node(say N1) because of memory
      shrink but inode is left in memory.  This inode can only be freed while
      N1 doing the orphan scan work.
      
      However, N1 may skip orphan scan for several times because other nodes
      may do the work earlier.  In our tests, it may take 1 hour on 4 nodes
      cluster and it hurts the user experience.  So we think the inode should
      be freed after the data flushed to disk when i_count becomes zero to
      avoid such circumstances.
      Signed-off-by: NJoyce.xue <xuejiufei@huawei.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Reviewed-by: NMark Fasheh <mfasheh@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      513e2dae
    • S
      ocfs2: trusted xattr missing CAP_SYS_ADMIN check · 0f5e7b41
      Sanidhya Kashyap 提交于
      The trusted extended attributes are only visible to the process which
      hvae CAP_SYS_ADMIN capability but the check is missing in ocfs2
      xattr_handler trusted list.  The check is important because this will be
      used for implementing mechanisms in the userspace for which other
      ordinary processes should not have access to.
      Signed-off-by: NSanidhya Kashyap <sanidhya.gatech@gmail.com>
      Reviewed-by: NMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Taesoo kim <taesoo@gatech.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0f5e7b41
    • J
      ocfs2: set filesytem read-only when ocfs2_delete_entry failed. · 807a7907
      jiangyiwen 提交于
      In ocfs2_rename, it will lead to an inode with two entried(old and new) if
      ocfs2_delete_entry(old) failed.  Thus, filesystem will be inconsistent.
      
      The case is described below:
      
      ocfs2_rename
          -> ocfs2_start_trans
          -> ocfs2_add_entry(new)
          -> ocfs2_delete_entry(old)
              -> __ocfs2_journal_access *failed* because of -ENOMEM
          -> ocfs2_commit_trans
      
      So filesystem should be set to read-only at the moment.
      Signed-off-by: NYiwen Jiang <jiangyiwen@huawei.com>
      Cc: Joseph Qi <joseph.qi@huawei.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Reviewed-by: NMark Fasheh <mfasheh@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      807a7907
    • J
      ocfs2/dlm: use list_for_each_entry instead of list_for_each · f83c7b5e
      Joseph Qi 提交于
      Use list_for_each_entry instead of list_for_each to simplify code.
      Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f83c7b5e
    • J
      ocfs2: remove unneeded code in dlm_register_domain_handlers · 0e3d9eaf
      Joseph Qi 提交于
      The last goto statement is unneeded, so remove it.
      Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0e3d9eaf
    • J
      ocfs2: fix BUG when o2hb_register_callback fails · cdd09f49
      Joseph Qi 提交于
      In dlm_register_domain_handlers, if o2hb_register_callback fails, it
      will call dlm_unregister_domain_handlers to unregister.  This will
      trigger the BUG_ON in o2hb_unregister_callback because hc_magic is 0.
      So we should call o2hb_setup_callback to initialize hc first.
      Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cdd09f49
    • J
      ocfs2: remove unneeded code in ocfs2_dlm_init · 914a9b74
      Joseph Qi 提交于
      status is already initialized and it will only be 0 or negatives in the
      code flow.  So remove the unneeded assignment after the lable 'local'.
      Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      914a9b74
    • J
      ocfs2: adjust code to match locking/unlocking order · 3cb2ec43
      Joseph Qi 提交于
      Unlocking order in ocfs2_unlink and ocfs2_rename mismatches the
      corresponding locking order, although it won't cause issues, adjust the
      code so that it looks more reasonable.
      Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3cb2ec43
    • J
      ocfs2: clean up unused local variables in ocfs2_file_write_iter · bf59e662
      Joseph Qi 提交于
      Since commit 86b9c6f3 ("ocfs2: remove filesize checks for sync I/O
      journal commit") removes filesize checks for sync I/O journal commit,
      variables old_size and old_clusters are not actually used any more.  So
      clean them up.
      Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bf59e662
    • C
      ocfs2: do not log twice error messages · 372a447c
      Christophe JAILLET 提交于
      'o2hb_map_slot_data' and 'o2hb_populate_slot_data' are called from only
      one place, in 'o2hb_region_dev_write'.  Return value is checked and
      'mlog_errno' is called to log a message if it is not 0.
      
      So there is no need to call 'mlog_errno' directly within these functions.
      This would result on logging the message twice.
      Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      372a447c
    • J
      ocfs2: do not BUG if buffer not uptodate in __ocfs2_journal_access · acf8fdbe
      Joseph Qi 提交于
      When storage network is unstable, it may trigger the BUG in
      __ocfs2_journal_access because of buffer not uptodate.  We can retry the
      write in this case or return error instead of BUG.
      Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
      Reported-by: NZhangguanghui <zhang.guanghui@h3c.com>
      Tested-by: NZhangguanghui <zhang.guanghui@h3c.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      acf8fdbe
    • J
      ocfs2: fix several issues of append dio · faaebf18
      Joseph Qi 提交于
      1) Take rw EX lock in case of append dio.
      2) Explicitly treat the error code -EIOCBQUEUED as normal.
      3) Set di_bh to NULL after brelse if it may be used again later.
      Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
      Cc: Yiwen Jiang <jiangyiwen@huawei.com>
      Cc: Weiwei Wang <wangww631@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      faaebf18
    • J
      ocfs2: fix race between dio and recover orphan · 512f62ac
      Joseph Qi 提交于
      During direct io the inode will be added to orphan first and then
      deleted from orphan.  There is a race window that the orphan entry will
      be deleted twice and thus trigger the BUG when validating
      OCFS2_DIO_ORPHANED_FL in ocfs2_del_inode_from_orphan.
      
      ocfs2_direct_IO_write
          ...
          ocfs2_add_inode_to_orphan
          >>>>>>>> race window.
                   1) another node may rm the file and then down, this node
                   take care of orphan recovery and clear flag
                   OCFS2_DIO_ORPHANED_FL.
                   2) since rw lock is unlocked, it may race with another
                   orphan recovery and append dio.
          ocfs2_del_inode_from_orphan
      
      So take inode mutex lock when recovering orphans and make rw unlock at the
      end of aio write in case of append dio.
      Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
      Reported-by: NYiwen Jiang <jiangyiwen@huawei.com>
      Cc: Weiwei Wang <wangww631@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      512f62ac
    • A
      sh: use PFN_DOWN macro · 81cf09ed
      Alexander Kuleshov 提交于
      Replace ((x) >> PAGE_SHIFT) with the predefined PFN_DOWN macro.
      Signed-off-by: NAlexander Kuleshov <kuleshovmail@gmail.com>
      Acked-by: NGeert Uytterhoeven <geert+renesas@glider.be>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      81cf09ed
    • S
      ntfs: delete unnecessary checks before calling iput() · 917520e1
      SF Markus Elfring 提交于
      iput() tests whether its argument is NULL and then returns immediately.
      Thus the test around the call is not needed.
      
      This issue was detected by using the Coccinelle software.
      Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
      Cc: Julia Lawall <julia.lawall@lip6.fr>
      Reviewed-by: NAnton Altaparmakov <anton@tuxera.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      917520e1
    • Z
      scripts/spelling.txt: add some typo-words · 35108d71
      Zhao Lei 提交于
      I wrote a small script to show word-pair from all linux spelling-typo
      commits, and get following result by sort | uniq -c:
      
          181 occured -> occurred
           78 transfered -> transferred
           67 recieved -> received
           65 dependant -> dependent
           58 wether -> whether
           56 accomodate -> accommodate
           54 occured -> occurred
           51 recieve -> receive
           47 cant -> can't
           40 sucessfully -> successfully
           ...
      
      Some of them are not in spelling.txt, this patch adds the most common
      word-pairs into spelling.txt.
      Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      35108d71
    • R
      scripts: decode_stacktrace: fix ARM architecture decoding · e260fe01
      Robert Jarzmik 提交于
      Fix the stack decoder for the ARM architecture.
      An ARM stack is designed as :
      
      [   81.547704] [<c023eb04>] (bucket_find_contain) from [<c023ec88>] (check_sync+0x40/0x4f8)
      [   81.559668] [<c023ec88>] (check_sync) from [<c023f8c4>] (debug_dma_sync_sg_for_cpu+0x128/0x194)
      [   81.571583] [<c023f8c4>] (debug_dma_sync_sg_for_cpu) from [<c0327dec>] (__videobuf_s
      
      The current script doesn't expect the symbols to be bound by
      parenthesis, and triggers the following errors :
      
        awk: cmd. line:1: error: Unmatched ( or \(: / (check_sync$/
        [   81.547704] (bucket_find_contain) from (check_sync+0x40/0x4f8)
      
      Fix it by chopping starting and ending parenthesis from the each symbol
      name.
      
      As a side note, this probably comes from the function
      dump_backtrace_entry(), which is implemented differently for each
      architecture.  That makes a single decoding script a bit a challenge.
      Signed-off-by: NRobert Jarzmik <robert.jarzmik@free.fr>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Michal Marek <mmarek@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e260fe01
    • J
      scripts/Lindent: handle missing indent gracefully · fa70900e
      Jean Delvare 提交于
      If indent is not found, bail out immediately instead of spitting random
      shell script error messages.
      Signed-off-by: NJean Delvare <jdelvare@suse.de>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fa70900e
    • B
      kerneldoc: Convert error messages to GNU error message format · d40e1e65
      Bart Van Assche 提交于
      Editors like emacs and vi recognize a number of error message formats.
      The format used by the kerneldoc tool is not recognized by emacs.
      
      Change the kerneldoc error message format to the GNU style such that the
      emacs prev-error and next-error commands can be used to navigate through
      kerneldoc error messages.  For more information about the GNU error
      message format, see also
        https://www.gnu.org/prep/standards/html_node/Errors.html.
      
      This patch has been generated via the following sed command:
      
        sed -i.orig 's/Error(\${file}:\$.):/\${file}:\$.: error:/g;s/Warning(\${file}:\$.):/\${file}:\$.: warning:/g;s/Warning(\${file}):/\${file}:1: warning:/g;s/Info(\${file}:\$.):/\${file}:\$.: info:/g' scripts/kernel-doc
      Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Cc: Johannes Berg <johannes.berg@intel.com>
      Acked-by: NRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d40e1e65
    • S
      scripts/spelling.txt: spelling of uninitialized · c22b6ae6
      Sudip Mukherjee 提交于
      I just did a spelling mistake of uninitialized and wrote that as
      unintialized.  Fortunately I noticed it in my final review.
      Signed-off-by: NSudip Mukherjee <sudip@vectorindia.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c22b6ae6
    • M
      scripts/spelling.txt: add misspelled words for check · 779a6ce8
      Maninder Singh 提交于
      misspelled words for check:-
       chcek
       chck
       cehck
      
      I myself did these spell mistakes in changelog for patches, Thus
      suggesting to add in spelling.txt, so that checkpatch.pl warns it
      earlier.  References:-
      
      ./arch/powerpc/kernel/exceptions-64e.S:456: . . . make sure you chcek
      https://lkml.org/lkml/2015/6/25/289
      ./arch/x86/mm/pageattr.c:1368: * No need to cehck in that case
      
      [akpm@linux-foundation.org: add whcih->which, whcih I always get wrong]
      Signed-off-by: NManinder Singh <maninder1.s@samsung.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      779a6ce8
    • J
      fsnotify: get rid of fsnotify_destroy_mark_locked() · 4712e722
      Jan Kara 提交于
      fsnotify_destroy_mark_locked() is subtle to use because it temporarily
      releases group->mark_mutex.  To avoid future problems with this
      function, split it into two.
      
      fsnotify_detach_mark() is the part that needs group->mark_mutex and
      fsnotify_free_mark() is the part that must be called outside of
      group->mark_mutex.  This way it's much clearer what's going on and we
      also avoid some pointless acquisitions of group->mark_mutex.
      Signed-off-by: NJan Kara <jack@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4712e722
    • J
      fsnotify: remove mark->free_list · 925d1132
      Jan Kara 提交于
      Free list is used when all marks on given inode / mount should be
      destroyed when inode / mount is going away.  However we can free all of
      the marks without using a special list with some care.
      Signed-off-by: NJan Kara <jack@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      925d1132
    • J
      fsnotify: document mark locking · 1e39fc01
      Jan Kara 提交于
      Signed-off-by: NJan Kara <jack@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1e39fc01
    • J
      fsnotify: fix check in inotify fdinfo printing · 3c53e514
      Jan Kara 提交于
      A check in inotify_fdinfo() checking whether mark is valid was always
      true due to a bug.  Luckily we can never get to invalidated marks since
      we hold mark_mutex and invalidated marks get removed from the group list
      when they are invalidated under that mutex.
      
      Anyway fix the check to make code more future proof.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3c53e514
    • D
      fs/notify: optimize inotify/fsnotify code for unwatched files · 7c49b861
      Dave Hansen 提交于
      I have a _tiny_ microbenchmark that sits in a loop and writes single
      bytes to a file.  Writing one byte to a tmpfs file is around 2x slower
      than reading one byte from a file, which is a _bit_ more than I expecte.
      This is a dumb benchmark, but I think it's hard to deny that write() is
      a hot path and we should avoid unnecessary overhead there.
      
      I did a 'perf record' of 30-second samples of read and write.  The top
      item in a diffprofile is srcu_read_lock() from fsnotify().  There are
      active inotify fd's from systemd, but nothing is actually listening to
      the file or its part of the filesystem.
      
      I *think* we can avoid taking the srcu_read_lock() for the common case
      where there are no actual marks on the file.  This means that there will
      both be nothing to notify for *and* implies that there is no need for
      clearing the ignore mask.
      
      This patch gave a 13.1% speedup in writes/second on my test, which is an
      improvement from the 10.8% that I saw with the last version.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NJan Kara <jack@suse.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: John McCutchan <john@johnmccutchan.com>
      Cc: Robert Love <rlove@rlove.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7c49b861
    • Y
      drivers/video/concole: add negative dependency for VGA_CONSOLE on ARC · 031e29b5
      Yuriy Kolerov 提交于
      Architectures which support VGA console must define screen_info
      structurture from "uapi/linux/screen_info.h".  Otherwise undefined
      symbol error occurs.  Usually it's defined in "setup.c" for each
      architecture.
      
      If an architecture does not support VGA console (ARC's case) there are 2
      ways: define a dummy instance of screen_info or add a negative
      dependency for VGA_CONSOLE in to prevent selecting this option.
      
      I've implemented the second way.  However the best solution is to add
      HAVE_VGA_CONSOLE option for targets which support VGA console.  Then
      turn off VGA_CONSOLE by default and add dependency to HAVE_VGA_CONSOLE.
      But right now it's better to just add a negative dependency for ARC and
      then consider how to collaborate about this issue with maintainers of
      other architectures.
      Signed-off-by: NYuriy Kolerov <yuriy.kolerov@synopsys.com>
      Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
      Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
      Cc: Jaya Kumar <jayalk@intworks.biz>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      031e29b5
    • A
      capabilities: add a securebit to disable PR_CAP_AMBIENT_RAISE · 746bf6d6
      Andy Lutomirski 提交于
      Per Andrew Morgan's request, add a securebit to allow admins to disable
      PR_CAP_AMBIENT_RAISE.  This securebit will prevent processes from adding
      capabilities to their ambient set.
      
      For simplicity, this disables PR_CAP_AMBIENT_RAISE entirely rather than
      just disabling setting previously cleared bits.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Acked-by: NAndrew G. Morgan <morgan@kernel.org>
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Serge Hallyn <serge.hallyn@canonical.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Aaron Jones <aaronmdjones@gmail.com>
      Cc: Ted Ts'o <tytso@mit.edu>
      Cc: Andrew G. Morgan <morgan@kernel.org>
      Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
      Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>
      Cc: Markku Savela <msa@moth.iki.fi>
      Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      746bf6d6
    • A
      selftests/capabilities: Add tests for capability evolution · 32ae976e
      Andy Lutomirski 提交于
      This test focuses on ambient capabilities.  It requires either root or
      the ability to create user namespaces.  Some of the test cases will be
      skipped for nonroot users.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Christoph Lameter <cl@linux.com> # Original author
      Cc: Serge E. Hallyn <serge.hallyn@ubuntu.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Shuah Khan <shuahkh@osg.samsung.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      32ae976e
    • A
      capabilities: ambient capabilities · 58319057
      Andy Lutomirski 提交于
      Credit where credit is due: this idea comes from Christoph Lameter with
      a lot of valuable input from Serge Hallyn.  This patch is heavily based
      on Christoph's patch.
      
      ===== The status quo =====
      
      On Linux, there are a number of capabilities defined by the kernel.  To
      perform various privileged tasks, processes can wield capabilities that
      they hold.
      
      Each task has four capability masks: effective (pE), permitted (pP),
      inheritable (pI), and a bounding set (X).  When the kernel checks for a
      capability, it checks pE.  The other capability masks serve to modify
      what capabilities can be in pE.
      
      Any task can remove capabilities from pE, pP, or pI at any time.  If a
      task has a capability in pP, it can add that capability to pE and/or pI.
      If a task has CAP_SETPCAP, then it can add any capability to pI, and it
      can remove capabilities from X.
      
      Tasks are not the only things that can have capabilities; files can also
      have capabilities.  A file can have no capabilty information at all [1].
      If a file has capability information, then it has a permitted mask (fP)
      and an inheritable mask (fI) as well as a single effective bit (fE) [2].
      File capabilities modify the capabilities of tasks that execve(2) them.
      
      A task that successfully calls execve has its capabilities modified for
      the file ultimately being excecuted (i.e.  the binary itself if that
      binary is ELF or for the interpreter if the binary is a script.) [3] In
      the capability evolution rules, for each mask Z, pZ represents the old
      value and pZ' represents the new value.  The rules are:
      
        pP' = (X & fP) | (pI & fI)
        pI' = pI
        pE' = (fE ? pP' : 0)
        X is unchanged
      
      For setuid binaries, fP, fI, and fE are modified by a moderately
      complicated set of rules that emulate POSIX behavior.  Similarly, if
      euid == 0 or ruid == 0, then fP, fI, and fE are modified differently
      (primary, fP and fI usually end up being the full set).  For nonroot
      users executing binaries with neither setuid nor file caps, fI and fP
      are empty and fE is false.
      
      As an extra complication, if you execute a process as nonroot and fE is
      set, then the "secure exec" rules are in effect: AT_SECURE gets set,
      LD_PRELOAD doesn't work, etc.
      
      This is rather messy.  We've learned that making any changes is
      dangerous, though: if a new kernel version allows an unprivileged
      program to change its security state in a way that persists cross
      execution of a setuid program or a program with file caps, this
      persistent state is surprisingly likely to allow setuid or file-capped
      programs to be exploited for privilege escalation.
      
      ===== The problem =====
      
      Capability inheritance is basically useless.
      
      If you aren't root and you execute an ordinary binary, fI is zero, so
      your capabilities have no effect whatsoever on pP'.  This means that you
      can't usefully execute a helper process or a shell command with elevated
      capabilities if you aren't root.
      
      On current kernels, you can sort of work around this by setting fI to
      the full set for most or all non-setuid executable files.  This causes
      pP' = pI for nonroot, and inheritance works.  No one does this because
      it's a PITA and it isn't even supported on most filesystems.
      
      If you try this, you'll discover that every nonroot program ends up with
      secure exec rules, breaking many things.
      
      This is a problem that has bitten many people who have tried to use
      capabilities for anything useful.
      
      ===== The proposed change =====
      
      This patch adds a fifth capability mask called the ambient mask (pA).
      pA does what most people expect pI to do.
      
      pA obeys the invariant that no bit can ever be set in pA if it is not
      set in both pP and pI.  Dropping a bit from pP or pI drops that bit from
      pA.  This ensures that existing programs that try to drop capabilities
      still do so, with a complication.  Because capability inheritance is so
      broken, setting KEEPCAPS, using setresuid to switch to nonroot uids, and
      then calling execve effectively drops capabilities.  Therefore,
      setresuid from root to nonroot conditionally clears pA unless
      SECBIT_NO_SETUID_FIXUP is set.  Processes that don't like this can
      re-add bits to pA afterwards.
      
      The capability evolution rules are changed:
      
        pA' = (file caps or setuid or setgid ? 0 : pA)
        pP' = (X & fP) | (pI & fI) | pA'
        pI' = pI
        pE' = (fE ? pP' : pA')
        X is unchanged
      
      If you are nonroot but you have a capability, you can add it to pA.  If
      you do so, your children get that capability in pA, pP, and pE.  For
      example, you can set pA = CAP_NET_BIND_SERVICE, and your children can
      automatically bind low-numbered ports.  Hallelujah!
      
      Unprivileged users can create user namespaces, map themselves to a
      nonzero uid, and create both privileged (relative to their namespace)
      and unprivileged process trees.  This is currently more or less
      impossible.  Hallelujah!
      
      You cannot use pA to try to subvert a setuid, setgid, or file-capped
      program: if you execute any such program, pA gets cleared and the
      resulting evolution rules are unchanged by this patch.
      
      Users with nonzero pA are unlikely to unintentionally leak that
      capability.  If they run programs that try to drop privileges, dropping
      privileges will still work.
      
      It's worth noting that the degree of paranoia in this patch could
      possibly be reduced without causing serious problems.  Specifically, if
      we allowed pA to persist across executing non-pA-aware setuid binaries
      and across setresuid, then, naively, the only capabilities that could
      leak as a result would be the capabilities in pA, and any attacker
      *already* has those capabilities.  This would make me nervous, though --
      setuid binaries that tried to privilege-separate might fail to do so,
      and putting CAP_DAC_READ_SEARCH or CAP_DAC_OVERRIDE into pA could have
      unexpected side effects.  (Whether these unexpected side effects would
      be exploitable is an open question.) I've therefore taken the more
      paranoid route.  We can revisit this later.
      
      An alternative would be to require PR_SET_NO_NEW_PRIVS before setting
      ambient capabilities.  I think that this would be annoying and would
      make granting otherwise unprivileged users minor ambient capabilities
      (CAP_NET_BIND_SERVICE or CAP_NET_RAW for example) much less useful than
      it is with this patch.
      
      ===== Footnotes =====
      
      [1] Files that are missing the "security.capability" xattr or that have
      unrecognized values for that xattr end up with has_cap set to false.
      The code that does that appears to be complicated for no good reason.
      
      [2] The libcap capability mask parsers and formatters are dangerously
      misleading and the documentation is flat-out wrong.  fE is *not* a mask;
      it's a single bit.  This has probably confused every single person who
      has tried to use file capabilities.
      
      [3] Linux very confusingly processes both the script and the interpreter
      if applicable, for reasons that elude me.  The results from thinking
      about a script's file capabilities and/or setuid bits are mostly
      discarded.
      
      Preliminary userspace code is here, but it needs updating:
      https://git.kernel.org/cgit/linux/kernel/git/luto/util-linux-playground.git/commit/?h=cap_ambient&id=7f5afbd175d2
      
      Here is a test program that can be used to verify the functionality
      (from Christoph):
      
      /*
       * Test program for the ambient capabilities. This program spawns a shell
       * that allows running processes with a defined set of capabilities.
       *
       * (C) 2015 Christoph Lameter <cl@linux.com>
       * Released under: GPL v3 or later.
       *
       *
       * Compile using:
       *
       *	gcc -o ambient_test ambient_test.o -lcap-ng
       *
       * This program must have the following capabilities to run properly:
       * Permissions for CAP_NET_RAW, CAP_NET_ADMIN, CAP_SYS_NICE
       *
       * A command to equip the binary with the right caps is:
       *
       *	setcap cap_net_raw,cap_net_admin,cap_sys_nice+p ambient_test
       *
       *
       * To get a shell with additional caps that can be inherited by other processes:
       *
       *	./ambient_test /bin/bash
       *
       *
       * Verifying that it works:
       *
       * From the bash spawed by ambient_test run
       *
       *	cat /proc/$$/status
       *
       * and have a look at the capabilities.
       */
      
      #include <stdlib.h>
      #include <stdio.h>
      #include <errno.h>
      #include <cap-ng.h>
      #include <sys/prctl.h>
      #include <linux/capability.h>
      
      /*
       * Definitions from the kernel header files. These are going to be removed
       * when the /usr/include files have these defined.
       */
      #define PR_CAP_AMBIENT 47
      #define PR_CAP_AMBIENT_IS_SET 1
      #define PR_CAP_AMBIENT_RAISE 2
      #define PR_CAP_AMBIENT_LOWER 3
      #define PR_CAP_AMBIENT_CLEAR_ALL 4
      
      static void set_ambient_cap(int cap)
      {
      	int rc;
      
      	capng_get_caps_process();
      	rc = capng_update(CAPNG_ADD, CAPNG_INHERITABLE, cap);
      	if (rc) {
      		printf("Cannot add inheritable cap\n");
      		exit(2);
      	}
      	capng_apply(CAPNG_SELECT_CAPS);
      
      	/* Note the two 0s at the end. Kernel checks for these */
      	if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, cap, 0, 0)) {
      		perror("Cannot set cap");
      		exit(1);
      	}
      }
      
      int main(int argc, char **argv)
      {
      	int rc;
      
      	set_ambient_cap(CAP_NET_RAW);
      	set_ambient_cap(CAP_NET_ADMIN);
      	set_ambient_cap(CAP_SYS_NICE);
      
      	printf("Ambient_test forking shell\n");
      	if (execv(argv[1], argv + 1))
      		perror("Cannot exec");
      
      	return 0;
      }
      
      Signed-off-by: Christoph Lameter <cl@linux.com> # Original author
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Aaron Jones <aaronmdjones@gmail.com>
      Cc: Ted Ts'o <tytso@mit.edu>
      Cc: Andrew G. Morgan <morgan@kernel.org>
      Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
      Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>
      Cc: Markku Savela <msa@moth.iki.fi>
      Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      58319057
    • A
      kernel/kthread.c:kthread_create_on_node(): clarify documentation · e9f06986
      Andrew Morton 提交于
      - Make it clear that the `node' arg refers to memory allocations only:
        kthread_create_on_node() does not pin the new thread to that node's
        CPUs.
      
      - Encourage the use of NUMA_NO_NODE.
      
      [nzimmer@sgi.com: use NUMA_NO_NODE in kthread_create() also]
      Cc: Nathan Zimmer <nzimmer@sgi.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e9f06986
    • Y
      mm: check if section present during memory block registering · 04697858
      Yinghai Lu 提交于
      Tony Luck found on his setup, if memory block size 512M will cause crash
      during booting.
      
        BUG: unable to handle kernel paging request at ffffea0074000020
        IP: get_nid_for_pfn+0x17/0x40
        PGD 128ffcb067 PUD 128ffc9067 PMD 0
        Oops: 0000 [#1] SMP
        Modules linked in:
        CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.0-rc8 #1
        ...
        Call Trace:
           ? register_mem_sect_under_node+0x66/0xe0
           register_one_node+0x17b/0x240
           ? pci_iommu_alloc+0x6e/0x6e
           topology_init+0x3c/0x95
           do_one_initcall+0xcd/0x1f0
      
      The system has non continuous RAM address:
       BIOS-e820: [mem 0x0000001300000000-0x0000001cffffffff] usable
       BIOS-e820: [mem 0x0000001d70000000-0x0000001ec7ffefff] usable
       BIOS-e820: [mem 0x0000001f00000000-0x0000002bffffffff] usable
       BIOS-e820: [mem 0x0000002c18000000-0x0000002d6fffefff] usable
       BIOS-e820: [mem 0x0000002e00000000-0x00000039ffffffff] usable
      
      So there are start sections in memory block not present.  For example:
      
          memory block : [0x2c18000000, 0x2c20000000) 512M
      
      first three sections are not present.
      
      The current register_mem_sect_under_node() assume first section is
      present, but memory block section number range [start_section_nr,
      end_section_nr] would include not present section.
      
      For arch that support vmemmap, we don't setup memmap for struct page
      area within not present sections area.
      
      So skip the pfn range that belong to absent section.
      
      [akpm@linux-foundation.org: simplification]
      [rientjes@google.com: more simplification]
      Fixes: bdee237c ("x86: mm: Use 2GB memory block size on large memory x86-64 systems")
      Fixes: 982792c7 ("x86, mm: probe memory block size for generic x86 64bit")
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Reported-by: NTony Luck <tony.luck@intel.com>
      Tested-by: NTony Luck <tony.luck@intel.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Tested-by: NDavid Rientjes <rientjes@google.com>
      Cc: <stable@vger.kernel.org>	[3.15+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      04697858
    • R
      ocfs2: direct write will call ocfs2_rw_unlock() twice when doing aio+dio · aa1057b3
      Ryan Ding 提交于
      ocfs2_file_write_iter() is usng the wrong return value ('written').  This
      will cause ocfs2_rw_unlock() be called both in write_iter & end_io,
      triggering a BUG_ON.
      
      This issue was introduced by commit 7da839c4 ("ocfs2: use
      __generic_file_write_iter()").
      
      Orabug: 21612107
      Fixes: 7da839c4 ("ocfs2: use __generic_file_write_iter()")
      Signed-off-by: NRyan Ding <ryan.ding@oracle.com>
      Reviewed-by: NJunxiao Bi <junxiao.bi@oracle.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      aa1057b3
    • T
      memory-hotplug: add hot-added memory ranges to memblock before allocate node_data for a node. · 7f36e3e5
      Tang Chen 提交于
      Commit f9126ab9 ("memory-hotplug: fix wrong edge when hot add a new
      node") hot-added memory range to memblock, after creating pgdat for new
      node.
      
      But there is a problem:
      
        add_memory()
        |--> hotadd_new_pgdat()
             |--> free_area_init_node()
                  |--> get_pfn_range_for_nid()
                       |--> find start_pfn and end_pfn in memblock
        |--> ......
        |--> memblock_add_node(start, size, nid)    --------    Here, just too late.
      
      get_pfn_range_for_nid() will find that start_pfn and end_pfn are both 0.
      As a result, when adding memory, dmesg will give the following wrong
      message.
      
        Initmem setup node 5 [mem 0x0000000000000000-0xffffffffffffffff]
        On node 5 totalpages: 0
        Built 5 zonelists in Node order, mobility grouping on.  Total pages: 32588823
        Policy zone: Normal
        init_memory_mapping: [mem 0x60000000000-0x607ffffffff]
      
      The solution is simple, just add the memory range to memblock a little
      earlier, before hotadd_new_pgdat().
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Gu Zheng <guz.fnst@cn.fujitsu.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: David Rientjes <rientjes@google.com>
      Cc: <stable@vger.kernel.org>	[4.2.x]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7f36e3e5
    • L
      Merge tag 'pinctrl-v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 88a99886
      Linus Torvalds 提交于
      Pull pin control updates from Linus Walleij:
       "This is the bulk of pin control changes for the v4.3 development
        cycle.
      
        Like with GPIO it's a lot of stuff.  If my subsystems are any sign of
        the overall tempo of the kernel v4.3 will be a gigantic diff.
      
      [ It looks like 4.3 is calmer than 4.2 in most other subsystems, but
        we'll see - Linus ]
      
        Core changes:
      
         - It is possible configure groups in debugfs.
      
         - Consolidation of chained IRQ handler install/remove replacing all
           call sites where irq_set_handler_data() and
           irq_set_chained_handler() were done in succession with a combined
           call to irq_set_chained_handler_and_data().  This series was
           created by Thomas Gleixner after the problem was observed by
           Russell King.
      
         - Tglx also made another series of patches switching
           __irq_set_handler_locked() for irq_set_handler_locked() which is
           way cleaner.
      
         - Tglx also wrote a good bunch of patches to make use of
           irq_desc_get_xxx() accessors and avoid looking up irq_descs from
           IRQ numbers.  The goal is to get rid of the irq number from the
           handlers in the IRQ flow which is nice.
      
        Driver feature enhancements:
      
         - Power management support for the SiRF SoC Atlas 7.
      
         - Power down support for the Qualcomm driver.
      
         - Intel Cherryview and Baytrail: switch drivers to use raw spinlocks
           in IRQ handlers to play nice with the realtime patch set.
      
         - Rework and new modes handling for Qualcomm SPMI-MPP.
      
         - Pinconf power source config for SH PFC.
      
        New drivers and subdrivers:
      
         - A new driver for Conexant Digicolor CX92755.
      
         - A new driver for UniPhier PH1-LD4, PH1-Pro4, PH1-sLD8, PH1-Pro5,
           ProXtream2 and PH1-LD6b SoC pin control support.
      
         - Reverse-egineered the S/PDIF settings for the Allwinner sun4i
           driver.
      
         - Support for Qualcomm Technologies QDF2xxx ARM64 SoCs
      
         - A new Freescale i.mx6ul subdriver.
      
        Cleanup:
      
         - Remove platform data support in a number of SH PFC subdrivers"
      
      * tag 'pinctrl-v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (95 commits)
        pinctrl: at91: fix null pointer dereference
        pinctrl: mediatek: Implement wake handler and suspend resume
        pinctrl: mediatek: Fix multiple registration issue.
        pinctrl: sh-pfc: r8a7794: add USB pin groups
        pinctrl: at91: Use generic irq_{request,release}_resources()
        pinctrl: cherryview: Use raw_spinlock for locking
        pinctrl: baytrail: Use raw_spinlock for locking
        pinctrl: imx6ul: Remove .owner field
        pinctrl: zynq: Fix typos in smc0_nand_grp and smc0_nor_grp
        pinctrl: sh-pfc: Implement pinconf power-source param for voltage switching
        clk: rockchip: add pclk_pd_pmu to the list of rk3288 critical clocks
        pinctrl: sun4i: add spdif to pin description.
        pinctrl: atlas7: clear ugly branch statements for pull and drivestrength
        pinctrl: baytrail: Serialize all register access
        pinctrl: baytrail: Drop FSF mailing address
        pinctrl: rockchip: only enable gpio clock when it setting
        pinctrl/mediatek: fix spelling mistake in dev_err error message
        pinctrl: cherryview: Serialize all register access
        pinctrl: UniPhier: PH1-Pro5: add I2C ch6 pin-mux setting
        pinctrl: nomadik: reflect current input value
        ...
      88a99886