1. 20 7月, 2016 1 次提交
    • A
      bdev: get rid of ->bd_inodes · a4a4f943
      Al Viro 提交于
      Since 2006 we have ->i_bdev pinning bdev in question, so there's no
      way to get to bdev ->evict_inode() while there's an aliasing inode
      anywhere.  In other words, the only place walking the list of aliases
      is guaranteed to do it only when the list is empty...
      
      Remove the detritus; it should've been done in "[PATCH] Fix a race
      condition between ->i_mapping and iput()", but nobody had noticed it
      back then.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a4a4f943
  2. 30 6月, 2016 2 次提交
    • M
      vfs: document ->d_real() · e698b8a4
      Miklos Szeredi 提交于
      Add missing documentation for the d_op->d_real() method and d_real()
      helper.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      e698b8a4
    • M
      vfs: merge .d_select_inode() into .d_real() · 2d902671
      Miklos Szeredi 提交于
      The two methods essentially do the same: find the real dentry/inode
      belonging to an overlay dentry.  The difference is in the usage:
      
      vfs_open() uses ->d_select_inode() and expects the function to perform
      copy-up if necessary based on the open flags argument.
      
      file_dentry() uses ->d_real() passing in the overlay dentry as well as the
      underlying inode.
      
      vfs_rename() uses ->d_select_inode() but passes zero flags.  ->d_real()
      with a zero inode would have worked just as well here.
      
      This patch merges the functionality of ->d_select_inode() into ->d_real()
      by adding an 'open_flags' argument to the latter.
      
      [Al Viro] Make the signature of d_real() match that of ->d_real() again.
      And constify the inode argument, while we are at it.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      2d902671
  3. 25 6月, 2016 4 次提交
    • K
      Revert "mm: make faultaround produce old ptes" · 315d09bf
      Kirill A. Shutemov 提交于
      This reverts commit 5c0a85fa.
      
      The commit causes ~6% regression in unixbench.
      
      Let's revert it for now and consider other solution for reclaim problem
      later.
      
      Link: http://lkml.kernel.org/r/1465893750-44080-2-git-send-email-kirill.shutemov@linux.intel.comSigned-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reported-by: N"Huang, Ying" <ying.huang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Vinayak Menon <vinmenon@codeaurora.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      315d09bf
    • A
      mm: mempool: kasan: don't poot mempool objects in quarantine · 9b75a867
      Andrey Ryabinin 提交于
      Currently we may put reserved by mempool elements into quarantine via
      kasan_kfree().  This is totally wrong since quarantine may really free
      these objects.  So when mempool will try to use such element,
      use-after-free will happen.  Or mempool may decide that it no longer
      need that element and double-free it.
      
      So don't put object into quarantine in kasan_kfree(), just poison it.
      Rename kasan_kfree() to kasan_poison_kfree() to respect that.
      
      Also, we shouldn't use kasan_slab_alloc()/kasan_krealloc() in
      kasan_unpoison_element() because those functions may update allocation
      stacktrace.  This would be wrong for the most of the remove_element call
      sites.
      
      (The only call site where we may want to update alloc stacktrace is
       in mempool_alloc(). Kmemleak solves this by calling
       kmemleak_update_trace(), so we could make something like that too.
       But this is out of scope of this patch).
      
      Fixes: 55834c59 ("mm: kasan: initial memory quarantine implementation")
      Link: http://lkml.kernel.org/r/575977C3.1010905@virtuozzo.comSigned-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reported-by: NKuthonuzo Luruo <kuthonuzo.luruo@hpe.com>
      Acked-by: NAlexander Potapenko <glider@google.com>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9b75a867
    • L
      fix up initial thread stack pointer vs thread_info confusion · 7f1a00b6
      Linus Torvalds 提交于
      The INIT_TASK() initializer was similarly confused about the stack vs
      thread_info allocation that the allocators had, and that were fixed in
      commit b235beea ("Clarify naming of thread info/stack allocators").
      
      The task ->stack pointer only incidentally ends up having the same value
      as the thread_info, and in fact that will change.
      
      So fix the initial task struct initializer to point to 'init_stack'
      instead of 'init_thread_info', and make sure the ia64 definition for
      that exists.
      
      This actually makes the ia64 tsk->stack pointer be sensible for the
      initial task, but not for any other task.  As mentioned in commit
      b235beea, that whole pointer isn't actually used on ia64, since
      task_stack_page() there just points to the (single) allocation.
      
      All the other architectures seem to have copied the 'init_stack'
      definition, even if it tended to be generally unusued.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7f1a00b6
    • L
      Clarify naming of thread info/stack allocators · b235beea
      Linus Torvalds 提交于
      We've had the thread info allocated together with the thread stack for
      most architectures for a long time (since the thread_info was split off
      from the task struct), but that is about to change.
      
      But the patches that move the thread info to be off-stack (and a part of
      the task struct instead) made it clear how confused the allocator and
      freeing functions are.
      
      Because the common case was that we share an allocation with the thread
      stack and the thread_info, the two pointers were identical.  That
      identity then meant that we would have things like
      
      	ti = alloc_thread_info_node(tsk, node);
      	...
      	tsk->stack = ti;
      
      which certainly _worked_ (since stack and thread_info have the same
      value), but is rather confusing: why are we assigning a thread_info to
      the stack? And if we move the thread_info away, the "confusing" code
      just gets to be entirely bogus.
      
      So remove all this confusion, and make it clear that we are doing the
      stack allocation by renaming and clarifying the function names to be
      about the stack.  The fact that the thread_info then shares the
      allocation is an implementation detail, and not really about the
      allocation itself.
      
      This is a pure renaming and type fix: we pass in the same pointer, it's
      just that we clarify what the pointer means.
      
      The ia64 code that actually only has one single allocation (for all of
      task_struct, thread_info and kernel thread stack) now looks a bit odd,
      but since "tsk->stack" is actually not even used there, that oddity
      doesn't matter.  It would be a separate thing to clean that up, I
      intentionally left the ia64 changes as a pure brute-force renaming and
      type change.
      Acked-by: NAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b235beea
  4. 24 6月, 2016 1 次提交
    • P
      locking/static_key: Fix concurrent static_key_slow_inc() · 4c5ea0a9
      Paolo Bonzini 提交于
      The following scenario is possible:
      
          CPU 1                                   CPU 2
          static_key_slow_inc()
           atomic_inc_not_zero()
            -> key.enabled == 0, no increment
           jump_label_lock()
           atomic_inc_return()
            -> key.enabled == 1 now
                                                  static_key_slow_inc()
                                                   atomic_inc_not_zero()
                                                    -> key.enabled == 1, inc to 2
                                                   return
                                                  ** static key is wrong!
           jump_label_update()
           jump_label_unlock()
      
      Testing the static key at the point marked by (**) will follow the
      wrong path for jumps that have not been patched yet.  This can
      actually happen when creating many KVM virtual machines with userspace
      LAPIC emulation; just run several copies of the following program:
      
          #include <fcntl.h>
          #include <unistd.h>
          #include <sys/ioctl.h>
          #include <linux/kvm.h>
      
          int main(void)
          {
              for (;;) {
                  int kvmfd = open("/dev/kvm", O_RDONLY);
                  int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0);
                  close(ioctl(vmfd, KVM_CREATE_VCPU, 1));
                  close(vmfd);
                  close(kvmfd);
              }
              return 0;
          }
      
      Every KVM_CREATE_VCPU ioctl will attempt a static_key_slow_inc() call.
      The static key's purpose is to skip NULL pointer checks and indeed one
      of the processes eventually dereferences NULL.
      
      As explained in the commit that introduced the bug:
      
        706249c2 ("locking/static_keys: Rework update logic")
      
      jump_label_update() needs key.enabled to be true.  The solution adopted
      here is to temporarily make key.enabled == -1, and use go down the
      slow path when key.enabled <= 0.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <stable@vger.kernel.org> # v4.3+
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 706249c2 ("locking/static_keys: Rework update logic")
      Link: http://lkml.kernel.org/r/1466527937-69798-1-git-send-email-pbonzini@redhat.com
      [ Small stylistic edits to the changelog and the code. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      4c5ea0a9
  5. 23 6月, 2016 2 次提交
  6. 18 6月, 2016 2 次提交
  7. 16 6月, 2016 1 次提交
  8. 15 6月, 2016 2 次提交
  9. 10 6月, 2016 8 次提交
  10. 08 6月, 2016 6 次提交
    • R
      drivers: of: Fix of_pci.h header guard · 5c1d3310
      Robin Murphy 提交于
      The compilation of of_pci.c is governed by CONFIG_OF_PCI, but the
      corresponding declarations in of_pci.h are inconsistently guarded by
      CONFIG_OF, with the result that if CONFIG_PCI is disabled for an OF
      platform, the dangling external declarations are still active and the
      inline stub definitions not. So far this has managed to go unnoticed
      since it happens that the only references to these functions are from
      code which itself depends on CONFIG_PCI or CONFIG_OF_PCI.
      
      Fix this with the appropriate config guard so that any new callers
      outside PCI-specific code don't start unexpectedly breaking under
      certain configs.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NRob Herring <robh@kernel.org>
      5c1d3310
    • P
      locking/qspinlock: Fix spin_unlock_wait() some more · 2c610022
      Peter Zijlstra 提交于
      While this prior commit:
      
        54cf809b ("locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()")
      
      ... fixes spin_is_locked() and spin_unlock_wait() for the usage
      in ipc/sem and netfilter, it does not in fact work right for the
      usage in task_work and futex.
      
      So while the 2 locks crossed problem:
      
      	spin_lock(A)		spin_lock(B)
      	if (!spin_is_locked(B)) spin_unlock_wait(A)
      	  foo()			foo();
      
      ... works with the smp_mb() injected by both spin_is_locked() and
      spin_unlock_wait(), this is not sufficient for:
      
      	flag = 1;
      	smp_mb();		spin_lock()
      	spin_unlock_wait()	if (!flag)
      				  // add to lockless list
      	// iterate lockless list
      
      ... because in this scenario, the store from spin_lock() can be delayed
      past the load of flag, uncrossing the variables and loosing the
      guarantee.
      
      This patch reworks spin_is_locked() and spin_unlock_wait() to work in
      both cases by exploiting the observation that while the lock byte
      store can be delayed, the contender must have registered itself
      visibly in other state contained in the word.
      
      It also allows for architectures to override both functions, as PPC
      and ARM64 have an additional issue for which we currently have no
      generic solution.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Giovanni Gherdovich <ggherdovich@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Waiman Long <waiman.long@hpe.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: stable@vger.kernel.org # v4.2 and later
      Fixes: 54cf809b ("locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      2c610022
    • T
      leds: core: Fix brightness setting upon hardware blinking enabled · 7cfe749f
      Tony Makkiel 提交于
      Commit 76931edd ("leds: fix brightness changing when software blinking
      is active") changed the semantics of led_set_brightness() which according
      to the documentation should disable blinking upon any brightness setting.
      Moreover it made it different for soft blink case, where it was possible
      to change blink brightness, and for hardware blink case, where setting
      any brightness greater than 0 was ignored.
      
      While the change itself is against the documentation claims, it was driven
      also by the fact that timer trigger remained active after turning blinking
      off. Fixing that would have required major refactoring in the led-core,
      led-class, and led-triggers because of cyclic dependencies.
      
      Finally, it has been decided that allowing for brightness change during
      blinking is beneficial as it can be accomplished without disturbing
      blink rhythm.
      
      The change in brightness setting semantics will not affect existing
      LED class drivers that implement blink_set op thanks to the LED_BLINK_SW
      flag introduced by this patch. The flag state will be from now on checked
      in led_set_brightness() which will allow to distinguish between software
      and hardware blink mode. In the latter case the control will be passed
      directly to the drivers which apply their semantics on brightness set,
      which is disable the blinking in case of most such drivers. New drivers
      will apply new semantics and just change the brightness while hardware
      blinking is on, if possible.
      
      The issue was smuggled by subsequent LED core improvements, which modified
      the code that originally introduced the problem.
      
      Fixes: f1e80c07 ("leds: core: Add two new LED_BLINK_ flags")
      Signed-off-by: NTony Makkiel <tony.makkiel@daqri.com>
      Signed-off-by: NJacek Anaszewski <j.anaszewski@samsung.com>
      7cfe749f
    • M
      coredump: fix dumping through pipes · 1607f09c
      Mateusz Guzik 提交于
      The offset in the core file used to be tracked with ->written field of
      the coredump_params structure. The field was retired in favour of
      file->f_pos.
      
      However, ->f_pos is not maintained for pipes which leads to breakage.
      
      Restore explicit tracking of the offset in coredump_params. Introduce
      ->pos field for this purpose since ->written was already reused.
      
      Fixes: a0083939 ("get rid of coredump_params->written").
      Reported-by: NZbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
      Signed-off-by: NMateusz Guzik <mguzik@redhat.com>
      Reviewed-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      1607f09c
    • D
      net: sched: fix tc_should_offload for specific clsact classes · 92c075db
      Daniel Borkmann 提交于
      When offloading classifiers such as u32 or flower to hardware, and the
      qdisc is clsact (TC_H_CLSACT), then we need to differentiate its classes,
      since not all of them handle ingress, therefore we must leave those in
      software path. Add a .tcf_cl_offload() callback, so we can generically
      handle them, tested on ixgbe.
      
      Fixes: 10cbc684 ("net/sched: cls_flower: Hardware offloaded filters statistics support")
      Fixes: 5b33f488 ("net/flower: Introduce hardware offload support")
      Fixes: a1b7c5fd ("net: sched: add cls_u32 offload hooks for netdevs")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      92c075db
    • C
      gtp: #define _UAPI_LINUX_GTP_H_ and not _UAPI_LINUX_GTP_H__ · 7b01b8e8
      Colin Ian King 提交于
      Fix clang build warning:
      
      ./include/uapi/linux/gtp.h:1:9: warning: '_UAPI_LINUX_GTP_H_' is
      used as a header guard here, followed by #define of a different
      macro [-Wheader-guard]
      
      fix by defining  _UAPI_LINUX_GTP_H_ and not _UAPI_LINUX_GTP_H__
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Acked-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7b01b8e8
  11. 07 6月, 2016 4 次提交
  12. 06 6月, 2016 2 次提交
    • M
      ipvs: update real-server binding of outgoing connections in SIP-pe · 3ec10d3a
      Marco Angaroni 提交于
      Previous patch that introduced handling of outgoing packets in SIP
      persistent-engine did not call ip_vs_check_template() in case packet was
      matching a connection template. Assumption was that real-server was
      healthy, since it was sending a packet just in that moment.
      
      There are however real-server fault conditions requiring that association
      between call-id and real-server (represented by connection template)
      gets updated. Here is an example of the sequence of events:
        1) RS1 is a back2back user agent that handled call-id1 and call-id2
        2) RS1 is down and was marked as unavailable
        3) new message from outside comes to IPVS with call-id1
        4) IPVS reschedules the message to RS2, which becomes new call handler
        5) RS2 forwards the message outside, translating call-id1 to call-id2
        6) inside pe->conn_out() IPVS matches call-id2 with existing template
        7) IPVS does not change association call-id2 <-> RS1
        8) new message comes from client with call-id2
        9) IPVS reschedules the message to a real-server potentially different
           from RS2, which is now the correct destination
      
      This patch introduces ip_vs_check_template() call in the handling of
      outgoing packets for SIP-pe. And also introduces a second optional
      argument for ip_vs_check_template() that allows to check if dest
      associated to a connection template is the same dest that was identified
      as the source of the packet. This is to change the real-server bound to a
      particular call-id independently from its availability status: the idea
      is that it's more reliable, for in->out direction (where internal
      network can be considered trusted), to always associate a call-id with
      the last real-server that used it in one of its messages. Think about
      above sequence of events where, just after step 5, RS1 returns instead
      to be available.
      
      Comparison of dests is done by simply comparing pointers to struct
      ip_vs_dest; there should be no cases where struct ip_vs_dest keeps its
      memory address, but represent a different real-server in terms of
      ip-address / port.
      
      Fixes: 39b97223 ("ipvs: handle connections started by real-servers")
      Signed-off-by: NMarco Angaroni <marcoangaroni@gmail.com>
      Acked-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      3ec10d3a
    • E
      devpts: Make each mount of devpts an independent filesystem. · eedf265a
      Eric W. Biederman 提交于
      The /dev/ptmx device node is changed to lookup the directory entry "pts"
      in the same directory as the /dev/ptmx device node was opened in.  If
      there is a "pts" entry and that entry is a devpts filesystem /dev/ptmx
      uses that filesystem.  Otherwise the open of /dev/ptmx fails.
      
      The DEVPTS_MULTIPLE_INSTANCES configuration option is removed, so that
      userspace can now safely depend on each mount of devpts creating a new
      instance of the filesystem.
      
      Each mount of devpts is now a separate and equal filesystem.
      
      Reserved ttys are now available to all instances of devpts where the
      mounter is in the initial mount namespace.
      
      A new vfs helper path_pts is introduced that finds a directory entry
      named "pts" in the directory of the passed in path, and changes the
      passed in path to point to it.  The helper path_pts uses a function
      path_parent_directory that was factored out of follow_dotdot.
      
      In the implementation of devpts:
       - devpts_mnt is killed as it is no longer meaningful if all mounts of
         devpts are equal.
       - pts_sb_from_inode is replaced by just inode->i_sb as all cached
         inodes in the tty layer are now from the devpts filesystem.
       - devpts_add_ref is rolled into the new function devpts_ptmx.  And the
         unnecessary inode hold is removed.
       - devpts_del_ref is renamed devpts_release and reduced to just a
         deacrivate_super.
       - The newinstance mount option continues to be accepted but is now
         ignored.
      
      In devpts_fs.h definitions for when !CONFIG_UNIX98_PTYS are removed as
      they are never used.
      
      Documentation/filesystems/devices.txt is updated to describe the current
      situation.
      
      This has been verified to work properly on openwrt-15.05, centos5,
      centos6, centos7, debian-6.0.2, debian-7.9, debian-8.2, ubuntu-14.04.3,
      ubuntu-15.10, fedora23, magia-5, mint-17.3, opensuse-42.1,
      slackware-14.1, gentoo-20151225 (13.0?), archlinux-2015-12-01.  With the
      caveat that on centos6 and on slackware-14.1 that there wind up being
      two instances of the devpts filesystem mounted on /dev/pts, the lower
      copy does not end up getting used.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: Peter Anvin <hpa@zytor.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
      Cc: Willy Tarreau <w@1wt.eu>
      Cc: Aurelien Jarno <aurelien@aurel32.net>
      Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
      Cc: Jann Horn <jann@thejh.net>
      Cc: Jiri Slaby <jslaby@suse.com>
      Cc: Florian Weimer <fw@deneb.enyo.de>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eedf265a
  13. 04 6月, 2016 1 次提交
  14. 03 6月, 2016 4 次提交
    • K
      of: add missing const for of_parse_phandle_with_args() in !CONFIG_OF · e93aeeae
      Kuninori Morimoto 提交于
      commit 93c667ca
      ("of: *node argument to of_parse_phandle_with_args should be const")
      changed to const for struct device node *np,
      but it cares CONFIG_OF case only, !CONFIG_OF case need it too.
      Signed-off-by: NKuninori Morimoto <kuninori.morimoto.gx@renesas.com>
      Signed-off-by: NRob Herring <robh@kernel.org>
      e93aeeae
    • V
      efi: Fix for_each_efi_memory_desc_in_map() for empty memmaps · 55f1ea15
      Vitaly Kuznetsov 提交于
      Commit:
      
        78ce248f ("efi: Iterate over efi.memmap in for_each_efi_memory_desc()")
      
      introduced a regression for systems booted with the 'noefi' kernel option.
      
      In particular, I observed an early kernel hang in efi_find_mirror()'s
      for_each_efi_memory_desc() call. As we don't have efi memmap on this
      system we enter this iterator with the following parameters:
      
        efi.memmap.map = 0, efi.memmap.map_end = 0, efi.memmap.desc_size = 28
      
      ... then for_each_efi_memory_desc_in_map() does the following comparison:
      
        (md) <= (efi_memory_desc_t *)((m)->map_end - (m)->desc_size);
      
      ... where md = 0, (m)->map_end = 0 and (m)->desc_size = 28 but when we subtract
      something from a NULL pointer wrap around happens and we end up returning
      invalid pointer and crash.
      
      Fix it by using the correct pointer arithmetics.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Fixes: 78ce248f ("efi: Iterate over efi.memmap in for_each_efi_memory_desc()")
      Link: http://lkml.kernel.org/r/1464690224-4503-2-git-send-email-matt@codeblueprint.co.uk
      [ Made the changelog more readable. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      55f1ea15
    • P
      locking/seqcount: Re-fix raw_read_seqcount_latch() · 55eed755
      Peter Zijlstra 提交于
      Commit 50755bc1 ("seqlock: fix raw_read_seqcount_latch()") broke
      raw_read_seqcount_latch().
      
      If you look at the comment that was modified; the thing that changes is
      the seq count, not the latch pointer.
      
       * void latch_modify(struct latch_struct *latch, ...)
       * {
       *	smp_wmb();	<- Ensure that the last data[1] update is visible
       *	latch->seq++;
       *	smp_wmb();	<- Ensure that the seqcount update is visible
       *
       *	modify(latch->data[0], ...);
       *
       *	smp_wmb();	<- Ensure that the data[0] update is visible
       *	latch->seq++;
       *	smp_wmb();	<- Ensure that the seqcount update is visible
       *
       *	modify(latch->data[1], ...);
       * }
       *
       * The query will have a form like:
       *
       * struct entry *latch_query(struct latch_struct *latch, ...)
       * {
       *	struct entry *entry;
       *	unsigned seq, idx;
       *
       *	do {
       *		seq = lockless_dereference(latch->seq);
      
      So here we have:
      
      		seq = READ_ONCE(latch->seq);
      		smp_read_barrier_depends();
      
      Which is exactly what we want; the new code:
      
      		seq = ({ p = READ_ONCE(latch);
      			 smp_read_barrier_depends(); p })->seq;
      
      is just wrong; because it looses the volatile read on seq, which can now
      be torn or worse 'optimized'. And the read_depend barrier is also placed
      wrong, we want it after the load of seq, to match the above data[]
      up-to-date wmb()s.
      
      Such that when we dereference latch->data[] below, we're guaranteed to
      observe the right data.
      
       *
       *		idx = seq & 0x01;
       *		entry = data_query(latch->data[idx], ...);
       *
       *		smp_rmb();
       *	} while (seq != latch->seq);
       *
       *	return entry;
       * }
      
      So yes, not passing a pointer is not pretty, but the code was correct,
      and isn't anymore now.
      
      Change to explicit READ_ONCE()+smp_read_barrier_depends() to avoid
      confusion and allow strict lockless_dereference() checking.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 50755bc1 ("seqlock: fix raw_read_seqcount_latch()")
      Link: http://lkml.kernel.org/r/20160527111117.GL3192@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      55eed755
    • C
      cpuidle: Do not access cpuidle_devices when !CONFIG_CPU_IDLE · 9bd616e3
      Catalin Marinas 提交于
      The cpuidle_devices per-CPU variable is only defined when CPU_IDLE is
      enabled. Commit c8cc7d4d ("sched/idle: Reorganize the idle loop")
      removed the #ifdef CONFIG_CPU_IDLE around cpuidle_idle_call() with the
      compiler optimising away __this_cpu_read(cpuidle_devices). However, with
      CONFIG_UBSAN && !CONFIG_CPU_IDLE, this optimisation no longer happens
      and the kernel fails to link since cpuidle_devices is not defined.
      
      This patch introduces an accessor function for the current CPU cpuidle
      device (returning NULL when !CONFIG_CPU_IDLE) and uses it in
      cpuidle_idle_call().
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: 4.5+ <stable@vger.kernel.org> # 4.5+
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      9bd616e3