1. 01 2月, 2018 9 次提交
  2. 31 1月, 2018 7 次提交
  3. 29 1月, 2018 7 次提交
  4. 28 1月, 2018 1 次提交
  5. 27 1月, 2018 8 次提交
    • T
      hrtimer: Reset hrtimer cpu base proper on CPU hotplug · d5421ea4
      Thomas Gleixner 提交于
      The hrtimer interrupt code contains a hang detection and mitigation
      mechanism, which prevents that a long delayed hrtimer interrupt causes a
      continous retriggering of interrupts which prevent the system from making
      progress. If a hang is detected then the timer hardware is programmed with
      a certain delay into the future and a flag is set in the hrtimer cpu base
      which prevents newly enqueued timers from reprogramming the timer hardware
      prior to the chosen delay. The subsequent hrtimer interrupt after the delay
      clears the flag and resumes normal operation.
      
      If such a hang happens in the last hrtimer interrupt before a CPU is
      unplugged then the hang_detected flag is set and stays that way when the
      CPU is plugged in again. At that point the timer hardware is not armed and
      it cannot be armed because the hang_detected flag is still active, so
      nothing clears that flag. As a consequence the CPU does not receive hrtimer
      interrupts and no timers expire on that CPU which results in RCU stalls and
      other malfunctions.
      
      Clear the flag along with some other less critical members of the hrtimer
      cpu base to ensure starting from a clean state when a CPU is plugged in.
      
      Thanks to Paul, Sebastian and Anna-Maria for their help to get down to the
      root cause of that hard to reproduce heisenbug. Once understood it's
      trivial and certainly justifies a brown paperbag.
      
      Fixes: 41d2e494 ("hrtimer: Tune hrtimer_interrupt hang logic")
      Reported-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Sewior <bigeasy@linutronix.de>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801261447590.2067@nanos
      d5421ea4
    • H
      x86: Mark hpa as a "Designated Reviewer" for the time being · 8a95b74d
      H. Peter Anvin 提交于
      Due to some unfortunate events, I have not been directly involved in
      the x86 kernel patch flow for a while now.  I have also not been able
      to ramp back up by now like I had hoped to, and after reviewing what I
      will need to work on both internally at Intel and elsewhere in the near
      term, it is clear that I am not going to be able to ramp back up until
      late 2018 at the very earliest.
      
      It is not acceptable to not recognize that this load is currently
      taken by Ingo and Thomas without my direct participation, so I mark
      myself as R: (designated reviewer) rather than M: (maintainer) until
      further notice.  This is in fact recognizing the de facto situation
      for the past few years.
      
      I have obviously no intention of going away, and I will do everything
      within my power to improve Linux on x86 and x86 for Linux.  This,
      however, puts credit where it is due and reflects a change of focus.
      
      This patch also removes stale entries for portions of the x86
      architecture which have not been maintained separately from arch/x86
      for a long time.  If there is a reason to re-introduce them then that
      can happen later.
      Signed-off-by: NH. Peter Anvin <h.peter.anvin@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180125195934.5253-1-hpa@zytor.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8a95b74d
    • L
      Merge tag 'riscv-for-linus-4.15-maintainers' of... · c4e0ca7f
      Linus Torvalds 提交于
      Merge tag 'riscv-for-linus-4.15-maintainers' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux
      
      Pull RISC-V update from Palmer Dabbelt:
       "RISC-V: We have a new mailing list and git repo!
      
        Sorry to send something essentially as late as possible (Friday after
        an rc9), but we managed to get a mailing list for the RISC-V Linux
        port. We've been using patches@groups.riscv.org for a while, but that
        list has some problems (it's Google Groups and it's shared over all
        RISC-V software projects). The new infaread.org list is much better.
        We just got it on Wednesday but I used it a bit on Thursday to shake
        out all the configuration problems and it appears to be in working
        order.
      
        When I updated the mailing list I noticed that the MAINTAINERS file
        was pointing to our github repo, but now that we have a kernel.org
        repo I'd like to point to that instead so I changed that as well.
        We'll be centralizing all RISC-V Linux related development here as
        that seems to be the saner way to go about it.
      
        I can understand if it's too late to get this into 4.15, but given
        that it's not a code change I was hoping it'd still be OK. It would be
        nice to have the new mailing list and git repo in the release tarballs
        so when people start to find bugs they'll get to the right place"
      
      * tag 'riscv-for-linus-4.15-maintainers' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
        Update the RISC-V MAINTAINERS file
      c4e0ca7f
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · ba804bb4
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) The per-network-namespace loopback device, and thus its namespace,
          can have its teardown deferred for a long time if a kernel created
          TCP socket closes and the namespace is exiting meanwhile. The kernel
          keeps trying to finish the close sequence until it times out (which
          takes quite some time).
      
          Fix this by forcing the socket closed in this situation, from Dan
          Streetman.
      
       2) Fix regression where we're trying to invoke the update_pmtu method
          on route types (in this case metadata tunnel routes) that don't
          implement the dst_ops method. Fix from Nicolas Dichtel.
      
       3) Fix long standing memory corruption issues in r8169 driver by
          performing the chip statistics DMA programming more correctly. From
          Francois Romieu.
      
       4) Handle local broadcast sends over VRF routes properly, from David
          Ahern.
      
       5) Don't refire the DCCP CCID2 timer endlessly, otherwise the socket
          can never be released. From Alexey Kodanev.
      
       6) Set poll flags properly in VSOCK protocol layer, from Stefan
          Hajnoczi.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        VSOCK: set POLLOUT | POLLWRNORM for TCP_CLOSING
        dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state
        net: vrf: Add support for sends to local broadcast address
        r8169: fix memory corruption on retrieval of hardware statistics.
        net: don't call update_pmtu unconditionally
        net: tcp: close sock if net namespace is exiting
      ba804bb4
    • L
      Merge tag 'drm-fixes-for-v4.15-rc10-2' of git://people.freedesktop.org/~airlied/linux · db218549
      Linus Torvalds 提交于
      Pull drm fixes from Dave Airlie:
       "A fairly urgent nouveau regression fix for broken irqs across
        suspend/resume came in. This was broken before but a patch in 4.15 has
        made it much more obviously broken and now s/r fails a lot more often.
      
        The fix removes freeing the irq across s/r which never should have
        been done anyways.
      
        Also two vc4 fixes for a NULL deference and some misrendering /
        flickering on screen"
      
      * tag 'drm-fixes-for-v4.15-rc10-2' of git://people.freedesktop.org/~airlied/linux:
        drm/nouveau: Move irq setup/teardown to pci ctor/dtor
        drm/vc4: Fix NULL pointer dereference in vc4_save_hang_state()
        drm/vc4: Flush the caches before the bin jobs, as well.
      db218549
    • S
      VSOCK: set POLLOUT | POLLWRNORM for TCP_CLOSING · ba3169fc
      Stefan Hajnoczi 提交于
      select(2) with wfds but no rfds must return when the socket is shut down
      by the peer.  This way userspace notices socket activity and gets -EPIPE
      from the next write(2).
      
      Currently select(2) does not return for virtio-vsock when a SEND+RCV
      shutdown packet is received.  This is because vsock_poll() only sets
      POLLOUT | POLLWRNORM for TCP_CLOSE, not the TCP_CLOSING state that the
      socket is in when the shutdown is received.
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ba3169fc
    • A
      dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state · dd5684ec
      Alexey Kodanev 提交于
      ccid2_hc_tx_rto_expire() timer callback always restarts the timer
      again and can run indefinitely (unless it is stopped outside), and after
      commit 120e9dab ("dccp: defer ccid_hc_tx_delete() at dismantle time"),
      which moved ccid_hc_tx_delete() (also includes sk_stop_timer()) from
      dccp_destroy_sock() to sk_destruct(), this started to happen quite often.
      The timer prevents releasing the socket, as a result, sk_destruct() won't
      be called.
      
      Found with LTP/dccp_ipsec tests running on the bonding device,
      which later couldn't be unloaded after the tests were completed:
      
        unregister_netdevice: waiting for bond0 to become free. Usage count = 148
      
      Fixes: 2a91aa39 ("[DCCP] CCID2: Initial CCID2 (TCP-Like) implementation")
      Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dd5684ec
    • P
      Update the RISC-V MAINTAINERS file · 6572cc2b
      Palmer Dabbelt 提交于
      Now that we're upstream in Linux we've been able to make some
      infrastructure changes so our port works a bit more like other ports.
      Specifically:
      
      * We now have a mailing list specific to the RISC-V Linux port, hosted
        at lists.infreadead.org.
      * We now have a kernel.org git tree where work on our port is
        coordinated.
      
      This patch changes the RISC-V maintainers entry to reflect these new
      bits of infrastructure.
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NPalmer Dabbelt <palmer@sifive.com>
      6572cc2b
  6. 26 1月, 2018 8 次提交
    • A
      x86/mm/64: Tighten up vmalloc_fault() sanity checks on 5-level kernels · 36b3a772
      Andy Lutomirski 提交于
      On a 5-level kernel, if a non-init mm has a top-level entry, it needs to
      match init_mm's, but the vmalloc_fault() code skipped over the BUG_ON()
      that would have checked it.
      
      While we're at it, get rid of the rather confusing 4-level folded "pgd"
      logic.
      
      Cleans-up: b50858ce ("x86/mm/vmalloc: Add 5-level paging support")
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Neil Berrington <neil.berrington@datacore.com>
      Link: https://lkml.kernel.org/r/2ae598f8c279b0a29baf75df207e6f2fdddc0a1b.1516914529.git.luto@kernel.org
      36b3a772
    • A
      x86/mm/64: Fix vmapped stack syncing on very-large-memory 4-level systems · 5beda7d5
      Andy Lutomirski 提交于
      Neil Berrington reported a double-fault on a VM with 768GB of RAM that uses
      large amounts of vmalloc space with PTI enabled.
      
      The cause is that load_new_mm_cr3() was never fixed to take the 5-level pgd
      folding code into account, so, on a 4-level kernel, the pgd synchronization
      logic compiles away to exactly nothing.
      
      Interestingly, the problem doesn't trigger with nopti.  I assume this is
      because the kernel is mapped with global pages if we boot with nopti.  The
      sequence of operations when we create a new task is that we first load its
      mm while still running on the old stack (which crashes if the old stack is
      unmapped in the new mm unless the TLB saves us), then we call
      prepare_switch_to(), and then we switch to the new stack.
      prepare_switch_to() pokes the new stack directly, which will populate the
      mapping through vmalloc_fault().  I assume that we're getting lucky on
      non-PTI systems -- the old stack's TLB entry stays alive long enough to
      make it all the way through prepare_switch_to() and switch_to() so that we
      make it to a valid stack.
      
      Fixes: b50858ce ("x86/mm/vmalloc: Add 5-level paging support")
      Reported-and-tested-by: NNeil Berrington <neil.berrington@datacore.com>
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: stable@vger.kernel.org
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: https://lkml.kernel.org/r/346541c56caed61abbe693d7d2742b4a380c5001.1516914529.git.luto@kernel.org
      5beda7d5
    • D
      Merge branch 'linux-4.15' of git://github.com/skeggsb/linux into drm-fixes · baa35cc3
      Dave Airlie 提交于
      Single irq regression fix
      * 'linux-4.15' of git://github.com/skeggsb/linux:
        drm/nouveau: Move irq setup/teardown to pci ctor/dtor
      baa35cc3
    • D
      net: vrf: Add support for sends to local broadcast address · 1e19c4d6
      David Ahern 提交于
      Sukumar reported that sends to the local broadcast address
      (255.255.255.255) are broken. Check for the address in vrf driver
      and do not redirect to the VRF device - similar to multicast
      packets.
      
      With this change sockets can use SO_BINDTODEVICE to specify an
      egress interface and receive responses. Note: the egress interface
      can not be a VRF device but needs to be the enslaved device.
      
      https://bugzilla.kernel.org/show_bug.cgi?id=198521Reported-by: NSukumar Gopalakrishnan <sukumarg1973@gmail.com>
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1e19c4d6
    • F
      r8169: fix memory corruption on retrieval of hardware statistics. · a78e9366
      Francois Romieu 提交于
      Hardware statistics retrieval hurts in tight invocation loops.
      
      Avoid extraneous write and enforce strict ordering of writes targeted to
      the tally counters dump area address registers.
      Signed-off-by: NFrancois Romieu <romieu@fr.zoreil.com>
      Tested-by: NOliver Freyermuth <o.freyermuth@googlemail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a78e9366
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 993ca206
      Linus Torvalds 提交于
      Pull input fixes from Dmitry Torokhov:
       "The main item is that we try to better handle the newer trackpoints on
        Lenovo devices that are now being produced by Elan/ALPS/NXP and only
        implement a small subset of the original IBM trackpoint controls"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Revert "Input: synaptics_rmi4 - use devm_device_add_group() for attributes in F01"
        Input: trackpoint - only expose supported controls for Elan, ALPS and NXP
        Input: trackpoint - force 3 buttons if 0 button is reported
        Input: xpad - add support for PDP Xbox One controllers
        Input: stmfts,s6sy671 - add SPDX identifier
      993ca206
    • M
      orangefs: fix deadlock; do not write i_size in read_iter · 6793f1c4
      Martin Brandenburg 提交于
      After do_readv_writev, the inode cache is invalidated anyway, so i_size
      will never be read.  It will be fetched from the server which will also
      know about updates from other machines.
      
      Fixes deadlock on 32-bit SMP.
      
      See https://marc.info/?l=linux-fsdevel&m=151268557427760&w=2Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Mike Marshall <hubcap@omnibond.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6793f1c4
    • L
      drm/nouveau: Move irq setup/teardown to pci ctor/dtor · 0fd189a9
      Lyude Paul 提交于
      For a while we've been having issues with seemingly random interrupts
      coming from nvidia cards when resuming them. Originally the fix for this
      was thought to be just re-arming the MSI interrupt registers right after
      re-allocating our IRQs, however it seems a lot of what we do is both
      wrong and not even nessecary.
      
      This was made apparent by what appeared to be a regression in the
      mainline kernel that started introducing suspend/resume issues for
      nouveau:
      
              a0c9259d (irq/matrix: Spread interrupts on allocation)
      
      After this commit was introduced, we started getting interrupts from the
      GPU before we actually re-allocated our own IRQ (see references below)
      and assigned the IRQ handler. Investigating this turned out that the
      problem was not with the commit, but the fact that nouveau even
      free/allocates it's irqs before and after suspend/resume.
      
      For starters: drivers in the linux kernel haven't had to handle
      freeing/re-allocating their IRQs during suspend/resume cycles for quite
      a while now. Nouveau seems to be one of the few drivers left that still
      does this, despite the fact there's no reason we actually need to since
      disabling interrupts from the device side should be enough, as the
      kernel is already smart enough to know to disable host-side interrupts
      for us before going into suspend. Since we were tearing down our IRQs by
      hand however, that means there was a short period during resume where
      interrupts could be received before we re-allocated our IRQ which would
      lead to us getting an unhandled IRQ. Since we never handle said IRQ and
      re-arm the interrupt registers, this would cause us to miss all of the
      interrupts from the GPU and cause our init process to start timing out
      on anything requiring interrupts.
      
      So, since this whole setup/teardown every suspend/resume cycle is
      useless anyway, move irq setup/teardown into the pci subdev's ctor/dtor
      functions instead so they're only called at driver load and driver
      unload. This should fix most of the issues with pending interrupts on
      resume, along with getting suspend/resume for nouveau to work again.
      
      As well, this probably means we can also just remove the msi rearm call
      inside nvkm_pci_init(). But since our main focus here is to fix
      suspend/resume before 4.15, we'll save that for a later patch.
      Signed-off-by: NLyude Paul <lyude@redhat.com>
      Cc: Karol Herbst <kherbst@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
      0fd189a9