1. 01 4月, 2019 2 次提交
  2. 30 3月, 2019 5 次提交
  3. 29 3月, 2019 1 次提交
    • M
      KVM: export <linux/kvm_para.h> and <asm/kvm_para.h> iif KVM is supported · 3d9683cf
      Masahiro Yamada 提交于
      I do not see any consistency about headers_install of <linux/kvm_para.h>
      and <asm/kvm_para.h>.
      
      According to my analysis of Linux 5.1-rc1, there are 3 groups:
      
       [1] Both <linux/kvm_para.h> and <asm/kvm_para.h> are exported
      
          alpha, arm, hexagon, mips, powerpc, s390, sparc, x86
      
       [2] <asm/kvm_para.h> is exported, but <linux/kvm_para.h> is not
      
          arc, arm64, c6x, h8300, ia64, m68k, microblaze, nios2, openrisc,
          parisc, sh, unicore32, xtensa
      
       [3] Neither <linux/kvm_para.h> nor <asm/kvm_para.h> is exported
      
          csky, nds32, riscv
      
      This does not match to the actual KVM support. At least, [2] is
      half-baked.
      
      Nor do arch maintainers look like they care about this. For example,
      commit 0add5371 ("microblaze: Add missing kvm_para.h to Kbuild")
      exported <asm/kvm_para.h> to user-space in order to fix an in-kernel
      build error.
      
      We have two ways to make this consistent:
      
       [A] export both <linux/kvm_para.h> and <asm/kvm_para.h> for all
           architectures, irrespective of the KVM support
      
       [B] Match the header export of <linux/kvm_para.h> and <asm/kvm_para.h>
           to the KVM support
      
      My first attempt was [A] because the code looks cleaner, but Paolo
      suggested [B].
      
      So, this commit goes with [B].
      
      For most architectures, <asm/kvm_para.h> was moved to the kernel-space.
      I changed include/uapi/linux/Kbuild so that it checks generated
      asm/kvm_para.h as well as check-in ones.
      
      After this commit, there will be two groups:
      
       [1] Both <linux/kvm_para.h> and <asm/kvm_para.h> are exported
      
          arm, arm64, mips, powerpc, s390, x86
      
       [2] Neither <linux/kvm_para.h> nor <asm/kvm_para.h> is exported
      
          alpha, arc, c6x, csky, h8300, hexagon, ia64, m68k, microblaze,
          nds32, nios2, openrisc, parisc, riscv, sh, sparc, unicore32, xtensa
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Acked-by: NCornelia Huck <cohuck@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3d9683cf
  4. 28 3月, 2019 1 次提交
  5. 27 3月, 2019 1 次提交
  6. 26 3月, 2019 2 次提交
    • B
      proc/kcore: Remove unused kclist_add_remap() · db779ef6
      Bhupesh Sharma 提交于
      Commit
      
        bf904d27 ("x86/pti/64: Remove the SYSCALL64 entry trampoline")
      
      removed the sole usage of kclist_add_remap() but it did not remove the
      left-over definition from the include file.
      
      Fix the same.
      Signed-off-by: NBhupesh Sharma <bhsharma@redhat.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dave Anderson <anderson@redhat.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Kairui Song <kasong@redhat.com>
      Cc: kexec@lists.infradead.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Omar Sandoval <osandov@fb.com>
      Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
      Cc: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/1553583028-17804-1-git-send-email-bhsharma@redhat.com
      db779ef6
    • L
      Revert "parport: daisy: use new parport device model" · a3ac7917
      Linus Torvalds 提交于
      This reverts commit 1aec4211.
      
      Steven Rostedt reports that it causes a hang at bootup and bisected it
      to this commit.
      
      The troigger is apparently a module alias for "parport_lowlevel" that
      points to "parport_pc", which causes a hang with
      
          modprobe -q -- parport_lowlevel
      
      blocking forever with a backtrace like this:
      
          wait_for_completion_killable+0x1c/0x28
          call_usermodehelper_exec+0xa7/0x108
          __request_module+0x351/0x3d8
          get_lowlevel_driver+0x28/0x41 [parport]
          __parport_register_driver+0x39/0x1f4 [parport]
          daisy_drv_init+0x31/0x4f [parport]
          parport_bus_init+0x5d/0x7b [parport]
          parport_default_proc_register+0x26/0x1000 [parport]
          do_one_initcall+0xc2/0x1e0
          do_init_module+0x50/0x1d4
          load_module+0x1c2e/0x21b3
          sys_init_module+0xef/0x117
      
      Supid says:
       "Due to the new device model daisy driver will now try to find the
        parallel ports while trying to register its driver so that it can bind
        with them. Now, since daisy driver is loaded while parport bus is
        initialising the list of parport is still empty and it tries to load
        the lowlevel driver, which has an alias set to parport_pc, now causes
        a deadlock"
      
      But I don't think the daisy driver should be loaded by the parport
      initialization in the first place, so let's revert the whole change.
      
      If the daisy driver can just initialize separately on its own (like a
      driver should), instead of hooking into the parport init sequence
      directly, this issue probably would go away.
      Reported-and-bisected-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Reported-by: NMichal Kubecek <mkubecek@suse.cz>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a3ac7917
  7. 25 3月, 2019 1 次提交
  8. 23 3月, 2019 2 次提交
  9. 22 3月, 2019 4 次提交
    • T
      gpio: amd-fch: Fix bogus SPDX identifier · b45a02e1
      Thomas Gleixner 提交于
      spdxcheck.py complains:
      
       include/linux/platform_data/gpio/gpio-amd-fch.h: 1:28 Invalid License ID: GPL+
      
      which is correct because GPL+ is not a valid identifier. Of course this
      could have been caught by checkpatch.pl _before_ submitting or merging the
      patch.
      
       WARNING: 'SPDX-License-Identifier: GPL+ */' is not supported in LICENSES/...
       #271: FILE: include/linux/platform_data/gpio/gpio-amd-fch.h:1:
       +/* SPDX-License-Identifier: GPL+ */
      
      Fix it under the assumption that the author meant GPL-2.0+, which makes
      sense as the corresponding C file is using that identifier.
      
      Fixes: e09d168f ("gpio: AMD G-Series PCH gpio driver")
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NBartosz Golaszewski <bgolaszewski@baylibre.com>
      b45a02e1
    • D
      net/sched: let actions use RCU to access 'goto_chain' · ee3bbfe8
      Davide Caratti 提交于
      use RCU when accessing the action chain, to avoid use after free in the
      traffic path when 'goto chain' is replaced on existing TC actions (see
      script below). Since the control action is read in the traffic path
      without holding the action spinlock, we need to explicitly ensure that
      a->goto_chain is not NULL before dereferencing (i.e it's not sufficient
      to rely on the value of TC_ACT_GOTO_CHAIN bits). Not doing so caused NULL
      dereferences in tcf_action_goto_chain_exec() when the following script:
      
       # tc chain add dev dd0 chain 42 ingress protocol ip flower \
       > ip_proto udp action pass index 4
       # tc filter add dev dd0 ingress protocol ip flower \
       > ip_proto udp action csum udp goto chain 42 index 66
       # tc chain del dev dd0 chain 42 ingress
       (start UDP traffic towards dd0)
       # tc action replace action csum udp pass index 66
      
      was run repeatedly for several hours.
      Suggested-by: NCong Wang <xiyou.wangcong@gmail.com>
      Suggested-by: NVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee3bbfe8
    • D
      net/sched: don't dereference a->goto_chain to read the chain index · fe384e2f
      Davide Caratti 提交于
      callers of tcf_gact_goto_chain_index() can potentially read an old value
      of the chain index, or even dereference a NULL 'goto_chain' pointer,
      because 'goto_chain' and 'tcfa_action' are read in the traffic path
      without caring of concurrent write in the control path. The most recent
      value of chain index can be read also from a->tcfa_action (it's encoded
      there together with TC_ACT_GOTO_CHAIN bits), so we don't really need to
      dereference 'goto_chain': just read the chain id from the control action.
      
      Fixes: e457d86a ("net: sched: add couple of goto_chain helpers")
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fe384e2f
    • D
      net/sched: prepare TC actions to properly validate the control action · 85d0966f
      Davide Caratti 提交于
      - pass a pointer to struct tcf_proto in each actions's init() handler,
        to allow validating the control action, checking whether the chain
        exists and (eventually) refcounting it.
      - remove code that validates the control action after a successful call
        to the action's init() handler, and replace it with a test that forbids
        addition of actions having 'goto_chain' and NULL goto_chain pointer at
        the same time.
      - add tcf_action_check_ctrlact(), that will validate the control action
        and eventually allocate the action 'goto_chain' within the init()
        handler.
      - add tcf_action_set_ctrlact(), that will assign the control action and
        swap the current 'goto_chain' pointer with the new given one.
      
      This disallows 'goto_chain' on actions that don't initialize it properly
      in their init() handler, i.e. calling tcf_action_check_ctrlact() after
      successful IDR reservation and then calling tcf_action_set_ctrlact()
      to assign 'goto_chain' and 'tcf_action' consistently.
      
      By doing this, the kernel does not leak anymore refcounts when a valid
      'goto chain' handle is replaced in TC actions, causing kmemleak splats
      like the following one:
      
       # tc chain add dev dd0 chain 42 ingress protocol ip flower \
       > ip_proto tcp action drop
       # tc chain add dev dd0 chain 43 ingress protocol ip flower \
       > ip_proto udp action drop
       # tc filter add dev dd0 ingress matchall \
       > action gact goto chain 42 index 66
       # tc filter replace dev dd0 ingress matchall \
       > action gact goto chain 43 index 66
       # echo scan >/sys/kernel/debug/kmemleak
       <...>
       unreferenced object 0xffff93c0ee09f000 (size 1024):
       comm "tc", pid 2565, jiffies 4295339808 (age 65.426s)
       hex dump (first 32 bytes):
         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
         00 00 00 00 08 00 06 00 00 00 00 00 00 00 00 00  ................
       backtrace:
         [<000000009b63f92d>] tc_ctl_chain+0x3d2/0x4c0
         [<00000000683a8d72>] rtnetlink_rcv_msg+0x263/0x2d0
         [<00000000ddd88f8e>] netlink_rcv_skb+0x4a/0x110
         [<000000006126a348>] netlink_unicast+0x1a0/0x250
         [<00000000b3340877>] netlink_sendmsg+0x2c1/0x3c0
         [<00000000a25a2171>] sock_sendmsg+0x36/0x40
         [<00000000f19ee1ec>] ___sys_sendmsg+0x280/0x2f0
         [<00000000d0422042>] __sys_sendmsg+0x5e/0xa0
         [<000000007a6c61f9>] do_syscall_64+0x5b/0x180
         [<00000000ccd07542>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
         [<0000000013eaa334>] 0xffffffffffffffff
      
      Fixes: db50514f ("net: sched: add termination action to allow goto chain")
      Fixes: 97763dc0 ("net_sched: reject unknown tcfa_action values")
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      85d0966f
  10. 21 3月, 2019 3 次提交
  11. 20 3月, 2019 1 次提交
    • I
      libceph: wait for latest osdmap in ceph_monc_blacklist_add() · bb229bbb
      Ilya Dryomov 提交于
      Because map updates are distributed lazily, an OSD may not know about
      the new blacklist for quite some time after "osd blacklist add" command
      is completed.  This makes it possible for a blacklisted but still alive
      client to overwrite a post-blacklist update, resulting in data
      corruption.
      
      Waiting for latest osdmap in ceph_monc_blacklist_add() and thus using
      the post-blacklist epoch for all post-blacklist requests ensures that
      all such requests "wait" for the blacklist to come into force on their
      respective OSDs.
      
      Cc: stable@vger.kernel.org
      Fixes: 6305a3b4 ("libceph: support for blacklisting clients")
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NJason Dillaman <dillaman@redhat.com>
      bb229bbb
  12. 19 3月, 2019 5 次提交
    • D
      blk-mq: remove unused 'nr_expired' from blk_mq_hw_ctx · 9496c015
      Dongli Zhang 提交于
      There is no usage of 'nr_expired'.
      
      The 'nr_expired' was introduced by commit 1d9bd516 ("blk-mq: replace
      timeout synchronization with a RCU and generation based scheme"). Its usage
      was removed since commit 12f5b931 ("blk-mq: Remove generation
      seqeunce").
      Signed-off-by: NDongli Zhang <dongli.zhang@oracle.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      9496c015
    • X
      sctp: get sctphdr by offset in sctp_compute_cksum · 273160ff
      Xin Long 提交于
      sctp_hdr(skb) only works when skb->transport_header is set properly.
      
      But in Netfilter, skb->transport_header for ipv6 is not guaranteed
      to be right value for sctphdr. It would cause to fail to check the
      checksum for sctp packets.
      
      So fix it by using offset, which is always right in all places.
      
      v1->v2:
        - Fix the changelog.
      
      Fixes: e6d8b64b ("net: sctp: fix and consolidate SCTP checksumming code")
      Reported-by: NLi Shuang <shuali@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      273160ff
    • M
      packets: Always register packet sk in the same order · a4dc6a49
      Maxime Chevallier 提交于
      When using fanouts with AF_PACKET, the demux functions such as
      fanout_demux_cpu will return an index in the fanout socket array, which
      corresponds to the selected socket.
      
      The ordering of this array depends on the order the sockets were added
      to a given fanout group, so for FANOUT_CPU this means sockets are bound
      to cpus in the order they are configured, which is OK.
      
      However, when stopping then restarting the interface these sockets are
      bound to, the sockets are reassigned to the fanout group in the reverse
      order, due to the fact that they were inserted at the head of the
      interface's AF_PACKET socket list.
      
      This means that traffic that was directed to the first socket in the
      fanout group is now directed to the last one after an interface restart.
      
      In the case of FANOUT_CPU, traffic from CPU0 will be directed to the
      socket that used to receive traffic from the last CPU after an interface
      restart.
      
      This commit introduces a helper to add a socket at the tail of a list,
      then uses it to register AF_PACKET sockets.
      
      Note that this changes the order in which sockets are listed in /proc and
      with sock_diag.
      
      Fixes: dc99f600 ("packet: Add fanout support")
      Signed-off-by: NMaxime Chevallier <maxime.chevallier@bootlin.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a4dc6a49
    • J
      block: add BIO_NO_PAGE_REF flag · 399254aa
      Jens Axboe 提交于
      If bio_iov_iter_get_pages() is called on an iov_iter that is flagged
      with NO_REF, then we don't need to add a page reference for the pages
      that we add.
      
      Add BIO_NO_PAGE_REF to track this in the bio, so IO completion knows
      not to drop a reference to these pages.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      399254aa
    • J
      iov_iter: add ITER_BVEC_FLAG_NO_REF flag · 875f1d07
      Jens Axboe 提交于
      For ITER_BVEC, if we're holding on to kernel pages, the caller
      doesn't need to grab a reference to the bvec pages, and drop that
      same reference on IO completion. This is essentially safe for any
      ITER_BVEC, but some use cases end up reusing pages and uncondtionally
      dropping a page reference on completion. And example of that is
      sendfile(2), that ends up being a splice_in + splice_out on the
      pipe pages.
      
      Add a flag that tells us it's fine to not grab a page reference
      to the bvec pages, since that caller knows not to drop a reference
      when it's done with the pages.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      875f1d07
  13. 18 3月, 2019 2 次提交
  14. 17 3月, 2019 2 次提交
  15. 16 3月, 2019 4 次提交
    • B
      xsk: fix umem memory leak on cleanup · 044175a0
      Björn Töpel 提交于
      When the umem is cleaned up, the task that created it might already be
      gone. If the task was gone, the xdp_umem_release function did not free
      the pages member of struct xdp_umem.
      
      It turned out that the task lookup was not needed at all; The code was
      a left-over when we moved from task accounting to user accounting [1].
      
      This patch fixes the memory leak by removing the task lookup logic
      completely.
      
      [1] https://lore.kernel.org/netdev/20180131135356.19134-3-bjorn.topel@gmail.com/
      
      Link: https://lore.kernel.org/netdev/c1cb2ca8-6a14-3980-8672-f3de0bb38dfd@suse.cz/
      Fixes: c0c77d8f ("xsk: add user memory registration support sockopt")
      Reported-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      044175a0
    • P
      net: add documentation to socket.c · 8a3c245c
      Pedro Tammela 提交于
      Adds missing sphinx documentation to the
      socket.c's functions. Also fixes some whitespaces.
      
      I also changed the style of older documentation as an
      effort to have an uniform documentation style.
      Signed-off-by: NPedro Tammela <pctammela@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a3c245c
    • Y
      appletalk: Fix potential NULL pointer dereference in unregister_snap_client · 9804501f
      YueHaibing 提交于
      register_snap_client may return NULL, all the callers
      check it, but only print a warning. This will result in
      NULL pointer dereference in unregister_snap_client and other
      places.
      
      It has always been used like this since v2.6
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9804501f
    • J
      filemap: kill page_cache_read usage in filemap_fault · a75d4c33
      Josef Bacik 提交于
      Patch series "drop the mmap_sem when doing IO in the fault path", v6.
      
      Now that we have proper isolation in place with cgroups2 we have started
      going through and fixing the various priority inversions.  Most are all
      gone now, but this one is sort of weird since it's not necessarily a
      priority inversion that happens within the kernel, but rather because of
      something userspace does.
      
      We have giant applications that we want to protect, and parts of these
      giant applications do things like watch the system state to determine how
      healthy the box is for load balancing and such.  This involves running
      'ps' or other such utilities.  These utilities will often walk
      /proc/<pid>/whatever, and these files can sometimes need to
      down_read(&task->mmap_sem).  Not usually a big deal, but we noticed when
      we are stress testing that sometimes our protected application has latency
      spikes trying to get the mmap_sem for tasks that are in lower priority
      cgroups.
      
      This is because any down_write() on a semaphore essentially turns it into
      a mutex, so even if we currently have it held for reading, any new readers
      will not be allowed on to keep from starving the writer.  This is fine,
      except a lower priority task could be stuck doing IO because it has been
      throttled to the point that its IO is taking much longer than normal.  But
      because a higher priority group depends on this completing it is now stuck
      behind lower priority work.
      
      In order to avoid this particular priority inversion we want to use the
      existing retry mechanism to stop from holding the mmap_sem at all if we
      are going to do IO.  This already exists in the read case sort of, but
      needed to be extended for more than just grabbing the page lock.  With
      io.latency we throttle at submit_bio() time, so the readahead stuff can
      block and even page_cache_read can block, so all these paths need to have
      the mmap_sem dropped.
      
      The other big thing is ->page_mkwrite.  btrfs is particularly shitty here
      because we have to reserve space for the dirty page, which can be a very
      expensive operation.  We use the same retry method as the read path, and
      simply cache the page and verify the page is still setup properly the next
      pass through ->page_mkwrite().
      
      I've tested these patches with xfstests and there are no regressions.
      
      This patch (of 3):
      
      If we do not have a page at filemap_fault time we'll do this weird forced
      page_cache_read thing to populate the page, and then drop it again and
      loop around and find it.  This makes for 2 ways we can read a page in
      filemap_fault, and it's not really needed.  Instead add a FGP_FOR_MMAP
      flag so that pagecache_get_page() will return a unlocked page that's in
      pagecache.  Then use the normal page locking and readpage logic already in
      filemap_fault.  This simplifies the no page in page cache case
      significantly.
      
      [akpm@linux-foundation.org: fix comment text]
      [josef@toxicpanda.com: don't unlock null page in FGP_FOR_MMAP case]
        Link: http://lkml.kernel.org/r/20190312201742.22935-1-josef@toxicpanda.com
      Link: http://lkml.kernel.org/r/20181211173801.29535-2-josef@toxicpanda.comSigned-off-by: NJosef Bacik <josef@toxicpanda.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a75d4c33
  16. 15 3月, 2019 3 次提交
  17. 14 3月, 2019 1 次提交