1. 02 5月, 2018 1 次提交
  2. 27 4月, 2018 1 次提交
    • M
      errseq: Always report a writeback error once · b4678df1
      Matthew Wilcox 提交于
      The errseq_t infrastructure assumes that errors which occurred before
      the file descriptor was opened are of no interest to the application.
      This turns out to be a regression for some applications, notably Postgres.
      
      Before errseq_t, a writeback error would be reported exactly once (as
      long as the inode remained in memory), so Postgres could open a file,
      call fsync() and find out whether there had been a writeback error on
      that file from another process.
      
      This patch changes the errseq infrastructure to report errors to all
      file descriptors which are opened after the error occurred, but before
      it was reported to any file descriptor.  This restores the user-visible
      behaviour.
      
      Cc: stable@vger.kernel.org
      Fixes: 5660e13d ("fs: new infrastructure for writeback error handling and reporting")
      Signed-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
      Reviewed-by: NJeff Layton <jlayton@kernel.org>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      b4678df1
  3. 23 4月, 2018 2 次提交
  4. 17 4月, 2018 1 次提交
  5. 14 4月, 2018 1 次提交
  6. 13 4月, 2018 1 次提交
  7. 12 4月, 2018 9 次提交
  8. 11 4月, 2018 1 次提交
  9. 10 4月, 2018 1 次提交
  10. 06 4月, 2018 3 次提交
  11. 01 4月, 2018 1 次提交
  12. 31 3月, 2018 4 次提交
  13. 30 3月, 2018 1 次提交
  14. 28 3月, 2018 2 次提交
  15. 27 3月, 2018 1 次提交
  16. 26 3月, 2018 7 次提交
  17. 24 3月, 2018 1 次提交
  18. 23 3月, 2018 1 次提交
    • T
      mm/vmalloc: add interfaces to free unmapped page table · b6bdb751
      Toshi Kani 提交于
      On architectures with CONFIG_HAVE_ARCH_HUGE_VMAP set, ioremap() may
      create pud/pmd mappings.  A kernel panic was observed on arm64 systems
      with Cortex-A75 in the following steps as described by Hanjun Guo.
      
       1. ioremap a 4K size, valid page table will build,
       2. iounmap it, pte0 will set to 0;
       3. ioremap the same address with 2M size, pgd/pmd is unchanged,
          then set the a new value for pmd;
       4. pte0 is leaked;
       5. CPU may meet exception because the old pmd is still in TLB,
          which will lead to kernel panic.
      
      This panic is not reproducible on x86.  INVLPG, called from iounmap,
      purges all levels of entries associated with purged address on x86.  x86
      still has memory leak.
      
      The patch changes the ioremap path to free unmapped page table(s) since
      doing so in the unmap path has the following issues:
      
       - The iounmap() path is shared with vunmap(). Since vmap() only
         supports pte mappings, making vunmap() to free a pte page is an
         overhead for regular vmap users as they do not need a pte page freed
         up.
      
       - Checking if all entries in a pte page are cleared in the unmap path
         is racy, and serializing this check is expensive.
      
       - The unmap path calls free_vmap_area_noflush() to do lazy TLB purges.
         Clearing a pud/pmd entry before the lazy TLB purges needs extra TLB
         purge.
      
      Add two interfaces, pud_free_pmd_page() and pmd_free_pte_page(), which
      clear a given pud/pmd entry and free up a page for the lower level
      entries.
      
      This patch implements their stub functions on x86 and arm64, which work
      as workaround.
      
      [akpm@linux-foundation.org: fix typo in pmd_free_pte_page() stub]
      Link: http://lkml.kernel.org/r/20180314180155.19492-2-toshi.kani@hpe.com
      Fixes: e61ce6ad ("mm: change ioremap to set up huge I/O mappings")
      Reported-by: NLei Li <lious.lilei@hisilicon.com>
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Wang Xuefeng <wxf.wang@hisilicon.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Chintan Pandya <cpandya@codeaurora.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b6bdb751
  19. 22 3月, 2018 1 次提交
    • C
      netns: send uevent messages · 692ec06d
      Christian Brauner 提交于
      This patch adds a receive method to NETLINK_KOBJECT_UEVENT netlink sockets
      to allow sending uevent messages into the network namespace the socket
      belongs to.
      
      Currently non-initial network namespaces are already isolated and don't
      receive uevents. There are a number of cases where it is beneficial for a
      sufficiently privileged userspace process to send a uevent into a network
      namespace.
      
      One such use case would be debugging and fuzzing of a piece of software
      which listens and reacts to uevents. By running a copy of that software
      inside a network namespace, specific uevents could then be presented to it.
      More concretely, this would allow for easy testing of udevd/ueventd.
      
      This will also allow some piece of software to run components inside a
      separate network namespace and then effectively filter what that software
      can receive. Some examples of software that do directly listen to uevents
      and that we have in the past attempted to run inside a network namespace
      are rbd (CEPH client) or the X server.
      
      Implementation:
      The implementation has been kept as simple as possible from the kernel's
      perspective. Specifically, a simple input method uevent_net_rcv() is added
      to NETLINK_KOBJECT_UEVENT sockets which completely reuses existing
      af_netlink infrastructure and does neither add an additional netlink family
      nor requires any user-visible changes.
      
      For example, by using netlink_rcv_skb() we can make use of existing netlink
      infrastructure to report back informative error messages to userspace.
      
      Furthermore, this implementation does not introduce any overhead for
      existing uevent generating codepaths. The struct netns got a new uevent
      socket member that records the uevent socket associated with that network
      namespace including its position in the uevent socket list. Since we record
      the uevent socket for each network namespace in struct net we don't have to
      walk the whole uevent socket list. Instead we can directly retrieve the
      relevant uevent socket and send the message. At exit time we can now also
      trivially remove the uevent socket from the uevent socket list. This keeps
      the codepath very performant without introducing needless overhead and even
      makes older codepaths faster.
      
      Uevent sequence numbers are kept global. When a uevent message is sent to
      another network namespace the implementation will simply increment the
      global uevent sequence number and append it to the received uevent. This
      has the advantage that the kernel will never need to parse the received
      uevent message to replace any existing uevent sequence numbers. Instead it
      is up to the userspace process to remove any existing uevent sequence
      numbers in case the uevent message to be sent contains any.
      
      Security:
      In order for a caller to send uevent messages to a target network namespace
      the caller must have CAP_SYS_ADMIN in the owning user namespace of the
      target network namespace. Additionally, any received uevent message is
      verified to not exceed size UEVENT_BUFFER_SIZE. This includes the space
      needed to append the uevent sequence number.
      
      Testing:
      This patch has been tested and verified to work with the following udev
      implementations:
      1. CentOS 6 with udevd version 147
      2. Debian Sid with systemd-udevd version 237
      3. Android 7.1.1 with ueventd
      Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      692ec06d