1. 15 2月, 2016 1 次提交
  2. 12 2月, 2016 2 次提交
  3. 09 2月, 2016 1 次提交
  4. 07 2月, 2016 1 次提交
    • H
      pty: make sure super_block is still valid in final /dev/tty close · 1f55c718
      Herton R. Krzesinski 提交于
      Considering current pty code and multiple devpts instances, it's possible
      to umount a devpts file system while a program still has /dev/tty opened
      pointing to a previosuly closed pty pair in that instance. In the case all
      ptmx and pts/N files are closed, umount can be done. If the program closes
      /dev/tty after umount is done, devpts_kill_index will use now an invalid
      super_block, which was already destroyed in the umount operation after
      running ->kill_sb. This is another "use after free" type of issue, but now
      related to the allocated super_block instance.
      
      To avoid the problem (warning at ida_remove and potential crashes) for
      this specific case, I added two functions in devpts which grabs additional
      references to the super_block, which pty code now uses so it makes sure
      the super block structure is still valid until pty shutdown is done.
      I also moved the additional inode references to the same functions, which
      also covered similar case with inode being freed before /dev/tty final
      close/shutdown.
      Signed-off-by: NHerton R. Krzesinski <herton@redhat.com>
      Cc: stable@vger.kernel.org # 2.6.29+
      Reviewed-by: NPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f55c718
  5. 06 2月, 2016 3 次提交
  6. 05 2月, 2016 4 次提交
  7. 04 2月, 2016 5 次提交
  8. 03 2月, 2016 1 次提交
    • R
      modules: fix longstanding /proc/kallsyms vs module insertion race. · 8244062e
      Rusty Russell 提交于
      For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
      There's one full copy, marked SHF_ALLOC and laid out at the end of the
      module's init section.  There's also a cut-down version that only
      contains core symbols and strings, and lives in the module's core
      section.
      
      After module init (and before we free the module memory), we switch
      the mod->symtab, mod->num_symtab and mod->strtab to point to the core
      versions.  We do this under the module_mutex.
      
      However, kallsyms doesn't take the module_mutex: it uses
      preempt_disable() and rcu tricks to walk through the modules, because
      it's used in the oops path.  It's also used in /proc/kallsyms.
      There's nothing atomic about the change of these variables, so we can
      get the old (larger!) num_symtab and the new symtab pointer; in fact
      this is what I saw when trying to reproduce.
      
      By grouping these variables together, we can use a
      carefully-dereferenced pointer to ensure we always get one or the
      other (the free of the module init section is already done in an RCU
      callback, so that's safe).  We allocate the init one at the end of the
      module init section, and keep the core one inside the struct module
      itself (it could also have been allocated at the end of the module
      core, but that's probably overkill).
      Reported-by: NWeilong Chen <chenweilong@huawei.com>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
      Cc: stable@kernel.org
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      8244062e
  9. 01 2月, 2016 1 次提交
  10. 31 1月, 2016 3 次提交
    • D
      block: use DAX for partition table reads · d1a5f2b4
      Dan Williams 提交于
      Avoid populating pagecache when the block device is in DAX mode.
      Otherwise these page cache entries collide with the fsync/msync
      implementation and break data durability guarantees.
      
      Cc: Jan Kara <jack@suse.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Reported-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Tested-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Reviewed-by: NMatthew Wilcox <willy@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      d1a5f2b4
    • D
      block: revert runtime dax control of the raw block device · 9f4736fe
      Dan Williams 提交于
      Dynamically enabling DAX requires that the page cache first be flushed
      and invalidated.  This must occur atomically with the change of DAX mode
      otherwise we confuse the fsync/msync tracking and violate data
      durability guarantees.  Eliminate the possibilty of DAX-disabled to
      DAX-enabled transitions for now and revisit this for the next cycle.
      
      Cc: Jan Kara <jack@suse.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      9f4736fe
    • D
      fs, block: force direct-I/O for dax-enabled block devices · 65f87ee7
      Dan Williams 提交于
      Similar to the file I/O path, re-direct all I/O to the DAX path for I/O
      to a block-device special file.  Both regular files and device special
      files can use the common filp->f_mapping->host lookup to determing is
      DAX is enabled.
      
      Otherwise, we confuse the DAX code that does not expect to find live
      data in the page cache:
      
          ------------[ cut here ]------------
          WARNING: CPU: 0 PID: 7676 at mm/filemap.c:217
          __delete_from_page_cache+0x9f6/0xb60()
          Modules linked in:
          CPU: 0 PID: 7676 Comm: a.out Not tainted 4.4.0+ #276
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
           00000000ffffffff ffff88006d3f7738 ffffffff82999e2d 0000000000000000
           ffff8800620a0000 ffffffff86473d20 ffff88006d3f7778 ffffffff81352089
           ffffffff81658d36 ffffffff86473d20 00000000000000d9 ffffea0000009d60
          Call Trace:
           [<     inline     >] __dump_stack lib/dump_stack.c:15
           [<ffffffff82999e2d>] dump_stack+0x6f/0xa2 lib/dump_stack.c:50
           [<ffffffff81352089>] warn_slowpath_common+0xd9/0x140 kernel/panic.c:482
           [<ffffffff813522b9>] warn_slowpath_null+0x29/0x30 kernel/panic.c:515
           [<ffffffff81658d36>] __delete_from_page_cache+0x9f6/0xb60 mm/filemap.c:217
           [<ffffffff81658fb2>] delete_from_page_cache+0x112/0x200 mm/filemap.c:244
           [<ffffffff818af369>] __dax_fault+0x859/0x1800 fs/dax.c:487
           [<ffffffff8186f4f6>] blkdev_dax_fault+0x26/0x30 fs/block_dev.c:1730
           [<     inline     >] wp_pfn_shared mm/memory.c:2208
           [<ffffffff816e9145>] do_wp_page+0xc85/0x14f0 mm/memory.c:2307
           [<     inline     >] handle_pte_fault mm/memory.c:3323
           [<     inline     >] __handle_mm_fault mm/memory.c:3417
           [<ffffffff816ecec3>] handle_mm_fault+0x2483/0x4640 mm/memory.c:3446
           [<ffffffff8127eff6>] __do_page_fault+0x376/0x960 arch/x86/mm/fault.c:1238
           [<ffffffff8127f738>] trace_do_page_fault+0xe8/0x420 arch/x86/mm/fault.c:1331
           [<ffffffff812705c4>] do_async_page_fault+0x14/0xd0 arch/x86/kernel/kvm.c:264
           [<ffffffff86338f78>] async_page_fault+0x28/0x30 arch/x86/entry/entry_64.S:986
           [<ffffffff86336c36>] entry_SYSCALL_64_fastpath+0x16/0x7a
          arch/x86/entry/entry_64.S:185
          ---[ end trace dae21e0f85f1f98c ]---
      
      Fixes: 5a023cdb ("block: enable dax for raw block devices")
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Reported-by: NKirill A. Shutemov <kirill@shutemov.name>
      Suggested-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Suggested-by: NMatthew Wilcox <willy@linux.intel.com>
      Tested-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      65f87ee7
  11. 30 1月, 2016 1 次提交
    • T
      workqueue: skip flush dependency checks for legacy workqueues · 23d11a58
      Tejun Heo 提交于
      fca839c0 ("workqueue: warn if memory reclaim tries to flush
      !WQ_MEM_RECLAIM workqueue") implemented flush dependency warning which
      triggers if a PF_MEMALLOC task or WQ_MEM_RECLAIM workqueue tries to
      flush a !WQ_MEM_RECLAIM workquee.
      
      This assumes that workqueues marked with WQ_MEM_RECLAIM sit in memory
      reclaim path and making it depend on something which may need more
      memory to make forward progress can lead to deadlocks.  Unfortunately,
      workqueues created with the legacy create*_workqueue() interface
      always have WQ_MEM_RECLAIM regardless of whether they are depended
      upon memory reclaim or not.  These spurious WQ_MEM_RECLAIM markings
      cause spurious triggering of the flush dependency checks.
      
        WARNING: CPU: 0 PID: 6 at kernel/workqueue.c:2361 check_flush_dependency+0x138/0x144()
        workqueue: WQ_MEM_RECLAIM deferwq:deferred_probe_work_func is flushing !WQ_MEM_RECLAIM events:lru_add_drain_per_cpu
        ...
        Workqueue: deferwq deferred_probe_work_func
        [<c0017acc>] (unwind_backtrace) from [<c0013134>] (show_stack+0x10/0x14)
        [<c0013134>] (show_stack) from [<c0245f18>] (dump_stack+0x94/0xd4)
        [<c0245f18>] (dump_stack) from [<c0026f9c>] (warn_slowpath_common+0x80/0xb0)
        [<c0026f9c>] (warn_slowpath_common) from [<c0026ffc>] (warn_slowpath_fmt+0x30/0x40)
        [<c0026ffc>] (warn_slowpath_fmt) from [<c00390b8>] (check_flush_dependency+0x138/0x144)
        [<c00390b8>] (check_flush_dependency) from [<c0039ca0>] (flush_work+0x50/0x15c)
        [<c0039ca0>] (flush_work) from [<c00c51b0>] (lru_add_drain_all+0x130/0x180)
        [<c00c51b0>] (lru_add_drain_all) from [<c00f728c>] (migrate_prep+0x8/0x10)
        [<c00f728c>] (migrate_prep) from [<c00bfbc4>] (alloc_contig_range+0xd8/0x338)
        [<c00bfbc4>] (alloc_contig_range) from [<c00f8f18>] (cma_alloc+0xe0/0x1ac)
        [<c00f8f18>] (cma_alloc) from [<c001cac4>] (__alloc_from_contiguous+0x38/0xd8)
        [<c001cac4>] (__alloc_from_contiguous) from [<c001ceb4>] (__dma_alloc+0x240/0x278)
        [<c001ceb4>] (__dma_alloc) from [<c001cf78>] (arm_dma_alloc+0x54/0x5c)
        [<c001cf78>] (arm_dma_alloc) from [<c0355ea4>] (dmam_alloc_coherent+0xc0/0xec)
        [<c0355ea4>] (dmam_alloc_coherent) from [<c039cc4c>] (ahci_port_start+0x150/0x1dc)
        [<c039cc4c>] (ahci_port_start) from [<c0384734>] (ata_host_start.part.3+0xc8/0x1c8)
        [<c0384734>] (ata_host_start.part.3) from [<c03898dc>] (ata_host_activate+0x50/0x148)
        [<c03898dc>] (ata_host_activate) from [<c039d558>] (ahci_host_activate+0x44/0x114)
        [<c039d558>] (ahci_host_activate) from [<c039f05c>] (ahci_platform_init_host+0x1d8/0x3c8)
        [<c039f05c>] (ahci_platform_init_host) from [<c039e6bc>] (tegra_ahci_probe+0x448/0x4e8)
        [<c039e6bc>] (tegra_ahci_probe) from [<c0347058>] (platform_drv_probe+0x50/0xac)
        [<c0347058>] (platform_drv_probe) from [<c03458cc>] (driver_probe_device+0x214/0x2c0)
        [<c03458cc>] (driver_probe_device) from [<c0343cc0>] (bus_for_each_drv+0x60/0x94)
        [<c0343cc0>] (bus_for_each_drv) from [<c03455d8>] (__device_attach+0xb0/0x114)
        [<c03455d8>] (__device_attach) from [<c0344ab8>] (bus_probe_device+0x84/0x8c)
        [<c0344ab8>] (bus_probe_device) from [<c0344f48>] (deferred_probe_work_func+0x68/0x98)
        [<c0344f48>] (deferred_probe_work_func) from [<c003b738>] (process_one_work+0x120/0x3f8)
        [<c003b738>] (process_one_work) from [<c003ba48>] (worker_thread+0x38/0x55c)
        [<c003ba48>] (worker_thread) from [<c0040f14>] (kthread+0xdc/0xf4)
        [<c0040f14>] (kthread) from [<c000f778>] (ret_from_fork+0x14/0x3c)
      
      Fix it by marking workqueues created via create*_workqueue() with
      __WQ_LEGACY and disabling flush dependency checks on them.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-and-tested-by: NThierry Reding <thierry.reding@gmail.com>
      Link: http://lkml.kernel.org/g/20160126173843.GA11115@ulmo.nvidia.com
      Fixes: fca839c0 ("workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue")
      23d11a58
  12. 29 1月, 2016 3 次提交
  13. 27 1月, 2016 3 次提交
  14. 26 1月, 2016 2 次提交
    • M
      irqdomain: Allow domain lookup with DOMAIN_BUS_WIRED token · 530cbe10
      Marc Zyngier 提交于
      Let's take the (outlandish) example of an interrupt controller
      capable of handling both wired interrupts and PCI MSIs.
      
      With the current code, the PCI MSI domain is going to be tagged
      with DOMAIN_BUS_PCI_MSI, and the wired domain with DOMAIN_BUS_ANY.
      
      Things get hairy when we start looking up the domain for a wired
      interrupt (typically when creating it based on some firmware
      information - DT or ACPI).
      
      In irq_create_fwspec_mapping(), we perform the lookup using
      DOMAIN_BUS_ANY, which is actually used as a wildcard. This gives
      us one chance out of two to end up with the wrong domain, and
      we try to configure a wired interrupt with the MSI domain.
      Everything grinds to a halt pretty quickly.
      
      What we really need to do is to start looking for a domain that
      would uniquely identify a wired interrupt domain, and only use
      DOMAIN_BUS_ANY as a fallback.
      
      In order to solve this, let's introduce a new DOMAIN_BUS_WIRED
      token, which is going to be used exactly as described above.
      Of course, this depends on the irqchip to setup the domain
      bus_token, and nobody had to implement this so far.
      
      Only so far.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Frank Rowand <frowand.list@gmail.com>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Link: http://lkml.kernel.org/r/1453816347-32720-2-git-send-email-marc.zyngier@arm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      530cbe10
    • D
      drivers: ata: wake port before DMA stop for ALPM · fb329633
      Danesh Petigara 提交于
      The AHCI driver code stops and starts port DMA engines at will
      without considering the power state of the particular port. The
      AHCI specification isn't very clear on how to handle this scenario,
      leaving implementation open to interpretation.
      
      Broadcom's STB SATA host controller is unable to handle port DMA
      controller restarts when the port in question is in low power mode.
      When a port enters partial or slumber mode, its PHY is powered down.
      When a controller restart is requested, the controller's internal
      state machine expects the PHY to be brought back up by software which
      never happens in this case, resulting in failures.
      
      To avoid this situation, logic is added to manually wake up the port
      just before its DMA engine is stopped, if the port happens to be in
      a low power state. HBA initiated power management ensures that the port
      eventually returns to its configured low power state, when the link is
      idle (as per the conditions listed in the spec). A new host flag is also
      added to ensure this logic is only exercised for hosts with the above
      limitation.
      
      tj: Formatting changes.
      Signed-off-by: NDanesh Petigara <dpetigara@broadcom.com>
      Reviewed-by: NMarkus Mayer <mmayer@broadcom.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      fb329633
  15. 25 1月, 2016 2 次提交
    • M
      of: drop symbols declared by _OF_DECLARE() from modules · 71f50c6d
      Masahiro Yamada 提交于
      The users of this macro (OF_EARLYCON_DECLARE, CLK_OF_DECLARE,
      IRQCHIP_DECLARE, etc.) are only parsed in the early boot stage.
      Such symbols contained in modules are never used.
      
      This commit fixes the link error introduced by commit b8d20e06
      ("serial: 8250_uniphier: add earlycon support"); the combination
      of CONFIG_SERIAL_8250_UNIPHIER=m and CONFIG_SERIAL_8250_CONSOLE=y
      fails to link:
      
      ERROR: "early_serial8250_setup" [drivers/tty/serial/8250/8250_uniphier.ko] undefined!
      
      Fixes: b8d20e06 ("serial: 8250_uniphier: add earlycon support")
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NRob Herring <robh@kernel.org>
      71f50c6d
    • A
      net: simplify napi_synchronize() to avoid warnings · facc432f
      Arnd Bergmann 提交于
      The napi_synchronize() function is defined twice: The definition
      for SMP builds waits for other CPUs to be done, while the uniprocessor
      variant just contains a barrier and ignores its argument.
      
      In the mvneta driver, this leads to a warning about an unused variable
      when we lookup the NAPI struct of another CPU and then don't use it:
      
      ethernet/marvell/mvneta.c: In function 'mvneta_percpu_notifier':
      ethernet/marvell/mvneta.c:2910:30: error: unused variable 'other_port' [-Werror=unused-variable]
      
      There are no other CPUs on a UP build, so that code never runs, but
      gcc does not know this.
      
      The nicest solution seems to be to turn the napi_synchronize() helper
      into an inline function for the UP case as well, as that leads gcc to
      not complain about the argument being unused. Once we do that, we can
      also combine the two cases into a single function definition and use
      if(IS_ENABLED()) rather than #ifdef to make it look a bit nicer.
      
      The warning first came up in linux-4.4, but I failed to catch it
      earlier.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Fixes: f8642885 ("net: mvneta: Statically assign queues to CPUs")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      facc432f
  16. 24 1月, 2016 5 次提交
  17. 23 1月, 2016 2 次提交
    • R
      dax: add support for fsync/sync · 9973c98e
      Ross Zwisler 提交于
      To properly handle fsync/msync in an efficient way DAX needs to track
      dirty pages so it is able to flush them durably to media on demand.
      
      The tracking of dirty pages is done via the radix tree in struct
      address_space.  This radix tree is already used by the page writeback
      infrastructure for tracking dirty pages associated with an open file,
      and it already has support for exceptional (non struct page*) entries.
      We build upon these features to add exceptional entries to the radix
      tree for DAX dirty PMD or PTE pages at fault time.
      
      [dan.j.williams@intel.com: fix dax_pmd_dbg build warning]
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jan Kara <jack@suse.com>
      Cc: Jeff Layton <jlayton@poochiereds.net>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9973c98e
    • R
      mm: add find_get_entries_tag() · 7e7f7749
      Ross Zwisler 提交于
      Add find_get_entries_tag() to the family of functions that include
      find_get_entries(), find_get_pages() and find_get_pages_tag().  This is
      needed for DAX dirty page handling because we need a list of both page
      offsets and radix tree entries ('indices' and 'entries' in this
      function) that are marked with the PAGECACHE_TAG_TOWRITE tag.
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jeff Layton <jlayton@poochiereds.net>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7e7f7749