1. 29 12月, 2018 1 次提交
    • J
      mm/mmu_notifier: use structure for invalidate_range_start/end callback · 5d6527a7
      Jérôme Glisse 提交于
      Patch series "mmu notifier contextual informations", v2.
      
      This patchset adds contextual information, why an invalidation is
      happening, to mmu notifier callback.  This is necessary for user of mmu
      notifier that wish to maintains their own data structure without having to
      add new fields to struct vm_area_struct (vma).
      
      For instance device can have they own page table that mirror the process
      address space.  When a vma is unmap (munmap() syscall) the device driver
      can free the device page table for the range.
      
      Today we do not have any information on why a mmu notifier call back is
      happening and thus device driver have to assume that it is always an
      munmap().  This is inefficient at it means that it needs to re-allocate
      device page table on next page fault and rebuild the whole device driver
      data structure for the range.
      
      Other use case beside munmap() also exist, for instance it is pointless
      for device driver to invalidate the device page table when the
      invalidation is for the soft dirtyness tracking.  Or device driver can
      optimize away mprotect() that change the page table permission access for
      the range.
      
      This patchset enables all this optimizations for device drivers.  I do not
      include any of those in this series but another patchset I am posting will
      leverage this.
      
      The patchset is pretty simple from a code point of view.  The first two
      patches consolidate all mmu notifier arguments into a struct so that it is
      easier to add/change arguments.  The last patch adds the contextual
      information (munmap, protection, soft dirty, clear, ...).
      
      This patch (of 3):
      
      To avoid having to change many callback definition everytime we want to
      add a parameter use a structure to group all parameters for the
      mmu_notifier invalidate_range_start/end callback.  No functional changes
      with this patch.
      
      [akpm@linux-foundation.org: fix drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c kerneldoc]
      Link: http://lkml.kernel.org/r/20181205053628.3210-2-jglisse@redhat.comSigned-off-by: NJérôme Glisse <jglisse@redhat.com>
      Acked-by: NJan Kara <jack@suse.cz>
      Acked-by: Jason Gunthorpe <jgg@mellanox.com>	[infiniband]
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Cc: Ross Zwisler <zwisler@kernel.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krcmar <rkrcmar@redhat.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Christian Koenig <christian.koenig@amd.com>
      Cc: Felix Kuehling <felix.kuehling@amd.com>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5d6527a7
  2. 23 8月, 2018 1 次提交
    • M
      mm, oom: distinguish blockable mode for mmu notifiers · 93065ac7
      Michal Hocko 提交于
      There are several blockable mmu notifiers which might sleep in
      mmu_notifier_invalidate_range_start and that is a problem for the
      oom_reaper because it needs to guarantee a forward progress so it cannot
      depend on any sleepable locks.
      
      Currently we simply back off and mark an oom victim with blockable mmu
      notifiers as done after a short sleep.  That can result in selecting a new
      oom victim prematurely because the previous one still hasn't torn its
      memory down yet.
      
      We can do much better though.  Even if mmu notifiers use sleepable locks
      there is no reason to automatically assume those locks are held.  Moreover
      majority of notifiers only care about a portion of the address space and
      there is absolutely zero reason to fail when we are unmapping an unrelated
      range.  Many notifiers do really block and wait for HW which is harder to
      handle and we have to bail out though.
      
      This patch handles the low hanging fruit.
      __mmu_notifier_invalidate_range_start gets a blockable flag and callbacks
      are not allowed to sleep if the flag is set to false.  This is achieved by
      using trylock instead of the sleepable lock for most callbacks and
      continue as long as we do not block down the call chain.
      
      I think we can improve that even further because there is a common pattern
      to do a range lookup first and then do something about that.  The first
      part can be done without a sleeping lock in most cases AFAICS.
      
      The oom_reaper end then simply retries if there is at least one notifier
      which couldn't make any progress in !blockable mode.  A retry loop is
      already implemented to wait for the mmap_sem and this is basically the
      same thing.
      
      The simplest way for driver developers to test this code path is to wrap
      userspace code which uses these notifiers into a memcg and set the hard
      limit to hit the oom.  This can be done e.g.  after the test faults in all
      the mmu notifier managed memory and set the hard limit to something really
      small.  Then we are looking for a proper process tear down.
      
      [akpm@linux-foundation.org: coding style fixes]
      [akpm@linux-foundation.org: minor code simplification]
      Link: http://lkml.kernel.org/r/20180716115058.5559-1-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: Christian König <christian.koenig@amd.com> # AMD notifiers
      Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx and umem_odp
      Reported-by: NDavid Rientjes <rientjes@google.com>
      Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
      Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
      Cc: Sudeep Dutt <sudeep.dutt@intel.com>
      Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: "Jérôme Glisse" <jglisse@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Felix Kuehling <felix.kuehling@amd.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      93065ac7
  3. 13 7月, 2018 1 次提交
  4. 14 5月, 2018 1 次提交
  5. 08 5月, 2018 1 次提交
  6. 16 2月, 2018 1 次提交
  7. 08 2月, 2018 1 次提交
  8. 21 11月, 2017 1 次提交
    • C
      drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM · 457db89b
      Chris Wilson 提交于
      Commit  21cc6431 ("drm/i915: Mark the userptr invalidate workqueue
      as WQ_MEM_RECLAIM") tried to fixup the check_flush_dependency warning
      for hitting i915_gem_userptr_mn_invalidate_range_start from within the
      shrinker, but I failed to notice userptr has 2 similarly named
      workqueues. I marked up i915-userptr-acquire as WQ_MEM_RECLAIM whereas
      we only wait upon i915-userptr-release from inside the reclaim paths.
      
      [62530.869510] workqueue: PF_MEMALLOC task 7983(gem_shrink) is flushing !WQ_MEM_RECLAIM i915-userptr-release:          (null)
      [62530.869515] ------------[ cut here ]------------
      [62530.869519] WARNING: CPU: 1 PID: 7983 at kernel/workqueue.c:2434 check_flush_dependency+0x7f/0x110
      [62530.869519] Modules linked in: pegasus mii ip6table_filter ip6_tables bnep iptable_filter snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic binfmt_misc nls_iso8859_1 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_intel snd_hda_codec kvm_intel snd_hda_core snd_hwdep kvm snd_pcm irqbypass snd_seq_midi snd_seq_midi_event snd_rawmidi crct10dif_pclmul crc32_pclmul 8250_dw ghash_clmulni_intel snd_seq pcbc snd_seq_device snd_timer btusb aesni_intel btrtl btbcm aes_x86_64 iwlwifi btintel crypto_simd glue_helper cryptd bluetooth snd intel_cstate input_leds idma64 intel_rapl_perf ecdh_generic serio_raw soundcore cfg80211 wmi_bmof virt_dma intel_lpss_pci intel_lpss acpi_als kfifo_buf industrialio winbond_cir soc_button_array rc_core spidev tpm_crb intel_hid acpi_pad mac_hid sparse_keymap
      [62530.869546]  parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid i915 i2c_algo_bit prime_numbers drm_kms_helper syscopyarea e1000e sysfillrect sysimgblt fb_sys_fops ahci ptp pps_core libahci drm wmi video i2c_hid hid
      [62530.869557] CPU: 1 PID: 7983 Comm: gem_shrink Tainted: G     U  W    L  4.14.0-rc8-drm-tip-ww45-commit-1342299+ #1
      [62530.869558] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake H DDR4 RVP, BIOS CNLSFWR1.R00.X098.A00.1707301945 07/30/2017
      [62530.869559] task: ffffa1049dbeec80 task.stack: ffffae7d05c44000
      [62530.869560] RIP: 0010:check_flush_dependency+0x7f/0x110
      [62530.869561] RSP: 0018:ffffae7d05c473a0 EFLAGS: 00010286
      [62530.869562] RAX: 000000000000006e RBX: ffffa1049540f400 RCX: ffffffffa3e55788
      [62530.869562] RDX: 0000000000000000 RSI: 0000000000000092 RDI: 0000000000000202
      [62530.869563] RBP: ffffae7d05c473c0 R08: 000000000000006e R09: 000000000038bb0e
      [62530.869563] R10: 0000000000000000 R11: 000000000000006e R12: ffffa1049dbeec80
      [62530.869564] R13: 0000000000000000 R14: 0000000000000000 R15: ffffae7d05c473e0
      [62530.869565] FS:  00007f621b129880(0000) GS:ffffa1050b240000(0000) knlGS:0000000000000000
      [62530.869566] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [62530.869566] CR2: 00007f6214400000 CR3: 0000000353a17003 CR4: 00000000003606e0
      [62530.869567] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [62530.869567] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [62530.869568] Call Trace:
      [62530.869570]  flush_workqueue+0x115/0x3d0
      [62530.869573]  ? wake_up_process+0x15/0x20
      [62530.869596]  i915_gem_userptr_mn_invalidate_range_start+0x12f/0x160 [i915]
      [62530.869614]  ? i915_gem_userptr_mn_invalidate_range_start+0x12f/0x160 [i915]
      [62530.869616]  __mmu_notifier_invalidate_range_start+0x55/0x80
      [62530.869618]  try_to_unmap_one+0x791/0x8b0
      [62530.869620]  ? call_rwsem_down_read_failed+0x18/0x30
      [62530.869622]  rmap_walk_anon+0x10b/0x260
      [62530.869624]  rmap_walk+0x48/0x60
      [62530.869625]  try_to_unmap+0x93/0xf0
      [62530.869626]  ? page_remove_rmap+0x2a0/0x2a0
      [62530.869627]  ? page_not_mapped+0x20/0x20
      [62530.869629]  ? page_get_anon_vma+0x90/0x90
      [62530.869630]  ? invalid_mkclean_vma+0x20/0x20
      [62530.869631]  migrate_pages+0x946/0xaa0
      [62530.869633]  ? __ClearPageMovable+0x10/0x10
      [62530.869635]  ? isolate_freepages_block+0x3c0/0x3c0
      [62530.869636]  compact_zone+0x22f/0x970
      [62530.869638]  compact_zone_order+0xa3/0xd0
      [62530.869640]  try_to_compact_pages+0x1a5/0x2a0
      [62530.869641]  ? try_to_compact_pages+0x1a5/0x2a0
      [62530.869643]  __alloc_pages_direct_compact+0x50/0x110
      [62530.869644]  __alloc_pages_slowpath+0x4da/0xf30
      [62530.869646]  __alloc_pages_nodemask+0x262/0x280
      [62530.869648]  alloc_pages_vma+0x165/0x1e0
      [62530.869649]  shmem_alloc_hugepage+0xd0/0x130
      [62530.869651]  ? __radix_tree_insert+0x45/0x230
      [62530.869652]  ? __vm_enough_memory+0x29/0x130
      [62530.869654]  shmem_alloc_and_acct_page+0x10d/0x1e0
      [62530.869655]  shmem_getpage_gfp+0x426/0xc00
      [62530.869657]  shmem_fault+0xa0/0x1e0
      [62530.869659]  ? file_update_time+0x60/0x110
      [62530.869660]  __do_fault+0x1e/0xc0
      [62530.869661]  __handle_mm_fault+0xa35/0x1170
      [62530.869662]  handle_mm_fault+0xcc/0x1c0
      [62530.869664]  __do_page_fault+0x262/0x4f0
      [62530.869666]  do_page_fault+0x2e/0xe0
      [62530.869667]  page_fault+0x22/0x30
      [62530.869668] RIP: 0033:0x404335
      [62530.869669] RSP: 002b:00007fff7829e420 EFLAGS: 00010216
      [62530.869670] RAX: 00007f6210400000 RBX: 0000000000000004 RCX: 0000000000b80000
      [62530.869670] RDX: 0000000000002e01 RSI: 0000000000008000 RDI: 0000000000000004
      [62530.869671] RBP: 0000000000000019 R08: 0000000000000002 R09: 0000000000000000
      [62530.869671] R10: 0000000000000559 R11: 0000000000000246 R12: 0000000008000000
      [62530.869672] R13: 00000000004042f0 R14: 0000000000000004 R15: 000000000000007e
      [62530.869673] Code: 00 8b b0 18 05 00 00 48 8d 8b b0 00 00 00 48 8d 90 c0 06 00 00 4d 89 f0 48 c7 c7 40 c0 c8 a3 c6 05 68 c5 e8 00 01 e8 c2 68 04 00 <0f> ff 4d 85 ed 74 18 49 8b 45 20 48 8b 70 08 8b 86 00 01 00 00
      [62530.869691] ---[ end trace 01e01ad0ff5781f8 ]---
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103739
      Fixes: 21cc6431 ("drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171114173520.8829-1-chris@chris-wilson.co.ukReviewed-by: NMatthew Auld <matthew.auld@intel.com>
      (cherry picked from commit 41729bf2)
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      457db89b
  9. 17 11月, 2017 1 次提交
    • C
      drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM · 41729bf2
      Chris Wilson 提交于
      Commit  21cc6431 ("drm/i915: Mark the userptr invalidate workqueue
      as WQ_MEM_RECLAIM") tried to fixup the check_flush_dependency warning
      for hitting i915_gem_userptr_mn_invalidate_range_start from within the
      shrinker, but I failed to notice userptr has 2 similarly named
      workqueues. I marked up i915-userptr-acquire as WQ_MEM_RECLAIM whereas
      we only wait upon i915-userptr-release from inside the reclaim paths.
      
      [62530.869510] workqueue: PF_MEMALLOC task 7983(gem_shrink) is flushing !WQ_MEM_RECLAIM i915-userptr-release:          (null)
      [62530.869515] ------------[ cut here ]------------
      [62530.869519] WARNING: CPU: 1 PID: 7983 at kernel/workqueue.c:2434 check_flush_dependency+0x7f/0x110
      [62530.869519] Modules linked in: pegasus mii ip6table_filter ip6_tables bnep iptable_filter snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic binfmt_misc nls_iso8859_1 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_intel snd_hda_codec kvm_intel snd_hda_core snd_hwdep kvm snd_pcm irqbypass snd_seq_midi snd_seq_midi_event snd_rawmidi crct10dif_pclmul crc32_pclmul 8250_dw ghash_clmulni_intel snd_seq pcbc snd_seq_device snd_timer btusb aesni_intel btrtl btbcm aes_x86_64 iwlwifi btintel crypto_simd glue_helper cryptd bluetooth snd intel_cstate input_leds idma64 intel_rapl_perf ecdh_generic serio_raw soundcore cfg80211 wmi_bmof virt_dma intel_lpss_pci intel_lpss acpi_als kfifo_buf industrialio winbond_cir soc_button_array rc_core spidev tpm_crb intel_hid acpi_pad mac_hid sparse_keymap
      [62530.869546]  parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid i915 i2c_algo_bit prime_numbers drm_kms_helper syscopyarea e1000e sysfillrect sysimgblt fb_sys_fops ahci ptp pps_core libahci drm wmi video i2c_hid hid
      [62530.869557] CPU: 1 PID: 7983 Comm: gem_shrink Tainted: G     U  W    L  4.14.0-rc8-drm-tip-ww45-commit-1342299+ #1
      [62530.869558] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake H DDR4 RVP, BIOS CNLSFWR1.R00.X098.A00.1707301945 07/30/2017
      [62530.869559] task: ffffa1049dbeec80 task.stack: ffffae7d05c44000
      [62530.869560] RIP: 0010:check_flush_dependency+0x7f/0x110
      [62530.869561] RSP: 0018:ffffae7d05c473a0 EFLAGS: 00010286
      [62530.869562] RAX: 000000000000006e RBX: ffffa1049540f400 RCX: ffffffffa3e55788
      [62530.869562] RDX: 0000000000000000 RSI: 0000000000000092 RDI: 0000000000000202
      [62530.869563] RBP: ffffae7d05c473c0 R08: 000000000000006e R09: 000000000038bb0e
      [62530.869563] R10: 0000000000000000 R11: 000000000000006e R12: ffffa1049dbeec80
      [62530.869564] R13: 0000000000000000 R14: 0000000000000000 R15: ffffae7d05c473e0
      [62530.869565] FS:  00007f621b129880(0000) GS:ffffa1050b240000(0000) knlGS:0000000000000000
      [62530.869566] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [62530.869566] CR2: 00007f6214400000 CR3: 0000000353a17003 CR4: 00000000003606e0
      [62530.869567] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [62530.869567] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [62530.869568] Call Trace:
      [62530.869570]  flush_workqueue+0x115/0x3d0
      [62530.869573]  ? wake_up_process+0x15/0x20
      [62530.869596]  i915_gem_userptr_mn_invalidate_range_start+0x12f/0x160 [i915]
      [62530.869614]  ? i915_gem_userptr_mn_invalidate_range_start+0x12f/0x160 [i915]
      [62530.869616]  __mmu_notifier_invalidate_range_start+0x55/0x80
      [62530.869618]  try_to_unmap_one+0x791/0x8b0
      [62530.869620]  ? call_rwsem_down_read_failed+0x18/0x30
      [62530.869622]  rmap_walk_anon+0x10b/0x260
      [62530.869624]  rmap_walk+0x48/0x60
      [62530.869625]  try_to_unmap+0x93/0xf0
      [62530.869626]  ? page_remove_rmap+0x2a0/0x2a0
      [62530.869627]  ? page_not_mapped+0x20/0x20
      [62530.869629]  ? page_get_anon_vma+0x90/0x90
      [62530.869630]  ? invalid_mkclean_vma+0x20/0x20
      [62530.869631]  migrate_pages+0x946/0xaa0
      [62530.869633]  ? __ClearPageMovable+0x10/0x10
      [62530.869635]  ? isolate_freepages_block+0x3c0/0x3c0
      [62530.869636]  compact_zone+0x22f/0x970
      [62530.869638]  compact_zone_order+0xa3/0xd0
      [62530.869640]  try_to_compact_pages+0x1a5/0x2a0
      [62530.869641]  ? try_to_compact_pages+0x1a5/0x2a0
      [62530.869643]  __alloc_pages_direct_compact+0x50/0x110
      [62530.869644]  __alloc_pages_slowpath+0x4da/0xf30
      [62530.869646]  __alloc_pages_nodemask+0x262/0x280
      [62530.869648]  alloc_pages_vma+0x165/0x1e0
      [62530.869649]  shmem_alloc_hugepage+0xd0/0x130
      [62530.869651]  ? __radix_tree_insert+0x45/0x230
      [62530.869652]  ? __vm_enough_memory+0x29/0x130
      [62530.869654]  shmem_alloc_and_acct_page+0x10d/0x1e0
      [62530.869655]  shmem_getpage_gfp+0x426/0xc00
      [62530.869657]  shmem_fault+0xa0/0x1e0
      [62530.869659]  ? file_update_time+0x60/0x110
      [62530.869660]  __do_fault+0x1e/0xc0
      [62530.869661]  __handle_mm_fault+0xa35/0x1170
      [62530.869662]  handle_mm_fault+0xcc/0x1c0
      [62530.869664]  __do_page_fault+0x262/0x4f0
      [62530.869666]  do_page_fault+0x2e/0xe0
      [62530.869667]  page_fault+0x22/0x30
      [62530.869668] RIP: 0033:0x404335
      [62530.869669] RSP: 002b:00007fff7829e420 EFLAGS: 00010216
      [62530.869670] RAX: 00007f6210400000 RBX: 0000000000000004 RCX: 0000000000b80000
      [62530.869670] RDX: 0000000000002e01 RSI: 0000000000008000 RDI: 0000000000000004
      [62530.869671] RBP: 0000000000000019 R08: 0000000000000002 R09: 0000000000000000
      [62530.869671] R10: 0000000000000559 R11: 0000000000000246 R12: 0000000008000000
      [62530.869672] R13: 00000000004042f0 R14: 0000000000000004 R15: 000000000000007e
      [62530.869673] Code: 00 8b b0 18 05 00 00 48 8d 8b b0 00 00 00 48 8d 90 c0 06 00 00 4d 89 f0 48 c7 c7 40 c0 c8 a3 c6 05 68 c5 e8 00 01 e8 c2 68 04 00 <0f> ff 4d 85 ed 74 18 49 8b 45 20 48 8b 70 08 8b 86 00 01 00 00
      [62530.869691] ---[ end trace 01e01ad0ff5781f8 ]---
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103739
      Fixes: 21cc6431 ("drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171114173520.8829-1-chris@chris-wilson.co.ukReviewed-by: NMatthew Auld <matthew.auld@intel.com>
      41729bf2
  10. 16 11月, 2017 1 次提交
  11. 18 10月, 2017 1 次提交
  12. 17 10月, 2017 2 次提交
  13. 10 10月, 2017 2 次提交
    • D
      drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock · 7741b547
      Daniel Vetter 提交于
      4.14-rc1 gained the fancy new cross-release support in lockdep, which
      seems to have uncovered a few more rules about what is allowed and
      isn't.
      
      This one here seems to indicate that allocating a work-queue while
      holding mmap_sem is a no-go, so let's try to preallocate it.
      
      Of course another way to break this chain would be somewhere in the
      cpu hotplug code, since this isn't the only trace we're finding now
      which goes through msr_create_device.
      
      Full lockdep splat:
      
      ======================================================
      WARNING: possible circular locking dependency detected
      4.14.0-rc1-CI-CI_DRM_3118+ #1 Tainted: G     U
      ------------------------------------------------------
      prime_mmap/1551 is trying to acquire lock:
       (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff8109dbb7>] apply_workqueue_attrs+0x17/0x50
      
      but task is already holding lock:
       (&dev_priv->mm_lock){+.+.}, at: [<ffffffffa01a7b2a>] i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915]
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #6 (&dev_priv->mm_lock){+.+.}:
             __lock_acquire+0x1420/0x15e0
             lock_acquire+0xb0/0x200
             __mutex_lock+0x86/0x9b0
             mutex_lock_nested+0x1b/0x20
             i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915]
             i915_gem_userptr_ioctl+0x222/0x2c0 [i915]
             drm_ioctl_kernel+0x69/0xb0
             drm_ioctl+0x2f9/0x3d0
             do_vfs_ioctl+0x94/0x670
             SyS_ioctl+0x41/0x70
             entry_SYSCALL_64_fastpath+0x1c/0xb1
      
      -> #5 (&mm->mmap_sem){++++}:
             __lock_acquire+0x1420/0x15e0
             lock_acquire+0xb0/0x200
             __might_fault+0x68/0x90
             _copy_to_user+0x23/0x70
             filldir+0xa5/0x120
             dcache_readdir+0xf9/0x170
             iterate_dir+0x69/0x1a0
             SyS_getdents+0xa5/0x140
             entry_SYSCALL_64_fastpath+0x1c/0xb1
      
      -> #4 (&sb->s_type->i_mutex_key#5){++++}:
             down_write+0x3b/0x70
             handle_create+0xcb/0x1e0
             devtmpfsd+0x139/0x180
             kthread+0x152/0x190
             ret_from_fork+0x27/0x40
      
      -> #3 ((complete)&req.done){+.+.}:
             __lock_acquire+0x1420/0x15e0
             lock_acquire+0xb0/0x200
             wait_for_common+0x58/0x210
             wait_for_completion+0x1d/0x20
             devtmpfs_create_node+0x13d/0x160
             device_add+0x5eb/0x620
             device_create_groups_vargs+0xe0/0xf0
             device_create+0x3a/0x40
             msr_device_create+0x2b/0x40
             cpuhp_invoke_callback+0xa3/0x840
             cpuhp_thread_fun+0x7a/0x150
             smpboot_thread_fn+0x18a/0x280
             kthread+0x152/0x190
             ret_from_fork+0x27/0x40
      
      -> #2 (cpuhp_state){+.+.}:
             __lock_acquire+0x1420/0x15e0
             lock_acquire+0xb0/0x200
             cpuhp_issue_call+0x10b/0x170
             __cpuhp_setup_state_cpuslocked+0x134/0x2a0
             __cpuhp_setup_state+0x46/0x60
             page_writeback_init+0x43/0x67
             pagecache_init+0x3d/0x42
             start_kernel+0x3a8/0x3fc
             x86_64_start_reservations+0x2a/0x2c
             x86_64_start_kernel+0x6d/0x70
             verify_cpu+0x0/0xfb
      
      -> #1 (cpuhp_state_mutex){+.+.}:
             __lock_acquire+0x1420/0x15e0
             lock_acquire+0xb0/0x200
             __mutex_lock+0x86/0x9b0
             mutex_lock_nested+0x1b/0x20
             __cpuhp_setup_state_cpuslocked+0x52/0x2a0
             __cpuhp_setup_state+0x46/0x60
             page_alloc_init+0x28/0x30
             start_kernel+0x145/0x3fc
             x86_64_start_reservations+0x2a/0x2c
             x86_64_start_kernel+0x6d/0x70
             verify_cpu+0x0/0xfb
      
      -> #0 (cpu_hotplug_lock.rw_sem){++++}:
             check_prev_add+0x430/0x840
             __lock_acquire+0x1420/0x15e0
             lock_acquire+0xb0/0x200
             cpus_read_lock+0x3d/0xb0
             apply_workqueue_attrs+0x17/0x50
             __alloc_workqueue_key+0x1d8/0x4d9
             i915_gem_userptr_init__mmu_notifier+0x1fb/0x270 [i915]
             i915_gem_userptr_ioctl+0x222/0x2c0 [i915]
             drm_ioctl_kernel+0x69/0xb0
             drm_ioctl+0x2f9/0x3d0
             do_vfs_ioctl+0x94/0x670
             SyS_ioctl+0x41/0x70
             entry_SYSCALL_64_fastpath+0x1c/0xb1
      
      other info that might help us debug this:
      
      Chain exists of:
        cpu_hotplug_lock.rw_sem --> &mm->mmap_sem --> &dev_priv->mm_lock
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&dev_priv->mm_lock);
                                     lock(&mm->mmap_sem);
                                     lock(&dev_priv->mm_lock);
        lock(cpu_hotplug_lock.rw_sem);
      
       *** DEADLOCK ***
      
      2 locks held by prime_mmap/1551:
       #0:  (&mm->mmap_sem){++++}, at: [<ffffffffa01a7b18>] i915_gem_userptr_init__mmu_notifier+0x138/0x270 [i915]
       #1:  (&dev_priv->mm_lock){+.+.}, at: [<ffffffffa01a7b2a>] i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915]
      
      stack backtrace:
      CPU: 4 PID: 1551 Comm: prime_mmap Tainted: G     U          4.14.0-rc1-CI-CI_DRM_3118+ #1
      Hardware name: Dell Inc. XPS 8300  /0Y2MRG, BIOS A06 10/17/2011
      Call Trace:
       dump_stack+0x68/0x9f
       print_circular_bug+0x235/0x3c0
       ? lockdep_init_map_crosslock+0x20/0x20
       check_prev_add+0x430/0x840
       __lock_acquire+0x1420/0x15e0
       ? __lock_acquire+0x1420/0x15e0
       ? lockdep_init_map_crosslock+0x20/0x20
       lock_acquire+0xb0/0x200
       ? apply_workqueue_attrs+0x17/0x50
       cpus_read_lock+0x3d/0xb0
       ? apply_workqueue_attrs+0x17/0x50
       apply_workqueue_attrs+0x17/0x50
       __alloc_workqueue_key+0x1d8/0x4d9
       ? __lockdep_init_map+0x57/0x1c0
       i915_gem_userptr_init__mmu_notifier+0x1fb/0x270 [i915]
       i915_gem_userptr_ioctl+0x222/0x2c0 [i915]
       ? i915_gem_userptr_release+0x140/0x140 [i915]
       drm_ioctl_kernel+0x69/0xb0
       drm_ioctl+0x2f9/0x3d0
       ? i915_gem_userptr_release+0x140/0x140 [i915]
       ? __do_page_fault+0x2a4/0x570
       do_vfs_ioctl+0x94/0x670
       ? entry_SYSCALL_64_fastpath+0x5/0xb1
       ? __this_cpu_preempt_check+0x13/0x20
       ? trace_hardirqs_on_caller+0xe3/0x1b0
       SyS_ioctl+0x41/0x70
       entry_SYSCALL_64_fastpath+0x1c/0xb1
      RIP: 0033:0x7fbb83c39587
      RSP: 002b:00007fff188dc228 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      RAX: ffffffffffffffda RBX: ffffffff81492963 RCX: 00007fbb83c39587
      RDX: 00007fff188dc260 RSI: 00000000c0186473 RDI: 0000000000000003
      RBP: ffffc90001487f88 R08: 0000000000000000 R09: 00007fff188dc2ac
      R10: 00007fbb83efcb58 R11: 0000000000000246 R12: 0000000000000000
      R13: 0000000000000003 R14: 00000000c0186473 R15: 00007fff188dc2ac
       ? __this_cpu_preempt_check+0x13/0x20
      
      Note that this also has the minor benefit of slightly reducing the
      critical section where we hold mmap_sem.
      
      v2: Set ret correctly when we raced with another thread.
      
      v3: Use Chris' diff. Attach the right lockdep splat.
      
      v4: Repaint in Tvrtko's colors (aka don't report ENOMEM if we race and
      some other thread managed to not also get an ENOMEM and successfully
      install the mmu notifier. Note that the kernel guarantees that small
      allocations succeed, so this never actually happens).
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Sasha Levin <alexander.levin@verizon.com>
      Cc: Marta Lofstedt <marta.lofstedt@intel.com>
      Cc: Tejun Heo <tj@kernel.org>
      References: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3180/shard-hsw3/igt@prime_mmap@test_userptr.html
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102939Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171009164401.16035-1-daniel.vetter@ffwll.ch
      7741b547
    • M
      drm/i915: s/sg_mask/sg_page_sizes/ · 84e8978e
      Matthew Auld 提交于
      It's a little unclear what the sg_mask actually is, so prefer the more
      meaningful name of sg_page_sizes.
      Suggested-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Signed-off-by: NMatthew Auld <matthew.auld@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171009110024.29114-1-matthew.auld@intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      84e8978e
  14. 07 10月, 2017 2 次提交
  15. 15 9月, 2017 1 次提交
    • C
      drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM · 21cc6431
      Chris Wilson 提交于
      To silence the critcs:
      
      [56532.161115] workqueue: PF_MEMALLOC task 36(khugepaged) is flushing !WQ_MEM_RECLAIM i915-userptr-release:          (null)
      [56532.161138] ------------[ cut here ]------------
      [56532.161144] WARNING: CPU: 1 PID: 36 at kernel/workqueue.c:2418 check_flush_dependency+0xe8/0xf0
      [56532.161145] Modules linked in: wmi_bmof
      [56532.161148] CPU: 1 PID: 36 Comm: khugepaged Not tainted 4.13.0-krejzi #1
      [56532.161149] Hardware name: HP HP ProBook 470 G3/8102, BIOS N78 Ver. 01.17 06/08/2017
      [56532.161150] task: ffff8802371ee200 task.stack: ffffc90000174000
      [56532.161152] RIP: 0010:check_flush_dependency+0xe8/0xf0
      [56532.161152] RSP: 0018:ffffc900001777b8 EFLAGS: 00010286
      [56532.161153] RAX: 000000000000006c RBX: ffff88022fc5a000 RCX: 0000000000000001
      [56532.161154] RDX: 0000000000000000 RSI: 0000000000000086 RDI: 00000000ffffffff
      [56532.161155] RBP: 0000000000000000 R08: 14f038bb55f6dae0 R09: 0000000000000516
      [56532.161155] R10: ffffc900001778a0 R11: 000000006c756e28 R12: ffff8802371ee200
      [56532.161156] R13: 0000000000000000 R14: 000000000000000b R15: ffffc90000177810
      [56532.161157] FS:  0000000000000000(0000) GS:ffff880240480000(0000) knlGS:0000000000000000
      [56532.161158] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [56532.161158] CR2: 0000000004795ff8 CR3: 000000000220a000 CR4: 00000000003406e0
      [56532.161159] Call Trace:
      [56532.161161]  ? flush_workqueue+0x136/0x3e0
      [56532.161178]  ? _raw_spin_unlock_irqrestore+0xf/0x30
      [56532.161179]  ? try_to_wake_up+0x1ce/0x3b0
      [56532.161183]  ? i915_gem_userptr_mn_invalidate_range_start+0x13f/0x150
      [56532.161184]  ? _raw_spin_unlock+0xd/0x20
      [56532.161186]  ? i915_gem_userptr_mn_invalidate_range_start+0x13f/0x150
      [56532.161189]  ? __mmu_notifier_invalidate_range_start+0x4a/0x70
      [56532.161191]  ? try_to_unmap_one+0x5e5/0x660
      [56532.161193]  ? rmap_walk_file+0xe4/0x240
      [56532.161195]  ? __ClearPageMovable+0x10/0x10
      [56532.161196]  ? try_to_unmap+0x8c/0xe0
      [56532.161197]  ? page_remove_rmap+0x280/0x280
      [56532.161199]  ? page_not_mapped+0x10/0x10
      [56532.161200]  ? page_get_anon_vma+0x90/0x90
      [56532.161202]  ? migrate_pages+0x6a5/0x940
      [56532.161203]  ? isolate_freepages_block+0x330/0x330
      [56532.161205]  ? compact_zone+0x593/0x6a0
      [56532.161206]  ? enqueue_task_fair+0xc3/0x1180
      [56532.161208]  ? compact_zone_order+0x9b/0xc0
      [56532.161210]  ? get_page_from_freelist+0x24a/0x900
      [56532.161212]  ? try_to_compact_pages+0xc8/0x240
      [56532.161213]  ? try_to_compact_pages+0xc8/0x240
      [56532.161215]  ? __alloc_pages_direct_compact+0x45/0xe0
      [56532.161216]  ? __alloc_pages_slowpath+0x845/0xb90
      [56532.161218]  ? __alloc_pages_nodemask+0x176/0x1f0
      [56532.161220]  ? wait_woken+0x80/0x80
      [56532.161222]  ? khugepaged+0x29e/0x17d0
      [56532.161223]  ? wait_woken+0x80/0x80
      [56532.161225]  ? collapse_shmem.isra.39+0xa60/0xa60
      [56532.161226]  ? kthread+0x10d/0x130
      [56532.161227]  ? kthread_create_on_node+0x60/0x60
      [56532.161228]  ? ret_from_fork+0x22/0x30
      [56532.161229] Code: 00 8b b0 10 05 00 00 48 8d 8b b0 00 00 00 48 8d 90 b8 06 00 00 49 89 e8 48 c7 c7 38 55 09 82 c6 05 f9 c6 1d 01 01 e8 0e a1 03 00 <0f> ff e9 6b ff ff ff 90 48 8b 37 40 f6 c6 04 75 1b 48 c1 ee 05
      [56532.161251] ---[ end trace 2ce2b4f5f69b803b ]---
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20170911084135.22903-2-chris@chris-wilson.co.ukReviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
      21cc6431
  16. 14 9月, 2017 1 次提交
    • M
      mm: treewide: remove GFP_TEMPORARY allocation flag · 0ee931c4
      Michal Hocko 提交于
      GFP_TEMPORARY was introduced by commit e12ba74d ("Group short-lived
      and reclaimable kernel allocations") along with __GFP_RECLAIMABLE.  It's
      primary motivation was to allow users to tell that an allocation is
      short lived and so the allocator can try to place such allocations close
      together and prevent long term fragmentation.  As much as this sounds
      like a reasonable semantic it becomes much less clear when to use the
      highlevel GFP_TEMPORARY allocation flag.  How long is temporary? Can the
      context holding that memory sleep? Can it take locks? It seems there is
      no good answer for those questions.
      
      The current implementation of GFP_TEMPORARY is basically GFP_KERNEL |
      __GFP_RECLAIMABLE which in itself is tricky because basically none of
      the existing caller provide a way to reclaim the allocated memory.  So
      this is rather misleading and hard to evaluate for any benefits.
      
      I have checked some random users and none of them has added the flag
      with a specific justification.  I suspect most of them just copied from
      other existing users and others just thought it might be a good idea to
      use without any measuring.  This suggests that GFP_TEMPORARY just
      motivates for cargo cult usage without any reasoning.
      
      I believe that our gfp flags are quite complex already and especially
      those with highlevel semantic should be clearly defined to prevent from
      confusion and abuse.  Therefore I propose dropping GFP_TEMPORARY and
      replace all existing users to simply use GFP_KERNEL.  Please note that
      SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and
      so they will be placed properly for memory fragmentation prevention.
      
      I can see reasons we might want some gfp flag to reflect shorterm
      allocations but I propose starting from a clear semantic definition and
      only then add users with proper justification.
      
      This was been brought up before LSF this year by Matthew [1] and it
      turned out that GFP_TEMPORARY really doesn't have a clear semantic.  It
      seems to be a heuristic without any measured advantage for most (if not
      all) its current users.  The follow up discussion has revealed that
      opinions on what might be temporary allocation differ a lot between
      developers.  So rather than trying to tweak existing users into a
      semantic which they haven't expected I propose to simply remove the flag
      and start from scratch if we really need a semantic for short term
      allocations.
      
      [1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org
      
      [akpm@linux-foundation.org: fix typo]
      [akpm@linux-foundation.org: coding-style fixes]
      [sfr@canb.auug.org.au: drm/i915: fix up]
        Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au
      Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0ee931c4
  17. 09 9月, 2017 1 次提交
  18. 07 9月, 2017 1 次提交
  19. 15 8月, 2017 1 次提交
  20. 16 6月, 2017 3 次提交
  21. 18 5月, 2017 1 次提交
  22. 16 3月, 2017 1 次提交
    • C
      drm/i915/userptr: Reinvent GGTT self-faulting protection · 15c344f4
      Chris Wilson 提交于
      lockdep doesn't like us taking the mm->mmap_sem inside the get_pages
      callback for a couple of reasons. The straightforward deadlock:
      
      [13755.434059] =============================================
      [13755.434061] [ INFO: possible recursive locking detected ]
      [13755.434064] 4.11.0-rc1-CI-CI_DRM_297+ #1 Tainted: G     U
      [13755.434066] ---------------------------------------------
      [13755.434068] gem_userptr_bli/8398 is trying to acquire lock:
      [13755.434070]  (&mm->mmap_sem){++++++}, at: [<ffffffffa00c988a>] i915_gem_userptr_get_pages+0x5a/0x2e0 [i915]
      [13755.434096]
                     but task is already holding lock:
      [13755.434098]  (&mm->mmap_sem){++++++}, at: [<ffffffff8104d485>] __do_page_fault+0x105/0x560
      [13755.434105]
                     other info that might help us debug this:
      [13755.434108]  Possible unsafe locking scenario:
      
      [13755.434110]        CPU0
      [13755.434111]        ----
      [13755.434112]   lock(&mm->mmap_sem);
      [13755.434115]   lock(&mm->mmap_sem);
      [13755.434117]
                      *** DEADLOCK ***
      
      [13755.434121]  May be due to missing lock nesting notation
      
      [13755.434126] 2 locks held by gem_userptr_bli/8398:
      [13755.434128]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8104d485>] __do_page_fault+0x105/0x560
      [13755.434135]  #1:  (&obj->mm.lock){+.+.+.}, at: [<ffffffffa00b887d>] __i915_gem_object_get_pages+0x1d/0x70 [i915]
      [13755.434156]
                     stack backtrace:
      [13755.434161] CPU: 3 PID: 8398 Comm: gem_userptr_bli Tainted: G     U          4.11.0-rc1-CI-CI_DRM_297+ #1
      [13755.434165] Hardware name: GIGABYTE GB-BKi7(H)A-7500/MFLP7AP-00, BIOS F4 02/20/2017
      [13755.434169] Call Trace:
      [13755.434174]  dump_stack+0x67/0x92
      [13755.434178]  __lock_acquire+0x133a/0x1b50
      [13755.434182]  lock_acquire+0xc9/0x220
      [13755.434200]  ? i915_gem_userptr_get_pages+0x5a/0x2e0 [i915]
      [13755.434204]  down_read+0x42/0x70
      [13755.434221]  ? i915_gem_userptr_get_pages+0x5a/0x2e0 [i915]
      [13755.434238]  i915_gem_userptr_get_pages+0x5a/0x2e0 [i915]
      [13755.434255]  ____i915_gem_object_get_pages+0x25/0x60 [i915]
      [13755.434272]  __i915_gem_object_get_pages+0x59/0x70 [i915]
      [13755.434288]  i915_gem_fault+0x397/0x6a0 [i915]
      [13755.434304]  ? i915_gem_fault+0x1a1/0x6a0 [i915]
      [13755.434308]  ? __lock_acquire+0x449/0x1b50
      [13755.434311]  ? __lock_acquire+0x449/0x1b50
      [13755.434315]  ? vm_mmap_pgoff+0xa9/0xd0
      [13755.434318]  __do_fault+0x19/0x70
      [13755.434321]  __handle_mm_fault+0x863/0xe50
      [13755.434325]  handle_mm_fault+0x17f/0x370
      [13755.434329]  ? handle_mm_fault+0x40/0x370
      [13755.434332]  __do_page_fault+0x279/0x560
      [13755.434336]  do_page_fault+0xc/0x10
      [13755.434339]  page_fault+0x22/0x30
      [13755.434342] RIP: 0033:0x7f5ab91b5880
      [13755.434345] RSP: 002b:00007fff62922218 EFLAGS: 00010216
      [13755.434348] RAX: 0000000000b74500 RBX: 00007f5ab7f81000 RCX: 0000000000000000
      [13755.434352] RDX: 0000000000100000 RSI: 00007f5ab7f81000 RDI: 00007f5aba61c000
      [13755.434355] RBP: 00007f5aba61c000 R08: 0000000000000007 R09: 0000000100000000
      [13755.434359] R10: 000000000000037d R11: 00007f5ab91b5840 R12: 0000000000000001
      [13755.434362] R13: 0000000000000005 R14: 0000000000000001 R15: 0000000000000000
      
      and cyclic deadlocks:
      
      [ 2566.458979] ======================================================
      [ 2566.459054] [ INFO: possible circular locking dependency detected ]
      [ 2566.459127] 4.11.0-rc1+ #26 Not tainted
      [ 2566.459194] -------------------------------------------------------
      [ 2566.459266] gem_streaming_w/759 is trying to acquire lock:
      [ 2566.459334]  (&obj->mm.lock){+.+.+.}, at: [<ffffffffa034bc80>] i915_gem_object_pin_pages+0x0/0xc0 [i915]
      [ 2566.459605]
      [ 2566.459605] but task is already holding lock:
      [ 2566.459699]  (&mm->mmap_sem){++++++}, at: [<ffffffff8106fd11>] __do_page_fault+0x121/0x500
      [ 2566.459814]
      [ 2566.459814] which lock already depends on the new lock.
      [ 2566.459814]
      [ 2566.459934]
      [ 2566.459934] the existing dependency chain (in reverse order) is:
      [ 2566.460030]
      [ 2566.460030] -> #1 (&mm->mmap_sem){++++++}:
      [ 2566.460139]        lock_acquire+0xfe/0x220
      [ 2566.460214]        down_read+0x4e/0x90
      [ 2566.460444]        i915_gem_userptr_get_pages+0x6e/0x340 [i915]
      [ 2566.460669]        ____i915_gem_object_get_pages+0x8b/0xd0 [i915]
      [ 2566.460900]        __i915_gem_object_get_pages+0x6a/0x80 [i915]
      [ 2566.461132]        __i915_vma_do_pin+0x7fa/0x930 [i915]
      [ 2566.461352]        eb_add_vma+0x67b/0x830 [i915]
      [ 2566.461572]        eb_lookup_vmas+0xafe/0x1010 [i915]
      [ 2566.461792]        i915_gem_do_execbuffer+0x715/0x2870 [i915]
      [ 2566.462012]        i915_gem_execbuffer2+0x106/0x2b0 [i915]
      [ 2566.462152]        drm_ioctl+0x36c/0x670 [drm]
      [ 2566.462236]        do_vfs_ioctl+0x12c/0xa60
      [ 2566.462317]        SyS_ioctl+0x41/0x70
      [ 2566.462399]        entry_SYSCALL_64_fastpath+0x1c/0xb1
      [ 2566.462477]
      [ 2566.462477] -> #0 (&obj->mm.lock){+.+.+.}:
      [ 2566.462587]        __lock_acquire+0x1602/0x1790
      [ 2566.462661]        lock_acquire+0xfe/0x220
      [ 2566.462893]        i915_gem_object_pin_pages+0x4c/0xc0 [i915]
      [ 2566.463116]        i915_gem_fault+0x2c2/0x8c0 [i915]
      [ 2566.463197]        __do_fault+0x42/0x130
      [ 2566.463276]        __handle_mm_fault+0x92c/0x1280
      [ 2566.463356]        handle_mm_fault+0x1e2/0x440
      [ 2566.463443]        __do_page_fault+0x1c4/0x500
      [ 2566.463529]        do_page_fault+0xc/0x10
      [ 2566.463613]        page_fault+0x1f/0x30
      [ 2566.463693]
      [ 2566.463693] other info that might help us debug this:
      [ 2566.463693]
      [ 2566.463820]  Possible unsafe locking scenario:
      [ 2566.463820]
      [ 2566.463918]        CPU0                    CPU1
      [ 2566.463988]        ----                    ----
      [ 2566.464068]   lock(&mm->mmap_sem);
      [ 2566.464143]                                lock(&obj->mm.lock);
      [ 2566.464226]                                lock(&mm->mmap_sem);
      [ 2566.464304]   lock(&obj->mm.lock);
      [ 2566.464378]
      [ 2566.464378]  *** DEADLOCK ***
      [ 2566.464378]
      [ 2566.464504] 1 lock held by gem_streaming_w/759:
      [ 2566.464576]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8106fd11>] __do_page_fault+0x121/0x500
      [ 2566.464699]
      [ 2566.464699] stack backtrace:
      [ 2566.464801] CPU: 0 PID: 759 Comm: gem_streaming_w Not tainted 4.11.0-rc1+ #26
      [ 2566.464881] Hardware name: GIGABYTE GB-BXBT-1900/MZBAYAB-00, BIOS F8 03/02/2016
      [ 2566.464983] Call Trace:
      [ 2566.465061]  dump_stack+0x68/0x9f
      [ 2566.465144]  print_circular_bug+0x20b/0x260
      [ 2566.465234]  __lock_acquire+0x1602/0x1790
      [ 2566.465323]  ? debug_check_no_locks_freed+0x1a0/0x1a0
      [ 2566.465564]  ? i915_gem_object_wait+0x238/0x650 [i915]
      [ 2566.465657]  ? debug_lockdep_rcu_enabled.part.4+0x1a/0x30
      [ 2566.465749]  lock_acquire+0xfe/0x220
      [ 2566.465985]  ? i915_sg_trim+0x1b0/0x1b0 [i915]
      [ 2566.466223]  i915_gem_object_pin_pages+0x4c/0xc0 [i915]
      [ 2566.466461]  ? i915_sg_trim+0x1b0/0x1b0 [i915]
      [ 2566.466699]  i915_gem_fault+0x2c2/0x8c0 [i915]
      [ 2566.466939]  ? i915_gem_pwrite_ioctl+0xce0/0xce0 [i915]
      [ 2566.467030]  ? __lock_acquire+0x642/0x1790
      [ 2566.467122]  ? __lock_acquire+0x642/0x1790
      [ 2566.467209]  ? debug_lockdep_rcu_enabled+0x35/0x40
      [ 2566.467299]  ? get_unmapped_area+0x1b4/0x1d0
      [ 2566.467387]  __do_fault+0x42/0x130
      [ 2566.467474]  __handle_mm_fault+0x92c/0x1280
      [ 2566.467564]  ? __pmd_alloc+0x1e0/0x1e0
      [ 2566.467651]  ? vm_mmap_pgoff+0x160/0x190
      [ 2566.467740]  ? handle_mm_fault+0x111/0x440
      [ 2566.467827]  handle_mm_fault+0x1e2/0x440
      [ 2566.467914]  ? handle_mm_fault+0x5d/0x440
      [ 2566.468002]  __do_page_fault+0x1c4/0x500
      [ 2566.468090]  do_page_fault+0xc/0x10
      [ 2566.468180]  page_fault+0x1f/0x30
      [ 2566.468263] RIP: 0033:0x557895ced32a
      [ 2566.468337] RSP: 002b:00007fffd6dd8a10 EFLAGS: 00010202
      [ 2566.468419] RAX: 00007f659a4db000 RBX: 0000000000000003 RCX: 00007f659ad032da
      [ 2566.468501] RDX: 0000000000000000 RSI: 0000000000100000 RDI: 0000000000000000
      [ 2566.468586] RBP: 0000000000000007 R08: 0000000000000003 R09: 0000000100000000
      [ 2566.468667] R10: 0000000000000001 R11: 0000000000000246 R12: 0000557895ceda60
      [ 2566.468749] R13: 0000000000000001 R14: 00007fffd6dd8ac0 R15: 00007f659a4db000
      
      By checking the status of the gup worker (serialized by the
      obj->mm.lock) we can determine whether it is still active, has failed or
      has succeeded. If the worker is still active (or failed), we know that
      it cannot be bound and so we can skip taking struct_mutex (risking
      potential recursion). As we check the worker status, we mark it to
      discard any partial results, forcing us to restart on the next
      get_pages.
      Reported-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Fixes: 1c8782dd ("drm/i915/userptr: Disallow wrapping GTT into a userptr")
      Testcase: igt/gem_userptr_blits/map-fixed-invalidate-gup
      Testcase: igt/gem_userptr_blits/dmabuf-sync
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170315140150.19432-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      15c344f4
  23. 09 3月, 2017 3 次提交
    • C
      drm/i915/userptr: Disallow wrapping GTT into a userptr · 1c8782dd
      Chris Wilson 提交于
      If we allow the user to convert a GTT mmap address into a userptr, we
      may end up in recursion hell, where currently we hit a mutex deadlock
      but other possibilities include use-after-free during the
      unbind/cancel_userptr.
      
      [  143.203989] gem_userptr_bli D    0   902    898 0x00000000
      [  143.204054] Call Trace:
      [  143.204137]  __schedule+0x511/0x1180
      [  143.204195]  ? pci_mmcfg_check_reserved+0xc0/0xc0
      [  143.204274]  schedule+0x57/0xe0
      [  143.204327]  schedule_timeout+0x383/0x670
      [  143.204374]  ? trace_hardirqs_on_caller+0x187/0x280
      [  143.204457]  ? trace_hardirqs_on_thunk+0x1a/0x1c
      [  143.204507]  ? usleep_range+0x110/0x110
      [  143.204657]  ? irq_exit+0x89/0x100
      [  143.204710]  ? retint_kernel+0x2d/0x2d
      [  143.204794]  ? trace_hardirqs_on_caller+0x187/0x280
      [  143.204857]  ? _raw_spin_unlock_irq+0x33/0x60
      [  143.204944]  wait_for_common+0x1f0/0x2f0
      [  143.205006]  ? out_of_line_wait_on_atomic_t+0x170/0x170
      [  143.205103]  ? wake_up_q+0xa0/0xa0
      [  143.205159]  ? flush_workqueue_prep_pwqs+0x15a/0x2c0
      [  143.205237]  wait_for_completion+0x1d/0x20
      [  143.205292]  flush_workqueue+0x2e9/0xbb0
      [  143.205339]  ? flush_workqueue+0x163/0xbb0
      [  143.205418]  ? __schedule+0x533/0x1180
      [  143.205498]  ? check_flush_dependency+0x1a0/0x1a0
      [  143.205681]  i915_gem_userptr_mn_invalidate_range_start+0x1c7/0x270 [i915]
      [  143.205865]  ? i915_gem_userptr_dmabuf_export+0x40/0x40 [i915]
      [  143.205955]  __mmu_notifier_invalidate_range_start+0xc6/0x120
      [  143.206044]  ? __mmu_notifier_invalidate_range_start+0x51/0x120
      [  143.206123]  zap_page_range_single+0x1c7/0x1f0
      [  143.206171]  ? unmap_single_vma+0x160/0x160
      [  143.206260]  ? unmap_mapping_range+0xa9/0x1b0
      [  143.206308]  ? vma_interval_tree_subtree_search+0x75/0xd0
      [  143.206397]  unmap_mapping_range+0x18f/0x1b0
      [  143.206444]  ? zap_vma_ptes+0x70/0x70
      [  143.206524]  ? __pm_runtime_resume+0x67/0xa0
      [  143.206723]  i915_gem_release_mmap+0x1ba/0x1c0 [i915]
      [  143.206846]  i915_vma_unbind+0x5c2/0x690 [i915]
      [  143.206925]  ? __lock_is_held+0x52/0x100
      [  143.207076]  i915_gem_object_set_tiling+0x1db/0x650 [i915]
      [  143.207236]  i915_gem_set_tiling_ioctl+0x1d3/0x3b0 [i915]
      [  143.207377]  ? i915_gem_set_tiling_ioctl+0x5/0x3b0 [i915]
      [  143.207457]  drm_ioctl+0x36c/0x670
      [  143.207535]  ? debug_lockdep_rcu_enabled.part.0+0x1a/0x30
      [  143.207730]  ? i915_gem_object_set_tiling+0x650/0x650 [i915]
      [  143.207793]  ? drm_getunique+0x120/0x120
      [  143.207875]  ? __handle_mm_fault+0x996/0x14a0
      [  143.207939]  ? vm_insert_page+0x340/0x340
      [  143.208028]  ? up_write+0x28/0x50
      [  143.208086]  ? vm_mmap_pgoff+0x160/0x190
      [  143.208163]  do_vfs_ioctl+0x12c/0xa60
      [  143.208218]  ? debug_lockdep_rcu_enabled+0x35/0x40
      [  143.208267]  ? ioctl_preallocate+0x150/0x150
      [  143.208353]  ? __do_page_fault+0x36a/0x6e0
      [  143.208400]  ? mark_held_locks+0x23/0xc0
      [  143.208479]  ? up_read+0x1f/0x40
      [  143.208526]  ? entry_SYSCALL_64_fastpath+0x5/0xc6
      [  143.208669]  ? __fget_light+0xa7/0xc0
      [  143.208747]  SyS_ioctl+0x41/0x70
      
      To prevent the possibility of a deadlock, we defer scheduling the worker
      until after we have proven that given the current mm, the userptr range
      does not overlap a GGTT mmaping. If another thread tries to remap the
      GGTT over the userptr before the worker is scheduled, it will be stopped
      by its invalidate-range flushing the current work, before the deadlock
      can occur.
      
      v2: Improve discussion of how we end up in the deadlock.
      v3: Don't forget to mark the userptr as active after a successful
      gup_fast. Rename overlaps_ggtt to noncontiguous_or_overlaps_ggtt.
      v4: Fix test ordering between invalid GTT mmaping and range completion
      (Tvrtko)
      Reported-by: NMichał Winiarski <michal.winiarski@intel.com>
      Testcase: igt/gem_userptr_blits/map-fixed-invalidate-gup
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170308215903.24171-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      1c8782dd
    • C
      drm/i915/userptr: Only flush the workqueue if required · d151e9ce
      Chris Wilson 提交于
      To avoid waiting for work from other invalidate-range threads where
      not required, only wait on the userptr cancel workqueue if we have added
      some work to it.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170307205851.32578-2-chris@chris-wilson.co.ukReviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      d151e9ce
    • C
      drm/i915/userptr: Deactivate a failed userptr if the worker reports an EFAULT · 42953b3c
      Chris Wilson 提交于
      If the worker fails, it no longer has pages to release and can be
      immediately removed from the invalidate-tree.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170307205851.32578-1-chris@chris-wilson.co.ukReviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      42953b3c
  24. 02 3月, 2017 1 次提交
    • I
      sched/headers: Prepare for new header dependencies before moving code to <linux/sched/mm.h> · 6e84f315
      Ingo Molnar 提交于
      We are going to split <linux/sched/mm.h> out of <linux/sched.h>, which
      will have to be picked up from other headers and a couple of .c files.
      
      Create a trivial placeholder <linux/sched/mm.h> file that just
      maps to <linux/sched.h> to make this patch obviously correct and
      bisectable.
      
      The APIs that are going to be moved first are:
      
         mm_alloc()
         __mmdrop()
         mmdrop()
         mmdrop_async_fn()
         mmdrop_async()
         mmget_not_zero()
         mmput()
         mmput_async()
         get_task_mm()
         mm_access()
         mm_release()
      
      Include the new header in the files that are going to need it.
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      6e84f315
  25. 28 2月, 2017 2 次提交
  26. 15 12月, 2016 1 次提交
    • L
      mm: add locked parameter to get_user_pages_remote() · 5b56d49f
      Lorenzo Stoakes 提交于
      Patch series "mm: unexport __get_user_pages_unlocked()".
      
      This patch series continues the cleanup of get_user_pages*() functions
      taking advantage of the fact we can now pass gup_flags as we please.
      
      It firstly adds an additional 'locked' parameter to
      get_user_pages_remote() to allow for its callers to utilise
      VM_FAULT_RETRY functionality.  This is necessary as the invocation of
      __get_user_pages_unlocked() in process_vm_rw_single_vec() makes use of
      this and no other existing higher level function would allow it to do
      so.
      
      Secondly existing callers of __get_user_pages_unlocked() are replaced
      with the appropriate higher-level replacement -
      get_user_pages_unlocked() if the current task and memory descriptor are
      referenced, or get_user_pages_remote() if other task/memory descriptors
      are referenced (having acquiring mmap_sem.)
      
      This patch (of 2):
      
      Add a int *locked parameter to get_user_pages_remote() to allow
      VM_FAULT_RETRY faulting behaviour similar to get_user_pages_[un]locked().
      
      Taking into account the previous adjustments to get_user_pages*()
      functions allowing for the passing of gup_flags, we are now in a
      position where __get_user_pages_unlocked() need only be exported for his
      ability to allow VM_FAULT_RETRY behaviour, this adjustment allows us to
      subsequently unexport __get_user_pages_unlocked() as well as allowing
      for future flexibility in the use of get_user_pages_remote().
      
      [sfr@canb.auug.org.au: merge fix for get_user_pages_remote API change]
        Link: http://lkml.kernel.org/r/20161122210511.024ec341@canb.auug.org.au
      Link: http://lkml.kernel.org/r/20161027095141.2569-2-lstoakes@gmail.comSigned-off-by: NLorenzo Stoakes <lstoakes@gmail.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krcmar <rkrcmar@redhat.com>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5b56d49f
  27. 02 12月, 2016 1 次提交
  28. 11 11月, 2016 1 次提交
  29. 02 11月, 2016 1 次提交
  30. 01 11月, 2016 1 次提交
  31. 29 10月, 2016 2 次提交