1. 20 10月, 2017 3 次提交
  2. 14 10月, 2017 1 次提交
    • Z
      mm: only display online cpus of the numa node · 064f0e93
      Zhen Lei 提交于
      When I execute numactl -H (which reads /sys/devices/system/node/nodeX/cpumap
      and displays cpumask_of_node for each node), I get different result
      on X86 and arm64.  For each numa node, the former only displayed online
      CPUs, and the latter displayed all possible CPUs.  Unfortunately, both
      Linux documentation and numactl manual have not described it clear.
      
      I sent a mail to ask for help, and Michal Hocko replied that he
      preferred to print online cpus because it doesn't really make much sense
      to bind anything on offline nodes.
      
      Will said:
       "I suspect the vast majority (if not all) code that reads this file was
        developed for x86, so having the same behaviour for arm64 sounds like
        something we should do ASAP before people try to special case with
        things like #ifdef __aarch64__. I'd rather have this in 4.14 if
        possible."
      
      Link: http://lkml.kernel.org/r/1506678805-15392-2-git-send-email-thunder.leizhen@huawei.comSigned-off-by: NZhen Lei <thunder.leizhen@huawei.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Tianhong Ding <dingtianhong@huawei.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Cc: Libin <huawei.libin@huawei.com>
      Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      064f0e93
  3. 12 10月, 2017 1 次提交
  4. 10 10月, 2017 1 次提交
    • J
      device property: Track owner device of device property · 5ab894ae
      Jarkko Nikula 提交于
      Deletion of subdevice will remove device properties associated to parent
      when they share the same firmware node after commit 478573c9 (driver
      core: Don't leak secondary fwnode on device removal).  This was observed
      with a driver adding subdevice that driver wasn't able to read device
      properties after rmmod/modprobe cycle.
      
      Consider the lifecycle of it:
      
      parent device registration
      	ACPI_COMPANION_SET()
      	device_add_properties()
      		pset_copy_set()
      		set_secondary_fwnode(dev, &p->fwnode)
      	device_add()
      
      parent probe
      	read device properties
      	ACPI_COMPANION_SET(subdevice, ACPI_COMPANION(parent))
      	device_add(subdevice)
      
      parent remove
      	device_del(subdevice)
      		device_remove_properties()
      			set_secondary_fwnode(dev, NULL);
      			pset_free()
      
      Parent device will have its primary firmware node pointing to an ACPI
      node and secondary firmware node point to device properties.
      
      ACPI_COMPANION_SET() call in parent probe will set the subdevice's
      firmware node to point to the same 'struct fwnode_handle' and the
      associated secondary firmware node, i.e. the device properties as the
      parent.
      
      When subdevice is deleted in parent remove that will remove those
      device properties and attempt to read device properties in next
      parent probe call will fail.
      
      Fix this by tracking the owner device of device properties and delete
      them only when owner device is being deleted.
      
      Fixes: 478573c9 (driver core: Don't leak secondary fwnode on device removal)
      Cc: 4.9+ <stable@vger.kernel.org> # 4.9+
      Signed-off-by: NJarkko Nikula <jarkko.nikula@linux.intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      5ab894ae
  5. 26 9月, 2017 1 次提交
  6. 20 9月, 2017 1 次提交
    • R
      PM: core: Fix device_pm_check_callbacks() · 157c460e
      Rafael J. Wysocki 提交于
      The device_pm_check_callbacks() function doesn't check legacy
      ->suspend and ->resume callback pointers under the device's
      bus type, class and driver, so in some cases it may set the
      no_pm_callbacks flag for the device incorrectly and then the
      callbacks may be skipped during system suspend/resume, which
      shouldn't happen.
      
      Fixes: aa8e54b5 (PM / sleep: Go direct_complete if driver has no callbacks)
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: 4.5+ <stable@vger.kernel.org> # 4.5+
      157c460e
  7. 18 9月, 2017 3 次提交
  8. 17 9月, 2017 2 次提交
  9. 11 9月, 2017 1 次提交
  10. 09 9月, 2017 2 次提交
    • A
      treewide: make "nr_cpu_ids" unsigned · 9b130ad5
      Alexey Dobriyan 提交于
      First, number of CPUs can't be negative number.
      
      Second, different signnnedness leads to suboptimal code in the following
      cases:
      
      1)
      	kmalloc(nr_cpu_ids * sizeof(X));
      
      "int" has to be sign extended to size_t.
      
      2)
      	while (loff_t *pos < nr_cpu_ids)
      
      MOVSXD is 1 byte longed than the same MOV.
      
      Other cases exist as well. Basically compiler is told that nr_cpu_ids
      can't be negative which can't be deduced if it is "int".
      
      Code savings on allyesconfig kernel: -3KB
      
      	add/remove: 0/0 grow/shrink: 25/264 up/down: 261/-3631 (-3370)
      	function                                     old     new   delta
      	coretemp_cpu_online                          450     512     +62
      	rcu_init_one                                1234    1272     +38
      	pci_device_probe                             374     399     +25
      
      				...
      
      	pgdat_reclaimable_pages                      628     556     -72
      	select_fallback_rq                           446     369     -77
      	task_numa_find_cpu                          1923    1807    -116
      
      Link: http://lkml.kernel.org/r/20170819114959.GA30580@avx2Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9b130ad5
    • K
      mm: change the call sites of numa statistics items · 3a321d2a
      Kemi Wang 提交于
      Patch series "Separate NUMA statistics from zone statistics", v2.
      
      Each page allocation updates a set of per-zone statistics with a call to
      zone_statistics().  As discussed in 2017 MM summit, these are a
      substantial source of overhead in the page allocator and are very rarely
      consumed.  This significant overhead in cache bouncing caused by zone
      counters (NUMA associated counters) update in parallel in multi-threaded
      page allocation (pointed out by Dave Hansen).
      
      A link to the MM summit slides:
        http://people.netfilter.org/hawk/presentations/MM-summit2017/MM-summit2017-JesperBrouer.pdf
      
      To mitigate this overhead, this patchset separates NUMA statistics from
      zone statistics framework, and update NUMA counter threshold to a fixed
      size of MAX_U16 - 2, as a small threshold greatly increases the update
      frequency of the global counter from local per cpu counter (suggested by
      Ying Huang).  The rationality is that these statistics counters don't
      need to be read often, unlike other VM counters, so it's not a problem
      to use a large threshold and make readers more expensive.
      
      With this patchset, we see 31.3% drop of CPU cycles(537-->369, see
      below) for per single page allocation and reclaim on Jesper's
      page_bench03 benchmark.  Meanwhile, this patchset keeps the same style
      of virtual memory statistics with little end-user-visible effects (only
      move the numa stats to show behind zone page stats, see the first patch
      for details).
      
      I did an experiment of single page allocation and reclaim concurrently
      using Jesper's page_bench03 benchmark on a 2-Socket Broadwell-based
      server (88 processors with 126G memory) with different size of threshold
      of pcp counter.
      
      Benchmark provided by Jesper D Brouer(increase loop times to 10000000):
        https://github.com/netoptimizer/prototype-kernel/tree/master/kernel/mm/bench
      
         Threshold   CPU cycles    Throughput(88 threads)
            32        799         241760478
            64        640         301628829
            125       537         358906028 <==> system by default
            256       468         412397590
            512       428         450550704
            4096      399         482520943
            20000     394         489009617
            30000     395         488017817
            65533     369(-31.3%) 521661345(+45.3%) <==> with this patchset
            N/A       342(-36.3%) 562900157(+56.8%) <==> disable zone_statistics
      
      This patch (of 3):
      
      In this patch, NUMA statistics is separated from zone statistics
      framework, all the call sites of NUMA stats are changed to use
      numa-stats-specific functions, it does not have any functionality change
      except that the number of NUMA stats is shown behind zone page stats
      when users *read* the zone info.
      
      E.g. cat /proc/zoneinfo
          ***Base***                           ***With this patch***
      nr_free_pages 3976                         nr_free_pages 3976
      nr_zone_inactive_anon 0                    nr_zone_inactive_anon 0
      nr_zone_active_anon 0                      nr_zone_active_anon 0
      nr_zone_inactive_file 0                    nr_zone_inactive_file 0
      nr_zone_active_file 0                      nr_zone_active_file 0
      nr_zone_unevictable 0                      nr_zone_unevictable 0
      nr_zone_write_pending 0                    nr_zone_write_pending 0
      nr_mlock     0                             nr_mlock     0
      nr_page_table_pages 0                      nr_page_table_pages 0
      nr_kernel_stack 0                          nr_kernel_stack 0
      nr_bounce    0                             nr_bounce    0
      nr_zspages   0                             nr_zspages   0
      numa_hit 0                                *nr_free_cma  0*
      numa_miss 0                                numa_hit     0
      numa_foreign 0                             numa_miss    0
      numa_interleave 0                          numa_foreign 0
      numa_local   0                             numa_interleave 0
      numa_other   0                             numa_local   0
      *nr_free_cma 0*                            numa_other 0
          ...                                        ...
      vm stats threshold: 10                     vm stats threshold: 10
          ...                                        ...
      
      The next patch updates the numa stats counter size and threshold.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Link: http://lkml.kernel.org/r/1503568801-21305-2-git-send-email-kemi.wang@intel.comSigned-off-by: NKemi Wang <kemi.wang@intel.com>
      Reported-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: NMel Gorman <mgorman@techsingularity.net>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Christopher Lameter <cl@linux.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Andi Kleen <andi.kleen@intel.com>
      Cc: Ying Huang <ying.huang@intel.com>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Tim Chen <tim.c.chen@intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3a321d2a
  11. 07 9月, 2017 2 次提交
    • M
      mm, memory_hotplug: remove zone restrictions · c6f03e29
      Michal Hocko 提交于
      Historically we have enforced that any kernel zone (e.g ZONE_NORMAL) has
      to precede the Movable zone in the physical memory range.  The purpose
      of the movable zone is, however, not bound to any physical memory
      restriction.  It merely defines a class of migrateable and reclaimable
      memory.
      
      There are users (e.g.  CMA) who might want to reserve specific physical
      memory ranges for their own purpose.  Moreover our pfn walkers have to
      be prepared for zones overlapping in the physical range already because
      we do support interleaving NUMA nodes and therefore zones can interleave
      as well.  This means we can allow each memory block to be associated
      with a different zone.
      
      Loosen the current onlining semantic and allow explicit onlining type on
      any memblock.  That means that online_{kernel,movable} will be allowed
      regardless of the physical address of the memblock as long as it is
      offline of course.  This might result in moveble zone overlapping with
      other kernel zones.  Default onlining then becomes a bit tricky but
      still sensible.  echo online > memoryXY/state will online the given
      block to
      
      	1) the default zone if the given range is outside of any zone
      	2) the enclosing zone if such a zone doesn't interleave with
      	   any other zone
              3) the default zone if more zones interleave for this range
      
      where default zone is movable zone only if movable_node is enabled
      otherwise it is a kernel zone.
      
      Here is an example of the semantic with (movable_node is not present but
      it work in an analogous way). We start with following memblocks, all of
      them offline:
      
        memory34/valid_zones:Normal Movable
        memory35/valid_zones:Normal Movable
        memory36/valid_zones:Normal Movable
        memory37/valid_zones:Normal Movable
        memory38/valid_zones:Normal Movable
        memory39/valid_zones:Normal Movable
        memory40/valid_zones:Normal Movable
        memory41/valid_zones:Normal Movable
      
      Now, we online block 34 in default mode and block 37 as movable
      
        root@test1:/sys/devices/system/node/node1# echo online > memory34/state
        root@test1:/sys/devices/system/node/node1# echo online_movable > memory37/state
        memory34/valid_zones:Normal
        memory35/valid_zones:Normal Movable
        memory36/valid_zones:Normal Movable
        memory37/valid_zones:Movable
        memory38/valid_zones:Normal Movable
        memory39/valid_zones:Normal Movable
        memory40/valid_zones:Normal Movable
        memory41/valid_zones:Normal Movable
      
      As we can see all other blocks can still be onlined both into Normal and
      Movable zones and the Normal is default because the Movable zone spans
      only block37 now.
      
        root@test1:/sys/devices/system/node/node1# echo online_movable > memory41/state
        memory34/valid_zones:Normal
        memory35/valid_zones:Normal Movable
        memory36/valid_zones:Normal Movable
        memory37/valid_zones:Movable
        memory38/valid_zones:Movable Normal
        memory39/valid_zones:Movable Normal
        memory40/valid_zones:Movable Normal
        memory41/valid_zones:Movable
      
      Now the default zone for blocks 37-41 has changed because movable zone
      spans that range.
      
        root@test1:/sys/devices/system/node/node1# echo online_kernel > memory39/state
        memory34/valid_zones:Normal
        memory35/valid_zones:Normal Movable
        memory36/valid_zones:Normal Movable
        memory37/valid_zones:Movable
        memory38/valid_zones:Normal Movable
        memory39/valid_zones:Normal
        memory40/valid_zones:Movable Normal
        memory41/valid_zones:Movable
      
      Note that the block 39 now belongs to the zone Normal and so block38
      falls into Normal by default as well.
      
      For completness
      
        root@test1:/sys/devices/system/node/node1# for i in memory[34]?
        do
      	echo online > $i/state 2>/dev/null
        done
      
        memory34/valid_zones:Normal
        memory35/valid_zones:Normal
        memory36/valid_zones:Normal
        memory37/valid_zones:Movable
        memory38/valid_zones:Normal
        memory39/valid_zones:Normal
        memory40/valid_zones:Movable
        memory41/valid_zones:Movable
      
      Implementation wise the change is quite straightforward.  We can get rid
      of allow_online_pfn_range altogether.  online_pages allows only offline
      nodes already.  The original default_zone_for_pfn will become
      default_kernel_zone_for_pfn.  New default_zone_for_pfn implements the
      above semantic.  zone_for_pfn_range is slightly reorganized to implement
      kernel and movable online type explicitly and MMOP_ONLINE_KEEP becomes a
      catch all default behavior.
      
      Link: http://lkml.kernel.org/r/20170714121233.16861-3-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NReza Arbab <arbab@linux.vnet.ibm.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: Kani Toshimitsu <toshi.kani@hpe.com>
      Cc: <slaoub@gmail.com>
      Cc: Daniel Kiper <daniel.kiper@oracle.com>
      Cc: Igor Mammedov <imammedo@redhat.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: <linux-api@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c6f03e29
    • M
      mm, memory_hotplug: display allowed zones in the preferred ordering · e5e68930
      Michal Hocko 提交于
      Prior to commit f1dd2cd1 ("mm, memory_hotplug: do not associate
      hotadded memory to zones until online") we used to allow to change the
      valid zone types of a memory block if it is adjacent to a different zone
      type.
      
      This fact was reflected in memoryNN/valid_zones by the ordering of
      printed zones.  The first one was default (echo online > memoryNN/state)
      and the other one could be onlined explicitly by online_{movable,kernel}.
      
      This behavior was removed by the said patch and as such the ordering was
      not all that important.  In most cases a kernel zone would be default
      anyway.  The only exception is movable_node handled by "mm,
      memory_hotplug: support movable_node for hotpluggable nodes".
      
      Let's reintroduce this behavior again because later patch will remove
      the zone overlap restriction and so user will be allowed to online
      kernel resp.  movable block regardless of its placement.  Original
      behavior will then become significant again because it would be
      non-trivial for users to see what is the default zone to online into.
      
      Implementation is really simple.  Pull out zone selection out of
      move_pfn_range into zone_for_pfn_range helper and use it in
      show_valid_zones to display the zone for default onlining and then both
      kernel and movable if they are allowed.  Default online zone is not
      duplicated.
      
      Link: http://lkml.kernel.org/r/20170714121233.16861-2-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
      Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: Kani Toshimitsu <toshi.kani@hpe.com>
      Cc: <slaoub@gmail.com>
      Cc: Daniel Kiper <daniel.kiper@oracle.com>
      Cc: Igor Mammedov <imammedo@redhat.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e5e68930
  12. 05 9月, 2017 1 次提交
    • A
      dma-coherent: fix dma_declare_coherent_memory() logic error · d35b0996
      Arnd Bergmann 提交于
      A recent change interprets the return code of dma_init_coherent_memory
      as an error value, but it is instead a boolean, where 'true' indicates
      success. This leads causes the caller to always do the wrong thing,
      and also triggers a compile-time warning about it:
      
      drivers/base/dma-coherent.c: In function 'dma_declare_coherent_memory':
      drivers/base/dma-coherent.c:99:15: error: 'mem' may be used uninitialized in this function [-Werror=maybe-uninitialized]
      
      I ended up changing the code a little more, to give use the usual
      error handling, as this seemed the best way to fix up the warning
      and make the code look reasonable at the same time.
      
      Fixes: 2436bdcd ("dma-coherent: remove the DMA_MEMORY_MAP and DMA_MEMORY_IO flags")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      d35b0996
  13. 04 9月, 2017 1 次提交
  14. 01 9月, 2017 3 次提交
  15. 29 8月, 2017 1 次提交
  16. 28 8月, 2017 2 次提交
  17. 25 8月, 2017 1 次提交
  18. 11 8月, 2017 7 次提交
    • R
      PM / s2idle: Rename freeze_state enum and related items · f02f4f9d
      Rafael J. Wysocki 提交于
      Rename the freeze_state enum representing the suspend-to-idle state
      machine states to s2idle_states and rename the related variables and
      functions accordingly.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      f02f4f9d
    • L
      firmware: enable a debug print for batched requests · 30172bea
      Luis R. Rodriguez 提交于
      Otherwise there is no easy way this actually happened.
      Signed-off-by: NLuis R. Rodriguez <mcgrof@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      30172bea
    • L
      firmware: define pr_fmt · 73da4b4b
      Luis R. Rodriguez 提交于
      For some reason we have always forgotten this. Without this
      we don't get a nice prefix on our pr_debug() / pr_*() messages.
      Signed-off-by: NLuis R. Rodriguez <mcgrof@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      73da4b4b
    • L
      firmware: send -EINTR on signal abort on fallback mechanism · 76098b36
      Luis R. Rodriguez 提交于
      Right now we send -EAGAIN to a syfs write which got interrupted.
      Userspace can't tell what happened though, send -EINTR if we
      were killed due to a signal so userspace can tell things apart.
      
      This is only applicable to the fallback mechanism.
      Reported-by: NMartin Fuzzey <mfuzzey@parkeon.com>
      Signed-off-by: NLuis R. Rodriguez <mcgrof@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      76098b36
    • L
      firmware: avoid invalid fallback aborts by using killable wait · 260d9f2f
      Luis R. Rodriguez 提交于
      Commit 0cb64249 ("firmware_loader: abort request if wait_for_completion
      is interrupted") added via 4.0 added support to abort the fallback mechanism
      when a signal was detected and wait_for_completion_interruptible() returned
      -ERESTARTSYS -- for instance when a user hits CTRL-C. The abort was overly
      *too* effective.
      
      When a child process terminates (successful or not) the signal SIGCHLD can
      be sent to the parent process which ran the child in the background and
      later triggered a sync request for firmware through a sysfs interface which
      relies on the fallback mechanism. This signal in turn can be recieved by the
      interruptible wait we constructed on firmware_class and detects it as an
      abort *before* userspace could get a chance to write the firmware. Upon
      failure -EAGAIN is returned, so userspace is also kept in the dark about
      exactly what happened.
      
      We can reproduce the issue with the fw_fallback.sh selftest:
      
      Before this patch:
      $ sudo tools/testing/selftests/firmware/fw_fallback.sh
      ...
      tools/testing/selftests/firmware/fw_fallback.sh: error - sync firmware request cancelled due to SIGCHLD
      
      After this patch:
      $ sudo tools/testing/selftests/firmware/fw_fallback.sh
      ...
      tools/testing/selftests/firmware/fw_fallback.sh: SIGCHLD on sync ignored as expected
      
      Fix this by making the wait killable -- only killable by SIGKILL (kill -9).
      We loose the ability to allow userspace to cancel a write with CTRL-C
      (SIGINT), however its been decided the compromise to require SIGKILL is
      worth the gains.
      
      Chances of this issue occuring are low due to the number of drivers upstream
      exclusively relying on the fallback mechanism for firmware (2 drivers),
      however this is observed in the field with custom drivers with sysfs
      triggers to load firmware. Only distributions relying on the fallback
      mechanism are impacted as well. An example reported issue was on Android,
      as follows:
      
      1) Android init (pid=1) fork()s (say pid=42) [this child process is totally
         unrelated to firmware loading, it could be sleep 2; for all we care ]
      2) Android init (pid=1) does a write() on a (driver custom) sysfs file which
         ends up calling request_firmware() kernel side
      3) The firmware loading fallback mechanism is used, the request is sent to
         userspace and pid 1 waits in the kernel on wait_*
      4) before firmware loading completes pid 42 dies (for any reason, even
         normal termination)
      5) Kernel delivers SIGCHLD to pid=1 to tell it a child has died, which
         causes -ERESTARTSYS to be returned from wait_*
      6) The kernel's wait aborts and return -EAGAIN for the
         request_firmware() caller.
      
      Cc: stable <stable@vger.kernel.org> # 4.0
      Fixes: 0cb64249 ("firmware_loader: abort request if wait_for_completion is interrupted")
      Suggested-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Suggested-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
      Tested-by: NMartin Fuzzey <mfuzzey@parkeon.com>
      Reported-by: NMartin Fuzzey <mfuzzey@parkeon.com>
      Signed-off-by: NLuis R. Rodriguez <mcgrof@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      260d9f2f
    • L
      firmware: fix batched requests - send wake up on failure on direct lookups · 90d41e74
      Luis R. Rodriguez 提交于
      Fix batched requests from waiting forever on failure.
      
      The firmware API batched requests feature has been broken since the API call
      request_firmware_direct() was introduced on commit bba3a87e ("firmware:
      Introduce request_firmware_direct()"), added on v3.14 *iff* the firmware
      being requested was not present in *certain kernel builds* [0].
      
      When no firmware is found the worker which goes on to finish never informs
      waiters queued up of this, so any batched request will stall in what seems
      to be forever (MAX_SCHEDULE_TIMEOUT). Sadly, a reboot will also stall, as
      the reboot notifier was only designed to kill custom fallback workers. The
      issue seems to the user as a type of soft lockup, what *actually* happens
      underneath the hood is a wait call which never completes as we failed to
      issue a completion on error.
      
      For device drivers with optional firmware schemes (ie, Intel iwlwifi, or
      Netronome -- even though it uses request_firmware() and not
      request_firmware_direct()), this could mean that when you boot a system with
      multiple cards the firmware will seem to never load on the system, or that
      the card is just not responsive even the driver initialization. Due to
      differences in scheduling possible this should not always trigger --
      one would need to to ensure that multiple requests are in place at the
      right time for this to work, also release_firmware() must not be called
      prior to any other incoming request. The complexity may not be worth
      supporting batched requests in the future given the wait mechanism is
      only used also for the fallback mechanism. We'll keep it for now and
      just fix it.
      
      Its reported that at least with the Intel WiFi cards on one system this
      issue was creeping up 50% of the boots [0].
      
      Before this commit batched requests testing revealed:
      ============================================================================
      CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
      CONFIG_FW_LOADER_USER_HELPER=y
      
      Most common Linux distribution setup.
      
      API-type                               no-firmware-found   firmware-found
      ----------------------------------------------------------------------
      request_firmware()                     FAIL                OK
      request_firmware_direct()              FAIL                OK
      request_firmware_nowait(uevent=true)   FAIL                OK
      request_firmware_nowait(uevent=false)  FAIL                OK
      ============================================================================
      CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
      CONFIG_FW_LOADER_USER_HELPER=n
      
      Only possible if CONFIG_DELL_RBU=n and CONFIG_LEDS_LP55XX_COMMON=n, rare.
      
      API-type                               no-firmware-found   firmware-found
      ----------------------------------------------------------------------
      request_firmware()                     FAIL                OK
      request_firmware_direct()              FAIL                OK
      request_firmware_nowait(uevent=true)   FAIL                OK
      request_firmware_nowait(uevent=false)  FAIL                OK
      ============================================================================
      CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y
      CONFIG_FW_LOADER_USER_HELPER=y
      
      Google Android setup.
      
      API-type                               no-firmware-found   firmware-found
      ----------------------------------------------------------------------
      request_firmware()                     OK                  OK
      request_firmware_direct()              FAIL                OK
      request_firmware_nowait(uevent=true)   OK                  OK
      request_firmware_nowait(uevent=false)  OK                  OK
      ============================================================================
      
      Ater this commit batched testing results:
      ============================================================================
      CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
      CONFIG_FW_LOADER_USER_HELPER=y
      
      Most common Linux distribution setup.
      
      API-type                               no-firmware-found   firmware-found
      ----------------------------------------------------------------------
      request_firmware()                     OK                  OK
      request_firmware_direct()              OK                  OK
      request_firmware_nowait(uevent=true)   OK                  OK
      request_firmware_nowait(uevent=false)  OK                  OK
      ============================================================================
      CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
      CONFIG_FW_LOADER_USER_HELPER=n
      
      Only possible if CONFIG_DELL_RBU=n and CONFIG_LEDS_LP55XX_COMMON=n, rare.
      
      API-type                               no-firmware-found   firmware-found
      ----------------------------------------------------------------------
      request_firmware()                     OK                  OK
      request_firmware_direct()              OK                  OK
      request_firmware_nowait(uevent=true)   OK                  OK
      request_firmware_nowait(uevent=false)  OK                  OK
      ============================================================================
      CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y
      CONFIG_FW_LOADER_USER_HELPER=y
      
      Google Android setup.
      
      API-type                               no-firmware-found   firmware-found
      ----------------------------------------------------------------------
      request_firmware()                     OK                  OK
      request_firmware_direct()              OK                  OK
      request_firmware_nowait(uevent=true)   OK                  OK
      request_firmware_nowait(uevent=false)  OK                  OK
      ============================================================================
      
      [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477
      
      Cc: stable <stable@vger.kernel.org> # v3.14
      Fixes: bba3a87e ("firmware: Introduce request_firmware_direct()"
      Reported-by: NNicolas <nbroeking@me.com>
      Reported-by: NJohn Ewalt  <jewalt@lgsinnovations.com>
      Reported-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NLuis R. Rodriguez <mcgrof@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      90d41e74
    • L
      firmware: fix batched requests - wake all waiters · e44565f6
      Luis R. Rodriguez 提交于
      The firmware cache mechanism serves two purposes, the secondary purpose is
      not well documented nor understood. This fixes a regression with the
      secondary purpose of the firmware cache mechanism: batched requests on
      successful lookups. Without this fix *any* time a batched request is
      triggered, secondary requests for which the batched request mechanism
      was designed for will seem to last forver and seem to never return.
      This issue is present for all kernel builds possible, and a hard reset
      is required.
      
      The firmware cache is used for:
      
      1) Addressing races with file lookups during the suspend/resume cycle
         by keeping firmware in memory during the suspend/resume cycle
      
      2) Batched requests for the same file rely only on work from the first file
         lookup, which keeps the firmware in memory until the last
         release_firmware() is called
      
      Batched requests *only* take effect if secondary requests come in prior to
      the first user calling release_firmware(). The devres name used for the
      internal firmware cache is used as a hint other pending requests are
      ongoing, the firmware buffer data is kept in memory until the last user of
      the buffer calls release_firmware(), therefore serializing requests and
      delaying the release until all requests are done.
      
      Batched requests wait for a wakup or signal so we can rely on the first file
      fetch to write to the pending secondary requests. Commit 5b029624
      ("firmware: do not use fw_lock for fw_state protection") ported the firmware
      API to use swait, and in doing so failed to convert complete_all() to
      swake_up_all() -- it used swake_up(), loosing the ability for *some* batched
      requests to take effect.
      
      We *could* fix this by just using swake_up_all() *but* swait is now known
      to be very special use case, so its best to just move away from it. So we
      just go back to using completions as before commit 5b029624 ("firmware:
      do not use fw_lock for fw_state protection") given this was using
      complete_all().
      
      Without this fix it has been reported plugging in two Intel 6260 Wifi cards
      on a system will end up enumerating the two devices only 50% of the time
      [0]. The ported swake_up() should have actually handled the case with two
      devices, however, *if more than two cards are used* the swake_up() would
      not have sufficed. This change is only part of the required fixes for
      batched requests. Another fix is provided in the next patch.
      
      This particular change should fix the cases where more than three requests
      with the same firmware name is used, otherwise batched requests will wait
      for MAX_SCHEDULE_TIMEOUT and just timeout eventually.
      
      Below is a summary of tests triggering batched requests on different
      kernel builds.
      
      Before this patch:
      ============================================================================
      CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
      CONFIG_FW_LOADER_USER_HELPER=y
      
      Most common Linux distribution setup.
      
      API-type                               no-firmware-found   firmware-found
      ----------------------------------------------------------------------
      request_firmware()                     FAIL                FAIL
      request_firmware_direct()              FAIL                FAIL
      request_firmware_nowait(uevent=true)   FAIL                FAIL
      request_firmware_nowait(uevent=false)  FAIL                FAIL
      ============================================================================
      CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
      CONFIG_FW_LOADER_USER_HELPER=n
      
      Only possible if CONFIG_DELL_RBU=n and CONFIG_LEDS_LP55XX_COMMON=n, rare.
      
      API-type                               no-firmware-found   firmware-found
      ----------------------------------------------------------------------
      request_firmware()                     FAIL                FAIL
      request_firmware_direct()              FAIL                FAIL
      request_firmware_nowait(uevent=true)   FAIL                FAIL
      request_firmware_nowait(uevent=false)  FAIL                FAIL
      ============================================================================
      CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y
      CONFIG_FW_LOADER_USER_HELPER=y
      
      Google Android setup.
      
      API-type                               no-firmware-found   firmware-found
      ----------------------------------------------------------------------
      request_firmware()                     FAIL                FAIL
      request_firmware_direct()              FAIL                FAIL
      request_firmware_nowait(uevent=true)   FAIL                FAIL
      request_firmware_nowait(uevent=false)  FAIL                FAIL
      ============================================================================
      
      After this patch:
      ============================================================================
      CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
      CONFIG_FW_LOADER_USER_HELPER=y
      
      Most common Linux distribution setup.
      
      API-type                               no-firmware-found   firmware-found
      ----------------------------------------------------------------------
      request_firmware()                     FAIL                OK
      request_firmware_direct()              FAIL                OK
      request_firmware_nowait(uevent=true)   FAIL                OK
      request_firmware_nowait(uevent=false)  FAIL                OK
      ============================================================================
      CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
      CONFIG_FW_LOADER_USER_HELPER=n
      
      Only possible if CONFIG_DELL_RBU=n and CONFIG_LEDS_LP55XX_COMMON=n, rare.
      
      API-type                               no-firmware-found   firmware-found
      ----------------------------------------------------------------------
      request_firmware()                     FAIL                OK
      request_firmware_direct()              FAIL                OK
      request_firmware_nowait(uevent=true)   FAIL                OK
      request_firmware_nowait(uevent=false)  FAIL                OK
      ============================================================================
      CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y
      CONFIG_FW_LOADER_USER_HELPER=y
      
      Google Android setup.
      
      API-type                               no-firmware-found   firmware-found
      ----------------------------------------------------------------------
      request_firmware()                     OK                  OK
      request_firmware_direct()              FAIL                OK
      request_firmware_nowait(uevent=true)   OK                  OK
      request_firmware_nowait(uevent=false)  OK                  OK
      ============================================================================
      
      [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477
      
      CC: <stable@vger.kernel.org>    [4.10+]
      Cc: Ming Lei <ming.lei@redhat.com>
      Fixes: 5b029624 ("firmware: do not use fw_lock for fw_state protection")
      Reported-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NLuis R. Rodriguez <mcgrof@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e44565f6
  19. 08 8月, 2017 1 次提交
    • R
      PM / wakeup: Set power.can_wakeup if wakeup_sysfs_add() fails · 820b9b0c
      Rafael J. Wysocki 提交于
      Currently, an error from wakeup_sysfs_add() in
      device_set_wakeup_capable() causes the device's power.can_wakeup
      flag to remain unset even though the device technically is capable
      of signaling wakeup.
      
      If wakeup_sysfs_add() fails user space may not be able to enable
      the device to wake up the system from sleep states, but at least
      for some devices that does not matter.
      
      For this reason, set or clear power.can_wakeup upfront and if
      wakeup_sysfs_add() returns an error, print a message to the log.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      820b9b0c
  20. 04 8月, 2017 1 次提交
    • T
      initcall_debug: add deferred probe times · 1f5000bd
      Todd Poynor 提交于
      initcall_debug attributes all deferred device probe retries for the
      late_initcall level to function deferred_probe_initcall.  Add logs of
      the individual device probe routines called, to identify which drivers
      are executing for how long during the initcall path.  Deferred probes
      that occur after initcall processing are not shown.
      
      Example log messages added:
      
      [    0.505119] deferred probe my-sound-device @ 6
      [    0.517656] deferred probe my-sound-device returned after 1227 usecs
      Signed-off-by: NTodd Poynor <toddpoynor@google.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f5000bd
  21. 01 8月, 2017 1 次提交
  22. 25 7月, 2017 3 次提交