1. 14 11月, 2017 2 次提交
  2. 13 11月, 2017 3 次提交
  3. 12 11月, 2017 1 次提交
    • D
      timers: Add a function to start/reduce a timer · b24591e2
      David Howells 提交于
      Add a function, similar to mod_timer(), that will start a timer if it isn't
      running and will modify it if it is running and has an expiry time longer
      than the new time.  If the timer is running with an expiry time that's the
      same or sooner, no change is made.
      
      The function looks like:
      
      	int timer_reduce(struct timer_list *timer, unsigned long expires);
      
      This can be used by code such as networking code to make it easier to share
      a timer for multiple timeouts.  For instance, in upcoming AF_RXRPC code,
      the rxrpc_call struct will maintain a number of timeouts:
      
      	unsigned long	ack_at;
      	unsigned long	resend_at;
      	unsigned long	ping_at;
      	unsigned long	expect_rx_by;
      	unsigned long	expect_req_by;
      	unsigned long	expect_term_by;
      
      each of which is set independently of the others.  With timer reduction
      available, when the code needs to set one of the timeouts, it only needs to
      look at that timeout and then call timer_reduce() to modify the timer,
      starting it or bringing it forward if necessary.  There is no need to refer
      to the other timeouts to see which is earliest and no need to take any lock
      other than, potentially, the timer lock inside timer_reduce().
      
      Note, that this does not protect against concurrent invocations of any of
      the timer functions.
      
      As an example, the expect_rx_by timeout above, which terminates a call if
      we don't get a packet from the server within a certain time window, would
      be set something like this:
      
      	unsigned long now = jiffies;
      	unsigned long expect_rx_by = now + packet_receive_timeout;
      	WRITE_ONCE(call->expect_rx_by, expect_rx_by);
      	timer_reduce(&call->timer, expect_rx_by);
      
      The timer service code (which might, say, be in a work function) would then
      check all the timeouts to see which, if any, had triggered, deal with
      those:
      
      	t = READ_ONCE(call->ack_at);
      	if (time_after_eq(now, t)) {
      		cmpxchg(&call->ack_at, t, now + MAX_JIFFY_OFFSET);
      		set_bit(RXRPC_CALL_EV_ACK, &call->events);
      	}
      
      and then restart the timer if necessary by finding the soonest timeout that
      hasn't yet passed and then calling timer_reduce().
      
      The disadvantage of doing things this way rather than comparing the timers
      each time and calling mod_timer() is that you *will* take timer events
      unless you can finish what you're doing and delete the timer in time.
      
      The advantage of doing things this way is that you don't need to use a lock
      to work out when the next timer should be set, other than the timer's own
      lock - which you might not have to take.
      
      [ tglx: Fixed weird formatting and adopted it to pending changes ]
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: keyrings@vger.kernel.org
      Cc: linux-afs@lists.infradead.org
      Link: https://lkml.kernel.org/r/151023090769.23050.1801643667223880753.stgit@warthog.procyon.org.uk
      b24591e2
  4. 10 11月, 2017 3 次提交
  5. 09 11月, 2017 6 次提交
  6. 08 11月, 2017 4 次提交
  7. 07 11月, 2017 10 次提交
  8. 06 11月, 2017 5 次提交
    • R
      ACPI / PM: Take SMART_SUSPEND driver flag into account · 05087360
      Rafael J. Wysocki 提交于
      Make the ACPI PM domain take DPM_FLAG_SMART_SUSPEND into account in
      its system suspend callbacks.
      
      [Note that the pm_runtime_suspended() check in acpi_dev_needs_resume()
      is an optimization, because if is not passed, all of the subsequent
      checks may be skipped and some of them are much more overhead in
      general.]
      
      Also use the observation that if the device is in runtime suspend
      at the beginning of the "late" phase of a system-wide suspend-like
      transition, its state cannot change going forward (runtime PM is
      disabled for it at that time) until the transition is over and the
      subsequent system-wide PM callbacks should be skipped for it (as
      they generally assume the device to not be suspended), so add
      checks for that in acpi_subsys_suspend_late/noirq() and
      acpi_subsys_freeze_late/noirq().
      
      Moreover, if acpi_subsys_resume_noirq() is called during the
      subsequent system-wide resume transition and if the device was left
      in runtime suspend previously, its runtime PM status needs to be
      changed to "active" as it is going to be put into the full-power
      state going forward, so add a check for that too in there.
      
      In turn, if acpi_subsys_thaw_noirq() runs after the device has been
      left in runtime suspend, the subsequent "thaw" callbacks need
      to be skipped for it (as they may not work correctly with a
      suspended device), so set the power.direct_complete flag for the
      device then to make the PM core skip those callbacks.
      
      On top of the above, make the analogous changes in the acpi_lpss
      driver that uses the ACPI PM domain callbacks.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      05087360
    • R
      PCI / PM: Take SMART_SUSPEND driver flag into account · c4b65157
      Rafael J. Wysocki 提交于
      Make the PCI bus type take DPM_FLAG_SMART_SUSPEND into account in its
      system-wide PM callbacks and make sure that all code that should not
      run in parallel with pci_pm_runtime_resume() is executed in the "late"
      phases of system suspend, freeze and poweroff transitions.
      
      [Note that the pm_runtime_suspended() check in pci_dev_keep_suspended()
      is an optimization, because if is not passed, all of the subsequent
      checks may be skipped and some of them are much more overhead in
      general.]
      
      Also use the observation that if the device is in runtime suspend
      at the beginning of the "late" phase of a system-wide suspend-like
      transition, its state cannot change going forward (runtime PM is
      disabled for it at that time) until the transition is over and the
      subsequent system-wide PM callbacks should be skipped for it (as
      they generally assume the device to not be suspended), so add checks
      for that in pci_pm_suspend_late/noirq(), pci_pm_freeze_late/noirq()
      and pci_pm_poweroff_late/noirq().
      
      Moreover, if pci_pm_resume_noirq() or pci_pm_restore_noirq() is
      called during the subsequent system-wide resume transition and if
      the device was left in runtime suspend previously, its runtime PM
      status needs to be changed to "active" as it is going to be put
      into the full-power state, so add checks for that too to these
      functions.
      
      In turn, if pci_pm_thaw_noirq() runs after the device has been
      left in runtime suspend, the subsequent "thaw" callbacks need
      to be skipped for it (as they may not work correctly with a
      suspended device), so set the power.direct_complete flag for the
      device then to make the PM core skip those callbacks.
      
      In addition to the above add a core helper for checking if
      DPM_FLAG_SMART_SUSPEND is set and the device runtime PM status is
      "suspended" at the same time, which is done quite often in the new
      code (and will be done elsewhere going forward too).
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Acked-by: NBjorn Helgaas <bhelgaas@google.com>
      c4b65157
    • R
      PM / core: Add SMART_SUSPEND driver flag · 0eab11c9
      Rafael J. Wysocki 提交于
      Define and document a SMART_SUSPEND flag to instruct bus types and PM
      domains that the system suspend callbacks provided by the driver can
      cope with runtime-suspended devices, so from the driver's perspective
      it should be safe to leave devices in runtime suspend during system
      suspend.
      
      Setting that flag may also cause middle-layer code (bus types,
      PM domains etc.) to skip invocations of the ->suspend_late and
      ->suspend_noirq callbacks provided by the driver if the device
      is in runtime suspend at the beginning of the "late" phase of
      the system-wide suspend transition, in which case the driver's
      system-wide resume callbacks may be invoked back-to-back with
      its ->runtime_suspend callback, so the driver has to be able to
      cope with that too.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: NUlf Hansson <ulf.hansson@linaro.org>
      0eab11c9
    • R
      PCI / PM: Use the NEVER_SKIP driver flag · c2eac4d3
      Rafael J. Wysocki 提交于
      Replace the PCI-specific flag PCI_DEV_FLAGS_NEEDS_RESUME with the
      PM core's DPM_FLAG_NEVER_SKIP one everywhere and drop it.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Acked-by: NBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: NUlf Hansson <ulf.hansson@linaro.org>
      c2eac4d3
    • R
      PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags · 08810a41
      Rafael J. Wysocki 提交于
      The motivation for this change is to provide a way to work around
      a problem with the direct-complete mechanism used for avoiding
      system suspend/resume handling for devices in runtime suspend.
      
      The problem is that some middle layer code (the PCI bus type and
      the ACPI PM domain in particular) returns positive values from its
      system suspend ->prepare callbacks regardless of whether the driver's
      ->prepare returns a positive value or 0, which effectively prevents
      drivers from being able to control the direct-complete feature.
      Some drivers need that control, however, and the PCI bus type has
      grown its own flag to deal with this issue, but since it is not
      limited to PCI, it is better to address it by adding driver flags at
      the core level.
      
      To that end, add a driver_flags field to struct dev_pm_info for flags
      that can be set by device drivers at the probe time to inform the PM
      core and/or bus types, PM domains and so on on the capabilities and/or
      preferences of device drivers.  Also add two static inline helpers
      for setting that field and testing it against a given set of flags
      and make the driver core clear it automatically on driver remove
      and probe failures.
      
      Define and document two PM driver flags related to the direct-
      complete feature: NEVER_SKIP and SMART_PREPARE that can be used,
      respectively, to indicate to the PM core that the direct-complete
      mechanism should never be used for the device and to inform the
      middle layer code (bus types, PM domains etc) that it can only
      request the PM core to use the direct-complete mechanism for
      the device (by returning a positive value from its ->prepare
      callback) if it also has been requested by the driver.
      
      While at it, make the core check pm_runtime_suspended() when
      setting power.direct_complete so that it doesn't need to be
      checked by ->prepare callbacks.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Acked-by: NBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: NUlf Hansson <ulf.hansson@linaro.org>
      08810a41
  9. 04 11月, 2017 4 次提交
  10. 03 11月, 2017 2 次提交
    • H
      mm, swap: fix race between swap count continuation operations · 2628bd6f
      Huang Ying 提交于
      One page may store a set of entries of the sis->swap_map
      (swap_info_struct->swap_map) in multiple swap clusters.
      
      If some of the entries has sis->swap_map[offset] > SWAP_MAP_MAX,
      multiple pages will be used to store the set of entries of the
      sis->swap_map.  And the pages are linked with page->lru.  This is called
      swap count continuation.  To access the pages which store the set of
      entries of the sis->swap_map simultaneously, previously, sis->lock is
      used.  But to improve the scalability of __swap_duplicate(), swap
      cluster lock may be used in swap_count_continued() now.  This may race
      with add_swap_count_continuation() which operates on a nearby swap
      cluster, in which the sis->swap_map entries are stored in the same page.
      
      The race can cause wrong swap count in practice, thus cause unfreeable
      swap entries or software lockup, etc.
      
      To fix the race, a new spin lock called cont_lock is added to struct
      swap_info_struct to protect the swap count continuation page list.  This
      is a lock at the swap device level, so the scalability isn't very well.
      But it is still much better than the original sis->lock, because it is
      only acquired/released when swap count continuation is used.  Which is
      considered rare in practice.  If it turns out that the scalability
      becomes an issue for some workloads, we can split the lock into some
      more fine grained locks.
      
      Link: http://lkml.kernel.org/r/20171017081320.28133-1-ying.huang@intel.com
      Fixes: 235b6217 ("mm/swap: add cluster lock")
      Signed-off-by: N"Huang, Ying" <ying.huang@intel.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Tim Chen <tim.c.chen@intel.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: <stable@vger.kernel.org>	[4.11+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2628bd6f
    • G
      crypto: introduce crypto wait for async op · ada69a16
      Gilad Ben-Yossef 提交于
      Invoking a possibly async. crypto op and waiting for completion
      while correctly handling backlog processing is a common task
      in the crypto API implementation and outside users of it.
      
      This patch adds a generic implementation for doing so in
      preparation for using it across the board instead of hand
      rolled versions.
      Signed-off-by: NGilad Ben-Yossef <gilad@benyossef.com>
      CC: Eric Biggers <ebiggers3@gmail.com>
      CC: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      ada69a16