1. 28 1月, 2016 4 次提交
    • A
      cpufreq: cpufreq-dt: avoid uninitialized variable warnings: · b331bc20
      Arnd Bergmann 提交于
      gcc warns quite a bit about values returned from allocate_resources()
      in cpufreq-dt.c:
      
      cpufreq-dt.c: In function 'cpufreq_init':
      cpufreq-dt.c:327:6: error: 'cpu_dev' may be used uninitialized in this function [-Werror=maybe-uninitialized]
      cpufreq-dt.c:197:17: note: 'cpu_dev' was declared here
      cpufreq-dt.c:376:2: error: 'cpu_clk' may be used uninitialized in this function [-Werror=maybe-uninitialized]
      cpufreq-dt.c:199:14: note: 'cpu_clk' was declared here
      cpufreq-dt.c: In function 'dt_cpufreq_probe':
      cpufreq-dt.c:461:2: error: 'cpu_clk' may be used uninitialized in this function [-Werror=maybe-uninitialized]
      cpufreq-dt.c:447:14: note: 'cpu_clk' was declared here
      
      The problem is that it's slightly hard for gcc to follow return
      codes across PTR_ERR() calls.
      
      This patch uses explicit assignments to the "ret" variable to make
      it easier for gcc to verify that the code is actually correct,
      without the need to add a bogus initialization.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      b331bc20
    • A
      cpufreq: pxa2xx: fix pxa_cpufreq_change_voltage prototype · fb2a24a1
      Arnd Bergmann 提交于
      There are two definitions of pxa_cpufreq_change_voltage, with slightly
      different prototypes after one of them had its argument marked 'const'.
      Now the other one (for !CONFIG_REGULATOR) produces a harmless warning:
      
      drivers/cpufreq/pxa2xx-cpufreq.c: In function 'pxa_set_target':
      drivers/cpufreq/pxa2xx-cpufreq.c:291:36: warning: passing argument 1 of 'pxa_cpufreq_change_voltage' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
         ret = pxa_cpufreq_change_voltage(&pxa_freq_settings[idx]);
                                          ^
      drivers/cpufreq/pxa2xx-cpufreq.c:205:12: note: expected 'struct pxa_freqs *' but argument is of type 'const struct pxa_freqs *'
       static int pxa_cpufreq_change_voltage(struct pxa_freqs *pxa_freq)
                  ^
      
      This changes the prototype in the same way as the other, which
      avoids the warning.
      
      Fixes: 03c22990 (cpufreq: pxa: make pxa_freqs arrays const)
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Cc: 4.2+ <stable@vger.kernel.org> # 4.2+
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      fb2a24a1
    • G
      cpufreq: Use list_is_last() to check last entry of the policy list · 2dadfd75
      Gautham R Shenoy 提交于
      Currently next_policy() explicitly checks if a policy is the last
      policy in the cpufreq_policy_list. Use the standard list_is_last
      primitive instead.
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      2dadfd75
    • V
      cpufreq: Fix NULL reference crash while accessing policy->governor_data · e4b133cc
      Viresh Kumar 提交于
      There is a race discovered by Juri, where we are able to:
      - create and read a sysfs file before policy->governor_data is being set
        to a non NULL value.
        OR
      - set policy->governor_data to NULL, and reading a file before being
        destroyed.
      
      And so such a crash is reported:
      
      Unable to handle kernel NULL pointer dereference at virtual address 0000000c
      pgd = edfc8000
      [0000000c] *pgd=bfc8c835
      Internal error: Oops: 17 [#1] SMP ARM
      Modules linked in:
      CPU: 4 PID: 1730 Comm: cat Not tainted 4.5.0-rc1+ #463
      Hardware name: ARM-Versatile Express
      task: ee8e8480 ti: ee930000 task.ti: ee930000
      PC is at show_ignore_nice_load_gov_pol+0x24/0x34
      LR is at show+0x4c/0x60
      pc : [<c058f1bc>]    lr : [<c058ae88>]    psr: a0070013
      sp : ee931dd0  ip : ee931de0  fp : ee931ddc
      r10: ee4bc290  r9 : 00001000  r8 : ef2cb000
      r7 : ee4bc200  r6 : ef2cb000  r5 : c0af57b0  r4 : ee4bc2e0
      r3 : 00000000  r2 : 00000000  r1 : c0928df4  r0 : ef2cb000
      Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
      Control: 10c5387d  Table: adfc806a  DAC: 00000051
      Process cat (pid: 1730, stack limit = 0xee930210)
      Stack: (0xee931dd0 to 0xee932000)
      1dc0:                                     ee931dfc ee931de0 c058ae88 c058f1a4
      1de0: edce3bc0 c07bfca4 edce3ac0 00001000 ee931e24 ee931e00 c01fcb90 c058ae48
      1e00: 00000001 edce3bc0 00000000 00000001 ee931e50 ee8ff480 ee931e34 ee931e28
      1e20: c01fb33c c01fcb0c ee931e8c ee931e38 c01a5210 c01fb314 ee931e9c ee931e48
      1e40: 00000000 edce3bf0 befe4a00 ee931f78 00000000 00000000 000001e4 00000000
      1e60: c00545a8 edce3ac0 00001000 00001000 befe4a00 ee931f78 00000000 00001000
      1e80: ee931ed4 ee931e90 c01fbed8 c01a5038 ed085a58 00020000 00000000 00000000
      1ea0: c0ad72e4 ee931f78 ee8ff488 ee8ff480 c077f3fc 00001000 befe4a00 ee931f78
      1ec0: 00000000 00001000 ee931f44 ee931ed8 c017c328 c01fbdc4 00001000 00000000
      1ee0: ee8ff480 00001000 ee931f44 ee931ef8 c017c65c c03deb10 ee931fac ee931f08
      1f00: c0009270 c001f290 c0a8d968 ef2cb000 ef2cb000 ee8ff480 00000020 ee8ff480
      1f20: ee8ff480 befe4a00 00001000 ee931f78 00000000 00000000 ee931f74 ee931f48
      1f40: c017d1ec c017c2f8 c019c724 c019c684 ee8ff480 ee8ff480 00001000 befe4a00
      1f60: 00000000 00000000 ee931fa4 ee931f78 c017d2a8 c017d160 00000000 00000000
      1f80: 000a9f20 00001000 befe4a00 00000003 c000ffe4 ee930000 00000000 ee931fa8
      1fa0: c000fe40 c017d264 000a9f20 00001000 00000003 befe4a00 00001000 00000000
      Unable to handle kernel NULL pointer dereference at virtual address 0000000c
      1fc0: 000a9f20 00001000 befe4a00 00000003 00000000 00000000 00000003 00000001
      pgd = edfc4000
      [0000000c] *pgd=bfcac835
      1fe0: 00000000 befe49dc 000197f8 b6e35dfc 60070010 00000003 3065b49d 134ac2c9
      
      [<c058f1bc>] (show_ignore_nice_load_gov_pol) from [<c058ae88>] (show+0x4c/0x60)
      [<c058ae88>] (show) from [<c01fcb90>] (sysfs_kf_seq_show+0x90/0xfc)
      [<c01fcb90>] (sysfs_kf_seq_show) from [<c01fb33c>] (kernfs_seq_show+0x34/0x38)
      [<c01fb33c>] (kernfs_seq_show) from [<c01a5210>] (seq_read+0x1e4/0x4e4)
      [<c01a5210>] (seq_read) from [<c01fbed8>] (kernfs_fop_read+0x120/0x1a0)
      [<c01fbed8>] (kernfs_fop_read) from [<c017c328>] (__vfs_read+0x3c/0xe0)
      [<c017c328>] (__vfs_read) from [<c017d1ec>] (vfs_read+0x98/0x104)
      [<c017d1ec>] (vfs_read) from [<c017d2a8>] (SyS_read+0x50/0x90)
      [<c017d2a8>] (SyS_read) from [<c000fe40>] (ret_fast_syscall+0x0/0x1c)
      Code: e5903044 e1a00001 e3081df4 e34c1092 (e593300c)
      ---[ end trace 5994b9a5111f35ee ]---
      
      Fix that by making sure, policy->governor_data is updated at the right
      places only.
      
      Cc: 4.2+ <stable@vger.kernel.org> # 4.2+
      Reported-and-tested-by: NJuri Lelli <juri.lelli@arm.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      e4b133cc
  2. 05 1月, 2016 2 次提交
    • A
      cpufreq-dt: fix handling regulator_get_voltage() result · 929ca89c
      Andrzej Hajda 提交于
      The function can return negative values so it should be assigned
      to signed type.
      
      The problem has been detected using proposed semantic patch
      scripts/coccinelle/tests/unsigned_lesser_than_zero.cocci.
      
      Link: http://permalink.gmane.org/gmane.linux.kernel/2038576Signed-off-by: NAndrzej Hajda <a.hajda@samsung.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      929ca89c
    • C
      cpufreq: governor: Fix negative idle_time when configured with CONFIG_HZ_PERIODIC · 0df35026
      Chen Yu 提交于
      It is reported that, with CONFIG_HZ_PERIODIC=y cpu stays at the
      lowest frequency even if the usage goes to 100%, neither ondemand
      nor conservative governor works, however performance and
      userspace work as expected. If set with CONFIG_NO_HZ_FULL=y,
      everything goes well.
      
      This problem is caused by improper calculation of the idle_time
      when the load is extremely high(near 100%). Firstly, cpufreq_governor
      uses get_cpu_idle_time to get the total idle time for specific cpu, then:
      
      1.If the system is configured with CONFIG_NO_HZ_FULL, the idle time is
        returned by ktime_get, which is always increasing, it's OK.
      2.However, if the system is configured with CONFIG_HZ_PERIODIC,
        get_cpu_idle_time might not guarantee to be always increasing,
        because it will leverage get_cpu_idle_time_jiffy to calculate the
        idle_time, consider the following scenario:
      
      At T1:
      idle_tick_1 = total_tick_1 - user_tick_1
      
      sample period(80ms)...
      
      At T2: ( T2 = T1 + 80ms):
      idle_tick_2 = total_tick_2 - user_tick_2
      
      Currently the algorithm is using (idle_tick_2 - idle_tick_1) to
      get the delta idle_time during the past sample period, however
      it CAN NOT guarantee that idle_tick_2 >= idle_tick_1, especially
      when cpu load is high.
      (Yes, total_tick_2 >= total_tick_1, and user_tick_2 >= user_tick_1,
      but how about idle_tick_2 and idle_tick_1? No guarantee.)
      So governor might get a negative value of idle_time during the past
      sample period, which might mislead the system that the idle time is
      very big(converted to unsigned int), and the busy time is nearly zero,
      which causes the governor to always choose the lowest cpufreq,
      then cause this problem.
      
      In theory there are two solutions:
      
      1.The logic should not rely on the idle tick during every sample period,
        but be based on the busy tick directly, as this is how 'top' is
        implemented.
      
      2.Or the logic must make sure that the idle_time is strictly increasing
        during each sample period, then there would be no negative idle_time
        anymore. This solution requires minimum modification to current code
        and this patch uses method 2.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=69821Reported-by: NJan Fikar <j.fikar@gmail.com>
      Signed-off-by: NChen Yu <yu.c.chen@intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      0df35026
  3. 02 1月, 2016 2 次提交
  4. 01 1月, 2016 3 次提交
  5. 28 12月, 2015 3 次提交
  6. 24 12月, 2015 1 次提交
  7. 20 12月, 2015 2 次提交
    • S
      rtc: da9063: fix access ordering error during RTC interrupt at system power on · 77535ace
      Steve Twiss 提交于
      This fix alters the ordering of the IRQ and device registrations in the RTC
      driver probe function. This change will apply to the RTC driver that supports
      both DA9063 and DA9062 PMICs.
      
      A problem could occur with the existing RTC driver if:
      
      A system is started from a cold boot using the PMIC RTC IRQ to initiate a
      power on operation. For instance, if an RTC alarm is used to start a
      platform from power off.
      The existing driver IRQ is requested before the device has been properly
      registered.
      i.e.
          ret = devm_request_threaded_irq()
      comes before
          rtc->rtc_dev = devm_rtc_device_register();
      
      In this case, the interrupt can be called before the device has been
      registered and the handler can be called immediately. The IRQ handler
      da9063_alarm_event() contains the function call
      
          rtc_update_irq(rtc->rtc_dev, 1, RTC_IRQF | RTC_AF);
      
      which in turn tries to access the unavailable rtc->rtc_dev.
      
      The fix is to reorder the functions inside the RTC probe. The IRQ is
      requested after the RTC device resource has been registered so that
      get_irq_byname is the last thing to happen.
      Signed-off-by: NSteve Twiss <stwiss.opensource@diasemi.com>
      Signed-off-by: NAlexandre Belloni <alexandre.belloni@free-electrons.com>
      77535ace
    • J
      rtc: rk808: Compensate for Rockchip calendar deviation on November 31st · f076ef44
      Julius Werner 提交于
      In A.D. 1582 Pope Gregory XIII found that the existing Julian calendar
      insufficiently represented reality, and changed the rules about
      calculating leap years to account for this. Similarly, in A.D. 2013
      Rockchip hardware engineers found that the new Gregorian calendar still
      contained flaws, and that the month of November should be counted up to
      31 days instead. Unfortunately it takes a long time for calendar changes
      to gain widespread adoption, and just like more than 300 years went by
      before the last Protestant nation implemented Greg's proposal, we will
      have to wait a while until all religions and operating system kernels
      acknowledge the inherent advantages of the Rockchip system. Until then
      we need to translate dates read from (and written to) Rockchip hardware
      back to the Gregorian format.
      
      This patch works by defining Jan 1st, 2016 as the arbitrary anchor date
      on which Rockchip and Gregorian calendars are in sync. From that we can
      translate arbitrary later dates back and forth by counting the number
      of November/December transitons since the anchor date to determine the
      offset between the calendars. We choose this method (rather than trying
      to regularly "correct" the date stored in hardware) since it's the only
      way to ensure perfect time-keeping even if the system may be shut down
      for an unknown number of years. The drawback is that other software
      reading the same hardware (e.g. mainboard firmware) must use the same
      translation convention (including the same anchor date) to be able to
      read and write correct timestamps from/to the RTC.
      Signed-off-by: NJulius Werner <jwerner@chromium.org>
      Reviewed-by: NDouglas Anderson <dianders@chromium.org>
      Signed-off-by: NAlexandre Belloni <alexandre.belloni@free-electrons.com>
      f076ef44
  8. 19 12月, 2015 10 次提交
  9. 18 12月, 2015 13 次提交
    • K
      xen/pciback: Don't allow MSI-X ops if PCI_COMMAND_MEMORY is not set. · 408fb0e5
      Konrad Rzeszutek Wilk 提交于
      commit f598282f ("PCI: Fix the NIU MSI-X problem in a better way")
      teaches us that dealing with MSI-X can be troublesome.
      
      Further checks in the MSI-X architecture shows that if the
      PCI_COMMAND_MEMORY bit is turned of in the PCI_COMMAND we
      may not be able to access the BAR (since they are memory regions).
      
      Since the MSI-X tables are located in there.. that can lead
      to us causing PCIe errors. Inhibit us performing any
      operation on the MSI-X unless the MEMORY bit is set.
      
      Note that Xen hypervisor with:
      "x86/MSI-X: access MSI-X table only after having enabled MSI-X"
      will return:
      xen_pciback: 0000:0a:00.1: error -6 enabling MSI-X for guest 3!
      
      When the generic MSI code tries to setup the PIRQ without
      MEMORY bit set. Which means with later versions of Xen
      (4.6) this patch is not neccessary.
      
      This is part of XSA-157
      
      CC: stable@vger.kernel.org
      Reviewed-by: NJan Beulich <jbeulich@suse.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      408fb0e5
    • K
      xen/pciback: For XEN_PCI_OP_disable_msi[|x] only disable if device has MSI(X) enabled. · 7cfb905b
      Konrad Rzeszutek Wilk 提交于
      Otherwise just continue on, returning the same values as
      previously (return of 0, and op->result has the PIRQ value).
      
      This does not change the behavior of XEN_PCI_OP_disable_msi[|x].
      
      The pci_disable_msi or pci_disable_msix have the checks for
      msi_enabled or msix_enabled so they will error out immediately.
      
      However the guest can still call these operations and cause
      us to disable the 'ack_intr'. That means the backend IRQ handler
      for the legacy interrupt will not respond to interrupts anymore.
      
      This will lead to (if the device is causing an interrupt storm)
      for the Linux generic code to disable the interrupt line.
      
      Naturally this will only happen if the device in question
      is plugged in on the motherboard on shared level interrupt GSI.
      
      This is part of XSA-157
      
      CC: stable@vger.kernel.org
      Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      7cfb905b
    • K
      xen/pciback: Do not install an IRQ handler for MSI interrupts. · a396f3a2
      Konrad Rzeszutek Wilk 提交于
      Otherwise an guest can subvert the generic MSI code to trigger
      an BUG_ON condition during MSI interrupt freeing:
      
       for (i = 0; i < entry->nvec_used; i++)
              BUG_ON(irq_has_action(entry->irq + i));
      
      Xen PCI backed installs an IRQ handler (request_irq) for
      the dev->irq whenever the guest writes PCI_COMMAND_MEMORY
      (or PCI_COMMAND_IO) to the PCI_COMMAND register. This is
      done in case the device has legacy interrupts the GSI line
      is shared by the backend devices.
      
      To subvert the backend the guest needs to make the backend
      to change the dev->irq from the GSI to the MSI interrupt line,
      make the backend allocate an interrupt handler, and then command
      the backend to free the MSI interrupt and hit the BUG_ON.
      
      Since the backend only calls 'request_irq' when the guest
      writes to the PCI_COMMAND register the guest needs to call
      XEN_PCI_OP_enable_msi before any other operation. This will
      cause the generic MSI code to setup an MSI entry and
      populate dev->irq with the new PIRQ value.
      
      Then the guest can write to PCI_COMMAND PCI_COMMAND_MEMORY
      and cause the backend to setup an IRQ handler for dev->irq
      (which instead of the GSI value has the MSI pirq). See
      'xen_pcibk_control_isr'.
      
      Then the guest disables the MSI: XEN_PCI_OP_disable_msi
      which ends up triggering the BUG_ON condition in 'free_msi_irqs'
      as there is an IRQ handler for the entry->irq (dev->irq).
      
      Note that this cannot be done using MSI-X as the generic
      code does not over-write dev->irq with the MSI-X PIRQ values.
      
      The patch inhibits setting up the IRQ handler if MSI or
      MSI-X (for symmetry reasons) code had been called successfully.
      
      P.S.
      Xen PCIBack when it sets up the device for the guest consumption
      ends up writting 0 to the PCI_COMMAND (see xen_pcibk_reset_device).
      XSA-120 addendum patch removed that - however when upstreaming said
      addendum we found that it caused issues with qemu upstream. That
      has now been fixed in qemu upstream.
      
      This is part of XSA-157
      
      CC: stable@vger.kernel.org
      Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      a396f3a2
    • K
      xen/pciback: Return error on XEN_PCI_OP_enable_msix when device has MSI or MSI-X enabled · 5e0ce145
      Konrad Rzeszutek Wilk 提交于
      The guest sequence of:
      
        a) XEN_PCI_OP_enable_msix
        b) XEN_PCI_OP_enable_msix
      
      results in hitting an NULL pointer due to using freed pointers.
      
      The device passed in the guest MUST have MSI-X capability.
      
      The a) constructs and SysFS representation of MSI and MSI groups.
      The b) adds a second set of them but adding in to SysFS fails (duplicate entry).
      'populate_msi_sysfs' frees the newly allocated msi_irq_groups (note that
      in a) pdev->msi_irq_groups is still set) and also free's ALL of the
      MSI-X entries of the device (the ones allocated in step a) and b)).
      
      The unwind code: 'free_msi_irqs' deletes all the entries and tries to
      delete the pdev->msi_irq_groups (which hasn't been set to NULL).
      However the pointers in the SysFS are already freed and we hit an
      NULL pointer further on when 'strlen' is attempted on a freed pointer.
      
      The patch adds a simple check in the XEN_PCI_OP_enable_msix to guard
      against that. The check for msi_enabled is not stricly neccessary.
      
      This is part of XSA-157
      
      CC: stable@vger.kernel.org
      Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
      Reviewed-by: NJan Beulich <jbeulich@suse.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      5e0ce145
    • K
      xen/pciback: Return error on XEN_PCI_OP_enable_msi when device has MSI or MSI-X enabled · 56441f3c
      Konrad Rzeszutek Wilk 提交于
      The guest sequence of:
      
       a) XEN_PCI_OP_enable_msi
       b) XEN_PCI_OP_enable_msi
       c) XEN_PCI_OP_disable_msi
      
      results in hitting an BUG_ON condition in the msi.c code.
      
      The MSI code uses an dev->msi_list to which it adds MSI entries.
      Under the above conditions an BUG_ON() can be hit. The device
      passed in the guest MUST have MSI capability.
      
      The a) adds the entry to the dev->msi_list and sets msi_enabled.
      The b) adds a second entry but adding in to SysFS fails (duplicate entry)
      and deletes all of the entries from msi_list and returns (with msi_enabled
      is still set).  c) pci_disable_msi passes the msi_enabled checks and hits:
      
      BUG_ON(list_empty(dev_to_msi_list(&dev->dev)));
      
      and blows up.
      
      The patch adds a simple check in the XEN_PCI_OP_enable_msi to guard
      against that. The check for msix_enabled is not stricly neccessary.
      
      This is part of XSA-157.
      
      CC: stable@vger.kernel.org
      Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
      Reviewed-by: NJan Beulich <jbeulich@suse.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      56441f3c
    • K
      xen/pciback: Save xen_pci_op commands before processing it · 8135cf8b
      Konrad Rzeszutek Wilk 提交于
      Double fetch vulnerabilities that happen when a variable is
      fetched twice from shared memory but a security check is only
      performed the first time.
      
      The xen_pcibk_do_op function performs a switch statements on the op->cmd
      value which is stored in shared memory. Interestingly this can result
      in a double fetch vulnerability depending on the performed compiler
      optimization.
      
      This patch fixes it by saving the xen_pci_op command before
      processing it. We also use 'barrier' to make sure that the
      compiler does not perform any optimization.
      
      This is part of XSA155.
      
      CC: stable@vger.kernel.org
      Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NJan Beulich <JBeulich@suse.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      8135cf8b
    • D
      xen-scsiback: safely copy requests · be69746e
      David Vrabel 提交于
      The copy of the ring request was lacking a following barrier(),
      potentially allowing the compiler to optimize the copy away.
      
      Use RING_COPY_REQUEST() to ensure the request is copied to local
      memory.
      
      This is part of XSA155.
      
      CC: stable@vger.kernel.org
      Reviewed-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      be69746e
    • R
      xen-blkback: read from indirect descriptors only once · 18779149
      Roger Pau Monné 提交于
      Since indirect descriptors are in memory shared with the frontend, the
      frontend could alter the first_sect and last_sect values after they have
      been validated but before they are recorded in the request.  This may
      result in I/O requests that overflow the foreign page, possibly
      overwriting local pages when the I/O request is executed.
      
      When parsing indirect descriptors, only read first_sect and last_sect
      once.
      
      This is part of XSA155.
      
      CC: stable@vger.kernel.org
      Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      18779149
    • R
      xen-blkback: only read request operation from shared ring once · 1f13d75c
      Roger Pau Monné 提交于
      A compiler may load a switch statement value multiple times, which could
      be bad when the value is in memory shared with the frontend.
      
      When converting a non-native request to a native one, ensure that
      src->operation is only loaded once by using READ_ONCE().
      
      This is part of XSA155.
      
      CC: stable@vger.kernel.org
      Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      1f13d75c
    • D
      xen-netback: use RING_COPY_REQUEST() throughout · 68a33bfd
      David Vrabel 提交于
      Instead of open-coding memcpy()s and directly accessing Tx and Rx
      requests, use the new RING_COPY_REQUEST() that ensures the local copy
      is correct.
      
      This is more than is strictly necessary for guest Rx requests since
      only the id and gref fields are used and it is harmless if the
      frontend modifies these.
      
      This is part of XSA155.
      
      CC: stable@vger.kernel.org
      Reviewed-by: NWei Liu <wei.liu2@citrix.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      68a33bfd
    • D
      xen-netback: don't use last request to determine minimum Tx credit · 0f589967
      David Vrabel 提交于
      The last from guest transmitted request gives no indication about the
      minimum amount of credit that the guest might need to send a packet
      since the last packet might have been a small one.
      
      Instead allow for the worst case 128 KiB packet.
      
      This is part of XSA155.
      
      CC: stable@vger.kernel.org
      Reviewed-by: NWei Liu <wei.liu2@citrix.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      0f589967
    • G
      Fix remove_and_add_spares removes drive added as spare in slot_store · cb01c549
      Goldwyn Rodrigues 提交于
      Commit 2910ff17
      introduced a regression which would remove a recently added spare via
      slot_store. Revert part of the patch which touches slot_store() and add
      the disk directly using pers->hot_add_disk()
      
      Fixes: 2910ff17 ("md: remove_and_add_spares() to activate specific
      rdev")
      Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: NPawel Baldysiak <pawel.baldysiak@intel.com>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      cb01c549
    • M
      md: fix bug due to nested suspend · 0dc10e50
      Mikulas Patocka 提交于
      The patch c7bfced9 committed to 4.4-rc
      causes crash in LVM test shell/lvchange-raid.sh. The kernel crashes with
      this BUG, the reason is that we attempt to suspend a device that is
      already suspended. See also
      https://bugzilla.redhat.com/show_bug.cgi?id=1283491
      
      This patch fixes the bug by changing functions mddev_suspend and
      mddev_resume to always nest.
      The number of nested calls to mddev_nested_suspend is kept in the
      variable mddev->suspended.
      [neilb: made mddev_suspend() always nest instead of introduce mddev_nested_suspend]
      
      kernel BUG at drivers/md/md.c:317!
      CPU: 3 PID: 32754 Comm: lvm Not tainted 4.4.0-rc2 #1
      task: 0000000047076040 ti: 0000000047014000 task.ti: 0000000047014000
      
           YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
      PSW: 00001000000001000000000000001111 Not tainted
      r00-03  000000000804000f 00000000102c5280 0000000010c7522c 000000007e3d1810
      r04-07  0000000010c6f000 000000004ef37f20 000000007e3d1dd0 000000007e3d1810
      r08-11  000000007c9f1600 0000000000000000 0000000000000001 ffffffffffffffff
      r12-15  0000000010c1d000 0000000000000041 00000000f98d63c8 00000000f98e49e4
      r16-19  00000000f98e49e4 00000000c138fd06 00000000f98d63c8 0000000000000001
      r20-23  0000000000000002 000000004ef37f00 00000000000000b0 00000000000001d1
      r24-27  00000000424783a0 000000007e3d1dd0 000000007e3d1810 00000000102b2000
      r28-31  0000000000000001 0000000047014840 0000000047014930 0000000000000001
      sr00-03  0000000007040800 0000000000000000 0000000000000000 0000000007040800
      sr04-07  0000000000000000 0000000000000000 0000000000000000 0000000000000000
      
      IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000102c538c 00000000102c5390
       IIR: 03ffe01f    ISR: 0000000000000000  IOR: 00000000102b2748
       CPU:        3   CR30: 0000000047014000 CR31: 0000000000000000
       ORIG_R28: 00000000000000b0
       IAOQ[0]: mddev_suspend+0x10c/0x160 [md_mod]
       IAOQ[1]: mddev_suspend+0x110/0x160 [md_mod]
       RP(r2): raid1_add_disk+0xd4/0x2c0 [raid1]
      Backtrace:
       [<0000000010c7522c>] raid1_add_disk+0xd4/0x2c0 [raid1]
       [<0000000010c20078>] raid_resume+0x390/0x418 [dm_raid]
       [<00000000105833e8>] dm_table_resume_targets+0xc0/0x188 [dm_mod]
       [<000000001057f784>] dm_resume+0x144/0x1e0 [dm_mod]
       [<0000000010587dd4>] dev_suspend+0x1e4/0x568 [dm_mod]
       [<0000000010589278>] ctl_ioctl+0x1e8/0x428 [dm_mod]
       [<0000000010589518>] dm_compat_ctl_ioctl+0x18/0x68 [dm_mod]
       [<0000000040377b88>] compat_SyS_ioctl+0xd0/0x1558
      
      Fixes: c7bfced9 ("md: suspend i/o during runtime blk_integrity_unregister")
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      0dc10e50