1. 03 12月, 2014 2 次提交
  2. 02 12月, 2014 3 次提交
    • T
      sb_edac: Fix discovery of top-of-low-memory for Haswell · f7cf2a22
      Tony Luck 提交于
      Haswell moved the TOLM/TOHM registers to a different device and offset.
      The sb_edac driver accounted for the change of device, but not for the
      new offset.  There was also a typo in the constant to fill in the low
      26 bits (was 0x1ffffff, should be 0x3ffffff).
      
      This resulted in a bogus value for the top of low memory:
      
        EDAC DEBUG: get_memory_layout: TOLM: 0.032 GB (0x0000000001ffffff)
      
      which would result in EDAC refusing to translate addresses for
      errors above the bogus value and below 4GB:
      
         sbridge MC3: HANDLING MCE MEMORY ERROR
         sbridge MC3: CPU 0: Machine Check Event: 0 Bank 7: 8c00004000010090
         sbridge MC3: TSC 0
         sbridge MC3: ADDR 2000000
         sbridge MC3: MISC 523eac86
         sbridge MC3: PROCESSOR 0:306f3 TIME 1414600951 SOCKET 0 APIC 0
         MC3: 1 CE Error at TOLM area, on addr 0x02000000 on any memory ( page:0x0 offset:0x0 grain:32 syndrome:0x0)
      
      With the fix we see the correct TOLM value:
      
         DEBUG: get_memory_layout: TOLM: 2.048 GB (0x000000007fffffff)
      
      and we decode address 2000000 correctly:
      
         sbridge MC3: HANDLING MCE MEMORY ERROR
         sbridge MC3: CPU 0: Machine Check Event: 0 Bank 7: 8c00004000010090
         sbridge MC3: TSC 0
         sbridge MC3: ADDR 2000000
         sbridge MC3: MISC 523e1086
         sbridge MC3: PROCESSOR 0:306f3 TIME 1414601319 SOCKET 0 APIC 0
         DEBUG: get_memory_error_data: SAD interleave package: 0 = CPU socket 0, HA 0, shiftup: 0
         DEBUG: get_memory_error_data: TAD#0: address 0x0000000002000000 < 0x000000007fffffff, socket interleave 1, channel interleave 4 (offset 0x00000000), index 0, base ch: 0, ch mask: 0x01
         DEBUG: get_memory_error_data: RIR#0, limit: 4.095 GB (0x00000000ffffffff), way: 1
         DEBUG: get_memory_error_data: RIR#0: channel address 0x00200000 < 0xffffffff, RIR interleave 0, index 0
         DEBUG: sbridge_mce_output_error:  area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0
         MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2000 offset:0x0 grain:32 syndrome:0x0 -  area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0)
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Acked-by: NAristeu Rozanski <aris@redhat.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab@osg.samsung.com>
      f7cf2a22
    • J
    • J
      sb_edac: Fix off-by-one error in number of channels · 50043e25
      Jim Snow 提交于
      This prevented edac sysfs code from properly handling 6 channels
      per memory controller.
      Signed-off-by: NJim Snow <jim.snow@intel.com>
      Signed-off-by: NLukasz Anaczkowski <lukasz.anaczkowski@intel.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab@osg.samsung.com>
      50043e25
  3. 28 11月, 2014 1 次提交
  4. 27 11月, 2014 8 次提交
  5. 26 11月, 2014 10 次提交
  6. 25 11月, 2014 4 次提交
  7. 24 11月, 2014 8 次提交
    • V
      drm/i915: Ignore SURFLIVE and flip counter when the GPU gets reset · bdfa7542
      Ville Syrjälä 提交于
      During a GPU reset we need to get pending page flip cleared out
      since the ring contents are gone and flip will never complete
      on its own. This used to work until the mmio vs. CS flip race
      detection came about. That piece of code is looking for a
      specific surface address in the SURFLIVE register, but as
      a flip to that address may never happen the check may never
      pass. So we should just skip the SURFLIVE and flip counter
      checks when the GPU gets reset.
      
      intel_display_handle_reset() tries to effectively complete
      the flip anyway by calling .update_primary_plane(). But that
      may not satisfy the conditions of the mmio vs. CS race
      detection since there's no guarantee that a modeset didn't
      sneak in between the GPU reset and intel_display_handle_reset().
      Such a modeset will not wait for pending flips due to the ongoing GPU
      reset, and then the primary plane updates performed by
      intel_display_handle_reset() will already use the new surface
      address, and thus the surface address the flip is waiting for
      might never appear in SURFLIVE. The result is that the flip
      will never complete and attempts to perform further page flips
      will fail with -EBUSY.
      
      During the GPU reset intel_crtc_has_pending_flip() will return
      false regardless, so the deadlock with a modeset vs. the error
      work acquiring crtc->mutex was avoided. And the reset_counter
      check in intel_crtc_has_pending_flip() actually made this bug
      even less severe since it allowed normal modesets to go through
      even though there's a pending flip.
      
      This is a regression introduced by me here:
       commit 75f7f3ec
       Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
       Date:   Tue Apr 15 21:41:34 2014 +0300
      
          drm/i915: Fix mmio vs. CS flip race on ILK+
      
      Testcase: igt/kms_flip/flip-vs-panning-vs-hang
      Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      bdfa7542
    • B
      gpu/radeon: Set flag to indicate broken 64-bit MSI · 91ed6fd2
      Benjamin Herrenschmidt 提交于
      Some radeon ASICs don't support all 64 address bits of MSIs despite
      advertising support for 64-bit MSIs in their configuration space.
      
      This breaks on systems such as IBM POWER7/8, where 64-bit MSIs can
      be assigned with some of the high address bits set.
      
      This makes use of the newly introduced "no_64bit_msi" flag in structure
      pci_dev to allow the MSI allocation code to fallback to 32-bit MSIs
      on those adapters.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
      CC: <stable@vger.kernel.org>
      ---
      
      Adding Alex's review tag. Patch to the driver is identical to the
      reviewed one, I dropped the arch/powerpc hunk rewrote the subject
      and cset comment.
      91ed6fd2
    • B
      PCI/MSI: Add device flag indicating that 64-bit MSIs don't work · f144d149
      Benjamin Herrenschmidt 提交于
      This can be set by quirks/drivers to be used by the architecture code
      that assigns the MSI addresses.
      
      We additionally add verification in the core MSI code that the values
      assigned by the architecture do satisfy the limitation in order to fail
      gracefully if they don't (ie. the arch hasn't been updated to deal with
      that quirk yet).
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: <stable@vger.kernel.org>
      Acked-by: NBjorn Helgaas <bhelgaas@google.com>
      f144d149
    • L
      iwlwifi: mvm: check TLV flag before trying to use hotspot firmware commands · 5ac6c72e
      Luciano Coelho 提交于
      Older firmwares do not provide support for the HOT_SPOT_CMD command.
      Check for the appropriate TLV flag that declares hotspot support in
      the firmware to prevent a firmware assertion failure that can be
      triggered from the userspace,
      
      Cc: stable@vger.kernel.org [3.17+]
      Signed-off-by: NLuciano Coelho <luciano.coelho@intel.com>
      Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
      5ac6c72e
    • J
      solos-pci: fix error return code · 73112f9b
      Julia Lawall 提交于
      Return a negative error code on failure.
      
      A simplified version of the semantic match that finds this problem is as
      follows: (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@
      identifier ret; expression e1,e2;
      @@
      (
      if (\(ret < 0\|ret != 0\))
       { ... return ret; }
      |
      ret = 0
      )
      ... when != ret = e1
          when != &ret
      *if(...)
      {
        ... when != ret = e2
            when forall
       return ret;
      }
      // </smpl>
      Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73112f9b
    • C
      igb: Fixes needed for surprise removal support · 17a402a0
      Carolyn Wyborny 提交于
      This patch adds some checks in order to prevent panic's on surprise
      removal of devices during S0, S3, S4.  Without this patch, Thunderbolt
      type device removal will panic the system.
      Signed-off-by: NYanir Lubetkin <yanirx.lubetkin@intel.com>
      Signed-off-by: NCarolyn Wyborny <carolyn.wyborny@intel.com>
      Tested-by: NAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      17a402a0
    • D
      ixgbe: fix use after free adapter->state test in ixgbe_remove/ixgbe_probe · b5b2ffc0
      Daniel Borkmann 提交于
      While working on a different issue, I noticed an annoying use
      after free bug on my machine when unloading the ixgbe driver:
      
      [ 8642.318797] ixgbe 0000:02:00.1: removed PHC on p2p2
      [ 8642.742716] ixgbe 0000:02:00.1: complete
      [ 8642.743784] BUG: unable to handle kernel paging request at ffff8807d3740a90
      [ 8642.744828] IP: [<ffffffffa01c77dc>] ixgbe_remove+0xfc/0x1b0 [ixgbe]
      [ 8642.745886] PGD 20c6067 PUD 81c1f6067 PMD 81c15a067 PTE 80000007d3740060
      [ 8642.746956] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
      [ 8642.748039] Modules linked in: [...]
      [ 8642.752929] CPU: 1 PID: 1225 Comm: rmmod Not tainted 3.18.0-rc2+ #49
      [ 8642.754203] Hardware name: Supermicro X10SLM-F/X10SLM-F, BIOS 1.1b 11/01/2013
      [ 8642.755505] task: ffff8807e34d3fe0 ti: ffff8807b7204000 task.ti: ffff8807b7204000
      [ 8642.756831] RIP: 0010:[<ffffffffa01c77dc>]  [<ffffffffa01c77dc>] ixgbe_remove+0xfc/0x1b0 [ixgbe]
      [...]
      [ 8642.774335] Stack:
      [ 8642.775805]  ffff8807ee824098 ffff8807ee824098 ffffffffa01f3000 ffff8807ee824000
      [ 8642.777326]  ffff8807b7207e18 ffffffff8137720f ffff8807ee824098 ffff8807ee824098
      [ 8642.778848]  ffffffffa01f3068 ffff8807ee8240f8 ffff8807b7207e38 ffffffff8144180f
      [ 8642.780365] Call Trace:
      [ 8642.781869]  [<ffffffff8137720f>] pci_device_remove+0x3f/0xc0
      [ 8642.783395]  [<ffffffff8144180f>] __device_release_driver+0x7f/0xf0
      [ 8642.784876]  [<ffffffff814421f8>] driver_detach+0xb8/0xc0
      [ 8642.786352]  [<ffffffff814414a9>] bus_remove_driver+0x59/0xe0
      [ 8642.787783]  [<ffffffff814429d0>] driver_unregister+0x30/0x70
      [ 8642.789202]  [<ffffffff81375c65>] pci_unregister_driver+0x25/0xa0
      [ 8642.790657]  [<ffffffffa01eb38e>] ixgbe_exit_module+0x1c/0xc8e [ixgbe]
      [ 8642.792064]  [<ffffffff810f93a2>] SyS_delete_module+0x132/0x1c0
      [ 8642.793450]  [<ffffffff81012c61>] ? do_notify_resume+0x61/0xa0
      [ 8642.794837]  [<ffffffff816d2029>] system_call_fastpath+0x12/0x17
      
      The issue is that test_and_set_bit() done on adapter->state is being
      performed *after* the netdevice has been freed via free_netdev().
      
      When netdev is being allocated on initialization time, it allocates
      a private area, here struct ixgbe_adapter, that resides after the
      net_device structure. In ixgbe_probe(), the device init routine,
      we set up the adapter after alloc_etherdev_mq() on the private area
      and add a reference for the pci_dev as well via pci_set_drvdata().
      
      Both in the error path of ixgbe_probe(), but also on module unload
      when ixgbe_remove() is being called, commit 41c62843 ("ixgbe:
      Fix rcu warnings induced by LER") accesses adapter after free_netdev().
      The patch stores the result in a bool and thus fixes above oops on my
      side.
      
      Fixes: 41c62843 ("ixgbe: Fix rcu warnings induced by LER")
      Cc: stable <stable@vger.kernel.org>
      Cc: Mark Rustad <mark.d.rustad@intel.com>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b5b2ffc0
    • V
      ixgbe: Correctly disable VLAN filter in promiscuous mode · 4556dc59
      Vlad Yasevich 提交于
      IXGBE adapter seems to require that VLAN filtering be enabled if
      VMDQ or SRIOV are enabled.  When those functions are disabled,
      VLAN filtering may be disabled in promiscuous mode.
      
      Prior to commit a9b8943e ("ixgbe: remove vlan_filter_disable
      and enable functions")
      
      The logic was correct.  However, after the commit the logic
      got reversed and VLAN filtered in now turned on when VMDQ/SRIOV
      is disabled.
      
      This patch changes the condition to enable hw vlan filtered
      when VMDQ or SRIOV is enabled.
      
      Fixes: a9b8943e ("ixgbe: remove vlan_filter_disable and enable functions")
      Cc: stable <stable@vger.kernel.org>
      CC: Jacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: NVladislav Yasevich <vyasevic@redhat.com>
      Acked-by: NEmil Tantilov <emil.s.tantilov@intel.com>
      Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4556dc59
  8. 22 11月, 2014 4 次提交