1. 22 8月, 2016 1 次提交
  2. 08 8月, 2016 1 次提交
    • L
      EDAC, sb_edac: Fix channel reporting on Knights Landing · c5b48fa7
      Lukasz Odzioba 提交于
      On Intel Xeon Phi Knights Landing processor family the channels of the
      memory controller have untypical arrangement - MC0 is mapped to CH3,4,5
      and MC1 is mapped to CH0,1,2. This causes the EDAC driver to report the
      channel name incorrectly.
      
      We missed this change earlier, so the code already contains similar
      comment, but the translation function is incorrect.
      
      Without this patch:
        errors in DIMM_A and DIMM_D were reported in DIMM_D
        errors in DIMM_B and DIMM_E were reported in DIMM_E
        errors in DIMM_C and DIMM_F were reported in DIMM_F
      
      Correct this.
      
      Hubert Chrzaniuk:
       - rebased to 4.8
       - comments and code cleanup
      
      Fixes: d0cdf900 ("sb_edac: Add Knights Landing (Xeon Phi gen 2) support")
      Reviewed-by: NTony Luck <tony.luck@intel.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Cc: lukasz.anaczkowski@intel.com
      Cc: lukasz.odzioba@intel.com
      Cc: mchehab@kernel.org
      Cc: <stable@vger.kernel.org> # v4.5..
      Link: http://lkml.kernel.org/r/1469231089-22837-1-git-send-email-lukasz.odzioba@intel.comSigned-off-by: NLukasz Odzioba <lukasz.odzioba@intel.com>
      [ Boris: Simplify a bit by removing char mc. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      c5b48fa7
  3. 16 7月, 2016 1 次提交
  4. 25 6月, 2016 2 次提交
  5. 24 6月, 2016 4 次提交
  6. 16 6月, 2016 2 次提交
    • B
      EDAC: Correct channel count limit · bba14295
      Borislav Petkov 提交于
      c44696ff ("EDAC: Remove arbitrary limit on number of channels")
      lifted the arbitrary limit on memory controller channels in EDAC.
      However, the dynamic channel attributes dynamic_csrow_dimm_attr and
      dynamic_csrow_ce_count_attr remained 6.
      
      This wasn't a problem except channels 6 and 7 weren't visible in sysfs
      on machines with more than 6 channels after the conversion to static
      attr groups with
      
        2c1946b6 ("EDAC: Use static attribute groups for managing sysfs entries")
      
       [ without that, we're exploding in edac_create_sysfs_mci_device()
         because we're dereferencing out of the bounds of the
         dynamic_csrow_dimm_attr array. ]
      
      Add attributes for channels 6 and 7 along with a guard for the
      future, should more channels be required and/or to sanity check for
      misconfigured machines.
      
      We still need to check against the number of channels present on the MC
      first, as Thor reported.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reported-by: NHironobu Ishii <ishii.hironobu@jp.fujitsu.com>
      Tested-by: NThor Thayer <tthayer@opensource.altera.com>
      Cc: <stable@vger.kernel.org> # 4.2
      bba14295
    • B
      EDAC, amd64_edac: Init opstate at the proper time during init · 6ba92fea
      Borislav Petkov 提交于
      It is useless to do it if we're loaded on unsupported hardware so do
      that only after we have detected at least 1 supported AMD northbridge.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      6ba92fea
  7. 08 6月, 2016 2 次提交
  8. 03 6月, 2016 3 次提交
    • T
      EDAC, sb_edac: Readd accidentally dropped Broadwell-D support · 665f05e0
      Tony Luck 提交于
      In commit
      
        2c1ea4c7 ("EDAC, sb_edac: Use cpu family/model in driver detection")
      
      we switched from using PCI ids to determine which platform we are
      running on to using CPU model instead.
      
      I forgot that Broadwell-DE has its own distinct model number different
      from Broadwell-EP or -EX.
      
      Fixing this isn't just adding a line to the array of cpuids - the
      exising code assumed a 1:1 mapping between entries in that array and the
      "enum type" values. Added the type to pci_id_table structure to remove
      this dependency and allows two Broadwell cpu models.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Cc: Aristeu Rozanski <arozansk@redhat.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Fixes: 2c1ea4c7 ("EDAC, sb_edac: Use cpu family/model in driver detection")
      Link: http://lkml.kernel.org/r/b3cffe40dec6dfe0235a5d52a504f0ba86a07ce7.1464902605.git.tony.luck@intel.comSigned-off-by: NBorislav Petkov <bp@suse.de>
      665f05e0
    • N
      EDAC: Fix workqueues poll period resetting · fbedcaf4
      Nicholas Krause 提交于
      After the workqueue cleanup, we're registering workqueues based on
      the presence of an ->edac_check function. When that is the case,
      we're setting OP_RUNNING_POLL. But we forgot to check that in
      edac_mc_reset_delay_period(), leading to:
      
        BUG: unable to handle kernel paging request at 0000000000015d10
        IP: [ .. ] queued_spin_lock_slowpath
        PGD 3ffcc8067 PUD 3ffc56067 PMD 0
        Oops: 0002 [#1] SMP
        Modules linked in: ...
        CPU: 1 PID: 2792 Comm: edactest Not tainted 4.6.0-dirty #1
        Hardware name: HP ProLiant MicroServer, BIOS O41     10/01/2013
        Stack:
        Call Trace:
          ? _raw_spin_lock_irqsave
          ? lock_timer_base.isra.34
          ? del_timer
          ? try_to_grab_pending
          ? mod_delayed_work_on
          ? edac_mc_reset_delay_period
          ? edac_set_poll_msec
          ? param_attr_store
          ? module_attr_store
          ? kernfs_fop_write
          ? __vfs_write
          ? __vfs_read
          ? __alloc_fd
          ? vfs_write
          ? SyS_write
          ? entry_SYSCALL_64_fastpath
        Code:
        RIP  [ .. ] queued_spin_lock_slowpath
         RSP <>
        CR2: 0000000000015d10
        ---[ end trace 3f286bc71cca15d1 ]---
        Kernel panic - not syncing: Fatal exception
      
      Fix it.
      Signed-off-by: NNicholas Krause <xerofoify@gmail.com>
      Cc: <stable@vger.kernel.org> # 4.5
      Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Link: http://lkml.kernel.org/r/1463697958-13406-1-git-send-email-xerofoify@gmail.com
      [ Rewrite commit message. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      fbedcaf4
    • T
      EDAC, sb_edac: Fix rank lookup on Broadwell · c7103f65
      Tony Luck 提交于
      Broadwell made a small change to the rank target register moving the
      target rank ID field up from bits 16:19 to bits 20:23.
      
      Also found that the offset field grew by one bit in the IVY_BRIDGE to
      HASWELL transition, so fix the RIR_OFFSET() macro too.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Cc: stable@vger.kernel.org # v3.19+
      Cc: Aristeu Rozanski <arozansk@redhat.com>
      Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Link: http://lkml.kernel.org/r/2943fb819b1f7e396681165db9c12bb3df0e0b16.1464735623.git.tony.luck@intel.comSigned-off-by: NBorislav Petkov <bp@suse.de>
      c7103f65
  9. 12 5月, 2016 1 次提交
  10. 10 5月, 2016 1 次提交
    • B
      EDAC, amd64_edac: Drop pci_register_driver() use · 3f37a36b
      Borislav Petkov 提交于
      - remove homegrown instances counting.
      - take F3 PCI device from amd_nb caching instead of F2 which was used with the
      PCI core.
      
      With those changes, the driver doesn't need to register a PCI driver and
      relies on the northbridges caching which we do anyway on AMD.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Yazen Ghannam <yazen.ghannam@amd.com>
      3f37a36b
  11. 07 5月, 2016 1 次提交
  12. 03 5月, 2016 1 次提交
    • T
      EDAC, sb_edac: Use cpu family/model in driver detection · 2c1ea4c7
      Tony Luck 提交于
      Instead of picking a random PCI ID from the dozen or so we need to
      access, just use x86_match_cpu() to pick based on CPU model number. The
      choosing of PCI devices has been problematic in the past, see
      
        11249e73 ("sb_edac: Fix detection on SNB machines")
      
      which fixed problems introduced by
      
        d0585cd8 ("sb_edac: Claim a different PCI device").
      
      This is especially ugly if future hardware might not even have
      EDAC-relevant registers in PCI config space and we would still be
      required to choose some "random" PCI devices to scan for just so our
      driver loads.
      
      Is this cleaner/clearer? It deletes much more code than it adds. Only
      tested on Broadwell. The driver loads/unloads and loads again. Still
      decodes errors too.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Suggested-by: NBorislav Petkov <bp@alien8.de>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      2c1ea4c7
  13. 29 4月, 2016 2 次提交
  14. 27 4月, 2016 1 次提交
  15. 24 4月, 2016 1 次提交
  16. 23 4月, 2016 4 次提交
  17. 22 4月, 2016 2 次提交
  18. 18 4月, 2016 1 次提交
  19. 07 4月, 2016 1 次提交
  20. 02 4月, 2016 3 次提交
  21. 29 3月, 2016 5 次提交