1. 20 2月, 2014 1 次提交
  2. 16 12月, 2013 1 次提交
  3. 12 12月, 2013 1 次提交
  4. 06 12月, 2013 1 次提交
  5. 30 11月, 2013 1 次提交
  6. 15 11月, 2013 11 次提交
  7. 22 10月, 2013 1 次提交
  8. 29 4月, 2013 1 次提交
    • L
      edac: sb_edac.c should not require prescence of IMC_DDRIO device · de4772c6
      Luck, Tony 提交于
      The Sandy Bridge EDAC driver uses a register in the IMC_DDRIO CSR
      space to determine the type of DIMMs (registered or unregistered).
      But this device does not exist on some single socket Sandy Bridge
      servers.  While the type of DIMMs is nice to know, it is not essential
      for this driver's other functions. So it seems harsh to have it
      refuse to load at all when it cannot find this device.
      
      Make the check for this device be optional. If it isn't present
      just report the memory type as "MEM_UNKNOWN".
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      de4772c6
  9. 04 1月, 2013 1 次提交
    • G
      Drivers: edac: remove __dev* attributes. · 9b3c6e85
      Greg Kroah-Hartman 提交于
      CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
      markings need to be removed.
      
      This change removes the use of __devinit, __devexit_p, and __devexit
      from these drivers.
      
      Based on patches originally written by Bill Pemberton, but redone by me
      in order to handle some of the coding style issues better, by hand.
      
      Cc: Bill Pemberton <wfp5p@virginia.edu>
      Cc: Doug Thompson <dougthompson@xmission.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Mark Gross <mark.gross@intel.com>
      Cc: Jason Uhlenkott <juhlenko@akamai.com>
      Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
      Cc: Tim Small <tim@buttersideup.com>
      Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
      Cc: "Arvind R." <arvino55@gmail.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: David Daney <david.daney@cavium.com>
      Cc: Egor Martovetsky <egor@pasemi.com>
      Cc: Olof Johansson <olof@lixom.net>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9b3c6e85
  10. 21 12月, 2012 1 次提交
  11. 25 9月, 2012 1 次提交
    • M
      sb_edac: Avoid overflow errors at memory size calculation · deb09dda
      Mauro Carvalho Chehab 提交于
      Sandy bridge EDAC is calculating the memory size with overflow.
      Basically, the size field and the integer calculation is using 32 bits.
      More bits are needed, when the DIMM memories have high density.
      
      The net result is that memories are improperly reported there, when
      high-density DIMMs are used:
      
      EDAC DEBUG: in drivers/edac/sb_edac.c, line at 591: mc#0: channel 0, dimm 0, -16384 Mb (-4194304 pages) bank: 8, rank: 2, row: 0x10000, col: 0x800
      EDAC DEBUG: in drivers/edac/sb_edac.c, line at 591: mc#0: channel 1, dimm 0, -16384 Mb (-4194304 pages) bank: 8, rank: 2, row: 0x10000, col: 0x800
      
      As the number of pages value is handled at the EDAC core as unsigned
      ints, the driver shows the 16 GB memories at sysfs interface as 16760832
      MB! The fix is simple: calculate the number of pages as unsigned 64-bits
      integer.
      
      After the patch, the memory size (16 GB) is properly detected:
      
      EDAC DEBUG: in drivers/edac/sb_edac.c, line at 592: mc#0: channel 0, dimm 0, 16384 Mb (4194304 pages) bank: 8, rank: 2, row: 0x10000, col: 0x800
      EDAC DEBUG: in drivers/edac/sb_edac.c, line at 592: mc#0: channel 1, dimm 0, 16384 Mb (4194304 pages) bank: 8, rank: 2, row: 0x10000, col: 0x800
      
      Cc: stable@kernel.org
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      deb09dda
  12. 12 6月, 2012 5 次提交
    • M
      sb_edac: properly handle error count · c1053839
      Mauro Carvalho Chehab 提交于
      Instead of reporting the error count via driver-specific details,
      use the new way provided by edac_mc_handle_error.
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      c1053839
    • M
      edac: edac_mc_handle_error(): add an error_count parameter · 9eb07a7f
      Mauro Carvalho Chehab 提交于
      In order to avoid loosing error events, it is desirable to group
      error events together and generate a single trace for several identical
      errors.
      
      The trace API already allows reporting multiple errors. Change the
      handle_error function to also allow that.
      
      The changes at the drivers were made by this small script:
      
      	$file .=$_ while (<>);
      	$file =~ s/(edac_mc_handle_error)\s*\(([^\,]+)\,([^\,]+)\,/$1($2,$3, 1,/g;
      	print $file;
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      9eb07a7f
    • M
      edac: remove arch-specific parameter for the error handler · 03f7eae8
      Mauro Carvalho Chehab 提交于
      Remove the arch-dependent parameter, as it were not used,
      as the MCE tracepoint weren't implemented. It probably doesn't
      make sense to have an MCE-specific tracepoint, as this will
      cost more bytes at the tracepoint, and tracepoint is not free.
      
      The changes at the EDAC drivers were done by this small perl script:
      
      	$file .=$_ while (<>);
      	$file =~ s/(edac_mc_handle_error)\s*\(([^\;]+)\,([^\,\)]+)\s*\)/$1($2)/g;
      	print $file;
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      03f7eae8
    • J
      edac: Convert debugfX to edac_dbg(X, · 956b9ba1
      Joe Perches 提交于
      Use a more common debugging style.
      
      Remove __FILE__ uses, add missing newlines,
      coalesce formats and align arguments.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      956b9ba1
    • M
      edac: Don't add __func__ or __FILE__ for debugf[0-9] msgs · dd23cd6e
      Mauro Carvalho Chehab 提交于
      The debug macro already adds that. Most of the work here was
      made by this small script:
      
      $f .=$_ while (<>);
      
      $f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*": /\1"/g;
      $f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*/\1/g;
      $f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*"MC: /\1"/g;
      
      $f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+)__func__\s*\,\s*/\1\2/g;
      $f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+),\s*__func__\s*\)/\1\2)/g;
      $f =~ s/(debugf[0-9]\s*\(\"MC\:\s*)\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+)__func__\s*\,\s*/\1\2/g;
      $f =~ s/(debugf[0-9]\s*\(\"MC\:\s*)\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+),\s*__func__\s*\)/\1\2)/g;
      
      $f =~ s/\"MC\: \\n\"/"MC:\\n"/g;
      
      print $f;
      
      After running the script, manual cleanups were done to fix it the remaining
      places.
      
      While here, removed the __LINE__ on most places, as it doesn't actually give
      useful info on most places.
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      dd23cd6e
  13. 11 6月, 2012 3 次提交
    • M
      edac: Rename the parent dev to pdev · fd687502
      Mauro Carvalho Chehab 提交于
      As EDAC doesn't use struct device itself, it created a parent dev
      pointer called as "pdev".  Now that we'll be converting it to use
      struct device, instead of struct devsys, this needs to be fixed.
      
      No functional changes.
      Reviewed-by: NAristeu Rozanski <arozansk@redhat.com>
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Cc: Doug Thompson <norsk5@yahoo.com>
      Cc: Borislav Petkov <borislav.petkov@amd.com>
      Cc: Mark Gross <mark.gross@intel.com>
      Cc: Jason Uhlenkott <juhlenko@akamai.com>
      Cc: Tim Small <tim@buttersideup.com>
      Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
      Cc: "Arvind R." <arvino55@gmail.com>
      Cc: Olof Johansson <olof@lixom.net>
      Cc: Egor Martovetsky <egor@pasemi.com>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Joe Perches <joe@perches.com>
      Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Hitoshi Mitake <h.mitake@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
      Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
      Cc: Josh Boyer <jwboyer@gmail.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      fd687502
    • C
      edac: fix the error about memory type detection on SandyBridge · 2cbb587d
      Chen Gong 提交于
      On SandyBridge, DDRIOA(Dev: 17 Func: 0 Offset: 328) is used
      to detect whether DIMM is RDIMM/LRDIMM, not TA(Dev: 15 Func: 0).
      Signed-off-by: NChen Gong <gong.chen@linux.intel.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      2cbb587d
    • C
      edac: avoid mce decoding crash after edac driver unloaded · e35fca47
      Chen Gong 提交于
      Some edac drivers register themselves as mce decoders via
      notifier_chain. But in current notifier_chain implementation logic,
      it doesn't accept same notifier registered twice. If so, it will be
      wrong when adding/removing the element from the list. For example,
      on one SandyBridge platform, remove module sb_edac and then trigger
      one error, it will hit oops because it has no mce decoder registered
      but related notifier_chain still points to an invalid callback
      function. Here is an example:
      
      Call Trace:
       [<ffffffff8150ef6a>] atomic_notifier_call_chain+0x1a/0x20
       [<ffffffff8102b936>] mce_log+0x46/0x180
       [<ffffffff8102eaea>] apei_mce_report_mem_error+0x4a/0x60
       [<ffffffff812e19d2>] ghes_do_proc+0x192/0x210
       [<ffffffff812e2066>] ghes_proc+0x46/0x70
       [<ffffffff812e20d8>] ghes_notify_sci+0x48/0x80
       [<ffffffff8150ef05>] notifier_call_chain+0x55/0x80
       [<ffffffff81076f1a>] __blocking_notifier_call_chain+0x5a/0x80
       [<ffffffff812aea11>] ? acpi_os_wait_events_complete+0x23/0x23
       [<ffffffff81076f56>] blocking_notifier_call_chain+0x16/0x20
       [<ffffffff812ddc4d>] acpi_hed_notify+0x19/0x1b
       [<ffffffff812b16bd>] acpi_device_notify+0x19/0x1b
       [<ffffffff812beb38>] acpi_ev_notify_dispatch+0x67/0x7f
       [<ffffffff812aea3a>] acpi_os_execute_deferred+0x29/0x36
       [<ffffffff81069dc2>] process_one_work+0x132/0x450
       [<ffffffff8106bbcb>] worker_thread+0x17b/0x3c0
       [<ffffffff8106ba50>] ? manage_workers+0x120/0x120
       [<ffffffff81070aee>] kthread+0x9e/0xb0
       [<ffffffff81514724>] kernel_thread_helper+0x4/0x10
       [<ffffffff81070a50>] ? kthread_freezable_should_stop+0x70/0x70
       [<ffffffff81514720>] ? gs_change+0x13/0x13
      Code: f3 49 89 d4 45 85 ed 4d 89 c6 48 8b 0f 74 48 48 85 c9 75 17 eb 41
      0f 1f 80 00 00 00 00 41 83 ed 01 4c 89 f9 74 22 4d 85 ff 74 1d <4c> 8b
      79 08 4c 89 e2 48 89 de 48 89 cf ff 11 4d 85 f6 74 04 41
      RIP  [<ffffffff8150eef6>] notifier_call_chain+0x46/0x80
       RSP <ffff88042868fb20>
      CR2: ffffffffa01af838
      ---[ end trace 0100930068e73e6f ]---
      BUG: unable to handle kernel paging request at fffffffffffffff8
      IP: [<ffffffff810705b0>] kthread_data+0x10/0x20
      PGD 1a0d067 PUD 1a0e067 PMD 0
      Oops: 0000 [#2] SMP
      
      Only i7core_edac and sb_edac have such issues because they have more
      than one memory controller which means they have to register mce
      decoder many times.
      
      Cc: <stable@vger.kernel.org> # 3.2 and upper
      Signed-off-by: NChen Gong <gong.chen@linux.intel.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      e35fca47
  14. 29 5月, 2012 7 次提交
    • M
      edac: Cleanup the logs for i7core and sb edac drivers · e17a2f42
      Mauro Carvalho Chehab 提交于
      Remove some information that it is duplicated at the MCE log,
      and don't have much usage for the error. Those data will be
      added again, when creating a trace function that outputs both
      memory errors and MCE fields.
      
      Cc: Aristeu Rozanski <arozansk@redhat.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      e17a2f42
    • M
      edac: Remove the legacy EDAC ABI · ca0907b9
      Mauro Carvalho Chehab 提交于
      Now that all drivers got converted to use the new ABI, we can
      drop the old one.
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      ca0907b9
    • M
      sb_edac: convert driver to use the new edac ABI · c36e3e77
      Mauro Carvalho Chehab 提交于
      The legacy edac ABI is going to be removed. Port the driver to use
      and benefit from the new API functionality.
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      c36e3e77
    • M
      edac: move nr_pages to dimm struct · a895bf8b
      Mauro Carvalho Chehab 提交于
      The number of pages is a dimm property. Move it to the dimm struct.
      
      After this change, it is possible to add sysfs nodes for the DIMM's that
      will properly represent the DIMM stick properties, including its size.
      
      A TODO fix here is to properly represent dual-rank/quad-rank DIMMs when
      the memory controller represents the memory via chip select rows.
      Reviewed-by: NAristeu Rozanski <arozansk@redhat.com>
      Acked-by: NBorislav Petkov <borislav.petkov@amd.com>
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Cc: Doug Thompson <norsk5@yahoo.com>
      Cc: Mark Gross <mark.gross@intel.com>
      Cc: Jason Uhlenkott <juhlenko@akamai.com>
      Cc: Tim Small <tim@buttersideup.com>
      Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
      Cc: "Arvind R." <arvino55@gmail.com>
      Cc: Olof Johansson <olof@lixom.net>
      Cc: Egor Martovetsky <egor@pasemi.com>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Joe Perches <joe@perches.com>
      Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Hitoshi Mitake <h.mitake@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
      Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
      Cc: Josh Boyer <jwboyer@gmail.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      a895bf8b
    • M
      edac: Don't initialize csrow's first_page & friends when not needed · 5e2af0c0
      Mauro Carvalho Chehab 提交于
      Almost all edac	drivers	initialize csrow_info->first_page,
      csrow_info->last_page and csrow_info->page_mask. Those vars are
      used inside the EDAC core, in order to calculate the csrow affected
      by an error, by using the routine edac_mc_find_csrow_by_page().
      
      However, very few drivers actually use it:
              e752x_edac.c
              e7xxx_edac.c
              i3000_edac.c
              i82443bxgx_edac.c
              i82860_edac.c
              i82875p_edac.c
              i82975x_edac.c
              r82600_edac.c
      
      There also a few other drivers that have their own calculus
      formula internally using those vars.
      
      All the others are just wasting time by initializing those
      data.
      
      While initializing data without using them won't cause any troubles, as
      those information is stored at the wrong place (at csrows structure), it
      is better to remove what is unused, in order to simplify the next patch.
      Reviewed-by: NAristeu Rozanski <arozansk@redhat.com>
      Acked-by: NBorislav Petkov <borislav.petkov@amd.com>
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Cc: Doug Thompson <norsk5@yahoo.com>
      Cc: Hitoshi Mitake <h.mitake@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
      Cc: Josh Boyer <jwboyer@gmail.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      5e2af0c0
    • M
      edac: move dimm properties to struct dimm_info · 084a4fcc
      Mauro Carvalho Chehab 提交于
      On systems based on chip select rows, all channels need to use memories
      with the same properties, otherwise the memories on channels A and B
      won't be recognized.
      
      However, such assumption is not true for all types of memory
      controllers.
      
      Controllers for FB-DIMM's don't have such requirements.
      
      Also, modern Intel controllers seem to be capable of handling such
      differences.
      
      So, we need to get rid of storing the DIMM information into a per-csrow
      data, storing it, instead at the right place.
      
      The first step is to move grain, mtype, dtype and edac_mode to the
      per-dimm struct.
      Reviewed-by: NAristeu Rozanski <arozansk@redhat.com>
      Reviewed-by: NBorislav Petkov <borislav.petkov@amd.com>
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Cc: Doug Thompson <norsk5@yahoo.com>
      Cc: Borislav Petkov <borislav.petkov@amd.com>
      Cc: Mark Gross <mark.gross@intel.com>
      Cc: Jason Uhlenkott <juhlenko@akamai.com>
      Cc: Tim Small <tim@buttersideup.com>
      Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
      Cc: "Arvind R." <arvino55@gmail.com>
      Cc: Olof Johansson <olof@lixom.net>
      Cc: Egor Martovetsky <egor@pasemi.com>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Joe Perches <joe@perches.com>
      Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Hitoshi Mitake <h.mitake@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: James Bottomley <James.Bottomley@parallels.com>
      Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
      Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
      Cc: Josh Boyer <jwboyer@gmail.com>
      Cc: Mike Williams <mike@mikebwilliams.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      084a4fcc
    • M
      edac: Create a dimm struct and move the labels into it · a7d7d2e1
      Mauro Carvalho Chehab 提交于
      The way a DIMM is currently represented implies that they're
      linked into a per-csrow struct. However, some drivers don't see
      csrows, as they're ridden behind some chip like the AMB's
      on FBDIMM's, for example.
      
      This forced drivers to fake^Wvirtualize a csrow struct, and to create
      a mess under csrow/channel original's concept.
      
      Move the DIMM labels into a per-DIMM struct, and add there
      the real location of the socket, in terms of csrow/channel.
      Latter patches will modify the location to properly represent the
      memory architecture.
      
      All other drivers will use a per-csrow type of location.
      Some of those drivers will require a latter conversion, as
      they also fake the csrows internally.
      
      TODO: While this patch doesn't change the existing behavior, on
      csrows-based memory controllers, a csrow/channel pair points to a memory
      rank. There's a known bug at the EDAC core that allows having different
      labels for the same DIMM, if it has more than one rank. A latter patch
      is need to merge the several ranks for a DIMM into the same dimm_info
      struct, in order to avoid having different labels for the same DIMM.
      
      The edac_mc_alloc() will now contain a per-dimm initialization loop that
      will be changed by latter patches in order to match other types of
      memory architectures.
      Reviewed-by: NAristeu Rozanski <arozansk@redhat.com>
      Reviewed-by: NBorislav Petkov <borislav.petkov@amd.com>
      Cc: Doug Thompson <norsk5@yahoo.com>
      Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
      Cc: "Arvind R." <arvino55@gmail.com>
      Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      a7d7d2e1
  15. 30 4月, 2012 1 次提交
  16. 22 3月, 2012 3 次提交