The following errors show up in dmesg
10-20 times per day:
MCA: Bank 5, Status 0x8c00004000010092
MCA: Global Cap 0x0000000001000c10, Status 0x0000000000000000
MCA: Vendor "GenuineIntel", ID 0x206d7, APIC ID 0
MCA: CPU 0 COR (1) RD channel 2 memory error
MCA: Address 0xbb5561e80 (Mode: Physical Address, LSB: 6)
MCA: Misc 0x2140109086
The CPU is always 0, and the "bank" is always 5. The "Misc" and the "Address" vary, but are often the same.
The motherboard is identified thus:
CPU: Intel(R) Xeon(R) CPU E5-1620 0 @ 3.60GHz (3591.44-MHz K8-class CPU)
Origin="GenuineIntel" Id=0x206d7 Family=0x6 Model=0x2d Stepping=7
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
Features2=0x1fbee3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX>
AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
AMD Features2=0x1<LAHF>
XSAVE Features=0x1<XSAVEOPT>
VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
TSC: P-state invariant, performance statistics
real memory = 137438953472 (131072 MB)
avail memory = 133741539328 (127545 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <LENOVO TC-A0 >
FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s) x 2 hardware threads
Should I replace a DIMM (and how do I identify it?), or is ECC doing its job, and there is no need to worry? Yet?
Adding output of mcelog
:
Hardware event. This is not a software error.
MCE 458
CPU 0 BANK 5 TSC 10283dbf8f01bc
MISC 21401e9e86 ADDR bb5561e80
TIME 1665418335 Mon Oct 10 12:12:15 2022
MCG status:
STATUS cc00010000010092 MCGSTATUS 0
MCGCAP 1000c10 APICID 0 SOCKETID 0
CPUID Vendor Intel Family 6 Model 45 Step 7