Hope this is the right place.
I'm trying to upgrade my Altix 350 to the maximum of 24GB per node (12GB per CPU), unfortunately, I'm getting memory errors during the memory self test. The machine works fine with the mix of 1GB and 512MB PC2700R modules it came with, however I've tried a couple of sets of 2GB modules and neither work. Firstly, I tried some Samsung PC3200R and secondly Micron PC2700R.
I've had to edit to include the log of the failed boot (with Micron modules) as the forum seemed to think it contained contact details.
Code:
001c02-L1>pwr u
returning to console mode 001c02 CPU0, <CTRL_T> to escape to L1
001c02#0c: SGI SAL Version 5.04 rel081111 IP41 built 07:25:21 AM Nov 11, 2008
001c02#0a: SGI SAL Version 5.04 rel081111 IP41 built 07:25:21 AM Nov 11, 2008
Found I/O brick attached to module/001c02/slab/0/node
Probing memory DIMMs ...................... DONE
Initializing memory controller ............ DONE
Testing memory .........................
MBIST FAILURE Byte Address = 0x000000000 (Approximate)
MBIST_REG:X_RP_L, ACT:0x000f0f0f0f, EXP:0x00ffffffff, DIF:0x00f0f0f0f0
MBIST_ERR, FRU:001c02.0 DIMM0_N0_R_BUS_X, ADDR:0x000000000, DIMM_BITS:44,45,46,47,52,53,54,55, ...
MBIST_REG:X_RS_L, ACT:0x0f0f2f9f0f, EXP:0xffffffffff, DIF:0xf0f0d060f0
MBIST_ERR, FRU:001c02.0 DIMM0_N0_R_BUS_X, ADDR:0x000000000, DIMM_BITS:4,5,6,7,13,14,20,22, ...
MBIST_REG:X_LP_L, ACT:0x000f0f0f0f, EXP:0x00ffffffff, DIF:0x00f0f0f0f0
MBIST_ERR, FRU:001c02.0 DIMM0_N0_L_BUS_X, ADDR:0x000000000, DIMM_BITS:44,45,46,47,52,53,54,55, ...
MBIST_REG:X_LS_L, ACT:0x0f0f0f1f1f, EXP:0xffffffffff, DIF:0xf0f0f0e0e0
MBIST_ERR, FRU:001c02.0 DIMM0_N0_L_BUS_X, ADDR:0x000000000, DIMM_BITS:5,6,7,13,14,15,20,21, ...
MBIST_REG:Y_RP_L, ACT:0x000f0f0f0f, EXP:0x00ffffffff, DIF:0x00f0f0f0f0
MBIST_ERR, FRU:001c02.0 DIMM1_N0_R_BUS_Y, ADDR:0x000000000, DIMM_BITS:44,45,46,47,52,53,54,55, ...
MBIST_REG:Y_RS_L, ACT:0x0f0f0f0f0f, EXP:0xffffffffff, DIF:0xf0f0f0f0f0
MBIST_ERR, FRU:001c02.0 DIMM1_N0_R_BUS_Y, ADDR:0x000000000, DIMM_BITS:4,5,6,7,12,13,14,15, ...
MBIST_REG:Y_LP_L, ACT:0x000f4f0f0f, EXP:0x00ffffffff, DIF:0x00f0b0f0f0
MBIST_ERR, FRU:001c02.0 DIMM1_N0_L_BUS_Y, ADDR:0x000000000, DIMM_BITS:44,45,46,47,52,53,54,55, ...
MBIST_REG:Y_LS_L, ACT:0x1f0f8f0f0f, EXP:0xffffffffff, DIF:0xe0f070f0f0
MBIST_ERR, FRU:001c02.0 DIMM1_N0_L_BUS_Y, ADDR:0x000000000, DIMM_BITS:4,5,6,7,12,13,14,15, ...
MULTI-BIT DIFFERENCE DETECTED DURING MEMORY BIST:
Location 001c02#0 DIMM0 N0_L_BUS_X received 2 multi-bit mbist error(s).
Location 001c02#0 DIMM1 N0_L_BUS_Y received 2 multi-bit mbist error(s).
Location 001c02#0 DIMM0 N0_R_BUS_X received 2 multi-bit mbist error(s).
Location 001c02#0 DIMM1 N0_R_BUS_Y received 2 multi-bit mbist error(s).
Module 001c02#0 Bank 0 failed memory tests.
MBIST FAILURE Byte Address = 0x400000000 (Approximate)
MBIST_REG:X_RP_L, ACT:0x000f0f1f8f, EXP:0x00ffffffff, DIF:0x00f0f0e070
MBIST_ERR, FRU:001c02.0 DIMM2_N0_R_BUS_X, ADDR:0x400000000, DIMM_BITS:44,45,46,53,54,55,60,61, ...
MBIST_REG:X_RS_L, ACT:0x8f0f0f2f0f, EXP:0xffffffffff, DIF:0x70f0f0d0f0
MBIST_ERR, FRU:001c02.0 DIMM2_N0_R_BUS_X, ADDR:0x400000000, DIMM_BITS:4,5,6,7,12,14,15,20, ...
MBIST_REG:X_LP_L, ACT:0x000f4f0f0f, EXP:0x00ffffffff, DIF:0x00f0b0f0f0
escaping to L1 system controller
001c02-L1>pwr d
WARNING: power appears off, console unavailable
Altix systems go in Itanium. It's not MIPS.
I found that my 350 won't accept any generic memory, even though it matches the specs of the original memory. I cross-checked these modules in HP Itanium machines and they worked fine. The Altix just wouldn't accept it.
Thanks, I wonder what the issue is. Maybe it's expecting something very specific in the SPD eeproms.
I'm sure the memory will find its way into something; quite possibly my HP Itanium will be a beneficary