OH ...boy...this is bad. I got the new card...SAME THING..SAME!!!!! URGGGHHHHH
Okay so I cobled together a multi-headed hydra of serial adapters into an old laptop to because I couldn't make console mode work on the fuel before (switching serial ports did NOTHING).
I got two terminal windows running, one on the internal header, one on serial #1 at 9600 instead of the higher speed and I was able to enter CAC and see stuff...I did a error_dump...OH MAN...I hope someone knows what the heck this is because it LOOKS LIKE A SHOW STOPPER. This would mean the board had BOTH real V10 damage and real Mainboard/PIMM damage!? Wow, my luck is NOT going well on this fuel.
Okay L1 log (changed to flash storage B just to see if that made a diff):
Code:
SGI SN1 L1 Controller
Firmware Image B: Rev. 1.28.3, Built 03/20/2004 00:01:57
001a01-L1>env
Environmental monitoring is enabled and running.
Description State Warning Limits Fault Limits Current
-------------- ---------- ----------------- ----------------- -------
12V Wait Pwr 10% 10.80/ 13.20 20% 9.60/ 14.40 0.25
12V IO Wait Pwr 10% 10.80/ 13.20 20% 9.60/ 14.40 0.25
5V Wait Pwr 10% 4.50/ 5.50 20% 4.00/ 6.00 0.08
3.3V Wait Pwr 10% 2.97/ 3.63 20% 2.64/ 3.96 0.62
2.5V Wait Pwr 10% 2.25/ 2.75 20% 2.00/ 3.00 0.00
1.5V Wait Pwr 10% 1.35/ 1.65 20% 1.20/ 1.80 0.00
5V aux Wait Pwr 10% 4.50/ 5.50 20% 4.00/ 6.00 5.04
3.3V aux Wait Pwr 10% 2.97/ 3.63 20% 2.64/ 3.96 3.29
PIMM0 12V bias Wait Pwr 10% 10.80/ 13.20 20% 9.60/ 14.40 0.25
Fuel SRAM Wait Pwr 10% 2.25/ 2.75 20% 2.00/ 3.00 0.05
Fuel CPU Wait Pwr 10% 1.13/ 1.38 20% 1.00/ 1.50 0.01
PIMM0 1.5V Wait Pwr 10% 1.35/ 1.65 20% 1.20/ 1.80 0.04
PIMM0 3.3V aux Wait Pwr 10% 2.97/ 3.63 20% 2.64/ 3.96 3.27
PIMM0 5V aux Wait Pwr 10% 4.50/ 5.50 20% 4.00/ 6.00 5.02
XIO 12V bias Wait Pwr 10% 10.80/ 13.20 20% 9.60/ 14.40 0.25
XIO 5V Wait Pwr 10% 4.50/ 5.50 20% 4.00/ 6.00 0.08
XIO 2.5V Wait Pwr 10% 2.25/ 2.75 20% 2.00/ 3.00 0.00
XIO 3.3V aux Wait Pwr 10% 2.97/ 3.63 20% 2.64/ 3.96 3.30
Description State Warning RPM Current RPM
-------------- ---------- ----------- -----------
FAN 0 EXHAUST Wait Pwr 920 0
FAN 1 HD Wait Pwr 1560 0
FAN 2 PCI Wait Pwr 1120 0
FAN 3 XIO 1 Wait Pwr 1600 0
FAN 4 XIO 2 Wait Pwr 1600 0
FAN 5 PS Wait Pwr 1349 0
Advisory Critical Fault Current
Description State Temp Temp Temp Temp
----------------- ---------- --------- --------- --------- ---------
0 NODE 0 Wait Pwr [Autofan Control] 75C/167F 26C/ 78F
1 NODE 1 Wait Pwr [Autofan Control] 75C/167F 25C/ 77F
2 NODE 2 Wait Pwr [Autofan Control] 75C/167F 20C/ 68F
3 PIMM Wait Pwr [Autofan Control] 75C/167F 29C/ 84F
4 ODYSSEY Wait Pwr [Autofan Control] 75C/167F 23C/ 73F
5 BEDROCK Wait Pwr Not currently available
001a01-L1>INFO: 001a01 will power up system in 5 seconds...
INFO: 001a01 powering up the system.
001a01-L1>env
Environmental monitoring is enabled and running.
Description State Warning Limits Fault Limits Current
-------------- ---------- ----------------- ----------------- -------
12V Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.00
12V IO Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.06
5V Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.07
3.3V Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.35
2.5V Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.47
1.5V Enabled 10% 1.35/ 1.65 20% 1.20/ 1.80 1.47
5V aux Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.02
3.3V aux Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.29
PIMM0 12V bias Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.00
Fuel SRAM Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.51
Fuel CPU Enabled 10% 1.13/ 1.38 20% 1.00/ 1.50 1.25
PIMM0 1.5V Enabled 10% 1.35/ 1.65 20% 1.20/ 1.80 1.49
PIMM0 3.3V aux Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.29
PIMM0 5V aux Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.02
XIO 12V bias Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.00
XIO 5V Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.07
XIO 2.5V Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.48
XIO 3.3V aux Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.30
Description State Warning RPM Current RPM
-------------- ---------- ----------- -----------
FAN 0 EXHAUST Enabled 920 1196
FAN 1 HD Enabled 1560 2205
FAN 2 PCI Enabled 1120 1534
FAN 3 XIO 1 Enabled 1600 2191
FAN 4 XIO 2 Enabled 1600 2057
FAN 5 PS Enabled 1349 2122
Advisory Critical Fault Current
Description State Temp Temp Temp Temp
----------------- ---------- --------- --------- --------- ---------
0 NODE 0 Enabled [Autofan Control] 75C/167F 25C/ 77F
1 NODE 1 Enabled [Autofan Control] 75C/167F 25C/ 77F
2 NODE 2 Enabled [Autofan Control] 75C/167F 20C/ 68F
3 PIMM Enabled [Autofan Control] 75C/167F 30C/ 86F
4 ODYSSEY Enabled [Autofan Control] 75C/167F 23C/ 73F
5 BEDROCK Enabled [Autofan Control] 85C/185F 27C/ 80F
001a01-L1>log
11/12/21 15:15:14 power up (COMMAND)
11/12/21 15:15:21 reset again MIPS
11/12/21 15:17:40 SMP unregistering events
11/12/21 15:17:40 UNREG: 300064c4 0 4
11/12/21 15:17:41 SMP-R: UART:UART_NO_CONNECTION
11/12/21 15:18:30 3.3V low fault limit reached 1.926V.
11/12/21 15:19:02 L1 booting 1.28.3
11/12/21 15:19:02 PSC: 0x09
11/12/21 15:19:02 USB0: waiting on open
11/12/21 15:19:02 auto power up countdown initiated
11/12/21 15:19:08 auto power up initiated
11/12/21 15:19:08 power up (COMMAND)
11/12/21 15:19:14 reset again MIPS
11/12/21 15:19:20 power down (COMMAND)
11/12/21 15:19:20 power down (COMMAND)
11/12/21 15:20:36 SMP unregistering events
11/12/21 15:20:36 UNREG: 300064c4 0 4
11/12/21 15:20:37 SMP-R: UART:UART_NO_CONNECTION
11/12/21 15:21:11 L1 booting 1.28.3
11/12/21 15:21:11 PSC: 0x09
11/12/21 15:21:11 USB0: waiting on open
11/12/21 15:21:11 auto power up countdown initiated
11/12/21 15:21:16 auto power up initiated
11/12/21 15:21:16 power up (COMMAND)
11/12/21 15:21:23 reset again MIPS
11/12/21 15:22:35 PIMM0 1.5V low fault limit reached 1.142V.
11/12/21 15:23:06 L1 booting 1.28.3
11/12/21 15:23:07 PSC: 0x09
11/12/21 15:23:07 USB0: waiting on open
11/12/21 15:23:07 auto power up countdown initiated
11/12/21 15:23:12 auto power up initiated
11/12/21 15:23:12 power up (COMMAND)
11/12/21 15:23:18 reset again MIPS
11/12/21 15:23:29 12V IO low fault limit reached 9.500V.
11/12/21 15:23:35 L1 booting 1.28.3
11/12/21 15:23:35 PSC: 0x09
11/12/21 15:23:35 USB0: waiting on open
11/12/21 15:23:35 auto power up countdown initiated
11/12/21 15:23:40 auto power up initiated
11/12/21 15:23:40 power up (COMMAND)
11/12/21 15:23:47 reset again MIPS
11/12/21 15:23:58 1.5V low fault limit reached 1.199V.
11/12/21 15:24:34 L1 booting 1.28.3
11/12/21 15:24:35 PSC: 0x09
11/12/21 15:24:35 USB0: waiting on open
11/12/21 15:24:35 auto power up countdown initiated
11/12/21 15:24:40 auto power up initiated
11/12/21 15:24:40 power up (COMMAND)
11/12/21 15:24:47 reset again MIPS
001a01-L1>env
Environmental monitoring is enabled and running.
Description State Warning Limits Fault Limits Current
-------------- ---------- ----------------- ----------------- -------
12V Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 11.94
12V IO Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.00
5V Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.07
3.3V Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.35
2.5V Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.47
1.5V Enabled 10% 1.35/ 1.65 20% 1.20/ 1.80 1.47
5V aux Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.02
3.3V aux Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.29
PIMM0 12V bias Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 11.94
Fuel SRAM Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.51
Fuel CPU Enabled 10% 1.13/ 1.38 20% 1.00/ 1.50 1.25
PIMM0 1.5V Enabled 10% 1.35/ 1.65 20% 1.20/ 1.80 1.49
PIMM0 3.3V aux Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.29
PIMM0 5V aux Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.02
XIO 12V bias Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 11.88
XIO 5V Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.07
XIO 2.5V Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.47
XIO 3.3V aux Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.30
Description State Warning RPM Current RPM
-------------- ---------- ----------- -----------
FAN 0 EXHAUST Enabled 920 1196
FAN 1 HD Enabled 1560 2205
FAN 2 PCI Enabled 1120 1534
FAN 3 XIO 1 Enabled 1600 2191
FAN 4 XIO 2 Enabled 1600 2057
FAN 5 PS Enabled 1349 2122
Advisory Critical Fault Current
Description State Temp Temp Temp Temp
----------------- ---------- --------- --------- --------- ---------
0 NODE 0 Enabled [Autofan Control] 75C/167F 27C/ 80F
1 NODE 1 Enabled [Autofan Control] 75C/167F 25C/ 77F
2 NODE 2 Enabled [Autofan Control] 75C/167F 20C/ 68F
3 PIMM Enabled [Autofan Control] 75C/167F 31C/ 87F
4 ODYSSEY Enabled [Autofan Control] 75C/167F 22C/ 71F
5 BEDROCK Enabled [Autofan Control] 85C/185F 29C/ 84F
001a01-L1>debug 0x010d
debug switches set to 0x010d
001a01-L1>cac
ERROR: command not found.
001a01-L1>
entering console mode 001a01 CPU0, <CTRL_T> to escape to L1
The low voltage errors in the log are odd, they don't show in the env...I think thats becsue they are momentary based on start-up load. But odd anyway.
Okay so the BIG guy..CAC...I did all the reset logs, reset all stuff...error_dump remained the same and it's pretty core..I really don't know how I'd look into this (
FYI, I see the memory layout error and I have corrected that, I don't want to upload a new log because it didn't change anything, also I F*D with the debug mode and restarted a bunch of times, so the output will be odd.):
Code:
IP35 PROM SGI Version 6.210 built 02:33:51 PM Aug 26, 2004
built for bedrock rev. 1.1 or greater
running in IP34 mode
Running in DDR mode
Local master CPU A revision: f41
PROM length: 0x1686a8, BSS length: 0xa7a0, flash count: 9
Configured bedrock clock: 200.0 MHz
Status of local IO: 0x1 0x3fc0fff6403
Bedrock Rev: 2, Module: 1 (001c01) from Sys Ctlr
On PROM entry: ERR_EPC=0xc00000001fc02ad4 (0xc00000001fc02ad4)
Configuring memory
*** Memory sizing failure:
*** Bank 0 (256 MB) and bank 1 (512 MB) sizes differ,
*** treating them both as 256 MB
Local memory configured: 1024 MB (standard)
*** Warning: System controller debug switches are non-zero (0x10d)
*** Diag level set to None (2)
*** Info level set to verbose
*** Boot stop requested at Global (2)
before reading NICHub NIC: 0x5455827f
SR1 set to 0x0000080690349000
SR0 set to 0x000000005455827f
Testing/Initializing memory ............... DONE
Copying PROM code to memory ............... Copy PROM (0x9000000018000000) to RAM (0x9600000001a00000), len 0x1686a8
Done
DONE
Skipping secondary cache diags
CPU A switching stack into UALIAS and invalidating D-cache
CPU A switching into node 0 cached RAM
CPU A running cached
Initializing kldir.
Done initializing kldir.
Initializing klconfig.
init_klcfg: nasid 0 start 9600000000030000 size 10000
Done initializing klconfig.
Discovering local IO ...................... Check_master: link 10 is master
Check_master: link 10 is master
DONE
CPU A initialized subnode
Discovering NUMAlink connectivity .........
Local hub NUMAlink is down.
*** Local network link down
DONE
Found 1 objects (1 hubs, 0 routers) in 30357 usec
Waiting for peers to complete discovery.... Discovery results:
ENTRY 0: HUB(5455827f)
NASID=-1 Mod=1 Flg=0x1500000 PROM=6.210 Route=N/A
MODULE=001c01 PARTITION=0 SPACE=RESET
Port 1 connection: Not connected
Port status: NF
DONE
No other nodes present; becoming global master
Global master is entry 0, NIC 0x5455827f, /hw/rack/001/bay/01
Global master is /hw/rack/001/bay/01
Global barrier (line 4315) \Global barrier passed.
Global barrier (line 4348) \Global barrier passed.
Master System Topology Graph (pre-nasid_assign):
ENTRY 0: HUB(5455827f)
NASID=-1 Mod=1 Flg=0x1500000 PROM=6.210 Route=N/A
MODULE=001c01 PARTITION=0 SPACE=RESET
Port 1 connection: Not connected
Port status: NF
Calculating NASIDs
num_routers is 0
Master System Topology Graph:
ENTRY 0: HUB(5455827f)
NASID=0 Mod=1 Flg=0x1500000 PROM=6.210 Route=N/A
MODULE=001c01 PARTITION=0 SPACE=RESET
Port 1 connection: Not connected
Port status: NF
Distributing routing tables
Distributing NASIDs
*** NASID assigned to 0
CPU A switching to UALIAS
CPU A running in UALIAS
Changing node ID to 0
Global barrier (line 4823) \Global barrier passed.
CPU A Flushing and invalidating caches
Global barrier (line 4928) \Global barrier passed.
CPU A switching to node 0 cached RAM
CPU A running cached
Nasids in partition: +0
Regions in partition: +0
Intializing any CPUless nodes.............. Global barrier (line 7714) \Global barrier passed.
Global barrier (line 7715) \Global barrier passed.
DONE
Global barrier (line 5089) \Global barrier passed.
hubii_link_good: A-brick attached to module 001c01.
Checking partitioning information ......... DONE
No other nodes present; becoming partition master
*** After partitioning ***
ENTRY 0: HUB(5455827f)
NASID=0 Mod=1 Flg=0x1500000 PROM=6.210 Route=N/A
MODULE=001c01 PARTITION=0 SPACE=RESET
Port 1 connection: Not connected
Port status: FE
Erecting partition fences ................ DONE
Update config for routers connected to hubs
Update config for hubs and hubless routers
CPU A flushing cache
check_router_cfg: nasid 0 is_voyager 0 check_cfg = 0
Global barrier (line 5300) \Global barrier passed.
Nasids in partition: +0
Regions in partition: +0
A 000: *** Entering POD mode on node 0
A 000: POD IOC3 Cac> go cac
Must be in Dex mode before switching to Cac or Unc.
A 000: POD IOC3 Cac> clearalllogs
*** This must be run only after NUMAlink discovery is complete.
*** This will clear all previous log variables such as:
*** moduleids, nodeids, etc. for all nodes.
Clear all logs? [n] n
Aborted
A 000: POD IOC3 Cac> y clearalllogs
*** This must be run only after NUMAlink discovery is complete.
*** This will clear all previous log variables such as:
*** moduleids, nodeids, etc. for all nodes.
Clear all logs? [n] y
Checking 1 entries for promlogs
.DONE
All PROM logs cleared!
A 000: POD IOC3 Cac> reset
Resetting the system...
IP35 PROM SGI Version 6.210 built 02:33:51 PM Aug 26, 2004
built for bedrock rev. 1.1 or greater
running in IP34 mode
Running in DDR mode
Local master CPU A revision: f41
PROM length: 0x1686a8, BSS length: 0xa7a0, flash count: 9
Configured bedrock clock: 200.0 MHz
Status of local IO: 0x1 0x3fc0fff6403
Bedrock Rev: 2, Module: 1 (001c01) from Sys Ctlr
On PROM entry: ERR_EPC=0xc00000001fc6b5e8 (0xc00000001fc6b5e8)
Configuring memory
*** Memory sizing failure:
*** Bank 0 (256 MB) and bank 1 (512 MB) sizes differ,
*** treating them both as 256 MB
Local memory configured: 1024 MB (standard)
*** Warning: System controller debug switches are non-zero (0x10d)
*** Diag level set to None (2)
*** Info level set to verbose
*** Boot stop requested at Global (2)
before reading NICHub NIC: 0x5455827f
SR1 set to 0x0000080690349000
SR0 set to 0x000000005455827f
Testing/Initializing memory ............... DONE
Copying PROM code to memory ............... Copy PROM (0x9000000018000000) to RAM (0x9600000001a00000), len 0x1686a8
Done
DONE
Skipping secondary cache diags
CPU A switching stack into UALIAS and invalidating D-cache
CPU A switching into node 0 cached RAM
CPU A running cached
Initializing kldir.
Done initializing kldir.
Initializing klconfig.
init_klcfg: nasid 0 start 9600000000030000 size 10000
Done initializing klconfig.
Discovering local IO ...................... Check_master: link 10 is master
Check_master: link 10 is master
DONE
CPU A initialized subnode
Discovering NUMAlink connectivity .........
Local hub NUMAlink is down.
*** Local network link down
DONE
Found 1 objects (1 hubs, 0 routers) in 30358 usec
Waiting for peers to complete discovery.... Discovery results:
ENTRY 0: HUB(5455827f)
NASID=-1 Mod=1 Flg=0x1500000 PROM=6.210 Route=N/A
MODULE=001c01 PARTITION=0 SPACE=RESET
Port 1 connection: Not connected
Port status: NF
DONE
No other nodes present; becoming global master
Global master is entry 0, NIC 0x5455827f, /hw/rack/001/bay/01
Global master is /hw/rack/001/bay/01
Global barrier (line 4315) \Global barrier passed.
Global barrier (line 4348) \Global barrier passed.
Master System Topology Graph (pre-nasid_assign):
ENTRY 0: HUB(5455827f)
NASID=-1 Mod=1 Flg=0x1500000 PROM=6.210 Route=N/A
MODULE=001c01 PARTITION=0 SPACE=RESET
Port 1 connection: Not connected
Port status: NF
Calculating NASIDs
num_routers is 0
Master System Topology Graph:
ENTRY 0: HUB(5455827f)
NASID=0 Mod=1 Flg=0x1500000 PROM=6.210 Route=N/A
MODULE=001c01 PARTITION=0 SPACE=RESET
Port 1 connection: Not connected
Port status: NF
Distributing routing tables
Distributing NASIDs
*** NASID assigned to 0
CPU A switching to UALIAS
CPU A running in UALIAS
Changing node ID to 0
Global barrier (line 4823) \Global barrier passed.
CPU A Flushing and invalidating caches
Global barrier (line 4928) \Global barrier passed.
CPU A switching to node 0 cached RAM
CPU A running cached
Nasids in partition: +0
Regions in partition: +0
Intializing any CPUless nodes.............. Global barrier (line 7714) \Global barrier passed.
Global barrier (line 7715) \Global barrier passed.
DONE
Global barrier (line 5089) \Global barrier passed.
hubii_link_good: A-brick attached to module 001c01.
Checking partitioning information ......... DONE
No other nodes present; becoming partition master
*** After partitioning ***
ENTRY 0: HUB(5455827f)
NASID=0 Mod=1 Flg=0x1500000 PROM=6.210 Route=N/A
MODULE=001c01 PARTITION=0 SPACE=RESET
Port 1 connection: Not connected
Port status: FE
Erecting partition fences ................ DONE
Update config for routers connected to hubs
Update config for hubs and hubless routers
CPU A flushing cache
check_router_cfg: nasid 0 is_voyager 0 check_cfg = 0
Global barrier (line 5300) \Global barrier passed.
Nasids in partition: +0
Regions in partition: +0
A 000: *** Entering POD mode on node 0
A 000: POD IOC3 Cac> error_dump
Hardware Error State: (Forced error dump)
+ Errors on node Nasid 0x0 (0)
+ XBow in /hw/module/174562
+ BEDROCK signalled following errors.
+ XBow Link a status register: 0xffffffff80020000
+ 17: Illegal destination
+ XBow error command word register: 0xffffffffaa018000
+ XBow error upper address register: 0x0
+ XBow error lower address register: 0x0
END Hardware Error State (Forced error dump)
A 000: POD IOC3 Cac> reset all
This will reset ALL partitions, OK? [n] y
Local hub NUMAlink is down.
*** Local network link down
DONE
Found 1 objects (1 hubs, 0 routers) in 30360 usec
Resetting the system...
IP35 PROM SGI Version 6.210 built 02:33:51 PM Aug 26, 2004
built for bedrock rev. 1.1 or greater
running in IP34 mode
Running in DDR mode
Local master CPU A revision: f41
PROM length: 0x1686a8, BSS length: 0xa7a0, flash count: 9
Configured bedrock clock: 200.0 MHz
Status of local IO: 0x1 0x3fc0fff6403
Bedrock Rev: 2, Module: 1 (001c01) from Sys Ctlr
On PROM entry: ERR_EPC=0xc00000001fc6b5e8 (0xc00000001fc6b5e8)
Configuring memory
*** Memory sizing failure:
*** Bank 0 (256 MB) and bank 1 (512 MB) sizes differ,
*** treating them both as 256 MB
Local memory configured: 1024 MB (standard)
*** Warning: System controller debug switches are non-zero (0x4d)
*** Diag level set to None (2)
*** Info level set to verbose
*** Boot stop requested at Global (2)
*** Bypassing first IO7
before reading NICHub NIC: 0x5455827f
SR1 set to 0x0000080690349000
SR0 set to 0x000000005455827f
Testing/Initializing memory ............... DONE
Copying PROM code to memory ............... Copy PROM (0x9000000018000000) to RAM (0x9600000001a00000), len 0x1686a8
Done
DONE
Skipping secondary cache diags
CPU A switching stack into UALIAS and invalidating D-cache
CPU A switching into node 0 cached RAM
CPU A running cached
Initializing kldir.
Done initializing kldir.
Initializing klconfig.
init_klcfg: nasid 0 start 9600000000030000 size 10000
Done initializing klconfig.
Discovering local IO ...................... Check_master: link 10 is master
Check_master: link 10 is master
DONE
CPU A initialized subnode
Discovering NUMAlink connectivity .........
Local hub NUMAlink is down.
*** Local network link down
DONE
Found 1 objects (1 hubs, 0 routers) in 30359 usec
Waiting for peers to complete discovery.... Discovery results:
ENTRY 0: HUB(5455827f)
NASID=-1 Mod=1 Flg=0x1500000 PROM=6.210 Route=N/A
MODULE=001c01 PARTITION=0 SPACE=RESET
Port 1 connection: Not connected
Port status: NF
DONE
No other nodes present; becoming global master
Global master is entry 0, NIC 0x5455827f, /hw/rack/001/bay/01
Global master is /hw/rack/001/bay/01
Global barrier (line 4315) \Global barrier passed.
Global barrier (line 4348) \Global barrier passed.
Master System Topology Graph (pre-nasid_assign):
ENTRY 0: HUB(5455827f)
NASID=-1 Mod=1 Flg=0x1500000 PROM=6.210 Route=N/A
MODULE=001c01 PARTITION=0 SPACE=RESET
Port 1 connection: Not connected
Port status: NF
Calculating NASIDs
num_routers is 0
Master System Topology Graph:
ENTRY 0: HUB(5455827f)
NASID=0 Mod=1 Flg=0x1500000 PROM=6.210 Route=N/A
MODULE=001c01 PARTITION=0 SPACE=RESET
Port 1 connection: Not connected
Port status: NF
Distributing routing tables
Distributing NASIDs
*** NASID assigned to 0
CPU A switching to UALIAS
CPU A running in UALIAS
Changing node ID to 0
Global barrier (line 4823) \Global barrier passed.
CPU A Flushing and invalidating caches
Global barrier (line 4928) \Global barrier passed.
CPU A switching to node 0 cached RAM
CPU A running cached
Nasids in partition: +0
Regions in partition: +0
Intializing any CPUless nodes.............. Global barrier (line 7714) \Global barrier passed.
Global barrier (line 7715) \Global barrier passed.
DONE
Global barrier (line 5089) \Global barrier passed.
hubii_link_good: A-brick attached to module 001c01.
*** Nasid 0: Memory bank 2 was previously Present & Enabled but is now Present & Disabled
*** Nasid 0: Memory bank 2 previously had 256 MB but now has 512 MB
Checking partitioning information ......... DONE
No other nodes present; becoming partition master
*** After partitioning ***
ENTRY 0: HUB(5455827f)
NASID=0 Mod=1 Flg=0x1500000 PROM=6.210 Route=N/A
MODULE=001c01 PARTITION=0 SPACE=RESET
Port 1 connection: Not connected
Port status: FE
Erecting partition fences ................ DONE
Update config for routers connected to hubs
Update config for hubs and hubless routers
CPU A flushing cache
check_router_cfg: nasid 0 is_voyager 0 check_cfg = 0
Global barrier (line 5300) \Global barrier passed.
Nasids in partition: +0
Regions in partition: +0
A 000: *** Entering POD mode on node 0
A 000: POD IOC3 Cac> error_dump
Hardware Error State: (Forced error dump)
+ Errors on node Nasid 0x0 (0)
+ XBow in /hw/module/174562
+ BEDROCK signalled following errors.
+ XBow Link a status register: 0xffffffff80020000
+ 17: Illegal destination
+ XBow error command word register: 0xffffffffaa008000
+ XBow error upper address register: 0x0
+ XBow error lower address register: 0x0
END Hardware Error State (Forced error dump)
A 000: POD IOC3 Cac> error
A 000: POD IOC3 Cac>
Soooooo...is this my error?
Code:
Errors on node Nasid 0x0 (0)
XBow in /hw/module/174562
BEDROCK signalled following errors.
XBow Link a status register: 0xffffffff80020000
17: Illegal destination
XBow error command word register: 0xffffffffaa008000
XBow error upper address register: 0x0
XBow error lower address register: 0x0
Also while using "leds" on the L1 during all this there was one instance where is said something like "invalid icache" but I didn't get it in a capture.
So, worst fears realized...damage to V10, damage to mainboard? Possibly the PIMM...I could still stick in the low voltage PIMM I got from noguri as a test PIMM?
Can anyone shed light on the this error? I mean from a "modular" perspective...it's NOT the V10 holding me back (right now). So either mainboard or PIMM. I have a sort of working PIMM I can test with...do you think that would yield any positive results? I'm asking only due to the power log errors...as one was 12v IO but the other WAS PIMM 1.5v voltage. Maybe PIMM is shorted and during startup it's connected, then it get's disconnected due to not being needed yet and 1.5V returns to normal before any real OS boot?
Grasping at straws...but there's nothing else to grasp right now...