Hi Weblacky,
here is my log and you can see the clear difference in behavior.
First with Debug: 0x7890:
>> 001a01:
>> debug switches set to 0x7890
>> ?-XXX.XXX.XXX.143-L2>l1 power up
>> ?-XXX.XXX.XXX.143-L2>
>> entering system console mode (001a01 CPU0), <CTRL_T> to escape to L2
>> *** DIP switch 15 set. Will skip IO and NUMAlink discovery
>>
>>
>> IP35 PROM SGI Version 6.211 built 04:16:18 PM Jan 25, 2008
>> Running in DDR mode
>> *** Warning: System controller debug switches are non-zero (0x7890)
>> *** Boot stop requested at Local (1)
>> *** Giving up global master status
>> Testing/Initializing memory ............... DONE
>> Copying PROM code to memory ............... DONE
>> Discovering NUMAlink connectivity .........
>> Local hub NUMAlink is down.
>> *** Local network link down
>> DONE
>> Found 1 objects (1 hubs, 0 routers) in 5886 usec
>> Waiting for peers to complete discovery.... DONE
>> No other nodes present; becoming global master
>> Global master is /hw/rack/001/bay/01
>> Intializing any CPUless nodes.............. DONE
>> Checking partitioning information ......... DONE
>> No other nodes present; becoming partition master
>> Suppressing error state display (system just powered on).
>> A 000 001c01:
>> A 000 001c01: *** Entering POD mode on node 0
>> A 000 001c01: POD SysCt Cac>
And now with what we have pretty much always used, debug 0x10d:
>> escaping to L2 system controller
>> ?-XXX.XXX.XXX.143-L2>debug 0x10d
>> 001a01:
>> debug switches set to 0x010d
>>
>> re-entering system console mode (001a01 CPU0), <CTRL_T> to escape to L2
>>
>> A 000 001c01: POD SysCt Cac> reset
>> A 000 001c01: Resetting the system...
>> Starting PROM Boot process
>> hubii_link_good: A-brick attached to module 001c01.
>> HUB at 0x0 attached as widget 0xa
>> 001c01/0xa/xbow_arb: nasid= 0x0 xbow_base= 0x9200000000000000
>> 001c01/0xa/xbow_arb: 622 master is 0xa
>> Check_master: link 10 is master
>> hubii_link_good: A-brick attached to module 001c01.
>> Check_master: link 10 is master
>>
>>
>> IP35 PROM SGI Version 6.211 built 04:16:18 PM Jan 25, 2008
>> built for bedrock rev. 1.1 or greater
>> running in IP34 mode
>> Running in DDR mode
>> Local master CPU A revision: f42
>> PROM length: 0x168648, BSS length: 0xa7a0, flash count: 16
>> Configured bedrock clock: 200.0 MHz
>> Status of local IO: 0x1 0x3fc0fff6403
>> Bedrock Rev: 2, Module: 1 (001c01) from Sys Ctlr
>> On PROM entry: ERR_EPC=0xffffffffbfc00300 (0xc00000001fc00300)
>> Configuring memory
>> Local memory configured: 4096 MB (premium)
>> *** Warning: System controller debug switches are non-zero (0x10d)
>> *** Diag level set to None (2)
>> *** Info level set to verbose
>> *** Boot stop requested at Global (2)
>> before reading NICHub NIC: 0x52275dad
>> SR1 set to 0x0000081698349000
>> SR0 set to 0x0000000052275dad
>> Testing/Initializing memory ............... DONE
>> Copying PROM code to memory ............... Copy PROM (0x9000000018000000) to RAM (0x9600000001a00000), len 0x168648
>> Done
>> DONE
>> Skipping secondary cache diags
>> CPU A switching stack into UALIAS and invalidating D-cache
>> CPU A switching into node 0 cached RAM
>> CPU A running cached
>> Initializing kldir.
>> Done initializing kldir.
>> Initializing klconfig.
>> init_klcfg: nasid 0 start 9600000000030000 size 10000
>> Done initializing klconfig.
>> Discovering local IO ...................... Check_master: link 10 is master
>> Check_master: link 10 is master
>> DONE
>> CPU A initialized subnode
>> Discovering NUMAlink connectivity .........
>> Local hub NUMAlink is down.
>> *** Local network link down
>> DONE
>> Found 1 objects (1 hubs, 0 routers) in 5889 usec
>> Waiting for peers to complete discovery.... Discovery results:
>> ENTRY 0: HUB(52275dad)
>> NASID=-1 Mod=1 Flg=0x9500000 PROM=6.211 Route=N/A
>> MODULE=001c01 PARTITION=0 SPACE=RESET
>> Port 1 connection: Not connected
>> Port status: NF
>> DONE
>> No other nodes present; becoming global master
>> Global master is entry 0, NIC 0x52275dad, /hw/rack/001/bay/01
>> Global master is /hw/rack/001/bay/01
>> Global barrier (line 4315)Global barrier passed.
>> Global barrier (line 4348)Global barrier passed.
>> Master System Topology Graph (pre-nasid_assign):
>> ENTRY 0: HUB(52275dad)
>> NASID=-1 Mod=1 Flg=0x9500000 PROM=6.211 Route=N/A
>> MODULE=001c01 PARTITION=0 SPACE=RESET
>> Port 1 connection: Not connected
>> Port status: NF
>> Calculating NASIDs
>> num_routers is 0
>> Master System Topology Graph:
>> ENTRY 0: HUB(52275dad)
>> NASID=0 Mod=1 Flg=0x9500000 PROM=6.211 Route=N/A
>> MODULE=001c01 PARTITION=0 SPACE=RESET
>> Port 1 connection: Not connected
>> Port status: NF
>> Distributing routing tables
>> Distributing NASIDs
>> *** NASID assigned to 0
>> CPU A switching to UALIAS
>> CPU A running in UALIAS
>> Changing node ID to 0
>> Global barrier (line 4823)Global barrier passed.
>> CPU A Flushing and invalidating caches
>> Global barrier (line 4928)Global barrier passed.
>> CPU A switching to node 0 cached RAM
>> CPU A running cached
>> Nasids in partition: +0
>> Regions in partition: +0
>> Intializing any CPUless nodes.............. Global barrier (line 7714)Global barrier passed.
>> Global barrier (line 7715)Global barrier passed.
>> DONE
>> Global barrier (line 5089)Global barrier passed.
>> hubii_link_good: A-brick attached to module 001c01.
>> Checking partitioning information ......... DONE
>> No other nodes present; becoming partition master
>> *** After partitioning ***
>> ENTRY 0: HUB(52275dad)
>> NASID=0 Mod=1 Flg=0x9500000 PROM=6.211 Route=N/A
>> MODULE=001c01 PARTITION=0 SPACE=RESET
>> Port 1 connection: Not connected
>> Port status: FE
>> Erecting partition fences ................ DONE
>> Update config for routers connected to hubs
>> Update config for hubs and hubless routers
>> CPU A flushing cache
>> check_router_cfg: nasid 0 is_voyager 0 check_cfg = 0
>> Global barrier (line 5300)Global barrier passed.
>> Nasids in partition: +0
>> Regions in partition: +0
>> A 000 001c01:
>> A 000 001c01: *** Entering POD mode on node 0
>> A 000 001c01: POD SysCt Cac>
You can clearly see all the extra stuff that is occuring with the more complete boot sequence.
EDIT #2: Further checking of the set values is that change is due to flags suppressing log output..., I fixed the debug flag calculator to generate the right flags.
I have created a "debug flag" cheat sheet. I will post once I can render it as easily readable graphic:
And here is my
"Dip Switch Calculator" (excel spreadsheet).
EDIT #1: looking at the more complete boot log, see "Configured bedrock clock: 200.0 MHz" this would appear to be point where it gets data from PROM speed configuration settings, given default values provide doing "flash" PROM update, if they machine does change clock as part of boot process, then it is likely to start at 400 MHZ (which is the default and lower speed than any sold configuration of Fuel).
EDIT #3: I found error in the dip switch calculations, due to not ensuring the Least Significant Bit (LSB) order was correct (dip switch ids go LSB from left -> right but binary convention has LSB on right, so needs to read right to left..). So fixed that with v0.2 of spreadsheet.
EDIT #4: When I set debug flag to boot "memoryless" (debug 0x011f) the boot hangs..., I tried with keyboard/mouse plugged in and pulled out and with console via L1 USB and first serial port ... but could not get Fuel to boot directly into POD/DEX mode and so avoid Cache. It might be you actually have to have a machine with NO RAM for this to work.
Cheers from Oz,
jwhat/John.