PIMM shot? -
s0ke - 02-03-2024
Is my pimm done for?
Recently picked up a Fuel from @kshuff at a fantastic price. Replaced the psu (kuba + weblacky) who also replaced the monitoring chip on my V10. After troubleshooting I then purchased a new V10 as it seems the GFX was borked. But still having issues after replacing all the parts I thought would fix the issue. Seems like the CPU may be the problem.
Off:
Code:
001a01-L1>env
Environmental monitoring is enabled and running.
Description State Warning Limits Fault Limits Current
-------------- ---------- ----------------- ----------------- -------
12V Wait Pwr 10% 10.80/ 13.20 20% 9.60/ 14.40 0.00
12V IO Wait Pwr 10% 10.80/ 13.20 20% 9.60/ 14.40 0.00
5V Wait Pwr 10% 4.50/ 5.50 20% 4.00/ 6.00 0.18
3.3V Wait Pwr 10% 2.97/ 3.63 20% 2.64/ 3.96 0.58
2.5V Wait Pwr 10% 2.25/ 2.75 20% 2.00/ 3.00 0.00
1.5V Wait Pwr 10% 1.35/ 1.65 20% 1.20/ 1.80 0.00
5V aux Wait Pwr 10% 4.50/ 5.50 20% 4.00/ 6.00 5.02
3.3V aux Wait Pwr 10% 2.97/ 3.63 20% 2.64/ 3.96 3.30
PIMM0 12V bias Wait Pwr 10% 10.80/ 13.20 20% 9.60/ 14.40 0.00
Asterix SRAM Wait Pwr 10% 2.25/ 2.75 20% 2.00/ 3.00 0.05
Asterix CPU Wait Pwr 10% 1.44/ 1.76 20% 1.28/ 1.92 0.01
PIMM0 1.5V Wait Pwr 10% 1.35/ 1.65 20% 1.20/ 1.80 0.04
PIMM0 3.3V aux Wait Pwr 10% 2.97/ 3.63 20% 2.64/ 3.96 3.30
PIMM0 5V aux Wait Pwr 10% 4.50/ 5.50 20% 4.00/ 6.00 4.99
XIO 12V bias Wait Pwr 10% 10.80/ 13.20 20% 9.60/ 14.40 0.00
XIO 5V Wait Pwr 10% 4.50/ 5.50 20% 4.00/ 6.00 0.18
XIO 2.5V Wait Pwr 10% 2.25/ 2.75 20% 2.00/ 3.00 0.00
XIO 3.3V aux Wait Pwr 10% 2.97/ 3.63 20% 2.64/ 3.96 3.30
Description State Warning RPM Current RPM
-------------- ---------- ----------- -----------
FAN 0 EXHAUST Wait Pwr 920 0
FAN 1 HD Wait Pwr 1560 0
FAN 2 PCI Wait Pwr 1120 0
FAN 3 XIO 1 Wait Pwr 1600 0
FAN 4 XIO 2 Wait Pwr 1600 0
FAN 5 PS Wait Pwr 1600 0
Advisory Critical Fault Current
Description State Temp Temp Temp Temp
-------------- ---------- --------- --------- --------- ---------
NODE 0 Wait Pwr 60C/140F 65C/149F 70C/158F 30C/ 86F
NODE 1 Wait Pwr 60C/140F 65C/149F 70C/158F 29C/ 84F
NODE 2 Wait Pwr 60C/140F 65C/149F 70C/158F 24C/ 75F
PIMM Wait Pwr 60C/140F 65C/149F 70C/158F 32C/ 89F
ODYSSEY Wait Pwr 60C/140F 65C/149F 70C/158F 26C/ 78F
BEDROCK Wait Pwr Not currently available
On:
Code:
001a01-L1>env
Environmental monitoring is enabled and running.
Description State Warning Limits Fault Limits Current
-------------- ---------- ----------------- ----------------- -------
12V Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 11.94
12V IO Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.00
5V Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.07
3.3V Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.35
2.5V Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.47
1.5V Enabled 10% 1.35/ 1.65 20% 1.20/ 1.80 1.47
5V aux Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 4.99
3.3V aux Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.30
PIMM0 12V bias Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.00
Asterix SRAM Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.54
Asterix CPU Enabled 10% 1.44/ 1.76 20% 1.28/ 1.92 1.59
PIMM0 1.5V Enabled 10% 1.35/ 1.65 20% 1.20/ 1.80 1.49
PIMM0 3.3V aux Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.30
PIMM0 5V aux Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 4.99
XIO 12V bias Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 11.88
XIO 5V Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.07
XIO 2.5V Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.47
XIO 3.3V aux Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.30
Description State Warning RPM Current RPM
-------------- ---------- ----------- -----------
FAN 0 EXHAUST Enabled 920 1210
FAN 1 HD Enabled 1560 2229
FAN 2 PCI Enabled 1120 1558
FAN 3 XIO 1 Enabled 1600 2185
FAN 4 XIO 2 Enabled 1600 4611
FAN 5 PS Enabled 1600 2214
Advisory Critical Fault Current
Description State Temp Temp Temp Temp
-------------- ---------- --------- --------- --------- ---------
NODE 0 Enabled 60C/140F 65C/149F 70C/158F 29C/ 84F
NODE 1 Enabled 60C/140F 65C/149F 70C/158F 29C/ 84F
NODE 2 Enabled 60C/140F 65C/149F 70C/158F 24C/ 75F
PIMM Enabled 60C/140F 65C/149F 70C/158F 31C/ 87F
ODYSSEY Enabled 60C/140F 65C/149F 70C/158F 26C/ 78F
BEDROCK Enabled 60C/140F 65C/149F 70C/158F 31C/ 87F
001a01-L1>log
02/02/24 17:12:26 checking power status
02/02/24 17:12:26 power up (COMMAND)
02/02/24 17:12:31 reset again MIPS
02/02/24 17:12:40 SMP_WQUE Q full (evt)
02/02/24 17:12:40 SMP_WQUE Q avail (lost 0 req, 0 rsp, 3 evt, 0 repl evt, 3 TOTAL)
No issues when viewed from L1 but when I switch to attempt to view the PROM I see the following:
Code:
returning to console mode 001a01 CPU0, <CTRL_T> to escape to L1
no response from 001a01 CPU0, system not responding
no response from 001a01 CPU0, system not responding
Code:
001a01-L1>leds
CPU A: 0x80: POD Mode (0x80/0xBC=okay, solid 0x80=possibly hung polling UART)
0x8c: FLED_GENERAL: General exception.
CPU B: 0x7b: unknown LED status.
CPU C: 0x33: PLED_INV_SCACHE
CPU D: 0xfa: unknown LED status.
Is the PIMM done for?
RE: PIMM shot? -
weblacky - 02-03-2024
Hi,
FYI the "CPU0" label is generic to express the "running system" vs L1...it's not literally a CPU error. I've seen this before when the PROM doesn't boot for some reason or PROM output has been redirected. Your PIMM voltages look good, so I doubt it's literally the PIMM. It's the POST/BOOT processing stalling out. It calls the PROM output screen that is supposed to be there "001a01 CPU0" on all Fuels...when the PROM is just not running when expected.
Did the Front case LED change behavior?
There is no PROM until stage 2 startup, though running fans should only occur on stage two. Try the "reset button" idea I mentioned. Also try putting the serial cable for the PROM output on both external serial ports in case there is some form of output
I can tell you concretely that on problem Fuel mainboard I'm working on... PROM almost NEVER appears on the L1 connector using the Crt+t/d commands. It's always either hiding entirely or on one of the another external serial ports. On some I still haven't found it but luckily got past my issue into the graphical PROM by using reset button restarts in the cases of stopping due to snaphat battery failure and NVRAM reloads.
Refer to threads like this:
https://gainos.org/~elf/sgi/nekonomicon/forum/users/JohnK/1.html
Same "error" you are getting yet it's only because the PROM output had been redirected! I swear the VAST majority of the Fuel mainboards I see don't have L1 PROM output. Also PROM output is very tricky, you MUST have the terminal software open and "off hook" BEFORE you turn on the fuel or often the Fuel doesn't see it had a terminal hooked and doesn't output anything for PROM serial output...happens so often to me I nearly pull my hair out about PROM issues where i cannot get the terminal "up".
RE: PIMM shot? -
s0ke - 02-03-2024
So everything seems ok when hooked to the L1 on the mainboard:
Code:
001a01-L1>log
02/02/24 18:30:13 checking power status
02/02/24 18:30:13 power up (COMMAND)
02/02/24 18:30:18 reset again MIPS
I attempted the reset but nothing changes.
Code:
001a01-L1>log
02/02/24 18:30:13 checking power status
02/02/24 18:30:13 power up (COMMAND)
02/02/24 18:30:18 reset again MIPS
02/02/24 18:32:09 NMI...
02/02/24 18:32:09 NMI (PANEL)
02/02/24 18:32:09 NMI done
Code:
001a01-L1>log clear
log reset.
But then hooking up to serial 1 I get nothing. I power on and still see nothing. Power off connect back to the L1 on the main board. Then do a power up again.
Code:
001a01-L1>log
02/02/24 18:33:20 Asterix SRAM low fault limit reached 0.000V.
02/02/24 18:33:34 L1 booting...
02/02/24 18:33:34 USB0: waiting on open
02/02/24 18:33:49 Reset...
02/02/24 18:33:49 reset (PANEL)
02/02/24 18:33:50 Reset done
02/02/24 18:33:54 Reset...
02/02/24 18:33:54 reset (PANEL)
02/02/24 18:33:54 Reset done
02/02/24 18:33:55 NMI...
02/02/24 18:33:55 NMI (PANEL)
02/02/24 18:33:55 NMI done
02/02/24 18:33:56 Reset...
02/02/24 18:33:56 reset (PANEL)
02/02/24 18:33:56 Reset done
02/02/24 18:33:57 checking power status
02/02/24 18:33:57 power up (PANEL)
02/02/24 18:34:02 reset again MIPS
02/02/24 18:35:09 1.5V low fault limit reached 0.719V.
02/02/24 18:35:44 L1 booting...
02/02/24 18:35:44 USB0: waiting on open
Code:
001a01-L1>env
Environmental monitoring is enabled and running.
Description State Warning Limits Fault Limits Current
-------------- ---------- ----------------- ----------------- -------
12V Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 11.94
12V IO Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.00
5V Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.07
3.3V Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.35
2.5V Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.47
1.5V Enabled 10% 1.35/ 1.65 20% 1.20/ 1.80 1.47
5V aux Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 4.99
3.3V aux Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.30
PIMM0 12V bias Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 11.94
Asterix SRAM Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.54
Asterix CPU Enabled 10% 1.44/ 1.76 20% 1.28/ 1.92 1.61
PIMM0 1.5V Enabled 10% 1.35/ 1.65 20% 1.20/ 1.80 1.49
PIMM0 3.3V aux Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.30
PIMM0 5V aux Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 4.97
XIO 12V bias Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 11.81
XIO 5V Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.07
XIO 2.5V Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.47
XIO 3.3V aux Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.30
Description State Warning RPM Current RPM
-------------- ---------- ----------- -----------
FAN 0 EXHAUST Enabled 920 1228
FAN 1 HD Enabled 1560 2229
FAN 2 PCI Enabled 1120 1573
FAN 3 XIO 1 Enabled 1600 2185
FAN 4 XIO 2 Enabled 1600 4549
FAN 5 PS Enabled 1600 2229
Advisory Critical Fault Current
Description State Temp Temp Temp Temp
-------------- ---------- --------- --------- --------- ---------
NODE 0 Enabled 60C/140F 65C/149F 70C/158F 36C/ 96F
NODE 1 Enabled 60C/140F 65C/149F 70C/158F 33C/ 91F
NODE 2 Enabled 60C/140F 65C/149F 70C/158F 27C/ 80F
PIMM Enabled 60C/140F 65C/149F 70C/158F 45C/113F
ODYSSEY Enabled 60C/140F 65C/149F 70C/158F 31C/ 87F
BEDROCK Enabled 60C/140F 65C/149F 70C/158F 39C/102F
I have another serial adapter I will try. But all I see is junk when using serial 1/2 on the back. But I'm assuming the baud rate is the same.
RE: PIMM shot? -
weblacky - 02-03-2024
I've PMed you.
"junk" on serial output is normally a good sign! It normally means it's trying to talk to you, but with different baud settings then what you've use on the terminal application. Try other baud settings starting at 9600 and keep going higher on the external serial port you saw junk on until things look normal.
RE: PIMM shot? -
s0ke - 02-03-2024
(02-03-2024, 01:56 AM)weblacky Wrote: I've PMed you.
"junk" on serial output is normally a good sign! It normally means it's trying to talk to you, but with different baud settings then what you've use on the terminal application. Try other baud settings starting at 9600 and keep going higher on the external serial port you saw junk on until things look normal.
Thanks weblacky! Helped me out with the POD.
Code:
L1_CMD> pod
POD>go cac
POD>clearalllogs
POD>initalllogs
POD>flush
POD>reset
Good to go now. Much appreciated.
Code:
System Maintenance Menu
1) Start System
2) Install System Software
3) Run Diagnostics
4) Recover System
5) Enter Command Monitor
RE: PIMM shot? -
weblacky - 02-03-2024
When you get things up I recommend you:
A. Set correct date & time and TZ (timezone) in L1 for future ease (Use commands TZ (PST, EST, DST) and DATE) in L1, find out by using help command on L1.
B. Do HINV at terminal and check RAM and all that (no odd sizes), verify CPU speed as well.
C. Use PROM terminal to set DATE for OS/SNAPHAT date and time, so OS install doesn't use erroneous install date.
D. Install Irix 6.5.30
E. Check the OS clock for correct date/time and then shutdown the system and pull wall power with rocker switch on your ATX mod PSU. Check if OS loses date and time. If it does, you need a new SNAPHAT (yellow chip to the right of the RAM slots), which are still made and you'll want the "bigger, beefer" version than the one it came with (longer life). They are still made and can be often purchased via Digikey or Mouser. Check back with us on that.
F. Afterwards contact me back for help on L1 firmware upgrading, jwhat made a great file archive for this and I can get it to you via Google drive share or something, make it so much easier to get the upgrades!
G Save the L1 env output from before and AFTER booting for your records in case of future issues.
H. Enjoy, it only gets better from here.
Cheers!
RE: PIMM shot? -
weblacky - 02-03-2024
For posterity the issue ended up being that the PROM console had moved and changed settings, as we have frequently find, to one of the external serial ports. The "junk" that s0ke saw on one of the external serial ports was (as I had hoped) the PROM's attempt at serial communications, but at different settings. I advised to hook back up to the external serial again and try 9600 and up on that port until the text looked normal. Upon trying 9600 baud, s0ke said it all looked normal now and could see the booting PROM.
What was then discovered is that the PROM was stopping at POD. It was unclear if it was stopping due to its own error or it was stopping due to a previous request for debug on the L1. Regardless a full reset was performed in PROM, which lost the ability to see PROM at the current port & settings, of course.
I directed that the null modem serial cable be reattached to the internal L1 serial port and that the PROM should now appear through that interface at normal 38,400 speed. That ended up being the case and now S0ke is at this point where both L1 and PROM come thought the L1 port using CRTL+t/d controls and it boots all the way into PROM, presenting the PROM options, show above.
DVI Monitor and peripheral hook up as well as OS loading and further firmware/settings are TBD at this point.
RE: PIMM shot? -
Raion - 02-03-2024
Please do not quote pyramid. If you're going to quote a post with a pyramid already part of it you need to remove all of the previous quotes out of it.
RE: PIMM shot? -
kshuff - 02-03-2024
Gee and I thought it was just a dead PS. Sorry s0ke, you didn't pull all your hair out, did you?
RE: PIMM shot? -
s0ke - 02-04-2024
Was worth the effort and extra parts for the price. Needs a new snaphat too. But it lives!
Code:
hinv -vm
Location: /hw/module/001c01/node
IP34 Board: barcode MNW845 part 030-1707-003 rev -H
Location: /hw/module/001c01/node/cpubus/0
IP34PIMM Board: barcode MEP885 part 030-1730-001 rev -F
Location: /hw/module/001c01/Ibrick/xtalk/13
ASTODYB Board: barcode MSE145 part 030-1725-001 rev -F
Location: /hw/module/001c01/Ibrick/xtalk/14
IP34 Board: barcode MNW845 part 030-1707-003 rev -H
Location: /hw/module/001c01/Ibrick/xtalk/15
IP34 Board: barcode MNW845 part 030-1707-003 rev -H
1 600 MHZ IP35 Processor
CPU: MIPS R14000 Processor Chip Revision: 2.3
FPU: MIPS R14010 Floating Point Chip Revision: 2.3
CPU 0 at Module 001c01/Slot 0/Slice A: 600 Mhz MIPS R14000 Processor Chip (enabled)
Processor revision: 2.3. Scache: Size 4 MB Speed 300 Mhz Tap 0xa
Main memory size: 1536 Mbytes
Instruction cache size: 32 Kbytes
Data cache size: 32 Kbytes
Secondary unified instruction/data cache size: 4 Mbytes
Memory at Module 001c01/Slot 0: 1536 MB (enabled)
Bank 0 contains 256 MB (Standard) DIMMS (enabled)
Bank 1 contains 256 MB (Standard) DIMMS (enabled)
Bank 2 contains 512 MB (Standard) DIMMS (enabled)
Bank 3 contains 512 MB (Standard) DIMMS (enabled)
Integral SCSI controller 2: Version IEEE1394 SBP2
Integral SCSI controller 0: Version QL12160, low voltage differential
Disk drive: unit 1 on SCSI controller 0 (unit 1)
Integral SCSI controller 1: Version QL12160, single ended
CDROM: unit 6 on SCSI controller 1
Integral SCSI controller 3: Version SAS/SATA LS1068
Disk drive: unit 3 on SCSI controller 3 (unit 3)
IOC3/IOC4 serial port: tty3
IOC3/IOC4 serial port: tty4
IOC3 parallel port: plp1
Graphics board: V10
Integral Fast Ethernet: ef0, version 1, module 001c01, pci 4
Iris Audio Processor: version MAD revision 1, number 1
PCI Adapter ID (vendor 0x104c, device 0x8024) PCI slot 1
PCI Adapter ID (vendor 0x1077, device 0x1216) PCI slot 1
PCI Adapter ID (vendor 0x1000, device 0x0054) PCI slot 2
PCI Adapter ID (vendor 0x1412, device 0x1724) PCI slot 3
PCI Adapter ID (vendor 0x10a9, device 0x0003) PCI slot 4
PCI Adapter ID (vendor 0x11c1, device 0x5802) PCI slot 5
HUB in Module 001c01/Slot 0: Revision 2 Speed 200.00 Mhz (enabled)
IP35prom in Module 001c01/Slot n0: Revision 6.210
DMediaPro DM10 FW option: unit 0, revision 1.1.0
USB controller: type OHCI