(06-03-2022, 02:06 AM)jwhat Wrote: Hi Weblacky and HarryT,
I have not commented on this thread as I have never had "General Exception Error" just the "*** TLB Refill Exception on node 0" one.
Given that that you have played replaced both Snaphat & DALLAS, before proceeding can you confirm that L1 config looks ok:
- Get L1 prompt and check: "date", "serial", "serial all" & "conf" .
- Then on the L1 check log: "log".
To get into POD/CAC mode you have to be able to power up the machine.
Typically this is done by first setting L1 "debug" flags "debug 0x10d" and then doing "power up".
The machine will boot into POD mode, but if "power up" fails then that will not work.
To get machine to then boot normally you need to reset the debug flag (via L1) back to 0 (zero).
If playing with L1 does not help, then you need to revisit Power Supply (as per Weblacky) or borrow an IP53 board from someone and retry with this.
I have had cases where my working SGI machine failed to reboot for no apparent reason (ie no physical/software change on my part) because there was some undiagnosed component failure that is just a result of these being old machines.
I hope you can get the machine going again.
Cheers from Oz,
jwhat/John.
Yo Jwhat,
In the thread HarryT said he saw his full serial when using "serial all" that matched what was on his case label.
I don't see a geographic location listed on HarryT's profile, but if he's in the USA they may be hope for getting a test PSU in the future.
The Ctrl+D console isn't running at all (which I assume is the PROM code?).
I though POD worked below the PROM, on my Fuel I got to POD on second stage power-on (fan running and working ENV) but before PROM loads...I thought?
Outside the complete reset, yeah I'm not too certain here...the error has no information whatsoever. If I'm remembering his logs were unremarkable.
Your symptoms KIND OF match this old recovered thread:
http://archive.irixnet.org/apocrypha/nek...awa/1.html
I'm nearly 100% sure you CAN get into debug mode with your system as it bypasses CPU and DIMM checks...I thought.
HarryT claimed he hasn't moved his Tezro, hasn't upgraded anything, used it last Christmas and then now it's acting this way.
I really was thinking...disabled CPU...but his CPU command CLAIMS no CPUs have been disabled.
Regardless, I agree with jwhat, if a "full reset" doesn't "magically" fix it...there must be a hardware reason that will require some parts swapping.
HarryT, if you're up for it...why not just try the FULL board reset.
Enter DEBUG mode on the L1 (before power-up...really fast typing) with : debug 0x10d
pwr up
<TRY Ctrl+D now> if you see A 000 001c01: POD SysCt Cac>
go cac
clearalllogs
y
initalllogs
flush
debug 0
<Crtl+T> back to L1 command prompt.
reset
This is the trick we know to fix a firmware confused late-model SGI. If it's not a firmware issue and it's a hardware issue...well then that's a more expansive problem since there is 0 documentation in those regards. Your power ENV generally looked fine, temps looked realistic, fan speeds as well. I don't have any rational explanation...see if the reset trick works for you. if it doesn't then I guess the next steps are either getting/borrowing a CHEAP CPU board from someone to test on your system or a new mainboard L1 lives on mainboard...I assume PROM too?