sgi tezro L1 General Exception on node 0
#41
RE: sgi tezro L1 General Exception on node 0
Hi HarryT,

yes there are 3 screws in front (near the array connector), 3 in back (from memory, next to memory banks) and two on the side.

Just unscrew and lift out.

The Blue Plastic alignment posts ensure the IP53/IP59 drops back in and aligns with connectors when you put it back in.

I have done this many many times and never had an issue, expect in case when the array connector itself was badly damaged and so I needed to add an special spacer to compensate.

BTW Geoman is a German based member I think and he has Tezro ... if you need to test board in alternate machine he might be able to help ...

Cheers from Oz,

jwhat/John
(This post was last modified: 06-26-2023, 06:22 PM by jwhat.)
jwhat
Octane/O350/Fuel User

Trade Count: (0)
Posts: 513
Threads: 29
Joined: Jul 2018
Location: Australia
Find Reply
06-09-2022, 11:24 PM
#42
RE: sgi tezro L1 General Exception on node 0
I had my own L1-nightmare: both the Snaphat and the Dallas went out of power: Changing the snaphat was no problem at all.
But changing the Dallas resulted in a Serial Number loss and a notice to email SGI for a new encrypted hardware key!! So initially I thought all was lost, but here was the solution: Connect via Windows NT/XP/7 Notebook and the original "Hyperterminal" to Tezro's serial port named "Console" and set a new serial number according to the backside of the machine. This might not be not relevant to HarryT's Problem, perhaps... Well at least I had success on connecting not to "L1-Port", "OIOI 1" or "OIOI 2" but to "Console" and entering commands there[b]Hasdfsdfsdfsdf[/b]


Attached Files Image(s)
           

SGI - the legend will never die!!

Indy Indigo Crimson Indigo2 R10000/IMPACT Indigo2 R10000/IMPACT O2 O2 Octane Octane2 Octane2 Tezro
Geoman
Crimson to Tezro

Trade Count: (0)
Posts: 162
Threads: 13
Joined: May 2018
Location: Germany
Find Reply
06-10-2022, 05:57 PM
#43
RE: sgi tezro L1 General Exception on node 0
how did you make out with this?

Indigo2 IMPACT  : R10K-195MHz, 1GB RAM, 146GB 15K, CD-ROM, AudioDAT, MaxImpact w/ TRAM.  IRIX 6.5.22

O2 : R12K-400MHz, 1GB RAM, 300GB 15K, DVD-ROM, CRM Graphics, AV1/2 Media Boards & O2 Cam, DV-Link, FPA & SW1600.  IRIX 6.5.30

 : 2 x R14K-600MHz, 6GB RAM, V12 Graphics, PCI Shoebox.  IRIX 6.5.30

IBM  : 7012-39H, 7043-140

chulofiasco
Hardware Junkie

Trade Count: (0)
Posts: 328
Threads: 51
Joined: May 2019
Location: New York, NY
Website Find Reply
06-26-2023, 03:35 PM
#44
RE: sgi tezro L1 General Exception on node 0
That's weird, I've had the Dallas battery on my Octane2 go out on several occasions, and just replacing it solved, basically, what was just an epoch problem. It never complained about the system serial number. I have no idea what a "Snaphat" is, guess I'd better dig into the documentation in case I ever have to deal with that. SGI was notorious for making system administration as goofy as possible to try to earn more money from field service being sent out to fix stuff the end users couldn't figure out.

Project: Temporarily lost at sea
Plan: World domination! Or something...
vishnu
Tezro, Octane2, 2 x Onyx4

Trade Count: (0)
Posts: 1,247
Threads: 42
Joined: Dec 2017
Location: Minneapolis, Minnesota USA
Find Reply
06-26-2023, 03:43 PM
#45
RE: sgi tezro L1 General Exception on node 0
We spent a great deal of effort and time documenting this area:

http://forums.irixnet.org/thread-2535.html


Jwhat even did a good summary post on his website. Basically the problem when you replace both RTC's is they were never made to have both go out at the same time. It's certainly doable to have them set themselves up again, but some of the automation doesn't always work. The link above was mainly for fuel but Tezro has fewer problems.

But they suggested the snaphat was always supposed to be second and not the first thing to address. So you were supposed to get the L1 RTC working and then put in the Snaphat and get it working as it caches from a similar location.

There is a socketed Atmel identity chip very close to the L1 RTC that carries the system ID information in fuel. In Tezro, i'm not 100% sure because it's been a while. But Tezro has been known to lose the serial number because it's cashed in the L1. We think it's cached in a couple places, but both of those rely on the batteries to work. So you just have to reinitialize it if it gets lost. Just use the set serial command in the L1 to do that. If you lose your serial, the system will complain and normally won't proceed to PROM so you'll Know just by that you have an issue and have to go into your L1 console.

Again the snaphat is used on an RTC that's on the IO9 controller under Tezro that holds the boot parameters as well as the date and time used by the operating system. There is a date and time under the L1 system which you're free to set separately as well as the time zone. But it may not matter for most things. The NVRAM in the L1 RTC is used to store board configurations, hardware inventory, and event logs for the L1...among other data.
weblacky
I play an SGI Doctor, on daytime TV.

Trade Count: (10)
Posts: 1,716
Threads: 88
Joined: Jan 2019
Location: Seattle, WA
Find Reply
06-26-2023, 04:14 PM
#46
RE: sgi tezro L1 General Exception on node 0
(06-03-2022, 09:09 PM)HarryT Wrote:  
(06-03-2022, 07:01 PM)weblacky Wrote:  Yeah, it uses the same cable as is the exact same signal.  Make sure you actually hooking up to the serial 1 port then open your terminal and go off-hook and THEN turn the Tezro on.

The L1 terminal signal will recover mid-talk with plugging and unplugging an active terminal signal.  The console output will not!  You have to be fully hooked and paying attention right at power on, if you attempt to attach later, you’ll see nothing on the additional serial port (if the output has been actually moved). 

Did you make sure to do this?  Fully connect on the terminal side and then plug the Tezro into the wall.  Do not put the serial 1 cable on while l1 is running already… too late.  You have to be hooked up and signaling long before L1 even starts. 


At least I had to do this on my fuel when it was redirected to serial 1 or I saw nothing. Had to have it all started and blinking at the terminal before power was applied to system…so I assume the same is true for Tezro. 


Please make sure that’s what you’ve done.


If you’ve done that and you’ve still not gotten output once your system had tried to auto-start (no output if you’ve disabled auto start of course). 

I assume you hooked up your Tezro to serial1, started the terminal software, gone off hook on the terminal (connect), plugged in your Tezro to power cable, and pressed the on button at front face to actually attempt power up. 


Then wait about 30 seconds and if the console was going to appear, it would.


Also I forgot to ask, does your Tezro front LCD say anything interesting during power up attempts?


Ok, I unplugged the tezro powercord, change the nullmodemcable plug from console port tezro to serial 1 port tezro (or serial 2 port), start putty with 38400 baud rate (same settings as for console l1 connection), press enter for putty, plug in power cord tezro, push power button tezro. Will get
Code:
001c01
Powered up
on lcd display tezro (like always).

But nothing happens in putty - sorry. Not on serial 1 nor on serial 2 port of the tezro.  There is no output.

I think this works only on Fuel not on tezro.

I had the same problem as HarryT (red light and not starting prom) until a nodeboard change today, so the Tezro runs again.
Renewed the RTC first und made the serial but nothing, here it seems to be a problem of the nodeboard.

Will try the old Nodeboard in my O350 (were i took the new from), maybe i'll find a solution for it to repair.

What makes me nervous - maybe someone knows if this behavior is normal:
There is a LED on the interface board near the RTC.
Bevore starting the machine it is green after the machine starts it gets red.
Is this normal behavior ?


Attached Files Image(s)
       

Tezro  O2
(This post was last modified: 12-22-2023, 11:05 PM by lohmos.)
lohmos
3D Modelling & Rendering

Trade Count: (0)
Posts: 70
Threads: 20
Joined: Jan 2021
Location: Germany
Find Reply
12-22-2023, 11:03 PM
#47
RE: sgi tezro L1 General Exception on node 0
(12-22-2023, 11:03 PM)lohmos Wrote:  
(06-03-2022, 09:09 PM)HarryT Wrote:  
(06-03-2022, 07:01 PM)weblacky Wrote:  Yeah, it uses the same cable as is the exact same signal.  Make sure you actually hooking up to the serial 1 port then open your terminal and go off-hook and THEN turn the Tezro on.

The L1 terminal signal will recover mid-talk with plugging and unplugging an active terminal signal.  The console output will not!  You have to be fully hooked and paying attention right at power on, if you attempt to attach later, you’ll see nothing on the additional serial port (if the output has been actually moved). 

Did you make sure to do this?  Fully connect on the terminal side and then plug the Tezro into the wall.  Do not put the serial 1 cable on while l1 is running already… too late.  You have to be hooked up and signaling long before L1 even starts. 


At least I had to do this on my fuel when it was redirected to serial 1 or I saw nothing. Had to have it all started and blinking at the terminal before power was applied to system…so I assume the same is true for Tezro. 


Please make sure that’s what you’ve done.


If you’ve done that and you’ve still not gotten output once your system had tried to auto-start (no output if you’ve disabled auto start of course). 

I assume you hooked up your Tezro to serial1, started the terminal software, gone off hook on the terminal (connect), plugged in your Tezro to power cable, and pressed the on button at front face to actually attempt power up. 


Then wait about 30 seconds and if the console was going to appear, it would.


Also I forgot to ask, does your Tezro front LCD say anything interesting during power up attempts?


Ok, I unplugged the tezro powercord, change the nullmodemcable plug from console port tezro to serial 1 port tezro (or serial 2 port), start putty with 38400 baud rate (same settings as for console l1 connection), press enter for putty, plug in power cord tezro, push power button tezro. Will get
Code:
001c01
Powered up
on lcd display tezro (like always).

But nothing happens in putty - sorry. Not on serial 1 nor on serial 2 port of the tezro.  There is no output.

I think this works only on Fuel not on tezro.

I had the same problem as HarryT (red light and not starting prom) until a nodeboard change today, so the Tezro runs again.
Renewed the RTC first und made the serial but nothing, here it seems to be a problem of the nodeboard.

Will try the old Nodeboard in my O350 (were i took the new from), maybe i'll find a solution for it to repair.

What makes me nervous - maybe someone knows if this behavior is normal:
There is a LED on the interface board near the RTC.
Bevore starting the machine it is green after the machine starts it gets red.
Is this normal behavior ?


Since Tezros were the last workstations they are the newest SGI's. So the truth is they haven't really aged enough for us to really see what's going happen with them, until now. We're already seeing slight VRM issues that should be fixable. But who knows what else that entire series hold.  If a node board changed things then obviously the node board has a problem. I would assume that problem is it probably is a shorted component on it that prevented the machine from ever really reaching post.

While not a Tezro, I have recently same L1 behavior on a fuel mainboard (that I'm still working on), that the L1 is not really a big help when it comes to an outright shorted power rail. I've seen Fuel L1 claim to be starting a system but it turns out one entire rail of it is in short and the VRM turned off it's switching to protect itself from the short and still didn't actually say anything about it. It just said "powering up" and then was wondering why things weren't powering up. This appears to be a similar situation with this Tezro. Which leads me to believe that yes it's got to be something pretty basic like a power rail, SMD electrolytic capacitor, malfunctioning VRM circuit...at this point.

You have to remember that the Tezro mainboard has low voltage VRM's on it while the node board has much higher powered VRM's. So there's actually voltage creation happening on both boards for different reasons. If memory serves me only one of the node board VRM's is removable the others are soldered into the board more permanently.  

But that's the take I would have on this, given my experience. You have most likely a short on the board past the primary power rails. That is on the other side of a first or second level buck converter. 

If you're comfortable poking around you can do some stuff but otherwise just hang onto the board if you have a working set up now just gently put it away so the very delicate connectors don't get damaged or anything like that and it's something that we might be able to look into in the future ahead a few other things. I have a quad 700 note board in a Tezro myself so if I can wrangle things in the future I might be able to upgrade that Tezro and remove the board and then have something to compare against between the two assuming they are the part number and revision. Then there be a way to track down what happened on yours. I would make the assumption that your CPUs are probably OK and it probably is fixable it's just a little dense and it's gonna take us some time.

I'm certainly glad to hear that you found a solution by swapping hardware. Like I said please hold onto your malfunctioning board in a good anti-static bag protecting those megarray board connectors and possibly in a year or two we'll be able to help you out with that.  We're ramping up on stuff like this now so give us some time and keep the faith and we'll see what comes up in the next few years.

I hope OP can do something similar.
weblacky
I play an SGI Doctor, on daytime TV.

Trade Count: (10)
Posts: 1,716
Threads: 88
Joined: Jan 2019
Location: Seattle, WA
Find Reply
12-23-2023, 02:08 AM
#48
RE: sgi tezro L1 General Exception on node 0
OK weblacky

the broken nodeboard is free now
It served as a placeholder and dust protection in a Origin 350 and has been removed today
you could start your experiments on it now

hope we can find out what happened there and protect or repair others in future

it is now in Germany so we have to see how we can get it into your home

Tezro  O2
(This post was last modified: 05-27-2024, 01:22 AM by lohmos.)
lohmos
3D Modelling & Rendering

Trade Count: (0)
Posts: 70
Threads: 20
Joined: Jan 2021
Location: Germany
Find Reply
05-27-2024, 01:15 AM
#49
RE: sgi tezro L1 General Exception on node 0
Refresh my memory, were you able to find a replacement for yours?
weblacky
I play an SGI Doctor, on daytime TV.

Trade Count: (10)
Posts: 1,716
Threads: 88
Joined: Jan 2019
Location: Seattle, WA
Find Reply
05-27-2024, 03:17 AM
#50
RE: sgi tezro L1 General Exception on node 0
(05-27-2024, 03:17 AM)weblacky Wrote:  Refresh my memory, were you able to find a replacement for yours?
i got one of Ian's 1 Ghz Nodeboards

Tezro  O2
lohmos
3D Modelling & Rendering

Trade Count: (0)
Posts: 70
Threads: 20
Joined: Jan 2021
Location: Germany
Find Reply
05-28-2024, 05:36 AM


Forum Jump:


Users browsing this thread: 1 Guest(s)