Tezro node/hub error
#1
Tezro node/hub error
Hi all,

Recently I have been getting the following error on the Tezzie:

WARNING : Excessive II Sn errors (77575/min) on /hw/module/001c01/node/hub

I assume this is the USB hub? Weird error if so because I have no USB devices plugged in.

If I leave it alone sometimes it's OK, sometimes it crashes with the L1 signaling a kernel panic.

Any thoughts? Loose connection somewhere??

Secondary related issue, if I try to run offline Diagnostics, I get the error "autoboot failed. Dksc(0,1,0):/stand/smdk/smdk : no such file or directory"

Thanks in advance 👍

[Image: bjcqtPvq_o.png]
(This post was last modified: 11-23-2020, 11:00 AM by BackPlaner.)
BackPlaner
Jurassic Technologist

Trade Count: (1)
Posts: 262
Threads: 39
Joined: Sep 2018
Location: Lost Angeles
Find Reply
11-23-2020, 10:47 AM
#2
RE: Tezro node/hub error
Could be more serious than a USB HUB. The HUB ASIC is the CPU interconnect in the Origin 2000 series. In the 3K series it's called Bedrock.

In the Tezro, the nodeboard (IP53 or IP59) is on one side of the chassis divider, the rest is on the side of the PCI cards. It's been mentioned in the past that the connection between the two can come undone, but I think that was while shipping a system. It's a fragile high density connector, so beware if you undo anything there. Also take ESD precautions.

I had a HUB go bad on an Onyx2 node some years ago. Had to replace the nodeboard. Fortunately on the Onyx2 the CPUs are not soldered to the nodeboard because it was a 500MHz R14K node.


Attached Files Image(s)
   
jan-jaap
SGI Collector

Trade Count: (0)
Posts: 1,048
Threads: 37
Joined: Jun 2018
Location: Netherlands
Website Find Reply
11-23-2020, 12:32 PM
#3
RE: Tezro node/hub error
I'm sad to hear this! (Luckily quad 700MHz nodes are rather easy to come by, the last time I checked anyway!)

I've seated different node boards into my Tezro quite a few times in the time that I've had it and the trick is not to force it into position.

Lay the Tezro on its side and place the node board lightly into position, allowing the FCI Meg-Array connectors to align properly. (You're able to feel if they're properly aligned, as they do fit into each other.)

This image shows how the FCI Meg-Array connectors interlock:

[Image: FjYnG32.jpg]

Once you're certain that the connectors are aligned, press GENTLY and EVENLY down on the "connector" side of the node board and DON'T FORCE IT! (And whatever you do, don't rush this process!)

Make sure that the connectors have interlocked before attempting to screw the node board into place!
(This post was last modified: 11-23-2020, 05:43 PM by Irinikus.)
Irinikus
Hardware Connoisseur

Trade Count: (0)
Posts: 3,475
Threads: 319
Joined: Dec 2017
Location: South Africa
Website Find Reply
11-23-2020, 05:10 PM
#4
RE: Tezro node/hub error
Hi Backplaner,

your other problem on diagnostics not running is because your IRIX install has not included the "Diagnostic Package".


On Fuel (a variation of same architecture) you can see (using SW Manager) that the diagnostics SW is part of:
"IRIX Based Fuel Offline Diagnostic Environment for 6.5", I presume that there is a similar package for Tezro.

Cheers from Oz,


John.
(This post was last modified: 11-24-2020, 02:03 AM by jwhat.)
jwhat
Octane/O350/Fuel User

Trade Count: (0)
Posts: 513
Threads: 29
Joined: Jul 2018
Location: Australia
Find Reply
11-24-2020, 02:01 AM
#5
RE: Tezro node/hub error
OK, thanks chaps, I shall report back after undertaking these investigations!

[Image: bjcqtPvq_o.png]
BackPlaner
Jurassic Technologist

Trade Count: (1)
Posts: 262
Threads: 39
Joined: Sep 2018
Location: Lost Angeles
Find Reply
11-24-2020, 04:27 AM
#6
RE: Tezro node/hub error
So I've reseated all the boards, but I can't find the diagnostics install package on my 6.5.5 cds, does anyone know how what disk it's on?

[Image: bjcqtPvq_o.png]
BackPlaner
Jurassic Technologist

Trade Count: (1)
Posts: 262
Threads: 39
Joined: Sep 2018
Location: Lost Angeles
Find Reply
12-19-2020, 06:54 AM
#7
RE: Tezro node/hub error
1. Does the error occur with whatever you were running before? You add an error, is it gone?

2. Irix 6.5.5 to WAAAAAAAYYYYYY to old for a Tezro! Tezro support was added starting in version 6.5.15 and higher....so there's that. Source: http://archive.irixnet.org/sgistuff/hard...tezro.html
weblacky
I play an SGI Doctor, on daytime TV.

Trade Count: (10)
Posts: 1,716
Threads: 88
Joined: Jan 2019
Location: Seattle, WA
Find Reply
12-19-2020, 07:15 AM
#8
RE: Tezro node/hub error
If the problem persists after reseating everything, then it unfortunately looks like you may have a node board crossbar-router (bedrock) problem!

[Image: DNdV9Dv.jpg]

I certainly hope this is not the case as the O350 nodes have become rather expensive at this point. (This certainly was not the case a year or so ago!)

It may be worth trying to get the bedrock re-balled. (Speak to someone who does deep gaming console repairs, as they would probably be able to help you with this, as faulty gaming consoles often end up by having their SOC's re-balled!) [They have the required specialised equipment and if it's cost effective to recall the SOC in a PS3 then it may not cost that much!]

The cheapest O350 node with quad 700MHz R16K's, 8MB of cache I can find currently is going for $600!

These guys may be worth contacting to get the required re-balling stencil made up:

(This post was last modified: 12-19-2020, 09:23 AM by Irinikus.)
Irinikus
Hardware Connoisseur

Trade Count: (0)
Posts: 3,475
Threads: 319
Joined: Dec 2017
Location: South Africa
Website Find Reply
12-19-2020, 07:20 AM
#9
RE: Tezro node/hub error
(12-19-2020, 07:15 AM)weblacky Wrote:  1. Does the error occur with whatever you were running before?  You add an error, is it gone?

2. Irix 6.5.5 to WAAAAAAAYYYYYY to old for a Tezro!  Tezro support was added starting in version 6.5.15 and higher....so there's that.  Source: http://archive.irixnet.org/sgistuff/hard...tezro.html

Yeah still getting the error unfortunately. And sorry that was a typo, meant 6.5.30

[Image: bjcqtPvq_o.png]
BackPlaner
Jurassic Technologist

Trade Count: (1)
Posts: 262
Threads: 39
Joined: Sep 2018
Location: Lost Angeles
Find Reply
12-19-2020, 08:07 AM
#10
RE: Tezro node/hub error
Well I think Jan-Jaap mentioned something on the O2 CPU mod thread that real reballing takes money and machines. These boards are thick and huge, you'll need very controlled board preheating and even though you're only soldering a connector (and not an IC) you don't want to melt it and I have no idea if the connector housing has an alignment pin through the PCB or something to align to correctly. After all, it has to actually line up with it's matting connector.

While nothing really stops you from trying to buy a Meg-ARRAY connector and doing the work (hopefully only on one side). I'd ask you to please stop and consider my original fix suggestion:

I asked before about doing a basic pin connection check. I was under the assumption that both boards have tiny test pads for every pin on that connector on the rear of opposite side of the boards. Before you go playing with that, I suggest you try my old idea of taking power off the system and just trying a continuity check of pin-pin while the connected halves are in place in the case, right now.

You should be able to use a test lead on each side and check that "trouble" pin area. If you are getting a repeatable connection...then you know the connector is actually working. If you find a SINGLE pin that does have an issue...then just solder (gently) a single jumper wire to BOTH test points (connect both rear sides of the connection with a jumper wire instead of the pin. You can always undo it (test pad will have solder on it forever...but that's it)! If you find a pin that MAYBE touching the wrong pad, then I might break that pin off and do a manual jumpering as well (if I was sure the pin was touching the wrong thing on the other side).

But I doubt that's what you'll see. Likely you'll see an erratic ohms reading (point to point, NOT point to ground) or one who's reading higher than its neighboring connections (poor surface contact).

Installing one or two jumper wires between boards (back of connector test pad to back of other connector's test pad) is a heck of a lot easier that reballing the entire connector! Even if I'm wrong about where the test pads are, you could just trace the bent pin(s) manually to other board points and solder there.

While you might argue the signals might need high frequency blah blah blah...it's undoable and a heck of a better patch..and if everything works afterwards...do you think you'll ever be separating the two boards again...no, you won't.

So my last attempt , please, test which pin is actually failing, then find a way to jumper the pin's connection from both sides of the boards...and be done with this. No one is going to give you crap of doing this because the connector was damaged, so what are your real options? Patch the broken connection lines or resolder the entire connection, I know which one I'd try first.

I wish you best on this, I hope the hardware can be saved.
weblacky
I play an SGI Doctor, on daytime TV.

Trade Count: (10)
Posts: 1,716
Threads: 88
Joined: Jan 2019
Location: Seattle, WA
Find Reply
12-19-2020, 09:14 AM


Forum Jump:


Users browsing this thread: 1 Guest(s)