O350 Interface Board L1 Issues
#1
O350 Interface Board L1 Issues
Hey, folks,

I've managed to acquire and mostly rebuild what appears to have been an Onyx 350 prototype of some kind. The system was gutted to just the 2U Interface board and a bunch of associated fans and other such pieces. I've been obtaining the necessary parts to rebuild it and I've even had it up and running for a little bit. I was very hopeful that I'd be able to snap some pictures and get a post up on the forums all about it. Smile

However, the system is now, quite sadly, not functioning at all. The L1 controller started to get flaky, not responding to commands. As of now, it doesn't respond at all and there is no message on the front LCD. The two amber lights on the node board come on and do not turn off, as they should when the controller activates. No other lights are lit on any other boards. I've also tried removing everything down to just the power assembly and the 2U Interface. Still no L1 communication.

As a bit of additional background: during shipping, one of the capacitors on the board was sheared off, including the majority of the underlying pads. I've replaced that capacitor since and managed to get it to attach to those remaining pads. I have replaced it again since the failure, thinking that maybe I didn't get it in place well enough. I'm fairly confident that it's okay as of now, though, as I do get voltage across it while the system is plugged in.

I'm also pretty confident that the Dallas chip is okay, as I've tested it in other systems, and also tried known-good ones in it.

With all of that out of the way, I'm hoping that someone might have some advice on what to even start testing. There are a *ton* of components on this board and I don't really know where to even start. I have access to a good number of electronics tools, including a multimeter, oscilloscope, logic probe, and others.

Any advice would be greatly appreciated. If I can't repair it, I'll probably have to source a replacement through one of our friendly neighborhood SGI vendors. I'm already asking for prices just in case. I'd like that to be a last effort, though, as this is really the only remaining original hardware from the prototype. It's very likely that this system was just a demo unit or a qualifying sample, so it's not totally unique. Still, I want to save what I can. (Also, don't worry, even if I can't fix it, the board will be saved and kept safe for someone more knowledgeable to try to fix in the future.)

Personaliris Indigo Indigo2 Indy Onyx2 Origin 200 Origin Vault O2 Octane2 (VW 320) (VW 540) (VW 550) Fuel Tezro Tezro Rack Origin 350 Onyx4 Altix 350 (Prism Rackmount)
kaigan
Site Admin and SGI Tinkerer

Trade Count: (2)
Posts: 262
Threads: 31
Joined: May 2019
Location: Omaha, NE
Find Reply
06-05-2020, 08:43 PM
#2
RE: O350 Interface Board L1 Issues
Well this advice could be just dumb but, assuming you have access to another similar system with same boards, you could do the old diode & resistance check between ALL board external interface pins and compare to known good samples. This is what cell phone repair shops tend to do to start repair, that way they don't have to buy expense gear for each technician, they just have a chart of known, good, resistance and diode check values for a known model of phone at specific test points. Any difference is look into. It doesn't check everything but it does show if external communications/ports/board-to-board interfaces are blown out or affected. You could go a step further and do all major IC legs you find with the same tests and compare. The number of combinations is up to you, the more info, the better. But it's manually tedious...

Basically, what I'm suggesting is that you investigate via comparison alone. Inventory each board/assembly from it's interface pins to known board ground (removed from the system), check diode values (both directions/polarity) and basic resistance values (both polarities). While this may not show semiconductor damage, it will show basic passive components and pathway integrity confidence.

Write down your findings and compare values with those measured in the exact same locations on the suspect boards. Compare values, anything greater than 5%-8% off your measured good values should be treated as suspect. These findings may be used to create a "heat map" of problem pins, those pins lead to problem components, even bad/cracked solder joints have a resistance difference.

It's really only of value as a comparator, so if you don't have identical working boards in another system than it's not doing much for you.


But it is a cheap way of potentially finding more faulty paths, connections, or components. Also if you receive VERY LOW ohms > 200mOhm, try to use a high-end multimeter that understand 10 mOhms resolution. Any variance under 20mOhms is with margin of error, but greater values can tell you lot as well.

Let us know what you decide and discover.
(This post was last modified: 06-05-2020, 11:50 PM by weblacky.)
weblacky
I play an SGI Doctor, on daytime TV.

Trade Count: (10)
Posts: 1,716
Threads: 88
Joined: Jan 2019
Location: Seattle, WA
Find Reply
06-05-2020, 11:48 PM
#3
RE: O350 Interface Board L1 Issues
Sadly, this is the only O350 2U Interface board that I have. I have a couple of Altix 350 boards at this point, but given the architecture change, they're radically different. Still, it's a good idea to examine components and get values. The components on the board are very, very tiny, so it is a little hard to even probe all the legs of an IC, for example. It can be done, though. I won't have anything to directly compare the values to until I get a replacement, but it could also let me find the fault if a component is operating strangely or very out-of-spec.

I'm hopeful that the greater knowledgebase of the community might be aware of specific issues that these boards can encounter beyond the Dallas. If not, though, hopefully I'll be able to generate some more insight over time. Smile

Personaliris Indigo Indigo2 Indy Onyx2 Origin 200 Origin Vault O2 Octane2 (VW 320) (VW 540) (VW 550) Fuel Tezro Tezro Rack Origin 350 Onyx4 Altix 350 (Prism Rackmount)
kaigan
Site Admin and SGI Tinkerer

Trade Count: (2)
Posts: 262
Threads: 31
Joined: May 2019
Location: Omaha, NE
Find Reply
06-06-2020, 12:28 AM
#4
RE: O350 Interface Board L1 Issues
Hmm,
Well with that said then, you can scale back the idea to use get aforementioned values from the interface pins (backplane connectors, etc..) and see if any are actually in short. Odd thing about what qualifies as a short, it's not always black and white, but if you find any connection ~200mOhms or lower then that is a short and you have something to follow if you have a bad SMD cap dragging down a power line or something. So if you find a shorted interface pin, then:

I refurbished an old ToneOhm 700 recently that can do that kind of stuff. But you can emulate the short finding ability with a Volt meter (requires 10mV accuracy, 100mV accuracy is too big), most flukes can do this) and a bench power supply set to about 0.55v DC. Shorted components take nearly all surrounding current. After you find any short you use a current limiting power supply at like 0.55v DC @ 1A (maybe more amps) to apply power to suspect “hot” track and PCB ground. And you measure for voltage drop across the same (+) line (serially jumping components as you go). As you move along the “hot” path (both leads on positive side) you measure the voltage between the two leads on the same wire (+). After both leads have jumped over the component that's shorted you'll see almost no voltage drop. When the component that is sorted is between your two leads, you'll see a large millivolt difference voltage, when your "origin lead" (lead pulling up the rear of the track) is past the short, you'll see almost nothing, that's how you know you just passed the shorted cap is a series of parallel caps.

Most of the time you can keep your origin lead at the start of the track and keep extending your measurement out by moving only the secondary lead farther out. Once you start seeing climbing millivolt values, you can move your origin lead to skip over earlier track components and they will “fall off” the measured track area, which removes their voltage drop value from the measured reading on the meter. You can do a combination of extending the reach of the secondary leads or pulling up the rear origin lead closer to your current secondary lead location (along the same PCB track) and watch the meter reading increase or decrease, a huge decrease means you origin lead just jumped over your biggest short on that track. Pulling the origin and secondary test leads closer together helps focus on a smaller area of investigation, using the rising and falling meter reading as your guide).

Basic idea is like this: https://www.allaboutcircuits.com/textboo...-analysis/
weblacky
I play an SGI Doctor, on daytime TV.

Trade Count: (10)
Posts: 1,716
Threads: 88
Joined: Jan 2019
Location: Seattle, WA
Find Reply
06-06-2020, 12:56 AM
#5
RE: O350 Interface Board L1 Issues
(06-05-2020, 08:43 PM)kaigan Wrote:  If I can't repair it, I'll probably have to source a replacement

Sadly, I think this is what you are going to have to do. I had one do the same thing and there were similar reports from other O350 owners.

It seems that if the O350 sits around long enough for the Dallas battery to die, this can happen. You would think putting in a new Dallas would fix that, but it doesn't. That whole L1 thing is not as clearcut or simple as it looks and no one who reported this failure was able to get the board to work again :(

(More lost information from nekochan, dammit :( :( )
(This post was last modified: 06-06-2020, 03:47 AM by hamei.)
hamei
broke-down old clunker

Trade Count: (0)
Posts: 380
Threads: 3
Joined: Jul 2019
Location: 上海
Find Reply
06-06-2020, 03:46 AM
#6
RE: O350 Interface Board L1 Issues
Not being familiar with this board, does anyone know where the L1 actually lives? It occurs to me that if we knew the IC the constitutes the L1, we could just check VCC during power-on? When you say flaky...I think power...not…my L1 is "wearing out".

Chips go flakey for a number of reasons, but stable power is almost always at the top of the list.

I'd actually assume that the L1 doesn't just take power directly from the PSU AS-IS, e.g. most PC-style mainboards have a chipset or something that starts off your POST process. These are normally 3.3v, but could be anything. In PCs, they tend to sit on a power rail that is after mainboard’s creation of a 5V and 3.3v rail system provided by a series of two MOSFETs in a buck converter circuit driven by a management IC, who's job it is to make these lower voltages from the higher 12V voltage from the PSU at power-on. These lower voltages then run the mainboard’s core systems.

From what I’ve seen, 5v embedded peripherals (not expansion card slots, I think) and 3.3v ICs aren’t usually powered directly from PSUs in PCs (I know this isn’t a guarantee). Normally the Mainboard of modern computer generates the 5v (like for USB), 3.3v (for logic and chipsets), and ~1.35v for RAM, and finally a VRM (or series of build in phases) create the power flow for the main CPU. All from 12V PSU rails.

The fastest approach I know is to visually look for the tall-tail signs of the buck converter, two MOSFETS, followed by a inductor (coil), followed by a capacitor. Anywhere you see an SMD coil...you likely have a voltage converter. Measure the active voltage after the cap and before the cap (but after the SMD coil) and see what's coming out. A cap that is "going out" in this configuration causes LOW voltage (unstable) on that output line. These caps are often solid electrolytic SMD caps (not SMD ceramics) due to ripple attributes that are required for filtering.

I guess what I'm saying it, I'd check my LOW voltage rails (if I could find them) on the board by looking for SMD coils and measuring after the coil and any SMD electrolytic caps right next to them. You should get voltages that make sense. If you get something like under 3.1v or around there...I'd say you have a failing SMD cap and just replace that cap and you'll likely see the voltage come right back up!

Check out some YouTube videos on dead laptop power repair. They show this kind of visually recognizable DC-DC circuit. Once you know how to recognize the small MOSFETS and SMD coils...you'll start "seeing" where power conversion happens on your board, then you can target those regions to tell if the board itself it’s generating unstable power...which would cause startup issues.

Any high-resolution images of the board you can give us would be great. If I can see a good close series of pictures, I could highlight what I THINK are DC-DC conversion areas for you to test with a multimeter to check yourself.

Specially with this older hardware, where they likely aren't integrated into a single IC (power driver and dual MOSFETs in-one, they didn't have that when these were made. So seeing all the pieces needed to make this circuit is MUCH easier on older boards).

If you get stuck, take some good closeup pictures for us...maybe we can point you in a couple places to check.

Thanks.
weblacky
I play an SGI Doctor, on daytime TV.

Trade Count: (10)
Posts: 1,716
Threads: 88
Joined: Jan 2019
Location: Seattle, WA
Find Reply
06-06-2020, 07:03 AM
#7
RE: O350 Interface Board L1 Issues
The board is dual-sided, and I've posted a number of images up on Silicon Image now. In general, it looks like most of the connectivity and power circuitry is on the front and the back has the main brains of the board. Pictures can be seen here: http://siliconimage.irixnet.org/index.ph...ard-Repair

For the moment, we can turn our attention back to that capacitor I was talking about as a culprit. It turns out that when I removed the board for pictures, that cap was happy to slide back off the board like it was never attached. :-/ The remaining little bits of the pads are so small and maintaining an electrical connection, much less a structural one, seems to be a very difficult task. Like I said, I was getting voltage across the capacitor while it was installed, but if that connection was unstable, that could easily prevent the L1 from functioning properly.

Given that this is the mess I'm working with...

[Image: 20200606_091559.jpg?]

...what the heck should I do about it? I can still get electrical continuity from both of those little remaining areas, and there are actually test points for both pads on the back of the board. Kits for repairing this sort of issue appear to be fairly expensive. Is there a good way to do repair this economically?

Once this is fixed, I'll see if the board is working properly again. If not, we can proceed from there. Thanks for all the help and advice, everyone!

Personaliris Indigo Indigo2 Indy Onyx2 Origin 200 Origin Vault O2 Octane2 (VW 320) (VW 540) (VW 550) Fuel Tezro Tezro Rack Origin 350 Onyx4 Altix 350 (Prism Rackmount)
kaigan
Site Admin and SGI Tinkerer

Trade Count: (2)
Posts: 262
Threads: 31
Joined: May 2019
Location: Omaha, NE
Find Reply
06-06-2020, 02:26 PM
#8
RE: O350 Interface Board L1 Issues
Addition to the above: tracing the positive side of that capacitor pad out, it ties in directly to the 3.3v line present on one of the voltage regulators and to the same line on the Dallas. I'm thinking that this capacitor may act as a filter for that line, then. With that as the case, unless the board has pretty tight tolerances, it may not be strictly necessary for operation, indicating that the fault could be elsewhere.

For the moment, that's just additional musing. I'm curious to hear what better-informed folks think. Biggrin

Personaliris Indigo Indigo2 Indy Onyx2 Origin 200 Origin Vault O2 Octane2 (VW 320) (VW 540) (VW 550) Fuel Tezro Tezro Rack Origin 350 Onyx4 Altix 350 (Prism Rackmount)
kaigan
Site Admin and SGI Tinkerer

Trade Count: (2)
Posts: 262
Threads: 31
Joined: May 2019
Location: Omaha, NE
Find Reply
06-06-2020, 02:46 PM
#9
RE: O350 Interface Board L1 Issues
Another update: I did manage to get a capacitor back in place and connected electrically. It's a little wiggly, but as long as I'm careful it seems to be okay. I'll secure it properly in place before all is said and done. It really does seem to just be a filter cap for the 3.3v line.

Speaking of the 3.3v line, it seems to be kind of low. At various places on the board, I'm seeing voltages anywhere from 3.28v to 3.22v, with the lower end of that seeming a little bit concerning. As far as I can tell, the 3.3v is generated by a MIC 29302BU voltage regulator. The input is coming in at just over 5.0v, but I'm seeing only 3.28v as the output. While that isn't too far out of line for a 3.3v line, I'm surprised that it's not outputting a voltage a little higher than 3.3v directly out of the regulator. Maybe the voltage drop is enough that some chips aren't operating properly?

Again, all thoughts are appreciated! Smile

Personaliris Indigo Indigo2 Indy Onyx2 Origin 200 Origin Vault O2 Octane2 (VW 320) (VW 540) (VW 550) Fuel Tezro Tezro Rack Origin 350 Onyx4 Altix 350 (Prism Rackmount)
kaigan
Site Admin and SGI Tinkerer

Trade Count: (2)
Posts: 262
Threads: 31
Joined: May 2019
Location: Omaha, NE
Find Reply
06-06-2020, 05:41 PM
#10
RE: O350 Interface Board L1 Issues
Yo,
OK, a lot of stuff here and to be fair it's going to take several posts, likely over time, to work through this.

I thank you for posting such good images and I can clearly see your current enthusiasm and that you want to continue with this right now. So I'll try to explain my thoughts as "tasks" that will give us data and work us forward here.

1. In regards to the pictures, I only need (please just type it into a reply, don't try to photograph it) the "barely visible" lettering on the "transistor/mosfet looking" ICs. As you can see the coils are the large grey components whose component writing starts with "2R0LA", they have components next to them with 3 legs but the center leg cut off, in example regions labeled: E8F5 & F1E6. I need the text from those 3 leg (center-cut) ICs. Those areas look like DC generation (spot on). I can hopefully figure out where to measure voltage production (close to the source), if I can find the data sheet and all that.

2. In regards to your fallen-off capacitor. I’m worried it’s damaged, can you please use an ESR meter (if you have one) and a multimeter with a capacitor measurement function to try to verify its functional workings? Being physically knocked off with that much force, then re-heated over and over may have damaged it…you don’t want a damaged cap put back in for troubleshooting purposes…it may give you bad results and throw off your investigation. Please test it, though I would honestly prefer to order new caps with the same markings to install fresh on a new attempt. Get an SMD cap, not a through-via metal top cap, get a cap that looks like the old one.

3. Once you’re sure to have access to a known working cap to refit into the broken pad space you indicated now you can move forward. Let’s talk about exposing more copper trace. While I’d suggest you practice on another junk PCB until you get the hang of it. I recommend you make an amazon purchase of a nylon/fiberglass scratch pen:

https://www.amazon.com/dp/B003NHDITW/ref...2Eb13BFGFW

While you must be careful using this pen (for your breathing), this is what is commonly used to gently (but quickly) remove solder mask (green coating on the PCB board) through abrasive action. What you want to do is slowly remove the green masking on either (extreme-end) of the broken pad regions and you’ll hit copper from the two traces. Once you have clean copper you go no further! You can actually grind through the copper traces if you keep trying to sand with this pen (doesn’t happen instantly but don’t be aggressive). So what I expect to see is you’ll do a single line (repeated swiping of the pen) parallel to the white outline, travelling through the test point vias you tried to solder to. So the outline region where your old solder point and “+” are and the solder region where the “B7G4” text is. You’ll be uncovering more trace from that region. It should be very obvious once you see it that it’s where your broken pads were connected.

You need clean copper, but do two small areas and don’t hurry.

**HEALTH ISSUES** whether you get a nylon-based pen, or a genuine fiberglass pen isn’t an issue. But shedding fine fiber will occur with EVERY STROKE. That’s normal, you’ll need to blow them away and NEVER breath in from the site. I recommend extreme caution, you can run a desk fan on high, close to your breathing area, to push as much air at you as is reasonable to make sure that shedding fiber to being push away from you as you work. You don’t want to breath it in, if you want to try to wear a mask, fine. I’ve only done a little bit of this and never worn a mask...but I’ve always made sure the “fiber dust” that the pen produces is ejected mechanically and with air while I’m working…safety first!

4. Once you have good, usable areas of copper exposed at either end, you’ll need to try to pre-tin them. Remember, just like duct tape, solder likes to stick to solder. So, don’t try to take the cap and just solder it to the new copper. You’ll need to focus on successfully applying solder to the newly exposed copper areas. That will be difficult, but once you do it, then you just have to do a solder-to-solder joint…which is much easier.

5. I have a few different ideas on doing this, but I’ll pick an unorthodox one that should make is easier for someone starting out doing a board like this. For ease of this, I think it might be better to approach this by NOT putting the cap back the way it was (very difficult and will require pad material to attach Cap leads to new copper areas. I might have to draw out this idea to best describe it.

Before the start, you want to get two 3 inch lengths of wire, I’m unsure what gauge we really need but I’d say the larger the better to be safe, let’s say obviously not bigger than the copper traces you’ll be uncovering, so maybe like 20awg or around there? If you have like old 18awg wiring from a lighting fixture install or something, that might work as well. Just looking for easily workable wire that can carry let’s say 4 amps? Feel it out. Strip the ends about 2/8 inch and pre-tin the leads on both wires correctly (flux then solder) first.

But to try to put it into words, you need to preheat the board because the reason you have issues soldering so far is the board is mass of copper, the heat from your soldering iron is being take away and diffused throughout the copper layers. You may need a helper for this idea. The board really only needs to get to like 120F to help you, you want to take a fight out of this heatsink problem. You can use a heat gun or hair drier set on high and try to evenly heat the board from like 7 inches above it. I don’t mean concentrate heat on one spot…that won’t work and you’ll burn things. Patently, wave the heat gun around the entire board 7 inches or higher from it, just spend 20-30 minutes waving, tracing, moving, all around. You’re warming the board like you’re trying to melt an open-faced sandwich (evenly and slowly).

The board will get just into the “uncomfortable to hold/touch” situation. That should mean you’ve reached the right temperature! No solder should melt anywhere…you shouldn’t be near that kind of temperature. Just a little uncomfortable to hold in your hand or touch…little too hot to hold it.

Now that the board’s warmed up (quickly, before it cools down) stop heating it and set that heat gun aside, you want to use paste or liquid flux and quickly apply it to the “newly exposed” copper trace areas you uncovered with the abrasive pen, then attempt to get a nice, thin layer of solder on it with your iron set HOT, like 750F or maybe even higher…not a huge blob, just a nice little hump/hill on both exposed areas.

If you manage to get a nice little hill of solder successfully on both pads, then comes the kind of crazy idea (mine) part. I want you to now use your soldering iron and attach the each of the pre-tinned wires (one end on each wire) to the pads (one wire to one pad). So in the end you have the empty pad, with two short (perhaps already in a semi-circular shape) wires, laying on their sides (like IC legs) across your soldering region curving upwards. Think upside-down dead bug legs. Each wire curving upwards, pre-tinned leads are laying on the pads. You want as much pad to wire contact you can get. So lay the wire lead across the pre-tined copper pad in the best orientation that makes the most contact.

Reheat the joint and solder the wire lead to the pre-tinned pad area. Due to the board pre-heating the solder may not cool fast enough and the wire may fall or move if you let it go after soldering. So be prepared to hold the wire in place with pliers or something for like about 5-7 minutes at most as the board is cooling, eventually it should solidify and you’ll have two wires coming out of your pad regions (one at “+” area and one around the “B7G4” area).

6. Take a break, let the board cool to room temp, get a sandwich or call it a day and do this next step the next day.

7. I think you should install the cap UPSIDE DOWN, that is like a magician pulling a rabbit out of a hat, place the top of the SMD cap on the PCB, with its legs in the air, then you can flux, pre-tin the cap legs, then solder each of your wires to the legs *watch polarity (orientation and polarity must match…so + wire to + cap side).

And you’re done. You can slightly silicone epoxy the cap’s top to the PCB if you want, I’d go very light on any adhesive. Don’t use hot glue or crazy glue, they can deform plastics when curing and shrink which put stress on the cap and board.

Now you have it, you didn’t have to reheat the board and focus on it so hard that you risked burning or ruining other components, you have a very good connection with your little 3” wires. You can trim them or bend them (gently) or whatever to connect to the new cap. Attach the wire leads to the cap and you’re done. Now you COULD lay the cap on it’s side, or some other orientation if you want, whatever you can get to fit on the board.


This should hopefully be the easiest way to work on this board. All the heating and caution is very real, so don’t try to do a blind SMD cap hot air solder to it…chances aren’t in your favor. I suggest trying my wire lead idea, then you can attach the wire leads in another step without reheating the board.

Anyway, that’s what I’d try. It just has to look neat and well done, you don’t have to place components back in the same spots…just the same connection points.

Let us know how you end up.

In terms of the voltage fluxing, I don't think 3.28v is an issue. I'm unsure about 3.22v, the fact that it's wiggling highly suggests a filtering cap isn't working well anymore. As that's a good symptom of not doing it's job. If you get me the IC text I asked for above I can hopefully point you to the other voltage generation areas to probe to see if you see the same wiggling. I suspect the caps in those areas are the real issue (e.g. F2F1, F2E2, H0E9, H6E9).

Thanks,
(This post was last modified: 06-06-2020, 08:41 PM by weblacky.)
weblacky
I play an SGI Doctor, on daytime TV.

Trade Count: (10)
Posts: 1,716
Threads: 88
Joined: Jan 2019
Location: Seattle, WA
Find Reply
06-06-2020, 08:35 PM


Forum Jump:


Users browsing this thread: 1 Guest(s)