tezro problems
#21
RE: tezro problems
(06-03-2021, 06:18 PM)weblacky Wrote:  
(06-03-2021, 03:12 PM)kaigan Wrote:  A Dallas that isn't dead dead should still work fine. The battery can be shot as long as the chip itself works.

Maybe try rerunning the POD mode commands without the IO9, then run enableall, update and reset from the PROM to see how the system reacts?

Now that's an IDEA!

KOKOBOI:
Where did you get this Dallas and do you still have your OLD dallas?  If your Dallas is FAKE...could this happen?  Kaigan is right, a dead Dallas battery on a working Dallas IC will work fine with power...use that as a troubleshooting guide.  I've personally NEVER seen a "dead" Dallas Chip (not working), only a dead battery...your L1 keeps trying to make alterations to NVRAM, I assume UNsuccessfully.  Perhaps this could be the reason? 

I know you said things were still a problem with your OLD L1 Dallas, but try the old one with NEW snaphat on IO9 anyway and try reset procedure again.  See if there is a difference.

Let us Know.


ALSO, there was an experiment not long ago from jwhat that proved a Tezro will happily accept a BLANK L1 Dallas...but will NOT ACCEPT, NOR ERASE, a USED DALLAS!  So there are two options there. Either use a dallas that was drilled with a battery and remove the battery to blank it, two use a programmer to "erase" the old NVRAM data through programming commands.  If you got a USED dallas instead of the blank one, the Tezro would reject it with serial errors, though I didn't see that in your output you posted.


 I am not sure how different the desk Tezro is to the rack one, but I bought several rack Tezros that were all missing the Dallas chips. I put in some used ones I had from old O350s and all came up and worked.  I bought some new Dallas chips that were same part number, slightly bigger and had a slightly different pin spacing that worked once I used a socket to adapt them.  I was able to the the "clear, init, flush" reset procedure and then assign a serial number. 

 On a separate note, I have seen a V10 card with a bad env chip cause the same errors you are seeing.

I have extra known good used Dallas chips so I could send you one if you wanted too try that.
(This post was last modified: 06-03-2021, 07:28 PM by mopar5150.)
mopar5150
Octane

Trade Count: (6)
Posts: 125
Threads: 37
Joined: May 2018
Location: Palm Springs, California
Find Reply
06-03-2021, 07:25 PM
#22
RE: tezro problems
(06-03-2021, 07:25 PM)mopar5150 Wrote:  
(06-03-2021, 06:18 PM)weblacky Wrote:  
(06-03-2021, 03:12 PM)kaigan Wrote:  A Dallas that isn't dead dead should still work fine. The battery can be shot as long as the chip itself works.

Maybe try rerunning the POD mode commands without the IO9, then run enableall, update and reset from the PROM to see how the system reacts?

Now that's an IDEA!

KOKOBOI:
Where did you get this Dallas and do you still have your OLD dallas?  If your Dallas is FAKE...could this happen?  Kaigan is right, a dead Dallas battery on a working Dallas IC will work fine with power...use that as a troubleshooting guide.  I've personally NEVER seen a "dead" Dallas Chip (not working), only a dead battery...your L1 keeps trying to make alterations to NVRAM, I assume UNsuccessfully.  Perhaps this could be the reason? 

I know you said things were still a problem with your OLD L1 Dallas, but try the old one with NEW snaphat on IO9 anyway and try reset procedure again.  See if there is a difference.

Let us Know.


ALSO, there was an experiment not long ago from jwhat that proved a Tezro will happily accept a BLANK L1 Dallas...but will NOT ACCEPT, NOR ERASE, a USED DALLAS!  So there are two options there. Either use a dallas that was drilled with a battery and remove the battery to blank it, two use a programmer to "erase" the old NVRAM data through programming commands.  If you got a USED dallas instead of the blank one, the Tezro would reject it with serial errors, though I didn't see that in your output you posted.


 I am not sure how different the desk Tezro is to the rack one, but I bought several rack Tezros that were all missing the Dallas chips. I put in some used ones I had from old O350s and all came up and worked.  I bought some new Dallas chips that were same part number, slightly bigger and had a slightly different pin spacing that worked once I used a socket to adapt them.  I was able to the the "clear, init, flush" reset procedure and then assign a serial number. 

 On a separate note, I have seen a V10 card with a bad env chip cause the same errors you are seeing.

I have extra known good used Dallas chips so I could send you one if you wanted too try that.



Must be because we spent a huge amount of time on this:

https://forums.irixnet.org/thread-2488.html
https://forums.irixnet.org/thread-2413.html
https://forums.irixnet.org/thread-2535.html

Basically if you have a used RTC from a Fuel, works fine (reinits) on Tezro,  If you have a BLANK Chip, inits fine on Tezro, If you have a chip with OTHER data, nope...L1 issues cannot/refuses to get to allow serial number set.  Tezro to tezro was hit/miss.  I think khral (thread-2413) was able to "reprogram" an RTC with another Tezro's valid image and a second Tezro WAS able to overwrite and init everything.  But depending on WHAT was on the NVRAM, it's not a slam dunk.
weblacky
I play an SGI Doctor, on daytime TV.

Trade Count: (10)
Posts: 1,716
Threads: 88
Joined: Jan 2019
Location: Seattle, WA
Find Reply
06-04-2021, 02:13 AM
#23
RE: tezro problems
Quote:KOKOBOI:

Where did you get this Dallas and do you still have your OLD dallas?  If your Dallas is FAKE...could this happen?  Kaigan is right, a dead Dallas battery on a working Dallas IC will work fine with power...use that as a troubleshooting guide.  I've personally NEVER seen a "dead" Dallas Chip (not working), only a dead battery...your L1 keeps trying to make alterations to NVRAM, I assume UNsuccessfully.  Perhaps this could be the reason?
Quote:ALSO, there was an experiment not long ago from jwhat that proved a Tezro will happily accept a BLANK L1 Dallas...but will NOT ACCEPT, NOR ERASE, a USED DALLAS! So there are two options there. Either use a dallas that was drilled with a battery and remove the battery to blank it, two use a programmer to "erase" the old NVRAM data through programming commands. If you got a USED dallas instead of the blank one, the Tezro would reject it with serial errors, though I didn't see that in your output you posted.
I've purchased it from aliexpress, so it's probably not blank. My programmers doesn't support this model... so I have no way to clear it.

Quote:I know you said things were still a problem with your OLD L1 Dallas, but try the old one with NEW snaphat on IO9 anyway and try reset procedure again. See if there is a difference.
The snaphat has been replaced with a brand new chip at the beginning, no difference.
kokoboi
O2

Trade Count: (0)
Posts: 46
Threads: 5
Joined: May 2018
Find Reply
06-04-2021, 07:18 AM
#24
RE: tezro problems
Your programmer doesn't support DS1220, that was the profile you need to use (it was in the second link I posted)? You might want to check again. I've extremely skeptical you got genuine product from aliexpress, just saying. I'd put the old one back until you get it figured out.
weblacky
I play an SGI Doctor, on daytime TV.

Trade Count: (10)
Posts: 1,716
Threads: 88
Joined: Jan 2019
Location: Seattle, WA
Find Reply
06-04-2021, 08:14 AM
#25
RE: tezro problems
(06-04-2021, 08:14 AM)weblacky Wrote:  Your programmer doesn't support DS1220, that was the profile you need to use (it was in the second link I posted)?  You might want to check again.  I've extremely skeptical you got genuine product from aliexpress, just saying.  I'd put the old one back until you get it figured out.

With DS1220 profile I'm able to read it (can't write though). The dallas chip is blanked.

Is there any easily obtainable SCSI controllers for the Tezro.... I will not use the IO9 SCSI controller for the moment
(This post was last modified: 06-05-2021, 06:40 AM by kokoboi.)
kokoboi
O2

Trade Count: (0)
Posts: 46
Threads: 5
Joined: May 2018
Find Reply
06-05-2021, 06:37 AM
#26
RE: tezro problems
Bought two IO9 cards, but drives are not detected (will check over serial tomorrow). I believe the same IO9 error occurred.
This time I'm pretty confided the NVRAM on the mainboard is completely dead and my replacement DS1220 are fake.
I've got this as a replacement: https://www.ebay.de/itm/271942434255
(This post was last modified: 08-29-2022, 07:05 PM by kokoboi.)
kokoboi
O2

Trade Count: (0)
Posts: 46
Threads: 5
Joined: May 2018
Find Reply
08-29-2022, 06:22 PM
#27
RE: tezro problems
Even with new dallas chip on mainboard and another IO9 card I got the same error:

io_config_space: Found 0 Qlogic devices BASEIO; expected 1 or more.
io_config_space failed:
RSLT io_config_spac FAIL diag_rc = 53
diag_io6confSpace_sanity: /hw/module/001c01/xtalk/15: FAILED


I gave up on this :(
kokoboi
O2

Trade Count: (0)
Posts: 46
Threads: 5
Joined: May 2018
Find Reply
08-30-2022, 11:14 AM
#28
RE: tezro problems
(05-02-2021, 09:29 AM)kokoboi Wrote:  I haven't powered up the tezro for a long time. The last time I've used the machine it booted IRIX fine.

First I noticed some VRM problem messages(don't know if this is a problem):
Quote:INFO: Cannot enable VRM: 9
INFO: Cannot enable VRM: 10
INFO: Cannot enable VRM: 11


SGI SN1 L1 Controller
Firmware Image B: Rev. 1.40.6, Built 01/06/2006 13:16:50


001c01-L1>

This is normal -- i think those VRMs might be on an IP59 board, and as you've got the IP53, you won't have them.

Quote:But it doesn't detect the IO9 board also:
001c01-L1>pwr up
001c01-L1>
entering console mode  001c01 CPU0, <CTRL_T> to escape to L1
Starting PROM Boot process
io_config_space: Found 0 Qlogic devices BASEIO; expected 1 or more.
io_config_space failed:
RSLT io_config_spac FAIL                diag_rc = 53
diag_io6confSpace_sanity: /hw/module/001c01/xtalk/15: FAILED


IP35 PROM SGI Version 6.210  built 02:33:51 PM Aug 26, 2004
Testing/Initializing memory ...............            DONE
Copying PROM code to memory ...............            DONE
Discovering local IO ......................            DONE
Discovering NUMAlink connectivity .........
Local hub NUMAlink is down.
*** Local network link down
DONE
Found 1 objects (1 hubs, 0 routers) in 5887 usec
Waiting for peers to complete discovery....            DONE
No other nodes present; becoming global master
Global master is /hw/rack/001/bay/01
Intializing any CPUless nodes..............            DONE
Checking partitioning information .........            DONE
No other nodes present; becoming partition master
Local slave entering slave loop
Local slave entering slave loop
Local slave entering slave loop
Loading BASEIO prom .......................            DONE

BASEIO PROM Monitor SGI Version 6.210  built 02:30:38 PM Aug 26, 2004 (BE64)
4 CPUs on 1 nodes found.

NVRAM checksum is incorrect: reinitializing.
Automatic update of PROM environment disabled
Graphics diagnostics

Odyssey board #0 found on nasid 0
Running Odyssey xtalk sanity diag...
        Board version 1 - Buzz revision 3B
        On board sdram size: 128 Mb
        Cas latency: CAS 3
        4 banks by sdram module
Running Odyssey Buzz registers diag...
Device passed diagnostics

Installing PROM Device drivers ............
On-board (IO9) tigon3 1000BaseT interface
Base I/O Ethernet set to /dev/ethernet/tg0
Installing Graphics Console...
graphics install: searching for pipe 0
Probing IOC4 ATA adapter 2
IOC4 RevId = 83
Initializing PROM Device drivers ..........
  Initializing Base I/O Ethernet Interface...Failed.  MII Status Register = 0x7949
Done.
  ---------------Interface Configuration Summary----------------
  ASIC|Revision|MAC Address      : 5701|B5|08:00:69:11:dc:4b
  Link Negotiation|Advertisement  : On|<H10 F10 H100 F100 F1000>
  Link|Speed|Duplex|Rx/Tx FlowCtrl: Down|10|Half|Off/Off
  --------------------------------------------------------------
DONE
Cannot connect to keyboard -- check the cable.
Cannot open /dev/input/ioc4pckm0 for input
Cannot connect to keyboard -- check the cable.
Cannot open /dev/input/ioc4pckm0 for input
Checking hardware inventory ...............
WARNING: hardware inventory is invalid.  Reinitializing...
Writing 5 records..... DONE
Updated new configuration. Wrote 5 records.

**** System Configuration and Diagnostics Summary ****
CONFIG:
        No. of NODEs enabled    = 1
        No. of NODEs disabled  = 0
        No. of CPUs enabled    = 4
        No. of CPUs disabled    = 0
        Mem enabled            = 4096 MB
        Mem disabled            = 0 MB
        No. of RTRs enabled    = 0
        No. of RTRs disabled    = 0

DIAG RESULTS:
        ALL DIAGS PASSED.
**** End System Configuration and Diagnostics Summary ****

Any help will be appreciated.

Best regards,
kokoboi

This might sound dumb, but have you checked the cable connection on the node board side to ensure the HDD bays are actually connected to the cable you've plugged into the IO9?  Check the power as well... if the drives aren't powering up, they won't be seen.

Also, the Dallas on the PCI side of the Tezro, I'm pretty sure that's used only for the L1 controller -- if this is bad, you'll know it, the LCD display will not power on or you'll get different errors on the console.

Indigo2 IMPACT  : R10K-195MHz, 1GB RAM, 146GB 15K, CD-ROM, AudioDAT, MaxImpact w/ TRAM.  IRIX 6.5.22

O2 : R12K-400MHz, 1GB RAM, 300GB 15K, DVD-ROM, CRM Graphics, AV1/2 Media Boards & O2 Cam, DV-Link, FPA & SW1600.  IRIX 6.5.30

 : 2 x R14K-600MHz, 6GB RAM, V12 Graphics, PCI Shoebox.  IRIX 6.5.30

IBM  : 7012-39H, 7043-140

(This post was last modified: 08-30-2022, 04:14 PM by chulofiasco.)
chulofiasco
Hardware Junkie

Trade Count: (0)
Posts: 328
Threads: 51
Joined: May 2019
Location: New York, NY
Website Find Reply
08-30-2022, 04:12 PM
#29
RE: tezro problems
I double checked the scsi bay, but even with scsi cable disconnected I get the same error.
What have puzzled me is that I'm able too boot from IDE CDROM, also the lan works, it only doesn't detect the qlogic scsi controller.
And this problem have occurred with 2 different IO9 boards, so it should be something else .
I installed a new Dallas on the mainboard and adjusted the serial to the one on the back of the machine and it keeps the new settings.

One possible solutions is to hook another PCI HVD scsi controller, but I'll leave that for a last solution.
(This post was last modified: 08-30-2022, 06:43 PM by kokoboi.)
kokoboi
O2

Trade Count: (0)
Posts: 46
Threads: 5
Joined: May 2018
Find Reply
08-30-2022, 06:36 PM
#30
RE: tezro problems
Okay, clearly this is a very unusual situation. And I’m going to just put this out there even though I have no basis for this and that is I think you have a power or VRM problem.

In my mind unless you have a cracked solder joint or lifted trace or damaged board, that is by scratches from improper handling things like that, that local power problems would leave ICs unresponsive or otherwise not operational.

Now I fully realize that a lot of the Qlogic stuff is BGA so it’s not easy to just find the right pin put a multimeter on it and say yes I’m getting correct VCC on this chip.

But since you’ve gone through multiple cards I think you have a power issue either on mainboard or on socket.

Here’s what I’d like you to try. Before you start swapping things can you get to an L1 prompt to issue the ENV command?

Because you’re, maybe, not booting far enough to know if the ENV system is on or off or working at all when it comes to the monitoring.

Let’s start with what your monitoring says for everything. If you’ve already posted this information then I just don’t remember seeing it. If you could re-post it for me, let’s start there.

I haven’t taken a good look at an I09 but mine is in fact removed right now so I can take glance at one easy enough and see if there any voltage conversion circuits on it at all that convert Power from the slot to power for the IO9 itself. If I don’t see such a thing then I’m going to assume that power comes from the slot and it’s directly used by the ICs on the IO9.

I’m going to propose that you have a missing or low voltage that is being sent to your SCSI controller. And basically the chip is simply not turning on and that’s why it’s gone. I would hope opposite isn’t true. That it’s being sent too much power and is actually damaging it.

But either way this is where I would start.

Let’s also see if we can find the Qlogic chip on the IO9 and find its data sheet. That’ll tell us exactly what voltage it’s taking and potentially what VRM were using. I do know that some of the removable or semi removal VRM‘s are used to produce the lower voltages. But I’d normally believe that the Qlogic chip would use the normal power rails on the power supply.

Why don’t we proceed under this assumption and see if we can find ourselves an errant or missing voltage that might explain loss of power on part of the IO9 board?
weblacky
I play an SGI Doctor, on daytime TV.

Trade Count: (10)
Posts: 1,716
Threads: 88
Joined: Jan 2019
Location: Seattle, WA
Find Reply
08-30-2022, 07:02 PM


Forum Jump:


Users browsing this thread: 1 Guest(s)