My Octane just died -
Shiunbird - 12-01-2023
In a quite bizarre timing, I was just taking photos of my lightbar for Weblacky and thought that, hey, the Octane is on, will play some Doom. Literally 15 seconds after I sent the pictures.
I went to switch user, clicked Logout and the screen scrolled up a bit, distorted and went blank.
I hit the reset button and red (I think - colorblind here) lightbar, attached the serial cable and got:
Code:
Exception: <vector=Normal>
Status register: 0x34005083<CU1,CU0,FR,IM7,IM5,IPL=???,KX,MODE=KERNEL>
Cause register: 0xc01c<CE=0,IP8,IP7,EXC=DBE>
Exception PC: 0xffffffff9fc3f020, Exception RA: 0x0
Data Bus error
HEART ISR : 0x8000000000000000<HEART_EXC>
HEART IMSR: 0x8000000000000000<HEART_EXC>
Cause: 0x10000<WIDGET_ERR>
Widget Error type: 0x8<PIO_RD_TIMEOUT>
PIO rd timeout address: 0xc02c<CPU=0,IO_SPACE=0x0,DIDN=0xc,ADDR=0x2c>
VID 0's saved user regs in hex (gpda=0xa800000020f1a3a8):
arg: 1 195c0400000000 1 195c0408000000
1800000000 ffffffffffff0000 195d0404120c08 0
tmp: 900000001c030000 5 3 8000000
sve: 900000001c020000 900000001c120000 19120400000000 2
195d0400000244 195d04040a1c10 195d0400000344 0
t8 10000000 t9 0 at 195c0400000024
v0 1 v1 1 k1 ffffffffbad11bad
gp 0 fp 195c0428000000 sp a800000020ffed10 ra 0
PANIC: Unexpected exception
[Press reset or ENTER to restart.]
The error comes less than half second after pushing the power button.
Pressing enter does nothing, reset gives me the same error.
I googled a bit around, checked quickly the forums... and...
Before I go and replace my graphics card (it's the only spare part I own)... since I hate the connector, thought of asking around first.
I am sad. =(
forgot to mention I saw this post:
https://forums.irixnet.org/archive/index.php?thread-2405.html
RE: My Octane just died -
johnnym - 12-01-2023
(12-01-2023, 09:22 PM)Shiunbird Wrote: I am sad. =(
I can fully understand that.
Does it work again after it was powered off (plus removing power and hitting the power button to "empty" the PSU) and has cooled down for an hour maybe?
When you decide to exchange the graphics card, you could in between try to run the machine when the possibly faulty graphics card was removed, to see if that error is still there. Which if still there would rule out the graphics card IMO.
RE: My Octane just died -
weblacky - 12-01-2023
This looks very similar in many respects to the link you posted. I'd check the fans or do an outright replacement of all the fans! I know I did a look through Octane case fans recently and I KNOW the heart ASIC Fan was still made and I thought the others were too.
Perhaps as a preventative, since Octane should have had thermal monitoring...but doesn't, that a "fan kit" needs to be made for Octane users to just whole-hog replace their fans after all this time. Since they run so hot and fans are critical.
But it looks like I'd agree, likely damage has been done, I do hope it's the graphics, as implied, so you can get back to work.
RE: My Octane just died -
vishnu - 12-02-2023
And just as a side note, don't ever, ever, *ever* shutdown an Octane from its power switch. Use 'shutdown -p' and then sit there for a minute until it asks you if you really mean to shut it down (or not), and type "yes." The Octane will then shut itself off. The power button's only valid use is to turn an Octane on.
RE: My Octane just died -
weblacky - 12-02-2023
(12-02-2023, 04:40 AM)vishnu Wrote: And just as a side note, don't ever, ever, *ever* shutdown an Octane from its power switch. Use 'shutdown -p' and then sit there for a minute until it asks you if you really mean to shut it down (or not), and type "yes." The Octane will then shut itself off. The power button's only valid use is to turn an Octane on.
I've never heard this before. May inquire as to your experience that led to this advice?
RE: My Octane just died -
vishnu - 12-02-2023
(12-02-2023, 07:28 AM)weblacky Wrote: (12-02-2023, 04:40 AM)vishnu Wrote: And just as a side note, don't ever, ever, *ever* shutdown an Octane from its power switch. Use 'shutdown -p' and then sit there for a minute until it asks you if you really mean to shut it down (or not), and type "yes." The Octane will then shut itself off. The power button's only valid use is to turn an Octane on.
I've never heard this before. May inquire as to your experience that led to this advice?
While I fully admit, XFS is probably one of the best journaling filesystems, filesystem corruption is not your friend. Let the kernel take care of shutting down. If IRIX works the way it's supposed to, pushing the power button will transfer the shutdown to the kernel, but it's not worth the risk. If you really want to experiment with how good XFS is, just yank the power plug, and see how well it comes back up.
RE: My Octane just died -
weblacky - 12-02-2023
(12-02-2023, 09:25 PM)vishnu Wrote: (12-02-2023, 07:28 AM)weblacky Wrote: (12-02-2023, 04:40 AM)vishnu Wrote: And just as a side note, don't ever, ever, *ever* shutdown an Octane from its power switch. Use 'shutdown -p' and then sit there for a minute until it asks you if you really mean to shut it down (or not), and type "yes." The Octane will then shut itself off. The power button's only valid use is to turn an Octane on.
I've never heard this before. May inquire as to your experience that led to this advice?
While I fully admit, XFS is probably one of the best journaling filesystems, filesystem corruption is not your friend. Let the kernel take care of shutting down. If IRIX works the way it's supposed to, pushing the power button will transfer the shutdown to the kernel, but it's not worth the risk. If you really want to experiment with how good XFS is, just yank the power plug, and see how well it comes back up.
So if I'm understanding you correctly you're saying that you've had experiences where when you press the power button instead of the kernel starting to shut down the message doesn't get past correctly and the system's board just times out and pulls power? Is that what you're saying? Otherwise I'm a little confused how a software shut down of the kernel via the button is different than the software shutdown via command. So I could only assume you're saying that there's a chance that the command doesn't make it to the kernel but the system shuts down anyway after some amount of seconds?
RE: My Octane just died -
vishnu - 12-04-2023
All I'm saying is, the power button is only there to turn the machine on. Proper Unix etiquette has always been to have the kernel turn the machine off via the 'shutdown' command which needs to be issued as root. Now, admittedly, if you don't have root access, you do have to use the power button, and if everything goes according to plan, the kernel will take over and do what it needs to when it sees the button has been actuated. To me, it's just not worth the risk. That's why, for example on Windows, you have to push and hold the power button down for five seconds to get the kernel to take you seriously and shut the system down.
RE: My Octane just died -
Shiunbird - 12-04-2023
Okey - I am wrapping up work in a bit, will try without the graphics.
AFAIK, the fans are ok. I don't notice any difference from when I got the Octane 2 years ago, in terms of air flow. It sits on the desk and I do a dust off every 6 months, last one having been 3 months ago.
Thanks for the tips - will report back.
RE: My Octane just died -
Shiunbird - 12-04-2023
SHE LIVES!
Test 1: Boot without graphics, serial console. All fine.
Test 2: I wanted to test my current card but without TRAM. When I went about removing the TRAM module, I found out it's bolted to the frame with a plastic seal. I didn't want to break it off, so I swapped cards and she is fine.
Fans: They blow more air than my girlfriend's hairdryer, so I doubt the problem is there. I should probably get spares, though.
One thing I found out browsing preterhuman is that Octanes are sensitive to power cycles due to thermal expansion and I usually use my Octane 1h-2h per day. =( We live we learn. Disaster averted.
Life without TRAM will suck, but hey...
Thanks for the tips!