o350 env chip locations
#11
RE: o350 env chip locations
Great, thank you both for confirming the PSU voltages! And jwhat thanks for those awesome images! Based on jan-jaap's env log I'm going to try an idea and just pull the single PSU, whichever one is responsible for 12V #1. Might be enough, or maybe it really is just a bad power supply.

But either way, the chips look like they're in an easy position to remove and replace. Lots of room on these boards. Some kapton tape and chipquik should do the job.
vvuk
O2

Trade Count: (0)
Posts: 43
Threads: 4
Joined: Aug 2021
Location: California
Find Reply
03-01-2022, 05:39 AM
#12
RE: o350 env chip locations
Hi vvuk,

looking at picture closely, I think there might be a third DS1780 at N4K4 just near the part number label.

Which inspection of another spare board confirms and I also checked the back of the board and could not see anymore DS1780 chips on the underside.

So the O350 board appears to have 3 DS1780 env monitoring chips on board.

Cheers from Oz,

jwhat/John
(This post was last modified: 03-01-2022, 12:22 PM by jwhat.)
jwhat
Octane/O350/Fuel User

Trade Count: (0)
Posts: 513
Threads: 29
Joined: Jul 2018
Location: Australia
Find Reply
03-01-2022, 12:17 PM
#13
RE: o350 env chip locations
There is something very strange going on here.  I have three o350 bricks.  The ambient temp is 19C, and it's been a little warmer earlier in the day.  Of the three bricks, the first one hasn't been powered up in days.  The second one has been up and down a bit as I've been working on the third one that has the env issue (they're linked together right now).  The machines are all currently powered down.  These are all 100% bogus:

Code:
one:
                              Advisory   Critical   Fault      Current
Description       State       Temp       Temp       Temp       Temp
----------------- ----------  ---------  ---------  ---------  ---------
0 INTERFACE 0      Wait Pwr   31C/ 87F   48C/118F   55C/131F    6C/ 42F
1 INTERFACE 1      Wait Pwr   31C/ 87F   48C/118F   55C/131F    6C/ 42F
2 INTERFACE 2      Wait Pwr   31C/ 87F   48C/118F   55C/131F   16C/ 60F
3 PCI RISER        Wait Pwr   31C/ 87F   48C/118F   55C/131F   15C/ 59F
4 ODYSSEY        <not present>
5 NODE             Wait Pwr   31C/ 87F   48C/118F   55C/131F   15C/ 59F
6 BEDROCK          Wait Pwr  Not currently available


two:
                              Advisory   Critical   Fault      Current
Description       State       Temp       Temp       Temp       Temp
----------------- ----------  ---------  ---------  ---------  ---------
0 INTERFACE 0      Wait Pwr    [Autofan Control]    75C/167F   37C/ 98F
1 INTERFACE 1      Wait Pwr    [Autofan Control]    75C/167F   36C/ 96F
2 INTERFACE 2      Wait Pwr    [Autofan Control]    75C/167F   38C/100F
3 PCI RISER        Wait Pwr    [Autofan Control]    75C/167F   28C/ 82F
4 ODYSSEY          Wait Pwr    [Autofan Control]    75C/167F   31C/ 87F
5 NODE             Wait Pwr    [Autofan Control]    75C/167F   32C/ 89F


three:
                              Advisory   Critical   Fault      Current
Description       State       Temp       Temp       Temp       Temp
----------------- ----------  ---------  ---------  ---------  ---------
0 INTERFACE 0      Disabled   Disabled   Disabled   Disabled    9C/ 48F
1 INTERFACE 1      Disabled   Disabled   Disabled   Disabled    8C/ 46F
2 INTERFACE 2      Disabled   Disabled   Disabled   Disabled   77C/170F
3 PCI RISER        Disabled   Disabled   Disabled   Disabled   12C/ 53F
4 ODYSSEY        <not present>
5 NODE             Disabled   Disabled   Disabled   Disabled   18C/ 64F
6 BEDROCK          Disabled   Disabled   Disabled   Disabled    4C/ 39F

There is no way anything in that first brick is 6C. Likewise there is no way anything in that second brick is close to 40C; I can physically touch the ds1780 and it's.. room temperature. The third suspect brick, nothing is 77C for sure Smile Powering up doesn't change the numbers, so I don't think it's an issue with whether the chips are actually getting power or not (I assume they must be through aux, otherwise voltage readings won't work).

Do people's o350 temp readings actually look sane? jwhat & jan-jaap, all of your readings look normal (and most importantly, your two bricks jan-jaap match). I'm at a loss.
vvuk
O2

Trade Count: (0)
Posts: 43
Threads: 4
Joined: Aug 2021
Location: California
Find Reply
03-02-2022, 01:22 AM
#14
RE: o350 env chip locations
hi vvuk,

yes the env results do look incorrect and looking at mine I was also a bit wary, but think variation could be due to different machines types:

>> M200XXXX-001-L2>env
>> 001c01:
>> Environmental monitoring is enabled and running.
>>
>> Description State Warning Limits Fault Limits Current
>> -------------- ---------- ----------------- ----------------- -------
>> 1.8V Enabled 10% 1.62/ 1.98 20% 1.44/ 2.16 1.777
>> 12V Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.063
>> 12V #2 Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.063
>> 3.3V Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.320
>> 12V IO Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.000
>> 5V AUX Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.096
>> 3.3V AUX Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.302
>> PCI 5V AUX Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.070
>> PCI 3.3V Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.285
>> PCI 2.5V Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.496
>> PCI 5V Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 4.940
>> XIO 12V BIAS Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.000
>> XIO 5V Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 4.914
>> XIO 2.5V Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.457
>> XIO 3.3V AUX Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.285
>> IP53 3.3V AUX Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.302
>> IP53 5V AUX Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.070
>> IP53 12V Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.000
>> IP53 VCPU Enabled 10% 1.13/ 1.38 20% 1.00/ 1.50 1.241
>> IP53 SRAM Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.457
>> IP53 1.5V Enabled 10% 1.35/ 1.65 20% 1.20/ 1.80 1.495
>>
>> Description State Warning RPM Current RPM
>> --------------- ---------- ----------- -----------
>> FAN 0 EXHST 1 Enabled 1980 2410
>> FAN 1 PS Enabled 3200 4891
>> FAN 2 PCI 1 Enabled 1980 2445
>> FAN 3 PCI 2 Enabled 1980 2657
>> FAN 4 ODY Enabled 1679 4066
>>
>> Advisory Critical Fault Current
>> Description State Temp Temp Temp Temp
>> ----------------- ---------- --------- --------- --------- ---------
>> 0 INTERFACE 0 Enabled [Autofan Control] 75C/167F 43C/109F
>> 1 INTERFACE 1 Enabled [Autofan Control] 75C/167F 44C/111F
>> 2 INTERFACE 2 Enabled [Autofan Control] 75C/167F 36C/ 96F
>> 3 PCI RISER Enabled [Autofan Control] 75C/167F 36C/ 96F
>> 4 ODYSSEY Enabled [Autofan Control] 75C/167F 36C/ 96F
>> 5 NODE Enabled [Autofan Control] 75C/167F 37C/ 98F
>> 6 BEDROCK Enabled [Autofan Control] 75C/167F 50C/122F
>>
>> Zone Temp Target Current Zone Fan Curr/Min
>> Zone Name State Sensors Average Average Index Fan %
>> --------- -------- ------------ -------- -------- --------- ---------
>> NODE Enabled 0,1,2,5,6 47C/116F 42C/107F 0 18%/ 18%
>> PS Enabled 0,1,2,5,6 47C/116F 42C/107F 1 55%/ 55%
>> PCI Enabled 3 45C/113F 36C/ 96F 2,3 55%/ 55%
>> ODY Enabled 4 48C/118F 36C/ 96F 4 55%/ 55%
>>
>> 001c02:
>> Environmental monitoring is enabled and running.
>>
>> Description State Warning Limits Fault Limits Current
>> -------------- ---------- ----------------- ----------------- -------
>> 1.8V Enabled 10% 1.62/ 1.98 20% 1.44/ 2.16 1.791
>> 12V Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.063
>> 12V #2 Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.063
>> 3.3V Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.320
>> 12V IO Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.063
>> 5V AUX Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.070
>> 3.3V AUX Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.268
>> PCI 5V AUX Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.070
>> PCI 3.3V Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.302
>> PCI 2.5V Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.509
>> PCI 5V Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 4.940
>> XIO 12V BIAS <not present>
>> XIO 5V <not present>
>> XIO 2.5V <not present>
>> XIO 3.3V AUX <not present>
>> IP53 3.3V AUX Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.268
>> IP53 5V AUX Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.070
>> IP53 12V Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.000
>> IP53 VCPU Enabled 10% 1.13/ 1.38 20% 1.00/ 1.50 1.241
>> IP53 SRAM Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.470
>> IP53 1.5V Enabled 10% 1.35/ 1.65 20% 1.20/ 1.80 1.480
>>
>> Description State Warning RPM Current RPM
>> --------------- ---------- ----------- -----------
>> FAN 0 EXHST 1 Enabled 1980 2280
>> FAN 1 PS Enabled 3200 5037
>> FAN 2 PCI 1 Enabled 1980 2360
>> FAN 3 PCI 2 Enabled 1980 2596
>> FAN 4 ODY Enabled 1679 1844
>>
>> Advisory Critical Fault Current
>> Description State Temp Temp Temp Temp
>> ----------------- ---------- --------- --------- --------- ---------
>> 0 INTERFACE 0 Enabled [Autofan Control] 75C/167F 43C/109F
>> 1 INTERFACE 1 Enabled [Autofan Control] 75C/167F 45C/113F
>> 2 INTERFACE 2 Enabled [Autofan Control] 75C/167F 36C/ 96F
>> 3 PCI RISER Enabled [Autofan Control] 75C/167F 36C/ 96F
>> 4 ODYSSEY <not present>
>> 5 NODE Enabled [Autofan Control] 75C/167F 38C/100F
>> 6 BEDROCK Enabled [Autofan Control] 75C/167F 48C/118F
>>
>> Zone Temp Target Current Zone Fan Curr/Min
>> Zone Name State Sensors Average Average Index Fan %
>> --------- -------- ------------ -------- -------- --------- ---------
>> EXHST Enabled 0,1,2,5,6 47C/116F 42C/107F 0 18%/ 18%
>> PS Enabled 0,1,2,5,6 47C/116F 42C/107F 1 55%/ 55%
>> PCI Enabled 3 45C/113F 36C/ 96F 2,3 55%/ 55%
>> SNO Enabled 4 35C/ 95F 0C/ 32F 4 55%/ 55%
>>
>> 001c03:
>> Environmental monitoring is enabled and running.
>>
>> Description State Warning Limits Fault Limits Current
>> -------------- ---------- ----------------- ----------------- -------
>> 1.8V Enabled 10% 1.62/ 1.98 20% 1.44/ 2.16 1.777
>> 12V Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.063
>> 12V #2 Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.000
>> 3.3V Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.320
>> 12V IO Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.063
>> 5V AUX Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.044
>> 3.3V AUX Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.285
>> PCI 5V AUX Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.044
>> PCI 3.3V Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.320
>> PCI 2.5V Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.496
>> PCI 5V Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 4.940
>> XIO 12V BIAS <not present>
>> XIO 5V <not present>
>> XIO 2.5V <not present>
>> XIO 3.3V AUX <not present>
>> IP59 3.3V AUX Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.285
>> IP59 5V AUX Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.070
>> IP59 12V Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.000
>> IP59 VCPU Enabled 10% 1.14/ 1.40 20% 1.02/ 1.52 1.283
>> IP59 SRAM Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.470
>> IP59 1.5V Enabled 10% 1.35/ 1.65 20% 1.20/ 1.80 1.495
>>
>> Description State Warning RPM Current RPM
>> --------------- ---------- ----------- -----------
>> FAN 0 EXHST 1 Enabled 1980 2636
>> FAN 1 EXHST 2 Enabled 1980 2636
>> FAN 2 PS Enabled 3200 4066
>> FAN 3 PCI 1 Enabled 1980 2576
>> FAN 4 PCI 2 Enabled 1980 2836
>> FAN 5 N0 LEFT Enabled 1980 4054
>> FAN 6 N0 CNTR Enabled 1980 3797
>> FAN 7 N0 RIGHT Enabled 1980 4109
>>
>> Advisory Critical Fault Current
>> Description State Temp Temp Temp Temp
>> ----------------- ---------- --------- --------- --------- ---------
>> 0 INTERFACE 0 Enabled 31C/ 87F 48C/118F 55C/131F 20C/ 68F
>> 1 INTERFACE 1 Enabled 31C/ 87F 48C/118F 55C/131F 20C/ 68F
>> 2 INTERFACE 2 Enabled 31C/ 87F 48C/118F 55C/131F 21C/ 69F
>> 3 PCI RISER Enabled 31C/ 87F 48C/118F 55C/131F 24C/ 75F
>> 4 ODYSSEY <not present>
>> 5 NODE Enabled 31C/ 87F 48C/118F 55C/131F 23C/ 73F
>> 6 BEDROCK Enabled 31C/ 87F 48C/118F 55C/131F 22C/ 71F
>>
>> 001c04:
>> Environmental monitoring is enabled and running.
>>
>> Description State Warning Limits Fault Limits Current
>> -------------- ---------- ----------------- ----------------- -------
>> 1.8V Enabled 10% 1.62/ 1.98 20% 1.44/ 2.16 1.777
>> 12V Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.063
>> 12V #2 Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.063
>> 3.3V Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.337
>> 12V IO Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.063
>> 5V AUX Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.070
>> 3.3V AUX Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.302
>> PCI 5V AUX Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.096
>> PCI 3.3V Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.320
>> PCI 2.5V Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.509
>> PCI 5V Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 4.966
>> XIO 12V BIAS <not present>
>> XIO 5V <not present>
>> XIO 2.5V <not present>
>> XIO 3.3V AUX <not present>
>> IP59 3.3V AUX Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.302
>> IP59 5V AUX Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.070
>> IP59 12V Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.000
>> IP59 VCPU Enabled 10% 1.14/ 1.40 20% 1.02/ 1.52 1.283
>> IP59 SRAM Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.470
>> IP59 1.5V Enabled 10% 1.35/ 1.65 20% 1.20/ 1.80 1.495
>>
>> Description State Warning RPM Current RPM
>> --------------- ---------- ----------- -----------
>> FAN 0 EXHST 1 Enabled 1980 2700
>> FAN 1 EXHST 2 Enabled 1980 2721
>> FAN 2 PS Enabled 3200 4891
>> FAN 3 PCI 1 Enabled 1980 2657
>> FAN 4 PCI 2 Enabled 1980 2884
>> FAN 5 N0 LEFT Enabled 1980 4054
>> FAN 6 N0 CNTR Enabled 1980 3846
>> FAN 7 N0 RIGHT Enabled 1980 4166
>>
>> Advisory Critical Fault Current
>> Description State Temp Temp Temp Temp
>> ----------------- ---------- --------- --------- --------- ---------
>> 0 INTERFACE 0 Enabled 31C/ 87F 48C/118F 55C/131F 21C/ 69F
>> 1 INTERFACE 1 Enabled 31C/ 87F 48C/118F 55C/131F 20C/ 68F
>> 2 INTERFACE 2 Enabled 31C/ 87F 48C/118F 55C/131F 21C/ 69F
>> 3 PCI RISER Enabled 31C/ 87F 48C/118F 55C/131F 25C/ 77F
>> 4 ODYSSEY <not present>
>> 5 NODE Enabled 31C/ 87F 48C/118F 55C/131F 23C/ 73F
>> 6 BEDROCK Enabled 31C/ 87F 48C/118F 55C/131F 22C/ 71F
>>
>> 001c05:
>> Environmental monitoring is enabled and running.
>>
>> Description State Warning Limits Fault Limits Current
>> -------------- ---------- ----------------- ----------------- -------
>> 1.8V Enabled 10% 1.62/ 1.98 20% 1.44/ 2.16 1.777
>> 12V Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.063
>> 12V #2 Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.000
>> 3.3V Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.320
>> 12V IO Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.000
>> 5V AUX Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.070
>> 3.3V AUX Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.268
>> PCI 5V AUX Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.070
>> PCI 3.3V Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.302
>> PCI 2.5V Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.509
>> PCI 5V Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 4.966
>> XIO 12V BIAS <not present>
>> XIO 5V <not present>
>> XIO 2.5V <not present>
>> XIO 3.3V AUX <not present>
>> IP59 3.3V AUX Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.268
>> IP59 5V AUX Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.070
>> IP59 12V Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 11.938
>> IP59 VCPU Enabled 10% 1.14/ 1.40 20% 1.02/ 1.52 1.283
>> IP59 SRAM Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.444
>> IP59 1.5V Enabled 10% 1.35/ 1.65 20% 1.20/ 1.80 1.480
>>
>> Description State Warning RPM Current RPM
>> --------------- ---------- ----------- -----------
>> FAN 0 EXHST 1 Enabled 1980 2596
>> FAN 1 EXHST 2 Enabled 1980 2596
>> FAN 2 PS Enabled 3200 5192
>> FAN 3 PCI 1 Enabled 1980 2537
>> FAN 4 PCI 2 Enabled 1980 2884
>> FAN 5 N0 LEFT Enabled 1980 3896
>> FAN 6 N0 CNTR Enabled 1980 3797
>> FAN 7 N0 RIGHT Enabled 1980 4054
>>
>> Advisory Critical Fault Current
>> Description State Temp Temp Temp Temp
>> ----------------- ---------- --------- --------- --------- ---------
>> 0 INTERFACE 0 Enabled 31C/ 87F 48C/118F 55C/131F 20C/ 68F
>> 1 INTERFACE 1 Enabled 31C/ 87F 48C/118F 55C/131F 20C/ 68F
>> 2 INTERFACE 2 Enabled 31C/ 87F 48C/118F 55C/131F 21C/ 69F
>> 3 PCI RISER Enabled 31C/ 87F 48C/118F 55C/131F 26C/ 78F
>> 4 ODYSSEY <not present>
>> 5 NODE Enabled 31C/ 87F 48C/118F 55C/131F 23C/ 73F
>> 6 BEDROCK Enabled 31C/ 87F 48C/118F 55C/131F 22C/ 71F
>>
>> 001r06:
>> Environmental monitoring is enabled and running.
>>
>> Description State Warning Limits Fault Limits Current
>> -------------- ---------- ----------------- ----------------- -------
>> 12 BIAS Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 11.750
>> 2.5V Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.496
>> 3.3V AUX Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.268
>> 5V AUX Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.044
>>
>> Description State Warning RPM Current RPM
>> --------------- ---------- ----------- -----------
>> FAN 0 LEFT Enabled 2160 5443
>> FAN 1 RIGHT Enabled 2160 5625
>>
>> Advisory Critical Fault Current
>> Description State Temp Temp Temp Temp
>> ----------------- ---------- --------- --------- --------- ---------
>> 0 POWER Enabled 30C/ 86F 40C/104F 50C/122F 23C/ 73F
>>
>> 001c07:
>> Environmental monitoring is enabled and running.
>>
>> Description State Warning Limits Fault Limits Current
>> -------------- ---------- ----------------- ----------------- -------
>> 1.8V Enabled 10% 1.62/ 1.98 20% 1.44/ 2.16 1.777
>> 12V Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.063
>> 12V #2 Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.000
>> 3.3V Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.320
>> 12V IO Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.063
>> 5V AUX Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.070
>> 3.3V AUX Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.268
>> PCI 5V AUX Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.070
>> PCI 3.3V Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.302
>> PCI 2.5V Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.496
>> PCI 5V Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 4.940
>> XIO 12V BIAS <not present>
>> XIO 5V <not present>
>> XIO 2.5V <not present>
>> XIO 3.3V AUX <not present>
>> IP59 3.3V AUX Enabled 10% 2.97/ 3.63 20% 2.64/ 3.96 3.268
>> IP59 5V AUX Enabled 10% 4.50/ 5.50 20% 4.00/ 6.00 5.070
>> IP59 12V Enabled 10% 10.80/ 13.20 20% 9.60/ 14.40 12.000
>> IP59 VCPU Enabled 10% 1.14/ 1.40 20% 1.02/ 1.52 1.283
>> IP59 SRAM Enabled 10% 2.25/ 2.75 20% 2.00/ 3.00 2.457
>> IP59 1.5V Enabled 10% 1.35/ 1.65 20% 1.20/ 1.80 1.480
>>
>> Description State Warning RPM Current RPM
>> --------------- ---------- ----------- -----------
>> FAN 0 EXHST 1 Enabled 1980 2721
>> FAN 1 EXHST 2 Enabled 1980 2700
>> FAN 2 PS Enabled 3200 4166
>> FAN 3 PCI 1 Enabled 1980 2678
>> FAN 4 PCI 2 Enabled 1980 2960
>> FAN 5 N0 LEFT Enabled 1980 3947
>> FAN 6 N0 CNTR Enabled 1980 3797
>> FAN 7 N0 RIGHT Enabled 1980 4166
>>
>> Advisory Critical Fault Current
>> Description State Temp Temp Temp Temp
>> ----------------- ---------- --------- --------- --------- ---------
>> 0 INTERFACE 0 Enabled 31C/ 87F 48C/118F 55C/131F 21C/ 69F
>> 1 INTERFACE 1 Enabled 31C/ 87F 48C/118F 55C/131F 21C/ 69F
>> 2 INTERFACE 2 Enabled 31C/ 87F 48C/118F 55C/131F 20C/ 68F
>> 3 PCI RISER Enabled 31C/ 87F 48C/118F 55C/131F 25C/ 77F
>> 4 ODYSSEY <not present>
>> 5 NODE Enabled 31C/ 87F 48C/118F 55C/131F 24C/ 75F
>> 6 BEDROCK Enabled 31C/ 87F 48C/118F 55C/131F 23C/ 73F
>>
>>
>> re-entering system console mode (001c01 console), <CTRL_T> to escape to L2

The temperature of room is likely to be at least 25C but some machine are reporting temp below this:
- low 20s - are all 4 x 1G (with big blowers and extra fans on the IP59 board)
- in 30-50C range - are all 4 x 800MHz (with V12 & DM3 and so less blowers)
- low 20s - the Numalink router which does not contain much circuitry and would expect to run cooler.


Cheers from Oz,


jwhat/John.
(This post was last modified: 03-02-2022, 09:52 AM by jwhat.)
jwhat
Octane/O350/Fuel User

Trade Count: (0)
Posts: 513
Threads: 29
Joined: Jul 2018
Location: Australia
Find Reply
03-02-2022, 09:40 AM
#15
RE: o350 env chip locations
At the time I captured the environmental data on my O350 it was running in the top slots of a 19" rack in a non-airconditioned room in an office building so I would take any numbers below 25 degrees with a grain of salt. The 15c and 18c numbers in there are definitely bogus.
jan-jaap
SGI Collector

Trade Count: (0)
Posts: 1,048
Threads: 37
Joined: Jun 2018
Location: Netherlands
Website Find Reply
03-02-2022, 10:19 AM
#16
RE: o350 env chip locations
@jwhat even with the big blowers, they shouldn't be able to get the temperature below ambient temp -- and the numalink router wouldn't be running lower than ambient. Hmmmm. Did we just prove that the temp sensors are mostly junk in these?

I'm going to snoop the i2c bus of one of the sensors sometime this week and see what it's actually reporting, see if it's maybe a software issue misinterpreting the data.
vvuk
O2

Trade Count: (0)
Posts: 43
Threads: 4
Joined: Aug 2021
Location: California
Find Reply
03-02-2022, 07:18 PM
#17
RE: o350 env chip locations
(03-02-2022, 07:18 PM)vvuk Wrote:  @jwhat even with the big blowers, they shouldn't be able to get the temperature below ambient temp -- and the numalink router wouldn't be running lower than ambient.  Hmmmm.  Did we just prove that the temp sensors are mostly junk in these?

I'm going to snoop the i2c bus of one of the sensors sometime this week and see what it's actually reporting, see if it's maybe a software issue misinterpreting the data.

I've already replaced three DS1780, because temperature and voltages were wrong. (and a few more malfunctions as well).

Challenge L Indy Indigo2 Indigo2 R10000/IMPACT O2 Octane Origin2000 Deskside Fuel Challenge S Origin 200 Origin 2000 Rack Origin 2000 Rack Origin 3200 Origin 350
fleedwood
O2

Trade Count: (0)
Posts: 24
Threads: 1
Joined: Dec 2020
Location: Germany
Find Reply
03-02-2022, 07:37 PM
#18
RE: o350 env chip locations
Hi vvuk, fleedwood and others,

yes agree that having temps in 20-25C range does make make sense when you have ambient temp above 25C.

Just funny how all the 4 x 1G and 4 x 800 MHZ fall into some ranges...

On my system the 001c04 systems has a voltage problem: https://forums.irixnet.org/thread-2563.html

Is this potentially a result of faulty DALLAS env monitoring chip ?

Regards,


jwhat/John
(This post was last modified: 03-03-2022, 04:35 AM by jwhat.)
jwhat
Octane/O350/Fuel User

Trade Count: (0)
Posts: 513
Threads: 29
Joined: Jul 2018
Location: Australia
Find Reply
03-03-2022, 04:34 AM
#19
RE: o350 env chip locations
(03-03-2022, 04:34 AM)jwhat Wrote:  Hi vvuk, fleedwood and others,

yes agree that having temps in 20-25C range does make make sense when you have ambient temp above 25C.

Just funny how all the 4 x 1G and 4 x 800 MHZ fall into some ranges...

On my system the 001c04 systems has a voltage problem: https://forums.irixnet.org/thread-2563.html

Is this potentially a result of faulty DALLAS env monitoring chip ?

Regards,


jwhat/John

While this COULD be true, I'm going into my Tezro believing it DOES in fact have this 1.8v problem.  I have a THEORY, it may be true or not but it's based on the PIMM work I did on that Fuel PIMM a while back (VRM).  THE 1.8v VRM is documented on SGUG, I think it's due to a faulty diode on the LOW-SIDE MOSFET gate control on top of an aged output cap.  That is it's a signal or error-correction issue.

My "version" of the 1.8v error isn't that is disappears, nor that it's instant, it was that I start fine...then within a few minutes it slowly dives lower and lower until I hit like 1.18v and the system shuts down.  But this slide takes like 20 seconds once it starts.

I have two things I'll be trying.

1. Change out the VRM output cap to get back the MAX filtering of output.
2. Remove and check/change the diodes on the circuit.

What I found in that PIMM project (https://forums.irixnet.org/thread-3238.html) is that there are 1-2 pairs of SMB diodes used to cleanup the MOSFET gate signalling.  Without them the signals are often less defined/sharp.  It causes FALSE turn-ons or turn-offs of the MOSFET.  Since my voltage value goes DOWN, it means my error is on the LOW-SIDE (grounding) MOSFET signaling.  So when things are fine it's okay...but as the "rate of change" of the On/Off duty cycle happens (same Hz) errors start to occur, so the LDO IC tries to compensate by making more changes.  The error due to constant changes starts to drift and get out of control (wrangling the error).

Once it starts to slide, the more drastic the changes, the faster the error grows until it falls off the edge.

Because I can "start out" okay, I think this is what is happening to me, So right now I'm planning as if I am having a simple signalling (dirty) issue.  So I will simply pluck the diodes off the MOSFET areas (and test them) it's likely I'll find one with an issue, then I need to replace entire paired sets to "clean-up" the gate operation.

My hope is this is my answer.  With the Fuel PIMM 5v VRM it was instantly high (always) mine isn't instant but occurs shortly after startup.  This is why I think it's signalling and NOT the power-side of the circuit.  If the signalling is going wild (and the LDO is trying to error correct) then that makes a lot more sense why it goes from "inspec" to "out of spec" like it does for me.
weblacky
I play an SGI Doctor, on daytime TV.

Trade Count: (10)
Posts: 1,716
Threads: 88
Joined: Jan 2019
Location: Seattle, WA
Find Reply
03-03-2022, 05:25 AM


Forum Jump:


Users browsing this thread: 1 Guest(s)