Power Supply Unit (PSU) on Sun Fire V440 (Non-RoHS) systems may fail prematurely due to PSU fan degradation. In some cases, this may lead to an orderly thermal shutdown of the system.
Contributing Factors
Power Supply (Sun p/n 300-1501) shipped on Sun Fire V440 (Non-RoHS) before December 01, 2005 may fail prematurely as it uses the
Minebea (NMB) fan that may be susceptible to premature degradation based on environmental conditions.
Symptoms
The affected V440 PSU (Sun p/n 300-1501) initially works in a degraded state with increased fan noise, increased fan vibration levels, and/or decreased fan RPMs, eventually leading to fan failure.
If PSU fan speed goes below 2,520 RPM (slow fan), a predictive fan fault signal is triggered by the system similar to the below example;
"PS0 fan is operating below its normal threshold"
However, if the PSU fan actually fails and the PSU turns off, the system outputs the message;
"PSU @ PS0 has FAILED"
...and, depending on the environment, the system may initiate an orderly shut down to avoid overheating. Primary impact to the customer in such a scenario would be unplanned downtime.
Root Cause
Sun Fire V440 PSU (Sun part number 300-1501) used a Minebea (NMB) fan with open-bearing motor design which is susceptible to dust accumulation and loss of lubrication over time under certain environmental conditions. The resulting degradation of the open-bearing fan motor due to increased friction leads to increased fan noise, increased fan vibration levels, and decreased fan RPMs with eventual fan failure.
Sun started using a closed bearing fan during production of the 300-1501-10 PSU acquired from a different vendor (Nidec) in November of 2005 (date code WW 45 and later). The follow-on substitute part for this PSU is the RoHS V440 PSU part number 300-1851-01, which also uses this closed bearing Nidec fan. PSUs with closed bearing fan design eliminates this degradation mode.
(NMB) fan with a 300-1501-10 or a 300-1851-01 (or above). RSL stock consists of 300-1501-10 (with
fan) and 300-1851 PSUs.
Only the failed PSU should be replaced - the second working PSU should not be proactively replaced. Our analysis shows that one PSU failing in the system does not necessarily lead to the second PSU failing in the same system soon thereafter. Therefore, this lends little support to proactive replacements beyond the failed PSU.
Failed 300-1501 PSUs have been getting updated at the Repair Vendor by replacing the
fan.
For more information on PSU replacement procedures please review Canned Action Plan DocID