KWSN Orbiting Fortress Forum Index KWSN Orbiting Fortress
KWSN Distributed Computing Teams forum
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Platform Failures- Black screen followed by BIOS restore

 
Post new topic   Reply to topic    KWSN Orbiting Fortress Forum Index -> Ye Olde Help Scrolls
View previous topic :: View next topic  
Author Message
lvanst
Baron
Baron


Joined: 31 Jan 2013
Posts: 152
Location: Phoenix, AZ (yes, it's hot here)

PostPosted: Sun Nov 20, 2016 3:11 pm    Post subject: Platform Failures- Black screen followed by BIOS restore Reply with quote

I'm getting platform failures under heavy load. No failures if the platform is idle. The screen goes black, followed by a BIOS restore/recovery, so it's a pretty hard crash.

I removed the Radeon card, and it still crashes running only Nvidia
I removed the Nvidia card, and it still crashes running only Radeon
So it doesn't look like the video cards, or drivers, are the problem...

I use hwinfo64 to monitor/log platform sensors, and the temps are stable right up to the crash:

CPU 60c (AMD FX-8350)
Motherboard 50c (Gigabyte 990FXA-UD3)
Northbridge 46c
Radeon 78c (R9 270x)
Nvidia 75c (GTX 760)

CPU and both GPUs have been rethermaled within the past six months.

I also ran a Windows memory test, to confirm my Corsair memory was healthy.

I'm out of ideas. It's definitely load related, as it will idle for a day without failures, but detonates within a few hours under heavy load.

Any help would be much appreciated...
#ni-1
_________________
Back to top
View user's profile Send private message
lvanst
Baron
Baron


Joined: 31 Jan 2013
Posts: 152
Location: Phoenix, AZ (yes, it's hot here)

PostPosted: Tue Nov 22, 2016 9:23 pm    Post subject: Reply with quote

33 views and no ideas or suggestions?

Help me out here my fellow knights. Rolling Eyes
_________________
Back to top
View user's profile Send private message
JumpinJohnny
Prince
Prince


Joined: 28 Mar 2013
Posts: 1245
Location: Western New Hamster

PostPosted: Tue Nov 22, 2016 10:58 pm    Post subject: Reply with quote

lvanst wrote:
33 views and no ideas or suggestions?

Help me out here my fellow knights. Rolling Eyes


OK make that 34 views and a no clue response.

If it were I ... I would backup everything and flash the BIOS.
It just seems like a MB issue and that might be a good place to start.
Back to top
View user's profile Send private message
PhastPhred
Prince
Prince


Joined: 22 Mar 2006
Posts: 6010
Location: Northwest AR (USA)

PostPosted: Wed Nov 23, 2016 3:06 pm    Post subject: Reply with quote

I assume you already did the Vacuum Cleaner stuff, and have a clean CPU and Power Supply - Vent holes in the case, etc? - Sorry, I have nothing...
_________________
Back to top
View user's profile Send private message Send e-mail Visit poster's website AIM Address Yahoo Messenger MSN Messenger
lvanst
Baron
Baron


Joined: 31 Jan 2013
Posts: 152
Location: Phoenix, AZ (yes, it's hot here)

PostPosted: Wed Nov 23, 2016 7:06 pm    Post subject: Reply with quote

Thanks guys. I too was thinking mb or power supply. I'll check for a bios update.
_________________
Back to top
View user's profile Send private message
JumpinJohnny
Prince
Prince


Joined: 28 Mar 2013
Posts: 1245
Location: Western New Hamster

PostPosted: Wed Nov 23, 2016 7:54 pm    Post subject: Reply with quote

lvanst wrote:
Thanks guys. I too was thinking mb or power supply. I'll check for a bios update.


Even if there is no "update" for the BIOS, it might have a tiny corruption that could be fixed by a BIOS flash.
But maybe I'm confusing flashing with streaking. Do you need clothes on for BIOS operations? Laughing
Back to top
View user's profile Send private message
KWSN-HoC
KWSN ArchBishop
KWSN ArchBishop


Joined: 18 May 2002
Posts: 1348
Location: German Quadrant

PostPosted: Fri Nov 25, 2016 7:43 am    Post subject: Reply with quote

Sounds to me the power supply has aged capacitors and max load isn't stable anymore ....

Cheers

HoC
_________________
Housekeeper of Camelot

Mastrumistulo de Kameloto

(also a Member of the Migratory Coconuts)

'My name is Homer from Borg. Resistance is fu ..... Oh doughnuts!'


Back to top
View user's profile Send private message Send e-mail Visit poster's website
sir spuddly buddly
Prince
Prince


Joined: 27 Nov 2004
Posts: 1048
Location: here, I think.

PostPosted: Sat Nov 26, 2016 1:47 pm    Post subject: Reply with quote

Yeah, agree with start with getting BIOS right then slowly cranking it up.
_________________
a monster can be excused for his behaviour . . . The problem is not how a monster could do it, but how a human being did it.
Scaring 11 year olds since 2005
"There is only one path to Ni!-dom - through the shubbery"
Click every day at http://naturarvet.se/en/ to save a forest!
Back to top
View user's profile Send private message
lvanst
Baron
Baron


Joined: 31 Jan 2013
Posts: 152
Location: Phoenix, AZ (yes, it's hot here)

PostPosted: Tue Nov 29, 2016 2:46 pm    Post subject: Reply with quote

BIOS flashed and still failing.

Trying a new power supply next...
_________________
Back to top
View user's profile Send private message
lvanst
Baron
Baron


Joined: 31 Jan 2013
Posts: 152
Location: Phoenix, AZ (yes, it's hot here)

PostPosted: Tue Dec 13, 2016 5:20 pm    Post subject: Reply with quote

This stopped happening when I stopped running the Einstein project. It seems to have been CPU centric, as I took turns removing the nvidia and radeon cards to no avail. The platform has been failure free for two weeks now, whereas it had been failing multiple times per day...
_________________
Back to top
View user's profile Send private message
JumpinJohnny
Prince
Prince


Joined: 28 Mar 2013
Posts: 1245
Location: Western New Hamster

PostPosted: Tue Dec 13, 2016 6:11 pm    Post subject: Reply with quote

Soooo...
We were right!
Gravitational waves from Einstien project were overloading PS capacitors and causing the BIOS to cause CPU errors when communicationg with the GPU's.
It was probably that newly discovered double pulsar that threw a whammy on the CPU. Very tricky, those gravitational waves... popping capicators, unflashing BIOS and throwing GPU's into interdimensional quanta.
Rolling Eyes

If it's possibly related:
I get excessive 'hardware errors' from my U3 Bitminer whilst running the Einstien "Multi-Directed Continious Gravitational Wave Search" WU's and also long running Prime Grid WU's like the PSP (LLR), even when I limit the consecutive WU's to half of the cores.
If I give the CPU a rest for a day or reboot and put something different on the CPU like Astroids, then the Bitmain U3 Asics miner gets almost no errors and hums along just fine.
I think some of these programs write their applications to 'hold open' memory without actually using it.
Just an observation. It would be more coherent if I knew what I was talking about.
_________________

Back to top
View user's profile Send private message
sir spuddly buddly
Prince
Prince


Joined: 27 Nov 2004
Posts: 1048
Location: here, I think.

PostPosted: Thu Dec 15, 2016 12:41 am    Post subject: Reply with quote

Glad to hear you found the source of the problem - hope you've told the people at Einstien.


#ni-1
_________________
a monster can be excused for his behaviour . . . The problem is not how a monster could do it, but how a human being did it.
Scaring 11 year olds since 2005
"There is only one path to Ni!-dom - through the shubbery"
Click every day at http://naturarvet.se/en/ to save a forest!
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    KWSN Orbiting Fortress Forum Index -> Ye Olde Help Scrolls All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group
Optimized Seti@Home App | BOINC Stats