Page 1 of 1
I thought Embedded USB install was 'safe' from corruption
Posted: 16 Mar 2017 22:53
by NeilP
I built an Embedded USB boot system two days ago and it has been running until last night. Could not access the WebGUI this morning and on fitting a screen to the headless box, I saw it was in an infinite reboot loop.
starting it again from the live CD , a fresh install to a New USB stick or even the old USB stick all result in a perfect booting system.
So what is going on here? what can cause this?
After I built the system two days ago, i backed up the config file. I restored it to the two new USB embedded installs...and they all booted correctly.
So what could have gone wrong over night to cause this USB Embedded boot stick to 'fail' ?
See below:
Screen Shot 2017-03-16 at 19.15.46.jpg
IMG_8462.JPG
Re: I thought Embedded USB install was 'safe' from corruption
Posted: 17 Mar 2017 04:38
by Snufkin
Faulty RAM?
Aging PSU?
Cooling?
Re: I thought Embedded USB install was 'safe' from corruption
Posted: 17 Mar 2017 08:00
by raulfg3
bad luck?
Re: I thought Embedded USB install was 'safe' from corruption
Posted: 17 Mar 2017 08:24
by NeilP
It has now been running again with the fresh Embedded install on the same USB stick, since last night.
Since it was running pretty much 24/7 as a WinXP system for ..... well a few years... yes it is ageing, but no, not had issues with it before.
Immediately after I noticed the failure and could not reboot it, I booted from my 'Everyday Carry' Tails Live USB stick, and while hot it booted perfectly to Tails.
Put the N4F embedded USB back in, and it was back to the boot loop.
Put N4F Live USB and Live CD's in, booted perfectly.
So think I can rule out CPU /RAM heat issues.
Been lying awake thinking about this during the night.
It must be omething that happens late in the boot process...after the system is running... it goes through the boot, obtains IP address, NTP Time server update, Something must have been written to the USB stick user config file during the night, it is hitting that entry and causing the crash and re boot.
Now I recall, had enabled the Fuppes/DLNA Service, with the Fuppes.db on the internal HDD.
On a Live CD boot just now , I have been unable to add the mount point to that HDD. Kept getting 'Error - Retry'
Had to reformat the HDD before I could add the HDD as a mount point again.
Would that trigger a crash ? the config file processing an instruction to mount a faulty partion on the HDD ?
Re: I thought Embedded USB install was 'safe' from corruption
Posted: 17 Mar 2017 08:52
by Snufkin
Have you checked SMART data of 'failed to mount' HDD?
Re: I thought Embedded USB install was 'safe' from corruption
Posted: 17 Mar 2017 09:23
by NeilP
It is saying 'No Errors'
but the max temp has got pretty warm 51 deg C
Umm remembered..I also had OneButton Installer and Extended GUI installed, but NOT enabled
Something must have got written to the USB Config file I guess ..and as it tried to use it ..like mount a drive with issues..it crashed.
Created Dell BIOS Diag disk and update the BIOS to A09 from the 2005 A05.
Running full extended Dell Daig Hardware testing now, including full surface scan of HDD
Code: Select all
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x80) Offline data collection activity
was never started.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 949) seconds.
Offline data collection
capabilities: (0x1b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 16) minutes.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0
2 Throughput_Performance 0x0004 157 157 050 Old_age Offline - 217
3 Spin_Up_Time 0x0006 117 117 024 Old_age Always - 174 (Average 167)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 693
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000a 100 100 067 Old_age Always - 0
8 Seek_Time_Performance 0x0004 132 132 020 Old_age Offline - 33
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 3045
10 Spin_Retry_Count 0x0012 100 100 060 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 681
192 Power-Off_Retract_Count 0x0032 100 100 050 Old_age Always - 785
193 Load_Cycle_Count 0x0012 100 100 050 Old_age Always - 785
194 Temperature_Celsius 0x0002 250 250 000 Old_age Always - 22 (Min/Max 6/51)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 1 -
# 2 Short offline Completed without error 00% 0 -
Selective Self-tests/Logging not supported