Page 1 of 1

I thought Embedded USB install was 'safe' from corruption

Posted: 16 Mar 2017 22:53
by NeilP
I built an Embedded USB boot system two days ago and it has been running until last night. Could not access the WebGUI this morning and on fitting a screen to the headless box, I saw it was in an infinite reboot loop.

starting it again from the live CD , a fresh install to a New USB stick or even the old USB stick all result in a perfect booting system.
So what is going on here? what can cause this?

After I built the system two days ago, i backed up the config file. I restored it to the two new USB embedded installs...and they all booted correctly.

So what could have gone wrong over night to cause this USB Embedded boot stick to 'fail' ?


See below:

Screen Shot 2017-03-16 at 19.15.46.jpg
IMG_8462.JPG

Re: I thought Embedded USB install was 'safe' from corruption

Posted: 17 Mar 2017 04:38
by Snufkin
Faulty RAM?
Aging PSU?
Cooling?

Re: I thought Embedded USB install was 'safe' from corruption

Posted: 17 Mar 2017 08:00
by raulfg3
bad luck?

Re: I thought Embedded USB install was 'safe' from corruption

Posted: 17 Mar 2017 08:24
by NeilP
It has now been running again with the fresh Embedded install on the same USB stick, since last night.

Since it was running pretty much 24/7 as a WinXP system for ..... well a few years... yes it is ageing, but no, not had issues with it before.

Immediately after I noticed the failure and could not reboot it, I booted from my 'Everyday Carry' Tails Live USB stick, and while hot it booted perfectly to Tails.
Put the N4F embedded USB back in, and it was back to the boot loop.

Put N4F Live USB and Live CD's in, booted perfectly.
So think I can rule out CPU /RAM heat issues.

Been lying awake thinking about this during the night.

It must be omething that happens late in the boot process...after the system is running... it goes through the boot, obtains IP address, NTP Time server update, Something must have been written to the USB stick user config file during the night, it is hitting that entry and causing the crash and re boot.

Now I recall, had enabled the Fuppes/DLNA Service, with the Fuppes.db on the internal HDD.

On a Live CD boot just now , I have been unable to add the mount point to that HDD. Kept getting 'Error - Retry'
Had to reformat the HDD before I could add the HDD as a mount point again.

Would that trigger a crash ? the config file processing an instruction to mount a faulty partion on the HDD ?

Re: I thought Embedded USB install was 'safe' from corruption

Posted: 17 Mar 2017 08:52
by Snufkin
Have you checked SMART data of 'failed to mount' HDD?

Re: I thought Embedded USB install was 'safe' from corruption

Posted: 17 Mar 2017 09:23
by NeilP
It is saying 'No Errors'
but the max temp has got pretty warm 51 deg C


Umm remembered..I also had OneButton Installer and Extended GUI installed, but NOT enabled


Something must have got written to the USB Config file I guess ..and as it tried to use it ..like mount a drive with issues..it crashed.

Created Dell BIOS Diag disk and update the BIOS to A09 from the 2005 A05.
Running full extended Dell Daig Hardware testing now, including full surface scan of HDD

Code: Select all

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever
					been run.
Total time to complete Offline
data collection: 		(  949) seconds.
Offline data collection
capabilities: 			 (0x1b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					No Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine
recommended polling time: 	 (   1) minutes.
Extended self-test routine
recommended polling time: 	 (  16) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0004   157   157   050    Old_age   Offline      -       217
  3 Spin_Up_Time            0x0006   117   117   024    Old_age   Always       -       174 (Average 167)
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       693
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000a   100   100   067    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0004   132   132   020    Old_age   Offline      -       33
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       3045
 10 Spin_Retry_Count        0x0012   100   100   060    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       681
192 Power-Off_Retract_Count 0x0032   100   100   050    Old_age   Always       -       785
193 Load_Cycle_Count        0x0012   100   100   050    Old_age   Always       -       785
194 Temperature_Celsius     0x0002   250   250   000    Old_age   Always       -       22 (Min/Max 6/51)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%         1         -
# 2  Short offline       Completed without error       00%         0         -

Selective Self-tests/Logging not supported