This is the old XigmaNAS forum in read only mode,
it will taken offline by the end of march 2021!



I like to aks Users and Admins to rewrite/take over important post from here into the new fresh main forum!
Its not possible for us to export from here and import it to the main forum!

I thought Embedded USB install was 'safe' from corruption

Problems, solutions, software
Forum rules
Set-Up GuideFAQsForum Rules
Post Reply
NeilP
Advanced User
Advanced User
Posts: 215
Joined: 15 Jul 2012 11:45
Location: Jersey, Channel Islands, Europe
Status: Offline

I thought Embedded USB install was 'safe' from corruption

Post by NeilP »

I built an Embedded USB boot system two days ago and it has been running until last night. Could not access the WebGUI this morning and on fitting a screen to the headless box, I saw it was in an infinite reboot loop.

starting it again from the live CD , a fresh install to a New USB stick or even the old USB stick all result in a perfect booting system.
So what is going on here? what can cause this?

After I built the system two days ago, i backed up the config file. I restored it to the two new USB embedded installs...and they all booted correctly.

So what could have gone wrong over night to cause this USB Embedded boot stick to 'fail' ?


See below:

Screen Shot 2017-03-16 at 19.15.46.jpg
IMG_8462.JPG
You do not have the required permissions to view the files attached to this post.

User avatar
Snufkin
Advanced User
Advanced User
Posts: 317
Joined: 01 Jul 2012 11:27
Location: Etc/GMT-3 (BSD style)
Status: Offline

Re: I thought Embedded USB install was 'safe' from corruption

Post by Snufkin »

Faulty RAM?
Aging PSU?
Cooling?
XNAS 11.4.0.4 embedded, ASUS P5B-E, Intel DC E6600, 4 GB DDR2
ZFS 2 x HGST HDN726040ALE614, L2ARC PLEXTOR PX-128M5S

User avatar
raulfg3
Site Admin
Site Admin
Posts: 4865
Joined: 22 Jun 2012 22:13
Location: Madrid (ESPAÑA)
Contact:
Status: Offline

Re: I thought Embedded USB install was 'safe' from corruption

Post by raulfg3 »

bad luck?
12.1.0.4 - Ingva (revision 7743) on SUPERMICRO X8SIL-F 8GB of ECC RAM, 11x3TB disk in 1 vdev = Vpool = 32TB Raw size , so 29TB usable size (I Have other NAS as Backup)

Wiki
Last changes

HP T510

NeilP
Advanced User
Advanced User
Posts: 215
Joined: 15 Jul 2012 11:45
Location: Jersey, Channel Islands, Europe
Status: Offline

Re: I thought Embedded USB install was 'safe' from corruption

Post by NeilP »

It has now been running again with the fresh Embedded install on the same USB stick, since last night.

Since it was running pretty much 24/7 as a WinXP system for ..... well a few years... yes it is ageing, but no, not had issues with it before.

Immediately after I noticed the failure and could not reboot it, I booted from my 'Everyday Carry' Tails Live USB stick, and while hot it booted perfectly to Tails.
Put the N4F embedded USB back in, and it was back to the boot loop.

Put N4F Live USB and Live CD's in, booted perfectly.
So think I can rule out CPU /RAM heat issues.

Been lying awake thinking about this during the night.

It must be omething that happens late in the boot process...after the system is running... it goes through the boot, obtains IP address, NTP Time server update, Something must have been written to the USB stick user config file during the night, it is hitting that entry and causing the crash and re boot.

Now I recall, had enabled the Fuppes/DLNA Service, with the Fuppes.db on the internal HDD.

On a Live CD boot just now , I have been unable to add the mount point to that HDD. Kept getting 'Error - Retry'
Had to reformat the HDD before I could add the HDD as a mount point again.

Would that trigger a crash ? the config file processing an instruction to mount a faulty partion on the HDD ?

User avatar
Snufkin
Advanced User
Advanced User
Posts: 317
Joined: 01 Jul 2012 11:27
Location: Etc/GMT-3 (BSD style)
Status: Offline

Re: I thought Embedded USB install was 'safe' from corruption

Post by Snufkin »

Have you checked SMART data of 'failed to mount' HDD?
XNAS 11.4.0.4 embedded, ASUS P5B-E, Intel DC E6600, 4 GB DDR2
ZFS 2 x HGST HDN726040ALE614, L2ARC PLEXTOR PX-128M5S

NeilP
Advanced User
Advanced User
Posts: 215
Joined: 15 Jul 2012 11:45
Location: Jersey, Channel Islands, Europe
Status: Offline

Re: I thought Embedded USB install was 'safe' from corruption

Post by NeilP »

It is saying 'No Errors'
but the max temp has got pretty warm 51 deg C


Umm remembered..I also had OneButton Installer and Extended GUI installed, but NOT enabled


Something must have got written to the USB Config file I guess ..and as it tried to use it ..like mount a drive with issues..it crashed.

Created Dell BIOS Diag disk and update the BIOS to A09 from the 2005 A05.
Running full extended Dell Daig Hardware testing now, including full surface scan of HDD

Code: Select all

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever
					been run.
Total time to complete Offline
data collection: 		(  949) seconds.
Offline data collection
capabilities: 			 (0x1b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					No Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine
recommended polling time: 	 (   1) minutes.
Extended self-test routine
recommended polling time: 	 (  16) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0004   157   157   050    Old_age   Offline      -       217
  3 Spin_Up_Time            0x0006   117   117   024    Old_age   Always       -       174 (Average 167)
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       693
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000a   100   100   067    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0004   132   132   020    Old_age   Offline      -       33
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       3045
 10 Spin_Retry_Count        0x0012   100   100   060    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       681
192 Power-Off_Retract_Count 0x0032   100   100   050    Old_age   Always       -       785
193 Load_Cycle_Count        0x0012   100   100   050    Old_age   Always       -       785
194 Temperature_Celsius     0x0002   250   250   000    Old_age   Always       -       22 (Min/Max 6/51)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%         1         -
# 2  Short offline       Completed without error       00%         0         -

Selective Self-tests/Logging not supported

Post Reply

Return to “Data recovery and backups”