*New 12.1 series Release:
2019-11-08: XigmaNAS 12.1.0.4.7091 - released!

*New 11.3 series Release:
2019-10-19: XigmaNAS 11.3.0.4.7014 - released


We really need "Your" help on XigmaNAS https://translations.launchpad.net/xigmanas translations. Please help today!

Producing and hosting XigmaNAS costs money. Please consider donating for our project so that we can continue to offer you the best.
We need your support! eg: PAYPAL

[SOLVED] Disk Corruption on Ubuntu VM

VirtualBox, VM config and HDD images.
Forum rules
Set-Up GuideFAQsForum Rules
Post Reply
Tweak
NewUser
NewUser
Posts: 5
Joined: 11 Oct 2018 05:36
Location: Phoenix, AZ
Status: Offline

[SOLVED] Disk Corruption on Ubuntu VM

#1

Post by Tweak » 03 Dec 2019 07:54

Friends--
I need help, but could not find this same symptom/failure anywhere on the forum.

I have a VM running under XigmaNAS, and it has Ubuntu Server 18-04.3 LTS installed.
It provides some services on my home network, and has been running flawlessly (and updating without incident) for more than a year... until recently.

I just updated XigmaNAS from 11.2 to 12.1 as the host OS.
[I believe this update also included an update to the VirtualBox version included in XigmaNAS... This might be part of the problem.]
After the Host OS update, I re-started the VM and logged into it to update the Guest OS (Ubuntu 18-04.3).
However, I have a significant problem every time the apt (dpkg) tries to unpack the linux-headers package for the update.
It freezes at that point, and then, after some number of minutes, with the VM's taskmeter climbing to 4 or 5 or <gulp!> 10 or more, (varies between 3-15 minutes), I will get error messages on the screen that there is an I/O error on /dev/sda (the virtual harddisk for the VM).
When this happens, I have to "force quit" (via a sudo kill command for the VBoxHeadless task) and re-start the VM.
This has happened consistently, for at least 20 attempts to "resurrect" the VM and get it updated.
Strangely, the VM will run happily, without error for at least 2 hours, without giving an I/O error. The error happens 100% of the time, when I try to update the Guest OS, once the apt update gets to the point of unpacking the linux-headers package. (?!?!?)
Since the VM's /dev/sda is just a file on the host's ZFS pool, I am struggling to figure out how to diagnose whether it is a filesystem corruption INSIDE the VM, or if it is a filesystem corruption on the Host OS, or if it is a potential impending mechanical failure of one of the drives in the pool.
(All of the SMART reports show that the constituent HDDs in the ZFS pool on the Host machine are A-OK.)

Can any of the experts in the Forum groups help me with some troubleshooting tips/techniques/procedures, and/or some advice on how to get the VM root disk healthy again?

Many thanks, in advance!

Cheers,
Mike
Last edited by Tweak on 08 Dec 2019 02:39, edited 2 times in total.

cookiemonster
Advanced User
Advanced User
Posts: 177
Joined: 23 Mar 2014 02:58
Location: UK
Status: Offline

Re: Disk Corruption on Ubuntu VM

#2

Post by cookiemonster » 04 Dec 2019 23:08

No expert on this but my thoughts are that to XigmaNas the VM is just a file, probably a vdi or vmdk type. Inside there will be all the files the VM requires. If there is a problem with that internal filesystem, then that could be the problem and unless ZFS knows of a problem via the checksums against memory or pool state, then is none the wiser.
In light of that I would probably do first with the VM switched off, do a zfs scrub.
Then I would start the VM on single user mode and force a fsck; followed by a restart from within ubuntu once finished.
Please note I'm not suggesting you do that, just sharing my thoughts. Maybe someone more experienced on these will chime in.
Main: Xigmanas 11.2.0.4 x64-full-RootOnZFS on Supermicro X8DT3. zroot on mirrorred pair of CRUCIAL_CT64M225. Memory: 72GB ECC; 2 Xeon E5645 CPUs; Storage: (HBA) - LSI SAS 9211-4i with 3 SATA x 1 Tb in raidZ1, 1 x 3 Tb SAS drive as single stripe.
Spare1: HP DL580 G5; 128 GB ECC RAM; 4 CPU; 8 x 500 GB disks on H210i
Spare2: HP DL360 G7; 6 GB ECC RAM; 1 Xeon CPU; 5 x 500 GB disks on H210i
Spare3: HP DL380 G7; 24 GB ECC RAM; 2 Xeon E5645 CPUs; 8 x 500 GB disks on IBM M1015 flashed to LSI9211-IT

Tweak
NewUser
NewUser
Posts: 5
Joined: 11 Oct 2018 05:36
Location: Phoenix, AZ
Status: Offline

Re: Disk Corruption on Ubuntu VM

#3

Post by Tweak » 05 Dec 2019 16:13

@cookiemonster--
Thanks for the quick reply!

I have the same mental concept as you explained in your post. (I think that's a good sign <for me>.)
I have a concern, though, because the VM can update *other* packages (via apt), and have NO problems [i.e., NOT crash out with disk I/O errors]. For some reason, it is only with the "linux-headers-4.15.0-72" that **always** makes it crash.
I even forced a removal of the .deb file and then re-downloaded it for another try to install -- trying to rule out corrupted/flawed source file. Still, no luck.

Because these troubles began after I upgraded the XigmaNAS system (to the 12.1 release), I am trying to discern if there's any kind of problem introduced in the VirtualBox OSE, which could lead to memory overflows/faults ... which might manifest as thought there is a "hardware" fault {like a buggy disk I/O controller, for instance} in the virtualized environment for the VM.

Does that make sense??

>> Also, since I am more of a Deb/Buntu and Arch kinda guy...and am NOT conversant in BSD...can you give me a quick "how to" for accomplishing your recommendation of "...do a zfs scrub"??

Again, many thanks, my friend!!

Cheers,
Mike
XigmaNAS 12.1.0.4 - Ingva embedded on SanDisk Ultra USB
HP Z400, 2x Xeon W3565, 24 GB ECC, 4x 4TB WD Red (WD40EFRX) in ZFS RAIDz1 pool
SAMBA/CIFS, AFP (Time Machine), NFS, Web server, and 2x VMs
Rsync to off-board 10TB HDD backup dive.

cookiemonster
Advanced User
Advanced User
Posts: 177
Joined: 23 Mar 2014 02:58
Location: UK
Status: Offline

Re: Disk Corruption on Ubuntu VM

#4

Post by cookiemonster » 05 Dec 2019 22:30

Hi. I'm thinking that if the VM bombs out in the same place that the likely problem is internal to its filesystem, not in XigmaNas. Or at least that's what it makes me think. For the scrub: Disks > ZFS > Tools > Scrub a pool > Start, chose pool, Next.
And then I would continue troubleshooting from Ubuntu.
I'm sure a dev will correct me but for the upgrade of XN, the VM files would not have been touched. The OS upgrade happened around it, I think the upgrade is a red herring. Would it be feasible to export it an import it into another host to see if it happens there too?
Main: Xigmanas 11.2.0.4 x64-full-RootOnZFS on Supermicro X8DT3. zroot on mirrorred pair of CRUCIAL_CT64M225. Memory: 72GB ECC; 2 Xeon E5645 CPUs; Storage: (HBA) - LSI SAS 9211-4i with 3 SATA x 1 Tb in raidZ1, 1 x 3 Tb SAS drive as single stripe.
Spare1: HP DL580 G5; 128 GB ECC RAM; 4 CPU; 8 x 500 GB disks on H210i
Spare2: HP DL360 G7; 6 GB ECC RAM; 1 Xeon CPU; 5 x 500 GB disks on H210i
Spare3: HP DL380 G7; 24 GB ECC RAM; 2 Xeon E5645 CPUs; 8 x 500 GB disks on IBM M1015 flashed to LSI9211-IT

cookiemonster
Advanced User
Advanced User
Posts: 177
Joined: 23 Mar 2014 02:58
Location: UK
Status: Offline

Re: Disk Corruption on Ubuntu VM

#5

Post by cookiemonster » 05 Dec 2019 22:41

Ah, I re-read your post. I see what you're saying the update to new version of VirtualBox OSE. Fair point. Reminds me, there was a user on the forum that posted that after his upgrade to one of these recent releases he had to recreate his VMs. It might be somethign else though.
Also the .deb file for a headers update... I would try from ap-get install, although that's probably what you did.
And me too, I'm more Ubuntu than FreeBSD.
Main: Xigmanas 11.2.0.4 x64-full-RootOnZFS on Supermicro X8DT3. zroot on mirrorred pair of CRUCIAL_CT64M225. Memory: 72GB ECC; 2 Xeon E5645 CPUs; Storage: (HBA) - LSI SAS 9211-4i with 3 SATA x 1 Tb in raidZ1, 1 x 3 Tb SAS drive as single stripe.
Spare1: HP DL580 G5; 128 GB ECC RAM; 4 CPU; 8 x 500 GB disks on H210i
Spare2: HP DL360 G7; 6 GB ECC RAM; 1 Xeon CPU; 5 x 500 GB disks on H210i
Spare3: HP DL380 G7; 24 GB ECC RAM; 2 Xeon E5645 CPUs; 8 x 500 GB disks on IBM M1015 flashed to LSI9211-IT

Tweak
NewUser
NewUser
Posts: 5
Joined: 11 Oct 2018 05:36
Location: Phoenix, AZ
Status: Offline

Re: Disk Corruption on Ubuntu VM

#6

Post by Tweak » 06 Dec 2019 04:41

@cookiemonster--

Thank you, again, for sharing some of your mental horsepower with me for this problem. Much appreciated, out here in the dark!!

I ran a "scrub" as you described [Scrub a Pool -> select my main ZFS storage pool]. It returned with "Command execution was successful."
>> I hope that means "the outcome was positive (no errors)," and not merely that the command had to 'fix' anything while it was running (per fsck).

Thanks for taking the time to re-read my previous reply.
I can't figure out why the *intermittent* failure mode -- only with the one LARGE file (the linux-headers .deb) -- and not a constant I/O error stream.
That's why I have concern that it's a *software* hiccup (in the OSE) masquerading as a <virtual> hardware hiccup.

Yes...I am using the 'vanilla' upgrade path (sudo apt update && sudo apt upgrade && sudo apt full-upgrade).
I have to execute a "dpkg -r --force-remove-reinstreq" every time I resurrect the VM, just to get the file locks and ghost 'handles' cleaned up.
No fun.

Frankly -- and I throw up in my mouth a little bit when I think about this -- I've got a Win Server 2010 VM running rock-steady as a guest in an Ubuntu host.
I was trying to work my way toward fully divesting of the MS-Win ecosystem, now that my wife is migrated to Apple (Linux-like ... AMEN!).
...but I'm still leery of pulling the plug on my AD and net-management services from the Win machine, if I can't keep the VM guest running through an unnattended-upgrades cycle.
(&%*#@!)

It hurts my heart to hear that others have had significant headaches. (Yes...I should have a pristine backup...but I don't have one, because it was still a "Work In Progress," and I had taken a fairly -erm- 'meandering' path to the limited success I had already achieved. I hate to see it disappear.

All the same, I truly appreciate your insights and help! I hope you have a fantastic holiday!!

Cheers,
Mike
XigmaNAS 12.1.0.4 - Ingva embedded on SanDisk Ultra USB
HP Z400, 2x Xeon W3565, 24 GB ECC, 4x 4TB WD Red (WD40EFRX) in ZFS RAIDz1 pool
SAMBA/CIFS, AFP (Time Machine), NFS, Web server, and 2x VMs
Rsync to off-board 10TB HDD backup dive.

cookiemonster
Advanced User
Advanced User
Posts: 177
Joined: 23 Mar 2014 02:58
Location: UK
Status: Offline

Re: Disk Corruption on Ubuntu VM

#7

Post by cookiemonster » 06 Dec 2019 23:47

Just a thought here. You could install the previous version of XN on a spare USB stick just to attempt to rule out the new VirtualBox version on the new XN update.
I take dmesg nor file system utilities show any abnormalities.? A bit out there but could it be possible to create a dump device and start a core dump to analyse? That is beyond my capabilities to guide you on but when I've done it once before and for the time available it was faster to reinstall and restore backup of personal files. It was a physical machine and the HD was failing. I created a block device backup with clonezilla to get some files. All in all faster than learning to debug core dumps.
Main: Xigmanas 11.2.0.4 x64-full-RootOnZFS on Supermicro X8DT3. zroot on mirrorred pair of CRUCIAL_CT64M225. Memory: 72GB ECC; 2 Xeon E5645 CPUs; Storage: (HBA) - LSI SAS 9211-4i with 3 SATA x 1 Tb in raidZ1, 1 x 3 Tb SAS drive as single stripe.
Spare1: HP DL580 G5; 128 GB ECC RAM; 4 CPU; 8 x 500 GB disks on H210i
Spare2: HP DL360 G7; 6 GB ECC RAM; 1 Xeon CPU; 5 x 500 GB disks on H210i
Spare3: HP DL380 G7; 24 GB ECC RAM; 2 Xeon E5645 CPUs; 8 x 500 GB disks on IBM M1015 flashed to LSI9211-IT

Tweak
NewUser
NewUser
Posts: 5
Joined: 11 Oct 2018 05:36
Location: Phoenix, AZ
Status: Offline

Re: Disk Corruption on Ubuntu VM

#8

Post by Tweak » 08 Dec 2019 02:36

@cookiemonster--

Well...I don't know how I 'stumbled' upon this solution, but this is how I got past the hurdle:

I made another VM (from within the webfront [phpvirtualbox/index.html]), and when I set it up I tried to start it and build a new 18-04.3 LTS server instance.
I noticed that there was an error message that popped-up about how the USB function was not available -- because the VirtualBox Extensions were not installed (in the OSE).
^^ THAT caught my attention. So, I disabled the USB in the control window, and re-started.
Nonetheless, I built another instance and got it up and running/updated to the latest kernel without incident.

** Note, I have been managing/running the extant VMs through the VBoxManage CLI, since I had been using scripts to execute on certain triggers.

As I was working to get all of the same packages/software loaded onto the new instance, I took a moment when BOTH machines were in a "down" (powered off) state, and I checked both of the 'showvminfo --details' outputs from the VBoxManage (CLI) interface.
I saw that the *problem* VM had the USB function *enabled* in its machine header...so I thought that might be a problem (based on the above error message).
Through the CLI interface, I disabled the USB function on the original VM (the one which had been causing the problem).

Voila -- on the next reboot (after executing a dpkg cleaning/re-homing), the apt update/upgrade cycle went through smoothly.
Yay!!!
The 'old' VM has been running smoothly for the last 8 hours.

Easy, peasy, lemon-squeezy.

Whew! (All of the work is still intact.)

Hopefully, maybe this thread might help someone else who faces the same enigma in the future, when upgrading the XigmaNAS OS (and, therefore, the VBox OSE).

...now...I just wish I could remember how I got the USB function enabled in the FIRST place (over a year ago).
I must've found some way to hack the Extensions into the OSE, even though they're not available via conventional/default installation.
>> Any thoughts on my last question?

Again...MANY THANKS for sticking with me through several days' discernment and muddling-about in the OSE inquiry...!

Best regards,
Mike
XigmaNAS 12.1.0.4 - Ingva embedded on SanDisk Ultra USB
HP Z400, 2x Xeon W3565, 24 GB ECC, 4x 4TB WD Red (WD40EFRX) in ZFS RAIDz1 pool
SAMBA/CIFS, AFP (Time Machine), NFS, Web server, and 2x VMs
Rsync to off-board 10TB HDD backup dive.

netware5
experienced User
experienced User
Posts: 123
Joined: 31 Jan 2017 21:39
Location: Sofia, BULGARIA
Status: Offline

Re: [SOLVED] Disk Corruption on Ubuntu VM

#9

Post by netware5 » 08 Dec 2019 13:00

I am using Ubuntu Server 16.04 LTS VM under Xigmanas. The VM provides print services to my home network via USB connected printer. So I definitely use the USB function. No issues detected during last two years. According to my memory the "USB function" has been added by devs on some later stage. Before that I tried to play with Vbox guest extensions, but without success. I really don't remember the whole story, but it is clear that currently I run my VM with USB function enabled and this does not affect the VM update process.
XigmaNAS 12.1.0.4 - Ingva (rev.7091) embedded on HP Proliant Microserver Gen8, Xeon E3-1265L, 16 GB ECC, 2x4TB WD Red ZFS Mirror

cookiemonster
Advanced User
Advanced User
Posts: 177
Joined: 23 Mar 2014 02:58
Location: UK
Status: Offline

Re: [SOLVED] Disk Corruption on Ubuntu VM

#10

Post by cookiemonster » 08 Dec 2019 23:28

That was a really good spot @Tweak, and thanks for sharing your findings.
Main: Xigmanas 11.2.0.4 x64-full-RootOnZFS on Supermicro X8DT3. zroot on mirrorred pair of CRUCIAL_CT64M225. Memory: 72GB ECC; 2 Xeon E5645 CPUs; Storage: (HBA) - LSI SAS 9211-4i with 3 SATA x 1 Tb in raidZ1, 1 x 3 Tb SAS drive as single stripe.
Spare1: HP DL580 G5; 128 GB ECC RAM; 4 CPU; 8 x 500 GB disks on H210i
Spare2: HP DL360 G7; 6 GB ECC RAM; 1 Xeon CPU; 5 x 500 GB disks on H210i
Spare3: HP DL380 G7; 24 GB ECC RAM; 2 Xeon E5645 CPUs; 8 x 500 GB disks on IBM M1015 flashed to LSI9211-IT

Tweak
NewUser
NewUser
Posts: 5
Joined: 11 Oct 2018 05:36
Location: Phoenix, AZ
Status: Offline

Re: [SOLVED] Disk Corruption on Ubuntu VM

#11

Post by Tweak » 09 Dec 2019 09:09

@netware5--

Your use-case is (for me) "the exception that proves the rule." :D
netware5 wrote:
08 Dec 2019 13:00
I am using Ubuntu Server 16.04 LTS VM under Xigmanas. The VM provides print services to my home network via USB connected printer. So I definitely use the USB function. No issues detected during last two years. According to my memory the "USB function" has been added by devs on some later stage. Before that I tried to play with Vbox guest extensions, but without success. I really don't remember the whole story, but it is clear that currently I run my VM with USB function enabled and this does not affect the VM update process.
I, too, had **somehow** gotten to a position where USB functionality was initially enabled in a previous iteration, via some -- now-forgotten -- admin "rain dance."
I did not mean to imply that correlation determined causation. :P

I only intended to highlight how/where I spotted a difference between the environment my VM was running under in the previous (11.2) OSE and the current (12.1) OSE.
When I changed the VM attributes to *match* what is 'required' under the 12.1 OSE (as presented in the php API webfront), the disk I/O errors {on VM device ata3, where my SATA hdd image is mounted} went away, and the updates proceeded without incident.

It solved *my* problem ..... and, maybe, it could be the "hidden gem" that might help someone else solve theirs.

Thanks for your reply! I appreciate hearing all the ways that others are using their servers (and VMs)! 8-)

Cheers, and Happy Holidays to you! :)

Mike
XigmaNAS 12.1.0.4 - Ingva embedded on SanDisk Ultra USB
HP Z400, 2x Xeon W3565, 24 GB ECC, 4x 4TB WD Red (WD40EFRX) in ZFS RAIDz1 pool
SAMBA/CIFS, AFP (Time Machine), NFS, Web server, and 2x VMs
Rsync to off-board 10TB HDD backup dive.

Post Reply

Return to “VM|VirtualBox”