*New 12.1 series Release:
2019-11-08: XigmaNAS 12.1.0.4.7091 - released!

*New 11.3 series Release:
2019-10-19: XigmaNAS 11.3.0.4.7014 - released


We really need "Your" help on XigmaNAS https://translations.launchpad.net/xigmanas translations. Please help today!

Producing and hosting XigmaNAS costs money. Please consider donating for our project so that we can continue to offer you the best.
We need your support! eg: PAYPAL

Power Failure & FSCK Fail -- SoftRAID UFS

Software RAID information and help
Forum rules
Set-Up GuideFAQsForum Rules
Post Reply
JHM001
Starter
Starter
Posts: 17
Joined: 02 Jun 2013 07:56
Status: Offline

Power Failure & FSCK Fail -- SoftRAID UFS

#1

Post by JHM001 » 13 Jun 2019 05:17

Hi, we have a UFS software RAID setup as a fileserver on a LAN, 2 X 3 TB WD Red drives running a fairly recent version of NAS4Free/XigmaNAS (total almost 10 years, starting with NAS4Free). On a power failure the system will not boot.

GOAL: Save data and acquire functioning XigmaSAS system again.

INCIDENT DATA

(ada1:ahcich1:0:0:0) Error 5, Retries exhausted
GEOM_MIRROR: Request failed (error=5), ada1(READ(offset- . . . , length= . . .)
. . . Error reading journal block nnnnn
. . . Unexpected SU+J Inconsistency
. . . Internal Error: Got To reply()
. . . . Unexpected Soft Udpate Inconsistency: RUN FSCK Manually.

EXPLORE SYSTEM

gpart show --> gives raw partitions AND this . . . in case it is helpful

40 5860533088 mirror/RaidXY GPT (2.7T)
40 5860533080 1 freebsd-ufs (2.7T)
5860533080 8 - free - (4.0K)

geom disk list --> gives ada0 and ada1
cat /etc/fstab --> gives /dev/da0p2 /cf ufs ro 1 1

ATTEMPT FIXES


fsck on anything does nothing -- everything is "clean"
fsck_ufs does nothing

YOUR ADVICE

* What would be the step-by-step? If we make another XigmaNAS USB, can we rebuild from there? is the problem on the USB key?
* Are there some commands you'd like me to run to find out something?

Thanks much!!

JHM

User avatar
ms49434
Developer
Developer
Posts: 754
Joined: 03 Sep 2015 18:49
Location: Neuenkirchen-Vörden, Germany - GMT+1
Contact:
Status: Online

Re: Power Failure & FSCK Fail -- SoftRAID UFS

#2

Post by ms49434 » 13 Jun 2019 09:32

JHM001 wrote:
13 Jun 2019 05:17
Hi, we have a UFS software RAID setup as a fileserver on a LAN, 2 X 3 TB WD Red drives running a fairly recent version of NAS4Free/XigmaNAS (total almost 10 years, starting with NAS4Free). On a power failure the system will not boot.

GOAL: Save data and acquire functioning XigmaSAS system again.

INCIDENT DATA

(ada1:ahcich1:0:0:0) Error 5, Retries exhausted
GEOM_MIRROR: Request failed (error=5), ada1(READ(offset- . . . , length= . . .)
. . . Error reading journal block nnnnn
. . . Unexpected SU+J Inconsistency
. . . Internal Error: Got To reply()
. . . . Unexpected Soft Udpate Inconsistency: RUN FSCK Manually.

EXPLORE SYSTEM

gpart show --> gives raw partitions AND this . . . in case it is helpful

40 5860533088 mirror/RaidXY GPT (2.7T)
40 5860533080 1 freebsd-ufs (2.7T)
5860533080 8 - free - (4.0K)

geom disk list --> gives ada0 and ada1
cat /etc/fstab --> gives /dev/da0p2 /cf ufs ro 1 1

ATTEMPT FIXES


fsck on anything does nothing -- everything is "clean"
fsck_ufs does nothing

YOUR ADVICE

* What would be the step-by-step? If we make another XigmaNAS USB, can we rebuild from there? is the problem on the USB key?
* Are there some commands you'd like me to run to find out something?

Thanks much!!

JHM
The FreeBSD manpages are a very good source of information: GEOM-Mirror
gmirror status will show you the status of each member disk and it tells you if your data is compromised.
Please read and follow the forum rules, it will help you to get better answers to your questions: Forum Rules
1) XigmaNAS 12.0.0.4 amd64-embedded on a Dell T20 running in a VM on ESXi 6.7U2, 22GB out of 32GB ECC RAM, LSI 9300-8i IT mode in passthrough mode. Pool 1: 2x HGST 10TB, mirrored, SLOG: Samsung 850 Pro, L2ARC: Samsung 850 Pro, Pool 2: 1x Samsung 860 EVO 1TB , services: Samba AD, CIFS/SMB, ftp, ctld, rsync, syncthing, zfs snapshots.
2) XigmaNAS 12.0.0.4 amd64-embedded on a Dell T20 running in a VM on ESXi 6.7U2, 8GB out of 32GB ECC RAM, IBM M1215 crossflashed, IT mode, passthrough mode, 2x HGST 10TB , services: rsync.

JHM001
Starter
Starter
Posts: 17
Joined: 02 Jun 2013 07:56
Status: Offline

Re: Power Failure & FSCK Fail -- SoftRAID UFS

#3

Post by JHM001 » 13 Jun 2019 13:33

ms49434: thanks for the note. If you would be so kind, I don't see what Forum rules I violated? I have more than 10 posts, it was an informative subject header, and in fact I was already reading the GEOM man pages. I also searched for other answers: there is a lot of power-failure driven GEOM-related discussion in various places on the web - but the problem seemed very specific to XigmaNAS. Anyway, per separate reply, total success for doing nothing, via "rebuilding provider finished". :)

JHM001
Starter
Starter
Posts: 17
Joined: 02 Jun 2013 07:56
Status: Offline

Re: Power Failure & FSCK Fail -- SoftRAID UFS

#4

Post by JHM001 » 13 Jun 2019 13:36

Pleased to report TOTAL SUCCESS. For "doing nothing". Left monitor hooked up to machine overnight, and in morning was greeted with "# GEOM_MIRROR: Device NAME: rebuilding provider ada0 finished."

Rebooted and everything worked. If I get a chance I'll find out more about what went on over night.

User avatar
ms49434
Developer
Developer
Posts: 754
Joined: 03 Sep 2015 18:49
Location: Neuenkirchen-Vörden, Germany - GMT+1
Contact:
Status: Online

Re: Power Failure & FSCK Fail -- SoftRAID UFS

#5

Post by ms49434 » 13 Jun 2019 13:51

JHM001 wrote:
13 Jun 2019 13:33
ms49434: thanks for the note. If you would be so kind, I don't see what Forum rules I violated? I have more than 10 posts, it was an informative subject header, and in fact I was already reading the GEOM man pages. I also searched for other answers: there is a lot of power-failure driven GEOM-related discussion in various places on the web - but the problem seemed very specific to XigmaNAS. Anyway, per separate reply, total success for doing nothing, via "rebuilding provider finished". :)
Just an example
a) XigmaNAS version, platform (Embedded/Full/LiveCD), and revision number.
vs
fairly recent version of NAS4Free/XigmaNAS (total almost 10 years, starting with NAS4Free).
1) XigmaNAS 12.0.0.4 amd64-embedded on a Dell T20 running in a VM on ESXi 6.7U2, 22GB out of 32GB ECC RAM, LSI 9300-8i IT mode in passthrough mode. Pool 1: 2x HGST 10TB, mirrored, SLOG: Samsung 850 Pro, L2ARC: Samsung 850 Pro, Pool 2: 1x Samsung 860 EVO 1TB , services: Samba AD, CIFS/SMB, ftp, ctld, rsync, syncthing, zfs snapshots.
2) XigmaNAS 12.0.0.4 amd64-embedded on a Dell T20 running in a VM on ESXi 6.7U2, 8GB out of 32GB ECC RAM, IBM M1215 crossflashed, IT mode, passthrough mode, 2x HGST 10TB , services: rsync.

JHM001
Starter
Starter
Posts: 17
Joined: 02 Jun 2013 07:56
Status: Offline

Re: Power Failure & FSCK Fail -- SoftRAID UFS

#6

Post by JHM001 » 13 Jun 2019 15:22

Another update -- in fact one of the two RAID mirror disks was "not a consumer", and the RAID was degraded. Need to "forget" (scary, but does NOT apply to RAID, only non-functional disks) and then "insert". "Status" will show progress. These things can all be done either from XigmaNAS GUI or a shell via console. Currently looks like a 10 hour rebuild job for 3 TB.

Question: Would ZFS have better protected against power failure? (And yes there is a UPS, but currently not triggering a shutdown. That needs to be configured.)

JHM001
Starter
Starter
Posts: 17
Joined: 02 Jun 2013 07:56
Status: Offline

Re: Power Failure & FSCK Fail -- SoftRAID UFS

#7

Post by JHM001 » 13 Jun 2019 15:30

XigmaNAS 11.1.0.4 x64-embedded on Intel Core2Duo E6750 on a Dell Optiplex 755, 6 GB non-ECC RAM, GEOM Software RAID-1 mirror on 2X WD Red 3.0 TB HD, CIFS, SSH, VirtualBox

User avatar
raulfg3
Site Admin
Site Admin
Posts: 4969
Joined: 22 Jun 2012 22:13
Location: Madrid (ESPAÑA)
Contact:
Status: Offline

Re: Power Failure & FSCK Fail -- SoftRAID UFS

#8

Post by raulfg3 » 13 Jun 2019 15:47

JHM001 wrote:
13 Jun 2019 15:22

Question: Would ZFS have better protected against power failure? (And yes there is a UPS, but currently not triggering a shutdown. That needs to be configured.)
YES. it's really good for this, not specifically designed for, but it's robust enough to suffer a power faillure and do not corrupt data.
12.0.0.4 (revision 6766)+OBI on SUPERMICRO X8SIL-F 8GB of ECC RAM, 12x3TB disk in 3 vdev in RaidZ1 = 32TB Raw size only 22TB usable

Wiki
Last changes

User avatar
JoseMR
Hardware & Software Guru
Hardware & Software Guru
Posts: 1213
Joined: 16 Apr 2014 04:15
Location: PR
Contact:
Status: Offline

Re: Power Failure & FSCK Fail -- SoftRAID UFS

#9

Post by JoseMR » 14 Jun 2019 12:32

JHM001 wrote:
13 Jun 2019 15:22
Hi, we have a UFS software RAID setup as a fileserver on a LAN, 2 X 3 TB WD Red drives running a fairly recent version of NAS4Free/XigmaNAS (total almost 10 years, starting with NAS4Free). On a power failure the system will not boot.
...
Question: Would ZFS have better protected against power failure? (And yes there is a UPS, but currently not triggering a shutdown. That needs to be configured.)
Hello, you are not the only one having major UFS corruption/problems with power failures, ungraceful shutdowns etc., been telling about to just use tuned ZFS from quite some time, but unfortunately the FUD about the ZFS/ECC strict requirement that spread the Web drives new XigmaNAS users to take the wrong decision when choosing the right filesystem for their valuable data storage, which is hands down ZFS.


Some references that could helps others to take the right direction before deploy serious data storage on UFS filesystems:
Recent XigmaNAS boot problem after power failure(Supports RootOnZFS)
pfSense boot problems after power failure (Since v2.4 ships ZFS)
Stability of UFS and ZFS on FreeBSD
OPNsence Power outages tolerance(Supports ZFS)
And a lot more about the topic around the Web



Some professional/authority advice about the ZFS and ECC strict requirements for reference.
Matthew Ahrens

Allan Jude:

JRS Systems:

Just to be clear, I do recommend using ECC for anything serious about data storage regardless of the filesystem of choice, though it is certainly not mandatory, p.s I also recommend RootOnZFS platform if server high availability and reliability is a concern.

Regards
System: FreeBSD 12 RootOnZFS, MB: Supermicro X8SI6-F, Xeon X3450, 16GB DDR3 ECC RDIMMs.
XigmaNAS RootOnZFS
Addons at GitHub
BastilleBSD
Boot Environments Intro
JoseMRPubServ(temporary down)

JHM001
Starter
Starter
Posts: 17
Joined: 02 Jun 2013 07:56
Status: Offline

Re: Power Failure & FSCK Fail -- SoftRAID UFS

#10

Post by JHM001 » 14 Jun 2019 17:38

JoseMR -- Super thanks for your notes on ZFS. Ironically about four years ago after reading the ECC-paranoia material I switch off ZFS and back to UFS! And currently I did in fact read some of the same material you are sharing -- and have concluded two things:

1) ZFS is fine with non-ECC RAM
2)ZFS does NOT need more RAM than I can put in the current box (8GB for 3TB ZFS RAID1).

You point about sharing this information is a good one. Super thanks for reinforcing the analysis.

Plan now is to convert back to ZFS.

JHM

Post Reply

Return to “Software RAID”