*New 11.3 series Release:
2019-10-19: XigmaNAS 11.3.0.4.7014 - released

*New 12.0 series Release:
2019-10-05: XigmaNAS 12.0.0.4.6928 - released!

*New 11.2 series Release:
2019-09-23: XigmaNAS 11.2.0.4.6881 - released!

We really need "Your" help on XigmaNAS https://translations.launchpad.net/xigmanas translations. Please help today!

Producing and hosting XigmaNAS costs money. Please consider donating for our project so that we can continue to offer you the best.
We need your support! eg: PAYPAL

HA script using CARP and ZFS snapshots

XigmaNAS Scripts and shell tips
Forum rules
Set-Up GuideFAQsForum Rules
Post Reply
thulium89
NewUser
NewUser
Posts: 1
Joined: 13 Feb 2018 22:52
Status: Offline

HA script using CARP and ZFS snapshots

#1

Post by thulium89 » 14 Feb 2018 00:43

I apologized in advance if someone has already made something like this, however when I did my search I was not able to find exactly what I was doing while being "mostly" easy to deploy. I did find zrep which looked like a great tool however I was not able to get it to run on nas4free. I know that there is HAST/CARP which is a wonderful product. However I found that it destroyed my performance when running in a hypervisor environment. I would go from 300Mbs to about 100Mbs for read speeds. I found that using ZFS to send the snapshots over to a remote

What it is:
A script that gets scheduled by the cron to use ZFS to send incremental snapshots over to a remote host and will manage turning on and off iscsi_target to prevent data locks while using CARP to detect if it is the master or backup and act accordingly

What is required to get it running:
You have to configure SSH Password-less / Key Authentication using https://www.nas4free.org/wiki/documenta ... entication
Setup CARP and configure iSCSI to use CARP. It is preferable if you do not enable iSCSI once configure as the script already does it for you, shouldn't be any issues if you do as the script should stop it. Also it is preferable to use a secondary network for CARP / iSCSI
Create a cron job that calls the script on what ever interval you feel is needed. (I'm using on a minute basis)
Create the script using viewtopic.php?f=70&t=2805&sid=049f5debc ... b2ba7cdd99 on each node
Edit remoteName to be name of the other node, so if you decided to call one serverA and serverB, on serverB's script set the remotename to serverA, can also use an ip address
Edit zfsVolumeName to be the name of your ZFS volume that you want replicated
Edit carpDevice to be the name

Other thoughts:
I had created the script in the manner I did because when I first started I was trying to make it work with a daemon and get it to run on boot using rc.conf and failed miserably. However I was not aware of the fact that in the advance tab there is a command script. So if you want to run the code every few seconds for more real time replication change the value of keepLooping to 1 and the value of scriptPath to the name and full path to the script. So for example if you created the script inside the /mnt/mount/ folder and named it my_script.sh you'd change the value to /mnt/mount/my_script.sh
After that create a command script with the command of /mnt/mount/my_script.sh and make it a PostInit type.

Changes:
Made a few quick changes as I noticed some errors and possible issues. One of them was if the main host detected the backup host had iSCSI running it would turn off iSCSI and not sync. Also added in an option to allow iSCSI to be running on the backup host. I thought about this and the reason I put this in was the cron job only runs one minute, so if the main host fails, you'll have to wait a full minute for fail over. So by turning off iSCSI during replication and turning it back on once it is all finished during fail over CARP will kick in and iSCSI will resume. If you want to do the old fashion way of waiting a full minute feel free to change the value of keepRemoteiSCSIUp to 0

Added in the ability to change from looping so you can use command script to start it or non-looping for cron.

Wish List:
Would love to add in a feature to set how often you want to redo the first snapshot and to use date difference or size or both. Currently I just have it redo the first snapshot when ever it detects a G but that was laziness / not sure how to compare it.

The script:

Code: Select all

#!/bin/sh
remoteName="nas4free01.local"
zfsVolumeName="ZFSpool/ZFS-Volume"
carpDevice="em1"
keepLooping=0
keepRemoteiSCSIUp=1
sleepTimeForBackupDuringLoop=5
PIDFILE="/usr/local/sbin/zfssync.pid"
scriptPath=/"usr/local/sbin/zfssync.sh"

isFailed=1
reSyncFailed=1

zfsSync() {
	zfs send -D -i "$zfsVolumeName"@sh02 "$zfsVolumeName"@sh03 | ssh $remoteName zfs recv -F "$zfsVolumeName" || return
	isFailed=0
}

zfsReSync() {
	zfs send -D -i "$zfsVolumeName"@sh01 "$zfsVolumeName"@sh02 | ssh $remoteName zfs recv -F "$zfsVolumeName" || return
	reSyncFailed=0
}

zfsReapplyBase() {
	echo $$ > $PIDFILE
	chmod 555 $PIDFILE
	ssh $remoteName ifconfig "$carpDevice" down
	zfs destroy "$zfsVolumeName"@sh01
	zfs rename "$zfsVolumeName"@sh02 "$zfsVolumeName"@sh01
	ssh $remoteName zfs destroy "$zfsVolumeName"@sh01
	zfs send -D "$zfsVolumeName"@sh01 | ssh $remoteName zfs recv -F "$zfsVolumeName"
	ssh $remoteName ifconfig "$carpDevice" up
	rm $PIDFILE
}

fullSync() {
	case "$vriSCSI" in
		*"pid"*)
				ssh $remoteName service iscsi_target onestop
		;;
		*)
		;;
	esac
	zfs snapshot "$zfsVolumeName"@sh03
	zfsDif=$(zfs list -H -o used "$zfsVolumeName"@sh02)
	if [ $zfsDif != 0 ]; then
		zfsSync
		zfs destroy "$zfsVolumeName"@sh02
		zfs rename "$zfsVolumeName"@sh03 "$zfsVolumeName"@sh02
		ssh $remoteName zfs destroy "$zfsVolumeName"@sh02
		if [ $isFailed == 1 ]; then
			zfsReSync
			if [ $reSyncFailed == 1 ]; then
				zfsReapplyBase
			fi
		else
			ssh $remoteName zfs rename "$zfsVolumeName"@sh03 "$zfsVolumeName"@sh02
			baseSize=$(zfs list -H -o used "$zfsVolumeName"@sh01)
			case "$baseSize" in
				*"G"*)
					zfs destroy "$zfsVolumeName"@sh01
					ssh $remoteName zfs destroy "$zfsVolumeName"@sh01
					zfs rename "$zfsVolumeName"@sh02 "$zfsVolumeName"@sh01
					ssh $remoteName zfs rename "$zfsVolumeName"@sh02 "$zfsVolumeName"@sh01
					zfs snapshot "$zfsVolumeName"@sh02
					zfs send -D -i "$zfsVolumeName"@sh01 "$zfsVolumeName"@sh02 | ssh $remoteName zfs recv -F "$zfsVolumeName"
				;;
				*)
				;;
			esac
		fi
	else
		zfs destroy "$zfsVolumeName"@sh03
		isMissingSH02=$(ssh $remoteName zfs list -H -o name "$zfsVolumeName"@sh02)
		case "$isMissingSH02" in
			*"exist"*)
				zfsReSync()
				if [ $reSyncFailed == 1 ]; then
					zfsReapplyBase
				fi
			;;
			*)
			;;
		esac
	fi
	if [ $keepRemoteiSCSIUp == 1 ]; then
		ssh $remoteName service iscsi_target onestart
	fi
}

ccarp=$(ifconfig "$carpDevice")
viscsi=$(service iscsi_target onestatus)
case "$ccarp" in
	*"MASTER"*)
		vriSCSI=$(ssh $remoteName service iscsi_target onestatus)
		case "$viscsi" in
			*"not"*)
				service iscsi_target onestart
			;;
			*)
			;;
		esac
		case "$vriSCSI" in
			*"ssh"*)
			;;
			*)
				isSyncing=$(ssh $remoteName zfs list -H -o name "$zfsVolumeName"@sh01)
					case "$isSyncing" in
						*"exist"*)
							pid=$(cat $PIDFILE)
							ps -p $pid > /dev/null 2>&1
							if [ $? -eq 1 ]; then
								rm $PIDFILE
								fullSync
							fi
						;;
						*)
							fullSync
						;;
					esac
				;;
			esac
	;;
	*)
		case "$viscsi" in
			*"pid"*)
				if [ $keepRemoteiSCSIUp == 0 ]; then
					service iscsi_target onestop
				fi
			;;
			*)
			;;
		esac
		ssh $remoteName ls || ifconfig "$carpDevice" up
		if [ $keepLooping == 1 ]; then
			sleep sleepTimeForBackupDuringLoop
		fi
	;;
esac
if [ $keepLooping == 1 ]; then
	$scriptPath
fi
Final Thoughts:
I am by no means an expert at freebsd or NAS4free or even shell scripting. I actually just learned what shell scripting was a week ago, so my code probably could use lots of improvements. Also if something like this already exists please delete this as I'm sure that the other product is much better. Also, use at your own risk, all I can say is that it currently works for me.

mauricio.viola
NewUser
NewUser
Posts: 1
Joined: 17 Apr 2018 01:27
Status: Offline

Re: HA script using CARP and ZFS snapshots

#2

Post by mauricio.viola » 17 Apr 2018 01:41

Hi thulium89
This is great, I've been thinking of using your script because I want to accomplish something similar.
The main difference is that I already have the 2 nas hosts using zfs replication and up to date.
What should I need to change to avoid resyncing all the data?
Thanks in advance, I appreciate you sharing this script
Mauricio

Post Reply

Return to “Scripts and shell tips”