Latest News:
2019-02-21: XigmaNAS 11.2.0.4.6536 - released!

Latest BETA Release:
2019-02-21: XigmaNAS 12.0.0.4.6536 - BETA released!

We really need "Your" help on XigmaNAS https://translations.launchpad.net/xigmanas translations. Please help today!

Producing and hosting XigmaNAS cost money, please consider a donation to our project so we can continue to offer you the best.
We need your support! eg: PAYPAL

CARP + HAST both nodes remain as "MASTER"

Highly Available Storage.
Forum rules
Set-Up GuideFAQsForum Rules
Post Reply
zdenyx
NewUser
NewUser
Posts: 12
Joined: 04 Jan 2013 14:47
Location: Czech Republic
Status: Offline

CARP + HAST both nodes remain as "MASTER"

#1

Post by zdenyx » 17 Feb 2016 08:54

Description:

two servers in HAST configuratin with CARP
Version 10.2.0.2 - Prester (revision 2067)

master: lagg0 VHID 7 192.168.100.31/24 adwskew 0
slave: lagg0 VHID 7 192.168.100.31/24 adwskew 10

server1 IP: 192.168.100.12
server2 IP: 192.168.100.14

CARP interface IP 192.168.100.31

both have lagg0 interface (em1+em2)
and em0 for HAST sync

When ever I try switch master/secondary in WebGUI all works OK but sometimes I have this problem::

Both nodes have this status of Virtual IP address: 192.168.100.31 (MASTER)

but HAST resources are OK:

diskX secondary complete
diskY primary complete

dmesg show this:

nas4free-slave: ~#
carp: VHID 7@lagg0: MASTER -> BACKUP (more frequent advertisement received)
carp: VHID 7@lagg0: BACKUP -> MASTER (master down)

nas4free-master: ~#
carp: VHID 7@lagg0: BACKUP -> MASTER (master down)
carp: VHID 7@lagg0: MASTER -> BACKUP (more frequent advertisement received)
carp: VHID 7@lagg0: BACKUP -> MASTER (master down)

iSCSI, NFS and SMB resources are disconnect and backups of all servers fails.
Sometimes stops responding WebGUI of one server - ssh login works, dmesg works, top works ... but other commands stays in timeout and ctrl+c is not possible

I think that the ongoing automatic backups to this HAST cluster make this problem.


Servers are connected to Cisco switch (C3560X) with this config:

interface Port-channel10
switchport access vlan 100
!
interface Port-channel20
switchport access vlan 100

interface GigabitEthernet0/39
description << LAN1 - 192.168.100.12 >>
switchport access vlan 100
channel-protocol lacp
channel-group 10 mode active
!
interface GigabitEthernet0/40
description << LAN2 NAS4Free - 192.168.100.12 >>
switchport access vlan 100
channel-protocol lacp
channel-group 10 mode active
!
interface GigabitEthernet0/41
description << LAN1 NAS4Free - 192.168.100.14 >>
switchport access vlan 100
channel-protocol lacp
channel-group 20 mode active
!
interface GigabitEthernet0/42
description << LAN2 NAS4Free - 192.160.100.14 >>
switchport access vlan 100
channel-protocol lacp
channel-group 20 mode active


Problem is also with another configuration of servers and version 10.2.0.2 - Prester (revision 2268)

How to diagnose ?

Something wrong with CARP adverisement ?

Thanks


Zdenyx
HAST config with 2x Supermicro X9SRL-F, Xeon(R) CPU E5-1620 v2 @ 3.70GHz, 32 GB ECC RAM, 16x HDD ST3000VN0001

swissiws
NewUser
NewUser
Posts: 2
Joined: 27 Oct 2012 11:38
Status: Offline

Re: CARP + HAST both nodes remain as "MASTER"

#2

Post by swissiws » 09 Mar 2016 12:11

I have similar issue

I think the mDNSresponderPosix service is resetting nics while restarting service - initiating CARP down sequence, if set. This seems to cause, on load, kernel panic - my Dell's R610 just crash.

I do have similar setup; LACP for CARP, LACP on OPT1 for hast-hast communication, SMB and AD enviroment, cisco 3750G stacks.


I would highly recommend to disable mDNSresponderPosix, if you do not use Mac's in your environment.

System /Advanced / untick 'Enable Zeroconf/Bonjour to advertise services of this device'


In the meantime, I do stick to manual switch as there is no option for me to circumvent unpredictable load sequence trigger for CARP interface.

Untitled.png
You do not have the required permissions to view the files attached to this post.
2 x Dell R610 96GB RAM,Perc H800, MD1200 DAS, 12x3TB SAS Z1 - HAST Cluster - 10.2.0.2.2332

Post Reply

Return to “HAST”