Before configuring the HA setup, please ensure that you have read this document completely.
This document contains the following sections;
Steps to Configure High Availability
Steps to be Performed in Primary Probe Server
Steps to be Performed in Standby Probe Server
Both the Primary and the Standby servers should have the same IT360 Build version.
The Primary and Standby servers should be in the same time-zone and time.
If license is applied in the Primary Server, the same has to be applied in the Standby Server also. Click here for the steps to apply the license in both the servers.
Before configuring the HA setup, take a backup of the IT360 database.
Receiving Netflows in HA Enabled IT360 Setup:
Exporting flows to both the Servers: Configure both the Primary and the Standby servers' ipaddresses in the router, to export the flows. In this case, the Standby server will be in hot-standby mode and will not be providing any service, until it takes over the role of Primary.
Note: This process consumes additional bandwidth, since the flow is exported to the Standby server also.
Exporting flows only to the current Primary server: In order to avoid the additional bandwidth utilization, manual intervention is made to change the Primary server`s ipaddress in the routers, when the Standby server takes over the role of the Primary. This in turn results in export of the flows, only to the current Primary server.
Receiving Traps in HA Enabled IT360 Setup:
Exporting traps to both the Servers: Configure both the Primary and the Standby servers' ipaddresses in the router, to export the traps. In this case, the Standby server will be in hot-standby mode and will not be providing any service, until it takes over the role of Primary.
Note: This process consumes additional bandwidth, since the trap is exported to the Standby server also.
Follow the steps listed below, separately for the Primary and Standby servers, in order to configure the HA for each of the servers.
Install the Primary Probe server.
During installation, give a prefix for the Database, e.g. IT360Probe.
Start & Stop the Primary Probe server.
Notifications will be sent, when the Standby Server takes Over the Role of the Primary. Click here for the steps to configure the notifications.
Install the Standby Probe server.
During installation, give the prefix for the Database, same as given for the Primary Database, i.e.IT360Probe.
Deselect/Uncheck the Create Database check box, as this database was already created when the Primary Probe server was installed.
Do not start the Standby Probe. (Important)
If you have configured HA for the Central server, give the following Standby Central server details in both the Primary and Standby Probe servers, in the file: <IT360HOME>/applications/conf/AMServer.properties;
am.standbyadminserver.host=StandbyCentralServerHostName (this is the host name of the Standby Central server)
am.standbyadminserver.port=StandbyCentralServerPort (this is the SSL Port of the Standby Central server)
Now, connect to the Central client, go to the Admin page and click on the Probe link. In the page displayed, the Primary Probe will be listed.
Click on the Edit icon corresponding to the Primary Probe. Do the following in the resulting Edit Probe page.
Enable the Failover Server Details check box.
Enter the Standby Probe details, i.e, the Host Name, Web Server Port and SSL PORT.
Notifications will be sent, when the Standby Probe Server takes Over the Role of the Primary. Click here for the steps to configure the notifications.
Start the Primary Probe.
Now, make sure that the Primary Probe startup process is complete and start the Standby Probe.
Now the Standby Probe will be running in hot-standby mode, which means that it will not be providing any service and cannot be connected to the WebClient. It will simply be monitoring the Primary probe server's running status. Once the the Primary server goes down, this Standby Probe server will be ready to takeover and start to provide the service.
You will receive notifications, in the form of emails, when the Standby server takes over the functions that were being performed by the Primary. For this, you need to manually add the email id (s) in the file: 'IT360_HOME>\applications\working\conf\FailOver.xml'.
Sample entries are as follows;
<PRIMARY HEART_BEAT_INTERVAL="60" />
<STANDBY FAIL_OVER_INTERVAL="60" RETRY_COUNT="1">
<BACKUP ENABLED="TRUE" BACKUP_INTERVAL="600"/>
SUBJECT="Primary Server Failed"
BODY="Primary Server is failed and taken over by the StandBy Server"/>
The optimal value for the RETRY_COUNT is 10.
If MIBS are added in the Primary server, the same MIBS should be added in the Standby server also, provided, the Standby has taken over the role of the Primary. (Click here to know about MIB upload procedure for Networks and click here to know about the MIB upload procedure for Applications & Servers)
How to give multiple email ids, while configuring the notifications?
Type in the email ids, separated by commas.
What is the recommended value for the HEART_BEAT_INTERVAL and the FAILOVER_INTERVAL, while configuring the notifications?
It is recommended to have the HEART_BEAT_INTERVAL and the FAILOVER_INTERVAL, above 20 seconds. Having lesser values may lead to unexpected behaviour.
What is the purpose of setting the RETRY_COUNT, while configuring the notifications? Is the value set for it in the above sample entries is fixed or changeable?
In general, the Standby will go through the parameters, specified in the 'FailOver.xml' file, available under the '<IT360_HOME>/applications/working/conf' directory. By specifying the RETRY_COUNT value in this configuration file, before starting the service, you can instruct the Standby server to try as many times as the count says, before taking over the role of the Primary.
By default, this attribute is configured as 1, which means that it will try only once, before taking the role as Primary. Subsequent retry attempts will happen with an interval of 60 secs. For instance, if you need the Standby server to wait for 5 mins before taking over the role as Primary, you can set the RETRY_COUNT value as 5. Please note that the optimal value for the RETRY_COUNT is 10.