Configuring Failover Support for OpManager


Failover or redundancy support for OpManager is necessary to achieve uninterrupted service. It becomes cumbersome if the OpManger DB crashes or loses its network connectivity and not monitoring your network. Though regular backups help you recover from DB crashes, but it takes time for OpManger to resume its service. However, in the mean time your network will be left unmonitored and some other critical devices such as routers, mail servers etc. may go down and affect your business. Implementing a redundancy system helps you to overcome such failures.


Failover support requires you to configure OpManager Secondary or Standby server and keep monitoring the OpManager Primary server. Incase the Primary server fails the Standby server automatically starts monitoring the network. The transition is so quick and smooth that the end user does not feel the impact of the failure of the Primary server or the subsequent taking over by Standby. In parallely the Standby server triggers an email alert (email ID entered configured in the mail server settings) about the Primary's failure. Once the Primary server is restored back to operation the Standby server automatically goes back to standby mode.


Working Mechanism

The Primary server updates its presence with a symbolic count in the BEFailover table at a specified interval known as the HEART_BEAT_INTERVAL. With every update the count gets incremented. This count is known as LASTCOUNT. Similarly the standby server also updates the its presence by updating the LASTCOUNT in the BEFailover table.


When the Primary server fails, it fails to update the LASTCOUNT. The Standby server keeps monitoring the Primary's LASTCOUNT at a specified periodic interval known as FAIL_OVER_INTERVAL. By default the FAIL_OVER_INTERVAL value is 60 seconds. If required you can modify it in the Failover.xml file (<OpManager_Standby_home>\conf). Supposing, you have specified FAIL_OVER_INTERVAL as 50 seconds, the standby will monitor the Primary's LASTCOUNT for every 50 seconds. Every time, when the Standby server looks up the LASTCOUNT, it compares the previous and present counts. When the Primary server fails to update the LASTCOUNT, consecutive counts will be the same and the Standby assumes that the Primary server has failed and starts monitoring the network.



Installing the Primary Server
If you are already running OpManager, first upgrade to build 7260 before applying build 8000. If you installing OpManager for the first time directly install build 8000. While installing OpManger (build 8000) on the Primary server, select as Primary server in the installation wizard and complete the installation process. Start the Primary server.

         

Installing the Standby Server

While installing OpManager on the standby server,

  1. Select as Standby server mode in the installation wizard.


  2. Enter the Primary webserver host, port and login details and complete the installation. Do not start the Standby server.



Note: The Date and Time settings of the Primary and the Standby should be same.

Configuring Failover:
The procedures for configuring failover support varies according to the following cases (backend DB used):


Using MySQL as the backend DB

Using MySQL bundled with OpManager:



If you are using MySQL bundled with OpManager as the backend DB, then follow the steps given below to copy the contents from the primary DB to the standby DB.

  1. Stop the OpManager Primary server.
  2. Open the command prompt in the Primary server and execute the command startMysql.bat/startMysql.sh (<OpManager_Primary_home>\bin).
  3. In the Standby server, open the command prompt and execute the command ReplicateDB.bat/ReplicateDB.sh (<OpManager_Standby_home>\bin)
  4. After successfully replicating the DB from Primary to Standby server, execute the command stopMysql.ba/stopMysql.sh (<OpManager_Primary_home>\bin) in the Primary server.
  5. Start the OpManager Primary server.
  6. Start the OpManager Standby server.


Using Standalone MySQL:
Steps to be followed on the Primary server:

  1. Stop the Primary server.
  2. Apply build 7260 before applying build 8000.
  3. Upgrade the Primary's remote MySQL (standalone MySQL server) to version 5.0.46.
  4. Copy my.huge.ini/mu.huge.inf files from the Primary server and paste it under the Primary's remote MySQL installation directory.
  5. Copy the Primary OpManager DB and its details to the Primary's remote MySQL server by
    1. Copying the OpManager DB folder and ibdata and ib_logs files (available under <OpManager_Primary_home>\mysql\data) and pasting it under the Primary's remote MySQL installation directory.

      or use mysql dump utility:

      Configurations to be done in the Primary server:
    1. Run startMySQL.bat or startMySQL.sh (<OpManager_Primary_home>\bin) inorder to start the MySQL server.
    2. Connect the MySQL bundled with OpManger using MySQL client and assign privileges to access this DB from the remote MySQL by entering the following command:
      grant all privileges on *.* TO root@ '<Primary's_remote_mysql_machine_name>'
    3. Now go to the remote MySQL server and verify whether it is able to connect to the Primary server by entering the following command:
      mysql -u root -P 13306 -h <Primary_server_name>

      Configurations to be done in the Primary's remote MySQL server:
    4. Start the remote MySQL application 5.0.46 if not started after the upgrade.
    5. From the command prompt itself go to the bin directory of the MySQL installation.
    6. Connect to the MySQL client and create a database OpManagerDB. The name of the database should be the same of the Primary's.
    7. Backup the MySQL data in the Primary server using the following command:
      mysqldump -u root -P 13306 -h <Primary_server_name> OpManagerDB > opm.sql
    8. Restore the data into the new installation using the following command:
      mysql -u root -P mysqlport OpManagerDB < opm.sql

  6. Ensure that OpManager Primary server has started successfully.
  7. Stop the OpManager Primary server.
  8. Now apply the build 8000.
  9. After successfully upgrading to build 8000, add the following script in mysql server startup script in Primary's remote MySQL server.
    For Windows installations: --default-files = <mysql_installation_path>\my.huge.ini
    For Linux installations: --default-files = <mysql_installation_path>\my.huge.cnf

Steps to be followed on the Standby server:

  1. Directly install 8000 build.
  2. Install MySQL version 5.0.46 or above on the Standby's remote MySQL DB machine and start the MySQL.
  3. Copy my.huge.ini/my.huge.cnf files from the standby server and paste it under the standby's remote MySQL installation directory.
  4. Copy the Standby OpManager DB and its details to the Standby's remote MySQL installation directory by
    1. Copying the OpManager DB folder and ibdata and ib_logs files (available under <OpManager_Standby_home>\mysql\data) and pasting it under the Standby's remote MySQL installation directory.

      or use mysql dump utility:

      Configurations to be done in the Standby server:
    1. Run startMySQL.bat ot startMySQL.sh (<OpManager_Standby_home>\bin) inorder to start the MySQL server.
    2. Connect the server using MySQL client and assign privileges to access this DB by  the remote MySQL DB by entering the following command:
      grant all privileges on *.* TO root@ '<Standby's_remote_mysql_machine_name>'
    3. Now go to the Standby's remote MySQL server and verify whether it is able to connect to the Standby server by entering the following command:
      mysql -u root -P 13306 -h <Standby_server_name>

      Configurations to be done in the Standby's remote mysql server:
    4. From the command prompt itself go to the bin directory of the MySQL installation.
    5. Connect to the MySQL client and create a database OpManagerDB. The name of the database should be the same of the Standby's.
    6. Backup the MySQL data in the Standby server using the following command:
      mysqldump -u root -P 13306 -h <Standby_server_name> OpManagerDB > opm.sql
    7. Restore the data into the new installation using the following command:
      mysql -u root -P mysqlport OpManagerDB < opm.sql

  5. After successfully restoring the Standby OpManager DB to its remote MySQL server, add the following parameter in mysql server startup script in the Standby's remote MySQL server.
    For Windows installations: --default-files = <mysql_installation_path>\my.huge.ini
    For Linux installations: --default-files = <mysql_installation_path>\my.huge.cnf
Now start the Primary and Standby OpManager servers and their respective remote MySQL servers.


Using MSSQL as the backend DB

If you are running OpManager with MSSQL as the backend DB, then implement clustering. Clustering refers to an array of databases in which the data are stored and have a single virtual IP. If any of the DB in the cluster environment fails the other DBs have the data thereby providing high availability of data. The Primary server sends all its data to a virtual IP and the data gets stored in multiple locations. The Standby server that takes control over the network in case the primary fails, then the standby server also sends the data to the same virtual IP.


For configuring MSSQL server clustering visit the below link published by Microsoft.
http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/failclus.mspx#EDAAC

MSSQL

For MSSQL, the Standby OpManager server can be started once the installation is completed, provided you have already configured MSSQL clustering for Primary server.


Once the Primary server fails, the Standby server assumes itself as the Primary server and starts monitoring the network. Once the Primary server is up, the Standby server goes back to its standby mode and monitors the Primary server.

Copyright © 2012, ZOHO Corp. All Rights Reserved.
Network Monitoring Software from ManageEngine