High Availability Best Practices


This document describes best practices for using Sawmill in a high availability or cluster environment.

To achieve high availability with Sawmill, it is best to create a cluster of Sawmill nodes, with a shared disk. Each node in the cluster mounts the shared drive, which contains the LogAnalysisInfo folder in the Sawmill installation. No other data sharing is necessary.

Because the data and configuration information in Sawmill are in the LogAnalysisInfo folder (except the database, in the case of a MySQL database), sharing the LogAnalysisInfo folder in this manner creates a Sawmill cluster, where each node can run its own Sawmill binary, in the directory containing LogAnalysisInfo.

Each node in the cluster will provide the same profile list, reports, and administrative interface, so any node can be accessed at any time, and will give the same results as any other node.

Add a load balancer in front of the cluster, a single IP address can then be accessed, and will be forwarded on to any active node in the cluster. Then if one node goes down, the load balancer will send traffic to another node instead.

An HTTP session in Sawmill can jump across nodes with no problems, so each request can be distributed randomly by the load balancer, without any need for the load balancer to keep a particular session on the same node.

This approach works with any Sawmill installation, whether it uses internal databases, or MySQL databases, or a combination of the two.