Profile Tuning For High Volume


A default profile created in Sawmill is optimized for maximum report generation speed. These settings can be very problematic for very large datasets (more than 200 million lines of log data), requiring very large amount of time and memory and disk space to build, and should be modified before any large import begins. It is best to start by turning off most of the database indices (all fields are indexed by default), in Config->Database Fields, and most of the cross- references (in Config->Cross-reference Groups). Turning off cross-reference groups will make the database build faster, but will make the corresponding report much slower (minutes instead of seconds). Turning off indices will make the database build faster and the database smaller, but will make filtering on the corresponding field slower. The date/time cross-reference group should generally be left enabled, as it is used in nearly every report, and is not very expensive.

Field complexity must also be managed for large datasets. Any normalized (non-numerical) field with more than about 10 million unique values can become a performance liability, and should be simplified or removed if possible. Fields can be simplified using a Log Filter, for instance setting the field value to a single constant value if it is not needed. Database fields can be removed completely using the command line “remove_database_field” action. During a large database build, it is good to monitor the web interface progress display for the build (click Show Reports on the profile), to see if there are any performance warnings appearing. These warnings also appear in the TaskLog.

Single monolithic databases can also be a performance issue. Whenever possible very large datasets should be segmented into separate profiles, each with its own database, for instance one profile per server, or one profile per month, or one profile per region.

Pre-Sales support is strongly recommended for large datasets

If you are considering Sawmill for a very large dataset (more than 200 million lines of data), it is recommended that you contact Sawmill technical support in advance, to get expert guidance with your profile configuration. There is no charge for pre-sales technical consultations, and it is very likely to improve your initial experience of Sawmill.

Contact support@sawmill.net for pre-sales support.