{=
  include("docs.util");
  start_docs_page(docs.technical_manual.page_titles.realtimelog);
=}

<p>$PRODUCT_NAME, typically, imports log data in batches, and reports are not available during import. This is efficient, but it has two disadvantages:</p>

<p>1. Reporting is not available at any time; if reports are requested during database builds or updates, they have to wait until the database build is complete. For large datasets, the result can be significant reporting downtime.</p>
<p>2. Reports are not up-to-date in the period between updates--new log data has arrived in the log source, but will not appear in the reports until the next update.</p>  

<p>To address these issues, $PRODUCT_NAME has a "real-time" option, when "Real-time" is checked for a profile in the Create New Profile wizard. Sawmill will allow reporting during database import; the import frequently checks for report requests, and if it sees one, it temporarily halts, allows the report to run, and then resumes import.</p>

<p>Real-time reporting is most useful with a continuous log source, like a command-line log source which monitors a log source for new data, and "prints" it to standard output as soon as it sees it. A very simple example of this is the UNIX "tail -f" command, which can be used to stream a single file into the database, as new lines arrive. However, a robust implementation of a real-time log source must be more sophisticated, as it must provide a mechanism for ensuring that lines are sent once, and only once, even in the event of Sawmill restarting. A more robust implementation might monitor a directory, dump any files it sees there to standard output, and them move them to another directory to mark them processed.</p>

<p>Real-time profiles have additional overhead that makes both database building and report generation slower than non-realtime profiles. On the build side, there is a slight overhead of watching for reports, which is not needed in non-realtime profiles. The reporting performance differences are more significant: because there is never a "clean up" part to the database build, none of the aggregation tables (cross-reference tables, indices, sessions, hierarchies, etc.) are built during database building, and they must be built on-demand when a report is requested. This introduces a delay before the report appears, which can be significant for large datasets. Therefore it is best to leave real-time processing off, if maximum reporting speed is required.</p>


{= end_docs_page() =}