Newsletters



Sawmill Newsletter

  July 15, 2007



Welcome to the Sawmill Newsletter!

You’re receiving this newsletter because during the downloading or purchase of Sawmill, you checked the box to join our mailing list. If you wish to be removed from this list, please send an email, with the subject line of “UNSUBSCRIBE” to newsletter@sawmill.net .


News

We are currently shipping Sawmill 7.2.9. You can get it from http://sawmill.net/download.html .

This issue of the Sawmill Newsletter describes the "session contains" report filter, and how it can be used to track conversions from their source.


Get the Most out of Sawmill with Professional Services

Looking to get more out of your statistics from Sawmill? Running short on time, but need the information now to make critical business decisions? Our Professional Service Experts are available for just this situation and many others. We will assist in the initial installation of Sawmill using best practices; work with you to integrate and configure Sawmill to generate reports in the shortest possible time. We will tailor Sawmill to your environment, create a customized solution, be sensitive to your requirements and stay focused on what your business needs are. We will show you areas of Sawmill you may not even be aware of, demonstrating these methods will provide you with many streamlined methods to get you the information more quickly. Often you'll find that Sawmill's deep analysis can even provide you with information you've been after but never knew how to reach, or possibly never realized was readily available in reports. Sawmill is an extremely powerful tool for your business, and most users only exercise a fraction of this power. That's where our experts really can make the difference. Our Sawmill experts have many years of experience with Sawmill and with a large cross section of devices and business sectors. Our promise is to very quickly come up with a cost effective solution that fits your business, and greatly expand your ROI with only a few hours of fee based Sawmill Professional Services. For more information, a quote, or to speak directly with a Professional services expert contact consulting@flowerfire.com.



Tips & Techniques: Sequential Scheduling

Sawmill's built-in Scheduler provides basic task scheduling capabilities. You can configure it to run a particular task, for a particular profile, at a particular time. For instance, you can configure it to update the databases for all your profiles at midnight every night, or to email yourself a Single-page Summary for a particular profile, every day at 8 AM. The Scheduler is available in the Admin page of the web interface.

However, there are some restrictions on what tasks can be run simultaneously. Database builds and updates, and "remove data" tasks, modify the database, and can conflict with each other and with reports if they are run simultaneously on the same profile. Depending on the number of processors (or cores) in the system, and the speed of the disk, you may not be able to run more than a few simultaneous tasks--each task generally uses as much as a full processor (or core), so on a four-processor system, performance will suffer if there are more than four simultaneous processes, even if they are on different profiles.

Therefore, it is often useful to run tasks sequentially rather than simultaneously. The Sawmill 7 Scheduler supports this in a few cases; you can rebuild or update databases for "all profiles," and it will rebuild or update them in sequence, starting the next task when the previous one completes (using one processor at all times). Also, some degree of sequencing is possible by spacing the scheduled tasks so they cannot overlap; for instance, if a database update is to be followed by a report generation, and the database update takes 1 hour, then scheduling the report generation two hours after the database build will generally ensure that it is run after the update completes. But this is problematic, because the time taken for a task can never really be predicted; if the log data suddenly gets larger, or if the system slows down for some other reason, that database update might take 3 hours, and the report generation will fail. What is sometimes needed is true sequencing of arbitrary tasks, running each task when the previous completes.

To perform sequencing of arbitrary tasks, it is easiest to use a script (a .BAT file on Windows), which executes the tasks with command line syntax, one after another. For instance, this .BAT file would do the database update, and then email the report:

  C:\Program Files\Sawmill 7\SawmillCL -p profilename -a ud
  C:\Program Files\Sawmill 7\SawmillCL -p profilename -a srbe -ss mail -rca me@here.com -rna you@there.com -rn overview

On non-Windows systems, the script would be very similar, but with the pathname of the "sawmill" binary instead of C:\Program Files\Sawmill 7\SawmillCL . This script runs a database update of profilename, and immediately when the update completes, it emails the Overview report. Create a text file (for instance, with Notepad), and call it update_and_email.bat, and paste the two lines above into the file. On non-Windows, you might call it update_and_email.sh, and make it executable with "chmod a+x update_and_email.sh".

The Sawmill Scheduler cannot run an arbitrary script, so to schedule this script it is necessary to use an external scheduler. On Windows, the Windows Scheduler is usually the best choice. Go to Control Panels, choose Scheduled Tasks, and choose Add Scheduled task. This will start the Scheduled Task Wizard. Then:
Now, the .BAT file will run every day at midnight, and it will run its two tasks sequentially. Any number of tasks can be added to this script, and they will all be run sequentially, with no gap in between.

On Linux, MacOS, UNIX, or other operating systems, this type of scheduling is usually done with cron, the built-in scheduler. The cron table can be edited through the graphical interface of the operating system, if one is available, or it can be edited from the command line with the command "crontab -e", adding a line like this to the cron table:

  0 0 * * * /opt/sawmill/bin/update_and_email.sh >> /opt/sawmill/log/update_and_email.log 2>&1

This runs the update_and_email.sh script every day at midnight, logging the output to a file.


Sawmill 8 Scheduler Features

The next major release of Sawmill, version 8, will include direct support for sequential scheduling in the Sawmill Scheduler, so it will be possible to do this sort of "A then B then C etc." scheduling directly from the Sawmill Scheduler.


Advanced Topic: Optimal Scheduling for Multiple Processors/Cores

If you have a multiprocessor (or multi-core) system, the approach above does not take full advantage of all your processors, because the .BAT file (or script) runs on only one processor. It is possible to configure Sawmill to use multiple processors for database builds or updates (using the Log Processing Threads option), but report generation always uses one processor, and multi-processor database builds/updates are less efficient than single-processor builds (i.e., running on two processors is faster, but not twice as fast). If you have many tasks, the optimal scheduling for multiple processors is to use single-threaded builds and updates, but to keep one task running per processor at all times. For instance, if there are four processors, you start four single-threaded tasks, and as each task completes, you start another one, always ensuring that there are four tasks running. This can be done by running four scripts (or four .BAT files), like the one above, at the same time, as long as each script takes roughly the same amount of time as the others. That splits the work of the tasks into four equal pieces, and runs them simultaneously.

It is also possible to write a script which does this sort of scheduling for you, and we have one, written in perl. The script, called multisawmill.pl, is available by emailing support@sawmill.net. At this point, it is limited to only one type of task, so for instance it can run 100 database builds, split over four processors, or 1000 report generations, split over 8 processors.


Questions or suggestions? Contact support@sawmill.net. If would you like a Sawmill Professional Services expert to implement this, or another customization, contact consulting@sawmill.net.


[Article revision v1.1]
[ClientID: 1]