We've gotten quite a number of requests along these lines lately. We will be profiling on SPARC very soon, to try to determine the bottleneck, if any.Multiprocessing support has been contemplated before, and indeed was implemented in version 6.0. But it never worked well, only gave a 2x speedup on 6 processors, and usually crashed to boot, so we disabled it. We've been tentatively planning to reimplement it in version 7, but using a more distributed model (in version 6, it used SMP and required a lot of locking, which is why it was so slow). This model will involve splitting the log data into separate chunks, processing each chunk with a completely separate Sawmill process, and then adding the processed results into the database. The final stage (adding into the database) will be single-threaded, but the rest of it (the majority of the time spent) will be done in parallel, so it should give a better speedup than the previous model. Even cooler, it should be possible to distribute it across multiple machines if desired (DMP). Given the dull roar of the crowds clamoring for this feature, I believe it will be part of version 7, but we have not yet begin implementation of it in version 7.
Version 7, by the way, is well underway, but at this point it's a mess, because we're changing so many things about the underlying database and configuration architecture that it's not even beta quality yet. When it reaches beta quality, it will be released as a beta, probably a public beta. That may be several months yet, though, and the first beta will have only a small set of the version 7 features, and will probably no include multiprocessing.
-
Greg Ferrar, Sawmill Product Manager
http://www.flowerfire.com/sawmill/