These are the changes in the beta branch which been added since the previous release. These changes are not available in the current beta release, but they are available in the current beta pre-release.
Extended the existing SNARE log format plug-in to support SNARE v2.0, which uses tab delimited seperators.
Added support for Tipping Point log format.
Added support for Blue Socket log format.
Added support for Cyberguard log format.
Added support for %I and %O fields in Apache Custom log format.
Added support for PHP Error Log Format.
Added support for %{%Y-%m-%d %H:%M:%S}t date format in Apache Custom log files.
Added support for Trend Micro InterScan Messaging Security Suite eManager Log Format.
Added support for SAS Firewall Log Format.
Added support for Squarespace Log Format.
Added support for a variant line layout in Trend Micro InterScan Messaging Security Suite
eManager Log Format.
Improved Fortigate support to track many additional fields, and to track all
numerical fields.
Added support for Enterasys Dragon IDS Log Format.
Added detection of Symantec Gateway Security Binary Log Format.
Sawmill cannot process this log format
directly because it is a binary format, but it now generates a useful error message about that,
describing how to convert it to a format which Sawmill does support (using remotelogfile8).
Added detection of Watchguard Binary (WGL) Log Format.
Sawmill cannot process this log format
directly because it is a binary format, but it now generates a useful error message about that,
describing how to convert it to a format which Sawmill does support (using
WSEP Text Export or Historical Reports Exports).
Improved support for Symantec Gateway Security Log Format; added support for some new
fields and for two different formats, tab separated and divided listed
fields and comma and tab separated and equal divided listed fields.
Added support for CFT Account Log Format.
Added support for Watchguard Firebox Export Header (dd/mm/yy dates) Log Formmat.
Added support for Who's Clicking Who Log Format.
Improved robustness by changing progress prediction errors to warnings.
Progress prediction errors occur when the order of steps that are predicted for a task
does not match the actual steps which occur. This occurs due to bugs, but it is
difficult to predict in every case what steps will occur during a task, so there have
been many bugs of this sort. This change works around this sort of bug by displaying a
wanting message and patching the progress prediction to match the actual steps
being taken by the task. This may mean that the steps in the progress page will change
in some cases, but this sort of issue will no longer be a fatal issue, terminating
the database build or report.
Added support for 8e6 Content Appliance Log Format.
Added support for Cumulus Digital Asset Management Actions Log Format.
Improved Win2K Performance Monitor log format to handle a date variant,
and to track many additional numerical fields.
Added support for Netopia 4553 Log Format.
Added support for Solar Winds Syslog Log Format.
Added support for nnBackup Log Format.
Greatly improved support for Microsoft ISA WebProxy CSV Log Format; added
tracking of all fields, including all numerical fields, and categorized
reports.
Improved memory usage by moving itemnum tables to disk, rather than keeping them in memory.
In earlier versions, itemnums tables could take huge amounts of memory (gigabytes)
for very large datasets, when using the internal database; this change makes it possible
to process more data with limited memory.
Improved memory usage by moving the session analysis to disk, rather than performing it in memory.
In earlier versions, the session analysis could take huge amounts of memory (gigabytes)
for very large datasets, when using the internal database; this change makes it possible
to report session statistics for very large datasets with limited memory.
Added a new 'table' type for the internal language, which is now used to keep report tables
on disk. In earlier versions, report tables could take huge amounts of memory (gigabytes)
for extremely large tables; this change makes it possible to display extremely large
tables (millions of lines) with limited memory.
Added support for BigFire / Babylon accounting Log Format.
Improved memory usage by moving integer lists information, rather than keeping it in memory.
Integer Lists are used to represent field hierarchies, database indices, and unique item lists.
In earlier versions, integer lists could take up large amount of memory
for very large datasets, when using the internal database; this change makes it possible
to process more data with limited memory.
Improved memory usage by moving cross-reference tables to disk, rather than keeping them
in memory. Cross-reference tables provide fast access to top-level table queries.
In earlier versions, cross-reference tables could take up large amount of memory
for very large datasets, when using the internal database; this change makes it possible
to process more data with limited memory.
Added support for DNSone DHCP Log Format.
Added support for Windows Event (Comma Delimited) dd.mm.yyyy Log Format.
Added support for a new variant of Trend Micro InterScan Messaging Security Suite eManager Log Format.
Enhanced Windows Event Log Format (dumpevt.exe export) to handle a variant,
and to track much more information.
Added support for Apache/NCSA Combined Format (NetTracker).
Added support for Squid Log Format With ncsa_auth Package.
Improved performance of SQL database updates by building a complete
separate set of tables (with different suffixes) for the update, then
merging this "database" into the main "database". This is *much* faster
than the old approach, which appended the new data to the main table,
then completely recomputed the xref, subitem, and bottomlevelitem tables.
Improved performance of SQL log processing by keeping itemnum maps local.
This eliminates a huge number of SQL queries during database builds,
improving lines-per-second, especially in situations where the SQL database
is not on the same system as Sawmill.
Improved performance of date range queries with a SQL database by using
xref tables to handle the days selected, when possible. This greatly
improves the performance of queries for large datasets, because only the
smaller xref table needs to be queries; previously, all date range
queries used the huge "main table" for the query.
Improved the performance of SQL queries which use the "main table" of the
database. Previously, these queries required one pass through the main table
for each numerical field which was a "unique" field (e.g. visitors or
unique client IPs), plus one pass through the main table for all the other
fields together. This is now done in a single pass which collects data from
all fields.
Improved performance of SQL queries which include "uniques" columns, and
are queries from cross-reference tables.
Previously these were always computed by counting the number of unique
items in the "uniques" table for a particular xref table. Now, the main
xref table for each field contains precomputed values of each unique field,
so it can be queried directly for simple filters, including the Overview
and table reports filtered with a single value per field (e.g. zooms).
This greatly improves the performance of these queries, since table rows
can essentially be read directly from the xref table, without having to be
computed by counting distinct values in a potentially huge list.
Added support for Selenia SAS log format.
Added support for Oracle Express Authentication Log Format.
Added support for per-column subitem level setting for table reports.
This means that it is now possible to generate a "months" report which shows
all months for all years in the database, or a "regions" report which shows all regions in
all countries, etc. Since this can be done per-column, it is now possible (it wasn't before)
to have a multiple column table where one column shows bottom-level items, another column
shows level-1 items, another shows level-2 items, etc. For instance, it is possible
to have a days by regions by pages report, or any other combination of columns and levels.
Enhanced support for NetScreen FR328 log format to handle a new variant.
Enhanced support for Kiwi Comma-separated format to allow for optional timezone.
Added support for 3Com 3CRGPOE10075 Log Format.
Improved performance of multiprocessor SQL database builds by building
separate databases (actually, separate sets of tables in the same database,
with different suffixes) for each thread, and then merging them to get the
final result. This is much faster than the old approach,
where all threads were writing to the same set of tables at the same time,
which was slow due to table locking and database query latency.
Added support for Norstar PRELUDE and CINPHONY ADC Agent Log Format.
Improved Neoteris format to handle a varaiant.
Added support for GeoIP tracking in Blue Coat Squid log format.
Added support for SFTP download of log data.
Added support for Mikrotek Router Log Format.
Added support for Kiwi (yyyy/mm/dd, space-separated) Syslog.
Improved support for Gene6 FTP log format to handle uploads.
Added support for praudit log format.
Added support for Norstar PRELUDE and CINPHONY ADC Log Format.
Improved error reporting for command-line errors to eliminate a few HTML
tags and other oddities which could appear there, and to eliminate the
source code filename and line number, for simpler and friendlier error
messages.
Added support for Kaspersky Labs AVP Client (Spanish) Log Format.
Added support for Kaspersky Labs AVP Server (Spanish) Log Format.
Added support for AM/PM dates in FileZilla Server format.
Enhanced support for Netscreen log format to report attacks.
Added support for honeyd Log Format.
Added a run_command action to the Scheduler, which executes an arbitrary command-line program.
Added a "sequence" action to the Scheduler, which executed a sequence of other
tasks. This is useful for creating tasks which run immediately after other tasks have completed,
e.g. a sequence might be 1) update database, 2) expire old data, 3) send a report by email.
Each step will begin immediately after the previous one completes, so there's no danger that
they will overlap.
Added support for Novell GWIA Log Format.
Added support for PNG images for graphs. This allows images to display many more
colors per pixel, resulting in higher quality graphs.
Added antialiasing of graphs. This makes edges look much smoother, and much nicer.
Added support for Kiwi (yyyy/m/d hh:mm, tab separated) Syslog.
Enhanced Nortel Contivity log format to handle a variant, and to report more information.
Added support for DLink DFL-700 Log Format.
Improved extraction of "Accessing URL" lines in PIX log data, to handle lines where
the username appears.
Added extraction of year from Netscreen messages; removed reporting of start_time
(which isn't needed since it's used as the main date/time field).
Added support for customizing the "decimal divider" (e.g. decimal point)
on a per-profile bases, so it is now possible to have some profiles show numbers
as 10,445.45 while other profiles show them as 10.445,45 .
Enhanced Intermapper Outages Log Format to support a different date format
(in addition to the two supported), and to track duration numerically.
Added support for Microsoft ISA 2004 IIS Log Format.
Added a "recurse subdirectories" option for FTP and SFTP log sources,
to process all files in subdirectories of the specified directory,
looking inside all subdirectories, and their subdirectories, etc.
Added progress reporting during autodetection of log data.
This avoids a timeout problem when processing very large compressed files.
Added support for Symantec Web Security CSV Log Format.
Enhanced GFI Attachment & Content Log Format to handle a variant.
Enhanced Trend Micro InterScan Messaging Security Suite eManager Log Format to handle a variant.
Enhanced McAfee Web Shield Log Format to handle the latest version ("detections log").
Added support for Sendmail (no syslog) Log Format.
Added support for Intermapper Event Log Format.
Improved IronPort format to handle a variant, and to track more information.
Added support for Vircom log format.
Improved IronPort plug-in to handle "bounce" log lines. This obsoletes the
"IronPort bounce" plug-in, since the "IronPort" plug-in now handles both formats.
Added support for 3D pie charts, and added antialiasing (smoothing) to pie chart edges.
Added support for ODBC. Sawmill can now use ODBC to connect to a MS SQL database, and use that as its backend database.
Added support for "-p *" on the command line to perform the command for all profiles (in sequence),
and "-p pattern:X" to perform the command for all profiles matching wildcard pattern X.
Added support for Oracle. Sawmill can now use ODBC to connect to an Oracle database, and use that as its backend database.
Moved default_profile.cfg from LogAnalysisInfo/profiles to LogAnalysisInfo. This makes upgrades
easier, because the whole profiles directory can be copied from the old installation to the new.
Increased the granularity of database locking from per-database to per-table,
and changed building of cross-reference tables, hierarchy tables, and indices,
so they occur on demand, rather than as part of the main build. This makes
the build or update much faster, since indices and xref tables do not need to be built;
which makes it possible to do much faster updates, for much closer to real-time reporting.
Added true real-time reporting. Technically, this is implemented by adding
automatic release of database locks by a database build process, when
a report is requested. In conjunction with other new features listed above,
this means it is now possible to view reports while a database builds or updates.
I.e., reloading a report will show a new report based on the latest information
added into the database by a simultaneous build. Used in conjunction with a
"standard input" log source which continually pipes data into the database,
this makes it possible to set up true real-time reporting, by piping
log data into the database build (e.g., through "tail -f" or a similar command);
the database build can then run all the time, piping the latest lines into the database,
and reports can be generated at any time, always showing the latest events.
Added new {}, [], ?{} and ? operators to Salang, which are syntactic shortcuts for
subnode_by_name(), subnode_by_number(), subnode_exists(), and node_exists().
Improved xref build performance for SQL databases, using a new multistage algorithm.
In one example, this resulted in an improvement in xref build times (for a 6 million
line dataset), from 2:05 (two hours, five minutes) to 0:35 (35 minutes), a nearly 3x
improvements.
Changed SQL databases so the hierarchies and xref tables are built on-demand.
Previously, each database build or update resulted in a rebuild or update of
all xref tables, and all hierarchy tables. Now, these builds are deferred until
reports are requested. This makes database builds much faster, and enables
real-time reporting in SQL, similarly to how it works with the internal database (see above).
Implemented "merge sort" for sorting large tables. This makes it possible to sort
tables of arbitrary size, while using a fixed amount of memory. The previous sort
implementation always sorted in memory, so very large tables would require very
large amounts of memory for sorting; the new approach puts an upper bound on the amount
of memory required for a sort, and therefore allows sorting of arbitrarily large tables
without exceeding the available physical memory.
Added -su and -sp options which provide SMTP username and password information for SMTP authentication.
This allows Sawmill to send email via an SMTP server which requires "LOGIN" authentication.
Added basic network functionality to Salang, in the form of build-in functions to create outbound sockets,
send data to them, and read data from them. Created a new build-in "data" type to represent a chunk
of arbitrary binary data.
Implemented a new function get_title_by_http(), in util.get_title_by_http, which makes an HTTP connection
to the specified server, grabs the specified URI, and parses it to extract the
tag. This can be useful
for sites where the URL is mostly unintelligible, but the page title is informative; Sawmill can now
extract the title as it processes log data.
Added a new "dump_main_table" action (i.e., "-a dmt"), which dumps a tab-separate version of the main database table to standard output. This dump is affected by filters (the -f option), and is much faster than exporting Log Detail.
Added support for report filtering on numerical fields. For instance, it is now possible to create a filter which selects only those entries where the "bytes" field is more than 1000, or less than 1000.
Added support for reading log data from an ODBC database (e.g., processing IIS logs which were logged to a database, rather than to text log files).
Added a new log format plug-in, IIS ODBC, which handles reading IIS log data from an ODBC database.
Added support for "parsing servers," a new feature where a potentially remote installation of Sawmill can provide log parsing services to another installation. Used this new feature to implement multiprocessor log processing, by using multiple parsing servers on the main system.
Added support for three methods of distributed parsing, specified in the profile: "none", which does not distribute parsing at all, but does it all in one thread; "listed", where parsing servers are listed explicitly in the profile, and parsing is distributed to them; and "auto" (the default), which does not distribute on one-processor systems, and spawns N+1 local parsing servers on N-processor systems, and distributes parsing to them.
Added a feature to split SSQL queries across multiple processors, for better performance of some queries on multiprocessor systems.
Added support for different integer types in tables (and as database fields): integers can now be 8-, 16-, 32-, and 64-bit integers;
or they can be generic integers which are chosen to have the maximum native word length. This allows database size to be reduced by
using fewer words per integer when appropriate.
Added an option to extract session IDs from already-sessionized log data,
rather than using timeouts.
Added an option to have per-profile prefixes and suffixes on the internal SQL table names.
Added support for sqlldr for fast Oracle import.
Add a "Run Now" button in the Scheduler, to run any task immediately.
Added support for custom session ID values. This allows a log format plug-in to override the session ID, specifying the session ID itself
rather than having it computed by the session algorithm (with sessionization, timeouts, etc.). This allows for more accurate session
reporting, in log data where a session ID is computed and logged by the device or server.