Using Report Filters


Report Filters

Where can I apply a report filter?

  • You can apply a report filter in the Professional and Enterprise versions:
  • a.) from the command line or in Scheduler with the -f option
  • b.) for all reports in a profile in Config/Report Options
  • c.) for all reports in a profile in Reports via the Filters window
  • d.) per report in Config/Reports
  • e.) per report element in Config/Reports

    Below are the filters to use when generating a report, it filters out all data not matching this expression, so only part of the data is reported. This can be used in the Report Filter field of a report or a report element, when editing a report, it can be used from the command line.

    The value of this option is an expression using a subset of the Salang: The Sawmill Language syntax (see examples below). Only a subset of the language is available for this option. Specifically, the option can use:

    • within: e.g. "(page within '/directory')" or "(date_time within '__/Jan/2004 __:__:__')"

    • <, >, <=, >=: for date/time field only, e.g. "(date_time < '01/Jan/2004 00:00:00')"

    • and: between any two expressions to perform the boolean "and" of those expressions

    • or: between any two expressions to perform the boolean "or" of those expressions

    • not: before any expression to perform the boolean "not" of that expression

    • matches: wildcard matching, e.g. "(page matches '/index.*')"

    • matches_regexp: regular expression matching, e.g. "(page matches_regexp '^/index\\\\..*\\')"

    Date/time values are always in the format dd/mmm/yyyy hh:mm:ss; underscores are used as wildcards, to match any value in that position. For instance, '15/Feb/2003 __:__:__' refers to a single day, and '__/Feb/2003 __:__:__' refers to a month, a '__/___/2003 __:__:__' refers to a year.

    Special case: Default Page. To filter on the default page of a directory, use: "(page within '/dirname/{default}')".

    Detailed explanation of Default Page: For proxy and web servers, an access to a directory, e.g., a hit on http://mysite.com/ , displays the index page, as though the access were actually on http://mysite.com/index.html . The exact name of the index page varies, and is not shown in the log data, and Sawmill differentiates in its URLs reports between aggregated hits on http://mysite.com/ (totals of all events on any page or subdirectory of the whole site), and hits on the page http://mysite.com/ (events on the top-level "index" page, or default page). In reports, the default page appears as "http://mysite.com/ (default page)", and zooming on that row is one way to filter on it. But to use "within" or wildcard filtering on the default page, it is necessary to filter on the internal (language-independent) representation of the default page, which in this case is "http://mysite.com/{default}". In other words, you can't just filter '(url within "http://mysite.com/")' to get the default page; that will get you the whole site; and you can't filter '(url within "http://mysite.com/ (default page)")'—that won't get you anything, because "(default page)" is just how Sawmill displays it in the reports (in English; it uses an equivalent word when displaying reports in other languages). Instead filter '(url within "http://mysite.com/{default}") to select events on the default page of the site. Similarly, filter (page within "/mydir/{default}") in web logs to select the default page of /mydir/. (The name of the field may not be "url" or "page"—it depends on the particular log format. Use the correct one for your format).

    Examples


    NOTE: To use these examples in a command line, or in the Extra Options of the Scheduler, use

      -f "filter"

    where filter is one of the examples below, e.g.,

      -f "(date_time within '__/Feb/2005 __:__:__')"

    Use double quotes (") around the entire filter expression; use single quotes (') within the filter if necessary.


    Example: To show only events from February, 2005 (but it's easier to use date filters for this (Using Date Filters)):

      (date_time within '__/Feb/2005 __:__:__')

    Example: To show only events within the page directory /picts/:

      (page within '/picts/')

    Example: To show only events from February, 2004, and within the page directory /picts/:

      ((date_time within '__/Jan/2004 __:__:__') and (page within '/picts/'))

    Example: To show only events from the last calendar month (the previous full month, i.e. the 1st through end-of-month, in the calendar month prior to the current month.) (but it's easier to use date filters for this (Using Date Filters)):

      (date_time within ("__/" . substr(epoc_to_date_time(now() - 30*24*60*60), 3, 8) . " __:__:__"))

    Example: To show only events from last month (more sophisticated than the one above, this works any time in the month, for any month, by computing the first second of the current month, subtracting one second to get the last second of the previous month, and using that to compute a filter for the previous month) (but it's easier to use date filters for this (Using Date Filters)):

      (date_time within ("__/" . substr(epoc_to_date_time(date_time_to_epoc("01/" . substr(epoc_to_date_time(now()), 3, 8) . " 00:00:00") - 1), 3, 8) . " __:__:__"))

    Example: To show only events from February 4, 2004 through February 10, 2004 (but it's easier to use date filters for this (Using Date Filters)):

      ((date_time >='04/Jan/2004 00:00:00') and (date_time <='11/Jan/2004 23:59:59'))

    Example: To show only events in the past 30 days (but it's easier to use date filters for this (Using Date Filters)):

      (date_time >= date_time_to_epoc(substr(epoc_to_date_time(now() - 30*24*60*60), 0, 11) . " __:__:__"))

    Example: To show only events with source port ending with 00:

      (source_port matches '*00')

    Example: To show only events with source port ending with 00, or with destination port not ending in 00:

      ((source_port matches '*00') or not (destination_port matches '*00'))

    Example: To show only events with server_response 404, and on pages whose names contain three consecutive digits:

      ((server_response within '404') and (page matches_regexp '[0-9][0-9][0-9]'))

    Example: To show only events with more than 100 in a numerical "bytes" field (this works only for numerical, aggregating fields):

      (bytes > 100)

    Advanced Example: Arbitrary Salang expressions may appear in comparison filters, which makes it possible to create very sophisticated expressions to do any type of date filtering. For instance, the following filter expression selects everything in the current week (starting on the Sunday before today). It does this by pulling in some Salang utility functions to compute the weekday from the date, and the month name from the month number, and then iterating backward one day at a time until it reaches a day where the weekday is Su (sunday). Then it uses that date to construct a date_time value in the standard format "dd/mmm/yyyy hh:mm:dd" which is then used in the filter.

      date_time >= (
        include 'templates.shared.util.date_time.get_weekday';
        include 'templates.shared.util.date_time.get_month_as_number';
        int t = now();
        string weekday = '';
        while (weekday ne 'Su') (
          bool m = matches_regular_expression(epoc_to_date_time(t), '^([0-9]+)/([A-Za-z]+)/([0-9]+) ');
          int day = 1;
          int month = get_month_as_number(epoc_to_date_time(now()));
          int year = 3;
          weekday = get_weekday(year, month, day);
          t -= 24*60*60;
        );
        t += 24*60*60;
        string first_second_of_week = date_time_to_epoc(substr(epoc_to_date_time(t), 0, 11) . ' 00:00:00');
        first_second_of_week;
      )