FAQ: Creating Custom Fields


How can I group my events in broad categories (like "internal" vs. "external" or "monitoring" vs. "actual"), and see the events on each category separately, or see them combined? How can I create content groups? How can I include information from an external database in my reports, e.g. include the full names of users based on the logged username, or the full names of pages based on the logged URL? How can I extract parts of the URL and report them as separate fields?

Short Answer

Create a new log field, database field, report, and report menu item to track and show the category or custom value, and then use a log filter to set the log field appropriately for each entry.

Long Answer

It is often useful to report information in the reports which is not in the logs, but which can be derived from the information in the logs. For instnace, it is useful to see events in categories other than those which naturally fall out of the data. Natural categories for web logs include page directories (the page field), months (the date/time field), or visitor domains (the hostname field). Similarly, it is useful to derive related values from the log fields values, and report them as though they were in the log data; for instance, if you have a username, you may want to report the full name, organization, and other information about the username. Sawmill treats every value of every field as a category, so you can categorize by any field in your log data. You can take advantage of this feature to make your own categories, even if those categories are not immediately clear in the log data. Categories like these are called "custom fields.". One common use of custom fields is to separate internal hits (hits from you) from external hits (hits from other people). Another use is to separate monitoring hits (hits from programs you use to monitor your own site) from actual hits (hits by browsing people). Another similar categorization is spider hits (hits from search engine robots and other robots) vs. human hits (hits by browsing people). Custom fields are also used to show metadata associated with a particular item, for instance to show whois information from an IP address, full name from a username, and other information. Sawmill does some common custom fields for you (geographic location derived from IP, hostname derived from IP, web browser derived from user-agent, and many more), but if you need to derive your own custom field, Sawmill also provides you with the "hooks" you need to do it.

There are five steps to this (described in detail below):

  1. Step 1: Create a log field

  2. Step 2: Create a database field based on that log field

  3. Step 3: Create a report based on that database field

  4. Step 4: Create a report menu item for that report

  5. Step 5: Create a log filter to populate the log field

Here are the details:

Step 1: Create a log field

Edit the profile .cfg file, in the profiles directory of the LogAnalysisInfo directory, using a text editor. Search for "log = {" and then search from there for "fields = {", to find the log fields list. Create a new field as shown below; enter the "internal" field name before the = sign (use only lower case letters, numbers, and underbars in the internal name), and enter the display "label" in the "label =" line. For instance, if you name the field category, the name and label will be the same; if you name it "my category", the name will be my_category and the label will be "my category". For this example, we will use "my category" as the field label throughout, and my_category as the field name.

      my_category = {
        label = "my category"
        type = "flat"
        index = "0"
        subindex = "0"
      } # my_category

Step 2: Create a database field based on that log field

Still editing the profile .cfg from above, search for "database = {" and then search from there for "fields = {", to find the database fields list. Add a field like this:

      my_category = {
        label = "my category"
        log_field = "my_category"
        type = "string"
        suppress_top = "0"
        suppress_bottom = "2"
      } # my_category

Step 3: Create a report based on that database field

Still editing the profile .cfg from above, search for "statistics = {" and then search from there for "reports = {", to find the database fields list. Find an existing table report; the file_type report may be a good choice; otherwise pick any report with 'type = "table"'. Copy this entire report, paste to duplicate it. Now edit the report to customize it for the new field. The edited version is shown below, with modifications in bold. The modifications are: 1) the report name and report element name have been changed, 2) the database_field_name has been changed so the table is generated from the my_category field, 3) the labels on the report element and table column have been changed to "My Category", 4) the field_name for first table column has been changed to my_category so the first column displays the my_category field values. The comments (#) have also been changed, though this is not essential.

      my_category = {
        report_elements = {
          my_category = {
            label = "My Category"
            type = "table"
            database_field_name = "my_category"
            sort_by = "hits"
            sort_direction = "descending"
            show_omitted_items_row = "true"
            omit_parenthesized_items = "true"
            show_totals_row = "true"
            starting_row = "1"
            ending_row = "10"
            only_bottom_level_items = "false"
            show_graph = "false"
            columns = {
              0 = {
                type = "string"
                visible = "true"
                field_name = "my_category"
                data_type = "string"
                header_label = "My Category"
                display_format_type = "string"
                main_column = "true"
              } # 0
              1 = {
                header_label = "%7B=capitalize(database.fields.hits.label)=}"
                type = "number"
                show_number_column = "true"
                show_percent_column = "true"
                show_bar_column = "true"
                visible = "true"
                field_name = "hits"
                data_type = "int"
                display_format_type = "integer"
              } # 1
              2 = {
                header_label = "%7B=capitalize(database.fields.page_views.label)=}"
                type = "number"
                show_number_column = "true"
                show_percent_column = "false"
                show_bar_column = "false"
                visible = "true"
                field_name = "page_views"
                data_type = "int"
                display_format_type = "integer"
              } # 2
              3 = {
                header_label = "%7B=capitalize(database.fields.visitors.label)=}"
                type = "number"
                show_number_column = "true"
                show_percent_column = "false"
                show_bar_column = "false"
                visible = "true"
                field_name = "visitors"
                data_type = "unique"
                display_format_type = "integer"
              } # 3
              4 = {
                header_label = "%7B=capitalize(database.fields.size.label)=}"
                type = "number"
                show_number_column = "true"
                show_percent_column = "false"
                show_bar_column = "false"
                visible = "true"
                field_name = "size"
                data_type = "float"
                display_format_type = "bandwidth"
              } # 4
            } # columns
          } # my_category
        } # report_elements
        label = "My Category"
      } # my_category

Step 4: Create a report menu item for that report

Still editing the profile .cfg from above, search for "reports_menu = {" to find the reports menu. This node describes the layout of the menu at the left of the reports. It includes hierarchical groups and report nodes within each group. Find a report menu item in there with 'type = "view"' (which means it clicks to a view on a report); duplicate that item and edit it so it looks like the node below. Again, the changes are to change the name of the node, the label, the view_name (which specifies which report it should click through to view), and optionally the comment:

          my_category = {
            type = "view"
            label = "My Category"
            view_name = "my_category"
            visible = "true"
            visible_if_files = "true"
          } # my_category

If you want the report to be in a different group from the one it's in, you can move it inside the "items =" list of any other group, or directly into the reports_menu node to make it a top-level report (not in any group).

Step 5: Create a log filter to populate the log field

This step varies greatly depending on what you're doing. In broad, what you need to do here is to create a log filter (in the Log Filters editor of the Config section of the web interface, or you can also do it in the log.filters section of the profile .cfg, by searching to for "log = {" and then "filters = "). The log filter you create should set the value of your new field. It could be something as simple as this:

  my_category = "some value"

to set the my_category field to the same constant value for every line, but that's not very useful. A slightly more useful example is to set it to part of another field, e.g.

  my_category = substring(file_type, 1)

In this example, my_category is set to the same value as file_type, but without the first character. Much more complex manipulations are possible; you can use any expression here. You could set it like this:

  my_category = agent . c_ip

to set my_category to the concatenation of the agent field and the c_ip field (which makes a pretty good "unique visitor" identified for web logs).

Here's one real-world example of the way you might create a lookup map to set the my_category field from the username field in web logs. Start by creating a file my_category_map.cfg in the LogAnalysisInfo directory, using a text editor. In that file, create a my_category for each possible username, like this:

  my_category_map = {
    jack = "Sales"
    jill = "Sales"
    bob = "Marketing"
    sue = "Marketing"
    sara = "Legal"
    ken = "Engineering"
  }

Then you can use this log filter:

  if (subnode_exists("my_category_map", username)) then
    my_category = node_value(subnode_by_name("my_category_map", username))
  else
    my_category = "Unknown Category" 

This works because when you create a file my_category_map.cfg in LogAnalysisInfo, you're automatically creating a variable that Sawmill can access as "my_category_map" (as an aside, you can also use directories; e.g. if you create a file "LogAnalysisInfo/log_filter_maps/my_category_map.cfg" you can access it from log filters as log_filter_maps.my_category_map). The function subnode_exists() checks if there is a subnode if its first parameter node whose name matches the second parameter, so it will be true if the username exists in my_category_map. If it does exist, then it gets that subnode's value (e.g. "Sales") and puts it in the my_category database field; otherwise, it sets it to "Unknown Category".

This is a fairly simple example; almost infinite flexibility is possible -- see (Salang: The Sawmill Language).