Skip to main content

Linux File Monitoring

You can configure MetricsHub to monitor files on Linux systems.

In the example below, we configured MetricsHub to:

  • monitor the file located in /var/log/myapp/file.log
  • search for the patterns "Error", "Exception", "Failure"
  • count how many times these patterns occur
  • expose the result through the system.file.match.count metric.

This use case allows you to track:

  • the number of errors in the log file
  • the number of exceptions
  • the number of specific events or keywords.

Procedure

To achieve this use case:

  • Declare the resource to be monitored (prod-linux-web) and its attributes (host.name, host.type)

    resources:
    prod-linux-web:
    attributes:
    host.name: prod-linux-web
    host.type: linux
  • Configure the SSH protocol with credentials and timeout

    protocols:
    ssh:
    username: USERNAME
    password: PASSWORD
    timeout: 240

Important: Monitoring large files may take time. Specify a sufficient timeout (for example, 240 seconds) to prevent connection timeouts.

  • Configure the monitor job to target the desired files

    monitors:
    file:
    simple:
  • Configure the File source

    sources:
    # FileSource: a string containing the added data on the file since the last polling.
    fileSource:
    type: file
    paths: "/var/log/myapp/file.log" # File path to monitor
    mode: log # File fetching mode. Use log for log file monitoring
    maxSizePerPoll: 1MB # Maximum size (in MB) to read per polling cycle.
  • Create an awk script to count how many times specific patterns appear in the file

    computes:
    - type: awk # The awk script counts how many times the content pattern is found
    script: |
    BEGIN {
    pattern = "Error|Exception|Failure"
    occurrence_count = 0
    }

    function count_occurrences(line, regex, count, pos, part) {
    count = 0
    pos = 1

    while (pos <= length(line)) {
    part = substr(line, pos)

    if (match(part, regex)) {
    count++
    pos += RSTART + RLENGTH - 1
    } else {
    break
    }
    }

    return count
    }

    {
    occurrence_count += count_occurrences($0, pattern)
    }

    END {
    print occurrence_count
    }
  • Define the identification attributes

    mapping:
    # Mapping is executed on the result produced by the source (after computes are applied).
    source: ${source::fileSource}
    attributes:
    id: "/var/log/myapp/file.log"
    system.file.path: "/var/log/myapp/file.log"
    system.file.keyword: "Error|Exception|Failure"
  • Expose the number of matches using the system.file.match.count metric

    metrics:
    # Number of pattern matches found.
    system.file.match.count: $1

Here is the complete YAML configuration:

resources:
prod-linux-web:
attributes:
host.name: prod-linux-web
host.type: linux
protocols:
ssh:
username: USERNAME
password: PASSWORD
timeout: 240
monitors:
file:
keys:
- id
- system.file.keyword
simple:
sources:
# FileSource: a string containing the added data on the file since the last polling.
fileSource:
type: file
paths: "/var/log/myapp/file.log"
mode: log # File fetching mode. Use log for log file monitoring
maxSizePerPoll: 1MB # Maximum size (in MB) to read per polling cycle.
computes:
- type: awk # The awk script counts how many times the content pattern is found
script: |
BEGIN {
pattern = "Error|Exception|Failure"
occurrence_count = 0
}

function count_occurrences(line, regex, count, pos, part) {
count = 0
pos = 1

while (pos <= length(line)) {
part = substr(line, pos)

if (match(part, regex)) {
count++
pos += RSTART + RLENGTH - 1
} else {
break
}
}

return count
}

{
occurrence_count += count_occurrences($0, pattern)
}

END {
print occurrence_count
}
mapping:
# Mapping is executed on the result produced by the source (after computes are applied).
source: ${source::fileSource}
attributes:
id: "/var/log/myapp/file.log"
system.file.path: "/var/log/myapp/file.log"
system.file.keyword: "Error|Exception|Failure"
metrics:
# Number of pattern matches found on each polling cycle.
system.file.match.count: $1

Supporting Resources