Back to Blog
November 15, 2025
8 min

Argus: Fast File Integrity Monitoring in Rust

You've deployed code to production. How do you know it hasn't been tampered with? A rootkit could modify binaries, a supply chain attack could swap libraries, or an insider threat could alter configuration files. By the time you notice, the damage is done.

File integrity monitoring (FIM) is a fundamental security control, but most tools are either:

  • Too slow for large directories (minutes to scan)
  • Too complex to configure and deploy
  • Too resource-intensive to run continuously

I built Argus to solve this: a lightweight, blazing-fast FIM tool that can scan thousands of files per second and detect changes in real-time.

The Problem: Traditional FIM is Slow

Most FIM tools are built for enterprise environments with complex policies, compliance reporting, and extensive configuration. For developers and security researchers who just need:

  • "Has anything changed in this directory?"
  • "What files were modified since my last scan?"
  • "Alert me when critical files are altered"

Existing tools are overkill.

And they're slow. Scanning a large codebase with tools like AIDE or Tripwire can take minutes because they:

  • Run on a single thread
  • Perform unnecessary operations (ACL checks, extended attributes)
  • Use inefficient file I/O patterns
  • Generate verbose logs

My Approach: Parallel SHA-256 at Scale

Argus is built on three principles:

1. Parallel Everything Modern machines have multiple cores - use them. Argus parallelizes:

  • Directory traversal
  • File reads
  • Checksum calculation
  • Output generation

2. Minimal Overhead Only compute what you need:

  • SHA-256 checksums (industry standard)
  • File size
  • Modification timestamp
  • Path

No unnecessary metadata, no complex policies.

3. Structured Output NDJSON (Newline Delimited JSON) for easy parsing: ```json {"path":"./src/main.rs","checksum":"a3f5...","size":2048,"timestamp":"2025-12-10T15:30:00Z"} {"path":"./src/lib.rs","checksum":"b2e1...","size":4096,"timestamp":"2025-12-10T15:31:00Z"} ```

Perfect for scripting, monitoring systems, or feeding into SIEMs.

Technical Deep Dive

Parallel File Processing

The core of Argus is a work-stealing thread pool:

```rust use rayon::prelude::*; use sha2::{Sha256, Digest};

pub fn scan_directory(path: &Path, threads: usize) -> Result<Vec<FileRecord>> { // Configure thread pool let pool = rayon::ThreadPoolBuilder::new() .num_threads(threads) .build()?;

pool.install(|| {
    // Collect all file paths
    let files: Vec<PathBuf> = WalkDir::new(path)
        .into_iter()
        .filter_map(|e| e.ok())
        .filter(|e| e.file_type().is_file())
        .map(|e| e.path().to_owned())
        .collect();

    // Process in parallel
    files.par_iter()
        .map(|file_path| compute_checksum(file_path))
        .collect()
})

}

fn compute_checksum(path: &Path) -> Result<FileRecord> { let mut file = File::open(path)?; let mut hasher = Sha256::new(); let mut buffer = vec![0u8; 8192]; // 8KB buffer

loop {
    let bytes_read = file.read(&mut buffer)?;
    if bytes_read == 0 { break; }
    hasher.update(&buffer[..bytes_read]);
}

Ok(FileRecord {
    path: path.to_string(),
    checksum: format!("{:x}", hasher.finalize()),
    size: file.metadata()?.len(),
    timestamp: file.metadata()?.modified()?,
})

} ```

Why Rayon?

Rayon is a data-parallelism library that makes parallel iteration trivial. The genius is work-stealing:

  • Each thread has a queue of tasks
  • When a thread finishes, it "steals" work from another thread
  • Automatic load balancing with no manual scheduling

This means Argus automatically adapts to:

  • Mixed file sizes (small configs + large binaries)
  • I/O latency variations
  • Number of available cores

Ignore Pattern Support

Security-focused FIM should respect .gitignore patterns. No one wants to checksum node_modules or .git directories.

Argus supports both .gitignore and .argusignore:

```rust use ignore::WalkBuilder;

pub fn scan_with_ignores(path: &Path) -> Result<Vec<FileRecord>> { let walker = WalkBuilder::new(path) .add_ignore(".gitignore") .add_ignore(".argusignore") .build();

// Walk respects ignore patterns automatically
walker
    .filter_map(|e| e.ok())
    .filter(|e| e.file_type().is_file())
    .par_bridge() // Parallel iterator
    .map(|entry| compute_checksum(entry.path()))
    .collect()

} ```

This dramatically reduces scan time for large projects with many dependencies.

Real-Time Monitoring

The watch mode uses notify crate for filesystem event monitoring:

```rust use notify::{Watcher, RecursiveMode, Event};

pub fn watch_directory(path: &Path, baseline: &[FileRecord]) -> Result<()> { let (tx, rx) = channel(); let mut watcher = RecommendedWatcher::new(tx, Config::default())?;

watcher.watch(path, RecursiveMode::Recursive)?;

for event in rx {
    match event? {
        Event::Modify(path) | Event::Create(path) => {
            let new_checksum = compute_checksum(&path)?;
            let baseline_record = baseline.iter()
                .find(|r| r.path == path);

            if let Some(old) = baseline_record {
                if old.checksum != new_checksum.checksum {
                    alert_change(path, old, &new_checksum);
                }
            }
        }
        Event::Remove(path) => {
            alert_deletion(path);
        }
        _ => {}
    }
}
Ok(())

} ```

This enables real-time alerting: ```bash argus watch /var/www/html --baseline production.ndjson

Alerts instantly when files change

```

Comparison Reports

Detecting what changed between two scans is critical for incident response:

```bash

Baseline scan

argus scan --directory /srv/app --output baseline.ndjson

Later, compare

argus scan --directory /srv/app --compare baseline.ndjson ```

Output shows:

  • Modified: Files with different checksums
  • Added: New files not in baseline
  • Deleted: Files in baseline but missing now

Implementation:

```rust pub fn compare_scans(current: &[FileRecord], baseline: &[FileRecord]) -> ComparisonReport { let baseline_map: HashMap<&str, &FileRecord> = baseline.iter() .map(|r| (r.path.as_str(), r)) .collect();

let mut modified = Vec::new();
let mut added = Vec::new();

for record in current {
    match baseline_map.get(record.path.as_str()) {
        Some(old) if old.checksum != record.checksum => {
            modified.push((old, record));
        }
        None => {
            added.push(record);
        }
        _ => {} // Unchanged
    }
}

let deleted = baseline.iter()
    .filter(|r| !current.iter().any(|c| c.path == r.path))
    .collect();

ComparisonReport { modified, added, deleted }

} ```

Performance Benchmarks

Tested on a MacBook Pro (M1, 8 cores) scanning a large codebase:

FilesSizeSingle Thread8 Threads (Argus)Speedup
1,00050MB2.3s0.4s5.8x
10,000500MB23.1s3.2s7.2x
50,0002GB118s15.7s7.5x

Real-world usage on a production web server (5,000 files):

  • Initial scan: 1.2 seconds
  • Incremental comparison: 0.3 seconds
  • Watch mode overhead: <1% CPU

Real-World Use Cases

Production Monitoring

I run Argus on production servers to detect unauthorized changes: ```bash

Cron job every 5 minutes

*/5 * * * * argus scan /var/www --compare /var/baseline.ndjson && notify_slack ```

If anything changes, Slack alert with diff of modified files.

Supply Chain Security

Verify vendor-provided binaries haven't been tampered with: ```bash

Generate checksums from trusted source

argus scan /opt/vendor-software --output trusted.ndjson

Periodically verify

argus scan /opt/vendor-software --compare trusted.ndjson ```

Incident Response

After detecting a breach, quickly identify what was modified: ```bash

Compare current state to pre-incident baseline

argus scan /compromised-system --compare pre-incident.ndjson > changes.txt ```

Shows exactly which files attackers modified.

Git Alternative for Non-Code

Track changes in directories that aren't under version control: ```bash

Configuration directories

argus watch /etc --baseline /backups/etc-baseline.ndjson

Data directories

argus watch /var/lib/important-data --baseline data-baseline.ndjson ```

Limitations & Future Work

Current Limitations:

  • Large files: 1GB file size limit (configurable)
  • No cryptographic signing: Checksums can be forged if attacker has root
  • Basic alerting: No built-in notification system
  • Single machine: Doesn't scale across distributed systems

Planned Features:

  • HMAC signatures for tamper-proof baselines
  • Built-in alerting (email, Slack, webhooks)
  • Distributed scanning for cluster deployments
  • SQLite storage option for faster comparisons
  • Filter by file patterns (only watch *.so files)

Why Rust for FIM?

Performance: Parallel processing with zero-cost abstractions Safety: No buffer overflows or data races when reading files concurrently Single Binary: Deploy one executable with no runtime dependencies Cross-Platform: Runs on Linux, macOS, Windows with same codebase

Most FIM tools are written in Python or C. Python is too slow for large scans. C requires careful memory management and is hard to parallelize safely. Rust gives you C-like performance with Python-like ergonomics.

Try It Yourself

Argus is open source and ready to use:

```bash

Install from source

git clone https://github.com/abendrothj/Argus cd Argus cargo install --path .

Or download pre-built binary from releases

Basic usage

argus scan --directory /path/to/monitor --output baseline.ndjson

Compare scans

argus scan --directory /path/to/monitor --compare baseline.ndjson

Watch mode

argus watch --directory /path/to/monitor --baseline baseline.ndjson

Custom thread count

argus scan --directory /large/directory --threads 16 --output scan.ndjson ```

For production use, I recommend:

  1. Generate baseline from known-good state
  2. Store baseline in immutable storage (S3, write-once filesystem)
  3. Run comparison scans via cron
  4. Alert on any differences
  5. Regenerate baseline after verified changes

Lessons Learned

Building Argus taught me:

  • Parallel I/O: How to efficiently read files concurrently without thrashing disk
  • Filesystem APIs: Deep dive into metadata, inodes, and platform differences
  • Benchmarking: Profiling parallel code is hard - learned to use cargo flamegraph
  • Rust Performance: Where async helps vs where thread pools are better
  • Security Operations: What practitioners actually need vs what vendors sell

The hardest part was optimizing for both many small files (configs) and few large files (binaries). Different access patterns need different strategies.

Closing Thoughts

File integrity monitoring shouldn't be complex or slow. Argus proves you can have:

  • Sub-second scan times
  • Simple deployment (one binary)
  • Minimal configuration (just specify a directory)
  • Structured output for automation

If you need to monitor files for changes - whether for security, compliance, or just peace of mind - give Argus a try.

And if you're working on systems programming in Rust, the codebase is a good example of parallel I/O, filesystem operations, and CLI design.


Security tools should be fast, simple, and transparent. Complex tools don't get deployed. Slow tools don't get run.