Skip to content

hervad/log-analyse

Repository files navigation

log-analyse

log-analyse is a portable uv-managed project for preparing support bundles for retrieval workflows.

The repository currently contains:

  • run-based bundle ingestion under data/input, data/work, and data/output
  • bundle discovery and profile-driven content preprocessing entry points
  • a small internal Python module under src/log_analyse/ for bundle handlers and generic orchestration
  • bundle discovery, content preprocessing, content chunking, chunks validation, embedding preparation, lexical vectorization, semantic vectorization, embeddings packaging, vector store load, evidence retrieval, and analysis stages
  • incident-centric metadata propagation for run_id, incident_id, bundle_id, bundle_type, system_id, and related payload fields
  • a new local dense embedding stage powered by sentence-transformers
  • an optional cloud semantic embedding backend for Google Vertex AI
  • a hierarchical JSON configuration file for switching embedding providers and models
  • a Textual-based terminal config editor for day-to-day configuration work

Quick start

  1. Install Python 3.14 or newer.
  2. Install uv.
  3. Sync the environment:
uv sync

To use Google Vertex AI embeddings, install the matching optional extra:

uv sync --extra google
  1. Run the dense embedding smoke test:
uv run python scripts/extra/show-privacy-status.py
uv run python scripts/extra/smoke-test.py
  1. Review or edit the config.

You can edit config/pipeline.json directly or use the TUI:

uv run log-analyse-config-ui

The TUI saves pipeline.json, rebuilds the compiled cache automatically, and stores its own local UI preferences in config/pipeline.ui.json.

Useful TUI actions:

  • Save: writes pipeline.json and refreshes pipeline.config.cache.bin
  • Revert: discards unsaved form changes
  • Default: loads the shipped safe baseline preset
  • Secure: loads a stricter local-only privacy-first preset
  1. Place a bundle under data/input/run-001/bundle/ or pass --bundle-path. The current pipeline processes one unpacked bundle per run.

  2. Run the idempotent main pipeline:

uv run log-analyse --run-id run-001

If you want to rebuild the local compiled config cache explicitly:

uv run log-analyse-config-compile

Configuration

The dense embedding stage reads its defaults from config/pipeline.json.

The main dense-provider setting looks like this:

{
  "embedding": {
    "semantic": {
      "provider": {
        "kind": "sentence-transformers",
        "model_name": "BAAI/bge-small-en-v1.5"
      }
    }
  }
}

Change provider.kind to switch providers and provider.model_name to switch models without rewriting the script.

The two currently implemented provider families are:

  • sentence-transformers
  • google-vertex-ai

Provider-specific settings live inside the provider object. For example, local_files_only and trust_remote_code apply to sentence-transformers, while project, location, and credentials_env apply to google-vertex-ai.

config/pipeline.json is the source of truth. The runtime may generate config/pipeline.config.cache.bin as a local compiled cache, but that file is not meant for editing, review, or committing to git. It is safe to delete; the next run will recreate it from pipeline.json.

The TUI may also generate config/pipeline.ui.json for local interface preferences such as the selected theme. That file is also local-only and is not meant for committing to git.

Documentation

The main log-analyse command prepares retrieval-ready artifacts on disk. Publishing those artifacts into Qdrant and running retrieval/evaluation workflows are separate commands.

Notes

  • The repository is now configured in offline-first mode for dense embeddings.
  • Dense embedding runs should use only local model files and should not try to contact Hugging Face.
  • If the dense model is missing locally, the dense stage should fail instead of downloading it.
  • Today the implemented bundle profile is linux-sosreport.
  • Today the only fully implemented bundle handler is also linux-sosreport.
  • Future bundle types already planned in config and docs include ESXi vm-support, vCenter support bundles, storage support bundles, and Jenkins logs bundles.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages