SonarQube KPI Runner is an end-to-end, automated code quality analysis and visualization pipeline designed for large-scale codebases, especially C/C++ multi-module systems .
It extracts Bugs and Issues from SonarQube , enriches them with source-code context (snippets / bug blocks / callsites), leverages LLMs (e.g., GPT-4.1 / Copilot) to generate structured explanations and remediation advice, and finally exports the results to CSV , which can be automatically synchronized to OneDrive → Power BI for continuous quality governance, trend analysis, and KPI reporting.
- ✅ Turn SonarQube issues into analyzable data assets (JSON / JSONL / CSV)
- ✅ Automatically generate explanations and remediation suggestions for each issue using LLMs
- ✅ Optionally enrich issues with stronger context (bug blocks / callsites) to improve suggestion quality
- ✅ Sync results to OneDrive as a stable data source for Power BI
- ✅ Archive outputs by project and timestamp for historical replay and auditing
SonarQube ↓ (list issues) Issues Snapshot (raw) ↓ (optional: snippets / bug blocks / callsites) Context Enrichment ↓ (LLM review) Advice JSON / JSONL ↓ (export) CSV ↓ (sync) OneDrive → Power BI
-
End-to-End Automation
From SonarQube ingestion → LLM review → CSV export → OneDrive → Power BI
-
Context-Aware Analysis
Supports bug block and callsite–level context (function / call location granularity) to improve remediation quality
-
Robust Synchronization
Uses temporary files and atomic replacement to prevent partial OneDrive syncs
-
Modular Architecture
Clear separation between ingestion, context extraction, LLM review, export, and publishing
-
Replayable Outputs
Timestamped outputs enable historical trend analysis and governance reviews
The structure below reflects the actual layout under
backend/src/.
SONARQUBEKPIRUNNER/ ├── backend/ │ └── src/ │ ├── config/ │ │ └── github_models.local.json # LLM configuration (local/example) │ │ │ ├── data_io/ │ │ ├── file_reader.py # Generic file reader │ │ ├── file_writer.py # Generic file writer │ │ └── jsonl_to_csv_exporter.py # JSONL → CSV exporter │ │ │ ├── dependency/ │ │ ├── bug_block_extractor.py # Bug block (function/context) extraction │ │ ├── bug_reference_scanner.py # Reference / linkage scanning │ │ └── cpp_dependency_extractor.py # C++ dependency extraction (optional) │ │ │ ├── doc/ │ │ └── issue_intelligence_keypoints.md # Design notes and key ideas │ │ │ ├── evaluations/ │ │ └── bugs/ │ │ ├── sq_bug_block_advisor.py # ✅ Entry: Bug-block-level LLM advisor │ │ └── sq_bug_callsite_advisor.py # ✅ Entry: Callsite-level LLM advisor │ │ │ ├── sonar/ │ │ └── sq_issue_advisor.py # Issue-level advisor (legacy/optional) │ │ │ ├── llm/ │ │ ├── config_loader.py # Model config loader │ │ ├── copilot_client.py # Copilot / OpenAI client wrapper │ │ └── llm_handler.py # Unified LLM invocation layer │ │ │ ├── powerbi/ │ │ ├── one_drive_publisher.py # OneDrive publisher │ │ └── power_bi_refresher.py # Power BI refresh (optional) │ │ │ ├── prompts/ │ │ ├── system.bug_block.review.txt │ │ ├── system.bug_callsite.review.txt │ │ ├── system.sonar.review.txt │ │ ├── user.bug_block.review.txt │ │ ├── user.bug_callsite.review.txt │ │ └── user.sonar.review.txt │ │ │ ├── outputs/ # All generated artifacts (archived) │ │ ├── HysysEngine.Engine.issues/ │ │ ├── HysysEngine.Engine.bugs/ │ │ ├── HysysEngine.Engine.dependency/ │ │ └── evaluations/ │ │ └── / │ │ ├── bug_blocks// │ │ └── bug_callsites// │ │ │ └── main.py # Future unified pipeline entry │ ├── Flowchart_cn.md ├── Flowchart_en.md ├── README.md └── requirements.txt
Typical output paths (archived by project and timestamp):
backend/src/outputs/HysysEngine.Engine.issues/issues_raw.jsonlbackend/src/outputs/HysysEngine.Engine.issues/issues_with_snippets.jsonl
backend/src/outputs/HysysEngine.Engine.bugs/bugs_with_anchors_and_calls.json
- Bug Block level
backend/src/outputs/evaluations/<PROJECT_KEY>/bug_blocks/<timestamp>/bugs_with_bug_block_advice.jsonl
- Callsite level
backend/src/outputs/evaluations/<PROJECT_KEY>/bug_callsites/<timestamp>/bugs_with_bug_callsite_advice.jsonl
-
issues_with_advice.csv -
bugs_with_*_advice.csv(exact schema depends on the exporter)
At the moment, the most stable and direct way to run the pipeline is via:
sq_bug_block_advisor.pysq_bug_callsite_advisor.py
These scripts:
- Read input JSON / JSONL (e.g.,
bugs_with_anchors_and_calls.json) - Load system and user prompts
- Invoke the LLM to generate advice
- Write results to
outputs/evaluations/.../<timestamp>/
pip install -r requirements.txt
Make sure you have:
- SonarQube access token / base URL (if required by your setup)
- LLM configuration (
backend/src/config/github_models.local.json) - A local OneDrive sync folder (if Power BI publishing is enabled)
python backend/src/evaluations/bugs/sq_bug_block_advisor.py
Best for:
- Minimal but sufficient context (function / class vicinity)
- Lower token usage
- Faster execution and lower failure rate
Output example:
backend/src/outputs/evaluations/<PROJECT_KEY>/bug_blocks/<timestamp>/bugs_with_bug_block_advice.jsonl
python backend/src/evaluations/bugs/sq_bug_callsite_advisor.py
Best for:
- Complex bugs spanning multiple modules
- Scenarios where callsite or reference context improves accuracy
Output example:
backend/src/outputs/evaluations/<PROJECT_KEY>/bug_callsites/<timestamp>/bugs_with_bug_callsite_advice.jsonl
All prompts are stored under:
backend/src/prompts/system.*.txtbackend/src/prompts/user.*.txt
This ensures:
- Prompt versions are tracked in Git
- Output quality changes can be traced back to prompt updates
Local / enterprise model settings are defined in:
backend/src/config/github_models.local.json
Note:
Tokens and sensitive credentials should be stored locally or via environment variables and excluded via
.gitignore.
After exporting JSONL to CSV:
one_drive_publisher.pyperforms atomic sync to a local OneDrive folderpower_bi_refresher.py(optional) triggers dataset refresh via Power BI REST API
Typical use cases:
- Near–real-time dashboard updates
- Turning code quality governance into measurable KPIs
This roadmap describes the evolution of the system beyond the current stable analysis and recommendation capabilities ,
toward a collaborative, verifiable, and governable AI-assisted code quality remediation loop .
-
Introduce a unified
PipelineRunnerentry point(orchestrating: SonarQube ingestion → context enrichment → LLM review → CSV export → publishing)
-
Support scheduled execution
(GitHub Actions / Windows Task Scheduler)
-
Strengthen CSV / JSON schema validation
(to ensure long-term stability of Power BI ingestion)
-
Add failure categorization and metrics
(token limits, LLM invocation failures, parsing errors, etc.)
- Generate temporary patch / diff artifacts based on LLM review results
- Containing proposed code changes
- Bundled with original code snippets and contextual explanations
- Patches are treated as candidate drafts , not direct modifications to the main branch
- Deliver patches back to developer repositories via PRs or temporary branches
- Developers review patches directly in their IDEs
- Perform necessary refinements (style, edge cases, business semantics)
- Introduce an explicit Developer Feedback mechanism
- accept / partial accept / reject
- Lightweight reason tags (insufficient context, logic mismatch, style concerns, etc.)
- Automatically trigger after PR merge or revision
- CI builds
- SonarQube re-analysis
- If issues are not fully resolved or new issues are introduced
- Capture failure signals
- Enter the next round of LLM-assisted remediation with failure-aware context
- Mark issues that fail across multiple iterations as hard cases
- Persist failure types, contextual features, and module distribution
- Perform time-based aggregation and analysis
- Identify rules or modules with low remediation success rates
- Determine which issue categories are better handled manually
- Distill key metrics into governance-level KPIs
- Fix success rate
- Developer acceptance rate
- Mean Time To Resolution (MTTR)
- Number of remediation iterations
This project is intended for internal engineering quality governance and automation.
If open-sourcing, consider removing or anonymizing internal project names, paths, tokens, and SharePoint / OneDrive references.