Skip to content

Commit cbf1ac6

Browse files
committed
refine llmcc-rust
1 parent 4fdc3c6 commit cbf1ac6

11 files changed

Lines changed: 531 additions & 621 deletions

File tree

crates/llmcc-core/src/ir_query.rs

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -323,7 +323,12 @@ impl<'hir, 'unit> HirQuery<'hir, 'unit> {
323323
/// through the type child to return the generic callee (`Repository` in
324324
/// `Repository<User>`).
325325
pub fn try_ident_with_field(self, field_id: u16) -> Option<&'hir HirIdent<'hir>> {
326-
debug_assert!(!self.node.is_kind(HirKind::Identifier));
326+
if self.node.is_kind(HirKind::Identifier) {
327+
return (self.node.try_field_id() == Some(field_id))
328+
.then(|| self.node.as_ident())
329+
.flatten();
330+
}
331+
327332
for child in self.node.children(self.unit) {
328333
if child
329334
.try_base()

crates/llmcc-rust/README.md

Lines changed: 27 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -1,62 +1,35 @@
11
# llmcc-rust
22

3-
This crate provides Rust language support for the **llmcc** project. It implements the language-specific logic for parsing, symbol collection, and semantic analysis (binding) of Rust code.
4-
5-
## Overview
6-
7-
`llmcc-rust` integrates with the core compiler infrastructure to provide:
8-
- **Parsing**: Uses `tree-sitter-rust` to generate an AST.
9-
- **Symbol Collection**: A first pass to declare all symbols (structs, functions, variables) in the scope graph.
10-
- **Symbol Binding**: A second pass to resolve references, infer types, and build the dependency graph.
11-
12-
## Architecture
13-
14-
The analysis pipeline consists of three main stages, orchestrated by the `LangRust` implementation in `src/token.rs`.
15-
16-
### 1. Parsing (`src/token.rs`)
17-
The entry point for the crate. It defines the `LangRust` struct which implements the `LanguageImpl`.
18-
- Wraps `tree-sitter-rust` to produce a concrete syntax tree.
19-
- Maps Tree-sitter nodes to LLMCC's internal HIR (High-level Intermediate Representation).
20-
- Auto-generates token definitions via `build.rs`.
21-
22-
### 2. Symbol Collection (`src/collect.rs`)
23-
The **Collection Pass** walks the AST to identify and declare definitions.
24-
- **Visitor**: `CollectorVisitor` traverses the AST.
25-
- **Scopes**: Creates scopes for modules, functions, structs, and blocks.
26-
- **Declarations**: Registers symbols for:
27-
- Primitives (`i32`, `bool`, etc.)
28-
- Modules and Crates (parsing `Cargo.toml` via `src/util.rs`)
29-
- Functions, Structs, Enums, Traits
30-
- Variables (via pattern matching in `let` bindings and parameters)
31-
- **Visibility**: Handles `pub` and `pub(crate)` modifiers to determine global symbol visibility.
32-
33-
### 3. Symbol Binding (`src/bind/`)
34-
The **Binding Pass** resolves identifiers to their definitions and builds the call graph. This module is split into focused components:
35-
36-
- **Visitor (`src/bind/visitor.rs`)**: The main driver, `BinderVisitor`, walks the AST again.
37-
- **Resolution (`src/bind/resolution.rs`)**: `SymbolResolver` handles complex name lookups, including:
38-
- Lexical scoping (variables).
39-
- Path resolution (`std::collections::HashMap`).
40-
- Method resolution (looking up methods in `impl` blocks).
41-
- **Inference (`src/bind/inference.rs`)**: `ExprResolver` determines the types of expressions to support accurate method resolution.
42-
- **Linking (`src/bind/linker.rs`)**: `SymbolLinker` connects usage sites to definition sites, forming the dependency graph used by downstream LLM tasks.
43-
44-
### Utilities (`src/util.rs`)
45-
Helper functions for filesystem and project structure analysis:
46-
- `parse_crate_name`: Extracts crate names from `Cargo.toml`.
47-
- `parse_module_name`: Handles Rust's module system conventions (e.g., `mod.rs`).
3+
Rust language support for llmcc.
484

49-
## Development
5+
The public API is intentionally small: use `LangRust` with the generic APIs from `llmcc-core` and `llmcc-resolver`. The collector, binder, inference, and pattern helpers are implementation details of the language adapter.
6+
7+
## Pipeline
8+
9+
`LangRust` implements the core language contract in `src/token.rs`:
10+
11+
- parses Rust source with `tree-sitter-rust`
12+
- maps tree-sitter nodes and fields to llmcc HIR/block kinds from `src/token_map.toml`
13+
- creates Rust primitive symbols in the initial global scope
14+
- dispatches symbol collection and binding to the internal passes
15+
16+
The internal passes are split by responsibility:
5017

51-
### Testing
52-
The crate includes extensive unit tests ensuring correct symbol resolution and dependency tracking.
18+
- `collect.rs`: declares Rust symbols and attaches lexical/semantic scopes
19+
- `bind.rs`: resolves references, associates symbols with types, and records graph-relevant relationships
20+
- `infer.rs`: infers local expression/type symbols needed by binding
21+
- `pattern.rs`: propagates known types through Rust binding patterns
22+
23+
## Conventions
24+
25+
- Keep Rust-specific syntax decisions in this crate, not in `llmcc-core` or `llmcc-resolver`.
26+
- Prefer collection-time publication of global symbols; binding may run per unit in parallel.
27+
- Avoid panics for recoverable HIR shape drift. Skip or warn when a tree-sitter node is not shaped as expected.
28+
- Add token-map entries before implementing visitors for new Rust syntax.
29+
30+
## Development
5331

5432
```bash
55-
# Run all tests for this crate
5633
cargo test -p llmcc-rust
34+
cargo clippy -p llmcc-rust --all-targets -- -D warnings
5735
```
58-
59-
### Adding New Features
60-
1. **New Syntax**: Update `src/token.rs` (or the build script) if new token types are needed.
61-
2. **New Declarations**: Update `CollectorVisitor` in `src/collect.rs` to register new symbol kinds.
62-
3. **New Resolution Logic**: Update `src/bind/` modules to handle new scoping rules or reference types.

crates/llmcc-rust/build.rs

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@ use std::env;
22
use std::fs;
33
use std::path::PathBuf;
44

5-
fn main() {
6-
let manifest_dir = PathBuf::from(env::var("CARGO_MANIFEST_DIR").unwrap());
5+
fn main() -> Result<(), Box<dyn std::error::Error>> {
6+
let manifest_dir = PathBuf::from(env::var("CARGO_MANIFEST_DIR")?);
77
let config_path = manifest_dir.join("./src/token_map.toml");
88

99
println!("cargo:rerun-if-changed={}", config_path.display());
@@ -14,10 +14,10 @@ fn main() {
1414
tree_sitter_rust::LANGUAGE.into(),
1515
tree_sitter_rust::NODE_TYPES,
1616
&config_path,
17-
)
18-
.unwrap();
17+
)?;
1918

20-
let out_dir = PathBuf::from(env::var("OUT_DIR").unwrap());
19+
let out_dir = PathBuf::from(env::var("OUT_DIR")?);
2120
let out_file = out_dir.join("rust_tokens.rs");
22-
fs::write(&out_file, contents).unwrap();
21+
fs::write(&out_file, contents)?;
22+
Ok(())
2323
}

0 commit comments

Comments
 (0)