Skip to content

Commit 0d8565c

Browse files
Merge pull request #50 from google-ai-edge:smilingday-more-links
PiperOrigin-RevId: 918176226
2 parents 4da4d39 + 901bf84 commit 0d8565c

1 file changed

Lines changed: 27 additions & 20 deletions

File tree

README.md

Lines changed: 27 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,17 @@ including converting, quantizing, compiling, running, benchmarking and
66
visualizing LiteRT (TFLite) models on various hardware (CPU / GPU / NPU) across
77
platforms (desktop, mobile, or cloud).
88

9-
🚀 [Installation](#-installation)  | [Quick start](#-quick-start)  |  💡 [Common commands](#-common-commands)
10-
<br>
11-
📓 [Try Colab](#-try-colab) &nbsp;|&nbsp; 🌟 [Quick demos](#-quick-demos) &nbsp;|&nbsp; 🤖 [Use in coding agent](#-use-in-coding-agent)
9+
🚀 [Installation](#-installation) | ⚡ [Quick start](#-quick-start) | 💡
10+
[Common commands](#-common-commands) | 📓 [Try Colab](#-try-colab) | 🌟
11+
[Quick demos](#-quick-demos) | 🤖 [Use in coding agent](#-use-in-coding-agent)
12+
13+
LiteRT CLI is built on top of [Google AI Edge](https://ai.google.dev/edge)
14+
stacks, including [LiteRT](https://github.com/google-ai-edge/LiteRT),
15+
[LiteRT-LM](https://github.com/google-ai-edge/LiteRT-LM),
16+
[LiteRT Torch](https://github.com/google-ai-edge/LiteRT-Torch),
17+
[AI Edge Quantizer](https://github.com/google-ai-edge/ai-edge-quantizer),
18+
[AI Edge Portal](https://ai.google.dev/edge/ai-edge-portal), and
19+
[Model Explorer](https://ai.google.dev/edge/model-explorer).
1220

1321
> [!NOTE] It's still an early preview under active development, thus has limited
1422
> platform and feature support, plus possible bugs. We appreciate your patience
@@ -124,26 +132,25 @@ Add the LiteRT CLI skill
124132
into your coding agent (like [Google Antigravity](https://antigravity.google/))
125133
and try prompts such as:
126134

127-
* "Download LiteRT model `litert-community/efficientnet_b1` and run it on CPU"
128-
* "Benchmark LiteRT model `litert-community/efficientnet_b1` on my Android
129-
GPU"
130-
* "Compile LiteRT model `litert-community/efficientnet_b1` for NPU target
131-
`sm8750`"
132-
* "Visualize LiteRT model `litert-community/efficientnet_b1`"
133-
* "Download the FP32 EfficientNet model `litert-community/efficientnet_b1`
134-
from HuggingFace. Quantize it to INT8 dynamic range (`--recipe
135-
dynamic_wi8_afp32`), then benchmark both the original FP32 model and the
136-
newly quantized INT8 model on the GPU of my connected Android device.
137-
Compare the average latency and report the throughput speedup."
138-
* "Convert the model `Qwen/Qwen1.5-0.5B-Chat` from HuggingFace Hub to LiteRT
139-
format, and run it locally using the prompt 'Explain edge machine learning
140-
in one sentence'."
141-
* "Download EfficientNet from huggingface repo
142-
`litert-community/efficientnet_b1` . Offline compile (AOT) the model for the
135+
* *Download LiteRT model `litert-community/efficientnet_b1` and run it on CPU*
136+
* *Benchmark LiteRT model `litert-community/efficientnet_b1` on my Android
137+
GPU*
138+
* *Compile LiteRT model `litert-community/efficientnet_b1` for NPU target
139+
`sm8750`*
140+
* *Visualize LiteRT model `litert-community/efficientnet_b1`*
141+
* *Download the FP32 model `litert-community/efficientnet_b1` , quantize it to
142+
INT8 dynamic range (`--recipe dynamic_wi8_afp32`), then benchmark both the
143+
original FP32 model and the newly quantized INT8 model on the GPU of my
144+
connected Android device. Compare the average latency and report the
145+
throughput speedup.*
146+
* *Convert the model `Qwen/Qwen1.5-0.5B-Chat` from HuggingFace, and run it
147+
locally using the prompt 'Explain edge machine learning one sentence'*
148+
* *Download EfficientNet from huggingface repo
149+
`litert-community/efficientnet_b1`, offline compile (AOT) the model for the
143150
`sm8750` target NPU, and output the compiled model into `./models/compiled`.
144151
Then, run an on-device inference and benchmark using this newly compiled AOT
145152
model on the connected Android device's NPU (`--npu`). Confirm that the
146-
graph loads directly without dynamic JIT compilation warmup latency."
153+
graph loads directly without dynamic JIT compilation warmup latency.*
147154

148155
The agent will automatically install the necessary tools, including Python
149156
virtual environments, `litert-cli-nightly`, and all required dependencies.

0 commit comments

Comments
 (0)