A low-cost low-energy approach to VQA on traffic signs problems

For VLSP 2025 MLQA-TSR, we propose a simple retrieval-based pipeline that requires no model training. Text and image features are extracted using Jina Embeddings v3, C-RADIOv2-B, and Owlv2, then stored in Qdrant for cosine similarity search. Retrieved examples directly provide legal terms for subtask 1 and serve as few-shot prompts for Llama 4 Maverick in subtask 2.

Our method achieved a top-5 ranking (F2 = 0.54) for retrieval and a top-1 ranking (accuracy = 0.86) for question answering.

Setup

Conda environment

sh scripts/conda_setup.sh
conda activate mlqa-tsr
sh scripts/pip_setup.sh

Docker services

docker compose for qdrant

name: qdrant
services:
    qdrant:
        ports:
            - 6333:6333
            - 6334:6334
        volumes:
            - ${PWD}/qdrant_storage:/qdrant/storage:z
        image: qdrant/qdrant
        restart: always

For vllm, it really depends on the GPU infrastructure.

See vllm and qdrant docs

Run

At notebooks folder,

Read detect_image_object.ipynb, extract_image_feature.ipynb firstly, then run it

Read index_qdrant.ipynb secondly, then run it

Read naive_vector_search.ipynb for subtask 1, then run it

Read vlm_answer.ipynb for subtask 2, then run it

The src folder contains reusable code such as processing data, inferring models, connecting databases.

Citation

@inproceedings{anh-etal-2025-low,
    title = "A low-cost low-energy approach to {VQA} on traffic signs problems",
    author = "Anh, Vu Dinh  and
      Tran, Khiem Vinh  and
      Ha, Tran Thi",
    editor = "Mai, Luong Chi  and
      Huyen, Nguyen Thi Minh  and
      Trang, Nguyen Thi Thu",
    booktitle = "Proceedings of the 11th International Workshop on Vietnamese Language and Speech Processing",
    month = oct,
    year = "2025",
    address = "Hanoi, Vietnam",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.vlsp-1.53/",
    pages = "451--458"
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
notebooks		notebooks
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ruff.toml		ruff.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A low-cost low-energy approach to VQA on traffic signs problems

Setup

Conda environment

Docker services

Run

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

A low-cost low-energy approach to VQA on traffic signs problems

Setup

Conda environment

Docker services

Run

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages