Toward Stable Semi-Supervised Remote Sensing Segmentation via Co-Guidance and Co-Fusion
Note: We recommend reading the IEEE Xplore paper version instead of the arXiv preprint. The arXiv paper is not the final version, while the IEEE version is revised based on reviewer feedback.
- Clone the Repository
git clone https://github.com/XavierJiezou/Co2S.git
cd Co2S- Install Dependencies
You can either set up the environment manually or use our pre-configured environment for convenience:
- Option 1: Manual Installation
conda create -n Co2S python=3.8.20
conda activate Co2S
pip install -r requirements.txt- Option 2: Use Pre-configured Environment
We provide a pre-configured environment (env.tar.gz) hosted on Hugging Face. You can download it directly from Hugging Face. Follow the instructions on the page to set up and activate the environment.
You can use the following command to download the package:
wget -O env.tar.gz https://huggingface.co/XavierJiezou/co2s-models/resolve/main/env.tar.gzOnce download env.tar.gz, you can extract it using the following command:
tar -xzf env.tar.gz -C envs
source envs/bin/activate
conda-unpackWe have open-sourced all datasets used in the paper, which are hosted on Hugging Face Datasets. Please follow the instructions on the dataset page to download the data.
You can use the following script to download the specific datasets and automatically rename the folders to lowercase:
pip install -U "huggingface_hub[cli]"
# Download and rename datasets (GID, LOVEDA, MER, MSL, POTSDAM, WHDLD) to lowercase
for DATASET in GID LOVEDA MER MSL POTSDAM WHDLD; do
echo "Downloading $DATASET..."
hf download XavierJiezou/co2s-datasets --repo-type dataset --include "$DATASET/*" --local-dir data
# Rename directory to lowercase (e.g., data/GID -> data/gid)
mv "data/$DATASET" "data/$(echo $DATASET | tr '[:upper:]' '[:lower:]')"
doneAfter downloading, organize the dataset as follows:
Co2S
├── ...
├── data
│ ├── whdld
│ │ ├── images
│ │ ├── labels
│ ├── potsdam
│ │ ├── images
│ │ ├── labels
│ ├── loveda
│ │ ├── images
│ │ ├── labels
│ ├── gid
│ │ ├── images
│ │ ├── labels
│ ├── mer
│ │ ├── images
│ │ ├── labels
│ ├── msl
│ │ ├── images
│ │ ├── labels
We provide the pre-trained weights (including converted CLIP and DINOv3) directly on Hugging Face. You can download them to the pretrained/ directory using the following command:
# Download the 'pretrained' folder content to the local current directory
hf download XavierJiezou/co2s-models --include "pretrained/*" --local-dir .After downloading, Please extract the Pre-Trained Backbones to the following folder structure:
Co2S
├── ...
├── pretrained
│ ├── clip2mmseg_ViT16_clip_backbone.pth
│ ├── dinov3_vitb16_pretrain_lvd1689m.pth
After converting the backbone network weights, make sure to correctly specify the path to the configuration file within your config settings.
For example:
nano configs/_base_/models/dual_model_clip.py# configs/_base_/models/dual_model_clip.py
model = dict(
type='Co2S',
pretrained='pretrained/clip2mmseg_ViT16_clip_backbone.pth', # you can set weight path here
backbone=dict(
...
),
)Update the configs directory with your training configuration, or use one of the provided example configurations. You can customize the backbone, dataset paths, and hyperparameters in the configuration file.
Use the following command to begin training:
python experiments.py --exp EXP_ID --run RUN_ID
# e.g. EXP_ID=40; RUN_ID=0 for Co2S on WHDLD with 1/24 labels
# WHDLD exp_id == 40: splits = ['1_24', '1_16', '1_8', '1_4']
# LoveDA exp_id == 41: splits = ['1_40', '1_16', '1_8', '1_4']
# Potsdam exp_id == 42: splits = ['1_32', '1_16', '1_8', '1_4']
# GID exp_id == 43: splits = ['1_8', '1_4']
# MER exp_id == 44: splits = ['1_8', '1_4']
# MSL exp_id == 45: splits = ['1_8', '1_4']All model weights used in the paper have been open-sourced and are available on Hugging Face Models.
You can use the following script to download the checkpoints for the ablation studies and all dataset experiments (GID, POTSDAM, etc.) into the exp/ directory:
# Download Ablation_Experiment and dataset checkpoints (GID, LOVEDA, etc.) to the local 'exp' folder
for DIR in Ablation_Experiment GID LOVEDA MER MSL POTSDAM WHDLD; do
echo "Downloading $DIR..."
huggingface-cli download XavierJiezou/co2s-models --include "$DIR/*" --local-dir exp
doneUse the following command to evaluate the trained model:
python -m third_party.unimatch.eval \
--config PATH/TO/CONFIG \ # e.g., exp/POTSDAM/1_32-74.30/config.yaml
--save-path PATH/TO/CHECKPOINT_DIR \ # e.g., exp/POTSDAM/1_32-74.30/
--pred-path PATH/TO/PREDICTION_OUTPUT # e.g., POTSDAM/1_32/Use the following command to infer the trained model:
python inference.py \
--config PATH/TO/CONFIG \ # e.g., exp/POTSDAM/1_32-74.30/config.yaml
--model MODEL_NAME \ # e.g., clip (or dinov3)
--checkpoint PATH/TO/CHECKPOINT \ # e.g., exp/POTSDAM/1_32-74.30/best_clip.pth
--pred-path PATH/TO/PRED_LIST \ # e.g., splits/potsdam/pred.txt
--output-dir PATH/TO/OUTPUT # e.g., output/POTSDAM/1_32/We have published the pre-trained model's visualization results of various datasets on Hugging Face at Hugging Face. If you prefer not to run the code, you can directly visit the repository to download the visualization results.
Alternatively, you can download the results directly to your local machine using the following command:
# Download the 'Visualization_results' folder from the dataset repository
huggingface-cli download XavierJiezou/co2s-datasets --repo-type dataset --include "Visualization_results/*" --local-dir .We provide specific tools to visualize the 12 individual attention heads, as well as the Average and Max attention projections for both CLIP and DINOv3 models. This helps in understanding the global vs. local feature focus of the backbone networks.
Use the following command to generate attention maps:
python tools/clip_dinov3_attention_map/attention_map.py \
--image-path PATH/TO/IMAGE \ # e.g., tools/clip_dinov3_attention_map/A.png
--model MODEL_NAME \ # e.g., clip (or dinov3)
--checkpoint PATH/TO/CHECKPOINT \ # e.g., pretrained/clip2mmseg_ViT16_clip_backbone.pth
--output-dir PATH/TO/OUTPUT # e.g., tools/clip_dinov3_attention_map/clip_attention/The results will then include individual head images (.png) and a complete summary table (.pdf), which will be saved in the specified output directory.
The sample visualization results are as follows:
|
CLIP Attention Maps |
DINOv3 Attention Maps |
If you use our code or models in your research, please cite with:
@ARTICLE{co2s,
author={Zhou, Yi and Zou, Xuechao and Zhang, Shun and Li, Kai and Wang, Shiying and Chen, Jingming and Lang, Congyan and Cao, Tengfei and Tao, Pin and Shi, Yuanchun},
journal={IEEE Transactions on Geoscience and Remote Sensing},
title={Toward Stable Semi-Supervised Remote Sensing Segmentation via Co-Guidance and Co-Fusion},
year={2026},
volume={},
number={},
pages={1-14}
}