Skip to content

pretrained model for crosslingual #6

Description

@Ella77

First, thank you for a great project with data in multiple languages for persona chat.

The reference link to the XNLG mentioned is well explained, but I will write it down for those who have difficulty training.

I guess that the cross-lingual model's link same as the multi-lingual model below is somewhat confusing.
CZWin32768/XNLG#11

We provided the Pre-trained XNLG models for you to skip the XNLG pre-training process.

I wanted to build an en-ko model and skip pretrain steps.
After some trials, I was able to run fine-tune script (run.sh).

fine-tune Xpersona on English and test on Korean (using XNLG based on XLM-R)
python xnlg-ft.py --exp_name xpersona --exp_id ftOnKo --dump_path ./dump --model_path /home/zihan/XNLG/xnlg/dump/stage2_en-ko/debug2/best-valid_en-ko_mt_bleu.pth --data_path ./data/processed/XNLG --optimizer adam,lr=0.00001 --batch_size 1 --n_epochs 4 --epoch_size 3000 --max_len 120 --max_vocab 200000 --train_layers 1,5 --decode_with_vocab False --n_enc_layers 10 --n_dec_layers 6 --ds_name xpersona --train_directions en-en --eval_directions ko-ko 

To this, I had to get xlm 17 or 100 language model here you linked and get bpe,vocab (*_xnli_100) in data folder
and run get-data-xpersona.sh

my crosslingual/data folder looks like this , and finally perfectly fits for the training script.
스크린샷 2021-01-25 오후 6 21 59
스크린샷 2021-01-25 오후 6 22 04

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions