Skip to content

Add support for protein sequences#94

Merged
cherman2 merged 6 commits into
qiime2:devfrom
fethalen:protein-alignments
Jan 20, 2026
Merged

Add support for protein sequences#94
cherman2 merged 6 commits into
qiime2:devfrom
fethalen:protein-alignments

Conversation

@fethalen

@fethalen fethalen commented Nov 7, 2025

Copy link
Copy Markdown
Contributor

This PR addresses issue #93 by introducing support for amino acid sequences.

  1. qiime alignment mafft can now accept either FeatureData[Sequence] or FeatureData[ProteinSequence] as an input, the resulting output will be of the type FeatureData[AlignedSequence] or FeatureData[AlignedProteinSequence] depending on the input type.
  2. qiime alignment mafft-add works in a similar manner, but takes an alignment as an additional output. The sequence type of the alignments and the sequences to be added must match.
  3. Additionally, qiime alignment mask can now accept either FeatureData[AlignedSequence] OR FeatureData[AlignedProteinSequence] as an input.

@gregcaporaso gregcaporaso moved this to Needs Review in 2026.1 ❄️ Nov 14, 2025
@gregcaporaso gregcaporaso requested review from gregcaporaso and removed request for gregcaporaso December 12, 2025 17:42
Comment thread q2_alignment/plugin_setup.py Outdated
Comment on lines +79 to +89
inputs={'alignment': (
FeatureData[AlignedSequence] |
FeatureData[AlignedProteinSequence]
),
'sequences': T_GenericSequenceInput},
parameters={'n_threads': Threads,
'parttree': Bool,
'addfragments': Bool,
'keeplength': Bool,
'large': Bool},
outputs=[('expanded_alignment', FeatureData[AlignedSequence])],
outputs=[('expanded_alignment', T_GenericAlignedSequenceOutput)],

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do a larger typemap here. https://develop.qiime2.org/en/latest/plugins/references/api/types.html#qiime2.plugin.TypeMap

I think this will look something like:

T_alignment, T_sequence, T_expanded_alignment = (FeatureData[AlignedSequence], FeatureData[AlignedSequence]) : FeatureData[AlignedSequence]

I think with this typemap, you can remove _validate_sequence_pair() helper.

Comment thread q2_alignment/_mafft.py Outdated
Comment thread q2_alignment/_mafft.py Outdated
Comment thread q2_alignment/_mafft.py

@cherman2 cherman2 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @fethalen,
Thanks for all your work on this. We have some ideas for how to simplify this code. Also, since we merged your alignment strategy PR, there are merge conflicts for this PR.

Let us know if you have any questions or need anything!

@cherman2 cherman2 moved this from Needs Review to In Development in 2026.1 ❄️ Jan 14, 2026
@fethalen

Copy link
Copy Markdown
Contributor Author

I've redefined the dependent types as suggested and was able to remove the _validate_sequence_pair() helper function. In addition, I've simplified the way that the sequence type is inferred, and have removed the SequenceType enum. A lot of the tests have been removed as well, since this were related either to the helper function or the enum. I don't see a way moving forward using two directory formats as the output of the _mafft helper function.

@fethalen fethalen requested a review from cherman2 January 20, 2026 10:13
@cherman2

Copy link
Copy Markdown
Contributor

Makes sense regarding the union not being a valid output. I didn't think about that. Your original code makes sense given that limitation.

Let's Get This Merged!

@cherman2 cherman2 merged commit 288aab1 into qiime2:dev Jan 20, 2026
4 checks passed
@colinvwood colinvwood moved this from In Development to Changelog Needed in 2026.1 ❄️ Jan 22, 2026
@Oddant1 Oddant1 moved this from Changelog Needed to Completed in 2026.1 ❄️ Jan 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

No open projects
Status: Completed

Development

Successfully merging this pull request may close these issues.

Add support for aligning protein sequences in q2-alignment using MAFFT

6 participants