Add support for protein sequences#94
Conversation
| inputs={'alignment': ( | ||
| FeatureData[AlignedSequence] | | ||
| FeatureData[AlignedProteinSequence] | ||
| ), | ||
| 'sequences': T_GenericSequenceInput}, | ||
| parameters={'n_threads': Threads, | ||
| 'parttree': Bool, | ||
| 'addfragments': Bool, | ||
| 'keeplength': Bool, | ||
| 'large': Bool}, | ||
| outputs=[('expanded_alignment', FeatureData[AlignedSequence])], | ||
| outputs=[('expanded_alignment', T_GenericAlignedSequenceOutput)], |
There was a problem hiding this comment.
Let's do a larger typemap here. https://develop.qiime2.org/en/latest/plugins/references/api/types.html#qiime2.plugin.TypeMap
I think this will look something like:
T_alignment, T_sequence, T_expanded_alignment = (FeatureData[AlignedSequence], FeatureData[AlignedSequence]) : FeatureData[AlignedSequence]
I think with this typemap, you can remove _validate_sequence_pair() helper.
cherman2
left a comment
There was a problem hiding this comment.
Hi @fethalen,
Thanks for all your work on this. We have some ideas for how to simplify this code. Also, since we merged your alignment strategy PR, there are merge conflicts for this PR.
Let us know if you have any questions or need anything!
|
I've redefined the dependent types as suggested and was able to remove the |
|
Makes sense regarding the union not being a valid output. I didn't think about that. Your original code makes sense given that limitation. Let's Get This Merged! |
This PR addresses issue #93 by introducing support for amino acid sequences.
qiime alignment mafftcan now accept eitherFeatureData[Sequence]orFeatureData[ProteinSequence]as an input, the resulting output will be of the typeFeatureData[AlignedSequence]orFeatureData[AlignedProteinSequence]depending on the input type.qiime alignment mafft-addworks in a similar manner, but takes an alignment as an additional output. The sequence type of the alignments and the sequences to be added must match.qiime alignment maskcan now accept eitherFeatureData[AlignedSequence]ORFeatureData[AlignedProteinSequence]as an input.