Add additional alignment strategy parameters#92
Conversation
|
Thanks @fethalen! Is this one ready for a review? |
Yes, it would be great if someone could have a look at it! And for the person who does, please consider my question whether we should implement our own logic to handle mutually exclusive parameters or if we should just let MAFFT do its thing. |
|
Sounds great, thanks @fethalen! We'll get someone to take a look at this either this week or next 🙂 |
|
Hi @lizgehret, I'm going on a Christmas break soon, and it would be great if someone could have a look at this PR. |
|
Hey @fethalen, sorry for the delay on this- I'm starting my OOO in preparation for the big move, but I just pinged the rest of the team to see if anyone has bandwidth either today or Monday! |
cherman2
left a comment
There was a problem hiding this comment.
Hi @fethalen,
Thanks for your work on this!
Our major concern is that 'localpair', 'globalpair', 'auto', and 'genafpair' seem like mutually exclusive algorithms. I am pretty sure that MAFFT allows you to pass in all the parameters, but under the hood, it selects the last parameter passed in and uses that.
I see two ways to approach this, so our users understand they can't pass in 2 different algorithms into MAFFT.
-
Instead of seperate 'localpair', 'globalpair', 'auto', and 'genafpair' params, we could have a "alignment-strategy" param with choices of 'localpair', 'globalpair', 'auto', and 'genafpair'. This would make it obvious to users that these are mutually exclusive selections, but diverges from MAFFTS syntax. I think this is our preference because it seems kinda fool-proof
-
We can keep these parameters separate, but we should error if they are passed in together. I think this is a more complex approach, but it does mimic MAFTT's syntax. If you feel really strongly that we should keep MAFFT's syntax, I think this is a fine solution.
Let me know what you think!
Hi @cherman2, and thank you for looking into this PR! I agree that adding an additional parameter with all of the alignment strategies is the best solution. I've now implemented that under the parameter |
|
Hi @fethalen, I will review this today. Thank you! |
| valid_strategies_str = ", ".join( | ||
| [s for s in valid_strategies]) | ||
|
|
||
| if strategy == "auto": | ||
| return ["--auto"] | ||
| if strategy == "fftns": | ||
| return [] | ||
| if strategy == "nofft": | ||
| return ["--nofft"] | ||
| if strategy == "localpair": | ||
| return ["--localpair"] | ||
| if strategy == "globalpair": | ||
| return ["--globalpair"] | ||
| if strategy == "genafpair": | ||
| return ["--genafpair"] | ||
| if strategy is None: | ||
| return [] | ||
| else: | ||
| raise ValueError( | ||
| f"Invalid alignment strategy '{strategy}'. " |
There was a problem hiding this comment.
We could avoid this code if we used the 'Choices' predicate. Docs are here https://develop.qiime2.org/en/stable/plugins/references/api/types.html#qiime2.plugin.Choices. That way, we can let the framework validate and yell at the user lol.
Then we can prepend a '--' to the strategy param. Ex: 'auto' to '--auto'
There was a problem hiding this comment.
We could avoid this code if we used the 'Choices' predicate. Docs are here https://develop.qiime2.org/en/stable/plugins/references/api/types.html#qiime2.plugin.Choices. That way, we can let the framework validate and yell at the user lol.
Then we can prepend a '--' to the strategy param. Ex: 'auto' to '--auto'
Thanks for the suggestion. I've removed the parsing function and replaced it with Choices. The only thing that is different now is that fftns is no longer an option. This simplifies the code a bit and it is also more consistent with how MAFFT works.
| if strategy: | ||
| strategy_flag = _validate_alignment_strategy(strategy) | ||
| cmd += strategy_flag |
There was a problem hiding this comment.
Since Choices will handle the validation, we can probably get rid of the _validate_alignment_strategy function and prepend here instead.
| "maxiterate": Int, | ||
| "retree": Int, |
There was a problem hiding this comment.
@fethalen,
We were so focused on the alignment strategy that we didn't notice that the same principles could be applied here.
Let's use the. 'Range' predicate here because MAFFT doesn't accept Negative numbers, and 'Range' will safeguard against that. https://develop.qiime2.org/en/latest/plugins/references/api/types.html#qiime2.plugin.Range
|
|
||
| # --maxiterate is set to 0 by default, so we only pass this argument onto | ||
| # MAFFT if it deviates from this value. | ||
| if maxiterate not in (None, 0): | ||
| cmd += ['--maxiterate', str(maxiterate)] | ||
|
|
||
| # --retree is set to 2 by default, so we only pass this argument onto | ||
| # MAFFT if it deviates from this value. | ||
| if retree not in (None, 2): | ||
| cmd += ['--retree', str(retree)] | ||
|
|
There was a problem hiding this comment.
Let's pass in the defaults to MAFFT.
I am worried about an edge case here where MAFFT changes its defaults, and we don't notice. Then our code would be misleading and say the default is 2 or 0, but it would really be using whatever MAFFT changed its default to.
Kind of a long shot because MAFFT feels pretty stable, but this way we always control what the default is.
|
Hi @cherman2, great suggestion. I've limited |
cherman2
left a comment
There was a problem hiding this comment.
Looks Great To Me! Let's Get This Merged!
This PR addresses issue #91 by exposing the following set of additional alignment strategy parameters:
globalpairlocalpairgenafpairmaxiterateretreenofftautoThese flags allows the user to choose between different alignment algorithms implemented in MAFFT. There is also the
--autoflag, which automatically decides on the most appropriate algorithm depending on the input data size.globalpair,localpair, andgenafpairare mutually exclusive but the default behavior of MAFFT is to just pick one based on its own internal logic. I've chosen to retain this behavior in the q2-alignment implementation, but a question I have is whether providing more than one of these flags should raise a warning or an error.