I'm taking at look a some neural machine translation links and have a couple of questions...

in chr •  7 years ago 

I've been hunting around for a NMT (Neural Machine Translation) toolkit that I can use with a limited corpus that is "ready to go" in the sense that I can give it corpus, run a pipeline, pull relationship data out, and run (probably very bad) translations.

So far I've only found the Sockeye system that has any type of pipeline with instructions that I can make heads or tails out of. I have previously used the Moses SMT system.

I also saw a paper once about using some sort of Neural system with badly aligned data to improve alignments before running through a full pipeline, but have had no luck finding it again.

Ideally I'd like to get something like the "phrase tables" or similar you can get out of Moses.

I also wonder how feasible it would be to feed it Cherokee sentences with each one having multiple possible English translation variations.

My corpora are very small and Cherokee being polysynthetic is there a way to tell it to "keep all words" instead of requiring a minimum usage of say 5 for each word?

Can such a system be abused to "stem" words if fed words in syllable form?

Here are the links:

https://slator.com/technology/amazon-pits-neural-machine-translation-framework-google-facebook-others/

https://aws.amazon.com/blogs/machine-learning/train-neural-machine-translation-models-with-sockeye/

https://github.com/awslabs/sockeye/tree/master/tutorials

https://www.cfbtranslations.com/my-neural-machine-translation-project-overview-over-open-source-toolkits/

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!