How does the ultra-pure multilingual data lake of Guildhawk Aided produce high quality machine translation?
What is Guildhawk Aided?
Guildhawk Aided is an AI translation tool, designed to translate vast amounts of content faster than a human linguist.
Other tools use statistical machine translation (MT), which only takes the immediate context of a word into account. (6 words either side, to be exact.) Unlike those tools, Guildhawk Aided uses neural MT, supported by Microsoft technology, which imitates the neurons of the human brain. This means the tool takes into account the entire context of a sentence, resulting in better word choice, better syntax, and more natural, flowing content.
How exactly does it work?
The Guildhawk Aided service consists of a unique combination of 3 tried and trusted elements:
- Translation Memory
Every time we translate something for you, our technology remembers key elements and builds a translation memory (TM) of your favoured terms, key terminology, translation preferences, and phrases.
Within the Guildhawk Aided process, TMs are applied to content first, to leverage as much previously translated content as possible. If we have not previously translated content for you, we can train the machine before commencing the process using previously translated and approved content provided by you in advance.
2. Microsoft AI
MT is applied to any remaining untranslated segments. Content is translated automatically by Guildhawk Aided using matches from the Microsoft AI’s sector-specific translation bases.
3. Client Reference Materials
Depending on the service level (see below), the fully translated document is then reviewed and edited by a human linguist, who compares with style guides and glossaries to ensure terminology and phraseology specific to the client are used within the matches from Microsoft AI.
This machine-translated and human-reviewed content, once approved, is saved to your client-specific TM, ensuring that this review work does not need to be performed on these sentences and phrases should they occur again. This is how we train the machine on an ongoing basis.
Once we have reached a critical mass of content in your TM (200,000 words per language combination), we can create an AI completely specific to you, i.e. one which has been trained based on your existing content and can then extrapolate out using neural AI to translate new terms and phrases in the manner preferred by you. In this way, human involvement can become less and less as the content produced by the machine becomes better and better, and more bespoke.
Is there human input, or does the AI work independently?
As mentioned above, this depends on the Guildhawk Aided service level you choose. There are 3 Guildhawk Aided service levels:
Silver – This is our AI-only service level. Content is processed by AI only (and TMs, where these already exist or can be compiled in advance), without any linguist intervention. This is the quickest, cheapest version of the service and is useful for large volumes with tight deadlines, when the client will not publish the translation, but only wishes to have an overview of what a text is about, i.e. cases where the translation doesn’t need to be 100% accurate or client-specific to be useful.
Gold – This service level involves AI translation and linguist review. A single linguist reviews and edits the content after AI translation to ensure errors are removed and any content not coming from client-specific TMs is consistent with reference materials. In terms of quality, this is equivalent to our human translation-only option.
Platinum – This is our highest Guildhawk Aided service level. Following the same steps in the Gold service, output is subsequently fully reviewed by a second linguist. This produces content of publishable quality, equivalent to our full human translation and proofreading service.
As per the section above, we would ideally recommend that there be human involvement in the process, especially if content is for anything other than basic information purposes. Human review is also essential to building robust translation memory and, in turn, a reliable client-specific AI.
How long does it take to train Guildhawk Aided?
This depends on how much reference material we have to train it with. The timescale for training in advance (i.e. before beginning the first translation) can be between 2-6 days, depending on the volume of content we are provided.
Whether in advance of translation, or over the course of live translation projects, in order to create a fully client-specific AI (requiring minimal linguist intervention on an ongoing basis), we will need to process, approve, and feed into the system a minimum of 200,000 words per language combination.
What kind of content do we need to provide for you to train Guildhawk Aided?
In order to train the machine in advance of commencing work, we need a minimum of 200,000 words of bilingual data,
i.e. existing approved translations, along with their corresponding source files. Source files should be less than 100 MB in size.
Note: In the absence of this advance bilingual data, Guildhawk Aided works on the basis of a set of sector - and language-specific terminology bases in the first instance. If you have chosen our Gold service level or above, we will then need to set preferences based on your feedback on the output, as we would anticipate a number of iterations will be needed to ensure the content is fit for purpose. The preferences you indicate are saved to the translation memory to train the machine for the next iteration. This iterative process would need to take place several times – typically an average of 5 times – until we reach the level of output that is desired.
How quickly can you translate my content?
Guildhawk Aided Silver is very quick, processing thousands of pages within 24-48 hours, depending on format. By way of comparison, a single human linguist would generally be able to translate about 10-20 pages in this time.
What does Guildhawk Aided cost?
Costs are dependent on the service level. Prices for our Silver service level start from £0.002/word, while our Gold service is priced at about 20% lower than standard single-linguist translation.
How does Guildhawk Aided integrate with our existing systems?
Guildhawk Aided is currently compatible with all of the following file types and programs.
Note: Should content need to be pushed automatically from a particular portal or platform directly to Guildhawk Aided for translation, we can create an API to enable this automated process.
|HTML documents (.html, .htm)||Portable Document Format (PDF)|
|Microsoft® Word® documents and Rich Text files|
(.doc, .docx, .rtf)
|Java properties files (.properties)|
|Text files (.txt, .inf, .ini, .reg, etc.)||AuthorIT projects (.xml)|
|Microsoft® Excel® files (.xls, .xlsx .xlt)||DITA documents (.dita)|
|Microsoft® PowerPoint® files (.pptx, .ppt, .pps,|
.pot, .potx )
|Excel 2003 XML spreadsheets (.xml)|
|OpenDocumentText documents (OpenOffice.org Write; ODT)||FreeMind mindmaps (.mm)|
|Adobe® Framemaker® files (.mif)||Microsoft Visio files (.vsd, .vsdx)|
|Adobe® InDesign® files (.inx, .idml, .indd)||Microsoft Help Workshop files (.hhc, .hhk)|
|XML and SGML files (.xml, .sgml)||Scalable Vector Graphics drawings (.svg)|
|XLIFF files (xlf; .xlz)||Typo3 pages (.xml)|
|.NET resource files (.resx)||JSON (.json)|
|YAML (.yaml)||WPML XLIFF files (.xliff)|
Which languages does Guildhawk Aided support?
Guildhawk Aided currently supports 61 languages:
Is my data confidential?
Guildhawk maintains UKAS-accredited ISO27001 certification, which is focused on information security and includes processes for protecting the integrity and confidentiality of data. Any content that you share with us as part of the Guildhawk Voice offering will be handled according to these secure information management practices.