Monthly Archives: July 2009

Buying Translation – Preparing for the best results

0
Filed under Language Translation Advice

As a translation buyer you can do your bit to ensure high quality translation. And your ‘bit’ comes before you set out to get the best translator to work on your translation project. You might ask that if you are going to ensure that the best translator works on your translation project, then why is any other preparation needed at all? The answer is, well it is needed. Let’s say that a translator can work on ensuring the best translation of the source document. But what happens if the source document itself is not good enough? It would not be fair to expect a high quality language translation from a substandard source document. Keeping that in mind, let’s move on to our first point –

Simplicity is the name of the game

The source document should be simple in every way. The text should use simple words and an active voice. Avoid using colloquial or regional terms and phrases including slangs, as they are very difficult to translate, especially if the target language/culture is significantly different.

Ensure that the source document is neat and clear, and adequately spaced - double spacing is recommended. This would especially be useful in the case of language translations where the target language tends to have longer words and sentences than the source language. Also remember to finalize the document before sending it out for translation, make sure that track changes/comments if any have been accepted.

Eliminate errors

On a technical level, ensure that the source document is free from any kind of errors including typos and grammatical mistakes. The formatting should be easy to handle, in fact it’s preferable to send word documents that can be edited easily compared to PDF files that can’t be edited.

Selection of typeface and font size

It might surprise some, but your selection of typeface and font size can influence document translation. For example, there are languages like Greek or Arabic among others that require particular characters and accents. Certain fonts like Times New Roman are not recommended for the web because they may not be legible on the screen.

When it comes to font sizes generally it is difficult to read fonts which are below 12pt. Now if you translate from a font which is less than 10pt and your target language has longer words and phrases than the source document then the translated version may be too small to even read.

Glossary and other reference materials

This is specially recommended for technical documents that use specific terminology or jargon. Building a glossary of such terms would help the translators maintain accuracy and speed of translation. The same goes for acronyms, explain acronyms whenever using them for the first time.

Furthermore, if you can send in any reference materials that would make it easier for your translator to understand your source document, it would be very helpful in ensuring a high quality translation. For example, if your translation is related to the services or products provided by your company, you can send in reference materials to the translator which would provide a better overview of the products and services, as well as your company’s profile. These would help the translator turn out a translation that is consistent with the image and spirit of your company.

Provide detailed instructions

Think over your requirements in detail, and convey them to your translation service provider at the outset. You can include specific instructions about formatting, additional information or changes suited for the target language readers, in the case of web site translation, you might need to specify how you would like the various parts of your website including flash images, links etc. to be translated.

Free and Useful translation tools - I

0
Filed under Translation Tools

Anaphraseus

Anaphraseus is a Computer Aided Translation (CAT) tool that can be use for creating, using and managing bilingual Translation Memories in various languages.

Anaphraseus offers the following features -

  • Inexact (Fuzzy) search in Translation Memory
  • OpenOffice.org Extension
  • Plain text and Unicode UTF-16LE TM
  • Terminology Recognition
  • Text Segmentation
  • Unicode UTF-16LE TMX Export/Import
  • User Glossary
  • Can be readily localized in any language

MT2007

MT2007 offers the following features -

  • Machine Translation (with Google online translator)
  • 21 dictionaries, including many specialized ones
  • Translation Memory
  • Access to Multitran online dictionary from MT2007 interface (www.multitran.ru)
  • Enables working with tags
  • Creation of user segments

As present the program only supports a single source file for each project. Supported formats includeMS Word 2007, MS Word 2003, TMX files and xml files.

Transolution

Transolution is a free CAT tool that supports the XLIFF standard. It includes a XLIFF Editor, translation memory and filters for converting various formats from and to XLIFF.

Benefits offered by Transolution include -

  • Interactive Translation Memory
  • Works for documentation as well as software
  • Sentence segmentation
  • Platform independent
  • Tag protection
  • Open source

The XLIFF editor offered by Transolution is especially helpful when the content that is required to be translated contains placeholders and tags. It helps the translator in changing only the translatable content keeping the tags protected.

Measuring machine translation quality

1
Filed under Uncategorized

The importance of Measuring Translation Quality – BLEU

Mr. Kirti Vashee, VP of Enterprise Translation Sales at Asia Online
writes on this fascinating guest post about the assessment of translation quality.

Anybody who has tried to measure translation quality will understand the difficulty of doing this in a way that has any general credibility. Developers of statistical machine translation systems, in particular have to grapple with this issue on a constant basis to understand how to evolve the state of the technology. MT developers are constantly trying new techniques to improve the technology, and need quick feedback on whether a particular strategy is working or not.

The question of quality is a difficult question to answer, because there is no entirely objective way to measure the quality/accuracy of automated translation software, or of any translation for that matter, that is widely accepted. The localization industry has struggled for years to establish some kind of objective measure for human translation quality and has yet to really succeed on this. Competent and objective humans are usually the surest measure of quality but as we all know, objectivity and real rigor is hard to define.

In Statistical Machine Translation (SMT), it is necessary to use some form of standardized, objective and relatively rapid means of assessing quality as part of the system development process in the technology. The oddly named BLEU – (BiLingual Evaluation Understudy) is an approach developed by IBM that is widely used in the general MT arena, and especially actively used by developers in the SMT community. http://domino.watson.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/5c651a88cb24938185256acb0055e548?OpenDocument&Highlight=0,BLEU

Why Quality is Difficult to Measure: What is a BLEU score?

Measuring translation quality is difficult because there is not an absolute way to measure how “correct” a translation is. Many “correct” answers are possible, and there can be as many “correct” answers as there are translators. The most common way to measure quality is to compare the output of automated translation to a human translation of the same document. The problem is that one human translator will translate the document significantly differently than another human translator. This inconsistency in the human reference translations leads to problems when using these human references to measure the quality of an automated translation solution. A document translated by an automated software solution may have 60% of the words overlap with one translator’s translation, and only 40% with the other translator’s translation; even though both human reference translations can be technically correct, the one with the 60% overlap with machine translation provides a higher “quality” score for the automated translation than the other translator’s translation did. Therefore, although humans are the true test of correctness, they do not provide an entirely objective and consistent measurement for quality.

The BLEU metric scores a translation on a scale of 0 to 1. The closer to 1, the more overlap there is with a human reference translation and thus the better the system is. In a nutshell, the BLEU metric measures how many words overlap, giving higher scores to sequential words. For example, a string of four words in the translation that match the human reference translation (in the same order) will have a positive impact on the BLEU score and is weighted more heavily (and scored higher) than a one or two word match. It is very unlikely that you woul dever score 1 as that would mean that the compared output is exactly the same as the reference output.

BLEU

  • The scoring algorithms punish you (brevity penalty) for unnecessarily repeating high frequency words like “the”.
  • Studies have shown that there is a high correlation between BLEU and human judgments of quality when properly used.
  • BLEU scores are often stated on a scale of 1 to 100 to simplify communication but should not be confused with percentage of accuracy.
  • Even two competent human translations of the exact same material may only score in the 0.6 or 0.7 if they use different vocabulary and phrasing.

To conduct a BLEU measurement the following data is necessary:

  1. One or more human reference translations. (In the case of SMT, this should be data, that has NOT been used in building the system as training data and ideally should be unknown to the SMT system developer. It is generally recommended that 1,000 or more sentences be used to get a meaningful measurement.)
  2. Automated translation output of the exact same source data set.
  3. A measurement utility like Language Studio LiteTM that performs the comparison and calculation for you. http://www.asiaonline.net/ToolsAndDownloads.aspx#SoftwareDownloads

As would be expected using multiple human reference tests will always result in higher scores as the SMT output has more human variations to match against. The NIST (National Institute of Standards & Technology) uses BLEU as an approximate measure of quality in its annual MT competitions with four human reference sets to ensure that some variance in human translation are captured, and thus allow more accurate quality evaluations of the MT solutions being evaluated. Thus, when companies claim they have the “best” MT system, all they are really saying is that they got the highest BLEU score on a single reference set comparison. The same system could do quite poorly with a different Test Set, so this information should be used with some care.

machine-translation-2

What is BLEU useful for?

SMT systems are built by “training” a computer with examples of human translations. As more human translation data is added, systems should generally get better in quality. Asia Online provides a development environment that allows users to develop and make many adjustments in developing an SMT translation system. Often, new data can be added with beneficial results but sometimes this new data can cause a negative effect. Thus, to measure the progress made in the development process, the system developers need to be able to measure the quality quickly and regularly to make sure they are improving the system and are in fact making progress.

Competent and dispassionate human judgment is always the best gauge of a systems translation quality. However, users and developers need immediate and rapid feedback on development strategies, so using human translators for every test is not an efficient solution. The SMT system developers will experiment with many different approaches and data combinations to find one that will produce the best results.

During the development process, an automatic test is necessary to quickly see the impact of a development strategy. This utility will help to measure BLEU and in time other measures that will provide quick feedback on development strategies and the current quality of an SMT system. BLEU allows developers a way “to monitor the effect of daily changes to their systems in order to weed out bad ideas from good ideas.

When used to evaluate the relative merit of different system building strategies, BLEU can be quite effective as it provides very quick feedback and this enables SMT developers to quickly refine and improve translation systems they are building and continue to improve quality on a long term basis.

Asia Online provides a table that is periodically updated showing the BLEU scores of 506 different language combinations, http://www.asiaonline.net/translation.aspx . The table is shown below, where the first column is the Source Language code and the first row is the Target Language code. This is useful, since for the most part the same amount/quality/type of core data has been used to build the all the SMT systems shown in the table and the test sets used to measure the quality are basically comparable. As you can see, the darker green combinations produce the best systems (given the same amount of data). The table also shows that English to Romance Languages and Romance to Romance Language combination produce the best quality systems, other things being equal.

machine-translation-31

What is BLEU not useful for?

BLEU scores are always very directly related to a specific “test set” and a specific language pair. Thus, BLEU should not be used as an absolute measure of translation quality because the BLEU score can vary even for one language depending on the test and subject domain. In most cases comparing BLEU scores across different languages is meaningless unless very strict protocols have been followed.

Because of this, Asia Online always uses human translators to measure fluency and verify the accuracy of the systems. Also, most industry leaders will always vet the BLEU score readings with human assessments before production use.

In competitive comparisons it is important to carry out the comparison tests in an unbiased, scientific manner to get a true view of where you stand against competitive alternatives. Thus it is important to use the exact same test set AND the same BLEU measurement tool. The Test Set should be unknown to all the systems that are involved in the measurement. As the basic calculations used in determining the final BLEU can also vary, it is important to use the same tool when measuring several different systems.

BLEU score comparisons between two systems presented by some companies can be misleading because:

  1. companies may use different test sets and one may be simpler than the other
  2. different BLEU measurement tools are used
  3. if more human references are used to calculate the BLEU score, the scores will be higher (i.e., scoring one system with 4 human reference translations will increase the number of overlapping words versus a score calculated with 1 human reference translation)

Because of this, Asia Online recommends

  1. uses blind (that is previously unseen by the system developers) test sets to generate the BLEU scores
  2. use the same BLEU measurement tool
  3. adjust and normalize the scores so that a translation scored using 4 human reference translations is not compared to a translation with only one human reference translation.

If you are looking at BLEU scores that compare two different translation systems, you should always understand how the results were generated. Comparing systems that were tested on different test sets will be somewhat meaningless and could lead to very misleading and erroneous conclusions.

Problems with BLEU

There are several criticisms of BLEU that should also be understood if you are to use the metric effectively. BLEU only measures direct word-by-word similarity, and looks to match and measure the extent to which word clusters in two documents are identical. Accurate translations that use different words may score poorly since there is no match in the human reference. There is no understanding of paraphrases and synonyms so scores can be somewhat misleading in terms of overall accuracy. Also, nonsensical language that contains the right phrases in the wrong order can score high. E.g.

“Wander” doesn’t get partial credit for “stroll,” nor “sofa” for “couch.”

“Appeared calm when he was taken to the American plane, which will to Miami, Florida” would get the very same score as: “was being led to the calm as he was would take carry him seemed quite when taken”

These problems are further discussed in this article: http://www.theregister.co.uk/2007/05/15/google_translation/page2.html

This link is an academic critique of the BLEU that clearly points out many of the shortcomings of the metric: http://www.iccs.inf.ed.ac.uk/~miles/papers/eacl06.pdf

Having pointed out all these shortcomings the BLEU metric is still a very useful tool for practitioners engaged in the difficult task of creating automated translation systems that are continually improving. Careful and informed use of BLEU can drive the development and evolution of systems and allow researchers to test out many different hypotheses to determine if they are favorably affecting the performance of a translation engines output.

What are best-practices in using BLEU?

  • BLEU is best used as a way to evaluate development strategies and most useful to developers engaged in the SMT system building process.
  • Take care to develop a comprehensive and “blind” set of test data to measure your systems of (500 - 1000+ sentences) that cover the domain of interest.
  • Remember that a system developed to translate software knowledge base material is unlikely to do well on a test set with sentences that are common in general political news. So keep your test set focused on your business purpose.
  • Use BLEU measurements frequently when adding new data to your system to understand if it is beneficial or not.
  • When measuring competitive systems ensure that you are using:
    The same test set
    The same measurement tool
  • Remember that BLEU is not useful as an absolute measure for quality as it only focuses on matching word clusters in two similar documents.

What other measures are available in addition to BLEU?

In addition to the BLEU score there are also other measures that may be useful to developers of SMT engines. These include the Meteor test, F-Scores and Edit Measures (how many changes made to a set of data to get it to human quality). The Language Studio LiteTM tool provides F-Scores & WER at this time and will add more quality metrics in future.

Other largely human judgment based approaches there are also used to measure translation quality. Some that are worth mentioning include:

  • SAE-J2450, a primarily human judgment based approach that was originally used in the automobile industry.
  • There is also the ASTM F2575 A new translation quality assurance standard published by ASTM International (an ANSI-accredited standards body) which defines quality as: The degree to which the characteristics of a translation fulfill the requirements of the agreed upon specifications.
  • The EuroCen115038 is a new standard that defines translation services, outlines the competencies a translator and reviser must have, and describes quality controls.

While these different approaches are very useful for different purposes, they are not useful to a developer of an SMT engine as they cannot provide the rapid feedback necessary to guide new experimentation and continued evolution of the translation system.

Is translation quality the most important measure?

While translation quality is a very key driver of the use of automated translation technology, it is important to ask this fundamental question. We understand that it is unlikely that we are going to see automated translation that is completely equivalent to competent human translation in the near or even distant future. The core objective of the Asia Online platform is to enable large amount of high value content to be converted to other languages in as close to human quality as possible, as rapidly as possible. We are already seeing that our technology is dramatically enhancing translation related productivity and can enhance and improve the productivity of enterprise translation efforts even in its imperfect state.

Thus, we believe that a much more important question is: Is the system good enough to boost the overall productivity of the translation process? Or Is the automatically translated output good enough to be useful to a potential user? So the focus then shifts away from linguistic quality exclusively to the impact the technology has on the work process for completing very large translation projects or end-user usability. There is now evidence that suggests that customized Asia Online translation systems can boost the productivity of a translation operation anywhere from 25% to 350%. Companies like Microsoft are servicing the technical support needs of hundreds of millions of customers with raw SMT output. This productivity boost can come in terms of throughput and speed and/or in terms of reduced overall costs to complete projects. These translation systems will also enable organizations to start translating material that would never get translated if the technology were not available. The use of this technology extends the reach of translated material to new areas. More information can be shared with global customers and the dialog with the global customer can be expanded and intensified. It is reasonable to presume that this can enhance all international business initiatives in general. I believe that human driven automated translation technology is a pivotal technology that will increasingly be seen as a key requirement to facilitate business initiatives in international markets.

As we are still at a phase, where SMT technology is sometimes seen as a threat by human translators, it is important to take care in measuring productivity and diffusing the negative mentality that some translators may bring to an evaluation. Several LSP studies now validate that MT can improve productivity, reduce costs and accelerate throughput. Asia Online is committed to man-machine collaboration and we see that our best systems are clearly a product of an intensive and structured human feedback loop that enables these systems to continually improve. We expect that this will drive the expansion of the market rather than a reduction of rates and revenues for all the current players in the professional translation market.

machine-translation-4

Facebook adds translation support for Connect

0
Filed under Translation News

Thanks to its user-driven translation program, Facebook which is already available in 64 languages because of its translation program which is driven by users, is now extending the same services to Facebook Connect where third-party websites can offer users to login with their Facebook ID.

Facebook has posted a simple code in its developer blog which can be used by the third party websites to change the language of Facebook Connect. This option will also be available in Fan Box and Live Stream Box, the recently released Connect options.

This development comes after Google’s Friend Connect provided support for 47 languages in the week gone by. Google also revealed that nearly 5 million sites have been using its identity tool. It is expected that Facebook would be able to increase its reach substantially with this new offering.

P.S. - Don’t forget to vote for Tomedes here - http://www.lexiophiles.com/language-blog-toplist/top-100-language-blogs-2009-voting-language-professionals. Thanks!

New Urdu language translation software

1
Filed under Translation News

Good news for Urdu translation, the National Language Authority (NLA) of Pakistan recently released Urdu computer software for use in computing and processing. The software has been developed by NLA’s Centre of Excellence for Urdu Informatics.

The software contains Microsoft Windows and Office in Urdu language. The NLA is also introducing a keyboard and font (Pak Nastaleeq Font) for use in writing Urdu and other languages of Pakistan. The new font is Unicode and has been based on internationalized, unified and standardized class. This makes it suitable for all types of data processing and computing.

Furthermore, the NLA has also released a Machine Translation software that can automatically translate from English to Urdu. This would help in making available all kinds of information in Urdu and also go a long way in contributing to the cause of Urdu translation.

Seminar to enhance writing skills

0
Filed under Translation Events

Translators looking for an opportunity to polish their writing skills have a good opportunity coming up, a two day seminar for translators is going to be organized at the Catskill Mountain Foundation from 21st to 23rd of August.

The seminar will focus on improving writing skills in the target language. Now every serious translator out there knows that writing skills play a very important role in turning out high quality translation. Excellent writing skills can go a long way in building a substantial client base consisting of customers who value quality. Such customers frequently play good rates for high quality work. Moreover, a high degree of job satisfaction is derived from doing quality work.

The Catskills Seminar would have some of the best people in the field guiding the participants on how to improve the quality of their work and court success. The workshop will mostly use translation examples from the French-English language pair but it would be useful for translators working in other language pairs as well. Interested? Check out their site here.

Google Wave picking up steam

0
Filed under Translation News

I had talked about Google’s new project Wave a few weeks back (Google Wave and Human Translation), and had also mentioned that I had high hopes for this venture. Well it seems the project is moving on with great gusto; till last week Google had 6,000 developers working on Wave, but according to latest updates the company has plans to invite another 20,000.

Developers who could not get an opportunity to work on this project are now thrilled to have got it and the Internet, specifically Twitter, is buzzing with excitement as well as tips about what’s in store.

Now why should developers have all the fun? Google is also planning to invite about 100,000 users from 30th of September onwards through the main page of Wave – wave.google.com. So if you are the non-developer type and dying to use Wave as soon as possible then get yourself a Developer Sandbox account as soon as possible otherwise you will not be able to be among the 100,000 lucky initial users of Wave.

Society for Editors and Proofreaders’ Annual Conference

0
Filed under Translation Events

There is another conference in the pipeline which is meant for proofreaders and editors. However, since most professional translators perform proofreading and editing, the conference would be helpful to them as well. The event in question is the annual conference of The Society for Editors and Proofreaders which is going to be held from the 13th to 15th September 2009 at Vanbrugh College, University of York.

The conference would include seminars, presentations and lectures around the topic - ‘Editing in the 21st Century’. A wide number of issues would be covered including on-screen editing, copywriting techniques, grammar, information and advice about useful software like Adobe InDesign and tips to help make a smooth transition to Microsoft Word 2007.

Some of the seminars would be dealing with issues related to freelancers, marketing oneself, advantages of blogging, relaxation techniques and work-life balance. Overall the conference promises to be highly informative; you can check out The Society for Editors and Proofreaders’ website for more details.

And please don’t forget to vote for us in Lexiophile’s Top 100 Language Blogs :)

The importance of bilingual police officers

0
Filed under Translation Musings

Just came across this piece on the Internet. It was about William Gonzalez, a bilingual police officer who works part time for the Gettysburg and Waynesboro, Pa., police departments. Gonzalez helps bridge the communication gap between residents of the area who are largely Hispanic and the police department.

According to Mark King, the Chief of Waynesboro police department, having Gonzalez in their department helps them in creating a better relationship with the Spanish speaking people. Even the police department of Cumberland Township have been known to borrow Gonzalez whenever they need help with Spanish translation.

There are police departments in the neighboring areas which do not have any Spanish speaking police officers. Chambersburg on the other hand does have a bilingual officer who works part-time, but considering the requirement the borough council is planning to recruit a full time police officer who is fluent in Spanish.

French newspaper uses machine translation with hilarious results

0
Filed under Translation News

Le Tribune which is a leading French business newspaper has risked public ridicule by using automatic translation for its website. A sample of some weird headlines follows –

“Internet Explorer: mistrust!”
“Ryanair loan to make travel of the passengers upright”
“Assets of the continental right in management of the crisis”
“The Chinese car in ambush”

In spite of such unfortunate results Le Tribune’s editors are pretty sure that their project will be successful. According to them it is still at an experimental stage and the translation software is being refined. A human translation expert would be soon hired to modify the text as required. To be fair, most of the articles translated into English were reasonably comprehensible though odd to read, but that may not be true for other languages. Right now the website is being translated from French to English, Italian, Spanish and German in real time; Chinese and Japanese will be added by the end of the year.

In contrast the BBC website is translated into 30 languages with several hundred journalists working on them. The BBC, as of now does not intend to follow Le Tribune’s model for web site translation though it would mean substantial cost saving.

Le Tribune is the second highest selling business daily and this translation project has been set in motion to make it accessible to new readers across the world. However critics of the project fear that the project may tarnish the image of Le Tribune which is otherwise highly respected in France.