How far are we from an accurate machine language translation service?

Another question from Quora:

How far are we from an accurate machine language translation service?

As a professional translator, this is a topic that interests me greatly. I agree with Steven’s assessment of what a machine should be able to do, and I believe all of us in the linguistic field believe that machines will be able to do this. The question remains: when?

It is amusing to go back and review science fiction in the past 60 years, which, since the advent of computers, has always believed that we are on the cusp of this breakthrough. “Within a decade” seems to be a common response, but it has been wrong every time and will continue to be for the foreseeable future. Advances like IBM’s Watson are encouraging, and show that a computer that is well trained and given copious amounts of data can decipher “what” we mean in most cases. This is half of the battle for translation.

The other half is correctly translating into a given context, and as Steven shrewdly points out, even humans cannot do it properly every time. However, we are good at putting ourselves in the user’s context and deciding to change English units to metric, fixing mailing addresses, and even explaining terms that make no sense in another country. For instance, imagine you are a rental car company in the US: how useful would it be for me to tell you that I have 26 points on my driving record in Italy? Points are good in Italy, with a maximum of 30. You often need a translator or a native to point such things out.

One thing I often tell translators is that the key to translating is not just what you know, but being able to see that you don’t know something. Idioms often make some sense when translated literally, but it takes someone well versed in a language and in their own abilities to identify a phrase that might have a second or third meaning. In these cases, translators research in specialized glossaries, search for examples in articles and search engines, etc. Computer architects will need to teach machines this same skill of double-checking their work.

An interesting circumvention of typical thought on translation is Google’s online translator. In large part, it works like any other translator. However, Google is also trying to gather proper translations (from humans) for everything in every language. For instance, recently it acquired rights to the European patent catalog. Using such information, Google continually improves its translator with the hopes of one day offering translations based on what it “knows” is correct. Even this has its limitations and seems a ways off. Notwithstanding, it does show, of course, that lots of computer power and human intellect is trying to tackle the problem. Ask Google, and its engineers might tell you they’ll be there “within a decade” but we all know this is unlikely.

When will machines be able to translate for us? For getting the gist of something, online translators are already there. They will be much better in 10 years time, and perhaps good enough for many more common uses. But to do a professional-quality translation, where we truly rely on the computer: that might take a lifetime.

What languages are worth localizing your app into?

Another question from Quora:

What languages are worth localizing your app into?

Babble-on localizes iOS apps and software on a regular basis. What I’ve learned from companies doing it is that they aren’t just guessing. Here are some ways to help you decide which language(s) to localize in:

Go where your users are
For iOS App Store projects, you can look in iTunes Connect to determine where your app is gaining traction. For example, you may see a spike in downloads from Italy — that tells you that localizing into Italian is going to get you even more users, even if Italian is not the most widely spoken language on Earth.

Go where your users will be
Another example is a company that has a good idea for an app that will sell abroad. A client of ours had a beautiful app that showed tranquil background scenes and it was a hit in various countries (so he applied rule #1 and translated into the appropriate languages). However, he then had the idea that a “cherry blossom” scene would do well in Japan. He had us translate the app into Japanese. He was right — he knew users in that country would love it, and they did.

Go for the big ones (that pay)
China has a huge market and translating into Chinese is tempting — but only if you really feel users will PAY for your app there. Otherwise, you are always better going with the most popular: Spanish, French, German, Portuguese (Brazil is a surprisingly good market). Other big localization markets are Japanese and Russian (but again, Russia like China, may not yield paying customers).

One last factor to keep in mind is that simply translating your App Store description can get you a lot of users. It’s cheap to do and has a good result. If your app is not heavy with text, even trying multiple languages won’t cost you a lot.

The return on investment can be huge.

How do two cultures with different languages learn to communicate at first contact?

Original question from Quora:

How do two cultures with extremely divergent languages learn to communicate with each other at first contact?

My thesis work was in Latin American studies, and as such I read a lot of the “first encounter” diaries and biographies that were written. The answer is actually surprisingly easy: CHILDREN.

Columbus, Cortez, and likely many of the other explorers kidnapped young children or took them on as servants. As children learn language incredibly quickly, it was not long before they could work as interpreters for the Spanish. By the time of the second voyage to the Americas, Columbus had fluent interpreters.

Other accounts of first meetings between the Spanish and the Native Americans show similar patterns that work from there. The Indian tongues could be divided into several groups, but even if you had an interpreter from one Indian language, that person could communicate with other tribes not too far off, much the way Spaniards and Italians could (and still can) communicate even though they don’t speak exactly the same language.

When no interpreter was available, they were reduced to drawing pictures and using gestures, just as you would imagine. However, language is learned rather quickly, especially when you have a child’s brain to help you!

 

Google gets more translations to ponder from European Patents

Via Google in Translation Pact for European Patents – ABC News.

Google said Thursday it has reached an agreement with European patent authorities to use its online technology to translate some 50 million patents.

Mountain View, California-based Google will gain access to all the translated patents – more than 1.5 million documents and 50 000 new patents each year – which will help improve its machine translation technology. Moreover, it will also deal with the growing amount of technology-related information in Japanese, Chinese, Korean and Russian.

It’s no secret that Google’s ambition for cataloging the world’s information encompasses every language. Through many efforts, including its translator toolkit, Google has been gathering raw data on translations from professional translators. Since teaching a computer grammar and syntax logic has not brought new gains, the best approach seems to be to mimic. That is, if Google’s computers can see enough examples of proper translations done by professional translators, eventually the computer can simply cut and paste phrases and put it all together.

As a patent translator, I wouldn’t be scared by this just yet. Google’s machine translation still has a long way to go before it can truly understand us mere mortals. Take a look at the following machine translation for a shipping product in Apple’s iPhone App Store.

Don't hire Google Translate to do your App Store marketing

For those who don’t read French, it says “Now available in unemployment insurance French!”

<joke>Insert French unemployment joke here</joke>

Pen and paper

How much does it cost to translate a book?

Pen and paper
"It's a handwritten 100,000-word book you say? How long you say? No money for my work right now you say"

I like to answer the phone for my translation business, and that leads to some repetitive question and answers, but also excellent insight into the people who use my services. 99% of clients who call me are great — and grateful — for honest and straightforward answers. I give them a fair price and an accurate time estimate in which I can guarantee delivery on time (often it arrives earlier, but I like to under-promise and over-deliver).

Then, there are the outliers—the calls every translator dreads to receive. For instance:

“Hi, I need my book translated.”

Surprising, at least to me, are the number of calls I receive asking me to translate a book. I would love to begin a book translation, especially if it is an author I love, or a children’s book. Inevitably, however, a few more details emerge:

  1. Shockingly, the caller wrote the book himself or herself.
  2. It’s unfathomably long — at least 100,000 words if not 300,000.
  3. There is no publishing deal in place — for any language.
  4. The total budget is smaller than my monthly utility bill.
  5. The person needs it ASAP — no, wait a sec’ — make that next Monday.

I’m a bit of a writer myself, so I understand the temptation to publish and the lure of the pen (or laptop). For the same reason, I also know how long it takes to come up with, and subsequently type out, 100,000 or more words. It boggles my mind that the calls I’ve received often expect the translation to take less time than it might to simply retype the book, and that, when I make the calculation for budget/time, it comes out to something like $0.85/hour. That’s 1/10 the minimum wage in San Francisco.

“Hi again, I’ll pay you with proceeds from my book sales.”

This followup call is my least favorite. The unrealistic budget and time allotment has failed to hook any translator in the sea. After a brief pause following the crushing reality of time and space required to do work, the author has entered again into a mind distortion field. He (or she, publishingitis affect both genders equally) has reemerged with a, seemingly, brilliant solution: pay with hypothetical future proceeds on the publication of a translation of a book that has never been published and may never be.

At this point, I still try to be polite and graciously decline. That doesn’t work. I’m quoted an outrageously high figure which supposedly corresponds to my potential virtually-guaranteed-not-to-be-missed-for-anything royalty figure in the not-to-distant future. It’s not that the offer isn’t flattering, I say, it’s just that I have so many other obligations.

And then another client calls on the other line—thank you! thank you! thank you!—I regretfully end the call with the budding author and talk to another potential client who restores my faith in all that is true and good.

“Hi, I need to get something notarized.”

“Well, it’s not exactly what I do, but thank you for the call. You really saved me.”

“Huh?”

Did your mother name you “At the time”?

 

A birth certificate in Hindi
They don't call it a name certificate for a reason.

Most blogs begin with an explanation about why the blogger began his blogging. This one will leave that subject to mystery (or privacy, perhaps). What I prefer to talk about is how difficult it is to come up with a name.

Don’t name your child

Just today I was helping someone translate a birth certificate from Hindi into English. Since Hindi isn’t a language I specialize in, or know even a single word of (is paneer Hindi?), I partner with a trusted translator in India I’ve worked with for the past several years. When he sent it back to me, the baby’s name was written as Tatsamay. That wasn’t the client’s name, and as it turns out, this is simply Hindi for “At the time” — meaning, the baby’s name had not yet been decided at the time of birth.

Imagine that: nine months to think up a name and…nothing. (It could be laziness, but likely it is just the tradition in India to leave the naming of child until later.)

Don’t name your blog

It seems fairly auspicious to begin a blog on the same day with the wise and ancient advice that a name is better left to the future, when you know something about the baby blog being born. After all, if Indian tradition doesn’t know about blogging, I don’t know who does.

So, let’s leave it at that. Language and names are very important, too important even for a blog about language.