Google Probem Writeup
Physics Today Ad for a Job at Google ("the Google FAX problem")
August 27, 2004; posted here 110805
Solution by T.D. Gutierrez.I never did apply for this job, but had fun doing the problem over a weekend back in August of 2004. Ironically, I solved the problem in a few hours but spent the rest of the time hunting down a bug which turned out to be spelling error in my word-to-number lookup table! Ouch. Silly, silly me. Anyway, I'm only now (Nov 2005) getting around to posting my solution.
There are certainly better solutions out there. I like my solution (I keep calling it "my" solution, but it is clearly just "a" solution) for a few reasons discussed below. If it doesn't make sense to you then I'm sure there are other variations out there that do. Just Google it (which is probably how you found this page in the first place)!
For example, a really great writup by Patrick Mock (that independently used a similar method as me, but doesn't involve lots of personal ranting and asides like I'm doing now) can be found here. Also, some nice clean code by Samuel Santiago (is it python?? Sorry, I'm still living in the 90's, man) outlining and solving the problem can be found here.
A Mathematica notebook to do the calculation is provided below. Details of how to use it are poorly noted in the notebook as comments. I used Mathematica, but in no way do I believe this was the best code forum for this project. It was the first thing I grabbed and I just ran with it.
Be warned of a couple things. There are spoilers in the comments of the
notebook. My answer is at the bottom of this page and in the header
of the notebook.
Also, while I think my core solution is "reasonably" elegant, my code is not. The notebook is not well written and makes the problem look more complex than it really is. The notebook doesn't really explain how to use itself (when you see the damn thing, you'll see why I am a physicist and could never be a programmer at Google!), but if you know Mathematica, you can probably muddle your way through it and get it to work. There are some fun plots of some of the functions at the bottom of the html version. But the html conversion truncated at a fixed tab, so a bunch of information is missing. Crap. The text output version seems to show promise initially, but then starts adding lots of spacing commands, making the output unreadable. Good luck trying to sort it all out. Sorry about the mess. Look for my summary at the bottom of the page.
Note on number standards: I adopted the standard that "and" would not be used in the name of the number. For example, 1234 is "one thousand thirty four" not "one thousand and thirty four". I don't use "hundred" to describe numbers above 900 (however, I did try it for fun at one point). Nor did I count hyphens. For example, 1450 and 4576 are "one thousand four hundred fifty" (27 letters) and "four thousand five hundred seventy six" (33 letters) not "fourteen hundred fifty" (20 letters) and not "forty five hundred seventy six" (26 letters)
Sounds pretty convoluted, but let's do an example:
Let's say you are given f(x)=71 and you want to find x. In the case a=71. The test value of xh=3E^3-71.
Now, iterate over values of E and calculate xh. Then, with a given value of xh, calculate E(xh) and compare to a.
E=1; xh=-68; E(-68)=something nasty (x will probably be greater than zero anyway) is E(-68)=71? No.
E=2; xh=-47; E(-47)=same as above; is E(-47)=71? No.
E=3; xh=10; E(10)=71; is E(10)=71? Yes! We have inverted f(x)=71 and got x=10.
The advantage to this method is you cover a lot of ground quickly by iterating over E and using the constraint of a rather than doing a brute force iteration over x.
The brute force method, even in this simple case where x was only 10, would require you to do 10 calculations and a logical comparison (to a). Iterating over E only required 3 interations and a logical comparison.
The reason this method is generally faster is that the number of letters in a number grows very slowly with the value of the number. Even really giant numbers might only have 300 letters in it (more on this below). This, of course, assumes there is actually a name for the number you are interested in.
Here is another example:
If you were given f(x)=-997000, a=-997000 so the test xh=3E^3-(-997000) or just xh=3E^3+997000.
E=1; xh=997003; is E(997003)=-997000? No.
E=2; xh=997024; is E(997024)=-997000? No.
E=10; xh=1000000; is E(1000000)=-997000?. Yes! So the inverse of f(x)=-997000 is x=1000000. After only 10 iterations, you have found an answer. There could be degeneracy (multiple x answers for a given value of a). But this method would quick find them. If you were willing to iterate up to E=300 you would cover all answers for x less than about 10^27 (assuming your database had a number-to-name lookup table that went that high).
Issues and tricks:
To help really set the scale with the size of the names and the size of
the number (as discussed above), the number with the longest name in
English is tied between:
777, 777, 777, 777, 777, 777, 777, 777, 777, 777, 777, 777, 777, 777, 777, 777, 777, 777, 777, 777, 777
888, 888, 888, 888, 888, 888, 888, 888, 888, 888, 888, 888, 888, 888, 888, 888, 888, 888, 888, 888, 888
and the rest of the 2^65-2 permutations of 7s and 8s. Seven and eight have the most letters for numbers less than ten. The second number above is same number with sevens replaced by eights. The name for the first number is
Seven hundred and seventy seven vigintillion
seven hundred and seventy seven novemdecillion
seven hundred and seventy seven ocodecillion
seven hundred and seventy seven septendecillion
seven hundred and seventy seven sexdecillion
seven hundred and seventy seven quindecillion
seven hundred and seventy seven quattrodecillion
seven hundred and seventy seven tredecillion
seven hundred and seventy seven duodecillion
seven hundred and seventy seven undecillion
seven hundred and seventy seven decillion
seven hundred and seventy seven nonillion
seven hundred and seventy seven octillion
seven hundred and seventy seven septillion
seven hundred and seventy seven sextillion
seven hundred and seventy seven quintillion
seven hundred and seventy seven quadrillion
seven hundred and seventy seven trillion
seven hundred and seventy seven billion
seven hundred and seventy seven million
seven hundred and seventy seven thousand
seven hundred and seventy seven
According to my count this has 822 characters. Compare this to the 10^65 order of magnatude! By iterating E over less than a thousand numbers, you cover all the numbers in the English language that have writable names. Of course, there are numbers like google (10^100), centillion (10^303) and googleplex (10^google). But since there aren't names for 10^3 steps down from these, you can't easily give a name to a number larger than or equal to 1000 vigintillion (you could, naturally, start saying "one thousand vigintillion", "one quadrillion vigintllion" etc., but that convention isn't used elsewhere -- even if you did, the scaling rule-of-thumb I describe would just continue). You could include negative numbers (with the word "negative" or "minus" out front), but this would just be a perturbation. You might try getting fancy and introducing complex numbers too (1+2i="one plus two i"), but this will give you only roughly a factor of two in name size