Google Probem Writeup
Physics Today Ad for a Job at Google ("the Google FAX problem")
August 27, 2004; posted here 110805
Solution by T.D. Gutierrez.
I never did apply for this job, but had fun doing the problem over a weekend back in August of 2004. Ironically, I solved the problem in a few hours but spent the rest of the time hunting down a bug which turned out to be spelling error in my word-to-number lookup table! Ouch. Silly, silly me. Anyway, I'm only now (Nov 2005) getting around to posting my solution.There are certainly better solutions out there. I like my solution (I keep calling it "my" solution, but it is clearly just "a" solution) for a few reasons discussed below. If it doesn't make sense to you then I'm sure there are other variations out there that do. Just Google it (which is probably how you found this page in the first place)!
For example, a really great writup by Patrick Mock (that independently used a similar method as me, but doesn't involve lots of personal ranting and asides like I'm doing now) can be found here. Also, some nice clean code by Samuel Santiago (is it python?? Sorry, I'm still living in the 90's, man) outlining and solving the problem can be found here.
Helpful code
A Mathematica notebook to do the calculation is provided below. Details of how to use it are poorly noted in the notebook as comments. I used Mathematica, but in no way do I believe this was the best code forum for this project. It was the first thing I grabbed and I just ran with it.
Be warned of a couple things. There are spoilers in the comments of the
notebook. My answer is at the bottom of this page and in the header
of the notebook.
Also, while I think my core solution is "reasonably" elegant, my code is
not. The notebook is not well written and makes the problem look more
complex than it really is. The notebook doesn't really explain how to use
itself (when you see the damn thing, you'll see why I am a physicist
and could never be a programmer at Google!), but if you
know Mathematica, you can probably muddle your way through
it and get it to work. There are some fun plots of some of the functions
at the bottom of the html version. But the html conversion truncated at a
fixed tab, so a bunch of information is missing. Crap. The text
output version
seems to show promise initially, but then starts adding lots of spacing
commands, making the output unreadable. Good luck trying to sort it all
out. Sorry about the mess. Look for my summary at the bottom of the
page.
Note on number standards: I adopted the standard that "and" would not be
used in the name of
the number. For example, 1234 is "one thousand thirty four" not "one
thousand and thirty four". I don't use "hundred" to describe numbers
above 900 (however, I did try it for fun at one point). Nor did I
count hyphens. For example, 1450
and 4576 are "one thousand four hundred fifty" (27 letters) and "four
thousand five hundred seventy
six" (33 letters) not
"fourteen hundred fifty" (20 letters) and not "forty five hundred seventy
six"
(26 letters)
[googleFinA.nb][googleFinA.txt][googleFinA.html]
Given:
Find:
Solution outline:
Sounds pretty convoluted, but let's do an example:
Let's say you are given f(x)=71 and you want to find x. In the case a=71. The test value of xh=3E^3-71.
Now, iterate over values of E and calculate xh. Then, with a given value of xh, calculate E(xh) and compare to a.
E=1; xh=-68; E(-68)=something nasty (x will probably be greater than zero anyway) is E(-68)=71? No.
E=2; xh=-47; E(-47)=same as above; is E(-47)=71? No.
E=3; xh=10; E(10)=71; is E(10)=71? Yes! We have inverted f(x)=71 and got x=10.
The advantage to this method is you cover a lot of ground quickly by iterating over E and using the constraint of a rather than doing a brute force iteration over x.
The brute force method, even in this simple case where x was only 10, would require you to do 10 calculations and a logical comparison (to a). Iterating over E only required 3 interations and a logical comparison.
The reason this method is generally faster is that the number of letters in a number grows very slowly with the value of the number. Even really giant numbers might only have 300 letters in it (more on this below). This, of course, assumes there is actually a name for the number you are interested in.
Here is another example:
If you were given f(x)=-997000, a=-997000 so the test xh=3E^3-(-997000) or just xh=3E^3+997000.
E=1; xh=997003; is E(997003)=-997000? No.
E=2; xh=997024; is E(997024)=-997000? No.
...
E=10; xh=1000000; is E(1000000)=-997000?. Yes! So the inverse of f(x)=-997000 is x=1000000. After only 10 iterations, you have found an answer. There could be degeneracy (multiple x answers for a given value of a). But this method would quick find them. If you were willing to iterate up to E=300 you would cover all answers for x less than about 10^27 (assuming your database had a number-to-name lookup table that went that high).
Issues and tricks:
To help really set the scale with the size of the names and the size of
the number (as discussed above), the number with the longest name in
English is tied between:
777, 777, 777, 777, 777, 777, 777, 777, 777, 777, 777, 777, 777, 777, 777,
777, 777, 777, 777, 777, 777
and
888, 888, 888, 888, 888, 888, 888, 888, 888, 888, 888, 888, 888, 888, 888,
888, 888, 888, 888, 888, 888
and the rest of the 2^65-2 permutations of 7s and 8s. Seven and eight
have the most letters for numbers less than ten. The second number
above is same number with sevens replaced by eights. The name for
the first number is
Seven hundred and seventy seven vigintillion
seven hundred and seventy seven novemdecillion
seven hundred and seventy seven ocodecillion
seven hundred and seventy seven septendecillion
seven hundred and seventy seven sexdecillion
seven hundred and seventy seven quindecillion
seven hundred and seventy seven quattrodecillion
seven hundred and seventy seven tredecillion
seven hundred and seventy seven duodecillion
seven hundred and seventy seven undecillion
seven hundred and seventy seven decillion
seven hundred and seventy seven nonillion
seven hundred and seventy seven octillion
seven hundred and seventy seven septillion
seven hundred and seventy seven sextillion
seven hundred and seventy seven quintillion
seven hundred and seventy seven quadrillion
seven hundred and seventy seven trillion
seven hundred and seventy seven billion
seven hundred and seventy seven million
seven hundred and seventy seven thousand
seven hundred and seventy seven
According to my count this has 822 characters. Compare this to the 10^65
order of magnatude! By iterating E over less than a thousand numbers, you
cover all the numbers in the English language that have writable names. Of
course, there are numbers like google (10^100), centillion (10^303) and
googleplex (10^google). But since there aren't names for 10^3 steps down
from these, you can't easily give a name to a number larger than or equal
to 1000 vigintillion (you could, naturally, start saying "one thousand
vigintillion", "one quadrillion vigintllion" etc., but that convention
isn't used elsewhere -- even if you did, the scaling rule-of-thumb I
describe would just continue). You could
include negative numbers (with the word
"negative" or "minus" out front), but this would just be a perturbation.
You might try getting fancy and introducing complex numbers too (1+2i="one
plus two i"), but this will
give you only roughly a factor of two in name size