Distributed Computing

13 April 2004, 12:23 am

Distributed Computing is merely splitting a complex process (an algorithm) into parts that can be shared by several different computers.

An Easy Example

A (not too realistic) example would be if you were asked to add up all the integers from one (1) to a thousand (1,000). You could get out a calculator or a big sheet of paper and start adding up the numbers. 1 + 2 + 3 + 4 + 5 + 6 + etc. Eventually you would ask someone to help you. Maybe get them to start adding from 900 to 1,000. 900 + 901 + 902 + 903 +904 + etc. And then they get someone else to take care of adding the numbers from 800 to 899. And so on until eventually you’ve got several dozen people adding up a range of numbers.

When each person finishes their range, they send them back to one person who keeps track of what work has been done. Once all the ranges are done, each total is added together to get the grand total of the original range. Tada! You’re finished! This process would work for any range of numbers that needs to be added.

But wait, some bright soul notices a pattern in the range of numbers. Instead of going through the trouble of adding everything you can just do a few math tricks. The formula would be (low of range + high of range) x (number in range / 2) or (1 + 1,000) x (1,000 / 2) = 1,001 times 500 = 500,500 (five hundred thousand, five hundred). This improvement (or optimization) to the algorithm will help us in the future if a similar problem occurs.

A Real World Example

The movie industry uses computer animations for movies, such as Toy Story. Instead of having just one computer working to create the thousands of graphics required for the movie, dozens of computers were used to render hundreds of graphics each.

A More Realistic Example

A Mersenne Prime Number is a special form of a prime number. A Mersenne Prime is a prime number expressed in the form 2^P-1, where P is a prime number. There are currently only 40 known Mersenne primes. A special mathematical test can be done to see if a number is a Mersenne Prime. This highly optimized test takes up to several weeks on a fast Pentium computer. This effort is called the Great Internet Mersenne Prime Search, known as GIMPS.

If you wanted to look for a Mersenne prime, you could do research in mathematical text books, or on the Internet, learn how to program, and then write your own program to run the test. Or you could join in with thousnds of people working with the GIMPS effort. You would be assigned a range of numbers to test on your computer. My computer is testing a prime number for Mersenne primeness. Is 2^3,624,139-1 a prime number?

I don’t know but I will know in about two weeks.

I don’t have to do anything, because my computer is doing all the work. The program runs in the background at the same time as all of my other programs. I can check my e-mail, surf the web, write papers for class, program, play games, and I don’t see any effect on the speed of my computer while it is testing for prime numbers.

In fact, the program will automatically even ask for numbers to test when it is completed testing my range. It knows when it is almost finished and will check with the main PrimeNet server and get the next range to test. But if it does find a prime, you get the credit for the work. Is that cool or what? You gain your place in history as the discoverer of a Mersenne Prime number!

A Non-Mathematical Example

So, mathematics doesn’t cut it for you. How about trying to win cash?

RSA Laboratories has decided to give a prize of $10,000 to whomever decrypts a secret message that they’ve put on their web page. They have even said what the first few words of the message are, and have even said what formula they have used to encrypt the message. No problem, right?

Wrong! There are only 18,446,744,073,710,000,000 different possibilities of what the key is that will decrypt the original message. Think of it like this:

I have a password on my computer so that no one can read my e-mail. But because you know that my program only allows passwords to be a letter of the alphabet, there are only 26 different passwords that I could use. If my program were improved to allow two letters, there would be 26², or 676 different possibilities. You could try those in a couple of hours probably.

So a group of programmers felt that they could search all of the possible keys for this encrypted message. They have created a non-profit research organization to break the code and get the money. The name of the organization is Distributed Computing Technologies.

There are thousands of people running the software to try and break the code. The program sits in the background and communicates over the Internet to get a range of keys to test. Eventually, every single key will be tested, and the message will be decrypted. If your computer is the one that finds the magic key, say hello to your share of the $10,000, which would be $1,000.

But it is going to take forever to test that many different possible keys!

True, the current estimate is that it will take about 25 years to solve the puzzle. But the RSA Laboratories had an earlier contest that also had a $10,000 prize. The number of keys for that contest was much smaller. When the contest first started, it was estimated that it would take 75 years to find the key. But people recruited friends to run the program. And the program was optimized nearly every month. Distributed Computing Technologies found that key in only 250 days.

The key to the problem of a long time to finish a task in distributed computing is to get more and more computers working on the problem. The programmers could spend a lot of time tweaking their programs, or every person that is involved in the effort could try to get just one more computer involved.

So what’s the Big Deal?

The big deal is that anyone with a computer can participate in either or both of these efforts. And you can help too.

If you’re interested in helping, go to one of these pages and download the software. Each of these efforts has a mailing list to offer support and discussion about the issues involved.

Still, not impressed? Look a few years into the future. More than likely all computers will be connected to the Internet. All the time. Never turned off. They might as well be doing something useful while you are at work. Or asleep. Or eating. What if Steven Spielberg wants to create super-colossal graphics for Jurassic Park XII? Instead of buying several dozen computers to render the graphics, he could buy time from you to render some of the frames on your computer.

Your computer runs a special graphic program that reads a data file from the main server. Your computer takes a couple of days to render the image. And then sends the final image back to the server. And Steve Spielberg’s server automatically transfers a couple of dollars into your checking account for every image you create. You may even get credit in the movie’s screen credits. And you didn’t have to do a thing but install some software.

Until we get to the point of getting reimbursed for helping in these efforts, we will be helping figure out the logistics of the process. Have you ever tried to coordinate over 12,000 people using over 20,000 computers to solve a math problem? And then tried to reimburse them? What we are doing today will help future programmers, administrators, and network designers.

Other Distributed Computing Tasks

There are many other distributed computing tasks that you could participate in, but they aren’t as user-friendly as GIMPS or the RSA Laboratories challenge. Most deal with number crunching, but one other unique project is SETI, the Search for Extra Terrestrial Intelligence. This will allow you to download actual radio transmisions from the mega-radio telescopes and search for messages from aliens. If this interests you, you can get more information at the SETI web page.

Filed under Technology

Comments are closed | Permalink

Journal of PlanetMike