HomeProjectsPeoplePublicatons
Search:
   
 

Kevin Camera
CS267 Assignment 0
Overview of the RSA Factoring Challenge


Problem

RSA Laboratories researches and develops cryptographic techniques for use by the public and (particularly) standardization groups. Current encryption technology is based on the concept that all information can be decrypted in a finite amount of time -- the difference is in the amount of time it would take to break the code. As long as the amount of processing time it takes to guess the encryption key is longer than the amount of time the information is useful (or if the cost of the resources required to break the code is higher than the total value of the information), the data is logically considered secure.

In order to quantify the safety of data protected by a given encryption technique, RSA tries to determine the amount of time in which the encryption key could be guessed by state-of-the-art computers using best known algorithms. To accomplish this, they created a series of challenges, such as the RSA Factoring Challenge. A cash prize is awarded to the team that finds the prime factors of RSA keys of varying strengths. The best known algorithm for factoring numbers is presently the General Number Field Sieve, and therefore the contest involves creating the fastest computing system for performing this algorithm.

Solution

Distributed Computing Technologies, Inc. (more commonly known as distributed.net) uses a pyramid architecture of key servers which assign untested key values to clients spread throughout the Internet. The computation performed on different key values is completely independent of one another, so only a list of outstanding results needs to be kept to prevent rework. A master key server resides at the top of the pyramid, and has authority over the status of all key values. A layer of proxy servers exists below the master key servers which request large blocks of key values and pass small collections on to the clients. This allows for a highly scalable architecture for adding additional clients to the system, and provides a level of fault tolerance and load balancing. There is also support for personal proxy servers between proxy servers and clients. These servers are typically used by participating organizations for traversing firewalls or collecting different flavors of statistics.

The client application is distributed in both source and binary form. The binary client is available for Windows, MacOS, and many flavors of UNIX (the Windows version also includes some accessories for running as a service and throttling screensaver priority). Each package has a small collection of core loops, of which the fastest routine for a given system is automatically selected during startup. This allows a pleasing level of optimization to take place for a large number of systems without requiring Ph.D. students from hand optimizing the source and assembly. Of course, the source is still available for compilation, but doing so is somewhat discouraged since the validity of the results is at risk.

The results returned by each client to a key server are not verified -- they are assumed to be correct. This places a lot of trust in the users, which is probably well placed since it's unlikely someone would spend the time to set up a client just to sabotage the effort by missing the correct key. The flip side of this policy is that users are highly discouraged from overclocking or otherwise modifying their machines, since incorrect results could waste an enormous amount of work.

Results

The distributed.net infrastructure has been around since 1997, when it started as a improved replacement for an initial RC5 56-bit effort by New Media Laboratories (genx.net). The performance achieved by the first distributed.net server hit 100 Mkeys/s on April 17, 1997 (about 1/5th the performance of the genx.net system at its best). At this rate, the estimated time to keyspace exhaustion was 22.1 years. On June 25, 1997 the system hit 1 Gkey/s for the first time, with keyspace exhaustion estimated at 2.1 years. On October 22, 1997 the RC5-56 challenge was won by distributed.net when the encryption was cracked after a total of 221 days. At the time of completion, the system was completing 5.3 Gkeys/s and would have exhausted the rest of the keyspace in 83 days.

Work began immediately on RC5-64 at an intial rate of 475 Mkeys/s, with an exhaustion time of 1,235 years. As you read this, the system is cranking out almost 200 Gkeys/s and doubling its performance every 261 days (an effective increase of  151.2 Mkeys/s per day). The keyspace will be exhausted in 581 days, thanks to the participation of 305,125 total participants. Even more impressive is that distributed.net is also hosting three other challenges, so one could imagine even higher key rates if their efforts were not divided.

Conclusion

The goal of the project was to prove that key strengths around 56 bits were completely unsafe for security applications, since they could be cracked quickly by a reasonably unorganized group of workstations throughout the world. Clearly, they are correct -- if distributed.net hired out their services (with income trickling down to the participants, of course), present-day secured data could be deciphered in a couple months by the highest bidder. The use of tens of thousands of workstations throughout the world creates an extremely powerful parallel processing machine.

Of course, the use of such a system is only applicable in such a case where there is no communication of data between processing nodes. If there were data dependencies between calculations requiring inter-node communication, such scalability would not be achieved. It's easy to imagine the server architecture being improved to handle a large number of nodes (perhaps thousands) as long as the computation/communication ratio was very high. Still, with the advent of widespread high-speed internet access and lots of excess processing power, it's easy to believe that massive use of Internet computing could match or exceed the capability of a massively parallel (and massively expensive) supercomputer.