Kevin Camera
CS267 Assignment 0
Overview of the RSA Factoring Challenge
Problem
RSA Laboratories researches and develops cryptographic techniques for use by
the public and (particularly) standardization groups. Current encryption
technology is based on the concept that all information can be decrypted in a
finite amount of time -- the difference is in the amount of time it would take
to break the code. As long as the amount of processing time it takes to guess
the encryption key is longer than the amount of time the information is useful
(or if the cost of the resources required to break the code is higher than the
total value of the information), the data is logically considered secure.
In order to quantify the safety of data protected by a given encryption
technique, RSA tries to determine the amount of time in which the encryption key
could be guessed by state-of-the-art computers using best known algorithms. To
accomplish this, they created a series of challenges, such as the RSA Factoring
Challenge. A cash prize is awarded to the team that finds the prime factors of
RSA keys of varying strengths. The best known algorithm for factoring numbers is
presently the General Number Field Sieve, and therefore the contest involves
creating the fastest computing system for performing this algorithm.
Solution
Distributed Computing Technologies, Inc. (more commonly known as
distributed.net) uses a pyramid architecture of key servers which assign
untested key values to clients spread throughout the Internet. The computation
performed on different key values is completely independent of one another, so
only a list of outstanding results needs to be kept to prevent rework. A master
key server resides at the top of the pyramid, and has authority over the status
of all key values. A layer of proxy servers exists below the master key servers
which request large blocks of key values and pass small collections on to the
clients. This allows for a highly scalable architecture for adding additional
clients to the system, and provides a level of fault tolerance and load
balancing. There is also support for personal proxy servers between proxy
servers and clients. These servers are typically used by participating
organizations for traversing firewalls or collecting different flavors of
statistics.
The client application is distributed in both source and binary form. The
binary client is available for Windows, MacOS, and many flavors of UNIX (the
Windows version also includes some accessories for running as a service and
throttling screensaver priority). Each package has a small collection of core
loops, of which the fastest routine for a given system is automatically selected
during startup. This allows a pleasing level of optimization to take place for a
large number of systems without requiring Ph.D. students from hand optimizing
the source and assembly. Of course, the source is still available for
compilation, but doing so is somewhat discouraged since the validity of the
results is at risk.
The results returned by each client to a key server are not verified -- they
are assumed to be correct. This places a lot of trust in the users, which is
probably well placed since it's unlikely someone would spend the time to set up
a client just to sabotage the effort by missing the correct key. The flip side
of this policy is that users are highly discouraged from overclocking or
otherwise modifying their machines, since incorrect results could waste an
enormous amount of work.
Results
The distributed.net infrastructure has been around since 1997, when it
started as a improved replacement for an initial RC5 56-bit effort by New Media
Laboratories (genx.net). The performance achieved by the first distributed.net
server hit 100 Mkeys/s on April 17, 1997 (about 1/5th the performance of the
genx.net system at its best). At this rate, the estimated time to keyspace
exhaustion was 22.1 years. On June 25, 1997 the system hit 1 Gkey/s for the
first time, with keyspace exhaustion estimated at 2.1 years. On October 22, 1997
the RC5-56 challenge was won by distributed.net when the encryption was cracked
after a total of 221 days. At the time of completion, the system was completing
5.3 Gkeys/s and would have exhausted the rest of the keyspace in 83 days.
Work began immediately on RC5-64 at an intial rate of 475 Mkeys/s, with an
exhaustion time of 1,235 years. As you read this, the system is cranking out
almost 200 Gkeys/s and doubling its performance every 261 days (an effective
increase of 151.2 Mkeys/s per day). The keyspace will be exhausted in 581
days, thanks to the participation of 305,125 total participants. Even more
impressive is that distributed.net is also hosting three other challenges, so
one could imagine even higher key rates if their efforts were not divided.
Conclusion
The goal of the project was to prove that key strengths around 56 bits were
completely unsafe for security applications, since they could be cracked quickly
by a reasonably unorganized group of workstations throughout the world. Clearly,
they are correct -- if distributed.net hired out their services (with income
trickling down to the participants, of course), present-day secured data could
be deciphered in a couple months by the highest bidder. The use of tens of
thousands of workstations throughout the world creates an extremely powerful
parallel processing machine.
Of course, the use of such a system is only applicable in such a case where
there is no communication of data between processing nodes. If there were
data dependencies between calculations requiring inter-node communication, such scalability
would not be achieved. It's easy to imagine the server architecture being
improved to handle a large number of nodes (perhaps thousands) as long as the
computation/communication ratio was very high. Still, with the advent of
widespread high-speed internet access and lots of excess processing power, it's
easy to believe that massive use of Internet computing could match or exceed the
capability of a massively parallel (and massively expensive) supercomputer.
|