PUBLIC-KEY CRYPTOGRAPHY NIST Special Publication 800-2 James Nechvatal Security Technology Group National Computer Systems Laboratory National Institute of Standards and Technology Gaithersburg, MD 20899 April 1991 PREFACE This publication presents a state-of-the-art survey of public- key cryptography circa 1988 - 1990. In doing so, it covers a number of different topics including: 1. The theory of public-key cryptography. 2. Comparisons to conventional (secret-key) cryptography. 3. A largely self-contained summary of relevant mathematics. 4. A survey of major existing public-key systems. 5. An exploration of digital signatures and hash functions. 6. A survey of public-key implementations in networks. 7. An introduction to zero-knowledge protocols and probabilistic encryption. 8. An exploration of security issues and key sizes. The treatment of public-key cryptography in this publication includes both theory and practice. Much of the existing published work, including those documents listed in the references, treats either the theory or specific systems/implementations, but not both. The viewpoint here is that the theory and practice are inseparable. Any mention of commercial products is for purposes of explanation and illustration only. Also, the selection of cryptosystems and hash functions mentioned in this publication serve only to provide examples. Such identification does not imply recommendation or endorsement by the National Institute of Standards and Technology, nor does it imply that systems or functions identified are necessarily the best available for the purpose. The focus is on issues such as criteria for systems and protocols for usage. These are presumably long-term, in contrast, to the set of existing public-key systems which is more volatile. Thus we provide information which will hopefully be of use to implementors of systems, but the frameworks we develop are versatile enough to be relevant in a variety of settings. The latter may include, for example, both electronic mail systems and electronic fund transfer systems. The core of this exposition is sections 1 to 5. Sections 1 to 3 cover the fundamentals of public-key cryptography and the related topics of hash functions and digital signatures. Extensive coverage of key management is also included, with a focus on certificate- based management. Section 4 gives some examples of public-key systems and hash functions. Section 5 gives some examples of actual or proposed implementations of public-key cryptography. The major example is the International Organization for Standardization (ISO) authentication framework. Section 6 gives a sample proposal for a local-area network implementation of public-key cryptography. It draws heavily on the work of ISO. A variety of topics are covered in the appendices, including a summary of relevant mathematics and algorithms. Also included is a brief introduction to zero-knowledge protocols, probabilistic encryption and identity-based public-key systems. In the following, letters refer to appendices; e.g. lemma G.2.1 refers to a lemma appearing in section 2 of appendix G. The author wishes to thank Dr. Ronald L. Rivest, Dr. Gustavus Simmons, and Dr. Dennis Branstad for providing many comments and suggestions, and Dr. Burton S. Kaliski Jr. for providing information on implementations of the RSA public-key system. The paper was edited by Miles Smid. This paper was supported in part by the United States Department of Computer-Aided Logistics Supports, Department of Defense. CONTENTS 1. Cryptosystems and cryptanalysis...............................1 1.1 Requirements for secrecy..................................2 1.2 Requirements for authenticity and integrity...............4 1.3 Conventional systems......................................5 1.4 Example of a conventional cipher: DES.....................5 1.5 Another conventional cipher: exponentiation...............6 1.6 Public-key cryptosystems..................................7 1.6.1 Secrecy and authenticity...........................8 1.6.2 Applicability and limitations.....................10 2. Key management...............................................12 2.1 Secret-key management....................................12 2.2 Public distribution of secret keys.......................13 2.3 Management of public components in a public-key system...15 2.3.1 Use of certificates...............................16 2.3.2 Generation and storage of component pairs.........17 2.3.3 Hardware support for key management...............18 2.4 Using public-key systems for secret key distribution.....19 2.4.1 A protocol for key exchange.......................20 2.5 Protocols for certificate-based key management...........22 2.5.1 Certificate management by a central authority.....22 2.5.2 Decentralized management..........................23 2.5.3 A phone-book approach to certificates.............24 3. Digital signatures and hash functions........................25 3.1 Public-key implementation of signatures..................27 3.1.1 Signing messages..................................27 3.1.2 The issue of nonrepudiation.......................29 3.1.3 The issue of proof of delivery....................30 3.2 Hash functions and message digests.......................31 3.2.1 Usage of hash functions...........................33 3.2.2 Relation to one-way functions.....................33 3.2.3 Weak and strong hash functions....................34 3.3 Digital signatures and certificate-based systems.........35 4. Examples of public-key systems and hash functions............37 4.1 The RSA public-key scheme................................39 4.1.1 Choice of p and q.................................41 4.1.2 Further notes on implementation...................42 4.1.3 Security of RSA...................................43 4.1.3.1 Restrictions on p and q..................43 4.1.3.2 Notes on factoring.......................44 4.1.4 Low-exponent versions of RSA......................45 4.2 Other public-key systems.................................46 4.2.1 Knapsack systems..................................47 4.2.2 The ElGamal signature scheme......................49 4.3 Examples of hash functions...............................53 4.3.1 Merkle's meta-method..............................53 4.3.2 Coppersmith's attack on Rabin-type functions......56 4.3.3 Quadratic congruential hash functions.............57 4.4 Hardware and software support............................58 4.4.1 Design considerations for RSA chips...............58 4.4.2 Proposed designs for RSA chips....................59 5. Implementations of public-key cryptography...................61 5.1 MITRENET.................................................61 5.2 ISDN.....................................................62 5.2.1 Keys..............................................62 5.2.2 Calling...........................................63 5.3 ISO Authentication Framework.............................64 5.3.1 Use of certificates...............................64 5.3.2 Certification paths...............................65 5.3.3 Expiration and revocation of certificates.........66 5.3.4 Authentication protocols..........................67 5.3.5 Further notes.....................................71 5.4 DARPA-Internet...........................................71 6. A sample proposal for a LAN implementation...................73 6.1 Integration into a network...............................73 6.2 Security threats.........................................74 6.3 Security services........................................74 6.4 Security mechanisms......................................75 6.5 Criteria for cryptosystems...............................76 6.5.1 Security..........................................77 6.5.2 Numerical criteria................................77 6.5.3 Other criteria....................................78 6.6 Criteria for hash functions..............................78 6.7 Example of a LAN security framework......................78 6.7.1 Key management....................................79 6.7.2 Component generation and storage....................79 6.7.3 Secret-key generation.............................79 6.7.4 Issuance and distribution of certificates.........80 6.7.5 Compromised or invalidated certificates...........80 6.7.6 Authentication....................................81 Appendix A. Mathematical and computational aspects..............83 A.1 Computational complexity and cryptocomplexity............83 A.2 Classical complexity theory..............................84 A.3 Public-key systems and cryptocomplexity..................84 A.4 Probabilistic algorithms.................................85 A.5 Status of some relevant problems.........................86 Appendix B. Algorithms and architectures........................89 B.1 Technology...............................................89 B.2 Computing modes..........................................90 B.3 Some relevant algorithms and implementation..............92 B.3.1 Quadratic sieve factoring algorithm................92 B.3.2 Computations in finite fields......................93 B.3.3 Other algorithms...................................94 B.4 Application-specific architectures.......................94 B.4.1 Systolic and wavefront arrays......................94 B.4.2 Proposal for a quadratic sieve machine.............95 B.4.3 Massively parallel machines........................95 Appendix C. The classical theory of computation.................97 C.1 Turing machines..........................................97 C.2 Nondeterministic Turing machines.........................98 C.3 Computational complexity.................................99 Appendix D. The theory of probabilistic computing..............101 Appendix E. Breaking knapsacks.................................103 Appendix F. Birthday attacks...................................105 Appendix G. Modular arithmetic and Galois fields...............107 G.1 The Euler Phi function..................................108 G.2 The Euler-Fermat Theorem................................108 G.3 Galois fields...........................................110 Appendix H. Euclid's algorithm.................................111 Appendix I. The Chinese Remainder Theorem......................113 Appendix J. Quadratic residues and the Jacobi symbol...........115 J.1 Quadratic residues modulo a prime.......................115 J.2 The Jacobi symbol.......................................116 J.3 Square roots modulo a prime.............................117 J.4 Quadratic residuosity modulo a prime....................118 Appendix K. Primitive roots and discrete logarithms............119 Appendix L. Primality testing..................................123 L.1 The Solovay/Strassen test...............................124 L.2 Lehman's test...........................................125 L.3 The Miller/Rabin test...................................126 Appendix M. Mathematics of RSA and other exponential systems...127 Appendix N. Quadratic residuosity modulo a composite...........129 N.1 Characterizing quadratic residues.......................129 N.2 The Jacobi symbol once more.............................130 N.3 Quadratic residuosity and factoring.....................132 N.4 Quadratic residuosity and Blum integers.................133 Appendix O. An introduction to zero-knowledge..................137 Appendix P. Alternatives to the Diffie/Hellman model...........143 P.1 Probabilistic encryption................................143 P.2 Identity-based schemes..................................145 References.....................................................147 LIST OF ILLUSTRATIONS Figure 1. Adaptive Chosen Plaintext Attack........................3 Figure 2. Using Public-Key for Secrecy and Authenticity..........10 Figure 3. The Diffie/Hellman Key Exchange........................15 Figure 4. A Protocol for Real-Time Authentication................21 Figure 5. A Protocol for Signing with Hash Function and Secrecy..28 Figure 6. Using RSA for Authenticity and Secrecy.................40 Figure 7. The ElGamal Signature Algorithm........................51 Figure 8. One-Way Authentication Protocol........................69 1. Cryptosystems and cryptanalysis. Cryptography deals with the transformation of ordinary text (plaintext) into coded form (ciphertext) by encryption, and transformation of ciphertext into plaintext by decryption. Normally these transformations are parameterized by one or more keys. The motive for encrypting text is security for transmissions over insecure channels. Three of the most important services provided by cryptosystems are secrecy, authenticity, and integrity. Secrecy refers to denial of access to information by unauthorized individuals. Authenticity refers to validating the source of a message; i.e., that it was transmitted by a properly identified sender and is not a replay of a previously transmitted message. Integrity refers to assurance that a message was not modified accidentally or deliberately in transit, by replacement, insertion or deletion. A fourth service which may be provided is nonrepudiation of origin, i.e., protection against a sender of a message later denying transmission. Variants of these services, and other services, are discussed in [ISO-87]. Classical cryptography deals mainly with the secrecy aspect. It also treats keys as secret. In the past 15 years two new trends have emerged: a. Authenticity as a consideration which rivals and sometimes exceeds secrecy in importance. b. The notion that some key material need not be secret. The first trend has arisen in connection with applications such as electronic mail systems and electronic funds transfers. In such settings an electronic equivalent of the handwritten signature may be desirable. Also, intruders into a system often gain entry by masquerading as legitimate users; cryptography presents an alternative to password systems for access control. The second trend addresses the difficulties which have traditionally accompanied the management of secret keys. This may entail the use of couriers or other costly, inefficient and not really secure methods. In contrast, if keys are public the task of key management may be substantially simplified. An ideal system might solve all of these problems concurrently, i.e., using public keys; providing secrecy; and providing authenticity. Unfortunately no single technique proposed to date has met all three criteria. Conventional systems such as DES require management of secret keys; systems using public key components may provide authenticity but are inefficient for bulk encryption of data due to low bandwidths. Fortunately, conventional and public-key systems are not mutually exclusive; in fact they can complement each other. Public- key systems can be used for signatures and also for the distribution of keys used in systems such as DES. Thus it is possible to construct hybrids of conventional and public-key systems which can meet all of the above goals: secrecy, authenticity and ease of key management. For surveys of the preceding and related topics see, e.g., ([BRAS88],[COPP87],[DENN83],[DIFF82],[DIFF79],[KLIN79],[KONH81], [LEMP79],[MASS88],[MERK82],[POPE78],[POPE79],[SALO85],[SIMM79]). More specialized discussions of public-key cryptography are given, e.g., in ([DIFF88],[LAKS83],[MERK82b]). Mathematical aspects are covered, e.g., in ([KRAN86],[PATT87]). In the following, E and D represent encryption and decryption transformations, respectively. It is always required that D(E(M)) = M. It may also be the case that E(D(M)) = M; in this event E or D can be employed for encryption. Normally D is assumed to be secret, but E may be public. In addition it may be assumed that E and D are relatively easy to compute when they are known. 1.1 Requirements for secrecy. Secrecy requires that a cryptanalyst (i.e., a would-be intruder into a cryptosystem) should not be able to determine the plaintext corresponding to given ciphertext, and should not be able to reconstruct D by examining ciphertext for known plaintext. This translates into two requirements for a cryptosystem to provide secrecy: a. A cryptanalyst should not be able to determine M from E(M); i.e., the cryptosystem should be immune to ciphertext-only attacks. b. A cryptanalyst should not be able to determine D given {E(Mi)} for any sequence of plaintexts {M1,M2,...}; i.e. the cryptosystem should be immune to known-plaintext attacks. This should remain true even when the cryptanalyst can choose {Mi} (chosen-plaintext attack), including the case in which the cryptanalyst can inspect {E(M1),...,E(Mj)} before specifying Mj+1 (adaptive chosen-plaintext attack). Adaptive chosen plaintext attack is illustrated below. ____________________ | | | i = 0 | |____________________| | ____________ | | | |/___| i := i + 1 |/___ |\ |____________|\ /|\ | | ________\|/_________ | | | | | choose M(i+1) | | |____________________| | | | | | ________\|/_________ | | | | | compute E(M(i+1)) | | |____________________| | | | | | ____________\|/_______________ | | | | | inspect E(M(1)),..,E(M(i+1)) | | |______________________________| | | | | | \|/ | / \ | / \___________________\| \ / continue / \ / | stop | \|/ Figure 1. Adaptive Chosen Plaintext Attack To illustrate the difference between these two categories, we use two examples. First, suppose E(M) = M3 mod N, N = p * q, where p and q are large secret primes. Then (cf. sec. 4) it is infeasible for a cryptanalyst to determine D, even after inspecting numerous pairs of the form {M,E(M)}. However, an eavesdropper who intercepts E(M) = 8 can conclude M = 2. Thus a ciphertext-only attack may be feasible in an instance where known- or chosen-plaintext attack is not useful. On the other hand, suppose E(M) = 5M mod N where N is secret. Then interception of E(M) would not reveal M or N; this would remain true even if several ciphertexts were intercepted. However, an intruder who learns that E(12) = 3 and E(16) = 4 could conclude N = 19. Thus a known- or chosen-plaintext attack may succeed where a ciphertext-only attack fails. Deficiencies in (a), i.e., vulnerability to ciphertext-only attack, can frequently be corrected by slight modifications of the encoding scheme, as in the M3 mod N encoding above. Adaptive chosen-plaintext is often regarded as the strongest attack. Secrecy ensures that decryption of messages is infeasible. However, the enciphering transformation E is not covered by the above requirements; it could even be public. Thus secrecy, per se, leaves open the possibility that an intruder could masquerade as a legitimate user, or could compromise the integrity of a message by altering it. That is, secrecy does not imply authenticity/ integrity. 1.2 Requirements for authenticity and integrity. Authenticity requires that an intruder should not be able to masquerade as a legitimate user of a system. Integrity requires that an intruder should not be able to substitute false ciphertext for legitimate ciphertext. Two minimal requirements should be met for a cryptosystem to provide these services: a. It should be possible for the recipient of a message to ascertain its origin. b. It should be possible for the recipient of a message to verify that it has not been modified in transit. These requirements are independent of secrecy. For example, a message M could be encoded by using D instead of E. Then assuming D is secret, the recipient of C = D(M) is assured that this message was not generated by an intruder. However, E might be public; C could then be decoded by anyone intercepting it. A related service which may be provided is nonrepudiation; i.e., we may add a third requirement if desired: c. A sender should not be able to deny later that he sent a message. We might also wish to add: d. It should be possible for the recipient of a message to detect whether it is a replay of a previous transmission. 1.3 Conventional systems. In a conventional cryptosystem, E and D are parameterized by a single key K, so that we have DK(EK(M)) = M. It is often the case that the algorithms for obtaining DK and EK from K are public, although both EK and DK are secret. In this event the security of a conventional system depends entirely on keeping K a secret. Then secrecy and authenticity are both provided: if two parties share a secret K, they can send messages to one another which are both private (since an eavesdropper cannot compute DK(C)) and authenticated (since a would-be masquerader cannot compute EK(M)). In some cases (e.g., transmission of a random bit string), this does not assure integrity; i.e., modification of a message en route may be undetected. Typically integrity is provided by sending a compressed form of the message (a message digest) along with the full message as a check. Conventional systems are also known as one-key or symmetric systems [SIMM79]. 1.4 Example of a conventional cipher: DES. The most notable example of a conventional cryptosystem is DES (Data Encryption Standard). It is well-documented (e.g. [DENN83],[EHRS78],[NATI77],[NATI80],[NATI81],[SMID81],[SMID88b]) and will not be discussed in detail here, except to contrast it with other ciphers. It is a block cipher, operating on 64-bit blocks using a 56-bit key. Essentially the same algorithm is used to encipher or decipher. The transformation employed can be written P-1(F(P(M))), where P is a certain permutation and F is a certain function which combines permutation and substitution. Substitution is accomplished via table lookups in so-called S-boxes. The important characteristics of DES from the standpoint of this exposition are its one-key feature and the nature of the operations performed during encryption/decryption. Both permutations and table lookups are easily implemented, especially in hardware. Thus encryption rates exceeding 40 Mbit/sec have been obtained (e.g., [BANE82],[MACM81]). This makes DES an efficient bulk encryptor, especially when implemented in hardware. The security of DES is produced in classical fashion: alternation of substitutions and permutations. The function F is obtained by cascading a certain function f(x,y), where x is 32 bits and y is 48 bits. Each stage of the cascade is called a round. A sequence of 48-bit strings {Ki} is generated from the key. Let L(x) and R(x) denote the left and right halves of x, and let XOR denote exclusive-or. Then if Mi denotes the output of stage i, we have L(Mi) = R(Mi-1), R(Mi) = L(Mi-1) XOR f(L(Mi),Ki). The hope is that after 16 rounds, the output will be statistically flat; i.e., all patterns in the initial data will be undetectable. 1.5 Another conventional cipher: exponentiation. Pohlig and Hellman [POHL78] noted a type of cipher which deviated from the classical methods such as transposition and substitution. Their technique was conceptually much simpler. Let GCD denote greatest common divisor (app. G). Suppose p > 2 is a prime and suppose K is a key in [1,p-1) with GCD(K,p-1) = 1 (i.e., K is relatively prime to p-1). If M is plaintext in [1,p- 1], an encryption function E may be defined by E(M) = MK mod p. Now the condition GCD(K,p-1) = 1 implies (lem. G.1) that we can find I with I * K ð 1 (mod p-1). Note that I is not a separate key; I is easily derived from K or vice-versa (app. H). We may set D(C) = CI mod p. It may then be shown (cor. G.2.3) that D(E(M)) = M, as required. Furthermore, E(D(C)) = C as well; i.e., E and D are inverse functions. This makes exponentiation a versatile cipher; in particular, we can encipher with D as well as E. Later we will note that this can be useful in authentication. However, Pohlig and Hellman used the above only as an example of a conventional cryptosystem. That is, since I is easily derived from K, both E and D are generated by the single, secret, shared key K. In section 4.1 we will see that if p were replaced by a product of two primes, derivation of I from K would be non-trivial. This would cause the key material to be split into two portions. Despite the relative simplicity of the definitions of E and D, they are not as easy to compute as their counterparts in DES. This is because exponentiation mod p is a more time-consuming operation than permutations or table lookups if p is large. Security of the system in fact requires that p be large. This is because K should be nondeterminable even in the event of a known-plaintext attack. Suppose a cryptanalyst knows p, M, C, and furthermore knows that C = MK mod p for some K. He should still be unable to find K; i.e., he should not be able to compute discrete logarithms modulo p. At present there are no efficient algorithms for the latter operation: the best techniques now available take time (e.g., [ADLE79],[COPP86]) exp(c((log p)(log log p))1/2). Computing modulo p is equivalent to using the Galois field GF(p); it is possible to use GF(pn) instead (app. G). There are both advantages and disadvantages to the extension. Arithmetic in GF(pn) is generally easier if n > 1, and especially so in GF(2n). On the other hand, taking discrete logarithms in GF(pn) is also easier. The case of small n is treated in [ELGA85b]. The greatest progress has been made for the case p = 2 (e.g., [BLAK84],[COPP84]). In [COPP84] it is shown that discrete logarithms in GF(2n) can be computed in time exp(c * n1/3 * (log n)2/3). For a survey of the discrete logarithm problem see, e.g., [ADLE86],[COPP87],[MCCU89],[ODLY84b]. 1.6 Public-key cryptosystems. The notion of public-key cryptography was introduced by Diffie and Hellman [DIFF76b]; for a history see [DIFF88]. Public-key systems, also called two-key or asymmetric [SIMM79], differ from conventional systems in that there is no longer a single secret key shared by a pair of users. Rather, each user has his own key material. Furthermore, the key material of each user is divided into two portions, a private component and a public component. The public component generates a public transformation E, and the private component generates a private transformation D. In analogy to the conventional case E and D might be termed encryption and decryption functions, respectively. However, this is imprecise: in a given system we may have D(E(M)) = M, E(D(M)) = M, or both. A requirement is that E must be a trapdoor one-way function. One-way refers to the fact that E should be easy to compute from the public key material but hard to invert unless one possesses the corresponding D, or equivalently, the private key material generating D. The private component thus yields a "trapdoor" which makes the problem of inverting E seem difficult from the point of view of the cryptanalyst, but easy for the (sole legitimate) possessor of D. For example, a trapdoor may be the knowledge of the factorization of an integer (see sec. 4.1). We remark that the trapdoor functions employed as public transformations in public-key systems are only a subclass of the class of one-way functions. The more general case will be discussed in section 3.2.2. We note also that public/private dichotomy of E and D in public- key systems has no analogue in a conventional cryptosystem: in the latter, both EK and DK are parameterized by a single key K. Hence if EK is known then it may be assumed that K has been compromised, whence it may also be assumed that DK is also known, or vice- versa. For example, in DES, both E and D are computed by essentially the same public algorithm from a common key; so E and D are both known or both unknown, depending on whether the key has been compromised. 1.6.1 Secrecy and authenticity. To support secrecy, the transformations of a public-key system must satisfy D(E(M)) = M. Suppose A wishes to send a secure message M to B. Then A must have access to EB, the public transformation of B (note that subscripts refer to users rather than keys in this context). Now A encrypts M via C = EB(M) and sends C to B. On receipt, B employs his private transformation DB for decryption; i.e., B computes DB(C) = DB(EB(M)) = M. If A's transmission is overheard, the intruder cannot decrypt C since DB is private. Thus secrecy is ensured. However, presumably anyone can access EB; B has no way of knowing the identity of the sender per se. Also, A's transmission could have been altered. Thus authenticity and integrity are not assured. To support authentication/integrity, the transformations in a public-key system must satisfy E(D(M)) = M. Suppose A wishes to send an authenticated message M to B. That is, B is to be able to verify that the message was sent by A and was not altered. In this case A could use his private transformation DA to compute C = DA(M) and send C to B. Now B can use A's public transformation EA to find EA(C) = EA(DA(M)) = M. Assuming M is valid plaintext, B knows that C was in fact sent by A, and was not altered in transit. This follows from the one-way nature of EA: if a cryptanalyst, starting with a message M, could find C' such that EA(C') = M, this would imply that he can invert EA, a contradiction. If M, or any portion of M, is a random string, then it may be difficult for B to ascertain that C is authentic and unaltered merely by examining EA(C). Actually, however, a slightly more complex procedure is generally employed: an auxiliary public function H is used to produce a digital signature S = DA(H(M)) which A sends to B along with M. On receipt B can compute H(M) directly. The latter may be checked against EA(S) to ensure authenticity and integrity, since once again the ability of a cryptanalyst to find a valid S' for a given M would violate the one-way nature of EA. Actually H must also be one-way; we return to this subject in section 3.2. Sending C or S above ensures authenticity, but secrecy is nonexistent. In the second scheme M was sent in the clear along with S; in the first scheme, an intruder who intercepts C = DA(M) presumably has access to EA and hence can compute M = EA(C). Thus in either case M is accessible to an eavesdropper. It may be necessary to use a combination of systems to provide secrecy, authenticity and integrity. However, in some cases it is possible to employ the same public-key system for these services simultaneously. We note that for authenticity/integrity purposes, D is applied to M or H(M); for secrecy, E is applied to M. If the same public-key system is to be used in both cases, then D(E(M)) = M and E(D(M)) = M must both hold; i.e., D and E are inverse functions. A requirement is that the plaintext space (i.e., the domain of E) must be the same as the ciphertext space (i.e., the domain of D). Suppose that in addition to E and D being inverses for each user, for each pair of users A and B the functions EA, DA, EB, and DB all have a common domain. Then both secrecy and authenticity can be accomplished with a single transmission: A sends C = EB(DA(M)) to B; then B computes EA(DB(C)) = EA(DA(M)) = M. An intruder cannot decrypt C since he lacks DB; hence secrecy is assured. If the intruder sends C' instead of C, C' cannot produce a valid M since DA is needed to produce a valid C. This assures authenticity. In actuality there are no common systems versatile enough for the last usage without modification. In fact there is only one major system (RSA) which satisfies E(D(M)) = D(E(M)) = M. The lack of a common domain between two users creates a technical problem in using such a system for secrecy and authenticity. We discuss some approaches to this problem in section 4.1. If the question of domains is ignored, the usage of a system satisfying E(D(M)) = D(E(M)) = M with a hash function H is illustrated below. There EA,DA,EB,DB are the public transformations of A and B, respectively. A: B: ____________________ _________________ | | _____\| | | compute H(M) | /|\ /| receive (C,S) | |____________________| | |_________________| | | | | | | ________\|/_________ | _________\|/_________ | | | | | | compute C = EB(M) | | | compute M' = DB(C) | |____________________| | |_____________________| | | | | | | _________\|/__________ | _________\|/_________ | | | | | | compute S = DA(H(M)) | | | compute H' = EA(S) | |______________________| | |_____________________| | | | | | | ________\|/_________ | _________\|/_________ | | | | | | send (C,S) to B | | | verify H' = H(M') | |____________________| | |_____________________| | | | | \|/________________\| / Figure 2. Using Public-Key for Secrecy and Authenticity 1.6.2 Applicability and limitations. The range of applicability of public-key systems is limited in practice by the relatively low bandwidths associated with public- key ciphers, compared to their conventional counterparts. It has not been proven that time or space complexity must necessarily be greater for public key systems than for conventional systems. However, the public-key systems which have withstood cryptanalytic attacks are all characterized by relatively low efficiency. For example, some are based on modular exponentiation, a relatively slow operation. Others are characterized by high data expansion (ciphertext much larger than plaintext). This inefficiency, under the conservative assumption that it is in fact inherent, seems to preclude the use of public-key systems as replacements for conventional systems utilizing fast encryption techniques such as permutations and substitutions. That is, using public-key systems for bulk data encryption is not feasible, at least for the present. On the other hand, there are two major application areas for public-key cryptosystems: a. Distribution of secret keys. b. Digital signatures. The first involves using public-key systems for secure and authenticated exchange of data-encrypting keys between two parties (sec. 2). Data-encrypting keys are secret shared keys connected with a conventional system used for bulk data encryption. This permits users to establish common keys for use with a system such as DES. Classically, users have had to rely on a mechanism such as a courier service or a central authority for assistance in the key exchange process. Use of a public-key system permits users to establish a common key which does not need to be generated by or revealed to any third party, providing both enhanced security and greater convenience and robustness. Digital signatures are a second major application (sec. 3). They provide authentication, nonrepudiation and integrity checks. As noted earlier, in some settings authentication is a major consideration; in some cases it is desirable even when secrecy is not a consideration (e.g., [SIMM88]). We have already seen an example of a digital signature, i.e., when A sent DA(M) to B. This permitted B to conclude that A did indeed send the message. As we will note in section 3, nonrepudiation is another property desirable for digital signatures. Public key cryptosystems provide this property as well. No bulk encryption is needed when public-key cryptography is used to distribute keys, since the latter are generally short. Also, digital signatures are generally applied only to outputs of hash functions (sec. 3). In both cases the data to be encrypted or decrypted is restricted in size. Thus the bandwidth limitation of public-key is not a major restriction for either application. 2. Key management. Regardless of whether a conventional or public-key cryptosystem is used, it is necessary for users to obtain other users' keys. In a sense this creates a circular problem: in order to communicate securely over insecure channels, users must first exchange key information. If no alternative to the insecure channel exists, then secure exchange of key information presents essentially the same security problem as subsequent secure communication. In conventional cryptosystems this circle can be broken in several ways. For example, it might be assumed that two users can communicate over a supplementary secure channel, such as a courier service. In this case it is often the case that the secure channel is costly, inconvenient, low-bandwidth and slow; furthermore, use of a courier cannot be considered truly secure. An alternative is for the two users to exchange key information via a central authority. This presumes that each user individually has established a means of communicating securely with the central authority. Use of a central authority has several disadvantages as noted below. In public-key systems the key management problem is simpler because of the public nature of the key material exchanged between users, or between a user and a central authority. Also, alternatives to the insecure channel may be simpler; e.g., a physical mail system might suffice, particularly if redundant information is sent via the insecure (electronic) channel. 2.1 Secret-key management. In a conventional (one-key) system, security is dependent on the secrecy of the key which is shared by two users. Thus two users who wish to communicate securely must first securely establish a common key. As noted above, one possibility is to employ a third party such as a courier; but as remarked there are various disadvantages to this implementation. The latter is the case even for a single key exchange. In practice, it may be necessary to establish a new key from time to time for security reasons. This may make use of a courier or similar scheme costly and inefficient. An alternative is for the two users to obtain a common key from a central issuing authority [BRAN75]. Security is then a major consideration: a central authority having access to keys is vulnerable to penetration. Due to the concentration of trust, a single security breach would compromise the entire system. In particular, a central authority could engage in passive eavesdropping for a long period of time before the practice was discovered; even then it might be difficult to prove. A second disadvantage of a central issuing authority is that it would probably need to be on-line. In large networks it might also become a bottleneck, since each pair of users needing a key must access a central node at least once. If the number of users is n then the number of pairs of users wishing to communicate could theoretically be as high as n(n-1)/2, although this would be highly unlikely in large networks. Each time a new key is needed, at least two communications involving the central authority are needed for each pair of users. Furthermore, such a system may not be robust: failure of the central authority could disrupt the key distribution system. Some of the disadvantages of a central authority can be mitigated by distributing key distribution via a network of issuing authorities. One possibility is a hierarchical (tree-structured) system, with users at the leaves and key distribution centers at intermediate nodes. However, distributing key management creates a new security problem, since a multiplicity of entry points for intruders is created. Furthermore, such a modification might be inefficient unless pairs of users communicating frequently were associated to a common subtree; otherwise the root of the tree would be a bottleneck as before. Another alternative is a key translation center which requires only one key-encrypting key exchange through a certification authority. Some of these disadvantages can also be mitigated by a public- key approach to key distribution. 2.2 Public distribution of secret keys. It was noted by Diffie and Hellman [DIFF76b] and Merkle [MERK78] that users can publicly exchange secret keys. This is not related a priori to public-key cryptography; however, a public-key system can be used to implement public distribution of secret keys (often referred to as public key distribution). The underlying public- key system must support secrecy, for use in key encryption. The notion of public distribution of secret keys was originated in 1974 by Merkle, who proposed a "puzzle" system to implement it [MERK78]. However, even before this work was published it was superseded by a public-key scheme supporting public key distribution [DIFF76b]. This has come to be known as the Diffie/Hellman exponential key exchange. To begin with, users A and B are assumed to have agreed upon a common prime p and a common primitive root g modulo p (app. K). Then (lem. K.2) each number in [1,p) can be expressed as gx mod p for some x. Both p and g may be public. To establish a common secret key, A chooses a random x(A) in [0,p-1] and B chooses a random x(B) in [0,p-1]; these are kept secret. Then A computes y(A) = gx(A) mod p while B computes y(B) = gx(B) mod p. These need not be secret. Finally A sends y(A) to B and B sends y(B) to A. Now A knows x(A) and y(B) and can compute K = y(B)x(A) mod p = gx(B)x(A). Similarly, B knows x(B) and y(A) and can compute K = y(A)x(B) mod p = gx(A)x(B). The Diffie/Hellman algorithm is illustrated below. A: B: _______________ _______________ | | | | | choose xA | | choose xB | |_______________| |_______________| | | | | ________\|/_________ ________\|/_________ | | | | | compute yA = g**xA | | compute yB = g**xB | |____________________| |____________________| | | | | _______\|/________ ________\|/_________ | | | | | send yA to B |------------->| receive yA from A | |__________________| |____________________| | | | | ________\|/________ ________\|/_________ | | | | | receive yB from B |<-------------| send yB to A | |___________________| |____________________| | | | | ________\|/_________ ________\|/_________ | | | | | compute K = yB**xA | | compute K = yA**xB | |____________________| |____________________| Figure 3. The Diffie/Hellman Key Exchange This establishes K as a secret common quantity which can be used in some agreed-upon fashion to generate a secret key. The secrecy of this scheme depends, as in section 1.5, on the difficulty of computing discrete logarithms. That is, knowing p, g, y and the fact that y = gx mod p for some x does not yield x. Thus, if an intruder knows p and g and intercepts y(A) or y(B) or both, he cannot find x(A) or x(B) and hence cannot compute K. A disadvantage of this scheme is lack of support for authenticity. If an intruder C intercepts y(B), sends y(C) = gx(C) mod p to B and B thinks he is receiving y(A), B will inadvertently establish a secret key with C. That is, the underlying public-key cryptosystem supports secrecy but not authentication. Thus it may be desirable to augment a secrecy-providing system with one which provides authentication. Examples of the latter will be given later. 2.3 Management of public components in a public-key system. Prior to using a public-key cryptosystem for establishing secret keys, signing messages etc., users A and B must exchange their public transformations (EA and EB in sec. 1.6), or equivalently their public components. The latter may be regarded as a key management problem. It is a simpler problem than establishment of secret keys, since public components do not require secrecy in storage or transit. Public components can, e.g., be managed by an on-line or off-line directory service; they can also be exchanged directly by users. However, authenticity is an issue as in the previous section. If A thinks that EC is really EB then A might encrypt using EC and inadvertently allow C to decrypt using DC. A second problem is integrity: any error in transmission of a public component will render it useless. Hence some form of error detection is desirable. Regardless of the scheme chosen for public component distribution, at some point a central authority is likely to be involved, as in conventional systems. However, it may not be necessary for the central authority to be on-line; exchange of public components between users need not involve the central authority. An example of how this can be implemented is given in section 2.5. We also note there that the implications of compromise of the central authority are not as severe as in the conventional case. Validity is an additional consideration: a user's public component may be invalidated because of compromise of the corresponding private component, or for some other reason such as expiration. This creates a stale-data problem in the event that public components are cached or accessed through a directory. Similar problems have been encountered in management of multi- cache systems and distributed databases, and various solutions have been suggested. For example, users could be notified immediately of key compromises or invalidations. This may be impractical, in which case either users should be informed within a given time period of changes needed for cached information on public components, or users should periodically check with a central authority to update validity information. 2.3.1 Use of certificates. A technique to gain a partial solution to both authenticity and integrity in distribution of public components is use of certificates [KOHN78]. A certificate-based system assumes a central issuing authority CA as in the secret-key case. Again it must be assumed that each user can communicate securely with the CA. This is relatively simple in the present instance: it merely requires that each user possess ECA, the public transformation of the CA. Then each user A may register EA with the CA. Since EA is public, this might be done via the postal service, an insecure electronic channel, a combination of these, etc. Normally A will follow some form of authentication procedure in registering with the CA. Alternatively, registration can be handled by a tree-structured system: the CA issues certificates to local representatives (e.g., of employing organizations), who then act as intermediaries in the process of registering users at lower levels of the hierarchy. An example of such a system is given in section 5.4. In any case, in return A receives a certificate signed by the CA (see sec. 3 for a more thorough discussion of signatures) and containing EA. That is, the CA constructs a message M containing EA, identification information for A, a validity period, etc. Then the CA computes CERTA = DCA(M) which becomes A's certificate. The latter is then a public document which both contains EA and authenticates it, since the certificate is signed by the CA. Certificates can be distributed by the CA or by users, or used in a hierarchical system which we return to later. The inclusion of the validity period is a generalization of timestamping. The latter was not treated in [KOHN78], but Denning [DENN81] notes the importance of timestamping in guarding against the use of compromised keys. In general, the problem of stale data is not wholly solved by timestamping: a certificate may be invalidated before its expiration date, because of compromise or administrative reasons. Hence if certificates are cached by users (as opposed to being re- distributed by the CA each time they are used), the CA must periodically issue lists of invalidated certificates. Popek and Kline [POPE79] have noted that certificates are analogous to capabilities in operating systems. Management of certificates creates analogous problems, such as selective revocation of capabilities. The public nature of certificates, however, a priori precludes an analogue of a useful feature of a capability system: users cannot be restricted to communicating with a subset of other users. If a selective capability feature is desired it must be implemented through a controlled distribution of certificates, which would be difficult and contrary to the spirit of public-key. 2.3.2 Generation and storage of component pairs. Generation of public/private component pairs is an important consideration. If this service is offered by a central facility, it may result in certain advantages; e.g., keys might be longer or chosen more carefully. However, this introduces the same problem noted in section 2.1: a central authority holding users' private components would be vulnerable to penetration. Also, the central facility would have to be trusted not to abuse its holdings by monitoring communications between users. Both problems are circumvented if users generate and store their own private/public components. 2.3.3 Hardware support for key management. If users store their own components, a management problem is created. One possibility is software storage, e.g., in a file with read protection. If greater security is desired, a second option is hardware keys ([DENN79],[FLYN78],[RIVE78]). In the classical version of this scheme, each public/private key pair is stored on a pair of ROM chips. The public key is revealed to the owner of the keys. However, the private key need not be known to any user, including the owner. There are two advantages to this mode of storage, namely the fact that neither the user nor a central authority need know the private component. We have previously noted the advantage of not having users reveal private keys to a central authority. Also, as noted in [FLYN78], it may also be desirable to protect keys from disloyal employees. Entering keys from a keyboard rather than from ROM may be insecure; moreover, the large size of public keys makes this impractical. However, there is also an obvious disadvantage to the ROM mode of hardware key storage, namely the fact that the party which loads a private key to ROM could retain a copy. An alternative, but still classical, scheme is for the user to be provided with a means of writing to a memory chip and then sealing it from further writing. A more modern and more secure hardware adjunct is a smart token, smart card or other device which contains both memory and a microprocessor. Such a device can both generate and store user keys. Ideally, it should be implemented via a single chip which stores both cryptographic algorithms and data. In the classical case, encryption and decryption are handled by separate units in a hardware device acting as an interface between a user's computer and the network. As noted in [DENN79], this creates a security risk: an intruder may rig the device. If a token or smart card is used, it is possible (at least in theory; the technology is still evolving at the time of this writing) to generate and store keys on the same device which executes encryption and decryption algorithms. Again, security is optimal if all of these functions are combined onto one or a minimal number of chips. Use of such auxiliary devices mitigates some of the problems encountered in the classical case. However, a token or smart card may be stolen, permitting the thief to masquerade as the user. Passwords (PINs) or biometrics used to control access to devices can substantially lessen this threat. Capabilities such as signature generation should become feasible on devices such as smart cards in the near future. This will provide an excellent solution to the problem of using private components without exposing them. 2.4 Using public-key systems for secret key distribution. As noted earlier, one of the major applications of public-key cryptosystems is in regard to public distribution of secret keys. Thus, a public-key system can be used in conjunction with a conventional system such as DES. We have already seen an example of a public-key cryptosystem implementing public key distribution in section 2.2. Both secrecy and authentication can be provided simultaneously in the distribution process via public-key systems. This may be achieved using a single public-key system if its encryption and decryption functions are inverses, or via a hybrid of two public key systems providing secrecy and authentication separately. The essence of this usage is very simple: a secret key is merely a particular form of message. Assuming that a system has been established for distribution of public components of users as discussed in the previous section, users can then establish secret keys at will by employing the public system for encryption and signing of secret keys. Thus the latter can be exchanged or changed without difficulty. This is in contradistinction to a system in which secret keys must be established via couriers or a central authority. The use of public-key cryptography thus reduces the problem of secret key distribution to the problem of binding user identities and user public keys. The latter is a simpler problem in that no communication of secret information is necessary. That is, users can generate their own private/public key pairs and need not expose their private keys to anyone, including themselves. However, it is critical that user public keys be properly authenticated. This is a nontrivial consideration in large networks. Since only integrity and authenticity are considerations, users may be able to register their public components without use of a secure channel. In a large network this might require a distributed system of certifying authorities, arranged in hierarchical fashion. The feasibility of such an approach requires transitivity of trust. For example, a central certifying authority could certify a second level of certifying agents who in turn certify user public components. Other levels could be added as well. Thus a user is certified via an authentication path. Such a path need not require use of insecure channels if the nodes in the path can communicate securely. For example, a user might register with a local certifying authority in person. The local authority might be affiliated with an organization and may be able to use the user's organization ID to certify the user's public key. If the local authority can communicate securely with the central authority, the user's public key may then be forwarded to the central authority for distribution. Assuming that user public components can be certified and distributed, two users may use each other's public components to establish common keys for use in message encryption, again without resort to a secure channel. 2.4.1 A protocol for key exchange. Suppose A and B wish to establish a shared secret key. Suppose further that they have obtained each other's public components. Some possible modes for binding user public keys to user identities and distributing certified user public keys are discussed in sections 5.3.1 - 5.3.2 and 5.4.6. In any case, A may now generate a secret key K. For example, this may be a key-encrypting key to be used to exchange session keys and possibly initialization vectors if, e.g., DES is used in cipher feedback mode or cipher block-chaining mode [NATI80]. Then A might form an expanded message which contains K and other data, which could include an identification for A, a timestamp, a sequence number, a random number, etc. This added material is needed for A to authenticate himself to B in real-time, and thus prevent replays. For example, Needham and Schroeder [NEED78] suggest a three-way handshake protocol: First of all A may send to B the message C = EB(RA,IDA), where EB is B's public transformation, IDA is the identification of A, and RA is a random number. Now B can decrypt C and obtain IDA. Now B generates a random number RB and sends C' = EA(RA,RB) to A. Upon decryption of C', A can verify in real time that B has received RA, since only B could decrypt C. Finally A sends C" = EB(RB) to B; when B decrypts C" he can verify in real time that A has received RB, since only A could decrypt C'. Thus A and B are authenticated to each other in real time; i.e., each knows that the other is really there. Now A sends EB(DA(K)) to B, who can decrypt to obtain K. This ensures both secrecy and authenticity in exchange of K. The authentication stage of the preceding protocol is illustrated below. A: B: __________________ ____________________ | | ______\| | | choose RA | /|\ /| receive M from A | |__________________| | |____________________| | | | | | | ___________\|/__________ | ___________\|/___________ | | | | | | compute M = EB(RA,IA) | | | compute (RA,IA) = EA(M) | |________________________| | |_________________________| | | | | | | _________\|/_________ | _______\|/_______ | | | | | | send M to B |______\| | choose RB | |_____________________| / |_________________| | | | | ________\|/________ ___________\|/__________ | | | | | receive M' from B |/_______ | compute M' = EA(RA,RB) | |___________________|\ /|\ |________________________| | | | | | | ___________\|/____________ | ________\|/________ | | | | | | compute (RA,RB) = DA(M') | |/______| send M' to A | |__________________________| \ |___________________| | | | | _______\|/______ ________\|/_________ | | _____\| | | verify RA | /|\ /| receive M" from A | |________________| | |____________________| | | | | | | _________\|/_________ | _________\|/__________ | | | | | | compute M" = EB(RB) | | | compute RB = DB(M") | |_____________________| | |______________________| | | | | | | _______\|/________ | ______\|/______ | | | | | | send M" to B |________\| | verify RB | |__________________| / |_______________| Figure 4. A Protocol for Real-Time Authentication 2.5 Protocols for certificate-based key management. There are a number of different approaches to public-key component management and the application to secret key distribution (e.g., [KLIN79],[NEED78]). For simplicity we assume a certificate- based system. We also assume that a system has been established for a central authority to create certificates. 2.5.1 Certificate management by a central authority. In a certificate-based public-key system, one possibility is to have the central issuing authority manage certificate distribution. Suppose the authority CA has created certificates CERTA and CERTB for A and B, respectively. These contain EA and EB (or equivalently, the public components of A and B) as noted in section 2.3.1. We may assume that A and B have cached CERTA and CERTB, respectively. If A has exchanged certificates with B previously, A and B may have each other's certificates cached. If not, then there are two ways in which A could obtain B's certificate. The simplest is for A to request B's certificate directly from B; this is explored in the next section. The advantage of this simple approach is that on-line access to a central authority is avoided. On the other hand, if A requests B's certificate directly from B, or obtains it from a directory listing, security and integrity considerations arise. For example, the certificate A obtains may have been recently invalidated. Thus A may wish to access B's certificate through the CA, assuming that the latter is on-line. In this event A requests CERTA and CERTB from S. We recall that each CERT has the form DS(M) where M contains an E. Thus when A receives the requested certificates, he can validate CERTA by computing ES(CERTA) and recovering EA. This validates the source of the certificates, thus validating CERTB as well. Hence A may retrieve a duly authenticated EB from CERTB. The disadvantage is the requirement of an on-line central node; the advantage is that A is assured of the instantaneous validity of the certificate. Later we will note that this has implications for nonrepudiation. Caching of certificates implies that authentication involving the CA will not be done regularly. Thus the above procedure will only be necessary when two users first communicate, or when components are compromised or certificates invalidated. As long as the public components involved are valid, secret keys can be exchanged at will, although a handshake protocol may still be used to ensure real-time authenticity and integrity. An advantage of this mode of key management is that the entire process is conducted over insecure channels with excellent security. A second advantage is that distribution of certificates by a central authority assures that certificates are valid at time of receipt. A prerequisite for the proper functioning of such a key management system is the existence of a system for binding user public keys and identities. Such a system requires distributed trust. A disadvantage of this mode of management is that as in section 2.1, the central authority may be a bottleneck. The severity is mitigated by the fact that a secure channel is not needed, but the essential problem is similar. A more significant disadvantage arises from the concentration of trust in one entity [KLIN79]. The security of the central authority might be compromised, e.g., by penetration. The fact that the central authority does not generally access users' private components means that existing messages are not compromised, as might be the case in a secret-key system [DIFF82]. However, if an intruder accesses the central authority's private component, the intruder could forge certificates. Some scenarios for intrusions of this sort, explorations of consequences, and means for minimizing damage are explored in [DENN83b] and [DIFF88]. Merkle [MERK82b] has suggested a tree-structured system as a means of coping with the compromise of a central authority. However, this scheme is not practical for large networks since the tree must be restructured each time the user population changes. 2.5.2 Decentralized management. Users may be responsible for managing their own certificates. In this event the protocol is much simpler. When A wishes to initiate communication (e.g., exchange of a secret key) with B, he sends a message to B containing A's certificate, A's identification, and other information such as a date, random number etc. as described in the protocol in the previous section. This message also requests B's certificate. Upon completion of the certificate exchange, employing some protocol such as the handshake above, A and B will possess each other's authenticated certificates. A can validate the certificate CB by computing ES(CB) as usual. Then EB may be retrieved. The certificate must contain information properly identifying B to A, so that an intruder cannot masquerade as B. The certificate must also have a validity period. In turn B may proceed similarly. The central authority must periodically issue lists of certificates which have become invalid before their expiration dates due to key compromise or administrative reasons. It is likely that in a large system this would be done, e.g., on a monthly basis. Hence a user receiving a certificate from another user would not have complete assurance of its validity, even though it is signed by S. Thus this system trades higher efficiency for some security loss, compared to the previous scheme. For greater assurance of validity, a user could access a centrally-managed list of invalid certificates; presumably these would be very current. This would require on-line access to a central facility, which would create the same type of bottleneck we have noted previously. However, the accesses would be very quick, since presumably only a certificate sequence number would be transmitted. Yet another on-line possibility would be for the central authority to enforce coherence by a global search of cached certificates each time a certificate is invalidated. Again there is a trade-off of validity assurance and efficiency. 2.5.3 A phone-book approach to certificates. Some of the features of the previous schemes could be combined in a phone-book approach, using an electronic equivalent such as a floppy disk containing certificates. This would optimize ease of use since a user could communicate securely with another by accessing the latter's certificate very rapidly. However, again the central authority would have to issue "hot lists". Periodic distribution of the compilations of certificates would be a separate management process. Security of such a scheme is clearly in question [KLIN79]. Phone-book entries might contain errors, or entries could be altered. 3. Digital signatures and hash functions. Digital signatures were introduced by Diffie and Hellman [DIFF76b]; for surveys see, e.g., [AKL-83],[POPE79],[RABI78]. A digital signature is the electronic analogue of a handwritten signature. A common feature is that they must provide the following: a. A receiver must be able to validate the sender's signature. b. A signature must not be forgeable. c. The sender of a signed message must not be able to repudiate it later. We have already seen an example of the usage of digital signatures, namely when a central authority signs certificates. In this section we are concerned with the signing of arbitrary messages. A major difference between handwritten and digital signatures is that a digital signature cannot be a constant; it must be a function of the document which it signs. If this were not the case then a signature, due to its electronic nature, could be attached to any document. Furthermore, a signature must be a function of the entire document; changing even a single bit should produce a different signature. Thus a signed message cannot be altered. There are two major variants of implementation: a. true signatures. b. arbitrated signatures. In a true signature system, signed messages are forwarded directly from signer to recipient. In an arbitrated system, a witness (human or automated) validates a signature and transmits the message on behalf of the sender. The use of an arbitrator may be helpful in event of key compromise as noted below. The notion of authentication by some form of signature can be traced back as far as 1952, as noted by Diffie [DIFF88]. Cryptography was used by the Air Force to identify friendly aircraft through a challenge/response system. The protocols utilized influenced the development of public-key cryptography by Diffie and Hellman [DIFF76b]. As we note below, hash functions are useful auxiliaries in this context, i.e., in validating the identity of a sender. They can also serve as cryptographic checksums (i.e., error-detecting codes), thereby validating the contents of a message. Use of signatures and hash functions can thus provide authentication and verification of message integrity at the same time. Numerous digital signature schemes have been proposed. As noted in [DAVI83], a major disadvantage of signature schemes in conventional systems is that they are generally one-time schemes. A signature is generated randomly for a specific message, typically using a large amount of key material, and is not re-usable. Furthermore, later resolution of disputes over signed documents requires written agreements and substantial bookkeeping on the part of the sender and receiver, making it more difficult for a third party to adjudicate. A conventional scheme was noted in [DIFF76b] (the Lamport/Diffie One-Time Signature) and is refined in [MERK82]. Some other conventional schemes are DES-based. Public-key schemes supporting authentication permit generation of signatures algorithmically from the same key repeatedly, although the actual signatures are of course different. That is, in a public-key scheme, signatures are a function of the message and a long-term key. Hence key material can be re-used many times before replacement. This obviates the necessity of special written agreements between individual senders and receivers. It also makes it easy for a third party to resolve disputes (e.g., involving repudiation) later, and simplifies storage and bookkeeping. As we note below, the bandwidth limitation of public-key systems is minimized by the use of hash functions as auxiliaries. In passing we remark that Goldwasser, Micali and Rivest [GOLD88] have proposed a nondeterministic public-key signature scheme which incorporates an aspect of conventional schemes: a message does not have a unique signature. This scheme is intended to deflect adaptive chosen-text attacks, in which an attacker is permitted to use a signer as an oracle. In such an attack, the attacker specifies a sequence of messages to be signed, and is able to study all previous signatures before requesting the next. In the GMR scheme, it can be proved that an attacker can gain no information by studying any number of signatures no matter how they are generated. However, this security gain is offset to some extent by data expansion (high ratio of signature to message size) and a negative effect on nonrepudiation. As in the case of the one-time schemes discussed earlier, the signatures generated in nondeterministic public-key schemes tend to be large in comparison to signatures produced by deterministic public-key schemes (i.e., those which produce a unique signature for a given message). Adjudication of disputes is made difficult by increased bookkeeping and the necessity of storing large databases of messages and their signatures. In the following we discuss only deterministic public-key signature schemes, in which a message has a unique signature. 3.1 Public-key implementation of signatures. We have noted previously that, like secret-key systems, public- key systems can be used for authentication. There are several distinctive features of a public-key implementation of signatures, including: a. The possibility of incorporating both secrecy and authenticity simultaneously, assuming that the system supports both services. b. The re-use of key material in generating signatures algorithmically. c. Nonrepudiation of sending is essentially a built-in service. These features make public-key implementation of signatures very efficient and versatile. 3.1.1 Signing messages. There are several possible protocols for signing in public-key cryptography, depending on the services desired. In the simplest case, A simply signs a document M by sending S = DA(M) to B. If a hash function H is used, A computes S = DA(H(M)) and sends both M and S to B. B validates A's signature S by computing H(M) = EA(S). We also noted earlier that because EA is a trapdoor one- way function, it should not be possible for an intruder to find S' such that H(M) = EA(S') for a given message M. Thus A's signature cannot be forged. Also, if A attempts to repudiate the M sent to B above, B may present M and S to a judge. The judge can access EA and hence can verify H(M) = EA(S); assuming the trapdoor DA is indeed secret, only A could have sent S. Thus a priori we have nonrepudiation (but see below). For both authenticity and secrecy, A could send M' = EB(M,DA(H(M))) to B; B computes (M,DA(H(M))) = DB(M') and then verifies that H(M) = EA(DA(H(M))). The last protocol is illustrated below. A: B: ____________________ ___________________ | | ____\| | | compute H = H(M) | /|\ /| receive M' from A | |____________________| | |___________________| | | | | | | ________\|/_________ | __________\|/___________ | | | | | | compute S = DA(H) | | | compute (M,S) = DB(M') | |____________________| | |________________________| | | | | | | _________\|/__________ | _________\|/_________ | | | | | | compute M' = EB(M,S) | | | compute H' = EA(S) | |______________________| | |_____________________| | | | | | | ________\|/_________ | _________\|/_________ | | | | | | send M' to B | | | verify H' = H(M) | |____________________| | |_____________________| | | | | \|/________________\| / Figure 5. A Protocol for Signing with Hash Function and Secrecy For nonrepudiation in the last case, B retains DB(M') = (M,DA(H(M))). A judge can apply EA to DA(H(M)) and compare to H(M). This again assumes common domains for the E's and D's; in section 4.1 we will note an example of how this point may be treated in practice. In passing we remark that the preceding schemes satisfy another desirable property: in the adjudication process, they do not compromise security by exposing private components to a judge. 3.1.2 The issue of nonrepudiation. Nonrepudiation, i.e., preventing senders from denying they have sent messages, is contingent upon users keeping their private components secret (e.g., [NEED78],[POPE79]). In the above, if DA should be compromised, then A might be able to repudiate messages sent even before the compromise. On the other hand, as in the case of lost or stolen credit cards, A may still be held liable for messages signed before the compromise was reported to a central authority. This is an administrative issue, and ultimately a matter for litigation; hence it is beyond the scope of cryptography per se. However, some partial solutions have been proposed which can be incorporated into protocols. Most of these involve use of some form of arbitrator, as noted in [DEMI83]. For example, in [POPE79], notary public machines are employed as arbitrators. A general protocol for usage of an arbitrator A when U is sending a message M to R is given in [AKL-83]: a. U computes S1 = ER(DU(M)). b. U computes a header m = identifying information. c. U computes S2 = DU(m,S1). d. U sends m and S2 to A. e. A applies EU to S2 and checks the identifying information in m, thereby verifying the origin of M. f. A computes M' = (m,S1,timestamp). g. A computes S' = DA(M'). h. A sends S' to R. As noted above, a copy of S' may also be sent to U. As noted in [POPE78], this type of scheme violates a desirable criterion for network protocols, namely point-to-point communication. It also requires trusted software, which is undesirable. In [BOOT81] the use of a central authority is suggested for this purpose. In this scheme, the receiver of a message sends a copy to the central authority. The latter can attest to the instantaneous validity of the sender's signature; i.e., that it has not been reported that the sender's private component has been compromised at the time of sending. The value of such testimony is mitigated by the necessity of users rapidly reporting key compromise to the central authority. Also, the central authority becomes a bottleneck. Another partial solution involves timestamps ([DENN81], [MERK82b]). This again may involve a network of automated arbitrators, but very simple in nature, since they need merely timestamp messages. In contrast to the notary publics above, however, in [MERK82b] it is suggested that receivers obtain timestamps. If a receiver needs to be sure of the validity of a signature, he may check the validity of the sender's private component by checking with a central authority. As long as the received message is timestamped before the validity check, the receiver is assured of nonrepudiation. To be legally cohesive, however, this system requires the sender to be responsible for signing until a compromise of his private component is reported to the central authority. In analogy to lost credit cards, this may not be a desirable system. Also, it requires an on-line central authority and real-time validity checks and timestamps, which may again create bottlenecks. Furthermore, such schemes require a network-wide clock and are vulnerable to forgery of timestamps, as noted in [BOOT81]. If users are permitted to change their components, a central authority should retain past components for use in disputes which may arise later. Neither compromise nor change of components of a user should cause catastrophic problems in practice, since credit card systems have established precedents for protocols, both administrative and legal. For example, as noted above, conventions have been established for treatment of cases of lost or stolen credit cards; presumably analogous procedures could be established for component compromise. 3.1.3 The issue of proof of delivery. The literature on public-key protocols concentrates on the issues of validity of signatures and the effects of key compromise on nonrepudiation. However, DeMillo and Merritt [DEMI83] note that the inverse of a signature, namely protection against the recipient of a message denying receipt, may also be a desired feature of a system. It is mentioned as an optional security service in [ISO- 87]. Both nonrepudiation and proof of delivery must be implemented via adjudicable protocols, i.e., protocols which can be verified later by a third party. In [DEMI83] it is noted that nonarbitrated adjudicable reception protocols are difficult to design. A simple handshaking protocol might proceed as follows: a. A computes X = EB(DA(M)). b. A sends X to B. c. B computes M = EA(DB(X)). d. B computes Y = EA(DB(M)). e. B acknowledges receipt of M by sending Y to A. If this protocol is standard procedure then A can assume nonreception unless he receives acknowledgement from B. However, to serve as an adjudicable proof-of-reception protocol, B is required to acknowledge anything A sends. Then an intruder C could proceed as follows: a. C intercepts Y = EA(DB(M)). b. C sends Y to A. c. A thinks that this is a legitimate message from C, unrelated to his communication with B. Following protocol: c.1. A computes M' = EC(DA(Y)). c.2. A computes Z = EC(DA(M')). c.3. A acknowledges receipt of M' by sending Z to C. d. C computes EB(DC(EA(DC(Z)))) = EB(DC(M')) = EB(DA(Y)) = M. This of course occurs because of step c, in which A is required by protocol to acknowledge M' = gibberish. This shows that such an automatic protocol may be undesirable. On the other hand, selective acknowledgement might have an adverse effect on adjudicability. 3.2 Hash functions and message digests. We noted that public-key systems generally encrypt more slowly than conventional ciphers such as DES. Other digital signature schemes are also relatively slow in general. Furthermore, some schemes produce signatures comparable in size to, and in some cases larger than, the messages they sign. This results in data expansion and effectively lower bandwidth of transmission. Thus it is usually not desirable to apply a digital signature directly to a long message. On the other hand, we remarked that the entire message must be signed. This is seemingly contradictory, but a heuristic solution can be obtained by using a hash function as an intermediary. A hash function H accepts a variable-size message M as input and outputs a fixed-size representation H(M) of M, sometimes called a message digest [DAVI80]. In general H(M) will be much smaller than M; e.g., H(M) might be 64 or 128 bits, whereas M might be a megabyte or more. A digital signature may be applied to H(M) in relatively quick fashion. That is, H(M) is signed rather than M. Both M and the signed H(M) may be encapsulated in another message which may then be encrypted for secrecy. The receiver may validate the signature on H(M) and then apply the public function H directly to M and check to see that it coincides with the forwarded signed version of H(M). This validates both the authenticity and integrity of M simultaneously. If H(M) were unsigned only integrity would be assured. A hash function can also serve to detect modification of a message, independent of any connection with signatures. That is, it can serve as a cryptographic checksum (also known as an MDC = manipulation detection code or MAC = message authentication code). This may be useful in a case where secrecy and authentication are unimportant but accuracy is paramount. For example, if a key is sent in unencrypted form, even if the key is only a few hundred bits it might be useful to transmit a checksum along with it. Another instance where this case arises is when secrecy is provided by a system such as DES which does not provide a signature capability. An important distinction is that a hash function such as a MAC used in connection with a conventional system is typically parameterized by a secret shared key, although the latter may be distinct from the session key used in transmitting the message and its MAC. In contrast, hash functions used in connection with public-key systems should be public and hence keyless. Earlier we referred to the use of hash functions as a "heuristic" solution to the problem of signing long messages. This refers to the fact that the signature for a message should be unique, but it is theoretically possible that two distinct messages could be compressed into the same message digest (a collision). The security of hash functions thus requires collision avoidance. Collisions cannot be avoided entirely, since in general the number of possible messages will exceed the number of possible outputs of the hash function. However, the probability of collisions must be low. If a function has a reasonably random output, the probability of collisions is determined by the output size. A hash function must meet at least the following minimal requirement to serve the authentication process properly: it must not be computationally feasible to find a message which hashes to the same digest as a given message. Thus, altering a message will change the message digest. This is important to avoid forgery. In many settings this minimal requirement is regarded as too weak. Instead, the requirement is added that it should not be possible to find any two strings which hash to the same value. We return to the distinction between these two criteria in section 3.2.3. 3.2.1 Usage of hash functions. In a public-key system augmented by a hash function H, A might send a message M to B as follows: for simplicity ignore secrecy considerations. Then A sends M and S = DA(H(M)) to B. The latter uses EA to retrieve H(M) from S, then computes H(M) directly and compares the two values for authentication. For nonrepudiation, B retains M, H(M) and S. If A attempts to repudiate M, a judge can use the three items to resolve the dispute as before: he computes H(M) = EA(S) and recomputes H(M) from M. If B could find M' with H(M') = H(M), B could claim A sent M'. A judge receiving M', H(M) and S would reach a false conclusion. 3.2.2 Relation to one-way functions. Merkle [MERK82] defines a hash function F to be a transformation with the following properties: a. F can be applied to an argument of any size. b. F produces a fixed-size output. c. F(x) is relatively easy to compute for any given x. d. For any given y it is computationally infeasible to find x with F(x) = y. e. For any fixed x it is computationally infeasible to find x' þ x with F(x') = F(x). The most important properties are (d) and (e). In particular, (e) guarantees that an alternative message hashing to the same value as a given message cannot be found. This prevents forgery and also permits F to function as a cryptographic checksum for integrity. Actually (e) can be strengthened as we note momentarily. Property (d) states that F is a one-way function (e.g., [DIFF76b]). One-way functions are used in various other contexts as well. For example, we noted earlier that the security of public- key systems depends on the fact that the public transformations are trapdoor one-way functions. Trapdoors permit decoding by recipients. In contrast, hash functions are one-way functions which do not have trapdoors. The concept of (non-trapdoor) one-way functions originated in connection with log-in procedures ([WILK68] p. 91): Needham noted that if passwords were stored encrypted under a one-way function, a list of encrypted passwords would be of no use to an intruder since the original passwords could not be recovered. When a user logged in, the string entered would be encrypted and compared to the stored version for authenticity. Essentially the same system was rediscovered later in [EVAN74]. Usage of one-way functions to produce message digests or encrypt passwords is very different from use of trapdoor one-way functions to generate encryption functions {EA} in a public-key cryptosystem. In the latter, (d) above becomes a dichotomy: it is computationally infeasible for anyone except A to find M from C = EA(M); but it is easy for A, the unique holder of the trapdoor DA, to compute M = DA(C). An example of functions which are not one-way are the private transformations in public-key systems: anyone can solve S = D(M) for M; i.e., M = E(S). This shows that signature schemes are not one-way functions; i.e., they do not satisfy (d) above. On the other hand, a deterministic signature function S satisfies (e) above; i.e., messages have unique signatures. For example, if a signature is generated via S = D(M) where D is a decryption function of a public-key system, then D(M) = D(M') implies E(D(M)) = M = E(D(M')) = M', so (e) is trivially satisfied. Merkle's requirements above are that a hash function must be both one-way (d) and effectively collisionless (e). In order to satisfy (a) and (b) concurrently, collisions must exist; (e) requires that a cryptanalyst should not be able to find them. This notion is amenable to further refinement. 3.2.3 Weak and strong hash functions. The security of hash functions can be refined further: we may distinguish between weak and strong hash functions. A function satisfying (a) - (e) of the previous section may be termed a weak hash function. Property (e) characterizes weak hash functions; it states that a cryptanalyst cannot find a second message producing the same message digest as a fixed message. On the other hand, H may be termed a strong hash function if (a) - (d) still hold, but (e) is modified as follows: it is computationally infeasible to find any {x1,x2} with H(x1) = H(x2). Strong and weak functions may often be obtained from the same class of functions. Strong functions are then characterized by larger message digests, which reduce the probability of collisions. Strong functions are thus more secure, but the longer message digests are likely to increase time needed to hash. Although the two definitions above are superficially similar, computationally they are quite different. For example, suppose H is a hash function with an 80-bit output. Suppose a cryptanalyst starts with a fixed message M and wishes to find a second message M' with H(M) = H(M'). Assuming that the 280 outputs of H are totally random, any candidate M' has a probability of only 2-80 of meeting the required condition. More generally, if the cryptanalyst tries k candidates, the probability that at least one candidate satisfies H(M) = H(M') is 1-(1-2-80)k which is about 2-80k by the binomial theorem if the latter is small. For example, the cryptanalyst will probably have to compute H for about k = 274 = 1022 values of M' to have even one chance in a million of finding one M' which collides with M. Thus H is secure in the weak sense. Suppose for the same H the cryptanalyst merely seeks any values {M1,M2} with H(M1) = H(M2). By example F.1, if he examines H(M) for at least 1.17 * 240 < 2 * 1012 random values of M, the probability of at least one collision exceeds 1/2. A supercomputer could probably find M1 and M2 with H(M1) = H(M2) in at most a few days. Thus H is not secure in the strong sense. The latter attack is called a birthday attack (app. F). It should be noted that finding a collision H(x) = H(y) gives the cryptanalyst no valuable information if x and y are random bit strings. For a purpose such as forgery an adversary may need to generate a large number (e.g. 1012 above) of variations of a message to find two which collide. Inclusion of timestamps or sequence numbers in messages, according to a fixed format, may make it computationally infeasible to find a collision of use to an adversary. 3.3 Digital signatures and certificate-based systems. We previously noted that digital signatures can be employed for authentication in the process of distributing public components in public-key systems. In particular, if the system is certificate- based, the central issuing authority can sign certificates containing public components. The notion of a central authority can be extended to a hierarchical (tree) structure. The central authority serves as the root of the tree; leaves represent users. Intermediate nodes may also be users. Typically the intermediate nodes will be arranged in several levels representing, e.g., organizations and organizational units. Each node of the tree is responsible for authenticating its children. Thus an authentication path is created for each node, ending at the root. For example, the central authority may certify an organization; the organization may certify a unit; the unit may certify an individual user. Certification may be accomplished by having a parent node sign a certificate for the child node. To validate another user's certificate, a user may request the entire authentication path. It is also possible for the tree to be replaced by a forest. For example, in a multinational system there may be a different tree for each country. In this event the roots must certify each other. An authentication path may then pass through two or more roots. More generally, an authentication structure can be an arbitrary directed graph, with a directed edge from A to B if A certifies B. Then authentication paths may be constructed by conjoining directed paths from two users to a common trusted node; an example of this more complex structure is given in section 5.3. 4. Examples of public-key systems and hash functions. Numerous public-key systems have been proposed. These may be divided into several categories: a. Systems which have been broken. b. Systems which are considered secure. b.1. Systems which are of questionable practicality. b.2. Systems which are practical. b.2.1. Systems suitable for key distribution only. b.2.2. Systems suitable for digital signatures only. b.2.3. Systems suitable for key distribution and digital signatures. From the standpoint of sheer numbers, most systems have proven to be insecure. This is unfortunate, since some of the techniques producing insecure systems have relatively high bandwidths, where bandwidth refers to rates of encryption/decryption. Knapsack ciphers are an example (sec. 4.2.1, app. E)) of high-bandwidth but generally insecure systems. Of the systems which have not been broken, many are regarded as impractical for reasons such as large key sizes or large data expansion (ciphertext much larger than plaintext). Only a relative handful of systems are widely-regarded as secure and practical. In particular, a cryptosystem is usually regarded as secure if breaking it is essentially equivalent to solving a long-standing mathematical problem which has defied all attempts at solution. Of the well-known systems which are generally regarded as secure and practical, some are limited to digital signatures and hence unsuitable for public key distribution (e.g. sec. 4.2.2). On the other hand, in instances such as the Diffie/Hellman exponential exchange scheme (sec. 2.2), public key distribution is supported but authentication is not. Such systems may need augmentation by a system which supports signatures. The only well-known system discovered to date which is secure, practical and suitable for both secrecy and authentication is described in section 4.1. Category (b.2.3) above could in theory be further subdivided into systems with relatively low and high bandwidths, but there are no well-known examples with high bandwidths. There does not exist a secure, practical system suitable for key distribution and digital signatures, and with high enough bandwidth to support bulk data encryption. Prospects for creating radically new systems seem dim; in fact, as noted in [DIFF88], most of the extant systems are based on a small number of hard mathematical problems: a. discrete exponentiation. b. knapsack problems. c. factoring. According to Diffie, discrete exponentiation was suggested by J. Gill, and the use of the knapsack problem by Diffie. A suggestion to use factoring was apparently made by Knuth to Diffie during the early years of public-key cryptography, but factoring was not incorporated into a cryptosystem until several years later (sec. 4.1). The knapsack problem is noted briefly in section 4.2.1. The other two mathematical problems above are easy to state (though also hard to solve): a. If p is a given prime and g and M are given integers, find x such that gx ð M (mod p). b. If N is a product of two secret primes: 1. factor N. 2. Given integers M and C, find d such that Md ð C (mod N). 3. Given integers e and C, find M such that Me ð C (mod N). 4. Given an integer x, decide whether there exists an integer y such that x ð y2 (mod N). Gill's suggestion was based on (a), i.e., on the difficulty of computing discrete logarithms modulo a prime. This has generated systems (e.g. sec. 2.2) suitable for key distribution, and others suitable for signatures (e.g. sec. 4.2.2). Exploitation of the difficulty of factoring has proven to be the most versatile approach to design of public-key systems. Factoring is widely-regarded as very difficult. If a modulus has unknown factorization, various problems in modular arithmetic become difficult. For example, discrete logarithm ((b.2) above) is difficult, as is taking roots (b.3) and deciding quadratic residuosity (b.4). These problems have generated the most widely- known public-key system (sec. 4.1) and various others. We remark in passing that the probabilistic scheme mentioned in section 3 and appendix P are based on (b.4); more generally, quadratic residuosity forms the basis of many zero-knowledge schemes. Diffie's knapsack suggestion is one of many attempts to utilize NP-complete problems for cryptosystems. Such attempts have met with very limited success. We return to this topic and further discussion of the above problems later. In the meantime we discuss a few of the most well-known public-key systems, along with some hash functions. 4.1 The RSA public-key scheme. Rivest, Shamir and Adleman [RIVE78] obtained the best-known and most versatile public-key cryptosystem. It supports both secrecy and authentication, and hence can provide complete and self- contained support for public key distribution and signatures. A user chooses primes p and q and computes n = p * q and m = (p- 1)(q-1). He then chooses e to be an integer in [1,m-1] with GCD(e,m) = 1. He finds the multiplicative inverse of e, modulo m; i.e. he finds (app. H) d in [1,m-1] with e * d ð 1 (mod m). Now n and e are public; d, p, q and m are secret. That is, for a user A, nA and eA constitute the public component, and dA, pA, qA the private component. After a user has computed p,q,e,d, the private transformation D and public transformation E are defined by (see app. M): E(M) = Me mod n, D(C) = Cd mod n. In the above, M and C are in [0,n-1]. As in section 1.5 we have D(E(M)) = M and E(D(C)) = C; i.e. D and E are inverses. Since d is private, so is D; and since e and n are public, so is E. This constitutes a cryptosystem which can be used for both secrecy and authentication. That is, for secrecy, A sends EB(M) to B as usual; for authentication, A sends DA(M) as usual. For both secrecy and authentication, suppose first that message digests are not employed. Assuming nA ó nB, A computes C = EB(DA(M)) and sends C to B. Then B recovers M as usual by M = EA(DB(EB(DA(M)))). This process is illustrated below: A: B: ___________________________ __________________ | | | | | compute M' = M**eA mod nA | |----->| receive C from A | |___________________________| | |__________________| | | | | | | _____________\|/___________ | ____________\|/____________ | | | | | | compute C = M'**dB mod nB | | | compute M' = C**eB mod nB | |___________________________| | |___________________________| | | | | | | __________\|/__________ | ____________\|/____________ | | | | | | send C to B | | | compute M = M'**dA mod nA | |_______________________| | |___________________________| | | | | \|/_____________\| / Figure 6. Using RSA for Authenticity and Secrecy As noted in section 3.1.1, for nonrepudiation B may retain DA(M). This implementation only works because the range of DA is a subset of the domain of EB; i.e. [0,nA-1] is a subset of [0,nB-1]. In the event that nA ò nB, Kohnfelder [KOHN78b] notes that A can instead transmit C' = DA(EB(M)). Then B can recover M as M = DB(EA(DA(EB(M)))). This works since the range of EB is a subset of the domain of DA. For adjudication of possible disputes, B must retain C' and M. Then to prove that A sent M, the judge can compute EA(C') and EB(M) and test for equality. However, in [DAVI83] and [DENN83b] it is noted that there is a disadvantage to this solution: A signs EB(M) rather than M. In section 3.1.1 the judge was able to apply EA to the stored quantity DA(M) and M is obtained. In Kohnfelder's protocol both C' and M must be stored, doubling the storage requirement. The use of message digests eliminates the preceding problem. Suppose H is a hash function. Then to send M to B, A can create a new message M' containing M and DA(H(M)) and send C = EB(M') to B. This gives both secrecy and authentication; also, C may be retained for nonrepudiation. Moreover, M and DA(H(M)) are reblocked in forming M', so that no problem arises from the sizes of nA and nb. 4.1.1 Choice of p and q. To use the RSA system, a user must first choose primes p and q. As noted in section 4.1.3, p and q should at least 75 to 80 decimal digits; for the sake of discussion we assume p and q to be on the order of about 100 digits. The user begins by choosing a random odd b of around 100 digits; b is a candidate for p. Now b is acceptable only if b is prime. There are two approaches to this problem. The deterministic approach decides with certainty whether b is prime (see below). An alternative is to use the probabilistic approach as implemented, e.g., by Solovay and Strassen [SOLO77]: randomly choose a set of about 100 numbers in [1,b-1]. For each number a in the set, test to see if GCD(a,b) = 1, where GCD = greatest common divisor. If this condition holds, compute the Jacobi symbol (a/b) (app. J). Both GCDs and Jacobi symbols are easy to compute via well-known algorithms (app. H,J). Now check to see if (a/b) ð a(b-1)/2 (mod b). If b is not prime, this check will fail with probability at least 1/2 (app. L). Thus, if 100 random a's are tested and all pass the above test, the probability of b not being prime is about 1 in 2100. Hence it is possible to test in polynomial time whether b is probably a prime. The probability of b being prime is about 1 in 115; hence repeated applications of the preceding algorithm will probably quickly yield b which is probably a prime. This can be taken to be p; a second application of the algorithm will yield q. The Solovay/Strassen algorithm is discussed further in appendix L, which also mentions alternative algorithms of Lehman [LEHM82] and Miller and Rabin ([MILL76],[RABI80]). The advantage of such probabilistic algorithms is that the algorithms run in polynomial time. They do not guarantee a correct answer, but the possibility of error is negligible as noted above. If absolute certainty were required, deterministic algorithms for primality testing could be used ([ADLE83],[COHE87],[COHE84]). These guarantee primality, but have runtimes of the form exp(c(log log n)(log log log n)). If a sufficient amount of computing power is available, primality of 100-digit numbers can usually be established deterministically in minutes. However, the code needed is complicated; moreover, most users will not have access to high- performance computers. The probabilistic algorithms have acceptable run times even on small computers, particularly if hardware support is available. 4.1.2 Further notes on implementation. Once p and q have been chosen, e is chosen to be relatively prime to m = (p-1)(q-1); i.e. GCD(e,m) = 1. For example, e can be chosen to be a random small prime > 2. If e is relatively small, E(M) can be computed relatively rapidly; i.e. only a small number of modular multiplications are needed. The condition GCD(e,m) = 1 presents no problem; in fact the probability that two random numbers are relatively prime is about 3/5 ([KNUT81] p. 324). Now d can be found in polynomial time; however, d may be large. Hence a version of the Chinese Remainder Theorem is employed for decryption (app. M). There are some restrictions on p and q in addition to the size specification mentioned above. Some of these are noted in section 4.1.3.1. Also, the time and space requirements of RSA are considerably larger than, e.g., DES. Storage for each of p, q, and d will require at least 300 bits; n is about 600 bits. Only e can be chosen to be small. Exponentiation mod n is a relatively slow operation for large n, especially in software. Efficient algorithms for computing products mod n have been given, e.g., in [BLAK83], [BRIC82], [MONT85] and [QUIS82]. In particular, [BRIC82] shows how multiplication mod n can be done in m+7 clock pulses if n is m bits. In [MONT85] it is shown how modular multiplication can avoid division. At the present time RSA decryption (encryption is slower) with a 508-bit modulus has been performed at about 225 Kbits/sec in hardware [SHAN90]. Decryption with a 512-bit modulus has been effected at about 11 Kbits/sec in software [DUSS90]. 4.1.3 Security of RSA. The security of the private components in the RSA cryptosystem depends on the difficulty of factoring large integers. No efficient algorithms are known for this problem. Consequently, knowing n = p * q does not yield p and q. Thus it is computationally infeasible to find (p-1)(q-1) = m, and hence the relation e * d ð 1 (mod m) is useless for determining d from e. The difficulty of computing discrete logarithms (cf. sec. 1.5) deflects known-plaintext attacks. Suppose an intruder knows M, C and M = D(C) for some plaintext/ciphertext pair {M,C}. To find d he must solve M = Cd mod n; i.e., he must find a discrete logarithm. Ciphertext-only attacks are equivalent to taking roots modulo a composite number with unknown factorization, i.e. solving C = Me mod n for M. This is probably about as difficult as discrete logarithms, even in the case of square roots (e.g., [ADLE86]). In fact (lem. N.3.2; cf. [KNUT81] p. 389, [RABI79]), taking square roots modulo such n is, loosely speaking, as hard as factoring n. It should be noted that the security of RSA is not provably equivalent to the difficulty of factoring or the other problems mentioned here. Also, security of RSA or other public-key systems does not preclude breaking specific protocols for their use; see e.g. [MOOR88] or [DOLE81]. An example of a protocol failure is given in [MOOR88]: suppose two users use a common modulus n and encryption exponents e and e' with GCD(e,e') = 1. If the same message M is sent to both, then let C = Me mod n and C' = Me' mod n. If both C and C' are intercepted, an intruder can find (app. H) r and s with r * e + s * e' = 1; now M = Cr * C's mod n. 4.1.3.1 Restrictions on p and q. There are two major restrictions on p and q in order to foil attempts at factoring n = p * q: a. p and q should be about the same length. b. p and q should be at least 75 digits in length. Present factoring methods will be foiled by such a choice. For longer-term security, p and q should be, e.g., around 100 digits. Condition (a) is needed against the elliptic curve method (see below). An attempt to predict the period of time for which a given modulus length will remain secure is necessarily guesswork. Intelligent guesses are the best that can be hoped for by implementors; these are connected with the status of progress in factoring, which is highly dynamic. 4.1.3.2 Notes on factoring. Choosing n to be about 200 digits will foil any present attempt at factoring n. The most powerful general-purpose factoring algorithm is Pomerance's quadratic sieve [POME84] and its variations. Using a parallel version of the quadratic sieve, Caron and Silverman [CARO87] factored 90-digit integers in several weeks on a network of several dozen Sun-3's. Recently, 106-digit integers have been factored on a larger network [LENS89], using the multiple polynomial variant of the quadratic sieve. Recent announcements of work in progress indicate that integers of around 116 digits may be factored shortly, using the same network used in the 106-digit case and another variant of the quadratic sieve. Pomerance, Smith and Tuler [POME88] discuss a prototype of a special pipelined architecture, optimized for the quadratic sieve, which they claim is extensible to a machine which might factor 125- digit integers in a year. However, this presents no threat to 200- digit moduli, since all of the best general-purpose factoring algorithms known, including the quadratic sieve, run in time roughly exp(((log n)(log log n))1/2). The fastest present computers only execute on the order of 1010 operations per second, at a cost of $10 million or more. Even if a future computer reached 1012 operations per second (and presumably a cost exceeding $100 million) it would take on the order of 1000 years to factor a 200-digit number using existing algorithms. The strongest results on factoring obtained to date have utilized networks of computers. The power of such networks is more difficult to extrapolate than for supercomputers, since they are expandable and the power of personal computers and workstations has been rising at a higher rate than at the supercomputer level. In theory it would be preferable to know that a cryptosystem would be immune to an attack by an assemblage of every computer in the universe. It seems difficult to estimate the total amount of computing power this would bring to bear; furthermore, the question in practice is what fraction of this total amount can be realistically allotted to a given factoring problem (cf. app. B). It follows that implementors may have a high level of confidence in 200-digit moduli at the present. An interesting issue at the time of this writing is whether a modulus with a length between 150 and 200 digits will provide security for a given length of time. It seems certain that 154-digit (i.e., 512-bit) integers will be factored eventually, but the question is when. If the preceding runtime estimate is used, the conclusion is that 154 digits is likely to be secure until the late 1990s, barring major algorithmic advances (cf. app. B). RSA moduli of around 512 bits are generally preferable to 200 digits (around 660 bits) from a hardware- implementation standpoint. A potential qualifier to the preceding analysis is that the above runtime estimate is generic; in particular it assumes that runtime is essentially independent of the integer being factored. However, there are algorithms which have the above as their worst- case time. For example, the Schnorr/Lenstra algorithm [SCHN84] factors n by using quadratic forms with discriminant -n. The equivalence classes of such forms form a group (the class group) whose order is denoted by h(-n). The algorithm factors n quickly if h(-n) is free of large prime divisors. The algorithm is Monte Carlo in the sense that the n's which will be factored quickly are evidently fairly random. No algorithms are known to generate class numbers with large prime divisors, although interestingly RSA comes close: class numbers h(-n) of discriminants n with one or two prime factors have a higher probability of having large prime factors than the class numbers of discriminants with only small prime factors. The net result is that, e.g., perhaps one in 1000 n's will factor 1000 times more quickly than average. If this method were practicable it would argue for the 200-digit modulus in RSA. However, the Monte Carlo phenomenon has not produced noticeably lower factoring times in practice as yet. Some other factoring methods are applicable to numbers which do not have the form required for an RSA modulus. For example, the elliptic curve method [LENS87] is oriented to integers whose second-largest prime factor is considerably smaller than the largest prime factor. This motivates the condition that an RSA modulus should have factors roughly equal in size. A recent algorithm, the number field sieve [LENS90], applies to integers of the form rd + s or rd - s where r and s are small. Again this will not affect properly-chosen RSA moduli. 4.1.4 Low-exponent versions of RSA. A priori the encryption exponent e in the RSA scheme is arbitrary. However, it need not be random. It has been suggested that e be chosen to have a common value by all users of an RSA- based system. Using e = 2 is not feasible per se, since 2 is not relatively prime to p-1 or q-1. However, extensions of this type have been made. These are of some theoretical interest, since Rabin [RABI79] and Williams [WILL80] have noted modifications of RSA with exponent 2 for which successful ciphertext-only attacks are essentially as difficult as factoring the modulus (app. N). The exponent 3 can be used directly with RSA. It has the advantage of greatly simplifying the encryption (but not the decryption) process. In fact Knuth [KNUT81], Rivest [RIVE84] and others recommend its usage. However, 3 and other very low exponents suffer from some flaws. The case e = 3 is an illustration. For one thing, as Knuth notes, users must make certain that each block M to be encrypted satisfies M3 >> n. This is because C = M3 mod n becomes simply C = M3 if M3 < n; in this event finding M reduces to the trivial problem of taking ordinary cube roots of integers. Thus the use of e = 3 causes the domain of E to be a subset of [0,n) rather than the whole interval as would be preferable. A related but more subtle flaw has also been noted if, e.g., e = 3 [HAST88]. Suppose A sends the same message M to each of {Bi}, i = 1,2,3. Suppose Bi uses modulus ni, and M < ni for each i (this will be true, e.g., if the sender uses a modulus smaller than each of the {ni}). Assuming that each of {n1,n2,n3} is generated as a product of two random primes, the probability of duplicates among the 6 primes used is near zero. Hence it may be safely assumed that the {ni} are pairwise relatively prime. Let Ci = M3 mod ni. Suppose all 3 {Ci} are intercepted. Using the Chinese Remainder Theorem (app. I), the intruder can find x in [0,n'), n' = n1 * n2 * n3, with x ð Ci (mod ni), i = 1,2,3. Thus x ð M3 (mod n'). But M3 < n', so M3 = x, so M = x1/3 (ordinary cube root). Hence the plaintext M can be easily recovered from the three ciphertexts. Thus the use of e = 3 or other small exponents makes RSA vulnerable to ciphertext-only attacks. The sender may attempt to modify M for each recipient to deflect this attack, but Hastad shows that more generally, a low exponent is vulnerable to linear dependencies in portions of messages. 4.2 Other public-key systems. Several public-key systems other than RSA have been proposed. These are briefly surveyed here. As noted earlier, most of these are either insecure, of questionable practicality, or limited to one of signatures/secrecy but not both. In some cases their security and/or practicality has not been established. None of these systems rivals RSA if a combination of versatility, security and practicality is the criterion. However, this does not preclude the use of the secure systems for specific applications such as digital signatures. 4.2.1 Knapsack systems. These were proposed by Merkle and Hellman [MERK78b]. However, the essence is the use of NP-complete problems (e.g., [GARE79],[HORO78]) which had been suggested for use in designing public-key systems in [DIFF76b]. The Merkle/Hellman approach was to employ a superincreasing sequence {ai}, i.e., a sequence of positive integers for which ai > ai-1 + ... + a1. The associated knapsack problem [HORO78] is to find nonnegative integers {Mi} such that for a given Y, Y = a1 * M1 + ... + an * Mn. The above is an instance of integer programming. In the present context the {Mi} are binary; thus the above reduces to sum-of- subsets. Solving the above is easy for superincreasing {ai}; in fact the solution is unique. However, even deciding existence of a solution of sum-of-subsets, or more generally integer programming, is NP-complete ([GARE79], pp. 223, 245). Now for any w and u satisfying u > a1 + ... + an, GCD(u,w) = 1, a disguised knapsack problem may be produced from {ai} by defining bi = w * ai mod u. Then the associated knapsack problem C = b1 * M1 + ... + bn * Mn appears to a cryptanalyst to be hard since the {bi} appear random. In actuality, it is easily solved using the connection with the {ai}: since GCD(w,u) = 1, there exists W (app. H) such that w * W ð 1 (mod u). Then W * C ð a1 * M1 + ... + an * Mn (mod u). Since 0 ó Mi ó 1, 0 ó a1 * M1 + ... + an * Mn < u, from which W * C = a1 * M1 + ... + an * Mn. As noted above, the latter knapsack problem has an easily-found unique solution, from which C may be retrieved. This is called a trapdoor; it permits a legitimate user to easily solve what appears to be a hard problem. The above modular disguise can be iterated; i.e., new w' and u' chosen and the {bi} disguised, etc. The trapdoor knapsack yields a public-key system: if the binary representation of plaintext M is Mn...M1, let E(M) = b1 * M1 + ... + bn * Mn. If C = E(M) then decrypting C is equivalent to solving what appears to be a general knapsack problem, although decryption is easy using the trapdoor. The public component is {bi} and the private component is u, w and {ai} (see [MERK78b] for details). The advantage is that the arithmetic is much quicker than in RSA: about 200 additions are needed, as opposed to about 500 squarings and 150 multiplications in RSA. In fact knapsacks may rival DES in speed [HENR81]. Unfortunately the above knapsack and most variations on the above have been broken (Appendix E). It should also be noted that even if a knapsack approach proves to be secure and practical, such schemes are generally limited to support for either secrecy or authentication but not both. In contrast to RSA, the encryption and decryption functions are not inverses, although D(E(M)) = M. A trivial example suffices to show this: suppose E(M) = 2M1 + 3M2. To be invertible a function must be both injective and surjective. However, E is not surjective (onto). For example there is no M such that E(M) = 1; hence D(1) is undefined. Thus we could not employ D for signatures as in RSA, since, e.g., 1 could not be signed by computing D(1). 4.2.2 The ElGamal signature scheme. A signature scheme derived from a modification of exponentiation ciphers was proposed by ElGamal [ELGA85]. First of all a prime p and a base a which is a primitive root modulo p (app. K) are public. Now suppose A wishes to send a signed message to B. The private component of A consists of two portions, one fixed and the other message-dependent. The fixed portion is a random x(A) from [1,p-2]. The public component of A also has fixed and message-dependent components. The fixed portion is y(A) = ax(A) mod p. Now for a given message M in [0,p-1], A chooses a secret k in [0,p-1] with GCD(k,p-1) = 1. Thus k is the message-dependent portion of A's private component. Also, A finds (app. H) I with k * I ð 1 (mod (p-1)). The message-dependent portion of A's public component consists of r and s, where r = ak mod p, s ð I * (M - r * x(A)) (mod (p-1)). Now A sends M, r and s to B. For authentication B computes C = aM mod p, C' = y(A)r * rs mod p. We note that M ð r * x(A) + k * s (mod (p-1)). Hence if M is authentic and valid, C = (ax(a))r * (ak)s mod p = C'. The ElGamal algorithm is illustrated below. A: B: ____________________ _________________ | | _____\| | | choose k | /|\ /| receive (M,r,s) | |____________________| | |_________________| | | | | | | _________\|/__________ | __________\|/___________ | | | | | | find I*k ð 1 mod p-1 | | | compute C = a**M mod p | |______________________| | |________________________| | | | | | | __________\|/___________ | _________\|/_________ | | | | | | compute r = a**k mod p | | | compute C' = yA**r | |________________________| | | * r**s mod p | | | |_____________________| | | | | | | __________\|/___________ | | | | | | | compute s = I*(M-r*xA) | | | | mod p-1 | | | |________________________| | | | | | | | | ________\|/_________ | _________\|/_________ | | | | | | send (M,r,s) to B | | | verify C' = C | |____________________| | |_____________________| | | | | \|/________________\| / Figure 7. The ElGamal Signature Algorithm A different k must be chosen for each M; using any k twice determines x(A) uniquely. The security of the method then depends mainly on the difficulty of computing discrete logarithms in GF(p): suppose an intruder intercepts M, r and s; a and p are public. Let d = ar mod p and u * rs ð 1 (mod p). Then the intruder knows aM ð y(A)r * rs (mod p). Hence u * aM ð y(A)r (mod p). Thus dx(A) ð (ax(A))r ð y(A)r (mod p). Finally dx(A) ð u * aM (mod p). Finding x(A), which permits the intruder to masquerade as A, is thus equivalent to the above discrete logarithm problem. It is easy to solve this problem if p-1 has only small factors [POHL78], hence p-1 should have a large prime factor as in RSA. An alternative approach is to seek k and m satisfying M = r * x(A) + s * k + (p-1) * m, but this is underdetermined, so that there are an exponential number of possible solutions to check. A third approach at cryptanalysis seeks r and s satisfying aM ð y(A)r * rs (mod p). This is presumably easier than discrete logarithm since r and s can both be varied, but it is not clear how much easier. In comparing this scheme to RSA, we note that both employ exponentiation, so that their speeds of encryption/decryption should be comparable. Generating keys in the two methods is similar; finding appropriate primes is the main step in either. Security of ElGamal is more difficult to assess. The major attack against RSA, i.e., factorization, is a problem which has been studied for centuries. It is safe to say that far less time has been invested in attempts to cryptanalyze the ElGamal scheme. Cryptanalytically, the last attack (varying r and s simultaneously) may have a complexity substantially lower than factoring or discrete logarithm; hence security of ElGamal is at best no better than RSA, and possibly much inferior, with respect to secrecy of components (it must be emphasized that this is a conservative characterization; it could also turn out that no substantial advantage is gained by varying r and s simultaneously). The use of message-dependent portions of components is a plus and minus: it increases bookkeeping, and makes it more difficult for a third party to adjudicate a dispute. On the other hand it may produce greater security against certain types of attacks. In fact the k above could be chosen differently for different blocks of one message. In passing we note that, like knapsacks, this scheme goes in one direction only: we have in effect M = E(r,s) = (x(A) * r + k * s) mod (p-1). This maps the ordered pair (r,s) to a unique M. The condition GCD(k,p-1) = 1 guarantees that an (r,s) can always be found; i.e., E is surjective. But it is not injective (one-to-one): e.g., if p = 7, k = 5, x(A) = 5, then E(1,2) = E(2,1) = 3. Thus text M cannot be represented uniquely by a pair (r,s), which would lead to ambiguity in an attempt to use this scheme in two directions. Also, ElGamal notes that extension to GF(pn) (app. G) is feasible. However, it is not clear that extensions to finite fields are desirable (cf. sec. 1.5); gains in efficiency may be offset by lower security for comparable key sizes. 4.3 Examples of hash functions. To be useful in conjunction with public-key systems, a hash function should ideally be keyless. This is in contradistinction to message authentication codes used in connection with secret- key systems. Several hash functions have been proposed for use in public-key settings. A few are mentioned here, but it seems to be difficult to produce secure, keyless, efficient hash functions. Thus we include some examples of functions which have keys, are insecure, or both. 4.3.1 Merkle's meta-method. Merkle ([MERK82],[MERK89]) has proposed a general technique for obtaining hash functions and digital signatures from existing conventional cryptosystems such as DES. Although secret-key systems are used as adjuncts, the resulting hash f