diff options
Diffstat (limited to 'botan/doc/api.tex')
| -rw-r--r-- | botan/doc/api.tex | 3103 |
1 files changed, 0 insertions, 3103 deletions
diff --git a/botan/doc/api.tex b/botan/doc/api.tex deleted file mode 100644 index 556e76a..0000000 --- a/botan/doc/api.tex +++ /dev/null @@ -1,3103 +0,0 @@ -\documentclass{article} - -\setlength{\textwidth}{6.5in} -\setlength{\textheight}{9in} - -\setlength{\headheight}{0in} -\setlength{\topmargin}{0in} -\setlength{\headsep}{0in} - -\setlength{\oddsidemargin}{0in} -\setlength{\evensidemargin}{0in} - -\title{\textbf{Botan API Reference}} -\author{} -\date{2009/2/19} - -\newcommand{\filename}[1]{\texttt{#1}} -\newcommand{\manpage}[2]{\texttt{#1}(#2)} - -\newcommand{\macro}[1]{\texttt{#1}} - -\newcommand{\function}[1]{\textbf{#1}} -\newcommand{\keyword}[1]{\texttt{#1}} -\newcommand{\type}[1]{\texttt{#1}} -\renewcommand{\arg}[1]{\textsl{#1}} -\newcommand{\namespace}[1]{\texttt{#1}} - -\newcommand{\url}[1]{\texttt{#1}} - -\newcommand{\ie}[0]{\emph{i.e.}} -\newcommand{\eg}[0]{\emph{e.g.}} - -\begin{document} - -\maketitle - -\tableofcontents - -\parskip=5pt - -\pagebreak -\section{Introduction} - -Botan is a C++ library that attempts to provide the most common -cryptographic algorithms and operations in an easy to use, efficient, -and portable way. It runs on a wide variety of systems, and can be -used with a number of different compilers. - -The base library is written in ISO C++, so it can be ported with -minimal fuss, but Botan also supports a modules system. This system -exposes system dependent code to the library through portable -interfaces, extending the set of services available to users. - -\subsection{Targets} - -Botan's primary targets (system-wise) are 32 and 64-bit CPUs, with a -flat memory address space of at least 32 bits. Generally, given the -choice between optimizing for 32-bit systems and 64-bit systems, Botan -is written to prefer 64-bit, simply on the theory that where -performance is a real concern, modern 64-bit processors are the -obvious choice. However in most cases this is not an issue, as many -algorithms are specified in terms of 32-bit operations precisely to -target commodity processors. - -Smaller handhelds, set-top boxes, and the bigger smart phones and smart -cards, are also capable of using Botan. However, Botan uses a fairly -large amount of code space (up to several megabytes, depending upon -the compiler and options used), which could be prohibitive in some -systems. Usage of RAM is fairly modest, usually under 64K. - -Botan's design makes it quite easy to remove unused algorithms in such -a way that applications do not need to be recompiled to work, even -applications that use the algorithms in question. They can simply ask -Botan if the algorithm exists, and if Botan says yes, ask the library -to give them such an object for that algorithm. - -\subsection{Why Botan?} - -Botan may be the perfect choice for your application. Or it might be a -terribly bad idea. This section will make clear what Botan is -and is not. - -First, let's cover the major strengths: - -\begin{list}{$\cdot$} - \item Support is (usually) quickly available on the project mailing lists. - Commercial support licenses are available for those that desire them. - - \item - \item Is written in a (fairly) clean object-oriented style, and the usual - API works in terms of reasonably high-level abstractions. - - \item Supports a huge variety of algorithms, including most of the major - public key algorithms and standards (such as IEEE 1363, PKCS, and - X.509v3). - - \item Supports a name-based lookup scheme, so you can get a hold of any - algorithm on the fly. - - \item You can easily extend much of the system at application compile time or - at run time. - - \item Works well with a wide variety of compilers, operating systems, and - CPUs, and more all the time. - - \item Is the only open source crypto library (that I know of) that has - support for memory allocation techniques that prevent an attacker from - reading swap in an attempt to gain access to keys or other secrets. In - fact several different such methods are supported, depending on the - system (two methods for Unix, another for Windows). - - \item Has (optional) support for Zlib and Bzip2 compression/decompression - integrated completely into the system -- it only takes a line or two of - code to add compression to your application. -\end{list} - -\noindent -And the major downsides and deficiencies are: - -\begin{list}{$\cdot$} - \item It's written in C++. If your application isn't, Botan is probably - going to be more pain than it's worth. - \item - - \item Botan doesn't directly support higher-level protocols and - formats like SSL or OpenPGP. SSH support is available from a - third-party, and there is an alpha-level SSL/TLS library - currently available. - - \item Doesn't currently support any very high level 'envelope' style - processing - support for this will probably be added once support for - CMS is available, so code using the high level interface will produce - data readable by many other libraries. -\end{list} - -\pagebreak -\section{Getting Started} - -\subsection{Basic Conventions} - -With a very small number of exceptions, declarations in the library -are contained within the namespace \namespace{Botan}. Botan declares -several typedef'ed types to help buffer it against changes in machine -architecture. These types are used extensively in the interface, -thus it would be often be convenient to use them without the -\namespace{Botan} prefix. You can do so by \keyword{using} the -namespace \namespace{Botan::types} (this way you can use the type -names without the namespace prefix, but the remainder of the library -stays out of the global namespace). The included types are \type{byte} -and \type{u32bit}, which are unsigned integer types. - -The headers for Botan are usually available in the form -\filename{botan/headername.h}. For brevity in this documentation, -headers are always just called \filename{headername.h}, but they -should be used with the \filename{botan/} prefix in your actual code. - -\subsection{Initializing the Library} - -There is a set of core services that the library needs access to -while it is performing requests. To ensure these are set up, you must -create a \type{LibraryInitializer} object (usually called 'init' in -Botan example code; 'botan\_library' or 'botan\_init' may make more -sense in real applications) prior to making any calls to Botan. This -object's lifetime must exceed that of all other Botan objects your -application creates; for this reason the best place to create the -\type{LibraryInitializer} is at the start of your \function{main} -function, since this guarantees that it will be created first and -destroyed last (via standard C++ RAII rules). The initializer does -things like setting up the memory allocation system and algorithm -lookup tables, finding out if there is a high resolution timer -available to use, and similar such matters. With no arguments, the -library is initialized with various default settings. So most of the -time (unless you are writing threaded code; see below), all you need -is: - -\texttt{Botan::LibraryInitializer init;} - -at the start of your \texttt{main}. - -The constructor takes an optional string that specifies arguments. -Currently the only possible argument is ``thread\_safe'', which must -have an Boolean argument (for instance ``thread\_safe=false'' or -``thread\_safe=true''). If ``thread\_safe'' is specified as true the -library will attempt to register a mutex type to properly guard access -to shared resources. However these locks do not protect individual -Botan objects: explicit locking must be used in this case. - -If you do not create a \type{LibraryInitializer} object, pretty much -any Botan operation will fail, because it will be unable to do basic -things like allocate memory or get random bits. Note too, that you -should be careful to only create one such object. - -It is not strictly necessary to create a \type{LibraryInitializer}; -the actual code performing the initialization and shutdown are in -static member functions of \type{LibraryInitializer}, called -\function{initialize} and \function{deinitialize}. A -\type{LibraryInitializer} merely provides a convenient RAII wrapper -for the operations (thus for the internal library state as well). - -\subsection{Pitfalls} - -There are a few things to watch out for to prevent problems when using Botan. - -Never allocate any kind of Botan object globally. The problem with -doing this is that the constructor for such an object will be called -before the library is initialized. Many Botan objects will, in their -constructor, make one or more calls into the library global state -object. Access to this object is checked, so an exception should be -thrown (rather than a memory access violation or undetected -uninitialized object access). A rough equivalent that will work is to -keep a global pointer to the object, initializing it after creating -your \type{LibraryInitializer}. Merely making the -\type{LibraryInitializer} also global will probably not help, because -C++ does not make very strong guarantees about the order that such -objects will be created. - -The same rule applies for making sure the destructors of all your -Botan objects are called before the \type{LibraryInitializer} is -destroyed. This implies you can't have static variables that are Botan -objects inside functions or classes (since in most C++ runtimes, these -objects will be destroyed after main has returned). This is inelegant, -but seems to not cause many problems in practice. - -Botan's memory object classes (\type{MemoryVector}, -\type{SecureVector}, \type{SecureBuffer}) are extremely primitive, and -do not (currently) meet the requirements for an STL container -object. After Botan starts adopting C++0x features, they will be -replaced by typedefs of \type{std::vector} with a custom allocator. - -Use a \function{try}/\function{catch} block inside your -\function{main} function, and catch any \type{std::exception} throws -(remember to catch by reference, as \type{std::exception}'s -\function{what} method is polymorphic). This is not strictly required, -but if you don't, and Botan throws an exception, the runtime will call -\function{std::terminate}, which usually calls \function{abort} or -something like it, leaving you (or worse, a user of your application) -wondering what went wrong. - -\subsection{Information Flow: Pipes and Filters} - -Many common uses of cryptography involve processing one or more -streams of data (be it from sockets, files, or a hardware device). -Botan provides services that make setting up data flows through -various operations, such as compression, encryption, and base64 -encoding. Each of these operations is implemented in what are called -\emph{filters} in Botan. A set of filters are created and placed into -a \emph{pipe}, and information ``flows'' through the pipe until it -reaches the end, where the output is collected for retrieval. If -you're familiar with the Unix shell environment, this design will -sound quite familiar. - -Here is an example that uses a pipe to base64 encode some strings: - -\begin{verbatim} - Pipe pipe(new Base64_Encoder); // pipe owns the pointer - pipe.start_msg(); - pipe.write(``message 1''); - pipe.end_msg(); // flushes buffers, increments message number - - // process_msg(x) is start_msg() && write(x) && end_msg() - pipe.process_msg(``message2''); - - std::string m1 = pipe.read_all_as_string(0); // ``message1'' - std::string m2 = pipe.read_all_as_string(1); // ``message2'' -\end{verbatim} - -Bytestreams in the pipe are grouped into messages; blocks of data that -are processed in an identical fashion (\ie, with the same sequence of -\type{Filter}s). Messages are delimited by calls to -\function{start\_msg} and \function{end\_msg}. Each message in a pipe -has its own identifier, which currently is an integer that increments -up from zero. - -As you can see, the \type{Base64\_Encoder} was allocated using -\keyword{new}; but where was it deallocated? When a filter object is -passed to a \type{Pipe}, the pipe takes ownership of the object, and -will deallocate it when it is no longer needed. - -There are two different ways to make use of messages. One is to send -several messages through a \type{Pipe} without changing the -\type{Pipe}'s configuration, so you end up with a sequence of -messages; one use of this would be to send a sequence of identically -encrypted UDP packets, for example (note that the \emph{data} need not -be identical; it is just that each is encrypted, encoded, signed, etc -in an identical fashion). Another is to change the filters that are -used in the \type{Pipe} between each message, by adding or removing -\type{Filter}s; functions that let you do this are documented in the -Pipe API section. - -Most operations in Botan have a corresponding filter for use in Pipe. -Here's code that encrypts a string with AES-128 in CBC mode: - -\begin{verbatim} - AutoSeeded_RNG rng, - SymmetricKey key(rng, 16); // a random 128-bit key - InitializationVector iv(rng, 16); // a random 128-bit IV - - // Notice the algorithm we want is specified by a string - Pipe pipe(get_cipher(``AES-128/CBC'', key, iv, ENCRYPTION)); - - pipe.process_msg(``secrets''); - pipe.process_msg(``more secrets''); - - MemoryVector<byte> c1 = pipe.read_all(0); - - byte c2[4096] = { 0 }; - u32bit got_out = pipe.read(c2, sizeof(c2), 1); - // use c2[0...got_out] -\end{verbatim} - -Note the use of \type{AutoSeeded\_RNG}, which is a random number -generator. If you want to, you can explicitly set up the random number -generators and entropy sources you want to, however for 99\% of cases -\type{AutoSeeded\_RNG} is preferable. - -\type{Pipe} also has convenience methods for dealing with -\type{std::iostream}s. Here is an example of those, using the -\type{Bzip\_Compression} filter (included as a module; if you have -bzlib available, check \filename{building.pdf} for how to enable it) -to compress a file: - -\begin{verbatim} - std::ifstream in(``data.bin'', std::ios::binary) - std::ofstream out(``data.bin.bz2'', std::ios::binary) - - Pipe pipe(new Bzip_Compression); - - pipe.start_msg(); - in >> pipe; - pipe.end_msg(); - out << pipe; -\end{verbatim} - -However there is a hitch to the code above; the complete contents of -the compressed data will be held in memory until the entire message -has been compressed, at which time the statement \verb|out << pipe| is -executed, and the data is freed as it is read from the pipe and -written to the file. But if the file is very large, we might not have -enough physical memory (or even enough virtual memory!) for that to be -practical. So instead of storing the compressed data in the pipe for -reading it out later, we divert it directly to the file: - -\begin{verbatim} - std::ifstream in(``data.bin'', std::ios::binary) - std::ofstream out(``data.bin.bz2'', std::ios::binary) - - Pipe pipe(new Bzip_Compression, new DataSink_Stream(out)); - - pipe.start_msg(); - in >> pipe; - pipe.end_msg(); -\end{verbatim} - -This is the first code we've seen so far that uses more than one -filter in a pipe. The output of the compressor is sent to the -\type{DataSink\_Stream}. Anything written to a \type{DataSink\_Stream} -is written to a file; the filter produces no output. As soon as the -compression algorithm finishes up a block of data, it will send it along, -at which point it will immediately be written to disk; if you were to -call \verb|pipe.read_all()| after \verb|pipe.end_msg()|, you'd get an -empty vector out. - -Here's an example using two computational filters: - -\begin{verbatim} - AutoSeeded_RNG rng, - SymmetricKey key(rng, 32); - InitializationVector iv(rng, 16); - - Pipe encryptor(get_cipher("AES/CBC/PKCS7", key, iv, ENCRYPTION), - new Base64_Encoder); - - encryptor.start_msg(); - file >> encryptor; - encryptor.end_msg(); // flush buffers, complete computations - std::cout << encryptor; -\end{verbatim} - -\subsection{Fork} - -It is fairly common that you might receive some data and want to -perform more than one operation on it (\ie, encrypt it with Serpent -and calculate the SHA-256 hash of the plaintext at the same -time). That's where \type{Fork} comes in. \type{Fork} is a filter that -takes input and passes it on to \emph{one or more} \type{Filter}s -that are attached to it. \type{Fork} changes the nature of the pipe -system completely. Instead of being a linked list, it becomes a tree. - -Each \type{Filter} in the fork is given its own output buffer, and -thus its own message. For example, if you had previously written two -messages into a \type{Pipe}, then you start a new one with a -\type{Fork} that has three paths of \type{Filter}'s inside it, you -add three new messages to the \type{Pipe}. The data you put into the -\type{Pipe} is duplicated and sent into each set of \type{Filter}s, -and the eventual output is placed into a dedicated message slot in the -\type{Pipe}. - -Messages in the \type{Pipe} are allocated in a depth-first manner. This is only -interesting if you are using more than one \type{Fork} in a single \type{Pipe}. -As an example, consider the following: - -\begin{verbatim} - Pipe pipe(new Fork( - new Fork( - new Base64_Encoder, - new Fork( - NULL, - new Base64_Encoder - ) - ), - new Hex_Encoder - ) - ); -\end{verbatim} - -In this case, message 0 will be the output of the first \type{Base64\_Encoder}, -message 1 will be a copy of the input (see below for how \type{Fork} interprets -NULL pointers), message 2 will be the output of the second -\type{Base64\_Encoder}, and message 3 will be the output of the -\type{Hex\_Encoder}. As you can see, this results in message numbers being -allocated in a top to bottom fashion, when looked at on the screen. However, -note that there could be potential for bugs if this is not anticipated. For -example, if your code is passed a \type{Filter}, and you assume it is a -``normal'' one that only uses one message, your message offsets would be -wrong, leading to some confusion during output. - -If Fork's first argument is a null pointer, but a later argument is -not, then Fork will feed a copy of its input directly through. Here's -a case where that is useful: - -\begin{verbatim} - // have std::string ciphertext, auth_code, key, iv, mac_key; - - Pipe pipe(new Base64_Decoder, - get_cipher(``AES-128'', key, iv, DECRYPTION), - new Fork( - 0 - new MAC_Filter(``HMAC(SHA-1)'', mac_key) - ) - ); - - pipe.process_msg(ciphertext); - std::string plaintext = pipe.read_all_as_string(0); - SecureVector<byte> mac = pipe.read_all(1); - - if(mac != auth_code) - error(); -\end{verbatim} - -Here we wanted to not only decrypt the message, but send the decrypted -text through an additional computation, in order to compute the -authentication code. - -Any \type{Filter}s that are attached to the \type{Pipe} after the -\type{Fork} are implicitly attached onto the first branch created by -the fork. For example, let's say you created this \type{Pipe}: - -\begin{verbatim} -Pipe pipe(new Fork(new Hash_Filter("MD5"), new Hash_Filter("SHA-1")), - new Hex_Encoder); -\end{verbatim} - -And then called \function{start\_msg}, inserted some data, then -\function{end\_msg}. Then \arg{pipe} would contain two messages. The -first one (message number 0) would contain the MD5 sum of the input in -hex encoded form, and the other would contain the SHA-1 sum of the -input in raw binary. However, it's much better to use a \type{Chain} -instead. - -\subsubsection{Chain} - -A \type{Chain} filter creates a chain of \type{Filter}s and -encapsulates them inside a single filter (itself). This allows a -sequence of filters to become a single filter, to be passed into or -out of a function, or to a \type{Fork} constructor. - -You can call \type{Chain}'s constructor with up to 4 \type{Filter*}s -(they will be added in order), or with an array of \type{Filter*}s and -a \type{u32bit} that tells \type{Chain} how many \type{Filter*}s are -in the array (again, they will be attached in order). Here's the -example from the last section, using chain instead of relying on the -obscure rule that version used. - -\begin{verbatim} - Pipe pipe(new Fork( - new Chain(new Hash_Filter("MD5"), new Hex_Encoder), - new Hash_Filter("SHA-1") - ) - ); -\end{verbatim} - -\subsection{The Pipe API} - -\subsubsection{Initializing Pipe} - -By default, \type{Pipe} will do nothing at all; any input placed into -the \type{Pipe} will be read back unchanged. Obviously, this has -limited utility, and presumably you want to use one or more -\type{Filter}s to somehow process the data. First, you can choose a -set of \type{Filter}s to initialize the \type{Pipe} via the -constructor. You can pass it either a set of up to 4 \type{Filter*}s, -or a pre-defined array and a length: - -\begin{verbatim} - Pipe pipe1(new Filter1(/*args*/), new Filter2(/*args*/), - new Filter3(/*args*/), new Filter4(/*args*/)); - Pipe pipe2(new Filter1(/*args*/), new Filter2(/*args*/)); - - Filter* filters[5] = { - new Filter1(/*args*/), new Filter2(/*args*/), new Filter3(/*args*/), - new Filter4(/*args*/), new Filter5(/*args*/) /* more if desired... */ - }; - Pipe pipe3(filters, 5); -\end{verbatim} - -This is by far the most common way to initialize a \type{Pipe}. However, -occasionally a more flexible initialization strategy is necessary; this is -supported by 4 member functions: \function{prepend}(\type{Filter*}), -\function{append}(\type{Filter*}), \function{pop}(), and \function{reset}(). -These functions may only be used while the \type{Pipe} in question is not in -use; that is, either before calling \function{start\_msg}, or after -\function{end\_msg} has been called (and no new calls to \function{start\_msg} -have been made yet). - -The function \function{reset}() simply removes all the \type{Filter}s -that the \type{Pipe} is currently using~--~it is reset to an -initialize, ``empty'' state. Any data that is being retained by the -\type{Pipe} is retained after a \function{reset}(), and -\function{reset}() does not affect the message numbers (discussed -later). - -Calling \function{prepend} and \function{append} will either prepend -or append the passed \type{Filter} object to the list of -transformations. For example, if you \function{prepend} a -\type{Filter} implementing encryption, and the \type{Pipe} already had -a \type{Filter} that hex encoded the input, then the next set of -input would be first encrypted, then hex encoded. Alternately, if you -called \function{append}, then the input would be first be hex -encoded, and then encrypted (which is not terribly useful in this -particular example). - -Finally, calling \function{pop}() will remove the first transformation -of the \type{Pipe}. Say we had called \function{prepend} to put an -encryption \type{Filter} into a \type{Pipe}; calling \function{pop}() -would remove this \type{Filter} and return the \type{Pipe} to its -state before we called \function{prepend}. - -\subsubsection{Giving Data to a Pipe} - -Input to a \type{Pipe} is delimited into messages, which can be read from -independently (\ie, you can read 5 bytes from one message, and then all of -another message, without either read affecting any other messages). The -messages are delimited by calls to \function{start\_msg} and -\function{end\_msg}. In between these two calls, you can write data into a -\type{Pipe}, and it will be processed by the \type{Filter}(s) that it -contains. Writes at any other time are invalid, and will result in an -exception. - -As to writing, you can call any of the functions called \function{write}(), -that can take any of: a \type{byte[]}/\type{u32bit} pair, a -\type{SecureVector<byte>}, a \type{std::string}, a \type{DataSource\&}, or a -single \type{byte}. - -Sometimes, you may want to do only a single write per message. In this case, -you can use the \function{process\_msg} series of functions, which start a -message, write their argument into the \type{Pipe}, and then end the -message. In this case you would not make any explicit calls to -\function{start\_msg}/\function{end\_msg}. The version of \function{write} -that takes a single \type{byte} is not supported by \function{process\_msg}, -but all the other variants are. - -\type{Pipe} can also be used with the \verb|>>| operator, and will accept a -\type{std::istream}, (or on Unix systems with the \verb|fd_unix| module), a -Unix file descriptor. In either case, the entire contents of the file will be -read into the \type{Pipe}. - -\subsubsection{Getting Output from a Pipe} - -Retrieving the processed data from a \type{Pipe} is a bit more complicated, for -various reasons. In particular, because \type{Pipe} will separate each message -into a separate buffer, you have to be able to retrieve data from each message -independently. Each of \type{Pipe}'s read functions has a final parameter that -specifies what message to read from (as a 32-bit integer). If this parameter is -set to \type{Pipe::DEFAULT\_MESSAGE}, it will read the current default message -(\type{DEFAULT\_MESSAGE} is also the default value of this parameter). The -parameter will not be mentioned in further discussion of the reading API, but -it is always there (unless otherwise noted). - -Reading is done with a variety of functions. The most basic are \type{u32bit} -\function{read}(\type{byte} \arg{out}[], \type{u32bit} \arg{len}) and -\type{u32bit} \function{read}(\type{byte\&} \arg{out}). Each reads into -\arg{out} (either up to \arg{len} bytes, or a single byte for the one taking a -\type{byte\&}), and returns the total number of bytes read. There is a variant -of these functions, all named \function{peek}, which performs the same -operations, but does not remove the bytes from the message (reading is a -destructive operation with a \type{Pipe}). - -There are also the functions \type{SecureVector<byte>} \function{read\_all}(), -and \type{std::string} \function{read\_all\_as\_string}(), which return the -entire contents of the message, either as a memory buffer, or a -\type{std::string} (which is generally only useful if the \type{Pipe} has -encoded the message into a text string, such as when a \type{Base64\_Encoder} -is used). - -To determine how many bytes are left in a message, call \type{u32bit} -\function{remaining}() (which can also take an optional message -number). Finally, there are some functions for managing the default message -number: \type{u32bit} \function{default\_msg}() will return the current default -message, \type{u32bit} \function{message\_count}() will return the total number -of messages (0...\function{message\_count}()-1), and -\function{set\_default\_msg}(\type{u32bit} \arg{msgno}) will set a new default -message number (which must be a valid message number for that \type{Pipe}). The -ability to set the default message number is particularly important in the case -of using the file output operations (\verb|<<| with a \type{std::ostream} or -Unix file descriptor), because there is no way to specify it explicitly when -using the output operator. - -\subsection{A Filter Example} - -Here is some code that takes one or more filenames in \arg{argv} and -calculates the result of several hash functions for each file. The complete -program can be found as \filename{hasher.cpp} in the Botan distribution. For -brevity, most error checking has been removed. - -\begin{verbatim} - string name[3] = { "MD5", "SHA-1", "RIPEMD-160" }; - Botan::Filter* hash[3] = { - new Botan::Chain(new Botan::Hash_Filter(name[0]), - new Botan::Hex_Encoder), - new Botan::Chain(new Botan::Hash_Filter(name[1]), - new Botan::Hex_Encoder), - new Botan::Chain(new Botan::Hash_Filter(name[2]), - new Botan::Hex_Encoder) }; - - Botan::Pipe pipe(new Botan::Fork(hash, COUNT)); - - for(u32bit j = 1; argv[j] != 0; j++) - { - ifstream file(argv[j]); - pipe.start_msg(); - file >> pipe; - pipe.end_msg(); - file.close(); - for(u32bit k = 0; k != 3; k++) - { - pipe.set_default_msg(3*(j-1)+k); - cout << name[k] << "(" << argv[j] << ") = " << pipe << endl; - } - } -\end{verbatim} - - -\subsection{Filter Catalog} - -This section contains descriptions of every \type{Filter} included in -the portable sections of Botan. \type{Filter}s provided by modules -are documented elsewhere. - -\subsubsection{Keyed Filters} - -A few sections ago, it was mentioned that \type{Pipe} can process multiple -messages, treating each of them exactly the same. Well, that was a bit of a -lie. There are some algorithms (in particular, block ciphers not in ECB mode, -and all stream ciphers) that change their state as data is put through them. - -Naturally, you might well want to reset the keys or (in the case of block -cipher modes) IVs used by such filters, so multiple messages can be processed -using completely different keys, or new IVs, or new keys and IVs, or whatever. -And in fact, even for a MAC or an ECB block cipher, you might well want to -change the key used from message to message. - -Enter \type{Keyed\_Filter}, which acts as an abstract interface for -any filter that is uses keys: block cipher modes, stream ciphers, -MACs, and so on. It has two functions, \function{set\_key} and -\function{set\_iv}. Calling \function{set\_key} will, naturally, set -(or reset) the key used by the algorithm. Setting the IV only makes -sense in certain algorithms -- a call to \function{set\_iv} on an -object that doesn't support IVs will be ignored. You \emph{must} call -\function{set\_key} before calling \function{set\_iv}: while not all -\type{Keyed\_Filter} objects require this, you should assume it is -required anytime you are using a \type{Keyed\_Filter}. - -Here's a example: - -\begin{verbatim} - Keyed_Filter *cast, *hmac; - Pipe pipe(new Base64_Decoder, - // Note the assignments to the cast and hmac variables - cast = new CBC_Decryption("CAST-128", "PKCS7", cast_key, iv), - new Fork( - 0, // Read the section 'Fork' to understand this - new Chain( - hmac = new MAC_Filter("HMAC(SHA-1)", mac_key, 12), - new Base64_Encoder - ) - ) - ); - pipe.start_msg(); - [use pipe for a while, decrypt some stuff, derive new keys and IVs] - pipe.end_msg(); - - cast->set_key(cast_key2); - cast->set_iv(iv2); - hmac->set_key(mac_key2); - - pipe.start_msg(); - [use pipe for some other things] - pipe.end_msg(); -\end{verbatim} - -There are some requirements to using \type{Keyed\_Filter} that you must -follow. If you call \function{set\_key} or \function{set\_iv} on a filter that -is owned by a \type{Pipe}, you must do so while the \type{Pipe} is -``unlocked''. This refers to the times when no messages are being processed by -\type{Pipe} -- either before \type{Pipe}'s \function{start\_msg} is called, or -after \function{end\_msg} is called (and no new call to \function{start\_msg} -has happened yet). Doing otherwise will result in undefined behavior, probably -silently getting invalid output. - -And remember: if you're resetting both values, reset the key \emph{first}. - -\subsubsection{Cipher Filters} - -Getting a hold of a \type{Filter} implementing a cipher is very easy. Simply -make sure you're including the header \filename{lookup.h}, and call -\function{get\_cipher}. Generally you will pass the return value directly into -a \type{Pipe}. There are actually a couple different functions, which do pretty -much the same thing: - -\function{get\_cipher}(\type{std::string} \arg{cipher\_spec}, - \type{SymmetricKey} \arg{key}, - \type{InitializationVector} \arg{iv}, - \type{Cipher\_Dir} \arg{dir}); - -\function{get\_cipher}(\type{std::string} \arg{cipher\_spec}, - \type{SymmetricKey} \arg{key}, - \type{Cipher\_Dir} \arg{dir}); - -The version that doesn't take an IV is useful for things that don't use them, -like block ciphers in ECB mode, or most stream ciphers. If you specify a -\arg{cipher\_spec} that does want a IV, and you use the version that doesn't -take one, an exception will be thrown. The \arg{dir} argument can be either -\type{ENCRYPTION} or \type{DECRYPTION}. In a few cases, like most (but not all) -stream ciphers, these are equivalent, but even then it provides a way of -showing the ``intent'' of the operation to readers of your code. - -The \arg{cipher\_spec} is a string that specifies what cipher is to be -used. The general syntax for \arg{cipher\_spec} is ``STREAM\_CIPHER'', -``BLOCK\_CIPHER/MODE'', or ``BLOCK\_CIPHER/MODE/PADDING''. In the case of -stream ciphers, no mode is necessary, so just the name is sufficient. A block -cipher requires a mode of some sort, which can be ``ECB'', ``CBC'', ``CFB(n)'', -``OFB'', ``CTR-BE'', or ``EAX(n)''. The argument to CFB mode is how many bits -of feedback should be used. If you just use ``CFB'' with no argument, it will -default to using a feedback equal to the block size of the cipher. EAX mode -also takes an optional bit argument, which tells EAX how large a tag size to -use~--~generally this is the size of the block size of the cipher, which is the -default if you don't specify any argument. - -In the case of the ECB and CBC modes, a padding method can also be -specified. If it is not supplied, ECB defaults to not padding, and CBC defaults -to using PKCS \#5/\#7 compatible padding. The padding methods currently -available are ``NoPadding'', ``PKCS7'', ``OneAndZeros'', and ``CTS''. CTS -padding is currently only available for CBC mode, but the others can also be -used in ECB mode. - -Some example \arg{cipher\_spec} arguments are: ``DES/CFB(32)'', -``TripleDES/OFB'', ``Blowfish/CBC/CTS'', ``SAFER-SK(10)/CBC/OneAndZeros'', -``AES/EAX'', ``ARC4'' - -``CTR-BE'' refers to counter mode where the counter is incremented as if it -were a big-endian encoded integer. This is compatible with most other -implementations, but it is possible some will use the incompatible little -endian convention. This version would be denoted as ``CTR-LE'' if it were -supported. - -``EAX'' is a new cipher mode designed by Wagner, Rogaway, and Bellare. It is an -authenticated cipher mode (that is, no separate authentication is needed), has -provable security, and is free from patent entanglements. It runs about half as -fast as most of the other cipher modes (like CBC, OFB, or CTR), which is not -bad considering you don't need to use an authentication code. - -\subsubsection{Hashes and MACs} - -Hash functions and MACs don't need anything special when it comes to -filters. Both just take their input and produce no output until -\function{end\_msg()} is called, at which time they complete the hash or MAC -and send that as output. - -These \type{Filter}s take a string naming the type to be used. If for some -reason you name something that doesn't exist, an exception will be thrown. - -\noindent -\function{Hash\_Filter}(\type{std::string} \arg{hash}, - \type{u32bit} \arg{outlength}): - -This type hashes its input with \arg{hash}. When \function{end\_msg} is called -on the owning \type{Pipe}, the hash is completed and the digest is sent on to -the next thing in the pipe. The argument \arg{outlength} specifies how much of -the output of the hash will be passed along to the next filter when -\function{end\_msg} is called. By default, it will pass the entire hash. - -Examples of names for \function{Hash\_Filter} are ``SHA-1'' and ``Whirlpool''. - -\noindent -\function{MAC\_Filter}(\type{std::string} \arg{mac}, - \type{const SymmetricKey\&} \arg{key}, - \type{u32bit} \arg{outlength}): - -The constructor for a \type{MAC\_Filter} takes a key, used in calculating the -MAC, and a length parameter, which has semantics exactly the same as the one -passed to \type{Hash\_Filter}s constructor. - -Examples for \arg{mac} are ``HMAC(SHA-1)'', ``CMAC(AES-128)'', and the -exceptionally long, strange, and probably useless name -``CMAC(Lion(Tiger(20,3),MARK-4,1024))''. - -\subsubsection{PK Filters} - -There are four classes in this category, \type{PK\_Encryptor\_Filter}, -\type{PK\_Decryptor\_Filter}, \type{PK\_Signer\_Filter}, and -\type{PK\_Verifier\_Filter}. Each takes a pointer to an object of the -appropriate type (\type{PK\_Encryptor}, \type{PK\_Decryptor}, etc) that is -deleted by the destructor. These classes are found in \filename{pk\_filts.h}. - -Three of these, for encryption, decryption, and signing are pretty much -identical conceptually. Each of them buffers its input until the end of the -message is marked with a call to the \function{end\_msg} function. Then they -encrypt, decrypt, or sign their input and send the output (the ciphertext, the -plaintext, or the signature) into the next filter. - -Signature verification works a little differently, because it needs to know -what the signature is in order to check it. You can either pass this in along -with the constructor, or call the function \function{set\_signature} -- with -this second method, you need to keep a pointer to the filter around so you can -send it this command. In either case, after \function{end\_msg} is called, it -will try to verify the signature (if the signature has not been set by either -method, an exception will be thrown here). It will then send a single byte onto -the next filter -- a 1 or a 0, which specifies whether the signature verified -or not (respectively). - -For more information about PK algorithms (including creating the appropriate -objects to pass to the constructors), read the section ``Public Key -Cryptography'' in this manual. - -\subsubsection{Encoders} - -Often you want your data to be in some form of text (for sending over channels -that aren't 8-bit clean, printing it, etc). The filters \type{Hex\_Encoder} -and \type{Base64\_Encoder} will convert arbitrary binary data into hex or -base64 formats. Not surprisingly, you can use \type{Hex\_Decoder} and -\type{Base64\_Decoder} to convert it back into its original form. - -Both of the encoders can take a few options about how the data should be -formatted (all of which have defaults). The first is a \type{bool} which simply -says if the encoder should insert line breaks. This defaults to -false. Line breaks don't matter either way to the decoder, but it makes the -output a bit more appealing to the human eye, and a few transport mechanisms -(notably some email systems) limit the maximum line length. - -The second encoder option is an integer specifying how long such lines will be -(obviously this will be ignored if line-breaking isn't being used). The default -tends to be in the range of 60-80 characters, but is not specified exactly. If -you want a specific value, set it. Otherwise the default should be fine. - -Lastly, \type{Hex\_Encoder} takes an argument of type \type{Case}, which can be -\type{Uppercase} or \type{Lowercase} (default is \type{Uppercase}). This -specifies what case the characters A-F should be output as. The base64 encoder -has no such option, because it uses both upper and lower case letters for its -output. - -The decoders both take a single option, which tells it how the object should -behave in the case of invalid input. The enum (called \type{Decoder\_Checking}) -can take on any of three values: \type{NONE}, \type{IGNORE\_WS}, and -\type{FULL\_CHECK}. With \type{NONE} (the default, for compatibility with -previous releases), invalid input (for example, a ``z'' character in supposedly -hex input) will simply be ignored. With \type{IGNORE\_WS}, whitespace will be -ignored by the decoder, but receiving other non-valid data will raise an -exception. Finally, \type{FULL\_CHECK} will raise an exception for \emph{any} -characters not in the encoded character set, including whitespace. - -You can find the declarations for these types in \filename{hex.h} and -\filename{base64.h}. - -\subsection{Rolling Your Own} - -The system of filters and pipes was designed in an attempt to make it -as simple as possible to write new \type{Filter} objects. There are -essentially four functions that need to be implemented by an object -deriving from \type{Filter}: - -\noindent -\type{void} \function{write}(\type{byte} \arg{input}[], \type{u32bit} -\arg{length}): - -The \function{write} function is what is called when a filter receives input -for it to process. The filter is \emph{not} required to process it right away; -many filters buffer their input before producing any output. A filter will -usually have \function{write} called many times during its lifetime. - -\noindent -\type{void} \function{send}(\type{byte} \arg{output}[], \type{u32bit} -\arg{length}): - -Eventually, a filter will want to produce some output to send along to the next -filter in the pipeline. It does so by calling \function{send} with whatever it -wants to send along to the next filter. There is also a version of -\function{send} taking a single byte argument, as a convenience. - -\noindent -\type{void} \function{start\_msg()}: - -This function is optional. Implement it if your \type{Filter} would like to do -some processing or setup at the start of each message (for an example, see the -Zlib compression module). - -\noindent -\type{void} \function{end\_msg()}: - -Implementing the \function{end\_msg} function is optional. It is called when it -has been requested that filters finish up their computations. Note that they -must \emph{not} deallocate their resources; this should be done by their -destructor. They should simply finish up with whatever computation they have -been working on (for example, a compressing filter would flush the compressor -and \function{send} the final block), and empty any buffers in preparation for -processing a fresh new set of input. It is essentially the inverse of -\function{start\_msg}. - -Additionally, if necessary, filters can define a constructor that takes any -needed arguments, and a destructor to deal with deallocating memory, closing -files, etc. - -There is also a \type{BufferingFilter} class (in \filename{buf\_filt.h}) that -will take a message and split it up into an initial block that can be of any -size (including zero), a sequence of fixed sized blocks of any non-zero size, -and last (possibly zero-sized) final block. This might make a useful base class -for your filters, depending on what you have in mind. - - -\pagebreak -\section{Public Key Cryptography} - -Let's create a 1024-bit RSA private key, encode the public key as a -PKCS \#1 file with PEM encoding (which can be understood by many other -cryptographic programs) - -\begin{verbatim} -// everyone does: -AutoSeeded_RNG rng; - -// Alice -RSA_PrivateKey priv_rsa(rng, 1024 /* bits */); - -std::string alice_pem = X509::PEM_encode(priv_rsa); - -// send alice_pem to Bob, who does - -// Bob -std::auto_ptr<X509_PublicKey> alice(load_key(alice_pem)); - -RSA_PublicKey* alice_rsa = dynamic_cast<RSA_PublicKey>(alice); -if(alice_rsa) - { - /* ... */ - } - -\end{verbatim} - -\subsection{Creating PK Algorithm Key Objects} - -The library has interfaces for encryption, signatures, etc that do not require -knowing the exact algorithm in use (for example RSA and Rabin-Williams -signatures are handled by the exact same code path). - -One place where we \emph{do} need to know exactly what kind of algorithm is in -use is when we are creating a key (\emph{But}: read the section ``Importing and -Exporting PK Keys'', later in this manual). - -There are (currently) two kinds of public key algorithms in Botan: ones based -on integer factorization (RSA and Rabin-Williams), and ones based on the -discrete logarithm problem (DSA, Diffie-Hellman, Nyberg-Rueppel, and -ElGamal). Since discrete logarithm parameters (primes and generators) can be -shared among many keys, there is the notion of these being a combined type -(called \type{DL\_Group}). - -There are two ways to create a DL private key (such as -\type{DSA\_PrivateKey}). One is to pass in just a \type{DL\_Group} object -- a -new key will automatically be generated. The other involves passing in a group -to use, along with both the public and private values (private value first). - -Since in integer factorization algorithms, the modulus used isn't shared by -other keys, we don't use this notion. You can create a new key by passing in a -\type{u32bit} telling how long (in bits) the key should be, or you can copy an -pre-existing key by passing in the appropriate parameters (primes, exponents, -etc). For RSA and Rabin-Williams (the two IF schemes in Botan), the parameters -are all \type{BigInt}s: prime 1, prime 2, encryption exponent, decryption -exponent, modulus. The last two are optional, since they can easily be derived -from the first three. - -\subsubsection{Creating a DL\_Group} - -There are quite a few ways to get a \type{DL\_Group} object. The best is to use -the function \function{get\_dl\_group}, which takes a string naming a group; it -will either return that group, if it knows about it, or throw an -exception. Names it knows about include ``IETF-n'' where n is 768, 1024, 1536, -2048, 3072, or 4096, and ``DSA-n'', where n is 512, 768, or 1024. The IETF -groups are the ones specified for use with IPSec, and the DSA ones are the -default DSA parameters specified by Java's JCE. For DSA and Nyberg-Rueppel, you -should only use the ``DSA-n'' groups, while Diffie-Hellman and ElGamal can use -either type (keep in mind that some applications/standards require DH/ELG to -use DSA-style primes, while others require strong prime groups). - -You can also generate a new random group. This is not recommend, because it is -quite slow, especially for safe primes. - -\subsection{Key Checking} - -Most public key algorithms have limitations or restrictions on their -parameters. For example RSA requires an odd exponent, and algorithms based on -the discrete logarithm problem need a generator $> 1$. - -Each low-level public key type has a function named \function{check\_key} that -takes a \type{bool}. This function returns a Boolean value that declares -whether or not the key is valid (from an algorithmic standpoint). For example, -it will check to make sure that the prime parameters of a DSA key are, in fact, -prime. It does not have anything to do with the validity of the key for any -particular use, nor does it have anything to do with certificates that link a -key (which, after all, is just some numbers) with a user or other entity. If -\function{check\_key}'s argument is \type{true}, then it does ``strong'' -checking, which includes fairly expensive operations like primality checking. - -Keys are always checked when they are loaded or generated, so typically there -is no reason to use this function directly. However, you can disable or reduce -the checks for particular cases (public keys, loaded private keys, generated -private keys) by setting the right config toggle (see the section on the -configuration subsystem for details). - -\subsection{Getting a PK algorithm object} - -The key types, like \type{RSA\_PrivateKey}, do not implement any kind -of padding or encoding (which is generally necessary for security). To -get an object like this, the easiest thing to do is call the functions -found in \filename{look\_pk.h}. Generally these take a key, followed -by a string that specified what hashing and encoding method(s) to -use. Examples of such strings are ``EME1(SHA-256)'' for OAEP -encryption and ``EMSA4(SHA-256)'' for PSS signatures (where the -message is hashed using SHA-256). - -Here are some basic examples (using an RSA key) to give you a feel for the -possibilities. These examples assume \type{rsakey} is an -\type{RSA\_PrivateKey}, since otherwise we would not be able to create a -decryption or signature object with it (you can create encryption or signature -verification objects with public keys, naturally). Remember to delete these -objects when you're done with them. - -\begin{verbatim} - // PKCS #1 v2.0 / IEEE 1363 compatible encryption - PK_Encryptor* rsa_enc1 = get_pk_encryptor(rsakey, "EME1(RIPEMD-160)"); - // PKCS #1 v1.5 compatible encryption - PK_Encryptor* rsa_enc2 = get_pk_encryptor(rsakey, "PKCS1v15"); - - // Raw encryption: no padding, input is directly encrypted by the key - // Don't use this unless you know what you're doing - PK_Encryptor* rsa_enc3 = get_pk_encryptor(rsakey, "Raw"); - - // This object can decrypt things encrypted by rsa_enc1 - PK_Decryptor* rsa_dec1 = get_pk_decryptor(rsakey, "EME1(RIPEMD-160)"); - - // PKCS #1 v1.5 compatible signatures - PK_Signer* rsa_sig = get_pk_signer(rsakey, "EMSA3(MD5)"); - PK_Verifier* rsa_verify = get_pk_verifier(rsakey, "EMSA3(MD5)"); - - // PKCS #1 v2.1 compatible signatures - PK_Signer* rsa_sig2 = get_pk_signer(rsakey, "EMSA4(SHA-1)"); - PK_Verifier* rsa_verify2 = get_pk_verifier(rsakey, "EMSA4(SHA-1)"); - - // Hash input with SHA-1, but don't pad the input in any way; usually - // used with DSA/NR, not RSA - PK_Signer* rsa_sig = get_pk_signer(rsakey, "EMSA1(SHA-1)"); -\end{verbatim} - -\subsection{Encryption} - -The \type{PK\_Encryptor} and \type{PK\_Decryptor} classes are the interface for -encryption and decryption, respectively. - -Calling \function{encrypt} with a \type{byte} array, a length -parameter, and an RNG object will return the input encrypted with -whatever scheme is being used. Calling the similar \function{decrypt} -will perform the inverse operation. You can also do these operations -with \type{SecureVector<byte>}s. In all cases, the output is returned -via a \type{SecureVector<byte>}. - -If you attempt an operation with a larger size than the key can -support (this limit varies based on the algorithm, the key size, and -the padding method used (if any)), an exception will be -thrown. Alternately, you can call \function{maximum\_input\_size}, -that will return the maximum size you can safely encrypt. In fact, -you can often encrypt an object that is one byte longer, but only if -enough of the high bits of the leading byte are set to zero. Since -this is pretty dicey, it's best to stick with the advertised maximum. - -Available public key encryption algorithms in Botan are RSA and ElGamal. The -encoding methods are EME1, denoted by ``EME1(HASHNAME)'', PKCS \#1 v1.5, -called ``PKCS1v15'' or ``EME-PKCS1-v1\_5'', and raw encoding (``Raw''). - -For compatibility reasons, PKCS \#1 v1.5 is recommend for use with -ElGamal (most other implementations of ElGamal do not support any -other encoding format). RSA can also be used with PKCS \# 1 encoding, -but because of various possible attacks, EME1 is the preferred -encoding. EME1 requires the use of a hash function: unless a competent -applied cryptographer tells you otherwise, you should use SHA-256 or -SHA-512. - -Don't use ``Raw'' encoding unless you need it for backward -compatibility with old protocols. There are many possible attacks -against both ElGamal and RSA when they are used in this way. - -\subsection{Signatures} - -The signature algorithms look quite a bit like the hash functions. You -can repeatedly call \function{update}, giving more and more of a -message you wish to sign, and then call \function{signature}, which -will return a signature for that message. If you want to do it all in -one shot, call \function{sign\_message}, which will just call -\function{update} with its argument and then return whatever -\function{signature} returns. Generating a signature requires random -numbers with some schemes, so \function{signature} and -\function{sign\_message} both take a \type{RandomNumberGenerator\&}. - -You can validate a signature by updating the verifier class, and finally seeing -the if the value returned from \function{check\_signature} is true (you pass -the supposed signature to the \function{check\_signature} function as a byte -array and a length or as a \type{MemoryRegion<byte>}). There is another -function, \function{verify\_message}, which takes a pair of byte array/length -pairs (or a pair of \type{MemoryRegion<byte>} objects), the first of which is -the message, the second being the (supposed) signature. It returns true if the -signature is valid and false otherwise. - -Available public key signature algorithms in Botan are RSA, DSA, -Nyberg-Rueppel, and Rabin-Williams. Signature encoding methods include EMSA1, -EMSA2, EMSA3, EMSA4, and Raw. All of them, except Raw, take a parameter naming -a message digest function to hash the message with. Raw actually signs the -input directly; if the message is too big, the signing operation will fail. Raw -is not useful except in very specialized applications. - -There are various interactions that make certain encoding schemes and signing -algorithms more or less useful. - -EMSA2 is the usual method for encoding Rabin-William signatures, so for -compatibility with other implementations you may have to use that. EMSA4 (also -called PSS), also works with Rabin-Williams. EMSA1 and EMSA3 do \emph{not} work -with Rabin-Williams. - -RSA can be used with any of the available encoding methods. EMSA4 is by far the -most secure, but is not (as of now) widely implemented. EMSA3 (also called -``EMSA-PKCS1-v1\_5'') is commonly used with RSA (for example in SSL). EMSA1 -signs the message digest directly, without any extra padding or encoding. This -may be useful, but is not as secure as either EMSA3 or EMSA4. EMSA2 may be used -but is not recommended. - -For DSA and Nyberg-Rueppel, you should use EMSA1. None of the other encoding -methods are particularly useful for these algorithms. - -\subsection{Key Agreement} - -You can get a hold of a \type{PK\_Key\_Agreement\_Scheme} object by calling -\function{get\_pk\_kas} with a key that is of a type that supports key -agreement (such as a Diffie-Hellman key stored in a \type{DH\_PrivateKey} -object), and the name of a key derivation function. This can be ``Raw'', -meaning the output of the primitive itself is returned as the key, or -``KDF1(hash)'' or ``KDF2(hash)'' where ``hash'' is any string you happen to -like (hopefully you like strings like ``SHA-256'' or ``RIPEMD-160''), or -``X9.42-PRF(keywrap)'', which uses the PRF specified in ANSI X9.42. It takes -the name or OID of the key wrap algorithm that will be used to encrypt a -content encryption key. - -How key agreement generally works is that you trade public values with some -other party, and then each of you runs a computation with the other's value and -your key (this should return the same result to both parties). This computation -can be called by using \function{derive\_key} with either a byte array/length -pair, or a \type{SecureVector<byte>} than holds the public value of the other -party. The last argument to either call is a number that specifies how long a -key you want. - -Depending on the key derivation function you're using, you many not -\emph{actually} get back a key of that size. In particular, ``Raw'' will return -a number about the size of the Diffie-Hellman modulus, and KDF1 can only return -a key that is the same size as the output of the hash. KDF2, on the other -hand, will always give you a key exactly as long as you request, regardless of -the underlying hash used with it. The key returned is a \type{SymmetricKey}, -ready to pass to a block cipher, MAC, or other symmetric algorithm. - -The public value that should be used can be obtained by calling -\function{public\_data}, which exists for any key that is associated with a -key agreement algorithm. It returns a \type{SecureVector<byte>}. - -``KDF2(SHA-256)'' is by far the preferred algorithm for key derivation -in new applications. The X9.42 algorithm may be useful in some -circumstances, but unless you need X9.42 compatibility, KDF2 is easier -to use. - -There is a Diffie-Hellman example included in the distribution, which you may -want to examine. - -\subsection{Importing and Exporting PK Keys} - -[This section mentions \type{Pipe} and \type{DataSource}, which is not covered -until later in the manual. Please read those sections for more about -\type{Pipe} and \type{DataSource} and their uses.] - -There are many, many different (often conflicting) standards surrounding public -key cryptography. There is, thankfully, only two major standards surrounding -the representation of a public or private key: X.509 (for public keys), and -PKCS \#8 (for private keys). Other crypto libraries, like OpenSSL and B-SAFE, -also support these formats, so you can easily exchange keys with software that -doesn't use Botan. - -In addition to ``plain'' public keys, Botan also supports X.509 certificates. -These are documented in the section ``Certificate Handling'', later in this -manual. - -\subsubsection{Public Keys} - -The interfaces for doing either of these are quite similar. Let's look at the -X.509 stuff first: -\begin{verbatim} -namespace X509 { - void encode(const X509_PublicKey& key, Pipe& out, X509_Encoding enc = PEM); - std::string PEM_encode(const X509_PublicKey& out); - - X509_PublicKey* load_key(DataSource& in); - X509_PublicKey* load_key(const std::string& file); - X509_PublicKey* load_key(const SecureVector<byte>& buffer); -} -\end{verbatim} - -Basically, \function{X509::encode} will take an \type{X509\_PublicKey} -(as of now, that's any RSA, DSA, or Diffie-Hellman key) and encodes it -using \arg{enc}, which can be either \type{PEM} or -\type{RAW\_BER}. Using \type{PEM} is \emph{highly} recommended for -many reasons, including compatibility with other software, for -transmission over 8-bit unclean channels, because it can be identified -by a human without special tools, and because it sometimes allows more -sane behavior of tools that process the data. It will place the -encoding into \arg{out}. Remember that if you have just created the -\type{Pipe} that you are passing to \function{X509::encode}, you need -to call \function{start\_msg} first. Particularly with public keys, -about 99\% of the time you just want to PEM encode the key and then -write it to a file or something. In this case, it's probably easier to -use \function{X509::PEM\_encode}. This function will simply return the -PEM encoding of the key as a \type{std::string}. - -For loading a public key, the preferred method is one of the variants -of \function{load\_key}. This function will return a newly allocated -key based on the data from whatever source it is using (assuming, of -course, the source is in fact storing a representation of a public -key). The encoding used (PEM or BER) need not be specified; the format -will be detected automatically. The key is allocated with -\function{new}, and should be released with \function{delete} when you -are done with it. The first takes a generic \type{DataSource} that -you have to allocate~--~the others are simple wrapper functions that -take either a filename or a memory buffer. - -So what can you do with the return value of \function{load\_key}? On -its own, a \type{X509\_PublicKey} isn't particularly useful; you can't -encrypt messages or verify signatures, or much else. But, using -\function{dynamic\_cast}, you can figure out what kind of operations -the key supports. Then, you can cast the key to the appropriate type -and pass it to a higher-level class. For example: - -\begin{verbatim} - /* Might be RSA, might be ElGamal, might be ... */ - X509_PublicKey* key = X509::load_key("pubkey.asc"); - /* You MUST use dynamic_cast to convert, because of virtual bases */ - PK_Encrypting_Key* enc_key = dynamic_cast<PK_Encrypting_Key*>(key); - if(!enc_key) - throw Some_Exception(); - PK_Encryptor* enc = get_pk_encryptor(*enc_key, "EME1(SHA-256)"); - SecureVector<byte> cipher = enc->encrypt(some_message, size_of_message); -\end{verbatim} - -\subsubsection{Private Keys} - -There are two different options for private key import/export. The first is a -plaintext version of the private key. This is supported by the following -functions: - -\begin{verbatim} -namespace PKCS8 { - void encode(const PKCS8_PrivateKey& key, Pipe& to, X509_Encoding enc = PEM); - - std::string PEM_encode(const PKCS8_PrivateKey& key); -} -\end{verbatim} - -These functions are basically the same as the X.509 functions described -previously. The only difference is that they take a \type{PKCS8\_PrivateKey} -type (which, again, can be either RSA, DSA, or Diffie-Hellman, but this time -the key must be a private key). In most situations, using these is a bad idea, -because anyone can come along and grab the private key without having to know -any passwords or other secrets. Unless you have very particular security -requirements, always use the versions that encrypt the key based on a -passphrase. For importing, the same functions can be used for encrypted and -unencrypted keys. - -The other way to export a PKCS \#8 key is to first encode it in the same manner -as done above, then encrypt it (using a passphrase and the techniques of PKCS -\#5), and store the whole thing into another structure. This method is -definitely preferred, since otherwise the private key is unprotected. The -following functions support this technique: - -\begin{verbatim} -namespace PKCS8 { - void encrypt_key(const PKCS8_PrivateKey& key, Pipe& out, - std::string passphrase, std::string pbe = "", - X509_Encoding enc = PEM); - - std::string PEM_encode(const PKCS8_PrivateKey& key, std::string passphrase, - std::string pbe = ""); -} -\end{verbatim} - -To export an encrypted private key, call \function{PKCS8::encrypt\_key}. The -\arg{key}, \arg{out}, and \arg{enc} arguments are similar in usage to the ones -for \function{PKCS8::encode}. As you might notice, there are two new arguments -for \function{PKCS8::encrypt\_key}, however. The first is a passphrase (which -you presumably got from a user somehow). This will be used to encrypt the key. -The second new argument is \arg{pbe}; this specifies a particular password -based encryption (or PBE) algorithm. - -The \function{PEM\_encode} version shown here is similar to the one that -doesn't take a passphrase. Essentially it encrypts the key (using the default -PBE algorithm), and then returns a C++ string with the PEM encoding of the key. - -If \arg{pbe} is blank, then the default algorithm (controlled by the -``base/default\_pbe'' option) will be used. As shipped, this default is -``PBE-PKCS5v20(SHA-1,TripleDES/CBC)'' . This is among the more secure options -of PKCS \#5, and is widely supported among implementations of PKCS \#5 v2.0. It -offers 168 bits of security against attacks, which should be more that -sufficient. If you need compatibility with systems that only support PKCS \#5 -v1.5, pass ``PBE-PKCS5v15(MD5,DES/CBC)'' as \arg{pbe}. However, be warned that -this PBE algorithm only has 56 bits of security against brute force attacks. As -of 1.4.5, all three keylengths of AES are also available as options, which can -be used with by specifying a PBE algorithm of -``PBE-PKCS5v20(SHA-1,AES-256/CBC)'' (or ``AES-128'' or ``AES-192''). Support -for AES is slightly non-standard, and some applications or libraries might not -handle it. It is known that OpenSSL (0.9.7 and later) do handle AES for private -key encryption. - -There may be some strange programs out there that support the v2.0 extensions -to PBES1 but not PBES2; if you need to inter-operate with a program like that, -use ``PBE-PKCS5v15(MD5,RC2/CBC)''. For example, OpenSSL supports this format -(though since it also supports the v2.0 schemes, there is no reason not to just -use TripleDES or AES). This scheme uses a 64-bit key that, while -significantly better than a 56-bit key, is a bit too small for comfort. - -Last but not least, there are some functions that are basically identical to -\function{X509::load\_key} that will load, and possibly decrypt, a PKCS \#8 -private key: - -\begin{verbatim} -namespace PKCS8 { - PKCS8_PrivateKey* load_key(DataSource& in, - RandomNumberGenerator& rng, - const User_Interface& ui); - PKCS8_PrivateKey* load_key(DataSource& in, - RandomNumberGenerator& rng, - std::string passphrase = ""); - - PKCS8_PrivateKey* load_key(const std::string& filename, - RandomNumberGenerator& rng, - const User_Interface& ui); - PKCS8_PrivateKey* load_key(const std::string& filename, - RandomNumberGenerator& rng, - const std::string& passphrase = ""); -} -\end{verbatim} - -The versions that take \type{std::string} \arg{passphrase}s are primarily for -compatibility, but they are useful in limited circumstances. The -\type{User\_Interface} versions are how \function{load\_key} is actually -implemented, and provides for much more flexibility. Essentially, if the -passphrase given to the function is not correct, then an exception is thrown -and that is that. However, if you pass in an UI object instead, then the UI -object can keep asking the user for the passphrase until they get it right (or -until they cancel the action, though the UI interface). A -\type{User\_Interface} has very little to do with talking to users; it's just a -way to glue together Botan and whatever user interface you happen to be -using. You can think of it as a user interface interface. The default -\type{User\_Interface} is actually very dumb, and effectively acts just like -the versions taking the \type{std::string}. - -All versions need access to a \type{RandomNumberGenerator} in order to -perform probabilistic tests on the loaded key material. - -After loading a key, you can use \function{dynamic\_cast} to find out what -operations it supports, and use it appropriately. Remember to \function{delete} -it once you are done with it. - -\subsubsection{Limitations} - -As of now Nyberg-Rueppel and Rabin-Williams keys cannot be imported or -exported, because they have no official ASN.1 OID or definition. ElGamal keys -can (as of Botan 1.3.8) be imported and exported, but the only other -implementation that supports the format is Peter Gutmann's Cryptlib. If you -can help it, stick to RSA and DSA. - -\emph{Note}: Currently NR and RW are given basic ASN.1 key formats (which -mirror DSA and RSA, respectively), which means that, if they are assigned an -OID, they can be imported and exported just as easily as RSA and DSA. You can -assign them an OID by putting a line in a Botan configuration file, calling -\function{OIDS::add\_oid}, or editing \filename{src/policy.cpp}. Be warned that -it is possible that a future version will use a format that is different from -the current one (\ie, a newly standardized format). - -\pagebreak -\section{Certificate Handling} - -A certificate is essentially a binding between some identifying information of -a person or other entity (called a \emph{subject}) and a public key. This -binding is asserted by a signature on the certificate, which is placed there by -some authority (the \emph{issuer}) that at least claims that it knows the -subject named in the certificate really ``owns'' the private key -corresponding to the public key in the certificate. - -The major certificate format in use today is X.509v3, designed by ISO and -further hacked on by dozens (hundreds?) of other organizations. - -When working with certificates, the main class to remember is -\type{X509\_Certificate}. You can read an object of this type, but you can't -create one on the fly; a CA object is necessary for actually making a new -certificate. So for the most part, you only have to worry about reading them -in, verifying the signatures, and getting the bits of data in them (most -commonly the public key, and the information about the user of that key). An -X.509v3 certificate can contain a literally infinite number of items related to -all kinds of things. Botan doesn't support a lot of them, simply because nobody -uses them and they're an impossible mess to work with. This section only -documents the most commonly used ones of the ones that are supported; for the -rest, read \filename{x509cert.h} and \filename{asn1\_obj.h} (which has the -definitions of various common ASN.1 constructs used in X.509). - -\subsection{So what's in an X.509 certificate?} - -Obviously, you want to be able to get the public key. This is achieved by -calling the member function \function{subject\_public\_key}, which will return -a \type{X509\_PublicKey*}. As to what to do with this, read about -\function{load\_key} in the section ``Importing and Exporting PK Keys''. In the -general case, this could be any kind of public key, though 99\% of the time it -will be an RSA key. However, Diffie-Hellman and DSA keys are also supported, so -be careful about how you treat this. It is also a wise idea to examine the -value returned by \function{constraints}, to see what uses the public key is -approved for. - -The second major piece of information you'll want is the name/email/etc of the -person to whom this certificate is assigned. Here is where things get a little -nasty. X.509v3 has two (well, mostly just two $\ldots$) different places where -you can stick information about the user: the \emph{subject} field, and in an -extension called \emph{subjectAlternativeName}. The \emph{subject} field is -supposed to only included the following information: country, organization -(possibly), an organizational sub-unit name (possibly), and a so-called common -name. The common name is usually the name of the person, or it could be a title -associated with a position of some sort in the organization. It may also -include fields for state/province and locality. What exactly a locality is, -nobody knows, but it's usually given as a city name. - -Botan doesn't currently support any of the Unicode variants used in ASN.1 -(UTF-8, UCS-2, and UCS-4), any of which could be used for the fields in the -DN. This could be problematic, particularly in Asia and other areas where -non-ASCII characters are needed for most names. The UTF-8 and UCS-2 string -types \emph{are} accepted (in fact, UTF-8 is used when encoding much of the -time), but if any of the characters included in the string are not in ISO -8859-1 (\ie 0 \ldots 255), an exception will get thrown. Currently the -\type{ASN1\_String} type holds its data as ISO 8859-1 internally (regardless -of local character set); this would have to be changed to hold UCS-2 or UCS-4 -in order to support Unicode (also, many interfaces in the X.509 code would have -to accept or return a \type{std::wstring} instead of a \type{std::string}). - -Like the distinguished names, subject alternative names can contain a lot of -things that Botan will flat out ignore (most of which you would never actually -want to use). However, there are three very useful pieces of information that -this extension might hold: an email address (``person@site1.com''), a DNS name -(``somehost.site2.com''), or a URI (``http://www.site3.com''). - -So, how to get the information? Simply call \function{subject\_info} with the -name of the piece of information you want, and it will return a -\type{std::string} that is either empty (signifying that the certificate -doesn't have this information), or has the information requested. There are -several names for each possible item, but the most easily readable ones are: -``Name'', ``Country'', ``Organization'', ``Organizational Unit'', ``Locality'', -``State'', ``RFC822'', ``URI'', and ``DNS''. These values are returned as a -\type{std::string}. - -You can also get information about the issuer of the certificate in the same -way, using \function{issuer\_info}. - -\subsubsection{X.509v3 Extensions} - -X.509v3 specifies a large number of possible extensions. Botan supports some, -but by no means all of them. This section lists which ones are supported, and -notes areas where there may be problems with the handling. You have to be -pretty familiar with X.509 in order to understand what this is talking about. - -\begin{list}{$\cdot$} - \item Key Usage and Extended Key Usage: No problems known. - \item - - \item Basic Constraints: No problems known. The default for a v1/v2 - certificate is assume it's a CA if and only if the option - ``x509/default\_to\_ca'' is set. A v3 certificate is marked as a CA if - (and only if) the basic constraints extension is present and set for a - CA cert. - - \item Subject Alternative Names: Only the ``rfc822Name'', ``dNSName'', and - ``uniformResourceIdentifier'' fields will be stored; all others are - ignored. - - \item Issuer Alternative Names: Same restrictions as the Subject Alternative - Names extension. New certificates generated by Botan never include the - issuer alternative name. - - \item Authority Key Identifier: Only the version using KeyIdentifier is - supported. If the GeneralNames version is used and the extension is - critical, an exception is thrown. If both the KeyIdentifier and - GeneralNames versions are present, then the KeyIdentifier will be - used, and the GeneralNames ignored. - - \item Subject Key Identifier: No problems known. -\end{list} - -\subsubsection{Revocation Lists} - -It will occasionally happen that a certificate must be revoked before its -expiration date. Examples of this happening include the private key being -compromised, or the user to which it has been assigned leaving an -organization. Certificate revocation lists are an answer to this problem -(though online certificate validation techniques are starting to become -somewhat more popular). Essentially, every once in a while the CA will release -a CRL, listing all certificates that have been revoked. Also included is -various pieces of information like what time a particular certificate was -revoked, and for what reason. In most systems, it is wise to support some form -of certificate revocation, and CRLs handle this fairly easily. - -For most users, processing a CRL is quite easy. All you have to do is call the -constructor, which will take a filename (or a \type{DataSource\&}). The CRLs -can either be in raw BER/DER, or in PEM format; the constructor will figure out -which format without any extra information. For example: - -\begin{verbatim} - X509_CRL crl1("crl1.der"); - - DataSource_Stream in("crl2.pem"); - X509_CRL crl2(in); -\end{verbatim} - -After that, pass the \type{X509\_CRL} object to a \type{X509\_Store} object -with \type{X509\_Code} \function{add\_crl}(\type{X509\_CRL}), and all future -verifications will take into account the certificates listed, assuming -\function{add\_crl} returns \type{VERIFIED}. If it doesn't return -\type{VERIFIED}, then the return value is an error code signifying that the CRL -could not be processed due to some problem (which could range from the issuing -certificate not being found, to the CRL having some format problem). For more -about the \type{X509\_Store} API, read the section later in this chapter. - -\subsection{Reading Certificates} - -\type{X509\_Certificate} has two constructors, each of which takes a source of -data; a filename to read, and a \type{DataSource\&}. - -\subsection{Storing and Using Certificates} - -If you read a certificate, you probably want to verify the signature on -it. However, consider that to do so, we may have to verify the signature on the -certificate that we used to verify the first certificate, and on and on until -we hit the top of the certificate tree somewhere. It would be a might huge pain -to have to handle all of that manually in every application, so there is -something that does it for you: \type{X509\_Store}. - -This is a pretty easy thing to use. The basic operations are: put certificates -and CRLs into it, search for certificates, and attempt to verify -certificates. That's about it. In the future, there will be support for online -retrieval of certificates and CRLs (\eg with the HTTP cert-store interface -currently under consideration by PKIX). - -\subsubsection{Adding Certificates} - -You can add new certificates to a certificate store using any of these -functions: - -\function{add\_cert}(\type{const X509\_Certificate\&} \arg{cert}, - \type{bool} \arg{trusted} \type{= false}) - -\function{add\_certs}(\type{DataSource\&} \arg{source}) - -\function{add\_trusted\_certs}(\type{DataSource\&} \arg{source}) - -The versions that take a \type{DataSource\&} will add all the certificates -that it can find in that source. - -All of them add the cert(s) to the store. The 'trusted' certificates are the -ones that you have some reason to trust are genuine. For example, say your -application is working with certificates that are owned by employees of some -company, and all of their certificates are signed by the company CA, whose -certificate is in turned signed by a commercial root CA. What you would then do -is include the certificate of the commercial CA with your application, and read -it in as a trusted certificate. From there, you could verify the company CA's -certificate, and then use that to verify the end user's certificates. Only -self-signed certificates may be considered trusted. - -\subsubsection{Adding CRLs} - -\type{X509\_Code} \function{add\_crl}(\type{const X509\_CRL\&} \arg{crl}); - -This will process the CRL and mark the revoked certificates. This will also -work if a revoked certificate is added to the store sometime after the CRL is -processed. The function can return an error code (listed later), or will return -\type{VERIFIED} if everything completed successfully. - -\subsubsection{Storing Certificates} - -You can output a set of certificates by calling \function{PEM\_encode}, which -will return a \type{std::string} containing each of the certificates in the -store, PEM encoded and concatenated. This simple format can easily be read by -both Botan and other libraries/applications. - -\subsubsection{Searching for Certificates} - -You can find certificates in the store with a series of functions contained -in the \function{X509\_Store\_Search} namespace: - -\begin{verbatim} -namespace X509_Store_Search { -std::vector<X509_Certificate> by_email(const X509_Store& store, - const std::string& email_addr); -std::vector<X509_Certificate> by_name(const X509_Store& store, - const std::string& name); -std::vector<X509_Certificate> by_dns(const X509_Store&, - const std::string& dns_name); -} -\end{verbatim} - -These functions will return a (possibly empty) vector of certificates from -\arg{store} matching your search criteria. The email address and DNS name -searches are case-insensitive but are sensitive to extra whitespace and so -on. The name search will do case-insensitive substring matching, so, for -example, calling \function{X509\_Store\_Search::by\_name}(\arg{your\_store}, -``dob'') will return certificates for ``J.R. 'Bob' Dobbs'' and -``H. Dobbertin'', assuming both of those certificates are in \arg{your\_store}. - -You could then display the results to a user, and allow them to select the -appropriate one. Searching using an email address as the key is usually more -effective than the name, since email addresses are rarely shared. - -\subsubsection{Certificate Stores} - -An object of type \type{Certificate\_Store} is a generalized interface to an -external source for certificates (and CRLs). Examples of such a store would be -one that looked up the certificates in a SQL database, or by contacting a CGI -script running on a HTTP server. There are currently three mechanisms for -looking up a certificate, and one for retrieving CRLs. By default, most of -these mechanisms will simply return an empty \type{std::vector} of -\type{X509\_Certificate}. This storage mechanism is \emph{only} queried when -doing certificate validation: it allows you to distribute only the root key -with an application, and let some online method handle getting all the other -certificates that are needed to validate an end entity certificate. In -particular, the search routines will not attempt to access the external -database. - -The three certificate lookup methods are \function{by\_SKID} (Subject Key -Identifier), \function{by\_name} (the CommonName DN entry), and -\function{by\_email} (stored in either the distinguished name, or in a -subjectAlternativeName extension). The name and email versions take a -\type{std::string}, while the SKID version takes a \type{SecureVector<byte>} -containing the subject key identifier in raw binary. You can choose not to -implement \function{by\_name} or \function{by\_email}, but \function{by\_SKID} -is mandatory to implement, and, currently, is the only version that is used by -\type{X509\_Store}. - -Finally, there is a method for finding CRLs, called \function{get\_crls\_for}, -that takes an \type{X509\_Certificate} object, and returns a -\type{std::vector} of \type{X509\_CRL}. While generally there will be only one -CRL, the use of the vector makes it easy to return no CRLs (\eg, if the -certificate store doesn't support retrieving them), or return multiple ones -(for example, if the certificate store can't determine precisely which key was -used to sign the certificate). Implementing the function is optional, and by -default will return no CRLs. If it is available, it will be used by -\type{X509\_CRL}. - -As for actually using such a store, you have to tell \type{X509\_Store} about -it, by calling the \type{X509\_Store} member function - -\function{add\_new\_certstore}(\type{Certificate\_Store}* \arg{new\_store}) - -The argument, \arg{new\_store}, will be deleted by \type{X509\_Store}'s -destructor, so make sure to allocate it with \function{new}. - -\subsubsection{Verifying Certificates} - -There is a single function in \type{X509\_Store} related to verifying a -certificate: - -\type{X509\_Code} -\function{validate\_cert}(\type{const X509\_Certificate\&} \arg{cert}, - \type{Cert\_Usage} \arg{usage} = \type{ANY}) - -To sum things up simply, it returns \type{VERIFIED} if the certificate can -safely be considered valid for the usage(s) described by \arg{usage}, and an -error code if it is not. Naturally, things are a bit more complicated than -that. The enum \type{Cert\_Usage} is defined inside the \type{X509\_Store} -class, it (currently) can take on any of the values \type{ANY} (any usage is -OK), \type{TLS\_SERVER} (for SSL/TLS server authentication), \type{TLS\_CLIENT} -(for SSL/TLS client authentication), \type{CODE\_SIGNING}, -\type{EMAIL\_PROTECTION} (email encryption, usually this means S/MIME), -\type{TIME\_STAMPING} (in theory any time stamp application, usually IETF -PKIX's Time Stamp Protocol), or \type{CRL\_SIGNING}. Note that Microsoft's code -signing system, certainly the most widely used, uses a completely different -(and basically undocumented) method for marking certificates for code signing. - -First, how does it know if a certificate is valid? Basically, a certificate is -valid if both of the following hold: a) the signature in the certificate can be -verified using the public key in the issuer's certificate, and b) the issuer's -certificate is a valid CA certificate. Note that this definition is -recursive. We get out of this by ``bottoming out'' when we reach a certificate -that we consider trusted. In general this will either be a commercial root CA, -or an organization or application specific CA. - -There are actually a few other restrictions (validity periods, key usage -restrictions, etc), but the above summarizes the major points of the validation -algorithm. In theory, Botan implements the certificate path validation -algorithm given in RFC 2459, but in practice it does not (yet), because we -don't support the X.509v3 policy or name constraint extensions. - -Possible values for \arg{usage} are \type{TLS\_SERVER}, \type{TLS\_CLIENT}, -\type{CODE\_SIGNING}, \type{EMAIL\_PROTECTION}, \type{CRL\_SIGNING}, and -\type{TIME\_STAMPING}, and \type{ANY}. The default \type{ANY} does not mean -valid for any use, it means ``is valid for some usage''. This is generally -fine, and in fact requiring that a random certificate support a particular -usage will likely result in a lot of failures, unless your application is very -careful to always issue certificates with the proper extensions, and you never -use certificates generated by other apps. - -Return values for \function{validate\_cert} (and \function{add\_crl}) include: - -\begin{list}{$\cdot$} - \item VERIFIED: The certificate is valid for the specified use. - \item - \item INVALID\_USAGE: The certificate cannot be used for the specified use. - - \item CANNOT\_ESTABLISH\_TRUST: The root certificate was not marked as - trusted. - \item CERT\_CHAIN\_TOO\_LONG: The certificate chain exceeded the length - allowed by a basicConstraints extension. - \item SIGNATURE\_ERROR: An invalid signature was found - \item POLICY\_ERROR: Some problem with the certificate policies was found. - - \item CERT\_FORMAT\_ERROR: Some format problem was found in a certificate. - \item CERT\_ISSUER\_NOT\_FOUND: The issuer of a certificate could not be - found. - \item CERT\_NOT\_YET\_VALID: The certificate is not yet valid. - \item CERT\_HAS\_EXPIRED: The certificate has expired. - \item CERT\_IS\_REVOKED: The certificate has been revoked. - - \item CRL\_FORMAT\_ERROR: Some format problem was found in a CRL. - \item CRL\_ISSUER\_NOT\_FOUND: The issuer of a CRL could not be found. - \item CRL\_NOT\_YET\_VALID: The CRL is not yet valid. - \item CRL\_HAS\_EXPIRED: The CRL has expired. - - \item CA\_CERT\_CANNOT\_SIGN: The CA certificate found does not have an - contain a public key that allows signature verification. - \item CA\_CERT\_NOT\_FOR\_CERT\_ISSUER: The CA cert found is not allowed to - issue certificates. - \item CA\_CERT\_NOT\_FOR\_CRL\_ISSUER: The CA cert found is not allowed to - issue CRLs. - - \item UNKNOWN\_X509\_ERROR: Some other error occurred. - -\end{list} - -\subsection{Certificate Authorities} - -Setting up a CA for X.509 certificates is actually probably the easiest thing -to do related to X.509. A CA is represented by the type \type{X509\_CA}, which -can be found in \filename{x509\_ca.h}. A CA always needs its own certificate, -which can either be a self-signed certificate (see below on how to create one) -or one issued by another CA (see the section on PKCS \#10 requests). Creating -a CA object is done by the following constructor: - -\begin{verbatim} - X509_CA(const X509_Certificate& cert, const PKCS8_PrivateKey& key); -\end{verbatim} - -The private key is the private key corresponding to the public key in the -CA's certificate. - -Generally, requests for new certificates are supplied to a CA in the form on -PKCS \#10 certificate requests (called a \type{PKCS10\_Request} object in -Botan). These are decoded in a similar manner to -certificates/CRLs/etc. Generally, a request is vetted by humans (who somehow -verify that the name in the request corresponds to the name of the person who -requested it), and then signed by a CA key, generating a new certificate. - -\begin{verbatim} - X509_Certificate sign_request(const PKCS10_Request&) const; -\end{verbatim} - -\subsubsection{Generating CRLs} - -As mentioned previously, the ability to process CRLs is highly important in -many PKI systems. In fact, according to strict X.509 rules, you must not -validate any certificate if the appropriate CRLs are not available (though -hardly any systems are that strict). In any case, a CA should have a valid CRL -available at all times. - -Of course, you might be wondering what to do if no certificates have been -revoked. In fact, CRLs can be issued without any actually revoked certificates -- the list of certs will simply be empty. To generate a new, empty CRL, just -call \type{X509\_CRL} -\function{X509\_CA::new\_crl}(\type{u32bit}~\arg{seconds}~=~0)~--~it will -create a new, empty, CRL. If \arg{seconds} is the default 0, then the normal -default CRL next update time (the value of the ``x509/crl/next\_update'') will -be used. If not, then \arg{seconds} specifies how long (in seconds) it will be -until the CRL's next update time (after this time, most clients will reject the -CRL as too old). - -On the other hand, you may have issued a CRL before. In that case, you will -want to issue a new CRL that contains all previously revoked -certificates, along with any new ones. This is done by calling the -\type{X509\_CA} member function -\function{update\_crl}(\type{X509\_CRL}~\arg{old\_crl}, -\type{std::vector<CRL\_Entry>}~\arg{new\_revoked}, -\type{u32bit}~\arg{seconds}~=~0), where \type{X509\_CRL} is the last CRL this -CA issued, and \arg{new\_revoked} is a list of any newly revoked certificates. -The function returns a new \type{X509\_CRL} to make available for clients. The -semantics for the \arg{seconds} argument is the same as \function{new\_crl}. - -The \type{CRL\_Entry} type is a structure that contains, at a minimum, the -serial number of the revoked certificate. As serial numbers are never repeated, -the pairing of an issuer and a serial number (should) distinctly identify any -certificate. In this case, we represent the serial number as a -\type{SecureVector<byte>} called \arg{serial}. There are two additional -(optional) values, an enumeration called \type{CRL\_Code} that specifies the -reason for revocation (\arg{reason}), and an object that represents the time -that the certificate became invalid (if this information is known). - -If you wish to remove an old entry from the CRL, insert a new entry for the -same cert, with a \arg{reason} code of \type{DELETE\_CRL\_ENTRY}. For example, -if a revoked certificate has expired 'normally', there is no reason to continue -to explicitly revoke it, since clients will reject the cert as expired in any -case. - -\subsubsection{Self-Signed Certificates} - -Generating a new self-signed certificate can often be useful, for example when -setting up a new root CA, or for use in email applications. In this case, -the solution is summed up simply as: - -\begin{verbatim} -namespace X509 { - X509_Certificate create_self_signed_cert(const X509_Cert_Options& opts, - const PKCS8_PrivateKey& key); -} -\end{verbatim} - -Where \arg{key} is obviously the private key you wish to use (the public key, -used in the certificate itself, is extracted from the private key), and -\arg{opts} is an structure that has various bits of information that will be -used in creating the certificate (this structure, and its use, is discussed -below). This function is found in the header \filename{x509self.h}. There is an -example of using this function in the \filename{self\_sig} example. - -\subsubsection{Creating PKCS \#10 Requests} - -Also in \filename{x509self.h}, there is a function for generating new PKCS \#10 -certificate requests. - -\begin{verbatim} -namespace X509 { - PKCS10_Request create_cert_req(const X509_Cert_Options&, - const PKCS8_PrivateKey&); -} -\end{verbatim} - -This function acts quite similarly to \function{create\_self\_signed\_cert}, -except it instead returns a PKCS \#10 certificate request. After creating it, -one would typically transmit it to a CA, who signs it and returns a freshly -minted X.509 certificate. There is an example of using this function in the -\filename{pkcs10} example. - -\subsubsection{Certificate Options} - -So what is this \type{X509\_Cert\_Options} thing we've been passing around? -Basically, it's a bunch of information that will end up being stored into the -certificate. This information comes in 3 major flavors: information about the -subject (CA or end-user), the validity period of the certificate, and -restrictions on the usage of the certificate. - -First and foremost is a number of \type{std::string} members, which contains -various bits of information about the user: \arg{common\_name}, -\arg{serial\_number}, \arg{country}, \arg{organization}, \arg{org\_unit}, -\arg{locality}, \arg{state}, \arg{email}, \arg{dns\_name}, and \arg{uri}. As -many of these as possible should be filled it (especially an email address), -though the only required ones are \arg{common\_name} and \arg{country}. - -There is another value that is only useful when creating a PKCS \#10 request, -which is called \arg{challenge}. This is a challenge password, which you can -later use to request certificate revocation (\emph{if} the CA supports doing -revocations in this manner). - -Then there is the validity period; these are set with \function{not\_before} -and \function{not\_after}. Both of these functions also take a -\type{std::string}, which specifies when the certificate should start being -valid, and when it should stop being valid. If you don't set the starting -validity period, it will automatically choose the current time. If you don't -set the ending time, it will choose the starting time plus a default time -period. The arguments to these functions specify the time in the following -format: ``2002/11/27 1:50:14''. The time is in 24-hour format, and the date is -encoded as year/month/day. The date must be specified, but you can omit the -time or trailing parts of it, for example ``2002/11/27 1:50'' or -``2002/11/27''. - -Lastly, you can set constraints on a key. The one you're mostly likely to want -to use is to create (or request) a CA certificate, which can be done by calling -the member function \function{CA\_key}. This should only be used when needed. - -Other constraints can be set by calling the member functions -\function{add\_constraints} and \function{add\_ex\_constraints}. The first -takes a \type{Key\_Constraints} value, and replaces any previously set -value. If no value is set, then the certificate key is marked as being valid -for any usage. You can set it to any of the following (for more than one -usage, OR them together): \type{DIGITAL\_SIGNATURE}, \type{NON\_REPUDIATION}, -\type{KEY\_ENCIPHERMENT}, \type{DATA\_ENCIPHERMENT}, \type{KEY\_AGREEMENT}, -\type{KEY\_CERT\_SIGN}, \type{CRL\_SIGN}, \type{ENCIPHER\_ONLY}, -\type{DECIPHER\_ONLY}. Many of these have quite special semantics, so you -should either consult the appropriate standards document (such as RFC 3280), or -simply not call \function{add\_constraints}, in which case the appropriate -values will be chosen for you. - -The second function, \function{add\_ex\_constraints}, allows you to specify an -OID that has some meaning with regards to restricting the key to particular -usages. You can, if you wish, specify any OID you like, but there is a set of -standard ones that other applications will be able to understand. These are -the ones specified by the PKIX standard, and are named ``PKIX.ServerAuth'' (for -TLS server authentication), ``PKIX.ClientAuth'' (for TLS client -authentication), ``PKIX.CodeSigning'', ``PKIX.EmailProtection'' (most likely -for use with S/MIME), ``PKIX.IPsecUser'', ``PKIX.IPsecTunnel'', -``PKIX.IPsecEndSystem'', and ``PKIX.TimeStamping''. You can call -\function{add\_ex\_constraints} any number of times~--~each new OID will be -added to the list to include in the certificate. - -\pagebreak -\section{The Low-Level Interface} - -Botan has two different interfaces. The one documented in this section is meant -more for implementing higher-level types (see the section on filters, earlier in -this manual) than for use by applications. Using it safely requires a solid -knowledge of encryption techniques and best practices, so unless you know, for -example, what CBC mode and nonces are, and why PKCS \#1 padding is important, -you should avoid this interface in favor of something working at a higher level -(such as the CMS interface). - -\subsection{Basic Algorithm Abilities} - -There are a small handful of functions implemented by most of Botan's -algorithm objects. Among these are: - -\noindent -\type{std::string} \function{name}(): - -Returns a human-readable string of the name of this algorithm. Examples of -names returned are ``Blowfish'' and ``HMAC(MD5)''. You can turn names back into -algorithm objects using the functions in \filename{lookup.h}. - -\noindent -\type{void} \function{clear}(): - -Clear out the algorithm's internal state. A block cipher object will ``forget'' -its key, a hash function will ``forget'' any data put into it, etc. Basically, -the object will look exactly as it did when you initially allocated it. - -\noindent -\function{clone}(): - -This function is central to Botan's name-based interface. The \function{clone} -has many different return types, such as \type{BlockCipher*} and -\type{HashFunction*}, depending on what kind of object it is called on. Note -that unlike Java's clone, this returns a new object in a ``pristine'' state; -that is, operations done on the initial object before calling \function{clone} -do not affect the initial state of the new clone. - -Cloned objects can (and should) be deallocated with the C++ \texttt{delete} -operator. - -\subsection{Keys and IVs} - -Both symmetric keys and initialization values can simply be considered byte (or -octet) strings. These are represented by the classes \type{SymmetricKey} and -\type{InitializationVector}, which are subclasses of \type{OctetString}. - -Since often it's hard to distinguish between a key and IV, many things (such as -key derivation mechanisms) return \type{OctetString} instead of -\type{SymmetricKey} to allow its use as a key or an IV. - -\noindent -\function{OctetString}(\type{u32bit} \arg{length}): - -This constructor creates a new random key of size \arg{length}. - -\noindent -\function{OctetString}(\type{std::string} \arg{str}): - -The argument \arg{str} is assumed to be a hex string; it is converted to binary -and stored. Whitespace is ignored. - -\noindent -\function{OctetString}(\type{const byte} \arg{input}[], \type{u32bit} -\arg{length}): - -This constructor simply copies its input. - -\subsection{Symmetrically Keyed Algorithms} - -Block ciphers, stream ciphers, and MACs all handle keys in pretty much the same -way. To make this similarity explicit, all algorithms of those types are -derived from the \type{SymmetricAlgorithm} base class. This type has three -functions: - -\noindent -\type{void} \function{set\_key}(\type{const byte} \arg{key}[], \type{u32bit} -\arg{length}): - -Most algorithms only accept keys of certain lengths. If you attempt to call -\function{set\_key} with a key length that is not supported, the exception -\type{Invalid\_Key\_Length} will be thrown. There is also another version of -\function{set\_key} that takes a \type{SymmetricKey} as an argument. - -\noindent -\type{bool} \function{valid\_keylength}(\type{u32bit} \arg{length}) const: - -This function returns true if a key of the given length will be accepted by -the cipher. - -There are also three constant data members of every \type{SymmetricAlgorithm} -object, which specify exactly what limits there are on keys which that object -can accept: - -MAXIMUM\_KEYLENGTH: The maximum length of a key. Usually, this is at most 32 -(256 bits), even if the algorithm actually supports more. In a few rare cases -larger keys will be supported. - -MINIMUM\_KEYLENGTH: The minimum length of a key. This is at least 1. - -KEYLENGTH\_MULTIPLE: The length of the key must be a multiple of this value. - -In all cases, \function{set\_key} must be called on an object before any data -processing (encryption, decryption, etc) is done by that object. If this is not -done, the results are undefined -- that is to say, Botan reserves the right in -this situation to do anything from printing a nasty, insulting message on the -screen to dumping core. - -\subsection{Block Ciphers} - -Block ciphers implement the interface \type{BlockCipher}, found in -\filename{base.h}, as well as the \type{SymmetricAlgorithm} interface. - -\noindent -\type{void} \function{encrypt}(\type{const byte} \arg{in}[BLOCK\_SIZE], - \type{byte} \arg{out}[BLOCK\_SIZE]) const - -\noindent -\type{void} \function{encrypt}(\type{byte} \arg{block}[BLOCK\_SIZE]) const - -These functions apply the block cipher transformation to \arg{in} and -place the result in \arg{out}, or encrypts \arg{block} in place -(\arg{in} may be the same as \arg{out}). BLOCK\_SIZE is a constant -member of each class, which specifies how much data a block cipher can -process at one time. Note that BLOCK\_SIZE is not a static class -member, meaning you can (given a \type{BlockCipher*} named -\arg{cipher}), call \verb|cipher->BLOCK_SIZE| to get the block size of -that particular object. \type{BlockCipher}s have similar functions -\function{decrypt}, which perform the inverse operation. - -\begin{verbatim} -AES_128 cipher; -SymmetricKey key(cipher.MAXIMUM_KEYLENGTH); // randomly created -cipher.set_key(key); - -byte in[16] = { /* secrets */ }; -byte out[16]; -cipher.encrypt(in, out); -\end{verbatim} - -\subsection{Stream Ciphers} - -Stream ciphers are somewhat different from block ciphers, in that encrypting -data results in changing the internal state of the cipher. Also, you may -encrypt any length of data in one go (in byte amounts). - -\noindent -\type{void} \function{encrypt}(\type{const byte} \arg{in}[], \type{byte} -\arg{out}[], \type{u32bit} \arg{length}) - -\noindent -\type{void} \function{encrypt}(\type{byte} \arg{data}[], \type{u32bit} -\arg{length}): - -These functions encrypt the arbitrary length (well, less than 4 gigabyte long) -string \arg{in} and place it into \arg{out}, or encrypts it in place in -\arg{data}. The \function{decrypt} functions look just like -\function{encrypt}. - -Stream ciphers implement the \type{SymmetricAlgorithm} interface. - -Some stream ciphers support random access to any point in their cipher -stream. For such ciphers, calling \type{void} \function{seek}(\type{u32bit} -\arg{byte}) will change the cipher's state so that it is as if the cipher had been -keyed as normal, then encrypted \arg{byte} -- 1 bytes of data (so the next byte -in the cipher stream is byte number \arg{byte}). - -\subsection{Hash Functions / Message Authentication Codes} - -Hash functions take their input without producing any output, only producing -anything when all input has already taken place. MACs are very similar, but are -additionally keyed. Both of these are derived from the base class -\type{BufferedComputation}, which has the following functions. - -\noindent -\type{void} \function{update}(\type{const byte} \arg{input}[], \type{u32bit} -\arg{length}) - -\noindent -\type{void} \function{update}(\type{byte} \arg{input}) - -\noindent -\type{void} \function{update}(\type{const std::string \&} \arg{input}) - -Updates the hash/mac calculation with \arg{input}. - -\noindent -\type{void} \function{final}(\type{byte} \arg{out}[OUTPUT\_LENGTH]) - -\noindent -\type{SecureVector<byte>} \function{final}(): - -Complete the hash/MAC calculation and place the result into \arg{out}. -OUTPUT\_LENGTH is a public constant in each object that gives the length of the -hash in bytes. After you call \function{final}, the hash function is reset to -its initial state, so it may be reused immediately. - -The second method of using final is to call it with no arguments at all, as -shown in the second prototype. It will return the hash/mac value in a memory -buffer, which will have size OUTPUT\_LENGTH. - -There is also a pair of functions called \function{process}. They are -essentially a combination of a single \function{update}, and \function{final}. -Both versions return the final value, rather than placing it an array. Calling -\function{process} with a single byte value isn't available, mostly because it -would rarely be useful. - -A MAC can be viewed (in most cases) as simply a keyed hash function, so classes -that are derived from \type{MessageAuthenticationCode} have \function{update} -and \function{final} classes just like a \type{HashFunction} (and like a -\type{HashFunction}, after \function{final} is called, it can be used to make a -new MAC right away; the key is kept around). - -A MAC has the \type{SymmetricAlgorithm} interface in addition to the -\type{BufferedComputation} interface. - -\pagebreak -\section{Random Number Generators} - -The random number generators provided in Botan are meant for creating keys, -IVs, padding, nonces, and anything else that requires 'random' data. It is -important to remember that the output of these classes will vary, even if they -are supplied with exactly the same seed (\ie, two \type{Randpool} objects with -similar initial states will not produce the same output, because the value of -high resolution timers is added to the state at various points). - -To ensure good quality output, a PRNG needs to be seeded with truly random data -(such as that produced by a hardware RNG). Typically, you will use an -\type{EntropySource} (see below). To add entropy to a PRNG, you can use -\type{void} \function{add\_entropy}(\type{const byte} \arg{data}[], -\type{u32bit} \arg{length}) or (better), use the \type{EntropySource} -interface. - -Once a PRNG has been initialized, you can get a single byte of random data by -calling \type{byte} \function{random()}, or get a large block by calling -\type{void} \function{randomize}(\type{byte} \arg{data}[], \type{u32bit} -\arg{length}), which will put random bytes into each member of the array from -indexes 0 $\ldots$ \arg{length} -- 1. - -You can avoid all the problems inherent in seeding the PRNG by using the -globally shared PRNG, described later in this section. - -\subsection{Randpool} - -\type{Randpool} is the primary PRNG within Botan. In recent versions all uses -of it have been wrapped by an implementation of the X9.31 PRNG (see below). If -for some reason you should have cause to create a PRNG instead of using the -``global'' one owned by the library, it would be wise to consider the same on -the grounds of general caution; while \type{Randpool} is designed with known -attacks and PRNG weaknesses in mind, it is not an standard/official PRNG. The -remainder of this section is a (fairly technical, though high-level) description -of the algorithms used in this PRNG. Unless you have a specific interest in -this subject, the rest of this section might prove somewhat uninteresting. - -\type{Randpool} has an internal state called pool, which is 512 bytes -long. This is where entropy is mixed into and extracted from. There is also a -small output buffer (called buffer), which holds the data which has already -been generated but has just not been output yet. - -It is based around a MAC and a block cipher (which are currently HMAC(SHA-256) -and AES-256). Where a specific size is mentioned, it should be taken as a -multiple of the cipher's block size. For example, if a 256-bit block cipher -were used instead of AES, all the sizes internally would double. Every time -some new output is needed, we compute the MAC of a counter and a high -resolution timer. The resulting MAC is XORed into the output buffer (wrapping -as needed), and the output buffer is then encrypted with AES, producing 16 -bytes of output. - -After 8 blocks (or 128 bytes) have been produced, we mix the pool. To do this, -we first rekey both the MAC and the cipher; the new MAC key is the MAC of the -current pool under the old MAC key, while the new cipher key is the MAC of the -current pool under the just-chosen MAC key. We then encrypt the entire pool in -CBC mode, using the current (unused) output buffer as the IV. We then generate -a new output buffer, using the mechanism described in the previous paragraph. - -To add randomness to the PRNG, we compute the MAC of the input and XOR the -output into the start of the pool. Then we remix the pool and produce a new -output buffer. The initial MAC operation should make it very hard for chosen -inputs to harm the security of \type{Randpool}, and as HMAC should be able to -hold roughly 256 bits of state, it is unlikely that we are wasting much input -entropy (or, if we are, it doesn't matter, because we have a very abundant -supply). - -\subsection{ANSI X9.31} - -\type{ANSI\_X931\_PRNG} is the standard issue X9.31 Appendix A.2.4 PRNG, though -using AES-256 instead of 3DES as the block cipher. This PRNG implementation has -been checked against official X9.31 test vectors. - -Internally, the PRNG holds a pointer to another PRNG (typically -Randpool). This internal PRNG generates the key and seed used by the -X9.31 algorithm, as well as the date/time vectors. Each time an X9.31 -PRNG object receives entropy, it simply passes it along to the PRNG it -is holding, and then pulls out some random bits to generate a new key -and seed. This PRNG considers itself seeded as soon as the internal -PRNG is seeded. - -As of version 1.4.7, the X9.31 PRNG is by default used for all random number -generation. - -\subsection{Entropy Sources} - -An \type{EntropySource} is an abstract representation of some method of gather -``real'' entropy. This tends to be very system dependent. The \emph{only} way -you should use an \type{EntropySource} is to pass it to a PRNG that will -extract entropy from it -- never use the output directly for any kind of key or -nonce generation! - -\type{EntropySource} has a pair of functions for getting entropy from some -external source, called \function{fast\_poll} and \function{slow\_poll}. These -pass a buffer of bytes to be written; the functions then return how many bytes -of entropy were actually gathered. \type{EntropySource}s are usually used to -seed the global PRNG using the functions found in the \namespace{Global\_RNG} -namespace. - -Note for writers of \type{EntropySource}s: it isn't necessary to use any kind -of cryptographic hash on your output. The data produced by an EntropySource is -only used by an application after it has been hashed by the -\type{RandomNumberGenerator} that asked for the entropy, thus any hashing -you do will be wasteful of both CPU cycles and possibly entropy. - -\pagebreak -\section{User Interfaces} - -Botan has recently changed some infrastructure to better accommodate more -complex user interfaces, in particular ones that are based on event -loops. Primary among these was the fact that when doing something like loading -a PKCS \#8 encoded private key, a passphrase might be needed, but then again it -might not (a PKCS \#8 key doesn't have to be encrypted). Asking for a -passphrase to decrypt an unencrypted key is rather pointless. Not only that, -but the way to handle the user typing the wrong passphrase was complicated, -undocumented, and inefficient. - -So now Botan has an object called \type{UI}, which provides a simple interface -for the aspects of user interaction the library has to be concerned -with. Currently, this means getting a passphrase from the user, and that's it -(\type{UI} will probably be extended in the future to support other operations -as they are needed). The base \type{UI} class is very stupid, because the -library can't directly assume anything about the environment that it's running -under (for example, if there will be someone sitting at the terminal, if the -application is even \emph{attached} to a terminal, and so on). But since you -can subclass \type{UI} to use whatever method happens to be appropriate for -your application, this isn't a big deal. - -There is (currently) a single function that can be overridden by subclasses of -\type{UI} (the \type{std::string} arguments are actually \type{const -std::string\&}, but shown as simply \type{std::string} to keep the line from -wrapping): - -\noindent -\type{std::string} \function{get\_passphrase}(\type{std::string} \arg{what}, - \type{std::string} \arg{source}, - \type{UI\_Result\&} \arg{result}) const; - -The \arg{what} argument specifies what the passphrase is needed for (for -example, PKCS \#8 key loading passes \arg{what} as ``PKCS \#8 private -key''). This lets you provide the user with some indication of \emph{why} your -application is asking for a passphrase; feel free to pass the string through -\function{gettext(3)} or moral equivalent for i18n purposes. Similarly, -\arg{source} specifies where the data in question came from, if available (for -example, a file name). If the source is not available for whatever reason, then -\arg{source} will be an empty string; be sure to account for this possibility -when writing a \type{UI} subclass. - -The function returns the passphrase as the return value, and a status code in -\arg{result} (either \type{OK} or \type{CANCEL\_ACTION}). If -\type{CANCEL\_ACTION} is returned in \arg{result}, then the return value will -be ignored, and the caller will take whatever action is necessary (typically, -throwing an exception stating that the passphrase couldn't be determined). In -the specific case of PKCS \#8 key decryption, a \type{Decoding\_Error} -exception will be thrown; your UI should assume this can happen, and provide -appropriate error handling (such as putting up a dialog box informing the user -of the situation, and canceling the operation in progress). - -There is an example \type{UI} that uses GTK+ available on the web site. The -\type{GTK\_UI} code is cleanly separated from the rest of the example, so if -you happen to be using GTK+, you can copy (and/or adapt) that code for your -application. If you write a \type{UI} object for another windowing system -(Win32, Qt, wxWidgets, FOX, etc), and would like to make it available to users -in general (ideally under a permissive license such as public domain or -MIT/BSD), feel free to send in a copy. - -\pagebreak -\section{Botan's Modules} - -Botan comes with a variety of modules that can be compiled into the system. -These will not be available on all installations of the library, but you can -check for their availability based on whether or not certain macros are -defined. - -\subsection{Pipe I/O for Unix File Descriptors} - -This is a fairly minor feature, but it comes in handy sometimes. In all -installations of the library, Botan's \type{Pipe} object overloads the -\keyword{<<} and \keyword{>>} operators for C++ iostream objects, which is -usually more than sufficient for doing I/O. - -However, there are cases where the iostream hierarchy does not map well to -local 'file types', so there is also the ability to do I/O directly with Unix -file descriptors. This is most useful when you want to read from or write to -something like a TCP or Unix-domain socket, or a pipe, since for simple file -access it's usually easier to just use C++'s file streams. - -If \macro{BOTAN\_EXT\_PIPE\_UNIXFD\_IO} is defined, then you can use the -overloaded I/O operators with Unix file descriptors. For an example of this, -check out the \filename{hash\_fd} example, included in the Botan distribution. - -\subsection{Entropy Sources} - -All of these are used by the \function{Global\_RNG::seed} function if they are -available. Since this function is called by the \type{LibraryInitializer} class -when it is created, it is fairly rare that you will need to deal with any of -these classes directly. Even in the case of a long-running server that needs to -renew its entropy poll, it is easier to simply call -\function{Global\_RNG::seed} (see the section entitled ``The Global PRNG'' for -more details). - -\noindent -\type{EGD\_EntropySource}: Query an EGD socket. If the macro -\macro{BOTAN\_EXT\_ENTROPY\_SRC\_EGD} is defined, it can be found in -\filename{es\_egd.h}. The constructor takes a \type{std::vector<std::string>} -that specifies the paths to look for an EGD socket. - -\noindent -\type{Unix\_EntropySource}: This entropy source executes programs common on -Unix systems (such as \filename{uptime}, \filename{vmstat}, and \filename{df}) -and adds it to a buffer. It's quite slow due to process overhead, and (roughly) -1 bit of real entropy is in each byte that is output. It is declared in -\filename{es\_unix.h}, if \macro{BOTAN\_EXT\_ENTROPY\_SRC\_UNIX} is -defined. If you don't have \filename{/dev/urandom} \emph{or} EGD, this is -probably the thing to use. For a long-running process on Unix, keep on object -of this type around and run fast polls ever few minutes. - -\noindent -\type{FTW\_EntropySource}: Walk through a filesystem (the root to start -searching is passed as a string to the constructor), reading files. This tends -to only be useful on things like \filename{/proc} that have a great deal of -variability over time, and even then there is only a small amount of entropy -gathered: about 1 bit of entropy for every 16 bits of output (and many hundreds -of bits are read in order to get that 16 bits). It is declared in -\filename{es\_ftw.h}, if \macro{BOTAN\_EXT\_ENTROPY\_SRC\_FTW} is defined. Only -use this as a last resort. I don't really trust it, and neither should you. - -\noindent -\type{Win32\_CAPI\_EntropySource}: This routines gathers entropy from a Win32 -CAPI module. It takes an optional \type{std::string} that will specify what -type of CAPI provider to use. Generally the CAPI RNG is always the same -software-based PRNG, but there are a few that may use a hardware RNG. By -default it will use the first provider listed in the option -``rng/ms\_capi\_prov\_type'' that is available on the machine (currently the -providers ``RSA\_FULL'', ``INTEL\_SEC'', ``FORTEZZA'', and ``RNG'' are -recognized). - -\noindent -\type{BeOS\_EntropySource}: Query system statistics using various BeOS-specific -APIs. - -\noindent -\type{Pthread\_EntropySource}: Attempt to gather entropy based on jitter -between a number of threads competing for a single mutex. This entropy source -is \emph{very} slow, and highly questionable in terms of security. However, it -provides a worst-case fallback on systems that don't have Unix-like features, -but do support POSIX threads. This module is currently unavailable due to -problems on some systems. - -\subsection{Compressors} - -There are two compression algorithms supported by Botan, Zlib and Bzip2 (Gzip -and Zip encoding will be supported in future releases). Only lossless -compression algorithms are currently supported by Botan, because they tend to -be the most useful for cryptography. However, it is very reasonable to consider -supporting something like GSM speech encoding (which is lossy), for use in -encrypted voice applications. - -You should always compress \emph{before} you encrypt, because encryption seeks -to hide the redundancy that compression is supposed to try to find and remove. - -\subsubsection{Bzip2} - -To test for Bzip2, check to see if \macro{BOTAN\_EXT\_COMPRESSOR\_BZIP2} is -defined. If so, you can include \filename{bzip2.h}, which will declare a pair -of \type{Filter} objects: \type{Bzip2\_Compression} and -\type{Bzip2\_Decompression}. - -You should be prepared to take an exception when using the decompressing -filter, for if the input is not valid Bzip2 data, that is what you will -receive. You can specify the desired level of compression to -\type{Bzip2\_Compression}'s constructor as an integer between 1 and 9, 1 -meaning worst compression, and 9 meaning the best. The default is to use 9, -since small values take the same amount of time, just use a little less memory. - -The Bzip2 module was contributed by Peter J. Jones. - -\subsubsection{Zlib} - -Zlib compression works pretty much like Bzip2 compression. The only differences -in this case are that the macro is \macro{BOTAN\_EXT\_COMPRESSOR\_ZLIB}, the -header you need to include is called \filename{botan/zlib.h} (remember that you -shouldn't just \verb|#include <zlib.h>|, or you'll get the regular zlib API, -which is not what you want). The Botan classes for Zlib -compression/decompression are called \type{Zlib\_Compression} and -\type{Zlib\_Decompression}. - -Like Bzip2, a \type{Zlib\_Decompression} object will throw an exception if -invalid (in the sense of not being in the Zlib format) data is passed into it. - -In the case of zlib's algorithm, a worse compression level will be faster than -a very high compression ratio. For this reason, the Zlib compressor will -default to using a compression level of 6. This tends to give a good trade off -in terms of time spent to compression achieved. There are several factors you -need to consider in order to decide if you should use a higher compression -level: - -\begin{list}{$\cdot$} - \item Better security: the less redundancy in the source text, the harder it - is to attack your ciphertext. This is not too much of a concern, - because with decent algorithms using sufficiently long keys, it doesn't - really matter \emph{that} much (but it certainly can't hurt). - \item - - \item Decreasing returns. Some simple experiments by the author showed - minimal decreases in the size between level 6 and level 9 compression - with large (1 to 3 megabyte) files. There was some difference, but it - wasn't that much. - - \item CPU time. Level 9 zlib compression is often two to four times as slow - as level 6 compression. This can make a substantial difference in the - overall runtime of a program. -\end{list} - -While the zlib compression library uses the same compression algorithm as the -gzip and zip programs, the format is different. The zlib format is defined in -RFC 1950. - -\subsubsection{Data Sources} - -A \type{DataSource} is a simple abstraction for a thing that stores bytes. This -type is used fairly heavily in the areas of the API related to ASN.1 -encoding/decoding. The following types are \type{DataSource}s: \type{Pipe}, -\type{SecureQueue}, and a couple of special purpose ones: -\type{DataSource\_Memory} and \type{DataSource\_Stream}. - -You can create a \type{DataSource\_Memory} with an array of bytes and a length -field. The object will make a copy of the data, so you don't have to worry -about keeping that memory allocated. This is mostly for internal use, but if it -comes in handy, feel free to use it. - -A \type{DataSource\_Stream} is probably more useful than the memory based -one. Its constructors take either a \type{std::istream} or a -\type{std::string}. If it's a stream, the data source will use the -\type{istream} to satisfy read requests (this is particularly useful to use -with \type{std::cin}). If the string version is used, it will attempt to open -up a file with that name and read from it. - -\subsubsection{Data Sinks} - -A \type{DataSink} (in \filename{data\_snk.h}) is a \type{Filter} that takes -arbitrary amounts of input, and produces no output. Generally, this means it's -doing something with the data outside the realm of what -\type{Filter}/\type{Pipe} can handle, for example, writing it to a file (which -is what the \type{DataSink\_Stream} does). There is no need for -\type{DataSink}s that write to a \type{std::string} or memory buffer, because -\type{Pipe} can handle that by itself. - -Here's a quick example of using a \type{DataSink}, which encrypts -\filename{in.txt} and sends the output to \filename{out.txt}. There is -no explicit output operation; the writing of \filename{out.txt} is -implicit. - -\begin{verbatim} - DataSource_Stream in("in.txt"); - Pipe pipe(new CBC_Encryption("Blowfish", "PKCS7", key, iv), - new DataSink_Stream("out.txt")); - pipe.process_msg(in); -\end{verbatim} - -A real advantage of this is that even if ``in.txt'' is large, only as -much memory is needed for internal I/O buffers will actually be used. - -\subsection{Writing Modules} - -It's a lot simpler to write modules for Botan that it is to write code -in the core library, for several reasons. First, a module can rely on -external libraries and services beyond the base ISO C++ libraries, and -also machine dependent features. Also, the code can be added at -configuration time on the user's end with very little effort (\ie the -code can be distributed separately, and included by the user without -needing to patch any existing source files). - -Each module lives in a subdirectory of the \filename{modules} -directory, which exists at the top-level of the Botan source tree. The -``short name'' of the module is the same as the name of this -directory. The only required file in this directory is -\filename{info.txt}, which contains directives that specify what a -particular module does, what systems it runs on, and so on. Comments -in \filename{info.txt} start with a \verb|#| character and continue -to end of line. - -Recognized directives include: - -\newcommand{\directive}[2]{ - \vskip 4pt - \noindent - \texttt{#1}: #2 -} - -\directive{realname <name>}{Specify that the 'real world' name of this module - is \texttt{<name>}.} - -\directive{note <note>}{Add a note that will be seen by the end-user at -configure time if the module is included into the library.} - -\directive{require\_version <version>}{Require at configure time that -the version of Botan in use be at least \texttt{<version>}.} - -\directive{define <macro>[,<macro>[,...]]}{Cause the macro - \macro{BOTAN\_EXT\_<macro>} (for each instance of \macro{<macro>} - in the directive) to be defined in \filename{build.h}. This should - only be used if the module creates user-visible changes. There is a - set of conventions that should be followed in deciding what to call - this macro (where xxx denotes some descriptive and distinguishing - characteristic of the thing implemented, such as - \macro{ALLOC\_MLOCK} or \macro{MUTEX\_PTHREAD}): - -\begin{itemize} -\item Allocator: \macro{ALLOC\_xxx} -\item Compressors: \macro{COMPRESSOR\_xxx} -\item EntropySource: \macro{ENTROPY\_SRC\_xxx} -\item Engines: \macro{ENGINE\_xxx} -\item Mutex: \macro{MUTEX\_xxx} -\item Timer: \macro{TIMER\_xxx} -\end{itemize} -} - -\directive{<libs> / </libs>}{This specifies any extra libraries to be -linked in. It is a mapping from OS to library name, for example -\texttt{linux -> rt}, which means that on Linux librt should be linked -in. You can also use ``all'' to force the library to be linked in on -all systems.} - -\directive{<add> / </add>}{Tell the configuration script to add the - files named between these two tags into the source tree. All these - files must exist in the current module directory.} - -\directive{<ignore> / </ignore>}{Tell the configuration script to - ignore the files named in the main source tree. This is useful, for - example, when replacing a C++ implementation with a pure assembly - version.} - -\directive{<replace> / </replace>}{Tell the configuration script to - ignore the file given in the main source tree, and instead use the - one in the module's directory.} - -Additionally, the module file can contain blocks, delimited by the -following pairs: - -\texttt{<os> / </os>}, \texttt{<arch> / </arch>}, \texttt{<cc> / </cc>} - -\noindent -For example, putting ``alpha'' and ``ia64'' in a \texttt{<arch>} block will -make the configuration script only allow the module to be compiled on those -architectures. Not having a block means any value is acceptable. - -\pagebreak -\section{Miscellaneous} - -This section has documentation for anything that just didn't fit into any of -the major categories. Many of them (Timers, Allocators) will rarely be used in -actual application code, but others, like the S2K algorithms, have a wide -degree of applicability. - -\subsection{S2K Algorithms} - -There are various procedures (usually fairly ad-hoc) for turning a passphrase -into a (mostly) arbitrary length key for a symmetric cipher. A general -interface for such algorithms is presented in \filename{s2k.h}. The main -function is \function{derive\_key}, which takes a passphrase, and the desired -length of the output key, and returns a key of that length, deterministically -produced from the passphrase. If an algorithm can't produce a key of that size, -it will throw an exception (most notably, PKCS \#5's PBKDF1 can only produce -strings between 1 and $n$ bytes, where $n$ is the output size of the underlying -hash function). - -Most such algorithms allow the use of a ``salt'', which provides some extra -randomness and helps against dictionary attacks on the passphrase. Simply call -\function{change\_salt} (there are variations of it for most of the ways you -might wish to specify a salt, check the header for details) with a block of -random data. You can also have the class generate a new salt for you with -\function{new\_random\_salt}; the salt that was generated can be retrieved with -\function{current\_salt}. - -Additionally some algorithms allow you to set some sort of iteration -count, which will make the algorithm take longer to compute the final -key (reducing the speed of brute-force attacks of various kinds). This -can be changed with the \function{set\_iterations} function. Most -standards recommend an iteration count of at least 1000. Currently -defined S2K algorithms are ``PBKDF1(digest)'', ``PBKDF2(digest)'', and -``OpenPGP-S2K(digest)''; you can retrieve any of these using the -\function{get\_s2k}, found in \filename{lookup.h}. As of this writing, -``PBKDF2(SHA-256)'' with 10000 iterations and an 8 byte salt is -recommend for new applications. - -\subsubsection{OpenPGP S2K} - -There are some oddities about OpenPGP's S2K algorithms that are documented -here. For one thing, it uses the iteration count in a strange manner; instead -of specifying how many times to iterate the hash, it tells how many -\emph{bytes} should be hashed in total (including the salt). So the exact -iteration count will depend on the size of the salt (which is fixed at 8 bytes -by the OpenPGP standard, though the implementation will allow any salt size) -and the size of the passphrase. - -To get what OpenPGP calls ``Simple S2K'', set iterations to 0 (the default for -OpenPGP S2K), and do not specify a salt. To get ``Salted S2K'', again leave the -iteration count at 0, but give an 8-byte salt. ``Salted and Iterated S2K'' -requires an 8-byte salt and some iteration count (this should be significantly -larger than the size of the longest passphrase that might reasonably be used; -somewhere from 1024 to 65536 would probably be about right). Using both a -reasonably sized salt and a large iteration count is highly recommended to -prevent password guessing attempts. - -\subsection{Checksums} - -Checksums are very similar to hash functions, and in fact share the same -interface. But there are some significant differences, the major ones being -that the output size is very small (usually in the range of 2 to 4 bytes), and -is not cryptographically secure. But for their intended purpose (error -checking), they perform very well. Some examples of checksums included in Botan -are the Adler32 and CRC32 checksums. - -\subsection{Exceptions} - -Sooner or later, something is going to go wrong. Botan's behavior when -something unusual occurs, like most C++ software, is to throw an exception. -Exceptions in Botan are derived from the \type{Exception} class. You can see -most of the major varieties of exceptions used in Botan by looking at -\filename{exceptn.h}. The only function you really need to concern yourself -with is \type{const char*} \function{what()}. This will return an error message -relevant to the error that occurred. For example: - -\begin{verbatim} -try { - // various Botan operations - } -catch(Botan::Exception& e) - { - cout << "Botan exception caught: " << e.what() << endl; - // error handling, or just abort - } -\end{verbatim} - -Botan's exceptions are derived from \type{std::exception}, so you don't need -to explicitly check for Botan exceptions if you're already catching the ISO -standard ones. - -\subsection{Threads and Mutexes} - -Botan includes a mutex system, which is used internally to lock some shared -data structures that must be kept shared for efficiency reasons (mostly, these -are in the allocation systems~--~handing out 1000 separate allocators hurts -performance and makes caching memory blocks useless). This system is supported -by the \texttt{mux\_pthr} module, implementing the \type{Mutex} interface for -systems that have POSIX threads. - -If your application is using threads, you \emph{must} add the option -``thread\_safe'' to the options string when you create the -\type{LibraryInitializer} object. If you specify this option and no mutex type -is available, an exception is thrown, since otherwise you would probably be -facing a nasty crash. - -\subsection{Secure Memory} - -A major concern with mixing modern multiuser OSes and cryptographic -code is that at any time the code (including secret keys) could be -swapped to disk, where it can later be read by an attacker. Botan -stores almost everything (and especially anything sensitive) in memory -buffers that a) clear out their contents when their destructors are -called, and b) have easy plugins for various memory locking functions, -such as the \function{mlock}(2) call on many Unix systems. - -Two of the allocation method used (``malloc'' and ``mmap'') don't -require any extra privileges on Unix, but locking memory does. At -startup, each allocator type will attempt to allocate a few blocks -(typically totaling 128k), so if you want, you can run your -application \texttt{setuid} \texttt{root}, and then drop privileges -immediately after creating your \type{LibraryInitializer}. If you end -up using more than what's been allocated, some of your sensitive data -might end up being swappable, but that beats running as \texttt{root} -all the time. BTW, I would note that, at least on Linux, you can use a -kernel module to give your process extra privileges (such as the -ability to call \function{mlock}) without being root. For example, -check out my Capability Override LSM -(\url{http://www.randombit.net/projects/cap\_over/}), which makes this -pretty easy to do. - -These classes should also be used within your own code for storing sensitive -data. They are only meant for primitive data types (int, long, etc): if you -want a container of higher level Botan objects, you can just use a -\verb|std::vector|, since these objects know how to clear themselves when they -are destroyed. You cannot, however, have a \verb|std::vector| (or any other -container) of \type{Pipe}s or \type{Filter}s, because these types have pointers -to other \type{Filter}s, and implementing copy constructors for these types -would be both hard and quite expensive (vectors of pointers to such objects is -fine, though). - -These types are not described in any great detail: for more information, -consult the definitive sources~--~the header files \filename{secmem.h} and -\filename{allocate.h}. - -\type{SecureBuffer} is a simple array type, whose size is specified at compile -time. It will automatically convert to a pointer of the appropriate type, and -has a number of useful functions, including \function{clear()}, and -\type{u32bit} \function{size()}, which returns the length of the array. It is a -template that takes as parameters a type, and a constant integer which is how -long the array is (for example: \verb|SecureBuffer<byte, 8> key;|). - -\type{SecureVector} is a variable length array. Its size can be increased or -decreased as need be, and it has a wide variety of functions useful for copying -data into its buffer. Like \type{SecureBuffer}, it implements \function{clear} -and \function{size}. - -\subsection{Allocators} - -The containers described above get their memory from allocators. As a user of -the library, you can add new allocator methods at run time for containers, -including the ones used internally by the library, to use. The interface to -this is in \filename{allocate.h}. Basically how it works is that code needing -an allocator uses \function{get\_allocator}, which returns a pointer to an -allocator. This pointer should not be freed: the caller does not own the -allocator (it is shared among multiple users, and locks itself as needed). It -is possible to call \function{get\_allocator} with a specific name to request a -particular type of allocator, otherwise, a default allocator type is returned. - -At start time, the only allocator known is a \type{Default\_Allocator}, which -just allocates memory using \function{malloc}, and \function{memset}s it to 0 -when the memory is released. It is known by the name ``malloc''. If you ask for -another type of allocator (``locking'' and ``mmap'' are currently used), and it -is not available, some other allocator will be returned. - -You can add in a new allocator type using \function{add\_allocator\_type}. This -function takes a string and a pointer to an allocator. The string gives this -allocator type a name to which it can be referred when one is requesting it -with \function{get\_allocator}. If an error occurs (such as the name being -already registered), this function returns false. It will return true if the -allocator was successfully registered. If you ask it to, -\type{LibraryInitializer} will do this for you. - -Finally, you can set the default allocator type that will be returned using -the policy setting ``default\_alloc'' to the name of any previously registered -allocator. - -\subsection{BigInt} - -\type{BigInt} is Botan's implementation of a multiple-precision -integer. Thanks to C++'s operator overloading features, using \type{BigInt} is -often quite similar to using a native integer type. The number of functions -related to \type{BigInt} is quite large. You can find most of them in -\filename{bigint.h} and \filename{numthry.h}. - -Due to the sheer number of functions involved, only a few, which a regular user -of the library might have to deal with, are mentioned here. Fully documenting -the MPI library would take a significant while, so if you need to use it now, -the best way to learn is to look at the headers. - -Probably the most important are the encoding/decoding functions, which -transform the normal representation of a \type{BigInt} into some other form, -such as a decimal string. The most useful of these functions are - -\type{SecureVector<byte>} \function{BigInt::encode}(\type{BigInt}, -\type{Encoding}) - -\noindent -and - -\type{BigInt} \function{BigInt::decode}(\type{SecureVector<byte>}, -\type{Encoding}) - -\type{Encoding} is an enum that has values \type{Binary}, \type{Octal}, -\type{Decimal}, and \type{Hexadecimal}. The parameter will default to -\type{Binary}. These functions are static member functions, so they would be -called like this: - -\begin{verbatim} - BigInt n1; // some number - SecureVector<byte> n1_encoded = BigInt::encode(n1); - BigInt n2 = BigInt::decode(n1_encoded); - // now n1 == n2 -\end{verbatim} - -There are also C++-style I/O operators defined for use with \type{BigInt}. The -input operator understands negative numbers, hexadecimal numbers (marked with a -leading ``0x''), and octal numbers (marked with a leading '0'). The '-' must -come before the ``0x'' or '0' marker. The output operator will never adorn the -output; for example, when printing a hexadecimal number, there will not be a -leading ``0x'' (though a leading '-' will be printed if the number is -negative). If you want such things, you'll have to do them yourself. - -\type{BigInt} has constructors that can create a \type{BigInt} from an unsigned -integer or a string. You can also decode a \type{byte}[] / length pair into a -BigInt. There are several other \type{BigInt} constructors, which I would -seriously recommend you avoid, as they are only intended for use internally by -the library, and may arbitrarily change, or be removed, in a future release. - -An essentially random sampling of \type{BigInt} related functions: - -\type{u32bit} \function{BigInt::bytes}(): Return the size of this \type{BigInt} -in bytes. - -\type{BigInt} \function{random\_prime(\type{u32bit} \arg{b})}: Return a prime -number \arg{b} bits long. - -\type{BigInt} \function{gcd}(\type{BigInt} \arg{x}, \type{BigInt} \arg{y}): -Returns the greatest common divisor of \arg{x} and \arg{y}. Uses the binary -GCD algorithm. - -\type{bool} \function{is\_prime}(\type{BigInt} \arg{x}): Returns true if -\arg{x} is a (possible) prime number. Uses the Miller-Rabin probabilistic -primality test with fixed bases. For higher assurance, use -\function{verify\_prime}, which uses more rounds and randomized 48-bit bases. - -\subsubsection{Efficiency Hints} - -If you can, always use expressions of the form \verb|a += b| over -\verb|a = a + b|. The difference can be \emph{very} substantial, because the -first form prevents at least one needless memory allocation, and possibly as -many as three. - -If you're doing repeated modular exponentiations with the same modulus, create -a \type{BarrettReducer} ahead of time. If the exponent or base is a constant, -use the classes in \filename{mod\_exp.h}. This stuff is all handled for you by -the normal high-level interfaces, of course. - -Never use the low-level MPI functions (those that begin with -\texttt{bigint\_}). These are completely internal to the library, and -may make arbitrarily strange and undocumented assumptions about their -inputs, and don't check to see if they are actually true, on the -assumption that only the library itself calls them, and that the -library knows what the assumptions are. The interfaces for these -functions can change completely without notice. - -\pagebreak -\section{Algorithms} - -\subsection{Recommended Algorithms} - -This section is by no means the last word on selecting which algorithms to use. -However, Botan includes a sometimes bewildering array of possible algorithms, -and unless you're familiar with the latest developments in the field, it can be -hard to know what is secure and what is not. The following attributes of the -algorithms were evaluated when making this list: security, standardization, -patent status, support by other implementations, and efficiency (in roughly -that order). - -It is intended as a set of simple guidelines for developers, and nothing more. -It's entirely possible that there are algorithms in Botan that will turn out to -be more secure than the ones listed, but the algorithms listed here are -(currently) thought to be safe. - -\begin{list}{$\cdot$} - \item Block ciphers: AES or Serpent in CBC or CTR mode - - \item Hash functions: SHA-256, SHA-512 - - \item MACs: HMAC with any recommended hash function - - \item Public Key Encryption: RSA with ``EME1(SHA-256)'' - - \item Public Key Signatures: RSA with EMSA4 and any recommended hash, or DSA - with ``EMSA1(SHA-256)'' - - \item Key Agreement: Diffie-Hellman, with ``KDF2(SHA-256)'' -\end{list} - -\subsection{Compliance with Standards} - -Botan is/should be at least roughly compatible with many cryptographic -standards, including the following: - -\newcommand{\standard}[2]{ - \vskip 4pt - * #1: \textbf{#2} -} - -\standard{RSA}{PKCS \#1 v2.1, ANSI X9.31} - -\standard{DSA}{ANSI X9.30, FIPS 186-2} - -\standard{Diffie-Hellman}{ANSI X9.42, PKCS \#3} - -\standard{Certificates}{ITU X.509, RFC 3280/3281 (PKIX), PKCS \#9 v2.0, -PKCS \#10} - -\standard{Private Key Formats}{PKCS \#5 v2.0, PKCS \#8} - -\standard{DES/DES-EDE}{FIPS 46-3, ANSI X3.92, ANSI X3.106} - -\standard{SHA-1}{FIPS 180-2} - -\standard{HMAC}{ANSI X9.71, FIPS 198} - -\standard{ANSI X9.19 MAC}{ANSI X9.9, ANSI X9.19} - -\vskip 8pt -\noindent -There is also support for the very general standards of \textbf{IEEE 1363-2000} -and \textbf{1363a}. Most of the contents of such are included in the standards -mentioned above, in various forms (usually with extra restrictions that 1363 -does not impose). - -\subsection{Algorithms Listing} - -Botan includes a very sizable number of cryptographic algorithms. In -nearly all cases, you never need to know the header file or type name -to use them. However, you do need to know what string (or strings) are -used to identify that algorithm. Generally, these names conform to -those set out by SCAN (Standard Cryptographic Algorithm Naming), which -is a document that specifies how strings are mapped onto algorithm -objects, which is useful for a wide variety of crypto APIs (SCAN is -oriented towards Java, but Botan and several other non-Java libraries -also make at least some use of it). For full details, read the SCAN -document, which can be found at -\url{http://www.users.zetnet.co.uk/hopwood/crypto/scan/} - -Many of these algorithms can take options (such as the number of -rounds in a block cipher, the output size of a hash function, -etc). These are shown in the following list; all of them default to -reasonable values (unless otherwise marked). There are -algorithm-specific limits on most of them. When you see something like -``HASH'' or ``BLOCK'', that means you should insert the name of some -algorithm of that type. There are no defaults for those options. - -A few very obscure algorithms are skipped; if you need one of them, -you'll know it, and you can look in the appropriate header to see what -that classes' \function{name} function returns (the names tend to -match that in SCAN, if it's defined there). - -\begin{list}{$\cdot$} - \item ROUNDS: The number of rounds in a block cipher. - \item - \item OUTSZ: The output size of a hash function or MAC - \item PASS: The number of passes in a hash function (more passes generally - means more security). -\end{list} - -\vskip .05in -\noindent -\textbf{Block Ciphers:} ``AES'', ``Blowfish'', ``CAST-128'', -``CAST-256'', ``DES'', ``DESX'', ``TripleDES'', ``GOST'', ``IDEA'', -``MARS'', ``MISTY1(ROUNDS)'', ``RC2'', ``RC5(ROUNDS)'', ``RC6'', -``SAFER-SK(ROUNDS)'', ``SEED'', ``Serpent'', ``Skipjack'', ``Square'', -``TEA'', ``Twofish'', ``XTEA'' - -\noindent -\textbf{Stream Ciphers:} ``ARC4'', ``MARK4'', ``Turing'', ``WiderWake4+1-BE'' - -\noindent -\textbf{Hash Functions:} ``FORK-256'', ``HAS-160'', ``GOST-34.11'', -``MD2'', ``MD4'', ``MD5'', ``RIPEMD-128'', ``RIPEMD-160'', -``SHA-160'', ``SHA-256'', ``SHA-384'', ``SHA-512'', ``Skein-512'', -``Tiger(OUTSZ,PASS)'', ``Whirlpool'' - -\noindent -\textbf{MACs:} ``HMAC(HASH)'', ``CMAC(BLOCK)'', ``X9.19-MAC'' - -\subsection{Compatibility} - -Generally, cryptographic algorithms are well standardized, thus -compatibility between implementations is relatively simple (of course, not all -algorithms are supported by all implementations). But there are a few -algorithms that are poorly specified, and these should be avoided if you wish -your data to be processed in the same way by another implementation (including -future versions of Botan). - -The block cipher GOST has a particularly poor specification: there are no -standard Sboxes, and the specification does not give test vectors even for -sample boxes, which leads to issues of endian conventions, etc. - -If you wish maximum portability between different implementations of an -algorithm, it's best to stick to strongly defined and well standardized -algorithms, TripleDES, AES, HMAC, and SHA-256 all being good examples. - -\pagebreak -\section{Support and Further Information} - -\subsection{Patents} - -Some of the algorithms implemented by Botan may be covered by patents in some -locations. Algorithms known to have patent claims on them in the United States -and that are not available in a license-free/royalty-free manner include: -IDEA, MISTY1, RC5, RC6, and Nyberg-Rueppel. - -You must not assume that, just because an algorithm is not listed here, it is -not encumbered by patents. If you have any concerns about the patent status of -any algorithm you are considering using in an application, please discuss it -with your attorney. - -\subsection{Recommended Reading} - -It's a very good idea if you have some knowledge of cryptography prior -to trying to use this stuff. You really should read one or more of -these books before seriously using the library (note that the Handbook -of Applied Cryptography is available for free online): - -\setlength{\parskip}{5pt} - -\noindent -\textit{Handbook of Applied Cryptography}, Alfred J. Menezes, -Paul C. Van Oorschot, and Scott A. Vanstone; CRC Press - -\noindent -\textit{Security Engineering -- A Guide to Building Dependable Distributed -Systems}, Ross Anderson; Wiley - -\noindent -\textit{Cryptography: Theory and Practice}, Douglas R. Stinson; CRC Press - -\noindent -\textit{Applied Cryptography, 2nd Ed.}, Bruce Schneier; Wiley - -\noindent -Once you've got the basics down, these are good things to at least take a look -at: IEEE 1363 and 1363a, SCAN, NESSIE, PKCS \#1 v2.1, the security related FIPS -documents, and the CFRG RFCs. - -\subsection{Support} - -Questions or problems you have with Botan can be directed to the -development mailing list. Joining this list is highly recommended if -you're going to be using Botan, since often advance notice of upcoming -changes is sent there. ``Philosophical'' bug reports, announcements of -programs using Botan, and basically anything else having to do with -Botan are also welcome. - -The lists can be found at -\url{http://lists.randombit.net/mailman/listinfo/}. - -\subsection{Contact Information} - -A PGP key with a fingerprint of -\verb|621D AF64 11E1 851C 4CF9 A2E1 6211 EBF1 EFBA DFBC| is used to sign all -Botan releases. This key can be found in the file \filename{doc/pgpkeys.asc}; -PGP keys for the developers are also stored there. - -\vskip 5pt \noindent -Web Site: \url{http://botan.randombit.net} - -\subsection{License} - -Copyright \copyright 2000-2008, Jack Lloyd - -Licensed under the same terms as the Botan source - -\end{document} |
