stlencoders  1.1.3
stlencoders - Generic Base 16/32/64 Encoding for C++

Introduction

stlencoders is a C++ implementation of the Base16, Base32 and Base64 encoding schemes as defined in RFC 4648.

The stlencoders library provides several codec classes, which are class templates designed to support the transformation of arbitrary octet sequences into sequences of characters of any character type (encoding), and vice versa (decoding), for a particular encoding scheme. Codec classes are parameterized by the character type and a character encoding traits class, which defines the encoding alphabet, i.e. the mapping of integral values to individual characters of the character type.

Character Encoding Traits

Classes representing character encoding traits define an encoding alphabet for a particular encoding scheme and a given character type. These classes define the properties of the character type used in encoding and decoding algorithms, e.g. the types used for representing characters and integral values, the necessary operations for mapping integral values to individual characters, as well as the character used for padding of encoded data. Different traits classes may exist for the same encoding scheme and character type, representing alternative encoding alphabets, such as the ones used in the base64url and base32hex encodings defined in RFC 4648.

Character Encoding Traits Requirements

In the following tables, traits denotes a class template defining types and functions for a character type charT; c and d denote values of type charT; e and f denote values of type traits::int_type. Operations on character encoding traits shall not throw exceptions.

Expression Return Type Description Complexity
traits::char_type charT The encoding character type. compile-time
traits::int_type implementation-defined An integral type large enough to hold all values of an octet [0..255]. compile-time
traits::eq(c, d) bool Returns whether the character c is to be treated equal to the character d. constant
traits::eq_int_type(e, f) bool For all characters c and d in the encoding alphabet, traits::eq_int_type(traits::to_int_type(c), traits::to_int_type(d)) is equal to traits::eq(c, d); otherwise, returns true if e and f are both copies of traits::inv(); otherwise, returns false if one of e and f is a copy of traits::inv() and the other is not; otherwise the return value is unspecified. constant
traits::to_char_type(e) traits::char_type If the integral value e has a character representation in the encoding, returns that character; otherwise, the return value is unspecified. If the encoding supports both upper- and lowercase alphabets, it is implementation-defined whether an upper- or lowercase character will be returned. constant
traits::to_int_type(c) traits::int_type For all characters c in the encoding alphabet, returns the integral value represented by c; otherwise, returns traits::inv(). constant
traits::inv() traits::int_type Returns a value e such that traits::eq_int_type(e, traits::to_int_type(c)) is false for all characters c in the encoding alphabet. constant

Optional Character Encoding Traits Members

A character encoding traits class may define the following members depending on the characteristics of the particular encoding, i.e. whether the encoding uses padding or supports both upper- and lowercase alphabets.

Expression Return Type Description Complexity
traits::to_char_type_upper(e) traits::char_type If the encoding supports both upper- and lowercase alphabets and e has a character representation in the encoding, returns the uppercase character representing e; otherwise the return value is unspecified. constant
traits::to_char_type_lower(e) traits::char_type If the encoding supports both upper- and lowercase alphabets and e has a character representation in the encoding, returns the lowercase character representing e; otherwise the return value is unspecified. constant
traits::pad() traits::char_type If padding is used by the encoding, returns the character used to perform padding at the end of a character range. constant

Codecs

A codec is a class template that implements a particular encoding scheme for a given character type and encoding alphabet, and is parameterized by the character type and a character encoding traits class. Codec classes provide operations for encoding octet sequences to character sequences and decoding character sequences to octet sequences. These operations are implemented as static encode and decode template member functions of the codec class, and are parameterized by iterator types.

Codec Requirements

In the following tables, codec denotes a codec class defining types and functions for a particular encoding with character type charT and character encoding traits traits; i and j denote iterators satisfying input iterator requirements and refer to elements implicitly convertible to codec::int_type; k and l denote iterators satisfying input iterator requirements and refer to elements implicitly convertible to codec::char_type; [i, j) and [k, l) denote valid ranges; r denotes an iterator satisfying output iterator requirements; s denotes a function object that, when applied to a value of type codec::char_type, returns a value testable as true; n denotes a value of integral type; and f denotes a boolean flag.

Expression Return Type Description Complexity
codec::char_type charT The encoding character type. compile-time
codec::traits_type traits The character encoding traits type. compile-time
codec::int_type implementation-defined An integral type large enough to hold all values of an octet [0..255]. compile-time
codec::encode(i, j, r) type of r Encodes the octet range [i, j) into the output range beginning at r, performing padding if supported by the encoding. Returns an iterator referring to one past the last value assigned to the output range. linear
codec::decode(k, l, r) type of r Decodes the character range [k, l) into the output range beginning at r. For any character c not in the encoding alphabet, if padding is used by the encoding, stops at the first occurence of traits::pad(); otherwise, throws stlencoders::invalid_character. Returns an iterator referring to one past the last value assigned to the output range, or throws stlencoders::invalid_length if the input range does not constitute a valid encoding sequence (e.g. an odd number of Base16 characters). linear
codec::decode(k, l, r, s) type of r Decodes the character range [k, l) into the output range beginning at r. For any character c not in the encoding alphabet, if s(c) evaluates to true, that character is ignored; otherwise, if padding is used by the encoding, stops at the first occurence of traits::pad(); otherwise, throws stlencoders::invalid_character. Returns an iterator referring to one past the last value assigned to the output range, or throws stlencoders::invalid_length if the input range does not constitute a valid encoding sequence (e.g. an odd number of Base16 characters). linear
codec::max_encode_size(n) type of n Returns the largest possible result length when encoding an octet range of length n. constant
codec::max_decode_size(n) type of n Returns the largest possible result length when decoding a character range of length n. constant

Note that, although iterators referencing almost any integral type can be passed where an octet range is expected, these iterators will always be treated as if referring to octet types. Given the following example:

std::vector<unsigned long> in;
std::vector<unsigned long> out;
std::string s;
stlencoders::base64<char>::encode(in.begin(), in.end(), std::back_inserter(s));
stlencoders::base64<char>::decode(s.begin(), s.end(), std::back_inserter(out));

in's values will be truncated to eight bits when encoding, and out will only hold values in the range [0..255] after decoding.

Optional Codec Members

A codec class may define the following members depending on the characteristics of the particular encoding, e.g. whether the encoding uses padding, or the encoding supports both upper- and lowercase alphabets.

Expression Return Type Description Complexity
codec::encode(i, j, r, f) type of r Encodes the octet range [i, j) into the output range beginning at r. Performs padding at the end of the output range if and only if f is true. Returns an iterator referring to one past the last value assigned to the output range. linear
codec::encode_upper(i, j, r) type of r Encodes the octet range [i, j) into the output range beginning at r, using the uppercase encoding alphabet as defined by traits::to_char_type_upper and performing padding if supported by the encoding. Returns an iterator referring to one past the last value assigned to the output range. linear
codec::encode_lower(i, j, r) type of r Encodes the octet range [i, j) into the output range beginning at r, using the lowercase encoding alphabet as defined by traits::to_char_type_lower and performing padding if supported by the encoding. Returns an iterator referring to one past the last value assigned to the output range. linear
codec::encode_upper(i, j, r, f) type of r Encodes the octet range [i, j) into the output range beginning at r, using the uppercase encoding alphabet as defined by traits::to_char_type_upper. Performs padding at the end of the output range if and only if f is true. Returns an iterator referring to one past the last value assigned to the output range. linear
codec::encode_lower(i, j, r, f) type of r Encodes the octet range [i, j) into the output range beginning at r, using the lowercase encoding alphabet as defined by traits::to_char_type_lower. Performs padding at the end of the output range if and only if f is true. Returns an iterator referring to one past the last value assigned to the output range. linear