bases.encoding.fixchar

Fixed-character base encodings, generalising those described by rfc4648.

Constructor options:

char_nbits: Union[int, Literal["auto"]], number of bits per character (default: "auto")
pad_char: Optional[str], optional padding character (default: None)
padding: PaddingOptions, padding style (default: "ignore")

If char_nbits is set to "auto" (by default), it is automatically computed as:

int(math.ceil(math.log2(base)))

From char_nbits, size of a block (in bytes and chars) is computed as:

block_nbytes = lcm(char_nbits, 8)//8
block_nchars = lcm(char_nbits, 8)//char_nbits

The value block_nbytes is presently not used, while the block_nchars is used for padding.

The padding option must be set to "ignore" if a padding character is not specified (i.e. if pad_char is None). If a padding character is specified, it must be a character (string of length 1) not in the encoding alphabet: it is allowed in decoding strings, but only at then end (so that s.rstrip(pad_char) removes all padding).

The padding behaviour is determined by the value of padding:

"ignore": no padding included on encoding, no padding required on decoding
"include": padding included on encoding, no padding required on decoding
"require": padding included on encoding, padding required on decoding

Encoding of a bytestring b:

compute the minimum number extra_nbits of additional bits necessary to make 8*len(b) an integral multiple of char_nbits
convert b to an unsigned integer i (big-endian)
left-shift i by extra_nbits bits, introducing the necessary zero pad bits
converts i to the encoding base, using the encoding alphabet for digits
if padding is "include" or "require", append the minimum number of padding characters necessary to make the encoded string length an integral multiple of block_nchars

Decoding of a string s:

if pad_char is not None, count the number N of contiguous padding characters at the end of s and strip them, obtaining s_stripped
if padding is "require", ensure that N is exactly the minimum number of padding characters that must be appended to s_stripped to make its length an integral multiple of block_nchars
converts s to an unsigned integer i, using the encoding alphabet for digits of the encoding base
compute the number extra_nbits = (char_nbits*len(s))%8 of pad bits: if this is not smaller than char_nbits, raise bases.encoding.errors.DecodingError
extract the value i%(2**extra_nbits) of the pad bits: if this is not zero, raise DecodingError
compute the number of bytes in the decoded bytestring as original_nbytes = (char_nbits*len(s))//8
right-shift i by extra_nbits bits, removing the zero pad bits
converts i to its minimal byte representation (big-endian), then zero-pad on the left to reach original_nbytes bytes

FixcharBaseEncoding

class FixcharBaseEncoding(alphabet, *, case_sensitive=None, char_nbits='auto', pad_char=None, padding='ignore')[source]

Bases: BaseEncoding

Fixed-character encodings.

Parameters:

alphabet (str, range or Alphabet) – the alphabet to use for the encoding
case_sensitive (bool or None, optional) – optional case sensitivity (if None, the one from the alphabet is used)
char_nbits (int or "auto", optional) – number of bits per character (default: "auto")
pad_char (str or None, optional) – optional padding character (default: None)
padding (PaddingOptions, optional) – padding style (default: "ignore")

property block_nchars

Number of characters in a block.

Return type:: int

canonical_bytes(b)[source]

Returns a canonical version of the bytestring b: this is the bytestring obtained by first encoding b and then decoding it.

(This method is overridden by subclasses with more efficient implementations.)

Parameters:: b (BytesLike) – the bytestring
Return type:: bytes

canonical_string(s)[source]

Returns a canonical version of the string s: this is the string obtained by first decoding s and then encoding it.

(This method is overridden by subclasses with more efficient implementations.)

Parameters:: s (str) – the string
Return type:: str

property char_nbits

Number of bits per character.

Return type:: int

property effective_base

Effective base used when decoding is 2**char_nbits.

Return type:: int

property include_padding

Whether padding is included on encoding (derived from padding).

Return type:: bool

nopad(allow=True)[source]

Returns a copy of this encoding which does not include/require paddding (and optionally disallows it by removing the padding character).

Example usage:

>>> encoding.base32
FixcharBaseEncoding(
    StringAlphabet('ABCDEFGHIJKLMNOPQRSTUVWXYZ234567',
                   case_sensitive=False),
    pad_char='=', padding='include')
>>> encoding.base32.nopad()
FixcharBaseEncoding(
    StringAlphabet('ABCDEFGHIJKLMNOPQRSTUVWXYZ234567',
                   case_sensitive=False),
    pad_char='=')
>>> encoding.base32.nopad(allow=False)
FixcharBaseEncoding(
    StringAlphabet('ABCDEFGHIJKLMNOPQRSTUVWXYZ234567',
                   case_sensitive=False))

Parameters:: allow (bool) – whether padding is to be allowed on decoding
Return type:: FixcharBaseEncoding

options(skip_defaults=False)[source]

The options used to construct this particular encoding.

Example usage:

>>> encoding.base32.options()
{'char_nbits': 'auto', 'pad_char': '=', 'padding': 'include'}
>>> encoding.base32.options(skip_defaults=True)
{'pad_char': '=', 'padding': 'include'}

Parameters:: skip_defaults (bool, optional) – if set to True, only options with non-default values are included in the mapping
Return type:: Mapping[str, Any]

pad(require=False)[source]

Returns a copy of this encoding which includes paddding (and optionally requires it).

Example usage, from "include" to "require":

>>> encoding.base32
FixcharBaseEncoding(
    StringAlphabet('ABCDEFGHIJKLMNOPQRSTUVWXYZ234567',
                   case_sensitive=False),
    pad_char='=', padding='include')
>>> encoding.base32.pad(require=True)
FixcharBaseEncoding(
    StringAlphabet('ABCDEFGHIJKLMNOPQRSTUVWXYZ234567',
                   case_sensitive=False),
    pad_char='=', padding='require')

Example usage, from "ignore" to "include":

>>> encoding.base32z
FixcharBaseEncoding(
    StringAlphabet('ybndrfg8ejkmcpqxot1uwisza345h769',
                   case_sensitive=False))
>>> encoding.base32z.with_pad_char("=")
FixcharBaseEncoding(
    StringAlphabet('ybndrfg8ejkmcpqxot1uwisza345h769',
                   case_sensitive=False),
    pad_char='=')
>>> encoding.base32z.with_pad_char("=").pad()
FixcharBaseEncoding(
    StringAlphabet('ybndrfg8ejkmcpqxot1uwisza345h769',
                   case_sensitive=False),
    pad_char='=', padding='include')

Parameters:: require (bool) – whether padding is to be required on decoding
Return type:: FixcharBaseEncoding

property pad_char

An optional character to be used for padding of encoded strings. In rfc4648, this is "=" for both base64 and base32.

Return type:: Optional[str]

pad_string(s)[source]

If no padding character is specified for this encoding, returns the input string unchanged. If a padding character is specified for this encoding, pads the input string by appending the minimum number of padding characters necessary to make its length an integral multiple of the block char size (given by block_nchars).

The value of padding is irrelevant to this method.

Parameters:: s (str) – the string
Return type:: str

property padding

Padding style:

"ignore": no padding included on encoding, no padding required on decoding
"include": padding included on encoding, no padding required on decoding
"require": padding included on encoding, padding required on decoding

Return type:: PaddingOptions

property require_padding

Whether padding is required on decoding (derived from padding).

Return type:: bool

strip_string(s)[source]

If no padding character is specified for this encoding, returns the input string unchanged. If a padding character is specified for this encoding, strips the input string of any padding characters it might have to the right. If padding is set to "require", checks that the correct number of padding characters were included and raises PaddingError if not.

Parameters:: s (str) – the string
Return type:: str

with_pad_char(pad_char)[source]

Returns a copy of this encoding with a different padding character (or without a padding character if pad_char is None).

Example usage:

>>> encoding.base32
FixcharBaseEncoding(
    StringAlphabet('ABCDEFGHIJKLMNOPQRSTUVWXYZ234567',
                   case_sensitive=False),
    pad_char='=', padding='include')
>>> encoding.base32.with_pad_char("~")
FixcharBaseEncoding(
    StringAlphabet('ABCDEFGHIJKLMNOPQRSTUVWXYZ234567',
                   case_sensitive=False),
    pad_char='~', padding='include')

Parameters:: pad_char (str or None) – padding character (default: None)
Return type:: FixcharBaseEncoding

PaddingOptions

PaddingOptions

Type of allowed padding options for fixed-character encoding. See FixcharBaseEncoding.padding.

alias of Literal[‘ignore’, ‘include’, ‘require’]