bases.encoding.fixchar

Fixed-character base encodings, generalising those described by rfc4648.

Constructor options:

  • char_nbits: Union[int, Literal["auto"]], number of bits per character (default: "auto")

  • pad_char: Optional[str], optional padding character (default: None)

  • padding: PaddingOptions, padding style (default: "ignore")

If char_nbits is set to "auto" (by default), it is automatically computed as:

int(math.ceil(math.log2(base)))

From char_nbits, size of a block (in bytes and chars) is computed as:

block_nbytes = lcm(char_nbits, 8)//8
block_nchars = lcm(char_nbits, 8)//char_nbits

The value block_nbytes is presently not used, while the block_nchars is used for padding.

The padding option must be set to "ignore" if a padding character is not specified (i.e. if pad_char is None). If a padding character is specified, it must be a character (string of length 1) not in the encoding alphabet: it is allowed in decoding strings, but only at then end (so that s.rstrip(pad_char) removes all padding).

The padding behaviour is determined by the value of padding:

  • "ignore": no padding included on encoding, no padding required on decoding

  • "include": padding included on encoding, no padding required on decoding

  • "require": padding included on encoding, padding required on decoding

Encoding of a bytestring b:

  1. compute the minimum number extra_nbits of additional bits necessary to make 8*len(b) an integral multiple of char_nbits

  2. convert b to an unsigned integer i (big-endian)

  3. left-shift i by extra_nbits bits, introducing the necessary zero pad bits

  4. converts i to the encoding base, using the encoding alphabet for digits

  5. if padding is "include" or "require", append the minimum number of padding characters necessary to make the encoded string length an integral multiple of block_nchars

Decoding of a string s:

  1. if pad_char is not None, count the number N of contiguous padding characters at the end of s and strip them, obtaining s_stripped

  2. if padding is "require", ensure that N is exactly the minimum number of padding characters that must be appended to s_stripped to make its length an integral multiple of block_nchars

  3. converts s to an unsigned integer i, using the encoding alphabet for digits of the encoding base

  4. compute the number extra_nbits = (char_nbits*len(s))%8 of pad bits: if this is not smaller than char_nbits, raise bases.encoding.errors.DecodingError

  5. extract the value i%(2**extra_nbits) of the pad bits: if this is not zero, raise DecodingError

  6. compute the number of bytes in the decoded bytestring as original_nbytes = (char_nbits*len(s))//8

  7. right-shift i by extra_nbits bits, removing the zero pad bits

  8. converts i to its minimal byte representation (big-endian), then zero-pad on the left to reach original_nbytes bytes

FixcharBaseEncoding

class FixcharBaseEncoding(alphabet, *, case_sensitive=None, char_nbits='auto', pad_char=None, padding='ignore')[source]

Bases: BaseEncoding

Fixed-character encodings.

Parameters:
  • alphabet (str, range or Alphabet) – the alphabet to use for the encoding

  • case_sensitive (bool or None, optional) – optional case sensitivity (if None, the one from the alphabet is used)

  • char_nbits (int or "auto", optional) – number of bits per character (default: "auto")

  • pad_char (str or None, optional) – optional padding character (default: None)

  • padding (PaddingOptions, optional) – padding style (default: "ignore")

property block_nchars

Number of characters in a block.

Return type:

int

canonical_bytes(b)[source]

Returns a canonical version of the bytestring b: this is the bytestring obtained by first encoding b and then decoding it.

(This method is overridden by subclasses with more efficient implementations.)

Parameters:

b (BytesLike) – the bytestring

Return type:

bytes

canonical_string(s)[source]

Returns a canonical version of the string s: this is the string obtained by first decoding s and then encoding it.

(This method is overridden by subclasses with more efficient implementations.)

Parameters:

s (str) – the string

Return type:

str

property char_nbits

Number of bits per character.

Return type:

int

property effective_base

Effective base used when decoding is 2**char_nbits.

Return type:

int

property include_padding

Whether padding is included on encoding (derived from padding).

Return type:

bool

nopad(allow=True)[source]

Returns a copy of this encoding which does not include/require paddding (and optionally disallows it by removing the padding character).

Example usage:

>>> encoding.base32
FixcharBaseEncoding(
    StringAlphabet('ABCDEFGHIJKLMNOPQRSTUVWXYZ234567',
                   case_sensitive=False),
    pad_char='=', padding='include')
>>> encoding.base32.nopad()
FixcharBaseEncoding(
    StringAlphabet('ABCDEFGHIJKLMNOPQRSTUVWXYZ234567',
                   case_sensitive=False),
    pad_char='=')
>>> encoding.base32.nopad(allow=False)
FixcharBaseEncoding(
    StringAlphabet('ABCDEFGHIJKLMNOPQRSTUVWXYZ234567',
                   case_sensitive=False))
Parameters:

allow (bool) – whether padding is to be allowed on decoding

Return type:

FixcharBaseEncoding

options(skip_defaults=False)[source]

The options used to construct this particular encoding.

Example usage:

>>> encoding.base32.options()
{'char_nbits': 'auto', 'pad_char': '=', 'padding': 'include'}
>>> encoding.base32.options(skip_defaults=True)
{'pad_char': '=', 'padding': 'include'}
Parameters:

skip_defaults (bool, optional) – if set to True, only options with non-default values are included in the mapping

Return type:

Mapping[str, Any]

pad(require=False)[source]

Returns a copy of this encoding which includes paddding (and optionally requires it).

Example usage, from "include" to "require":

>>> encoding.base32
FixcharBaseEncoding(
    StringAlphabet('ABCDEFGHIJKLMNOPQRSTUVWXYZ234567',
                   case_sensitive=False),
    pad_char='=', padding='include')
>>> encoding.base32.pad(require=True)
FixcharBaseEncoding(
    StringAlphabet('ABCDEFGHIJKLMNOPQRSTUVWXYZ234567',
                   case_sensitive=False),
    pad_char='=', padding='require')

Example usage, from "ignore" to "include":

>>> encoding.base32z
FixcharBaseEncoding(
    StringAlphabet('ybndrfg8ejkmcpqxot1uwisza345h769',
                   case_sensitive=False))
>>> encoding.base32z.with_pad_char("=")
FixcharBaseEncoding(
    StringAlphabet('ybndrfg8ejkmcpqxot1uwisza345h769',
                   case_sensitive=False),
    pad_char='=')
>>> encoding.base32z.with_pad_char("=").pad()
FixcharBaseEncoding(
    StringAlphabet('ybndrfg8ejkmcpqxot1uwisza345h769',
                   case_sensitive=False),
    pad_char='=', padding='include')
Parameters:

require (bool) – whether padding is to be required on decoding

Return type:

FixcharBaseEncoding

property pad_char

An optional character to be used for padding of encoded strings. In rfc4648, this is "=" for both base64 and base32.

Return type:

Optional[str]

pad_string(s)[source]

If no padding character is specified for this encoding, returns the input string unchanged. If a padding character is specified for this encoding, pads the input string by appending the minimum number of padding characters necessary to make its length an integral multiple of the block char size (given by block_nchars).

The value of padding is irrelevant to this method.

Parameters:

s (str) – the string

Return type:

str

property padding

Padding style:

  • "ignore": no padding included on encoding, no padding required on decoding

  • "include": padding included on encoding, no padding required on decoding

  • "require": padding included on encoding, padding required on decoding

Return type:

PaddingOptions

property require_padding

Whether padding is required on decoding (derived from padding).

Return type:

bool

strip_string(s)[source]

If no padding character is specified for this encoding, returns the input string unchanged. If a padding character is specified for this encoding, strips the input string of any padding characters it might have to the right. If padding is set to "require", checks that the correct number of padding characters were included and raises PaddingError if not.

Parameters:

s (str) – the string

Return type:

str

with_pad_char(pad_char)[source]

Returns a copy of this encoding with a different padding character (or without a padding character if pad_char is None).

Example usage:

>>> encoding.base32
FixcharBaseEncoding(
    StringAlphabet('ABCDEFGHIJKLMNOPQRSTUVWXYZ234567',
                   case_sensitive=False),
    pad_char='=', padding='include')
>>> encoding.base32.with_pad_char("~")
FixcharBaseEncoding(
    StringAlphabet('ABCDEFGHIJKLMNOPQRSTUVWXYZ234567',
                   case_sensitive=False),
    pad_char='~', padding='include')
Parameters:

pad_char (str or None) – padding character (default: None)

Return type:

FixcharBaseEncoding

PaddingOptions

PaddingOptions

Type of allowed padding options for fixed-character encoding. See FixcharBaseEncoding.padding.

alias of Literal[‘ignore’, ‘include’, ‘require’]