bases.encoding.block
Block base encodings.
Split the bytestring to encode (resp. string to decode) into blocks,
then encodes (resp. decodes) each block individually using an underlying encoding.
By default, the underlying encoding is a simple base encoding.
Constructor options:
block_size: Union[int, Mapping[int, int]]cf. belowsep_char: stran optional separator character for encoded string blocks (default:"")reverse_blocks: boolan optional flag to reverse individual char blocks in the encoded string (default:False)
The block_size option is mandatory and determines the allowed block sizes for encoding and decoding:
if
block_sizeis a strictly increasing mapping of positive integers to positive integers, its keys are taken to be the allowed block byte sizes and its values are taken to be the corresponding block char sizes.if
block_sizeis an integer, all block byte sizes inrange(1, block_size+1)are allowed, and the coresponding block char sizes are computed by:
char_size = int(math.floor(math.log(256**byte_size, base)))+1
The property nbytes2nchars has all valid block byte sizes as keys and the corresponding block char sizes as values.
The property nchars2nbytes has all valid block char sizes as keys and the corresponding block byte sizes as values.
Each pair of corresponding block byte and char sizes is assessed to ensure that encoding and decoding are unambiguous,
using the static methods max_block_nchars and
max_block_nbytes from the zeropad base encoding implementation
(cf. class ZeropadBaseEncoding).
The maximum valid block byte (resp. char) size is used on encoding (resp. decoding) for all blocks except at most the last one: if the number of bytes (resp. chars) in the last block is not valid, the bytestring (resp. string) is not valid overall.
As a concrete example, the following is the constructor for the base45 encoding:
base45 = BlockBaseEncoding(alphabet.base45, block_size={1: 2, 2: 3})
In this case, encoding uses blocks of 2 bytes, with the final block allowed to be 1 or 2 bytes. Decoding uses blocks of 3 chars, with the final block allowed to be 2 or 3 chars (but not 1 char). Because no encoding was explicitly specified, the encoding used is the simple encoding for the base45 alphabet.
Encoding of a bytestring b:
split
binto blocks of sizeblock_nbytes, with the final block allowed to be any size innbytes2nchars(raiseEncodingErrorif it isn’t)encode each block individually using the
block_encodingcheck that no encoded block string exceeds the block char size corresponding to the original block byte size
prepend zero chars to each encoded block string until it reaches the designated block char size
if
reverse_blocks, reverse each individual char blockjoin the blocks into the final encoded string (using the separator character
sep_char, if specified)
Decoding of a string s:
split
sinto blocks of sizeblock_nchars, with the final block allowed to be any size innchars2nbytes(raiseDecodingErrorif it isn’t)if
reverse_blocks, reverse each individual char blockdecode each block individually using the
block_encodingcheck that no decode block bytestring exceeds the block byte size corresponding to the original block char size
prepend zero bytes to each decoded block bytestring until it reaches the designated block byte size
join the blocks into the final decoded bytestring
BlockBaseEncoding
- class BlockBaseEncoding(encoding, *, case_sensitive=None, block_size, sep_char='', reverse_blocks=False)[source]
Bases:
BaseEncodingBlock base encodings.
- Parameters:
alphabet (
str,rangeorAlphabet) – the alphabet to use for the encodingcase_sensitive (
boolorNone, optional) – optional case sensitivity (ifNone, the one from the alphabet is used)block_size (
intorMapping[int,int]]) – allowed block size(s) for encoding/decodingsep_char (
bool, optional) – an optional separator character for encoded string blocks (default:"")reverse_blocks – an optional flag to reverse individual char blocks in the encoded string (default:
False)
- property block_encoding
The encoding used for individual blocks.
- Return type:
- canonical_bytes(b)[source]
Returns a canonical version of the bytestring
b: this is the bytestring obtained by first encodingband then decoding it.(This method is overridden by subclasses with more efficient implementations.)
- Parameters:
b (
BytesLike) – the bytestring- Return type:
- canonical_string(s)[source]
Returns a canonical version of the string
s: this is the string obtained by first decodingsand then encoding it.(This method is overridden by subclasses with more efficient implementations.)
- property nbytes2nchars
Mapping of bytes block sizes to char block sizes.
- property nchars2nbytes
Mapping of char block sizes to byte block sizes.
- options(skip_defaults=False)[source]
The options used to construct this particular encoding.
Example usage:
>>> encoding.base32.options() {'char_nbits': 'auto', 'pad_char': '=', 'padding': 'include'} >>> encoding.base32.options(skip_defaults=True) {'pad_char': '=', 'padding': 'include'}
- property reverse_blocks
Whether individual char block should be reversed when encoding, e.g. as done by the base45 spec.
- Return type:
- property sep_char
Optional block separation character. It is either the empty string, or a string of length 1.
- Return type:
BlockBaseEncodingSubclass
- BlockBaseEncodingSubclass = ~BlockBaseEncodingSubclass
Type variable for subclasses of
BlockBaseEncoding.