bases.encoding.block
Block base encodings.
Split the bytestring to encode (resp. string to decode) into blocks,
then encodes (resp. decodes) each block individually using an underlying encoding.
By default, the underlying encoding is a simple
base encoding.
Constructor options:
block_size: Union[int, Mapping[int, int]]
cf. belowsep_char: str
an optional separator character for encoded string blocks (default:""
)reverse_blocks: bool
an optional flag to reverse individual char blocks in the encoded string (default:False
)
The block_size
option is mandatory and determines the allowed block sizes for encoding and decoding:
if
block_size
is a strictly increasing mapping of positive integers to positive integers, its keys are taken to be the allowed block byte sizes and its values are taken to be the corresponding block char sizes.if
block_size
is an integer, all block byte sizes inrange(1, block_size+1)
are allowed, and the coresponding block char sizes are computed by:
char_size = int(math.floor(math.log(256**byte_size, base)))+1
The property nbytes2nchars
has all valid block byte sizes as keys and the corresponding block char sizes as values.
The property nchars2nbytes
has all valid block char sizes as keys and the corresponding block byte sizes as values.
Each pair of corresponding block byte and char sizes is assessed to ensure that encoding and decoding are unambiguous,
using the static methods max_block_nchars
and
max_block_nbytes
from the zeropad
base encoding implementation
(cf. class ZeropadBaseEncoding
).
The maximum valid block byte (resp. char) size is used on encoding (resp. decoding) for all blocks except at most the last one: if the number of bytes (resp. chars) in the last block is not valid, the bytestring (resp. string) is not valid overall.
As a concrete example, the following is the constructor for the base45 encoding:
base45 = BlockBaseEncoding(alphabet.base45, block_size={1: 2, 2: 3})
In this case, encoding uses blocks of 2 bytes, with the final block allowed to be 1 or 2 bytes. Decoding uses blocks of 3 chars, with the final block allowed to be 2 or 3 chars (but not 1 char). Because no encoding was explicitly specified, the encoding used is the simple encoding for the base45 alphabet.
Encoding of a bytestring b
:
split
b
into blocks of sizeblock_nbytes
, with the final block allowed to be any size innbytes2nchars
(raiseEncodingError
if it isn’t)encode each block individually using the
block_encoding
check that no encoded block string exceeds the block char size corresponding to the original block byte size
prepend zero chars to each encoded block string until it reaches the designated block char size
if
reverse_blocks
, reverse each individual char blockjoin the blocks into the final encoded string (using the separator character
sep_char
, if specified)
Decoding of a string s
:
split
s
into blocks of sizeblock_nchars
, with the final block allowed to be any size innchars2nbytes
(raiseDecodingError
if it isn’t)if
reverse_blocks
, reverse each individual char blockdecode each block individually using the
block_encoding
check that no decode block bytestring exceeds the block byte size corresponding to the original block char size
prepend zero bytes to each decoded block bytestring until it reaches the designated block byte size
join the blocks into the final decoded bytestring
BlockBaseEncoding
- class BlockBaseEncoding(encoding, *, case_sensitive=None, block_size, sep_char='', reverse_blocks=False)[source]
Bases:
BaseEncoding
Block base encodings.
- Parameters:
alphabet (
str
,range
orAlphabet
) – the alphabet to use for the encodingcase_sensitive (
bool
orNone
, optional) – optional case sensitivity (ifNone
, the one from the alphabet is used)block_size (
int
orMapping
[int
,int
]]) – allowed block size(s) for encoding/decodingsep_char (
bool
, optional) – an optional separator character for encoded string blocks (default:""
)reverse_blocks – an optional flag to reverse individual char blocks in the encoded string (default:
False
)
- property block_encoding
The encoding used for individual blocks.
- Return type:
- canonical_bytes(b)[source]
Returns a canonical version of the bytestring
b
: this is the bytestring obtained by first encodingb
and then decoding it.(This method is overridden by subclasses with more efficient implementations.)
- Parameters:
b (
BytesLike
) – the bytestring- Return type:
- canonical_string(s)[source]
Returns a canonical version of the string
s
: this is the string obtained by first decodings
and then encoding it.(This method is overridden by subclasses with more efficient implementations.)
- property nbytes2nchars
Mapping of bytes block sizes to char block sizes.
- property nchars2nbytes
Mapping of char block sizes to byte block sizes.
- options(skip_defaults=False)[source]
The options used to construct this particular encoding.
Example usage:
>>> encoding.base32.options() {'char_nbits': 'auto', 'pad_char': '=', 'padding': 'include'} >>> encoding.base32.options(skip_defaults=True) {'pad_char': '=', 'padding': 'include'}
- property reverse_blocks
Whether individual char block should be reversed when encoding, e.g. as done by the base45 spec.
- Return type:
- property sep_char
Optional block separation character. It is either the empty string, or a string of length 1.
- Return type:
BlockBaseEncodingSubclass
- BlockBaseEncodingSubclass = ~BlockBaseEncodingSubclass
Type variable for subclasses of
BlockBaseEncoding
.