Pyzstd module provides classes and functions for compressing and decompressing data, using Facebook's
Zstandard <http://www.zstd.net>_ (or zstd as short name) algorithm.
The API style is similar to Python's bz2/lzma/zlib modules.
this note <https://pyzstd.readthedocs.io/en/latest/#build-pyzstd>_.
python -m pyzstd --help
0.15.4 (Feb 24, 2023)
v1.5.4 <https://github.com/facebook/zstd/releases/tag/v1.5.4>_. v1.5.3 is a non-public release.
pyproject.tomlbuild mechanism (PEP-517). Note that specifying build options in old way may be invalid, see
0.15.3 (Aug 3, 2022)
ZstdError object can't be pickled.
0.15.2 (Jan 22, 2022)
Update bundled zstd source code from v1.5.1 to
0.15.1 (Dec 25, 2021)
finalize_dict()may use wrong length for some buffer protocol objects, see
this issue <https://github.com/animalize/pyzstd/issues/4>_.
* Setting ``CParameter.nbWorkers`` to ``1`` now means "1-thread multi-threaded mode", rather than "single-threaded mode". * If the underlying zstd library doesn't support multi-threaded compression, no longer automatically fallback to "single-threaded mode", now raise a ``ZstdError`` exception.
this note <https://pyzstd.readthedocs.io/en/latest/#build-pyzstd>_.
0.15.0 (May 18, 2021)
0.14.4 (Mar 24, 2021)
0.14.3 (Mar 4, 2021)
Update bundled zstd source code from v1.4.8 to
0.14.2 (Feb 24, 2021)
0.14.1 (Dec 19, 2020)
* v1.4.6 is a non-public release for Linux kernel. * v1.4.8 is a hotfix for `v1.4.7 <https://github.com/facebook/zstd/releases/tag/v1.4.7>`_.
0.13.0 (Nov 7, 2020)
ZstdDecompressorclass: now it has the same API and behavior as BZ2Decompressor / LZMADecompressor classes in Python standard library, it stops after a frame is decompressed.
EndlessZstdDecompressorclass, it accepts multiple concatenated frames. It is renamed from previous
Truewhen both the input and output streams are at a frame edge.
open(), consistent with Python standard library.
* ~9% faster when: there is one frame, and the decompressed size was recorded in frame header. * raises ZstdError when input **or** output data is not at a frame edge. Previously, it only raise for output data is not at a frame edge.
0.12.5 (Oct 12, 2020)
No longer use
Argument Clinic <https://docs.python.org/3/howto/clinic.html>_, now supports Python 3.5+, previously 3.7+.
0.12.4 (Oct 7, 2020)
It seems the API is stable.
0.2.4 (Sep 2, 2020)
The first version upload to PyPI.
v1.4.5 <https://github.com/facebook/zstd/releases/tag/v1.4.5>_ source code.
Find project collaborator(s) to release new versions in my absence.
About 2~3 versions of
zstd are released every year, and a new version of
pyzstd needs to be released at this time.
I recently changed the status of the project from Beta to Stable. As I said in https://github.com/animalize/pyzstd/pull/3#issuecomment-825365829, there is basically no need for other maintenance work:
I used to spend time checking such details, and manually triggering exceptions to see if them can be handled correctly. Once the development of pyzstd module is completed, almost no maintenance is needed. Basically just update the zstd source code, and use new API in major version updates.
Other precautions have been written in tech memo. If you are interested, I can explain more, such as what I have tried.
Please ensure that:
This module was originally written for Python stdlib:
And use a script to convert the code to this
After Oct-20-2020, all development were transferred to this module, and no longer use CPython's internal feature: argument clinic. Now only use CPython's public API for C extension.
In mid-March 2021, the code seems stable, then add a CFFI implementation.
After exploring some API/implementation changes, always return to "now is better". So in Jan 2023, change Development Status from Beta to Stable. It has exceeded its stdlib brothers a lot.
zstd modules: https://github.com/animalize/pyzstd/discussions/19#discussioncomment-4702814
[Feature Request] Add zstd module to stdlib, on Python issue tracker. https://bugs.python.org/issue37095
A discussion about adding zstd to Python standard library, on Python-Ideas mail-list. https://mail.python.org/archives/list/[email protected]/thread/VQIFA7WTNRAOYZGTVP4WZC2CD36KYIVY/
Include zstd library source code, without any changes.
Zstd lib source code is in
zstd/ folder, if someone wants to upgrade/downgrade the bundled zstd lib, just replace this folder.
The code supports zstd v1.4.0+ (released in Apr 2019).
Only use zstd's "stable" API, don't use "experimental" API.
When statically linking to zstd lib, use
ZSTD_MULTITHREAD build macro (in
setup.py) for enabling multi-threaded compression. MT is enabled by default in zstd v1.5.0+,
pyzstd still define it for zstd v1.4.x.
No more zstd macros are defined except this one.
See this note: https://pyzstd.readthedocs.io/en/latest/#build-pyzstd
The API is similar to Python's bz2/lzma/zlib module.
Try to make all major functionalities provided by zstd "stable" API can be used.
🔴 If "skippable frame" is used more, related API may be added. (unlikely. It's not difficult to implement "skippable frame" functions at user side.)
ZSTD_c_stableInBuffer parameter is moved from "experimental" API to "stable" API, it can be used to speed up
.FLUSH_FRAME compression if (
.last_mode == .FLUSH_FRAME).
No plan to use
ZSTD_c_stableOutBuffer, because it raises an error when the output buffer is not enough.
ZSTD_getFrameHeader() function is moved from "experimental" API to "stable" API, more items can be added to
ZSTD_d_refMultipleDDicts parameter is moved from "experimental" API to "stable" API,
zstd_dict parameter may accepts a tuple that contains multiuple dictionaries.
(not very likely, few people use it, and it makes the API complex a bit. This functionality can be implemented via
get_frame_info() function and dispatching to different decompressors.)
ZDICT_finalizeDictionary() support training dict (no custom dict), the first arg can be
finalize_dict(zstd_dict, samples, dict_size, level)
train_dict(samples, dict_size), it can specify level.
🟢~~Use multi-phase init when it matures, then pyzstd module can support CPython sub-interpreters.~~ (implemented in 0.15.4, support subclass well.)
Depends on the progresses of CPython:
- Subinterpreters for Python, https://lwn.net/Articles/820424/
METH_METHOD flag metioned in PEP 573 can be used with more flag, otherwise have to disable subclass for ZstdDict/Compressor/Decompressor. Maybe need to wait until at least 3.11.
PEP 489 -- Multi-phase extension module initialization PEP 573 -- Module State Access from C Extension Methods
🟢 If the minimum version is 3.6:
- use f-string. Its performance is better than
% a bit. Currently string formatting is only used for exception message, so it's not a big problem.
#include "pythread.h" in
- try to add
-fvisibility=hidden compile option, it reduces ~12KiB .so size. see commit https://github.com/animalize/pyzstd/commit/ab21add1e8d9b93e90eb49e62811159846934178.
- remove this
__init__.py, and related code in unit-test:
from os import PathLike
# For Python 3.5
🟢 If the minimum version is 3.7:
- consider use
#define Py_UNREACHABLE() assert(0)
- remove this code in
if size < 0:
size = _32_KiB
🟢 If the minimum version is 3.8:
:= operator in ZstdFile, it's a bit faster.
ZstdDict.__init__(self, dict_content, is_raw=False)
dict_content is a normal dictionary, and set
is_raw to True, the dictionary is NOT treated as raw dictionary.
Very rare cases. If has magic number, it's probably a normal dict.
🟡 ~~When dynamically linking to zstd lib,
compressionLevel_values.default may be wrong, it uses the value of
ZSTD_CLEVEL_DEFAULT macro from
Very rare cases. Very few people modify
ZSTD_CLEVEL_DEFAULT when building zstd lib.
zstd_version >= 1.5 and pyzstd_version >= 0.15
Use Python, Android(Java).GitHub Repository Homepage
zstd zstandard python