2. Unicode Versions
2.1. unicodedata2
and uniseg
unicodedata2
[1] is a backport project of the standard
unicodedata
[2] library. It provides the same functionality of the
unicodedata
which is based on the (almost) latest Unicode versions on
every Python version.
uniseg
uses some unicodedata
functions (category()
,
east_asian_width()
, etc.) on its internal processes. At the current
release, these functions look providing the same results through all the
supported Python versions, so that the text segmentation works fine dispite of
that the algorithm is implemented under the different version of the Unicode.
This compatibility feature is not guaranteed through further releases of the
module though. It will be a good practice to install unicodedata2
which
supports the same Unicode version as uniseg
does.
uniseg
uses unicodedata2
instead of the built-in
unicodedata
module when unicodedata2
is found on the system.
Note that it does not check whether its versions match though.
You can see the Unicode version which uniseg
supports by checking
uniseg.unidata_version
.
2.2. Python Versions and unicodedata.unidata_version
Python |
|
---|---|
3.13 |
“15.1.0” |
3.12 |
“15.0.0” |
3.11 |
“14.0.0” |
3.10 |
“13.0.0” |
3.9 |
“13.0.0” |