pycrossword
0.4
Pure-Python implementation of a crossword puzzle generator and editor
|
Main interface to handle downloads and imports of Hunspell dictionaries as SQLite databases. More...
Public Member Functions | |
def | __init__ (self, settings, dbmanager=None, dicfolder=DICFOLDER) |
def | pool_running (self) |
Checks if there are tasks running in the pool. More... | |
def | pool_threadcount (self) |
Gets the number of active threads (tasks) in the pool. More... | |
def | pool_wait (self) |
Waits for all the tasks in the pool to complete. More... | |
def | get_installed_info (self, lang) |
Gets the information about an existing SQLite database: full path and number of words. More... | |
def | list_hunspell (self, stopcheck=None) |
Retrieves the list of Hunspell dictionaries available for download from the public Github repo. More... | |
def | list_all_dics (self, stopcheck=None) |
Retrieves the information for all available Hunspell dictionaries. More... | |
def | download_hunspell (self, url, lang, overwrite=True, on_stopcheck=None, on_start=None, on_getfilesize=None, on_progress=None, on_complete=None, on_error=None, wait=False) |
Downloads a single Hunspell dictionary (*.dic file) and stores it locally. More... | |
def | download_hunspell_all (self, dics, on_stopcheck=None, on_start=None, on_getfilesize=None, on_progress=None, on_complete=None, on_error=None) |
Downloads all the Hunspell dictionaries specified by the user. More... | |
def | add_from_hunspell (self, lang, posrules=None, posrules_strict=False, posdelim='/', lcase=True, replacements=None, remove_hyphens=True, filter_out=None, rows=None, commit_each=1000, on_checkstop=None, on_start=None, on_word=None, on_commit=None, on_finish=None, on_error=None, wait=False) |
Imports a Hunspell-formatted dictionary file into the DB. More... | |
def | add_all_from_hunspell (self, dics, posrules=None, posrules_strict=True, posdelim='/', lcase=True, replacements=None, remove_hyphens=True, filter_out=None, rows=None, commit_each=1000, on_stopcheck=None, on_start=None, on_word=None, on_commit=None, on_finish=None, on_error=None) |
Imports all Hunspell dictionaries specified by the user. More... | |
Public Attributes | |
settings | |
dict pointer to the app global settings (utils::guisettings::CWSettings::settings ) More... | |
db | |
Sqlitedb | None DB object More... | |
dicfolder | |
str root path of the dictionaries, default = utils::globalvars::DICFOLDER More... | |
pool | |
QtCore.QThreadPool thread pool to run tasks More... | |
timeout_ | |
int timeout for HTTP(S) requests (in milliseconds) More... | |
proxies_ | |
dict HTTP(S) proxy server settings More... | |
Main interface to handle downloads and imports of Hunspell dictionaries as SQLite databases.
Can start download and import tasks both in a synchonous mode (start and wait for completion) and asynchronously (in a thread pool).
def pycross.dbapi.HunspellImport.__init__ | ( | self, | |
settings, | |||
dbmanager = None , |
|||
dicfolder = DICFOLDER |
|||
) |
settings | dict pointer to the app global settings (utils::guisettings::CWSettings::settings ) |
dbmanager | Sqlitedb | None DB object (None to create a new one) |
dicfolder | str root path of the dictionaries, default = utils::globalvars::DICFOLDER |
def pycross.dbapi.HunspellImport.add_all_from_hunspell | ( | self, | |
dics, | |||
posrules = None , |
|||
posrules_strict = True , |
|||
posdelim = '/' , |
|||
lcase = True , |
|||
replacements = None , |
|||
remove_hyphens = True , |
|||
filter_out = None , |
|||
rows = None , |
|||
commit_each = 1000 , |
|||
on_stopcheck = None , |
|||
on_start = None , |
|||
on_word = None , |
|||
on_commit = None , |
|||
on_finish = None , |
|||
on_error = None |
|||
) |
Imports all Hunspell dictionaries specified by the user.
The import tasks are started asynchronously in the thread pool, each task using HunspelImportTask::signals to signalize its status and check for interruption request.
dics | list list of dict objects each representing a single Hunspell dictionary, its URL, langugage, etc. See list_hunspell() for dict structure. See other parameters in add_from_hunspell() |
def pycross.dbapi.HunspellImport.add_from_hunspell | ( | self, | |
lang, | |||
posrules = None , |
|||
posrules_strict = False , |
|||
posdelim = '/' , |
|||
lcase = True , |
|||
replacements = None , |
|||
remove_hyphens = True , |
|||
filter_out = None , |
|||
rows = None , |
|||
commit_each = 1000 , |
|||
on_checkstop = None , |
|||
on_start = None , |
|||
on_word = None , |
|||
on_commit = None , |
|||
on_finish = None , |
|||
on_error = None , |
|||
wait = False |
|||
) |
Imports a Hunspell-formatted dictionary file into the DB.
lang | str short name of the imported dictionary language, e.g. 'en', 'de' etc. |
posrules | dict part-of-speech regular expression parsing rules in the format: {'N': 'regex for nouns', 'V': 'regex for verb', ...}
Possible keys are: 'N' [noun], 'V' [verb], 'ADV' [adverb], 'ADJ' [adjective], 'P' [participle], 'PRON' [pronoun], 'I' [interjection], 'C' [conjuction], 'PREP' [preposition], 'PROP' [proposition], 'MISC' [miscellaneous / other], 'NONE' [no POS] |
posrules_strict | bool if True (default), only the parts of speech present in posrules dict will be imported [all other words will be skipped]. If False , such words will be imported with 'MISC' and 'NONE' POS markers. |
posdelim | str delimiter delimiting the word and its part of speech [default = '/'] |
lcase | bool if True (default), found words will be imported in lower case; otherwise, the original case will remain |
replacements | dict character replacement rules in the format: {'char_from': 'char_to', ...}
None (no replacements) |
remove_hyphens | bool if True (default), all hyphens ['-'] will be removed from the words |
filter_out | dict regex-based rules to filter out [exclude] words in the format: {'word': ['regex1', 'regex2', ...], 'pos': ['regex1', 'regex2', ...]}
None (no filter rules apply). |
rows | 2-tuple | None the start and end rows (indices) of the words to import; e.g. (20, 100) means start import from row 20 and end import after row 100. If the second element in the tuple is negative (e.g. -1), only the start row will be considered and the import will go on till the last word in the source DIC file. None means ALL available words. |
commit_each | int threshold of insert operations after which the transaction will be committed (default = 1000) |
on_checkstop | callback callback function called periodically to check for interrupt condition; takes 3 parameters:
|
on_start | callback Qt slot (callback) for HunspellImportSignals::sigStart |
on_word | callback Qt slot (callback) for HunspellImportSignals::sigWordWritten |
on_commit | callback Qt slot (callback) for HunspellImportSignals::sigCommit |
on_finish | callback Qt slot (callback) for HunspellImportSignals::sigComplete |
on_error | callback Qt slot (callback) for HunspellImportSignals::sigError |
wait | bool True to wait for the task to complete; False to start the task asynchronously (without waiting for the result) |
def pycross.dbapi.HunspellImport.download_hunspell | ( | self, | |
url, | |||
lang, | |||
overwrite = True , |
|||
on_stopcheck = None , |
|||
on_start = None , |
|||
on_getfilesize = None , |
|||
on_progress = None , |
|||
on_complete = None , |
|||
on_error = None , |
|||
wait = False |
|||
) |
Downloads a single Hunspell dictionary (*.dic file) and stores it locally.
url | str URL of the DIC file to download (generally, https://raw.githubusercontent.com/wooorm/dictionaries/main/dictionaries/<LANG>/index.dic) |
lang | str short name of the language, e.g. 'en' |
overwrite | bool whether to overwrite the existing file (if any) |
on_stopcheck | callback callback function called periodically to check for interrupt condition; takes 4 parameters:
|
on_start | callback Qt slot (callback) for HunspellDownloadSignals::sigStart |
on_getfilesize | callback Qt slot (callback) for HunspellDownloadSignals::sigGetFilesize |
on_progress | callback Qt slot (callback) for HunspellDownloadSignals::sigProgress |
on_complete | callback Qt slot (callback) for HunspellDownloadSignals::sigComplete |
on_error | callback Qt slot (callback) for HunspellDownloadSignals::sigError |
wait | bool True to wait for the task to complete; False to start the task asynchronously (without waiting for the result) |
def pycross.dbapi.HunspellImport.download_hunspell_all | ( | self, | |
dics, | |||
on_stopcheck = None , |
|||
on_start = None , |
|||
on_getfilesize = None , |
|||
on_progress = None , |
|||
on_complete = None , |
|||
on_error = None |
|||
) |
Downloads all the Hunspell dictionaries specified by the user.
The download tasks are started asynchronously in the thread pool, each task using HunspellDownloadTask::signals to signalize its status and check for interruption request.
dics | list list of dict objects each representing a single Hunspell dictionary, its URL, langugage, etc. See list_hunspell() for dict structure. See other parameters in download_hunspell() |
def pycross.dbapi.HunspellImport.get_installed_info | ( | self, | |
lang | |||
) |
Gets the information about an existing SQLite database: full path and number of words.
lang | str short name of the language, e.g. 'en' |
dict
info in the format: def pycross.dbapi.HunspellImport.list_all_dics | ( | self, | |
stopcheck = None |
|||
) |
Retrieves the information for all available Hunspell dictionaries.
Does everything what list_hunspell() does, but adds DB information (number of entries and path to DB file) to each dictionary in the list.
stopcheck | callback callback that returns True to stop the operation or False to continue (takes no parameters) |
list
list of dictionaries representing language-specific dictionary info: * 'dic_url': URL of the dictionary file * 'lang': short language name, e.g. 'en' / 'ru' / 'it' * 'lang_full': full language name, e.g. 'Russian', 'English (US)' * 'license': name of applicable license, e.g. 'GPL-3.0' / 'MIT and BSD' * 'license_url': URL of applicable license file * 'entries': number of entries in the existing DB (0 if no DB exists or is empty) * 'path': full path to the existing DB file (empty string if no DB exists)
def pycross.dbapi.HunspellImport.list_hunspell | ( | self, | |
stopcheck = None |
|||
) |
Retrieves the list of Hunspell dictionaries available for download from the public Github repo.
stopcheck | callback callback that returns True to stop the operation or False to continue (takes no parameters) |
list
list of dictionaries representing language-specific dictionary info: * 'dic_url': URL of the dictionary file * 'lang': short language name, e.g. 'en' / 'ru' / 'it' * 'lang_full': full language name, e.g. 'Russian', 'English (US)' * 'license': name of applicable license, e.g. 'GPL-3.0' / 'MIT and BSD' * 'license_url': URL of applicable license file
def pycross.dbapi.HunspellImport.pool_running | ( | self | ) |
Checks if there are tasks running in the pool.
bool
True
if there are active tasks, False
if none def pycross.dbapi.HunspellImport.pool_threadcount | ( | self | ) |
Gets the number of active threads (tasks) in the pool.
int
number of active tasks def pycross.dbapi.HunspellImport.pool_wait | ( | self | ) |
Waits for all the tasks in the pool to complete.
pycross.dbapi.HunspellImport.db |
Sqlitedb
| None
DB object
pycross.dbapi.HunspellImport.dicfolder |
str
root path of the dictionaries, default = utils::globalvars::DICFOLDER
pycross.dbapi.HunspellImport.pool |
QtCore.QThreadPool
thread pool to run tasks
pycross.dbapi.HunspellImport.proxies_ |
dict
HTTP(S) proxy server settings
pycross.dbapi.HunspellImport.settings |
dict
pointer to the app global settings (utils::guisettings::CWSettings::settings
)
pycross.dbapi.HunspellImport.timeout_ |
int
timeout for HTTP(S) requests (in milliseconds)