lodstorage package

Submodules

lodstorage.csv module

class lodstorage.csv.CSV(name)[source]

Bases: LOD

helper for converting data in csv format to list of dicts (LoD) and vice versa

static fixTypes(lod: list)[source]

fixes the types of the given LoD.

static fromCSV(csvString: str, fields: list | None = None, delimiter=',', quoting=2, **kwargs)[source]

convert given csv string to list of dicts (LOD)

Args:

csvStr(str): csv string that should be converted to LOD headerNames(list): Names of the headers that should be used. If None it is assumed that the header is given.

Returns:

list of dicts (LoD) containing the content of the given csv string

static readFile(filename: str) str[source]

Reads the given filename and returns it as string Args:

filename: Name of the file that should be returned as string

Returns:

Content of the file as string

static restoreFromCSVFile(filePath: str, headerNames: list | None = None, withPostfix: bool = False)[source]

restore LOD from given csv file

Args:

filePath(str): file name headerNames(list): Names of the headers that should be used. If None it is assumed that the header is given. withPostfix(bool): If False the file type is appended to given filePath. Otherwise file type MUST be given with filePath.

Returns:

list of dicts (LoD) containing the content of the given csv file

static storeToCSVFile(lod: list, filePath: str, withPostfix: bool = False)[source]

converts the given lod to CSV file.

Args:

lod(list): lod that should be converted to csv file filePath(str): file name the csv should be stored to withPostfix(bool): If False the file type is appended to given filePath. Otherwise file type MUST be given with filePath.

Returns:

csv string of the given lod

static toCSV(lod: list, includeFields: list | None = None, excludeFields: list | None = None, delimiter=',', quoting=2, **kwargs)[source]

converts the given lod to CSV string. For details about the csv dialect parameters see https://docs.python.org/3/library/csv.html#csv-fmt-params

Args:

lod(list): lod that should be converted to csv string includeFields(list): list of fields that should be included in the csv (positive list) excludeFields(list): list of fields that should be excluded from the csv (negative list) kwargs: csv dialect parameters

Returns:

csv string of the given lod

static writeFile(content: str, filename: str) str[source]

Write the given str to the given filename Args:

content(str): string that should be written into the file filename: Name of the file the given str should be written to

Returns:

Nothing

lodstorage.docstring_parser module

Created on 2024-01-21

@author: wf

class lodstorage.docstring_parser.DocstringParser[source]

Bases: object

A Python docstring parser.

parse(docstring: str)[source]

Parse the given docstring.

lodstorage.entity module

Created on 2020-08-19

@author: wf

class lodstorage.entity.EntityManager(name, entityName, entityPluralName: str, listName: str | None = None, clazz=None, tableName: str | None = None, primaryKey: str | None = None, config=None, handleInvalidListTypes=False, filterInvalidListTypes=False, listSeparator='⇹', debug=False)[source]

Bases: YamlAbleMixin, JsonPickleMixin, JSONAbleList

generic entity manager

fromCache(force: bool = False, getListOfDicts=None, append=False, sampleRecordCount=-1)[source]

get my entries from the cache or from the callback provided

Args:

force(bool): force ignoring the cache getListOfDicts(callable): a function to call for getting the data append(bool): True if records should be appended sampleRecordCount(int): the number of records to analyze for type information

Returns:

the list of Dicts and as a side effect setting self.cacheFile

fromStore(cacheFile=None, setList: bool = True) list[source]

restore me from the store Args:

cacheFile(String): the cacheFile to use if None use the pre configured cachefile setList(bool): if True set my list with the data from the cache file

Returns:

list: list of dicts or JSON entitymanager

getCacheFile(config=None, mode=StoreMode.SQL)[source]

get the cache file for this event manager Args:

config(StorageConfig): if None get the cache for my mode mode(StoreMode): the storeMode to use

getLoD()[source]

Return the LoD of the entities in the list

Return:

list: a list of Dicts

getSQLDB(cacheFile)[source]

get the SQL database for the given cacheFile

Args:

cacheFile(string): the file to get the SQL db from

initSQLDB(sqldb, listOfDicts=None, withCreate: bool = True, withDrop: bool = True, sampleRecordCount=-1)[source]

initialize my sql DB

Args:

listOfDicts(list): the list of dicts to analyze for type information withDrop(boolean): true if the existing Table should be dropped withCreate(boolean): true if the create Table command should be executed - false if only the entityInfo should be returned sampleRecordCount(int): the number of records to analyze for type information

Return:

EntityInfo: the entity information such as CREATE Table command

isCached()[source]

check whether there is a file containing cached data for me

removeCacheFile()[source]

remove my cache file

setNone(record, fields)[source]

make sure the given fields in the given record are set to none Args:

record(dict): the record to work on fields(list): the list of fields to set to None

showProgress(msg)[source]

display a progress message

Args:

msg(string): the message to display

store(limit=10000000, batchSize=250, append=False, fixNone=True, sampleRecordCount=-1, replace: bool = False) str[source]

store my list of dicts

Args:

limit(int): maximum number of records to store per batch batchSize(int): size of batch for storing append(bool): True if records should be appended fixNone(bool): if True make sure the dicts are filled with None references for each record sampleRecordCount(int): the number of records to analyze for type information replace(bool): if True allow replace for insert

Return:

str: The cachefile being used

storeLoD(listOfDicts, limit=10000000, batchSize=250, cacheFile=None, append=False, fixNone=True, sampleRecordCount=1, replace: bool = False) str[source]

store my entities

Args:

listOfDicts(list): the list of dicts to store limit(int): maximum number of records to store batchSize(int): size of batch for storing cacheFile(string): the name of the storage e.g path to JSON or sqlite3 file append(bool): True if records should be appended fixNone(bool): if True make sure the dicts are filled with None references for each record sampleRecordCount(int): the number of records to analyze for type information replace(bool): if True allow replace for insert

Return:

str: The cachefile being used

storeMode()[source]

return my store mode

lodstorage.jsonable module

This module has a class JSONAble for serialization of tables/list of dicts to and from JSON encoding

Created on 2020-09-03

@author: wf

class lodstorage.jsonable.JSONAble[source]

Bases: object

mixin to allow classes to be JSON serializable see

asJSON(asString=True, data=None)[source]

recursively return my dict elements

Args:

asString(boolean): if True return my result as a string

checkExtension(jsonFile: str, extension: str = '.json') str[source]

make sure the jsonFile has the given extension e.g. “.json”

Args:

jsonFile(str): the jsonFile name - potentially without “.json” suffix

Returns:

str: the jsonFile name with “.json” as an extension guaranteed

fromDict(data: dict)[source]

initialize me from the given data

Args:

data(dict): the dictionary to initialize me from

fromJson(jsonStr)[source]

initialize me from the given JSON string

Args:

jsonStr(str): the JSON string

getJSONValue(v)[source]

get the value of the given v as JSON

Args:

v(object): the value to get

Returns:

the the value making sure objects are return as dicts

getJsonTypeSamples()[source]

does my class provide a “getSamples” method?

static getJsonTypeSamplesForClass(cls)[source]

return the type samples for the given class

Return:

list: a list of dict that specify the types by example

classmethod getPluralname()[source]
static readJsonFromFile(jsonFilePath)[source]

read json string from the given jsonFilePath

Args:

jsonFilePath(string): the path of the file where to read the result from

Returns:

the JSON string read from the file

reprDict(srcDict)[source]

get the given srcDict as new dict with fields being converted with getJSONValue

Args:

scrcDict(dict): the source dictionary

Returns

dict: the converted dictionary

restoreFromJsonFile(jsonFile: str)[source]

restore me from the given jsonFile

Args:

jsonFile(string): the jsonFile to restore me from

static singleQuoteToDoubleQuote(singleQuoted, useRegex=False)[source]

convert a single quoted string to a double quoted one

Args:

singleQuoted (str): a single quoted string e.g.

{‘cities’: [{‘name’: “Upper Hell’s Gate”}]}

useRegex (boolean): True if a regular expression shall be used for matching

Returns:

string: the double quoted version of the string

Note:

see - https://stackoverflow.com/questions/55600788/python-replace-single-quotes-with-double-quotes-but-leave-ones-within-double-q

static singleQuoteToDoubleQuoteUsingBracketLoop(singleQuoted)[source]

convert a single quoted string to a double quoted one using a regular expression

Args:

singleQuoted(string): a single quoted string e.g. {‘cities’: [{‘name’: “Upper Hell’s Gate”}]} useRegex(boolean): True if a regular expression shall be used for matching

Returns:

string: the double quoted version of the string e.g.

Note:

see https://stackoverflow.com/a/63862387/1497139

static singleQuoteToDoubleQuoteUsingRegex(singleQuoted)[source]

convert a single quoted string to a double quoted one using a regular expression

Args:

singleQuoted(string): a single quoted string e.g. {‘cities’: [{‘name’: “Upper Hell’s Gate”}]} useRegex(boolean): True if a regular expression shall be used for matching

Returns:

string: the double quoted version of the string e.g.

Note:

see https://stackoverflow.com/a/50257217/1497139

static storeJsonToFile(jsonStr, jsonFilePath)[source]

store the given json string to the given jsonFilePath

Args:

jsonStr(string): the string to store jsonFilePath(string): the path of the file where to store the result

storeToJsonFile(jsonFile: str, extension: str = '.json', limitToSampleFields: bool = False)[source]

store me to the given jsonFile

Args:

jsonFile(str): the JSON file name (optionally without extension) exension(str): the extension to use if not part of the jsonFile name limitToSampleFields(bool): If True the returned JSON is limited to the attributes/fields that are present in the samples. Otherwise all attributes of the object will be included. Default is False.

toJSON(limitToSampleFields: bool = False)[source]
Args:

limitToSampleFields(bool): If True the returned JSON is limited to the attributes/fields that are present in the samples. Otherwise all attributes of the object will be included. Default is False.

Returns:

a recursive JSON dump of the dicts of my objects

toJsonAbleValue(v)[source]

return the JSON able value of the given value v Args:

v(object): the value to convert

class lodstorage.jsonable.JSONAbleList(listName: str | None = None, clazz=None, tableName: str | None = None, initList: bool = True, handleInvalidListTypes=False, filterInvalidListTypes=False)[source]

Bases: JSONAble

Container class

asJSON(asString=True)[source]

recursively return my dict elements

Args:

asString(boolean): if True return my result as a string

fromJson(jsonStr, types=None)[source]

initialize me from the given JSON string

Args:

jsonStr(str): the JSON string fixType(Types): the types to be fixed

fromLoD(lod, append: bool = True, debug: bool = False)[source]

load my entityList from the given list of dicts

Args:

lod(list): the list of dicts to load append(bool): if True append to my existing entries

Return:

list: a list of errors (if any)

getJsonData()[source]

get my Jsondata

getList()[source]

get my list

getLoDfromJson(jsonStr: str, types=None, listName: str | None = None)[source]

get a list of Dicts form the given JSON String

Args:

jsonStr(str): the JSON string fixType(Types): the types to be fixed

Returns:

list: a list of dicts

getLookup(attrName: str, withDuplicates: bool = False)[source]

create a lookup dictionary by the given attribute name

Args:

attrName(str): the attribute to lookup withDuplicates(bool): whether to retain single values or lists

Return:

a dictionary for lookup or a tuple dictionary,list of duplicates depending on withDuplicates

readLodFromJsonFile(jsonFile: str, extension: str = '.json')[source]

read the list of dicts from the given jsonFile

Args:

jsonFile(string): the jsonFile to read from

Returns:

list: a list of dicts

readLodFromJsonStr(jsonStr) list[source]

restore me from the given jsonStr

Args:

storeFilePrefix(string): the prefix for the JSON file name

restoreFromJsonFile(jsonFile: str) list[source]

read my list of dicts and restore it

restoreFromJsonStr(jsonStr: str) list[source]

restore me from the given jsonStr

Args:

jsonStr(str): the json string to restore me from

setListFromLoD(lod: list) list[source]

set my list from the given list of dicts

Args:

lod(list) a raw record list of dicts

Returns:
list: a list of dicts if no clazz is set

otherwise a list of objects

toJsonAbleValue(v)[source]

make sure we don’t store our meta information clazz, tableName and listName but just the list we are holding

class lodstorage.jsonable.JSONAbleSettings[source]

Bases: object

settings for JSONAble - put in a separate class so they would not be serialized

indent = 4

regular expression to be used for conversion from singleQuote to doubleQuote see https://stackoverflow.com/a/50257217/1497139

singleQuoteRegex = re.compile("(?<!\\\\)'")
class lodstorage.jsonable.Types(name: str, warnOnUnsupportedTypes=True, debug=False)[source]

Bases: JSONAble

holds entity meta Info

Variables:

name(string) – entity name = table name

addType(listName, field, valueType)[source]

add the python type for the given field to the typeMap

Args:

listName(string): the name of the list of the field field(string): the name of the field

valueType(type): the python type of the field

fixListOfDicts(typeMap, listOfDicts)[source]

fix the type in the given list of Dicts

fixTypes(lod: list, listName: str)[source]

fix the types in the given data structure

Args:

lod(list): a list of dicts listName(str): the types to lookup by list name

static forTable(instance, listName: str, warnOnUnsupportedTypes: bool = True, debug=False)[source]

get the types for the list of Dicts (table) in the given instance with the given listName Args:

instance(object): the instance to inspect listName(string): the list of dicts to inspect warnOnUnsupportedTypes(bool): if TRUE warn if an item value has an unsupported type debug(bool): True if debuggin information should be shown

Returns:

Types: a types object

getType(typeName)[source]

get the type for the given type name

getTypes(listName: str, sampleRecords: list, limit: int = 10)[source]

determine the types for the given sample records

Args:

listName(str): the name of the list sampleRecords(list): a list of items limit(int): the maximum number of items to check

getTypesForItems(listName: str, items: list, warnOnNone: bool = False)[source]

get the types for the given items side effect is setting my types

Args:

listName(str): the name of the list items(list): a list of items warnOnNone(bool): if TRUE warn if an item value is None

typeName2Type = {'bool': <class 'bool'>, 'date': <class 'datetime.date'>, 'datetime': <class 'datetime.datetime'>, 'float': <class 'float'>, 'int': <class 'int'>, 'str': <class 'str'>}

lodstorage.jsonpicklemixin module

class lodstorage.jsonpicklemixin.JsonPickleMixin[source]

Bases: object

allow reading and writing derived objects from a jsonpickle file

asJsonPickle() str[source]

convert me to JSON

Returns:

str: a JSON String with my JSON representation

static checkExtension(jsonFile: str, extension: str = '.json') str[source]

make sure the jsonFile has the given extension e.g. “.json”

Args:

jsonFile(str): the jsonFile name - potentially without “.json” suffix

Returns:

str: the jsonFile name with “.json” as an extension guaranteed

debug = False
static readJsonPickle(jsonFileName, extension='.jsonpickle')[source]
Args:

jsonFileName(str): name of the file (optionally without “.json” postfix) extension(str): default file extension

writeJsonPickle(jsonFileName: str, extension: str = '.jsonpickle')[source]

write me to the json file with the given name (optionally without postfix)

Args:

jsonFileName(str): name of the file (optionally without “.json” postfix) extension(str): default file extension

lodstorage.linkml module

Created on 2024-01-28

@author: wf

class lodstorage.linkml.PythonTypes[source]

Bases: object

python type handling

classmethod get_linkml_range(ptype: Type) str[source]

Determines the LinkML range for a given Python type.

Args:

ptype (Type): The Python type for which the LinkML range is required.

Returns:

str: The corresponding LinkML range as a string. Defaults to “string” if the type is not found.

classmethod get_rdf_datatype(ptype: Type) XSD | None[source]

Determines the RDF (XSD) datatype for a given Python type.

Args:

ptype (Type): The Python type for which the RDF (XSD) datatype is required.

Returns:

XSD: The corresponding RDF (XSD) datatype. Returns None if the type is not found.

to_linkml_ranges = {<class 'bool'>: 'boolean', <class 'dict'>: 'dictionary', <class 'float'>: 'float', <class 'int'>: 'integer', <class 'list'>: 'list', <class 'str'>: 'string'}
to_rdf_datatypes = {<class 'bool'>: rdflib.term.URIRef('http://www.w3.org/2001/XMLSchema#boolean'), <class 'float'>: rdflib.term.URIRef('http://www.w3.org/2001/XMLSchema#float'), <class 'int'>: rdflib.term.URIRef('http://www.w3.org/2001/XMLSchema#integer'), <class 'str'>: rdflib.term.URIRef('http://www.w3.org/2001/XMLSchema#string')}

lodstorage.linkml_gen module

Created on 2024-01-21

@author: wf

class lodstorage.linkml_gen.LinkMLGen(schema: Schema)[source]

Bases: object

Class for generating LinkML YAML schema from Python data models using dataclasses.

check_value(value)[source]
ensure_consistency(name, declared_type, actual_type, doc_attributes)[source]
gen_schema(data_model_class) Schema[source]
gen_schema_from_instance(data_model_instance) Schema[source]

Generate a LinkML YAML schema from a Python data model using dataclasses.

Args:

data_model_instance: An instance of the Python data model.

Returns:

Schema: The LinkML schema generated from the data model.

lodstorage.lod module

Created on 2021-01-31

@author: wf

class lodstorage.lod.LOD(name)[source]

Bases: object

list of Dict aka Table

static addLookup(lookup, duplicates, record, value, withDuplicates: bool)[source]

add a single lookup result

Args:

lookup(dict): the lookup map duplicates(list): the list of duplicates record(dict): the current record value(object): the current value to lookup withDuplicates(bool): if True duplicates should be allowed and lists returned if False a separate duplicates list is created

static filterFields(lod: list, fields: list, reverse: bool = False)[source]

filter the given LoD with the given list of fields by either limiting the LoD to the fields or removing the fields contained in the list depending on the state of the reverse parameter

Args:

lod(list): list of dicts from which the fields should be excluded fields(list): list of fields that should be excluded from the lod reverse(bool): If True limit dict to the list of given fields. Otherwise exclude the fields from the dict.

Returns:

LoD

static getFields(listOfDicts, sampleCount: int | None = None)[source]
static getLookup(lod: list, attrName: str, withDuplicates: bool = False)[source]

create a lookup dictionary by the given attribute name for the given list of dicts

Args:

lod(list): the list of dicts to get the lookup dictionary for attrName(str): the attribute to lookup withDuplicates(bool): whether to retain single values or lists

Return:

a dictionary for lookup

classmethod handleListTypes(lod, doFilter=False, separator=',')[source]

handle list types in the given list of dicts

Args:

cls: this class lod(list): a list of dicts doFilter(bool): True if records containing lists value items should be filtered separator(str): the separator to use when converting lists

static intersect(listOfDict1, listOfDict2, key=None)[source]

get the intersection of the two lists of Dicts by the given key

static setNone(record, fields)[source]

make sure the given fields in the given record are set to none Args:

record(dict): the record to work on fields(list): the list of fields to set to None

static setNone4List(listOfDicts, fields)[source]

set the given fields to None for the records in the given listOfDicts if they are not set Args:

listOfDicts(list): the list of records to work on fields(list): the list of fields to set to None

static sortKey(d, key=None)[source]

get the sort key for the given dict d with the given key

lodstorage.mwTable module

Created on 2020-08-21

@author: wf

class lodstorage.mwTable.MediaWikiTable(wikiTable=True, colFormats=None, sortable=True, withNewLines=False)[source]

Bases: object

helper for https://www.mediawiki.org/wiki/Help:Tables

addHeader(record)[source]

add the given record as a “sample” header

addRow4Dict(record)[source]
asWikiMarkup()[source]

convert me to MediaWiki markup

Returns:

string: the MediWiki Markup for this table

fromListOfDicts(listOfDicts)[source]
noneReplace(value)[source]

lodstorage.plot module

Created on 2020-07-05

@author: wf

class lodstorage.plot.Plot(valueList, title, xlabel=None, ylabel=None, gformat='.png', fontsize=12, plotdir=None, debug=False)[source]

Bases: object

create Plot based on counters see https://stackoverflow.com/questions/19198920/using-counter-in-python-to-build-histogram

barchart(mode='show')[source]

barchart based histogram for the given counter

hist(mode='show')[source]

create histogram for the given counter

showDebug()[source]
showMe(mode='show', close=True)[source]

show me in the given mode

titleMe()[source]

set my title and labels

lodstorage.query module

Created on 2020-08-22

@author: wf

class lodstorage.query.Endpoint[source]

Bases: JSONAble

a query endpoint

classmethod getDefault()[source]
static getSamples()[source]
class lodstorage.query.EndpointManager[source]

Bases: object

manages a set of SPARQL endpoints

static getEndpointNames(endpointPath=None, lang: str | None = None) list[source]

Returns a list of all available endpoint names Args:

endpointPath(str): the path to the yaml file with the endpoint configurations lang(str): if lang is given filter by the given language

static getEndpoints(endpointPath: str | None = None, lang: str | None = None)[source]

get the endpoints for the given endpointPath

Args:

endpointPath(str): the path to the yaml file with the endpoint configurations lang(str): if lang is given filter by the given language

class lodstorage.query.Format(value)[source]

Bases: Enum

the supported formats for the results to be delivered

csv = 'csv'
github = 'github'
json = 'json'
latex = 'latex'
mediawiki = 'mediawiki'
tsv = 'tsv'
xml = 'xml'
class lodstorage.query.Query(name: str, query: str, lang='sparql', endpoint: str | None = None, database: str = 'blazegraph', title: str | None = None, description: str | None = None, limit: int | None = None, prefixes=None, tryItUrl: str | None = None, formats: list | None = None, debug=False)[source]

Bases: object

a Query e.g. for SPAQRL

addFormatCallBack(callback)[source]
asWikiMarkup(listOfDicts)[source]

convert the given listOfDicts result to MediaWiki markup

Args:

listOfDicts(list): the list of Dicts to convert to MediaWiki markup

Returns:

string: the markup

asWikiSourceMarkup()[source]

convert me to Mediawiki markup for syntax highlighting using the “source” tag

Returns:

string: the Markup

asYaml()[source]
documentQueryResult(qlod: list, limit=None, tablefmt: str = 'mediawiki', tryItUrl: str | None = None, withSourceCode=True, **kwArgs)[source]

document the given query results - note that a copy of the whole list is going to be created for being able to format

Args:

qlod: the list of dicts result limit(int): the maximum number of records to display in result tabulate tablefmt(str): the table format to use tryItUrl: the “try it!” url to show withSourceCode(bool): if True document the source code

Return:

str: the documentation tabular text for the given parameters

formatWithValueFormatters(lod, tablefmt: str)[source]

format the given list of Dicts with the ValueFormatters

convert the given url and title to a link for the given tablefmt

Args:

url(str): the url to convert title(str): the title to show tablefmt(str): the table format to use

getTryItUrl(baseurl: str, database: str = 'blazegraph')[source]

return the “try it!” url for the given baseurl

Args:

baseurl(str): the baseurl to used

Returns:

str: the “try it!” url for the given query

preFormatWithCallBacks(lod, tablefmt: str)[source]

run the configured call backs to pre-format the given list of dicts for the given tableformat

Args:

lod(list): the list of dicts to handle tablefmt(str): the table format (according to tabulate) to apply

convert url prefixes to link according to the given table format TODO - refactor as preFormat callback

Args:

lod(list): the list of dicts to convert prefix(str): the prefix to strip tablefmt(str): the tabulate tableformat to use

class lodstorage.query.QueryManager(lang: str | None = None, debug=False, queriesPath=None)[source]

Bases: object

manages pre packaged Queries

static getQueries(queriesPath=None)[source]

get the queries for the given queries Path

class lodstorage.query.QueryResultDocumentation(query, title: str, tablefmt: str, tryItMarkup: str, sourceCodeHeader: str, sourceCode: str, resultHeader: str, result: str)[source]

Bases: object

documentation of a query result

asText()[source]

return my text representation

Returns:

str: description, sourceCodeHeader, sourceCode, tryIt link and result table

static uniCode2Latex(text: str, withConvert: bool = False) str[source]

converts unicode text to latex and fixes UTF-8 chars for latex in a certain range:

₀:$_0$ … ₉:$_9$

see https://github.com/phfaist/pylatexenc/issues/72

Args:

text(str): the string to fix withConvert(bool): if unicode to latex libary conversion should be used

Return:

str: latex presentation of UTF-8 char

class lodstorage.query.QuerySyntaxHighlight(query, highlightFormat: str = 'html')[source]

Bases: object

Syntax highlighting for queries with pygments

highlight()[source]
Returns:

str: the result of the syntax highlighting with pygments

class lodstorage.query.ValueFormatter(name: str, formatString: str, regexps: list | None = None)[source]

Bases: object

a value Formatter

applyFormat(record, key, resultFormat: Format)[source]

apply the given format to the given record

Args:

record(dict): the record to handle key(str): the property key resultFormat(str): the resultFormat Style to apply

formatsPath = '/Users/wf/Library/Python/3.10/lib/python/site-packages/lodstorage/../sampledata/formats.yaml'
classmethod fromDict(name: str, record: dict)[source]

create a ValueFormatter from the given dict

classmethod getFormats(formatsPath: str | None = None) dict[source]

get the available ValueFormatters

Args:

formatsPath(str): the path to the yaml file to read the format specs from

Returns:

dict: a map for ValueFormatters by formatter Name

home = '/Users/wf'
valueFormats = None
class lodstorage.query.YamlPath[source]

Bases: object

static getPaths(yamlFileName: str, yamlPath: str | None = None)[source]

lodstorage.querymain module

Created on 2022-02-13

@author: wf

class lodstorage.querymain.QueryMain[source]

Bases: object

Commandline handler

classmethod main(args)[source]

command line activation with parsed args

Args:

args(list): the command line arguments

static rawQuery(endpointConf, query, resultFormat, mimeType)[source]

returns raw result of the endpoint

Args:

endpointConf: EndPoint query(str): query resultFormat(str): format of the result mimeType(str): mimeType

Returns:

raw result of the query

lodstorage.querymain.main(argv=None, lang=None)[source]

main program.

commandline access to List of Dicts / Linked Open Data Queries

lodstorage.querymain.mainSPARQL(argv=None)[source]

commandline for SPARQL queries

lodstorage.querymain.mainSQL(argv=None)[source]

commandline for SQL queries

lodstorage.rdf module

Created on 2024-01-27

@author: wf, using ChatGPT-4 prompting

class lodstorage.rdf.RDFDumper(schema: Schema, instance: object)[source]

Bases: object

A class to convert instances of data models (based on a LinkML schema) into an RDF graph.

convert_to_literal(value, slot_obj)[source]

Converts a value to an RDFLib Literal with appropriate datatype.

Args:

value: The value to be converted. slot_obj: The slot object containing information about the field.

Returns:

An RDFLib Literal with the value and appropriate datatype.

convert_to_rdf()[source]

Converts the provided instance into RDF triples based on the LinkML schema.

get_instance_uri(instance_data)[source]

Generates a URI for an instance. If the instance has an ‘identifier’ property, it uses that as part of the URI. Otherwise, it generates or retrieves a unique URI.

process_class(class_name: str, instance_data: object)[source]
serialize(rdf_format: str = 'turtle') str[source]

Serializes the RDF graph into a string representation in the specified format.

Args:

format (str): The serialization format (e.g., ‘turtle’, ‘xml’, ‘json-ld’).

Returns:

str: The serialized RDF graph.

value_iterator(value: Any)[source]

Iterates over values in a mapping or iterable.

Args:

value: The value to iterate over. It can be a mapping, iterable, or a single value.

Yields:

Tuples of (key, value) from the input value. For single values, key is None.

lodstorage.sample module

Created on 2020-08-24

@author: wf

class lodstorage.sample.Cities(load=False)[source]

Bases: JSONAbleList

class lodstorage.sample.Royal[source]

Bases: JSONAble

i am a single Royal

classmethod getSamples()[source]
class lodstorage.sample.Royals(load=False)[source]

Bases: JSONAbleList

a non ORM Royals list

class lodstorage.sample.RoyalsORMList(load=False)[source]

Bases: JSONAbleList

class lodstorage.sample.Sample[source]

Bases: object

Sample dataset generator

cityList = None
static dob(isoDateString)[source]

get the date of birth from the given iso date state

static getCities()[source]

get a list of cities

static getCountries()[source]
static getRoyals()[source]
static getRoyalsInstances()[source]
static getSample(size)[source]

lodstorage.sample2 module

Created on 2024-01-21

@author: wf

class lodstorage.sample2.Sample[source]

Bases: object

Sample dataset provider

static get(dataset_name: str)[source]

Get the given sample dataset name

lodstorage.schema module

Created on 2021-01-26

@author: wf

class lodstorage.schema.Schema(name: str, title: str)[source]

Bases: object

a relational Schema

static generalizeColumn(tableList, colName: str)[source]

remove the column with the given name from all tables in the tablelist and return it

Args:

tableList(list): a list of Tables colName(string): the name of the column to generalize

Returns:

string: the column having been generalized and removed

static getGeneral(tableList, name: str, debug: bool = False)[source]

derive a general table from the given table list Args:

tableList(list): a list of tables name(str): name of the general table debug(bool): True if column names should be shown

Returns:

at table dict for the generalized table

static getGeneralViewDDL(tableList, name: str, debug=False) str[source]

get the DDL statement to create a general view

Args:

tableList: the list of tables name(str): the name of the view debug(bool): True if debug should be set

class lodstorage.schema.SchemaManager(schemaDefs=None, baseUrl: str | None = None)[source]

Bases: object

a manager for schemas

lodstorage.sparql module

Created on 2020-08-14

@author: wf

class lodstorage.sparql.SPARQL(url, mode='query', debug=False, isFuseki=False, typedLiterals=False, profile=False, agent='PyLodStorage', method='POST')[source]

Bases: object

wrapper for SPARQL e.g. Apache Jena, Virtuoso, Blazegraph

Variables:
  • url – full endpoint url (including mode)

  • mode – ‘query’ or ‘update’

  • debug – True if debugging is active

  • typedLiterals – True if INSERT should be done with typedLiterals

  • profile(boolean) – True if profiling / timing information should be displayed

  • sparql – the SPARQLWrapper2 instance to be used

  • method(str) – the HTTP method to be used ‘POST’ or ‘GET’

addAuthentication(username: str, password: str, method: BASIC | DIGEST = 'BASIC')[source]

Add Http Authentication credentials to the sparql wrapper Args:

username: name of the user password: password of the user method: HTTP Authentication method

asListOfDicts(records, fixNone: bool = False, sampleCount: int | None = None)[source]

convert SPARQL result back to python native

Args:

record(list): the list of bindings fixNone(bool): if True add None values for empty columns in Dict sampleCount(int): the number of samples to check

Returns:

list: a list of Dicts

controlChars = ['\x00', '\x01', '\x02', '\x03', '\x04', '\x05', '\x06', '\x07', '\x08', '\t', '\n', '\x0b', '\x0c', '\r', '\x0e', '\x0f', '\x10', '\x11', '\x12', '\x13', '\x14', '\x15', '\x16', '\x17', '\x18', '\x19', '\x1a', '\x1b', '\x1c', '\x1d', '\x1e', '\x1f']
static controlEscape(s)[source]

escape control characters

see https://stackoverflow.com/a/9778992/1497139

fix_comments(query_string: str) str[source]

make sure broken SPARQLWrapper will find comments

classmethod fromEndpointConf(endpointConf) SPARQL[source]

create a SPARQL endpoint from the given EndpointConfiguration

Args:

endpointConf(Endpoint): the endpoint configuration to be used

getFirst(qLod: list, attr: str)[source]

get the column attr of the first row of the given qLod list

Args:

qLod(list): the list of dicts (returned by a query) attr(str): the attribute to retrieve

Returns:

object: the value

getLocalName(name)[source]

retrieve valid localname from a string based primary key https://www.w3.org/TR/sparql11-query/#prefNames

Args:

name(string): the name to convert

Returns:

string: a valid local name

getResults(jsonResult)[source]

get the result from the given jsonResult

Args:

jsonResult: the JSON encoded result

Returns:

list: the list of bindings

getValue(sparqlQuery: str, attr: str)[source]

get the value for the given SPARQL query using the given attr

Args:

sparql(SPARQL): the SPARQL endpoint to ge the value for sparqlQuery(str): the SPARQL query to run attr(str): the attribute to get

getValues(sparqlQuery: str, attrList: list)[source]

get Values for the given sparlQuery and attribute list

Args:

sparqlQuery(str): the query which did not return any values attrList(list): the list of attributes

insert(insertCommand)[source]

run an insert

Args:

insertCommand(string): the SPARQL INSERT command

Returns:

a response

insertListOfDicts(listOfDicts, entityType, primaryKey, prefixes, limit=None, batchSize=None, profile=False)[source]

insert the given list of dicts mapping datatypes

Args:

entityType(string): the entityType to use as a primaryKey(string): the name of the primary key attribute to use prefix(string): any PREFIX statements to be used limit(int): maximum number of records to insert batchSize(int): number of records to send per request

Return:

a list of errors which should be empty on full success

datatype maping according to https://www.w3.org/TR/xmlschema-2/#built-in-datatypes

mapped from https://docs.python.org/3/library/stdtypes.html

compare to https://www.w3.org/2001/sw/rdb2rdf/directGraph/ http://www.bobdc.com/blog/json2rdf/ https://www.w3.org/TR/json-ld11-api/#data-round-tripping https://stackoverflow.com/questions/29030231/json-to-rdf-xml-file-in-python

insertListOfDictsBatch(listOfDicts, entityType, primaryKey, prefixes, title='batch', batchIndex=None, total=None, startTime=None)[source]

insert a Batch part of listOfDicts

Args:

entityType(string): the entityType to use as a primaryKey(string): the name of the primary key attribute to use prefix(string): any PREFIX statements to be used title(string): the title to display for the profiling (if any) batchIndex(int): the start index of the current batch total(int): the total number of records for all batches starttime(datetime): the start of the batch processing

Return:

a list of errors which should be empty on full success

printErrors(errors)[source]

print the given list of errors

Args:

errors(list): a list of error strings

Returns:

boolean: True if the list is empty else false

query(queryString, method='POST')[source]

get a list of results for the given query

Args:

queryString(string): the SPARQL query to execute method(string): the method eg. POST to use

Returns:

list: list of bindings

queryAsListOfDicts(queryString, fixNone: bool = False, sampleCount: int | None = None)[source]

get a list of dicts for the given query (to allow round-trip results for insertListOfDicts)

Args:

queryString(string): the SPARQL query to execute fixNone(bool): if True add None values for empty columns in Dict sampleCount(int): the number of samples to check

Returns:

list: a list ofDicts

rawQuery(queryString, method='POST')[source]

query with the given query string

Args:

queryString(string): the SPARQL query to be performed method(string): POST or GET - POST is mandatory for update queries

Returns:

list: the raw query result as bindings

static strToDatetime(value, debug=False)[source]

convert a string to a datetime Args:

value(str): the value to convert

Returns:

datetime: the datetime

lodstorage.sql module

Created on 2020-08-24

@author: wf

class lodstorage.sql.EntityInfo(sampleRecords, name, primaryKey=None, debug=False)[source]

Bases: object

holds entity meta Info

Variables:
  • name(string) – entity name = table name

  • primaryKey(string) – the name of the primary key column

  • typeMap(dict) – maps column names to python types

  • debug(boolean) – True if debug information should be shown

addType(column, valueType, sqlType)[source]

add the python type for the given column to the typeMap

Args:

column(string): the name of the column

valueType(type): the python type of the column

fixDates(resultList)[source]

fix date entries in the given resultList by parsing the date content e.g. converting ‘1926-04-21’ back to datetime.date(1926, 4, 21)

Args:

resultList(list): the list of records to be fixed

getCreateTableCmd(sampleRecords)[source]

get the CREATE TABLE DDL command for the given sample records

Args:

sampleRecords(list): a list of Dicts of sample Records

Returns:

string: CREATE TABLE DDL command for this entity info

Example:

CREATE TABLE Person(name TEXT PRIMARY KEY,born DATE,numberInLine INTEGER,wikidataurl TEXT,age FLOAT,ofAge BOOLEAN)
getInsertCmd(replace: bool = False) str[source]

get the INSERT command for this entityInfo

Args:

replace(bool): if True allow replace for insert

Returns:

str: the INSERT INTO SQL command for his entityInfo e.g.

Example:

INSERT INTO Person (name,born,numberInLine,wikidataurl,age,ofAge) values (?,?,?,?,?,?).
class lodstorage.sql.SQLDB(dbname: str = ':memory:', connection=None, check_same_thread=True, timeout=5, debug=False, errorDebug=False)[source]

Bases: object

Structured Query Language Database wrapper

Variables:
  • dbname(string) – name of the database

  • debug(boolean) – True if debug info should be provided

  • errorDebug(boolean) – True if debug info should be provided on errors (should not be used for production since it might reveal data)

RAM = ':memory:'
backup(backupDB, action='Backup', profile=False, showProgress: int = 200, doClose=True)[source]

create backup of this SQLDB to the given backup db

see https://stackoverflow.com/a/59042442/1497139

Args:

backupDB(string): the path to the backupdb or SQLDB.RAM for in memory action(string): the action to display profile(boolean): True if timing information shall be shown showProgress(int): show progress at each showProgress page (0=show no progress)

backupProgress(status, remaining, total)[source]
close()[source]

close my connection

copyTo(copyDB, profile=True)[source]

copy my content to another database

Args:

copyDB(Connection): the target database profile(boolean): if True show profile information

createTable(listOfRecords, entityName: str, primaryKey: str | None = None, withCreate: bool = True, withDrop: bool = False, sampleRecordCount=1, failIfTooFew=True)[source]

derive Data Definition Language CREATE TABLE command from list of Records by examining first recorda as defining sample record and execute DDL command

auto detect column types see e.g. https://stackoverflow.com/a/57072280/1497139

Args:

listOfRecords(list): a list of Dicts entityName(string): the entity / table name to use primaryKey(string): the key/column to use as a primary key withDrop(boolean): true if the existing Table should be dropped withCreate(boolean): true if the create Table command should be executed - false if only the entityInfo should be returned sampleRecords(int): number of sampleRecords expected and to be inspected failIfTooFew(boolean): raise an Exception if to few sampleRecords else warn only

Returns:

EntityInfo: meta data information for the created table

execute(ddlCmd)[source]

execute the given Data Definition Command

Args:

ddlCmd(string): e.g. a CREATE TABLE or CREATE View command

executeDump(connection, dump, title, maxErrors=100, errorDisplayLimit=12, profile=True)[source]

execute the given dump for the given connection

Args:

connection(Connection): the sqlite3 connection to use dump(string): the SQL commands for the dump title(string): the title of the dump maxErrors(int): maximum number of errors to be tolerated before stopping and doing a rollback profile(boolean): True if profiling information should be shown

Returns:

a list of errors

getDebugInfo(record, index, executeMany)[source]

get the debug info for the given record at the given index depending on the state of executeMany

Args:

record(dict): the record to show index(int): the index of the record executeMany(boolean): if True the record may be valid else not

getTableDict(tableType='table')[source]

get the schema information from this database as a dict

Args:

tableType(str): table or view

Returns:

dict: Lookup map of tables with columns also being converted to dict

getTableList(tableType='table')[source]

get the schema information from this database

Args:

tableType(str): table or view

Return:

list: a list as derived from PRAGMA table_info

logError(msg)[source]

log the given error message to stderr

Args:

msg(str): the error messsage to display

progress(action, status, remaining, total)[source]

show progress

query(sqlQuery, params=None)[source]

run the given sqlQuery and return a list of Dicts

Args:

sqlQuery(string): the SQL query to be executed params(tuple): the query params, if any

Returns:

list: a list of Dicts

queryAll(entityInfo, fixDates=True)[source]

query all records for the given entityName/tableName

Args:

entityName(string): name of the entity/table to qury fixDates(boolean): True if date entries should be returned as such and not as strings

queryGen(sqlQuery, params=None)[source]

run the given sqlQuery a a generator for dicts

Args:

sqlQuery(string): the SQL query to be executed params(tuple): the query params, if any

Returns:

a generator of dicts

static restore(backupDB, restoreDB, profile=False, showProgress=200, debug=False)[source]

restore the restoreDB from the given backup DB

Args:

backupDB(string): path to the backupDB e.g. backup.db restoreDB(string): path to the restoreDB or in Memory SQLDB.RAM profile(boolean): True if timing information should be shown showProgress(int): show progress at each showProgress page (0=show no progress)

restoreProgress(status, remaining, total)[source]
showDump(dump, limit=10)[source]

show the given dump up to the given limit

Args:

dump(string): the SQL dump to show limit(int): the maximum number of lines to display

store(listOfRecords, entityInfo, executeMany=False, fixNone=False, replace=False)[source]

store the given list of records based on the given entityInfo

Args:

listOfRecords(list): the list of Dicts to be stored entityInfo(EntityInfo): the meta data to be used for storing executeMany(bool): if True the insert command is done with many/all records at once fixNone(bool): if True make sure empty columns in the listOfDict are filled with “None” values replace(bool): if True allow replace for insert

lodstorage.storageconfig module

Created on 2020-08-29

@author: wf

class lodstorage.storageconfig.StorageConfig(mode=StoreMode.SQL, cacheRootDir: str | None = None, cacheDirName: str = 'lodstorage', cacheFile=None, withShowProgress=True, profile=True, debug=False, errorDebug=True)[source]

Bases: object

a storage configuration

getCachePath(ensureExists=True) str[source]

get the path to the default cache

Args:

name(str): the name of the cache to use

static getDefault(debug=False)[source]
static getJSON(debug=False)[source]
static getJsonPickle(debug=False)[source]
static getSPARQL(prefix, endpoint, host, debug=False)[source]
static getSQL(debug=False)[source]
static getYaml(debug=False)[source]
class lodstorage.storageconfig.StoreMode(value)[source]

Bases: Enum

possible supported storage modes

JSON = 2
JSONPICKLE = 1
SPARQL = 4
SQL = 3
YAML = 5

lodstorage.sync module

Created on 2023-12-27

@author: wf

class lodstorage.sync.Sync(pair: SyncPair)[source]

Bases: object

A class to help with synchronization between two sets of data, each represented as a list of dictionaries.

get_keys(direction: str) set[source]

Get the keys for a given direction of synchronization.

get_record_by_key(side: str, key: str) dict[source]

Retrieves a record by the given unique key from the appropriate data source as specified by direction.

Args:

side (str): The side of data source, “←”,”l” or “left” for left and “→”,”r” or “right” for right. key (str): The unique key of the record to retrieve.

Returns:

Optional[Dict[str, Any]]: The record if found, otherwise None.

Raises:

ValueError: If the provided direction is invalid.

get_record_by_pkey(side: str, pkey: str) Dict[str, Any] | None[source]

Retrieves a record by primary key from the appropriate data source as specified by direction.

Args:

side (str): The side of data source, “←”,”l” or “left” for left and “→”,”r” or “right” for right. pkey (str): The primary key of the record to retrieve.

Returns:

Optional[Dict[str, Any]]: The record if found, otherwise None.

handle_direction_error(direction: str)[source]
handle_side_error(side: str)[source]
status_table(tablefmt: str = 'grid') str[source]

Create a table representing the synchronization status.

class lodstorage.sync.SyncPair(title: str, l_name: str, r_name: str, l_data: List[Dict[str, Any]], r_data: List[Dict[str, Any]], l_key: str, r_key: str, l_pkey: str | None = None, r_pkey: str | None = None)[source]

Bases: object

A class to represent a pair of data sources for synchronization.

Attributes:

title (str): The title of the synchronization pair. l_name (str): Name of the left data source (e.g., ‘local’). r_name (str): Name of the right data source (e.g., ‘wikidata’). l_data (List[Dict[str, Any]]): A list of dictionaries from the left data source. r_data (List[Dict[str, Any]]): A list of dictionaries from the right data source. l_key (str): The field name in the left data source dictionaries used as a unique identifier for synchronization. r_key (str): The field name in the right data source dictionaries used as a unique identifier for synchronization. l_pkey(str): the primary key field of the left data source r_pkey(str): the primary key field of the right data source

Example usage: l_data = [{‘id_l’: ‘1’, ‘value’: ‘a’}, {‘id_l’: ‘2’, ‘value’: ‘b’}] r_data = [{‘id_r’: ‘2’, ‘value’: ‘b’}, {‘id_r’: ‘3’, ‘value’: ‘c’}] pair = SyncPair(“Title”, “local”, “wikidata”, l_data, r_data, ‘id_l’, ‘id_r’) sync = Sync(pair) print(sync.status_table())

l_by_pkey: Dict[str, Dict[str, Any]]
l_data: List[Dict[str, Any]]
l_key: str
l_name: str
l_pkey: str | None = None
r_by_pkey: Dict[str, Dict[str, Any]]
r_data: List[Dict[str, Any]]
r_key: str
r_name: str
r_pkey: str | None = None
title: str

lodstorage.tabulateCounter module

Created on 2021-06-13

@author: wf

class lodstorage.tabulateCounter.TabulateCounter(counter)[source]

Bases: object

helper for tabulating Counters

mostCommonTable(headers=['#', 'key', 'count', '%'], tablefmt='pretty', limit=50)[source]

get the most common Table

lodstorage.trulytabular module

Created on 2022-04-14

@author: wf

class lodstorage.trulytabular.TrulyTabular(itemQid, propertyLabels: list = [], propertyIds: list = [], subclassPredicate='wdt:P31', where: str | None = None, endpointConf=None, lang='en', debug=False)[source]

Bases: object

truly tabular SPARQL/RDF analysis

checks “how tabular” a query based on a list of properties of an itemclass is

addStatsColWithPercent(m: dict, col: str, value: int | float, total: int | float)[source]

add a statistics Column Args:

m(dict): col(str): name of the column value: value total: total value

asText(long: bool = True)[source]

returns my content as a text representation

Args:

long(bool): True if a long format including url is wished

Returns:

str: a text representation of my content

count()[source]

get my count

genPropertyStatistics()[source]

generate the property Statistics

Returns:

generator: a generator of statistic dict rows

genWdPropertyStatistic(wdProperty: WikidataProperty, itemCount: int, withQuery=True) dict[source]

generate a property Statistics Row for the given wikidata Property

Args:

wdProperty(WikidataProperty): the property to get the statistics for itemCount(int): the total number of items to check withQuery(bool): if true include the sparql query

Returns:

dict: a statistics row

generateSparqlQuery(genMap: dict, listSeparator: str = '⇹', naive: bool = True, lang: str = 'en') str[source]

generate a SPARQL Query

Args:

genMap(dict): a dictionary of generation items aggregates/ignores/labels listSeparator(str): the symbole to use as a list separator for GROUP_CONCAT naive(bool): if True - generate a naive straight forward SPARQL query

if False generate a proper truly tabular aggregate query

lang(str): the language to generate for

Returns:

str: the generated SPARQL Query

getItemText()[source]
getPropertyStatistics()[source]

get the property Statistics

classmethod getQueryManager(lang='sparql', name='trulytabular', debug=False)[source]

get the query manager for the given language and fileName

Args:

lang(str): the language of the queries to extract name(str): the name of the manager containing the query specifications debug(bool): if True set debugging on

mostFrequentPropertiesQuery(whereClause: str | None = None, minCount: int = 0)[source]

get the most frequently used properties

Args:

whereClause(str): an extra WhereClause to use

noneTabular(wdProperty: WikidataProperty)[source]

get the none tabular result for the given Wikidata property

Args:

wdProperty(WikidataProperty): the Wikidata property

noneTabularQuery(wdProperty: WikidataProperty, asFrequency: bool = True)[source]

get the none tabular entries for the given property

Args:

wdProperty(WikidataProperty): the property to analyze asFrequency(bool): if true do a frequency analysis

class lodstorage.trulytabular.Variable[source]

Bases: object

Variable e.g. name handling

classmethod validVarName(varStr: str) str[source]

convert the given potential variable name string to a valid variable name

see https://stackoverflow.com/a/3305731/1497139

Args:

varStr(str): the string to convert

Returns:

str: a valid variable name

class lodstorage.trulytabular.WikidataItem(qid: str, lang: str = 'en', sparql: SPARQL | None = None, debug: bool = False)[source]

Bases: object

a wikidata Item

asText(long: bool = True, wrapAt: int = 0)[source]

returns my content as a text representation

Args:

long(bool): True if a long format including url is wished wrapAt(int): wrap long lines at the given width (if >0)

Returns:

str: a text representation of my content

classmethod getItemsByLabel(sparql: SPARQL, itemLabel: str, lang: str = 'en', debug: bool = False) list[source]

get a Wikidata items by the given label

Args:

sparql(SPARQL): the SPARQL endpoint to use itemLabel(str): the label of the items lang(str): the language of the label debug(bool): if True show debugging information

Returns:

a list of potential items

classmethod getLabelAndDescription(sparql: SPARQL, itemId: str, lang: str = 'en', debug: bool = False)[source]

get the label for the given item and language

Args:

itemId(str): the wikidata Q/P id lang(str): the language of the label debug(bool): if True output debug information

Returns:

(str,str): the label and description as a tuple

classmethod getPrefixes(prefixes=['rdf', 'rdfs', 'schema', 'wd', 'wdt', 'wikibase', 'xsd'])[source]
class lodstorage.trulytabular.WikidataProperty(pid: str)[source]

Bases: object

a WikidataProperty

classmethod addPropertiesForQuery(wdProperties: list, sparql, query)[source]

add properties from the given query’s result to the given wdProperties list using the given sparql endpoint

Args:

wdProperties(list): the list of wikidata properties sparql(SPARQL): the SPARQL endpoint to use query(str): the SPARQL query to perform

classmethod from_id(property_id: str, sparql, lang: str = 'en') WikidataProperty[source]

construct a WikidataProperty from the given property_id

Args:

property_id(str): a property ID e.g. “P6375” sparql(SPARQL): the SPARQL endpoint to use lang(str): the language for the label

getPredicate()[source]

get me as a Predicate

classmethod getPropertiesByIds(sparql, propertyIds: list, lang: str = 'en')[source]

get a list of Wikidata properties by the given id list

Args:

sparql(SPARQL): the SPARQL endpoint to use propertyIds(list): a list of ids of the properties lang(str): the language of the label

classmethod getPropertiesByLabels(sparql, propertyLabels: list, lang: str = 'en')[source]

get a list of Wikidata properties by the given label list

Args:

sparql(SPARQL): the SPARQL endpoint to use propertyLabels(list): a list of labels of the properties lang(str): the language of the label

lodstorage.uml module

Created on 2020-09-04

@author: wf

class lodstorage.uml.UML(debug=False)[source]

Bases: object

UML diagrams via plantuml

mergeSchema(schemaManager, tableList, title=None, packageName=None, generalizeTo=None, withSkin=True)[source]

merge Schema and tableList to PlantUml notation

Args:

schemaManager(SchemaManager): a schema manager to be used tableList(list): the tableList list of Dicts from getTableList() to convert title(string): optional title to be added packageName(string): optional packageName to be added generalizeTo(string): optional name of a general table to be derived withSkin(boolean): if True add default BITPlan skin parameters

Returns:

string: the Plantuml notation for the entities in columns of the given tablelist

skinparams = "\n' BITPlan Corporate identity skin params\n' Copyright (c) 2015-2020 BITPlan GmbH\n' see http://wiki.bitplan.com/PlantUmlSkinParams#BITPlanCI\n' skinparams generated by com.bitplan.restmodelmanager\nskinparam note {\n  BackGroundColor #FFFFFF\n  FontSize 12\n  ArrowColor #FF8000\n  BorderColor #FF8000\n  FontColor black\n  FontName Technical\n}\nskinparam component {\n  BackGroundColor #FFFFFF\n  FontSize 12\n  ArrowColor #FF8000\n  BorderColor #FF8000\n  FontColor black\n  FontName Technical\n}\nskinparam package {\n  BackGroundColor #FFFFFF\n  FontSize 12\n  ArrowColor #FF8000\n  BorderColor #FF8000\n  FontColor black\n  FontName Technical\n}\nskinparam usecase {\n  BackGroundColor #FFFFFF\n  FontSize 12\n  ArrowColor #FF8000\n  BorderColor #FF8000\n  FontColor black\n  FontName Technical\n}\nskinparam activity {\n  BackGroundColor #FFFFFF\n  FontSize 12\n  ArrowColor #FF8000\n  BorderColor #FF8000\n  FontColor black\n  FontName Technical\n}\nskinparam classAttribute {\n  BackGroundColor #FFFFFF\n  FontSize 12\n  ArrowColor #FF8000\n  BorderColor #FF8000\n  FontColor black\n  FontName Technical\n}\nskinparam interface {\n  BackGroundColor #FFFFFF\n  FontSize 12\n  ArrowColor #FF8000\n  BorderColor #FF8000\n  FontColor black\n  FontName Technical\n}\nskinparam class {\n  BackGroundColor #FFFFFF\n  FontSize 12\n  ArrowColor #FF8000\n  BorderColor #FF8000\n  FontColor black\n  FontName Technical\n}\nskinparam object {\n  BackGroundColor #FFFFFF\n  FontSize 12\n  ArrowColor #FF8000\n  BorderColor #FF8000\n  FontColor black\n  FontName Technical\n}\nhide Circle\n' end of skinparams '\n"
tableListToPlantUml(tableList, title=None, packageName=None, generalizeTo=None, withSkin=True)[source]

convert tableList to PlantUml notation

Args:

tableList(list): the tableList list of Dicts from getTableList() to convert title(string): optional title to be added packageName(string): optional packageName to be added generalizeTo(string): optional name of a general table to be derived withSkin(boolean): if True add default BITPlan skin parameters

Returns:

string: the Plantuml notation for the entities in columns of the given tablelist

lodstorage.version module

Created on 2022-03-06

@author: wf

class lodstorage.version.Version[source]

Bases: object

Version handling for pyLoDStorage

date = '2020-09-10'
description = 'python List of Dict (Table) Storage library'
name = 'pylodstorage'
updated = '2024-01-28'
version = '0.8.1'

lodstorage.xml module

Created on 2022-06-20

see

https://github.com/tyleradams/json-toolkit https://stackoverflow.com/questions/36021526/converting-an-array-dict-to-xml-in-python

@author: tyleradams @author: wf

class lodstorage.xml.Lod2Xml(lod, root: str = 'root', node_name: callable = <function Lod2Xml.<lambda>>)[source]

Bases: object

convert a list of dicts to XML

asXml(pretty: bool = True)[source]

convert result to XML

Args:

pretty(bool): if True pretty print the result

lodstorage.yamlable module

Created on 2023-12-08, Extended on 2023-16-12 and 2024-01-25

@author: wf, ChatGPT

Prompts for the development and extension of the ‘YamlAble’ class within the ‘yamable’ module:

  1. Develop ‘YamlAble’ class in ‘yamable’ module. It should convert dataclass instances to/from YAML.

  2. Implement methods for YAML block scalar style and exclude None values in ‘YamlAble’ class.

  3. Add functionality to remove None values from dataclass instances before YAML conversion.

  4. Ensure ‘YamlAble’ processes only dataclass instances, with error handling for non-dataclass objects.

  5. Extend ‘YamlAble’ for JSON serialization and deserialization.

  6. Add methods for saving/loading dataclass instances to/from YAML and JSON files in ‘YamlAble’.

  7. Implement loading of dataclass instances from URLs for both YAML and JSON in ‘YamlAble’.

  8. Write tests for ‘YamlAble’ within the pyLodStorage context. Use ‘samples 2’ example from pyLoDStorage https://github.com/WolfgangFahl/pyLoDStorage/blob/master/lodstorage/sample2.py as a reference.

  9. Ensure tests cover YAML/JSON serialization, deserialization, and file I/O operations, using the sample-based approach..

  10. Use Google-style docstrings, comments, and type hints

in ‘YamlAble’ class and tests.

  1. Adhere to instructions and seek clarification for any uncertainties.

  2. Add @lod_storable annotation support that will automatically YamlAble support and add @dataclass and @dataclass_json prerequisite behavior to a class

class lodstorage.yamlable.DateConvert[source]

Bases: object

date converter

classmethod iso_date_to_datetime(iso_date: str) date[source]
class lodstorage.yamlable.YamlAble[source]

Bases: Generic[T]

An extended YAML handler class for converting dataclass objects to and from YAML format, and handling loading from and saving to files and URLs.

classmethod from_dict2(data: dict) T[source]

Creates an instance of a dataclass from a dictionary, typically used in deserialization.

classmethod from_yaml(yaml_str: str) T[source]

Deserializes a YAML string to a dataclass instance.

Args:

yaml_str (str): A string containing YAML formatted data.

Returns:

T: An instance of the dataclass.

classmethod load_from_json_file(filename: str) T[source]

Loads a dataclass instance from a JSON file.

Args:

filename (str): The path to the JSON file.

Returns:

T: An instance of the dataclass.

classmethod load_from_json_url(url: str) T[source]

Loads a dataclass instance from a JSON string obtained from a URL.

Args:

url (str): The URL pointing to the JSON data.

Returns:

T: An instance of the dataclass.

classmethod load_from_yaml_file(filename: str) T[source]

Loads a dataclass instance from a YAML file.

Args:

filename (str): The path to the YAML file.

Returns:

T: An instance of the dataclass.

classmethod load_from_yaml_url(url: str) T[source]

Loads a dataclass instance from a YAML string obtained from a URL.

Args:

url (str): The URL pointing to the YAML data.

Returns:

T: An instance of the dataclass.

classmethod read_from_url(url: str) str[source]

Helper method to fetch content from a URL.

classmethod remove_ignored_values(value: Any, ignore_none: bool = True, ignore_underscore: bool = False, ignore_empty: bool = True) Any[source]

Recursively removes specified types of values from a dictionary or list. By default, it removes keys with None values. Optionally, it can also remove keys starting with an underscore.

Args:

value: The value to process (dictionary, list, or other). ignore_none: Flag to indicate whether None values should be removed. ignore_underscore: Flag to indicate whether keys starting with an underscore should be removed. ignore_empty: Flag to indicate whether empty collections should be removed.

represent_literal(dumper: Dumper, data: str) Node[source]

Custom representer for block scalar style for strings.

represent_none(_, __) Node[source]

Custom representer for ignoring None values in the YAML output.

save_to_json_file(filename: str)[source]

Saves the current dataclass instance to a JSON file.

Args:

filename (str): The path where the JSON file will be saved.

save_to_yaml_file(filename: str)[source]

Saves the current dataclass instance to a YAML file.

Args:

filename (str): The path where the YAML file will be saved.

to_yaml(ignore_none: bool = True, ignore_underscore: bool = True, allow_unicode: bool = True, sort_keys: bool = False) str[source]

Converts this dataclass object to a YAML string, with options to omit None values and/or underscore-prefixed variables, and using block scalar style for strings.

Args:

ignore_none: Flag to indicate whether None values should be removed from the YAML output. ignore_underscore: Flag to indicate whether attributes starting with an underscore should be excluded from the YAML output. allow_unicode: Flag to indicate whether to allow unicode characters in the output. sort_keys: Flag to indicate whether to sort the dictionary keys in the output.

Returns:

A string representation of the dataclass object in YAML format.

lodstorage.yamlable.lod_storable(cls)[source]

Decorator to make a class LoDStorable by inheriting from YamlAble. This decorator also ensures the class is a dataclass and has JSON serialization/deserialization capabilities.

lodstorage.yamlablemixin module

class lodstorage.yamlablemixin.YamlAbleMixin[source]

Bases: object

allow reading and writing derived objects from a yaml file

debug = False
static readYaml(name)[source]
writeYaml(name)[source]

Module contents