cmoncrawl.common.types#

Classes

DomainCrawl([domain, cdx_server, page])

DomainRecord(filename, url, offset, length)

ExtractConfig(extractors_path, routes)

Configuration for run.

ExtractorConfig(name[, since, to])

Configuration for extractor.

PipeMetadata(domain_record, article_data, ...)

Metadata for a pipe.

RetrieveResponse(status, content, reason)

RoutesConfig(regexes, extractors)

Configuration for extractors.