scitex_scholar.formatting

Citation formatting for plain paper dicts.

Provides BibTeX, RIS, EndNote, CSV, and text-style (APA, MLA, Chicago, Vancouver) formatters. Every function accepts a standard paper dict — no ORM or framework dependencies.

Standard paper dict keys:

title, authors_str, journal, year, doi, pmid, arxiv_id, url,
abstract, document_type, citation_count, impact_factor,
is_open_access, source, volume, number, pages, cite_key
scitex_scholar.formatting.clean_text(text)[source]

Remove characters that break citation formats and normalise whitespace.

Return type:

str

scitex_scholar.formatting.generate_cite_key(paper)[source]

Generate a BibTeX citation key from a paper dict.

Return type:

str

scitex_scholar.formatting.paper_normalize(data)[source]

Normalise a raw dict (e.g. API search result) to a standard paper dict.

Return type:

dict

scitex_scholar.formatting.paper_from_search_result(result)[source]

Normalize a raw search-API result dict to the standard paper format.

Handles field aliases from different search engines (externalUrl, snippet, etc.) and fills missing fields with safe defaults.

Return type:

dict

scitex_scholar.formatting.make_citation_key(last_name, year=None)[source]

Generate a citation key from author last name and year.

Parameters:
  • last_name (str) – Author last name (special chars stripped).

  • year – Publication year (optional).

Return type:

str

Returns:

Citation key string, e.g. smith2024.

scitex_scholar.formatting.sanitize_filename(filename, max_length=50)[source]

Sanitize a string for use as a download filename.

Replaces shell-unsafe characters with underscores, collapses whitespace, and truncates to max_length characters.

Return type:

str

scitex_scholar.formatting.to_bibtex(paper)[source]

Format a standard paper dict as a BibTeX entry.

Return type:

str

scitex_scholar.formatting.to_ris(paper)[source]

Format a standard paper dict as a RIS entry.

Return type:

str

scitex_scholar.formatting.to_endnote(paper)[source]

Format a standard paper dict as an EndNote entry.

Return type:

str

scitex_scholar.formatting.to_csv_row(paper)[source]

Format a standard paper dict as a CSV row dict.

Return type:

dict

scitex_scholar.formatting.to_text_citation(paper, style='apa', doc_type='article')[source]

Format a paper dict as a text citation in the given style.

Parameters:
  • paper (dict) – Standard paper dict.

  • style (str) – One of apa, mla, chicago, vancouver.

  • doc_type (str) – One of article, dataset.

Returns:

Formatted citation string.

Return type:

str

scitex_scholar.formatting.clean_bibtex_for_arxiv(bibtex_entry)[source]

Clean a BibTeX entry for arXiv compatibility.

Converts biblatex fields to standard bibtex and removes unsupported fields (url, urldate, file, abstract).

Return type:

str

scitex_scholar.formatting.papers_to_format(papers, fmt)[source]

Format a list of paper dicts to the given format string.

Return type:

str