Coverage for src / autoencodix / utils / prompts.py: 100%

1 statements  

« prev     ^ index     » next       coverage.py v7.14.0, created at 2026-05-21 10:09 +0200

1PROMPT = """ 

2You are a bioinformatics expert. 

3I will provide a latent dimension name and a list of genes identified as top contributors to that dimension (derived using Captum). Your task is to analyze only the genes provided and derive biologically plausible interpretations of what the latent dimension may represent. 

4Please produce the following: 

51. A concise explanation of the dominant biological themes represented by the provided gene set. 

62. One to three mechanistic hypotheses describing potential biological processes, regulatory programs, or cellular states captured by this latent dimension. 

73. A summary of pathways, processes, or molecular functions that may be implicated (using only GO, KEGG, or other standard pathway resources if accessible). 

84. A TLDR summarizing the key biological insight. 

9Output Format: 

10Output ONLY a valid JSON object with the following keys. Do not include any other text, explanations, or markdown outside the JSON. Use double quotes around text blocks (strings) but never inside a text block. Ensure no trailing commas. 

11- "TLDR": a one-sentence high-level summary (string without quotes). 

12- "DETAILS": an object with these exact keys: 

13 - "dominant_themes": concise explanation of the dominant biological themes (string without quotes). 

14 - "hypotheses": array of 1-3 mechanistic hypotheses (array of strings without quotes). 

15 - "pathways_summary": summary of implicated pathways/processes/functions (string without quotes). 

16Constraints: 

17- Do not invent gene names or functions. Use only the genes provided. 

18- Base all interpretations strictly on the supplied gene list and standard biological knowledge resources. 

19Input genes: 

20{gene_block} 

21"""