================================================================================
RING PERCEPTION / SSSR TEST CASES — HARDEST KNOWN EDGE CASES
================================================================================
Compiled from CDK/RDKit GitHub issues, published papers, and PubChem.
All SMILES verified against PubChem REST API where possible.

================================================================================
SECTION 1: PLATONIC / CAGE HYDROCARBONS (High Symmetry, Non-Unique SSSR)
================================================================================

1a. CUBANE (C8H8)
    SMILES: C12C3C4C1C5C4C3C25
    Atoms: 8  |  Bonds: 12  |  SSSR: 5 (Frerejacque: 12-8+1=5)
    WHY HARD: Oh symmetry. All 8 atoms are equivalent, but SSSR must pick
    only 5 of the 6 equivalent square faces. The 6th face is linearly
    dependent. Any choice of 5 breaks the molecular symmetry, making SSSR
    non-unique. The classic counterexample from Berger et al. (2004).
    FAILS: Any SSSR algorithm will produce an arbitrary subset. SMARTS
    ring-membership primitives (Rn, rn) give nondeterministic results.
    RDKit GetSymmSSSR returns all 6 faces as a workaround.

1b. PRISMANE (C6H6)
    SMILES: C12C3C1C4C2C34
    Atoms: 6  |  Bonds: 9  |  SSSR: 4 (9-6+1=4)
    WHY HARD: D3h symmetry. Has 5 faces (2 triangular + 3 square) but
    SSSR is only 4. Must omit one face — breaks symmetry.
    FAILS: Same non-unique SSSR problem as cubane.

1c. ADAMANTANE (C10H16)
    SMILES: C1C2CC3CC1CC(C2)C3
    Atoms: 10  |  Bonds: 12 (heavy-atom bonds)  |  SSSR: 3 (12-10+1=3)
    WHY HARD: Td symmetry cage. Has 4 equivalent 6-membered chair rings
    but only 3 in SSSR. Classic non-unique SSSR example.
    FAILS: SMARTS Rn primitives give asymmetric results on symmetric atoms.

1d. DODECAHEDRANE (C20H20)
    SMILES: C12C3C4C5C1C6C7C2C8C3C9C4C1C5C6C2C7C8C9C12
    Atoms: 20  |  Bonds: 30  |  SSSR: 11 (30-20+1=11)
    WHY HARD: Ih symmetry (icosahedral). Has 12 equivalent pentagonal
    faces but SSSR is only 11. Very high symmetry makes ring selection
    completely arbitrary. 20 ring-closure digits needed in SMILES — tests
    parser limits.
    FAILS: Non-unique SSSR. Expensive to compute all rings.

1e. BUCKMINSTERFULLERENE C60
    SMILES: C12=C3C4=C5C6=C1C7=C8C9=C1C%10=C%11C(=C29)C3=C2C3=C4C4=C5C5=C9C6=C7C6=C7C8=C1C1=C8C%10=C%10C%11=C2C2=C3C3=C4C4=C5C5=C%11C%12=C(C6=C95)C7=C1C1=C%12C5=C%11C4=C3C3=C5C(=C81)C%10=C23
    Atoms: 60  |  Bonds: 90  |  SSSR: 31 (90-60+1=31)
    WHY HARD: Ih symmetry. 12 pentagons + 20 hexagons = 32 faces, but
    SSSR is 31. Must omit one face. The number of possible SSSRs is
    astronomical. Computing ALL rings is infeasible (exponential blowup).
    CDK paper (May & Steinbeck 2014) notes "the number of cycles can be
    very large and infeasible to compute for fullerene-like structures."
    FAILS: AllRingsFinder in CDK can timeout. Old SSSRFinder was unstable.
    Ring bond count for atoms is non-trivial.

1f. BASKETANE (C10H12)
    SMILES: C1CC2C3C4C1C5C2C3C45
    Atoms: 10  |  Bonds: 13  |  SSSR: 4 (13-10+1=4)
    WHY HARD: Multiple ring-closure digits, cage topology.

1g. PAGODANE (C20H20)
    SMILES: C1C9C4C5C1C6C2CCC3C2C7CC3C8(C4CC5C678)C9
    Atoms: 20  |  Bonds: 26  |  SSSR: 7 (26-20+1=7)
    WHY HARD: D2h symmetry pagoda-shaped cage. Complex ring-closure
    pattern in SMILES.

================================================================================
SECTION 2: FENESTRANE (4 Rings Sharing One Atom)
================================================================================

2a. [4.4.4.4]FENESTRANE (C9H12)
    SMILES: C1CC2(CC3(CC4(CC1C4)C3)C2)
    Atoms: 9  |  SSSR: 4
    WHY HARD: Central quaternary carbon is shared by ALL four rings.
    Potential for planar tetracoordinate carbon. The shared atom creates
    unusual ring membership counts (atom in 4 rings). Stresses algorithms
    that use ring membership as invariant.
    (Note: The [4.4.4.4]fenestrane is hypothetical/highly strained.)

2b. [4.4.4.5]FENESTRANE (synthesized)
    SMILES: C1CC2(CC3(CCC4(C3)CC1C4)C2)
    Atoms: 10  |  SSSR: 4
    WHY HARD: Same shared-atom problem as above, but actually synthesized.

================================================================================
SECTION 3: LARGE FUSED POLYCYCLIC AROMATICS (Many Fused Rings)
================================================================================

3a. CORONENE (C24H12) — 7 fused rings
    SMILES: C1=CC2=C3C4=C1C=CC5=C4C6=C(C=C5)C=CC7=C6C3=C(C=C2)C=C7
    Atoms: 24  |  Bonds: 30  |  SSSR: 7 (30-24+1=7)
    WHY HARD: D6h symmetry. Baseline fused aromatic test.

3b. CORANNULENE (C20H10) — bowl-shaped [5]circulene
    SMILES: C1=CC2=C3C4=C1C=CC5=C4C6=C(C=C5)C=CC(=C36)C=C2
    Atoms: 20  |  Bonds: 25  |  SSSR: 6 (25-20+1=6)
    WHY HARD: Non-planar bowl shape. Central 5-membered ring + 5 fused
    6-membered rings. Curved pi-system.

3c. KEKULENE (C48H24) — [12]circulene, 12 fused benzene rings
    SMILES: C1=CC2=CC3=C4C=C2C5=CC6=C(C=CC7=CC8=C(C=C76)C9=CC2=C(C=CC6=C2C=C2C(=C6)C=CC6=C2C=C4C(=C6)C=C3)C=C9C=C8)C=C51
    Atoms: 48  |  Bonds: 60  |  SSSR: 13 (60-48+1=13)
    WHY HARD: 12 fused benzene rings forming a macrocyclic annulene.
    Both inner and outer perimeter cycles exist. Non-unique SSSR — the
    inner 18-membered annulene vs. outer 30-membered annulene vs.
    individual 6-membered rings. Classic test for ring perception because
    "obvious" chemical rings differ from MCB rings.
    FAILS: Algorithms may miss the macrocyclic rings or report chemically
    unintuitive ring sets.

3d. CIRCUMCORONENE (C54H18) — larger coronene analog
    SMILES: C1=CC2=CC3=C4C5=C6C7=C(C=CC8=CC9=C%10C%11=C(C=C9)C=C9C=CC%12=C%13C9=C%11C9=C%11C%13=C%13C(=C%12)C=CC%12=C%13C%13=C%11C(=C4C2=C%13C1=C%12)C6=C9C%10=C87)C=C5C=C3
    Atoms: 54  |  Bonds: 69  |  SSSR: 16 (69-54+1=16)
    WHY HARD: 19 fused rings total but SSSR=16. Very large aromatic
    system. High ring-closure digit count (%10, %11, %12, %13) in SMILES
    stresses parsers.

3e. [6]HELICENE (C26H16) — helical fused benzene rings
    SMILES: C1=CC=C2C(=C1)C=CC3=C2C4=C(C=C3)C=CC5=C4C6=CC=CC=C6C=C5
    Atoms: 26  |  Bonds: 31  |  SSSR: 6 (31-26+1=6)
    WHY HARD: Non-planar helical topology. Tests 3D vs 2D assumptions in
    ring algorithms.

3f. [14]HELICENE (C58H32) — very large helicene
    SMILES: C1=CC=C2C(=C1)C=CC3=C2C4=C(C=CC5=C4C6=C(C=C5)C=CC7=C6C8=C(C=C7)C=CC9=C8C1=C(C=CC2=C1C1=C(C=C2)C=CC2=C1C1=CC=CC=C1C=C2)C=C9)C=C3
    Atoms: 58  |  Bonds: 69  |  SSSR: 12 (69-58+1=12)
    WHY HARD: Very long helical chain of 14 fused 6-membered rings.
    Non-planar. Large SMILES string.

================================================================================
SECTION 4: NATURAL PRODUCTS WITH COMPLEX RING SYSTEMS
================================================================================

4a. STRYCHNINE (C21H22N2O2) — 7 fused rings
    SMILES: C1CN2CC3=CCOC4CC(=O)N5C6C4C3CC2C61C7=CC=CC=C75
    Atoms: 24 (heavy)  |  SSSR: 7
    WHY HARD: 7 fused rings of varying sizes (5,5,6,6,6,7,7) including
    bridged and spiro-fused systems. Heterocyclic (N, O). Classic
    benchmark for total synthesis and ring perception alike.

4b. RESERPINE (C33H40N2O9) — 6 fused rings
    SMILES: COC1C(CC2CN3CCC4=C(C3CC2C1C(=O)OC)NC5=C4C=CC(OC)=C5)OC(=O)C6=CC(OC)=C(OC)C(OC)=C6
    Atoms: 44 (heavy)  |  SSSR: 6
    WHY HARD: Indole + piperidine + other ring fusions. Complex
    bridged polycyclic system with multiple heteroatoms.

4c. PACLITAXEL / TAXOL (C47H51NO14) — tetracyclic taxane skeleton
    SMILES: CC1=C2C(C(=O)C3(C(CC4C(C3C(C(C2(C)C)(CC1OC(=O)C(C(C5=CC=CC=C5)NC(=O)C6=CC=CC=C6)O)O)OC(=O)C7=CC=CC=C7)(CO4)OC(=O)C)O)C)OC(=O)C
    Atoms: 62 (heavy)  |  SSSR: 5 (4 core taxane + 3 phenyl = ~7 total)
    WHY HARD: Large molecule. Multiple ester side chains with phenyl
    rings. Bridged oxetane ring in core. Complex stereochemistry.

4d. VANCOMYCIN (C66H75Cl2N9O24) — glycopeptide with cross-linked rings
    SMILES: CC1C(C(CC(O1)OC2C(C(C(OC2OC3=C4C=C5C=C3OC6=C(C=C(C=C6)C(C(C(=O)NC(C(=O)NC5C(=O)NC7C8=CC(=C(C=C8)O)C9=C(C=C(C=C9O)O)C(NC(=O)C(C(C1=CC(=C(O4)C=C1)Cl)O)NC7=O)C(=O)O)CC(=O)N)NC(=O)C(CC(C)C)NC)O)Cl)CO)O)O)(C)N)O
    Atoms: 176 (heavy)  |  SSSR: ~11 (multiple macrocyclic + aromatic)
    WHY HARD: Extremely large. Multiple macrocyclic rings formed by
    cross-links between amino acid side chains. Biaryl ether linkages
    form additional rings. Sugar moieties. One of the hardest natural
    products for ring perception — the macrocyclic rings span many atoms.
    FAILS: Ring finders may miss the large macrocyclic rings or
    be very slow computing all cycles.

================================================================================
SECTION 5: DOCUMENTED RDKit RING PERCEPTION FAILURES
================================================================================

5a. RDKit Issue #5055 — Hash collision in ring perception
    SMILES: CC1(C)NC(=O)CN2C=C(C[C@H](C(=O)NC)NC(=O)CN3CCN(C(=O)[C@H]4Cc5c([nH]c6ccccc56)CN4C(=O)CN4CN(c5ccccc5)C5(CCN(CC5)C1=O)C4=O)[C@@H](Cc1ccccc1)C3=O)[N-][NH2+]2
    WHY HARD: Complex macrocyclic peptide-like structure. A hash collision
    in ring invariants caused the algorithm to fail to recognize a phenyl
    sidechain as a ring. Error: "non-ring atom 45 marked aromatic."
    FAILS: RDKit MolFromSmiles returned None. Fixed in later release.

5b. RDKit Issue #4266 — Multi-fragment SSSR failure
    SMILES: c1ccccc1.C123C45C16C21C34C561
    WHY HARD: Benzene (aromatic) + prismane (cage) as separate fragments
    in one SMILES. The fallback ring-finding algorithm could not handle
    multi-fragment molecules with aromatic rings.
    FAILS: RDKit MolFromSmiles returned None. Warning: "could not find
    number of expected rings."

5c. RDKit Issue #6915 — Cucurbituril barrel missing cap ring
    SMILES: C1N2C3C4N(C2=O)CN5C6C7N(C5=O)CN8C9C2N(C8=O)CN5C8C%10N(C5=O)CN5C%11C%12N(C5=O)CN5C%13C(N1C5=O)N1CN3C(=O)N4CN6C(=O)N7CN9C(=O)N2CN8C(=O)N%10CN%11C(=O)N%12CN%13C1=O
    Atoms: 84 (heavy)  |  Formula: C36H36N24O12
    SSSR expected: 19 (including barrel-cap macrocycle of 24+ atoms)
    WHY HARD: Barrel-shaped molecule (cucurbit[6]uril). GetSSSR found
    only the small wall rings (12 five-membered + 6 eight-membered) but
    MISSED the large macrocyclic ring forming the barrel cap.
    FAILS: RDKit GetSSSR. The large macrocycle is part of the true MCB
    but was not reported.

5d. RDKit Issue #7482 — GetSymmSSSR wrong count on adamantane-like
    SMILES: C1CC2CCCC3CCC(C1)CCC(CC2)CC3
    Atoms: 18  |  Expected SSSR: 3 (but user expected 4)
    WHY HARD: Adamantane-like topology with rings of sizes 9, 10, 10, 11.
    Demonstrates how SSSR ring count (bonds-atoms+1) can be
    counterintuitive — looks like 4 rings visually but only 3 are
    independent.

5e. RDKit Issue #4144 — EnumerateStereoisomers corrupts ring info
    WHY HARD: EnumerateStereoisomers clears ring info, then
    FastFindRings produces incomplete set. GetSymmSSSR then reuses
    stale data. Not a specific molecule — affects any polycyclic
    molecule processed through stereoisomer enumeration.

================================================================================
SECTION 6: MECHANICALLY INTERLOCKED & TOPOLOGICAL MOLECULES
================================================================================

NOTE: Standard SMILES cannot represent mechanical/topological bonds
(catenanes, rotaxanes, knots). These are listed for completeness as
they represent the ultimate ring perception challenge. Extensions like
BigSMILES (Sudhakar & Olsen, 2026) are being developed.

6a. [2]CATENANE — two interlocked macrocycles
    Cannot be represented in SMILES (no mechanical bond notation).
    The graph is two disconnected cycles, but they are physically linked.
    Any ring algorithm operating on the molecular graph will see two
    separate rings, missing the topological entanglement.

6b. TREFOIL KNOT — molecular knot with 3 crossings
    All-benzene trefoil knot synthesized by Segawa et al. (Science, 2019).
    No standard SMILES — the topology is encoded in 3D coordinates only.

6c. BORROMEAN RINGS — three mutually interlocked rings
    Synthesized by Stoddart group. No pairwise links but collectively
    linked. Cannot be captured by any graph-based ring algorithm.

================================================================================
SECTION 7: CYCLOPHANE-LIKE STRESS-TEST STRUCTURES
================================================================================

These are the structures noted in the CDK ring perception paper (May &
Steinbeck, J. Cheminformatics 2014) and Unique Ring Families paper
(Kolodzik et al., JCIM 2012) as worst-case inputs.

7a. PARA-CYCLOPHANE (simplest)
    SMILES: C1CCC2=CC=CC=C2CCC3=CC=CC=C31
    WHY HARD: Macrocyclic ring + aromatic rings create non-obvious SSSR.

7b. MULTIPLY-BRIDGED CYCLOPHANE — n para-bridged 6-rings in a macrocycle
    For a cyclophane with n para-bridged benzene rings:
    - Cyclomatic number = n + 1
    - Number of relevant cycles = n^2 + n
    - Number of URFs = n + 1
    This means relevant cycles grow QUADRATICALLY with n, making
    exhaustive ring enumeration expensive for large n.

================================================================================
SECTION 8: BIPHENYLENE AND RELATED NON-UNIQUE SSSR CASES
================================================================================

8a. BIPHENYLENE (C12H8) — fused 6-4-6 ring system
    SMILES: C1=CC2=C3C=CC=CC3=C2C=C1
    Atoms: 12  |  Bonds: 14  |  SSSR: 3 (14-12+1=3)
    WHY HARD: Two 6-membered rings fused to a central 4-membered ring.
    The two 6-rings and the 4-ring are all in the SSSR, but the
    12-membered outer perimeter is also a valid cycle. Non-unique when
    considering which rings to report.

8b. NAPHTHALENE (C10H8)
    SMILES: c1ccc2ccccc2c1
    Atoms: 10  |  SSSR: 2
    WHY HARD: Simple case but demonstrates: SSSR contains both 6-rings,
    but the 10-membered outer envelope is also a valid ring. In aromatic
    SMILES, correct kekulization depends on correct ring perception.

================================================================================
SECTION 9: SUMMARY OF KEY BENCHMARK PAPERS
================================================================================

1. Berger, Flamm, Gleiss, Leydold, Stadler (2004)
   "Counterexamples in Chemical Ring Perception"
   J. Chem. Inf. Comput. Sci. 44(2):323-31
   DOI: 10.1021/ci030405d
   KEY FINDING: Many published claims about ring sets are WRONG. Provides
   counterexamples using real molecules (cubane, etc.).

2. May & Steinbeck (2014)
   "Efficient ring perception for the Chemistry Development Kit"
   J. Cheminformatics 6:3
   DOI: 10.1186/1758-2946-6-3
   KEY FINDING: Old CDK SSSRFinder was unstable. New algorithms much
   faster. Fullerene-like and cyclophane-like structures are worst cases.

3. Kolodzik, Urbaczek, Rarey (2012)
   "Unique Ring Families: A Chemically Meaningful Description of
   Molecular Ring Topologies"
   J. Chem. Inf. Model. 52(8):2013-2021
   KEY FINDING: URFs combine uniqueness + chemical meaning + efficiency.
   SSSR is neither unique nor chemically meaningful.

4. Flachsenberg, Andresen, Rarey (2017)
   "RingDecomposerLib: An Open-Source Implementation of Unique Ring
   Families and Other Cycle Bases"
   J. Chem. Inf. Model. 57(4):627-636
   DOI: 10.1021/acs.jcim.6b00736
   KEY FINDING: First open-source URF implementation. Benchmarked on
   entire PubChem. URFs should replace SSSR as the standard.

5. Downs & Barnard (1997/2003)
   "Ring Perception Using Breadth-First Search"
   J. Chem. Inf. Comput. Sci.
   KEY FINDING: BFS-based ring perception. Faster than earlier DFS methods.

6. OpenEye "SSSR Considered Harmful"
   https://docs.eyesopen.com/toolkits/python/oechemtk/ring.html
   KEY FINDING: OpenEye intentionally does NOT implement SSSR. All ring-
   related queries use topological ring membership instead.

================================================================================
SECTION 10: QUICK-REFERENCE TABLE
================================================================================

Molecule              | Atoms | SSSR | Symmetry | Key Difficulty
----------------------|-------|------|----------|---------------------------
Prismane              |   6   |  4   | D3h      | Non-unique SSSR
Cubane                |   8   |  5   | Oh       | Non-unique SSSR (classic)
Adamantane            |  10   |  3   | Td       | Non-unique SSSR
[4.4.4.4]Fenestrane   |   9   |  4   | -        | 4 rings sharing 1 atom
Basketane             |  10   |  4   | -        | Cage topology
Corannulene           |  20   |  6   | C5v      | Bowl-shaped, non-planar
Dodecahedrane         |  20   |  11  | Ih       | Non-unique SSSR, high ring-closures
Pagodane              |  20   |  7   | D2h      | Complex cage
Strychnine            |  24   |  7   | -        | 7 heterogeneous fused rings
[6]Helicene           |  26   |  6   | C2       | Helical, non-planar
Reserpine             |  44   |  6   | -        | Bridged heterocyclic
Kekulene              |  48   |  13  | D6h      | Macrocyclic annulene + 12 fused
Circumcoronene        |  54   |  16  | D6h      | 19 faces but SSSR=16
[14]Helicene          |  58   |  12  | C2       | Very large helicene
C60 Fullerene         |  60   |  31  | Ih       | 32 faces, SSSR=31, AllRings infeasible
Taxol                 |  62   |  ~7  | -        | Bridged oxetane + large side chains
CB6 Cucurbituril      |  84   |  ~19 | D6h      | Barrel; SSSR misses cap macrocycle
Vancomycin            | 176   | ~11  | -        | Multiple macrocyclic + aromatic rings

================================================================================
END OF FILE
================================================================================
