The 12 Faces of Autism

A Gene-Phenotype Atlas

Research Data Only - Not For Clinical Use

This resource contains literature-derived associations, NOT clinical data. It cannot provide penetrance estimates, severity predictions, or diagnostic guidance. Absence of a gene-phenotype link means "not reported in literature," not "clinically absent." See Methods → Critical Limitations before interpreting any data.

Autism is not one condition. It's twelve.

AI analysis of 1,000+ papers reveals distinct genetic subtypes with unique symptom profiles

Evidence: Strong
241
Genes Analyzed
12
Distinct Subtypes
1.00
Confidence
0.00
Entropy
View Clusters See Evidence Explore Subtypes

What We Did: A Plain-Language Summary

Purpose

Autism spectrum disorder (ASD) is incredibly diverse - no two individuals present exactly the same way. We wanted to see if we could identify distinct subtypes of autism based on which symptoms tend to occur together in people with specific gene variants.

Methods

We used AI to read over 1,000 scientific papers about genes from the SFARI autism database, extracting which symptoms (like seizures, language delay, or intellectual disability) were reported for each gene. Then we used a clustering algorithm to group the 241 genes with sufficient data by their symptom profiles.

Results

We found 12 distinct clusters - 6 major and 6 minor subtypes. For example, one cluster features mostly intellectual disability, another shows seizures with language problems, and another has behavioral/anxiety features. Machine learning correctly predicted cluster membership 86% of the time, suggesting these are real patterns.

Conclusion

Autism isn't one condition - it's many. Different genes tend to cause different constellations of symptoms. This could eventually help doctors predict what to watch for based on a patient's genetic results, though much more research is needed before clinical use.

Explore the Phenotype Data
Papers Analyzed
1,010
From PubMed literature
Phenotype Records
~6,000
Extracted observations
Unique Genes
556
From SFARI database
Phenotype Traits
56
Standardized schema
Phenotype Status Distribution
Breakdown of Present, Absent, and Not Reported across all extractions

Data Quality

Matrix Density 8.6%
High Confidence (>=0.8) 61.6%
Multi-paper Genes 44%
Note: 2% Absent rate reflects publication bias toward positive findings
Research Timeline
Publications, genes, and phenotype observations over time

Explore Data

Genes
556 SFARI genes
Phenotypes
56 traits
Explorer
Gene-trait matrix
Literature
1,010 papers

Analysis

Clustering
12 subtypes (6+6)
Co-occurrence
Trait correlations
Pathways
Enrichment
ML Models
Prediction

Advanced Analysis

Subtypes
Gene Interactions
Trait Specificity
Biclusters
NMF Factors

Top Genes by Paper Coverage

View All
Gene Papers Traits
Loading...

Most Reported Phenotypes

View All
Phenotype Present Genes
Loading...