Phenotype Factors (NMF)
Research Use Only: NMF factors are statistical decompositions of
phenotype patterns. They represent co-occurring trait combinations but are not
clinical diagnoses or validated subtypes.
About Non-negative Matrix Factorization
NMF decomposes the gene-phenotype matrix into latent factors representing coherent phenotype constellations. Each factor captures a distinct pattern of co-occurring traits. Genes and traits have "loadings" indicating their association strength with each factor.
Factors Extracted
-
Latent components
Variance Explained
-
Total captured
Reconstruction Error
-
Lower is better
Total Genes
-
In analysis
What you're seeing: Each card represents a "factor"—a pattern of phenotypes that tend to occur together.
The algorithm (NMF) discovered these patterns automatically from the data. Each factor shows its top contributing
phenotypes (left) and genes (right). The "variance explained" indicates how much of the overall data pattern this factor captures.
What it means: Factors represent distinct clinical profiles. For example, Factor 0 (Intellectual Disability)
groups genes that commonly cause cognitive impairment. If a gene loads strongly on this factor, patients with that mutation
are likely to show ID-related features.
Factor × Trait Loading Matrix
How strongly each trait contributes to each factor
What you're seeing: Each row is a factor (F0-F5), each column is a phenotype. Darker blue
indicates stronger "loading"—how much that phenotype defines the factor. What it means:
This reveals the complete structure of each factor. A phenotype with high loading on multiple factors
is common across different gene groups; a phenotype with high loading on just one factor is more specific
to that particular pattern.
Factor Similarity
Cosine similarity between factor trait profiles
What you're seeing: How similar each factor is to every other factor. Higher values (warmer colors)
indicate factors that share similar phenotype profiles. What it means: Factors with low similarity
to others represent more distinct clinical presentations. High similarity between two factors suggests they capture
related (but not identical) phenotype patterns.
What you're seeing: Each dot is a gene, positioned by how strongly it loads on two factors (chosen via dropdowns).
Colors indicate each gene's "dominant" factor—the one it loads most strongly on. What it means:
Genes clustered together have similar phenotype profiles. Genes near the origin (0,0) don't strongly match either factor.
Genes far along one axis strongly match that factor. This helps visualize which genes belong to which clinical profile.
Gene Factor Loadings
Each point is a gene positioned by its loading on selected factors