Phenotype Factors (NMF)

Research Use Only: NMF factors are statistical decompositions of phenotype patterns. They represent co-occurring trait combinations but are not clinical diagnoses or validated subtypes.

About Non-negative Matrix Factorization

NMF decomposes the gene-phenotype matrix into latent factors representing coherent phenotype constellations. Each factor captures a distinct pattern of co-occurring traits. Genes and traits have "loadings" indicating their association strength with each factor.

Factors Extracted

Latent components

Variance Explained

Total captured

Reconstruction Error

Lower is better

Total Genes

In analysis

What you're seeing: Each card represents a "factor"—a pattern of phenotypes that tend to occur together. The algorithm (NMF) discovered these patterns automatically from the data. Each factor shows its top contributing phenotypes (left) and genes (right). The "variance explained" indicates how much of the overall data pattern this factor captures. What it means: Factors represent distinct clinical profiles. For example, Factor 0 (Intellectual Disability) groups genes that commonly cause cognitive impairment. If a gene loads strongly on this factor, patients with that mutation are likely to show ID-related features.

Factor × Trait Loading Matrix

How strongly each trait contributes to each factor

What you're seeing: Each row is a factor (F0-F5), each column is a phenotype. Darker blue indicates stronger "loading"—how much that phenotype defines the factor. What it means: This reveals the complete structure of each factor. A phenotype with high loading on multiple factors is common across different gene groups; a phenotype with high loading on just one factor is more specific to that particular pattern.

Factor Similarity

Cosine similarity between factor trait profiles

What you're seeing: How similar each factor is to every other factor. Higher values (warmer colors) indicate factors that share similar phenotype profiles. What it means: Factors with low similarity to others represent more distinct clinical presentations. High similarity between two factors suggests they capture related (but not identical) phenotype patterns.

What you're seeing: Each dot is a gene, positioned by how strongly it loads on two factors (chosen via dropdowns). Colors indicate each gene's "dominant" factor—the one it loads most strongly on. What it means: Genes clustered together have similar phenotype profiles. Genes near the origin (0,0) don't strongly match either factor. Genes far along one axis strongly match that factor. This helps visualize which genes belong to which clinical profile.

Gene Factor Loadings

Each point is a gene positioned by its loading on selected factors