PCA Visualization
Principal Component Analysis of 51 states across 30 indicators for 2025.
PCA Methodology
What This Visualization Shows
Principal Component Analysis (PCA) reduces the complexity of many indicators into a simpler 2D visualization while preserving as much information as possible.
Instead of viewing states across 20+ individual indicators, PCA creates new composite dimensions (principal components) that capture the main patterns of variation.
How PCA Works
PCA finds new axes (principal components) that best explain the variation in the data:
- Center the data: Subtract the mean from each indicator
- Find PC1: The direction of maximum variance in the data
- Find PC2: The direction of maximum remaining variance, perpendicular to PC1
- Continue: Each subsequent PC captures remaining variance, perpendicular to all previous
Understanding the Scatter Plot
Each point on the scatter plot represents a state, positioned by its PC1 and PC2 scores:
- X-axis (PC1): Position along the first principal component
- Y-axis (PC2): Position along the second principal component
- Close states: Similar overall civic engagement profiles
- Distant states: Different profiles along these dimensions
States scoring high on the indicators that load positively on PC1
States scoring high on the indicators that load positively on PC2
Reading the Loadings
Loadings show how much each original indicator contributes to each principal component:
- Large positive loading: Indicator strongly contributes to high scores on this PC
- Large negative loading: Indicator contributes to low scores (moves in opposite direction)
- Near-zero loading: Indicator doesn't contribute much to this PC
Example: If PC1 has high positive loadings for voting, volunteering, and group membership, then PC1 might represent "general civic engagement" — states on the right are high on all three.
Explained Variance
Explained variance tells you how much information each component captures:
- PC1: Always captures the most variance (the dominant pattern)
- PC2: Captures the next most (the second pattern)
- Cumulative: PC1 + PC2 together show how much of the total picture you're seeing
What This Data DOES Tell You
- Overall similarity: Which states have similar civic engagement profiles
- Main dimensions: The key patterns that differentiate states
- Outliers: States with unusual combinations of indicators
- Indicator groupings: Which indicators tend to vary together (via loadings)
What This Data Does NOT Tell You
- Component meaning: PCs are mathematical constructs — naming them (e.g., "civic engagement") is interpretation
- Causation: Nearby states aren't similar because of any particular reason
- Raw values: Position shows relative standing, not actual participation rates
- Full picture: 2D shows PC1 + PC2 only; other dimensions may reveal different patterns
- Policy implications: States at similar positions may have arrived there through very different paths
Interpreting with Caution
- Naming components: Resist the urge to over-interpret PC labels. Look at what actually loads highly before naming.
- Low explained variance: If PC1 + PC2 explain only 40% of variance, the remaining 60% tells a different story
- Rotation sensitivity: The exact position of points depends on which indicators are included
- Scale dependence: We use z-scores so all indicators contribute equally; different scaling choices would change results
Best practice: Use PCA as an exploratory tool to generate hypotheses, then verify patterns using the original indicators in the Matrix or Correlation views.
Technical Implementation
Our PCA implementation:
- Uses z-score standardized data (mean=0, std=1 for each indicator)
- Computes covariance matrix from centered data
- Extracts principal components via power iteration (100 iterations)
- Deflates covariance matrix after each component
- Extracts up to 4 principal components
Questions about methodology? Contact the NCoC research team.
State Positions (PC1 vs PC2) 55.5% + 25.2% = 80.7% of variance
PC1 Loadings (55.5%)
Indicators that most influence position on the horizontal axis
PC2 Loadings (25.2%)
Indicators that most influence position on the vertical axis
All Loadings
| Indicator | PC1 | PC2 | PC3 | PC4 |
|---|---|---|---|---|
| politicaldonor | -0.258 | +0.006 | +0.138 | +0.026 |
| nissues | -0.256 | +0.128 | -0.059 | -0.017 |
| action | -0.254 | +0.009 | -0.033 | +0.178 |
| views | -0.244 | +0.123 | +0.114 | -0.054 |
| ntalk | -0.244 | +0.091 | +0.026 | -0.047 |
| meeting | -0.244 | -0.082 | +0.022 | +0.021 |
| favors | -0.215 | +0.216 | -0.165 | -0.032 |
| nissues_detailed | -0.215 | +0.169 | -0.010 | -0.170 |
| ffissues | -0.210 | -0.216 | -0.063 | -0.092 |
| worksat | -0.199 | +0.221 | +0.088 | +0.261 |
| views_detailed | -0.193 | +0.168 | +0.192 | -0.195 |
| org | -0.190 | -0.218 | +0.031 | +0.137 |
| workicont | -0.188 | +0.106 | -0.103 | +0.237 |
| news | -0.184 | -0.210 | +0.259 | +0.007 |
| ffissues_detailed | -0.180 | -0.194 | -0.084 | -0.237 |
| contact | -0.178 | -0.189 | -0.078 | +0.007 |
| ntalk_detailed | -0.177 | +0.177 | -0.157 | -0.241 |
| workvol | -0.176 | -0.013 | +0.157 | +0.113 |
| volunteer | -0.175 | -0.162 | -0.181 | +0.317 |
| donate | -0.167 | -0.221 | -0.099 | +0.098 |
| boycott | -0.150 | -0.266 | +0.190 | -0.035 |
| workpride | -0.145 | +0.170 | -0.262 | +0.153 |
| volfreq | -0.144 | +0.130 | -0.022 | -0.249 |
| news_detailed | -0.134 | -0.200 | +0.289 | -0.191 |
| voterstatus | -0.113 | -0.269 | +0.103 | -0.179 |
| workcont | -0.113 | -0.087 | -0.404 | +0.266 |
| favors_detailed | -0.100 | +0.298 | -0.198 | -0.208 |
| volfreq_detailed | -0.052 | +0.206 | +0.095 | -0.246 |
| fftalk_detailed | +0.027 | -0.199 | -0.407 | -0.342 |
| fftalk | -0.001 | -0.255 | -0.356 | -0.241 |