NCoC logo Civic Health Measurement Dashboard

PCA Visualization

Principal Component Analysis of 51 states across 22 indicators for 2023.

PCA Methodology

What This Visualization Shows

Principal Component Analysis (PCA) reduces the complexity of many indicators into a simpler 2D visualization while preserving as much information as possible.

Instead of viewing states across 20+ individual indicators, PCA creates new composite dimensions (principal components) that capture the main patterns of variation.

How PCA Works

PCA finds new axes (principal components) that best explain the variation in the data:

  1. Center the data: Subtract the mean from each indicator
  2. Find PC1: The direction of maximum variance in the data
  3. Find PC2: The direction of maximum remaining variance, perpendicular to PC1
  4. Continue: Each subsequent PC captures remaining variance, perpendicular to all previous
Technical note: PCA is computed using eigenvalue decomposition of the covariance matrix. We use power iteration to find the principal eigenvectors.

Understanding the Scatter Plot

Each point on the scatter plot represents a state, positioned by its PC1 and PC2 scores:

  • X-axis (PC1): Position along the first principal component
  • Y-axis (PC2): Position along the second principal component
  • Close states: Similar overall civic engagement profiles
  • Distant states: Different profiles along these dimensions
Right side (high PC1)

States scoring high on the indicators that load positively on PC1

Top (high PC2)

States scoring high on the indicators that load positively on PC2

Reading the Loadings

Loadings show how much each original indicator contributes to each principal component:

  • Large positive loading: Indicator strongly contributes to high scores on this PC
  • Large negative loading: Indicator contributes to low scores (moves in opposite direction)
  • Near-zero loading: Indicator doesn't contribute much to this PC

Example: If PC1 has high positive loadings for voting, volunteering, and group membership, then PC1 might represent "general civic engagement" — states on the right are high on all three.

Explained Variance

Explained variance tells you how much information each component captures:

  • PC1: Always captures the most variance (the dominant pattern)
  • PC2: Captures the next most (the second pattern)
  • Cumulative: PC1 + PC2 together show how much of the total picture you're seeing
Rule of thumb: If PC1 + PC2 explain less than 50% of variance, the 2D plot may be missing important information. Consider looking at PC3/PC4 loadings too.

What This Data DOES Tell You

  • Overall similarity: Which states have similar civic engagement profiles
  • Main dimensions: The key patterns that differentiate states
  • Outliers: States with unusual combinations of indicators
  • Indicator groupings: Which indicators tend to vary together (via loadings)

What This Data Does NOT Tell You

  • Component meaning: PCs are mathematical constructs — naming them (e.g., "civic engagement") is interpretation
  • Causation: Nearby states aren't similar because of any particular reason
  • Raw values: Position shows relative standing, not actual participation rates
  • Full picture: 2D shows PC1 + PC2 only; other dimensions may reveal different patterns
  • Policy implications: States at similar positions may have arrived there through very different paths

Interpreting with Caution

  • Naming components: Resist the urge to over-interpret PC labels. Look at what actually loads highly before naming.
  • Low explained variance: If PC1 + PC2 explain only 40% of variance, the remaining 60% tells a different story
  • Rotation sensitivity: The exact position of points depends on which indicators are included
  • Scale dependence: We use z-scores so all indicators contribute equally; different scaling choices would change results

Best practice: Use PCA as an exploratory tool to generate hypotheses, then verify patterns using the original indicators in the Matrix or Correlation views.

Technical Implementation

Our PCA implementation:

  • Uses z-score standardized data (mean=0, std=1 for each indicator)
  • Computes covariance matrix from centered data
  • Extracts principal components via power iteration (100 iterations)
  • Deflates covariance matrix after each component
  • Extracts up to 4 principal components

Questions about methodology? Contact the NCoC research team.

State Positions (PC1 vs PC2) 50.5% + 19.7% = 70.2% of variance

PC1 Loadings (50.5%)

Indicators that most influence position on the horizontal axis

Organizational Membership
+0.34
Contacting Public Officials
+0.31
Formal Volunteering
+0.30
Buycotting or Boycotting
+0.30
Charitable Giving
+0.30
Attending Public Meetings
+0.27
Taking Action with Neighbors
+0.25
Learning About Issues
+0.24

PC2 Loadings (19.7%)

Indicators that most influence position on the vertical axis

Contribute to Community
+0.46
Workplace Pride
+0.45
Informal Helping
+0.37
Workplace Contributes to Community
+0.34
Talking with Neighbors
+0.25
Learning About Issues
-0.23
Main Satisfaction Comes From Work
+0.22
Volunteer Frequency
+0.21

All Loadings

Indicator PC1 PC2 PC3 PC4
org +0.338 -0.007 -0.117 +0.014
contact +0.310 +0.087 -0.048 +0.093
volunteer +0.304 +0.043 -0.217 +0.003
boycott +0.301 -0.024 -0.083 +0.093
donate +0.301 -0.091 -0.156 +0.058
meeting +0.272 +0.046 +0.023 +0.087
action +0.253 +0.176 -0.104 -0.073
news +0.240 -0.226 +0.190 +0.023
voterstatus +0.229 -0.034 +0.143 +0.131
workvol +0.226 -0.089 -0.182 -0.075
politicaldonor +0.220 -0.067 +0.374 +0.071
fftalk +0.209 -0.145 +0.026 -0.115
ffissues +0.178 -0.154 +0.408 +0.048
ntalk +0.171 +0.251 -0.066 -0.477
worksat -0.163 +0.215 +0.262 +0.248
workcont +0.135 +0.338 -0.020 +0.370
nissues +0.116 +0.130 +0.391 -0.384
workpride +0.047 +0.454 +0.070 +0.201
volfreq -0.038 +0.213 -0.151 +0.038
workicont +0.033 +0.457 +0.192 +0.172
views +0.023 -0.025 +0.450 -0.171
favors -0.005 +0.371 -0.060 -0.498