Files
protocol-virtues-study/text_coding/analysis/analysis_summary.md
2026-03-29 15:25:34 -06:00

13 KiB
Raw Blame History

Multivariate Analysis of Coding.csv: Virtue Clustering and Associations

Date: 2026-03-28
Dataset: coding.csv
Texts Analyzed: 134
Unique Virtue Categories: 74
Average Virtues per Text: 2.78 (range: 1-5)


1. Executive Summary

This analysis examines 134 coded texts from two sources (AFP and PR) across 74 unique virtue categories. Using multiple multivariate techniques—clustering, network analysis, and association metrics—the study reveals:

  • 4 distinct text clusters with one dominant cluster containing 86% of texts
  • 3 major virtue communities representing different conceptual frameworks
  • Strong ethical pairings (e.g., Care+Consent) that nearly always co-occur
  • Source differences in conceptual complexity (AFP: more interconnected; PR: more focused)

2. Cluster Analysis of Texts

Using K-means clustering on binary virtue presence/absence vectors:

Cluster Size Key Virtues Sources Interpretation
1 5 texts Memory, Imitation, Inheritance, Tradition AFP, PR Memory-focused texts - Historical and temporal continuity themes
2 4 texts Refusal, Embodiment, Resistance, Subversion AFP only Resistance discourse - Tactical opposition to systems
3 115 texts Adaptability, Tension Management, Accessibility, Design AFP, PR Core protocol cluster - Dominant protocol ethics discourse
4 10 texts Authenticity, Alignment, Inheritance AFP, PR Authenticity/Alignment cluster - Self-determination and tradition

Key Finding: Cluster 3 represents the overwhelming majority (86%) of texts, suggesting a shared "protocol ethics" discourse across sources. Cluster 2 represents a distinct "resistance" discourse found only in AFP texts.


3. Strongest Virtue Associations

By Co-occurrence Count (raw frequency):

Rank Virtue Pair Count Notes
1 Accessibility + Situational Awareness 4 Practical context-sensitivity
2 Equity + Inclusivity 3 Justice framework
3 Balance + Tension Management 3 Managing contradictions

By Jaccard Similarity (normalized association strength):

Rank Virtue Pair Jaccard Index Interpretation
1 Care + Consent 0.750 Nearly inseparable - Ethical foundation pair
2 Resistance + Subversion 0.400 Tactical cluster
3 Refusal + Subversion 0.400 Resistance tactics
4 Equity + Inclusivity 0.375 Justice-oriented
5 Refusal + Resistance 0.333 Activism tactics
6 Embodiment + Groundedness 0.333 Material presence
7 Agency + Freedom 0.300 Autonomy cluster

Key Finding: The Care+Consent pairing (Jaccard = 0.750) is exceptionally strong, appearing together in 3 out of 4 possible texts where both concepts appear. This suggests an ethical foundation where care practices are inseparable from consent frameworks.


4. Virtue Communities (Network Analysis)

Using network thresholding on co-occurrence patterns, three major virtue communities were identified:

Community 1: "Protocol Mechanics" (~40 virtues)

Core operational virtues for protocol design and implementation

Central Members:

  • Adaptability, Agency, Balance, Capture Resistance
  • Care, Complex Systems Tolerance, Consent
  • Constraint, Curiosity, Design, Emergent Properties
  • Equity, Freedom, Institutional Critique, Iterative Development
  • Networked Intelligence, Plurality, Replicability, Systems Thinking

Characteristics:

  • Largest community spanning practical and ethical dimensions
  • High connectivity to Adaptability and Systems Thinking (hub virtues)
  • Brings together ethics (Care, Consent, Equity) with operational concepts (Design, Iterative Development)

Community 2: "Collective Intelligence" (3 virtues)

Focused on collaborative knowledge production

Members: Alignment, Collaboration, Networked Intelligence

Characteristics:

  • Small but distinct community
  • Emphasizes distributed, collaborative approach
  • Connected to Community 1 through Networked Intelligence

Community 3: "Relational Ethics" (~9 virtues)

Focus on social and cultural connection

Members:

  • Collectivity, Cultural Awareness, Empathy, Interdependence
  • Plurality, Relationality, Respect, Spatial Awareness
  • Plus contextual concepts

Characteristics:

  • Strong ties to Community 1 through Relationality
  • Emphasizes interpersonal and cultural dimensions
  • Includes Plurality, suggesting diversity and multiplicity

5. Network Centrality Analysis

"Hub" Virtues (ranked by number of connections to other virtue types):

Rank Virtue Connections Key Neighbors
1 Adaptability 25 Agency, Resistance, Long-Term Vision, Design, Systems Thinking
2 Design 23 Agency, Equity, Emergent Properties, Inheritance, Constraint
3 Agency 23 Resistance, Inheritance, Refusal, Autonomy, Systems Thinking
4 Temporal Awareness 19 Emergent Properties, Long-Term Vision, Adaptability
5 Systems Thinking 19 Agency, Design, Long-Term Vision, Constraint
6 Collectivity 17 Interdependence, Agency, Shared Responsibility
7 Transgression 17 Refusal, Subversion, Care, Capture Resistance
8 Institutional Critique 16 Refusal, Design, Subversion, Agency
9 Plurality 16 Interdependence, Agency, Systems Thinking
10 Relationality 16 Interdependence, Accessibility, Care, Curiosity

Key Finding: Adaptability is unequivocally the central hub of this virtue network, connecting to 25 other virtue concepts. This suggests it functions as a bridging concept across multiple ethical and practical domains.


6. Source Comparison (AFP vs. PR)

Metric AFP (62 texts) PR (72 texts) Interpretation
Unique virtue pairs 221 143 AFP texts show more conceptual diversity
Avg pairs per text 4.06 2.22 AFP texts are more conceptually dense
Network density 8.2% 5.3% AFP has more interconnected virtue networks
Top virtues Adaptability (8), Temporal Awareness (7), Collectivity (7), Institutional Critique (7) Tension Management (10), Adaptability (9), Systems Thinking (9), Infrastructural Awareness (8) AFP: critical/social; PR: technical/systemic

AFP Code Profile (Academic/Critical)

  • Dominant themes: Adaptability, Temporal Awareness, Collectivity, Institutional Critique
  • Emphasis: Social processes, critical engagement, collective action
  • Pattern: Higher virtue co-occurrence suggests more conceptually complex texts

PR Code Profile (Practical/Technical)

  • Dominant themes: Tension Management, Systems Thinking, Infrastructural Awareness
  • Emphasis: Technical complexity, managing contradictions, system design
  • Pattern: More focused virtue profiles, strong emphasis on Adaptability

Key Finding: Both sources prioritize Adaptability, but AFP has more distributed emphasis across critical/social virtues, while PR emphasizes technical/systemic concepts. The 8.2% vs 5.3% network density difference suggests AFP texts engage with more complex conceptual interconnections.


7. Frequency Distribution

Top 30 Virtues by Frequency:

Rank Virtue Count % of Texts
1 Adaptability 17 12.7%
2 Tension Management 13 9.7%
3 Accessibility 13 9.7%
4 Temporal Awareness 11 8.2%
5 Design 11 8.2%
6 Institutional Critique 10 7.5%
7 Agency 10 7.5%
8 Relationality 10 7.5%
9 Infrastructural Awareness 10 7.5%
10 Systems Thinking 10 7.5%
11 Plurality 9 6.7%
12 Transgression 9 6.7%
13 Collectivity 8 6.0%
14 Inheritance 8 6.0%
15 Authenticity 7 5.2%
16 Long-Term Vision 7 5.2%
17 Equity 6 4.5%
18 Capture Resistance 6 4.5%
19 Respect 6 4.5%
20 Cultural Awareness 6 4.5%
21 Spatial Awareness 6 4.5%
22 Interdependence 6 4.5%
23 Shared Responsibility 6 4.5%
24 Situational Awareness 6 4.5%
25 Memory 5 3.7%
26 Embodiment 5 3.7%
27 Inclusivity 5 3.7%
28 Balance 5 3.7%
29 Reciprocity 5 3.7%
30 Emergent Properties 5 3.7%

8. Key Insights and Implications

8.1 The Three Pillars of Protocol Ethics

The analysis reveals three conceptual pillars that structure this discourse:

  1. Adaptive Ethics (centered on Adaptability and Design): The capacity to adjust, learn, and evolve protocols in response to changing conditions

  2. Relational Justice (centered on Care, Consent, Equity, Inclusivity): Ethical frameworks emphasizing relationship, respect, and justice

  3. Systemic Resistance (centered on Refusal, Subversion, Institutional Critique): Tactical opposition and critique of existing systems

8.2 The Adaptability Paradigm

The overwhelming centrality of Adaptability (highest frequency, highest connectivity) suggests this is the core organizing concept. It bridges:

  • Ethical dimensions: Equity, Care, Consent
  • Operational dimensions: Design, Iterative Development, Systems Thinking
  • Resistance dimensions: Capture Resistance, Resistance, Agency

8.3 Source Convergence and Divergence

  • Convergence: Both sources treat Adaptability as central, suggesting a shared understanding that protocols must be capable of change
  • Divergence: AFP emphasizes critical/social dimensions (Institutional Critique, Collectivity), while PR emphasizes technical/systemic dimensions (Tension Management, Systems Thinking)
  • Integration: The most conceptually dense texts (highest network density) come from AFP, suggesting critical theory provides more complex conceptual interconnections

8.4 Unexpected Pairings

Several virtue pairs show unexpected strength:

  • Care + Consent (0.750): Suggests an ethics of care cannot exist without consent frameworks
  • Refusal + Subversion (0.400): Tactical language clusters together
  • Equity + Inclusivity (0.375): Justice requires both fair distribution and openness

8.5 The Resistance Cluster

The small cluster of resistance-focused texts (4 texts in Cluster 2) represents a distinct discourse that:

  • Appears only in AFP texts
  • Coheres around Refusal, Resistance, Subversion, Embodiment
  • Serves as a strategic counterpoint to the dominant protocol design discourse
  • May represent the critical "edge cases" that test protocol boundaries

9. Methodological Notes

Analytic Techniques Used:

  1. K-Means Clustering (k=4): Identified text groups based on virtue profile similarity
  2. Network Analysis: mapped virtue co-occurrences and calculated centrality (degree = number of connections)
  3. Jaccard Similarity: normalized measure of virtue pair association (intersection/union)
  4. Community Detection: threshold-based clustering of highly connected virtue groups

Limitations:

  • Small dataset (134 texts) limits statistical power
  • K-means clustering is sensitive to initialization (used deterministic starting points)
  • Binary coding (presence/absence) doesn't capture intensity or salience
  • Limited to virtues 1-5; other dimensions not analyzed

Generated Files:

File Description
cooccurrence_matrix.csv 25×25 matrix of virtue co-occurrence counts
jaccard_similarity_matrix.csv 25×25 similarity matrix (Jaccard indices)
strong_associations.csv Top 50 virtue pairs with association metrics
virtue_profiles.json Individual virtue profiles for each text

10. Recommendations for Further Analysis

  1. Qualitative Deep Dive: Examine the 4 resistance-focused texts (Cluster 2) and the 10 authenticity-focused texts (Cluster 4) to understand the distinct discourses

  2. Temporal Analysis: If dates are available, analyze how virtue frequencies change over time

  3. Semantic Mapping: The Care+Consent pairing could be explored through close reading to understand the conceptual linkage

  4. Source-Specific Models: Consider whether different theoretical frameworks might be needed for AFP vs. PR texts

  5. Expand to Other Codes: Analysis currently limited to Virtue_1 through Virtue_5; expanding to other coding categories could reveal additional patterns

  6. Visualization: Generate network graphs of virtue communities to make relationships visually explicit


Analysis generated using Python standard library (no external packages required). All calculations are fully reproducible.