3.6 KiB
3.6 KiB
Protocol Bicorder Analysis Workflow
This directory contains scripts for analyzing protocols using the Protocol Bicorder framework with LLM assistance.
Scripts
- bicorder_batch.py - [RECOMMENDED] Process entire CSV with one command
- bicorder_analyze.py - Prepares CSV with gradient columns
- bicorder_query.py - Queries LLM for each gradient value and updates CSV (each query is a new chat)
Quick Start (Recommended)
Process All Protocols with One Command
python3 bicorder_batch.py protocols_edited.csv -o analysis_output.csv
This will:
- Create the analysis CSV with gradient columns
- For each protocol row, query all gradients (each query is a new chat with full protocol context)
- Update the CSV automatically with the results
- Show progress and summary
Common Options
# Process only rows 1-5 (useful for testing)
python3 bicorder_batch.py protocols_edited.csv -o analysis_output.csv --start 1 --end 5
# Use specific LLM model
python3 bicorder_batch.py protocols_edited.csv -o analysis_output.csv -m mistral
# Add analyst metadata
python3 bicorder_batch.py protocols_edited.csv -o analysis_output.csv \
-a "Your Name" -s "Your analytical standpoint"
Manual Workflow (Advanced)
Step 1: Prepare the Analysis CSV
Create a CSV with empty gradient columns:
python3 bicorder_analyze.py protocols_edited.csv -o analysis_output.csv
Optional: Add analyst metadata:
python3 bicorder_analyze.py protocols_edited.csv -o analysis_output.csv \
-a "Your Name" -s "Your analytical standpoint"
Step 2: Query Gradients for a Protocol Row
Query all gradients for a specific protocol:
python3 bicorder_query.py analysis_output.csv 1
- Replace
1with the row number you want to analyze - Each gradient is queried in a new chat with full protocol context
- Each response is automatically parsed and written to the CSV
- Progress is shown for each gradient
Optional: Specify a model:
python3 bicorder_query.py analysis_output.csv 1 -m mistral
Step 3: Repeat for All Protocols
For each protocol in your CSV:
python3 bicorder_query.py analysis_output.csv 1
python3 bicorder_query.py analysis_output.csv 2
python3 bicorder_query.py analysis_output.csv 3
# ... and so on
# OR: Use bicorder_batch.py to automate all of this!
Architecture
How It Works
Each gradient query is sent to the LLM as a new, independent chat. Every query includes:
- The protocol descriptor (name)
- The protocol description
- The gradient definition (left term, right term, and their descriptions)
- Instructions to rate 1-9
This approach:
- Simplifies the code - No conversation state management
- Prevents bias - Each evaluation is independent, not influenced by previous responses
- Enables parallelization - Queries could theoretically run concurrently
- Makes debugging easier - Each query/response pair is self-contained
Tips
Dry Run Mode
Test prompts without calling the LLM:
python3 bicorder_query.py analysis_output.csv 1 --dry-run
This shows you exactly what prompt will be sent for each gradient, including the full protocol context.
Check Your Progress
View completed values:
python3 -c "
import csv
with open('analysis_output.csv') as f:
reader = csv.DictReader(f)
for i, row in enumerate(reader, 1):
empty = sum(1 for k, v in row.items() if 'vs' in k and not v)
print(f'Row {i}: {empty}/23 gradients empty')
"
Batch Processing
Use the bicorder_batch.py script (see Quick Start section above) for processing multiple protocols.