# Protocol Bicorder Analysis Workflow This directory contains scripts for analyzing protocols using the Protocol Bicorder framework with LLM assistance. The scripts automatically draw the gradients from the current state of the [bicorder.json](`../bicorder.json`) file. ## Scripts 1. **bicorder_batch.py** - **[RECOMMENDED]** Process entire CSV with one command 2. **bicorder_analyze.py** - Prepares CSV with gradient columns 3. **bicorder_query.py** - Queries LLM for each gradient value and updates CSV (each query is a new chat) ## Quick Start (Recommended) ### Process All Protocols with One Command ```bash python3 bicorder_batch.py protocols_edited.csv -o analysis_output.csv ``` This will: 1. Create the analysis CSV with gradient columns 2. For each protocol row, query all gradients (each query is a new chat with full protocol context) 3. Update the CSV automatically with the results 4. Show progress and summary ### Common Options ```bash # Process only rows 1-5 (useful for testing) python3 bicorder_batch.py protocols_edited.csv -o analysis_output.csv --start 1 --end 5 # Use specific LLM model python3 bicorder_batch.py protocols_edited.csv -o analysis_output.csv -m mistral # Add analyst metadata python3 bicorder_batch.py protocols_edited.csv -o analysis_output.csv \ -a "Your Name" -s "Your analytical standpoint" ``` --- ## Manual Workflow (Advanced) ### Step 1: Prepare the Analysis CSV Create a CSV with empty gradient columns: ```bash python3 bicorder_analyze.py protocols_edited.csv -o analysis_output.csv ``` Optional: Add analyst metadata: ```bash python3 bicorder_analyze.py protocols_edited.csv -o analysis_output.csv \ -a "Your Name" -s "Your analytical standpoint" ``` ### Step 2: Query Gradients for a Protocol Row Query all gradients for a specific protocol: ```bash python3 bicorder_query.py analysis_output.csv 1 ``` - Replace `1` with the row number you want to analyze - Each gradient is queried in a new chat with full protocol context - Each response is automatically parsed and written to the CSV - Progress is shown for each gradient Optional: Specify a model: ```bash python3 bicorder_query.py analysis_output.csv 1 -m mistral ``` ### Step 3: Repeat for All Protocols For each protocol in your CSV: ```bash python3 bicorder_query.py analysis_output.csv 1 python3 bicorder_query.py analysis_output.csv 2 python3 bicorder_query.py analysis_output.csv 3 # ... and so on # OR: Use bicorder_batch.py to automate all of this! ``` ## Architecture ### How It Works Each gradient query is sent to the LLM as a **new, independent chat**. Every query includes: - The protocol descriptor (name) - The protocol description - The gradient definition (left term, right term, and their descriptions) - Instructions to rate 1-9 This approach: - **Simplifies the code** - No conversation state management - **Prevents bias** - Each evaluation is independent, not influenced by previous responses - **Enables parallelization** - Queries could theoretically run concurrently - **Makes debugging easier** - Each query/response pair is self-contained ## Tips ### Dry Run Mode Test prompts without calling the LLM: ```bash python3 bicorder_query.py analysis_output.csv 1 --dry-run ``` This shows you exactly what prompt will be sent for each gradient, including the full protocol context. ### Check Your Progress View completed values: ```bash python3 -c " import csv with open('analysis_output.csv') as f: reader = csv.DictReader(f) for i, row in enumerate(reader, 1): empty = sum(1 for k, v in row.items() if 'vs' in k and not v) print(f'Row {i}: {empty}/23 gradients empty') " ``` ### Batch Processing Use the `bicorder_batch.py` script (see Quick Start section above) for processing multiple protocols.