Flatten data/readings/ → data/
Remove the intermediate readings/ subdirectory level — dataset naming (synthetic_YYYYMMDD, manual_YYYYMMDD) already encodes what the data is. Update all path references across scripts and docs accordingly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -27,10 +27,10 @@ The scripts automatically draw the gradients from the current state of the [bico
|
||||
|
||||
## Syncing a manual readings dataset
|
||||
|
||||
If the dataset has a `.sync_source` file (e.g., `data/readings/manual_20260320/`), one command handles everything:
|
||||
If the dataset has a `.sync_source` file (e.g., `data/manual_20260320/`), one command handles everything:
|
||||
|
||||
```bash
|
||||
scripts/sync_readings.sh data/readings/manual_20260320
|
||||
scripts/sync_readings.sh data/manual_20260320
|
||||
```
|
||||
|
||||
This fetches new JSON files from the remote repo, regenerates `readings.csv`, runs multivariate analysis (with `--min-coverage 0.8` to handle shortform readings), generates the LDA visualization, and saves cluster classifications to `analysis/classifications.csv`.
|
||||
@@ -39,15 +39,15 @@ This fetches new JSON files from the remote repo, regenerates `readings.csv`, ru
|
||||
|
||||
```bash
|
||||
# Full analysis pipeline
|
||||
python3 scripts/multivariate_analysis.py data/readings/manual_20260320/readings.csv \
|
||||
python3 scripts/multivariate_analysis.py data/manual_20260320/readings.csv \
|
||||
--min-coverage 0.8 \
|
||||
--analyses clustering pca correlation importance
|
||||
|
||||
# LDA visualization (cluster separation plot)
|
||||
python3 scripts/lda_visualization.py data/readings/manual_20260320/readings.csv
|
||||
python3 scripts/lda_visualization.py data/manual_20260320/readings.csv
|
||||
|
||||
# Classify all readings (uses synthetic dataset as training data by default)
|
||||
python3 scripts/classify_readings.py data/readings/manual_20260320/readings.csv
|
||||
python3 scripts/classify_readings.py data/manual_20260320/readings.csv
|
||||
```
|
||||
|
||||
Use `--min-coverage` (0.0–1.0) to drop dimension columns below the given coverage fraction before analysis. This is important for datasets with many shortform readings where most dimensions are sparsely filled.
|
||||
@@ -57,8 +57,8 @@ Use `--min-coverage` (0.0–1.0) to drop dimension columns below the given cover
|
||||
If you have a directory of individual bicorder JSON reading files:
|
||||
|
||||
```bash
|
||||
python3 scripts/json_to_csv.py data/readings/manual_20260320/json/ \
|
||||
-o data/readings/manual_20260320/readings.csv
|
||||
python3 scripts/json_to_csv.py data/manual_20260320/json/ \
|
||||
-o data/manual_20260320/readings.csv
|
||||
```
|
||||
|
||||
---
|
||||
@@ -68,7 +68,7 @@ python3 scripts/json_to_csv.py data/readings/manual_20260320/json/ \
|
||||
### Process All Protocols with One Command
|
||||
|
||||
```bash
|
||||
python3 scripts/bicorder_batch.py data/readings/synthetic_20251116/protocols_edited.csv -o analysis_output.csv
|
||||
python3 scripts/bicorder_batch.py data/synthetic_20251116/protocols_edited.csv -o analysis_output.csv
|
||||
```
|
||||
|
||||
This will:
|
||||
@@ -81,13 +81,13 @@ This will:
|
||||
|
||||
```bash
|
||||
# Process only rows 1-5 (useful for testing)
|
||||
python3 scripts/bicorder_batch.py data/readings/synthetic_20251116/protocols_edited.csv -o analysis_output.csv --start 1 --end 5
|
||||
python3 scripts/bicorder_batch.py data/synthetic_20251116/protocols_edited.csv -o analysis_output.csv --start 1 --end 5
|
||||
|
||||
# Use specific LLM model
|
||||
python3 scripts/bicorder_batch.py data/readings/synthetic_20251116/protocols_edited.csv -o analysis_output.csv -m mistral
|
||||
python3 scripts/bicorder_batch.py data/synthetic_20251116/protocols_edited.csv -o analysis_output.csv -m mistral
|
||||
|
||||
# Add analyst metadata
|
||||
python3 scripts/bicorder_batch.py data/readings/synthetic_20251116/protocols_edited.csv -o analysis_output.csv \
|
||||
python3 scripts/bicorder_batch.py data/synthetic_20251116/protocols_edited.csv -o analysis_output.csv \
|
||||
-a "Your Name" -s "Your analytical standpoint"
|
||||
```
|
||||
|
||||
@@ -100,12 +100,12 @@ python3 scripts/bicorder_batch.py data/readings/synthetic_20251116/protocols_edi
|
||||
Create a CSV with empty gradient columns:
|
||||
|
||||
```bash
|
||||
python3 scripts/bicorder_analyze.py data/readings/synthetic_20251116/protocols_edited.csv -o analysis_output.csv
|
||||
python3 scripts/bicorder_analyze.py data/synthetic_20251116/protocols_edited.csv -o analysis_output.csv
|
||||
```
|
||||
|
||||
Optional: Add analyst metadata:
|
||||
```bash
|
||||
python3 scripts/bicorder_analyze.py data/readings/synthetic_20251116/protocols_edited.csv -o analysis_output.csv \
|
||||
python3 scripts/bicorder_analyze.py data/synthetic_20251116/protocols_edited.csv -o analysis_output.csv \
|
||||
-a "Your Name" -s "Your analytical standpoint"
|
||||
```
|
||||
|
||||
|
||||
Reference in New Issue
Block a user