1 # AI-Unix Tools Test Suite
3 This test suite validates the functionality of the AI-powered Unix tools through multiple testing strategies that account for the non-deterministic nature of LLM outputs.
9 ├── run-tests.sh # Main test runner
10 ├── data/ # Test data files
11 │ ├── sample-log.txt # Sample log entries
12 │ ├── contacts.txt # Contact information
13 │ ├── feedback.txt # User feedback samples
14 │ └── empty.txt # Empty file for edge cases
15 ├── unit/ # Deterministic unit tests
16 │ ├── test-basic.sh # Exit codes, option parsing
17 │ └── test-formats.sh # Output format validation
18 └── integration/ # Semantic integration tests
19 └── test-semantic.sh # AI-powered semantic validation
24 ### 1. Deterministic Tests (unit/test-basic.sh)
25 Tests properties that should always be consistent:
26 - **Exit codes**: Correct error codes for various scenarios
27 - **Option parsing**: Help flags, invalid options, missing arguments
28 - **File handling**: Non-existent files, empty files
29 - **Input/output**: Stdin vs file input behavior
31 ### 2. Format Validation Tests (unit/test-formats.sh)
32 Tests output format correctness:
33 - **JSON validation**: ai-cut -j produces valid JSON
34 - **TSV format**: Tab-separated values in default ai-cut output
35 - **Line numbers**: ai-grep -n prefixes with line numbers
36 - **Count format**: ai-grep -c outputs single numbers
37 - **Category prefixes**: ai-class outputs "category:" format
39 ### 3. Semantic Tests (integration/test-semantic.sh)
40 Uses ai-test itself to validate semantic properties:
41 - **Semantic matching**: ai-grep finds semantically related content
42 - **Extraction quality**: ai-cut extracts appropriate fields
43 - **Categorization accuracy**: ai-class assigns reasonable categories
44 - **Transformation preservation**: ai-tr maintains semantic meaning
45 - **Pipeline behavior**: Tools work together semantically
55 ### Run Specific Test Suites
57 ./run-tests.sh --basic # Only deterministic tests
58 ./run-tests.sh --format # Only format validation tests
59 ./run-tests.sh --semantic # Only semantic tests (requires claude)
60 ./run-tests.sh --no-semantic # Skip semantic tests
65 - All ai-unix tools must be built and executable in parent directory
66 - Python3 (for JSON validation in format tests)
67 - Claude CLI tool (for semantic tests)
69 ## Understanding Test Results
71 ### Deterministic Tests
72 These should always pass if the tools are correctly implemented. Failures indicate:
73 - Incorrect exit codes
74 - Broken option parsing
76 - Basic functionality problems
79 These validate that output formats match specifications. Failures indicate:
80 - Invalid JSON output from ai-cut -j
81 - Missing tab separators in TSV output
82 - Incorrect line number formatting
83 - Wrong count output format
86 These use AI to validate AI output quality. Failures may indicate:
87 - Poor semantic matching by ai-grep
88 - Incorrect field extraction by ai-cut
89 - Bad categorization by ai-class
90 - Meaning loss in ai-tr transformations
93 ## Interpreting Non-Deterministic Results
95 Since AI outputs vary, semantic tests may occasionally fail even with correct tools. Consider:
96 - **Single failures**: May be normal variation, re-run to confirm
97 - **Consistent failures**: Likely indicate real issues
98 - **Pattern analysis**: Look for patterns across multiple runs
100 ## Extending the Test Suite
102 ### Adding New Test Data
103 Place new test files in `tests/data/` and reference them in test scripts.
105 ### Adding Deterministic Tests
106 Add new test functions to `unit/test-basic.sh` or `unit/test-formats.sh`.
108 ### Adding Semantic Tests
109 Add new semantic validation tests to `integration/test-semantic.sh` using ai-test.
111 ### Self-Improving Tests
112 Use ai-test creatively to validate its own assumptions:
114 # Example: Test if test data is appropriate
115 ai-test "contains realistic log entries" tests/data/sample-log.txt
120 1. **Test exit codes first** - They're the most reliable indicators
121 2. **Validate formats before semantics** - Structure problems are easier to debug
122 3. **Use ai-test liberally** - Let AI validate AI output quality
123 4. **Create focused test data** - Each file should test specific scenarios
124 5. **Document expected behaviors** - Especially for semantic edge cases