A unified Ruby interface for multiple Language Model providers from Ekohe. Seamlessly integrate OpenAI GPT, Anthropic Claude, Google Gemini, Groq, Ollama, OpenRouter, and Z.ai (Zhipu AI) with a single, consistent API.
- 🚀 Multi-Provider Support - 7+ LLM providers with automatic vendor detection
- 🎯 Unified API - Same interface across all providers
- 🖼️ Vision Support - Send images alongside text (OpenAI, Anthropic, OpenRouter, Z.ai, Gemini)
- 🔧 Custom Parameters - Fine-tune with temperature, top_p, and more
- 💰 Cost Tracking - Automatic token counting and cost estimation
- ⚡ Smart Configuration - Environment variables or code-based setup
- 📊 Model Evaluation - Compare models/vendors on cost, latency, tokens & LLM-judged quality (opt-in)
gem 'llm_conductor'bundle installrequire 'llm_conductor'
# Set up your API key (or use ENV variables)
LlmConductor.configure do |config|
config.openai(api_key: 'your-api-key')
end
# Generate text
response = LlmConductor.generate(
model: 'gpt-4o-mini',
prompt: 'Explain quantum computing in simple terms'
)
puts response.output # Generated text
puts response.total_tokens # Token count
puts response.estimated_cost # Cost in USD# Control creativity with temperature
response = LlmConductor.generate(
model: 'llama2',
prompt: 'Write a creative story',
vendor: :ollama,
params: { temperature: 0.9 }
)# Send images with your prompt
response = LlmConductor.generate(
model: 'gpt-4o',
prompt: {
text: 'What is in this image?',
images: ['https://example.com/image.jpg']
}
)Set these environment variables and the gem auto-configures:
export OPENAI_API_KEY=your-key-here
export ANTHROPIC_API_KEY=your-key-here
export GEMINI_API_KEY=your-key-here
export GROQ_API_KEY=your-key-here
export OLLAMA_ADDRESS=http://localhost:11434 # Optional
export OPENROUTER_API_KEY=your-key-here
export ZAI_API_KEY=your-key-hereLlmConductor.configure do |config|
config.default_model = 'gpt-4o-mini'
config.openai(api_key: ENV['OPENAI_API_KEY'])
config.anthropic(api_key: ENV['ANTHROPIC_API_KEY'])
config.gemini(api_key: ENV['GEMINI_API_KEY'])
config.groq(api_key: ENV['GROQ_API_KEY'])
config.ollama(base_url: 'http://localhost:11434')
config.openrouter(api_key: ENV['OPENROUTER_API_KEY'])
config.zai(api_key: ENV['ZAI_API_KEY'])
end| Provider | Auto-Detect | Vision | Custom Params |
|---|---|---|---|
| OpenAI (GPT) | ✅ gpt-* |
✅ | 🔜 |
| Anthropic (Claude) | ✅ claude-* |
✅ | 🔜 |
| Google (Gemini) | ✅ gemini-* |
✅ | 🔜 |
| Groq | ✅ llama/mixtral |
❌ | 🔜 |
| Ollama | ✅ (default) | ❌ | ✅ |
| OpenRouter | 🔧 Manual | ✅ | 🔜 |
| Z.ai (Zhipu) | ✅ glm-* |
✅ | 🔜 |
response = LlmConductor.generate(
model: 'gpt-4o-mini',
prompt: 'What is Ruby programming language?'
)response = LlmConductor.generate(
model: 'claude-3-5-sonnet-20241022',
type: :summarize_text,
data: {
text: 'Long article content here...',
max_length: '100 words',
style: 'professional'
}
)response = LlmConductor.generate(
model: 'llama2',
prompt: 'Extract email addresses from: contact@example.com',
vendor: :ollama,
params: { temperature: 0.0, seed: 42 }
)response = LlmConductor.generate(
model: 'gpt-4o',
prompt: {
text: 'Describe this image in detail',
images: [
'https://example.com/photo.jpg',
'https://example.com/diagram.png'
]
}
)response = LlmConductor.generate(...)
response.output # String - Generated text
response.success? # Boolean - Success status
response.model # String - Model used
response.input_tokens # Integer - Input token count
response.output_tokens # Integer - Output token count
response.total_tokens # Integer - Total tokens
response.estimated_cost # Float - Cost in USD (if available)
response.metadata # Hash - Additional info
# Parse JSON responses
response.parse_json # Hash - Parsed JSON output
# Extract code blocks
response.extract_code_block('ruby') # String - Code contentCreate reusable, testable prompt templates:
class AnalysisPrompt < LlmConductor::Prompts::BasePrompt
def render
<<~PROMPT
Analyze: #{title}
Content: #{truncate_text(content, max_length: 500)}
Provide insights in JSON format.
PROMPT
end
end
# Register and use
LlmConductor::PromptManager.register(:analyze, AnalysisPrompt)
response = LlmConductor.generate(
model: 'gpt-4o-mini',
type: :analyze,
data: { title: 'Article', content: '...' }
)Structure complex data for LLM consumption:
class CompanyDataBuilder < LlmConductor::DataBuilder
def build
{
name: source_object.name,
description: format_for_llm(source_object.description, max_length: 500),
metrics: build_metrics,
summary: build_company_summary
}
end
private
def build_metrics
{
employees: format_number(source_object.employee_count),
revenue: format_number(source_object.annual_revenue, format: :currency)
}
end
endresponse = LlmConductor.generate(...)
if response.success?
puts response.output
else
puts "Error: #{response.metadata[:error]}"
puts "Error class: #{response.metadata[:error_class]}"
endWhich model/vendor is best for your prompt? The eval harness runs the same
prompt across N (model, vendor) pairs over M inputs and compares them on
cost, latency, tokens, and LLM-judged quality — three of which generate
already produces for free.
It's behind a separate require so core users pay nothing:
require 'llm_conductor/eval'
# 1. Describe how to evaluate your feature (the only feature-specific code).
class ArticleSummarySpec < LlmConductor::Eval::Spec
def prompt_type = :summarize_text # a registered prompt type
def input_id(article) = article[:id]
def input_label(article) = article[:title]
def build_data(article) = { content: article[:body] } # payload for generate(type:, data:)
# { score:, bucket: } — bucket is any discrete label; powers disagreement detection.
def output_summary(parsed) = { score: parsed['rating'], bucket: parsed['verdict'] }
def judge_rubric_excerpt = 'A good summary is faithful, concise, and covers the key points.'
def judge_dimensions
[{ key: 'faithfulness', description: 'no hallucinations vs. the source' },
{ key: 'coverage', description: 'captures the key points' }]
end
end
# 2. Run it. `inputs` is ANY enumerable — selection is YOUR job, never the gem's.
report = LlmConductor::Eval.run(
spec: ArticleSummarySpec.new,
inputs: my_articles,
models: [ # caller-owned; no baked-in defaults
{ model: 'phi4-mini', vendor: :ollama },
{ model: 'gemini-2.5-flash', vendor: :gemini },
{ model: 'gpt-4o-mini', vendor: :openai }
],
judge: { model: 'llama-3.3-70b-versatile', vendor: :groq }, # default; needs Groq creds
store: LlmConductor::Eval::Store::FileStore.new('tmp/llm_eval') # or in-memory (default)
)
report.summary # per-model aggregates: parse-OK%, mean quality, p50/p95 latency, cost
report.to_markdown # decision-aid report (you persist it)
report.to_csv # full per-row data
report.needs_review # rows flagged for a human (bucket disagreement / borderline / errors)Judge bias matters. The judge defaults to Groq's llama-3.3-70b-versatile
precisely because it sits outside the Gemini/OpenAI/Ollama candidate families —
a model grading its own family scores it high. Any row where the judged model
equals the judge model is flagged self_judge=true so you can discount it.
Cheap re-runs reuse stored candidate outputs — no re-calling the candidates:
LlmConductor::Eval.judge_only(run_id:, spec:, store:, judge: { model: 'gemini-2.5-pro', vendor: :gemini })
LlmConductor::Eval.report_only(run_id:, spec:, store:)See examples/model_eval_usage.rb for a complete runnable example.
- Custom Parameters Guide - Temperature, top_p, and more
- Vision Support Guide - Using images with LLMs
- Model Evaluation - Compare models on cost, latency & LLM-judged quality
- Examples - Working code examples for all providers
Check the examples/ directory for comprehensive examples:
simple_usage.rb- Basic text generationollama_params_usage.rb- Custom parameters with Ollamagpt_vision_usage.rb- Vision with OpenAIclaude_vision_usage.rb- Vision with Anthropicgemini_vision_usage.rb- Vision with Geminiopenrouter_vision_usage.rb- Vision with OpenRouterzai_usage.rb- Using Z.ai GLM modelsdata_builder_usage.rb- Data builder patternsprompt_registration.rb- Custom prompt classesrag_usage.rb- Retrieval-Augmented Generationmodel_eval_usage.rb- Model evaluation harness (cost/latency/quality comparison)
Run any example:
ruby examples/simple_usage.rb# Clone and setup
git clone https://github.com/ekohe/llm-conductor.git
cd llm-conductor
bundle install
# Run tests
bundle exec rspec
# Run linter
bundle exec rubocop
# Interactive console
bin/console- Fork it
- Create your feature branch (
git checkout -b my-new-feature) - Commit your changes (
git commit -am 'Add some feature') - Push to the branch (
git push origin my-new-feature) - Create new Pull Request
Ensure tests pass and RuboCop is clean before submitting.
The gem is available as open source under the terms of the MIT License.
Developed with ❤️ by Ekohe - Making AI practical, achievable, and useful.