Transform your AI interactions. Master prompt engineering with professional-grade tools and interactive learning.
Complete API documentation for all prompt engineering tools in this repository.
Enhanced prompt validation system with multi-factor weighted scoring.
PromptValidatorfrom notebooks.prompt_validator import PromptValidator
validator = PromptValidator()
score_prompt(prompt: str) -> DictScores a prompt using weighted criteria and returns detailed feedback.
Parameters:
prompt (str): The prompt text to validateReturns:
Dict with keys:
overall_score (float): 0-100 scorebreakdown (Dict): Individual component scoresweighted_scores (Dict): Weighted component scoresfeedback (List[str]): Improvement suggestionssuggestions (List[str]): Quick-win recommendationsgrade (str): Letter grade (A+ to F)Example:
validator = PromptValidator()
result = validator.score_prompt(
"You are a copywriter. Write a 200-word blog post about AI."
)
print(f"Score: {result['overall_score']}%")
print(f"Grade: {result['grade']}")
for feedback in result['feedback']:
print(f"- {feedback}")
| Criteria | Weight | Description |
|---|---|---|
| Clarity | 25% | Clear, actionable language |
| Specificity | 25% | Specific vs vague language |
| Context | 20% | Role and context setting |
| Structure | 15% | Well-structured prompt |
| Examples | 15% | Examples and format guidance |
Production validation framework for real-world deployment scenarios.
ProductionValidatorfrom notebooks.production_validator import ProductionValidator, TestCase
validator = ProductionValidator()
add_test_case(prompt_id: str, test_case: TestCase)Add a test case for validation.
Parameters:
prompt_id (str): Unique identifier for the prompttest_case (TestCase): Test case configurationExample:
test_case = TestCase(
input_data="Test input",
expected_output_type="text",
expected_keywords=["response", "answer"],
min_length=10,
max_length=500
)
validator.add_test_case("my_prompt", test_case)
test_consistency(prompt_id: str, run_prompt_fn, num_runs: int = 5) -> DictTest prompt consistency across multiple runs.
Parameters:
prompt_id (str): Prompt identifierrun_prompt_fn (callable): Function that takes input and returns outputnum_runs (int): Number of runs to test (default: 5)Returns:
Dict with consistency metricsExample:
def run_prompt(input_data: str) -> str:
# Your LLM call here
return llm_response
result = validator.test_consistency("my_prompt", run_prompt, num_runs=5)
print(f"Consistency Score: {result['consistency_score']}")
test_robustness(prompt_id: str, run_prompt_fn) -> DictTest prompt robustness with edge cases.
Parameters:
prompt_id (str): Prompt identifierrun_prompt_fn (callable): Function that takes input and returns outputReturns:
Dict with robustness metricstest_performance(prompt_id: str, run_prompt_fn, num_iterations: int = 10) -> DictTest prompt performance (execution time).
Parameters:
prompt_id (str): Prompt identifierrun_prompt_fn (callable): Function to testnum_iterations (int): Number of iterations (default: 10)Returns:
Dict with performance metricsvalidate_in_production(prompt_id: str, run_prompt_fn, include_edge_cases: bool = True, include_performance: bool = True) -> DictComplete production validation suite.
Returns:
Dict with:
production_ready_score (float): 0-100 scoreproduction_ready (bool): Ready for deploymenttest_results (Dict): All test resultsrecommendations (List[str]): Improvement recommendationsModel-agnostic LLM provider support for OpenAI, Anthropic, and Ollama.
UnifiedLLMClientfrom notebooks.model_providers import UnifiedLLMClient, ModelProviderFactory
# Auto-detect provider
client = UnifiedLLMClient()
# Or specify provider
provider = ModelProviderFactory.create_provider("openai", model="gpt-4")
client = UnifiedLLMClient(provider=provider)
generate(prompt: str, **kwargs) -> LLMResponseGenerate a response from the LLM.
Parameters:
prompt (str): Input promptmodel (str, optional): Override default modelmax_tokens (int, optional): Maximum tokens (default: 1000)temperature (float, optional): Temperature 0-2 (default: 0.7)Returns:
LLMResponse with:
content (str): Generated textmodel (str): Model usedprovider (str): Provider nametokens_used (int, optional): Tokens consumedlatency_ms (float, optional): Response time in millisecondserror (str, optional): Error message if failedExample:
response = client.generate(
"Write a haiku about AI",
model="gpt-4",
max_tokens=100,
temperature=0.8
)
if response.error:
print(f"Error: {response.error}")
else:
print(response.content)
print(f"Latency: {response.latency_ms:.0f}ms")
get_provider_name() -> strGet the current provider name.
get_available_models() -> List[str]Get list of available models for current provider.
switch_provider(provider_type: str, **kwargs)Switch to a different provider.
Example:
client.switch_provider("anthropic", model="claude-3-opus-20240229")
ModelProviderFactorycreate_provider(provider_type: str, **kwargs) -> LLMProviderCreate a provider instance.
Supported providers:
"openai": OpenAI GPT models"anthropic": Anthropic Claude models"ollama": Local Ollama modelsExample:
# OpenAI
provider = ModelProviderFactory.create_provider(
"openai",
api_key="sk-...", # Optional if set in env
model="gpt-4"
)
# Anthropic
provider = ModelProviderFactory.create_provider(
"anthropic",
api_key="sk-ant-...", # Optional if set in env
model="claude-3-sonnet-20240229"
)
# Ollama (local)
provider = ModelProviderFactory.create_provider(
"ollama",
base_url="http://localhost:11434",
model="llama2"
)
PromptABTesterfrom notebooks.ab_testing_framework import PromptABTester
tester = PromptABTester()
create_test(test_id: str, prompt_a: str, prompt_b: str, metric: str, description: str) -> PromptTestCreate a new A/B test.
Example:
test = tester.create_test(
test_id="email_subject_test",
prompt_a="Write a subject line",
prompt_b="You are an email marketing expert. Write a compelling subject line...",
metric="click_through_rate",
description="Testing generic vs specific prompt"
)
record_result(test_id: str, prompt_version: str, score: float, response_text: str, notes: str = None)Record a test result.
Example:
tester.record_result(
"email_subject_test",
"A",
4.2,
"Weekly Newsletter #47",
notes="Low engagement"
)
analyze_test(test_id: str) -> DictAnalyze test results.
Returns:
Dict with statistical analysisgenerate_report(test_id: str) -> strGenerate formatted report.
ProgressTrackerfrom notebooks.progress_tracker import ProgressTracker
tracker = ProgressTracker(student_name="Your Name")
update_skill(week: str, skill: str, score: float)Update a skill score (0-1).
Example:
tracker.update_skill("week1_foundations", "prompt_debugging", 0.8)
record_assessment(week: str, assessment_type: str, score: float, details: Dict = None)Record assessment results.
complete_project(project_name: str, description: str, github_link: str = None)Mark a project as completed.
get_overall_progress() -> DictGet overall progress statistics.
generate_skill_report() -> strGenerate detailed skill progress report.
generate_certificate(week: str) -> strGenerate completion certificate.
All classes handle errors gracefully:
See the examples/ directory for complete working examples of all tools.