Ela MCB | AI-first AI Engineer · LLMs, MCP & Intelligent Systems

AI Projects

Canonical project list lives in PROJECTS.md. This homepage shows a curated subset first.

Learn AI-First Development

Master AI-Assisted Development: Comprehensive guides showing exactly how this portfolio was built

What You'll Learn:

Prompt Engineering: Real prompts for 10x faster development
Workflow Integration: Daily development routines with AI
Advanced Techniques: Chain-of-thought and role-based prompting
Real Examples: Every technique demonstrated in this portfolio

Learning Hub Guide

LLMGuardian Testing Framework

Production Framework: Systematic LLM validation, safety testing, and performance monitoring

Real Capabilities:

Tests 100+ prompt/response pairs across LLM providers
Measures accuracy, toxicity, and bias automatically
Detects hallucinations and factual errors
Prevents prompt injection attacks
CI/CD pipeline integration for continuous testing

Recent Results:

Math Accuracy: 94% | Safety Score: 87% | Overall: 91%

View Code Live Demo

Code Generation

Use Case: Generate Playwright test automation code

Input:

"Test login functionality with valid credentials"

AI Output:

test('login with valid credentials', async ({ page }) => {
  await page.goto('/login');
  await page.fill('[data-testid="email"]', 'user@example.com');
  await page.fill('[data-testid="password"]', 'password123');
  await page.click('[data-testid="login-button"]');
  await expect(page).toHaveURL('/dashboard');
});

Defect Analysis

Use Case: Analyze test failures and suggest fixes

Input:

Test failure: Element not found after 5s timeout

AI Output:

Check if element selector is correct
Verify page load timing
Add explicit wait conditions
Consider dynamic content loading

QA to Prompt Engineer Journey

Career Transition Project: 4-week structured plan to move into prompt engineering and AI-first development

What's Included:

Week 1: Core prompting & mental models
Week 2: Evaluation, data & iteration
Week 3: Domain transfer & advanced patterns
Week 4: Portfolio, interview prep & outreach

Key Deliverables:

Prompt A/B CLI tool
Evaluation frameworks & metrics
Case studies with measurable results
Portfolio site & interview stories

View Project GitHub

LLM & Gen AI Research Discovery

Automated Discovery System: Weekly scans for new LLMs and groundbreaking Gen AI research papers

Sources Monitored:

GitHub: New LLM repositories
ArXiv: Latest research papers
Hugging Face: New model releases
Papers with Code: SOTA updates

Features:

Automated weekly scans
Smart filtering for significant releases
Accumulated discoveries for review
Centralized discovery dashboard

View Discoveries GitHub

AI Research

This is a curated stream. Browse the full index at All Research.

Hydro-Swarm MoE: Adaptive AI Testing Architecture

Research Focus: A hybrid architecture combining Parallel-Agent Reinforcement Learning (PARL) with Input Domain-Aware Mixture of Experts (IDA-MoE) for dynamic AI testing. Bridges agent swarms and MoE for intelligent code quality assurance.

Key Innovations:

Fluid Orchestration: Water-metaphor agent swarms that adapt to code "currents"
Uncertainty-Aware Routing: AQUA-based confidence scoring for expert selection
Production Validation: Live model drift monitoring in distributed testing
IDA-MoE Routing: Code complexity metrics drive expert selection (not token similarity)

View Paper All Research

LLM Testing Methodologies

Research Focus: Comprehensive analysis of testing approaches for Large Language Models, including hallucination detection, bias measurement, and safety validation frameworks.

Key Contributions:

Hallucination Detection: Consistency-based framework for identifying factual errors
Bias Analysis: Multi-dimensional approach to measuring unfair responses
Safety Validation: Comprehensive framework for harmful content detection
Testing Pipeline: Integrated solution for production LLM validation

View Notebook All Research

MCP in Software Testing

Research Focus: Exploring Model Context Protocol applications in software testing, examining how standardized AI-tool communication can revolutionize test automation and create context-aware testing frameworks.

Key Innovations:

Context-Aware Testing: Real-time application state integration
Dynamic Test Generation: AI-driven test creation based on live data
Self-Healing Tests: Automatic adaptation to application changes
Intelligent Debugging: Complete failure context analysis

View Notebook All Research

Agentic Testing Integration

Research Focus: Investigating autonomous AI agents for software testing, from existing platform integration to specialized testing agent development and multi-agent orchestration systems.

Key Innovations:

Multi-Agent Systems: Coordinated autonomous testing workflows
Specialized Agents: Explorer, Executor, Analyzer, and Orchestrator agents
Platform Integration: Leveraging AutoGPT, LangChain, and Semantic Kernel
Autonomous QA: Self-improving testing systems with minimal human oversight

View Notebook All Research

Evaluating AI Models for Testing

Research Focus: Comprehensive framework for evaluating AI models in software testing contexts, including benchmarking methodologies, performance metrics, ROI analysis, and production deployment strategies.

Key Contributions:

Evaluation Framework: Systematic approach to model assessment across 6 key dimensions
Benchmark Suite: Test generation, bug detection, and adversarial testing scenarios
Model Comparisons: Side-by-side analysis of GPT-4, Claude 3.5, CodeLlama, and Gemini
ROI Calculator: Production metrics and cost-benefit analysis for deployment decisions

View Notebook All Research

Why Use AI Agents for Testing?

Research Focus: Practical healthcare case study answering why QA professionals should use AI agentic flows for software testing, demonstrating autonomous agents for test generation, security scanning, and compliance validation.

Key Insights:

Healthcare EHR Example: Patient portal with HIPAA compliance requirements
7 Agent Types: Explorer, Generator, Security, Compliance, Orchestrator agents
Proven Results: 92% coverage, 88% faster tests, 487% ROI
Practical Implementation: Tech stack, adoption roadmap, code examples

View Notebook All Research

Multi-Agent Orchestration Framework

Research Focus: Academic research comparing Manager-Worker, Collaborative Swarm, and Sequential Pipeline architectures for AI testing systems, with empirical results from 50 trials demonstrating optimal task decomposition strategies.

Key Findings:

Manager-Worker Architecture: 80.2% defect detection, 31% cost reduction
Comparative Analysis: 4 architectures across 5 specialized agent roles
ATAO Framework: Context-aware architecture selection system
Statistical Validation: ANOVA, Tukey HSD, effect size analysis

View Research All Research

AI Advancements Q4 2025

Research Focus: Analysis of major AI breakthroughs in October-December 2025 and their implications for software testing, AI systems, and autonomous agents.

Key Developments:

GPT-5.2 Release: Enhanced coding capabilities and long-context understanding
Gemini 3.0 Pro: New benchmarks in AI performance and reasoning
Agentic AI Systems: Higher autonomy and decision-making capabilities
Multimodal AI: Processing text, images, audio, and video together
Regulatory Milestones: FDA qualification of first AI tool for drug development

View Analysis All Research

State of AI Testing

Research Focus: Living overview of AI testing trends and research discoveries for builders of AI systems. Updated monthly by the Research & Literary Agent—no manual publish needed.

What you get:

Monthly digests: Curated LLM/Gen AI tools and papers from the discovery pipeline
Broad scope: Trends and implications, not tied to a single quarter
Automated: Agent runs on the 1st of each month and commits the update

View Doc Agent Guide All Research

The 40-Prompt Production Gate

Research Focus: Practical LLM safety and red-teaming for QA leaders—a first sprint teams can run before production: eight adversarial families, forty prompts, spreadsheet template, Pass/Conditional/Fail rubric, and numeric release gate.

You can implement this week:

Fixed matrix: Same 40 cells every release for comparable regressions
Time-boxed: Roughly half a day with two testers
Leadership-ready: Evidence links and gate thresholds, not vague “we tried jailbreaks”

View Guide All Research

Agentic Release Gate

Companion to the 40-prompt gate: a practical checklist for tool-using, multi-step agents—tool boundaries, plan/loop control, authorization, human approvals, data containment, and replayable evidence for release sign-off.

When to use it:

Tools & workflows: MCP, plugins, internal APIs, or “coding agent” style automation
Same bar as prompts: Pass / Conditional / Fail, leadership-ready evidence
Pairs with the matrix: prompts for content attacks; this gate for actions and traces

View Gate 40-Prompt Gate All Research

LLM Safety & Red-Teaming — Community Series

Dedicated hub for QA leaders focused on LLM safety and red-teaming—parallel to the main AI research index, with its own cadence and article list.

What this is:

Own section of the site (use the shield icon in the hero next to Research)
~three-week cadence for new articles when the topics queue has entries
Practical shorts you can share or use internally with your teams

Community hub Maintainer guide

Ethical AI Frameworks — Community Series

Sister hub to the LLM Safety series: short articles on governance, principles, and operational ethics for the Ethical AI Frameworks LinkedIn group.

How it works:

Own hub on this site (balance-scale icon in the hero)
~25-day cadence when the topics queue has entries
Same automation as LLM Safety—copy the live URL into the group when you are ready

Community hub Maintainer guide

AI System Testing: A Gentle Introduction

Research Focus: A human-centered approach to understanding and testing AI systems, exploring hallucinations, bias, and safety considerations through gentle observation and compassionate inquiry.

Key Topics:

Hallucination Detection: Learning to notice when AI creates rather than recalls
Bias Recognition: Understanding inherited patterns and assumptions in AI responses
Safety Considerations: Creating spaces where people feel heard and protected
Gentle Testing Methods: Tools as extensions of human intuition and discernment
Philosophy of Approach: Curiosity, humility, and patience in AI system evaluation

View Guide All Research

Model Drift: When Your AI Stops Paying Attention

Research Focus: A research paper exploring model drift in machine learning systems: what it is, why it happens, how to detect it, and how to fix it. Written for the curious, not just the experts.

Key Topics:

Data Drift vs Concept Drift: Understanding where model degradation occurs
Detection Methods: Accuracy monitoring, statistical tests, multi-signal tracking
Business Impact: Financial losses, engagement collapse, regulatory liability
Solutions: Retraining, continuous learning, ensemble models, feature stores
Real-World Examples: Fraud detection, recommendation systems, spam filters

View Paper All Research

Featured Projects

Looking for everything? Use PROJECTS.md for the full project catalog and references.

QA-to-AI Transformation Roadmap Premium

A proven 6-12 month strategy for transitioning traditional QA teams to AI-augmented ways of working. Transform your team into AI-first leaders.

Leadership Strategy Change Management ROI Modeling

Key Results:

487% ROI (Healthcare case study)
40-70% efficiency gains
85%+ automation coverage
32-week phased implementation

Preview Framework Request Full Access

AI Test Generator

A tool that uses AI to automatically generate test cases based on application behavior and user stories.

Python OpenAI API Playwright

Code Demo

Playwright Framework

Production-ready test automation for elamcb.github.io with AI-powered testing via MCP. Validates critical functionality, navigation, and performance with 100% test success rate.

Playwright ES Modules MCP CI/CD

Code Examples Documentation

Job Search Automation Suite

An ethical automation system that demonstrates intelligent job matching, application tracking, and interview preparation using AI and test automation principles.

Python Playwright AI/ML Data Analytics

Automation Features:

✅ Intelligent job matching (85% accuracy)
✅ Application status tracking
✅ Interview analytics dashboard
✅ Resume optimization suggestions

Quick Start Try Dashboard Documentation

AI IDE Collection - Gotta Code 'Em All

Interactive comparison of 10 AI-powered development environments tested over 100+ hours. S-Tier through B-Tier rankings with detailed pros, cons, and real-world performance insights.

Developer Tools AI Assistants Comparative Analysis Interactive UI

IDEs Tested:

S-Tier: Cursor (My Favorite), Windsurf
A-Tier: GitHub Copilot, Zed, Void, Continue.dev, Trae
B-Tier: Replit AI, CodeWhisperer, Tabnine
11 IDEs tested over 100+ hours

View Comparison Source

Legacy-AI Bridge Framework

Practical solution for introducing AI capabilities into legacy enterprise systems without disruption. Addresses the #1 barrier to AI adoption in established companies.

Enterprise Integration Legacy Systems AI/ML Pipeline Risk Management

Real-World Results:

✅ Banking System: 40% faster processing, 60% fraud reduction
✅ Manufacturing ERP: 25% less downtime, 30% inventory optimization
✅ Healthcare Records: Improved outcomes, maintained HIPAA compliance
✅ Zero Downtime: Non-invasive integration approach

Framework Details Assessment Tool

Algorithmic Trading System

A systematic mean reversion trading strategy with automated backtesting, risk management, and performance analytics. Demonstrates quantitative analysis and systematic decision-making.

Python pandas Statistical Analysis Risk Management

Trading Performance:

✅ +127% total return (2020-2024)
✅ 1.67 Sharpe ratio (risk-adjusted)
✅ 64% win rate across 342 trades
✅ -12.4% maximum drawdown

Strategy Details Implementation

Bio-AI Analogies: Educational Content Generator

An innovative educational tool that explains complex AI concepts through biological analogies. Demonstrates interdisciplinary thinking connecting AI with biological systems, making advanced concepts accessible through familiar natural processes.

Python Educational AI BDH Model Content Generation

Key Features:

✅ Explains neural networks through synaptic plasticity
✅ Makes attention mechanisms accessible via selective vision
✅ Compares reinforcement learning to animal training
✅ Bridges AI and biological systems thinking

Documentation Source Code

AI Innovations Discovery for QA Testing

Weekly automated discovery system that scans GitHub, Hacker News, and research sources to find cutting-edge AI tools, frameworks, and methodologies applicable to quality assurance and testing. Stay ahead of the latest innovations in AI testing.

Automated Discovery GitHub Actions Weekly Updates QA Innovation

Discovery Features:

✅ Automatic weekly scans (every Monday)
✅ 15+ targeted search keywords for QA/testing
✅ Categorization by innovation type (test generation, execution, maintenance, etc.)
✅ Filters duplicates and existing tools
✅ Generates detailed reports with QA applications

View Discoveries Discovery Data

Work In Progress WIP

Resources and projects currently in development. Includes ETL testing templates, AI innovations for data QA, and experimental tools.

ETL Testing Data QA AI Tools Templates

Current Contents:

✅ ETL Test Plan Template
✅ SQL & Python Test Cases
✅ AI Innovations for ETL Testing
✅ E2E Testing for DevOps & LLMOps
✅ Data Quality Frameworks

View on GitHub ETL Resources E2E Framework

Unified Autonomous Agent LIVE

Modular unified agent system with multiple capabilities working 24/7 to maintain, monitor, and enhance this portfolio. Single workflow, shared utilities, easy to extend.

Unified Architecture Modular Design GitHub Actions CI/CD Automation Agentic Workflows

Active Capabilities:

🔧 CI-Fix: Auto-fixes CI failures (npm sync, dependencies, output format errors)
🔗 Link-Health: Weekly broken link scans, creates PRs with fix reports
🔒 Security: npm audit, secret detection, auto-fixes moderate issues
⚡ Impact: Zero manual intervention for routine maintenance
🏗️ Architecture: Single workflow, modular capabilities, shared utilities

📊 Status Dashboard Architecture Agent README Build Your Own Workflow Runs

About Me

Core Expertise

Technical Skills

Test Automation

AI/ML Technologies

Programming Languages

AI Projects

Learn AI-First Development

What You'll Learn:

LLMGuardian Testing Framework

Real Capabilities:

Recent Results:

Code Generation

Input:

AI Output:

Defect Analysis

Input:

AI Output:

QA to Prompt Engineer Journey

What's Included:

Key Deliverables:

LLM & Gen AI Research Discovery

Sources Monitored:

Features:

AI Research

Hydro-Swarm MoE: Adaptive AI Testing Architecture

Key Innovations:

LLM Testing Methodologies

Key Contributions:

MCP in Software Testing

Key Innovations:

Agentic Testing Integration

Key Innovations:

Evaluating AI Models for Testing

Key Contributions:

Why Use AI Agents for Testing?

Key Insights:

Multi-Agent Orchestration Framework

Key Findings:

AI Advancements Q4 2025

Key Developments:

State of AI Testing

What you get:

The 40-Prompt Production Gate

You can implement this week:

Agentic Release Gate

When to use it:

LLM Safety & Red-Teaming — Community Series

What this is:

Ethical AI Frameworks — Community Series

How it works:

AI System Testing: A Gentle Introduction

Key Topics:

Model Drift: When Your AI Stops Paying Attention

Key Topics:

Featured Projects

QA-to-AI Transformation Roadmap Premium

Key Results:

AI Test Generator

Playwright Framework

Job Search Automation Suite

Automation Features:

AI IDE Collection - Gotta Code 'Em All

IDEs Tested:

Legacy-AI Bridge Framework

Real-World Results:

Algorithmic Trading System

Trading Performance:

Bio-AI Analogies: Educational Content Generator

Key Features:

AI Innovations Discovery for QA Testing

Discovery Features:

Work In Progress WIP

Current Contents:

Unified Autonomous Agent LIVE

Active Capabilities:

Portfolio Impact

0

0

10x