Autonomous CI Fix Agent Guide - Auto-Fix GitHub Actions Failures

Overview
What It Does
Supported Auto-Fixes
Setup Instructions
How It Works
Enhancing with AI
Customization
Example: Complete AI-Powered Version
Monitoring & Alerts
Best Practices
Troubleshooting
Cost Considerations

Overview

An autonomous agent that monitors your GitHub repository, detects CI/CD failures, analyzes errors, and automatically fixes common issues without human intervention.

Monitors

Watches GitHub Actions workflows for failures

Analyzes

Uses pattern matching and AI to understand errors

Fixes

Automatically applies fixes for common issues

Reports

Creates summaries and issues for complex problems

What It Does

Core Capabilities

Monitors: Watches GitHub Actions workflows for failures
Analyzes: Uses pattern matching and AI to understand errors
Fixes: Automatically applies fixes for common issues
Reports: Creates summaries and issues for complex problems

                    Key Benefits
                    No manual intervention needed for common errors
Faster CI/CD pipeline recovery
Reduced developer context switching
Consistent fix quality
24/7 monitoring and fixing

                

Supported Auto-Fixes

1. NPM Lock File Sync Issues

Error: npm ci can only install packages when your package.json and package-lock.json are in sync

Common Cause: Someone updated package.json but forgot to commit the updated package-lock.json

Auto-Fix: Runs npm install to update lock file and commits the change

# The agent automatically runs:
npm install
git add package-lock.json
git commit -m "🤖 Auto-fix: Update package-lock.json to sync with package.json"
git push

2. Missing Dependencies

Error: Missing: [package] from lock file

Auto-Fix: Installs missing dependencies and updates lock file

# The agent automatically runs:
npm install [missing-package]
git add package-lock.json package.json
git commit -m "🤖 Auto-fix: Install missing dependencies"
git push

3. Other Errors

Error: Unknown or complex errors

Action: Creates a GitHub issue with error details for manual review

Smart Behavior: The agent only auto-fixes errors it's confident about. Complex or unknown errors are flagged for human review.

Setup Instructions

Step 1: Enable the Workflow

The workflow file is already created at .github/workflows/autonomous-ci-fix-agent.yml

Verify the File Exists

# Check if the workflow file exists
ls .github/workflows/autonomous-ci-fix-agent.yml

Step 2: Configure Permissions

The workflow needs these permissions (already configured):

contents: write - To commit fixes
pull-requests: write - To create PRs (if needed)
issues: write - To create issues for complex errors

Note: These permissions are already set in the workflow file. No action needed unless you want to modify them.

Step 3: Test the Agent

Option A: Manual Trigger

Go to your GitHub repository
Navigate to Actions tab
Select Autonomous CI Fix Agent
Click Run workflow

Option B: Automatic Trigger

The agent will trigger automatically when CI workflows fail. To test:

Intentionally break a workflow (e.g., add invalid syntax)
Push the change
Wait for the workflow to fail
The agent will automatically analyze and attempt to fix

How It Works

Workflow Triggers

on:
  workflow_run:
    workflows: ["CI", "Tests", "Build"]
    types:
      - completed

The agent runs when:

Any workflow named "CI", "Tests", or "Build" completes
Only if the workflow failed
Can also be manually triggered

Customization: You can modify the workflows list to monitor different workflow names.

Error Analysis

The agent uses pattern matching to identify common errors:

# NPM lock file sync
if grep -q "npm ci.*can only install packages"; then
  fix_action="run_npm_install"
fi

# Missing dependencies
if grep -q "Missing:.*from lock file"; then
  fix_action="run_npm_install"
fi

How It Works:

Downloads the failed workflow's logs
Searches for known error patterns
Identifies the error type
Determines the appropriate fix action

Auto-Fix Process

Detect Error: Analyze workflow logs
Identify Type: Match error patterns
Apply Fix: Run appropriate fix command
Commit: Automatically commit the fix
Report: Create summary or issue

Example Fix Flow

CI workflow fails with "npm ci can only install packages..."
Agent detects the error pattern
Agent runs npm install
Agent checks if package-lock.json changed
If changed, agent commits and pushes the fix
CI workflow runs again automatically (if configured)

Enhancing with AI

Option 1: Use OpenAI API (More Intelligent)

Add this step to analyze errors with GPT:

- name: Analyze error with OpenAI
  env:
    OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
  run: |
    ERROR_LOG=$(cat workflow_logs.txt)
    
    RESPONSE=$(curl -s https://api.openai.com/v1/chat/completions \
      -H "Authorization: Bearer $OPENAI_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "model": "gpt-4",
        "messages": [{
          "role": "system",
          "content": "You are a CI/CD error analyzer. Analyze the error and suggest a fix."
        }, {
          "role": "user",
          "content": "Error: '"$ERROR_LOG"'"
        }]
      }')
    
    echo "analysis=$RESPONSE" >> $GITHUB_OUTPUT

Benefits: More intelligent error analysis, can handle complex errors, suggests better fixes

Cost: ~$0.01-0.10 per analysis

Option 2: Use Ollama (Free, Local)

If you have a self-hosted runner with Ollama:

- name: Analyze with Ollama
  run: |
    ERROR_LOG=$(cat workflow_logs.txt)
    
    ANALYSIS=$(ollama run llama3.2:3b "Analyze this CI error and suggest a fix: $ERROR_LOG")
    
    echo "analysis=$ANALYSIS" >> $GITHUB_OUTPUT

Benefits: Completely free, runs locally, no API costs, private

Requirement: Self-hosted GitHub Actions runner with Ollama installed

Option 3: Use GitHub Copilot API

- name: Analyze with GitHub Copilot
  uses: actions/github-script@v7
  with:
    github-token: ${{ secrets.GITHUB_TOKEN }}
    script: |
      const errorLog = fs.readFileSync('workflow_logs.txt', 'utf8');
      // Use GitHub API to analyze
      // Implementation depends on Copilot API availability

Note: GitHub Copilot API integration may require additional setup. Check GitHub's documentation for current availability.

Customization

Add More Error Patterns

Edit the workflow to add more patterns:

- name: Analyze error with AI
  run: |
    # Add your custom patterns
    if echo "$ERROR_LOG" | grep -q "Your custom error pattern"; then
      echo "error_type=custom_error" >> $GITHUB_OUTPUT
      echo "fix_action=custom_fix" >> $GITHUB_OUTPUT
    fi

Example: Add Python Dependency Error

# Detect: "ERROR: Could not find a version that satisfies the requirement"
if echo "$ERROR_LOG" | grep -q "Could not find a version"; then
  echo "error_type=python_dependency" >> $GITHUB_OUTPUT
  echo "fix_action=update_requirements" >> $GITHUB_OUTPUT
fi

Add More Auto-Fixes

Add new fix steps:

- name: Auto-fix custom error
  if: steps.analyze.outputs.error_type == 'custom_error'
  run: |
    # Your fix commands here
    npm run fix-custom-issue
    git add .
    git commit -m "🤖 Auto-fix: Custom error"
    git push

Example: Auto-fix Python Dependencies

- name: Auto-fix Python dependencies
  if: steps.analyze.outputs.error_type == 'python_dependency'
  run: |
    pip install --upgrade pip
    pip install -r requirements.txt
    git add requirements.txt
    git commit -m "🤖 Auto-fix: Update Python dependencies"
    git push

Monitor Different Workflows

Change which workflows trigger the agent:

on:
  workflow_run:
    workflows: ["Your-Workflow-Name", "Another-Workflow"]
    types:
      - completed

Tip: You can monitor all workflows by using workflows: ["*"], but be careful as this will trigger on every workflow failure.

Example: Complete AI-Powered Version

Here's a more advanced version using OpenAI:

name: AI-Powered CI Fix Agent

on:
  workflow_run:
    workflows: ["CI"]
    types:
      - completed

jobs:
  ai-fix:
    if: github.event.workflow_run.conclusion == 'failure'
    runs-on: ubuntu-latest
    permissions:
      contents: write
      pull-requests: write
    
    steps:
      - uses: actions/checkout@v4
      
      - name: Get error logs
        id: logs
        run: |
          gh run view ${{ github.event.workflow_run.id }} --log > error.log
          echo "error=$(cat error.log | base64 -w 0)" >> $GITHUB_OUTPUT
      
      - name: AI Analysis
        id: ai
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
        run: |
          ERROR=$(echo "${{ steps.logs.outputs.error }}" | base64 -d)
          
          ANALYSIS=$(curl -s https://api.openai.com/v1/chat/completions \
            -H "Authorization: Bearer $OPENAI_API_KEY" \
            -H "Content-Type: application/json" \
            -d '{
              "model": "gpt-4",
              "messages": [{
                "role": "system",
                "content": "Analyze CI errors and return JSON: {\"error_type\": \"...\", \"fix_commands\": [\"...\"], \"confidence\": 0.9}"
              }, {
                "role": "user",
                "content": "'"$ERROR"'"
              }]
            }' | jq -r '.choices[0].message.content')
          
          echo "analysis=$ANALYSIS" >> $GITHUB_OUTPUT
      
      - name: Apply AI-suggested fix
        if: steps.ai.outputs.analysis != ''
        run: |
          ANALYSIS='${{ steps.ai.outputs.analysis }}'
          FIX_COMMANDS=$(echo "$ANALYSIS" | jq -r '.fix_commands[]')
          
          for cmd in $FIX_COMMANDS; do
            eval "$cmd"
          done
          
          git add .
          git commit -m "🤖 AI Auto-fix: ${{ steps.ai.outputs.analysis | jq -r '.error_type' }}"
          git push

Security Note: Be very careful when executing AI-suggested commands. Always review the commands before execution, or add a safety check to only execute commands from a whitelist.

Monitoring & Alerts

Get Notifications

Add Slack/Discord notifications:

- name: Notify on fix
  if: steps.analyze.outputs.error_type != 'no_logs'
  uses: slackapi/slack-github-action@v1
  with:
    webhook-url: ${{ secrets.SLACK_WEBHOOK }}
    payload: |
      {
        "text": "🤖 Auto-fixed CI error: ${{ steps.analyze.outputs.error_type }}"
      }

Email Notifications

- name: Send email notification
  uses: dawidd6/action-send-mail@v3
  with:
    server_address: smtp.gmail.com
    server_port: 465
    username: ${{ secrets.EMAIL_USERNAME }}
    password: ${{ secrets.EMAIL_PASSWORD }}
    subject: "CI Auto-Fix: ${{ steps.analyze.outputs.error_type }}"
    body: "The agent fixed: ${{ steps.analyze.outputs.error_type }}"
    to: your-email@example.com

Best Practices

Start Simple: Begin with pattern matching, add AI later
Test Thoroughly: Test on non-critical branches first
Monitor Results: Review auto-fixes to improve patterns
Set Boundaries: Only auto-fix safe, common errors
Document: Keep track of what the agent fixes

                    Safety Recommendations
                    Only auto-fix errors you're 100% confident about
Require manual approval for complex fixes
Set up alerts for all auto-fixes
Review agent actions regularly
Have a rollback plan

                

Troubleshooting

Agent Not Triggering

Check workflow names: Ensure they match exactly (case-sensitive)
Verify permissions: Check that permissions are set correctly
Check workflow_run event: Ensure it's supported in your repository
Check Actions tab: Look for any error messages

Fixes Not Working

Review error logs: Check Actions logs for details
Check fix commands: Verify commands are correct
Verify git permissions: Ensure the agent can commit
Check branch protection: Some branches may prevent direct commits

Too Many Auto-Fixes

Add confidence thresholds: Only fix if confidence is high
Require manual approval: For certain fix types
Limit to specific error types: Only auto-fix known safe errors
Add rate limiting: Limit number of fixes per day

Cost Considerations

Free Option (Current)

Uses GitHub Actions (free for public repos)
Pattern matching (no API costs)
Basic error detection

Perfect for: Most use cases, especially if you have a public repository

AI-Powered Option

Service	Cost per Analysis	Best For
OpenAI API	~$0.01-0.10	Complex error analysis
Ollama	Free	Self-hosted runners
GitHub Copilot	Included with subscription	GitHub Enterprise users

Cost Estimate: If you have 10 CI failures per week and use OpenAI API, that's approximately $0.10-1.00 per week, or $5-50 per year.

Next Steps

Enable the workflow in your repository
Test it by triggering a known failure
Monitor the first few auto-fixes
Enhance with AI if needed
Expand to more error types

Related Resources

QA Agentic Workflows Guide - Build your own agents
GitHub Actions Documentation
OpenAI API Documentation
View Workflow File

Conclusion

This agent autonomously fixes CI failures, saving you time and keeping your builds green. Start with the basic pattern matching version, then enhance with AI as needed.

                    Remember:
                    Start simple: Pattern matching works for most common errors
Test first: Always test on non-critical branches
Monitor closely: Review agent actions regularly
Expand gradually: Add more error types over time

                

Table of Contents