Skip to content

Add release-doctor agentic workflow for automated release pipeline monitoring#500

Merged
Mossaka merged 4 commits into
mainfrom
copilot/add-release-doctor-workflow
Nov 2, 2025
Merged

Add release-doctor agentic workflow for automated release pipeline monitoring#500
Mossaka merged 4 commits into
mainfrom
copilot/add-release-doctor-workflow

Conversation

Copilot AI commented Nov 2, 2025

Copy link
Copy Markdown
Contributor

The release pipeline has grown complex (prepare-release → PR → release → CHANGELOG update → package manifests), making failures difficult to diagnose. This adds automated monitoring and diagnostic issue creation.

Implementation

New agentic workflow (.github/workflows/release-doctor.md):

  • Triggers on completion of prepare-release.yml, release.yml, update-package-manifests.yml
  • Exits immediately on success; investigates only failures
  • Uses GitHub Copilot engine with GitHub MCP server, cache-memory, and web-fetch tools

Seven investigation phases:

  1. Early exit if workflow succeeded (no noise)
  2. Triage failed jobs and extract context
  3. Deep log analysis with pattern recognition
  4. Release integrity verification (binaries, CHANGELOG sync, version consistency)
  5. Historical pattern analysis via cache-memory
  6. Root cause classification (build errors, auth issues, manifest failures)
  7. Automated issue creation with diagnostic report

Verification checks:

  • All 6 platform binaries present in GitHub release (Linux/macOS/Windows × AMD64/ARM64)
  • CHANGELOG.md synchronized on main and release branches
  • Cargo.toml version matches release tag
  • Package manifest update PR created

Issue structure:

# Release Pipeline Failure: [Brief Description]

**Workflow Run ID**: 12345
**Commit**: abc1234

## Failure Summary
[2-3 sentence overview]

## Release Integrity Status
✅/❌ GitHub Release Created
✅/❌ All 6 Binaries Present
✅/❌ CHANGELOG Synchronized
✅/❌ Version Consistency

## Root Cause Analysis
[Detailed explanation with error logs]

## Recommended Actions
1. [Specific command or fix]
2. [Next action]

Permissions: Read-only except issues: write (via safe-outputs for security)

Documentation: Added release-doctor to the agentic workflows list in docs/design/agentic-workflows.md alongside other agentic workflows (Issue Triage Bot, Scout Research Agent, CI Doctor).

Example Scenario

When the release workflow fails to build ARM64 macOS binary:

  1. Doctor analyzes build job logs
  2. Verifies 5 of 6 binaries uploaded (missing darwin_arm64)
  3. Checks if target was added (rustup target add aarch64-apple-darwin)
  4. Creates issue with specific remediation: runner availability or toolchain configuration
  5. Searches for similar past failures via cache-memory
Original prompt

Since the release pipeline is getting increasingly more complicated

pre-release -> PR -> release -> update homebrew and winget

We need a separate agentic workflow called release-doctor that can detect if the release process failed at some point (if one of the piepline jobs failed, the entire process should be considered failure) and then raise an issue to this repository to describe the failure.

You should figure out the right permissions that it should have to examine the workflows

It should have undersatnding of the release process and have visibility to jobs logs and artifacts propose the right fix.

It should also check if the release binary in github release page and make sure that the release note relfects the changes in the changelog and make sure that the versions in the main branch is the right version.

You should think about how this agentic workflow is triggered (perhaps at any point of the failure of the release pieplines)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 2 commits November 2, 2025 18:26
…ring

Add comprehensive release-doctor workflow that monitors all stages of the release
pipeline and creates diagnostic issues when failures occur. The workflow:

- Monitors prepare-release, release, and update-package-manifests workflows
- Analyzes failed jobs and extracts error logs
- Verifies release integrity (binaries, CHANGELOG sync, version consistency)
- Checks package manifest update PRs
- Creates detailed GitHub issues with root cause analysis and remediation steps
- Only triggers on workflow failures to avoid noise

The workflow uses GitHub Copilot as the AI engine and has appropriate permissions
to inspect workflows, read repository contents, search issues/PRs, and create
diagnostic issues.

Co-authored-by: Mossaka <5447827+Mossaka@users.noreply.github.com>
Add comprehensive documentation for the release-doctor agentic workflow including:
- What workflows it monitors (prepare-release, release, update-package-manifests)
- What it checks (binaries, CHANGELOG sync, version consistency, manifest PRs)
- How it works (automatic triggering, smart detection, deep analysis)
- Benefits and usage instructions
- How to view results and historical patterns

Co-authored-by: Mossaka <5447827+Mossaka@users.noreply.github.com>
Copilot AI changed the title [WIP] Add release-doctor workflow for failure detection Add release-doctor agentic workflow for automated release pipeline monitoring Nov 2, 2025
Copilot AI requested a review from Mossaka November 2, 2025 18:35
Comment thread CHANGELOG.md Outdated
Comment on lines +7 to +9
### Added

- Release Doctor agentic workflow that monitors the entire release pipeline (prepare-release, release, update-package-manifests) and automatically creates diagnostic issues when failures occur, including verification of release binaries, CHANGELOG synchronization, version consistency, and package manifest updates

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot delete

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in e55e3ed. Removed the CHANGELOG entry for release-doctor.

Comment thread RELEASE.md Outdated
Comment on lines +120 to +139
## Release Pipeline Monitoring

If the automated workflows fail, you can follow the manual process:
The [Release Doctor workflow](.github/workflows/release-doctor.md) automatically monitors the entire release pipeline and creates diagnostic issues when failures occur.

1. **Create and push the release tag manually** (if `auto-tag-release.yml` fails):
```bash
# Checkout the main branch and pull the latest changes
git checkout main
git pull origin main
### What It Monitors

# Create a new tag (e.g., v0.4.0)
git tag -a v<version> -m "Release v<version>"

# Push the tag
git push origin v<version>
```
The Release Doctor monitors three critical workflows:

1. **Prepare Release** (`prepare-release.yml`): Version bump PR creation
2. **Release** (`release.yml`): Binary builds, GitHub release creation, CHANGELOG updates
3. **Update Package Manifests** (`update-package-manifests.yml`): Homebrew and WinGet updates

### What It Checks

When a release workflow completes, the Release Doctor performs comprehensive verification:

**For All Workflows:**
- Analyzes failed jobs and extracts error logs
- Identifies root causes and error patterns
- Searches for similar historical failures

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot the monitor pipeline should not be in release.md. but it should be documented in the docs on what agentic workflow we use

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in e55e3ed. Moved the release-doctor documentation from RELEASE.md to docs/design/agentic-workflows.md where it's listed alongside other agentic workflows.

@Mossaka Mossaka marked this pull request as ready for review November 2, 2025 18:48
- Remove CHANGELOG entry as requested
- Remove "Release Pipeline Monitoring" section from RELEASE.md
- Add release-doctor to the list of workflows in docs/design/agentic-workflows.md

Addresses feedback from @Mossaka to document agentic workflows in the docs
rather than in RELEASE.md

Co-authored-by: Mossaka <5447827+Mossaka@users.noreply.github.com>
Copilot AI requested a review from Mossaka November 2, 2025 18:51
@Mossaka Mossaka merged commit ed9464d into main Nov 2, 2025
42 checks passed
@Mossaka Mossaka deleted the copilot/add-release-doctor-workflow branch November 2, 2025 23:53
Mossaka added a commit to Mossaka/wassette that referenced this pull request Feb 2, 2026
…nitoring (microsoft#500)

* Initial plan

* feat: Add release-doctor agentic workflow for release pipeline monitoring

Add comprehensive release-doctor workflow that monitors all stages of the release
pipeline and creates diagnostic issues when failures occur. The workflow:

- Monitors prepare-release, release, and update-package-manifests workflows
- Analyzes failed jobs and extracts error logs
- Verifies release integrity (binaries, CHANGELOG sync, version consistency)
- Checks package manifest update PRs
- Creates detailed GitHub issues with root cause analysis and remediation steps
- Only triggers on workflow failures to avoid noise

The workflow uses GitHub Copilot as the AI engine and has appropriate permissions
to inspect workflows, read repository contents, search issues/PRs, and create
diagnostic issues.

Co-authored-by: Mossaka <5447827+Mossaka@users.noreply.github.com>

* docs: Document release-doctor workflow in RELEASE.md

Add comprehensive documentation for the release-doctor agentic workflow including:
- What workflows it monitors (prepare-release, release, update-package-manifests)
- What it checks (binaries, CHANGELOG sync, version consistency, manifest PRs)
- How it works (automatic triggering, smart detection, deep analysis)
- Benefits and usage instructions
- How to view results and historical patterns

Co-authored-by: Mossaka <5447827+Mossaka@users.noreply.github.com>

* docs: Move release-doctor documentation to agentic-workflows.md

- Remove CHANGELOG entry as requested
- Remove "Release Pipeline Monitoring" section from RELEASE.md
- Add release-doctor to the list of workflows in docs/design/agentic-workflows.md

Addresses feedback from @Mossaka to document agentic workflows in the docs
rather than in RELEASE.md

Co-authored-by: Mossaka <5447827+Mossaka@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Mossaka <5447827+Mossaka@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants