[copilot-cli-research] Copilot CLI Deep Research - 2026-04-01 #23952

2026-04-01T21:15:07Z

github-actions[bot]
bot Apr 1, 2026

Analysis Date: 2026-04-01
Repository: github/gh-aw
Scope: 179 total workflows, 86 using Copilot engine (48%)
Triggered by: @pelikhan | Run: §23870837785

📊 Executive Summary

This first-run deep research report identifies 10 key missed opportunities across the 86 Copilot-engine workflows in this repository. The most impactful finding is that 55% of Copilot workflows (47 of 86) still require the manual COPILOT_GITHUB_TOKEN secret despite the copilot-requests feature being available to eliminate this dependency. Additionally, only 13% of workflows use the AWF sandbox for network isolation, exposing the majority of agentic runs to unrestricted network access.

On a positive note, the repository shows strong discipline in areas like timeout configuration (99% compliance), tool permission specificity (34% using specific bash commands), and memory utilization (cache-memory in 33%, repo-memory in 21% of workflows).

Primary Recommendation: Migrate the remaining 47 workflows to copilot-requests: true, which eliminates the need for the COPILOT_GITHUB_TOKEN secret and simplifies onboarding for new repositories forking this project.

🔴 High Priority Issues

1. Low `copilot-requests` Adoption (45%)

39 of 86 Copilot workflows enable copilot-requests: true. The remaining 47 workflows still require the COPILOT_GITHUB_TOKEN org-level secret. Migrating them would reduce secret management overhead and improve security posture.

Workflows currently missing copilot-requests include many daily/scheduled workflows that don't need elevated token permissions.

2. Limited AWF Sandbox Coverage (13%)

Only 11 of 86 Copilot workflows use the AWF network firewall (sandbox: agent: awf). The other 75 run without network restrictions, potentially allowing unexpected outbound network calls. High-sensitivity workflows (code analysis, security scanning, PR review) especially benefit from network isolation.

🟡 Medium Priority Opportunities

3. Autopilot (`max-continuations`) Barely Used

Only 1 workflow (smoke-copilot.md) uses max-continuations. Complex multi-step workflows that frequently hit timeout limits could benefit from breaking work into continuation sessions. For example, code-scanning-fixer.md, dead-code-remover.md, and similar long-running agents could be configured with:

engine:
  id: copilot
  max-continuations: 3

4. Custom Agent Files Underutilized

9 agent files exist in .github/agents/ but only 3 are referenced by workflows. The following agent files have no workflow consumers:

grumpy-reviewer.agent.md — could improve code review quality in review workflows
interactive-agent-designer.agent.md — useful for interactive design tasks
w3c-specification-writer.agent.md — for spec/documentation writing tasks
contribution-checker.agent.md — for contribution validation workflows
create-safe-output-type.agent.md — for development tooling workflows

5. Over-Reliance on `default` GitHub Toolset

Many workflows use toolsets: [default] when they only need a subset (e.g., just issues or pull_requests). The default toolset expands to all of: context + repos + issues + pull_requests, granting more permissions than necessary.

Examples of over-permissioned workflows:

auto-triage-issues.md — needs only issues, uses [issues] ✅ (good example)
Many analysis workflows use [default] but only read issues or PRs

6. MCP Scripts Feature Severely Underused

Only 1 workflow (security-review.md) references mcp-scripts, yet this feature allows injecting custom Node.js MCP server scripts without standing up separate infrastructure. For workflows needing specialized data processing (Slack, JIRA, internal APIs), this is a powerful zero-infrastructure option.

📈 Feature Usage Matrix

Feature Category	Available Features	Used in Workflows	Usage Rate
`copilot-requests`	✅	39/86	45%
`timeout-minutes`	✅	85/86	99% ✅
`strict: true`	✅	51/86	59%
AWF Sandbox	✅	11/86	13% ⚠️
`cache-memory`	✅	28/86	33% ✅
`repo-memory`	✅	18/86	21%
`web-fetch` tool	✅	20/179	11%
`playwright` tool	✅	20/179	11%
`edit` tool	✅	79/86	92% ✅
`max-continuations`	✅	1/86	1% ⚠️
`engine.model`	✅	2/86	2%
`engine.agent` (custom)	✅	3/86	3% ⚠️
`engine.args`	✅	1/179	0.6%
`engine.env`	✅	0/86	0%
`engine.version` pin	✅	0/86	0%
`mcp-scripts`	✅	1/179	0.6%
`network.blocked`	✅	1/179	0.6%
`mcp-gateway` flag	✅	0/179	0%
`difc-proxy` flag	✅	0/179	0%
`observability`	✅	1/179	0.6%

3️⃣ Missed Opportunities Detail

View High Priority Details

🔴 Opportunity 1: Migrate Remaining 47 Workflows to `copilot-requests`

What: 47 Copilot workflows lack features: copilot-requests: true, requiring a manually provisioned COPILOT_GITHUB_TOKEN org-level secret.

Why It Matters: The copilot-requests feature uses the built-in $\{\{ github.token }} eliminating the need to manage COPILOT_GITHUB_TOKEN. This is a significant security and onboarding improvement.

How to Implement: Add to each workflow's frontmatter:

features:
  copilot-requests: true

Expected Benefits: Simpler setup for forks, reduced secret rotation burden, improved security.

🔴 Opportunity 2: Expand AWF Sandbox for Sensitive Workflows

What: 75 Copilot workflows run without network isolation.

Why It Matters: Without AWF, the agent can make unrestricted outbound network calls. Security-sensitive workflows (code scanning, PR review, security analysis) especially need this protection.

Workflows that should adopt AWF:

security-review.md — security analysis without network restrictions
code-scanning-fixer.md — handles security alerts without isolation
breaking-change-checker.md — code analysis running without firewall

How to Implement:

network:
  allowed:
    - defaults
    - github
sandbox:
  agent: awf

View Medium Priority Details

🟡 Opportunity 3: Use `max-continuations` for Long-Running Agents

What: The max-continuations feature (autopilot mode) allows Copilot to break complex tasks into multiple runs, continuing where it left off.

Why It Matters: Complex workflows like code-scanning-fixer.md, dead-code-remover.md, and daily-doc-healer.md often time out or complete only partial work.

How to Implement:

engine:
  id: copilot
  max-continuations: 3   # run up to 3 continuation sessions

Recommended candidates: Any workflow with timeout-minutes: 20+ that handles large batches.

🟡 Opportunity 4: Use Available Custom Agent Files

What: 6 of 9 agent files in .github/agents/ are never referenced.

Agent files available but unused:

grumpy-reviewer.agent.md — stricter code review persona
interactive-agent-designer.agent.md — interactive UX for workflow design
w3c-specification-writer.agent.md — spec writing style
contribution-checker.agent.md — contribution validation
create-safe-output-type.agent.md — dev tooling persona

How to Implement:

engine:
  id: copilot
  agent: grumpy-reviewer    # for strict code review workflows

Example improvement: pr-nitpick-reviewer.md could benefit from the grumpy-reviewer agent for more consistent review style.

🟡 Opportunity 5: Narrow GitHub MCP Toolsets

What: Many workflows use toolsets: [default] which includes context + repos + issues + pull_requests.

Why It Matters: Principle of least privilege — request only what the workflow needs.

Examples:

Workflows only reading issues: use toolsets: [issues]
Workflows only analyzing PRs: use toolsets: [pull_requests]
Workflows doing CI analysis: use toolsets: [default, actions] (explicitly call out actions)

Good examples already in the repo (to follow):

auto-triage-issues.md: toolsets: [issues] ✅
breaking-change-checker.md: toolsets: [repos] ✅
code-scanning-fixer.md: toolsets: [context, repos, code_security, pull_requests] ✅

🟡 Opportunity 6: Adopt `engine.env` for Runtime Configuration

What: The engine.env field allows injecting custom environment variables into the agent's execution environment. Currently 0 Copilot workflows use it.

Why It Matters: Rather than hardcoding configuration in prompts, workflows can pass dynamic values as env vars (API endpoints, feature toggles, debug flags).

How to Implement:

engine:
  id: copilot
  env:
    ANALYSIS_DEPTH: "full"
    TARGET_BRANCH: "$\{\{ github.base_ref }}"

View Low Priority Details

🟢 Opportunity 7: Model Selection Optimization

What: Only 2 Copilot workflows specify a model (gpt-5.1-codex-mini for lightweight tasks). The rest use the default model.

Why It Matters: Using a lighter model for simple workflows (triage, labeling, formatting) would reduce latency and cost.

How to Implement — for simple classification/labeling tasks:

engine:
  id: copilot
  model: gpt-5.1-codex-mini

Candidates: auto-triage-issues.md, daily-assign-issue-to-user.md, bot-detection.md

🟢 Opportunity 8: Use Domain Blocklist for Defense in Depth

What: Only 1 workflow uses network.blocked. Explicit blocking of known-bad domains adds defense in depth alongside the allowlist.

How to Implement:

network:
  allowed:
    - defaults
    - github
  blocked:
    - pastebin.com
    - requestbin.net

🟢 Opportunity 9: Version Pinning for Production-Critical Workflows

What: All 86 Copilot workflows use version: latest (the default). For workflows that are customer-visible or production-critical, pinning prevents surprise breakage from CLI updates.

How to Implement:

engine:
  id: copilot
  version: "0.0.422"   # pin to tested version

Candidates: ci-coach.md, security-review.md, code-scanning-fixer.md

🟢 Opportunity 10: Expand `mcp-scripts` for Custom Server Logic

What: Only 1 workflow uses mcp-scripts, which allows embedding custom Node.js MCP server scripts inline without external infrastructure.

Why It Matters: Useful for workflows needing specialized integrations (Slack notifications, JIRA, internal metrics APIs) without deploying separate MCP servers.

4️⃣ Specific Workflow Recommendations

View Workflow-Specific Recommendations

`code-scanning-fixer.md`

Missing: sandbox: agent: awf (handles security-sensitive code without network isolation)
Missing: strict: true (strict mode should be on for security workflows)
Add: max-continuations: 3 (processes many alerts, benefits from autopilot continuation)

`pr-nitpick-reviewer.md`

Missing: strict: true
Consider: engine.agent: grumpy-reviewer for more consistent review quality
Consider: max-continuations: 2 for large PRs

`auto-triage-issues.md`

Already Good: Uses specific toolsets: [issues] ✅
Opportunity: engine.model: gpt-5.1-codex-mini (simple classification task)

`daily-architecture-diagram.md`

Already Good: Uses cache-memory ✅, strict: true ✅, copilot-requests: true ✅
This is a model workflow to copy from

`breaking-change-checker.md`

Missing: strict: true (important safety for API change detection)
Missing: copilot-requests: true

`dead-code-remover.md`

Add: max-continuations: 3 (complex refactoring benefits from continuation)
Missing: copilot-requests: true

5️⃣ Trends & Insights

View Historical Trends

This is the first comprehensive analysis run. Future research will track:

Adoption rate changes for copilot-requests
Growth in AWF sandbox usage
Custom agent file utilization
max-continuations adoption for complex workflows

Baseline established: 2026-04-01 (run §23870837785)

6️⃣ Best Practice Guidelines

Based on this research, here are recommended best practices for all new Copilot workflows:

Always use copilot-requests: true: Eliminates manual secret provisioning and uses the built-in GitHub token
Use strict: true: Enables proper permission validation at compile time
Set explicit timeout-minutes: Almost all workflows do this already (99%) — keep it up
Use specific GitHub toolsets: Avoid [default] when only a subset is needed — reduce permissions to what's actually used
Use AWF for security-sensitive workflows: Any workflow handling code changes, security alerts, or external content should use sandbox: agent: awf
Use cache-memory for incremental work: Any workflow analyzing changing data (commits, issues) benefits from caching state between runs
Leverage existing agent files: Before writing persona logic in prompts, check .github/agents/ for reusable agent definitions

7️⃣ Action Items

Immediate Actions (this week):

Migrate the remaining 47 Copilot workflows to copilot-requests: true
Add strict: true to the 35 Copilot workflows missing it

Short-term (this month):

Enable AWF sandbox on security-sensitive workflows (security-review.md, code-scanning-fixer.md)
Wire grumpy-reviewer.agent.md and w3c-specification-writer.agent.md to appropriate workflows
Audit workflows using toolsets: [default] and narrow to specific toolsets

Long-term (this quarter):

Add max-continuations: 2-3 to complex long-running workflows
Define standard model tiers (lightweight: gpt-5.1-codex-mini, standard: default)
Create a workflow health dashboard tracking these metrics over time

View Supporting Evidence & Methodology

Research Methodology

Phase 1 — Examined Copilot engine source files: copilot_engine.go, copilot_engine_execution.go, copilot_engine_tools.go, copilot_mcp.go, engine.go, and feature constants in pkg/constants/
Phase 2 — Surveyed all 179 workflow markdown files using grep pattern analysis across 20+ configuration dimensions
Phase 3 — Cross-referenced available features (from source) vs. actual usage (from frontmatter analysis)
Phase 4 — Prioritized by security impact, developer experience, and implementation ease

Data Sources

pkg/workflow/copilot_engine_execution.go — CLI flags and sandbox implementation
pkg/workflow/copilot_engine_tools.go — Tool permission system
pkg/constants/feature_constants.go — Available feature flags
pkg/workflow/engine.go — EngineConfig struct definition
.github/workflows/*.md — 179 workflow files analyzed
.github/agents/*.agent.md — 9 custom agent files inventoried
docs/src/content/docs/reference/engines.md — Feature documentation

Research saved to repo-memory

copilot-cli-research-latest.json — structured metrics
copilot-cli-research-notes.md — ongoing notes and tracking

References:

§23870837785 — This analysis run
Engines Reference Doc
AGENTS.md — Repository instructions

AI generated by Copilot CLI Deep Research Agent · history

expires on Apr 2, 2026, 9:15 PM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[copilot-cli-research] Copilot CLI Deep Research - 2026-04-01 #23952

Uh oh!

{{title}}

Uh oh!

🔴 Opportunity 1: Migrate Remaining 47 Workflows to `copilot-requests`

🔴 Opportunity 2: Expand AWF Sandbox for Sensitive Workflows

🟡 Opportunity 3: Use `max-continuations` for Long-Running Agents

🟡 Opportunity 4: Use Available Custom Agent Files

🟡 Opportunity 5: Narrow GitHub MCP Toolsets

🟡 Opportunity 6: Adopt `engine.env` for Runtime Configuration

🟢 Opportunity 7: Model Selection Optimization

🟢 Opportunity 8: Use Domain Blocklist for Defense in Depth

🟢 Opportunity 9: Version Pinning for Production-Critical Workflows

🟢 Opportunity 10: Expand `mcp-scripts` for Custom Server Logic

`code-scanning-fixer.md`

`pr-nitpick-reviewer.md`

`auto-triage-issues.md`

`daily-architecture-diagram.md`

`breaking-change-checker.md`

`dead-code-remover.md`

Research Methodology

Data Sources

Research saved to repo-memory

Replies: 0 comments

Select a reply

Uh oh!

[copilot-cli-research] Copilot CLI Deep Research - 2026-04-01 #23952

Uh oh!

github-actions[bot] bot Apr 1, 2026

📊 Executive Summary

🔴 High Priority Issues

1. Low copilot-requests Adoption (45%)

2. Limited AWF Sandbox Coverage (13%)

🟡 Medium Priority Opportunities

3. Autopilot (max-continuations) Barely Used

4. Custom Agent Files Underutilized

5. Over-Reliance on default GitHub Toolset

6. MCP Scripts Feature Severely Underused

📈 Feature Usage Matrix

3️⃣ Missed Opportunities Detail

🔴 Opportunity 1: Migrate Remaining 47 Workflows to copilot-requests

🔴 Opportunity 2: Expand AWF Sandbox for Sensitive Workflows

🟡 Opportunity 3: Use max-continuations for Long-Running Agents

🟡 Opportunity 4: Use Available Custom Agent Files

🟡 Opportunity 5: Narrow GitHub MCP Toolsets

🟡 Opportunity 6: Adopt engine.env for Runtime Configuration

🟢 Opportunity 7: Model Selection Optimization

🟢 Opportunity 8: Use Domain Blocklist for Defense in Depth

🟢 Opportunity 9: Version Pinning for Production-Critical Workflows

🟢 Opportunity 10: Expand mcp-scripts for Custom Server Logic

4️⃣ Specific Workflow Recommendations

code-scanning-fixer.md

pr-nitpick-reviewer.md

auto-triage-issues.md

daily-architecture-diagram.md

breaking-change-checker.md

dead-code-remover.md

5️⃣ Trends & Insights

6️⃣ Best Practice Guidelines

7️⃣ Action Items

Research Methodology

Data Sources

Research saved to repo-memory

Replies: 0 comments

github-actions[bot]
bot Apr 1, 2026

1. Low `copilot-requests` Adoption (45%)

3. Autopilot (`max-continuations`) Barely Used

5. Over-Reliance on `default` GitHub Toolset

🔴 Opportunity 1: Migrate Remaining 47 Workflows to `copilot-requests`

🟡 Opportunity 3: Use `max-continuations` for Long-Running Agents

🟡 Opportunity 6: Adopt `engine.env` for Runtime Configuration

🟢 Opportunity 10: Expand `mcp-scripts` for Custom Server Logic

`code-scanning-fixer.md`

`pr-nitpick-reviewer.md`

`auto-triage-issues.md`

`daily-architecture-diagram.md`

`breaking-change-checker.md`

`dead-code-remover.md`