fix: treat protocol-relative URLs as blocked domains in safe-outputs sanitizer by Copilot · Pull Request #23930 · github/gh-aw

Copilot · 2026-04-01T16:30:48Z

sanitizeUrlDomains() was anchored to https:// and sanitizeUrlProtocols() required an explicit scheme, so //evil.com/path passed both filters unchanged. Browsers on HTTPS pages resolve //hostname/path → https://hostname/path, fully bypassing the domain allowlist — including in image embeds where camo proxy requests leak to the attacker.

Changes

sanitize_content_core.cjs — sanitizeUrlDomains()
- Extracted shared allowlist logic into inner helper applyDomainFilter(match, hostnameWithPort)
- Added a second regex pass for protocol-relative URLs using a delimiter-based precondition (^|[\s([{"']) — matches // only at start-of-string or after clear URL delimiters (whitespace, brackets, quotes), avoiding false matches on // inside HTTPS URL paths (e.g. https://github.com//issues); stop-condition (?!\/\/) is symmetric with the existing https:// stop-condition
sanitize_content.test.cjs
- New "protocol-relative URL sanitization" describe block with 10 regression tests: plain text, markdown links ([text](//evil.com/path)), image embeds (![](//evil.com/pixel.gif)), allowed domains, subdomains, custom domain env var, mixed content, path+query, redaction logging, and a regression case ensuring https:// URLs with double-slash paths (e.g. https://github.com//issues) are not incorrectly redacted

// Before
sanitizeContentCore('[click](//evil.com/steal)')
// → '[click](//evil.com/steal)'   ← not redacted

// After
sanitizeContentCore('[click](//evil.com/steal)')
// → '[click]((evil.com/redacted)'

[!WARNING]

…sanitizer Protocol-relative URLs (//hostname/path) were not matched by either sanitizeUrlProtocols() (which requires an explicit scheme://) or sanitizeUrlDomains() (which was anchored to https://). Browsers on HTTPS pages resolve //hostname/path to https://hostname/path, so these URLs must be subject to the same domain allowlist check. Extends sanitizeUrlDomains() with a second regex pass for //hostname/path patterns. A negative lookbehind (?<![:/]) prevents the new pattern from matching the // inside https://. The shared applyDomainFilter() helper is used by both the existing https:// pass and the new protocol-relative pass, ensuring identical allow/redact logic. Fixes: //evil.com/steal, [click](//evil.com/steal), and ![img](//evil.com/pixel.gif) are now redacted to (evil.com/redacted). Adds 9 regression tests covering plain text, markdown links, image embeds, allowed/disallowed domains, custom allowed domains, mixed content, path + query strings, and logging." Agent-Logs-Url: https://github.com/github/gh-aw/sessions/0cad179d-8061-4756-945e-9233c95fdae7 Co-authored-by: szabta89 <1330202+szabta89@users.noreply.github.com>

Copilot

Pull request overview

Fixes a sanitizer bypass where protocol-relative URLs (//host/path) were not subject to domain allowlisting, allowing disallowed domains to pass unchanged (including in markdown links/images).

Changes:

Refactors sanitizeUrlDomains() to share allowlist/redaction logic via an inner helper.
Adds a second domain-filtering pass to redact protocol-relative URLs (//hostname/...).
Adds regression tests covering protocol-relative URLs in plain text, markdown links, and image embeds.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
actions/setup/js/sanitize_content_core.cjs	Adds protocol-relative URL domain filtering using a second regex pass and shared allowlist logic.
actions/setup/js/sanitize_content.test.cjs	Adds regression tests ensuring protocol-relative URLs are redacted/allowed appropriately and logged.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-01T23:57:02Z

actions/setup/js/sanitize_content_core.cjs

+  // The negative lookbehind (?<![:/]) avoids matching the // inside https://.
+  // The path stop-condition (?!\/\/) stops before the next protocol-relative URL
+  // (analogous to how the httpsUrlRegex stops before the next https:// URL).
+  // The path capture group is present for structural symmetry with httpsUrlRegex
+  // but is not needed by applyDomainFilter — the full match is used instead.
+  const protoRelativeUrlRegex = /(?<![:/])\/\/([\w.-]+(?::\d+)?)(\/(?:(?!\/\/)[^\s,])*)?/gi;
+
+  s = s.replace(protoRelativeUrlRegex, (match, hostnameWithPort) => applyDomainFilter(match, hostnameWithPort));


The protocol-relative regex will also match "//" sequences that occur inside the path of an allowed https:// URL (because the lookbehind only excludes ':' and '/'). For example, in "https://github.com//issues", the "//issues" part is preceded by "m" and will be treated as a protocol-relative URL, potentially redacting/altering an otherwise-allowed URL. Consider tightening the precondition so protocol-relative matches only occur at the start of the string or after clear URL delimiters (whitespace, '(', '[', quotes, etc.), rather than after hostname/path characters.

Suggested change

// The negative lookbehind (?<![:/]) avoids matching the // inside https://.

// The path stop-condition (?!\/\/) stops before the next protocol-relative URL

// (analogous to how the httpsUrlRegex stops before the next https:// URL).

// The path capture group is present for structural symmetry with httpsUrlRegex

// but is not needed by applyDomainFilter — the full match is used instead.

const protoRelativeUrlRegex = /(?<![:/])\/\/([\w.-]+(?::\d+)?)(\/(?:(?!\/\/)[^\s,])*)?/gi;

s = s.replace(protoRelativeUrlRegex, (match, hostnameWithPort) => applyDomainFilter(match, hostnameWithPort));

// We only treat // as a protocol-relative URL when it appears at the start of

// the string or immediately after a clear delimiter (whitespace, brackets,

// or quotes). This avoids matching // segments inside the path of an allowed

// https:// URL, such as "https://github.com//issues".

// The path stop-condition (?!\/\/) stops before the next protocol-relative URL

// (analogous to how the httpsUrlRegex stops before the next https:// URL).

// Capture groups:

// 1: prefix (start-of-string or delimiter)

// 2: full protocol-relative URL (starting with //)

// 3: hostname (and optional port)

// 4: optional path

const protoRelativeUrlRegex = /(^|[\s([{"'])(\/\/([\w.-]+(?::\d+)?)(\/(?:(?!\/\/)[^\s,])*)?)/gi;

s = s.replace(

protoRelativeUrlRegex,

(match, prefix, url, hostnameWithPort) => prefix + applyDomainFilter(url, hostnameWithPort)

);

Copilot · 2026-04-01T23:57:03Z

actions/setup/js/sanitize_content.test.cjs

+      expect(result).toContain("(evil.com/redacted)");
+      expect(result).not.toContain("//evil.com");
+    });
+


Consider adding a regression test for an allowed https:// URL that contains a double-slash in the path (e.g., "https://github.com//issues") to ensure the new protocol-relative pass doesn’t accidentally treat path "//..." segments as protocol-relative URLs and redact them.

Suggested change

it("should not treat double slashes in https URL paths as protocol-relative URLs", () => {

const result = sanitizeContent("https://github.com//issues and //evil.com/path");

expect(result).toContain("https://github.com//issues");

expect(result).toContain("(evil.com/redacted)");

expect(result).not.toContain("//evil.com");

});

pelikhan · 2026-04-02T00:21:02Z

@copilot review comments

pelikhan · 2026-04-02T01:48:18Z

@copilot review comments

pelikhan · 2026-04-02T04:14:04Z

@copilot review comments

… lookbehind Agent-Logs-Url: https://github.com/github/gh-aw/sessions/faa1705f-42c1-4f84-9165-6067c54c13e4 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-04-02T04:19:30Z

@copilot review comments

Addressed both review comments in ae6a1b8:

Regex tightened (sanitize_content_core.cjs): Replaced the lookbehind (?<![:/]) approach with a delimiter-based match (^|[\s([{"']). The regex now only treats // as a protocol-relative URL at start-of-string or after whitespace/brackets/quotes, preventing false matches on // inside HTTPS URL paths like https://github.com//issues.
Regression test added (sanitize_content.test.cjs): Added a test confirming that https://github.com//issues is preserved unchanged while //evil.com/path in the same string is still redacted.

pelikhan · 2026-04-02T04:22:27Z

@copilot add more positive and negative test cases

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/9aefc8a2-2f5b-4c63-82d8-3bf825bd7468 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-04-02T04:27:43Z

@copilot add more positive and negative test cases

Added 8 more tests in d03fdc8:

Negative (redacted):

//evil.com:8080/api — port number
//evil.com/a and //bad.org/b — multiple URLs in one string, all redacted
src="//evil.com/img.png" — HTML attribute with double-quote delimiter
//evil.com — hostname only (no path)

Positive (allowed):

//raw.githubusercontent.com/... with *.githubusercontent.com wildcard domain
//trusted.net/path with custom GH_AW_ALLOWED_DOMAINS=trusted.net
{//github.com/repo} — curly-brace delimited context

Boundary (not treated as protocol-relative):

word//evil.com/path — // not preceded by a delimiter, passes through unchanged

Initial plan

c06d8e6

Copilot AI assigned Copilot and szabta89 Apr 1, 2026

Copilot started work on behalf of szabta89 April 1, 2026 16:30 View session

Copilot AI linked an issue Apr 1, 2026 that may be closed by this pull request

safe-outputs sanitizer must treat //hostname protocol-relative URLs as blocked domains #23737

Closed

Copilot AI changed the title ~~[WIP] Fix safe-outputs sanitizer to block protocol-relative URLs~~ fix: treat protocol-relative URLs as blocked domains in safe-outputs sanitizer Apr 1, 2026

Copilot finished work on behalf of szabta89 April 1, 2026 16:47

Copilot AI requested a review from szabta89 April 1, 2026 16:47

This was referenced Apr 1, 2026

[PR Triage Report] PR Triage Report - 2026-04-01 #23937

Open

[aw] No-Op Runs #23927

Open

pelikhan marked this pull request as ready for review April 1, 2026 23:52

Copilot AI review requested due to automatic review settings April 1, 2026 23:52

Copilot started reviewing on behalf of pelikhan April 1, 2026 23:53 View session

Copilot AI reviewed Apr 1, 2026

View reviewed changes

Copilot started work on behalf of pelikhan April 2, 2026 00:21 View session

Copilot AI requested a review from pelikhan April 2, 2026 00:36

Copilot stopped work on behalf of pelikhan due to an error April 2, 2026 00:36
The job was not acquired by a runner even after multiple attempts.

Copilot started work on behalf of pelikhan April 2, 2026 01:48 View session

Copilot stopped work on behalf of pelikhan due to an error April 2, 2026 02:03
The job was not acquired by a runner even after multiple attempts.

Copilot started work on behalf of pelikhan April 2, 2026 04:14 View session

fix: tighten protocol-relative URL regex to use delimiters instead of…

ae6a1b8

… lookbehind Agent-Logs-Url: https://github.com/github/gh-aw/sessions/faa1705f-42c1-4f84-9165-6067c54c13e4 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan April 2, 2026 04:20

Copilot started work on behalf of pelikhan April 2, 2026 04:22 View session

test: add more positive and negative protocol-relative URL test cases

d03fdc8

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/9aefc8a2-2f5b-4c63-82d8-3bf825bd7468 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan April 2, 2026 04:28

Merge branch 'main' into copilot/fix-protocol-relative-url-sanitization

9a0e49c

pelikhan merged commit 0725126 into main Apr 2, 2026
56 checks passed

pelikhan deleted the copilot/fix-protocol-relative-url-sanitization branch April 2, 2026 04:38

github-actions bot mentioned this pull request Apr 2, 2026

Smoke Test: Claude - 23884939831 #24024

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: treat protocol-relative URLs as blocked domains in safe-outputs sanitizer#23930

fix: treat protocol-relative URLs as blocked domains in safe-outputs sanitizer#23930
pelikhan merged 5 commits intomainfrom
copilot/fix-protocol-relative-url-sanitization

Copilot AI commented Apr 1, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 1, 2026

Uh oh!

Copilot AI Apr 1, 2026

Uh oh!

pelikhan commented Apr 2, 2026

Uh oh!

pelikhan commented Apr 2, 2026

Uh oh!

pelikhan commented Apr 2, 2026

Uh oh!

Copilot AI commented Apr 2, 2026

Uh oh!

pelikhan commented Apr 2, 2026

Uh oh!

Copilot AI commented Apr 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

+    it("should not treat double slashes in https URL paths as protocol-relative URLs", () => {
+      const result = sanitizeContent("https://github.com//issues and //evil.com/path");
+      expect(result).toContain("https://github.com//issues");
+      expect(result).toContain("(evil.com/redacted)");
+      expect(result).not.toContain("//evil.com");
+    });

Conversation

Copilot AI commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

pelikhan commented Apr 2, 2026

Uh oh!

pelikhan commented Apr 2, 2026

Uh oh!

pelikhan commented Apr 2, 2026

Uh oh!

Copilot AI commented Apr 2, 2026

Uh oh!

pelikhan commented Apr 2, 2026

Uh oh!

Copilot AI commented Apr 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Copilot AI commented Apr 1, 2026 •

edited

Loading