R
TestRegex
← Back to Blog

Robust URL Validation Regex

Executive Summary

  • Clarifies the main production use case and where regex fits in the workflow.
  • Provides implementation boundaries that prevent over-matching and fragile behavior.
  • Highlights testing and rollout practices to reduce regressions.

In Short

Use narrowly scoped regex patterns, validate with fixture-driven tests, and verify behavior in the target engine before deployment.

Example Blocks

Input

Sample input

Expected Output

Expected match or transformed output

Engine Caveats

  • Flag semantics vary by engine.
  • Named groups and lookbehind support differ across runtimes.
  • Replacement syntax is not portable across all languages.

Validating a URL is a common requirement. The definition of a URL is incredibly broad, but most web applications care about standard HTTP/HTTPS links.

A Robust Pattern

This pattern is widely used for detecting URLs in text:

https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)

What makes this work?

  • https?: Matches http or https.
  • [-a-zA-Z0-9@:%._\+~#=]{1,256}: Permissive domain name character set.
  • \.[a-zA-Z0-9()]{1,6}: Catches TLDs like .com, .museum, or .io.
  • \b: Ensures we don't start matching in the middle of a word.

Reusable Patterns

FAQ

What problem does this guide solve?

It focuses on a practical regex workflow that can be applied directly in production codebases.

Which regex engines should I verify?

Validate behavior in the exact runtime engines your product uses before rollout.

How do I avoid regressions?

Add explicit passing and failing fixtures in CI for every key pattern introduced in the guide.

Related Guides

Test related patterns in the live editor

Open Editor