Logo

What is a good regular expression to match a URL?

There’s no perfect, one-size-fits-all regex to match every possible valid URL according to all standards. However, for practical purposes (like validating typical http:// or https:// URLs with optional paths, ports, etc.), the following pattern is commonly used:

<details> <summary>Example “Good Enough” Regex</summary>
^https?:\/\/
(?:[a-zA-Z0-9-]+\.)+[a-zA-Z]{2,}       # Domain name
(?::\d+)?                              # Optional port
(?:\/[^\s]*)?$                         # Optional path/query/fragment
</details>

This matches strings that:

  1. Start with http:// or https://.
  2. Have a “domain-like” structure of one or more labels ([a-zA-Z0-9-]+) separated by dots, ending with a TLD of at least 2 letters.
  3. Optionally include a :port.
  4. Optionally include a path (anything that’s not whitespace), beginning with a slash.

1. Explanation

^https?:\/\/
  • ^ asserts the start of the string.
  • https?:// matches http:// or https://.
(?:[a-zA-Z0-9-]+\.)+[a-zA-Z]{2,}
  • [a-zA-Z0-9-]+ matches a label (letters, digits, or hyphens).
  • \. requires a literal dot.
  • (?: ... )+ means we expect one or more such domain labels.
  • [a-zA-Z]{2,} ensures the top-level domain (TLD) is at least 2 letters (e.g., .com, .net, .org, etc.). Real TLD rules are more complex, but this is a practical compromise.
(?::\d+)?
  • An optional :port. For example, :8080.
(?:\/[^\s]*)?
  • An optional slash followed by any non-whitespace characters. This covers paths, query parameters, and fragments (like /some/path?foo=bar#section).
$
  • End of the string.

Example Matches

  • http://example.com
  • https://example.co.uk/
  • https://example.com:8080/path/to/resource
  • http://sub.domain-example.org

Example Non-Matches

  • ftp://site.com (not http:// or https://)
  • https://-domain.com (domain label can’t start with -)
  • https://example (TLD must be at least 2 letters)
  • just text (no scheme)

2. Variations & Caveats

  1. Include www.

    • If you want to specifically allow www., you might do ^(?:https?:\/\/)?(?:www\.)?[...rest...]$. But typically, the [a-zA-Z0-9-]+\. pattern already covers www. as a subdomain.
  2. More Schemes

    • If you need ftp://, mailto:, or other schemes, you can expand the scheme portion: ^(?:[a-zA-Z][a-zA-Z0-9+\-.]*):\/\/....
  3. International Domains / IDNs

    • Real-world TLDs can include many Unicode characters. Handling them might require a more advanced pattern or punycode handling. A single “basic” regex often falls short for fully internationalized domain names.
  4. Use a Parser for Complex Cases

    • For more robust or guaranteed correctness, consider using a real URL parser (like new URL() in JavaScript or libraries in other languages). Regex alone can get unwieldy for corner cases.

3. Example in JavaScript

function isValidHttpUrl(str) { const pattern = new RegExp( '^https?:\\/\\/' + // http:// or https:// '(?:[a-zA-Z0-9-]+\\.)+[a-zA-Z]{2,}' + // domain '(?::\\d+)?' + // optional port '(?:\\/[^\\s]*)?$' // optional path ); return pattern.test(str); } console.log(isValidHttpUrl('https://example.com')); // true console.log(isValidHttpUrl('https://example.com:3000/')); // true console.log(isValidHttpUrl('ftp://site.com')); // false console.log(isValidHttpUrl('just text')); // false

4. Final Thoughts

  • No single regex can handle all possible URLs (IDNs, exotic TLDs, custom schemes, etc.) precisely.
  • For production or international usage, using a URL parser or a trusted library is preferable.
  • If you just need to filter or quickly validate http:// or https:// with typical domain structures, the above regex is “good enough” for many use cases.

Bonus: Level Up Your Regex, JavaScript & Interview Skills

If you’re diving deeper into regex or preparing for coding interviews, these DesignGurus.io resources can boost your skill set:

  1. Grokking JavaScript Fundamentals
    Master closures, prototypes, async/await, and more to handle regex edge cases in real projects.

  2. Grokking the Coding Interview: Patterns for Coding Questions
    Learn systematic approaches to solving coding problems—key for interviews and day-to-day dev tasks.

For tailored feedback, explore Mock Interviews:

You can also visit the DesignGurus.io YouTube channel for free tutorials on system design, coding patterns, and more.

Summary: A practical regex for http:// or https:// might look like:

^https?:\/\/
(?:[a-zA-Z0-9-]+\.)+[a-zA-Z]{2,}
(?::\d+)?
(?:\/[^\s]*)?$

Use a real parser if you need to handle all possible URL edge cases.

CONTRIBUTOR
TechGrind