What is the best regular expression to check if a string is a valid URL?
There's no single “best” regular expression that flawlessly validates every possible URL according to the full RFC standards. URLs can have complex structures (e.g., IPv6 addresses, punycode, unusual schemes, and more). That said, many people use a practical regex that covers the most common URL formats:
<details> <summary>Example of a Practical Regex</summary>^https?:\/\/
(?:
(?:[a-zA-Z0-9-]+\.)+[a-zA-Z]{2,} # domain...
|
(?:\d{1,3}\.){3}\d{1,3} # or IPv4
|
\[?[A-Fa-f0-9:]+\]? # or IPv6 (simple version)
)
(?::\d+)? # optional port
(?:\/[^\s]*)? # optional slash + path
$
(Expanded for clarity; you’d typically remove the whitespace and comments or use the x
(extended) flag if supported.)
This regex attempts to handle:
- HTTP/HTTPS scheme (
^https?://
). - Common domain patterns (e.g.,
example.com
). - IPv4 addresses (like
192.168.0.1
). - A simplistic approach to IPv6 (
[::1]
, etc.). - Optional port (
:8080
). - Optional path after a slash.
Caveats
- This pattern won’t catch every possible valid URL (e.g., custom schemes like
ftp://
,mailto:
, or some exotic internationalized domains). - Full RFC-compliant URL regexes become extremely large and complex, or incomplete.
Alternative Approaches
-
Use a Parser
- For example, in JavaScript:
function isValidUrl(str) { try { new URL(str); return true; } catch (e) { return false; } }
- The built-in
URL
constructor will attempt to parse the string. If it fails, it’s not a valid URL.
- For example, in JavaScript:
-
Use a Library
- Many languages have robust libraries that parse URLs for you (e.g.,
urllib
in Python,java.net.URL
in Java). These handle edge cases that can be missed by hand-crafted regexes.
- Many languages have robust libraries that parse URLs for you (e.g.,
-
Restrict Format
- If your use case only needs to accept typical
http://example.com
URLs (no custom schemes, no exotic domain endings), a simpler regex might suffice.
- If your use case only needs to accept typical
Example JS Usage with the Above Regex
function isHttpUrl(str) { const pattern = new RegExp( '^https?:\\/\\/' + // protocol '(?:[a-zA-Z0-9-]+\\.)+[a-zA-Z]{2,}' + // domain '(?::\\d+)?' + // optional port '(?:\\/[^\\s]*)?' + // optional path '$' ); return pattern.test(str); } console.log(isHttpUrl('https://example.com')); // true console.log(isHttpUrl('http://example.com/path')); // true console.log(isHttpUrl('https://192.168.0.1:3000')); // true console.log(isHttpUrl('ftp://somewhere.com')); // false (ftp not allowed) console.log(isHttpUrl('not a url')); // false
Final Thoughts
- For comprehensive URL validation (handling all international domain scenarios, custom schemes, etc.), it’s best to use an actual URL parser.
- For common
http://
orhttps://
patterns, a simpler or slightly extended regex is enough in many cases. - Validate according to your needs. If you only allow
https://
, your regex can be simpler.
Enhance Your JavaScript and Coding Skills
If you’re working with regexes, URLs, or broader JavaScript challenges, consider these DesignGurus.io courses:
-
Grokking JavaScript Fundamentals
Deepen your understanding of closures, prototypes, and async programming to handle robust validation logic. -
Grokking the Coding Interview: Patterns for Coding Questions
Strengthen your coding interview skills with pattern-based problem solving.
For personalized feedback, check out Mock Interviews with ex-FAANG engineers. Also, explore the DesignGurus.io YouTube channel for free tutorials on system design, coding patterns, and more.
Ultimately, no single regex can handle every valid URL, but a “good enough” pattern for common http/https
cases plus a URL parser is usually the best approach.