How do I remove all non alphanumeric characters from a string except dash?
Use a regular expression that matches everything except letters, digits, and dash, then replace those characters with an empty string. For example (in JavaScript):
const str = "Hello@World - Example!"; const cleaned = str.replace(/[^a-zA-Z0-9-]+/g, ""); console.log(cleaned); // "HelloWorld-Example"
[^a-zA-Z0-9-]
is a character class that matches any character not in the set[a-zA-Z0-9-]
.- Adding
+
quantifier ([^...]+
) matches one or more such characters in a row. - The
g
flag applies the replacement to all occurrences. - Everything except letters (
a-z
,A-Z
), digits (0-9
), and dash (-
) is removed.
1. Details & Caveats
-
Placement of Dash in Bracket
- In many regex engines, you can safely include
-
as the last or first character in your bracket class (e.g.,[A-Za-z0-9-]
or[-A-Za-z0-9]
) without escaping. - If you want
-
inside the bracket among other characters, you often must escape it like\-
(e.g.,[A-Za-z0-9\-]
) to avoid range interpretations.
- In many regex engines, you can safely include
-
Case Insensitivity
- Here,
[a-zA-Z0-9-]
explicitly covers upper and lower case letters, so you don’t need a separate flag for case-insensitive matching.
- Here,
-
Unicode / Accented Characters
- If your string might contain accented letters or other Unicode characters,
[a-zA-Z]
might be too restrictive. You can consider something like[\p{L}\p{N}-]
in regex flavors that support Unicode properties (PCRE, etc.). JavaScript’su
flag can help, but that’s more advanced.
- If your string might contain accented letters or other Unicode characters,
-
Cross-Language Implementation
- The concept is similar in Python, Java, Ruby, and others. For instance, in Python:
import re s = "Hello@World - Example!" cleaned = re.sub(r'[^a-zA-Z0-9-]+', '', s) print(cleaned) # HelloWorld-Example
- The concept is similar in Python, Java, Ruby, and others. For instance, in Python:
2. Example Variations
-
If You Also Want to Keep Spaces
const cleaned = str.replace(/[^a-zA-Z0-9-\s]+/g, ""); // Adds \s (whitespace) to the "allowed" set
-
If You Allow Underscores (
_
)const cleaned = str.replace(/[^a-zA-Z0-9-_]+/g, ""); // Keeps underscores as well
-
If You Only Remove Non-Alphanumerics (i.e., Keep Dashes + Everything Else)
- Then you’d want to remove only specifically matched characters rather than everything but your set. (Or invert the logic, depending on your exact needs.)
Summary
- A negated character class
[^a-zA-Z0-9-]
is your go-to solution, removing everything that isn’t letters, digits, or dash. - Adjust the pattern for additional allowed characters, such as spaces or underscores, by adding them to the bracket.
- This approach is language-agnostic—just use the appropriate regex and string-replacement function in your target environment.---
Bonus: Level Up Your Regex & Coding Skills
To further strengthen your JavaScript (or general coding) abilities and handle real-world interview challenges, consider these DesignGurus.io resources:
-
Grokking JavaScript Fundamentals
Dive deeper into closures, prototypes, async/await, and more—ideal for debugging or customizing complex regex tasks. -
Grokking the Coding Interview: Patterns for Coding Questions
Learn how to approach coding problems using pattern-based solutions—an invaluable skill for both interviews and day-to-day engineering.
If you’d like personalized feedback or are gearing up for tough interviews, check out Mock Interviews with ex-FAANG engineers:
And be sure to explore free videos on the DesignGurus.io YouTube channel for system design and coding tutorials.