How to do a regular expression replace in MySQL?
If you’re using MySQL 8.0 or later, you can use the built-in REGEXP_REPLACE()
function to perform a regular-expression-based replacement. However, in older versions of MySQL (before 8.0), there is no native function for regex replacements. In such cases, you either:
- Implement a workaround via stored procedures, user-defined functions, or external processing (e.g., in an application layer), or
- Upgrade or switch to MySQL 8.0+ for direct usage of
REGEXP_REPLACE()
.
Below are examples and details for both scenarios.
1. MySQL 8.0+: Using REGEXP_REPLACE()
Starting from MySQL 8.0.4, you can call REGEXP_REPLACE(subject, pattern, replacement [, flags ])
. For instance:
SELECT REGEXP_REPLACE('Hello 123 world', '[0-9]+', 'XYZ');
- Subject:
'Hello 123 world'
- Pattern:
'[0-9]+'
(matches one or more digits) - Replacement:
'XYZ'
- flags (optional):
'i'
,'c'
,'m'
,'n'
,'u'
, or combinations, to control case-sensitivity, multiline, etc.
Result: Hello XYZ world
You can also do this on column values. For example, to update a table:
UPDATE my_table SET col_name = REGEXP_REPLACE(col_name, '[0-9]+', 'XYZ') WHERE id = 123;
This replaces all numeric substrings in col_name
with 'XYZ'
for the row(s) matching id = 123
.
1.1 Flags in REGEXP_REPLACE()
'i'
– Case-insensitive matching.'c'
– Case-sensitive matching (default).'m'
– Multiline mode (affects^
and$
).'n'
– Name “.” does not match newline.'u'
– Enables Unicode mode.
Example with flags:
SELECT REGEXP_REPLACE('Hello123', '[a-z]+', 'X', 'i'); -- Replaces 'Hello' (case-insensitive) with 'X', yields "X123"
2. Older MySQL Versions (Pre-8.0)
Prior to MySQL 8.0, there’s no built-in regex replace function. Possible approaches:
-
Client-Side Replacement
- Fetch the data from MySQL, do the regex replace in your application (e.g., in PHP, Python, Java, etc.), and then store the result back if needed.
-
User-Defined Functions (UDF)
- You can write or install a custom UDF in C/C++ that exposes a function like
REGEXP_REPLACE()
. This requires server configuration changes and compiling the UDF, which might not be feasible in shared hosting environments.
- You can write or install a custom UDF in C/C++ that exposes a function like
-
Stored Procedure Workarounds
- Some people attempt complex stored procedures that step through the string and do partial replacements, but it’s generally cumbersome and slower than doing it externally.
-
External Tools
- For large-scale text transformations, exporting data, running it through a script with e.g.
sed
,awk
, or a programming language, and then re-importing can be more practical.
- For large-scale text transformations, exporting data, running it through a script with e.g.
3. Example: Client-Side or UDF Approach (Pre-8.0)
3.1 Client-Side (e.g., in a Python Script)
import re import mysql.connector conn = mysql.connector.connect(host="localhost", user="root", password="pwd", database="testdb") cursor = conn.cursor() cursor.execute("SELECT id, col_name FROM my_table") for row in cursor: row_id, text_val = row new_val = re.sub(r'[0-9]+', 'XYZ', text_val) update_sql = "UPDATE my_table SET col_name = %s WHERE id = %s" cursor.execute(update_sql ### Recommended Resources 1. **[Grokking Python Fundamentals](https://www.designgurus.io/course/grokking-python-fundamentals)** Dive into Python essentials. 2. **[Grokking SQL for Tech Interviews](https://www.designgurus.io/course/grokking-sql-for-tech-interviews)**