Regular expressions (regex) are powerful patterns for searching, matching, and manipulating text. They might look intimidating at first, but the basics are surprisingly simple.
What is Regex?
A regular expression is a sequence of characters that defines a search pattern. It's used in programming, text editors, and command-line tools for pattern matching.
Basic Patterns
- \d matches any digit (0-9)
- \w matches any word character (letters, digits, underscore)
- \s matches any whitespace (space, tab, newline)
- . matches any character except newline
- ^ matches the start of a string
- $ matches the end of a string
Quantifiers
- * matches 0 or more times
- + matches 1 or more times
- ? matches 0 or 1 time
- {3} matches exactly 3 times
- {2,5} matches 2 to 5 times
Practical Examples
1. **Email validation**: [a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}
2. **Phone number**: \d{3}[-.]?\d{3}[-.]?\d{4}
3. **URL**: https?://[\w.-]+(?:\.[\w.-]+)+[\w.,@?^=%&:/~+#-]*
4. **IP address**: \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}
Flags
- g (global): Find all matches, not just the first
- i (case insensitive): Ignore case when matching
- m (multiline): ^ and $ match line boundaries
- s (dotAll): . matches newline characters too
Tips for Learning
1. Start with simple patterns and build complexity gradually
2. Use an online regex tester (like ToolBox AI's) to experiment
3. Read regex patterns left to right, one token at a time
4. Practice with real-world text extraction tasks