Python Regular Expressions Exercises

Which of the following patterns will match any 3-letter string consisting of lowercase letters only?

r'^[a-z]{3}$'

r'[a-z]{3}'

r'^[a-z]{2,4}$'

r'[a-z]{3,}'

This question tests your understanding of basic regex, anchors, and quantifiers.

Option 1: Correct. ^[a-z]{3}$ means:
- ^ → start of string
- [a-z] → any lowercase letter
- {3} → exactly 3 letters
- $ → end of string
This pattern ensures that only strings with exactly 3 lowercase letters match.
Option 2: Matches 3 consecutive lowercase letters anywhere in the string, so strings with extra characters will also match.
Option 3: Matches 2–4 letters, not exactly 3.
Option 4: Matches 3 or more letters; strings longer than 3 will also match.

Example usage:

import re

pattern = r'^[a-z]{3}$'
print(bool(re.match(pattern, 'cat')))  # True
print(bool(re.match(pattern, 'cats'))) # False
print(bool(re.match(pattern, 'ca')))   # False

Tip: Use ^ and $ anchors to ensure the whole string matches the pattern exactly. Without anchors, regex may match substrings.

You want to check if a string starts with “py” and is followed by any two characters. Which regex pattern should you use?

r'^py..$'

r'py..'

r'^py.{2}$'

r'^.py..$'

This exercise checks understanding of anchors (^), wildcards (.), and quantifiers in a simple pattern.

Option 1: Correct. ^py..$ means:
- ^ → string starts with
- py → literal characters “py”
- . → any single character (used twice)
- $ → end of string
This ensures the string starts with “py” and has exactly 2 more characters, making total length 4.
Option 2: Matches any substring starting with “py” followed by 2 characters, not necessarily the whole string.
Option 3: Correct in concept ({2} equals two characters), but technically option 1 is more straightforward with two dots.
Option 4: Incorrect. The leading . allows any character before “py”, so it does not ensure the string starts with “py”.

Example usage:

import re

pattern = r'^py..$'

print(bool(re.match(pattern, 'pych')))  # True
print(bool(re.match(pattern, 'python'))) # False (too long)
print(bool(re.match(pattern, 'apyx')))   # False (doesn't start with py)

Tip: Use ^ and $ for exact string patterns and . to represent any character. Two dots = exactly two characters.

Which pattern will match a string that starts with a digit, followed by exactly two lowercase letters, and ends with either “x” or “y”?

r'^\d[a-z]{2}[xy]$'

r'\d[a-z]{2}[xy]'

r'^[0-9][a-z]{2}(x|y)$'

r'^[0-9][a-z]{2}[xy]$'

This exercise combines digits, character classes, quantifiers, and alternation for a slightly tricky pattern.

Option 1: Almost correct. Uses \d (digit) and [a-z]{2}, ending with [xy]. Works in most cases, but option 4 uses numeric range explicitly which is clearer in some contexts.
Option 2: Missing ^ and $ anchors, so it may match substrings instead of the whole string.
Option 3: Uses (x|y), which is correct but adds unnecessary parentheses; simpler with character class [xy].
Option 4: Correct. ^[0-9][a-z]{2}[xy]$ ensures:
- Start of string: ^
- First character is a digit: [0-9]
- Next two characters are lowercase letters: [a-z]{2}
- Last character is either 'x' or 'y': [xy]
- End of string: $

Example usage:

import re

pattern = r'^[0-9][a-z]{2}[xy]$'

print(bool(re.match(pattern, '1abx')))  # True
print(bool(re.match(pattern, '2cdz')))  # False (last char not x or y)
print(bool(re.match(pattern, '12ax')))  # False (second char is not letter)

Tip: Use [xy] for simple alternation instead of parentheses when matching single characters. Always anchor your pattern with ^ and $ for exact matches.

Consider the following code:

import re

pattern = r'^[A-Z][a-z]{2,4}\d$'
test_strings = ['Cat1', 'Dog12', 'Apple5', 'Bat9', 'cat1']
matches = [s for s in test_strings if re.match(pattern, s)]
print(matches)

Which of the following is the correct output?

['Cat1', 'Dog12', 'Apple5', 'Bat9', 'cat1']

['Cat1', 'Bat9']

['Dog12', 'Apple5']

['Cat1', 'Apple5', 'Bat9']

This exercise combines anchors, character classes, quantifiers, and string filtering with regex.

Pattern Analysis: ^[A-Z][a-z]{2,4}\d$
- ^ → start of string
- [A-Z] → first character is uppercase
- [a-z]{2,4} → next 2 to 4 lowercase letters
- \d → ends with a digit
- $ → end of string
Check each string:
- 'Cat1' → C (uppercase), at (2), t (3), 1 → matches
- 'Dog12' → last part '12' is two digits → does not match
- 'Apple5' → A (uppercase), p p l e (4 lowercase letters), 5 → matches
- 'Bat9' → B (uppercase), a t (2 lowercase), 9 → matches
- 'cat1' → starts with lowercase → does not match

Thus, matches = ['Cat1', 'Apple5', 'Bat9']

Tip: Pay attention to anchors and exact counts with curly braces. re.match matches from the start of the string, so ^ is optional here but keeps intent clear.

Fill in the blank in the following statement:

The Python `re` function `_____________` returns a match object only if the pattern matches at the beginning of the string, otherwise it returns `None`.

re.search()

re.match()

re.findall()

re.fullmatch()

This exercise tests your conceptual understanding of the key difference between re.match and re.search in Python.

Option 1: re.search() searches for the pattern anywhere in the string, not just at the beginning.
Option 2: re.match() is correct. It attempts to match the pattern starting from the beginning of the string.
Option 3: re.findall() returns all non-overlapping matches as a list, not a match object.
Option 4: re.fullmatch() matches the entire string; the pattern must cover the whole string exactly.

Example:

import re

pattern = r'cat'
print(re.match(pattern, 'catapult'))  # 
print(re.match(pattern, 'concatenate'))  # None
print(re.search(pattern, 'concatenate')) #

Key Takeaway: Use re.match() when you want to ensure the pattern matches at the start of the string. For general search anywhere in the string, use re.search().

You have the following text:

Order ID: 12345, Customer: Alice
Order ID: 67890, Customer: Bob

You want to extract all Order IDs using Python regex. Which code snippet will correctly do this?

import re
re.match(r'Order ID: (\d+)', text)

import re
re.search(r'Order ID: (\d+)', text)

import re
re.findall(r'Order ID: (\d+)', text)

import re
re.match(r'(\d+)', text)

This exercise focuses on using groups and the appropriate re function to extract multiple matches from text.

Option 1: re.match only checks at the start of the string, so it will only match the first Order ID if it is at the very beginning.
Option 2: re.search finds the first occurrence of the pattern, but will not extract all Order IDs.
Option 3: Correct. re.findall returns a list of all matches for the pattern, using the captured group (\d+).
Option 4: re.match(r'(\d+)', text) tries to match digits at the start of the string. It will fail because the string starts with “Order ID:”.

Example usage:

import re

text = """Order ID: 12345, Customer: Alice
Order ID: 67890, Customer: Bob"""

order_ids = re.findall(r'Order ID: (\d+)', text)
print(order_ids)  # ['12345', '67890']

Tip: Use re.findall when you need all occurrences of a pattern. Parentheses (...) define the group you want to capture.

You want to match a string that contains either “cat” or “dog” anywhere in it. Which regex pattern will achieve this correctly?

r'cat|dog'

r'(cat|dog)^'

r'^(cat|dog)$'

r'[cat|dog]'

This exercise focuses on alternation in regex, which allows matching one of multiple patterns.

Option 1: Correct. cat|dog matches either “cat” or “dog” anywhere in the string.
Option 2: Incorrect. The ^ at the end is misplaced; anchors should be at the start or end only.
Option 3: Incorrect. ^(cat|dog)$ matches the entire string exactly equal to “cat” or “dog”, not when they appear inside other text.
Option 4: Incorrect. Square brackets denote a character class, so [cat|dog] matches any single character that is 'c', 'a', 't', '|', 'd', 'o', or 'g'.

Example usage:

import re

pattern = r'cat|dog'

texts = ['I have a cat', 'My dog is cute', 'Birds are flying']

matches = [s for s in texts if re.search(pattern, s)]
print(matches)  # ['I have a cat', 'My dog is cute']

Tip: Use the | operator for alternation. Remember: parentheses group alternatives, while square brackets match individual characters.

Consider the following text:

<p>Title 1</p><p>Title 2</p>

You want to extract each title separately using regex. Which pattern will work correctly?

r'<p>.*</p>'

r'<p>.+</p>'

r'<p>.*?</p>'

r'<p>[^p]*</p>'

This exercise tests understanding of greedy vs non-greedy quantifiers.

Option 1: .* is greedy, so it matches from the first  to the last , resulting in one big match: Title 1Title 2.
Option 2: Similar to option 1 but requires at least one character; still greedy, so same problem.
Option 3: Correct. .*? is non-greedy, so it matches the shortest possible string between  and , giving two separate matches: Title 1 and Title 2.
Option 4: Incorrect. [^p]* matches any character except 'p', which breaks the intended pattern.

Example usage:

import re

text = '<p>Title 1</p><p>Title 2</p>'
pattern = r'<p>.*?</p>'

matches = re.findall(pattern, text)
print(matches)  # ['<p>Title 1</p>', '<p>Title 2</p>']

Tip: Use ? after a quantifier like * or + to make it non-greedy when you want the shortest match possible, especially in nested or repeated patterns.

You want to match a string that literally contains `a+b`. Which regex pattern will work correctly in Python?

r'a+b'

r'a\+b'

r'a\\b'

r'[a+b]'

This exercise tests understanding of escaping special characters in regex.

Option 1: a+b is incorrect because + is a regex quantifier meaning “one or more of the preceding character”. It will match 'ab', 'aab', 'aaab', etc.
Option 2: Correct. a\+b escapes the +, so it matches the string 'a+b' literally.
Option 3: a\\b uses \b which represents a word boundary, not a literal '+'.
Option 4: [a+b] is a character class that matches a single character: 'a', '+', or 'b'. It does not match the whole string 'a+b'.

Example usage:

import re

pattern = r'a\\+b'

print(bool(re.match(pattern, 'a+b')))  # True
print(bool(re.match(pattern, 'aaab'))) # False
print(bool(re.match(pattern, 'a-b')))  # False

Tip: Special regex characters like + * ? . ^ $ () [] {} must be escaped with \\ if you want to match them literally.

Consider the following code:

import re

text = "Hello world"
pattern = r"world"

match_result = re.match(pattern, text)
search_result = re.search(pattern, text)
fullmatch_result = re.fullmatch(pattern, text)

results = [match_result, search_result, fullmatch_result]
print([bool(r) for r in results])

Which of the following is the correct output?

[True, True, True]

[True, True, False]

[False, False, False]

[False, True, False]

This exercise tests understanding of the difference between re.match, re.search, and re.fullmatch.

re.match(pattern, text) → matches only at the beginning of the string. Since "world" is not at the start, it returns None.
re.search(pattern, text) → searches the entire string and returns the first occurrence. "world" is present, so it returns a match object.
re.fullmatch(pattern, text) → requires the whole string to exactly match the pattern. "world" is only part of the string, so it returns None.

Step-by-step reasoning:

match_result → None      → bool(None) = False
search_result → match obj → bool(match obj) = True
fullmatch_result → None   → bool(None) = False

Output → [False, True, False]

Tip: Use re.match when matching from the start, re.search for anywhere in the string, and re.fullmatch when the entire string must match the pattern.

You have the following text:

"hello hello world world test"

You want to find all consecutive repeated words using Python regex. Which pattern will do this correctly?

r'(\b\w+\b)\1'

r'(\w+)\1+'

r'(\b\w+\b)\s+\1'

r'\b(\w+)\b\b\1\b'

This exercise focuses on capturing groups and backreferences in regex.

Option 1: (\b\w+\b)\1 fails because \1 is immediately after the word, but repeated words are separated by a space.
Option 2: (\w+)\1+ doesn’t account for word boundaries, so it may match partial substrings incorrectly.
Option 3: Correct. (\b\w+\b)\s+\1 means:
- (\b\w+\b) → capture a whole word
- \s+ → one or more spaces
- \1 → the same word captured before
This matches "hello hello" and "world world".
Option 4: Incorrect. The placement of \b is invalid; it will not match correctly.

Example usage:

import re

text = "hello hello world world test"
pattern = r'(\b\w+\b)\s+\1'

matches = re.findall(pattern, text)
print(matches)  # ['hello', 'world']

Tip: Use \1, \2, etc., to refer to previously captured groups. Remember to handle spaces or separators between repeated patterns.

You want to match either "cat" or "dog" without creating a capturing group. Fill in the blank in the following regex:

pattern = r'^(?:_____)$'

Which should replace `<blank>`?

cat|dog

(cat|dog)

[cat|dog]

c(a|o)t|dog

This exercise tests understanding of non-capturing groups in regex.

Option 1: Correct. (?:cat|dog) matches "cat" or "dog" without creating a capturing group. So the full pattern is ^(?:cat|dog)$.
Option 2: (cat|dog) would create a capturing group. It works functionally, but the exercise specifically asks for a non-capturing group.
Option 3: [cat|dog] is a character class; it matches a single character among 'c', 'a', 't', '|', 'd', 'o', or 'g', which is not intended.
Option 4: Complex and incorrect. It partially matches some characters but does not achieve the intended "cat or dog" match cleanly.

Example usage:

import re

pattern = r'^(?:cat|dog)$'

print(bool(re.match(pattern, 'cat')))  # True
print(bool(re.match(pattern, 'dog')))  # True
print(bool(re.match(pattern, 'bat')))  # False

Tip: Use ?: inside parentheses to create a group that groups alternatives without storing a match. This is useful for performance or when you do not need the captured value.

You want to validate a password that must:

Be at least 6 characters long
Contain at least one digit

Which regex pattern will correctly enforce this using a positive lookahead?

r'\d{1,6}'

r'^\d.*$'

r'^(?=.*\d).{6,}$'

r'.{6,}\d'

This exercise demonstrates a real-world practical scenario using positive lookahead to validate passwords.

Option 1: \d{1,6} only matches 1 to 6 digits, not the full password requirement.
Option 2: ^\d.*$ requires the password to start with a digit; it does not allow letters at the start.
Option 3: Correct. ^(?=.*\d).{6,}$ means:
- (?=.*\d) → positive lookahead ensures at least one digit exists somewhere in the string
- . {6,} → total length is at least 6 characters
- ^ ... $ → anchors match the whole string
Option 4: . {6,}\d requires the password to end with a digit; digits elsewhere are not counted.

Example usage:

import re

pattern = r'^(?=.*\d).{6,}$'

print(bool(re.match(pattern, 'abc123')))   # True
print(bool(re.match(pattern, 'abcdef')))   # False (no digit)
print(bool(re.match(pattern, '12ab')))     # False (less than 6 chars)

Tip: Positive lookahead (?=...) is a powerful way to enforce rules without consuming characters. It’s especially useful in password validation or complex multi-rule patterns.

You have the following text:

Prices: $100, $250, 300, $400

You want to extract only the numbers preceded by `$` using regex. Which pattern will do this correctly?

r'\$\d+'

r'(?=\$\d+)\d+'

r'(?<=\$)\d+'

r'\d+\$'

This exercise demonstrates positive lookbehind to extract numbers that are immediately preceded by a specific character.

Option 1: \$\d+ works, but it includes the $ in the match. Sometimes you may want only the number.
Option 2: (?=\$\d+)\d+ is a positive lookahead; it checks for a $ before digits ahead but does not consume it correctly for extraction.
Option 3: Correct. (?<=\$)\d+ means:
- (?<=\$) → positive lookbehind ensures the number is preceded by a $
- \d+ → matches the digits only
This will extract 100, 250, 400 only.
Option 4: \d+\$ matches numbers that end with $, which is not what we want.

Example usage:

import re

text = 'Prices: $100, $250, 300, $400'
pattern = r'(?<=\$)\d+'

matches = re.findall(pattern, text)
print(matches)  # ['100', '250', '400']

Tip: Use positive lookbehind (?<=...) to match content only if it is preceded by a specific character or pattern, without including the preceding part in the match.

You want to validate an email address that must:

Start with letters or digits
Contain exactly one @
Domain contains only letters and at least one dot

Which regex pattern is correct?

r'\w+@\w+\.\w+'

r'[a-z]+@[a-z]+\.[a-z]+'

r'\w+@[\w]+\.[\w]+'

r'^[a-zA-Z0-9]+@[a-zA-Z]+\.[a-zA-Z]+$'

This exercise demonstrates a complex real-world validation using regex for emails.

Option 1: Matches emails partially, but it allows digits and underscores in the domain, which may not be desired.
Option 2: Only allows lowercase letters in the username and domain; digits are rejected.
Option 3: Similar to option 1, but domain part may include digits and underscores.
Option 4: Correct. ^[a-zA-Z0-9]+@[a-zA-Z]+\.[a-zA-Z]+$ ensures:
- Username: letters and digits only ([a-zA-Z0-9]+)
- Exactly one @ symbol
- Domain: letters only ([a-zA-Z]+)
- Dot between domain and TLD
- Anchors ^ and $ ensure the entire string matches

Example usage:

import re

pattern = r'^[a-zA-Z0-9]+@[a-zA-Z]+\.[a-zA-Z]+$'

emails = ['test123@domain.com', 'user@domain', 'user_name@domain.com', 'user@domain1.com']
valid_emails = [e for e in emails if re.match(pattern, e)]

print(valid_emails)  # ['test123@domain.com']

Tip: For strict email validation, carefully define allowed characters in username and domain. Use anchors to match the entire string and avoid partial matches.

Which of the following statements about Python regex backreferences is TRUE?

Which of the following statements about Python regex backreferences is TRUE?

Backreferences can only refer to non-capturing groups.

Backreferences allow you to match the same text that was previously captured by a capturing group.

Backreferences are used to repeat a pattern any number of times like {n}.

This is a conceptual exercise about backreferences in regex.

Option 1: Incorrect. \1 does not refer to the first character in the string; it refers to the content captured by the first capturing group.
Option 2: Incorrect. Backreferences cannot refer to non-capturing groups; they only refer to capturing groups (defined by parentheses (...)).
Option 3: Correct. Backreferences allow the regex to match the same text that was previously captured by a capturing group. This is useful for finding repeated words, patterns, or mirrored sequences.
Option 4: Incorrect. Quantifiers like `{n}` are used to repeat a pattern a fixed number of times; this is different from backreferences.

Key Takeaway: Use backreferences (\1, \2, ...) when you want to ensure that a later part of the string matches exactly what was captured earlier. They are particularly useful for repeated words or mirrored patterns.

You have the following text:

"Love #Python3! #Regex is #fun, but #coding123 is awesome."

You want to extract all hashtags (words starting with `#` and containing letters and digits only, no punctuation). Which regex pattern will work correctly?

r'#\w+!'

r'#\w+'

r'#[a-zA-Z]+'

r'#\w+\b[^\W_]'

This exercise focuses on multi-step extraction using regex for a practical scenario: extracting hashtags.

Option 1: #\w+! only matches hashtags that end with an exclamation mark, missing normal hashtags.
Option 2: Correct. #\w+ matches # followed by letters, digits, or underscores, capturing all hashtags like #Python3, #Regex, #fun, #coding123.
Option 3: #[a-zA-Z]+ only matches letters, missing digits in hashtags like #Python3 or #coding123.
Option 4: #\w+\b[^\W_] is overly complicated and invalid; it may not match correctly.

Example usage:

import re

text = "Love #Python3! #Regex is #fun, but #coding123 is awesome."
pattern = r'#\w+'

hashtags = re.findall(pattern, text)
print(hashtags)  # ['#Python3', '#Regex', '#fun', '#coding123']

Tip: Use \w+ to match letters, digits, and underscores. For hashtags, this captures typical alphanumeric hashtags without punctuation.

You have the following text:

"Total: $100 USD, Discount: $20 USD, Tax: 5 USD"

You want to extract numbers that are preceded by `$` and followed by `USD`. Which regex pattern will work correctly?

r'\$\d+USD'

r'(?<=\$)\d+'

r'(?<=\$)\d+(?=\sUSD)'

r'\d+(?=USD)'

This exercise demonstrates combining positive lookbehind and positive lookahead for precise matching.

Option 1: \$\d+USD matches the dollar sign and USD text along with the number, but we may want only the number.
Option 2: (?<=\$)\d+ matches numbers after $, but does not ensure they are followed by USD.
Option 3: Correct. (?<=\$)\d+(?=\sUSD) means:
- (?<=\$) → number must be preceded by $
- \d+ → match one or more digits
- (?=\sUSD) → number must be followed by a space and "USD" without including it in the match
Option 4: \d+(?=USD) matches digits before USD but ignores the requirement for a preceding $.

Example usage:

import re

text = "Total: $100 USD, Discount: $20 USD, Tax: 5 USD"
pattern = r'(?<=\$)\d+(?=\sUSD)'

matches = re.findall(pattern, text)
print(matches)  # ['100', '20']

Tip: Combining lookahead (?=...) and lookbehind (?<=...) allows precise extraction without including surrounding characters in the match.

You have the following text:

"func(a, b), func2(c, func3(d, e))"

You want to extract the content inside the first level of parentheses for each function (not deeply nested). Which regex pattern will work correctly?

r'\(([^()]*)\)'

r'\((.*)\)'

r'\((.+?)\)'

r'\(([^)]*)\([^)]*\)[^)]*\)'

This exercise demonstrates handling nested-like structures with regex, specifically extracting content inside parentheses at the first level.

Option 1: Correct. $([^()]*)$ works as follows:
- $ and $ → match the literal parentheses
- [^()]* → match any character except parentheses (ensures only first-level content is captured)
Option 2: $(.*)$ is greedy and will match from the first '(' to the last ')', consuming nested parentheses as well.
Option 3: $(.+?)$ is non-greedy and matches minimal content, but may fail with multiple nested parentheses scenarios.
Option 4: Overly complex and not reliable; trying to manually account for nested parentheses is error-prone in regex.

Example usage:

import re

text = "func(a, b), func2(c, func3(d, e))"
pattern = r'\(([^()]*)\)'

matches = re.findall(pattern, text)
print(matches)  # ['a, b', 'c']  → only first-level parentheses content

Tip: Regex is not fully recursive, but [^()]* helps match first-level parentheses. For deeper nested structures, consider a parser instead of regex.

You want to validate URLs with the following rules:

Start with http:// or https://
Domain contains letters, digits, or hyphens
Optional path and query parameters
No spaces allowed

Which regex pattern will correctly validate such URLs?

r'https?://[a-zA-Z]+\.[a-zA-Z]+'

r'https?://\w+\.\w+(/\w+)?'

r'https?://[a-zA-Z0-9-]+\.[a-zA-Z]+(/\w+)?'

r'^https?://[a-zA-Z0-9-]+\.[a-zA-Z]+(?:/\S*)?$'

This exercise demonstrates regex optimization and handling edge cases for URL validation.

Option 1: Only matches the domain part; does not allow digits, hyphens, or optional paths.
Option 2: Uses \w+ for domain and optional path, but does not allow hyphens and allows only simple paths.
Option 3: Improves domain matching with digits and hyphens, allows simple paths, but paths are restricted to word characters only.
Option 4: Correct. ^https?://[a-zA-Z0-9-]+\.[a-zA-Z]+(?:/\S*)?$:
- ^https?:// → matches http or https at start
- [a-zA-Z0-9-]+\.[a-zA-Z]+ → domain allows letters, digits, hyphens, and a valid TLD
- (?:/\S*)? → optional path or query parameters, no spaces allowed
- $ → ensures full string match

Example usage:

import re

pattern = r'^https?://[a-zA-Z0-9-]+\.[a-zA-Z]+(?:/\S*)?$'

urls = [
    'http://example.com',
    'https://my-site.com/path/to/page?query=1',
    'ftp://invalid.com',
    'https://space inurl.com'
]

valid_urls = [u for u in urls if re.match(pattern, u)]
print(valid_urls)  
# ['http://example.com', 'https://my-site.com/path/to/page?query=1']

Tip: For URL validation, carefully handle optional paths and query parameters, use anchors to match the full string, and disallow spaces. Non-capturing groups (?: ... ) are useful for grouping without capturing.

Function	Description
`re.search()`	Finds the first occurrence of the pattern anywhere in the string.
`re.match()`	Checks whether the pattern matches from the beginning of the string.
`re.findall()`	Returns a list of all non-overlapping matches.
`re.finditer()`	Returns an iterator containing match objects for all matches.
`re.sub()`	Replaces matched patterns with the specified replacement text.
`re.split()`	Splits a string based on the given regex pattern.
`re.compile()`	Precompiles a pattern for better performance when used multiple times.

Pattern	Meaning
`.`	Matches any single character except newline.
`\d`	Matches any digit (0–9).
`\w`	Matches any word character (letters, digits, underscore).
`\s`	Matches any whitespace character.
`^`	Matches the start of a string.
`$`	Matches the end of a string.
`*`	Matches 0 or more repetitions.
`+`	Matches 1 or more repetitions.
`?`	Matches 0 or 1 repetition.
`{m,n}`	Matches the previous pattern between m and n times.
`[]`	Defines a character set.
`\|`	Acts as an OR operator.
`()`	Groups patterns together.

Metacharacter	Meaning	Example
`.`	Matches any character except newline	`a.c` matches abc, a1c, a-c
`^`	Matches the start of a string	`^Hello` matches any string starting with “Hello”
`$`	Matches the end of a string	`world$` matches any string ending with “world”
`*`	Matches 0 or more repetitions	`go` matches g, go, goo*...
`+`	Matches 1 or more repetitions	`go+` matches go, goo...
`?`	Matches 0 or 1 repetition (optional)	`colou?r` matches color and colour
`[]`	Character set	`[abc]` matches a, b, or c
`()`	Grouping or capturing	`(abc)+` matches repeated “abc”
`\|`	Alternation (OR)	`cat\|dog` matches cat or dog
`\`	Escape special characters	`\.` matches a literal dot(.)

Function	Description	Example (Scrollable Code)
`re.search()`	Searches for the first occurrence of a pattern	`import re text = "Welcome to Python" result = re.search("Python", text) print(result)`
`re.findall()`	Returns all matches as a list	`import re text = "cat, dog, cat, tiger" matches = re.findall("cat", text) print(matches) # ['cat', 'cat']`
`re.match()`	Checks only the beginning of a string	`import re text = "Python is fun" result = re.match("Python", text) print(result)`
`re.split()`	Splits a string by the matched pattern	`import re text = "one-two-three" parts = re.split("-", text) print(parts)`
`re.sub()`	Replaces pattern occurrences with another string	`import re text = "blue sky, blue ocean" result = re.sub("blue", "green", text) print(result)`

Pattern	Description	Example Match
`\d`	Matches any digit (0–9)	“7” in `"Grade7"`
`\D`	Matches any non-digit	“G” in `"G7"`
`\w`	Matches letters, digits, and underscore	“A” or “t” in `"A_10"`
`\W`	Matches any non-alphanumeric character	“#” in `"A#1"`
`\s`	Matches whitespace (space, tab, newline)	space in `"Hello World"`
`\S`	Matches any non-whitespace	“H” in `"Hello"`
`\b`	Word boundary	Match before “Python” in `"Python code"`
`\B`	Not a word boundary	Middle of a word

Anchor	Meaning	Example Use
`^`	Matches the start of the string	Check if text begins with a word
`$`	Matches the end of the string	Ensure a string ends with digits

Python Regular Expressions Practice Questions

Which of the following patterns will match any 3-letter string consisting of lowercase letters only?

You want to check if a string starts with “py” and is followed by any two characters. Which regex pattern should you use?

Which pattern will match a string that starts with a digit, followed by exactly two lowercase letters, and ends with either “x” or “y”?

Consider the following code:

Which of the following is the correct output?

Fill in the blank in the following statement:

The Python re function _____________ returns a match object only if the pattern matches at the beginning of the string, otherwise it returns None.

You have the following text:

You want to match a string that contains either “cat” or “dog” anywhere in it. Which regex pattern will achieve this correctly?

Consider the following text:

You want to extract each title separately using regex. Which pattern will work correctly?

You want to match a string that literally contains a+b. Which regex pattern will work correctly in Python?

Consider the following code:

Which of the following is the correct output?

You have the following text:

You want to find all consecutive repeated words using Python regex. Which pattern will do this correctly?

You want to match either "cat" or "dog" without creating a capturing group. Fill in the blank in the following regex:

Which should replace <blank>?

You want to validate a password that must:

Be at least 6 characters long

Contain at least one digit

Which regex pattern will correctly enforce this using a positive lookahead?

You have the following text:

You want to extract only the numbers preceded by $ using regex. Which pattern will do this correctly?

You want to validate an email address that must:

Start with letters or digits

Contain exactly one @

Domain contains only letters and at least one dot

Which regex pattern is correct?

Which of the following statements about Python regex backreferences is TRUE?

You have the following text:

You want to extract all hashtags (words starting with # and containing letters and digits only, no punctuation). Which regex pattern will work correctly?

You have the following text:

You want to extract numbers that are preceded by $ and followed by USD. Which regex pattern will work correctly?

You have the following text:

You want to extract the content inside the first level of parentheses for each function (not deeply nested). Which regex pattern will work correctly?

You want to validate URLs with the following rules:

Start with http:// or https://

Domain contains letters, digits, or hyphens

Optional path and query parameters

No spaces allowed

Which regex pattern will correctly validate such URLs?

Quick Recap of Python Regular Expressions Concepts

What Are Regular Expressions?

Common Regex Functions in Python

Basic Regex Syntax and Meaning

Why Use Regular Expressions?

Basic Syntax and Metacharacters

Using Regular Expressions in Python (re Module)

Character Classes in Regular Expressions

1. Basic Character Class

2. Character Range

3. Negated Character Class

4. Multiple Character Groups

Common Uses

Predefined Character Classes in Regular Expressions

Example: Extracting Digits from a String

Example: Matching Words Only

Example: Finding Whitespace

Anchors: Matching Start and End of Strings

Example: Must Start with a Capital Letter

Example: Must End with Three Digits

Test Your Python Regular Expressions Knowledge

About This Exercise: Python – Regular Expressions

What You Will Learn

Why Regular Expressions Matter in Python

Start Practicing Python Regular Expressions

Need a Quick Refresher?

The Python `re` function `_____________` returns a match object only if the pattern matches at the beginning of the string, otherwise it returns `None`.

You want to match a string that literally contains `a+b`. Which regex pattern will work correctly in Python?

Which should replace `<blank>`?

You want to extract only the numbers preceded by `$` using regex. Which pattern will do this correctly?

Contain exactly one `@`

You want to extract all hashtags (words starting with `#` and containing letters and digits only, no punctuation). Which regex pattern will work correctly?

You want to extract numbers that are preceded by `$` and followed by `USD`. Which regex pattern will work correctly?

Start with `http://` or `https://`

Using Regular Expressions in Python (`re` Module)