TrashPanda Wiki
Advertisement

A RegEx, or Regular Expression, is a sequence of characters that forms a search pattern.

It is useful for mainly two tasks:

  1. Verifying that strings match a pattern (for instance, that a string has the format of an email address).
  2. Performing substitutions in a string (such as changing all American spelling to British ones).

Some basics[]

The below example checks whether the pattern "spam" matches the string and prints "Match if it does.

import re

# Define a regular expression
pattern = r"spam"

# Run 're.match' function to determine whether it matches at the beginning of a string.
if re.match(pattern, "spamspamspam"):
    print("Match")
else:
    print("No Match")



>>>
Match
>>>

Other functions we can use to match patterns are :

re.search: finds a match of a pattern anywhere in the string.

re.findall: returns a list of all substrings that match a pattern.

if re.match(pattern, "eggspamsausagespam"):
    print("Match")
else:
    print("No match")

if re.search(pattern, "eggspamsausagespam"):
    print("Match")
else:
    print("No Match")


print(re.findall(pattern, "eggspamsausagespam"))


>>>
No match
Match
['spam', 'spam']
>>>

The regex search returns an object with several methods that give details about it.

group: returns the string matched.

start/end: returns the start and ending postions of the first match respectively.

span: returns the starts and end positions of the first match as a tuple.

import re

pattern = r"pam"

match = re.search(pattern, "eggspamsausage")

if match:
    print(match.group())
    print(match.start())
    print(match.end())
    print(match.span())


>>>
pam
4
7
(4, 7)
>>>

Search and Replace[]

One of the most important regular expressions is sub.

re.sub(<pattern>, <repl>, <string>, max=0)

pattern = the pattern you are matching.

repl = the thing you are replacing the matched pattern with.

string = The thing you are running the substitute on.

e.g:

import re

my_string = "My name is Renee. Hi Renee."
pattern = r"Renee"
new_string = re.sub(pattern, "Daniel", my_string)
print(new_string)


>>>
My name is Daniel. Hi Daniel.
>>>
Advertisement