Ruby
Ruby
Regular Expression with Capture Groups
See more Regular Expressions Examples
Note: Chilkat uses PCRE2. See PCRE2 Regular Expressions
Also see: PCRE2 Performance
Demonstrates the following PCRE2 regular expression:
See the sample code below.
Name:\s+(\w+)\s+(\w+),\s+Email:\s+(\S+)
And apply it to this string:
Name: John Smith, Email: john.smith@example.com
Regex Components Explained
| Part | Meaning | Matched Text |
|---|---|---|
| "Name:" | Matches the literal text "Name:" | "Name:" |
| "\s+" | Matches one or more whitespace characters (spaces, tabs, etc.) | (space) |
| "(\w+)" | Capture Group 1: One or more word characters ("a-zA-Z0-9_") | "John" |
| "\s+" | More whitespace | (space) |
| "(\w+)" | Capture Group 2: Another word (the last name) | "Smith" |
| "," | A literal comma | "," |
| "\s+" | Whitespace again | (space) |
| "Email:" | Matches the literal "Email:" | "Email:" |
| "\s+" | Whitespace | (space) |
| "(\S+)" | Capture Group 3: One or more non-whitespace characters | "john.smith@example.com" |
Matches for Your Example String
String:
"Name: John Smith, Email: john.smith@example.com"
Regex Match Groups:
| Group | Captured Value |
|---|---|
| Group 1 | "John" |
| Group 2 | "Smith" |
| Group 3 | "john.smith@example.com" |
Notes on Character Classes
\wmatches[a-zA-Z0-9_]— so it doesn’t include punctuation like a period.\Smatches any non-whitespace character, so it’s good for capturing an email.
Chilkat Ruby Downloads
require 'chilkat'
success = false
subject = "Name: John Smith, Email: john.smith@example.com"
pattern = "Name:\\s+(\\w+)\\s+(\\w+),\\s+Email:\\s+(\\S+)"
sb = Chilkat::CkStringBuilder.new()
sb.Append(subject)
json = Chilkat::CkJsonObject.new()
json.put_EmitCompact(false)
timeoutMs = 2000
numMatches = sb.RegexMatch(pattern,json,timeoutMs)
if (numMatches < 0)
# Probably an error in the regular expression.
# Suggestion: Use AI to help create and/or diagnose regular expressions.
print sb.lastErrorText() + "\n";
exit
end
# Examine the matches:
print json.emit() + "\n";
# This is the JSON with the match information.
# See the JSON parsing code below to get the matched capture group values.
# Important: Capture group 0 always contains the entire match — that is, the portion of the input string that matches the full regular expression.
# {
# "match": [
# {
# "group": [
# {
# "cap": "Name: John Smith, Email: john.smith@example.com",
# "idx": 0,
# "len": 47
# },
# {
# "cap": "John",
# "idx": 6,
# "len": 4
# },
# {
# "cap": "Smith",
# "idx": 11,
# "len": 5
# },
# {
# "cap": "john.smith@example.com",
# "idx": 25,
# "len": 22
# }
# ]
# }
# ]
# }
i = 0
matchCount = json.SizeOfArray("match")
while i < matchCount
print "Match " + (i + 1).to_s() + ":" + "\n";
json.put_I(i)
j = 0
numCaptureGroups = json.SizeOfArray("match[i].group")
while j < numCaptureGroups
json.put_J(j)
cap = json.stringOf("match[i].group[j].cap")
print j.to_s() + ": " + cap + "\n";
j = j + 1
end
i = i + 1
end
# Capture group 0 always contains the entire match — that is, the portion of the input string that matches the full regular expression.
# Output
# Match 1:
# 0: Name: John Smith, Email: john.smith@example.com
# 1: John
# 2: Smith
# 3: john.smith@example.com