Title: | Match Regular Expressions with a Nicer 'API' |
---|---|
Description: | A small wrapper on 'regexpr' to extract the matches and captured groups from the match of a regular expression to a character vector. |
Authors: | Gabor Csardi |
Maintainer: | Gabor Csardi <[email protected]> |
License: | MIT + file LICENSE |
Version: | 2.0.0.9000 |
Built: | 2024-12-01 06:10:55 UTC |
Source: | https://github.com/gaborcsardi/rematch |
This function is a small wrapper on the regexpr
base R function, to provide an API that is easier to use.
re_match(pattern, text, ...)
re_match(pattern, text, ...)
pattern |
Regular expression, defaults to be a PCRE
expression. See |
text |
Character vector. |
... |
Additional arguments to pass to
|
Currently only the first occurence of the pattern is used.
A character matrix of the matched (sub)strings.
The first column is always the full match. This column is
named .match
. The result of the columns are capture groups,
with appropriate column names, if the groups are named.
dates <- c("2016-04-20", "1977-08-08", "not a date", "2016", "76-03-02", "2012-06-30", "2015-01-21 19:58") isodate <- "([0-9]{4})-([0-1][0-9])-([0-3][0-9])" re_match(text = dates, pattern = isodate) # The same with named groups isodaten <- "(?<year>[0-9]{4})-(?<month>[0-1][0-9])-(?<day>[0-3][0-9])" re_match(text = dates, pattern = isodaten)
dates <- c("2016-04-20", "1977-08-08", "not a date", "2016", "76-03-02", "2012-06-30", "2015-01-21 19:58") isodate <- "([0-9]{4})-([0-1][0-9])-([0-3][0-9])" re_match(text = dates, pattern = isodate) # The same with named groups isodaten <- "(?<year>[0-9]{4})-(?<month>[0-1][0-9])-(?<day>[0-3][0-9])" re_match(text = dates, pattern = isodaten)
This function is a thin wrapper on the gregexpr
base R function, to provide an API that is easier to use. It is
similar to re_match
, but extracts all matches, including
potentially named capture groups.
re_match_all(pattern, text, ...)
re_match_all(pattern, text, ...)
pattern |
Regular expression, defaults to be a PCRE
expression. See |
text |
Character vector. |
... |
Additional arguments to pass to
|
A list of character matrices. Each list element contains the
matches of one string in the input character vector. Each matrix
has a .match
column that contains the matching part of the
string. Additional columns are added for capture groups. For named
capture groups, the columns are named.
A small wrapper on 'regexpr' to extract the matches and captured
groups from the match of a regular expression to a character vector.
See re_match
.