System Feature
RE
Purpose
Regular Expression
Aliases
None.
Syntax
<fragment> [ <fragment> [ <fragment ... ] ]
Regular Expression consists of one or more <fragment>s. A <fragment> is also called
REF (Regular Expression Fragment). Each <fragment> has the following syntax.
[ . | , | @ | : | ; | & | <string> | <character_set> ]
Options
None.
Arguments
. The period (.). Matches exactly one printable character.
, The comma (,). Matches exactly one non-printable character. Space, tab, newline,
carriage return are some examples of non-printable characters.
@ The at symbol (@). Matches exactly one any (printable and non-printable) character.
: The colon (:). Matches any number (including 0) of printable characters.
; The semicolon (;). Matches any number (including 0) of non-printable characters.
& The ampersand symbol (&). Matches any number (including 0) of any
(printable and un-printable) characters.
<string>
The exact string to match. This is an exact sequence of exact individual characters.
<character_set>
A set of characters from which exactly one character will be matched. This has the following syntax.
( [#] < character > | <character_range> [ < character > | <character_range> [ < character > | <character_range> ... ]] )
If the leading sharp symbol (#) is not present, exactly one character in the
<character_set> will be matched. If the leading sharp symbol (#) is present,
exactly one character NOT in the <character_set> will be matched.
The opening and closing parentheses are required around a <character_set>. Also,
no commas or spaces should be used within a <character_set> unless they are part
of the <character_set>.
If the sharp symbol (#) is present, but not the character immediately
following the opening parenthesis ( ( ), it will be treated just as another
character in the <character_set>.
<character> An exact character.
<character_range> A range of characters. This has the following syntax.
<character1>><character2>
<character1> An exact character which starts the <character_range>.
<character2> An exact character which ends the <character_range>.
Note that the second > character indicates the range.
If <string> or <characer_set> contain a backslash, a double quote ("),
a caret (^), an opening square bracket ([), a closing square bracket (]),
a period (.), a comma (,), an at symbol (@),
a colon (:), a semicolon (;), an ampresand symbol (&),
an opening parenthesis ((), a closing parenthesis ()),
a sharp symbol (#), a greater than sign (>), or a formatting character,
escape them with a backslash, such as
\\, \^, \[, \]. \", \., \,, \@, \:, \;, \&, \(, \), \#, \>, etc.
See help page on escape for more details.
Stream Input
Not applicable.
Stream Output
Not applicable.
Stream Error
Not applicable. Any errors are listed in the Stream Error of the command of which
this regular expression is part of.
Description
Regular expressions provide a way to search and manipulate strings containing complex
search criteria. They can be used with any string editor command. The string editor command
must contain the -r option, so that the command will treat the search string as a regular
expression.
Restrictions
Valid Example
var str str
# Assign $str
...
sen -r "^.^" $str
Will output the number of all pritable characters in $str.
sen -r "^@^" $str
Will output the number of all (printable and non-printable) characters in $str.
stex -r "[^webpage statistics^" $str
The above will find the first instance of exact string "Webpage statistics" in $str and extract
parts of $str beginning with (and including) that instance.
stex -c -r "[^webpage statistics^" $str
This will also extract the same part of $str, but it will find the instance of the string "webpage statistics"
irrespective of the case. So, Webpage statistics, WEBPAGE STASTISTICS, etc. will all match.
sin -r "^:^" "BEGIN PRINTABLE PORTION" $str
The above will insert "BEGIN PRINTABLE PORTION" before the first printable portion within $str.
sap -r "^:^" "END PRINTABLE PORTION" $str
The above will append "END PRINTABLE PORTION" after the first printable portion within $str.
sap -r "^:^l" "END PRINTABLE PORTION" $str
The above will append "END PRINTABLE PORTION" after the last printable portion within $str.
sen -r -c "^(aeiou)^" $str
The above will output the number of vowels (lower and upper case) in $str.
(a>z)
This is a <character_set>. It will match exactly one lower case alphabetic character.
(a>zA>Z)
This is also a <character_set>. It will match exactly one lower case or upper case
alphabetic character.
while ( { sen -r "^( \t\n\r)^" $str } > 0 )
sex -r "^( \t\n\r)^" $str >null
The above will remove all formatting characters (space, tab, newline, carriage return) from $str.
Invalid Examples
(##)
This is not quite an invalid example. This will match one character which is not #.
customer (#)
is invalid. It will match "customer " followed by any character. If the # is meant to be
part of the <character_set>, since it is by itself, use it outside the parentheses,
as follows:
customer #
The above will match "customer #".
See Also
escape
sen
stex
sin
sap
sal
|