Editor - Word extractor
wordextractor, wordext, wex, awk
wex [ <options> ] " [<start_bounder>] <n> [<end_bounder>] " <input_string>
-p Preserve the input string. Without this option, when
a part of the input string (called the extraction target,
or just target) is extracted, that part is removed from
the input string. (This is done so that each subsequent
extract command will produce subsequent words.) With this
option, the input string is left unchanged.
-e Count and return empty words. This option is useful if you
have fields separated by tabs, commas, etc on a line, and
you want to receive empty fields.
-c Case insensitive. Case will be ignored when considering separator characters.
The input string on which this command will operate. It
can be specified as a str constant or str variable or an
expression resulting in a str value.
If a str constant is used, we highly recommend using
double quotes around it, such as "John Doe".
Without the double quotes, the spaces in the input string
will produce errors. In case of a str constant or a str
expression, the -p option is assumed.
<n> The instance number. The input string will be searched for this
instance of the target. Instances are counted from 1. <n> must be either
a number higher than 0 or the letter l (which indicates the last instance).
This argument can either be absent, the character [ or the character ].
The <start_bounder> appears before the <n>. The <end_bounder>
appears after the <n>.
We will now explain the role of these bounders with an example.
We will assume that we are looking for fifth word.
"5" Extract only the fifth word.
"5[" Extract everything after but excluding the fifth word.
"5]" Extract everything upto and including the fifth word.
"[5" Extract everything beginning with and including the fifth word.
"[5[" This combination is INVALID.
"" Extract only the fifth word. This is same as "5".
"]5" Extract everything upto but excluding the fifth word.
"]5[" Extract everything outside but excluding the fifth word.
"]5]" This combination is INVALID.
The quotes in the command syntax are required. Without the double quotes, an error
or erroneous output may be produced.
Stream input is ignored.
The extracted content is added to stream output.
Any errors are listed here.
The command extracts the target word(s) from the input string and writes them to
the stream output or redirected output target. If <input_string> is a constant or an expression,
it remains unchanged. If <input_string> is a variable, and if the -p option is not
specified, the target is removed from the <input_string>. Similarly, if <input_string>
is a variable, and if the -p option is specified, the <input_string> remains unchanged.
The following system variable plays an important role.
$wsep Word separator
This variable is used to identify, number, and extract distinct words.
See the 'systemvar' help topic for its description.
You can change its value to highly refine your search procedure.
We highly recommend that, if you change any system variable's value, you restore it
after the search is complete, as many system variables are often
used by more than one command.
The command CAN ALSO BE USED WITH FILES. Simply read in the contents
of the file using the repro command into a str variable. Perform editing operations on
that variable. Then, write the variable back to the file.
If the <input_string> is specified as a constant or as a str expression,
the presence or absence of option -p is ignored. A constant never changes its value.
We have records in a file myfile.txt. Each record is on a new line. Each record has
fields separated by commas. The following code will produce a tabular listing
of the record-set.
# We will set $wsep to comma, since the fields are comma-separated.
# Save the original $wsep.
var str saved_wsep
set $saved_wsep = $wsep
# Set $wsep to comma.
set $wsep = ","
# Get the record-set in string format.
var str record_set
repro myfile.txt > $record_set
while ($record_set <> "")
# Get the next record. Each record is on a new line, so we will use lex.
# We will use the -e option, since we want empty records.
var str record
lex -e "1" $record_set > $record
# If this record is empty, we will just write a newline.
if ($record == "")
echo # echo always write a new line
# We will collect our output line into a variable, field by field.
var str line
while ($record <> "")
# Process fields one by one. Each field is in a new word, so we will use wex.
# We will use the -e option, since we want empty fields.
var str field
wex -e "1" $record > $field
# We have a field. Add it to line. We will separate fields with tabs in our output.
set $line = $line + $field + "\t"
# We are done with this record. Echo the line.
# Empty the line for the next record.
set $line = ""
# Restore original $wsep.
set $wsep = $saved_wsep
var int i
wex "3" $i # Get the third word.
Will produce an error. $i is not of type str.
© 2008-2013, biterScripting.com. All rights reserved.
biterScripting, biterScript, biterBrowser, biterMobile, biterScripting.com, FVA (Forward Variable Assignment) are trademarks of biterScripting.com. Is it biterScripting-compatible ? is a service mark of biterScripting.com.
Explorer, Unix, Windows are trademarks, service marks or other forms of intellectual property of their respective owners.