PRODUCT






Home









Free Download








Installation Instructions





FAQ





FAQ








Ask A Question





LEARN SCRIPTING





Overview








Lesson 1








2


3


4


5








Exam





SAMPLE SCRIPTS





Computer








Internet








Administrators








Developers








Data








Miscellaneous





HELP / DOCUMENTATION





Commands








Automated Internet








Automated Editors








Sample Scripts








Precompiled Functions








System Features






  Sample Script - SS_WebPageToText

( Some of the sample scripts may not be reproduced correctly in html because those scripts, especially web-related scripts, have html tags such as < tr > in their code.

For an accurate, copy-and-paste'able text version of this script, see SS_WebPageToText.txt . )



#####################################################################
# SCRIPT: SS_WebPageToText.txt
#
# This script reads a web page and creates corresponding plain text
# version. The plain text version of web pages, thus created, can then be stored in
# a local file and used for spell-checking, excerpting, review by
# legal department, inclusion into legal documents, inclusion into
# requirements/specifications documents, keeping the page lengths within limits, or other
# purposes.
#
# The name of the web page is passed as input argument/FVA $page to this script.
# It can be either a web page or a local file and has one of the following forms -
# "http://www.xxx.yyy/.../zzz.html", "C:/.../file.html" . We are using the extension
# of .html as an example only, the script will accept any extension such as .asp, .php, etc.
#
# Download this script into directory C:/Scripts to a file named sS_WebPageToText.txt.
# Then call it as below.
#
# script "C:/Scripts/SS_WebPageToText.txt" page("http://www.xxx.yyy/.../zzz.mmm")
#
# The above will produce text output on screen. If you want to store the output in a file,
# simply redirect the script output to a local file, as below -
#
# script "C:/Scripts/SS_WebPageToText.txt" page("http://www.xxx.yyy/.../zzz.mmm") > "C:/page.txt"
#
# The script can be edited to meet your requirements more precisely.
#
# IMPORTANT: As a sample of producing
# debugging messages, we have left the debugging echo calls in this script. The -e option
# in these echo commands indicates that the output of the echo commands will be written to
# standard error stream. To not see these debugging statements, simply the following when calling
# this script -
# 2>null
# Alternatively, you can remove the debugging echo calls.
#
# If you don't have biterscripting, you can download it from biterscripting.com .
#
#####################################################################

var str page # name of input page or file

# Read the file contents into a variable web_version.
# We will create the text version in variable text_version.
var str web_version, text_version
echo -e "DEBUG Reading page " $page
cat $page > $web_version

# Remove all <...> tags. To do this, we will use the script SS_RemoveTags.
echo -e "DEBUG Removing <...> Tags"
script SS_RemoveTags.txt input($web_version) start_tag("<") end_tag(">") > $web_version

# Remove all {...} tags.
echo -e "DEBUG Removing {...} Tags"
script SS_RemoveTags.txt input($web_version) start_tag("{") end_tag("}") > $web_version

# Remove &...; formatting tags
echo -e "DEBUG Removing &...; Tags"
script SS_RemoveTags.txt input($web_version) start_tag("&") end_tag(";") > $web_version

# Remove all extra spaces, tabs, etc. We will replace them with one space.
# Will will use a temporary str variable for collecting intermediate output.
echo -e "DEBUG Removing extra formatting characters"
var str temp_str
while ( { sen -r "^,^" $web_version } > 0 )
do
stex -r "]^,^" $web_version > $temp_str
set $text_version = $text_version + $temp_str + " "
stex -r "^;^]" $web_version > null # We will discard this output.
done

# There may be something left in $web_version
set $text_version = $text_version + $web_version

# For easy reading, we will cut lines at every 12 words. (You can change this number of words,
# pass the number of words thru an input variable, or format it entirely differently.)
# We will create the formatted version in $formatted_version .
echo -e "DEBUG Inserting line breaks for easy reading"
var str formatted_version
while ( { wen $text_version } > 12 )
do
wex "12]" $text_version > $temp_str
set $formatted_version = $formatted_version + $temp_str + "\n"
done

# There may be something left in $text_version
set $formatted_version = $formatted_version + $text_version

# Write out the formatted version.
echo $formatted_version

2008-2014, biterScripting.com. All rights reserved.
biterScripting, biterScript, biterBrowser, biterMobile, biterScripting.com, FVA (Forward Variable Assignment) are trademarks of biterScripting.com. Is it biterScripting-compatible ? is a service mark of biterScripting.com. Explorer, Unix, Windows are trademarks, service marks or other forms of intellectual property of their respective owners.