PRODUCT






Home









Free Download








Installation Instructions





FAQ





FAQ








Ask A Question





LEARN SCRIPTING





Overview








Lesson 1








2


3


4


5








Exam





SAMPLE SCRIPTS





Computer








Internet








Administrators








Developers








Data








Miscellaneous





HELP / DOCUMENTATION





Commands








Automated Internet








Automated Editors








Sample Scripts








Precompiled Functions








System Features






  Sample Script - SS_SearchWeb

( Some of the sample scripts may not be reproduced correctly in html because those scripts, especially web-related scripts, have html tags such as < tr > in their code.

For an accurate, copy-and-paste'able text version of this script, see SS_SearchWeb.txt . )



#####################################################################
# SCRIPT: SS_SearchWeb
#
# This script searches the web for a specified search string or a
# Regular Expression.
#
# A seed URL is assigned using FVA (Forward Variable Assignment)
# for str variable $seedURL. The value of $seedURL is of form
# "http://www.abc.def" or "http://www.abc.def/.../page.html" .
#
# The search string or regular expression is passed using FVA for
# str variable $search_sting. Note that even if this variable is called
# search_string, you can pass a regular expression though it.
#
# The list of domains to be ignored is passed using FVA for str variabls
# $ignore_domains. The format is <domain>|<domain>|<domain> ...
# Each <domain> is in the form "http://www.abc.def" .
#
# All domains and domains passed to this script need to be in the standard format
# http://www.abc.def or http://www.abc.def/...../page.html
# The http:// part IS REQUIRED. The page name extension can be anything such as
# .html, .aspx, .js, etc.
#
# The script writes the found instances of the <search_string> to a file out.txt.
# This file then may be viewed using any text editor. You can also write the output to a
# file called out.html, and view it with a web browser, but the formatting will not be correct.
#
# NOTE: Depending on the seedURL, this script may take a long time to execute and may collect
# a very large amount of data. Writing large amounts of data to screen can slow down execution
# even further.
#
# This script calls other sample scripts as follows.
# SS_URLs Collects URLs from $seedURL and from URLs thus collected.
# SS_SearchURL Searches each URL thus found.
#
# This script can be stored and edited as necessary, in a text file
# called SS_SearchWeb.txt . The script can then be called as
#
# script SS_SearchWeb.txt seedURL("http://www.abc.def/.../page.html") search_string("Computer Training")
#
#####################################################################

var str seedURL # Name of the seed URL
var str search_string # string or regular expression to search for
var str ignore_domains # List of domains to be ignored. Domains are separated by |.

var str URLList # We collect a list of URLs found along
# the way in this variable. We repeatedly
# call script SS_SearchURL for each
# URL in this list.

var str foundURL # The URL we are currenly processing.

var str processedURLList # we keep the list of processed URLs in here,
# before processing a new URL, we always check if
# we already did that URL before. This way, we do not
# go into an infinite loop, because URLs do mutually
# refer each other.

# Add the seedURL to URLList. We will start the list with
# just this one URL.
echo $seedURL >> $URLList

while ($URLList <> "")
do
# Get the next URL.
lex "1" $URLList > $foundURL

# Did we already process this URL ?
# Create dynamic argument for the sen command.

var str sen_arg
set $sen_arg = "^````"+$foundURL+"````^"

# The resulting value of $sen_arg will like this: ^````<URL>````^
# That means we are looking for the exact URL with the "````" before and after.
# If there is any character before/after the URL which is not `, that's a different
# URL. This makes sure that, for example, http://www.abc.def does not match http://www.abc.def/xyz.

if ( { sen -c $sen_arg $processedURLList } == 0 )
do
# No, we did not process this URL.

# Output this URL.
echo $foundURL >> out.txt

# Search this URL first.
script SS_SearchURL.txt URL($foundURL) search_string($search_string) >> out.txt

# Add URLs found in this URL to our list of URLs to process.
script SS_URLs.txt URL($foundURL) ignore_domains($ignore_domains) >> $URLList

# Add this URL to processedURLList. We will add a marker "````" before and after the URL.
echo ("````"+$foundURL+"````") >> $processedURLList

done
endif

done

© 2008-2014, biterScripting.com. All rights reserved.
biterScripting, biterScript, biterBrowser, biterMobile, biterScripting.com, FVA (Forward Variable Assignment) are trademarks of biterScripting.com. Is it biterScripting-compatible ? is a service mark of biterScripting.com. Explorer, Unix, Windows are trademarks, service marks or other forms of intellectual property of their respective owners.