Back to Main Page…

 

Tentative Layout for an Automatic Script Generator

 

This page describes a design prototype for an automatic Perl script generation program that takes a web-browser based query form as input and outputs a Perl script. The generated script can be used to submit queries instead of the web-browser form. Examples of the kinds of scripts that may be generated by this program can be found at the Perl Scripts Page. The advantage of using a script is that it can be used to automatically submit multiple queries whereas a web-browser based form only allows manual entry of one query at a time. While this automatic script generator is intended for use with various types of protein structure prediction programs, it may potentially find use in other areas where automatic query placement is of value.

 

The minimal information necessary to access a given site may consist of the following data:

 

WWW Form location URL: Here the user must input the URL for the web-page where the query form resides. For example, a user specified URL would be http://pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_blast.html for the location of the form to make a Blast Search using the NPSA site described on the main page. The script generator will download the HTML source from this location to extract the information required by the query engine.

 

Protein String File: The user may input the name of the text file that contains the list of protein strings to be submitted here.

 

Email Address: Some protein structure prediction sites return results via email. An preliminary version of the script generator may ask the user if this is the case and if so to input an email address to return the results to. Later implementations of the generator should be able to automatically deduce if an email address is required.

 

Eventually a system will be in place that takes the output from the Perl scripts and transfers it to an SQL database(see Main Page for more details) to allow effective and convenient retrieval and analysis of the data mined. This should be useful considering the potentially huge quantities of data that may end up being collected.

 

Brief Description of Program Strategy/Logic

 

The script generator will likely include the following steps:

 

Ø      Download the html source for the query form using the user input URL

Ø      Extract the location of the server CGI script using the ACTION tag from the HTML code

Ø      Extract required request fields using the NAME and VALUE tags from the HTML code

Ø      Select appropriate values using user inputs, default values and HTML code data to create an appropriate request object

Ø      Use a pre-defined script template with the above information plugged in. The template will contain generic code for creating a user agent that sends out a request and brings back a result from the server

 

 

NOTE: A paper dealing with related subject matter may be found the following URL: http://www3.interscience.wiley.com/cgi-bin/abstract?ID=40000289 While the site requires users to logon, registration is free.

 

The title of this paper isA script-based WWW information integration support system and its application to genome databases” It has been presented by Yasuhiko Kitamura, Tetsuya Nozaki and Shoji Tatsumi from the Department of Information and Communication Engineering, Faculty of Engineering, Osaka City University, Osaka, Japan 558

 

Back to Main Page…