Introduce to Ori-Finder
   
 
Ori-Finder is an online system for finding oriCs in bacterial genomes based on an integrated method comprising the analysis of base composition asymmetry using the Z-curve method, distribution of DnaA boxes, and the occurrence of genes frequently close to oriCs. The program can also deal with the unannotated sequences by integrating the gene-finding program ZCURVE 1.02. Output of the predicted results is exported to an HTML report, which offers convenient views on the results in both graphical and tabular formats.
 
I Website description
   
 

The web server Ori-Finder is implemented on Apache server and the web interface is designed using CGI (Common Gateway Interface) Perl scripts. The algorithms to predict oriC regions of bacterial genomes in silico are complemented with the language of C++. The output graphs are generated by gnuplot graphic routine (http://www.gnuplot.info/).

The software kit consists of the following programs.
(1) The program to calculate the coordinates of the RY, MK, GC and AT disparity curves
(2) The program to calculate the distribution of DnaA boxes for the input genome sequence.
(3) The program to assign and rank the priority of every intergenic sequence.
(4) The gene-finding program ZCURVE 1.02.
(5) The Perl script integrating the above programs into one system.

 
II Inputs and parameters
   
 

Ori-Finder has a user-friendly and intuitive input interface. Users can choose to paste the sequence into the input box or upload the sequence (FASTA format) in a file. Additionally, the server re-quires the specification of some optional parameters listed as follows.

(1) Select 'species-specific' DnaA boxes according to the spe-cies. It defaults to the Escherichia coli perfect DnaA box (TTATCCACA). In addition, 'species-specific' DnaA boxes have also been provided for some organism, such as Chlamydiae (TTTTCCACA), Dehalococcoides (TTATCGAAA) and Bradyrhizobiaceae (TGTTTCACG) etc. Users can also choose to search for the DnaA boxes defined by themselves. By default, only nonamers differ in no more than one position from the selected DnaA boxes are scanned, but this can be changed according to the requirements of users.

(2) Whether to upload a ptt file containing the coordinates of genes in the input genome sequence. If no ptt file is up-loaded, ZCURVE1.02 will run to generate the gene list. It should be noted that ZCURVE1.02 or other gene-finding software couldn't annotate gene function. Therefore, it is strongly recommended that user Blasts the genome sequence against a database of indicator genes (such as dnaA, dnaN, hemE, gidA etc) collected by us after running the gene-finding software to obtain the gene list with the annotation of gene function related to oriCs.

(3) Whether to display RY, MK, GC or AT disparity curves and the DnaA box distribution in the output graphs. It defaults to displaying all of them in the output graphs.

 

 

III Outputs
     
 
The output web page shows the process of Ori-Finder, and provides links to the output results by Ori-Finder: (i) the genome size, GC content, location, length, DnaA box number, the motif of DnaA boxes, the conserved genes adjacent to the predicted oriCs, the precise coordinates of extremes of the four disparity curves and the sequence of identified oriC regions as an HTML table; (ii) the DnaA box distribution and the coordinates of four disparity curves as text files; (iii) the integrated plot for the original sequence and (iv) that for the rotated sequence to display the obtained results, such as general genome information, four disparity curves, distribution of DnaA boxes, locations of indicator genes and oriC regions in PNG format. If no ptt file is uploaded, the gene list file generated by ZCURVE1.02 will also be output. Users can do a Blast searching using each output result by Ori-Finder against DoriC, a database of oriC regions, to confirm the reliability of the prediction.