Table of Contents
Example 4.1. StandAloneBlast run
use Bio::SeqIO;
use Bio::Tools::Run::StandAloneBlast;
my $Seq_in = Bio::SeqIO->new (-file => $ARGV[0], -format => 'fasta');
my $query = $Seq_in->next_seq();
my $factory = Bio::Tools::Run::StandAloneBlast->new('program' => 'blastp',
'database' => 'swissprot',
_READMETHOD => "Blast"
);
my $blast_report = $factory->blastall($query);
my $result = $blast_report->next_result;
while( my $hit = $result->next_hit()) {
print "\thit name: ", $hit->name(), " significance: ", $hit->significance(), "\n";}
You can try it with this file.
![]() | Exercise 4.1. Running Blast on a Swissprot entry |
|
Run the above example with the TAUD_ECOLI entry from Swissprot (see Section 2.2.2) ). Solution A.10 | |
![]() | Exercise 4.2. Running Blast: Setting parameters |
|
Change the Expectation value (E) Blast parameter to 1.0 (instead of default value, 10.0) (query). Solution A.11 | |
![]() | Exercise 4.3. Running Blast: Saving output |
|
Save the blast report in a local file (e.g blast.out). Solution A.12 | |
![]() | Exercise 4.4. Running a Remote Blast |
|
Find out from the bioperl Web site documentation how to run a Blast at NCBI. Solution A.13 | |
pseudocode:
For all hits (class Bio::Search::Hit::GenericHit)
for all HSP (class Bio::Search::HSP::GenericHSP)
Example 4.2. StandAloneBlast parsing
use Bio::SeqIO;
use Bio::Tools::Run::StandAloneBlast;
my $Seq_in = Bio::SeqIO->new (-file => $ARGV[0],
-format => 'fasta');
my $query = $Seq_in->next_seq();
my $factory = Bio::Tools::Run::StandAloneBlast->new( 'program' => 'blastp',
'database' => 'swissprot',
_READMETHOD => "Blast" );
my $blast_report = $factory->blastall($query);
my $result = $blast_report->next_result;
while( my $hit = $result->next_hit()) {
print "\thit name: ", $hit->name(), " significance: ", $hit->significance(), "\n";
while( my $hsp = $hit->next_hsp()) {
print "E: ", $hsp->evalue(), "frac_identical: ", $hsp->frac_identical(), "\n"; }}
$result belongs to the Bio::Search::Result::GenericResult class, where you can find the documentation about available methods on results.
![]() | Exercise 4.5. Display Blast hits |
|
Display hit accession and hsp length. Solution A.14 | |
![]() | Exercise 4.6. Class of a Blast report |
|
What is the class of $blast_report? (hint: use the ref() perl function). Solution A.15 | |
The $blast_report that is created by the Bio::Tools::Run::StandAloneBlast new method actually belongs to the Bio::SearchIO class (see HOWTO). So, you can directly create such an object from a file.
Example 4.3. Parsing from a Blast file
use Bio::SearchIO;
my $blast_report = new Bio::SearchIO ('-format' => 'blast',
'-file' => $ARGV[0]);
my $result = $blast_report->next_result;
while( my $hit = $result->next_hit()) {
print "\thit name: ", $hit->name(), "\n";
while( my $hsp = $hit->next_hsp()) {
print "E: ", $hsp->evalue(), "frac_identical: ", $hsp->frac_identical(), "\n"; }}
![]() | Exercise 4.7. Parse a Blast output file |
|
Run Example 4.3 with the file you have just created in Exercise 4.3. | |
![]() | Exercise 4.8. Parse Blast results on standard input |
|
Run Example 4.3 with the blast report given on standard input. Solution A.16 | |
![]() | Exercise 4.9. Filtering hits by length |
|
Only display hits of length > to a given value (create a blast report from this file). Solution A.17 | |
![]() | Exercise 4.10. Filtering hits by position |
|
Only display hits between two positions. Solution A.18 | |
![]() | Exercise 4.11. Display the best hit by databank |
|
Display the best hit for each database in a Blast NCBI report (example).
Solution A.19 | |
![]() | Exercise 4.12. Multiple queries |
|
Submit two sequences and display the best hit for each. (2nd query) Solution A.20 | |
![]() | Exercise 4.13. Extracting the subject sequence |
|
Extract as a string the subject sequence from HSP with identity above a given level. Solution A.21 | |
![]() | Exercise 4.14. Extracting alignments |
|
Extract HSP alignment from hits which name contains a given pattern, and print this alignment in Clustalw format. Solution A.22 | |
![]() | Exercise 4.15. Locate EST in a genome |
|
Locate EST in a genome: Solution A.23 | |
![]() | Exercise 4.16. Record Blast hits as sequence features |
|
Iterate a local Blast on databases given on the command line and record each corresponding best hit as a feature for the query sequence. Solution A.24 | |
The old Bio::Tools::BPlite family of parsers (BPlite, Bio::Tools::BPpsilite, Bio::Tools::BPbl2seq) is still maintained in the 1.0 release, although is will be deprecated one day. It is actually still necessary for blasting two sequences and to perform a psiblast (Position Specific Iterative Blast). It is also still the default parser returned by Bio::Tools::Run::StandAloneBlast, as shown below (although this might change soon).
Example 4.4. Parsing with BPLite
In the following example (notice that we have removed the _READMETHOD parameter), the $blast_report does not belong to the Bio::SearchIO class, but to the Bio::Tools::BPlite class. That is why the method to use in order to get informations are different, namely nextSbjct and nextHSP of class Bio::Tools::BPlite::Sbjct instead of next_hit and next_hsp. There is no next_result method, since these parsers were not able to deal with multiple reports.
use Bio::SeqIO;
use Bio::Tools::Run::StandAloneBlast;
my $Seq_in = Bio::SeqIO->new (-file => $ARGV[0],
-format => 'fasta');
my $query = $Seq_in->next_seq();;
my $factory = Bio::Tools::Run::StandAloneBlast->new(
'program' => 'blastp',
'database' => 'swissprot'
);
my $blast_report = $factory->blastall($query);
while (my $subject = $blast_report->nextSbjct()) {
print $subject->name(), "\n";
while (my $hsp = $subject->nextHSP()) {
print join("\t",
$hsp->P,
$hsp->percent,
$hsp->score), "\n";
}
}
![]() | Exercise 4.17. Print informations from a BPLite report |
|
Take Example 4.4 and print hit names, and for all HSP, its P value, percent identity and score (see Bio::Tools::BPlite::HSP). Solution A.25 | |
![]() | Exercise 4.18. Create a Bio::Tools::BPlite from a file |
|
Create a Bio::Tools::BPlite from a file. Solution A.26 | |
![]() | Exercise 4.19. Create a Bio::Tools::BPlite from standard input |
|
Create a Bio::Tools::BPlite from standard input. Solution A.27 | |
See documentation or tutorial at NCBI.
Example 4.5. Running PSI-blast
The following example run a PSI-blast:
use Bio::SeqIO;
use Bio::Tools::Run::StandAloneBlast;
my $query_in = Bio::SeqIO->newFh ( -file => $ARGV[0],
-format => 'fasta' );
my $query = <$query_in>;
$factory = Bio::Tools::Run::StandAloneBlast->new('database' => 'swissprot');
$factory->j(2);
$blast_report = $factory->blastpgp($query);
for my $iteration (1 .. $blast_report->number_of_iterations()) {
print "iteration: $iteration\n";
my $result = $blast_report->round($iteration);
while( my $hit = $result->nextSbjct()) {
print "\thit name: ", $hit->name(), "\n";
foreach my $hsp ($hit->nextHSP) {
print "\tstart: ", $hsp->hit->start(), " end: ",$hsp->hit->end(),"\n";
print "\tscore: ", $hsp->score(), "\n"; } }}
You can also load a report from a file, or from standard input:
my $blast_report = Bio::Tools::BPpsilite->new(-fh=>\*FH);
Example 4.6. PSI-Blast SearchIO class
There is also a SearchIO class for PSI-blast report, but it does not give access to the specific iterations:
use Bio::SearchIO;
my $in = Bio::SearchIO->new( -format => 'psiblast',
-file => $ARGV[0] );
while ( my $result = $in->next_result() ) {
foreach my $hit ( $result->hits ) {
print "Hit: $hit\n"; }}
![]() | Exercise 4.20. Parse a PSI-blast report |
|
Load a PSI-blast report (example ) and at each iteration, display:
Solution A.28 | |
Example 4.7. Running bl2seq
use Bio::SeqIO;use Bio::Tools::Run::StandAloneBlast;
my $query1_in = Bio::SeqIO->newFh ( -file => $ARGV[0],
-format => 'fasta' );
my $query1 = <$query1_in>;
my $query2_in = Bio::SeqIO->newFh ( -file => $ARGV[1],
-format => 'fasta' );
my $query2 = <$query2_in>;
$factory = Bio::Tools::Run::StandAloneBlast->new('program' => 'blastp');
$report = $factory->bl2seq($query1, $query2);
print "ref: ", ref($blast_report), "\n";
while(my $hsp = $report->next_feature) {
print "homology seq :\n", $hsp->homologySeq, "\n";
print "sbjctSeq :\n", $hsp->sbjctSeq, "\n";}
The call to bl2seq will return a Bio::Tools::BPbl2seq object.
![]() | Exercise 4.21. Build a Bio::SimpleAlign object |
|
Build a Bio::SimpleAlign object from the subject and query sequences, and display this alignment in Clustalw format. Solution A.29 | |