nct.service.homology
Class BlastXMLFileFilterInputStream

java.lang.Object
  extended by java.io.InputStream
      extended by java.io.FilterInputStream
          extended by nct.service.homology.BlastXMLFileFilterInputStream
All Implemented Interfaces:
java.io.Closeable

public class BlastXMLFileFilterInputStream
extends java.io.FilterInputStream

This class extends a FilterInputStream and is used to filter out the extra xml version and DOCTYPE tags that come in Blast "XML" files for multiple queries and wraps the entire document in blast_aggregate tags. The result is an InputStream that is well formed XML. This class be used in place of a FileInputStream as follows.

 InputStream is = new BlastXMLFileFilterInputStream(blastFileName, true);
 
 // blast parser setup - see Biojava in Anger

 parser.parse(new InputSource(is));
 

Author:
Michael Smoot

Field Summary
static java.lang.String doctypeRegEx
          The string defining the regular expression used to identify the DOCTYPE tags in the Blast ouptput.
protected  boolean keepFirst
          Whether or not to keep the first instances of xmlRegEx and doctypeRegEx.
static java.lang.String wrappingTag
          The tag used to wrap the multiple BlastOutput sections.
static java.lang.String xmlRegEx
          The string defining the regular expression used to identify the xml version tags in the Blast ouptput.
 
Fields inherited from class java.io.FilterInputStream
in
 
Constructor Summary
BlastXMLFileFilterInputStream(java.io.InputStream ins, boolean keepFirst)
          Constructor.
BlastXMLFileFilterInputStream(java.lang.String fileName, boolean keepFirst)
          Constructor.
 
Method Summary
 
Methods inherited from class java.io.FilterInputStream
available, close, mark, markSupported, read, read, read, reset, skip
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

wrappingTag

public static java.lang.String wrappingTag
The tag used to wrap the multiple BlastOutput sections.


xmlRegEx

public static java.lang.String xmlRegEx
The string defining the regular expression used to identify the xml version tags in the Blast ouptput.


doctypeRegEx

public static java.lang.String doctypeRegEx
The string defining the regular expression used to identify the DOCTYPE tags in the Blast ouptput.


keepFirst

protected boolean keepFirst
Whether or not to keep the first instances of xmlRegEx and doctypeRegEx. Different parsers are more or less tolerant.

Constructor Detail

BlastXMLFileFilterInputStream

public BlastXMLFileFilterInputStream(java.lang.String fileName,
                                     boolean keepFirst)
                              throws java.io.IOException
Constructor.

Parameters:
fileName - The XML file name that needs to be processed.
keepFirst - Whether or not to keep the first instance of the doctype and xml version declarations.
Throws:
java.io.IOException

BlastXMLFileFilterInputStream

public BlastXMLFileFilterInputStream(java.io.InputStream ins,
                                     boolean keepFirst)
                              throws java.io.IOException
Constructor.

Parameters:
ins - The input stream of the XML file that needs to be processed.
keepFirst - Whether or not to keep the first instance of the doctype and xml version declarations.
Throws:
java.io.IOException