org.apache.solr.analysis
Class BufferedTokenStream

java.lang.Object
  extended by org.apache.lucene.analysis.TokenStream
      extended by org.apache.solr.analysis.BufferedTokenStream
Direct Known Subclasses:
RemoveDuplicatesTokenFilter

public abstract class BufferedTokenStream
extends org.apache.lucene.analysis.TokenStream

Handles input and output buffering of TokenStream

 // Example of a class implementing the rule "A" "B" => "Q" "B"
 class MyTokenStream extends BufferedTokenStream {
   public MyTokenStream(TokenStream input) {super(input);}
   protected Token process(Token t) throws IOException {
     if ("A".equals(t.termText())) {
       Token t2 = read();
       if (t2!=null && "B".equals(t2.termText())) t.setTermText("Q");
       if (t2!=null) pushBack(t2);
     }
     return t;
   }
 }

 // Example of a class implementing "A" "B" => "A" "A" "B"
 class MyTokenStream extends BufferedTokenStream {
   public MyTokenStream(TokenStream input) {super(input);}
   protected Token process(Token t) throws IOException {
     if ("A".equals(t.termText()) && "B".equals(peek(1).termText()))
       write(t);
     return t;
   }
 }
 

Version:
$Id$
Author:
yonik

Constructor Summary
BufferedTokenStream(org.apache.lucene.analysis.TokenStream input)
           
 
Method Summary
 org.apache.lucene.analysis.Token next()
           
protected  java.lang.Iterable<org.apache.lucene.analysis.Token> output()
          Provides direct Iterator access to the buffered output stream.
protected  org.apache.lucene.analysis.Token peek(int n)
          Peek n tokens ahead in the buffered input stream, without modifying the stream.
protected abstract  org.apache.lucene.analysis.Token process(org.apache.lucene.analysis.Token t)
          Process a token.
protected  void pushBack(org.apache.lucene.analysis.Token t)
          Push a token back into the buffered input stream, such that it will be returned by a future call to read()
protected  org.apache.lucene.analysis.Token read()
          Read a token from the buffered input stream.
protected  void write(org.apache.lucene.analysis.Token t)
          Write a token to the buffered output stream
 
Methods inherited from class org.apache.lucene.analysis.TokenStream
close, next, reset
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BufferedTokenStream

public BufferedTokenStream(org.apache.lucene.analysis.TokenStream input)
Method Detail

process

protected abstract org.apache.lucene.analysis.Token process(org.apache.lucene.analysis.Token t)
                                                     throws java.io.IOException
Process a token. Subclasses may read more tokens from the input stream, write more tokens to the output stream, or simply return the next token to be output. Subclasses may return null if the token is to be dropped. If a subclass writes tokens to the output stream and returns a non-null Token, the returned Token is considered to be at the head of the token output stream.

Throws:
java.io.IOException

next

public final org.apache.lucene.analysis.Token next()
                                            throws java.io.IOException
Overrides:
next in class org.apache.lucene.analysis.TokenStream
Throws:
java.io.IOException

read

protected org.apache.lucene.analysis.Token read()
                                         throws java.io.IOException
Read a token from the buffered input stream.

Returns:
null at EOS
Throws:
java.io.IOException

pushBack

protected void pushBack(org.apache.lucene.analysis.Token t)
Push a token back into the buffered input stream, such that it will be returned by a future call to read()


peek

protected org.apache.lucene.analysis.Token peek(int n)
                                         throws java.io.IOException
Peek n tokens ahead in the buffered input stream, without modifying the stream.

Parameters:
n - Number of tokens into the input stream to peek, 1 based ... 0 is invalid
Returns:
a Token which exists in the input stream, any modifications made to this Token will be "real" if/when the Token is read() from the stream.
Throws:
java.io.IOException

write

protected void write(org.apache.lucene.analysis.Token t)
Write a token to the buffered output stream


output

protected java.lang.Iterable<org.apache.lucene.analysis.Token> output()
Provides direct Iterator access to the buffered output stream. Modifying any token in this Iterator will affect the resulting stream.



Copyright © 2006 - 2009 The Apache Software Foundation