net.sf.saxon.expr
Class Tokenizer
java.lang.Object
net.sf.saxon.expr.Tokenizer
public final class Tokenizer
extends java.lang.Object
Tokenizer for expressions and inputs.
This code was originally derived from James Clark's xt, though it has been greatly modified since.
See copyright notice at end of file.
static int | BARE_NAME_STATE - State in which a name is NOT to be merged with what comes next, for example "("
|
static int | DEFAULT_STATE - Initial default state of the Tokenizer
|
static int | OPERATOR_STATE - State in which the next thing to be read is an operator
|
static int | SEQUENCE_TYPE_STATE - State in which the next thing to be read is a SequenceType
|
int | currentToken - The number identifying the most recently read token
|
int | currentTokenStartOffset - The position in the input expression where the current token starts
|
String | currentTokenValue - The string value of the most recently read token
|
String | input - The string being parsed
|
int | inputOffset - The current position within the input string
|
int | startLineNumber - The starting line number (for XPath in XSLT, the line number in the stylesheet)
|
int | getColumnNumber() - Get the column number of the current token
|
int | getColumnNumber(int offset) - Return the column number corresponding to a given offset in the expression
|
long | getLineAndColumn(int offset) - Get the line and column number corresponding to a given offset in the input expression,
as a long value with the line number in the top half
and the column number in the lower half
|
int | getLineNumber() - Get the line number of the current token
|
int | getLineNumber(int offset) - Return the line number corresponding to a given offset in the expression
|
int | getState() - Get the current tokenizer state
|
void | lookAhead() - Look ahead by one token.
|
void | next() - Get the next token from the input expression.
|
char | nextChar() - Read next character directly.
|
String | recentText() - Get the most recently read text (for use in an error message)
|
void | setState(int state) - Set the tokenizer into a special state
|
void | tokenize(String input, int start, int end, int lineNumber) - Prepare a string for tokenization.
|
void | treatCurrentAsOperator() - Force the current token to be treated as an operator if possible
|
void | unreadChar() - Step back one character.
|
BARE_NAME_STATE
public static final int BARE_NAME_STATE
State in which a name is NOT to be merged with what comes next, for example "("
DEFAULT_STATE
public static final int DEFAULT_STATE
Initial default state of the Tokenizer
OPERATOR_STATE
public static final int OPERATOR_STATE
State in which the next thing to be read is an operator
SEQUENCE_TYPE_STATE
public static final int SEQUENCE_TYPE_STATE
State in which the next thing to be read is a SequenceType
currentToken
public int currentToken
The number identifying the most recently read token
currentTokenStartOffset
public int currentTokenStartOffset
The position in the input expression where the current token starts
currentTokenValue
public String currentTokenValue
The string value of the most recently read token
input
public String input
The string being parsed
inputOffset
public int inputOffset
The current position within the input string
startLineNumber
public int startLineNumber
The starting line number (for XPath in XSLT, the line number in the stylesheet)
getColumnNumber
public int getColumnNumber()
Get the column number of the current token
getColumnNumber
public int getColumnNumber(int offset)
Return the column number corresponding to a given offset in the expression
offset
- the byte offset in the expression
getLineAndColumn
public long getLineAndColumn(int offset)
Get the line and column number corresponding to a given offset in the input expression,
as a long value with the line number in the top half
and the column number in the lower half
offset
- the byte offset in the expression
- the line and column number, packed together
getLineNumber
public int getLineNumber()
Get the line number of the current token
getLineNumber
public int getLineNumber(int offset)
Return the line number corresponding to a given offset in the expression
offset
- the byte offset in the expression
getState
public int getState()
Get the current tokenizer state
lookAhead
public void lookAhead()
throws XPathException
Look ahead by one token. This method does the real tokenization work.
The method is normally called internally, but the XQuery parser also
calls it to resume normal tokenization after dealing with pseudo-XML
syntax.
next
public void next()
throws XPathException
Get the next token from the input expression. The type of token is returned in the
currentToken variable, the string value of the token in currentTokenValue.
nextChar
public char nextChar()
throws StringIndexOutOfBoundsException
Read next character directly. Used by the XQuery parser when parsing pseudo-XML syntax
- the next character from the input
recentText
public String recentText()
Get the most recently read text (for use in an error message)
- a chunk of text leading up to the error
setState
public void setState(int state)
Set the tokenizer into a special state
tokenize
public void tokenize(String input,
int start,
int end,
int lineNumber)
throws XPathException
Prepare a string for tokenization.
The actual tokens are obtained by calls on next()
input
- the string to be tokenizedstart
- start point within the stringend
- end point within the string (last character not read):
-1 means end of stringlineNumber
- the linenumber in the source where the expression appears
XPathException
- if a lexical error occurs, e.g. unmatched
string quotes
treatCurrentAsOperator
public void treatCurrentAsOperator()
Force the current token to be treated as an operator if possible
unreadChar
public void unreadChar()
Step back one character. If this steps back to a previous line, adjust the line number.