Extending the parser¶
Modules such as page3
extend the CSS 2.1 parser to add support for
CSS 3 syntax.
They do so by sub-classing css21.CSS21Parser
and overriding/extending
some of its methods. If fact, the parser is made of methods in a class
(rather than a set of functions) solely to enable this kind of sub-classing.
tinycss is designed to enable you to have parser subclasses outside of tinycss, without monkey-patching. If however the syntax you added is for a W3C specification, consider including your subclass in a new tinycss module and send a pull request: see Hacking tinycss.
Example: star hack¶
The star hack uses invalid declarations that are only parsed by some versions of Internet Explorer. By default, tinycss ignores invalid declarations and logs an error.
>>> from tinycss.css21 import CSS21Parser
>>> css = '#elem { width: [W3C Model Width]; *width: [BorderBox Model]; }'
>>> stylesheet = CSS21Parser().parse_stylesheet(css)
>>> stylesheet.errors
[ParseError('Parse error at 1:35, expected a property name, got DELIM',)]
>>> [decl.name for decl in stylesheet.rules[0].declarations]
['width']
If for example a minifier based on tinycss wants to support the star hack, it can by extending the parser:
>>> class CSSStarHackParser(CSS21Parser):
... def parse_declaration(self, tokens):
... has_star_hack = (tokens[0].type == 'DELIM' and tokens[0].value == '*')
... if has_star_hack:
... tokens = tokens[1:]
... declaration = super(CSSStarHackParser, self).parse_declaration(tokens)
... declaration.has_star_hack = has_star_hack
... return declaration
...
>>> stylesheet = CSSStarHackParser().parse_stylesheet(css)
>>> stylesheet.errors
[]
>>> [(d.name, d.has_star_hack) for d in stylesheet.rules[0].declarations]
[('width', False), ('width', True)]
This class extends the parse_declaration()
method.
It removes any *
delimeter Token
at the start of
a declaration, and adds a has_star_hack
boolean attribute on parsed
Declaration
objects: True
if a *
was removed, False
for
“normal” declarations.
Parser methods¶
In addition to methods of the user API (see Parsing a stylesheet), here are the methods of the CSS 2.1 parser that can be overriden or extended:
-
CSS21Parser.
parse_rules
(tokens, context)[source]¶ Parse a sequence of rules (rulesets and at-rules).
Parameters: - tokens – An iterable of tokens.
- context – Either
'stylesheet'
or an at-keyword such as'@media'
. (Most at-rules are only allowed in some contexts.)
Returns: A tuple of a list of parsed rules and a list of
ParseError
.
-
CSS21Parser.
read_at_rule
(at_keyword_token, tokens)[source]¶ Read an at-rule from a token stream.
Parameters: - at_keyword_token – The ATKEYWORD token that starts this at-rule You may have read it already to distinguish the rule from a ruleset.
- tokens – An iterator of subsequent tokens. Will be consumed just enough for one at-rule.
Returns: An unparsed
AtRule
.Raises: ParseError
if the head is invalid for the core grammar. The body is not validated. SeeAtRule
.
-
CSS21Parser.
parse_at_rule
(rule, previous_rules, errors, context)[source]¶ Parse an at-rule.
Subclasses that override this method must use
super()
and pass its return value for at-rules they do not know.In CSS 2.1, this method handles @charset, @import, @media and @page rules.
Parameters: - rule – An unparsed
AtRule
. - previous_rules – The list of at-rules and rulesets that have been parsed so far in this context. This list can be used to decide if the current rule is valid. (For example, @import rules are only allowed before anything but a @charset rule.)
- context – Either
'stylesheet'
or an at-keyword such as'@media'
. (Most at-rules are only allowed in some contexts.)
Raises: ParseError
if the rule is invalid.Returns: A parsed at-rule
- rule – An unparsed
-
CSS21Parser.
parse_media
(tokens)[source]¶ For CSS 2.1, parse a list of media types.
Media Queries are expected to override this.
Parameters: tokens – A list of tokens Raises: ParseError
on invalid media types/queriesReturns: For CSS 2.1, a list of media types as strings
-
CSS21Parser.
parse_page_selector
(tokens)[source]¶ Parse an @page selector.
Parameters: tokens – An iterable of token, typically from the head
attribute of an unparsedAtRule
.Returns: A page selector. For CSS 2.1, this is 'first'
,'left'
,'right'
orNone
.Raises: ParseError
on invalid selectors
-
CSS21Parser.
parse_declarations_and_at_rules
(tokens, context)[source]¶ Parse a mixed list of declarations and at rules, as found eg. in the body of an @page rule.
Note that to add supported at-rules inside @page,
CSSPage3Parser
extendsparse_at_rule()
, not this method.Parameters: - tokens – An iterable of token, typically from the
body
attribute of an unparsedAtRule
. - context – An at-keyword such as
'@page'
. (Most at-rules are only allowed in some contexts.)
Returns: A tuple of:
- A list of
Declaration
- A list of parsed at-rules (empty for CSS 2.1)
- A list of
ParseError
- tokens – An iterable of token, typically from the
-
CSS21Parser.
parse_ruleset
(first_token, tokens)[source]¶ Parse a ruleset: a selector followed by declaration block.
Parameters: - first_token – The first token of the ruleset (probably of the selector). You may have read it already to distinguish the rule from an at-rule.
- tokens – an iterator of subsequent tokens. Will be consumed just enough for one ruleset.
Returns: a tuple of a
RuleSet
and an error list. The errors are recoveredParseError
in declarations. (Parsing continues from the next declaration on such errors.)Raises: ParseError
if the selector is invalid for the core grammar. Note a that a selector can be valid for the core grammar but not for CSS 2.1 or another level.
-
CSS21Parser.
parse_declaration_list
(tokens)[source]¶ Parse a
;
separated declaration list.You may want to use
parse_declarations_and_at_rules()
(or some other method that usesparse_declaration()
directly) instead if you have not just declarations in the same context.Parameters: tokens – an iterable of tokens. Should stop at (before) the end of the block, as marked by }
.Returns: a tuple of the list of valid Declaration
and a list ofParseError
-
CSS21Parser.
parse_declaration
(tokens)[source]¶ Parse a single declaration.
Parameters: tokens – an iterable of at least one token. Should stop at (before) the end of the declaration, as marked by a ;
or}
. Empty declarations (ie. consecutive;
with only white space in-between) should be skipped earlier and not passed to this method.Returns: a Declaration
Raises: ParseError
if the tokens do not match the ‘declaration’ production of the core grammar.
Unparsed at-rules¶
-
class
tinycss.css21.
AtRule
(at_keyword, head, body, line, column)[source]¶ An unparsed at-rule.
-
at_keyword
¶ The normalized (lower-case) at-keyword as a string. Eg:
'@page'
-
head
¶ The part of the at-rule between the at-keyword and the
{
marking the body, or the;
marking the end of an at-rule without a body. ATokenList
.
-
body
¶ The content of the body between
{
and}
as aTokenList
, orNone
if there is no body (ie. if the rule ends with;
).
The head was validated against the core grammar but not the body, as the body might contain declarations. In case of an error in a declaration, parsing should continue from the next declaration. The whole rule should not be ignored as it would be for an error in the head.
These at-rules are expected to be parsed further before reaching the user API.
-
Parsing helper functions¶
The tinycss.parsing
module contains helper functions for parsing
tokens into a more structured form:
-
tinycss.parsing.
strip_whitespace
(tokens)[source]¶ Remove whitespace at the beggining and end of a token list.
Whitespace tokens in-between other tokens in the list are preserved.
Parameters: tokens – A list of Token
orContainerToken
.Returns: A new sub-sequence of the list.
-
tinycss.parsing.
split_on_comma
(tokens)[source]¶ Split a list of tokens on commas, ie
,
DELIM tokens.Only “top-level” comma tokens are splitting points, not commas inside a function or other
ContainerToken
.Parameters: tokens – An iterable of Token
orContainerToken
.Returns: A list of lists of tokens
-
tinycss.parsing.
validate_value
(tokens)[source]¶ Validate a property value.
Parameters: tokens – an iterable of tokens Raises: ParseError
if there is any invalid token for the ‘value’ production of the core grammar.
-
tinycss.parsing.
validate_block
(tokens, context)[source]¶ Raises: ParseError
if there is any invalid token for the ‘block’ production of the core grammar.Parameters: - tokens – an iterable of tokens
- context – a string for the ‘unexpected in ...’ message
-
tinycss.parsing.
validate_any
(token, context)[source]¶ Raises: ParseError
if this is an invalid token for the ‘any’ production of the core grammar.Parameters: - token – a single token
- context – a string for the ‘unexpected in ...’ message