3.12.1. sciexp2.common.text.Extractor
Methods
|
Apply extraction to given text. |
- class Extractor(template)
Bases:
objectExtract a dict with the variable values that match a given template.
Variables and sections on the template are used to define regular expressions, following Python’s syntax.
- Parameters:
- templatestr
Template text to extract from.
- extract(text)
Apply extraction to given text.
- Parameters:
- textstr
Text to extract from.
Examples
You can perform simple text extractions, where variables correspond to the simple regex
.+:>>> e = Extractor('Hello {{a}}') >>> e.extract('Hello world') {'a': 'world'} >>> e.extract('Hello 123!') {'a': '123!'}
More complex regexes can be specified using section tags:
>>> Extractor('Hello {{#a}}[0-9]+{{/a}}.*').extract('Hello 123!') {'a': 123}
And using the same variable on multiple tags ensures they all match the same contents:
>>> extracted = Extractor('{{#a}}[0-9]+{{/a}}.*{{a}}{{b}}').extract('123-123456') >>> extracted == {'a': 123, 'b': 456} True