decomp.semantics.predpatt.corpus¶
Container classes for collections of PredPatt extractions and integration with UDS corpora.
Corpus management for PredPatt semantic extractions.
This module provides functionality for loading and managing collections of PredPatt semantic graphs from CoNLL-U format dependency corpora.
Classes¶
- PredPattCorpus
Container class extending the base Corpus for managing PredPatt semantic extractions paired with their dependency graphs.
- class PredPattCorpus[source]¶
Bases:
Corpus[tuple[PredPattEngine,DiGraph],DiGraph]Container for managing collections of PredPatt semantic graphs.
This class extends the base Corpus class to handle PredPatt extractions paired with their dependency graphs. It provides methods for loading corpora from CoNLL format and converting them to NetworkX graphs with semantic annotations.
- classmethod from_conll(corpus, name='ewt', options=None)[source]¶
Load a CoNLL-U dependency corpus and extract predicate-argument structures.
Parses Universal Dependencies format data and applies PredPatt extraction rules to identify predicates and their arguments. Each sentence in the corpus is processed to create a semantic graph.
- Parameters:
corpus (str | TextIO) – Path to a .conllu file, raw CoNLL-U formatted string, or open file handle
name (str, optional) – Corpus name used as prefix for graph identifiers. Default is ‘ewt’
options (PredPattOpts | None, optional) – Configuration options for PredPatt extraction. If None, uses default options with relative clause resolution and argument borrowing enabled
- Returns:
Corpus containing PredPatt extractions and their graphs
- Return type:
- Raises:
ValueError – If PredPatt cannot parse the provided CoNLL-U data, likely due to incompatible Universal Dependencies version