decomp.syntax.dependency

Module for building and containing dependency trees from CoNLL format.

This module provides functionality to parse CoNLL-U and CoNLL-X formatted dependency parse data and convert it into NetworkX DiGraph structures for further processing within the decomp package.

Classes

CoNLLDependencyTreeCorpus

Corpus containing dependency trees built from CoNLL data.

DependencyGraphBuilder

Builder class for constructing dependency graphs from CoNLL format.

Type Aliases

ConllRow

Type alias for a single row of CoNLL data as a list of strings.

ConllData

Type alias for complete CoNLL data as a list of ConllRow entries.

Constants

CONLL_HEAD

Column headers for CoNLL-U (‘u’) and CoNLL-X (‘x’) formats.

CONLL_NODE_ATTRS

Node attribute mappings for different CoNLL format versions.

CONLL_EDGE_ATTRS

Edge attribute mappings for different CoNLL format versions.

class CoNLLDependencyTreeCorpus[source]

Bases: Corpus[ConllData, DiGraph]

Class for building/containing dependency trees from CoNLL-U.

graphs

trees constructed from annotated sentences

graphids

ids for trees constructed from annotated sentences

ngraphs

number of graphs in corpus

class DependencyGraphBuilder[source]

Bases: object

A dependency graph builder.

classmethod from_conll(conll, treeid='', spec='u')[source]

Build DiGraph from a CoNLL representation.

Parameters:
  • conll (TypeAliasType) – conll representation

  • treeid (str, default: '') – a unique identifier for the tree

  • spec (str, default: 'u') – the specification to assume of the conll representation (“u” or “x”)

Return type:

DiGraph