decomp.semantics.uds.document

Module for representing UDS documents with sentence-level and document-level graphs.

This module provides the UDSDocument class for managing Universal Decompositional Semantics (UDS) documents. Each document contains:

  • A collection of sentence-level graphs (UDSSentenceGraph)

  • A document-level graph (UDSDocumentGraph) connecting nodes across sentences

  • Metadata including document name, genre, and timestamp

  • Methods for adding sentences and annotations to the document

The document structure preserves the hierarchical relationship between documents and their constituent sentences while enabling document-level semantic annotations.

class UDSDocument[source]

Bases: object

A Universal Decompositional Semantics document.

Parameters:
  • sentence_graphs (TypeAliasType) – the UDSSentenceGraphs associated with each sentence in the document

  • sentence_ids (TypeAliasType) – the UD sentence IDs for each graph

  • name (str) – the name of the document (i.e. the UD document ID)

  • genre (str) – the genre of the document (e.g. weblog)

  • timestamp (str | None, default: None) – the timestamp of the UD document on which this UDSDocument is based

  • doc_graph (UDSDocumentGraph | None, default: None) – the NetworkX DiGraph for the document. If not provided, this will be initialized without edges from sentence_graphs

__init__(sentence_graphs, sentence_ids, name, genre, timestamp=None, doc_graph=None)[source]
to_dict()[source]

Convert the document graph to a dictionary.

Returns:

NetworkX adjacency data format for the document graph

Return type:

NetworkXGraphData

classmethod from_dict(document, sentence_graphs, sentence_ids, name='UDS')[source]

Construct a UDSDocument from a dictionary.

Since only the document graphs are serialized, the sentence graphs must also be provided to this method call in order to properly associate them with their documents.

Parameters:
  • document (dict[str, dict]) – a dictionary constructed by networkx.adjacency_data, containing the graph for the document

  • sentence_graphs (dict[str, UDSSentenceGraph]) – a dictionary containing (possibly a superset of) the sentence-level graphs for the sentences in the document

  • sentence_ids (dict[str, str]) – a dictionary containing (possibly a superset of) the UD sentence IDs for each graph

  • name (str, default: 'UDS') – identifier to append to the beginning of node ids

Return type:

UDSDocument

add_sentence_graphs(sentence_graphs, sentence_ids)[source]

Add sentence graphs to the document.

Creates document-level nodes for each semantics node in the sentence graphs and updates the sentence graph metadata with document information.

Parameters:
  • sentence_graphs (SentenceGraphDict) – Dictionary mapping graph names to UDSSentenceGraph objects

  • sentence_ids (SentenceIDDict) – Dictionary mapping graph names to UD sentence identifiers

Return type:

None

add_annotation(node_attrs, edge_attrs)[source]

Add annotations to the document-level graph.

Delegates to the document graph’s add_annotation method, passing along the sentence IDs for validation.

Parameters:
  • node_attrs (dict[str, NodeAttributes]) – Node annotations keyed by node ID

  • edge_attrs (dict[EdgeKey, EdgeAttributes]) – Edge annotations keyed by (source, target) tuples

Return type:

None

semantics_node(document_node)[source]

Get the semantics node corresponding to a document node.

Document nodes maintain references to their corresponding semantics nodes through the ‘semantics’ attribute, which contains the graph name and node ID.

Parameters:

document_node (str) – The document domain node ID

Returns:

Single-item dict mapping node ID to its attributes

Return type:

dict[str, BasicNodeAttrs]

Raises:
  • TypeError – If the semantics attribute is not a dictionary

  • KeyError – If required keys are missing from semantics dict

property text: str

The full document text reconstructed from sentences.

Concatenates the text from all sentence graphs in sorted order with space separation.

Returns:

The complete document text

Return type:

str