decomp.semantics.uds.metadata

Metadata structures for Universal Decompositional Semantics (UDS) annotations.

This module defines the metadata infrastructure used to describe and validate UDS semantic annotations across sentence and document graphs. It provides a flexible type system that supports both categorical and continuous values with optional bounds and ordering constraints.

Key Components

Type System
  • PrimitiveType: Base types supported in UDS (str, int, bool, float)

  • UDSDataTypeDict: Dictionary format for serializing data types

  • UDSDataType: Wrapper for primitive types with categorical support

Property Metadata
  • PropertyMetadataDict: Dictionary format for property metadata

  • UDSPropertyMetadata: Metadata for individual semantic properties

Annotation Metadata
  • AnnotationMetadataDict: Dictionary format for annotation metadata

  • UDSAnnotationMetadata: Collection of properties organized by subspace

  • UDSCorpusMetadata: Complete metadata for sentence and document graphs

The metadata system ensures consistency across UDS corpora by tracking: - Property names and their expected data types - Categorical values and their ordering - Numeric bounds for continuous properties - Confidence score types for uncertain annotations - Subspace organization of semantic properties

See also

decomp.semantics.uds.annotation

Annotation classes that use this metadata

decomp.semantics.uds.corpus

Corpus classes that store metadata

class UDSDataType[source]

Bases: object

A wrapper around builtin datatypes with support for categorical values.

This class provides a minimal extension of basic builtin datatypes for representing categorical datatypes with optional ordering and bounds. It serves as a lightweight alternative to pandas categorical types.

Parameters:
  • datatype (type[PrimitiveType]) – A builtin datatype (str, int, bool, or float).

  • categories (list[PrimitiveType] | None, optional) – The allowed values for categorical datatypes. Required if ordered is True.

  • ordered (bool | None, optional) – Whether this categorical datatype has an ordering. Required if categories is specified.

  • lower_bound (float | None, optional) – The lower bound value for numeric types. Can be specified independently of categories. If both categories and lower_bound are specified, the datatype must be ordered and bounds must match category bounds.

  • upper_bound (float | None, optional) – The upper bound value for numeric types. Can be specified independently of categories. If both categories and upper_bound are specified, the datatype must be ordered and bounds must match category bounds.

datatype

The underlying primitive type.

Type:

type[PrimitiveType]

is_categorical

Whether this represents a categorical datatype.

Type:

bool

is_ordered_categorical

Whether this is an ordered categorical datatype.

Type:

bool

is_ordered_noncategorical

Whether this is ordered but not categorical (has bounds).

Type:

bool

lower_bound

The lower bound if specified.

Type:

float | None

upper_bound

The upper bound if specified.

Type:

float | None

categories

The categories as a set (unordered) or list (ordered).

Type:

set[PrimitiveType] | list[PrimitiveType] | None

__init__(datatype, categories=None, ordered=None, lower_bound=None, upper_bound=None)[source]
__eq__(other)[source]

Check equality based on dictionary representation.

Parameters:

other (object) – Object to compare with.

Returns:

True if both objects have the same dictionary representation.

Return type:

bool

classmethod from_dict(datatype)[source]

Build a UDSDataType from a dictionary.

Parameters:

datatype (TypeAliasType) – A dictionary representing a datatype. This dictionary must at least have a "datatype" key. It may also have a "categorical" and an "ordered" key, in which case it must have both.

Return type:

UDSDataType

to_dict()[source]

Convert to dictionary representation.

Returns:

Dictionary with datatype info, excluding None values.

Return type:

UDSDataTypeDict

class UDSPropertyMetadata[source]

Bases: object

Metadata for a UDS property including value and confidence datatypes.

This class encapsulates the metadata for a single UDS property, including the datatypes for both the property value and the confidence score, as well as optional annotator information.

Parameters:
  • value (UDSDataType) – The datatype for property values.

  • confidence (UDSDataType) – The datatype for confidence scores.

  • annotators (set[str] | None, optional) – Set of annotator identifiers who provided annotations for this property.

value

The value datatype.

Type:

UDSDataType

confidence

The confidence datatype.

Type:

UDSDataType

annotators

The annotator identifiers.

Type:

set[str] | None

__init__(value, confidence, annotators=None)[source]
__eq__(other)[source]

Whether the value and confidence datatypes match and annotators are equal.

Parameters:

other (object) – the other UDSDatatype.

Return type:

bool

__add__(other)[source]

Return a UDSPropertyMetadata with the union of annotators.

If the value and confidence datatypes don’t match, this raises an error.

Parameters:

other (UDSPropertyMetadata) – the other UDSDatatype.

Raises:

ValueError – Raised if the value and confidence datatypes don’t match.

Return type:

UDSPropertyMetadata

classmethod from_dict(metadata)[source]

Build UDSPropertyMetadata from a dictionary.

Parameters:

metadata (PropertyMetadataDict) – A mapping from "value" and "confidence" to datatype dictionaries. May optionally include "annotators" mapping to a set of annotator identifiers.

Returns:

The constructed metadata object.

Return type:

UDSPropertyMetadata

Raises:
  • ValueError – If required fields (value, confidence) are missing.

  • TypeError – If fields have incorrect types

to_dict()[source]

Convert to dictionary representation.

Returns:

Dictionary with value, confidence, and optional annotators.

Return type:

PropertyMetadataDict

class UDSAnnotationMetadata[source]

Bases: object

The metadata for UDS properties by subspace.

Parameters:

metadata (dict[TypeAliasType, dict[str, UDSPropertyMetadata]]) – A mapping from subspaces to properties to datatypes and possibly annotators.

__init__(metadata)[source]
__getitem__(k)[source]

Get metadata by subspace or (subspace, property) tuple.

Parameters:

k (UDSSubspace | tuple[UDSSubspace, str]) – Either a subspace name or a (subspace, property) tuple.

Returns:

Property dict for subspace or specific property metadata.

Return type:

dict[str, UDSPropertyMetadata] | UDSPropertyMetadata

Raises:
  • TypeError – If key is not a string or 2-tuple.

  • KeyError – If subspace or property not found.

__eq__(other)[source]

Check equality by comparing all subspaces and properties.

Parameters:

other (object) – Object to compare with.

Returns:

True if all subspaces, properties, and metadata match.

Return type:

bool

__add__(other)[source]

Merge two metadata objects, combining annotators for shared properties.

Parameters:

other (UDSAnnotationMetadata) – Metadata to merge with this one.

Returns:

New metadata with merged properties and annotators.

Return type:

UDSAnnotationMetadata

property metadata: dict[UDSSubspace, dict[str, UDSPropertyMetadata]]

The underlying metadata dictionary.

Returns:

Mapping from subspaces to properties to metadata.

Return type:

dict[UDSSubspace, dict[str, UDSPropertyMetadata]]

property subspaces: set[UDSSubspace]

Set of all subspace names.

Returns:

The subspace identifiers.

Return type:

set[UDSSubspace]

properties(subspace=None)[source]

Return the properties in a subspace.

Parameters:

subspace (TypeAliasType | None, default: None) – The subspace to get the properties of.

Return type:

set[str]

has_annotators(subspace=None, prop=None)[source]

Check if annotators exist for a subspace and/or property.

Parameters:
  • subspace (UDSSubspace | None, optional) – Subspace to check.

  • prop (str | None, optional) – Property to check.

Returns:

True if any annotators exist.

Return type:

bool

classmethod from_dict(metadata)[source]

Build from nested dictionary structure.

Parameters:

metadata (AnnotationMetadataDict) – Nested dict mapping subspaces to properties to metadata dicts.

Returns:

The constructed metadata object.

Return type:

UDSAnnotationMetadata

to_dict()[source]

Convert to nested dictionary structure.

Returns:

Nested dict representation.

Return type:

AnnotationMetadataDict

class UDSCorpusMetadata[source]

Bases: object

The metadata for UDS properties by subspace.

This is a thin wrapper around a pair of UDSAnnotationMetadata objects: one for sentence annotations and one for document annotations.

Parameters:
__init__(sentence_metadata=None, document_metadata=None)[source]
classmethod from_dict(metadata)[source]

Build from dictionary with sentence and document metadata.

Parameters:
  • metadata (dict[) – Literal[‘sentence_metadata’, ‘document_metadata’], AnnotationMetadataDict

  • ] – Dict with ‘sentence_metadata’ and ‘document_metadata’ keys.

Returns:

The constructed corpus metadata.

Return type:

UDSCorpusMetadata

to_dict()[source]

Convert to dictionary with sentence and document metadata.

Returns:

Dict with ‘sentence_metadata’ and ‘document_metadata’ keys.

Return type:

dict[Literal[‘sentence_metadata’, ‘document_metadata’], AnnotationMetadataDict]

__add__(other)[source]

Merge two corpus metadata objects.

Parameters:

other (UDSCorpusMetadata) – Metadata to merge.

Returns:

New metadata with merged sentence and document metadata.

Return type:

UDSCorpusMetadata

add_sentence_metadata(metadata)[source]

Add sentence annotation metadata.

Parameters:

metadata (UDSAnnotationMetadata) – Metadata to merge with existing sentence metadata.

Return type:

None

add_document_metadata(metadata)[source]

Add document annotation metadata.

Parameters:

metadata (UDSAnnotationMetadata) – Metadata to merge with existing document metadata.

Return type:

None

property sentence_metadata: UDSAnnotationMetadata

The sentence-level annotation metadata.

Returns:

Metadata for sentence annotations.

Return type:

UDSAnnotationMetadata

property document_metadata: UDSAnnotationMetadata

The document-level annotation metadata.

Returns:

Metadata for document annotations.

Return type:

UDSAnnotationMetadata

property sentence_subspaces: set[UDSSubspace]

Set of sentence-level subspaces.

Returns:

Sentence subspace identifiers.

Return type:

set[UDSSubspace]

property document_subspaces: set[UDSSubspace]

Set of document-level subspaces.

Returns:

Document subspace identifiers.

Return type:

set[UDSSubspace]

sentence_properties(subspace=None)[source]

Return the properties in a sentence subspace.

Parameters:

subspace (TypeAliasType | None, default: None) – The subspace to get the properties of.

Return type:

set[str]

document_properties(subspace=None)[source]

Return the properties in a document subspace.

Parameters:

subspace (TypeAliasType | None, default: None) – The subspace to get the properties of.

Return type:

set[str]

sentence_annotators(subspace=None, prop=None)[source]

Return the annotators for a property in a sentence subspace.

Parameters:
  • subspace (TypeAliasType | None, default: None) – The subspace to get the annotators of.

  • prop (str | None, default: None) – The property to get the annotators of.

Return type:

set[str] | None

document_annotators(subspace=None, prop=None)[source]

Return the annotators for a property in a document subspace.

Parameters:
  • subspace (TypeAliasType | None, default: None) – The subspace to get the annotators of.

  • prop (str | None, default: None) – The property to get the annotators of.

Return type:

set[str] | None

has_sentence_annotators(subspace=None, prop=None)[source]

Check if sentence-level annotators exist.

Parameters:
  • subspace (UDSSubspace | None, optional) – Subspace to check.

  • prop (str | None, optional) – Property to check.

Returns:

True if annotators exist.

Return type:

bool

has_document_annotators(subspace=None, prop=None)[source]

Check if document-level annotators exist.

Parameters:
  • subspace (UDSSubspace | None, optional) – Subspace to check.

  • prop (str | None, optional) – Property to check.

Returns:

True if annotators exist.

Return type:

bool