decomp.semantics.uds.metadata¶
Metadata structures for Universal Decompositional Semantics (UDS) annotations.
This module defines the metadata infrastructure used to describe and validate UDS semantic annotations across sentence and document graphs. It provides a flexible type system that supports both categorical and continuous values with optional bounds and ordering constraints.
Key Components¶
- Type System
PrimitiveType: Base types supported in UDS (str, int, bool, float)UDSDataTypeDict: Dictionary format for serializing data typesUDSDataType: Wrapper for primitive types with categorical support
- Property Metadata
PropertyMetadataDict: Dictionary format for property metadataUDSPropertyMetadata: Metadata for individual semantic properties
- Annotation Metadata
AnnotationMetadataDict: Dictionary format for annotation metadataUDSAnnotationMetadata: Collection of properties organized by subspaceUDSCorpusMetadata: Complete metadata for sentence and document graphs
The metadata system ensures consistency across UDS corpora by tracking: - Property names and their expected data types - Categorical values and their ordering - Numeric bounds for continuous properties - Confidence score types for uncertain annotations - Subspace organization of semantic properties
See also
decomp.semantics.uds.annotationAnnotation classes that use this metadata
decomp.semantics.uds.corpusCorpus classes that store metadata
- class UDSDataType[source]¶
Bases:
objectA wrapper around builtin datatypes with support for categorical values.
This class provides a minimal extension of basic builtin datatypes for representing categorical datatypes with optional ordering and bounds. It serves as a lightweight alternative to pandas categorical types.
- Parameters:
datatype (type[PrimitiveType]) – A builtin datatype (str, int, bool, or float).
categories (list[PrimitiveType] | None, optional) – The allowed values for categorical datatypes. Required if ordered is True.
ordered (bool | None, optional) – Whether this categorical datatype has an ordering. Required if categories is specified.
lower_bound (float | None, optional) – The lower bound value for numeric types. Can be specified independently of categories. If both categories and lower_bound are specified, the datatype must be ordered and bounds must match category bounds.
upper_bound (float | None, optional) – The upper bound value for numeric types. Can be specified independently of categories. If both categories and upper_bound are specified, the datatype must be ordered and bounds must match category bounds.
- categories¶
The categories as a set (unordered) or list (ordered).
- classmethod from_dict(datatype)[source]¶
Build a UDSDataType from a dictionary.
- Parameters:
datatype (
TypeAliasType) – A dictionary representing a datatype. This dictionary must at least have a"datatype"key. It may also have a"categorical"and an"ordered"key, in which case it must have both.- Return type:
- class UDSPropertyMetadata[source]¶
Bases:
objectMetadata for a UDS property including value and confidence datatypes.
This class encapsulates the metadata for a single UDS property, including the datatypes for both the property value and the confidence score, as well as optional annotator information.
- Parameters:
value (UDSDataType) – The datatype for property values.
confidence (UDSDataType) – The datatype for confidence scores.
annotators (set[str] | None, optional) – Set of annotator identifiers who provided annotations for this property.
- value¶
The value datatype.
- Type:
- confidence¶
The confidence datatype.
- Type:
- __add__(other)[source]¶
Return a UDSPropertyMetadata with the union of annotators.
If the value and confidence datatypes don’t match, this raises an error.
- Parameters:
other (
UDSPropertyMetadata) – the other UDSDatatype.- Raises:
ValueError – Raised if the value and confidence datatypes don’t match.
- Return type:
- classmethod from_dict(metadata)[source]¶
Build UDSPropertyMetadata from a dictionary.
- Parameters:
metadata (PropertyMetadataDict) – A mapping from
"value"and"confidence"to datatype dictionaries. May optionally include"annotators"mapping to a set of annotator identifiers.- Returns:
The constructed metadata object.
- Return type:
- Raises:
ValueError – If required fields (value, confidence) are missing.
TypeError – If fields have incorrect types
- class UDSAnnotationMetadata[source]¶
Bases:
objectThe metadata for UDS properties by subspace.
- Parameters:
metadata (
dict[TypeAliasType,dict[str,UDSPropertyMetadata]]) – A mapping from subspaces to properties to datatypes and possibly annotators.
- __getitem__(k)[source]¶
Get metadata by subspace or (subspace, property) tuple.
- __add__(other)[source]¶
Merge two metadata objects, combining annotators for shared properties.
- Parameters:
other (UDSAnnotationMetadata) – Metadata to merge with this one.
- Returns:
New metadata with merged properties and annotators.
- Return type:
- property metadata: dict[UDSSubspace, dict[str, UDSPropertyMetadata]]¶
The underlying metadata dictionary.
- Returns:
Mapping from subspaces to properties to metadata.
- Return type:
dict[UDSSubspace, dict[str, UDSPropertyMetadata]]
- property subspaces: set[UDSSubspace]¶
Set of all subspace names.
- Returns:
The subspace identifiers.
- Return type:
set[UDSSubspace]
- has_annotators(subspace=None, prop=None)[source]¶
Check if annotators exist for a subspace and/or property.
- class UDSCorpusMetadata[source]¶
Bases:
objectThe metadata for UDS properties by subspace.
This is a thin wrapper around a pair of
UDSAnnotationMetadataobjects: one for sentence annotations and one for document annotations.- Parameters:
sentence_metadata (
UDSAnnotationMetadata|None, default:None) – The metadata for sentence annotations.document_metadata (
UDSAnnotationMetadata|None, default:None) – The metadata for document_annotations.
- classmethod from_dict(metadata)[source]¶
Build from dictionary with sentence and document metadata.
- Parameters:
metadata (dict[) – Literal[‘sentence_metadata’, ‘document_metadata’], AnnotationMetadataDict
] – Dict with ‘sentence_metadata’ and ‘document_metadata’ keys.
- Returns:
The constructed corpus metadata.
- Return type:
- to_dict()[source]¶
Convert to dictionary with sentence and document metadata.
- Returns:
Dict with ‘sentence_metadata’ and ‘document_metadata’ keys.
- Return type:
dict[Literal[‘sentence_metadata’, ‘document_metadata’], AnnotationMetadataDict]
- __add__(other)[source]¶
Merge two corpus metadata objects.
- Parameters:
other (UDSCorpusMetadata) – Metadata to merge.
- Returns:
New metadata with merged sentence and document metadata.
- Return type:
- add_sentence_metadata(metadata)[source]¶
Add sentence annotation metadata.
- Parameters:
metadata (UDSAnnotationMetadata) – Metadata to merge with existing sentence metadata.
- Return type:
- add_document_metadata(metadata)[source]¶
Add document annotation metadata.
- Parameters:
metadata (UDSAnnotationMetadata) – Metadata to merge with existing document metadata.
- Return type:
- property sentence_metadata: UDSAnnotationMetadata¶
The sentence-level annotation metadata.
- Returns:
Metadata for sentence annotations.
- Return type:
- property document_metadata: UDSAnnotationMetadata¶
The document-level annotation metadata.
- Returns:
Metadata for document annotations.
- Return type:
- property sentence_subspaces: set[UDSSubspace]¶
Set of sentence-level subspaces.
- Returns:
Sentence subspace identifiers.
- Return type:
set[UDSSubspace]
- property document_subspaces: set[UDSSubspace]¶
Set of document-level subspaces.
- Returns:
Document subspace identifiers.
- Return type:
set[UDSSubspace]
- sentence_annotators(subspace=None, prop=None)[source]¶
Return the annotators for a property in a sentence subspace.
- document_annotators(subspace=None, prop=None)[source]¶
Return the annotators for a property in a document subspace.