decomp.semantics.uds.annotation¶
Module for representing UDS property annotations.
- class decomp.semantics.uds.annotation.NormalizedUDSAnnotation(metadata, data)¶
A normalized Universal Decompositional Semantics annotation
Properties in a NormalizedUDSAnnotation may have only a single
str
,int
, orfloat
value and a singlestr
,int
, orfloat
confidence.- Parameters
metadata (
UDSAnnotationMetadata
) – The metadata for the annotationsdata (
Dict
[str
,Dict
[str
,Dict
[str
,Dict
[str
,Dict
[str
,Union
[str
,int
,bool
,float
]]]]]]) – A mapping from graph identifiers to node/edge identifiers to property subspaces to property to value and confidence. Edge identifiers must be represented as NODEID1%%NODEID2, and node identifiers must not contain %%.
- classmethod from_json(jsonfile)¶
Generates a dataset of normalized annotations from a JSON file
For node annotations, the format of the JSON passed to this class method must be:
{GRAPHID_1: {NODEID_1_1: DATA, ...}, GRAPHID_2: {NODEID_2_1: DATA, ...}, ... }
Edge annotations should be of the form:
{GRAPHID_1: {NODEID_1_1%%NODEID_1_2: DATA, ...}, GRAPHID_2: {NODEID_2_1%%NODEID_2_2: DATA, ...}, ... }
Graph and node identifiers must match the graph and node identifiers of the predpatt graphs to which the annotations will be added.
DATA in the above is assumed to have the following structure:
{SUBSPACE_1: {PROP_1_1: {'value': VALUE, 'confidence': VALUE}, ...}, SUBSPACE_2: {PROP_2_1: {'value': VALUE, 'confidence': VALUE}, ...}, }
VALUE in the above is assumed to be unstructured.
- Return type
- class decomp.semantics.uds.annotation.RawUDSAnnotation(metadata, data)¶
A raw Universal Decompositional Semantics dataset
Unlike
decomp.semantics.uds.NormalizedUDSAnnotation
, objects of this class may have multiple annotations for a particular attribute. Each annotation is associated with an annotator ID, and different annotators may have annotated different numbers of items.- Parameters
annotation – A mapping from graph identifiers to node/edge identifiers to property subspaces to property to value and confidence for each annotator. Edge identifiers must be represented as NODEID1%%NODEID2, and node identifiers must not contain %%.
- annotators(subspace=None, prop=None)¶
Annotator IDs for a subspace and property
If neither subspace nor property are specified, all annotator IDs are returned. IF only the subspace is specified, all annotators IDs for the subspace are returned.
- Parameters
subspace (
Optional
[str
]) – The subspace to constrain toprop (
Optional
[str
]) – The property to constrain to
- Return type
Set
[str
]
- classmethod from_json(jsonfile)¶
Generates a dataset for raw annotations from a JSON file
For node annotations, the format of the JSON passed to this class method must be:
{GRAPHID_1: {NODEID_1_1: DATA, ...}, GRAPHID_2: {NODEID_2_1: DATA, ...}, ... }
Edge annotations should be of the form:
{GRAPHID_1: {NODEID_1_1%%NODEID_1_2: DATA, ...}, GRAPHID_2: {NODEID_2_1%%NODEID_2_2: DATA, ...}, ... }
Graph and node identifiers must match the graph and node identifiers of the predpatt graphs to which the annotations will be added.
DATA in the above is assumed to have the following structure:
{SUBSPACE_1: {PROP_1_1: {'value': { ANNOTATOR1: VALUE1, ANNOTATOR2: VALUE2, ... }, 'confidence': { ANNOTATOR1: CONF1, ANNOTATOR2: CONF2, ... } }, PROP_1_2: {'value': { ANNOTATOR1: VALUE1, ANNOTATOR2: VALUE2, ... }, 'confidence': { ANNOTATOR1: CONF1, ANNOTATOR2: CONF2, ... } }, ...}, SUBSPACE_2: {PROP_2_1: {'value': { ANNOTATOR3: VALUE1, ANNOTATOR4: VALUE2, ... }, 'confidence': { ANNOTATOR3: CONF1, ANNOTATOR4: CONF2, ... } }, ...}, ...}
VALUEi and CONFi are assumed to be unstructured.
- Return type
- items(annotation_type=None, annotator_id=None)¶
Dictionary-like items generator for attributes
This method behaves exactly like UDSAnnotation.items, except that, if an annotator ID is passed, it generates only items annotated by the specified annotator.
- Parameters
annotation_type (
Optional
[str
]) – Whether to return node annotations, edge annotations, or both (default)annotator_id (
Optional
[str
]) – The annotator whose annotations will be returned by the generator (defaults to all annotators)
- Raises
ValueError – If both annotation_type and annotator_id are passed and the relevant annotator gives no annotations of the relevant type, and exception is raised
- class decomp.semantics.uds.annotation.UDSAnnotation(metadata, data)¶
A Universal Decompositional Semantics annotation
This is an abstract base class. See its RawUDSAnnotation and NormalizedUDSAnnotation subclasses.
The
__init__
method for this class is abstract to ensure that it cannot be initialized directly, even though it is used by the subclasses and has a valid default implementation. Thefrom_json
class method is abstract to force the subclass to define more specific constraints on its JSON inputs.- Parameters
metadata (
UDSAnnotationMetadata
) – The metadata for the annotationsdata (
Dict
[str
,Dict
[str
,Any
]]) – A mapping from graph identifiers to node/edge identifiers to property subspaces to properties to annotations. Edge identifiers must be represented as NODEID1%%NODEID2, and node identifiers must not contain %%.
- property edge_attributes¶
The edge attributes
- property edge_graphids: Set[str]¶
The identifiers for graphs with edge annotations
- Return type
Set
[str
]
- property edge_subspaces: Set[str]¶
The subspaces for edge annotations
- Return type
Set
[str
]
- abstract classmethod from_json(jsonfile)¶
Load Universal Decompositional Semantics dataset from JSON
For node annotations, the format of the JSON passed to this class method must be:
{GRAPHID_1: {NODEID_1_1: DATA, ...}, GRAPHID_2: {NODEID_2_1: DATA, ...}, ... }
Edge annotations should be of the form:
{GRAPHID_1: {NODEID_1_1%%NODEID_1_2: DATA, ...}, GRAPHID_2: {NODEID_2_1%%NODEID_2_2: DATA, ...}, ... }
Graph and node identifiers must match the graph and node identifiers of the predpatt graphs to which the annotations will be added. The subclass determines the form of DATA in the above.
- Parameters
jsonfile (
Union
[str
,TextIO
]) – (path to) file containing annotations as JSON- Return type
- property graphids: Set[str]¶
The identifiers for graphs with either node or edge annotations
- Return type
Set
[str
]
- items(annotation_type=None)¶
Dictionary-like items generator for attributes
If annotation_type is specified as “node” or “edge”, this generator yields a graph identifier and its node or edge attributes (respectively); otherwise, this generator yields a graph identifier and a tuple of its node and edge attributes.
- property metadata: UDSAnnotationMetadata¶
All metadata for this annotation
- Return type
- property node_attributes¶
The node attributes
- property node_graphids: Set[str]¶
The identifiers for graphs with node annotations
- Return type
Set
[str
]
- property node_subspaces: Set[str]¶
The subspaces for node annotations
- Return type
Set
[str
]
- properties(subspace=None)¶
The properties in a subspace
- Return type
Set
[str
]
- property_metadata(subspace, prop)¶
The metadata for a property in a subspace
- Parameters
subspace (
str
) – The subspace the property is inprop (
str
) – The property in the subspace
- Return type
- property subspaces: Set[str]¶
The subspaces for node and edge annotations
- Return type
Set
[str
]