decomp.semantics.uds.annotation

Module for representing UDS property annotations.

class decomp.semantics.uds.annotation.NormalizedUDSAnnotation(metadata, data)

A normalized Universal Decompositional Semantics annotation

Properties in a NormalizedUDSAnnotation may have only a single str, int, or float value and a single str, int, or float confidence.

Parameters
  • metadata (UDSAnnotationMetadata) – The metadata for the annotations

  • data (Dict[str, Dict[str, Dict[str, Dict[str, Dict[str, Union[str, int, bool, float]]]]]]) – A mapping from graph identifiers to node/edge identifiers to property subspaces to property to value and confidence. Edge identifiers must be represented as NODEID1%%NODEID2, and node identifiers must not contain %%.

classmethod from_json(jsonfile)

Generates a dataset of normalized annotations from a JSON file

For node annotations, the format of the JSON passed to this class method must be:

{GRAPHID_1: {NODEID_1_1: DATA,
             ...},
 GRAPHID_2: {NODEID_2_1: DATA,
             ...},
 ...
}

Edge annotations should be of the form:

{GRAPHID_1: {NODEID_1_1%%NODEID_1_2: DATA,
             ...},
 GRAPHID_2: {NODEID_2_1%%NODEID_2_2: DATA,
             ...},
 ...
}

Graph and node identifiers must match the graph and node identifiers of the predpatt graphs to which the annotations will be added.

DATA in the above is assumed to have the following structure:

{SUBSPACE_1: {PROP_1_1: {'value': VALUE,
                        'confidence': VALUE},
             ...},
 SUBSPACE_2: {PROP_2_1: {'value': VALUE,
                         'confidence': VALUE},
             ...},
}

VALUE in the above is assumed to be unstructured.

Return type

NormalizedUDSAnnotation

class decomp.semantics.uds.annotation.RawUDSAnnotation(metadata, data)

A raw Universal Decompositional Semantics dataset

Unlike decomp.semantics.uds.NormalizedUDSAnnotation, objects of this class may have multiple annotations for a particular attribute. Each annotation is associated with an annotator ID, and different annotators may have annotated different numbers of items.

Parameters

annotation – A mapping from graph identifiers to node/edge identifiers to property subspaces to property to value and confidence for each annotator. Edge identifiers must be represented as NODEID1%%NODEID2, and node identifiers must not contain %%.

annotators(subspace=None, prop=None)

Annotator IDs for a subspace and property

If neither subspace nor property are specified, all annotator IDs are returned. IF only the subspace is specified, all annotators IDs for the subspace are returned.

Parameters
  • subspace (Optional[str]) – The subspace to constrain to

  • prop (Optional[str]) – The property to constrain to

Return type

Set[str]

classmethod from_json(jsonfile)

Generates a dataset for raw annotations from a JSON file

For node annotations, the format of the JSON passed to this class method must be:

{GRAPHID_1: {NODEID_1_1: DATA,
             ...},
 GRAPHID_2: {NODEID_2_1: DATA,
             ...},
 ...
}

Edge annotations should be of the form:

{GRAPHID_1: {NODEID_1_1%%NODEID_1_2: DATA,
             ...},
 GRAPHID_2: {NODEID_2_1%%NODEID_2_2: DATA,
             ...},
 ...
}

Graph and node identifiers must match the graph and node identifiers of the predpatt graphs to which the annotations will be added.

DATA in the above is assumed to have the following structure:

{SUBSPACE_1: {PROP_1_1: {'value': {
                            ANNOTATOR1: VALUE1,
                            ANNOTATOR2: VALUE2,
                            ...
                                  },
                         'confidence': {
                            ANNOTATOR1: CONF1,
                            ANNOTATOR2: CONF2,
                            ...
                                       }
                        },
              PROP_1_2: {'value': {
                            ANNOTATOR1: VALUE1,
                            ANNOTATOR2: VALUE2,
                            ...
                                  },
                         'confidence': {
                            ANNOTATOR1: CONF1,
                            ANNOTATOR2: CONF2,
                            ...
                                       }
                        },
              ...},
 SUBSPACE_2: {PROP_2_1: {'value': {
                            ANNOTATOR3: VALUE1,
                            ANNOTATOR4: VALUE2,
                            ...
                                  },
                         'confidence': {
                            ANNOTATOR3: CONF1,
                            ANNOTATOR4: CONF2,
                            ...
                                       }
                        },
             ...},
...}

VALUEi and CONFi are assumed to be unstructured.

Return type

RawUDSAnnotation

items(annotation_type=None, annotator_id=None)

Dictionary-like items generator for attributes

This method behaves exactly like UDSAnnotation.items, except that, if an annotator ID is passed, it generates only items annotated by the specified annotator.

Parameters
  • annotation_type (Optional[str]) – Whether to return node annotations, edge annotations, or both (default)

  • annotator_id (Optional[str]) – The annotator whose annotations will be returned by the generator (defaults to all annotators)

Raises

ValueError – If both annotation_type and annotator_id are passed and the relevant annotator gives no annotations of the relevant type, and exception is raised

class decomp.semantics.uds.annotation.UDSAnnotation(metadata, data)

A Universal Decompositional Semantics annotation

This is an abstract base class. See its RawUDSAnnotation and NormalizedUDSAnnotation subclasses.

The __init__ method for this class is abstract to ensure that it cannot be initialized directly, even though it is used by the subclasses and has a valid default implementation. The from_json class method is abstract to force the subclass to define more specific constraints on its JSON inputs.

Parameters
  • metadata (UDSAnnotationMetadata) – The metadata for the annotations

  • data (Dict[str, Dict[str, Any]]) – A mapping from graph identifiers to node/edge identifiers to property subspaces to properties to annotations. Edge identifiers must be represented as NODEID1%%NODEID2, and node identifiers must not contain %%.

property edge_attributes

The edge attributes

property edge_graphids: Set[str]

The identifiers for graphs with edge annotations

Return type

Set[str]

property edge_subspaces: Set[str]

The subspaces for edge annotations

Return type

Set[str]

abstract classmethod from_json(jsonfile)

Load Universal Decompositional Semantics dataset from JSON

For node annotations, the format of the JSON passed to this class method must be:

{GRAPHID_1: {NODEID_1_1: DATA,
             ...},
 GRAPHID_2: {NODEID_2_1: DATA,
             ...},
 ...
}

Edge annotations should be of the form:

{GRAPHID_1: {NODEID_1_1%%NODEID_1_2: DATA,
             ...},
 GRAPHID_2: {NODEID_2_1%%NODEID_2_2: DATA,
             ...},
 ...
}

Graph and node identifiers must match the graph and node identifiers of the predpatt graphs to which the annotations will be added. The subclass determines the form of DATA in the above.

Parameters

jsonfile (Union[str, TextIO]) – (path to) file containing annotations as JSON

Return type

UDSAnnotation

property graphids: Set[str]

The identifiers for graphs with either node or edge annotations

Return type

Set[str]

items(annotation_type=None)

Dictionary-like items generator for attributes

If annotation_type is specified as “node” or “edge”, this generator yields a graph identifier and its node or edge attributes (respectively); otherwise, this generator yields a graph identifier and a tuple of its node and edge attributes.

property metadata: UDSAnnotationMetadata

All metadata for this annotation

Return type

UDSAnnotationMetadata

property node_attributes

The node attributes

property node_graphids: Set[str]

The identifiers for graphs with node annotations

Return type

Set[str]

property node_subspaces: Set[str]

The subspaces for node annotations

Return type

Set[str]

properties(subspace=None)

The properties in a subspace

Return type

Set[str]

property_metadata(subspace, prop)

The metadata for a property in a subspace

Parameters
  • subspace (str) – The subspace the property is in

  • prop (str) – The property in the subspace

Return type

UDSPropertyMetadata

property subspaces: Set[str]

The subspaces for node and edge annotations

Return type

Set[str]