PredPatt Sentence Graphs¶
The semantic graphs that form the second layer of annotation in the dataset are produced by the PredPatt system. PredPatt takes as input a UD parse for a single sentence and produces a set of predicates and set of arguments of each predicate in that sentence. Both predicates and arguments are associated with a single head token in the sentence as well as a set of tokens that make up the predicate or argument (its span). Predicate or argument spans may be trivial in only containinig the head token.
For example, given the dependency parse for the sentence Chris gave the book to Pat ., PredPatt produces the following.
?a gave ?b to ?c
?a: Chris
?b: the book
?c: Pat
Assuming UD’s 1-indexation, the single predicate in this sentence (gave…to) has a head at position 2 and a span over positions {2, 5}. This predicate has three arguments, one headed by Chris at position 1, with span over position {1}; one headed by book at position 4, with span over positions {3, 4}; and one headed by Pat at position 6, with span over position {6}.
See the PredPatt documentation tests for examples.
Each predicate and argument produced by PredPatt is associated with a
node in a digraph with identifier
ewt-SPLIT-SENTNUM-semantics-TYPE-HEADTOKNUM
, where TYPE
is
always either pred
or arg
and HEADTOKNUM
is the ordinal
position of the head token within the sentence (1-indexed, following
the convention in UD-EWT). At minimum, each such node has the
following attributes.
domain
(str
): the subgraph this node is part of (alwayssemantics
)
type
(str
): the type of the object in the particular domain (eitherpredicate
orargument
)
frompredpatt
(bool
): whether this node is associated with a predicate or argument output by PredPatt (alwaysTrue
)
Predicate and argument nodes produced by PredPatt furthermore always have at least one outgoing instance edge that points to nodes in the syntax domain that correspond to the associated span of the predicate or argument. At minimum, each such edge has the following attributes.
domain
(str
): the subgraph this node is part of (alwaysinterface
)
type
(str
): the type of the object in the particular domain (eitherhead
ornonhead
)
frompredpatt
(bool
): whether this node is associated with a predicate or argument output by PredPatt (alwaysTrue
)
Because PredPatt produces a unique head for each predicate and
argument, there is always exactly one instance edge of type head
from any particular node in the semantics domain. There may or may not
be instance edges of type nonhead
.
In addition to instance edges, predicate nodes always have exactly one outgoing edge connecting them to each of the nodes corresponding to their arguments. At minimum, each such edge has the following attributes.
domain
(str
): the subgraph this node is part of (alwayssemantics
)
type
(str
): the type of the object in the particular domain (alwaysdependency
)
frompredpatt
(bool
): whether this node is associated with a predicate or argument output by PredPatt (alwaysTrue
)
There is one special case where an argument nodes has an outgoing edge that points to a predicate node: clausal subordination.
For example, given the dependency parse for the sentence Gene thought that Chris gave the book to Pat ., PredPatt produces the following.
?a thinks ?b
?a: Gene
?b: SOMETHING := that Chris gave the book to Pat
?a gave ?b to ?c
?a: Chris
?b: the book
?c: Pat
In this case, the second argument of the predicate headed by thinks
is the argument that Chris gave the book to Pat, which is headed by
gave. This argument is associated with a node of type argument
with span over positions {3, 4, 5, 6, 7, 8, 9} and identifier
ewt-SPLIT-SENTNUM-semantics-arg-5
. In addition, there is a
predicate headed by gave. This predicate is associated with a node
with span over positions {5, 8} and identifier
ewt-SPLIT-SENTNUM-semantics-pred-5
. Node
ewt-SPLIT-SENTNUM-semantics-arg-5
then has an outgoing edge
pointing to ewt-SPLIT-SENTNUM-semantics-pred-5
. At minimum, each
such edge has the following attributes.
domain
(str
): the subgraph this node is part of (alwayssemantics
)
type
(str
): the type of the object in the particular domain (alwayshead
)
frompredpatt
(bool
): whether this node is associated with a predicate or argument output by PredPatt (alwaysTrue
)
The type
attribute in this case has the same value as instance
edges, but crucially the domain
attribute is distinct. In the case
of instance edges, it is interface
and in the case of clausal
subordination, it is semantics
. This matters when making queries
against the graph.
If the frompredpatt
attribute has value True
, it is guaranteed
that the only semantics edges of type head
are ones that involve
clausal subordination like the above. This is not guaranteed for nodes
for which the frompredpatt
attribute has value False
.
Every semantic graph contains at least four additional performative
nodes that are note produced by PredPatt (and thus, for which the
frompredpatt
attribute has value False
).
ewt-SPLIT-SENTNUM-semantics-arg-0
: an argument node representing the entire sentence in the same way complement clauses are represented
ewt-SPLIT-SENTNUM-semantics-pred-root
: a predicate node representing the author’s production of the entire sentence directed at the addressee
ewt-SPLIT-SENTNUM-semantics-arg-speaker
: an argument node representing the author
ewt-SPLIT-SENTNUM-semantics-arg-addressee
: an argument node representing the addressee
All of these nodes have a domain
attribute with value semantics
. Unlike nodes associated with PredPatt predicates and arguments, ewt-SPLIT-SENTNUM-semantics-pred-root
, ewt-SPLIT-SENTNUM-semantics-arg-speaker
, and ewt-SPLIT-SENTNUM-semantics-arg-addressee
have no instance edges connecting them to syntactic nodes. In contrast, ewt-SPLIT-SENTNUM-semantics-arg-0
has an instance head edge to ewt-SPLIT-SENTNUM-root-0
.
The ewt-SPLIT-SENTNUM-semantics-arg-0
node has semantics head edges to each of the predicate nodes in the graph that are not dominated by any other semantics node. This node, in addition to ewt-SPLIT-SENTNUM-semantics-arg-speaker
and ewt-SPLIT-SENTNUM-semantics-arg-addressee
, has a dependency edge to ewt-SPLIT-SENTNUM-semantics-pred-root
.
These nodes are included for purposes of forward compatibility. None of them currently have attributes, but future releases of decomp will include annotations on either them or their edges.