Queries with negation
Oftentimes we encounter situations where we would like to form queries with the use of negation. Examples of such queries include:
a) List all people that were unemployed in the last 6 months
b) Show me text-only timelines (no videos or images)
c) All single people who have posted a photo in a forum
The intuitive meaning of a negated pattern is that of a complement. However, a relation complement is not a clearly defined term as it requires the definition of a domain of values with respect to which the complement is computed. Even in this case, we end up with an infinite relation which leaves the projection or join operations unapplicable. As a result, we understand pattern negation in terms of computation of set differences. The set-difference semantics are different to the perhaps familiar semantics of Negation-as-Failure of Prolog and in this chapter, we will attempt to provide a clear explanation of the meaning of pattern negation in TypeDB.
Let us consider the relation of unemployment which can be trivially understood as the absence of employment. We define negation blocks by enclosing patterns with
curly braces and preceding them with a not
keyword:
not {
...
};
Therefore, to retrieve people that are unemployed we want to express:
Person($x), ¬Employment($x, employer: $y)
i.e. we look for the following TypeQL pattern:
Please note that the $y
variable inside the negation block is not bound. This is of great importance when discussing the meaning of negation.
If we were to interpret the query using the simple complement semantics we would arrive at a conclusion that unemployed people are people for which there exists
a company that doesn’t hire them - it would get evaluated to all ($x, $y)
pairs where $x
is a person and $y
is a company that is not
in an employment relation with $x
. What we do instead is we interpret the negation block as some relation of arity equal to the number of variables that are bound
to non-negated statements. This imposes a requirement of at least one variable in the negation block being bound to a variable outside the block.
Here our only bound variable is $x
. Consequently we can think of the query as:
Person($x), ¬SomeRelation($x)
SomeRelation($x) :- Employment(employee: $x, employer: $y)
In this way, we have no problems defining the projection or join operations as these are handled by the native rule semantics. Consequently, we can proceed with the set difference semantics unambiguously. As a result, the unemployment is evaluated according to our expectation of unemployment as an absence of being part of any employment - from the set of people we subtract the set of people being in employment relations. Consequently, as a user the only thing we need to type is our query pattern:
{
$x isa person;
not {
(employee: $x, employer: $y) isa employment;
};
}
The variables in the negation block are local to the negation block. Consequently, executing the query:
will yield a list of concepts assigned to the $x
variable.
Should we decide that our unemployment pattern is a common one, we might decide to express it via a rule. In the case when the set of bound variables has only one element, it is more convenient to define it in terms of a type. Defining the unemployment in terms of a rule and the freshly introduced negation block we can then write:
Consequently, our unemployment query pattern simply becomes:
Negation blocks are not tied to relations. We are free to define our patterns inside the negation blocks as long as they are conjunctive. For example, we can use negation blocks to perform exclusions, e.g. the query pattern to list all non-English speaking employees can read:
We shall now see how we can form more complex patterns with negation. Let’s say we want to find people that are orphans:
Person($x), ¬Parentship($x, father: $y), ¬Parentship($x, mother: $y)
To express that in TypeQL, we require two negation blocks:
which is equivalent to having two extra specific types in the schema:
and then defining a pattern:
Now let’s look at how we arrive at the answers if we execute this as a match query. To do this we will go through
the set difference computation procedure. Let’s define a set P
to be the set of all people, i.e. a set that is the answer set to a query:
a set F
as the set of people having a father, i. e. a set that is the answer set of a query:
and finally a set M
as the set of people having a mother which can be defined in terms of a query:
We can illustrate the relationships between the sets with a suitable Venn diagram:
Consequently, our set of answers of our match pattern is defined as A = P \ F \ M
. As a result, from the set of all people we subtract those who have a father and those who have a mother.
Please note that the scope of variables in a negation block is local to the negation block. As a result, the above pattern does not look for people that do not have a mother and a father that is the same person.
Now let’s say we have three people, Alice (A), Bob (B) and Charlie (C). Consequently, our set P
reads:
P = {Alice, Bob, Charlie}
Additionally, let’s say we know that the following two parentship
relations hold:
Parentship(child: Alice, father: Bob)
Parentship(child: Bob, mother: Charlie)
i.e. Bob is the father of Alice and Charlie is the mother of Bob.
This results in our sets F
and M
being defined as:
F = {Alice}
M = {Bob}
Consequently, the final result of the match query:
will only have the Charlie concept as the result obtained by computing the set difference:
A = P \ F \ M = {Charlie}
Now let’s complicate things a little and see what happens if we bind the $y
variable for parents, i. e. if we consider a query pattern:
Upon inspection, we can see that in this case our three sets are different. Let’s define them as P'
, M'
and F'
.
The first difference is that an element of each set is a pair (2-tuple). Our set of people P
corresponds to the answer set of a match query:
As a result, we have (abbreviating the names for clarity):
P' = {
(A, A), (A, B), (A, C),
(B, A), (B, B), (B, C),
(C, A), (C, B), (C, C)
}
Now the set F'
is a set of pairs {($x, $y)}
such that the concepts in the pair are connected via parentship
relation with the concept assigned to $y
variable playing the role of a father.
Similarly, the set M'
is a set of pairs {($x, $y)}
such that the concepts in the pair are connected via parentship
relation with the concept assigned to $y
variable playing the role of a mother.
Consequently we have:
F' = {(A, B)}
M' = {(B, C)}
Now, executing our query pattern as an ordinary match-get query:
will yield the following concept pairs mapped to $x
and $y
variables:
A' = P' \ F' \ M' =
{
(A, A), (A, C),
(B, A), (B, B),
(C, A), (C, B), (C, C)
}
However! If we now do a projection onto the $x
variable to compare the results with our previous query variant where
the $y
variable is unbounded, i.e. if we execute:
we will get the following concepts in return:
A' = P' \ F' \ M' | x = {A, B, C}
which is clearly different to the anticipated result of A = {C}
- our answer set is not a set of orphans, instead it is a projection from a set of people pairs where pairs playing in a parentship
relation are excluded.
As a result, extra care should be taken and thought given when formulating queries with negation blocks.
One might be tempted to put the two negation blocks into one. Let’s look at the outcome of that. If we define:
noting that this time we need to pick a fresh variable for mother as we are in the same negation block. The meaning of this pattern is the following. From all people, remove the people that have both a mother and a father. As a result, our answer set will contain people that have at most one parent.
We can go further than that. Negation blocks in queries can be nested. Consequently, if we wanted to find people whose father is unemployed we can write something like this:
Please note, nesting of negation blocks is only allowed in queries, but not in rules.
Negation blocks: DOs and DONTs
Please note, the following restrictions apply to negation blocks:
- for each negation block in a query, at least one variable in the negation block must be bound to a statement outside of the negation block. This ensures that set difference operations are performed on sets that are not disjoint
- variables in negation blocks are local to the block they are defined in
Negation in rules
As we have already mentioned, we can use negation blocks within rules.
However, when inserting negation blocks in rules, currently the following restrictions apply:
- all restrictions applying to queries with negation blocks
- only conjunctive statements are allowed within rule negations (i.e. no nested
not
oror
statements) - rules with negations may not contradict themselves (i.e. recurse back to themselves, even indirectly. An error will be thrown if this is possible.)
We will illustrate the use of negation with rules with a graphical example.
Let us define a network of nodes with possible edges between nodes:
Then we can define two nodes as being reachable if there exists an edge between them:
Consequently, with the use of negation we can define edges that are indirect:
We can mark the unreachable nodes by defining the following rule:
Please note the explicit addition of the node
types in the body of the rule. This is to ensure the boundedness condition
is satisfied as well as to maintain the expected meaning of the unreachable
relation.