The simplest form of example in ILASP is a (positive or negative) partial interpretation. A positive partial interpretation is of the following form:
#pos({inc_1, ..., inc_m}, {exc_1, ..., exc_n}).
Where each inc_i
and exc_i
is a ground atom called (respectively)
the inclusions and exclusions of the partial interpretation. A
complete interpretation is said to extend the partial interpretation
if it includes all of the inclusions and none of the exclusions. A
positive partial interpretation is covered if there is at least one
answer set of the learned program (the hypothesis combined with the
background knowledge) that extends the partial interpretation.
Similarly, a negative partial interpretation is of the form:
#neg({inc_1, ..., inc_m}, {exc_1, ..., exc_n}).
A negative partial interpretation is covered if there is no answer set of the learned program that extends the partial interpretation.
Positive examples are able to express properties that should be satisfied in at least one example and so naturally express brave reasoning in ASP. On the other hand, negative examples express properties that should be expressed in no answer set, and can therefore express properties that should be true in all answer sets, thus expressing cautious reasoning in ASP.
Consider a very simple setting, where we want to learn a program that
describes the behaviour of three (specific) coins, by flipping them and
observing which sides they land on. We can use a very simple mode bias
to describe the rules we are allowed to learn. The predicates heads/1
and tails/1
are both allowed to appear in the head with a single
argument, which is either a variable or constant of type coin
(where
there are three constants of type coin in the domain: c1
, c2
and
c3
). In the body, we can use three predicates (both positively and
negatively): heads/1
, tails/1
and coin/1
. The mode bias expressing
this language is shown below.
#modeh(heads(var(coin))).
#modeh(tails(var(coin))).
#modeb(heads(var(coin))).
#modeb(tails(var(coin))).
#modeb(coin(var(coin))).
#modeh(heads(const(coin))).
#modeh(tails(const(coin))).
#constant(coin, c1).
#constant(coin, c2).
#constant(coin, c3).
We flip the coins twice, and see the following combinations of
observations: {heads(c1), tails(c2), heads(c3)}
, {heads(c1),
heads(c2), tails(c3)}
. We can encode these “observations” in ILASP
using positive examples. Positive examples specify properties which
should hold in at least one answer set of the learned program. The two
observations are represented by the following two examples:
#pos({heads(c1), tails(c2), heads(c3)},
{tails(c1), heads(c2), tails(c3)}).
#pos({heads(c1), heads(c2), tails(c3)},
{tails(c1), tails(c2), heads(c3)}).
For a positive example to be covered, there must be at least one answer
set of the learned program that contains all of the inclusions and none
of the exclusions. In this case, these examples mean that there must be
(at least) two answer sets, one which contains heads(c1)
, tails(c2)
and heads(c3)
, and does not contain tails(c1)
, heads(c2)
and
tails(c3)
, and another answer set which contains heads(c1)
,
heads(c2)
and tails(c3)
, and does not contain tails(c1)
,
tails(c2)
and heads(c3)
. Although these particular examples
completely describe the values of all coins, this does not need to be
the case in general. Partial examples allow us to represent uncertainty;
for example, we could have a fourth coin c4
for which we do not know
the value.
Together with the above examples, we can also give the following, very simple, background knowledge, which defines the set of coins we have:
coin(c1). coin(c2). coin(c3).
If we run this task in ILASP, then ILASP returns the following solution:
heads(V1) :- coin(V1), not tails(V1).
tails(V1) :- coin(V1), not heads(V1).
This program states that every coin must land on either heads
or
tails
, but not both. Although the first coin c1
has never landed on
tails
in the scenarios we have observed, ILASP has generalised to
learn first-order rules that apply to all coins, rather than specific
ground rules that only explain the specific instances we have seen. The
ability to generalise in this way is a huge advantage of ILP systems
over other forms of machine learning, because it usually means that ILP
techniques require very few examples to learn general concepts. Note
that both positive examples are required for ILASP to learn this general
program. Neither positive example on its own is sufficient because in
both cases there is a shorter program that explains the example – the
set of facts { heads(c1). tails(c2). heads(c3). }
covers the first
example and similarly the set of facts { heads(c1). heads(c2).
tails(c3). }
covers the second.
It may be that after many more observations, we still have not witnessed
c1
landing on tails
, and we could be convinced that it never
will. In this case, we can use ILASP’s negative examples to specify that there
should be no answer set that contains tails(c1)
. This example is
expressed in ILASP as follows:
#neg({tails(c1)}, {}).
Given this extra example, ILASP learns the slightly larger program:
heads(V1) :- coin(V1), not tails(V1).
tails(V1) :- coin(V1), not heads(V1).
heads(c1).
This program states that all coins must land on either heads
or
tails
, but not both, except for c1
, which can only land on heads
.
Note that negative examples often cause ILASP to learn programs with
rules that eliminate answer sets. In this case, the fact heads(c1)
eliminates all answer sets that contain tails(c1)
. Negative examples
are often used to learn constraints. The constraint :- tails(c1).
would have has the same effect; however, it is not permitted because the
mode bias does not allow constants to be used in the body of a rule.
Positive and negative examples of partial answer sets are targeted at
learning a fixed program. When ASP is used in practice, however, a
program representing a general problem definition is often combined with
another set of rules (usually just facts) describing a particular
instance of the problem to be solved. For instance, a general program
defining what it means for a graph to be Hamiltonian (i.e. the general
problem definition) can be combined with a set of facts describing a
particular graph (i.e. a problem instance). The combined program is
satisfiable if and only if the graph represented by the set of facts is
Hamiltonian. The context-dependent behaviour of the general Hamilton
program cannot be captured by positive and negative examples of partial
answer sets. Instead, we need an extension, called a context-dependent
example. This allows each example to come with its own extra bit of
background knowledge, called a context C
, which applies only to that
example. It is now the learned program combined with this context (i.e.
the union of B
, H
and C
) that has to satisfy the semantic
properties of the example, rather than the learned program alone. In
ILASP, the context of an example is expressed by adding an extra set to
the example, containing the context.
Consider the following two context-dependent examples, which each represent a graph with four nodes and various edges:
#pos({}, {}, {
node(1..4).
edge(1, 2).
edge(2, 3).
edge(3, 4).
edge(4, 1).
}).
#neg({}, {}, {
node(1..4).
edge(1, 2).
edge(2, 1).
edge(2, 3).
edge(3, 4).
edge(4, 2).
}).
Both examples have empty inclusions and exclusions. In the case of a
positive example, this simply means that there must exist at least one
answer set of the learned program combined with the context – any
answer set is consistent with the empty partial interpretation – and in
the case of a negative example, it means that there should be no answer
set of the learned program combined with the context. Given a sufficient
number of examples of this form, ILASP can be used to learn a program
that corresponds to the definition of a Hamiltonian graph; i.e. the
learned program combined with a set of facts C
is satisfiable if and
only if C
represents a Hamiltonian graph. The full program learned by
ILASP is:
0 { in(V0, V1) } 1 :- edge(V0, V1).
reach(V0) :- in(1, V0).
reach(V1) :- reach(V0), in(V0, V1).
:- not reach(V0), node(V0).
:- V1 != V2, in(V0, V2), in(V0, V1).
This example shows the high expressive power of ILASP, compared to many other
ILP systems, which are only able to learn definite logic programs. In this
case, ILASP has learned a choice rule, constraints and a recursive definition
of reachability. The full learning task, hamilton.las
, used to learn
this program is available
here.