Partial Interpretations

The simplest form of example in ILASP is a (positive or negative) partial interpretation. A positive partial interpretation is of the following form:

#pos({inc_1, ..., inc_m}, {exc_1, ..., exc_n}).

Where each inc_i and exc_i is a ground atom called (respectively) the inclusions and exclusions of the partial interpretation. A complete interpretation is said to extend the partial interpretation if it includes all of the inclusions and none of the exclusions. A positive partial interpretation is covered if there is at least one answer set of the learned program (the hypothesis combined with the background knowledge) that extends the partial interpretation.

Similarly, a negative partial interpretation is of the form:

#neg({inc_1, ..., inc_m}, {exc_1, ..., exc_n}).

A negative partial interpretation is covered if there is no answer set of the learned program that extends the partial interpretation.

Positive examples are able to express properties that should be satisfied in at least one example and so naturally express brave reasoning in ASP. On the other hand, negative examples express properties that should be expressed in no answer set, and can therefore express properties that should be true in all answer sets, thus expressing cautious reasoning in ASP.

Coin Example

Consider a very simple setting, where we want to learn a program that describes the behaviour of three (specific) coins, by flipping them and observing which sides they land on. We can use a very simple mode bias to describe the rules we are allowed to learn. The predicates heads/1 and tails/1 are both allowed to appear in the head with a single argument, which is either a variable or constant of type coin (where there are three constants of type coin in the domain: c1, c2 and c3). In the body, we can use three predicates (both positively and negatively): heads/1, tails/1 and coin/1. The mode bias expressing this language is shown below.

#modeh(heads(var(coin))).
#modeh(tails(var(coin))).

#modeb(heads(var(coin))).
#modeb(tails(var(coin))).
#modeb(coin(var(coin))).

#modeh(heads(const(coin))).
#modeh(tails(const(coin))).

#constant(coin, c1).
#constant(coin, c2).
#constant(coin, c3).

We flip the coins twice, and see the following combinations of observations: {heads(c1), tails(c2), heads(c3)}, {heads(c1), heads(c2), tails(c3)}. We can encode these “observations” in ILASP using positive examples. Positive examples specify properties which should hold in at least one answer set of the learned program. The two observations are represented by the following two examples:

#pos({heads(c1), tails(c2), heads(c3)},
     {tails(c1), heads(c2), tails(c3)}).

#pos({heads(c1), heads(c2), tails(c3)},
     {tails(c1), tails(c2), heads(c3)}).

For a positive example to be covered, there must be at least one answer set of the learned program that contains all of the inclusions and none of the exclusions. In this case, these examples mean that there must be (at least) two answer sets, one which contains heads(c1), tails(c2) and heads(c3), and does not contain tails(c1), heads(c2) and tails(c3), and another answer set which contains heads(c1), heads(c2) and tails(c3), and does not contain tails(c1), tails(c2) and heads(c3). Although these particular examples completely describe the values of all coins, this does not need to be the case in general. Partial examples allow us to represent uncertainty; for example, we could have a fourth coin c4 for which we do not know the value.

Together with the above examples, we can also give the following, very simple, background knowledge, which defines the set of coins we have:

coin(c1).    coin(c2).    coin(c3).

If we run this task in ILASP, then ILASP returns the following solution:

heads(V1) :- coin(V1), not tails(V1).
tails(V1) :- coin(V1), not heads(V1).

This program states that every coin must land on either heads or tails, but not both. Although the first coin c1 has never landed on tails in the scenarios we have observed, ILASP has generalised to learn first-order rules that apply to all coins, rather than specific ground rules that only explain the specific instances we have seen. The ability to generalise in this way is a huge advantage of ILP systems over other forms of machine learning, because it usually means that ILP techniques require very few examples to learn general concepts. Note that both positive examples are required for ILASP to learn this general program. Neither positive example on its own is sufficient because in both cases there is a shorter program that explains the example – the set of facts { heads(c1). tails(c2). heads(c3). } covers the first example and similarly the set of facts { heads(c1). heads(c2). tails(c3). } covers the second.

It may be that after many more observations, we still have not witnessed c1 landing on tails, and we could be convinced that it never will. In this case, we can use ILASP’s negative examples to specify that there should be no answer set that contains tails(c1). This example is expressed in ILASP as follows:

#neg({tails(c1)}, {}).

Given this extra example, ILASP learns the slightly larger program:

heads(V1) :- coin(V1), not tails(V1).
tails(V1) :- coin(V1), not heads(V1).
heads(c1).

This program states that all coins must land on either heads or tails, but not both, except for c1, which can only land on heads. Note that negative examples often cause ILASP to learn programs with rules that eliminate answer sets. In this case, the fact heads(c1) eliminates all answer sets that contain tails(c1). Negative examples are often used to learn constraints. The constraint :- tails(c1). would have has the same effect; however, it is not permitted because the mode bias does not allow constants to be used in the body of a rule.

Context-dependent Examples.

Positive and negative examples of partial answer sets are targeted at learning a fixed program. When ASP is used in practice, however, a program representing a general problem definition is often combined with another set of rules (usually just facts) describing a particular instance of the problem to be solved. For instance, a general program defining what it means for a graph to be Hamiltonian (i.e. the general problem definition) can be combined with a set of facts describing a particular graph (i.e. a problem instance). The combined program is satisfiable if and only if the graph represented by the set of facts is Hamiltonian. The context-dependent behaviour of the general Hamilton program cannot be captured by positive and negative examples of partial answer sets. Instead, we need an extension, called a context-dependent example. This allows each example to come with its own extra bit of background knowledge, called a context C, which applies only to that example. It is now the learned program combined with this context (i.e. the union of B, H and C) that has to satisfy the semantic properties of the example, rather than the learned program alone. In ILASP, the context of an example is expressed by adding an extra set to the example, containing the context.

Consider the following two context-dependent examples, which each represent a graph with four nodes and various edges:

#pos({}, {}, {
  node(1..4).
  edge(1, 2).
  edge(2, 3).
  edge(3, 4).
  edge(4, 1).
}).

#neg({}, {}, {
  node(1..4).
  edge(1, 2).
  edge(2, 1).
  edge(2, 3).
  edge(3, 4).
  edge(4, 2).
}).

Both examples have empty inclusions and exclusions. In the case of a positive example, this simply means that there must exist at least one answer set of the learned program combined with the context – any answer set is consistent with the empty partial interpretation – and in the case of a negative example, it means that there should be no answer set of the learned program combined with the context. Given a sufficient number of examples of this form, ILASP can be used to learn a program that corresponds to the definition of a Hamiltonian graph; i.e. the learned program combined with a set of facts C is satisfiable if and only if C represents a Hamiltonian graph. The full program learned by ILASP is:

0 { in(V0, V1) } 1 :- edge(V0, V1).
reach(V0) :- in(1, V0).
reach(V1) :- reach(V0), in(V0, V1).
:- not reach(V0), node(V0).
:- V1 != V2, in(V0, V2), in(V0, V1).

This example shows the high expressive power of ILASP, compared to many other ILP systems, which are only able to learn definite logic programs. In this case, ILASP has learned a choice rule, constraints and a recursive definition of reachability. The full learning task, hamilton.las, used to learn this program is available here.