Inductive Learning of Answer Set Programs


In recent years, some ILP systems, such as Metagol, have abandoned mode biases, in favour of a new form of bias called metarules. A metarule is essentially a template of a rule which may be learned. Metarules can be simulated in ILASP (from version 3.5.0 as an “experimental” feature) by adding rules to the background knowledge in which the predicate symbols are replaced with variables. For these rules to be safe, all “predicate variables” must occur as one of the arguments of at least one positive body literal. Usually, all predicate variables will occur together in a single positive literal which “names” the metarule. Consider the following metarules from the Metagol github:

metarule([P,Q], [P,A,B], [[Q,A,B]]). % identity
metarule([P,Q], [P,A,B], [[Q,B,A]]). % inverse
metarule([P,Q,R], [P,A,B], [[Q,A],[R,A,B]]). % precon
metarule([P,Q,R], [P,A,B], [[Q,A,B],[R,B]]). % postcon
metarule([P,Q,R], [P,A,B], [[Q,A,C],[R,C,B]]). % chain

These can be represented in ILASP by adding the following rules to the background knowledge:

P(A, B) :- Q(A, B), identity(P, Q).
P(A, B) :- Q(B, A), inverse(P, Q).
P(A, B) :- Q(A), R(A, B), precon(P, Q, R).
P(A, B) :- Q(A, B), R(B), postcon(P, Q, R).
P(A, B) :- Q(A, C), R(C, B), chain(P, Q, R).

For these metarules to be used, it is necessary to define the set of predicates that can be used to instantiate the predicate variables. For this purpose, we use the new #predicate declaration. This declaration takes two arguments: the type of the predicate, and the predicate schema (specified as predicate_name/predicate_arity). The predicate types do not have any semantic meaning, and are only used to restrict which predicates can be used to instantiate which predicate variables.

#predicate(base, edge/2).
#predicate(invented, p1/2).
#predicate(invented, p2/2).

Finally, the new #modem declarations are used to link the #predicate declarations with the metarules. These are similar to mode declarations, with the main difference being that their arguments are abstracted predicate schemas of the form predicate_type/predicate_arity. The other difference is that the first argument of a #modem declaration denotes not the recall, but the cost of using the metarule. Note that the special predicate type any means that a predicate of any type can be used.

For example, consider the following bias:

P(A, B) :- Q(B, A), inverse(P, Q).
P(A, B) :- Q(A, B), R(B), postcon(P, Q, R).

#predicate(base, node/1).
#predicate(base, edge/2).
#predicate(invented, p1/2).
#predicate(invented, p2/2).

#modem(2, inverse(invented/2, base/2)).
#modem(3, postcon(invented/2, any/2, base/1)).

This above bias is equivalent to expressing the following hypothesis space:

2 ~ p1(A, B) :- edge(B, A).
2 ~ p2(A, B) :- edge(B, A).
3 ~ p1(A, B) :- edge(A, B), node(B).
3 ~ p1(A, B) :- p1(A, B), node(B).
3 ~ p1(A, B) :- p2(A, B), node(B).
3 ~ p2(A, B) :- edge(A, B), node(B).
3 ~ p2(A, B) :- p1(A, B), node(B).
3 ~ p2(A, B) :- p2(A, B), node(B).

It should be noted that metarules have been introduced as a purely experimental feature. Due to some technical details of how they work internally in ILASP, metarule based biases can sometimes be less efficient than equivalent explicitly defined (or mode declaration defined) hypothesis spaces. We hope to address this in future versions of ILASP.

More expressive metarules

One of the interesting features of combining metarules with ILASP is that more expressive rule templates can be defined. For example, the following metarule represents a choice rule:

0 { P(X, Y) } 1 :- Q(X, Y), subset(P, Q).

#modem(3, subset(any/2,any/2)).

Metarules can similarly be used to represent hard constraints and even weak constraints.

Due to the general way that metarules have been implemented, they can also be used not only for rules, but also for program templates. For instance, the following closure metarule represents that the predicate P is the transitive closure of Q:

P(X, Y) :- Q(X, Y), closure(P, Q).
P(X, Z) :- P(X, Y), Q(Y, Z), closure(P, Q).
#modem(5, closure(any/2,any/2)).

This can be useful in restricting the structure of the learned program; for instance, it can be used to control how recursion can be used in the hypothesis (e.g. to enforce stratification).