Everything presented so far in this manual assumes that all examples are
correctly labelled, and therefore that all examples should be covered by
the learned program. In real applications, of course, this is often not
the case; examples may be noisy (i.e. mislabelled), and so finding a
program that covers all examples may not be possible, or even desirable
(as this might be overfitting on the examples). In ILASP, each example
can be given a penalty, which is a cost for not covering that example.
The search for an optimal learned program now searches for a program
that minimises |H| + cost
, where |H|
is the length of the hypothesis
and cost
is the sum of the penalties of all examples that are not
covered by the learned program. The penalty of an example is a positive
integer, and is specified for each of the four example types as follows:
#pos(id@penalty, { inclusions }, { exclusions }, { context }).
#neg(id@penalty, { inclusions }, { exclusions }, { context }).
#brave_ordering(id@penalty, eg_1, eg_2, ordering_operator).
#cautious_ordering(id@penalty, eg_1, eg_2, ordering_operator).
Examples with no specified penalty are considered non-noisy, and to have an infinite penalty, meaning that they must be covered by the learned program.