Kamanja has a core function or user-defined function (UDF) library of over 700 functions that may be used as required within the PMML models being developed to run on a Kamanja cluster. A rich set of date and time functions and other common operations to write software is available.

GROUP BY and Aggregations

GROUP BY is supported in Kamanja PMML currently with one key only (multiple group by keys are not yet supported). The aggregation functions available are those defined in the DMG specification, namely Sum, Min, Max, Avg, Median, and Product. GROUP BY is implemented by a function, GroupBy. For example:

When the GroupBy function executes, … Read More

Implementation ToDo

While what is available now provides adequate support for most needs, it really doesn’t address more complex object initialization that can be expressed with the Kamanja PMML. For example, Kamanja can create complex structures that are more difficult to initialize. For these cases, another sort of instruction extension is used that allows the initialization to be accomplished, … Read More


Pattern – Using missingValueReplacement on Mining Field

This section describes how to provide default values to PMML model variables that remain unset after execution completes. Intent DataField and DerivedField element values that are intended to be emitted as part of the model output ideally should have some value, even when the model execution did not set the field explicitly. In PMML, the DMG language … Read More


Predictive Model Markup Language. A language written in XML that describes the rules that a company wants to give the engine. LigaData chooses an augmented version of the Predictive Model Markup Language (PMML) standard from the Data Mining Consortium for writing LigaData models. PMML is an XML-based language for specifying a wide variety of data … Read More

PMML Model – Testing

A New Command to Test a Generated PMML Model Introduction This section contains a simple application that can be used by those that have some familiarity with the PMML source providers (for example, R/Rattle, KNIME, RapidMiner, SAS) and consumers (KNIME, Zementis’ Adapa). This tool does not use Kamanja, but allows the user to quickly test … Read More

PMML Models – Adding, Updating, and Removing

PMML models vary from the standard Kamanja model ingestion mechanism for Scala, Java, and kPMML models. The model, model version, and message must be supplied on the command-line. Here are a few Kamanja script examples to illustrate the basic commands. The value of the KAMANJA_HOME variable in the examples is the Kamanja install … Read More