Cluster (Data Mining)

When discussing data mining algorithms, a cluster analysis is a type of unsupervised learning that groups records together that have similar attributes or fields. A cluster center is the average over the records for that cluster, for each field used to define the cluster. For example, if the data was the X, Y position of points on a barn door, … Read More

Cluster (Kamanja)

(In a Kamanja context) – When installing and configuring Kamanja, a group of Kamanja nodes working together on the same task. Typically, clusters would not overlap. A cluster may be composed of one or more nodes/engines.


Directed acyclic graph. The graph edges represent the directional flow of data. The graph nodes represent programs operating on the data, such as models or rules. One or more edges can flow into a node, one or more edges can flow out of a node, but no cycles are possible. For example, the data would … Read More


Data Mining Group –


Domain-specific language.


An instance of Kamanja (specifically the KamanjaManager). With respect to Kamanja, the terms node and engine are used interchangeably.


Graph Visualization Software.


Model Validation Process

Which Model Types Were Validated? The model types listed below were taken from the following diagram called “Kamanja Integration with Software and Algorithms” (updated 5/20/2016):   The following model types from PMML producers (vendors) were validated: SAS (vendor), using PMML generated by their Enterprise Miner software: Decision Trees (CART, C5.0, Cubist) Neural Net Regression (linear, … Read More