Cluster (Data Mining)

When discussing data mining algorithms, a cluster analysis is a type of unsupervised learning that groups records together that have similar attributes or fields. A cluster center is the average over the records for that cluster, for each field used to define the cluster. For example, if the data was the X, Y position of points on a barn door, … Read More

Cluster (Kamanja)

(In a Kamanja context) – When installing and configuring Kamanja, a group of Kamanja nodes working together on the same task. Typically, clusters would not overlap. A cluster may be composed of one or more nodes/engines.

DAG

Directed acyclic graph. The graph edges represent the directional flow of data. The graph nodes represent programs operating on the data, such as models or rules. One or more edges can flow into a node, one or more edges can flow out of a node, but no cycles are possible. For example, the data would … Read More

DMG

Data Mining Group – www.dmg.org.

DSL

Domain-specific language.

Engine

An instance of Kamanja (specifically the KamanjaManager). With respect to Kamanja, the terms node and engine are used interchangeably.

GraphViz

Graph Visualization Software.

HMEQ-SAS doc

http://kamanja.org/wp-content/uploads/2016/07/HMEQ-SAS-doc.pdf

Model Validation Process

Which Model Types Were Validated? The model types listed below were taken from the following diagram called “Kamanja Integration with Software and Algorithms” (updated 5/20/2016):   The following model types from PMML producers (vendors) were validated: SAS (vendor), using PMML generated by their Enterprise Miner software: Decision Trees (CART, C5.0, Cubist) Neural Net Regression (linear, … Read More