Home Forums Kamanja Forums Best Practices PMML models: Remove unused columns from data before submitting to Kamanja

This topic contains 0 replies, has 1 voice, and was last updated by  Bill Bruns 1 year, 4 months ago.

  • Author
    Posts
  • #16125 Reply

    Bill Bruns
    Participant

    This is only relevant to using a model that is in PMML form.

    Often one or more data columns will not be used in the Model. These columns are not mentioned n the PMML file.  These columns (or fields) must be removed from the data before it is submitted to Kamaja, otherwise the processing will fail.

    For example, if the fields (or columns) are Age, Name, Salary, a Big Data model would likely not use the Name field to do any predictive modeling.  Thus, the PMML file would not refer to the Name field.  The Message Definition for this message type should also not mention that field; instead, the Message Definition (and the data itself) should only mention the fields that are used in the model.  See also the report by Rich on the Github issues at https://github.com/LigaData/Kamanja/issues/989

     

Reply To: PMML models: Remove unused columns from data before submitting to Kamanja
Your information: