A Classification Model for Water Quality analysis Using Decision Tree
Abstract
A classification algorithm is used to assign predefined classes to test instances
for evaluation) or future instances to an application). This study presents a Classification
model using decision tree for the purpose of analyzing water quality data from different
counties in Kenya. The water quality is very important in ensuring citizens get to drink clean
water. Application of decision tree as a data mining method to predict clean water based on
the water quality parameters can ease the work of the laboratory technologist by predicting
which water samples should proceed to the next step of analysis. The secondary data from
Kenya Water institute was used for creation of this model. The data model was implemented
in WEKA software. Classification using decision tree was applied to classify /predict the clean
and not clean water. The analysis of water Alkalinity,pH level and conductivity can play a
major role in assessing water quality. Five decision tree classifiers which are J48, LMT,
Random forest, Hoeffding tree and Decision Stump were used to build the model and the
accuracy compared. J48 decision tree had the highest accuracy of 94% with Decision Stump
having the lowest accuracy of 83%.