A Novel Data Mining Approach for Health Care Applications

S Jaganathan, V Krishnaveni

Abstract


In Health Care Industry, many diagnoses have been done for predicting the diseases. For finding out the diseases, a number of tests need to be taken by the patients. As the number of tests increases, the dimension of the medical dataset also increases and as a consequence the complexity of the prediction process also increases. Now-a-days, data mining plays a crucial role in analyzing high dimensional data and hence in prediction of diseases too. The Health care industry utilizes data mining techniques and finds out the information which is hidden in the dataset. In this work, the potential use of classification based data mining methods such as C4.5, RIPPER (Repeated Incremental Pruning to Produce Error Reduction) algorithm and Artificial Neural Network are taken into consideration to healthcare datasets and their performances are examined. Further, their limitations have been observed and their performances have been improved using various heuristics in the proposed work.

In the proposed work, an Enhanced RIPPER algorithm (En Ripper) for fast effective rule induction is proposed and it is proved that the Enhanced RIPPER obtains error rates which are lower than that of the existing RIPPER. The Enhanced RIPPER algorithm contains three components viz., alternative pruning phase, new stopping heuristics for rule adding and post pruning of whole rule set. An Enhanced ANN algorithm (En ANN) is proposed as another part of this work to minimize the global error rate and it is proved that the error evolves to small values during training. In the Enhanced ANN algorithm, when the error does not decrease by more than one percent of its previous value, a new hidden unit is added and the connection weights are randomly reinitialized over the previously-defined interval. This process is repeated until the network converges to an acceptable global error value.  The performances of the existing systems and the proposed algorithms have been analysed and compared against three Healthcare datasets. The error rate is measured in terms of true positive, true negative, false positive, false negative and the performance accuracy of the system. The experimental results show that the error rate is less in the proposed systems when compared to the existing methods.

 


References


Ruben d. Canlas jr,”data mining in healthcare: current applications and issues “,by msit, mba ,5 august 2009.

Li wanqing, ma lihua, wei dong,”data mining based on rough sets in risk decision-making: foundation and application”, wseas transactions on computers, issn: 1109-2750 , issue 2, volume 9, february 2010.

Hian chye koh and gerald tan,”data mining applications in healthcare”, journal of healthcare information management — vol. 19, no. 2.

Jens hühn and eyke hüllermeier,”furia: an algorithm for unordered fuzzy rule induction”, data mining and knowledge discovery.

Shikha mehta , k.anusha,”decision support system for health care specialists: a fuzzy data mining approach”, jaypee institute of information echnology University, noida, u.p, india.

Lukasz a. Kurgan1 and petr musilek2,”A survey of Knowledge Discovery and Data Mining process models “,The Knowledge Engineering Review, Vol. 21:1, 1–24. 2006


Full Text: PDF

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.

.......................................................................................................................................................................................................................

ISSN  2279 – 0381 |  IST HOMEJOURNAL HOME | Copyright IST 2012-13

.......................................................................................................................................................................................................................