Validating Synthetic Health Datasets for Longitudinal Clustering

Loading...
Thumbnail Image
Date
2013
Authors
Ghassem Pour, S
Maeder, Anthony
Jorm, L
Journal Title
Journal ISSN
Volume Title
Publisher
Rights
Copyright (C) 2013, Australian Computer Society, Inc.
Rights Holder
Australian Computer Society, Inc
Abstract
Clustering methods partition datasets into subgroups with some homogeneous properties, with information about the number and particular characteristics of each subgroup unknown a priori. The problem of predicting the number of clusters and quality of each cluster might be overcome by using cluster validation methods. This paper presents such an approach in-corporating quantitative methods for comparison be-tween original and synthetic versions of longitudinal health datasets. The use of the methods is demon-strated by using two different clustering algorithms, K-means and Latent Class Analysis, to perform clus-tering on synthetic data derived from the 45 and Up Study baseline data, from NSW in Australia.
Description
This paper appeared at the Australasian Workshop on Health Informatics and Knowledge Management (HIKM 2013), Adelaide, Australia. Conferences in Research and Practice in Information Technology (CRPIT), Vol.142. K. Gray and A. Koronios, Eds. Reproduction for academic, not-for profit purposes permitted provided this text is included.
Keywords
Cluster analysis, longitudinal synthetic data, Cluster validation
Citation
Pour, S. G., Maeder, A., & Jorm, L. (2013, January). Validating synthetic health datasets for longitudinal clustering. In Proceedings of the Sixth Australasian Workshop on Health Informatics and Knowledge Management-Volume 142 (pp. 15-19). Australian Computer Society, Inc.