A large number of clustering algorithms have been proposed of late, which can identify clusters of arbitrary shapes, varying densities and sizes. This necessitates the idea of "benchmarking" datasets that can evaluate clustering algorithms on various aspects like scalability, accuracy and robustness to noise. Real-life datasets are few in number and do not have the "original" clustering results by default. This emphasizes the need to have a toolkit that can generate datasets, which mimic real-life data along with the actual clustering results. In this paper, we propose a few algorithms and methodologies that generate high-dimensional datasets along with the original clustering results. We developed a toolkit called SynDECA that generates synthetic datasets based on the algorithms proposed.