Cloud-Scale Genomic Signals Processing for Robust Large-Scale Cancer Genomic Microarray Data Analysis

Benjamin Simeon Harvey, Soo Yeon Ji

Research output: Contribution to journalArticlepeer-review

7 Scopus citations


As microarray data available to scientists continues to increase in size and complexity, it has become overwhelmingly important to find multiple ways to bring forth oncological inference to the bioinformatics community through the analysis of large-scale cancer genomic (LSCG) DNA and mRNA microarray data that is useful to scientists. Though there have been many attempts to elucidate the issue of bringing forth biological interpretation by means of wavelet preprocessing and classification, there has not been a research effort that focuses on a cloud-scale distributed parallel (CSDP) separable 1-D wavelet decomposition technique for denoising through differential expression thresholding and classification of LSCG microarray data. This research presents a novel methodology that utilizes a CSDP separable 1-D method for wavelet-based transformation in order to initialize a threshold which will retain significantly expressed genes through the denoising process for robust classification of cancer patients. Additionally, the overall study was implemented and encompassed within CSDP environment. The utilization of cloud computing and wavelet-based thresholding for denoising was used for the classification of samples within the Global Cancer Map, Cancer Cell Line Encyclopedia, and The Cancer Genome Atlas. The results proved that separable 1-D parallel distributed wavelet denoising in the cloud and differential expression thresholding increased the computational performance and enabled the generation of higher quality LSCG microarray datasets, which led to more accurate classification results.

Original languageEnglish (US)
Pages (from-to)238-245
Number of pages8
JournalIEEE Journal of Biomedical and Health Informatics
Issue number1
StatePublished - Jan 2017
Externally publishedYes


  • Cancer
  • cloud
  • decomposition
  • distributed
  • genomics
  • parallelization
  • processing
  • scale
  • thresholding signal

ASJC Scopus subject areas

  • Biotechnology
  • Computer Science Applications
  • Electrical and Electronic Engineering
  • Health Information Management


Dive into the research topics of 'Cloud-Scale Genomic Signals Processing for Robust Large-Scale Cancer Genomic Microarray Data Analysis'. Together they form a unique fingerprint.

Cite this