Data-intensive spatial filtering in large numerical simulation datasets

Kalin Kanov, Randal Burns, Greg Eyink, Charles Meneveau, Alexander Szalay

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present a query processing framework for the efficient evaluation of spatial filters on large numerical simulation datasets stored in a data-intensive cluster. Previously, filtering of large numerical simulations stored in scientific databases has been impractical owing to the immense data requirements. Rather, filtering is done during simulation or by loading snapshots into the aggregate memory of an HPC cluster. Our system performs filtering within the database and supports large filter widths. We present two complementary methods of execution: I/O streaming computes a batch filter query in a single sequential pass using incremental evaluation of decomposable kernels, summed volumes generates an intermediate data set and evaluates each filtered value by accessing only eight points in this dataset. We dynamically choose between these methods depending upon workload characteristics. The system allows us to perform filters against large data sets with little overhead: query performance scales with the cluster's aggregate I/O throughput.

Original languageEnglish (US)
Title of host publication2012 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2012
DOIs
StatePublished - 2012
Event2012 24th International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2012 - Salt Lake City, UT, United States
Duration: Nov 10 2012Nov 16 2012

Publication series

NameInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC
ISSN (Print)2167-4329
ISSN (Electronic)2167-4337

Other

Other2012 24th International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2012
Country/TerritoryUnited States
CitySalt Lake City, UT
Period11/10/1211/16/12

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Hardware and Architecture
  • Software

Fingerprint

Dive into the research topics of 'Data-intensive spatial filtering in large numerical simulation datasets'. Together they form a unique fingerprint.

Cite this