Probabilistic Machine Learning with Low-Cost Sensor Networks for Occupational Exposure Assessment and Industrial Hygiene Decision Making

Andrew N. Patton, Konstantin Medvedovsky, Christopher Zuidema, Thomas M. Peters, Kirsten Koehler

Research output: Contribution to journalArticlepeer-review


Occupational exposure assessments are dominated by small sample sizes and low spatial and temporal resolution with a focus on conducting Occupational Safety and Health Administration regulatory compliance sampling. However, this style of exposure assessment is likely to underestimate true exposures and their variability in sampled areas, and entirely fail to characterize exposures in unsampled areas. The American Industrial Hygiene Association (AIHA) has developed a more realistic system of exposure ratings based on estimating the 95th percentiles of the exposures that can be used to better represent exposure uncertainty and exposure variability for decision-making; however, the ratings can still fail to capture realistic exposure with small sample sizes. Therefore, low-cost sensor networks consisting of numerous lower-quality sensors have been used to measure occupational exposures at a high spatiotemporal scale. However, the sensors must be calibrated in the laboratory or field to a reference standard. Using data from carbon monoxide (CO) sensors deployed in a heavy equipment manufacturing facility for eight months from August 2017 to March 2018, we demonstrate that machine learning with probabilistic gradient boosted decision trees (GBDT) can model raw sensor readings to reference data highly accurately, entirely removing the need for laboratory calibration. Further, we indicate how the machine learning models can produce probabilistic hazard maps of the manufacturing floor, creating a visual tool for assessing facility-wide exposures. Additionally, the ability to have a fully modeled prediction distribution for each measurement enables the use of the AIHA exposure ratings, which provide an enhanced industrial decision-making framework as opposed to simply determining if a small number of measurements were above or below a pertinent occupational exposure limit. Lastly, we show how a probabilistic modeling exposure assessment with high spatiotemporal resolution data can prevent exposure misclassifications associated with traditional models that rely exclusively on mean or point predictions.

Original languageEnglish (US)
Pages (from-to)580-590
Number of pages11
JournalAnnals of work exposures and health
Issue number5
StatePublished - Jun 1 2022


  • carbon monoxide
  • exposure assessment
  • machine learning
  • occupational exposure assessment
  • sensor networks

ASJC Scopus subject areas

  • General Medicine


Dive into the research topics of 'Probabilistic Machine Learning with Low-Cost Sensor Networks for Occupational Exposure Assessment and Industrial Hygiene Decision Making'. Together they form a unique fingerprint.

Cite this