Dense Depth Estimation in Monocular Endoscopy with Self-Supervised Learning Methods

Xingtong Liu, Ayushi Sinha, Masaru Ishii, Gregory D. Hager, Austin Reiter, Russell H. Taylor, Mathias Unberath

Research output: Contribution to journalArticlepeer-review

13 Scopus citations


We present a self-supervised approach to training convolutional neural networks for dense depth estimation from monocular endoscopy data without a priori modeling of anatomy or shading. Our method only requires monocular endoscopic videos and a multi-view stereo method, e.g., structure from motion, to supervise learning in a sparse manner. Consequently, our method requires neither manual labeling nor patient computed tomography (CT) scan in the training and application phases. In a cross-patient experiment using CT scans as groundtruth, the proposed method achieved submillimeter mean residual error. In a comparison study to recent self-supervised depth estimation methods designed for natural video on in vivo sinus endoscopy data, we demonstrate that the proposed approach outperforms the previous methods by a large margin. The source code for this work is publicly available online at

Original languageEnglish (US)
Article number8889760
Pages (from-to)1438-1447
Number of pages10
JournalIEEE transactions on medical imaging
Issue number5
StatePublished - May 2020


  • Endoscopy
  • depth estimation
  • self-supervised learning
  • unsupervised learning

ASJC Scopus subject areas

  • Software
  • Radiological and Ultrasound Technology
  • Computer Science Applications
  • Electrical and Electronic Engineering


Dive into the research topics of 'Dense Depth Estimation in Monocular Endoscopy with Self-Supervised Learning Methods'. Together they form a unique fingerprint.

Cite this