Understanding the deformation of the tongue during human speech is important for head and neck surgeons and speech and language scientists. Tagged magnetic resonance (MR) imaging can be used to image 2D motion, and data from multiple image planes can be combined via post-processing to yield estimates of 3D motion. However, lacking boundary information, this approach suffers from inaccurate estimates near the tongue surface. This paper describes a method that combines two sources of information to yield improved estimation of 3D tongue motion. The method uses the harmonic phase (HARP) algorithm to extract motion from tags and diffeomorphic demons to provide surface deformation. It then uses an incompressible deformation estimation algorithm to incorporate both sources of displacement information to form an estimate of the 3D whole tongue motion. Experimental results show that use of combined information improves motion estimation near the tongue surface, a problem that has previously been reported as problematic in HARP analysis, while preserving accurate internal motion estimates. Results on both normal and abnormal tongue motions are shown.