When computing visual saliency on natural scenes, many current models do not consider temporal information that may exist within the visual stimuli. Most models are designed for predicting salient regions of static images only. However, the world is dynamic and constantly changing. Furthermore, motion is a naturally occurring phenomena that plays an essential role in both human and computer visual processing. Henceforth, the most efficient model of visual saliency should consider motion that may be exhibited within the visual scene. In this paper, we investigate the most advantageous and biologically-plausible manner in which motion should be applied to our current model of proto-object based visual saliency. We investigate the type of motion that should be extracted in such a bottom-up, feed-forward model as well as where the motion should be incorporated into the model. Two final approaches are suggested and compared against how well each can predict human eye saccades on a set of videos from the Itti dataset. They are each validated using the KL Divergence metric. We conclude by selecting the model better at predicting saccades throughout various videos from the dataset. Our results also gives general insight into how motion should be integrated into a visual saliency model.