Perception of dynamic scenes in our environment results from the evaluation of visual features such as the fundamental spatial and temporal frequency components of a moving object. The ratio between these two components represents the object's speed of motion. The human middle temporal cortex hMT+ has a crucial biological role in the direct encoding of object speed. However, the link between hMT+ speed encoding and the spatiotemporal frequency components of a moving object is still under explored. Here, we recorded high resolution 7T blood oxygen level-dependent BOLD responses to different visual motion stimuli as a function of their fundamental spatial and temporal frequency components. We fitted each hMT+ BOLD response with a 2D Gaussian model allowing for two different speed encoding mechanisms: (1) distinct and independent selectivity for the spatial and temporal frequencies of the visual motion stimuli; (2) pure tuning for the speed of motion. We show that both mechanisms occur but in different neuronal groups within hMT+, with the largest subregion of the complex showing separable tuning for the spatial and temporal frequency of the visual stimuli. Both mechanisms were highly reproducible within participants, reconciling single cell recordings from MT in animals that have showed both encoding mechanisms. Our findings confirm that a more complex process is involved in the perception of speed than initially thought and suggest that hMT+ plays a primary role in the evaluation of the spatial features of the moving visual input.