How humans perform duration judgments with multisensory stimuli is an ongoing debate. Here, we investigated how sub-second duration judgments are achieved by asking participants to compare the duration of a continuous sound to the duration of an empty interval in which onset and offset were marked by signals of different modalities using all combinations of visual, auditory and tactile stimuli. The pattern of perceived durations across five stimulus durations (ranging from 100. ms to 900. ms) follows the Vierordt Law. Furthermore, intervals with a sound as onset (audio-visual, audio-tactile) are perceived longer than intervals with a sound as offset. No modality ordering effect is found for visualtactile intervals. To infer whether a single modality-independent or multiple modality-dependent time-keeping mechanisms exist we tested whether perceived duration follows a summative or a multiplicative distortion pattern by fitting a model to all modality combinations and durations. The results confirm that perceived duration depends on sensory latency (summative distortion). Instead, we did not find evidence for multiplicative distortions. The results of the model and the behavioural data support the concept of a single time-keeping mechanism that allows for judgments of durations marked by multisensory stimuli.