Caption Style
Karaoke Captions
Karaoke captions show all words at once and highlight each word as it's spoken — the most readable caption style for educational content, podcasts, and lyric videos.
What Are Karaoke Captions?
Karaoke captions display the entire sentence or phrase on screen from the start, then highlight each word individually as it's spoken. Unlike Build or Pop categories where words appear and disappear, Karaoke keeps all text visible at all times — only the styling of the active word changes. This creates a reading experience similar to a teleprompter or lyric video, where viewers can read ahead, follow along at their own pace, or glance away and easily find their place again. The highlighting mechanism is driven by word-level timestamps from Whisper AI, ensuring frame-accurate sync between the spoken audio and the visual highlight. VideoCaptions.AI offers four highlight sub-styles: Scale (the active word grows slightly), Background (a colored block appears behind the active word), Bounce (the active word bounces vertically), and ColorChange (the active word switches to your chosen highlight color). Each sub-style can be combined with a custom highlight color for full creative control.
How It Works
Karaoke captions work differently from other categories because they don't use entrance/exit animations. All words in a page are rendered from frame zero with their default styling. The computeHighlightValues() function runs every frame, comparing the current playback frame against each word's start time (wg.from) to determine which word is 'active.' The active word receives the selected highlight sub-style transformation — scale, background, bounce, or colorChange — while all other words retain their default appearance. This is implemented as a per-frame calculation, not a CSS transition, because Remotion renders frames independently (a frame at position 100 must look identical whether you seek there or play there). The highlight color is stored in the page's metadata (pages[pageTag].highlightColor) and the sub-style in pages[pageTag].highlightSubStyle.
Best For
- -Podcast clips and interview highlights where viewers follow extended dialogue
- -Educational and tutorial videos where comprehension benefits from seeing full sentences
- -Lyric videos and music content where timing is everything
- -Accessibility-focused content for hearing-impaired audiences
- -Long-form talking-head content on YouTube and LinkedIn
Best Platforms for Karaoke Captions
YouTube
YouTube's longer format benefits from Karaoke's readability — viewers can follow multi-sentence explanations without text appearing and disappearing.
Captions for YouTube →The professional, clean look of Karaoke captions matches LinkedIn's aesthetic. All text visible at once feels structured and organized, like presentation slides.
Podcast apps
Podcast clip videos shared on social media work perfectly with Karaoke — listeners-turned-viewers can follow the conversation naturally with highlighted words tracking the speaker.
01
Karaoke Captions vs. Traditional Subtitles
Traditional subtitles show one or two lines of text that change every few seconds. Karaoke captions take this further by keeping the text stable and moving a highlight through the words. This distinction matters because stable text is significantly easier to read than text that constantly changes position and content. Research on reading comprehension shows that when text moves or changes, the reader's eye must re-orient — a process that takes 200-300 milliseconds per change. Karaoke eliminates this re-orientation cost because the text never moves. The viewer's eye simply follows the highlight, which is a much simpler tracking task. This makes Karaoke the best category for content where comprehension matters more than visual excitement: educational content, tutorials, podcast clips, and professional presentations. It's also the most accessible option for viewers with dyslexia or reading difficulties, as the stable text and moving highlight provide a consistent reading anchor.
02
Customizing Karaoke Highlight Styles
VideoCaptions.AI offers four karaoke highlight sub-styles, each creating a different visual effect. The Scale sub-style enlarges the active word slightly (typically 110-120% of normal size), creating a gentle 'pulse' that draws the eye without disrupting the text layout. Background adds a colored block behind the active word — effective for high-contrast highlighting and particularly popular for lyric videos. Bounce makes the active word jump vertically for a playful, musical feel that works well on TikTok and Instagram. ColorChange swaps the active word to a different color (your chosen highlightColor), which is the subtlest option and works well for professional or educational content. You can combine any sub-style with any highlight color. For maximum readability, pair Scale with a bright highlight color. For the most visually engaging look, try Bounce with a contrasting neon color and glow enabled. Each sub-style is computed per-frame by the Remotion renderer, ensuring perfect timing in both the live preview and the exported MP4.
Frequently Asked Questions
Everything you need to know before you start.
Can't find what you're looking for? Contact us
Karaoke timing is driven by Whisper AI's word-level timestamps, which are typically accurate to within 50-100 milliseconds. This is precise enough that the highlight feels naturally synced to speech. If any word's timing feels off, you can manually adjust it in the editor by shifting the word's start frame.
Yes. Each page has its own highlightColor setting, so you can use different highlight colors for different sections of your video. You can also choose from four highlight sub-styles (Scale, Background, Bounce, ColorChange) independently per page. These settings are found in the clip inspector panel.
Karaoke works best for moderate to slow-paced content where viewers have time to read ahead and follow the highlight. For very fast speech or high-energy content, the Pop or Flash categories tend to perform better because they match the rapid visual pacing. Karaoke excels when you want viewers to absorb information, not just feel energy.
Absolutely — Karaoke was designed with lyric videos in mind. Upload your music track, let Whisper transcribe the lyrics, review and correct any misheard words, then choose Karaoke with the Background or Bounce sub-style. The word-level highlighting will follow the vocal timing, creating a professional lyric video without any video editing experience.