Resolve Transcript to Readable Dialogue
When working on interviews or documentaries, DaVinci Resolve’s transcription tool is a lifesaver. It can automatically generate a text transcript of a timeline, complete with speaker detection.

But the output format isn’t exactly something you’d want to hand over to a client or drop straight into an article.
For example, exporting a transcript with “Detect Speaker” enabled gives you something like this:
[00:00:43:03 - 00:00:48:08]
Speaker 1
So can you tell me what something you love or are interested in people might not expect?
[00:00:50:23 - 00:01:01:15]
Speaker 2
Something I love that people would not expect is my diverse taste in music and especially my passion for electronic dance music.
This is great for editing reference, but clunky for storytelling.
It has timecodes, generic “Speaker 1 / Speaker 2” labels, and breaks every few seconds.
What we actually want is something more natural, like a written Q&A:
Justin: So can you tell me what something you love or are interested in people might not expect?
Vasileios: Something I love that people would not expect is my diverse taste in music and especially my passion for electronic dance music.
The Solution: Clean Transcript Script
To fix this, I wrote a Python script that:
- Removes timecodes like
[00:00:43:03 - 00:00:48:08] - Maps speaker numbers to real names (e.g.
Speaker 1 → Justin,Speaker 2 → Vasileios) - Merges consecutive blocks from the same speaker, so the dialogue flows naturally
You can find the full script here: 👉 GitHub – clean_transcript.py
How to Use It
Save your Resolve transcript as a text file, then run the script from the terminal.
Basic syntax:
python clean_transcript.py input.txt -o output.txt --s1 "Justin" --s2 "Vasileios"
input.txt→ the raw transcript exported from Resolve-o output.txt→ the cleaned transcript output file (optional; if omitted, prints to console)--s1 "Justin"→ replaces Speaker 1 with Justin--s2 "Vasileios"→ replaces Speaker 2 with Vasileios
You can also adjust: --sep → separator between merged speaker blocks (default is a blank line).
Full Python Script
#!/usr/bin/env python3
"""
Author: Matteo Curcio
Website: https://matteocurcio.com
Email: [email protected]
Description:
Clean Resolve transcripts by:
- Removing timecode lines like [00:00:43:03 - 00:00:48:08]
- Mapping 'Speaker 1' / 'Speaker 2' to user-defined names
- Merging consecutive blocks from the same speaker
Usage:
python clean_transcript.py input.txt -o output.txt --s1 "Name1" --s2 "Name2"
Example:
python clean_transcript.py raw.txt -o clean.txt --s1 "Justin" --s2 "Vasileios"
"""
import argparse
import re
from pathlib import Path
TIMECODE_RE = re.compile(
r'^\s*\[\d{2}:\d{2}:\d{2}:\d{2}\s*-\s*\d{2}:\d{2}:\d{2}:\d{2}\]\s*$'
)
SPEAKER_LINE_RE = re.compile(r'^\s*(Speaker)\s+(\d+)\s*$', re.IGNORECASE)
def parse_args():
ap = argparse.ArgumentParser()
ap.add_argument("input", type=Path, help="Input transcript text file")
ap.add_argument("-o", "--output", type=Path, help="Output file (default: stdout)")
ap.add_argument("--s1", default="Speaker 1", help="Name for Speaker 1")
ap.add_argument("--s2", default="Speaker 2", help="Name for Speaker 2")
ap.add_argument("--sep", default="\n\n", help="Separator between merged blocks")
return ap.parse_args()
def speaker_name(raw: str, s1: str, s2: str) -> str:
m = SPEAKER_LINE_RE.match(raw)
if not m:
return raw.strip()
num = m.group(2)
if num == "1":
return s1
if num == "2":
return s2
# Leave Speaker 3+ unchanged but normalized
return f"Speaker {num}"
def clean_lines(lines, s1, s2):
blocks = [] # list of (speaker, text_string)
cur_speaker = None
cur_text_parts = []
def flush():
nonlocal cur_speaker, cur_text_parts
if cur_speaker is not None:
text = " ".join(" ".join(cur_text_parts).split()) # collapse whitespace nicely
blocks.append((cur_speaker, text))
cur_speaker = None
cur_text_parts = []
i = 0
n = len(lines)
while i < n:
line = lines[i].rstrip("\n")
# Skip pure timecode lines
if TIMECODE_RE.match(line):
i += 1
# Next non-empty line should be the speaker label
while i < n and lines[i].strip() == "":
i += 1
if i < n:
sp_line = lines[i].rstrip("\n")
if SPEAKER_LINE_RE.match(sp_line):
sp = speaker_name(sp_line, s1, s2)
# If new block speaker differs from current speaker, flush
if cur_speaker is None:
cur_speaker = sp
elif sp != cur_speaker:
flush()
cur_speaker = sp
# Consume speaker line and continue collecting text
i += 1
# Collect text lines until next timecode or blank+timecode pattern
while i < n and not TIMECODE_RE.match(lines[i]):
txt = lines[i].strip()
if txt != "":
cur_text_parts.append(txt)
i += 1
# Do not flush yet; we might merge with next block if same speaker
continue
else:
# Unexpected line, treat as text under current speaker (if any)
if sp_line.strip():
if cur_speaker is None:
# Unknown speaker: label as 'Unknown'
cur_speaker = "Unknown"
cur_text_parts.append(sp_line.strip())
i += 1
continue
else:
# End after a timecode line: flush whatever we had
break
else:
# Non-timecode line outside the expected structure.
# If it's a speaker line, handle similarly.
if SPEAKER_LINE_RE.match(line):
sp = speaker_name(line, s1, s2)
if cur_speaker is None:
cur_speaker = sp
elif sp != cur_speaker:
flush()
cur_speaker = sp
else:
if line.strip():
if cur_speaker is None:
cur_speaker = "Unknown"
cur_text_parts.append(line.strip())
i += 1
# Flush at end
flush()
return blocks
def format_blocks(blocks, sep="\n\n"):
out_lines = []
last_speaker = None
for sp, text in blocks:
if sp != last_speaker:
out_lines.append(f"{sp}: {text}")
last_speaker = sp
else:
out_lines[-1] = out_lines[-1] + " " + text
return sep.join(out_lines).strip() + "\n"
def main():
args = parse_args()
raw = args.input.read_text(encoding="utf-8").splitlines()
blocks = clean_lines(raw, args.s1, args.s2)
result = format_blocks(blocks, sep=args.sep)
if args.output:
args.output.write_text(result, encoding="utf-8")
else:
print(result, end="")
if __name__ == "__main__":
main()
Why This Matters
Instead of spending hours cleaning transcripts manually, this script automates the process in seconds. You get a clean, readable dialogue format ready for:
- Blog posts
- Interview articles
- Subtitles or captions
- Documentary scripts
This keeps the creative process moving instead of bogging down in formatting work.
🔗 Full code and documentation: clean_transcript.py on GitHub