Using Junior
  • Starting a New Project
    • ❓How Junior Works
    • 🌊Onboarding Flow
    • 🎬Home Page
    • 🎤Call Guides
  • Build a Knowledge Base
    • 📞How to Use the Call Tracker
    • ⬆️How to Upload Transcripts
      • Send a recording bot
      • Uploading a transcript
      • Record in your browser
    • 🧹How to Clean Transcripts
      • The Review Flow
      • Cleaning Entities
      • Review Cleaned Notes
      • Tip: Shortcuts
    • ⭐Key Takeaways
      • How to edit KTAs
      • KTA Templates
      • Distributing KTAs
    • ⬇️How to Export Call Notes
  • Qualitative Research
    • 🔍Ctrl + F
    • 🎧Playlists
    • 🏦Entities
  • Quantitative Data Extraction
    • 🔢Quant Tracker (QT)
      • Interpreting Output
      • Column Summaries
  • JuniorGPT
    • 😎JuniorGPT
  • Settings & Project Wrap-Up
    • ⚙️Project Settings
    • 📬Notification Settings
    • 💸Billing / Invoicing & Archiving
  • Junior's Limitations & Tips for Use
    • Tips for Use
  • Logistics & Boring Stuff
    • 🔑Access / Log-In
    • 💌Communicating With Us
Powered by GitBook
On this page
  • What Happens During 'Transcript Cleaning'?
  • Why Do We Have a Review Stage?

Was this helpful?

  1. Build a Knowledge Base

How to Clean Transcripts

What Happens During 'Transcript Cleaning'?

During the cleaning stage, Junior transforms speaker labeled transcripts into client- or partner-ready call notes. He does this by taking the putting each transcripts through multiple transformations:

  • [Only for audio files], turning the speech into a highly accurate verbatim text of the conversation

  • Parse out question and answer pairs ("QA Blocks")

  • Clean the language for grammatical and syntactical errors as well as interruptions and repetitions

    • Remove informal language, colloquialisms, and filler words ('ums', 'ahs')

  • Extract Entities from every QA block

Junior sees all interviews as a series of questions and answers and chunks the conversation into "QA Blocks" that comprise the base unit of insight in the platform.

The end result is that Junior typically removes ~30-50% of the verbatim text from a transcript to get you a set of notes from the conversation that are more synoptic in nature and less of a pure reference material.

Why Do We Have a Review Stage?

Junior does a great job, but can't do a perfect job out of the box. He still requires a human-in-the-loop to guide him, for 3 main reasons:

  1. AI generated transcripts are liable to mishear / mistranscribe. While our transcription service is consistently best in class, the quality of output is still dependant on input quality, and clarity. It is not possible, therefore, for the AI to mistranscribe words like "impossible" as "possible" if there was interference at the wrong time in the audio file.

  2. Proper Nouns are still the most difficult problem to solve with respect to transcription. Even the best transcription models still struggle with industry technical terms and competitor names. As part of the Review Transcript flow, we've built functionality around Junior to solve this.

  3. Large Language Models (LLMs) tend to oversummarise. While they anchor very well to ideas, concepts and arguments, they have the tendency to remove some nuance and anecdotes during the cleaning process. Some of these details and anecdotes may well be items of the conversation you would like to include in a 'cleaned' transcript.

It is critical that you review all output that has undergone transformation via AI.

We have adopted a key design principle throughout Junior: users have the ability to quickly and seamlessly double-check and approve output created by Junior. Ultimately, it is up to you to ensure that work product meets your firm standards.

The importance of the cleaning process should not be understated. By reviewing and approving Junior's output, you are contributing to the creation of a ‘single source of truth’: the workflow tools in Junior are built on top of the cleaned version of the transcripts - errors or omissions at this stage will cascade through the application.

There are additional benefits to the cleaning process:

  • it ensures your comprehension following a call

  • it gives you the opportunity to tag the best insights for use later on

  • by correcting Proper Nouns, you reduce the time taken to clean future transcripts, increase the accuracy of workflow tools and contribute to your firm's knowledge graph

Last updated 1 year ago

Was this helpful?

🧹