Voiced by Amazon Polly |
Introduction
This post presents a cost-effective solution for video auto-dubbing using Amazon Translate for initial translations, Amazon Bedrock for post-editing, and Amazon Polly for generating synthetic voices.
Solution Overview
The solution utilizes AWS services to streamline the dubbing process. The inputs include the original video and caption file, target language, and toggles for idiom detection and formality tone. These inputs are specified in an Excel template and uploaded to an Amazon S3 bucket to launch the entire pipeline. The final outputs are a dubbed video file and a translated caption file.
The process involves:
- Amazon Translate: Translates the video captions.
- Amazon Bedrock: Enhances translation quality and synchronizes audio and video.
- Amazon Augmented AI: Allows editors to review the content.
- Amazon Polly: Generates synthetic voices for the video.
- AWS Step Functions: Orchestrates the steps using AWS Lambda or AWS Batch.
By utilizing AWS CloudFormation, the pipeline is reusable for dubbing new foreign languages.
Pioneers in Cloud Consulting & Migration Services
- Reduced infrastructural costs
- Accelerated application deployment
The Role of Amazon Translate
Amazon Translate is pivotal in translating video captions efficiently and accurately. It supports over 75 languages, providing a broad range of options for global reach. Three primary reasons underpin the choice of Amazon Translate for this solution:
- Language Support: Amazon Translate supports many languages, ensuring comprehensive coverage for global audiences.
- Translation Accuracy: Rigorous evaluations by translation professionals have confirmed the commendable accuracy of Amazon Translate.
- Custom Terminology: The ability to input custom terminology dictionaries ensures that translations align with specific organizational vocabulary.
Custom Terminology in Amazon Translate
Custom terminology ensures translations maintain the intended meaning, especially for specialized terms.
1 2 3 4 5 6 7 8 9 10 11 |
import boto3 translate = boto3.client(service_name='translate', region_name='us-east-1', use_ssl=True) def translate_text(text, source_lang, target_lang): result = translate.translate_text(Text=text, SourceLanguageCode=source_lang, TargetLanguageCode=target_lang) return result.get('TranslatedText') text = "(speaking in a foreign language)" output = translate_text(text, "en", "de") print(output) |
By using custom terminology, the translation output reflects the intended meaning accurately:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
import boto3 import json translate = boto3.client('translate') with open('translation_custom_terminology_de.csv', 'rb') as ct_file: translate.import_terminology( Name='CustomTerminology_boto3', MergeStrategy='OVERWRITE', Description='Terminology for Demo through boto3', TerminologyData={ 'File': ct_file.read(), 'Format': 'CSV', 'Directionality': 'MULTI' } ) text = "(speaking in foreign language)" result = translate.translate_text( Text=text, SourceLanguageCode="en", TargetLanguageCode="de" ) print(result['TranslatedText']) # Output: (Person spricht in einer Fremdsprache) |
Setting Formality Tone in Amazon Translate
Amazon Translate allows for setting the formal tone, which is crucial for different documentary genres. Some genres tend to be more formal than others, and Amazon Translate provides the flexibility to adjust the translation’s tone accordingly.
1 2 3 4 5 6 7 8 9 10 |
translate = boto3.client(service_name='translate', region_name='us-east-1', use_ssl=True) def translate_text(text, source_lang, target_lang, formality='FORMAL'): result = translate.translate_text(Text=text, SourceLanguageCode=source_lang, TargetLanguageCode=target_lang, Settings={'Formality': formality}) return result.get('TranslatedText') text = "[Speaker 1] Let me show you something." output = translate_text(text, "en", "de", formality='FORMAL') print(output) # Output: [Sprecher 1] Lassen Sie mich Ihnen etwas zeigen. |
Enhancing Quality with Amazon Bedrock
Amazon Bedrock significantly enhances the quality of video dubbing through idiom detection and sentence shortening, ensuring that translations are contextually accurate and culturally appropriate.
Idiom Detection and Replacement
Idiom detection and replacement are essential for accurately conveying cultural nuances in dubbing. This feature can be toggled on or off, depending on the content genre. For instance, idioms are more prevalent in casual conversations and less so in scientific documentaries.
1 2 3 4 5 6 |
text_rephrased = bedrock_api_idiom(text) print(text_rephrased) # Output: I work hard response = translate_text(text_rephrased, "en", "es-MX") print(response) # Output: yo trabajo duro |
Sentence Shortening
Sentence shortening aids in automatic time scaling, reducing manual effort and improving efficiency. This is particularly useful for ensuring the dubbed audio matches the original video’s timing.
Original sentence:
1 |
“<code>A large portion of the solar energy that reaches our planet </code><span class="hljs-keyword">is</span><code> reflected back </code><span class="hljs-keyword">into</span><code> space </code><span class="hljs-keyword">or</span><code> absorbed </code><span class="hljs-keyword">by</span><code> dust </code><span class="hljs-keyword">and</span><code> clouds.” |
Shortened sentence:
1 |
“A large part of solar energy </code><span class="hljs-keyword">is</span><code> reflected </code><span class="hljs-keyword">into</span><code> space </code><span class="hljs-keyword">or</span><code> absorbed </code><span class="hljs-keyword">by</span><code> dust </code><span class="hljs-keyword">and</span><code> clouds.” |
Amazon Polly
Using Amazon Polly to convert text to speech
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
def polly(text,key, bucket_name): pollyclient = boto3.client('polly') s3client = boto3.client('s3') s3_key = 'response audio/speech'+{key}+'.mp3' try: response = pollyclient.synthesize_speech( Engine='standard', LanguageCode='en-GB', OutputFormat='mp3', Text=text, TextType='text', VoiceId='Aditi') except (BotoCoreError, ClientError) as error: print(error) sys.exit(-1) if "AudioStream" in response: with closing(response["AudioStream"]) as stream: output = os.path.join(gettempdir(), f"speech{key}.mp3") try: # Open a file for writing the output as a binary stream with open(output, "wb") as file: file.write(stream.read()) except IOError as error: # Could not write to file, exit gracefully print(error) sys.exit(-1) try: delete_object_response = s3client.delete_object( Bucket=bucket_name, Key=s3_key) url=s3client.upload_file(output, bucket_name, s3_key) print(url) print(f"File successfully uploaded to s3://{bucket_name}/{s3_key}") except (BotoCoreError, ClientError) as error: print(f"Failed to upload file to S3: {error}") sys.exit(-1) else: # The response didn't contain audio data, exit gracefully print("Could not stream audio") sys.exit(-1) s="https://{}.s3.ap-south-1.amazonaws.com/response+audio/speech{}.mp3".format(bucket_name,key) return s |
Conclusion
The innovative pipeline developed by Mission Cloud and powered by AWS services has transformed video dubbing for MagellanTV, offering a cost-effective, efficient, and high-quality solution. This approach opens new opportunities for global content distribution, allowing companies to reach wider audiences without the traditional high costs associated with dubbing.
This constantly evolving solution addresses the challenges faced by the Media & Entertainment industry, providing a new frontier of opportunities for content localization.
For more information or to consult with the Mission team about your generative AI use case, request a session through AWS Marketplace.
Drop a query if you have any questions regarding Video Dubbing and we will get back to you quickly.
Making IT Networks Enterprise-ready – Cloud Management Services
- Accelerated cloud migration
- End-to-end view of the cloud environment
About CloudThat
CloudThat is a leading provider of Cloud Training and Consulting services with a global presence in India, the USA, Asia, Europe, and Africa. Specializing in AWS, Microsoft Azure, GCP, VMware, Databricks, and more, the company serves mid-market and enterprise clients, offering comprehensive expertise in Cloud Migration, Data Platforms, DevOps, IoT, AI/ML, and more.
CloudThat is the first Indian Company to win the prestigious Microsoft Partner 2024 Award and is recognized as a top-tier partner with AWS and Microsoft, including the prestigious ‘Think Big’ partner award from AWS and the Microsoft Superstars FY 2023 award in Asia & India. Having trained 650k+ professionals in 500+ cloud certifications and completed 300+ consulting projects globally, CloudThat is an official AWS Advanced Consulting Partner, Microsoft Gold Partner, AWS Training Partner, AWS Migration Partner, AWS Data and Analytics Partner,AWS DevOps Competency Partner, Amazon QuickSight Service Delivery Partner, Amazon EKS Service Delivery Partner, AWS Microsoft Workload Partners, Amazon EC2 Service Delivery Partner, Amazon ECS Service Delivery Partner, AWS Glue Service Delivery Partner, Amazon Redshift Service Delivery Partner, AWS Control Tower Service Delivery Partner, AWS WAF Service Delivery Partner and many more.
To get started, go through our Consultancy page and Managed Services Package, CloudThat’s offerings.
FAQs
1. Why use Amazon Translate for video dubbing?
ANS: – Amazon Translate offers several advantages for video dubbing:
- Supports over 75 languages: Ensures wide global reach.
- High translation accuracy: Evaluated and confirmed by translation professionals.
- Custom terminology: Allows input of custom terminology to reflect organizational vocabulary accurately.
2. What role does Amazon Bedrock play in video dubbing?
ANS: – Amazon Bedrock enhances the translation quality by:
- Idiom detection and replacement: Ensures cultural nuances are accurately conveyed.
- Sentence shortening: Improves synchronization of dubbed audio with the video, reducing the need for manual time scaling.
WRITTEN BY Shantanu Singh
Shantanu Singh works as a Research Associate at CloudThat. His expertise lies in Data Analytics. Shantanu's passion for technology has driven him to pursue data science as his career path. Shantanu enjoys reading about new technologies to develop his interpersonal skills and knowledge. He is very keen to learn new technology. His dedication to work and love for technology make him a valuable asset.
Click to Comment