How To Make AI Song Covers with Anyone's Voice for FREE

AI Search

5 chapters7 takeaways13 key terms5 questions

Overview

This video tutorial demonstrates how to create AI song covers using anyone's voice for free. It presents three distinct methods: RVC (a powerful, locally run tool), Kits AI (an online platform with a free tier), and AI Cover Gen (a simplified, Google Colab-based option). The process generally involves separating vocals from instrumentals, converting the vocals to a target voice using AI models, and then rejoining the converted vocals with the original instrumentals. The tutorial uses "Shape of You" by Ed Sheeran and "Idol" by YOASOBI as examples, showcasing conversions to Gura and SpongeBob voices, respectively.

How was this?

Save this permanently with flashcards, quizzes, and AI chat

Chapters

The video will cover three free methods for creating AI song covers, varying in complexity and hardware needs.
Two songs, "Shape of You" (Ed Sheeran) and "Idol" (YOASOBI), will be used as examples to demonstrate different styles and languages.
It's recommended to use acoustic or unplugged versions of songs for cleaner vocals, as AI tools struggle with separating harmonies from lead vocals.
The first step is to separate the original song's vocals from its instrumentals using a tool like Vocal Remover.

Understanding the preparation steps ensures you have the right source material and tools to begin the AI voice conversion process effectively.

Using the acoustic version of "Shape of You" by Ed Sheeran to ensure cleaner vocal separation.

RVC is presented as the gold standard for AI voice conversion, with various versions available (e.g., Appo RVC, Mangio RVC).
To use RVC, you need to download and install a voice model (e.g., Gura's voice from voicemodels.com) and place its .pth file into the RVC 'weights' folder.
The RVC interface allows you to input the path to the vocals you want to convert and select the desired voice model.
Pitch shifting is crucial for male-to-female or female-to-male conversions; for male-to-female, a shift of +8 to +12 semitones is suggested.
After conversion, the new vocals must be merged with the original instrumentals using an audio editor like Audacity, ensuring the instrumentals are also pitch-shifted to match the converted vocals.

RVC offers high control and quality for AI covers, but requires local installation and manual audio mixing, making it suitable for users comfortable with more technical steps.

Converting Ed Sheeran's vocals to Gura's voice and then using Audacity to pitch-shift the instrumentals by 8 semitones to match Gura's converted vocals.

Kits AI provides an online, browser-based solution for AI voice conversion, eliminating the need for local downloads or powerful hardware.
Users upload their desired voice models (e.g., Gura, SpongeBob) to their Kits AI account.
The platform allows direct input of original song vocals and offers advanced settings for pitch shifting, conversion strength, and audio effects like reverb and chorus.
Kits AI simplifies the process by handling vocal separation and re-joining internally, requiring only the original vocals and the target voice model.
A free starter plan offers limited conversion minutes per month, suitable for occasional use.

Kits AI offers a user-friendly, accessible alternative for creating AI covers without complex local setups, ideal for those who prefer an online workflow and have limited technical expertise.

Uploading Gura's voice model to Kits AI, then using the platform to convert Ed Sheeran's vocals and generate the AI cover song.

AI Cover Gen is presented as the easiest and 'laziest' method, runnable via Google Colab to avoid local installation.
Users paste a YouTube link of the song they want to cover and provide the link to the desired voice model zip file.
The tool automatically handles vocal separation, AI voice conversion, pitch adjustment, and re-joining with instrumentals.
It offers options for algorithm quality (RMVPE) and audio mixing, along with reverb settings.
Running in Google Colab requires connecting a GPU and running all cells, with each session needing to be set up again.

This method is the most streamlined, requiring minimal technical knowledge and no local setup, making AI song cover creation accessible to almost anyone.

Pasting a YouTube link for "Shape of You" and Gura's voice model zip file into the AI Cover Gen Google Colab interface to generate the cover.

Ekus Vocal Remover is highlighted as another free online tool for separating vocals and instrumentals, offering various separation options (e.g., drums, bass).
It supports direct YouTube link input for vocal/instrumental separation.
The video concludes by emphasizing the rapid advancement of AI in voice generation and encourages viewers to ask questions in the comments.
A resource for finding AI tools (ai-search.com) is also mentioned.

This section provides additional tool recommendations and reinforces the accessibility and rapid progress of AI cover generation technology.

Using Ekus Vocal Remover by pasting a YouTube link to a song to extract its vocals and instrumentals.

Key takeaways

1AI song cover creation can be achieved for free using various tools, ranging from complex local software to simple online platforms.
2The core process involves isolating vocals, converting them to a target voice using AI models, and then reintegrating them with instrumentals.
3Choosing acoustic versions of songs can significantly improve the quality of vocal separation for AI processing.
4Pitch shifting is a critical step when converting between different voice types (e.g., male to female) to ensure the converted vocals match the instrumental key.
5RVC offers the most control but requires technical setup and manual audio mixing, while Kits AI and AI Cover Gen provide more user-friendly, automated solutions.
6AI voice models can be found on platforms like voicemodels.com, and their compatibility with different tools should be considered.
7The quality of AI covers is rapidly improving, making it an increasingly accessible creative outlet.

Key terms

AI Song CoverVocal SeparatorInstrumentalVoice ModelRVC (Retrieval-based Voice Conversion)Mangio RVCKits AIAI Cover GenGoogle ColabPitch ShiftingSemitonesPTH fileAudacity

Test your understanding

1What are the three main methods presented for creating AI song covers, and what are their primary differences in terms of user experience and technical requirements?
2Why is it often recommended to use acoustic versions of songs when creating AI voice covers, and how does this relate to the limitations of current AI vocal separation tools?
3How does pitch shifting work in the context of AI voice conversion, and why is it a necessary step when changing a male voice to a female voice (or vice versa)?
4Compare and contrast the workflows of RVC, Kits AI, and AI Cover Gen. Which method would be most suitable for a beginner, and which for an advanced user, and why?
5What is the role of a voice model in AI song cover generation, and where can users find these models?