Comments

CubeAce wrote on 2/11/2021, 2:38 PM

@jak.willis

Hi Jak.

Yes you can alter the pitch but not the timbre so would not really sound like a female without some work from the performer. It may also add some artifacts into the recording. MEP and VPX is not that as smooth in it's abilities compared to a dedicated music DAW in that regard. whether an individual can hear the difference or not is a different matter. I can hear it if the pitch is taken beyond a certain point but then I'm sensitive to things like auto-tune.

Right click an audio track and select Timestretch / resample.

Ray.

 

Last changed by CubeAce on 2/11/2021, 2:49 PM, changed a total of 1 times.

 

Windows 10 Enterprise. Version 22H2 OS build 19045.5737

Direct X 12.1 latest hardware updates for Western Digital hard drives.

Asus ROG STRIX Z390-F Gaming motherboard Rev 1.xx with Supreme FX inboard audio using the S1220A code. Driver No 6.0.8960.1 Bios version 1401

Intel i9900K Coffee Lake 3.6 to 5.1GHz CPU with Intel UHD 630 Graphics .Driver version Graphics Driver 31.0.101.2135 for 7th-10th Gen Intel® with 64GB of 3200MHz Corsair DDR4 ram.

1000 watt EVGA modular power supply.

1 x 250GB Evo 970 NVMe: drive for C: drive backup 1 x 1TB Sabrent NVMe drive for Operating System / Programs only. 1X WD BLACK 1TB internal SATA 7,200rpm hard drives.1 for internal projects, 1 for Library clips/sounds/music/stills./backup of working projects. 1x500GB SSD current project only drive, 2x WD RED 2TB drives for latest footage storage. Total 31TB of 10 external WD drives for backup.

ASUS NVIDIA GeForce RTX 3060 12GB. nVidia Studio driver version 572.60 - 3584xCUDA cores Direct X 12.1. Memory interface 192bit Memory bandwidth 360.05GB/s 12GB of dedicated GDDR6 video memory, shared system memory 16307MB PCi Express x8 Gen3. Two Samsung 27" LED SA350 monitors with 5000000:1 contrast ratios at 60Hz.

Running MMS 2024 Suite v 23.0.1.182 (UDP3) and VPX 14 - v20.0.3.180 (UDP3)

M Audio Axiom AIR Mini MIDI keyboard Ver 5.10.0.3507

VXP 14, MMS 2024 Suite, Vegas Studio 16, Vegas Pro 18, Vegas Pro 21,Cubase 4. CS6, NX Studio, Mixcraft 9 Recording Studio. Mixcraft Pro 10 Studio. CS6 and DXO Photolab 8, OBS Studio.

Audio System 5 x matched bi-wired 150 watt Tannoy Reveal speakers plus one Tannoy 15" 250 watt sub with 5.1 class A amplifier. Tuned to room with Tannoy audio application.

Ram Acoustic Studio speakers amplified by NAD amplifier.

Rogers LS7 speakers run from Cambridge Audio P50 amplifier

Schrodinger's Backup. "The condition of any backup is unknown until a restore is attempted."

jak.willis wrote on 2/11/2021, 3:06 PM

@jak.willis

Hi Jak.

Yes you can alter the pitch but not the timbre so would not really sound like a female without some work from the performer. It may also add some artifacts into the recording. MEP and VPX is not that as smooth in it's abilities compared to a dedicated music DAW in that regard. whether an individual can hear the difference or not is a different matter. I can hear it if the pitch is taken beyond a certain point but then I'm sensitive to things like auto-tune.

Right click an audio track and select Timestretch / resample.

Ray.

 

Hi Ray,

I actually do my own voice overs for my videos. I’m quite good at doing female voices, however I am trying to find a way to make each female voice sound slightly different so that it doesn’t become obvious that it’s the same person doing all of the voices of that makes sense?

CubeAce wrote on 2/11/2021, 4:36 PM

@jak.willis

Hi Jak.

One of the most complex sounds to alter is the human voice. It is not easily sculpted of synthesized. The best way to alter a voice is to use the human voice. You can alter it's pitch and increase or decrease frequencies bands but you can't alter it's timbrel or harmonic structure convincingly by artificial means. Most AI text to speech starts by recording various actors performances building a library of sounds of hundreds of thousands of recordings. So if you are looking for a sort of plugin that could do that then there are specialist companies like Sonantic that produce products for game developers and the like but their products are not for the consumer.

Beyond pitch correction, I can't think of any effect or plug-in that would be remotely useful for this.

Maybe someone else has come across something I haven't seen. I would be interested in looking at something that could do that myself but I doubt it would be cheap.

Ray.

 

Windows 10 Enterprise. Version 22H2 OS build 19045.5737

Direct X 12.1 latest hardware updates for Western Digital hard drives.

Asus ROG STRIX Z390-F Gaming motherboard Rev 1.xx with Supreme FX inboard audio using the S1220A code. Driver No 6.0.8960.1 Bios version 1401

Intel i9900K Coffee Lake 3.6 to 5.1GHz CPU with Intel UHD 630 Graphics .Driver version Graphics Driver 31.0.101.2135 for 7th-10th Gen Intel® with 64GB of 3200MHz Corsair DDR4 ram.

1000 watt EVGA modular power supply.

1 x 250GB Evo 970 NVMe: drive for C: drive backup 1 x 1TB Sabrent NVMe drive for Operating System / Programs only. 1X WD BLACK 1TB internal SATA 7,200rpm hard drives.1 for internal projects, 1 for Library clips/sounds/music/stills./backup of working projects. 1x500GB SSD current project only drive, 2x WD RED 2TB drives for latest footage storage. Total 31TB of 10 external WD drives for backup.

ASUS NVIDIA GeForce RTX 3060 12GB. nVidia Studio driver version 572.60 - 3584xCUDA cores Direct X 12.1. Memory interface 192bit Memory bandwidth 360.05GB/s 12GB of dedicated GDDR6 video memory, shared system memory 16307MB PCi Express x8 Gen3. Two Samsung 27" LED SA350 monitors with 5000000:1 contrast ratios at 60Hz.

Running MMS 2024 Suite v 23.0.1.182 (UDP3) and VPX 14 - v20.0.3.180 (UDP3)

M Audio Axiom AIR Mini MIDI keyboard Ver 5.10.0.3507

VXP 14, MMS 2024 Suite, Vegas Studio 16, Vegas Pro 18, Vegas Pro 21,Cubase 4. CS6, NX Studio, Mixcraft 9 Recording Studio. Mixcraft Pro 10 Studio. CS6 and DXO Photolab 8, OBS Studio.

Audio System 5 x matched bi-wired 150 watt Tannoy Reveal speakers plus one Tannoy 15" 250 watt sub with 5.1 class A amplifier. Tuned to room with Tannoy audio application.

Ram Acoustic Studio speakers amplified by NAD amplifier.

Rogers LS7 speakers run from Cambridge Audio P50 amplifier

Schrodinger's Backup. "The condition of any backup is unknown until a restore is attempted."

jak.willis wrote on 2/11/2021, 6:30 PM

@jak.willis

Hi Jak.

One of the most complex sounds to alter is the human voice. It is not easily sculpted of synthesized. The best way to alter a voice is to use the human voice. You can alter it's pitch and increase or decrease frequencies bands but you can't alter it's timbrel or harmonic structure convincingly by artificial means. Most AI text to speech starts by recording various actors performances building a library of sounds of hundreds of thousands of recordings. So if you are looking for a sort of plugin that could do that then there are specialist companies like Sonantic that produce products for game developers and the like but their products are not for the consumer.

Beyond pitch correction, I can't think of any effect or plug-in that would be remotely useful for this.

Maybe someone else has come across something I haven't seen. I would be interested in looking at something that could do that myself but I doubt it would be cheap.

Ray.

There are a couple of programs I briefly looked at called VoiceMod & Voxal Voice Changer, but it doesn’t look they do what I want. Like they’re more cartoony/silly voice effects, which I don’t want.

CubeAce wrote on 2/12/2021, 1:54 AM

@jak.willis

Yes, those types of programs add a ring modulator module (Think Moog synthesizer module) that rapidly fades in and out the samples on a waveform (Sine, square or saw tooth) to a pitch changer /re-sampler which also does not work that well as it literally slices the waveform at the sampling frequency and repeats the slice or takes some of it away or speeds it up or slows it down depending on whether you select to resample or stretch the audio. Pitch changers / re-samplers that can work at higher sampling rates can do a better job if the recording is made at a higher sampling rate but even then they are not ideal.

That is the limit of what can be done at current technology levels without the use of artificial intelligence and whole huge bank of real life samples to work with. Vocoders just add extra synthesis modules to get the sort of 'Sparky the magic piano effect'.

Ray.

 

Windows 10 Enterprise. Version 22H2 OS build 19045.5737

Direct X 12.1 latest hardware updates for Western Digital hard drives.

Asus ROG STRIX Z390-F Gaming motherboard Rev 1.xx with Supreme FX inboard audio using the S1220A code. Driver No 6.0.8960.1 Bios version 1401

Intel i9900K Coffee Lake 3.6 to 5.1GHz CPU with Intel UHD 630 Graphics .Driver version Graphics Driver 31.0.101.2135 for 7th-10th Gen Intel® with 64GB of 3200MHz Corsair DDR4 ram.

1000 watt EVGA modular power supply.

1 x 250GB Evo 970 NVMe: drive for C: drive backup 1 x 1TB Sabrent NVMe drive for Operating System / Programs only. 1X WD BLACK 1TB internal SATA 7,200rpm hard drives.1 for internal projects, 1 for Library clips/sounds/music/stills./backup of working projects. 1x500GB SSD current project only drive, 2x WD RED 2TB drives for latest footage storage. Total 31TB of 10 external WD drives for backup.

ASUS NVIDIA GeForce RTX 3060 12GB. nVidia Studio driver version 572.60 - 3584xCUDA cores Direct X 12.1. Memory interface 192bit Memory bandwidth 360.05GB/s 12GB of dedicated GDDR6 video memory, shared system memory 16307MB PCi Express x8 Gen3. Two Samsung 27" LED SA350 monitors with 5000000:1 contrast ratios at 60Hz.

Running MMS 2024 Suite v 23.0.1.182 (UDP3) and VPX 14 - v20.0.3.180 (UDP3)

M Audio Axiom AIR Mini MIDI keyboard Ver 5.10.0.3507

VXP 14, MMS 2024 Suite, Vegas Studio 16, Vegas Pro 18, Vegas Pro 21,Cubase 4. CS6, NX Studio, Mixcraft 9 Recording Studio. Mixcraft Pro 10 Studio. CS6 and DXO Photolab 8, OBS Studio.

Audio System 5 x matched bi-wired 150 watt Tannoy Reveal speakers plus one Tannoy 15" 250 watt sub with 5.1 class A amplifier. Tuned to room with Tannoy audio application.

Ram Acoustic Studio speakers amplified by NAD amplifier.

Rogers LS7 speakers run from Cambridge Audio P50 amplifier

Schrodinger's Backup. "The condition of any backup is unknown until a restore is attempted."

jak.willis wrote on 2/14/2021, 5:53 PM

What you need is a vocal processor VST that allows you to change the formats.

Meldaproduction Mtransformer is an example of a plugin that has this capability. You can try the 14 day demo version to see of this fits your needs.

https://www.meldaproduction.com/MTransformer

Thank you, I’ll check it out. Have you ever used it yourself?

Former user wrote on 2/18/2021, 12:59 AM

Yes I have this plugin and I have used it myself (albeit in a DAW) to change the timbre of a singers voice in a music track.

CubeAce wrote on 3/23/2021, 5:02 PM

@jak.willis

Hi Jak.

If you are still interested have a look at Watson speech to text program.

Ray.

 

Windows 10 Enterprise. Version 22H2 OS build 19045.5737

Direct X 12.1 latest hardware updates for Western Digital hard drives.

Asus ROG STRIX Z390-F Gaming motherboard Rev 1.xx with Supreme FX inboard audio using the S1220A code. Driver No 6.0.8960.1 Bios version 1401

Intel i9900K Coffee Lake 3.6 to 5.1GHz CPU with Intel UHD 630 Graphics .Driver version Graphics Driver 31.0.101.2135 for 7th-10th Gen Intel® with 64GB of 3200MHz Corsair DDR4 ram.

1000 watt EVGA modular power supply.

1 x 250GB Evo 970 NVMe: drive for C: drive backup 1 x 1TB Sabrent NVMe drive for Operating System / Programs only. 1X WD BLACK 1TB internal SATA 7,200rpm hard drives.1 for internal projects, 1 for Library clips/sounds/music/stills./backup of working projects. 1x500GB SSD current project only drive, 2x WD RED 2TB drives for latest footage storage. Total 31TB of 10 external WD drives for backup.

ASUS NVIDIA GeForce RTX 3060 12GB. nVidia Studio driver version 572.60 - 3584xCUDA cores Direct X 12.1. Memory interface 192bit Memory bandwidth 360.05GB/s 12GB of dedicated GDDR6 video memory, shared system memory 16307MB PCi Express x8 Gen3. Two Samsung 27" LED SA350 monitors with 5000000:1 contrast ratios at 60Hz.

Running MMS 2024 Suite v 23.0.1.182 (UDP3) and VPX 14 - v20.0.3.180 (UDP3)

M Audio Axiom AIR Mini MIDI keyboard Ver 5.10.0.3507

VXP 14, MMS 2024 Suite, Vegas Studio 16, Vegas Pro 18, Vegas Pro 21,Cubase 4. CS6, NX Studio, Mixcraft 9 Recording Studio. Mixcraft Pro 10 Studio. CS6 and DXO Photolab 8, OBS Studio.

Audio System 5 x matched bi-wired 150 watt Tannoy Reveal speakers plus one Tannoy 15" 250 watt sub with 5.1 class A amplifier. Tuned to room with Tannoy audio application.

Ram Acoustic Studio speakers amplified by NAD amplifier.

Rogers LS7 speakers run from Cambridge Audio P50 amplifier

Schrodinger's Backup. "The condition of any backup is unknown until a restore is attempted."

jak.willis wrote on 3/24/2021, 6:41 PM

@jak.willis

Hi Jak.

If you are still interested have a look at Watson speech to text program.

Ray.

Thanks Ray, I’ll have a look at it.

CubeAce wrote on 3/24/2021, 6:57 PM

@jak.willis

Hi Jak.

That was meant to read text to speech. 😆

It's supposed to be one of the better ones. To give more convincing gaps there needs to be added commas and full stops where you need pauses. There should be a selection of voices to choose from that could also be pitch shifted for additional variation. Some are more convincing than others. There is also Google's offering. I have found others but some need commercial licensing for use on publicly viewable material. Wideo is another free program that creates an MP3 file to download. I found quite a few in the end of varying quality. Most though are not that natural. Watson so far has been the best I think I found.

Also be aware it does not always work well. 'I am having a row with bill'. Will come out as in 'I saw the two people row a boat'. So Row will be pronounced roe. Other words also get mispronounced.

Ray.

Last changed by CubeAce on 3/24/2021, 7:03 PM, changed a total of 2 times.

 

Windows 10 Enterprise. Version 22H2 OS build 19045.5737

Direct X 12.1 latest hardware updates for Western Digital hard drives.

Asus ROG STRIX Z390-F Gaming motherboard Rev 1.xx with Supreme FX inboard audio using the S1220A code. Driver No 6.0.8960.1 Bios version 1401

Intel i9900K Coffee Lake 3.6 to 5.1GHz CPU with Intel UHD 630 Graphics .Driver version Graphics Driver 31.0.101.2135 for 7th-10th Gen Intel® with 64GB of 3200MHz Corsair DDR4 ram.

1000 watt EVGA modular power supply.

1 x 250GB Evo 970 NVMe: drive for C: drive backup 1 x 1TB Sabrent NVMe drive for Operating System / Programs only. 1X WD BLACK 1TB internal SATA 7,200rpm hard drives.1 for internal projects, 1 for Library clips/sounds/music/stills./backup of working projects. 1x500GB SSD current project only drive, 2x WD RED 2TB drives for latest footage storage. Total 31TB of 10 external WD drives for backup.

ASUS NVIDIA GeForce RTX 3060 12GB. nVidia Studio driver version 572.60 - 3584xCUDA cores Direct X 12.1. Memory interface 192bit Memory bandwidth 360.05GB/s 12GB of dedicated GDDR6 video memory, shared system memory 16307MB PCi Express x8 Gen3. Two Samsung 27" LED SA350 monitors with 5000000:1 contrast ratios at 60Hz.

Running MMS 2024 Suite v 23.0.1.182 (UDP3) and VPX 14 - v20.0.3.180 (UDP3)

M Audio Axiom AIR Mini MIDI keyboard Ver 5.10.0.3507

VXP 14, MMS 2024 Suite, Vegas Studio 16, Vegas Pro 18, Vegas Pro 21,Cubase 4. CS6, NX Studio, Mixcraft 9 Recording Studio. Mixcraft Pro 10 Studio. CS6 and DXO Photolab 8, OBS Studio.

Audio System 5 x matched bi-wired 150 watt Tannoy Reveal speakers plus one Tannoy 15" 250 watt sub with 5.1 class A amplifier. Tuned to room with Tannoy audio application.

Ram Acoustic Studio speakers amplified by NAD amplifier.

Rogers LS7 speakers run from Cambridge Audio P50 amplifier

Schrodinger's Backup. "The condition of any backup is unknown until a restore is attempted."