sábado, 4 de julio de 2015

Talk like Hawking: Teaching with a synthetic voice

Hiya! I have been pretty busy and also a bit disappointed in life lately (soo dramatic!), but blogging has always been good medicine, so I'm back! 

I have always found voice matters interesting, and I do pay a lot of attention to people's voices in general. They have the power of generating feelings, immediate gut feelings, in me. 

For many years, I was concerned about my own voice, as it would invariably become hoarse and then turn to a whisper at least twice or three times a year. I was teaching over 55 periods a week and as we all know, our teaching activity is a 100% dependent on our voice (there's a limit to emergency video sessions one can plan when your voice's "on the blink"!). So I was basically "killing my voice softly".

Things changed for the better when I started training in vocal technique, and I will always be grateful to Carlos and Liliana at Instituto de la Voz for having saved me from potential hiatus and nodule problems. And I have now got the huge responsibility of passing on these techniques and voice care tips to my teacher trainees, which I try to do on a regular basis. 

Now. This whole babble is related to today's blog post, since in spite of all my care, my voice was gone yesterday due to an irritating bout of flu or something (and the winter temperatures hitting our country like we haven't seen for a number of years!). With several classes a day to teach, all pronunciation-related, how do you keep going?

Well, the obvious answer is: "You don't. You go home, you see a doctor, you get some rest for a few days till you recover". And that would be the right answer. But it's not always possible. So here's what I did yesterday.


Teaching Phonetics the Hawking-style

We are all familiar with Stephen Hawking. We all know what a miracle it is that he can share all his wisdom and knowledge with us through a computer. Therefore, I used him as an inspiration to try to teach my class without having to strain my voice.

So here's what I have done:

1) I typed part of my lecture at home, on my tablet (It runs an Android Kit Kat operating system).

2) I downloaded the following app: Talk, and selected English (GB) as the output voice. (I was in a bit of a hurry, so this is the first app that worked for me, but I will be comparing it with others below)

3) I tried the app out, and noticed that to make it less "robotic", I had to "teach" the system to read the way I wanted it to read. That is, I had to input extra punctuation marks, such as "" and ; to allow the machine to chunk units apart or produce some basic intonation patterns which would allow my audience to follow more clearly.  I also had to simplify my clauses and exploit thematic marking on the text to make sure my synthetic voice would signpost topics and paratones more clearly to my students.

4) I gave my tablet Internet access via my phone, using it as a router, since this app needs a working connection, and of course, I connected my tablet to my portable speakers.

The immediate result? Giggles, of course! It was fun. I got called "Marina Hawking". But it was, of course, not the same as being able to respond to students' queries in real time. I eventually had to replace the use of my tablet for the whiteboard when I needed to reply to students' queries on the spot. 

I wanted to brag and I finished my lesson by typing in "That's all, folks!", but the awful rendering by the system (giving "folks" a separate IP and using a fall) spoiled all the fun! (Try it!)

However, I was teaching a theoretical class, so the app did the trick. I had a Lab lesson right afterwards, and it was really difficult to succeed in this way. I was not in a position to deliver feedback on the spot, since obviously, my app would not read IPA symbols, and the need for synchronous comments while the student is reading out, or speaking, does not match my typing speed. In this particular case, gestures and the whiteboard did the job. Sort of. My teacher assistant allowed the class to happen, to be honest!

(Had I had a data projector, I would have projected the text my students were reading, and I would have highlighted their slips and mistakes on the spot, to allow them to at least see what I was trying to comment on! I could have also projected some vids and audio files from some of the pronunciation apps I reviewed in the past.)

So all in all, a text-to-speech act may save the day, but you defo need to take care of yourself and try to get your voice back asap!

Free Text-to-Speech Apps

After yesterday's experience and while resting in bed, I decided to review a few other apps that may do the job nicely.

(I will review them together, as they are very similar and use the same voice database. The same text was read by both apps identically, and with the same voice. They only differ in their interface.)
  • These apps allow you to paste your text, or type it straightaway, and play it out. (Easy Text to Speech does not require any text-pasting, it just reads from your clipboard)
  • You are able to choose from a long list of languages and accents.
  • You can export or save your audio file.
  • You need an active data or Internet connection.
  • Tip! You need to "overpunctuate" your text to avoid a robotic output.

This application uses its own voice database. You choose an accent, and a speaker, and you download the voice database for that voice in particular. I have chosen "Amy, UK English". So far, it sounds like the most natural reading option, because the voice is somehow soothing. 

It has not, of course, reached the state of artificial intelligence, in that it does not chunk my text the way I would like it to, and it does not always employ the intonation I expect, but it does recognise a few chunks and it appears to be able to recognise patterns of declination and paratones. 

This application also allows you to set your preferred pitch and speech rate for the chosen voice.

From what I have been able to make out, it does not need an active Internet connection. The voice data files I have downloaded were 150 MB, so I guess it should all be "there".

Even though it is not a text-to-speech app but rather a translator, I cannot help feeling surprised about the changes that Google has introduced to their voices and to the whole reading aloud experience. It still sounds pretty "robotic", but there are some chunks of language that the system automatically recognises and reads out with some fixed intonation patterns. This creates an overall "weird" effect, since you hear upsteps, downsteps and sudden tone changes that make the whole text sound incohesive at times, but less so, if compared to what it was like last year!

BTW! I have found a tip on how to get Google to read out a phrase out for you. Type in this address in your browser:

http://translate.google.com/translate_tts?tl=en&q=type your phrase here

And replace the bit that says "type your phrase here" with your own phrase, and voilá!

Much as I love Google, Microsoft has done a better job in terms of making their performance of read-out texts more natural. The Bing Translator option presents, in my humble opinion, a better chunking, and more natural selection of intonation patterns. 

See for yourself! Compare the output versions of a short paragraph of my lesson by both Google Translate and Bing Translator. Look out for some interesting word stress choices as well!


Final remarks and a few extras...

  • A full selection of Android text-to-speech apps is available here.
  • For a list of other possible uses for text-to-speech for language learning, and ideas to aid students with special needs, you can check out these resources:

Hope this blog post has been of use! It has certainly inspired me to start considering writing a paper on synthetic-speech talk and intonation!