Text to Speech using Java ~ Vinay's Blog

Nov 7, 2009

Text to Speech using Java

java framework, Text to Speach9 comments

Introduction

In this article we will see how to convert the text input into Speech using Java. This can be achieved in two ways. They are
JSAPI 1.0 (Java Speech Application Programming Interface)
Free TTS (Free Text To Speech)

JSAPI 1.0

JSAPI 1.0 is developed by Sun Microsystems. Jsapi contains two core technologies, viz. Speech Synthesis and Speech Recognition.

The Speech Synthesis is a speech engine that converts the text into speech.

The Speech Recognition is used for Speech to text conversion.

FreeTTS

FreeTTS is an open source package. It is entirely written by making use of Java programming language. This package can also be used to convert the text into speech.

Necessity for the conversion of Text into Speech

Speech synthesis of a text in a word processor is an aid to proof-reading. It is easier to detect grammatical and stylistic problems. In the TTS, if we save the file in the audio format, the size of the file will be larger than that of a text file. It may be more useful in mobile phones in which we can hear the message that we have received instead of reading the SMS.

Methods to convert Text to Speech

Structure analysis: Structure analysis processes the input text to determine where paragraphs, sentences and other structures start and end.

Text pre-processing: This expands any abbreviations or other forms of shorthand.

E.g. St. John PG becomes Saint John PG.

Text –to-phoneme conversion: It converts each word to phonemes which refers to a basic unit of sound in a language.

Prosody analysis: It processes the sentence structure, words and phonemes to determine appropriate prosody for the sentence.

Waveform production: Finally, the phonemes and prosody information are used to produce the audio waveform for each sentence.

Requirements for Text to Speech:

The following jar files are very important

Cmu_us_kal.jar,
Cmulex.jar
en_us.jar
freetts.jar
jsapi.jar

The above files are available in FreeTTS-1.2.1-bin. FreeTTS is an open source package. Download freetts-1.2.1-bin.zip from http://sourcefore.net/projects/freetts/

Unzip the freeTTS binary package and check inside the lib directory, that all the above jar files are available except jsapi.jar. The jsapi.exe file will be available.

Run jsapi.exe, and you will get jsapi.jar

Copy all the jar files mentioned above to the working folder or

C:\Program Files\java\jdk1.6.0_03\jre\lib\ext

Important classes in javax.speech package:

import javax.speech.*;
import javax.speech.synthesis.*;

Engine:

The Engine interface is available inside the speech package. The Engine interface is the parent interface for all speech engines including Recognizer and Synthesizer. “Speech engine” is the generic term for a system designed to deal with either speech input or speech output.

The package is import javax.speech.Engine;

The basic processes for using a speech engine in an application are as follows.

Identify the application’s functional requirements for an engine (e.g. language or dictation capability.

Locate and create an engine that meets those functional requirements.

Allocate the resources for the engine.

Set up the engine.

Begin operation to allocate the engine and resume it.

Use the engine. Deallocate the resources of the engine.

Central

The Central class is the initial access point to all speech input and output capabilities. Central provides the ability to locate, select and create speech recognizers and speech synthesizers.

SynthesizerModeDesc

SynthesizerModeDesc extends the EngineModeDesc with the properties that are specific to speech synthesizers. A SynthesizerModeDesc inherits the engine name, mode name locale and running properties from EngineModeDesc. SynthesizerModeDesc adds two properties: List of voices provided by the synthesizer Voice to be loaded when the synthesizer is started.

Synthesizer

The Synthesizer interface provides primary access to speech synthesis capabilities.

Voice

A description of one output voice of a speech synthesizer is as follows.

Voice objects can be used in selection of synthesis engines (through the SynthesizerModeDesc). The current speaking voice of a Synthesizer can be changed during the operation with the setVoice method of the SynthesizerProperties object.

Create a simple program using jsapi speech synthesis.

 import javax.speech.*;
 import javax.speech.synthesis.*;

Step 1: We first import the above packages.

Step 2: Create a Synthesizer.

Synthesizer syn = Central.createSynthesizer(null);

This method creates a default Synthesizer. This Synthesizer gets the default locale.

 SynthesizerModeDesc desc = new SynthesizerDesc();
 desc.setLocale(new Locale(“de”, ””));
 desc.addVoice(new Voice(null, GENDER_FEMALE, AGE_DONT_CARE, null));
 Synthesizer synthesizer = Central.createSynthesizer(desc);

The above code is to select a particular Locale and particular voice.

Step 3: The following code is to allocate and resume the synthesizer.

synthesizer.allocate();
synthesizer.resume();

Step 4:

Voice[] voices = desc.getVoices();

Get the available voices:

The getVoices() method returns all the available voices from the selected Synthesizer.

synthesizer.getSynthesizerProperties().setVoice(voice);

Step 5:

The setVoice() method sets a particular value.

Step 6:

Speak the text.

synthesizer.speakPlainText(speaktext, null);
synthesizer.waitEngineState(Synthesizer.QUEUE_EMPTY);

The speakPlainText(speaktext, null) method speaks the given text until queue is empty.

Step 7:

Deallocate the Synthesizer.

 synthesizer.deallocate();

Demo Programs:

Demo 1:

Our first Demo program uses the jsapi.jar. We will call it as Demojsapi.java

 import javax.speech.*;
 import java.util.*;
 import javax.speech.synthesis.*; 

 public class Demojsapi
  {
    String speaktext;
    public void doSpeak(String speak, String voicename)
     {
       speaktext = speak;
       String voiceName = voicename;
       try
        {
        SynthesizerModeDesc desc = new SynthesizerModeDesc(null, “general, “Locale.US, null, null);
         Synthesizer synthesizer = Central.createSynthesizer(desc);
         synthesizer.allocate();
         synthesizer.resume();
         desc = (SynthesizerModeDesc)synthesizer.getEngineModeDesc();
         Voice[] voices = desc.getVoices();
         Voice voice = null;
         for(int i = 0; i< voices. length; i++)
          {
           if(voices[i].getName().equals(voiceName))
            {
             voice = voices[i];
             break;
            }
          }
         synthesizer.getSynthesizerProperties().setVoice(voice);
         synthesizer.speakPlainText(speaktext, null);
         synthesizer.waitEngineState(Synthesizer.QUEUE_EMPTY);
         synthesizer.deallocate();
        }
       catch (Exception e)
        {
String message = “missing speech properties in “ + System.getProperty(“user.home”) + “\n”;
            System.out.println (“”+e);
        }
     }
   public static void main (String[]args)
   {
     Demojsapi obj = new Demojsapi();
     obj.doSpeak(args[0],”Kelvin16”);
   }
}

Demo 2:

Our second program uses freetts.jar

We will call it as Demofreetts.java

 import com.sun.speech.freetts.*;
 import java.util.*; 

 public class Demofreetts
  {
   private String speaktext;
   public void doSpeak(String speak, String voice)
    {
     speaktext = speak;
     try
      {
       VoiceManager voiceManager = VoiceManager.getInstance();
       Voice voices = voiceManager.getVoice(voice);
       Voice sp = null;
       if(voices! = null)
         sp = voices;
       else
         System.out.println(“No Voice Available”);
       //==================================================
        sp.allocate();
        sp.speak(speaktext);
        sp.deallocate();
      //==================================================
      }
     catch(Exception e)
      {
       e.printStackTrace();
      }
    }
   public static void main(String[]args)
   {
    Demofreetts obj = new Demofreetts();
    obj.doSpeak(args[0],”Kelvin16”;
   }
  }

Running Procedure:

The running procedure is same for both the programs. Here we will see the running procedure for the first program.

Create a folder named texttospeech.

Copy all the jars (jsapi.jar, freetts.jar, cmu_time_awb.jar, cmu_us_kal.jar, etc) to that folder or C:\Program Files\java\ jdk1.6.0_03\jre\lib\ext

Create the program named Demojsapi.java

If all the jar files are in your working folder, then set the class path as

set classpath = cmu_us_kal.jar;en_us.jar;freetts.jar;cmulex.jar;jsapi.jar;

If all the jar files do not exist in your working folder, set class path as

Set classpath = C:\Program Files\java\jdk1.6.0\jre\lib; C:\Program Files\java\jdk1.6.0\jre\lib\cmu_us_kal.jar; C:\Program Files\java\jdk1.6.0\jre\lib\en_us.jar; C:\Program Files\java\jdk1.6.0\jre\lib\freetts.jar; C:\Program Files\java\jdk1.6.0\jre\lib\cmulex.jar; C:\Program Files\java\jdk1.6.0\jre\lib\jsapi.jar

set the java path

E.g. set path = C:\windows\system32;C:\jdk1.6\bin;

Compile the program

javac Demojsapi.java

Run the program

java Demojsapi.java “Web Technology I/O Blog Welcomes You”

(If you get the error “missing speech.properties in user home: C:\Documents and Settings\User”, copy the speech.properties from freetts1.2.1 folder and paste it into user home (if we use windows XP, the user home is C:\Documents and settings\User)).

You will get the required result.

By following the above 8 steps, we can run the second program as well.

Summary

In this artilce we saw a simple program on how to convert the text input into Speech using Java. But an effective speech application is one that uses speech to enhance a user's performance of a task or enable an activity that cannot be done without it. Designing an application with speech in mind from the outset is a key success factor.

For More Information

The following sources provide additional information on speech user interface design.

http://java.sun.com/products/java-media/speech/forDevelopers/jsapi-guide/index.html

Fraser, N.M. and G.N. Gilbert, "Simulating Speech Systems," Computer Speech and Language, Vol. 5, Academic Press Limited, 1991.

Raman, T.V. Auditory User Interfaces: Towards the Speaking Computer. Kluwer Academic Publishers, Boston, MA, 1997.

Roe, D.B. and N.M. Wilpon, editors. Voice Communication Between Humans and Machines. National Academy Press, Washington D.C., 1994.

Schmandt, C. Voice Communication with Computers: Conversational Systems . Van Nostrand Reinhold, New York, 1994.

9 comments:

fell0206 said...: Hi, I learn you comment, and do my self, but when I run it, I got the error message:
System property "mbrola.base" is undefined.
I search in google, but I still not correct it, can you tell me what happen? Thank you!; March 22, 2010 at 10:20 PM
ganesh said...: I like ur way of approach...I have used ur guide that how to execute speech recognition program.I was worked perfectly.My suggestion is ,u can include some visual demo also.; October 7, 2010 at 10:19 AM
madhulatha said...: hii
i have tried this procedure but i'm getting the error as
"Error: Could not find or load main class demojsapi.java"; September 12, 2011 at 5:51 AM
Priyanka said...: hello,
i tried to run your programe but getting error as package javax.speech does not exist... so what i need to do for that?...
if possible pls send me that package...asap
Thanks..; January 8, 2012 at 8:09 AM
Priyanka said...: Hello,
i tried to run your programe but getting error as package javax.speech does not exist...so what i need to do for that?
if possible pls send me that package..as son as possible...; January 8, 2012 at 8:13 AM
Anonymous said...: just change the name Kelvin16 to kevin16 (k is lower case and no l ); February 17, 2012 at 12:19 PM
Anonymous said...: just change voicename Kelvin16 to kevin16; February 17, 2012 at 12:25 PM
Anonymous said...: I am getting the error in compiling the first program Demojsapi.java. Error is at the line SysthesizerModeDesc desc = new SysthesizerModeDesc(null,"general","Locale",null,null)
plz help as soon as possible; October 7, 2012 at 5:41 PM
Anonymous said...: hai sir,i am tried to execute these, but it shows NullPointerException.i didn't know how could these solution to be solved.please give me suggestion for solveing these problem.; November 13, 2012 at 2:18 AM

Nov 7, 2009