Problem Statement

You just highlight/select the text you want to listen, press the Selected Text button on the Text Toolbar of firefox it will play for you.

Do you know that paper is 10 times easier to read than a computer screen? CoolSpeaking can read any text on your computer for you.

Just highlight the Text you want to hear, and then press the play button to hear it.

Reads any text on your Webpage out loud
It is a text to speech program which will read aloud any tex

Typical Applications

Relax while your PC reads long documents
Listen to an ebook on you way to work
text to speech software Frees up the eyes and hands to perform other tasks simultaneously.
Great for people with low vision, or reading disabilities
Reduce eye strain from too much reading
Listen to information while you work on something else
Read some user testimonials

Text to Speech Features

Reads text using “espeak” text to speech technology
Change voice characteristics setting to change speed, pitch and volume
Read text from Internet Explorer

Recommended Hardware

Sound Card
Headsets
Speakers

System Requirements

Linux or ubuntu.
“eSpeak” installed on it

Motivation

Information and communication technology is rapidly evolving as an effective tool for making information wide spread and available online to several communities. The industrial society is turning towards information society. The increased use of information technology is enabling people across the world to participate in the knowledge network; however visually impaired people in developing country like Mongolia are being deprived of the benefits of the computer system. One of the main reasons for this is lack of suitable human computer interface and the software designed and developed to meet local needs. To design and develop a computer interface for a person who can not see what computer displays, is the most challenging task for many software developers. In most of the developed countries like Japan they have many public projects and commercial software companies addressing to such issue. Many software companies in India are developing commercial software like content management system and financial software etc., however due to current market needs they do not recognize the needs of text to speech (TTS) converter. There is a great need to develop a text to speech converter tool with simple human computer interface in local language to meet needs of visually impaired people and to put foundation for side applications. The text to speech(TTS) conversion tool can effectively address needs of visually impaired people in India. On the other hand the leading causes of loosing sight are computer displays, TVs and video games.

Objective

General Objective

To make usage of PC’s more user friendly by developing text to speech synthesizer and to meet needs of visually impaired people in Mongolian language.

Specific Objective:

To develop and implement Mongolian text to speech synthesizer.
Simple TTS interface for the blind people

Initial Proposal and Achievemen

Initially I thought of doing only an icon in the tool bar which will convert the selected text area in the firefox we browser.
I am able to implement it with the help of eSpeak, A speech synthsizer software.
I thought of copying the source file off eSpeak directly to the specified location.
But due to lack of time not able to do it, so to use the tool first espeak should be installed in the specified system. with out which the proposed tool is of no use.
I am able to implement some other addons to my initial propose like the volume control buttons speed buttons and peech control buttons.
As well as in one button I am able to show how to access the clipboard
and in another button I showed how to create a temparrory file with the selected text content.

Break of the development is as follows:

Most of the time in implemenation, I wasted searching things in the net.
More then 50% of implementation is spend in reading the tutorials and learning the new languages.
and 40% I spent for searching the things in the net.
Rest 10% I utilized for coding the actual tool.

.
Module	Time
For Reading the Tutorial	30 Hrs
For Copy Paste Mechanizm	15 Hrs
For invoking java in Javascript	20 Hrs
For invoking shell script in javascript	10 Hrs

Project Methodology

The Development Methodology

The addon as a whole comprises of two subsystems; the Interface part and the Text to speech conversion engine. The interface part is the ordinary mozilla firefox window with some text selected. The conversion engine is “espeak” will take input in text format. The general architecture of the addon is shown below.

“eSpeak” (Speech Synthesizer)

eSpeak is a compact open source software speech synthesizer for English

It can run as a command line program to speak text from a file or from stdin.

Features of eSpeak:

•Includes different Voices, whose characteristics can be altered.

•Can produce speech output as a WAV file.

•SSML (Speech Synthesis Markup Language) is supported (not complete), and also HTML.

•Compact size. The program and its data, including many languages, totals about 1 Mbytes.

•Can translate text to phoneme codes, so it could be adapted as a front end for another speech synthesis engine.

•Potential for other languages. Several are included in varying stages of progress. Help from native speakers for these or other languages is welcomed.

•Development tools available for producing and tuning phoneme data.

•Written in C++.

Command Line options for eSpeak

espeak [options] ["text words"]

Text input can be taken either from a file, from a string in the command, or from stdin.

-f <text file>

Speaks a text file.

-a <integer>

Sets amplitude (volume) in a range of 0 to 200. The default is 100.

-p <integer>

Adjusts the pitch in a range of 0 to 99. The default is 50.

-s <integer>

Sets the speed in words-per-minute (approximate values for the default English voice, others may differ slightly). The default value is 170. I generally use a faster speed of 190. Range 80 to 390.

-h or --help

The first line of output gives the eSpeak version number.

-q

Quiet. No sound is generated. This may be useful with the -x option.

INSTALLATIONLinux and other Posix systems

There are two versions of the command line program. They both have the same command parameters (see below).

1.espeak uses speech engine in the libespeak shared library. The libespeak library must first be installed.

2.speak is a stand-alone version which includes its own copy of the speech engine.

Place the espeak or speak executable file in the command path, eg in /usr/local/bin

Place the "espeak-data" directory in /usr/share as /usr/share/espeak-data.

Dependencies

espeak uses the PortAudio sound library (version 18), so you will need to have the libportaudio0 library package installed. It may be already, since it's used by other software, such as OpenOffice.org and the Audacity sound editor.

The speak program may be compiled without using PortAudio, by removing the line

#define USE_PORTAUDIO

in the file speech.h.

Official Website of espeak eSpeak http://espeak.sourceforge.net/

Mozilla Firefox Add – OnTechnologies used to develop Firefox extensions

Firefox is largely built using four technologies:

1 XUL

2 CSS

3 JavaScript

4 XPCOM.

Mozilla

Extensions are also built using these four technologies.

XML:

Extensible Markup Language (XML) is a meta-language for expressing various kinds of data. It was specified in 1998 by W3C, the organization that sets standards for web-related technologies. It has a number of useful qualities: it is generic, extensible, and easy to validate as well-formed.

CSS: A style language to alter the display of XML documents

It is a style-description language defining the display of data marked up in XML and HTML. By separating the structure of the data, expressed through HTML or XML, and the display style, indicated by CSS, data can be reused better than it is when structural and stylistic markup are both embedded in HTML.There are three CSS specifications (Level 1 through Level 3), with progressively powerful features. The Gecko rendering engine handles nearly all of CSS Level 2 and some of CSS Level 3.

JavaScript

JavaScript is a prototype-based object-oriented language, and as shown in Listing 3, also permits independent class definitions. It does not have strict typing like Java, making it extremely flexible and giving it qualities that in some senses could be considered similar to Lisp.

hierarichy

XUL

XUL is an XML-based language, and was developed to be the GUI markup language for the Mozilla browser. There are earlier experiments going back a long way in developing user interfaces using a combination of HTML and scripting languages, and XUL could be considered an evolutionary step from that.

For more on XUL : https://developer.mozilla.org/En/Firefox_addons_developer_guide/Introduction_to_XUL%E2%80%94How_to_build_a_more_intuitive_UI

Using XPCOM

XPCOM is a framework for developing platform-independent components. Components developed in line with that framework are referred to as XPCOM components, and sometimes the components are simply referred to as XPCOMs.

It is mainly used here for creating and executing files.

For more on XPCOM:

https://developer.mozilla.org/En/Firefox_addons_developer_guide/Using_XPCOM%E2%80%94Implementing_advanced_processes

Implementation

Contents of the package:

Chrome

“Chrome”is the word used to describe all the GUI structural elements that go into an XUL application.

Three kinds of packages make up chrome

The content package

This package is used to contain the main XUL and JavaScript source files. Most extensions consist of a single content package

The locale package

This package is used to contain language data that can be translated. To make an extension’s GUI support multiple languages, you can include multiple locale packages, one for each language.

The skin package

This is used to include source files used as visual elements in the GUI, including style sheets and images. Most extensions include only one skin package, but you can include multiple skin packages to allow the GUI to change with different themes.

Chrome URL

Use a file called a “chrome.manifest” to register chrome packages with Firefox and start using them. To register a package, you use a special URI scheme called a “Chrome URL” to represent the path to the file. Chrome URLs are structured as:


File name	Role
install.rdf	Called the install manifest, this gives basic information about the extension, and is required in order for the extension to be installed in Firefox.
chrome.manifest	This is the chrome manifest described in the earlier section. Registers packages and invokes cross-package overlays.
overlay.xul	XUL file that will be overlaid on the Firefox browser window, adding buttons, menu items, etc.
speak.xul speak.js	The XUL to display a clock in the window, and the JavaScript to control its operation (these files will be used in Phase 2).

folders of the package

Main components of the speak.js file

function Speak_speak()

Accessing file: I Created a XPCom components to handle the file input output. The basic idea is to create a temporary file in the tmp folder and using a shell script run the file.

for creating a temporary file Reference https://developer.mozilla.org/En/Firefox_addons_developer_guide/Using_XPCOM%e2%80%94Implementing_advanced_processes

After creating the temporary file I have given the file to shell script Speak_shell.sh for the command

espeak -f /tmp/Speak_temp.txt

Speak_temp.txt is the file that was created by the XPCom component.
The shell script will output the audio.
for accessing shell script from java sccript following Reference:

Reference to do this: https://developer.mozilla.org/en/Java_in_Firefox_Extensions

function CopyToClipboard()

for copying To the clipboard first user should be given with the permission

netscape.security.PrivilegeManager.enablePrivilege('UniversalXPConnect');

Reference for this piece of code is

https://developer.mozilla.org/en/Using_the_Clipboard

http://www.mozilla.org/editor/midasdemo/securityprefs.html

For security: http://www.mozilla.org/editor/midasdemo/securityprefs.html

http://ntt.cc/2008/01/19/copy-paste-javascript-codes-ie-firefox-opera.html#more-33

function CreateFile()

For creating a file used XPCom component.for creating a temporary file Reference https://developer.mozilla.org/En/Firefox_addons_developer_guide/Using_XPCOM%e2%80%94Implementing_advanced_processes

function KeepQuiet()

This is to make the voice quiet this is nothing but accessing the shell script through java script

It will access the Speak_quiet.sh file which contains

espeak -q command

function Speakpitchinc() & Speakpitchdec()

To Increase and decrease the volume

It will access the Speak_pitch.sh file which contains

espeak -h command

function SpeakIncrease() & SpeakDecrease()

To Increase and decrease the volume.

It will access the Speak_Incr.sh file which contains It will pass a parameter and integer to the shell script which will be incremented/Decremented after every button click.

espeak -a <integer>

function SpeakSpeedinc() & SpeakSpeeddec()

To Increase and decrease the speed.