Skip to content
Commits on Source (5)
This diff is collapsed.
Speech Dispatcher 0.8
Speech Dispatcher 0.9
=====================
The Brailcom organization is happy to announce the availability of
Speech Dispatcher 0.8 developed as a part of the Free(b)Soft
project.
* What is new in 0.8?
* Python 3 compatibility of the Python bindings
* User configuration, logs and runtime files are now stored in
directories according to the XDG specification
* Internationalization of spd-say and translations into Hungarian
and Czech languages
* Espeak output can now use libsonic for faster speech
Announcing the availability of Speech Dispatcher 0.9 developed as a part of
the Free(b)Soft project.
* Pico output module
* What is new in 0.9?
* Lots of bugfixes, cleanups and fine-tunnings
- Add configuration file for espeak-ng + mbrola
- Add support for Baratinoo (VoxyGen), Kali, and Mary-TTS
- Better manage volume for generic-based modules.
- Auto-detect module availability.
- Make generic module provide voice list.
- Add systemd unit file.
* Where to get it?
You can get the distribution tarball of the released version from
http://www.freebsoft.org/pub/projects/speechd/speech-dispatcher-0.8.tar.gz
https://github.com/brailcom/speechd/archive/0.9.tar.gz
We recommend you to fetch the sound icons for use with Speech Dispatcher.
We recommend the use of sound icons with Speech Dispatcher.
They are available at
http://www.freebsoft.org/pub/projects/sound-icons/sound-icons-0.1.tar.gz
Corresponding distribution packages should soon be available at
your distribution mirrors.
The home page of the project is http://www.freebsoft.org/speechd
The home page of the project is https://github.com/brailcom/speechd
* What is Speech Dispatcher?
......@@ -67,9 +58,27 @@ project.
* How to report bugs?
Please report bugs at https://its.freebsoft.org/its/issues/project/1876 .
Please report bugs at https://github.com/brailcom/speechd/issues .
For other contact please use either the above link or our mailing list
<speechd@lists.freebsoft.org> .
Happy synthesizing!
Copyright (C) 2004-2013 Brailcom, o.p.s
Copyright (C) 2010 William Hubbs <w.d.hubbs@gmail.com>
Copyright (C) 2014-2017 Luke Yelavich <themuso@ubuntu.com>
Copyright (C) 2018 Samuel Thibault <samuel.thibault@ens-lyon.org>
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the GNU General Public License for more details (file
COPYING in the root directory).
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
AUTHORS
=======
Samuel Thibault <samuel point thibault at ens dash lyon point org>
Luke Yelavich <themuso at themuso point com>
Jan Buchal <buchal at brailcom point org>
Tomas Cerha <cerha at brailcom point org>
......@@ -12,8 +15,6 @@ Milan Zamazal <pdm at brailcom point org>
Christopher Brannon <cmbrannon79 at gmail point com>
Luke Yelavich <themuso at themuseo point com>
Andrei Kholodnyi <andrei point kholodnyi at gmail point com>
THANKS
......@@ -44,10 +45,32 @@ work on the internationalization of Speech Dispatcher.
Thanks to Trevor Saunders for various improvements and bugfixes.
Thanks to Halim Sahin for for fixes in the audio subsystem.
Thanks to Samuel Thibault for various bugfixes.
Thanks to Halim Sahin for fixes in the audio subsystem.
Thanks to William Jon McCann for XDG Base Dir patches.
Thanks to Didier Spaier for cleaning up lists with generic modules.
Thanks to Colomban Wendling for the Baratinoo support.
Thanks to Raphaël Poitevin for the Kali support.
...and to many others who have contributed.
Copyright (C) 2001-2012 Brailcom, o.p.s
Copyright (C) 2014 Luke Yelavich <themuso@ubuntu.com>
Copyright (C) 2018 Samuel Thibault <samuel.thibault@ens-lyon.org>
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the GNU General Public License for more details (file
COPYING in the root directory).
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
......@@ -4,3 +4,19 @@ A list of known bugs:
as there is not any support for that in those modules.
* Espeak generic module doesn't support punctuation.
Copyright (C) 2003-2008 Brailcom, o.p.s
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the GNU General Public License for more details (file
COPYING in the root directory).
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
This diff is collapsed.
Messages documenting the modifications during development are now in Git. This file
is here only as a placeholder for Autotools.
......@@ -54,3 +54,19 @@ A: Are you sure you have installed festival-freebsoft-utils 0.3 or higher as
A: Are you sure you are not running into the famous 'server_access_list'
problem? Please see the file INSTALL.
Copyright (C) 2001-2008 Brailcom, o.p.s
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the GNU General Public License for more details (file
COPYING in the root directory).
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
......@@ -4,7 +4,8 @@ The requirements:
You will need these components to compile Speech Dispatcher:
- glib 2.0 (http://www.gtk.org)
- libdotconf 1.0 (http://www.azzit.de/dotconf/)
- libdotconf 1.3 (http://github.com/williamh/dotconf)
- libsndfile 1.0.2 (http://www.mega-nerd.com/libsndfile/)
- pthreads
- gcc and GNU make (http://www.gnu.org)
- festival-freebsoft-utils 0.3+ (optional)
......@@ -13,13 +14,13 @@ You will need these components to compile Speech Dispatcher:
(http://www.python.org)
Only if you are building from Git, you also need these:
- automake 1.11+, autoconf, pkg-config and libtool
- automake 1.11+, autoconf, pkg-config, libtool, texinfo and gettext
We recommend to also install these packages:
- PulseAudio (http://www.pulseaudio.org)
- Festival (http://www.cstr.ed.ac.uk/projects/festival/)
- Espeak (http://espeak.sourceforge.net/)
- Intltool 0.40.0+ (http://freedesktop.org/wiki/Software/intltool) for internationalization support.
- Gettext 0.19.8+ (https://www.gnu.org/software/gettext/) for internationalization support.
These packages are known to work with Speech Dispatcher:
Software synthesizers:
......@@ -31,13 +32,11 @@ These packages are known to work with Speech Dispatcher:
synthesizer with very good support for various Speech Dispatcher
features.
- Flite, Cicero, IBM TTS, Dectalk Software, Ivona, SVOX Pico
- Pulse Audio (http://www.pulseaudio.org/)
User applications:
- Orca (http://www.gnome.org/projects/orca/) -- AT-SPI screen reader (Gnome)
- speechd-el (http://www.freebsoft.org/) -- Emacs speech interface
- Linux Screen Reader (LSR (http://live.gnome.org/LSR)
- gnome-speech/Gnopernicus (http://www.gnome.org/)
- BrlTTY (http://mielke.cc/brltty/)
- ...
......@@ -60,15 +59,16 @@ speech-dispatcher-<version> directory and run "make all" command as follows:
$ cd speech-dispatcher-[version]
Notes about ALSA, Pulse, OSS and NAS support
Notes about Pulse, LIBAO, ALSA, OSS and NAS support
============================================
Speech Dispatcher's default audio output system is the Pulse Audio
with an automatic fallback on ALSA if Pulse is not available.
It's also possible possible to use libao output and to build
Speech Dispatcher with Network Audio System and/or OSS support
and then switch between these two audio systems in the configuration
files (speechd.conf).
Speech Dispatcher's default audio output system is the first discovered
during configuration started with the Pulse Audio and followed by libao, ALSA,
OSS and NAS. It is possible to explicitly setup default audio output system
while building Speech Dispatcher, see configuration options.
This value can be redefined during run time in the speechd.conf.
If multiple audio output systems separated by comma are defined as default
audio output system e.g. "pulse,alsa" an automatic fallback will be done.
Notes about Espeak support
==========================
......@@ -136,7 +136,16 @@ It includes three manuals:
- SVOX_Pico_Manual.pdf
- SVOX_Pico_architecture_and_design.pdf
To build SVOX Pico go to svox/pico subdirectory and type
To build SVOX Pico, fetch the debian-sid branch with
git checkout debian-sid
then apply Debian patches with
QUILT_PATCHES=debian/patches quilt push -a
then go to the pico subdirectory and type
chmod +x autogen.sh
./autogen.sh
./configure
make install
......@@ -259,3 +268,23 @@ Now the configure file should be created and you can proceed like with
an ordinary instalation.
Copyright (C) 2001-2012 Brailcom, o.p.s
Copyright (C) 2010 Rui Batista <ruiandrebatista@gmail.com>
Copyright (C) 2010 William Hubbs <w.d.hubbs@gmail.com>
Copyright (C) 2010-2016 Andrei Kholodnyi <andrei.kholodnyi@gmail.com>
Copyright (C) 2015 Luke Yelavich <themuso@themuso.com>
Copyright (C) 2018 Samuel Thibault <samuel.thibault@ens-lyon.org>
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the GNU General Public License for more details (file
COPYING in the root directory).
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
#
# Copyright (C) 2002 - 2018 Brailcom, o.p.s.
#
# This is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2, or (at your option)
# any later version.
#
# This software is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
## Process this file with automake to produce Makefile.in
pkgconfigdir = $(libdir)/pkgconfig
pkgconfig_DATA = speech-dispatcher.pc
SUBDIRS= include src config doc po
BUILT_SOURCES = $(top_srcdir)/.version
SUBDIRS= include locale src config doc po
EXTRA_DIST= config.rpath COPYING.LGPL ANNOUNCE BUGS FAQ README.md README.packagers README.translators README.overview.md README.style.md build.sh git-version-gen split-version.sh speech-dispatcherd.service.in po/Makevars.in po/README import-nvda.sh
EXTRA_DIST= ANNOUNCE BUGS FAQ README.packagers README.translators README.style build.sh
CLEANFILES=
MAINTAINERCLEANFILES=configure
AM_DISTCHECK_CONFIGURE_FLAGS=--without-systemdsystemunitdir
nodist_systemdsystemunit_DATA = \
speech-dispatcherd.service
CLEANFILES += \
speech-dispatcherd.service
%.service: %.service.in
$(AM_V_GEN)sed -e 's,@bindir\@,$(bindir),g' $< > $@
testinstall: install check
cd src/tests && $(MAKE) $(AM_MAKEFLAGS) testinstall
$(top_srcdir)/.version:
echo $(VERSION) > $@-t && mv $@-t $@
dist-hook:
echo $(VERSION) > $(distdir)/.tarball-version
echo $(VERSION) > $(distdir)/.version
rm $(distdir)/po/speech-dispatcher.pot
-include $(top_srcdir)/git.mk
GITIGNOREFILES = $(GITIGNORE_MAINTAINERCLEANFILES_TOPLEVEL) py-compile
ACLOCAL_AMFLAGS = -I m4
This diff is collapsed.
Version 0.9
* Add modules for non-free Baratinoo (VoxyGen) and Kali speech syntheses.
* Add configuration file for the Mary-TTS system.
* Add configuration file for espeak-ng + mbrola.
* Set the pulse client name when using the generic module with paplay.
* espeak-*-mbrola-generic: Update voice list.
* Auto-detect module availability.
* Make generic module provide voice list.
* Add systemd service file.
Version 0.8.8
* Add German translation, thanks to Chris Leick for the patch
* Fix some spelling mistakes, thanks to Paul Gevers for the patch
* Some slight code improvements
Version 0.8.7
* Further fixes to spd-conf, which should now work properly.
* Split the espeak-ng driver code into its own source file.
* Add a work-around to the espeak-ng driver to account for spaces in voice
names which recently appeared in espeak-ng git master. This will properly
be fixed in 0.9.
* Voice names are not forced to lower case, due to espeak-ng git master now
having multi-case voice names.
* Fix stripped audio output from the flite module, thanks to Samuel Thibault.
* Further code and build improvements.
Version 0.8.6
* Various internal code improvements.
* Fix more compiler warnings.
* Python bug fixes with thanks to Sebastian Humenda.
Version 0.8.5
* More unused code removal.
* Fix more compiler warnings.
* Use GLib main loop for the main server thread.
* Implement a shutdown timer in the server, which activates after 5 seconds
with no clients connected.
* Add support for espeak-ng.
* Configuration documentation for the ibmtts module.
* Removal of unused configuration options from the ibmtts module.
* Add command-line argument to allow for custom modules location
Version 0.8.4
* Updated documentation for required dependencies and where to find them.
* Removed unused code.
* Fixed compiler and GLib warnings.
* Cleanup header definitions and inclusions.
* Enabled silent rules by default
* Fix language identification references.
Version 0.8.3
* Add API methods to get language, rate, pitch, and volume.
* A lot of code cleanup, and compatibility improvements.
* Removed all references to GNOME Speech, since it has long since been
deprecated.
* Fix some inconsistancy in the SSIP API for voice type.
* The SET VOICE SSIP command is now deprecated, and will be removed in 0.9.
* The C library API now provides macro definitions for major, minor, and micro
versions in libspeechd_versions.h.
* The libsndfile library is now a mandetory dependency to improve the user
experience around sound icons.
* Fix a possible crash in the festival driver.
* Add a configuration option to the espeak driver to show voice variants in the
voice list. This will remain until a proper variants retrieval API is added
for compatible synthesizers.
Version 0.8.2
* Add convenience methods to the libspeech API to free module list and voice
data structures.
* Add method to the libspeechd API to get the current output module, and
update the documentation accordingly.
* The API is now licensed under the GNU Lesser General Public License v2.1
or later.
* The spdconf configuration utility is now translatable.
* Fixed a bug where speech-dispatcher would fail to start if the user
configuration directory existed but did not contain a config file.
* Install the spdconf desktop file.
Version 0.8.1
* User dictionaries support added to the IBMTTS driver
* Added a pico configuration file for use with the generic driver
* Better support for multi-arch enabled distros to facilitate the use of the
i386 only IBMTTS driver being easily installable on an amd64 system
* Bug fixes, and documentation cleanup
Version 0.8
* Python 3 compatibility of the Python bindings
......@@ -230,3 +314,22 @@ Version 0.0.1
* pre API designed
* simple server core and client example written
* CVS repository and other project stuff set up
Copyright (C) 2002-2012 Brailcom, o.p.s
Copyright (C) 2002-2012 Brailcom, o.p.s
Copyright (C) 2018 Samuel Thibault <samuel.thibault@ens-lyon.org>
Copyright (C) 2018 Florian Steinhardt <no.known.email@example.com>
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the GNU General Public License for more details (file
COPYING in the root directory).
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
speech-dispatcher
=================
*Common interface to speech synthesis*
Introduction
------------
This is the Speech Dispatcher project (speech-dispatcher). It is a part of the
[Free(b)soft project](https://devel.freebsoft.org/), which is intended to allow
blind and visually impaired people to work with computer and Internet based on
free software.
Speech Dispatcher project provides a high-level *device independent* layer
for access to speech synthesis through a simple, stable and well documented
interface.
Documentation
-------------
Complete documentation may be found in doc directory. Read
[doc/README](doc/README) for more information. This [documentation is also
available online](https://devel.freebsoft.org/doc/speechd/speech-dispatcher.html).
The [SSIP communication protocol is also
documented](https://devel.freebsoft.org/doc/speechd/ssip.html).
The key features and the supported TTS engines, output subsystems, client
interfaces and client applications known to work with Speech Dispatcher are
listed in [overview of speech-dispatcher](README.overview.md) as well as voices
settings and where to look at in case of a sound or speech issue.
Mailing-lists
-------------
There is a public mailing-list speechd@lists.freebsoft.org for this project.
This list is for Speech Dispatcher developers, as well as for users. If you
want to contribute the development, propose a new feature, get help or just be
informed about the latest news, don't hesitate to subscribe. The communication
on this list is held in English.
Development
-----------
Various versions of speech-dispatcher can be downloaded from the [project
archive](https://github.com/brailcom/speechd/releases).
Bug reports, issues, and patches can be submitted to [the github
tracker](https://github.com/brailcom/speechd/issues).
The source code is freely available. It is managed using Git. You can use
the [GitHub web interface](https://github.com/brailcom/speechd) or clone the
repository from:
https://github.com/brailcom/speechd.git
A Java library is currently developed separately. You can use the [GitHub web
interface](https://github.com/brailcom/speechd-java) or clone the repository
from:
https://github.com/brailcom/speechd-java.git
To build and install speech-dispatcher and all of it's components, read the
file [INSTALL](INSTALL).
People
------
Speech Dispatcher is being developed in closed cooperation between the Brailcom
company and external developers, both are equally important parts of the
development team. The development team also accepts and processes contributions
from other developers, for which we are always very thankful! See more details
about our development model in Cooperation. Bellow find a list of current inner
development team members and people who have contributed to Speech Dispatcher in
the past:
Development team:
* Samuel Thibault
* Jan Buchal
* Tomas Cerha
* Hynek Hanke
* Milan Zamazal
* Luke Yelavich
* C.M. Brannon
* William Hubbs
* Andrei Kholodnyi
Contributors: Trevor Saunders, Lukas Loehrer,Gary Cramblitt, Olivier Bert, Jacob
Schmude, Steve Holmes, Gilles Casse, Rui Batista, Marco Skambraks ...and many
others.
License
-------
Copyright (C) 2001-2009 Brailcom, o.p.s
Copyright (C) 2018 Samuel Thibault <samuel.thibault@ens-lyon.org>
Copyright (C) 2018 Didier Spaier <didier@slint.fr>
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the GNU General Public License for more details (file
COPYING in the root directory).
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
Note:
- The speech-dispatcher server (src/server/ + src/common/) contains
GPLv2-or-later and LGPLv2.1-or-later source code, but is linked against
libdotconf, which is LGPLv2.1-only at the time of writing.
- The speech-dispatcher modules (src/modules/ + src/common/ + src/audio/)
contain GPLv2-or-later, LGPLv2.1-or-later, and LGPLv2-or-later source code,
but are also linked against libdotconf, which is LGPLv2.1-only at the time of
writing.
- The spd-conf tool (src/api/python/speechd_config/) and spd-say tool
(src/clients/say) are GPLv2-or-later.
- The spdsend tool (src/clients/spdsend/) contains both GPLv2-or-later and
GPLv2-only source code.
- The C API library (src/api/c/) is LGPLv2.1-or-later
- The Common Lisp API library (src/api/cl/) is LGPLv2.1-or-later.
- The Guile API library (src/api/guile/) contains GPLv2-or-later,
LGPLv2.1-or-later, and LGPLv2.1-only source code.
- The Python API library (src/api/python/speechd/) is LGPLv2.1-or-later.
- src/tests/spd_cancel_long_message.c and
src/tests/spd_set_notifications_all.c are GPLv2-only.
- other tests in src/tests/ are GPLv2-or-later.
Overview of Speech Dispatcher
=============================
Key features:
-------------
* Common interface to different Text To Speech (TTS) engines
* Handling concurrent synthesis requests — requests may come asynchronously
from multiple sources within an application and/or from different applications
* Subsequent serialization, resolution of conflicts and priorities of incoming
requests
* Context switching — state is maintained for each client connection
independently, event for connections from within one application
* High-level client interfaces for popular programming languages
* Common sound output handling — audio playback is handled by Speech
Dispatcher rather than the TTS engine, since most engines have limited sound
output capabilities
What is a very high level GUI library to graphics, Speech Dispatcher is to
speech synthesis. The application neither needs to talk to the devices directly
nor to handle concurrent access, sound output and other tricky aspects of the
speech subsystem.
Supported TTS engines:
----------------------
* Festival
* Flite
* Espeak
* Cicero
* IBM TTS
* Espeak+MBROLA (through a generic driver)
* Epos (through a generic driver)
* DecTalk software (through a generic driver)
* Cepstral Swift (through a generic driver)
* Ivona
* Pico (possibly through a generic driver)
* Espeak NG
* Kali TTS
* Baratinoo (Voxygen)
* Mary-TTS
Supported sound output subsystems:
----------------------------------
* OSS
* ALSA
* PulseAudio
* NAS
* Libao
The architecture is based on a client/server model. The clients are all the
applications in the system that want to produce speech (typically assisting
technologies). The basic means of client communication with Speech Dispatcher
is through a Unix socket or Inet TCP connection using the Speech Synthesis
Interface Protocol (See the SSIP documentation for more information). High-level
client libraries for many popular programming languages implement this protocol
to make its usage as simple as possible.
Supported client interfaces:
----------------------------
* C/C++ API
* Python 3 API
* Java API
* Emacs Lisp API
* Common Lisp API
* Guile API
* Simple command line client
Existing assistive technologies known to work with Speech Dispatcher:
* speechd-el (see https://devel.freebsoft.org/speechd-el)
* Orca (see http://live.gnome.org/Orca/SpeechDispatcher)
* Yasr (see http://yasr.sourceforge.net/)
* BrlTTY (see http://brltty.com)
* Chromevox (extension of the Chrome and Chromium browsers)
Voices settings
---------------
The available voices depend on the TTS engines and voices installed.
The voice to use can be set in speech-dispatcher itself, at the system and user
level, and from the client application, like Orca, speechd-el or Chromevox.
The settings in each application and in speech dispatcher are independent of
each others.
The settings in speech-dispatcher at the user level override those
made at the system level.
In speech-dispatcher, the system settings are recorded in the file
/etc/speech-dispatcher/speechd.conf among which a default synthesizer, a voice
type or symbolic name (e.g. MALE1) and a default language.
In turn, each installed voice is associated to a voice type and a language, thus
with this default setting a voice available with these characteristics (voice
type, language, synthesizer) will be chosen if available.
The default values of theses voices parameters can also be set at the system
level customized at the user level: rate, pitch, pitch range and volume.
It is also possible to make the synthesizer depend on the language used.
The user settings are written in the file ~/.config/speech-dispatcher/spd.conf
using the application spd-conf, which also allows to modify the system settings.
spd-conf allows to set the synthesizer, the language and other voice parameters
but not select directly a specific voice.
Instead a specific voice can be chosen from the client application, selecting it
by name in a proposed list that depends on the synthesizer chosen.
The voice name can be a first name like 'bob' or 'viginie", a locale code in the
form language_COUNTRY or a language code followed by a number, for instance.
The language code associated to each name is listed alongside it between
parenthesis, like (fr) for French.
Where to look at in case of a sound or speech issue
---------------------------------------------------
Speech dispatcher links together all the components that contribute to speak a
text, so if you don't get speech at all or something is not spoken, or not the
way you expect, this can come from speech dispatcher itself or from any of those
components (or lack of) and their settings:
- the audio subsystem in use, e.g. alsa or pulseaudio,
- the synthesizer in use, e.g espeak-ng or pico,
- the client application, like Orca or speechd-el or an underlying software like
at-spi,
- the application that provides the text to be spoken, like Firefox.
How to investigate a specific issue goes far beyond this document, but bear in
mind that all listed components can be involved, as the audio equipment in use
and the way it is linked to the computer.
Copyright (C) 2001-2009 Brailcom, o.p.s
Copyright (C) 2018 Samuel Thibault <samuel.thibault@ens-lyon.org>
Copyright (C) 2018 Didier Spaier <didier@slint.fr>
Copyright (C) 2018 Alex ARNAUD <alexarnaud@hypra.fr>
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the GNU General Public License for more details (file
COPYING in the root directory).
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
......@@ -7,4 +7,21 @@ bindings, such as the Orca screen reader, to be compatible with this
release, a version must be used which runs on Python 3.
Please make sure to setup the package dependencies correctly
to avoid possible problems.
\ No newline at end of file
to avoid possible problems.
Copyright (C) 2006 Gary Cramblitt <garycramblitt@comcast.net>
Copyright (C) 2006-2012 Brailcom, o.p.s
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the GNU General Public License for more details (file
COPYING in the root directory).
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
This document describes the coding style used in this repository. All
patches or changes must follow this style. If you have any questions,
please contact the speech dispatcher mailing list.
C Code:
The indenting style we use is the same as the linux kernel. If you use
GNU indent 2.2.10 or later, you should run it as follows:
indent -npro -kr -i8 -ts8 -sob -l80 -ss -ncs -cp1 -il0
For versions of indent earlier than 2.2.10, drop the -il0 from the parameters.
Speech Dispatcher Coding Guidelines
===================================
This document describes the coding style used in this repository. All
patches or changes must follow this style. If you have any questions,
please contact the speech dispatcher mailing list.
Coding Style Guidelines for C Code
----------------------------------
The indenting style we use is the same as the linux kernel, with the following exceptions and extensions:
* Goto statements should not be used unless in very special cases.
* Function names should be
* lowercase words separated by underscores.
* functions which are only implementation details of the given source file
should be declared as static
* Variable names
* global variables should follow the same conventions as functions (e.g.
`output_modules`)
* the verbosity of the name of local variables should be appropriate to its
scope
* Macro names
* Macro names should be in uppercase, words separated by underscores
(e.g. `SPEECHD_OPTION_CB_STR`)
* Type names
* New types are defined in mixed uppercase (e.g. MessageType)
If you use
GNU indent 2.2.10 or later, you should run it as follows:
indent -npro -kr -i8 -ts8 -sob -l80 -ss -ncs -cp1 -il0
For versions of indent earlier than 2.2.10, drop the -il0 from the parameters.
In emacs environment the following can be used (untested):
(defun speechd-c-mode ()
"C mode with adjusted defaults for use with Speech Dispatcher."
(interactive)
(c-mode)
(c-set-style "K&R"))
Coding Style Guideline for other code
-------------------------------------
Please respect the coding style of the given component.
Copyright (C) 2001-2018 Brailcom, o.p.s
Copyright (C) 2011 William Hubbs <w.d.hubbs@gmail.com>
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the GNU General Public License for more details (file
COPYING in the root directory).
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
This file contains instructions for translators to translate Speech
Dispatcher interface to their languages. Speech Dispatcher uses
intltool[1] for its internationalization support.
Intltool is a superset of gettext[2] so any translator used to gettext
.po files can translate Speech Dispatcher the same way.
gettext[1] for its internationalization support.
If you're not familiar with gettext po files we recommend that you
read the Gettext manual[2] before continuing.
......@@ -13,7 +10,7 @@ read the Gettext manual[2] before continuing.
To translate Speech Dispatcher you need a git clone of speech
dispatcher. If you are not reading this file from a git checkout
please check the `INSTALL` file for detailed instructions. You will
also need intltool and gettext packages installed, as recommended on
also need gettext package installed, as recommended on
the `INSTALL` file.
== Adding a new Language ==
......@@ -26,7 +23,7 @@ messages. The following steps explain the process.
2. Create a pot template for Speech Dispatcher running:
$ intltool-update -p
$ make -C po speech-dispatcher.pot-update
This will create a file called Speech Dispatcher.pot with all
translatable Speech Dispatcher messages.
......@@ -65,9 +62,9 @@ them before continuing.
When there are new Speech Dispatcher messages for translation or some
messages are changed, you need to update your <locale.>.po file.
To update an existing po file wit new messages please run
To update an existing po file with new messages please run
$ intltool-update <locale>
$ make -C po <locale>.po-update
Where <locale> is the locale po file to update.
......@@ -91,5 +88,23 @@ speechd@lists.freebsoft.org
== References ==
[1] Intltool: http://freedesktop.org/wiki/Software/intltool
[2] Gettext Manual: http://www.gnu.org/software/gettext/manual/gettext.html
[1] Gettext: https://www.gnu.org/software/gettext/
[2] Gettext Manual: https://www.gnu.org/software/gettext/manual/gettext.html
Copyright (C) 2010 Rui Batista <ruiandrebatista@gmail.com>
Copyright (C) 2012 Brailcom, o.p.s
Copyright (C) 2017 Jan Tojnar <jtojnar@gmail.com>
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the GNU General Public License for more details (file
COPYING in the root directory).
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
Speech Dispatcher TODO
======================
The release versions are not final, and could change. Targetted release is
based on demand from users, and difficulty of the work involved.
* Add pitch as an option for capitalization presentation
see https://github.com/brailcom/speechd/issues/24
* Allow setting a synthesis voice in the user config using spd-conf.
* Separate voice variants out of language / country
see https://github.com/brailcom/speechd/issues/22
* Pronunciation dictionaries
see https://github.com/brailcom/speechd/issues/57
* Emoji support (from CLDR)
see https://github.com/brailcom/speechd/issues/49
* Add support for Mimic
see https://github.com/brailcom/speechd/issues/19
(0.10) Migrate to GSettings.
(0.10) Synthesizer specific settings API.
(0.10) Use more GLib in the server.
(0.10) Move audio into server.
(0.10) Client audio retrieval API.
(0.10) Server to module protocol documentation.
(0.11) Server to module protocol improvements.
* Move synth modules to plugin architecture with plugin host.
* Synth plugin API.
* Allow for building synth plugins out of tree.
(0.10) Integrate with logind/consolekit.
(0.10) Properly support system-wide mode.
* Support spawning the server via Systemd socket activation.
The above improvements are documented in detail below. If work has started on
a particular project, a git branch will be noted. These git branches are
located at https://github.com/TheMuso/speechd-wip.git. To read the most up to
date copy of this file, please clone the master Speech Dispatcher git
repository, located at git://git.freebsoft.org/git/speechd.gitand check out
the master branch.
Migrate to GSettings
--------------------
* Write the GSettings metadata XML file.
* Migrate the server to GSettings.
* Listen to GSettings changes.
* Migrate synthesizer modules to GSettings.
* Write a program to migrate user settings to GSettings.
Synthesizer specific settings API
Depends on: Migration to GSettings
---------------------------------
Background:
* Currently have API for espeak pitch range in git master, but this is only
useful for espeak.
* Espeak module has a config option to show variants along with available
voices, which can be a very long list and can choak some clients.
* Implement server to module protocol to support:
- Request available settings.
- Request available settings and their current value.
- Request the value of a setting.
- Set a setting.
- Reset a setting to its default.
* Implement SSIP protocol support.
* Implement C API, see synthesizer specific settings C API draft.
* Implement python API.
Synthesizer specific settings C API draft
typedef struct {
char *name;
char *description; /* This should be localized */
enum SynthSettingValueType get_type;
enum SynthSettingValueType set_type;
int min_value;
int max_value;
char **value_list;
void *cur_value;
] SynthSetting;
In the C API, a NULL terminated array of this structure would be returned for
all settings a synth offers.
The SynthSettingValueType enum would look something like this:
typedef enum {
SYNTH_SETTING_VALUE_UNKNOWN = 0,
SYNTH_SETTING_VALUE_NUMBER = 1,
SYNTH_SETTING_VALUE_STRING = 2,
SYNTH_SETTING_VALUE_STRING_LIST = 3 /* A list of strings for the user
to choose from, i.e voice variants */
} SynthSettingValueType;
C API methods to work with these data types could be as follows:
SynthSetting **spd_synth_get_settings(SPDConnection *connection);
int spd_synth_set_setting(SPDConnection *connection, SynthSetting *setting,
void *value);
void free_synth_settings(SynthSettings **settings);
Use more GLib in the server
---------------------------
* Use GLib event loops where possible.
* use GLib GThreads and GAsyncQueues for thread communication.
* Use g_spawn calls for executing modules.
* Support multiple client connection methods, unix socket, inet socket.
* Use g_debug and other relevant GLib logging facilities for
messages/logging.
* Use GThreadedSocketService for handling client connections.
* Replace custom implementations of parsing buffers with GLib equivalent
methods where possible.
Move audio into server
----------------------
* Extend the server to module protocol to receive audio from modules.
* Consider using a separate socket for audio transfer, however this may be
difficult when attempting to synchronise with index marks. An alternative is
to send index mark data via the audio socket as well.
* Implement a playback queue supporting the following types:
(The Espeak module is a good reference)
- Begin event
- End event
- Index mark event
- Audio event
- Sound icon event
* Rework modules supporting audio output to not use any advanced internal
playback queueing, and simply send the audio in relatively small buffers to
the server. Smaller buffers to allow the server to stop/pause the audio more
responsively.
* Implement a mechanism to allow modules to signal that they do not support
audio output.
* Support multiple clients using different audio output devices on the one
backend.
* Extend priority system to be either global priority, or priority per audio
output device.
* Run audio in separate thread, possibly using 2 threads, a controller
thread, and a playback thread, one playback thread per audio device. Again,
the espeak module does something similar.
* Rework pulseaudio output to use a GLib event loop.
* Rework other audio output modules to better work within an event loop.
Client audio retrieval API
--------------------------
* Allow client to either request audio directly, or have audio written to a
designated file on disk.
* Allow modules to decline the use of direct audio retrieval. I know of one
speech synth that is not supported by speech dispatcher, who's licensing
model doesn't allow for direct audio retrieval. If this module is ever
supported, its code will likely remain closed to prevent people working
around the implementation, but it would still be nice to support this synth
in the longer term. (Luke Yelavich)
* Load a new instance of the requested synth module, and spin up a worker
thread to handle audio file writing or sending to client, to allow the server
to dispatch other speech messages, as direct audio retrieval should be
independant of the priority system.
Server to module protocol documentation
---------------------------------------
* Similar to the SSIp documentation, write up a texi document that explains
the server to module protocol, currently over stdin/stdout, but may use other
IPC in the future.
Server to module protocol improvements
--------------------------------------
* Consider using sockets for IPC, with a dedicated socket per module.
* Consider implementing shared memory support, particularly for audio data
transfer, but this may depend on whether GLib has a shared memory API, The
GMappedFile API may be useful, if the initiator can change the contents of
the GMappedFile, and the other side can notice changes. Needs investigation.
* Support the launching of modules via systems other than Speech Dispatcher,
useful where containers of some sort are being used, and the environment
requires that any separate processes are run in containers/other kind of
sandbox, hense the use of sockets as per above.
Integrate with logind/consolekit
(Depends on migration to GSettings, GLib main event loops everywhere)
--------------------------------
* Query current user, and currently running sessions for that user.
* Subscribe to tty change events and cork audio playback and synthesis flow
if none of the user's sessions are active.
* Allow the enabling/disabling of logind/consolekit via GSettings and at
runtime, enabled being the default.
* Allow the disabling of consolekit/logind at build time.
* Consider abstracting this functionality into plugins, or at the very least
separate code with an internal API to more easily support any future
session/seat monitoring systems.
Properly support system-wide mode
---------------------------------
* Set a default user and group for the system wide instance to run under, at
build time, and runtime.
* Add a systemd unit to allow the use of system wide mode, disabled by
default.
Support spawning the server via Systemd socket activation
---------------------------------------------------------
* Allow this to be enabled/disabled at build time.
Copyright (C) 2001 Brailcom, o.p.s
Copyright (C) 2016 Luke Yelavich <themuso@themuso.com>
Copyright (C) 2018 Samuel Thibault <samuel.thibault@ens-lyon.org>
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the GNU General Public License for more details (file
COPYING in the root directory).
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.