How It Works

At the heart of the private voice exchange is the concept of a producer farm. A producer farm is a collection of devices which are capable of producing either STT or TTS output. What a producer node connects to and how it connects is variable, however, all producer nodes adhere to the same basic behavior. Put simply TTS producer nodes take in a text string and a few voice paramaters and produce a wav file as output. Similarily STT producer nodes take in a wav file and and model name and produce a text string.

Some producers connect via a direct cgi interface and perform the transcription (either TTS or STT) in the cgi script. Some connect to a socket server and others act as relay servers, presenting a cgi interface to the user but forwarding the request on to a producer node. The PriVox infrastructure, for example, uses separate machines for the api interface, the socket servers, and the producer nodes. The entire infrastructure could also easily run on a single machine if so desired.

What makes the PriVox voice exchange unique is that it grows or shrinks based on user input. If no users contribute requests fall back to the PriVox servers. As user contributions grow overall network capacity grows. This may be seen on the about page where the number of active nodes and available QSS is shown.

It takes less than 5 minutes to get setup and start contributing as a temporary producer node. Nearly any Desktop or Laptop will work. Currently only Linux machines are supported. To use the voice exchange see the PVX API.