Website: http://speex.org[]
Comparison of Rate (kHz) bitrate (kbps) delay frame+lookahead (ms) multi-rate embedded VBR PLC bit-robust license among codecs of Speex, iLBC, AMR-*, GSM-* and G.7*.
Vocoders are voice codecs— audio codecs that are optimized for coding human speech at very low bitrates.
There are 9 articles in this category, as of 2006.05.24.
[verse] Voice codec created by VoiceAge. http://www.voiceage.com/acelpnet.php RealNetworks uses this codec under the name "RealAudio sipro".
Windows Media Audio Voice, the voice codec from Microsoft's line of Windows Media audio codecs, is actually ACELP.net as well.
GSM 06.10 is a GSM vocoder standard that also occurs in some multimedia files.
Speech codec found in older Flash Video files.
Qualcomm Voice Codec is most commonly found in Apple QuickTime. The version in QuickTime is not the same as the standardized one.
Qualcomm PureVoice is standardized as TIA IS-733, specs could be downloaded from the 3GPP2 specs site.
RealNetworks' 14.4 kbits/sec codec. Actual data rate is 8 kbits/sec. Equivalent to EIA/TIA IS-54 VSELP.
Speex is an open audio compression format designed for speech that claims to be patent-free. It is available under the Xiph variant of the BSD license.
Website: http://speex.org[]
Equivalent to RealAudio 14.4.
ACELP.net is the preferred low bit rate speech codec in RealAudio and is widely deployed in both Windows Media Player and Audible ready equipment.
Encoded bandwidth
Dual-rate & fixed-rate: ~200-3400 Hz Wideband: ~50-7000 Hz
Coding type
ACELP(r) (Algebraic Code Excited Linear Prediction)
Bit rate
Dual-rate: 8.5/6.5 kbps Fixed-rate: 5.0 kbps Wideband: 16 kbps
Delay (ms):
Frame size:
Dual-rate: 18 Fixed-rate: 30 Wideband: 10
Lookahead: Half a frame
Quality: Toll at 8.5 kbps; near-toll at 6.5 kbps
Vector Sum Excited Linear Prediction, or VSELP, is a speech coding method used in the IS-54 standard. This codec was used in early TDMA cell phones in the United States. It was also used in the first version of RealAudio for audio over the Internet. The IS-54 VSELP standard was published by the Telecommunications Industry Association in 1989.
IS-54 VSELP specifies an encoding of each 20 ms of speech into 159-bit frames, thus achieving a raw data rate of 7.95 kbit/s. In an actual TDMA cell phone, the vocoder output is packaged with error correction and signaling information, resulting in an over-the-air data rate of 16.2 kbit/s. For internet audio, each 159-bit frame is stored in 20 bytes, leaving 1 bit unused. The resulting file thus has a data rate of exactly 8 kbit/s.
A major drawback of VSELP is its limited ability to encode non-speech sounds, so that it performs poorly when encoding speech in the presence of background noise. For this reason, use of VSELP has been gradually phased out in favor of newer codecs.
Newsgroups: gmane.comp.video.ffmpeg.user Date: Tue, 23 May 2006 18:34:53 -0700
> I know "ffmpeg -formats" can give me a huge list of supported codecs. > I'm wondering, among those codecs, which one is best for human voice > recording? By "best" I mean the one that would produce the smallest file > size -- I don't mind the quality degraded, as long as I can make out of > it, phone quality is enough. > > Maybe gsm? Any better ones?
You are interested in the general category called voice codecs or "vocoders":
GSM 06.10 is a possibility. Speex is another open source vocoder solution.
Mike Melanson