Skip to content

Slothologist/AudioSegmenter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AudioSegmenter

Segmentation of audio for a speech pipeline using jack audio. Intended to use as a VAD (Voice Activation Detection) in a speech recognition pipeline.

Disclaimer

The AudioSegmenter is in development and currently not recommended for use. Requirements will most likely change in the near future; We will move away from jackaudio as sound framework to a more dedicated framework called esiaf, which is currently beeing worked on.

Current Segmentation Methods

A basic interface allows for different VAD algorithms. At this point in time, only one is implemented.

Double Threshold Segmenter

Will read audio and write it back if...

  • The db_min is reached
  • The db does not fall below db_keep_alive for more than time_keep_alive ms
  • The audio is not longer than time_max ms

Otherwise, it will output a zero signal.

Requirements

Releases

No releases published

Packages

No packages published