I tried this with the latest version of the notebook. The evals do well, and the predictions have high confidence, but, when I try the live recognition cell, all the predictions are completely wrong and random. It particularly likes the numbers 5 and 9.
I tried running the diagnostic cells, but it wasn’t much help. The predictions here were also random.
I tried re-recording with a different microphone but the same thing happened.

The readings were interesting. I was surprised that vocal analysis is not a more widely-known/widely-used tool in surveillance and law enforcement. It was also interesting to hear about how vocoders work – using one signal to program another signal’s filter envelope. Cool!