User requests integration of the Parakeet audio-to-text model and the addition of long-term memory features to improve context-aware translation, providing links to the model.
Can you try this new audio-to-text model? Is it possible to add long-term memory features for better context-aware translation? Parakeet: https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2 https://huggingface.co/spaces/hf-audio/open_asr_leaderboard https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/nemo-transducer-models.html#sherpa-onnx-nemo-parakeet-tdt-0-6b-v2-int8-english https://k2-fsa.github.io/sherpa/onnx/lazarus/generate-subtitles.html https://github.com/k2-fsa/sherpa-onnx Long-term memory: letta:https://github.com/letta-ai/letta mem0:https://github.com/mem0ai/mem0 memary:https://github.com/kingjulio8238/Memary