sueden.social ist einer von vielen unabhängigen Mastodon-Servern, mit dem du dich im Fediverse beteiligen kannst.
Eine Community für alle, die sich dem Süden hingezogen fühlen. Wir können alles außer Hochdeutsch.

Serverstatistik:

1,8 Tsd.
aktive Profile

#speechrecognition

2 Beiträge2 Beteiligte0 Beiträge heute
Fortgeführter Thread

(Linux news in previous posts of thread)

FOSS NEWS

VirtualBox 7.2 released with initial support for Linux kernel 6.16 and 6.17, improved Linux Guest Additions support for Oracle Linux 10 and Red Hat Enterprise Linux 10 guests, improved handling of the vboxvideo kernel module in the init script for Linux guests, video decoding acceleration is enabled for Linux hosts when the 3D option is active in settings, GUI improvements, bug fixes:
9to5linux.com/virtualbox-7-2-o

Organic Maps now displays popular hiking and cycling routes, agricultural and forestry roads are excluded from routing, bookmark names are displayed directly on the map for faster identification, Android app gets track elevation graph and track selection on the map:
alternativeto.net/news/2025/8/

CoMaps v2025.08.13-8 released with UI improvements, support for Irish postcodes, various bug fixes:
alternativeto.net/news/2025/8/

Immich 1.137 released with beta timeline fixes, option for custom URLs when generating shared links, new utility to quickly locate large files, fine-grained permissions extended to more API endpoints, etc.:
alternativeto.net/news/2025/8/

Immich 1.138 released with ability to reset PIN code by entering current password, option to reset OAuth IDs, swipe-to-delete functionality for albums for beta timeline users, improved upload and sync capabilities, etc.:
alternativeto.net/news/2025/8/

Ghostty terminal GTK build is rewritten to fix various issues on Linux and BSD, including memory issues:
omgubuntu.co.uk/2025/08/ghostt

FFmpeg 8.0 will include OpenAI Whisper filter for automatic speech recognition and transcription if built with --enable-whisper flag:
phoronix.com/news/FFmpeg-Lands

(more FOSS news in comment)

Journal of Open Source Software: voice: A Comprehensive R Package for Audio Analysis
{voice}
"...a free, open-source toolkit designed to streamline audio analysis by integrating music theory and advanced computational techniques. It enables researchers to extract, summarize, and analyze voice data efficiently, supporting applications such as speech recognition, speaker identification, and mood inference..."

joss.theoj.org/papers/10.21105

Journal of Open Source Softwarevoice: A Comprehensive R Package for Audio AnalysisZabala et al., (2025). voice: A Comprehensive R Package for Audio Analysis. Journal of Open Source Software, 10(111), 8420, https://doi.org/10.21105/joss.08420
Antwortete im Thread

"#KarenHao only really gets her teeth into this point in the book’s epilogue, “How the Empire Falls.” She takes inspiration from #TeHiku, a #Māori AI #speechrecognition project. Te Hiku seeks to revitalize the #te_reo language through putting archived audio tapes of te reo speakers into an AI model, teaching new generations of Māori.
The tech has been developed on consent and active participation from the Māori community, and it is only licensed to organizations that respect Māori values"

Antwortete im Thread

@thelinuxEXP I really like Speech Note! It's a fantastic tool for quick and local voice transcription in multiple languages, created by @mkiol

It's incredibly handy for capturing thoughts on the go, conducting interviews, or making voice memos without worrying about language barriers. The app uses strictly locally running LLMs, and its ease of use makes it a standout choice for anyone needing offline transcription services.

I primarily use #WhisperAI for transcription and Piper for voice, but many other models are available as well.

It is available as flatpak and github.com/mkiol/dsnote

#TTS #transcription #TextToSpeech #translator translation #offline #machinetranslation #sailfishos #SpeechSynthesis #SpeechRecognition #speechtotext #nmt #linux-desktop #stt #asr #flatpak-applications #SpeechNote

🌟 Excited to share Thorsten-Voice's YouTube channel! 🎥 🗣️🔊 ♿ 💬

Thorsten presents innovative TTS solutions and a variety of voice technologies, making it an excellent starting point for anyone interested in open-source text-to-speech. Whether you're a developer, accessibility advocate, or tech enthusiast, his channel offers valuable insights and resources. Don't miss out on this fantastic content! 🎬

follow hem here: @thorstenvoice
or on YouTube: youtube.com/@ThorstenMueller YouTube channel!

www.youtube.comBevor Sie zu YouTube weitergehen
#Accessibility#FLOSS#TTS

I'm exploring ways to improve audio preprocessing for speech recognition for my [midi2hamlib](github.com/DO9RE/midi2hamlib) project. Do any of my followers have expertise with **SoX** or **speech recognition**? Specifically, I’m seeking advice on: 1️⃣ Best practices for audio preparation for speech recognition. 2️⃣ SoX command-line parameters that can optimize audio during recording or playback.
github.com/DO9RE/midi2hamlib/b #SoX #SpeechRecognition #OpenSource #AudioProcessing #ShellScripting #Sphinx #PocketSphinx #Audio Retoot appreciated.

GitHubGitHub - DO9RE/midi2hamlibContribute to DO9RE/midi2hamlib development by creating an account on GitHub.