Voice to Text Feature

petemeister · May 15, 2025, 11:47am

Ah, well, here we run the risk of the blind leading the blind because I only got it to work by stumbling (ass over head) through aimless tapping, repetition and pure chance with Jose’s words richocheting in my head. So, with that caveat, I believe I began by downloading FUTO Keyboard from F-Droid. (Search for ‘FUTO’) and you should see ‘FUTO Keyboard’ and ‘FUTO Voice Input’. Install both. Then, if you open FUTO Keyboard>Language & Models and tap on ‘Voice Input’, you can choose from available voice-to-text models (each optimised slightly differently). You may be taken to a webpage view when you can download the model you prefer. I had a few goes at this. Anyway, choose one and tap the ‘left’ arrow on your MK back to Languages & Models section again. It should now show ‘Externally imported model’. I’m not sure if you need to have ‘Transformer’ set to English v1 but mine is now selected and it works. I had many failed attempts where I believe it wasn’t selected. Throw in a few handset restarts and that should get you either close to victory or to the point of frustration. I’m sorry I can’t be definitive. Someone eminently more qualified than me can probably provide a better guide.

Cheerio.

PS. I nearly forgot – for some reason the F-Droid version of FUTO Keyboard didn’t play well with the F-Droid version of FUTO Voice Input. In desperation I downloaded FUTO Keyboard from the Aurora Store and it worked. Coming from iOS, this Android world is a dizzying roller coaster.

anon80904485 · May 15, 2025, 11:52am

Wow! Thank you for you explanation! Yep coming from iOS myself.

kirkmahoneyphd · May 15, 2025, 10:35pm

Thank you for the clarification about how to make FUTO Keyboard accept voice input! I set mine up with the English-39 model (“Best for quick processing. The default model for English.”).

I have been spoiled for three years by the fast and accurate voice-to-text feature in the US$40/year Premium Service associated with my Sunbeam Wireless F1 Orchid, which I am replacing with my Mudita Kompakt. I frequently have used that feature to dictate text messages on my Orchid. I appreciate that Sunbeam anonymizes my identity before Sunbeam very quickly – through cellular data – pushes what I speak to a Microsoft transcription service, which very quickly returns the transcription to Sunbeam, which very quickly pushes that transcription to my Orchid.

So, I was dreading losing voice-to-text for texting when I moved from the Orchid to the Kompakt.

FUTO Keyboard + its voice-input feature + your instructions to the rescue!

Fun Experiment

Sample to Be Transcribed from Voice to Text

I spoke the following earlier-received text message into the FUTO Keyboard and into my Orchid:

344461 is your Amazon OTP. Do not share it with anyone.

FUTO Keyboard with English-39 Model

When I spoke it into my FUTO Keyboard on my Kompakt, 12 seconds elapsed between when I started speaking and when the FUTO Keyboard concluded its transcription of what I spoke. Here is the transcript:

3, 4, 4, 4, 6, 1 is your Amazon OTP. Do not share it with anyone.

Sunbeam Wireless Premium Service

When I spoke it into my Orchid, six seconds elapsed between when I started speaking and when my Orchid concluded its transcription of what I spoke. Here is the transcript:

344461 is your Amazon OTP. Do not share it with anyone.

Summary

Sunbeam’s voice-to-text transcription of the sample was twice as fast as the FUTO Keyboard with its fastest English-language model.
Sunbeam’s voice-to-text transcription of the sample was 100% accurate, where as the FUTO Keyboard with its fastest English-language model did well but stumbled, not surprisingly, on the sequence of six digits.

Conclusion

FUTO Keyboard + the English-39 voice-to-text transcription model is an adequate, half-as-fast, but free – and 100%-offline – substitute on the Mudita Kompakt for the accurate, fast, but US$40/year – and identity-anonymized – voice-to-text transcription feature within the Premium Service on Sunbeam Wireless flip-phones.

reed-OKC · May 16, 2025, 3:59am

The F1 orchid is currently my daily driver and I use voice-to-text most of the time. This has helped me set my expectations for the futo keyboard. Thanks a lot!

kirkmahoneyphd · May 16, 2025, 12:07pm

You are quite welcome! My two other tips to you as a fellow F1 Orchid user who obviously subscribes to the Premium Service:

If you ever use the Navigation feature in the Orchid – even just for looking up a business – then plan to sideload HERE WeGo.
If you ever use the Orchid’s Weather feature, then consider sideloading Breazy Weather.

reed-OKC · May 16, 2025, 2:22pm

I have that one in my notes!

Breazy Weather… From a quick AI prompt, it seems breazy does not have radar, which I appreciated of the Orchid, although I see that users like the UI. I’ll definitely look into it! Can I ask what makes you want a sideloaded weather app? More accuracy? UI?

kirkmahoneyphd · May 16, 2025, 2:29pm

Yes, I miss the colorful, animated, multi-hour RADAR map!

I sideloaded Breazy Weather because I wanted more details for each of the upcoming 24 hours and more details for each of the upcoming 14 days. I still use the built-in Weather app, too. It is less detailed, but it is optimized for E Ink (unlike Breazy Weather), and I can access it directly through the control panel.

kirkmahoneyphd · May 16, 2025, 8:05pm

Comparing Language Models for the FUTO Keyboard

Following up on the earlier experiment, here is a side-by-side comparison of the three English language models for the voice-to-text feature of the sideloaded FUTO Keyboard app. I spoke the same source text for all three models.

“Description” refers to the description of the voice-input model at FUTO.
“Output” shows you what I got in the transcription.
“Duration” refers to the difference between when I started to speak the source text and when the FUTO Keyboard completed its transcription of what I spoke.

Source Text

344461 is your Amazon OTP. Do not share it with anyone.

Results with English-39 Model

Description: “Best for quick processing. The default model for English.”
Output:

3, 4, 4, 4, 6, 1 is your Amazon OTP. Do not share it with anyone.

Duration: 12 seconds

Results with English-74 Model

Description: “Strikes a balance between speed and accuracy.”
Output:

3 4 4 4 6 1 is your Amazon OTP. Do not share it with anyone.

Duration: 14 seconds

Results with English-244 Model

Description: “Best for the most accurate results, but more demanding.”
Output:

344-461 is your Amazon OTP. Do not share it with anyone.

Duration: 24 seconds

Discussion

The English-74 model in this simple side-by-side comparison gave a less clumsy (i.e., easier-to-edit) transcription than did the English-39 model but required only 14% more time (14 vs. 12 seconds).
The English-244 model produced an even better transcription than did the English-74 model but required 71% more time (24 vs. 14 seconds).
Based on these results (with an admittedly simple comparison), I recommend the English-74 model over the other two models for everyday use.
I recommend the English-244 model for the greatest transcription accuracy, but you must be willing to wait much longer than you would wait for either of the other two models!

rockinrobin · January 29, 2026, 2:01pm

Another vote for voice to text! (: