Croatian Speech Recognition
Most Accurate Transcription & Voice to Text for Croatian Language
We created and curated a large Croatian speech database for our Dataset and used it to fine-tune Whisper to achieve the best results for speech recognition needed for our voice AI assistants, but also to offer it as a standalone product.
Why Our Speech Recognition & Transcription?
Our models are the result of extensive research and development, tailored to the specifics of the Croatian language and with a large Croatian speech database (Dataset).
Lowest WER
Best Word Error Rate for Croatian across all benchmarks
Call Center Optimized
Specially tuned for phone recordings and CC environments
API Access
Easy integration via inference endpoints with API keys
Production Ready
Powering our voice AI assistants in production
Automatic Transcription
Convert voice to text automatically - perfect for transcribing meetings, calls, and audio files
Audio Transcription Service
High-quality audio to text conversion with support for various audio formats and real-time processing
Croatian Speech Recognition & Transcription API (Voice to Text)
While our public model achieves top-tier results on public datasets, these often do not reflect real-world challenges. Our private model is further fine-tuned on an extensive proprietary dataset (call centers, telephony), where it significantly outperforms the public model, as clearly demonstrated in the benchmarks on our internal test sets.
Contact us for API access and enterprise pricing.
Model Comparison
Lower is better
Datasets marked with a Internal badge represent realistic environments: call centers, phone recordings, natural speech...
SL99 Dataset
(sl99 test)| Model | WER (%) |
|---|---|
| openai/whisper-large-v3-turbo | 22.93 |
| SL Public Model(GoranS/whisper-large-v3-turbo-hr-parla) | 18.44 |
| SL Private Model | 11.53 |
SL31 Dataset
(sl31 test)| Model | WER (%) |
|---|---|
| openai/whisper-large-v3-turbo | 21.62 |
| SL Public Model(GoranS/whisper-large-v3-turbo-hr-parla) | 16.97 |
| SL Private Model | 8.81 |
Fleurs Dataset
(google/fleurs hr_hr test)| Model | WER (%) |
|---|---|
| openai/whisper-large-v3-turbo | 12.73 |
| SL Public Model(GoranS/whisper-large-v3-turbo-hr-parla) | 8.66 |
| SL Private Model | 9.93 |
Parla Dataset
(parla_867k test)| Model | WER (%) |
|---|---|
| openai/whisper-large-v3-turbo | 10.23 |
| SL Public Model(GoranS/whisper-large-v3-turbo-hr-parla) | 3.52 |
| SL Private Model | 4.59 |
Free Public Model
We've released a fine-tuned model trained on the Croatian Parliament (Parla) dataset, available freely on Hugging Face.
Need Croatian Transcription or Voice to Text for Your Project?
Get access to our state-of-the-art speech recognition and automatic transcription API.

Premium Infrastructure & Performance
Powered by Omonia & Exoscale Zagreb
Our AI services run on enterprise-grade infrastructure located directly in Zagreb, hosted by Omonia and A1 Croatia (Exoscale hr-zag-1). This ensures ultra-low latency for real-time voice applications and full data sovereignty.
Ultra-Low Latency
<10ms round-trip time in Croatia via Omonia's optimized BGP routing.
Data Sovereignty
All data stays in Croatia. GDPR compliant processing on local servers.
10Gbit+ Connectivity
Multiple redundant 10Gbit upstreams ensure uninterrupted service.
Tier 3 Reliability
N+1 redundancy on power and cooling for 99.99% uptime.
Get in touch
We're here to help you transform the way you connect with your customers. Whether you have questions, need a demo, or are ready to get started, our team is just a message away. Let's work together to create seamless, intelligent interactions that drive success.