Croatian Speech Recognition

Most Accurate Transcription & Voice to Text for Croatian Language

We created and curated a large Croatian speech database for our Dataset and used it to fine-tune Whisper to achieve the best results for speech recognition needed for our voice AI assistants, but also to offer it as a standalone product.

Why Our Speech Recognition & Transcription?

Our models are the result of extensive research and development, tailored to the specifics of the Croatian language and with a large Croatian speech database (Dataset).

Lowest WER

Best Word Error Rate for Croatian across all benchmarks

Call Center Optimized

Specially tuned for phone recordings and CC environments

API Access

Easy integration via inference endpoints with API keys

Production Ready

Powering our voice AI assistants in production

Automatic Transcription

Convert voice to text automatically - perfect for transcribing meetings, calls, and audio files

Audio Transcription Service

High-quality audio to text conversion with support for various audio formats and real-time processing

Enterprise

Croatian Speech Recognition & Transcription API (Voice to Text)

While our public model achieves top-tier results on public datasets, these often do not reflect real-world challenges. Our private model is further fine-tuned on an extensive proprietary dataset (call centers, telephony), where it significantly outperforms the public model, as clearly demonstrated in the benchmarks on our internal test sets.

Model Comparison

Lower is better

Datasets marked with a Internal badge represent realistic environments: call centers, phone recordings, natural speech...

SL99 Dataset

(sl99 test)

Internal (Private)

Model	WER (%)
openai/whisper-large-v3-turbo	22.93
SL Public Model(GoranS/whisper-large-v3-turbo-hr-parla)	18.44
SL Private Model	11.53

SL31 Dataset

(sl31 test)

Internal (Private)

Model	WER (%)
openai/whisper-large-v3-turbo	21.62
SL Public Model(GoranS/whisper-large-v3-turbo-hr-parla)	16.97
SL Private Model	8.81

Fleurs Dataset

(google/fleurs hr_hr test)

Model	WER (%)
openai/whisper-large-v3-turbo	12.73
SL Public Model(GoranS/whisper-large-v3-turbo-hr-parla)	8.66
SL Private Model	9.93

Parla Dataset

(parla_867k test)

Model	WER (%)
openai/whisper-large-v3-turbo	10.23
SL Public Model(GoranS/whisper-large-v3-turbo-hr-parla)	3.52
SL Private Model	4.59

Free Public Model

We've released a fine-tuned model trained on the Croatian Parliament (Parla) dataset, available freely on Hugging Face.

View on Hugging Face

Need Croatian Transcription or Voice to Text for Your Project?

Get access to our state-of-the-art speech recognition and automatic transcription API.

hr-zag-1

Latency

< 10ms

INFRASTRUCTURE

Premium Infrastructure & Performance
Powered by Omonia & Exoscale Zagreb

Our AI services run on enterprise-grade infrastructure located directly in Zagreb, hosted by Omonia and A1 Croatia (Exoscale hr-zag-1). This ensures ultra-low latency for real-time voice applications and full data sovereignty.

Ultra-Low Latency

<10ms round-trip time in Croatia via Omonia's optimized BGP routing.

Data Sovereignty

All data stays in Croatia. GDPR compliant processing on local servers.

10Gbit+ Connectivity

Multiple redundant 10Gbit upstreams ensure uninterrupted service.

Tier 3 Reliability

N+1 redundancy on power and cooling for 99.99% uptime.

Get in touch

We're here to help you transform the way you connect with your customers. Whether you have questions, need a demo, or are ready to get started, our team is just a message away. Let's work together to create seamless, intelligent interactions that drive success.