Pakistan has long been on the receiving end of AI in terms of using fancy global models rather than helping build them. While tools like ChatGPT changed how the world writes, studies, and gets stuff done, they mostly assume you know English and come from a Western bubble. Qalb LLM is here to flip that script entirely.
The building force behind Qalb LLM, Taimoor Hassan, an AI engineer by profession, sat down exclusively with TechJuice to talk about his Urdu model. He said:
I had the idea of making Pakistan’s own LLM for a while now. Back in 4th undergraduate semester when there was initial ChatGPT hype, I experienced the product’s consumer side and difficulties as an entrepreneur looking at students/people using it in a different region whose first language is not English.
While it was just an idea till a few months back, I got working on my research paper regarding SLM (small language model) centric playbook and how different layers of SLM can help different use-cases in different regions and languages.
That everyday hassle planted the seed for Qalb: a foundational model made to actually understand and respond like someone from Pakistan (or Indian) with our language, our way of thinking, and our culture. Things were not as smooth as Hassan hoped.
“What makes this special for me, that this wasn’t backed by funding. No grants. No sponsors. Just my own savings, time and good friends. All built to give back to the community, Opensource, and keep it accessible for everyone,” he shared.
While digging into a paper about small language models and how they could better serve specific regions and languages Hassan’s idea moved from “what if” to “let’s actually do this.”
Hassan shared that he spent months figuring out the tools, the approach, the whole roadmap to pull off a real foundation model for Pakistan. Then he rounded up his old undergrad roommates (now doing master’s in Germany) and they got to work. There was no grant or sponsorship; just Hassan’s own savings, late nights, and friends who believed in the mission. He said his idea was to finish an open-source LLM, and “give back to the community… keep it accessible for everyone.”
Most big models gobble up whatever junk is on the internet. Qalb’s team went the careful route instead. They hand-picked high-quality data across seven or eight key areas: poetry, official government docs, news, blogs, video transcripts, textbooks, broken down into proper sub-categories.
As the creator put it simply:
If you want to make sure a CSS candidate dont just get knowledge but gets wisdom as well and be on top of the line, he/she should know everything from basics to the top.
That careful curation pays off big time, as Qalb hallucinates way less and comes across as “saleesi,” i.e., balanced, thoughtful, not cocky or over-the-top.
Right now, Qalb is a text-generation model. Hassan says “think early GPT-1 level”. He does elaborate that the foundations for next phases will come with a little of everything gradually. Under the hood, it’s a continued pre-training setup with right-to-left tokenization that plays nice with Nastaliq script and Urdu fonts. They stuck with proven, professionally tested foundations instead of risky hacks, so it stays stable.
“As Qalb is a continuous pre-training model, it uses RLT and font structure/style/features of the base models, as it sets a professional approach and pre-tested one as well,” said Hassan.
Internal tests shared with TechJuice show killer numbers: Qalb excels in Translation (94.41/100), Classification (96.38/100), and Sentiment Analysis (95.79/100), demonstrating superior linguistic precision and semantic understanding. It also shows strong performance in Reasoning (88.59/100) and Question Answering (80.40/100), with notable improvements over old base models. The model’s balanced performance across diverse tasks indicates robust language understanding beyond Urdu-specific capabilities.
Like pretty much every solid LLM, Qalb is rolling out in stages. First, developer access to lock in a reliable foundation. Next, consumer-facing apps and tools. Hassan sums it up perfectly:
All the LLMs in the world launch in 2 phases. It starts with the developers phase and then to the consumer end. We have successfully completed the first phase and by that it produces a baseline to start working on the consumer-end side of products.
As we see all over the world, with the APIs access to general public/developers, the revolution of integrating AI into day to day applications has become essential. As technology comes to overcome the burden of work and putting life to ease. As for example the boats started with rowing and ended up becoming giants and running on engines, meaning from rowing with 1000s of shipmen to just 1 big giant engine.
So, Qalb provides every chance for the localized industry to become an AI run industry, This base can unlock whole new lanes next e.g. TTS, STT, voice agents, local language tools, and more.
