SKT-NRS/SKT-OMNI-CORPUS-2T
Preview • Updated • 38 • 11
Small Language Model — Mixture of Experts (3B Parameters) Built by SKT AI LABS, India
SKT 3B-MoE is a compact Small Language Model (SLM) built using ST-X-0 Taken Mixtral For Better MoE Stability , it delivers efficient and intelligent responses while maintaining a small footprint.
| Property | Value |
|---|---|
| Architecture | Mixture of Experts (MoE) |
| Total Parameters | ~3B |
| Active Parameters | ~1.1B (2 expert/token) |
| Hidden Size | 2048 |
| *Number of Experts | 4 |
| Context Length | 8K tokens |
| Training Tokens | 40B |
| Capability | Description |
|---|---|
| Bilingual | English & Hindi |
| Basic Coding | Python, logic, algorithms |
| Reasoning | Logical thinking, problem solving |
| Creative Writing | Stories, poems, roleplay |
| Knowledge QA | General knowledge, facts |
| Personality | Friendly, helpful, cute |
pip install transformers accelerate torch
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model
model = AutoModelForCausalLM.from_pretrained(
"sKT-Ai-Labs/SKT-ST-X-0-3B",
device_map="auto",
torch_dtype=torch.float16
)
tokenizer = AutoTokenizer.from_pretrained("sKT-Ai-Labs/SKT-ST-X-0-3B")
# Chat
prompt = "What is Quantum Physics ?"
formatted = f"<|user|>\n{prompt}\n<|assistant|>\n"
inputs = tokenizer(formatted, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response.split("<|assistant|>")[-1].strip())
Q: Write a short story about a cat
A: Once upon a time, there was a little brown cat named Jake.
He was very small but very brave. One day, Jake saw a beautiful bird...
Q: Explain quantum computing
A: Quantum computing is a type of computing that uses quantum mechanics
to process information. Unlike classical computers that use bits (0 or 1),
quantum computers use qubits that can be both 0 and 1 simultaneously...
from peft import LoraConfig, get_peft_model
lora_config = LoraConfig(
r=8,
lora_alpha=32,
target_modules=["q_proj", "v_proj", "k_proj", "o_proj"],
lora_dropout=0.1,
bias="none",
task_type="CAUSAL_LM"
)
model = get_peft_model(model, lora_config)
model.print_trainable_parameters() # Only ~0.06% trainable!
from transformers import BitsAndBytesConfig
import torch
quant_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16
)
model = AutoModelForCausalLM.from_pretrained(
"sKT-Ai-Labs/SKT-ST-X-0-3B",
quantization_config=quant_config,
device_map="auto"
)
Both the code repository and the model weights are released under the Apache-2.0 License
See License
If you have any questions, please reach out at support@sktailabs.in.
@misc{SKT-ST-X-0-3B,
author = {SKT AI LABS, India},
title = {SKT-ST-X-0-3B: A Compact Mixture of Experts Model},
year = {2026},
publisher = {Hugging Face},
url = {[https://huggingface.co/sKT-Ai-Labs/SKT-ST-X-0-3B](https://huggingface.co/sKT-Ai-Labs/SKT-ST-X-0-3B)}
}