Qwopus3.6-35B-A3B-v1 MLX 6bit

MLX 6-bit conversion of Jackrong/Qwopus3.6-35B-A3B-v1.

Converted directly from the original HF bf16 safetensors. Not from GGUF, not chained from another quant.

The full MLX ladder

Variant Repo Disk ~Min unified RAM Role
MLX bf16 Qwopus3.6-35B-A3B-v1-MLX-bf16 69.3 GB ~72 GB Reference
MLX 8bit Qwopus3.6-35B-A3B-v1-MLX-8bit 36.8 GB ~40 GB Near-lossless
MLX 6bit (this repo) this 28.2 GB ~32 GB Quality / size middle
MLX 4bit Qwopus3.6-35B-A3B-v1-MLX-4bit 19.5 GB ~22 GB Standard daily-use tier
MLX 3bit Qwopus3.6-35B-A3B-v1-MLX-3bit 15.2 GB ~18 GB Smallest practical

Collection: Qwopus3.6-35B-A3B-v1 MLX

Use

pip install mlx-lm
mlx_lm.generate --model zaydiscold/Qwopus3.6-35B-A3B-v1-MLX-6bit \
  --prompt "Explain quantum entanglement in one paragraph" --max-tokens 200

Conversion

python -m mlx_lm convert \
  --hf-path Jackrong/Qwopus3.6-35B-A3B-v1 \
  --mlx-path ./Qwopus3.6-35B-A3B-v1-MLX-6bit \
  -q --q-bits 6

Notes

  • GGUF Q4_K_M is a llama.cpp format. MLX has no literal Q4_K_M — MLX 4-bit is the practical peer at a different quantizer.
  • See the sibling repos for other bit budgets.

Credits

Downloads last month
168
Safetensors
Model size
35B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

6-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zaydiscold/Qwopus3.6-35B-A3B-v1-MLX-6bit

Quantized
(20)
this model

Collection including zaydiscold/Qwopus3.6-35B-A3B-v1-MLX-6bit