Moonlight-16B-A3B-Instruct-abliterated

This is an abliterated version of moonshotai/Moonlight-16B-A3B-Instruct with reduced refusals.

Model Details

  • Base Model: moonshotai/Moonlight-16B-A3B-Instruct
  • Architecture: Mixture-of-Experts (MoE) - 16B total, 3B active
  • Modification: Abliteration (refusal direction removal)
  • Context Length: 8,192 tokens
  • Abliteration Tool: Bruno

Abliteration Results

Metric Baseline Post-Abliteration Change
Refusal Rate 100% 41% -59%
MMLU Average 7.5% 7.9% +0.4%
KL Divergence N/A 8.94 -

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "quanticsoul4772/Moonlight-16B-A3B-Instruct-abliterated"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

messages = [{"role": "user", "content": "Hello!"}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Requirements

  • Python 3.10+
  • transformers >= 4.51.0
  • torch >= 2.1.0
  • trust_remote_code=True (required)

Hardware Requirements

Precision VRAM Needed
BF16/FP16 ~32GB
8-bit ~16GB
4-bit ~8GB

Disclaimer

This model has been modified to reduce refusals. Use responsibly and in accordance with applicable laws and regulations.

Downloads last month
12
Safetensors
Model size
16B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rawcell/Moonlight-16B-A3B-Instruct-abliterated

Finetuned
(3)
this model

Collections including rawcell/Moonlight-16B-A3B-Instruct-abliterated