Product

Select Language

Pricing

Contact

Blog

Docs

About

Pricing

SiliconCloud

Fine-tuning

Model

Price

Qwen/Qwen2.5-72B-Instruct

$5.9/M tokens

Qwen/Qwen2.5-7B-Instruct

$0.5/M tokens

Multimodal

*SiliconCloud charges based on the total number of tokens in your fine-tuning dataset.

Fine-tuning

Model

Price

Qwen/Qwen2.5-72B-Instruct

$5.9/M tokens

Qwen/Qwen2.5-7B-Instruct

$0.5/M tokens

Model

Price

Qwen/Qwen2.5-72B-Instruct

$5.9/M tokens

Qwen/Qwen2.5-7B-Instruct

$0.5/M tokens

Multimodal

*SiliconCloud charges based on the total number of tokens in your fine-tuning dataset.

Deployment Service

Serverless Deployment

Model

Price

Base model

Playground

Fine-tuned model

price of corresponding base model*1.5

Model

Price

Base model

Playground

Fine-tuned model

price of corresponding base model*1.5

Multimodal

On-demand Deployment & Reserved Capacity

Prepaid Duration

GPU Type

GPU Type*

Price

Pay-as-you-go GPU instance

Compute Unit A

Compute Unit B

$0.35/h

Pay-as-you-go Inference Engine

Compute Unit A

Compute Unit B

$0.30/h

$0.23/h

$0.80/h

Reserved Capacity GPU instance

With Inference Engine

Compute Unit A

Compute Unit B

$0.20/h

Product

Product Type*

Price

Pay-as-you-go GPU instance

Compute Unit A

Compute Unit B

Pay-as-you-go Inference Engine

Compute Unit A

Compute Unit B

$0.30/h

$0.80/h

Reserved Capacity GPU instance

with Inference Engine

Compute Unit A

Compute Unit B

Multimodal

*Compute Unit A is lower cost, Comoute Unit B is lower latency.

Software Subscription

Dedicated email support.

Software

Price

OneDiff

/NVIDIA Hopper architecture GPUs

OneDiff

/Other GPUs

*The current product price is only limited to use outside mainland China.

Software Subscription

Dedicated email support

Software

Price

OneDiff

/NVIDIA Hopper architecture GPUs

OneDiff

/Other GPUs

*The current product price is only limited to use outside mainland China.

Deployment Service

Serverless Deployment

Model

Price

Base model

Playground

Fine-tuned model

price of corresponding base model*1.5

Multimodal

*Compute Unit A is lower cost, Comoute Unit B is lower latency.

On-demand Deployment

Prepaid Duration

GPU Type*

Price

Pay-as-you-go GPU instance

Compute Unit A

Compute Unit B

Pay-as-you-go Inference Engine

Compute Unit A

Compute Unit B

$0.30/h

$0.80/h

Reserved Capacity GPU instance

and Accleration Framework

Compute Unit A

Compute Unit B

Multimodal

Dedicated email support

Software Subscription

Software

Price

OneDiff

/NVIDIA Hopper architecture GPUs

OneDiff

/Other GPUs

*The current product price is only limited to use outside mainland China.

Frequently Asked Questions

Can I try SiliconCloud before subscribing?

Can I try OneDiff or SiliconLLM before subscribing?

Can I get a refund if I cancel my subscription before it expires?

Accelerate AGI to Benefit Humanity

Pages

Products

Company

Legal

Accelerate AGI to Benefit Humanity

Pages

Products

Company

Legal

Product

Select Language

Product

Select Language

Pricing

Select Language

Frequently Asked Questions

Can I try SiliconCloud before subscribing?

Can I try OneDiff or SiliconLLM before subscribing?

Can I get a refund if I cancel my subscription before it expires?

Can I try SiliconCloud before subscribing?

Can I try OneDiff or SiliconLLM before subscribing?

Can I get a refund if I cancel my subscription before it expires?

Frequently Asked Questions

Can I try SiliconCloud before subscribing?

Can I try OneDiff or SiliconLLM before subscribing?

Can I get a refund if I cancel my subscription before it expires?

Can I try SiliconCloud before subscribing?

Can I try OneDiff or SiliconLLM before subscribing?

Can I get a refund if I cancel my subscription before it expires?

Accelerate AGI to Benefit Humanity

Pages

Products

Legal

Company

Accelerate AGI to Benefit Humanity

Pages

Products

Legal

Company