Product

En

Log in

Product

En

Log in

SiliconCloud, Production Ready
Cloud with Low Cost

Teaming up with excellent open-source foundation models.

Product

En

Log in

SiliconCloud, Production Ready
Cloud with Low Cost

Teaming up with excellent open-source foundation models.

Top-quality model services

01.

Chat

SiliconCloud delivers efficient, user-friendly, and scalable LLM models, with an out-of-the-box inference acceleration capability, including Llama3, Mixtral, Qwen, Deepseek, etc.

02.

Image

SiliconCloud encompasses a diverse range of text-to-image and text-to-video models, such as SDXL, SDXL lightning, photomaker, instantid, and so on.

01.

Chat

SiliconCloud delivers efficient, user-friendly, and scalable LLM models, with an out-of-the-box inference acceleration capability, including Llama3, Mixtral, Qwen, Deepseek, etc.

02.

Image

SiliconCloud encompasses a diverse range of text-to-image and text-to-video models, such as SDXL, SDXL lightning, photomaker, instantid, and so on.

Serverless GenAI services

Serverless GenAI services

01.

Chat

SiliconCloud delivers efficient, user-friendly, and scalable LLM models, with an out-of-the-box inference acceleration capability, including Llama3, Mixtral, Qwen, Deepseek, etc.

02.

Image

03.

more

01.

Chat

SiliconCloud delivers efficient, user-friendly, and scalable LLM models, with an out-of-the-box inference acceleration capability, including Llama3, Mixtral, Qwen, Deepseek, etc.

02.

Image

03.

more

One-Stop: From Fine-Tune to Deploying

One-Stop: From Fine-Tune to Deploying

designed for large-scale model fine-tuning and deploying. Through the platform, users can quickly and seamlessly deploy custom models as services and fine-tune them based on the data uploaded.

designed for large-scale model fine-tuning and deploying. Through the platform, users can quickly and seamlessly deploy custom models as services and fine-tune them based on the data uploaded.

designed for large-scale model fine-tuning and deploying. Through the platform, users can quickly and seamlessly deploy custom models as services and fine-tune them based on the data uploaded.

Blazing fsat model inference

Blazing fsat model inference
Blazing fsat model inference
Time latency of LLM is reduced by up to 2.7 times
Time latency of LLM is reduced by up to 2.7 times
Time latency of LLM is reduced by up to 2.7 times
Speed of text to image is increased by 3 times
Speed of text to image is increased by 3 times
Speed of text to image is increased by 3 times

Auto-scaling on demand

Auto-scaling on demand
Auto-scaling on demand

1.

1.

Create an auto-scaling group which contains a collection of SiliconCloud instances.

Create an auto-scaling group which contains a collection of SiliconCloud instances.

2.

2.

Specify minimum and maximum numbers of instance in that group.

Specify minimum and maximum numbers of instance in that group.

1.

Create an auto-scaling group which contains a collection of SiliconCloud instances.

2.

Specify minimum and maximum numbers of instance in that group.

3.

Specify desired capacity and auto-scaling policies.

3.

3.

4.

Specify desired capacity and auto-scaling policies.

Specify desired capacity and auto-scaling policies.

Created successfully. The platform will automatically scales the service on demand.

4.

4.

Created successfully. The platform will automatically scales the service on demand.

Created successfully. The platform will automatically scales the service on demand.

Fine-tuning to deployment one-stop service

Fine-tuning to deployment one-stop service
Fine-tuning to deployment one-stop service

Data Upload

Data Upload

Data Upload

Build a suitable dataset and upload it for creating fine-tuning jobs. The data set consists of a single JSONL file, where each line is a separate training example.

Build a suitable dataset and upload it for creating fine-tuning jobs. The data set consists of a single JSONL file, where each line is a separate training example.

Build a suitable dataset and upload it for creating fine-tuning jobs. The data set consists of a single JSONL file, where each line is a separate training example.

Step.01→

Step.01→

Step.01→

Fine-tuning

Fine-tuning

Fine-tuning

Select the appropriate dataset and adjust the relevant parameters to improve the model effect and meet the customization needs.

Select the appropriate dataset and adjust the relevant parameters to improve the model effect and meet the customization needs.

Select the appropriate dataset and adjust the relevant parameters to improve the model effect and meet the customization needs.

Step.02→

Step.02→

Step.02→

Effect Evaluation

Effect Evaluation

Effect Evaluation

Upload the evaluation dataset. Evaluate the effect of the trained model, and choose tge best one for deployment.

Upload the evaluation dataset. Evaluate the effect of the trained model, and choose tge best one for deployment.

Upload the evaluation dataset. Evaluate the effect of the trained model, and choose tge best one for deployment.

Step.03→

Step.03→

Step.03→

Model Deploying

Model Deploying

Model Deploying

Deploy the fine-tuned model on the cloud platform and call it through APIs.

Deploy the fine-tuned model on the cloud platform and call it through APIs.

Deploy the fine-tuned model on the cloud platform and call it through APIs.

Step.04

Step.04

Step.04

Easy to use

Easy to use
Easy to use

from openai import OpenAI

client = OpenAI(api_key="YOUR_API_KEY", base_url="https://api.siliconflow.cn/v1")

response = client.chat.completions.create(

model='alibaba/Qwen1.5-110B-Chat',

messages=[

{'role': 'user', 'content': "抛砖引玉是什么意思呀"}

],

stream=True

)

for chunk in response:

print(chunk.choices[0].delta.content)

from openai import OpenAI

client = OpenAI(api_key="YOUR_API_KEY", base_url="https://api.siliconflow.cn/v1")

response = client.chat.completions.create(

model='alibaba/Qwen1.5-110B-Chat',

messages=[

{'role': 'user', 'content': "抛砖引玉是什么意思呀"}

],

stream=True

)

for chunk in response:

print(chunk.choices[0].delta.content)

from openai import OpenAI

client = OpenAI(api_key="YOUR_API_KEY", base_url="https://api.siliconflow.cn/v1")

response = client.chat.completions.create(

model='alibaba/Qwen1.5-110B-Chat',

messages=[

{'role': 'user', 'content': "抛砖引玉是什么意思呀"}

],

stream=True

)

for chunk in response:

print(chunk.choices[0].delta.content)

Model Inference

Model Inference
Model Inference

With just a single line of code, developers can seamlessly integrate the fastest model services from SiliconCloud.

With just a single line of code, developers can seamlessly integrate the fastest model services from SiliconCloud.

With just a single line of code, developers can seamlessly integrate the fastest model services from SiliconCloud.

Model Deploy

Model Deploy
Model Deploy

·


·


Upload your workflow and Download the callable Model Service API.

Upload your workflow and Download the callable Model Service API.

Upload your workflow and Download the callable Model Service API.

·


·


Reduce the chances of application downtime with auto scaling.

Reduce the chances of application downtime with auto scaling.

Reduce the chances of application downtime with auto scaling.

·

·

Accelerate your workflow as needed.

Accelerate your workflow as needed.

Accelerate your workflow as needed.

Multiple service modes
meet enterprise-level standardized delivery

Serverless Deployment

Built for developers

High-performance inference, industry-leading speed

Diverse models, covering multiple scenarios

Pay-as-you-go, per-token pricing

Serverless rate limits

On-demand Deployment

Enhanced for enterprises

Custom models tailored to your needs

Configurable strategies optimization

Isolated resources for high QoS

Custom enterprise rate limiting

Reserved Capacity

Enhanced for enterprises

Custom models tailored to your needs

Configurable strategies optimization

Isolated resources for high QoS

Custom enterprise rate limiting

Competitive Unit Pricing

Prioritize using the latest product features

Multiple service mode
meet enterprise-level standardized delivery

Serverless Deployment

Built for developers

High-performance inference, industry-leading speed

Diverse models, covering multiple scenarios

Pay-as-you-go, per-token pricing

Serverless rate limits

On-demand Deployment

Enhanced for enterprises

Custom models tailored to your needs

Configurable strategies optimization

Isolated resources for high QoS

Custom enterprise rate limiting

Reserved Capacity

Enhanced for enterprises

Custom models tailored to your needs

Configurable strategies optimization

Isolated resources for high QoS

Custom enterprise rate limiting

Competitive Unit Pricing

Prioritize using the latest product features

Multiple service mode
meet enterprise-level standardized delivery

Serverless Deployment

Built for developers

High-performance inference, industry-leading speed

Diverse models, covering multiple scenarios

Pay-as-you-go, per-token pricing

Serverless rate limits

On-demand Deployment

Enhanced for enterprises

Custom models tailored to your needs

Configurable strategies optimization

Isolated resources for high QoS

Custom enterprise rate limiting

Reserved Capacity

Enhanced for enterprises

Custom models tailored to your needs

Configurable strategies optimization

Isolated resources for high QoS

Custom enterprise rate limiting

Competitive Unit Pricing

Prioritize using the latest product features

Multiple service mode

meet enterprise-level standardized delivery

Serverless Deployment

Built for developers

High-performance inference, industry-leading speed

Diverse models, covering multiple scenarios

Pay-as-you-go, per-token pricing

Serverless rate limits



On-demand Deployment

Built For enterprises

Custom models tailored to your needs

Configurable strategies optimization

Isolated resources for high QoS

Custom enterprise rate limiting


Reserved Capacity

Enhanced for enterprises

Custom models tailored to your needs

Configurable strategies optimization

Isolated resources for high QoS

Custom enterprise rate limiting

Competitive Unit Pricing

Prioritize using the latest product features

OneDiff, High-performance
Image Generation Engine

Teaming up with excellent open-source foundation models.

En

Log in

En

Log in

SiliconCloud, Production Ready
Cloud with Low Cost