AIStation

AIStation is an AI DevOps platform developed by KAYTUS for AI model development and deployment. It is dedicated to helping enterprises build efficient deep learning development platforms, manage and schedule AI computing resources in a unified manner, and effectively improve the utilization of computing resources. AIStation provides AI development engineers with a complete AI development software stack and development process, greatly improving the R&D efficiency.

Key Features

Fine-grained scheduling of GPU

A GPU shared scheduling strategy provides single-device sharing of GPU resources and supports the sharing of up to 64 tasks per GPU. Allocation and isolation at any granularity are supported. Users can dynamically request GPU resources based on the GPU memory.
Data acceleration strategies

The strategies of "zero-copy" transmission, multi-thread fetch, incremental data update and affinity scheduling for training data greatly shorten the data cache cycle and improve efficiency of model development and training.
Efficient distributed training

Supports the extension of distributed training through MPI in TensorFlow, PyTorch and other mainstream frameworks and provides standard UI operations, so that users can submit distributed training through simple GPU resources and training script configurations.
Fault tolerance mechanism

Provides fault tolerance for training tasks, enabling the platform to effectively ensure continuous training of tasks and reduce the recovery time in the event of a server crash or GPU failure.

Performance Measurement

Improve the efficiency of distributed training

Improve AI training efficiency with a data cache strategy

The resnet50 benchmark test shows that the AI training efficiency using an AIStation data cache strategy significantly improved with an increase in the number of concurrent tasks. With 70 concurrent tasks, the efficiency of model training improved by 72%.

Improve the efficiency of distributed training

For distributed training for resnet50, with the increase of task concurrency, the GPU acceleration ratio of multi-card distributed training with AIStation can be increased by up to 90%.





Serial Number Verification

Employee Email Verification

Serial Number:

Verification Code:

By clicking SubmitSend, you acknowledge that you agree to comply with all applicable laws and regulations

Click here to learn how to inquire a SN code.

AIStation

Fine-grained scheduling of GPU

Data acceleration strategies

Efficient distributed training

Fault tolerance mechanism