API Reference
Create Model
Create Model allows you to create a copy of an existing model (called the parent model). Once created, your new model is independent and can be customized through training. You can experiment with different training parameters, datasets, and fine-tuning techniques without any risk to the original parent model.
Note: You can only train models that you have created yourself. The standard base models (like Llama-3.1-8b or DeepSeek-R1-8b) are read-only, so you must first create a copy of these models before you can train them.
Parameters
The name for your new model. This will be used to identify the model in API calls and the model list.
Note: Model names must be unique within your account.
A description of what this model is intended for or what makes it special. This helps you organize and identify your models.
The name of the parent model to base this new model on. This must be an existing model available in your model list.
Common parent models include:
- Llama-3.1-8b - Meta's Llama 3.1 model
- DeepSeek-R1-8b - DeepSeek's R1 model
- Any model you have previously created
The rank parameter for LoRA (Low-Rank Adaptation) fine-tuning. This controls the dimensionality of the low-rank matrices used in the adaptation.
- Lower values (4-16) - Fewer parameters, faster training, less memory usage, but potentially less expressive
- Higher values (32-128) - More parameters, slower training, more memory usage, but potentially more expressive
Note: If not specified, the model will use a default rank value based on the parent model architecture.
The alpha parameter for LoRA scaling. This controls the scaling factor applied to the LoRA weights.
Typically set as a multiple of rank (e.g., if rank=16, alpha=32). A common rule of thumb is alpha = 2 * rank.
Note: If not specified, the model will use a default alpha value based on the rank.
The dropout rate applied during training. Dropout helps prevent overfitting by randomly setting a fraction of input units to 0 during training.
- 0.0 - No dropout (may lead to overfitting)
- 0.1 - Light regularization (recommended for most cases)
- 0.2-0.5 - Stronger regularization (use when overfitting is a concern)
Note: If not specified, the model will use a default dropout value.
The initial learning rate for training this model. This can be overridden when you actually train the model, but setting it here provides a default value.
See the Train endpoint documentation for more details on learning rates.
Note: If not specified, the model will use a default learning rate based on the parent model.
Output
ModelInfo
Contains information about a model, including its configuration, training parameters, and metadata.
Model name, this is the name given to to model.
Model id, this is a id generated when the model is created.
Model description, set when the model was created.
Is the model readonly, only the models that you have created can be trained(that is not read only)
The name of the parent model, this is the base model that the model is based on.
The revision number of the current model. It start with 0 and is increase by 1 for every training.
Time stamp of the latest training.
The default model the one selected at startup.
Default temperature setting, this is the value user in getResponse if no other value if specified.
Controls the randomness and creativity of the model's responses. Lower values make output more focused and deterministic, while higher values increase randomness and creativity.
- 0.0 - Most deterministic, repeatable responses
- 1.0 - Balanced creativity and coherence (recommended for most use cases)
- 2.0 - Maximum randomness and creativity
Note: For tasks requiring consistency (like data extraction or classification), use lower values (0.0-0.3). For creative tasks (like brainstorming or storytelling), higher values (0.7-1.5) work better.
Default temperature setting, this is the value user in getResponse if no other value if specified.
Also known as "nucleus sampling," this parameter controls the diversity of responses by limiting the model to consider only the most probable tokens whose cumulative probability reaches the specified threshold.
- 0.1 - Very focused, only highly probable tokens
- 0.5 - Moderately diverse output
- 1.0 - Considers all tokens based on their probability
Note: It's generally recommended to adjust either temperature OR top_p, but not both simultaneously. When top_p is less than 1.0, the model samples from the smallest set of tokens whose cumulative probability exceeds the threshold.
curl http://localhost:45678/v1/models \
-X POST \
-H "Authorization: Bearer $TIGER_API_KEY"\
-H 'Content-Type: application/json' \
-d "{
\"name\": \"MyCustomModel\",
\"description\": \"A fine-tuned model for customer support tasks\",
\"parent_model\": \"Llama-3.1-8b\",
\"rank\": 16,
\"alpha\": 32,
\"dropout\": 0.1,
\"learning_rate\": 0.00001
}"{
"beta" : 0.8,
"default_model" : false,
"description" : "A fine-tuned model for customer support tasks",
"epsilon" : 0.2,
"id" : "mycustommodel",
"last_modified" : "2025-12-28T15:23:15.632Z",
"learning_rate" : 0.00001,
"learning_steps" : 5,
"name" : "MyCustomModel",
"parent_model" : "Llama-3.1-8b",
"read_only" : false,
"revision" : 0,
"temperature" : 0.7,
"top_p" : 0.95
}