Create an automation project
A user begins using the service by creating an automation project. The Machine Learning Pipeline Automation API uses the following attributes: user-supplied project name, data table URI and target variable, and other attributes. The service creates an underlying analytics (VDMML) project, executes multiple steps of pre-processing, builds a data analysis pipeline, and runs the pipeline to completion. The entire process is fully automated without further user intervention. The request body is structured in three sections.
-
Key information of the automation project
- Automation project ID (automatically created by the service, read-only)
- Name of the project (a random string is appended to ensure uniqueness)
- Description of the project
- Project type (only "predictive" type is supported)
- Data table URI
- Project state
- The pipeline build method (either "automatic" or "template")
-
Automation project settings, grouped under "settings"
- A properties bag through which a user can pass arbitrary key/value pairs,
regardless if they are used. The properties currently used are as follows.
- applyGlobalMetadata (a flag to indicate whether to apply global metadata during project creation. Default is set to true.)
- autoRun (a flag to indicate whether to automatically start pipeline run at the time of project creation. Default is set to true.)
- numberOfModels (a positive integer to indicate maximum number of top models to return. Default is set to 5)
- consider (a list of machine learning algorithms specified by ID to additionally consider when generating the pipeline)
- exclude (a list of machine learning algorithms specified by ID to exclude from consideration when generating a pipeline)
- forceInclude (a list of machine learning algorithms specified by ID to force into the generated pipeline)
- algorithms (a list of machine learning algorithms used to inform pipeline generation)
- subsampleCount (number of training observations to sample for pipeline generation)
- subsamplePercent (percent of training observations to sample for pipeline generation)
- subsampleSeed (seed to use when subsampling training data during pipeline generation)
- A properties bag through which a user can pass arbitrary key/value pairs,
regardless if they are used. The properties currently used are as follows.
-
Analytics project attributes, grouped under "analyticsProjectAttributes"
- Analytics project ID (automatically created by the service, read-only)
- Target variable
- Target level (a string to indicate target level. The accepted values are
listed below.)
- binary: Binary target level
- interval: Interval target level
- nominal: Nominal target level
- Target event level
- classSelectionStatistic (a string to indicate class selection statistic,
dependent upon the type of target variable. For 'BINARY' and 'NOMINAL'
target variable types, the accepted values are listed below, where the
default is set to 'ks'. The 'ORDINAL' target variable type is not
supported.)
- ase: Average squared error
- c: Area under curve (C statistic)
- capturedResponse: Captured response
- cumulativeCapturedResponse: Cumulative captured response
- cumulativeLift: Cumulative lift
- f1: F1 score
- fdr: False discovery rate
- fpr: False positive rate
- gain: Gain
- gini: Gini
- ks: Kolmogorov-Smirnov statistic
- lift: Lift
- misclassificationEvent: Misclassification (Event)
- mce: Misclassification (MCE)
- mcll: Multiclass log loss
- ks2: ROC separation
- rase: Root average squared error
- misclassificationRateCutoff: Misclassification at cutoff
- intervalSelectionStatistic (a string to indicate interval selection
statistic, dependent upon the type of target variable. For 'INTERVAL'
target variable types, the accepted values are listed below, where the
default is set to 'ase'. This field is ignored for all other types, like
'BINARY', 'NOMINAL', or 'ORDINAL'.)
- ase: Average squared error
- rase: Root average squared error
- rmae: Root mean absolute error
- rmsle: Root mean squared logarithmic error
- Partition enabled flag
1{2 "creationTimeStamp": "2022-03-01T01:29:46.014052Z",3 "createdBy": "username",4 "modifiedTimeStamp": "2022-03-01T01:29:46.014039Z",5 "modifiedBy": "username",6 "revision": 0,7 "id": "4f3a766b-78b4-4bf2-bbc4-fe5ea38503ef",8 "name": "project name",9 "links": [10 {11 "method": "GET",12 "rel": "up",13 "href": "/mlPipelineAutomation/projects",14 "uri": "/mlPipelineAutomation/projects",15 "type": "application/vnd.sas.collection",16 "itemType": "application/vnd.sas.analytics.ml.pipeline.automation.project"17 },18 {19 "method": "GET",20 "rel": "self",21 "href": "/mlPipelineAutomation/projects/4f3a766b-78b4-4bf2-bbc4-fe5ea38503ef",22 "uri": "/mlPipelineAutomation/projects/4f3a766b-78b4-4bf2-bbc4-fe5ea38503ef",23 "type": "application/vnd.sas.analytics.ml.pipeline.automation.project"24 },25 {26 "method": "PUT",27 "rel": "update",28 "href": "/mlPipelineAutomation/projects/4f3a766b-78b4-4bf2-bbc4-fe5ea38503ef",29 "uri": "/mlPipelineAutomation/projects/4f3a766b-78b4-4bf2-bbc4-fe5ea38503ef",30 "type": "application/vnd.sas.analytics.ml.pipeline.automation.project"31 },32 {33 "method": "DELETE",34 "rel": "delete",35 "href": "/mlPipelineAutomation/projects/4f3a766b-78b4-4bf2-bbc4-fe5ea38503ef",36 "uri": "/mlPipelineAutomation/projects/4f3a766b-78b4-4bf2-bbc4-fe5ea38503ef"37 },38 {39 "method": "DELETE",40 "rel": "propagateDelete",41 "href": "/mlPipelineAutomation/projects/4f3a766b-78b4-4bf2-bbc4-fe5ea38503ef?propagate=true",42 "uri": "/mlPipelineAutomation/projects/4f3a766b-78b4-4bf2-bbc4-fe5ea38503ef?propagate=true"43 },44 {45 "method": "GET",46 "rel": "state",47 "href": "/mlPipelineAutomation/projects/4f3a766b-78b4-4bf2-bbc4-fe5ea38503ef/state",48 "uri": "/mlPipelineAutomation/projects/4f3a766b-78b4-4bf2-bbc4-fe5ea38503ef/state",49 "type": "text/plain"50 },51 {52 "method": "PUT",53 "rel": "updateState",54 "href": "/mlPipelineAutomation/projects/4f3a766b-78b4-4bf2-bbc4-fe5ea38503ef/state?value={value}",55 "uri": "/mlPipelineAutomation/projects/4f3a766b-78b4-4bf2-bbc4-fe5ea38503ef/state?value={value}",56 "responseType": "text/plain"57 }58 ],59 "version": 3,60 "dataTableUri": "/dataTables/dataSources/cas~fs~cas-shared-default~fs~Public/tables/DATA",61 "type": "predictive",62 "state": "pending",63 "settings": {64 "applyGlobalMetadata": true,65 "autoRun": true,66 "locale": "en",67 "maxModelingTime": 1,68 "modelingMode": "Standard",69 "numberOfModels": 770 },71 "analyticsProjectAttributes": {72 "analyticsProjectId": "2876465b-3cff-4b58-8176-5cd7cbb793dc",73 "targetVariable": "BAD",74 "partitionEnabled": true,75 "overrideClassificationCutoffEnabled": false,76 "samplingEnabled": "AUTO",77 "samplingPercentage": 50,78 "intervalSelectionStatistic": "ase",79 "classSelectionStatistic": "ks",80 "selectionDepth": 10,81 "selectionPartition": "default",82 "overrideClassificationCutoffValue": 0.5,83 "cutoffPercentage": 50,84 "numberOfCutoffValues": 10085 },86 "championModel": {},87 "pipelineBuildMethod": "automatic"88}
Name | Type | Required | Description |
---|---|---|---|
Accept-Language | string | false | Used to set the project locale. Default: en |
project
This object contains metadata and information about an automation project.
Name | Type | Required | Description |
---|---|---|---|
analyticsProjectAttributes | Analytics Project Attributes | true | This object contains a list of analytics project attributes related to Model Studio project settings. |
dataTableUri | string | true | Data table URI |
description | string | false | Description of the automation project |
name | string | true | Name of the automation project |
revision | integer<int64> | false | Revision of the automation project instance |
settings | Automation Project Settings | false | A collection of optional settings to configure an automation project after Model Studio project is created based on analyticsProjectAttributes. |
state | string | false | Automation Project state. One of the enums [pending, preparing, waiting, ready, modeling, constructingPipeline, runningPipeline, quiescing, quiesced, completed, canceled, failed, oversampling, retraining]. Allowed values: pendingpreparingwaitingreadymodelingconstructingPipelinerunningPipelinequiescingquiescedcompletedcanceledfailedoversamplingretraining |
type | string | true | Automation project type Allowed value: predictive |
pipelineBuildMethod | string | false | The method used to generate the project pipeline Allowed values: automatictemplate |
version | integer<int32> | false | Version of the resource |
championModel | Champion Model | false | This object contains a summary of champion model. |
customProperties | object | false | Custom properties expressed as a map of strings that are associated with the automation project. These properties are added to the ProviderSpecificProperties map of the generated analytics project (prefixed with 'customMLPA_'). |
links | array [Link] | false | The links that apply to this automation project |
Status | Meaning | Description | ||
---|---|---|---|---|
202 | Accepted | The request is accepted. | Headers | Schema |
400 | Bad Request | One or more parameters were invalid. | Schema |