

The hyper-parameter tuning job will be launched by the SageMaker Airflow operator SageMakerTuningOperator. Tune the Model Hyper-parameters: A conditional/optional task to tune the hyper-parameters of Factorization Machine to find the best model.The training job will be launched by the Airflow SageMaker operator SageMakerTrainingOperator. Training the Model: Train the SageMaker's built-in Factorization Machine model with the training data and generate model artifacts.In this task, pre-processed data will be transformed to RecordIO Protobuf format. The algorithm expects training data only in RecordIO Protobuf format with Float32 tensors. Prepare Training Data: To build the recommender system, we will use SageMaker's built-in algorithm - Factorization machines.Data Pre-processing: Extract and pre-process data from S3 to prepare the training data.The workflow performs the following tasks Here is the high-level depiction of the ML workflow we will implement for building the recommender system └── preprocess.py Data pre-processing script └── pipeline Python module used in Airflow DAG for data preparation ├── dag_ml_pipeline_amazon_video_reviews.py Airflow DAG definition for ML workflow ├── config.py Config file to configure SageMaker jobs and other ML tasks
#Aws mwaa tutorial code#
└── src Source code for Airflow DAG definition │ └── amazon-video-recommender_using_fm_algo.ipynb │ └── airflow-ec2.yaml CloudFormation for installing Airflow instance backed by RDS A companion Jupyter Notebook to understand the individual ML tasks in detail such as data exploration, data preparation, model training/tuning and inference.Airflow DAG Python Script that integrates and orchestrates all the ML tasks in a ML workflow for building a recommender system.AWS CloudFormation Templates to launch the AWS services required to create the components.More details on this dataset can be found at its AWS Public Datasets page. We'll use historical star ratings from over 2M Amazon customers on over 160K digital videos. We will build a recommender system to predict a customer's rating for a certain video based on customer's historical ratings of similar videos as well as the behavior of other similar customers. This repository shows a sample example to build, manage and orchestrate ML workflows using Amazon Sagemaker and Apache Airflow. This repository contains the assets for the Amazon Sagemaker and Apache Airflow integration sample described in this ML blog post. Build End-to-End Machine Learning (ML) Workflows with Amazon SageMaker and Apache Airflow
