{ "cells": [ { "cell_type": "raw", "metadata": {}, "source": [ "---\n", "title: \"Petastorm Training Data Create\"\n", "date: 2021-02-24\n", "type: technical_note\n", "draft: false\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Create MNIST Petastorm Dataset\n", "\n", "In this notebook we will go through how you can create a Petastorm dataset with the MNIST images of digits, and also how you can save it as a documented and reusable training dataset in the Hopsworks Feature Store. The petastorm dataset can later on be used to train models using either Tensorflow, PyTorch or SparkML" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Starting Spark application\n" ] }, { "data": { "text/html": [ "
ID | YARN Application ID | Kind | State | Spark UI | Driver log | Current session? |
---|---|---|---|---|---|---|
1 | application_1551196216588_0003 | pyspark | idle | Link | Link | ✔ |