{ "cells": [ { "cell_type": "raw", "metadata": {}, "source": [ "---\n", "title: \"Dataset Images on the Feature Store\"\n", "date: 2021-02-24\n", "type: technical_note\n", "draft: false\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Example of Using a Raw Image Dataset in the Feature Store\n", "\n", "Images are often stored in binary formats for training machine learning models, such as tfrecords or parquet. However, sometimes it can be useful to store a large image dataset in a folder with one file per image, such as .jpg or .png. \n", "\n", "This notebook will demonstrate how to create a training dataset with .jpg files in the Hopsworks Feature Store" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Starting Spark application\n" ] }, { "data": { "text/html": [ "
ID | YARN Application ID | Kind | State | Spark UI | Driver log | Current session? |
---|---|---|---|---|---|---|
18 | application_1549717352737_0020 | pyspark | idle | Link | Link | ✔ |