{ "cells": [ { "cell_type": "markdown", "source": [ "---\n", "title: \"Distributed Hyperparameter Optimization on MNIST Dataset\"\n", "date: 2021-05-03\n", "type: technical_note\n", "draft: false\n", "---" ], "metadata": { "collapsed": false } }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Maggy Distributed Hyper Parameters Optimization example\n", "---\n", "Created: 24/04/2019\n", "Updated: 2021" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook illustrates the usage of the maggy framework for asynchronous hyperparameter optimization on the famous MNIST dataset. \n", "\n", "In this specific example we are using random search over three parameters and we are deploying the median early stopping rule in order to make use of the asynchrony of the framework. The Median Stopping Rule implements the simple strategy of stopping a trial if its performance falls below the median of other trials at similar points in time.\n", "\n", "We are using Keras for this example. This notebook works with any Spark cluster given that you are using maggy 0.1. In future versions we will add functionality that relies on Hopsworks.\n", "\n", "This notebook has been tested with TensorFlow 1.11.0 and Spark 2.4.0. \n", "Requires Python 3.6 or higher." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1. Spark Session" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Make sure you have a running Spark Session/Context available. On Hopsworks just execute a simple command to start the spark application." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Starting Spark application\n" ] }, { "data": { "text/html": [ "
ID | YARN Application ID | Kind | State | Spark UI | Driver log | Current session? |
---|---|---|---|---|---|---|
9 | application_1556201759536_0001 | pyspark | idle | Link | Link | ✔ |