Maze Environment

A gridworld maze where the agent must navigate from a start cell to an exit while avoiding walls.

Maze Layout

The default environment ships with a single 8x8 maze:

0 0 0 0 0 1 0 0
1 1 0 1 0 1 0 1
0 0 0 1 0 0 0 1
0 1 1 1 1 1 0 0
0 0 0 0 0 0 1 0
1 0 1 1 1 0 0 0
0 0 0 0 1 0 1 0
0 1 1 0 0 0 0 0

0 is an empty cell and 1 is a wall. Coordinates are (col, row) with (0, 0) at the upper-left.

Quick Start

The simplest way to use the Maze environment is through the MazeEnv client:

from maze_env import MazeAction, MazeEnv

try:
    # Create environment from Docker image
    env = MazeEnv.from_docker_image("maze_env-env:latest")

    # Reset to start a new episode
    result = env.reset()
    print(f"Start position: {result.observation.current_position}")
    print(f"Legal actions: {result.observation.legal_actions}")

    # Play until done
    while not result.done:
        action_id = result.observation.legal_actions[0]
        result = env.step(MazeAction(action=action_id))
        print(f"Position: {result.observation.current_position}")
        print(f"Reward: {result.reward}, Done: {result.done}")

finally:
    # Always clean up
    env.close()

That’s it! The MazeEnv.from_docker_image() method handles:

Starting the Docker container
Waiting for the server to be ready
Connecting to the environment
Container cleanup when you call close()

Building the Docker Image

From the environment directory (envs/maze_env/):

docker build -t maze_env-env:latest -f server/Dockerfile .

Deploying to Hugging Face Spaces

You can easily deploy your OpenEnv environment to Hugging Face Spaces using the openenv push command:

# From the environment directory (envs/maze_env/)
openenv push

# Or specify options
openenv push --namespace my-org --private

The openenv push command will:

Validate that the directory is an OpenEnv environment (checks for openenv.yaml)
Prepare a custom build for Hugging Face Docker space (enables web interface)
Upload to Hugging Face (ensuring you’re logged in)

Prerequisites

Authenticate with Hugging Face: The command will prompt for login if not already authenticated

Options

--directory, -d: Directory containing the OpenEnv environment (defaults to current directory)
--repo-id, -r: Repository ID in format ‘username/repo-name’ (defaults to ‘username/env-name’ from openenv.yaml)
--base-image, -b: Base Docker image to use (overrides Dockerfile FROM)
--private: Deploy the space as private (default: public)

Examples

# Push to your personal namespace (defaults to username/env-name from openenv.yaml)
openenv push

# Push to a specific repository
openenv push --repo-id my-org/maze-env

# Push with a custom base image
openenv push --base-image ghcr.io/huggingface/openenv-base:latest

# Push as a private space
openenv push --private

# Combine options
openenv push --repo-id my-org/maze-env --base-image custom-base:latest --private

After deployment, your space will be available at: https://huggingface.co/spaces/<repo-id>

Running the Default Maze

docker run -p 8000:8000 maze_env-env:latest

Environment Details

Action

MazeAction: Contains a single field

action (int) - Movement action
- 0 = move up
- 1 = move down
- 2 = move left
- 3 = move right

Observation

MazeObservation: Per-step observation payload

current_position (List[int]) - Agent position as [col, row]
legal_actions (List[int]) - Valid actions from the current position
metadata (dict) - Extra info:
- maze: 2D grid (0 = empty, 1 = wall)
- status: “playing”, “win”, or “lose”
- exit_cell: Exit position as [col, row]
- step: Current step count

State

MazeState: Server-side state snapshot

episode_id (str) - Unique identifier for the current episode
step_count (int) - Number of steps taken
done (bool) - Whether the episode has ended
current_position (List[int]) - Current agent position
exit_cell (List[int]) - Exit position
status (str) - “playing”, “win”, or “lose”

Reward

The reward follows the underlying maze rules:

Small penalty for a move (-0.05)
Penalty for revisiting a cell (-0.25)
Penalty for invalid move (-0.75)
Reward for reaching the exit (+10.0)

Configuration

Environment Variables

This environment does not rely on environment variables. Customize the maze in code (see Development & Testing).

Advanced Usage

Connecting to an Existing Server

If you already have a Maze environment server running:

from maze_env import MazeAction, MazeEnv

# Connect to existing server
env = MazeEnv(base_url="http://localhost:8000")

# Use as normal
result = env.reset()
result = env.step(MazeAction(action=result.observation.legal_actions[0]))

# Close connection (does NOT stop the server)
env.close()

Connecting to HuggingFace Space

from maze_env import MazeAction, MazeEnv

# Connect to remote Space
env = MazeEnv(base_url="https://your-username-maze-env.hf.space")

result = env.reset()
print(f"Position: {result.observation.current_position}")
print(f"Legal actions: {result.observation.legal_actions}")

result = env.step(MazeAction(action=result.observation.legal_actions[0]))
env.close()

Project Structure

maze_env/
├── __init__.py            # Module exports
├── README.md              # This file
├── openenv.yaml           # OpenEnv manifest
├── pyproject.toml         # Project metadata and dependencies
├── client.py              # MazeEnv client implementation
├── models.py              # Action, Observation, and State models
└── server/
    ├── __init__.py        # Server module exports
    ├── maze.py            # Maze logic and rewards
    ├── maze_env_environment.py  # Core environment implementation
    ├── app.py             # FastAPI application
    └── Dockerfile         # Environment container

References

Reinforcement-Learning-Maze (original implementation)

Update on GitHub