Algorithm Output Schema

This is the complete JSON schema that defines the structure of the algo_output.json file that algorithms must write.

Schema Overview

The output file must contain:

results - Array of results (one per AOI typically)
- source_aoi_version - Which AOI this result is for
- data_type - Output Data Type name
- observations - Array of time-based observations
  - observation_start_ts - Unix timestamp
  - observation_values - Summary metrics (optional)
  - measurement_path - Path to detailed data file (optional)

Complete JSON Schema

{
  "title": "Algorithm Output Schema",
  "description": "JSON schema to define an algorithm output file. The JSON file should be named algorithm_output.json and located under the directory specified by output_path in algorithm_input.json.",
  "type": "object",
  "required": [
    "results"
  ],
  "properties": {
    "results": {
      "description": "A list of Algorithm output Results.",
      "type": "array",
      "minItems": 1,
      "items": {
        "type": "object",
        "$ref": "#/$defs/result"
      }
    }
  },
  "$defs": {
    "result": {
      "description": "An Algorithm Result.",
      "type": "object",
      "required": [
        "source_aoi_version",
        "data_type",
        "observations"
      ],
      "properties": {
        "source_aoi_version": {
          "description": "AOIIdentifier Version of the associated Result.",
          "type": "integer",
          "example": 14538000
        },
        "dest_aoi_version": {
          "description": "Destination AOIIdentifier Version of the associated Result, if applicable. E.g. Used for Cross AOI Analysis requiring multiple AOIs to generate a single Result.",
          "type": "integer",
          "example": 10241660
        },
        "data_type": {
          "description": "Identifies the output data_type, retrieved using DataTypesGetRequest and specified in the outputs section of the manifest",
          "type": "string"
        },
        "algo_config_class": {
          "description": "Class type of the algorithm, if applicable",
          "type": "string",
          "examples": ["car", "truck", "large_commercial_aircraft", "building", "water", "grass"]
        },
        "algo_config_subclass": {
          "description": "Sub-class type of the algorithm, if applicable. If specified, an associated algo_config_class must also be specified.",
          "type": "string",
          "examples": ["boeing-737-300", "airbus_a-300"]
        },
        "observations": {
          "description": "A list of Observations for the associated Result.",
          "type": "array",
          "minItems": 1,
          "items": {
            "type": "object",
            "$ref": "#/$defs/observation"
          }
        }
      }
    },
    "observation": {
      "description": "An Observation",
      "type": "object",
      "required": [
        "observation_start_ts"
      ],
      "properties": {
        "observation_start_ts": {
          "description": "Starting timestamp of the Observation in Unix time.",
          "type": "integer",
          "example": 1577865600
        },
        "measurement_path": {
          "description": "File path to the Measurements data that comprise the Observation. Each Measurement file should have a unique name/path, and should conform to the Data Type Schema of the Algorithm. Currently, only Parquet files are supported as measurements. Algorithms that output measurements without any data should still write an empty Parquet file with necessary columns.",
          "type": "string",
          "example": "/algorithm/output_data/measurement_data_0.parquet"
        },
        "observation_values": {
          "description": "List of aggregated values of the Measurements data, if applicable. Each property in the observation_values is specified by the 'observation_value_columns' of the 'algorithm_parameters' in the Algorithm Manifest. Algorithms that produce detections/counts should still populate this field when the value is zero.",
          "type": "object",
          "items": {
            "$ref": "#/$defs/observationValueColumnDef"
          }
        }
      }
    },
    "observationValueColumnDef": {
      "type": "object",
      "patternProperties": {
        ".*": {
          "anyOf": [
            {"type": "string"},
            {"type": "number"},
            {"type": "boolean"},
            {"type": "object"}
          ]
        }
      }
    }
  }
}

Output Examples

Example 1: Simple Aggregated Results

Algorithm that only outputs summary values:

{
  "results": [
    {
      "source_aoi_version": 36810008,
      "data_type": "device_visits",
      "observations": [
        {
          "observation_start_ts": 1614607200,
          "observation_values": [{
            "unique_device_count": 10
          }]
        }
      ]
    }
  ]
}

Example 2: Detailed Measurements Only

Algorithm that outputs detailed data files:

{
  "results": [
    {
      "source_aoi_version": 36809989,
      "data_type": "device_visits",
      "observations": [
        {
          "observation_start_ts": 1546832297,
          "measurement_path": "meas_34e2dc28-0a28-4dbd-891c-7e3ae49b614b.parquet"
        },
        {
          "observation_start_ts": 1546835897,
          "measurement_path": "meas_37549e93-00d6-473e-bfb0-00aec9d4addf.parquet"
        },
        {
          "observation_start_ts": 1546839497,
          "measurement_path": "meas_f85e9245-1f01-4b2f-886c-b367603c1c6b.parquet"
        }
      ]
    }
  ]
}

Example 3: Both Aggregates and Detailed Data

Algorithm that outputs counts and detailed detections:

{
  "results": [
    {
      "source_aoi_version": 36809989,
      "data_type": "object_detections",
      "algo_config_class": "car",
      "observations": [
        {
          "observation_start_ts": 1546832297,
          "measurement_path": "meas_34e2dc28-0a28-4dbd-891c-7e3ae49b614b.parquet",
          "observation_values": [{
            "count": 29
          }]
        },
        {
          "observation_start_ts": 1546835897,
          "measurement_path": "meas_37549e93-00d6-473e-bfb0-00aec9d4addf.parquet",
          "observation_values": [{
            "count": 31
          }]
        },
        {
          "observation_start_ts": 1546839497,
          "measurement_path": "meas_f85e9245-1f01-4b2f-886c-b367603c1c6b.parquet",
          "observation_values": [{
            "count": 5
          }]
        }
      ]
    }
  ]
}

Example 4: Classification with Classes

Algorithm that detects multiple object types:

{
  "results": [
    {
      "source_aoi_version": 36809989,
      "data_type": "vehicle_detections",
      "algo_config_class": "car",
      "observations": [
        {
          "observation_start_ts": 1546832297,
          "observation_values": [{"count": 15}]
        }
      ]
    },
    {
      "source_aoi_version": 36809989,
      "data_type": "vehicle_detections",
      "algo_config_class": "truck",
      "observations": [
        {
          "observation_start_ts": 1546832297,
          "observation_values": [{"count": 8}]
        }
      ]
    }
  ]
}

Key Requirements

Required Fields

Every result must have:

source_aoi_version - Which AOI this is for
data_type - Must match manifest outputs
observations - At least one observation

Every observation must have:

observation_start_ts - Unix timestamp
At least one of:
- observation_values - Summary metrics
- measurement_path - Detailed data file

Observation Values

Must match observation_value_columns in manifest
Can be numbers, strings, booleans, or objects
Should be provided even when zero (e.g., {"count": 0})

Measurement Files

Must be Parquet format
Must conform to the Data Type schema
Must have unique filenames
Should be in the output_path directory
Even empty results should write an empty Parquet with correct columns

Optional Fields

dest_aoi_version - For cross-AOI analysis
algo_config_class - For classification results
algo_config_subclass - For fine-grained classification

Writing Output

Python Example

import json
import pandas as pd
from pathlib import Path
import uuid

def write_algorithm_output(output_dir, results_data):
    """Write algorithm output in correct format"""

    results = []

    for aoi_version, aoi_data in results_data.items():
        observations = []

        for time, data_df in aoi_data.items():
            # Write measurement file
            measurement_filename = f"meas_{uuid.uuid4()}.parquet"
            measurement_path = Path(output_dir) / measurement_filename
            data_df.to_parquet(measurement_path, index=False)

            # Create observation
            observation = {
                'observation_start_ts': int(time),
                'measurement_path': measurement_filename,
                'observation_values': [{
                    'count': len(data_df)
                }]
            }
            observations.append(observation)

        # Create result
        result = {
            'source_aoi_version': aoi_version,
            'data_type': 'device_visits',
            'observations': observations
        }
        results.append(result)

    # Write output file
    output = {'results': results}
    output_path = Path(output_dir) / 'algo_output.json'

    with open(output_path, 'w') as f:
        json.dump(output, f, indent=2)

    print(f"Output written to {output_path}")

Validation

The platform validates:

Output file exists at {output_path}/algo_output.json
JSON is valid and matches schema
All required fields are present
Measurement files exist and are valid Parquet
Measurement files conform to Data Type schema
observation_values match manifest declarations

Next Steps

See Algorithm Input Schema for input format
Review Algorithm Input/Output guide for implementation
Check Creating Algorithms for complete workflow