Algorithm Output Schema

Complete specification of algorithm output format

Algorithm Output Schema

This is the complete JSON schema that defines the structure of the algo_output.json file that algorithms must write.

Schema Overview

The output file must contain:

  • results - Array of results (one per AOI typically)
    • source_aoi_version - Which AOI this result is for
    • data_type - Output Data Type name
    • observations - Array of time-based observations
      • observation_start_ts - Unix timestamp
      • observation_values - Summary metrics (optional)
      • measurement_path - Path to detailed data file (optional)

Complete JSON Schema

{
  "title": "Algorithm Output Schema",
  "description": "JSON schema to define an algorithm output file. The JSON file should be named algorithm_output.json and located under the directory specified by output_path in algorithm_input.json.",
  "type": "object",
  "required": [
    "results"
  ],
  "properties": {
    "results": {
      "description": "A list of Algorithm output Results.",
      "type": "array",
      "minItems": 1,
      "items": {
        "type": "object",
        "$ref": "#/$defs/result"
      }
    }
  },
  "$defs": {
    "result": {
      "description": "An Algorithm Result.",
      "type": "object",
      "required": [
        "source_aoi_version",
        "data_type",
        "observations"
      ],
      "properties": {
        "source_aoi_version": {
          "description": "AOIIdentifier Version of the associated Result.",
          "type": "integer",
          "example": 14538000
        },
        "dest_aoi_version": {
          "description": "Destination AOIIdentifier Version of the associated Result, if applicable. E.g. Used for Cross AOI Analysis requiring multiple AOIs to generate a single Result.",
          "type": "integer",
          "example": 10241660
        },
        "data_type": {
          "description": "Identifies the output data_type, retrieved using DataTypesGetRequest and specified in the outputs section of the manifest",
          "type": "string"
        },
        "algo_config_class": {
          "description": "Class type of the algorithm, if applicable",
          "type": "string",
          "examples": ["car", "truck", "large_commercial_aircraft", "building", "water", "grass"]
        },
        "algo_config_subclass": {
          "description": "Sub-class type of the algorithm, if applicable. If specified, an associated algo_config_class must also be specified.",
          "type": "string",
          "examples": ["boeing-737-300", "airbus_a-300"]
        },
        "observations": {
          "description": "A list of Observations for the associated Result.",
          "type": "array",
          "minItems": 1,
          "items": {
            "type": "object",
            "$ref": "#/$defs/observation"
          }
        }
      }
    },
    "observation": {
      "description": "An Observation",
      "type": "object",
      "required": [
        "observation_start_ts"
      ],
      "properties": {
        "observation_start_ts": {
          "description": "Starting timestamp of the Observation in Unix time.",
          "type": "integer",
          "example": 1577865600
        },
        "measurement_path": {
          "description": "File path to the Measurements data that comprise the Observation. Each Measurement file should have a unique name/path, and should conform to the Data Type Schema of the Algorithm. Currently, only Parquet files are supported as measurements. Algorithms that output measurements without any data should still write an empty Parquet file with necessary columns.",
          "type": "string",
          "example": "/algorithm/output_data/measurement_data_0.parquet"
        },
        "observation_values": {
          "description": "List of aggregated values of the Measurements data, if applicable. Each property in the observation_values is specified by the 'observation_value_columns' of the 'algorithm_parameters' in the Algorithm Manifest. Algorithms that produce detections/counts should still populate this field when the value is zero.",
          "type": "object",
          "items": {
            "$ref": "#/$defs/observationValueColumnDef"
          }
        }
      }
    },
    "observationValueColumnDef": {
      "type": "object",
      "patternProperties": {
        ".*": {
          "anyOf": [
            {"type": "string"},
            {"type": "number"},
            {"type": "boolean"},
            {"type": "object"}
          ]
        }
      }
    }
  }
}

Output Examples

Example 1: Simple Aggregated Results

Algorithm that only outputs summary values:

{
  "results": [
    {
      "source_aoi_version": 36810008,
      "data_type": "device_visits",
      "observations": [
        {
          "observation_start_ts": 1614607200,
          "observation_values": [{
            "unique_device_count": 10
          }]
        }
      ]
    }
  ]
}

Example 2: Detailed Measurements Only

Algorithm that outputs detailed data files:

{
  "results": [
    {
      "source_aoi_version": 36809989,
      "data_type": "device_visits",
      "observations": [
        {
          "observation_start_ts": 1546832297,
          "measurement_path": "meas_34e2dc28-0a28-4dbd-891c-7e3ae49b614b.parquet"
        },
        {
          "observation_start_ts": 1546835897,
          "measurement_path": "meas_37549e93-00d6-473e-bfb0-00aec9d4addf.parquet"
        },
        {
          "observation_start_ts": 1546839497,
          "measurement_path": "meas_f85e9245-1f01-4b2f-886c-b367603c1c6b.parquet"
        }
      ]
    }
  ]
}

Example 3: Both Aggregates and Detailed Data

Algorithm that outputs counts and detailed detections:

{
  "results": [
    {
      "source_aoi_version": 36809989,
      "data_type": "object_detections",
      "algo_config_class": "car",
      "observations": [
        {
          "observation_start_ts": 1546832297,
          "measurement_path": "meas_34e2dc28-0a28-4dbd-891c-7e3ae49b614b.parquet",
          "observation_values": [{
            "count": 29
          }]
        },
        {
          "observation_start_ts": 1546835897,
          "measurement_path": "meas_37549e93-00d6-473e-bfb0-00aec9d4addf.parquet",
          "observation_values": [{
            "count": 31
          }]
        },
        {
          "observation_start_ts": 1546839497,
          "measurement_path": "meas_f85e9245-1f01-4b2f-886c-b367603c1c6b.parquet",
          "observation_values": [{
            "count": 5
          }]
        }
      ]
    }
  ]
}

Example 4: Classification with Classes

Algorithm that detects multiple object types:

{
  "results": [
    {
      "source_aoi_version": 36809989,
      "data_type": "vehicle_detections",
      "algo_config_class": "car",
      "observations": [
        {
          "observation_start_ts": 1546832297,
          "observation_values": [{"count": 15}]
        }
      ]
    },
    {
      "source_aoi_version": 36809989,
      "data_type": "vehicle_detections",
      "algo_config_class": "truck",
      "observations": [
        {
          "observation_start_ts": 1546832297,
          "observation_values": [{"count": 8}]
        }
      ]
    }
  ]
}

Key Requirements

Required Fields

Every result must have:

  • source_aoi_version - Which AOI this is for
  • data_type - Must match manifest outputs
  • observations - At least one observation

Every observation must have:

  • observation_start_ts - Unix timestamp
  • At least one of:
    • observation_values - Summary metrics
    • measurement_path - Detailed data file

Observation Values

  • Must match observation_value_columns in manifest
  • Can be numbers, strings, booleans, or objects
  • Should be provided even when zero (e.g., {"count": 0})

Measurement Files

  • Must be Parquet format
  • Must conform to the Data Type schema
  • Must have unique filenames
  • Should be in the output_path directory
  • Even empty results should write an empty Parquet with correct columns

Optional Fields

  • dest_aoi_version - For cross-AOI analysis
  • algo_config_class - For classification results
  • algo_config_subclass - For fine-grained classification

Writing Output

Python Example

import json
import pandas as pd
from pathlib import Path
import uuid

def write_algorithm_output(output_dir, results_data):
    """Write algorithm output in correct format"""

    results = []

    for aoi_version, aoi_data in results_data.items():
        observations = []

        for time, data_df in aoi_data.items():
            # Write measurement file
            measurement_filename = f"meas_{uuid.uuid4()}.parquet"
            measurement_path = Path(output_dir) / measurement_filename
            data_df.to_parquet(measurement_path, index=False)

            # Create observation
            observation = {
                'observation_start_ts': int(time),
                'measurement_path': measurement_filename,
                'observation_values': [{
                    'count': len(data_df)
                }]
            }
            observations.append(observation)

        # Create result
        result = {
            'source_aoi_version': aoi_version,
            'data_type': 'device_visits',
            'observations': observations
        }
        results.append(result)

    # Write output file
    output = {'results': results}
    output_path = Path(output_dir) / 'algo_output.json'

    with open(output_path, 'w') as f:
        json.dump(output, f, indent=2)

    print(f"Output written to {output_path}")

Validation

The platform validates:

  1. Output file exists at {output_path}/algo_output.json
  2. JSON is valid and matches schema
  3. All required fields are present
  4. Measurement files exist and are valid Parquet
  5. Measurement files conform to Data Type schema
  6. observation_values match manifest declarations

Next Steps