Debug ETL

There is a debugger interface implemented in Easy SQL.

Start to debug

We recommend debugging ETLs from jupyter. You can follow the steps below to start debugging your ETL.

  1. Install jupyter first with command python3 -m pip install jupyterlab.

  2. Create a file named debugger.py with contents like below:

A more detailed sample could be found here.

from typing import Dict, Any

def create_debugger(sql_file_path: str, vars: Dict[str, Any] = None, funcs: Dict[str, Any] = None):
    from pyspark.sql import SparkSession
    from easy_sql.sql_processor.backend import SparkBackend
    from easy_sql.sql_processor_debugger import SqlProcessorDebugger
    spark = SparkSession.builder.enableHiveSupport().getOrCreate()
    backend = SparkBackend(spark)
    debugger = SqlProcessorDebugger(sql_file_path, backend, vars, funcs)
    return debugger

  1. Create a file named test.sql with contents as here.

  2. Then start jupyter lab with command: jupyter lab.

  3. Start debugging like below:

ETL Debugging

Debuger API

Please refer to API doc here