Skip to content

Incorrect validation_result["results"]["exception_info"] structure when raised_exception == True #10849

@vasilijyaromenka

Description

@vasilijyaromenka

Describe the bug
When raised_exception == True, exception_info has incorrect structure.
Instead of {'raised_exception': True, 'exception_traceback': 'The traceback', 'exception_message': 'some message'}, it has the following structure:
{"additional_key" : {'raised_exception': True, 'exception_traceback': 'The traceback', 'exception_message': 'some message'}}

To Reproduce

# df to validate 
df = spark.sql("""
            SELECT  id , CASE WHEN id%4 = 0 THEN "NOT NULL" END AS colname
            FROM  range(1, 100)""")

# update expectation suite
suite_name = "e_simple_unit_test"

suite = context.suites.add_or_update (gx.ExpectationSuite(name=suite_name))

correct_column_name = gx.expectations.ExpectColumnValuesToNotBeNull (
    column="colname", mostly=1, row_condition = "id%2 = 0", condition_parser = "spark")
    
incorrect_column_name = gx.expectations.ExpectColumnValuesToNotBeNull (
    column="___colname___", mostly=1, row_condition = "id%2 = 0", condition_parser = "spark")

suite.add_expectation(correct_column_name)
suite.add_expectation(incorrect_column_name)

suite.save()

# update validation
data_source_name = data_source_configs["data_source_name"]
data_asset_name = data_source_configs["data_asset_name"]
batch_definition_name = data_source_configs["batch_definition_name"]

batch_definition = context.data_sources.get(data_source_name).get_asset(data_asset_name).get_batch_definition(batch_definition_name)
validation_definition_name = "unit_test_validation_definition"

validation_definition = gx.ValidationDefinition(
data=batch_definition, suite=suite, name=validation_definition_name
)

unit_test_validation_definition = context.validation_definitions.add_or_update(validation_definition)

# run the ValidationDefinition
validation_results = unit_test_validation_definition.run(
                                    batch_parameters={"dataframe": df}, 
                                    result_format = "COMPLETE")
results_dict = validation_results.to_json_dict()

for dct in results_dict["results"]:
    if "exception_message" in dct["exception_info"].keys():
        print("\nCorrect exception_info structure:")
    elif  "exception_message" not in dct["exception_info"].keys():
        print("\nInorrect exception_info structure:")

    print(dct["exception_info"])

returns -- >


Inorrect exception_info structure:
{"('column_values.nonnull.condition', '242ce27d28b7ac28fe08ad7be0377b1a', ())": {'exception_traceback': 'Traceback.......', 'exception_message': 'Error: The column "___colname___" in BatchData does not exist.', 'raised_exception': True}}

Correct exception_info structure:
{'raised_exception': False, 'exception_traceback': None, 'exception_message': None}

Expected behavior

Correct exception_info structure:
{'raised_exception': True, 'exception_traceback': 'Traceback.......', 'exception_message': 'Error: The column "___colname___" in BatchData does not exist.'}

Correct exception_info structure:
{'raised_exception': False, 'exception_traceback': None, 'exception_message': None}

Environment (please complete the following information):

  • Great Expectations Version: [e.g. 1.3.1]
  • Data Source: Spark
  • Cloud environment: Databricks

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugBugs bugs bugs!

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    In progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions