-
Notifications
You must be signed in to change notification settings - Fork 204
Open
Description
Describe the bug
After upgrading to DataFusion 49 #1154 , the expression cache (cached_exprs_evaluator) fails to deduplicate identical extension functions.
This is because DataFusion's SimpleScalarUDF::equals now enforces pointer equality (Arc::ptr_eq) for function implementations (DataFusion PR #16781). Auron currently creates a new Arc for every UDF instance (native-engine/datafusion-ext-functions/src/lib.rs), causing logically identical functions to have different memory addresses.
To Reproduce
Run the following query with logging the dups in cached_exprs_evaluator.rs shows 0 duplicates found.
test("my test") {
withTable("my_table") {
sql("""
|create table my_cache_table using parquet as
|select col1 from values (''{"a":"1", "b":"2"}'), ('{"a":"3", "b":"4"}'), ('{"a":"5", "b":"6"}')
|""".stripMargin)
sql("""
|select
| get_json_object(col1, '$.a'),
| get_json_object(col1, '$.b')
|from my_cache_table
|""".stripMargin).show()
}
}
Expected behavior
Screenshots
Additional context
Metadata
Metadata
Assignees
Labels
No labels