Hive UDFs in views

You can create user-defined functions in Hive. Simple ones are simple. The syntax for declaring a function is also simple:

CREATE TEMPORARY FUNCTION my_func AS 'in.sinking.udf.MyFunction';

What’s that TEMPORARY doing there? Well, it means that my_func is only available during the current hive session.

I found myself creating a VIEW that uses my_func – how does that work? Pretty well, as long as you only query it from the same hive session in which you declare the function. When you next fire up hive you’ll find your VIEW mysteriously fails with:

SemanticException Line 16:4 Invalid function '`my_func`' in definition of VIEW ...

Gah – that took me a while to figure out. The workaround seems to be to bung the CREATE TEMPORARY FUNCTION ... clause into your .hiverc, thereby making it a bit more permanent. There seems to be an old issue on a related subject in this issue on Hive’s Jira.

Leave a Reply

Your email address will not be published. Required fields are marked *