task-6
Tue Oct 14 2025 18:50:09 GMT+0000 (Coordinated Universal Time)
Saved by @rcb
STEP 1: Launch Hive
Open the Terminal in Cloudera and type:
hive
You should see:
Logging initialized using configuration in /etc/hive/conf/hive-log4j.properties
Hive>
STEP 2: Create or Use a Database
SHOW DATABASES;
CREATE DATABASE IF NOT EXISTS company;
USE company;
Confirm:
SELECT current_database();
STEP 3: Create a Table
Create a simple table to hold employee data.
CREATE TABLE employees (
id INT,
name STRING
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
TBLPROPERTIES ("skip.header.line.count"="1");
STEP 4: Create a CSV File
Exit Hive (type exit;) and in the terminal run:
cd /home/cloudera/
gedit employees.csv
Paste the data below:
id,name
101,satvik
102,rahul
103,rishi
104,nithish
Save and close the file.
STEP 5: Load Data into the Hive Table
Reopen Hive:
hive
USE company;
LOAD DATA LOCAL INPATH '/home/cloudera/employees.csv' INTO TABLE employees;
Check the data:
SELECT * FROM employees;
✅ Output:
101 satvik
102 rahul
103 rishi
104 nithish
STEP 6: Create a Hive UDF (Java File)
Exit Hive and go back to terminal.
Create a new Java file:
gedit CapitalizeUDF.java
Paste the code:
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;
public class CapitalizeUDF extends UDF {
public Text evaluate(Text input) {
if (input == null) return null;
String str = input.toString().trim();
if (str.isEmpty()) return new Text("");
String result = str.substring(0, 1).toUpperCase() + str.substring(1).toLowerCase();
return new Text(result);
}
}
Save and close.
STEP 7: Compile the Java File
In the terminal:
javac -classpath $(hadoop classpath):/usr/lib/hive/lib/* -d . CapitalizeUDF.java
If successful, it won’t show any error and will create a .class file.
STEP 8: Create a JAR File
jar -cvf CapitalizeUDF.jar CapitalizeUDF.class
Check:
ls
You should see:
CapitalizeUDF.java CapitalizeUDF.class CapitalizeUDF.jar
STEP 9: Add JAR to Hive
Open Hive again:
hive
USE company;
ADD JAR /home/cloudera/CapitalizeUDF.jar;
You’ll get:
Added resources: /home/cloudera/CapitalizeUDF.jar
STEP 10: Create a Temporary Function
CREATE TEMPORARY FUNCTION capitalize AS 'CapitalizeUDF';
STEP 11: Use the Function
SELECT id, capitalize(name) AS capitalized_name FROM employees;



Comments