Technology Analyst - Big Data at ZS Associates
Oct 2014 - Present
I am working for various engagements: Extract the data of various sources from Oracle DB using Sqoop into HDFS on Hadoop •Automation of data migration process using Python script •Process the source data which is in the form of CSV, CSV GZIP •Perform data validations and create Hive external tables, Impala parquet tables with snappy compression based on the requirement •Data analysis on different datasets using Impala queries using various joins •Implementation of Hadoop data security model using Sentry , LDAP and ACLs •Used Sentry to assign role-based permission levels and table level security •Use of Amazon Web Services (AWS) including S3, EC2, EMR, EBS, AMI etc. •Implemented HDFS snapshot and restore process, HIVE metastore backup process, archival of data into S3 and then into glacier. •A module to developed to make the data available to business users with specific requirement. The process of Create/Delete/Assign Roles Project Area was automated. Daily data backup, logging, Sentry Authorization were implemented.
Systems Engineer(Big Data) at Infosys
Jul 2012 - Aug 2014
Automation using Big Data/Hadoop framework Automated an end to end process of data ingestion from multiple sources into HDFS and applying business logic on top of it and publishing processed data to MySQL table for business user review. Use of Core Java along with the implementation of Hadoop,HDFS,MapReduce. Scheduling of actions through Oozie,use of Hive to run queries ,process data and use of Sqoop to push/pull data to/from Mysql and Teradata. Hadoop Cluster setup engagement Inhouse 3 Nodes cluster was setup and components like Hive,Pig,Sqoop,Zookeeper,Oozie were installed. Infosys Certified Hadoop Developer
Bachelor of Engineering (B.E.), Computer Science at Chitkara University
Dec 2007 - Dec 2011