Have 3.5 years of experience as Architect & Design lead in big data technologies Hive, Mapreduce, Spark, Sqoop, Flume, Pig & Python. Have 8 years of experience in Database technologies Oracle, SQL Server and scripting languages Unix shell script and Pro*c. Have been involved in Architect/Design/analysis/Development/ Testing. Key Skills ________________________________________ • Horton works Certified Hadoop Developer (HDPCD certification). • Certified Scrum Master , Certified Oracle SQL and PL/SQL and Specialized on Consultative selling techniques. • Over 6 years of onsite experience in British gas – United Kingdom, Hattan National bank – Srilanka and Bank of Baroda – Uganda. • Overseeing and taking responsibility for the creation of architectural blueprints (roadmaps) and high level solution designs, including elaborating and developing them throughout the full project lifecycle. • Develop and manage Big data projects involving Hive, Map reduce , Sqoop, Shell scripting, Python and Spark. • Experience in leading Data professionals on Analysis using Hadoop technologies and fixes of Data on multiple systems. • Experience in deploying applications into production and other environments in Hadoop and Database technologies. • In-depth knowledge of Oracle and SQL Server with experience in designing and implementing databases, writing queries, stored procedures, functions, triggers and views. • Lead as an internal Subject Matter Expert (SME) for the team’s architecture blueprint using strong stakeholder management skills to communicate, influence and often challenge both internal customers and suppliers. • Extremely thorough working in Agile, detail-oriented, and organized, with the ability to manage multiple parallel projects • Passionate about following best practices and writing clean, readable and consistent code. • Responsible for providing cost effective optimal solutions for different projects and presenting effectively to Customers.
Big Data lead & Developer, Cognizant Technology solutions
Jul 2015 to Present
Annual service for British Gas services customers has to happen on a yearly basis depending on their visit dates. Since multiple systems are involved and due to replication issues ASV was not happening as required for customers having data issues. As part of this project Data is extracted and analyzed from different systems in Hadoop and data issues were fixed on source system. Tools used Hadoop – HDFS, Hive, Shell scripting, SAP – CRM & ISU, Qlikview. Role and Responsibilities • Lead a team of Data analysts and coordinating with business for Requirements and solution design. • Implementing the extracts in Hadoop using Hive and shell scripting. • UDF’s were created in java for additional functionality needed. • Coordinate with support team for deployment and support. • Fine-tuned overall system performance for ASV, P0 Extract and resolved performance bottleneck issues effectively. • Accountable for deploying applications into production, from strategic design all the way to development and handling daily operations of Hadoop environments. • Coordinate with SAP team for fixes file requirement and automated fixing.
Big Data Architect & Developer, Cognizant Technology solutions.
Oct 2013 to Present
British Gas customer and utility data is stored in multiple databases and was periodically refreshed into one database namely Microsoft SQL server where reporting and analysis was carried out. Since the deployment of our big data platform (Horton works) all customer and utility data has been moved to HDFS. British Gas took the strategic decision to decommission Microsoft server and build a reporting framework that would identify and monitor incorrect data that would require fixing. Tools used Hadoop – HDFS, Hive, Map reduce, Sqoop, Spark, Shell scripting, Python & R. Role and Responsibilities • As a Technical architect responsible for designing the framework using Big data technologies. • Data model design for the framework using Visio. The snowflake pattern (fact and dimension) was used. • Implementing the data model using hive scripts. All the tables were created in HDFS. • Queries were written in hive to identify incorrect data. • A framework was built to populate the data model in order to monitor the incorrect data in a structured manner so there is a visibility of how many have been fixed since the last run, how many are new instances and how many were not fixed. This was developed using hive. • A scheduler framework was created using python that would read configuration files and output a shell script that would execute the framework. • Visualization (trend graphs and pie charts) were created using R by connecting to hive. • Sqoop was used for transforming data for fixes and exporting to Relational data • Map reduce programming was used to do exception log analysis and for adhoc files received by business. • Enhancement to existing framework is being developed using Spark. • Involved in project plan creation (Governance discussion, Management forums with customer) • Manage and resolve queries, escalations, conflicts and issues across onsite/offshore teams.