• Solid experience concentrated on developing Big Data, IoT , NLP , Machine learning API's and solutions including large scale data lake systems. • Experienced in Data Science and Business Intelligence roles across industries. • Hands on development experience in Hadoop & Spark ecosystem. • Thorough Understanding of big data technologies like M/R, Pig, Hive, HBase, Sqoop, Hawq, Spark etc. • Implemented Real time analytics and big data processing for the Internet of Things (Healthcare Wearables) with low-latency requirements using Spring XD and Apache Storm. • Implemented Machine learning framework for Text and Predictive analytics using various Machine learning tools such as R, Python SciKit and Spark MLLib, etc. • Worked on Restful API development and Integration using Python libraries and Mulesoft. • Knowledge of industry best practices and a deep understanding on how data is extracted, transformed, scrubbed and loaded in large Data storage environments. Experience working with large data sets and developing scalable algorithms. • Developing wrapper shell scripts and understand code review, deployment, propagation and maintenance processes. Governance tools like Ranger auditing, Kerberos authentication, Apache Knox etc • Ability to communicate analytical results in a way that is meaningful for business stakeholders and provide actionable insights. Have executed various projects from Proof of Concept to Client Production environments.
Big Data & Analytics Developer, Deloitte
Jul 2013 to Present
Presently, working in the Big Data & Analytics team and have worked in the Data Integration initiatives in the past. Being part of multiple initiatives for end to end requirement analysis, technical design, coding and production release of data warehousing, integration, big data and analytics projects. Some projects I have worked on are mentioned below: 1. Real time big data analytics of Healthcare Wearable's and other IoT for a major healthcare provider in the Pivotal Big Data stack. 2. Social Media and Web data analytics for a Major Health Insurance Firm in the US using Cloudera Hadoop distribution. NLP using Python NLTK, SciPy and other text mining libraries. 3. End to end Data lake implementation for a European insurance provider, using Informatica BDE, Hortonworks Data Platform including App SME and change management. 4. Healthcare analytics in Teradata and Informatica for a Major Health plan provider using BTEQ, TPT and MLOAD utilities. 5. Other machine learning and predictive analytics algorithms like Retail Intelligence and Patient adherence prediction.
Data Analyst , Cognizant Technology Services
Nov 2011 to Jul 2013
1. Worked on multiple data warehousing technologies such as Informatica Powercenter ETL and OLAP/OLTP Systems on a data integration project for a leading healthcare device manufacturer.. 2. Worked as an ETL production engineer with Oracle PL/SQL AFX framework.
B.Tech, Computer Science , BPUT
Dec 2006 to Dec 2010