Founder at Nube Technologies
Jun 2010 - Present
Nube's product Reifier is an AI based fuzzy matching engine built over Apache Spark and Hadoop to find similar entities in business data for data quality, data governance, Master Data Management(MDM), Client Relationship Management (CRM), Data Warehousing, cross selling, lead management, 360 view of data, ETL, fraud analytics like AML, security and compliance like KYC. Reifier identifies near duplicate data and links structured and semi structured records in CRM, product catalogs and other sources for a consolidated 360 view. It collates existing client, customer, vendor or product lists in different formats and systems, many of which are near duplicates. Each record typically has multiple fields, some of which may be absent in some systems, partially or poorly populated in others, and not matching exactly. Reifier's advanced proprietary machine learning and big data algorithms discover these linked records and near duplicates. Check www.nubetech.co for more information. Nube also provides niche consulting in big data adoption, strategy, data analytics, nlp and machine learning. Some past projects 1. Crux Reporting for HBase https://github.com/sonalgoyal/crux 2. HIHO for Hadoop ETL https://github.com/sonalgoyal/hiho 3. Cascading job flows for semi structured data and creation of a data warehouse using Hadoop. 4. Design consulting for a petabyte scale user enrollment, authetication and reporting system. 5. Design of cloud based email archival and indexing system 6. Map Reduce for network analysis. 7. Data deduplication and similarity ranking using Map Reduce. 8. Creation of custom AMIs for EC2 with Hadoop and Hive. 9. Implementation of UDFs for Apache Hive and deployment on AWS Elastic Cloud Compute. 10. Advertising Campaign Monitoring and Performance
Hadoop, Hive, HBase, Cassandra, NoSQL, Cloud Computing, AWS consultant at Self Employed
Apr 2006 - May 2010
Technical Lead at BabyPackets
Jun 2003 - Mar 2006
I work on the Voice VPN product which provides specialized VOIP networks and advanced user preferences
Sr Associate Technology at Sapient
Apr 2001 - Apr 2003
Worked on different customer projects involving EAI, Content Management Systems, Web Based Custom Software. Involved in architecture and design, implementation, performance tuning, team management, testing infrastructure and troubleshooting tasks.
Analyst at Etrade
Mar 2000 - Mar 2001
Engaged in rollout of ETrade Sweden and ETrade HongKong websites along with related backend.
Analyst at Webtek Software (A Dresdner Kleinwort Wasserstein subsidiary)
Dec 1997 - Dec 1999
Worked on the year end and monthly reporting solution, GAAP adjustments and import and export of data for consumption of the banking division.
Bachelor's Degree, Chemical Engineering, 8.17 at Indian Institute of Technology, Delhi
Dec 1993 - Dec 1997