Data Lake Analytics: Analyzing millions of NYC Taxi Trips
Most data today tends to be big data. Companies of all sizes are searching for platforms and tools to manage their rapidly growing data sets. However, despite the number of platforms available, bringing heterogeneous data into a homogenous data lake environment is one of the most daunting aspects of any big data implementation.
Join Flavio and Arjuna as they discuss real-world case studies to illustrate how to manage disparate data with ease and efficiency. They will walk through how to profile, transform, aggregate and analyze New York City taxi data to extract many important conclusions including traffic volume by day, hour, and week for a given area, number of trips to and from an airport, cash transactions vs credit, and much more using the open-source (and completely free) HPCC Systems platform built by LexisNexis, one of the largest data aggregating companies in the world. LexisNexis Risk Solutions built HPCC Systems on the premise that big data is actually a solution and not a problem, and harvesting large data from thousands of sources helps in incorporating a learning-based approach to handling data.
In this fun-filled presentation, you will learn how to:
Dr. Flavio Villanustre leads HPCC Systems, and is also VP of Technology for LexisNexis Risk Solutions. In this position, he is responsible for information security, overall platform strategy and new product development. Dr. Villanustre is also involved in a number of projects involving big data integration, analytics and business intelligence. Previously, Dr. Villanustre was Director of Infrastructure for Seisint.
Prior to 2001, Flavio served in a variety of roles at different companies including infrastructure, information security and information technology.
In addition, Dr. Villanustre has been involved with the open source community for over 15 years through multiple initiatives. Some of these include founding the first Linux User Group in Buenos Aires (BALUG) in 1994, releasing several pieces of software under different open source licenses, and evangelizing open source to different audiences through conferences, training and education.
Dr. Villanustre was a neurosurgeon prior to his technology career.
Arjuna Chala is Senior Director of Special Projects for the HPCC Systems platform at LexisNexis Risk Solutions. With almost 20 years of experience in software design, Arjuna leads the development of next generation big data capabilities including creating tools around exploratory data analysis, data streaming and business intelligence. Arjuna strives to understanding new technologies and bring innovative applications and design to the HPCC System platform.
Dedicated to development excellence, Arjuna served as a key member of the team to bring the HPCC Systems platform to the open source community. In his work with HPCC Systems community leaders and system integrator partners, Arjuna's efforts have contributed to the spread of HPCC Systems technology into the enterprise domestically as well as the international markets of China, Brazil, Europe and India. Arjuna has a BS in Computer Science from RVCE, Bangalore University.