Hazy: Big Data

Submitted by: Submitted by

Views: 91

Words: 871

Pages: 4

Category: Business and Industry

Date Submitted: 04/03/2014 10:21 PM

Report This Essay

hazy ng It Easier to Build and Maintain Big-Data Analytics

-By Arun Kumar, Feng Niu, and Christopher Ré

Paper Presentation

Team 4

Bonilha, Eduardo

Gosavi, Amruta

Lawton, Kirke

Sulsberger, Joshua

While existing data management systems assume that data has rigid, precise semantics, increasingly more data contains imprecision or inconsistency.

The proliferation of ever-evolving algorithms to gain insight into data can often be daunting. The developer has to keep up with the state of the art and expend significant effort experimenting with different algorithms.

State-of-the-art approaches to these challenges combine rich databases with statistical analysis and machine learning software.

Complexity of such systems makes building them very challenging even for PhD-level computer scientists. To address this, systems need to be turned into commodities that can be easily applied to different domains.

Hazy project hypothesis: the next breakthrough in data analysis will be the ability to rapidly combine, deploy, and maintain existing algorithms.

Two broad categories of common patterns (abstractions): programming abstractions and infrastructure abstractions.

GeoDeevDive: Is an example of trained systems

Programming Abstractions

Programming abstractions decouple the developer’s application-specific modeling from the algorithms used. This ensures that a developer can try many different algorithms to the same data set without additional effort. As algorithms are improved applications using them automatically improve. (An example of this is how one can learn SQL to write database queries. The same SQL statements run faster as databases become more efficient without the SQL developer having do anything.)

The core of the Hazy programming abstraction is:

* Relational data model

* Probabilistic logic programming

* Debugging

The relational model provides the advantages of a mature data platform. For example, one can easily load data to and...