There is a lot of Big Data software available now. One of them that you should definitely know about is the H2O AI Machine Learning solution.
With this open-source application you can implement algorithms from the fields of statistics, data mining and machine learning. The H2O AI Engine is based on the distributed file system Hadoop and is therefore more performant than other analysis tools. Your machine learning methods can thus be used as
parallelized methods.
Table of Contents
Software Stack
They can program their algorithms in R, Python and Java and thus in the most important mathematical programming languages. H2O provides a REST interface to Python, R, JSON and Excel. Additionally, you can access H2O directly with Hadoop and Apache Spark. This makes integration into your data science workflow much easier. You already get approximate results while running the algorithms. A graphical web browser UI helps you to better analyze the processes and perform targeted optimizations.
How Clients Interacts with H2O AI
You can interact with H2O via clients using various interfaces. It is important for you to know that the data is usually not held in memory. They are localized in a H2O cluster and you only get a pointer to the data when you make a request.
H2O Frame
The basic unit of data storage accessible to you is the H2O Frame. This corresponds to a two-dimensional, resizable and potentially heterogeneous data point. This tabular data structure also contains labeled axes.
H2O Cluster
Your H2O cluster consists of one or more nodes. A node corresponds to a JVM process and this process consists of three layers.
H2O Machine Learning Components
Language Layer
The R evaluation layer is a slave to the REST client front-end and in the Scala layer you can write native programs and algorithms. You can then use these with H2O Machine learning.
Algorithms Layer
This layer is where your algorithms are applied. You can run statistical methods, data import and machine learning here.
Core Layer
In this layer you handle the resource management. You can manage both the memory and the CPU processing capacity.
Leave a Reply