
IBM Analytics39



































Q1: What is #Spark?

Thulasiram Valleru
Just like Hadoop platform for distributed computing.

jameskobielus
Spark is a distributed in-memory analytics tool.

IBM Analytics
You can post your comments here - starting with the first question.

jameskobielus
Essentially, Spark is a next generation cluster-computing solution, runtime processing environment, and development framework for in-memory advanced analytics.

jameskobielus
Apache Spark's core design feature is the ability to support iterative, distributed, parallelized algorithmic program execution entirely in memory, without need to write out result sets after each pass through the data.

Joel Horwitz
Spark to me is a development framework for creating Intelligent Applications.

jameskobielus
This capability makes Apache Spark well-suited for the growing range of real-time applications—such as Internet of Things applications—where much or most of the data analysis will be performed on cached, live data, rather than stored, historica

Joel Horwitz
@andbflo_denny Spark is light weight and fast as heck.
Himanshu Mehra
Apache Spark is a, in-memory distributed computing engine specifically designed to perform machine learning

jameskobielus
@andbflo_denny Advantages of Spark are speed, simplicity, versatilitiy, ability to work with your HDFS data, etc.

Pam Denny
@JSHorwitz love fast!

jameskobielus
Spark's performance advantages come from parallelizing models across distributed in-memory clusters.

Kimberly Madia
An engine built for simplicity and speed with connections to any data source and in-memory processing. Enables collaboration and ability to work with all data abstracting technical challenges.

Avadhoot
@jameskobielus is spark extension of mapreduce2 ?

Joel Horwitz
@avi_patwardhan its similar, but not the same. Here's a good overview http://www.quora.com...

Thulasiram Valleru
@JSHorwitz Even the architecture of HDFS, results are computed in memory and then store them on disk as I heard.

jameskobielus
@avi_patwardhan Spark is not an extension to MapReduce; instead, it complements MapReduce through a separate SQL and runtime engine geared for distributed in-memory parallelized real-time computations across clusters.

IBM Analytics
A light-weight compute engine for data science offering end to end support for data scientists, developers and data engineers.

Kimberly Madia
Spark is among the top growing open source projects, its exciting to see community innovation really taking off

Thulasiram Valleru
@jameskobielus It comliments Mapreduce like in Hadoop but the difference will be in-memory computing

Kimberly Madia
@andbflo_denny Spark is used across all industries to move from dashboards and alerts to meaningful and timely action. Use cases include machine learning, iterative analytics and Internet of Things applications.

jameskobielus
@elesinOlalekan Not a matter of #Spark over #Hadoop. It's more a matter of Spark leveraging and extending Hadoop to address broader range of use cases: in-memory, streaming, graph analytics. etc.

Andrew C. Oliver
@jameskobielus Spark is a DAG and you can consider MapReduce to be a special case / subset of a DAG. The problem with map-reduce is each step is linear, Spark hits non-dependant operations in parallel.

jameskobielus
Forward-looking organizations see Spark as a platform to complement their investments in advanced analytics, machine learning platforms, and big-data platforms such as Hadoop.

jameskobielus
Currently in version 1.3.1, Spark is a layered distributed-computing framework that can leverage much of the Hadoop storage environment, including HDFS.

Thulasiram Valleru
How Hadoop + Spark drastically improve Analytics

Andrew C. Oliver
@elesinOlalekan It isn't either/or. You will in all probability use Spark with HDFS and a lot of Hadoop-related tooling, just instead of Map-Reduce. So it replaces pieces of Hadoop but not the whole thing.

Kimberly Madia
Spark compliments data management and data discovery solutions with agile data science and application development. A key enabler of collaboration and innovation.

IBM Analytics
Time to move to Question #2 - look out!
Himanshu Mehra
how does IBM sees Spark's future ?

IBM Analytics
We would recommend looking at the latest post from @ibmbigdata

Kimberly Madia
@andbflo_denny yes! centerpoint energy is doing this today https://ibm.biz/BdXq... they are able to resolve issues electronically no need to deploy truck and crew

Kimberly Madia
@andbflo_denny excellent question, lots of possibilities .. car makers talk about auto playing music to suit your driving - fast, stops and starts, driver needs classical music to calm down

Kimberly Madia
ideas for apps for Spark, automotive - More profitable aftermarket products based on driving preferences. More interactive and safer driving experiences, respond approaching dangers.

Monica Fox
@andbflo_denny ^amen - from a fellow bostonian :)

Kimberly Madia
@andbflo_denny :) i don't have the nerve to drive in Boston! I read a few other cool ideas from the consumer electronics show like automatically changing your alarm clock depending on traffic and adjusting airbag deployment based on weight of driver