hw2013

HadoopWorld BigData NYC
Crowd Coverage of BigDataNYC of Strataconf and Hadoopworld with thought leaders
John Furrier
3 pieces of data platform: 1) infrastructure like hadoop (data OS) 2) bus intelligence in new way to access data & insight in real time not separate 3) combination of smart data domain knowledge & data science in platform
Jacques Nadeau
Of those three, where is the biggest gap? I'd argue #2 has yet to grow to the strength of either 1 or 3.
John Furrier the gap is #3 building a analytics app layer ontop of the data infra to build applications to the segment of 1 - data engines are need
Jeff Kelly
need to leverage #Hadoop as a storage and COMPUTATIONAL platform to get its full value #bigdatanyc
John Furrier advanced analytics apps is the most relevant right now from our research..
Jeff Kelly
agree - domain knowledge and the science very difficult combo to find
(((Ellen Friedman)))
I like the way NFS access to MapR platform makes #2 & #3 easier to do. Ex: Storm runs on MapR now w/o Kafka layer; Solr easy to use too #hw2013
John Furrier Solr is very important and enterprises need a cohesive platform with security..love how the platforms are getting mature..
(((Ellen Friedman)))
Bay Area buzz real for big-data, IT etc. But cool global community for some #opensource projects like @ApacheMahout too. Shared innovation
John Furrier
Mahout is getting lots of positive attention.. what is the latest update there
(((Ellen Friedman))) Mahout v.0.8 new from July: streamlined, fast new k-means clustering algor, strong math library. And BIG interest in practical recommendation: Mahout + Solr
Crowd Captain
Apache is the case study of how open source should work..according to experts.. Storify use case very interesting here at #strataconf
John Furrier
Data Scientist market is not just geeks business analyst position is changing; today estimated 200k data scientists in the world; >2m business analysts quote on @theCUBE
Jeff Frick
Have to empower LOB folks, with a bit of Data Scientist POV, but LOB population MUCH larger, and impact or empowering them, HUGE!
Crowd Captain
domain expertise is very important but machine learning is the scale point for data science bc machines emulate people..that is where this scales according to the experts in the crowd
(((Ellen Friedman)))
Best practices for large scale apps & machine learning benefit from the ability to connect legacy code & new innovation (d3 node.js Storm Solr) to Hadoop cluster. That's why I'm strong on access via NFS #hw2013 #mapr #analytics
John Furrier
nice I agree.. functionality is what customers care about but no lock in..
(((Ellen Friedman))) Absolutely. No vendor lock in even w MapR - it's Hadoop API compatible, HBase API compatible. Gives customers open choices now & later #hw2013 #bigdata
Jeff Frick
Biggest surprise from the Keynotes?
John Furrier
Very surprised by the clear articulation of who is competing with who..liked @mapr and @mikeolson presentations
John Furrier
keynotes had mix of commercial pimping and technical conversations..Intel was all nuts and bolts not pushing Intel but tech..same with Facebook
Jeff Frick Any "Ah Ha" moments?
John Furrier
@cloudera message of Data Hub is very smart of them and clear positioning against Pivotal and IBM .. they want to be more than a #hadoop distro
Crowd Captain
Topic in the crowd at #strataconf - continuous availability that is zero downtime; if datacenter goes out what happens.. Netflix has taught us to watch this..who has what? what is the solution
John Furrier
this is a key area of importance for any large enterprise and mainly financial markets #bigdataNYC issue not being discussed much at #strataconf #important for #cloud
John Furrier
active replication is a key part and @hortonworks and Apache needs to keep the innovation coming to fill the whitespaces.. Hadoop 2 is huge here
(((Ellen Friedman)))
Security is key issue for Hadoop. News; MapR has engineered new native authentication as they integrated security into their Hadoop distro. Details at http://bit.ly/1au8QcT #strataconf #hw2013
John Furrier
What is going on with Pivotal many want to know..Partnership with Hortonworks and Cloudera and yet competitor; Cloudera clearing moving to bigger mkt than just distro
Dave Vellante
abhi mehta on #theCUBE says Hadoop must be more than just a cheap storage platform - value is about storing, processing and analyzing w/in a domain so humans can be more productive
John Furrier
what's great is developers are jamming hard; businesses are adopting; startups are getting funded; VCs investing in value plays #bigdataNYC @theCUBE
(((Ellen Friedman)))
Huge fan of Apache, esp new projects like @ApacheDrill with its 1st milestone release & Apache Storm, mature project entering Apache & building community #hw2013
John Furrier
Storm are saying Spark is generating more buzz.. Drill is very interesting
Jeff Kelly
Storm is getting a ton of buzz, huge interest in streaming analytics - making decisions and executing actions in real time
John Furrier
the advancements are driving new apps and functionality