BigDataNYC

Analytics for All:
Putting the Power of Analytics in the Hands of Business Users
   11 years ago
#BigDataNYCData-Driven EnterpriseWhat are the Requirements, Challenges, & Approach to transform into a Data-Driven Enterprise
   11 years ago
#BigDataNYCHadoop for the EnterpriseIs Hadoop Enterprise Ready?
Ben Werther
Does BI have a role against all the types of data that are natural for Hadoop? i.e. how useful is a pie-chart against clickstream, ad impressions or sensor events?
Jeff Kelly
not very ;) need different visualizations for time-series analysis to follow patterns of events
Ben Werther I think you're underestimating what is possible here.
John Furrier
Hadoop is limited from folks we talk to but a big part in certain use cases. small improvement in bus operations yields big value in revenue & business outcomes; faster dynamic BI is the answer; what google did for web pages someone need to do
AnalystOne
Very thought provoking question. Seems like BI is evolving so the answer to this might evolve too. Do you think?
John Furrier
What google did for web pages someone needs to do for BI and DW enabling non IT people to be "the next Billy Beane" #moneyball for enterprises
Dan Hushon
interesting BI vs. BA & the evolvement of #Storm I personally think that link analytics becomes increasingly relevant
Ben Werther
I'm predicting a big topic at Hadoop World will be the difference between BI and Big Data Analytics. It will be a stark contrast.
In Sik Rhee
in-memory no-sql + analytics is becoming more interesting to BI than Hadoop
Ben Werther That is the key. It is the difference between pretty pictures on the surface VS finding deep facts about the business.
Dan Hushon agree, but does that speak to the deficiency of current state BI?
John Furrier totally agree new sw will emerge fast in this area - big oppty imo
Ben Werther I assumed you meant that in-memory + analytics is more important THAN BI to Hadoop.
In Sik Rhee ... it's a natural middle ground. hadoop is not as well suited for data exploration as it is for operationalizing insights after they've been discovered.
Dan Hushon
I think that we are going to see a ton of link/graph building on hadoop to aid in analytic process... question is one of correlation vs. causality at end of day
Jeff Frick Seems to always boil down to correlation vs causality. Some argue that an overwhelming correlation has the same actionable weight, almost a proxy for causality. Seems dangerous.
Dan Hushon don't know that I'm there jeff, if you're trying to remediate better to treat the cause than the symptom
Jeff Kelly
What are the data governance and security implications for self-service analytics? Should business users be let loose on any data source they like? Where does IT fit in this equation?
AnalystOne
Wow more great questions. Very thought provoking. Need more than Twitter to deal with this one...
Suzanne
Data Governance, Data Architecure are extremely important. IT has to provide security and governance for #BigData Analytics to suceed
John Furrier
IT drives this imho bc cloud opens up a can of worms wrt security bc it differs from on-prem vs cloud.. very important that SLAs on the front end meet the cloud security SLA
AnalystOne
What if an enterprise user generates conclusion based on new self service BI and mis-uses info? There are endless scenarios where that could happen. And I can't think of an auditing system that would stop that.
Jeff Kelly if decisions being made are strategic "big picture" decisions, users should have to justify their decisions to others - self-serve analytics is a tool, not the final arbiter of decisions
Dan Hushon
why isn't there a whitelist / blacklist registry
Dan Hushon
and I also believe that tools like #chorus may provide for peer review of analytics through visible analytic process
Jeff Kelly great point - analytics should be a collaborative process
Suzanne
#SelfServiceAnalytics does not equal chaos. It too, can be governed
Dan Hushon
it's funny that we talk about governance, and yet don't recognize the importance of peer review in science ... proofs versus theorems
AnalystOne
Question: is the age of Big Data infrastructure on decline, to be replaced by rise of end user tools? #DataAnalytics
Jeff Frick
Is a data pool, or data ocean, additive, or subtractive to Big Data Infrastructure as you've defined it here?
John Furrier Thank God you didn't say "data lake" - i hate that term :-)
John Furrier
no brainer on the rise.. the market for big data apps hasn't come to life yet bc the killer app is visualization which is being paced by analytics which is waiting on infrastructure #DWBI #ConvergedInfrastructure retooling
AnalystOne Love that reply but thought infrastructure was coming along very well. No?
Seabourne
Big Data is only getting bigger, and more disparate :)
John Furrier both volume and diversity from folks that I talk to.. at @sap @sapphirenow 4 yrs ago we talked about fast data..now #industrialinternet #iot highlights machine to machine data small and fast but very important
Jeff Kelly
the underlying #BigData infrastructure is the enabler of effective and game-changing #analytics - can't have one without the other IMO
John Furrier
also AWS is showing the model for integrating stacks for rapid developer productivity and their next step will be big data apps to be followed by @rackspace @pivotal @ibm @hp etc
John Furrier
Topic: Data Science? What is the role and who are the people and skills needed?
Crowd Captain
Data science should be everyone in the organization according to Florian Zettelmeyer who spoke at #GE event last week
Jeff Frick I heard Florian talking about learning some basic data science methods and vocab, POV, but aren't we trying to democratize data, make it actionable, in the hands of the people?
Crowd Captain
@kelloggschool posted https://twitter.com/KelloggSchool/status/388364388266037250
KelloggSchool
How can you derive tangible business value from #BigData? Florian Zettelmeyer explains [VIDEO]: http://t.co/7RWC0POhwF #IndustrialInternet
7 days ago
Crowd Captain @gesoftware https://twitter.com/GEsoftware/status/387965277800906752
GEsoftware
Most important skills of analytics R not technical; they R thinking skills. -Prof. Florian Zettelmeyer #IndustrialInternet @KelloggSchool
8 days ago
Sylvie Otten (Sollod
IMHO Data scientists combine technical know-how & curiosity to analyze massive amts of data to deliver insights/answers to help solve real business problems.
Sylvie Otten (Sollod
They’re part technologist, scientist, researcher, business analyst, mathematician, statistician, economist and engineer!
Jeff Frick Difficult job description for the new hire
Sylvie Otten (Sollod @JeffFrick ;-) They just need to possess the skills - not necessarily the job description. ;-) I think many folks today already have many of these skills - doesn't mean you have to be expert in all of them - just passionate about them.
AnalystOne
I always loved the @josh_wills data scientist definition: Data Scientist (n.): Person who is better at statistics than any software engineer and better at software engineering than any statistician.
Jeff Kelly
role should be to identify insights that can be productionized and rolled out to business users
Jeff Frick
Will real BI finally be delivered to business users using the latest generation of tools and solutions? Who is leading this charge?
Jeff Frick
Or should i say the "Promise of BI" and DSS going way back.
John Furrier
companies like Platfora are redefining BI and most CIOs want to throw away the old data warehouses and BI techniques and use more agile ways to get data out in the hands of users
AnalystOne Agree regarding firms like @Platfora, they have powerful backend but their analyst-facing vis for #analytics is key
Jeff Kelly
well it won't be the old style of BI apps and platforms - too rigid, slow and not user friendly
Lara Shackelford agree; we see our #FactBased customers moving away from BI to #BigDataAnalytics.
John Furrier
@kdnuggets wrote: https://twitter.com/kdnuggets/status/390903174376488961
kdnuggets
Text Analytics is vital: 68% of Analytics Professionals use both Unstructured & Structured data to get insights http://t.co/Ihkq4rsZsa
an hour ago
Dan Hushon
The other key issue imo. is turning data sets from tribal phenomena to enterprise assets... registries, meta information, lineage, provenance, and the like? thoughts?
Dan Hushon
how often are you calling people for data? and then not knowing how it's been transformed [tampered] and whether it's fit for purpose
Jeff Frick Can you expand on this?
Dan Hushon how many people would reuse a piece of binary code?... for me a derived data set that has been massaged for one purpose may not fit your purpose
John Furrier
I agree that we will see the rise of what I call "graph software" which will create the "next Google" bc search is the big problem..clutter is a discovery and navigation problem to "data"
Jeff Frick
Are enterprises viewing the challenge this way? Really changing the construct.
Jeff Frick
And the elevation of the second order info, the meta data. interesting
Suzanne
There are eschelons of #DataConsumers. #DataScientist is not the same as the #DataEnthusiast, and they are both more sophisticated than the common Excel user, but all could need #BigData
Jeff Frick
Wow! There are a number of blog posts in this tweet.
John Furrier
I agree and I would add (bc of CrowdChat) we are also CrowdConsumers - both are new categories of a very important & relevant trend #cc #venturecapital dudes
Dan Hushon
we all do it with @linkedin but don't realize it when we see that certain people are 3 degrees away and then try and figure what the link is and why!
Jeff Frick
Are the LOB folks were talking about helping in their day jobs with data, the #DataConsumers you're refering too? And again, are we empowering them more effectively?
AnalystOne
Question: what can enterprises do to steer the open source world to improve security in big data infrastructure capabilities? Anything? Do we just give up?
Seabourne
Tools that can drop inside of an organization's firewall are useful (flexible hosting)
Jeff Kelly
open source world is actually making good strides here - Apache Accumulo fine-grain security capabilities and more recently Sentry from Cloudera - not as fine-grained but more role-based security
Suzanne
I think the point of OpenSource is to avoid the scrutiny to some degree. IT Governance has to play a part with sensitive data. NSA & WikiLeaks being examples.
John Furrier
I think that communicating with their checkbook bc open source dudes want to make money too.. @cloudera is targeting that with CDH
AnalystOne I like this answer about communicating with checkbook.
Sylvie Otten (Sollod
Security is a if not THE top issue - and the one that companies must address. Still have a long way to go - but I agree strides are being made. I finally got on this CrowdChat today thanks to working through our own IT Security experts!;-)
AnalystOne Thanks for comments, I see progress, but not sure about strides, yet.
Suzanne
#BigData represents a wealth of security challenges. HDFS hosts a wide variety of formats that can't be dealt w/ the same way. #BIAnalytics layer should limit access as well
Suzanne
#BI defines a framework for decision validation. #DataAnalytics represents interactive process of understanding decision process. 3G of #BI with #Tableau
Jeff Frick
Interactive, seems like a key point, might have a partial hypothesis, but those change as new patterns emerge, check a different path
Jeff Kelly Analytics is fundamentally an iterative process, must be interactive to ask questions, get an answer, ask another question
Crowd Captain
the crowd call agree that Tableau really shines the light on the value of getting at data fast and in many different ways..kills any argument for old model of BI and DW
Jeff Kelly
good distnction - with traditional BI, you model the underlying data to answer predefined questions - interactive analytics allows for iterative investigation of data