datafriction

Modern Infrastructure Mgmt
Accelerating Productivity Through Machine Learning
jameskobielus
Is your infrastructure data becoming more of a strategic asset or overhead burden? (one response only)

Is your infrastructure data becoming more of a strategic asset or overhead burden? (one response only)

Jason Johnson PMP
http://www.via-cc.at...

Randy Arseneau
Have to strike the balance - level investment in tooling and resources with derived (and derivable) value.
Matt Cauthorn
@dorkninja Agreed, 100% - hence the #datagravity subject. Can hit diminishing returns if not handled strategically
Randy Arseneau
Winners will be the ones most able to commoditize pattern data quickly and repeatably.
Matt Cauthorn
One thought here - infrastructure data *is* a strategic asset. It's up to orgs to extract its value...
Matt Cauthorn
2/ and that can become a non-trivial task
jameskobielus
@dorkninja What does it means to "commoditize pattern data"? Does this have something to do with building and training ML models that can be reused in new IT infra mgt apps and possibly even resold in online marketplaces?
jameskobielus
@dorkninja Right. And balance the investment in IT tooling/infra against investment in IT management human resources, including developers of IT mgt machine learning assets.
Jason Johnson PMP
Question #3 coming up.
Chris Selland
The key is to not treat data in a stovepiped way but be able to manage and integrate all forms of data the SAME way
George Gilbert
with that approach you can create a data lake for IT which can be further refined into machine learning models of how specific domains work
Randy Arseneau
Agreed, although challenging from an integration and orchestration perspective sometimes.
Chris Selland
you can create a #datalake for the entire org - not just IT. Which is how it should be
George Gilbert
@dorkninja exactly - that's the trade-off
Colin Walker
Agreed, assuming you can actually make use of that data lake in am meaningful way. So often people get huge cess pools of unrefined data that isn't useful, doesn't have appropriate ML features, and they can garner no insight from. #SadMLFails
Matt Cauthorn
here too I'd say that different data sources warrant different treatment, though ultimately they can be unified
Jim Shocrylas
see cess pools being created and access to data through same old stove piped approach
Chris Selland
@mcauth that's the reality today but also the goal
Colin Walker
+1. Hurts my data loving soul, but it's a song all too common. Companies know they "need" big data, but have no idea how to implement/refine/use/benefit. So they end up with a non useful solution, are mired in it for years, and fall behind.
Chris Selland
@Jshoc I've heard about data lakes that became swamps but first time I've heard cesspool
Randy Arseneau
@colin_walker Yup. And that inhibits future innovation and the appetite to experiment.
jameskobielus
Nice to have you on the chat.
Colin Walker
Precisely. Suddenly the message is "We tried that big data thing. It's bogus. Why should we try it again?" when new, innovative, powerful approaches surface. Same issue "cloud" and many new techs have faced, frankly.
George Gilbert
@colin_walker some vendors build in the data/ML smarts for a particular domain so their application collects and organizes only the data they need in the data lake
Chris Selland
@jameskobielus happy to be here - see you in NYC in a few weeks?
jameskobielus
What's the "SAME way" that all data should be managed and integrated? A data lake? Data warehouse? Data refinery?
Colin Walker
Truth. Unfortunately I think the biggest value of ML won't come from silo driven lakes. Correlation is *money* when it comes to trends/patterns.
Jim Shocrylas
akin to to dashboard exhuast
jameskobielus
@ggilbert41 Specific IT management domains? Such as incident response? etc.
Chris Selland
@jameskobielus data lake with governance - I like Steve Smith's piece https://www.eckerson...
Chris Selland
@colin_walker that's right and you often won't know what's correlated up front
jameskobielus
I'll be at Strata at the end of the month. Let's talk. You guys doing the Cube?
Peter Evans
Data lakes only become cesspools if they do not have a good ILM process in place from the outset
Colin Walker
Precisely. Process/analyze/correlate the best you can, and store for further analysis? Yes please. #DataNirvana
Matt Cauthorn
@colin_walker It's all about getting to the data with minimal friction, max velocity
Chris Selland
Great let's do that - @2thebeach coordinating our Cube participation
George Gilbert
@jameskobielus i think incident response is more horizontal. i was thinking about intrusion detection or management of an application's full stack such as SAP, etc
Neil Raden
@colin_walker We're the emergence lineage, provenance, governance and security capabilities for data lakes, but what is still laking is a clear vision of the desired outcome.
Colin Walker
@NeilRaden Oof, that hits home hard. 100% agree. People don't know what they want to know, yet. They're looking for not only the answers, but the questions they're supposed to be asking. Makes it tough to set up appropriate ILM, practices, etc.
Colin Walker
@mcauth Exactly! How fast can I get to insights? How little time can I waste digging through what I don't need, or care about?
Randy Arseneau
@colin_walker There are some emerging autodiscovery and pattern sniffing techniques that can help here I think. Akin to pharma saving all clinical trial and vector data forever, in case a future superbug appears.
Chris Selland
@EvansBI yes but - you don't always know what you're looking for so it needs to be flexible - historically ILM hasn't been
Matt Cauthorn
@NeilRaden ...and perhaps a commitment from the organization to tap into the potential. It'll provide returns if the commitment is there.
Peter Evans
@NeilRaden problem I believe is confusion from vendors about what outcome is best for the data you have ingested - specific use cases should be designed by industry to enable governance to work and be compliant with regulatory rules GDPR etc
Colin Walker
@dorkninja Agreed. Lots of options coming that use ML to ask the questions of the data without humans having to know what to ask. Definitely the way forward, IMO. #OneMansOpinion. Part of why we use this technique in our tech. #DontKnowWhatYouDontKnow
Randy Arseneau
@colin_walker Spent a little time with Jeff Jonas while at IBM - he would vehemently agree!
Peter Evans
ILM based on regulatory compliance and industry methodology can be tuned to be both flexible and encompassing - problem with Big Data technologies is that they primarily did not start out that way so difficult to enact ILM correctly
Matt Cauthorn
@dorkninja patterns for sure...and relevant features extracted from the operational data. This is key to the ML equation, to put it mildly
jameskobielus
@mcauth What's the role of machine learning in unifying the treatment of different data sources while also enabling differentiated task-specific IT management insights?
Matt Cauthorn
@jameskobielus Increasingly large. Putting data sources aside for a moment, the growth alone warrants machine assist. Human analyst error rates are high vs. machines for a huge swath of ops intel.
Chris Selland
@EvansBI yup that's the challenge but also the opportunity - we're working on it at @unifisoftware
tomgolway
@jameskobielus ML has a substantial role in demand forecasting/management
Colin Walker
@mcauth Exactly. Forget sources or silos. The sheer growth and volume of inbound to be processed means ML is either currently or soon 100% required, full stop. Otherwise what's your option? Call center sized buildings full of data analysts? #Pass
Peter Evans
Funny that so do we at @solixbigdata :-)
jameskobielus
@colin_walker Industrial-grade feature discovery on that infrastructure data can call out the predictors from unstructured data. Perhaps cluster analysis algorithms.
Colin Walker
@jameskobielus Precisely. Proper Discovery -> ML -> Analysis -> Reporting chains are the way of the future. #NotKickBoxing
Chris Selland
@EvansBI we should probably chat offline!