GetGraph

Do you need graph analytics?
Graph: The missing link in big data analytics | Join the discussion below!
Cray Inc.
How does graph fit in with other big data analytics tools such as Hadoop and Spark?
Eric Dull
Graph supports interactive analysis that powers discovery analytics, aligning with Spark’s batch and streaming engines. Graph provides an index into Hadoop data stores, enabling analysts to identify needles in needlestakes and validate the findings
James Maltby
There are quite a few good graph analysis tool in Hadoop and Spark...
Amy Hodler
@eg_dull @Graph_Guy So it's not really an either or choice.
James Maltby
But i usually think of H/S as being good for pre- and post processing
Eric Dull
@amyhodler Not at all. They are extremely complementary and support different parts of a production workflow
Mallinath Sengupta
As compared to other big-data analytics tools, Graph is better suited when:
- Problem is network-centric or we need answers like 'Yes/No/ May be' as response to a question like 'Who are my FATCA customers?'
James Maltby
ETL on the front end and post-processing with a statistical package like R
Eric Dull
Map/Reduce can prep a graph, graph discovers some knowledge, and Streaming / CEP handles detecting / acting on that knowledge in 'production'
Eric Dull
@eg_dull one example of a workflow containing all three types of analytic engines
Mallinath Sengupta
Used ina complimentary way, they can give a great winning combination
Eric Dull
@amyhodler Computer network analysis. IDS / IPS systems handle actioning the discoveries found through graph analysis enabled by map/reduce data preparation
Mallinath Sengupta
While a conventional DB returns definitive values; in case of a Graph DB combined with knowledge models, it is possible to get ‘partially true responses’
Mallinath Sengupta
This can be used to provide suggestions in case of Money Laundering investigations
Cray Inc.
When is graph the best analysis choice?
Eric Dull
Graph is the best choice to analyze human, and human-generated activities, like computer networks, road networks, power grids, social networks. Graph is also the best choice for merging and identifying correlations between different types of data
James Maltby
If your data is naturally irregular graph would be the best choice, think of the web of relationships between genes in the human genome
Amy Hodler
When you're monitoring for emergent patterns. (The unknown unknowns.)
Eric Dull
The patterns in these activity graphs lend themselves to graph databases
James Maltby
Relational databases are a mature technology and can be very efficient for suitable tasks. If the task fits a relational style, that would be the best choice.
Mallinath Sengupta
When a hypothesis needs to lead to deeper data analysis to prove / disprove it is a good case when graph is a better choice. An example is a Financial Crime investigation
Eric Dull
@mallinath0603 Indeed. The performance needed to get answers back to people to prove hypotheses is very important for us
James Maltby
The set of facebook friends is a natural graph- think of a one-billion by one-billion table with only 0.000035% nonzero entries
Amy Hodler
Is graph used for complexity system modeling? (Dynamical systems.)
Eric Dull
@Graph_Guy To get a little in the weeds, those graphs lend themselves to adjacency lists, where the connections are very sparse
Eric Dull
@eg_dull rather than more fully-connected graphs where matrices play a much bigger role
James Maltby
The internal representation of the graph is really a sparse matrix
Eric Dull
@amyhodler Very much so. Bayesian Belief nets and power grid anlysis are two examples of which I am aware
James Maltby
@amyhodler Semantic databases are logical- not really floating point
Mallinath Sengupta
It is also suitable in types of cases where discovery needs to be made by subject matter experts by comparing complex patterns
James Maltby
However, mathematical graph algorithms are core to dynamical system analysis
Cray Inc.
How is graph database analysis different from relational database analysis?
James Maltby
Different languages, for example SQL vs. SPARQL...
Tara Wilson
Relational DBs are great for performing the same operation on a large amount of data. While Graph DB analytics are better for highly connected / complex data
Eric Dull
Graph databases provide an efficient way to represent connections between entities, correlate information to the entities, and handle missing or incomplete data, and enable analysis of those connections which is difficult to impossible otherwise
James Maltby
But it requires a very different way of thinking of your data- young a different mental model
James Maltby
Graph databases can scale well also-
Eric Dull
@Graph_Guy Indeed. Good news is that graph is a native representation for a number of data types
Eric Dull
@Graph_Guy Makes the data-thinking easier
Mallinath Sengupta
A graph DB scales more naturally to large data sets as they do not typically require extensive join operations. As they depend less on a rigid schema, they are more suitable to manage ad hoc and changing data environments
Manogna Bhargav
@eg_dull @Graph_Guy does this mean that graph databases can handle problems that relational databases are incapable of (such as friend-of-friend problems where the insight is the relationship between datapoints) whereas the converse is not true ?
Eric Dull
Jason Ready at GA Tech did some very interesting graph work that showed scale (billions). Relational database tables go larger, but the analyses start to bog down badly as they get complex
Eric Dull
Great point. Every graph "hop" tends to lead to another table lookup, and in cluster environments, performance suffers when you go off the local node
Eric Dull
Betweenness centrality (identifying the key nodes holding a graph together) is a great example of one of those types of problems
Mallinath Sengupta
GraphDB is better suited where you want to get insight that you may not be aware of. In other words, it leads to discovery that we may not be aware og
James Maltby
It is very useful to follow long chains of relationships without having to keep the scheme in your head!
James Maltby
Homans can typically only keep chains of 2-5, while a graph pattern can be 10, 20, 30...
Ted Slater
The uniformity of graph knowledge representation (e.g. subject-predicate-object triples) makes it much easier to integrate data than in relational environments, enriching analytics downstream.
Cray Inc.
Where is graph in use today?
Mallinath Sengupta
We are using Graph in our AI enabled solutuons for Financial Regulatory Compliance
James Maltby
Graph analytics is very popular in the life sciences, because nature does not tend to form tables. Natural data tends to be very irregular
Eric Dull
Deloitte is using graph analysis underlying a number of service offerings, including Cyber analytics, Supply Chain analysis, and social network analysis. Here is a link to more detail on one of our cyber analytic offering http://goo.gl/rVfZdn
James Maltby
RDF and SPARQL were originally invented to make index and sense of the World Wide Web
James Maltby
There has been some recent interesting work in using graphs to analyze the spread of epidemics
Mallinath Sengupta
@Nextangles we also use Graph to aid Financial Crime Investigation including AML/ KYC or Risk solutions like Liquidity Management
Eric Dull
@eg_dull in which we are using graph analytics to do some very interesting things
James Maltby
Graph is actually used for Oilfield logistics in the North Sea off Norway: https://www.epim.no/...
Ted Slater
@eg_dull Absolutely, cybersecurity is a big and growing application for graph technology.
David W. Mizell
Cybersecurity has a characteristic that is natural for graphical analysis, in the sense that you may have this big database of network events, any singe one of which looks innocuous, but when you pick out a pattern of related events, it's significant.
Mallinath Sengupta
To learn more about how Graph is used in AML investigations, please go to https://nextangles.c...
Eric Dull
@tedslater Yes. we have seen great successes in intrusion analysis and merging network activity, threat information, and contextual information
Ted Slater
I've used graphs in @OpenBEL for simulation in networks of "causal" relationships between biological molecules and processes. BEL and be converted to RDF and back again with free software.