ThirdWave

Dave Vellante

it starts with gaining visibility on the copies at your shop

John Furrier

first step is to review the process and practices in place

@Peter_Eicher would you leverage your copies on the primary storage box? wont that kill performance

Not the Jacksonville Jaguar defense. They can't stop anything. Not points and probably not data deluge

Sathya Sankaran

You cant manage what you cant measure. First step is to observe and measure the problem!

Some vendors like @HDS now let you take snaps off clones. The base clone is 100% copy, but the snaps are now on separate spindles so no production hit. Good idea, but not cheap.

@storageswiss Lol! Somebody had a bad fantasy football week!

I think step number one is a full copy on a secondary (less expensive) device. Then trigger snapshots and what have you from there.

ttessks

@ I wonder how many IT centers really know how many copies they even have or the size of their problem. Seems like we would need to start there.

@storageswiss Yes, snap-and-replicate is a great solution. Shift the load to an alternate array. NetApp banging that drum for years.

@ttessks In my experience IT guys know it is a problem. You are correct that they may not know the scope of the problem, but they will admit is is a problem. And are looking for a solution

Ira Goodman

@storageswiss You are so right. The IT guys are always the ones who run around trying to make sure that this problem is watched after and need help to minimize the problem that they live with day by day.

Frank Weitz

@storageswiss from my experience in switzerland: the IT Guys really know....but it´s almost the same: they they need the approval to buy a solution

http://www.actifio.com/assets/IDC-CD-Insight.pdf

Dave Vellante1

Good white paper on the copy data problem (sponsored by @actifio) by @baldydubois http://www.actifio.c...

[+] Show Hidden Comments

Storage Alchemist18

Q4. )If the problem is too much data don't data reduction technologies like #compression and #dedupe solve that?

3 Votes Vote

[+] Show Hidden Comments

They help for certain use cases, not others.

Running data mining off a deduped backup store w SATA drives? See you next month when the job is finished.

Sathya Sankaran

It is the difference between reducing garbage and compacting garbage!

@Satsank Lol! Well said.

My opinion is that dedupe is an over-rated technology. Like it but it should not be all things to all people.

Dedupe essentially rewards bad behavior, you wouldn't do that for your kids and you shouldn't do that with your data

3 Votes Vote

@storageswiss Yup. You keep shoveling the same garbage into the box. But the shoveling itself takes a toll. (Too much metaphor)

@storageswiss agree - need to have undeduped copies to run tests and analytics

@Peter_Eicher and you end up breaking the shovel

Dave Vellante

data-deduplication was a 1-time hit - it created a baseline and now it's off to the data races

Jay Livens

You need to understand the data that you are storing and why and use the right technology.

@dvellante #ExtractTransformLose

Steven Beedle

@JLivens good point Jay, question is, how do manage it all? #copydata #ThirdWave

Sathya Sankaran

Deduplication is not free, it doesnt solve the underlying problem... it gets you some space and time to get your act together

Jay Livens

In my view the challenge comes back to metadata. You need to understand what you have and why. Without that knowledge, it is diff to make intelligent choices.

@JLivens you get the prize man... It is all about the metadata - question is, when is that data set too big? what is the best way to manage that? is it a #datacatalog?