Main Office +1 512.814.6324
  • Home
  • Solutions
    • Trading Systems
    • NebulaBlocks
    • Custom SaaS
  • Services
  • Blog
    • Article Series Index
  • News
  • Contact
  • About

Prebuilt Solutions

Nebility is constantly developing new Traditional and SaaS Solutions for the Financial Industry. We also offer our NebulaBlocks SaaS Platform to 3rd party developers.

Read More

Custom Solutions

Nebility is able to build any Custom Financial System and offer it as a hosted SaaS Solution. Let us build your next system with dramatically reduced cost and time to market, while simultaneously improving quality and reducing risk.

Read More

NebulaBlocks

NebulaBlocks is Nebility's premier, fully SOA based, SaaS Multi-Tenant capable, platform. Nebility uses it as the basis for all of its applications and it can be made available for yours as well.

Read More

About Nebility

Nebility is Enterprise Systems Done Right. We combine decades of experience building some of the World's largest and most complex systems with cutting edge technology to deliver World Class solutions.

Read More

Home / Enterprise Solutions Done Right / Real Life Issues With Big Data In The Enterprise – The Issues With Data Completeness

Real Life Issues With Big Data In The Enterprise – The Issues With Data Completeness

Posted on: 03-7-2011 in Big Data, Business Intelligence, CIO
This entry is part 3 of 3 in the series Challenges Of Dealing With Big Data

So completeness can mean a lot of different things.  In this case I am going to define a piece of data as being complete if the description of the data contains all of the available information about the item in question and if that description is captured in a manner which represents a true representation of that object in a context neutral manner.  In my experience,  this is the single biggest cause of the data issues in the enterprise.  In fact if this was done well the issues listed in the first article of this series would be much less likely to occur.

So here is an example from all of our recent past, related to the global financial crisis.  Consider a bank who gives Paul Michaud a mortgage, along with about 10,000 other people.  This bank then bundles them all together and sells them as a Mortgage Backed Security (MBS) .  A MBS is basically a bond whose interest and principle get paid off by the people paying their mortgages.  The MBS then gets sold to a bunch of other banks who hold it in their portfolios. So the problem here is that the banks that buy the MBS bonds don’t even know that Paul Michaud’s mortgage is even in the pool of 10,000 that is responsible for paying their bonds.  Often,  neither does the bank that sold the MBS based on the pool in the first place but that’s another issue.  Anyhow,  the bank that bought the bond wants to asses risk on their portfolio of MBS securities.  In order to do this they need to project cash flows from their investments under different risk scenarios.  In order to do this well, they would like to be able to model the behavior of the individuals who own the bonds underlying their MBS securities.  The problem is they have no idea who owns the mortgages, how much they earn,  what their debt load is, etc.  At best they have some general summary statistics about the pool of 10,000.  Worse yet,  even if they were given all of the detailed data,  none of their systems would be able to store it correctly anyhow.  At the end of the day, the firm will run a valuation and risk assessment using a supercomputing cluster of 10,000+ servers on the data and generate 100’s to 1000’s of reports based on it.  Unfortunately, all of this fancy analysis was rendered inaccurate because the data was not complete or accurate at the time of data capture and thus their analysis is at best an approximation from which they may draw incorrect conclusions, as we observed a few years ago.

In fact it is my opinion (and I told this to people in the federal government, and other top advisors to the World’s financial establishment) that this is one of the primary causes of the global financial crisis a few years ago.  While everyone was harping on the need to force these banks to disclose more and more information to the government and other regulators, it would have been a GIGO  exercise because the banks don’t fully know what they have or the risks they face largely because of the limitations created by incomplete data.  While there were definitely lots of issues inside the banks that contributed to the crisis, I genuinely believe the banks try their best to asses value and risk in their portfolios and that they disclose what they believe those values and risks to be to the government and regulators.  The issue is it’s a truly hard problem to solve and if they can’t fix the quality of their data then they will always be at risk of their internal analysis being wrong.

So the bottom line here is that you need you systems to be able to capture, store, maintain and retrieve a complete representation of all of your data.  At the very least the data model inside your systems should be capable of capturing complete data even if you are unable to populate it with complete data at this time.  Who knows a year from now you may be able to fill in the missing values but if the systems wasn’t designed to hold them,  you are going to be in trouble.

Remember, at the end of the day, virtually every computer system on the planet, has as its primary responsibility, the need to capture store, maintain, retrieve and process data.  So if it doesn’t do that primary data job right, then what’s the point of building it in the first place.

As always you can reach me through Twitter, LinkedIn, by using the contact links in the author box or here through the website.

Series Navigation

«Real Life Issues With Big Data In The Enterprise – The Issues With Data Consistency (Or Lack Thereof)

Paul Michaud

Paul Michaud is a co-founder and CEO of Nebility, an enterprise solutions company. Paul has been designing and building some of the world’s largest, most scalable and highest performing applications, for over 25 years. Immediately prior to Nebility, Paul was Global Executive IT Architect for Financial Services at IBM. To learn more about Paul check him out on LinkedIn using the button at the top of this author box.

Other posts by Paul Michaud
  • Popular Posts
  • Related Posts
  • Real Life Issues With Big Data In The Enterprise - The Issues With Data Completeness
    Real Life Issues With Big Data In The Enterprise - The Issues With Data Completeness
  • Real Life Issues With Big Data In The Enterprise – The Issues With Data Consistency (Or Lack Thereof)
    Real Life Issues With Big Data In The Enterprise – The Issues With Data Consistency (Or Lack Thereof)
  • The Challenges of Dealing With Big Data
    The Challenges of Dealing With Big Data
  • Welcome To Nebilitys’ New Blog
    Welcome To Nebilitys’ New Blog
  • Real Life Issues With Big Data In The Enterprise – The Issues With Data Consistency (Or Lack Thereof)
    Real Life Issues With Big Data In The Enterprise – The Issues With Data Consistency (Or Lack Thereof)
  • The Challenges of Dealing With Big Data
    The Challenges of Dealing With Big Data

Search

Categories

Popular Categories

Architecture Big Data Business Intelligence Challenges Of Dealing With Big Data CIO Cloud Computing Data Architecture Executive Discussions High Performance Computing Series Service Oriented Architecture (SOA) Software as a Service (SaaS) Strategy
Avatars by Sterling Adventures
Call +1 512.814.6324 to speak with a Nebility Enterprise Expert
© 2011 Nebility Inc. All Rights Reserved
TwitterStumbleUponRedditDiggdel.icio.usFacebookLinkedIn