Tag Archives: Unstructured Data
I have two kids. In school. They generate a remarkable amount of paper. From math worksheets, permission slips, book reports (now called reading responses) to newsletters from the school. That’s a lot of paper. All of it is presented in different forms with different results – the math worksheets tell me how my child is doing in math, the permission slips tell me when my kids will be leaving school property and the book reports tell me what kind of books my child is interested in reading. I need to put the math worksheet information into a storage space so I can figure out how to prop up my kid if needed on the basic geometry constructs. The dates that permission slips are covering need to go into the calendar. The book reports can be used at the library to choose the next book.
We are facing a similar problem (albeit on a MUCH larger scale) in the insurance market. We are getting data from clinicians. Many of you are developing and deploying mobile applications to help patients manage their care, locate providers and improve their health. You may capture licensing data to assist pharmaceutical companies identify patients for inclusion in clinical trials. You have advanced analytics systems for fraud detection and to check the accuracy and consistency of claims. Possibly you are at the point of near real-time claim authorization.
The amount of data generated in our world is expected to increase significantly in the coming years. There are an estimated 50 petabytes of data in the Healthcare realm, which is predicted to grow by a factor of 50 to 25,000 petabytes by 2020. Healthcare payers already store and analyze some of this data. However in order to capture, integrate and interrogate large information sets, the scope of the payer information will have to increase significantly to include provider data, social data, government data, pharmaceutical and medical product manufacturers data, and information aggregator data.
Right now – you probably depend on a traditional data warehouse model and structured data analytics to access some of your data. This has worked adequately for you up to now, but with the amount of data that will be generated in the future, you need the processing capability to load and query multi-terabyte datasets in a timely fashion. You need the ability to manage both semi-structured and unstructured data.
Fortunately, a set of emerging technologies (called “Big Data”) may provide the technical foundation of a solution. Big Data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage and process data within a tolerable amount of time. While some existing technology may prove inadequate to future tasks, many of the information management methods of the past will prove to be as valuable as ever. Assembling successful Big Data solutions will require a fusion of new technology and old-school disciplines:
Which of these technologies do you have? Which of these technologies can integrate with on-premise AND cloud based solutions? On which of these technologies does your organization have knowledgeable resources that can utilize the capabilities to take advantage of Big Data?
Gartner’s official definition of Information Governance is “…the specification of decision rights and an accountability framework to encourage desirable behavior in the valuation, creation, storage, use, archival and deletion of information. It includes the processes, roles, standards, and metrics that ensure the effective and efficient use of information in enabling a business to achieve its goals.” It therefore looks to address important considerations that key stakeholders within an enterprise face.
A CIO of a large European bank once asked me – “How long do we need to keep information?”
Keeping Information Governance relevant
This bank had to govern, index, search, and provide content to auditors to show it is managing data appropriately to meet Dodd-Frank regulation. In the past, this information was retrieved from a database or email. Now, however, the bank was required to produce voice recordings from phone conversations with customers, show the Reuters feeds coming in that are relevant, and document all appropriate IMs and social media interactions between employees.
All these were systems the business had never considered before. These environments continued to capture and create data and with it complex challenges. These islands of information that seemingly do not have anything to do with each other, yet impact how that bank governs itself and how it saves any of the records associated with trading or financial information.
Coping with the sheer growth is one issue; what to keep and what to delete is another. There is also the issue of what to do with all the data once you have it. The data is potentially a gold mine for the business, but most businesses just store it and forget about it.
Legislation, in tandem, is becoming more rigorous and there are potentially thousands of pieces of regulation relevant to multinational companies. Businesses operating in the EU, in particular, are affected by increasing regulation. There are a number of different regulations, including Solvency II, Dodd-Frank, HIPAA, Gramm-Leach-Bliley Act (GLBA), Basel III and new tax laws. In addition, companies face the expansion of state-regulated privacy initiatives and new rules relating to disaster recovery, transportation security, value chain transparency, consumer privacy, money laundering, and information security.
Regardless, an enterprise should consider the following 3 core elements before developing and implementing a policy framework.
Whatever your size or type of business, there are several key processes you must undertake in order to create an effective information governance program. As a Business Transformation Architect, I can see 3 foundation stones of an effective Information Governance Program:
Assess Your Business Maturity
Understand the full scope of requirements on your business is a heavy task. Assess whether your business is mature enough to embrace information governance. Many businesses in EMEA do not have an information governance team already in place, but instead have key stakeholders with responsibility for information assets spread across their legal, security, and IT teams.
Undertake a Regulatory Compliance Review
Understand the legal obligations to your business are critical in shaping an information governance program. Every business is subject to numerous compliance regimes managed by multiple regulatory agencies, which can differ across markets. Many compliance requirements are dependent upon the numbers of employees and/or turnover reaching certain limits. For example, certain records may need to be stored for 6 years in Poland, yet the same records may need to be stored for 3 years in France.
Establish an Information Governance Team
It is important that a core team be assigned responsibility for the implementation and success of the information governance program. This steering group and a nominated information governance lead can then drive forward operational and practical issues, including; Agreeing and developing a work program, Developing policy and strategy, and Communication and awareness planning.
If you haven’t updated your B2B integration capabilities in the past five years, are you at risk of being left behind? This is the age of superior customer experience and rapid time-to-value so speedy customer on-boarding and support of specialized integration services means the difference between winning and losing business. A health check starts with asking some simple questions: (more…)
Big Data, Big Problems: Leveraging Informatica 9.5 to Build an Effective Data Governance Strategy to Meet the Big Data Challenge
By: Chris Cingrani, Informatica DQ & MDM Practice Lead, Data Management Practice at SSG Ltd., www.ssglimited.com
Big data is something that I am continually asked about by clients, as the subject continues to gain significant press. While discussing this topic, I often address it from the angle that bigger data volumes will result in bigger data problems. Although this seems like a logical premise, the reality of what it really means to an organization and how to plan accordingly is what is often overlooked. Rather than solve the problem in this blog post, I want to focus on two key considerations from a data governance standpoint, as well as discuss why SSG sees Informatica 9.5 as a core component of a sound data governance strategy that can ensure an organizations’ business decision-making success. (more…)
By Nancy Atkinson, Senior Analyst, Aite Group
Karen Hsu of Informatica organized a TweetJam (#INFAtj) recently on business-to-business (B2B) payments, SEPA, and integration. In conversation with Chris Skinner of Balatro Ltd., I stayed (mostly) within the 140-character message limitations of Twitter while the hour flew by. (more…)
Hello and welcome to my first blog on Perspectives. I’m Krish Krishnan, and you may have seen me before. I have a channel on BeyeNETWORK on Data Warehouse Architectures and Appliances.
My Perspectives blogs will be focused on data warehousing as a practice and in the coming months, I will be publishing topics on the architecture, integration challenges and share some implementation tips on data warehousing. I welcome your feedback and hope to make this one of your go-to websites for information exchange and sharing insights.
Now I want to cover the State of the Data Warehouse …
Business needs today mandate the availability of data at the right time to the users, to make effective decisions. This is the promise that the data warehouse was built on. But in the real world, the data warehouse has morphed into a “big” truth repository and the business value derived from the same is perspective based. What the current data warehouse lacks is a flexible architecture from a data management and integration perspective. (more…)