Category Archives: Data Integration
Original article can be found here, scmagazine.com
On Jan. 13 the White House announced President Barack Obama’s proposal for new data privacy legislation, the Personal Data Notification and Protection Act. Many states have laws today that require corporations and government agencies to notify consumers in the event of a breach – but it is not enough. This new proposal aims to improve cybersecurity standards nationwide with the following tactics:
Enable cyber-security information sharing between private and public sectors.
Government agencies and corporations with a vested interest in protecting our information assets need a streamlined way to communicate and share threat information. This component of the proposed legislation incents organizations that participate in knowledge-sharing with targeted liability protection, as long as they are responsible for how they share, manage and retain privacy data.
Modernize the tools law enforcement has to combat cybercrime.
Existing laws, such as the Computer Fraud and Abuse Act, need to be updated to incorporate the latest cyber-crime classifications while giving prosecutors the ability to target insiders with privileged access to sensitive and privacy data. The proposal also specifically calls out pursuing prosecution when selling privacy data nationally and internationally.
Standardize breach notification policies nationwide.
Many states have some sort of policy that requires notification of customers that their data has been compromised. Three leading examples include California , Florida’s Information Protection Act (FIPA) and Massachusetts Standards for the Protection of Personal Information of Residents of the Commonwealth. New Mexico, Alabama and South Dakota have no data breach protection legislation. Enforcing standardization and simplifying the requirement for companies to notify customers and employees when a breach occurs will ensure consistent protection no matter where you live or transact.
Invest in increasing cyber-security skill sets.
For a number of years, security professionals have reported an ever-increasing skills gap in the cybersecurity profession. In fact, in a recent Ponemon Institute report, 57 percent of respondents said a data breach incident could have been avoided if the organization had more skilled personnel with data security responsibilities. Increasingly, colleges and universities are adding cybersecurity curriculum and degrees to meet the demand. In support of this need, the proposed legislation mentions that the Department of Energy will provide $25 million in educational grants to Historically Black Colleges and Universities (HBCU) and two national labs to support a cybersecurity education consortium.
This proposal is clearly comprehensive, but it also raises the critical question: How can organizations prepare themselves for this privacy legislation?
The International Association of Privacy Professionals conducted a study of Federal Trade Commission (FTC) enforcement actions. From the report, organizations can infer best practices implied by FTC enforcement and ensure these are covered by their organization’s security architecture, policies and practices:
- Perform assessments to identify reasonably foreseeable risks to the security, integrity, and confidentiality of personal information collected and stored on the network, online or in paper files.
- Limited access policies curb unnecessary security risks and minimize the number and type of network access points that an information security team must monitor for potential violations.
- Limit employee access to (and copying of) personal information, based on employee’s role.
- Implement and monitor compliance with policies and procedures for rendering information unreadable or otherwise secure in the course of disposal. Securely disposed information must not practicably be read or reconstructed.
- Restrict third party access to personal information based on business need, for example, by restricting access based on IP address, granting temporary access privileges, or similar procedures.
The Personal Data Notification and Protection Act fills a void at the national level; most states have privacy laws with California pioneering the movement with SB 1386. However, enforcement at the state AG level has been uneven at best and absent at worse.
In preparing for this national legislation organization need to heed the policies derived from the FTC’s enforcement practices. They can also track the progress of this legislation and look for agencies such as the National Institute of Standards and Technology to issue guidance. Furthermore, organizations can encourage employees to take advantage of cybersecurity internship programs at nearby colleges and universities to avoid critical skills shortages.
With online security a clear priority for President Obama’s administration, it’s essential for organizations and consumers to understand upcoming legislation and learn the benefits/risks of sharing data. We’re looking forward to celebrating safeguarding data and enabling trust on Data Privacy Day, held annually on January 28, and hope that these tips will make 2015 your safest year yet.
Lately I have been thinking a lot about what is real and just marketing fluff with the Internet of Things (IoT). From all the stories written and people that I talk to it seems I am not alone. One day there is news of what at best is a communications company receiving +100M in funding and the next there is what amounts to a re-skinned mobile app claiming to be the real IoT.
This is the first part in a series of posts where I am going to define a framework for identifying real IoT solutions and the value that they provide. In addition I will provide actual examples of companies and solutions that fit this solution definition framework.
My main issue with the entire IoT universe is that a lot of the focus in on things that do not exist or that have been around a long time and have just been re-branded. Neither of these actually do justice to the concept of IoT that is very interesting, which is using distributed data and events to deliver totally new or dynamically better solutions (think 10x or more) compared to what exists today. We are talking revolutionary and not evolutionary.
From my point of view real IoT solutions need to address one or more of the following solution areas and I will be using these and additional criteria to build out the framework.
- Personal productivity
- Business productivity
- Business critical
- Life critical
Have another point of view? Feel free to share. My next post will focus on the segment of personal productivity.
I have a teenaged daughter. She is interested in her appearance, but not a risk taker. As a result, she waits until a fashion trend has been established by her peers before adopting the new look. This works out well for me because I’m not on the bleeding edge of the vagaries of fashion spending on a trend that may fizzle out before it becomes more widely adopted. This works out well for my daughter because she is able to blend in with her peers. She has developed her own adoption model for fashion.
Healthcare analytics can be considered to be similar to fashion. It can be confusing and even overwhelming without a systematic framework to guide an approach or priorities. Fortunately – a group of very smart people have been working on this and have created the Healthcare Analytics Adoption Model. The Healthcare Analytics Adoption Model is a framework to measure the adoption and meaningful use of data warehouses and analytics in healthcare – similar to the HIMSS Analytics EMRAM model. It should be considered a guide to classifying groups of analytics capabilities, and provide a methodology for health organizations to adopt analytics.
The Healthcare Analytics Adoption Model proposes that there are three phases of data analysis:
- data collection – systems that are designed specifically for supporting transaction based workflows and data collection. The adoption of the electronic medical record (EMR) is a great example of this phase.
- data sharing – The need for sharing data among members of the workflow team – similar to the capabilities of a health information exchange.
- data analytics – organizations realize that they can start to analyze this collected and shared data through investigating the patterns in the aggregated data.
Once your organization has moved into Phase Three – you are ready to start work on the Healthcare Analytics Adoption Model which is comprised of eight levels:
Each level of adoption includes progressive expansion of analytic capabilities in four dimensions:
- New data sources – Data content expands as new sources of data are added to your organization
- Complexity – analytic algorithms and data binding become progressively more complex
- Data literacy – this increases among employees, leading to an increasing ability to exploit data as an asset
- Data timeliness – timeliness of data content increases (data latency decreases) leading to a reduction in decision cycles and mean time to improvement.
We’ll spend some time in future weeks talking through the different levels. In the meantime – where do you see your organization?
Let’s face it, building a Data Governance program is no overnight task. As one CDO puts it: ”data governance is a marathon, not a sprint”. Why? Because data governance is a complex business function that encompasses technology, people and process, all of which have to work together effectively to ensure the success of the initiative. Because of the scope of the program, Data Governance often calls for participants from different business units within an organization, and it can be disruptive at first.
Why bother then? Given that data governance is complex, disruptive, and could potentially introduce additional cost to a company? Well, the drivers for data governance can vary for different organizations. Let’s take a close look at some of the motivations behind data governance program.
For companies in heavily regulated industries, establishing a formal data governance program is a mandate. When a company is not compliant, consequences can be severe. Penalties could include hefty fines, brand damage, loss in revenue, and even potential jail time for the person who is held accountable for being noncompliance. In order to meet the on-going regulatory requirements, adhere to data security policies and standards, companies need to rely on clean, connected and trusted data to enable transparency, auditability in their reporting to meet mandatory requirements and answer critical questions from auditors. Without a dedicated data governance program in place, the compliance initiative could become an on-going nightmare for companies in the regulated industry.
A data governance program can also be established to support customer centricity initiative. To make effective cross-sells and ups-sells to your customers and grow your business, you need clear visibility into customer purchasing behaviors across multiple shopping channels and touch points. Customer’s shopping behaviors and their attributes are captured by the data, therefore, to gain thorough understanding of your customers and boost your sales, a holistic Data Governance program is essential.
Other reasons for companies to start a data governance program include improving efficiency and reducing operational cost, supporting better analytics and driving more innovations. As long as it’s a business critical area and data is at the core of the process, and the business case is loud and sound, then there is a compelling reason for launching a data governance program.
Now that we have identified the drivers for data governance, how do we start? This rather loaded question really gets into the details of the implementation. A few critical elements come to consideration including: identifying and establishing various task forces such as steering committee, data governance team and business sponsors; identifying roles and responsibilities for the stakeholders involved in the program; defining metrics for tracking the results. And soon you will find that on top of everything, communications, communications and more communications is probably the most important tactic of all for driving the initial success of the program.
A rule of thumb? Start small, take one-step at a time and focus on producing something tangible.
Sounds easy, right? Well, let’s hear what the real-world practitioners have to say. Join us at this Informatica webinar to hear Michael Wodzinski, Director of Information Architecture, Lisa Bemis, Director of Master Data, Fabian Torres, Director of Project Management from Houghton Mifflin Harcourt, global leader in publishing, as well as David Lyle, VP of product strategy from Informatica to discuss how to implement a successful data governance practice that brings business impact to an enterprise organization.
If you are currently kicking the tires on setting up data governance practice in your organization, I’d like to invite you to visit a member-only website dedicated to Data Governance: http://governyourdata.com/. This site currently has over 1,000 members and is designed to foster open communications on everything data governance. There you will find conversations on best practices, methodologies, frame works, tools and metrics. I would also encourage you to take a data governance maturity assessment to see where you currently stand on the data governance maturity curve, and compare the result against industry benchmark. More than 200 members have taken the assessment to gain better understanding of their current data governance program, so why not give it a shot?
Data Governance is a journey, likely a never-ending one. We wish you best of the luck on this effort and a joyful ride! We love to hear your stories.
2014 was a pivotal turning point for Informatica as our investments in Hadoop and efforts to innovate in big data gathered momentum and became a core part of Informatica’s business. Our Hadoop related big data revenue growth was in the ballpark of leading Hadoop startups – more than doubling over 2013.
In 2014, Informatica reached about 100 enterprise customers of our big data products with an increasing number going into production with Informatica together with Hadoop and other big data technologies. Informatica’s big data Hadoop customers include companies in financial services, insurance, telcommunications, technology, energy, life sciences, healthcare and business services. These innovative companies are leveraging Informatica to accelerate their time to production and drive greater value from their big data investments.
These customers are in-production or implementing a wide range of use cases leveraging Informatica’s great data pipeline capabilities to better put the scale, efficiency and flexibility of Hadoop to work. Many Hadoop customers start by optimizing their data warehouse environments by moving data storage, profiling, integration and cleansing to Hadoop in order to free up capacity in their traditional analytics data warehousing systems. Customers that are further along in their big data journeys have expanded to use Informatica on Hadoop for exploratory analytics of new data types, 360 degree customer analytics, fraud detection, predictive maintenance, and analysis of massive amounts of Internet of Things machine data for optimization of energy exploration, manufacturing processes, network data, security and other large scale systems initiatives.
2014 was not just a year of market momentum for Informatica, but also one of new product development innovations. We shipped enhanced functionality for entity matching and relationship building at Hadoop scale (a key part of Master Data Management), end-to-end data lineage through Hadoop, as well as high performance real-time streaming of data into Hadoop. We also launched connectors to NoSQL and analytics databases including Datastax Cassandra, MongoDB and Amazon Redshift. Informatica advanced our capabilities to curate great data for self-serve analytics with a connector to output Tableau’s data format and launched our self-service data preparation solution, Informatica Rev.
Customers can now quickly try out Informatica on Hadoop by downloading the free trials for the Big Data Edition and Vibe Data Stream that we launched in 2014. Now that Informatica supports all five of the leading Hadoop distributions, customers can build their data pipelines on Informatica with confidence that no matter how the underlying Hadoop technologies evolve, their Informatica mappings will run. Informatica provides highly scalable data processing engines that run natively in Hadoop and leverage the best of open source innovations such as YARN, MapReduce, and more. Abstracting data pipeline mappings from the underlying Hadoop technologies combined with visual tools enabling team collaboration empowers large organizations to put Hadoop into production with confidence.
As we look ahead into 2015, we have ambitious plans to continue to expand and evolve our product capabilities with enhanced productivity to help customers rapidly get more value from their data in Hadoop. Stay tuned for announcements throughout the year.
Try some of Informatica’s products for Hadoop on the Informatica Marketplace here.
Valentine’s Day is such a strange holiday. It always seems to bring up more questions than answers. And the internet always seems to have a quiz to find out the answer! There’s the “Does he have a crush on you too – 10 simple ways to find out” quiz. There’s the “What special gift should I get her this Valentine’s Day?” quiz. And the ever popular “Why am I still single on Valentine’s Day?” quiz.
Well Marketers, it’s your lucky Valentine’s Day! We have a quiz for you too! It’s about your relationship with data. Where do you stand? Are you ready to take the next step?
Question 1: Do you connect – I mean, really connect – with your data?
□ (A) Not really. We just can’t seem to get it together and really connect.
□ (B) Sometimes. We connect on some levels, but there are big gaps.
□ (C) Most of the time. We usually connect, but we miss out on some things.
□ (D) We are a perfect match! We connect about everything, no matter where, no matter when.
Translation: Data ready marketers have access to the best possible data, no matter what form it is in, no matter what system it is in. They are able to make decisions based everything the entire organization “knows” about their customer/partner/product – with a complete 360 degree view. And they are also able to connect to and integrate with data outside the bounds of their organization to achieve the sought-after 720 degree view. They can integrate and react to social media comments, trends, and feedback – in real time – and to match it with an existing record whenever possible. And they can quickly and easily bring together any third party data sources they may need.
Question 2: How good looking & clean is you data?
□ (A) Yikes, not very. But it’s what’s on the inside that counts right?
□ (B) It’s ok. We’ve both let ourselves go a bit.
□ (C) It’s pretty cute. Not supermodel hot, but definitely girl or boy next door cute.
□ (D) My data is HOT! It’s perfect in every way!
Translation: Marketers need data that is reliable and clean. According to a recent Experian study, American companies believe that 25% of their data is inaccurate, the rest of the world isn’t much more confident. 90% of respondents said they suffer from common data errors, and 78% have problems with the quality of the data they gather from disparate channels. Making marketing decisions based upon data that is inaccurate leads to poor decisions. And what’s worse, many marketers have no idea how good or bad their data is, so they have no idea what impact it is having on their marketing programs and analysis. The data ready marketer understands this and has a top tier data quality solution in place to make sure their data is in the best shape possible.
Question 3: Do you feel safe when you’re with your data?
□ (A) No, my data is pretty scary. 911 is on speed dial.
□ (B) I’m not sure actually. I think so?
□ (C) My date is mostly safe, but it’s got a little “bad boy” or “bad girl” streak.
□ (D) I protect my data, and it protects me back. We keep each other safe and secure.
Translation: Marketers need to be able to trust the quality of their data, but they also need to trust the security of their data. Is it protected or is it susceptible to theft and nefarious attacks like the ones that have been all over the news lately? Nothing keeps a CMO and their PR team up at night like worrying they are going to be the next brand on the cover of a magazine for losing millions of personal customer records. But beyond a high profile data breach, marketers need to be concerned over data privacy. Are you treating customer data in the way that is expected and demanded? Are you using protected data in your marketing practices that you really shouldn’t be? Are you marketing to people on excluded lists
Question 4: Is your data adventurous and well-traveled, or is it more of a “home-body”?
□ (A) My data is all over the place and it’s impossible to find.
□ (B) My data is all in one place. I know we’re missing out on fun and exciting options, but it’s just easier this way.
□ (C) My data is in a few places and I keep fairly good tabs on it. We can find each other when we need to, but it takes some effort.
□ (D) My data is everywhere, but I have complete faith that I can get ahold of any source I might need, when and where I need it.
Translation: Marketing data is everywhere. Your marketing data warehouse, your CRM system, your marketing automation system. It’s throughout your organization in finance, customer support, and sale systems. It’s in third party systems like social media and data aggregators. That means it’s in the cloud, it’s on premise, and everywhere in between. Marketers need to be able to get to and integrate data no matter where it “lives”.
Question 5: Does your data take forever to get ready when it’s time to go do so something together?
□ (A) It takes forever to prepare my data for each new outing. It’s definitely not “ready to go”.
□ (B) My data takes it’s time to get ready, but it’s worth the wait… usually!
□ (C) My data is fairly quick to get ready, but it does take a little time and effort.
□ (D) My data is always ready to go, whenever we need to go somewhere or do something.
Translation: One of the reasons many marketers end up in marketing is because it is fast paced and every day is different. Nothing is the same from day-to-day, so you need to be ready to act at a moment’s notice, and change course on a dime. Data ready marketers have a foundation of great data that they can point at any given problem, at any given time, without a lot of work to prepare it. If it is taking you weeks or even days to pull data together to analyze something new or test out a new hunch, it’s too late – your competitors have already done it!
Question 6: Can you believe the stories your data is telling you?
□ (A) My data is wrong a lot. It stretches the truth a lot, and I cannot rely on it.
□ (B) I really don’t know. I question these stories – dare I say excused – but haven’t been able to prove it one way or the other.
□ (C) I believe what my data says most of the time. It rarely lets me down.
□ (D) My data is very trustworthy. I believe it implicitly because we’ve earned each other’s trust.
Translation: If your data is dirty, inaccurate, and/or incomplete, it is essentially “lying” to you. And if you cannot get to all of the data sources you need, your data is telling you “white lies”! All of the work you’re putting into analysis and optimization is based on questionable data, and is giving you questionable results. Data ready marketers understand this and ensure their data is clean, safe, and connected at all times.
Question 7: Does your data help you around the house with your daily chores?
□ (A) My data just sits around on the couch watching TV.
□ (B) When I nag my data will help out occasionally.
□ (C) My data is pretty good about helping out. It doesn’t take imitative, but it helps out whenever I ask.
□ (D) My data is amazing. It helps out whenever it can, however it can, even without being asked.
Translation: Your marketing data can do so much. It should enable you be “customer ready” – helping you to understand everything there is to know about your customers so you can design amazing personalized campaigns that speak directly to them. It should enable you to be “decision ready” – powering your analytics capabilities with great data so you can make great decisions and optimize your processes. But it should also enable you to be “showcase ready” – giving you the proof points to demonstrate marketing’s actual impact on the bottom line.
Now for the fun part… It’s time to rate your data relationship status
If you answered mostly (A): You have a rocky relationship with your data. You may need some data counseling!
If you answered mostly (B): It’s time to decide if you want this data relationship to work. There’s hope, but you’ve got some work to do.
If you answered mostly (C): You and your data are at the beginning of a beautiful love affair. Keep working at it because you’re getting close!
If you answered mostly (D): Congratulations, you have a strong data marriage that is based on clean, safe, and connected data. You are making great business decisions because you are a data ready marketer!
Do You Love Your Data?
No matter what your data relationship status, we’d love to hear from you. Please take our survey about your use of data and technology. The results are coming out soon so don’t miss your chance to be a part. https://www.surveymonkey.com/s/DataMktg
Also, follow me on twitter – The Data Ready Marketer – for some of the latest & greatest news and insights on the world of data ready marketing. And stay tuned because we have several new Data Ready Marketing pieces coming out soon – InfoGraphics, eBooks, SlideShares, and more!
First off, let me get one thing off my chest. If you don’t pay close attention to your data, throughout the application consolidation or migration process, you are almost guaranteed delays and budget overruns. Data consolidation and migration is at least 30%-40% of the application go-live effort. We have learned this by helping customers deliver over 1500 projects of this type. What’s worse, if you are not super meticulous about your data, you can be assured to encounter unhappy business stakeholders at the end of this treacherous journey. The users of your new application expect all their business-critical data to be there at the end of the road. All the bells and whistles in your new application will matter naught if the data falls apart. Imagine if you will, students’ transcripts gone missing, or your frequent-flyer balance a 100,000 miles short! Need I say more? Now, you may already be guessing where I am going with this. That’s right, we are talking about the myths and realities related to your data! Let’s explore a few of these.
Myth #1: All my data is there.
Reality #1: It may be there… But can you get it? if you want to find, access and move out all the data from your legacy systems, you must have a good set of connectivity tools to easily and automatically find, access and extract the data from your source systems. You don’t want to hand-code this for each source. Ouch!
Myth #2: I can just move my data from point A to point B.
Reality #2: You can try that approach if you want. However you might not be happy with the results. Reality is that there can be significant gaps and format mismatches between the data in your legacy system and the data required by your new application. Additionally you will likely need to assemble data from disparate systems. You need sophisticated tools to profile, assemble and transform your legacy data so that it is purpose-fit for your new application.
Myth #3: All my data is clean.
Reality #3: It’s not. And here is a tip: better profile, scrub and cleanse your data before you migrate it. You don’t want to put a shiny new application on top of questionable data . In other words let’s get a fresh start on the data in your new application!
Myth #4: All my data will move over as expected
Reality #4: It will not. Any time you move and transform large sets of data, there is room for logical or operational errors and surprises. The best way to avoid this is to automatically validate that your data has moved over as intended.
Myth #5: It’s a one-time effort.
Reality #5: ‘Load and explode’ is formula for disaster. Our proven methodology recommends you first prototype your migration path and identify a small subset of the data to move over. Then test it, tweak your model, try it again and gradually expand. More importantly, your application architecture should not be a one-time effort. It is work in progress and really an ongoing journey. Regardless of where you are on this journey, we recommend paying close attention to managing your application’s data foundation.
As you can see, there is a multitude of data issues that can plague an application consolidation or migration project and lead to its doom. These potential challenges are not always recognized and understood early on. This perception gap is a root-cause of project failure. This is why we are excited to host Philip Russom, of TDWI, in our upcoming webinar to discuss data management best practices and methodologies for application consolidation and migration. If you are undertaking any IT modernization or rationalization project, such as consolidating applications or migrating legacy applications to the cloud or to ‘on-prem’ application, such as SAP, this webinar is a must-see.
So what’s your reality going to be like? Will your project run like a dream or will it escalate into a scary nightmare? Here’s hoping for the former. And also hoping you can join us for this upcoming webinar to learn more:
Webinar with TDWI:
Successful Application Consolidation & Migration: Data Management Best Practices.
Date: Tuesday March 10, 10 am PT / 1 pm ET
Don’t miss out, Register Today!
1) Gartner report titled “Best Practices Mitigate Data Migration Risks and Challenges” published on December 9, 2014
2) Harvard Business Review: ‘Why your IT project may be riskier than you think’.
Healthcare and data have the makings of an epic love affair, but like most relationships, it’s not all roses. Data is playing a powerful role in finding a cure for cancer, informing cost reduction, targeting preventative treatments and engaging healthcare consumers in their own care. The downside? Data is needy. It requires investments in connectedness, cleanliness and safety to maximize its potential.
- Data is ubiquitous…connect it.
4400 times the amount of information held at the Library of Congress – that’s how much data Kaiser Permanente alone has generated from its electronic medical record. Kaiser successfully makes every piece of information about each patient available to clinicians, including patient health history, diagnosis by other providers, lab results and prescriptions. As a result, Kaiser has seen marked improvements in outcomes: 26% reduction in office visits per member and a 57% reduction in medication errors.
Ongoing value, however, requires continuous investment in data. Investments in data integration and data quality ensure that information from the EMR is integrated with other sources (think claims, social, billing, supply chain) so that clinicians and decision makers have access in the format they need. Without this, self-service intelligence can be inhibited by duplicate data, poor quality data or application silos.
- Data is popular…ensure it is clean.
Healthcare leaders can finally rely on electronic data to make strategic decisions. A CHRISTUS Health anecdote you might relate to – In a weekly meeting each executive reviews their strategic dashboard; these dashboards drive strategic decision making about CPOE adoption (computerized physician order entry), emergency room wait times and price per procedure. Powered by enterprise information management, these dashboards paint a reliable and consistent view across the system’s 60 hospitals. Previous to the implementation of an enterprise data platform, each executive was reliant on their own set of data.
In the pre-data investment era, seemingly common data elements from different sources did not mean the same thing. For example, “Admit Date” in one report reflected the emergency department admission date whereas “Admit Date” in another report referred to the inpatient admission date.
- Sharing data is necessary…make it safe.
To cure cancer, reduce costs and engage patients, care providers need access to data and not just the data they generate; it has to be shared for coordination of care through transitions of care and across settings, i.e. home care, long term care and behavioral health. Fortunately, Consumers and Clinicians agree on this, PWC reports that 56% of consumers and 30% of physicians are comfortable with data sharing for care coordination. Further progress is demonstrated by healthcare organizations willingly adopting cloud based applications –as of 2013, 40% of healthcare organizations were already storing protected health information (PHI) in the cloud.
Increased data access carries risk, leaving health data exposed, however. The threat of data breach or hacking is multiplied by the presence (in many cases necessary) of PHI on employee laptops and the fact that providers are provided increased access to PHI. Ponemon Institute, a security firm estimates that data breaches cost the industry $5.6 billion each year. Investments in data-centric security are necessary to assuage fear, protect personal health data and make secure data sharing a reality.
Early improvements in patient outcomes indicate that the relationship between data and healthcare is a valuable investment. The International Institute of Analytics supports this, reporting that although analytics and data maturity across healthcare lags other industries, the opportunity to positively impact clinical and operational outcomes is significant.
I have two kids. In school. They generate a remarkable amount of paper. From math worksheets, permission slips, book reports (now called reading responses) to newsletters from the school. That’s a lot of paper. All of it is presented in different forms with different results – the math worksheets tell me how my child is doing in math, the permission slips tell me when my kids will be leaving school property and the book reports tell me what kind of books my child is interested in reading. I need to put the math worksheet information into a storage space so I can figure out how to prop up my kid if needed on the basic geometry constructs. The dates that permission slips are covering need to go into the calendar. The book reports can be used at the library to choose the next book.
We are facing a similar problem (albeit on a MUCH larger scale) in the insurance market. We are getting data from clinicians. Many of you are developing and deploying mobile applications to help patients manage their care, locate providers and improve their health. You may capture licensing data to assist pharmaceutical companies identify patients for inclusion in clinical trials. You have advanced analytics systems for fraud detection and to check the accuracy and consistency of claims. Possibly you are at the point of near real-time claim authorization.
The amount of data generated in our world is expected to increase significantly in the coming years. There are an estimated 50 petabytes of data in the Healthcare realm, which is predicted to grow by a factor of 50 to 25,000 petabytes by 2020. Healthcare payers already store and analyze some of this data. However in order to capture, integrate and interrogate large information sets, the scope of the payer information will have to increase significantly to include provider data, social data, government data, pharmaceutical and medical product manufacturers data, and information aggregator data.
Right now – you probably depend on a traditional data warehouse model and structured data analytics to access some of your data. This has worked adequately for you up to now, but with the amount of data that will be generated in the future, you need the processing capability to load and query multi-terabyte datasets in a timely fashion. You need the ability to manage both semi-structured and unstructured data.
Fortunately, a set of emerging technologies (called “Big Data”) may provide the technical foundation of a solution. Big Data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage and process data within a tolerable amount of time. While some existing technology may prove inadequate to future tasks, many of the information management methods of the past will prove to be as valuable as ever. Assembling successful Big Data solutions will require a fusion of new technology and old-school disciplines:
Which of these technologies do you have? Which of these technologies can integrate with on-premise AND cloud based solutions? On which of these technologies does your organization have knowledgeable resources that can utilize the capabilities to take advantage of Big Data?
A lot of the trends we are seeing in enterprise integration today are being driven by the adoption of cloud based technologies from IaaS, PaaS and SaaS. I just was reading this story about a recent survey on cloud adoption and thought that a lot of this sounds very similar to things that we have seen before in enterprise IT.
Why discuss this? What can we learn? A couple of competing quotes come to mind.
Those who forget the past are bound to repeat it. – Edmund Burke
We are doomed to repeat the past no matter what. – Kurt Vonnegut
While every enterprise has to deal with their own complexities there are several past technology adoption patterns that can be used to drive discussion and compare today’s issues in order to drive decisions in how a company designs and deploys their current enterprise cloud architecture. Flexibility in design should be a key goal in addition to satisfying current business and technical requirements. So, what are the big patterns we have seen in the last 25 years that have shaped the cloud integration discussion?
1. 90s: Migration and replacement at the solution or application level. A big trend of the 90s was replacing older home grown systems or main frame based solutions with new packaged software solutions. SAP really started a lot of this with ERP and then we saw the rise of additional solutions for CRM, SCM, HRM, etc.
This kept a lot of people that do data integration very busy. From my point of view this era was very focused on replacement of technologies and this drove a lot of focus on data migration. While there were some scenarios around data integration to leave solutions in place these tended to be more in the area of systems that required transactional integrity and high level of messaging or back office solutions. On the classic front office solutions enterprises in large numbers did rip & replace and migration to new solutions.
2. 00s: Embrace and extend existing solutions with web applications. The rise of the Internet Browser combined with a popular and powerful standard programming language in Java shaped and drove enterprise integration in this time period. In addition, due to many of the mistakes and issues that IT groups had in the 90s there appeared to be a very strong drive to extend existing investments and not do rip and replace. IT and businesses were trying to figure out how to add new solutions to what they had in place. A lot of enterprise integration, service bus and what we consider as classic application development and deployment solutions came to market and were put in place.
3. 00s: Adoption of new web application based packaged solutions. A big part of this trend was driven by .Net & Java becoming more or less the de-facto desired language of enterprise IT. Software vendors not on these platforms were for the most part forced to re-platform or lose customers. New software vendors in many ways had an advantage because enterprises were already looking at large data migration to upgrade the solutions they had in place. In either case IT shops were looking to be either a .Net or Java shop and it caused a lot of churn.
4. 00s: First generation cloud applications and platforms. The first adoption of cloud applications and platforms were driven by projects and specific company needs. From Salesforce.com being used just for sales management before it became a platform to Amazon being used as just a run-time to develop and deploy applications before it became a full scale platform and an every growing list of examples as every vendor wants to be the cloud platform of choice. The integration needs originally were often on the light side because so many enterprises treated it as an experiment at first or a one off for a specific set of users. This has changed a lot in the last 10 years as many companies repeated their on premise silo of data problems in the cloud as they usage went from one cloud app to 2, 5, +10, etc. In fact, if you strip away where a solution happens to be deployed (on prem or cloud) the reality is that if an enterprise had previously had a poorly planned on premise architecture and solution portfolio they probably have just as poorly planned cloud architecture solution and portfolio. Adding them together just leads to disjoint solutions that are hard to integrate, hard to maintain and hard to evolve. In other words the opposite of the being flexible goal.
5. 10s: Consolidation of technology and battle of the cloud platforms. It appears we are just getting started in the next great market consolidation and every enterprise IT group is going to need to decide their own criteria for how they balance current and future investments. Today we have Salesforce, Amazon, Google, Apple, SAP and a few others. In 10 years some of these will either not exist as they do today or be marginalized. No one can say which ones for sure and this is why prioritizing flexibility in terms or architecture for cloud adoption.
For me the main take aways from the past 25 years of technology adoption trends for anyone that thinks about enterprise and data integration would be the following.
a) It’s all starts and ends with data. Yes, applications, process, and people are important but it’s about the data.
b) Coarse grain and loosely coupled approaches to integration are the most flexible. (e.g. avoid point to point at all costs)
c) Design with the knowledge of what data is critical and what data might or should be accessible or movable
d) Identify data and applications that might have to stay where it is no matter what.(e.g. the main frame is never dying)
e) Make sure your integration and application groups have access to or include someone that understand security. While a lot of integration developers think they understand security it’s usually after the fact that you find out they really do not.
So, it’s possible to shape your cloud adoption and architecture future by at least understanding how past technology and solution adoption has shaped the present. For me it is important to remember it is all about the data and prioritizing flexibility as a technology requirement at least at the same level as features and functions. Good luck.