Tag Archives: Data Services
This is a continuation from Part 1 of the Blog which you can read here.
Now, if you are in IT, reading about how Informatica Rev enables the everyday business users in your company to participate in the Data Democracy might feel like treachery. You are likely thinking that Informatica is letting the bull loose in your own fine china shop. You likely feel, first of all, that Informatica is supporting the systemic bypass of all the data governance that IT has worked hard to put in place and then second of all, that Informatica is supporting the alienation of the very IT people that have approved of and invested in Informatica for decades.
While I can understand this thought process I am here to, proudly, inform you that your thoughts cannot be further from the truth! In fact, in the not too distant future, Informatica is in a very strong position to create a very unique technology solution to ensure you can better govern all the data in your enterprise and do it in a way that will allow you to proactively deliver the right data to the business, yes, before the masses of everyday business users have started to knock your door down to even ask for it. Informatica’s unique solution will ensure the IT and Business divide that has existed in your company for decades, actually becomes a match made in heaven. And you in IT get the credit for leading this transformation of your company to a Data Democracy. Listen to this webinar to hear Justin Glatz, Executive Director of Information Technology at Code Nast speak about how he will be leading Conde Nast’s transformation to Data Democracy.
“How?” you might ask. Well, first let’s face it, today you do not have any visibility into how the business is procuring and using most data, and therefore you are not governing most of it. Without a change in your tooling, your ability to gain this visibility is diminishing greatly, especially since the business does not have to come to you to procure and use their cloud based applications. By having all of your everyday business users use Informatica Rev, you, for the first time will have the potential to gain a truly complete picture of how data is being used in your company. Even the data they do not come to you to procure.
In the not too distant future, you will gain this visibility through an IT companion application to Informatica Rev. You will then gain the ability to easily operationalize your business user’s exact transformation logic or Recipe as we call it in Informatica Rev, into your existing repositories be they your enterprise data warehouse, datamart or master data management repository for example. And by-the-way you are likely already using Informatica PowerCenter or Informatica Cloud or Informatica MDM to manage these repositories anyway so you already have the needed infrastructure we will be integrating Informatica Rev with. And if you are not using Informatica for managing these repositories, the draw of becoming proactive with your business and leading the transformation of your company to a Data Democracy will be enough to make you want to go get Informatica.
Just as these Professionals have found success by participating in the Data Democracy, with Informatica Rev you finally can do so, too. You can try Informatica Rev for free by clicking here.
Informatica Cloud Data Preparation has launched! Those are the words we aspired to hear, the words that served as our rallying cry, when all we had was an idea coupled with a ton of talent, passion and drive. Well, today, we launch Informatica Rev, as business users refer to it. (Check out the press release here.)
As we launch today, we now have over 3,500 individual users across over 800 logos. These users are everyday business users who just want to improve the speed and quality of their business decisions. And by doing so, they help their corporation find success in the marketplace. And in turn they also find success in their own careers. You can hear more from Customers talking about their experience using Informatica Rev during our December 16 Future of Work webinar.
These users are people who, previously, were locked out of the exclusive Data Club because they did not have the time to be excel jocks or know how to code. But now, these are people who have found success by turning their backs on this Club and aggressively participating in the Data Democracy.
And they are able to finally participate in the “Data Democracy” because of Informatica Rev. You can try Informatica Rev for free by clicking here.
These people play every conceivable role in the information economy. They are marketing managers, marketing operations leads, tradeshow managers , sales people, sales operations leads, accounting analysts, recruiting leads, benefits managers, to mention a few. They work for large companies to small/mid-size companies and even sole proprietorships. They are even IT leads who might have more technical knowledge than their business counterparts, but are increasingly getting barraged by requests from their business side counterparts, and are just looking to be more productive with these requests. Let’s take a peek into how Informatica Rev allows them to participate in the Data Democracy, and changes their lives for the better.
Before Informatica Rev, a marketing analyst was simply unable to respond to rapid changes to competitor prices because by the time the competitor pricing data was assembled by people or tools they relied on, the competitor prices changed. This lead to lost revenue opportunities for the company. I almost don’t need to state that this end result is not an insignificant repercussion of the inability to respond at the rapid pace of business.
Let’s explore what a marketing analyst does today. When a file with competitor prices was received by the analyst, the initial questions they ask were “Which of my SKUs is each competitive price for?” and ”Do the prices vary by some geography” and to answer these questions, they use Excel VLOOKUPS and some complex macros. By the time the Excel work is done, if they know what a VLOOKUP is, the competitor data is old. Therefore, at some point, there was no reason to continue this analysis and just accept the inability to capture this revenue.
With Informatica Rev, a Marketing Analyst can use Intelligent Guidance to understand the competitor data file to determine its completeness and then with Smart Combine easily combine the competitor data with their own. This is with no code, formal training, and in a few minutes all by themselves. And with Tableau as their BI tool, they can then use the Export to TDE capability to seamlessly export to Tableau to analyze trends in price changes to decide on their strategy. Voila!
Before Informatica Rev, a tradeshow manager used to spend an inordinate amount of time trying to validate leads so that they could then load them into a Marketing Automation System. After a tradeshow, time is of the essence and leads need to be processed rapidly otherwise they will decay, and fewer opportunities will result for the company. Again, I almost don’t need to state that this end result is not an insignificant repercussion of the inability to respond at the rapid pace of business. But, the Tradeshow Manager finds themselves using Excel VLOOKUPS and other creative but time consuming ways to validate the lead information. They simply want to know, “Which leads have missing titles or phone numbers?” and ” What is the correct phone number?” and” How many are new leads?” and ” How many are in accounts closing this quarter?”
All of these are questions that can be answered, but take a lot of time in Excel and even after all that Excel work, the final lead list was still error prone causing missed sales opportunities. With Informatica Rev, a Tradeshow Manager can answer these questions rapidly with no code, formal training, and in a few minutes all by themselves. With the Intelligent Guidance capability they can easily surface where the missing data lies. With Fast Combine they can access their opportunity information in Salesforce and be guided through the process of combining tradeshow and salesforce data to correctly replace the missing data. Again, Voila!
Before Informatica Rev, an Accounting Analyst spent inordinate amounts of time processing trade partner data, every month, reconciling it with the trade partner’s receivables to determine if they had been paid the correct amount by their trade partner. Not only was this process time consuming, it was error prone and after all of the effort, they actually left millions in earned revenue, unreceived. And again, I almost don’t need to state that this end result is not an insignificant repercussion of the inability to respond at the rapid pace of business, and also effectively managing operational costs within the analysts company. So, let’s take a look at what the Accounting Analyst does, today. Every trade partner sends large files with different structures of purchase data in them. The Accounting Analyst initially asks, “What data is in them?”,” For what time period?”,” How many transactions?”,” From which products?”, “Which of our actual products does their name for our product tie to?”
Then, after they get these answers, they need to combine it with the payments data they received from the trade partner in order to answer the question, “Have we been paid the right amount and if not what is the difference?” All of these questions are ones that can be answered, but used to take a lot of time with Excel VLOOKUPS and complex macros. And often, the reconciliation was performed incorrectly leaving receivables, well, un-received. With Informatica Rev, an Accounting Analyst can benefit from Intelligent Guidance where they are lead through the process of rapidly understanding their questions about the trade partner files, with a few simple clicks. Furthermore Informatica Rev’s Smart Combine capability suggests how to combine receivables data with trade partner data. So there you have it, now they know if the correct amount has been paid. And the best part is that they were able to answer these questions rapidly with no code, formal training, and in a few minutes all by themselves. Now, this process has to be done every month. Using Recipes, every step the Accounting Analyst took last month is recorded, so they do not have to repeat it this month. Just re-import the new trade partner data and you reconciled. And Again, Voila!
One more thing for you, the everyday business user. In the future, you will be able to send this Recipe to IT. This capability will allow you to communicate your exact data requirement to IT, just as you created it with no mis-interpretation on anyone’s behalf. IT can then rapidly institutionalize your logic exactly as you defined it, into the enterprise datawarehouse, datamart or some other repository of your or your IT department’s liking. Perhaps this means the end to those requirements gathering sessions?
More importantly, I feel this means that you just got your exact requirement added into a central repository in a matter of minutes. And you did not need to make a case to be part of an enterprise project either. This capability is a necessary part for you to participate in the Data Democracy and maintain your rapid pace of business. This is a piece that Informatica is uniquely positioned to solve for you as your IT department likely already has Informatica.
Just as these Professionals have found success by participating in the Data Democracy, with Informatica Rev you finally can do so, too.
Please look for Part 2 of this Blog, tomorrow, where I will discuss how Informatica Rev elegantly bridges the IT and Business divide, empowering IT to lead the charge into Data Democracy. But in the meantime check out Informatica Rev for yourself and let me know what you think.
Data warehousing systems remain the de facto standard for high performance reporting and business intelligence, and there is no sign that will change soon. But Hadoop now offers an opportunity to lower costs by transferring infrequently used data and data preparation workloads off of the data warehouse and process entirely new sources of data coming from the explosion of industrial and personal devices. This is motivating interest in new concepts like the “data lake” as adjunct environments to traditional data warehousing systems.
Now, let’s be real. Between the evolutionary opportunity of preparing data more cost effectively and the revolutionary opportunity of analyzing new sources of data, the latter just sounds cooler. This revolutionary opportunity is what has spurred the growth of new roles like data scientists and new tools for self-service visualization. In the revolutionary world of pervasive analytics, data scientists have the ability to use Hadoop as a low cost and transient sandbox for data. Data scientists can perform exploratory data analysis by quickly dumping data from a variety of sources into a schema-on-read platform and by iterating dumps as new data comes in. SQL-on-Hadoop technologies like Cloudera Impala, Hortonworks Stinger, Apache Drill, and Pivotal HAWQ enable agile and iterative SQL-like queries on datasets, while new analysis tools like Tableau enable self-service visualization. We are merely in the early phases of the revolutionary opportunity of big data.
But while the revolutionary opportunity is exciting, there’s an equally compelling opportunity for enterprises to modernize their existing data environment. Enterprises cannot rely on an iterative dump methodology for managing operational data pipelines. Unmanaged “data swamps” are simply unpractical for business operations. For an operational data pipeline, the Hadoop environment must be a clean, consistent, and compliant system of record for serving analytical systems. Loading enterprise data into Hadoop instead of a relational data warehouse does not eliminate the need to prepare it.
Now I have a secret to share with you: nearly every enterprise adopting Hadoop today to modernize their data environment has processes, standards, tools, and people dedicated to data profiling, data cleansing, data refinement, data enrichment, and data validation. In the world of enterprise big data, schemas and metadata still matter.
I’ll share some examples with you. I attended a customer panel at Strata + Hadoop World in October. One of the participants was the analytics program lead at a large software company whose team was responsible for data preparation. He described how they ingest data from heterogeneous data sources by mandating a standardized schema for everything that lands in the Hadoop data lake. Once the data lands, his team profiles, cleans, refines, enriches, and validates the data so that business analysts have access to high quality information. Another data executive described how inbound data teams are required to convert data into Avro before storing the data in the data lake. (Avro is an emerging data format alongside other new formats like ORC, Parquet, and JSON). One data engineer from one of the largest consumer internet companies in the world described the schema review committee that had been set up to govern changes to their data schemas. The final participant was an enterprise architect from one of the world’s largest telecom providers who described how their data schema was critical for maintaining compliance with privacy requirements since data had to be masked before it could be made available to analysts.
Let me be clear – these companies are not just bringing in CRM and ERP data into Hadoop. These organizations are ingesting patient sensor data, log files, event data, clickstream data, and in every case, data preparation was the first task at hand.
I recently talked to a large financial services customer who proposed a unique architecture for their Hadoop deployment. They wanted to empower line of business users to be creative in discovering revolutionary opportunities while also evolving their existing data environment. They decided to allow line of businesses to set up sandbox data lakes on local Hadoop clusters for use by small teams of data scientists. Then, once a subset of data was profiled, cleansed, refined, enriched, and validated, it would be loaded into a larger Hadoop cluster functioning as an enterprise information lake. Unlike the sandbox data lakes, the enterprise information lake was clean, consistent, and compliant. Data stewards of the enterprise information lake could govern metadata and ensure data lineage tracking from source systems to sandbox to enterprise information lakes to destination systems. Enterprise information lakes balance the quality of a data warehouse with the cost-effective scalability of Hadoop.
Building enterprise information lakes out of data lakes is simple and fast with tools that can port data pipeline mappings from traditional architectures to Hadoop. With visual development interfaces and native execution on Hadoop, enterprises can accelerate their adoption of Hadoop for operational data pipelines.
No one described the opportunity of enterprise information lakes better at Strata + Hadoop World than a data executive from a large healthcare provider who said, “While big data is exciting, equally exciting is complete data…we are data rich and information poor today.” Schemas and metadata still matter more than ever, and with the help of leading data integration and preparation tools like Informatica, enterprises have a path to unleashing information riches. To learn more, check out this Big Data Workbook
Insurance companies serve as a fantastic example of big data technology use since data is such a pervasive asset in the business. From a cost savings and risk mitigation standpoint, insurance companies use data to assess the probable maximum loss of catastrophic events as well as detect the potential for fraudulent claims. From a revenue growth standpoint, insurance companies use data to intelligently price new insurance offerings and deploy cross-sell offers to customers to maximize their lifetime value.
New data sources are enabling insurance companies to mitigate risk and grow revenues even more effectively. Location-based data from mobile devices and sensors are being used inside insured properties to proactively detect exposure to catastrophic events and deploy preventive maintenance. For example, automobile insurance providers are increasingly offering usage-based driving programs, whereby insured individuals install a mobile sensor inside their car to relay the quality of their driving back to their insurance provider in exchange for lower premiums. Even healthcare insurance providers are starting to analyze the data collected by wearable fitness bands and smart watches to monitor insured individuals and inform them of personalized ways to be healthier. Devices can also be deployed in the environment that triggers adverse events, such as sensors to monitor earthquake and weather patterns, to help mitigate the costs of potential events. Claims are increasingly submitted with supporting information in a variety of formats like text files, spreadsheets, and PDFs that can be mined for insights as well. And with the growth on insurance sales online, web log and clickstream data is more important than ever to help drive online revenue.
Beyond the benefits of using new data sources to assess risk and grow revenues, big data technologies are enabling insurance companies to fundamentally rethink the basis of their analytical architecture. In the past, probable maximum loss modeling could only be performed on statistically aggregated datasets. But with big data technologies, insurance companies have the opportunity to analyze data at the level of an insured individual or a unique insurance claim. This increased depth of analysis has the potential to radically improve the quality and accuracy of risk models and market predictions.
Informatica is helping insurance companies accelerate the benefits of big data technologies. With multiple styles of ingestion available, Informatica enables insurance companies to leverage nearly any source of data. Informatica Big Data Edition provides comprehensive data transformations for ETL and data quality, so that insurance companies can profile, parse, integrate, cleanse, and refine data using a simple user-friendly visual development environment. With built-in data lineage tracking and support for data masking, Informatica helps insurance companies ensure regulatory compliance across all data.
To try out the Big Data Edition, download a free trial today in the Informatica Marketplace and get started with big data today!
According to the article, in Hamilton County Ohio, it’s not unusual to see kids from the same neighborhoods coming to the hospital for asthma attacks. Thus, researchers wanted to know if it was fact or mistaken perception that an unusually high number of children in the same neighborhood were experiencing asthma attacks. The next step was to review existing data to determine the extent of the issues, and perhaps how to solve the problem altogether.
“The researchers studied 4,355 children between the ages of 1 and 16 who visited the emergency department or were hospitalized for asthma at Cincinnati Children’s between January 2009 and December 2012. They tracked those kids for 12 months to see if they returned to the ED or were readmitted for asthma.”
Not only were the researchers able to determine a sound correlation between the two data sets, but they were able to advance the research to predict which kids were at high-risk based upon where they live. Thus, some of the cause and the effects have been determined.
This came about when researchers began thinking out of the box, when it comes to dealing with traditional and non-traditional medical data. They integrated housing and census data, in this case, with that of the data from the diagnosis and treatment of the patients. These are data sets unlikely to find their way to each other, but together they have a meaning that is much more valuable than if they just stayed in their respective silos.
“Non-traditional medical data integration has begun to take place in some medical collaborative environments already. The New York-Presbyterian Regional Health Collaborative created a medical village, which ‘goes beyond the established patient-centered medical home mode.’ It not only connects an academic medical center with a large ambulatory network, medical homes, and other providers with each other, but community resources such as school-based clinics and specialty-care centers (the ones that are a part of NYP’s network).”
The fact of the matter is that data is the key to understanding what the heck is going on when cells of sick people begin to emerge. While researchers and doctors can treat the individual patients there is not a good understanding of the larger issues that may be at play. In this case, poor air quality in poor neighborhoods. Thus, they understand what problem needs to be corrected.
The universal sharing of data is really the larger solution here, but one that won’t be approached without a common understanding of the value, and funding. As we pass laws around the administration of health care, as well as how data is to be handled, perhaps it’s time we look at what the data actually means. This requires a massive deployment of data integration technology, and the fundamental push to share data with a central data repository, as well as with health care providers.
Or in other words: Did the agency model kill data quality? When you watch the TV series “Homeland”, you quickly realize the interdependence between field operatives and the command center. This is a classic agency model. One arm gathers, filters and synthesizes information and prepares a plan but the guys on the ground use this intel to guide their sometimes ad hoc next moves.
The last few months I worked a lot – and I mean A LOT – with a variety of mid-sized life insurers (<$1B annual revenue) around fixing their legacy-inherited data quality problems. Their IT departments, functioning like Operations Command Centers (intel acquisition, filter and synthesize), were inundated with requests to fix up and serve a coherent, true, enriched central view of a participant (the target) and all his policies and related entities from and to all relevant business lines (planning) to achieve their respective missions (service, retain, upsell, mitigate risk): employee benefits, broker/dealer, retirement services, etc.
The captive and often independent agents (execution); however, often run with little useful data into an operation (sales cycle) as the Ops Center is short on timely and complete information. Imagine Carrie directing her strike team to just wing it based on their experience and dated intel from a raid a few months ago without real-time drone video feeds. Would she be saying, “Guys, it’s always been your neck, you’ll figure it out.” I think not.
This becomes apparent when talking to the actuaries, claims operations, marketing, sales, agency operations, audit, finance, strategic planning, underwriting and customer service, common denominators appeared quickly:
- Every insurer saw the need to become customer instead of policy centric. That’s the good news.
- Every insurer knew their data was majorly sub-standard in terms of quality and richness.
- Every insurer agreed that they are not using existing data capture tools (web portals for agents and policy holders, CRM applications, policy and claims mgmt systems) to their true potential.
- New books-of-business were generally managed as separate entities from a commercial as well as IT systems perspective, even if they were not net-new products, like trust products. Cross-selling as such, even if desired, becomes a major infrastructure hurdle.
- As in every industry, the knee-jerk reaction was to throw the IT folks at data quality problems and making it a back-office function. Pyrrhic victory.
- Upsell scenarios, if at all strategically targeted, are squarely in the hands of the independent agents. The insurer will, at most, support customer insights around lapse avoidance or 401k roll-off indicators for retiring or terminated plan participants. This may be derived from a plan sponsor (employer) census file, which may have incorrect address information.
- Prospect and participant e-mail addresses are either not captured (process enforcement or system capability) or not validated (domain, e-mail verification), so the vast majority of customer correspondence, like scenarios, statements, privacy notices and policy changes, travels via snail mail (and this typically per policy). Overall, only 15-50% of contacts have an “unverified” e-mail address today and of these less than a third subscribed to exclusive electronic statement delivery.
- Postal address information is still not 99% correct, complete or current, resulting in high returned mail ($120,000-$750,000 every quarter), priority mail upgrades, statement reprints, manual change capture and shredding cost as well as the occasional USPS fine.
- Data quality, as unbelievable as it sounds, took a back-burner to implementing a new customer data warehouse, a new claims system, a new risk data mart, etc. They all just get filled with the same old, bad data as business users were – and I quote –“used to the quality problem already”.
- Premium underpricing runs at 2-4% annually, foregoing millions in additional revenue, due to lack of a full risk profile.
- Customer cost –of-acquisition (CAC) is unknown or incorrect as there is no direct, realistic tracking of agency campaign/education dollars spent against new policies written.
- Agency historic production and projections are unclear as a dynamic enforcement of hierarchies is not available, resulting in orphaned policies generating excess tax burdens. Often this is the case when agents move to states where they are not licensed, they passed or retired.
What does a cutting-edge insurer look like instead? Ask Carrie Mathison and Saul Bernstein. They already have a risk and customer EDW as well as a modern (cloud based?) CRM and claims mgmt system. They have considered, as part of their original implementation or upgrade, capabilities required to fix the initial seed data into their analytics platforms. Now, they are looking into pushing them back into operational systems like CRM and avoiding bad source system entries from the get-go.
They are also beyond just using data to avoid throwing more bodies in every department at “flavor-of-the-month” clean-up projects, e.g. yet another state unclaimed property matching exercise, total annual premium revenue written in state X for tax review purposes by the state tax authority.
So what causes this drastic segmentation of leading versus laggard life insurers? In my humble opinion, it is the lack of a strategic refocusing of what the insurer can do for an agent by touching the prospects and customers directly. Direct interaction (even limited) improves branding, shortens the sales cycle and rates based on improved insights through better data quality.
Agents (and insurers) need to understand that the wealth of data (demographic, interactions, transactions) corporate possesses already via native and inherited (via M&A) can be a powerful competitive differentiator. Imagine if they start tapping into external sources beyond the standard credit bureaus and consumer databases; dare I say social media?
Competing based on immediate instead of long term needs (in insurance: life time earnings potential replacement), price (fees) and commission cannot be the sole answer.
If you use production data in test and development environments or are looking for alternative approaches, register for the first webinar in a three part series on data security gaps and remediation. On December 9th, Adrian Lane, Security Analyst at Securosis, will join me to discuss security for test environments.
This is the first webinar in a three part series on data security gaps and remediation. This webinar will focus on how data centric security can be used to shore up vulnerabilities in one of the key focus areas, test and development environments. It’s common practice that non-production database environments are created by making copies of production data. This potentially exposes sensitive and confidential production data to developers, testers, and contractors alike. Commonly, 6-10 copies of production databases are created for each application environment and they are regularly provisioned to support development, testing and training efforts. Since security controls deployed for the source database are not replicated in the test environments, this is a glaring hole in data security and a target for external or internal exploits.
In this webinar, we will cover:
- Key trends in enterprise data security
- Vulnerabilities in non-production application environments (test and development)
- Alternatives to consider when protecting test and development environments
- Priorities for enterprises in reducing attack surface for their organization
- Compliance and internal audit cost reduction
- Data masking and synthetics data use cases
- Informatica Secure Testing capabilities
Register for the webinar today at http://infa.media/1pohKov. If you cannot attend the live event, be sure to watch the webinar on-demand.
From this analysis in “What’s Reasonable Security? A Moving Target,” IAPP extrapolated the best practices from the FTC’s enforcement actions.
While the white paper and article indicate that “reasonable security” is a moving target it does provide recommendations that will help organizations access and baseline their current data security efforts. Interesting is the focus on data centric security, from overall enterprise assessment to the careful control of access of employees and 3rd parties. Here some of the recommendations derived from the FTC’s enforcements that call for Data Centric Security:
- Perform assessments to identify reasonably foreseeable risks to the security, integrity, and confidentiality of personal information collected and stored on the network, online or in paper files.
- Limited access policies curb unnecessary security risks and minimize the number and type of network access points that an information security team must monitor for potential violations.
- Limit employee access to (and copying of) personal information, based on employee’s role.
- Implement and monitor compliance with policies and procedures for rendering information unreadable or otherwise secure in the course of disposal. Securely disposed information must not practicably be read or reconstructed.
- Restrict third party access to personal information based on business need, for example, by restricting access based on IP address, granting temporary access privileges, or similar procedures.
How does Data Centric Security help organizations achieve this inferred baseline?
- Data Security Intelligence (Secure@Source coming Q2 2015), provides the ability to “…identify reasonably foreseeable risks.”
- Data Masking (Dynamic and Persistent Data Masking) provides the controls to limit access of information to employees and 3rd parties.
- Data Archiving provides the means for the secure disposal of information.
Other data centric security controls would include encryption for data at rest/motion and tokenization for securing payment card data. All of the controls help organizations secure their data, whether a threat originates internally or externally. And based on the never ending news of data breaches and attacks this year, it is a matter of when, not if your organization will be significantly breached.
For 2015, “Reasonable Security” will require ongoing analysis of sensitive data and the deployment of reciprocal data centric security controls to ensure that the organizations keep pace with this “Moving Target.”
Part 1 of this blog touched on the differences between PIM and Product MDM. Since both play a role in ensuring the availability of high quality product data, it is easy to see the temptation to extend the scope of either product to play a more complete part. However, there are risks involved in customising software. PIM and MDM are not exceptions, and any customisations will carry some risk.
In the specific case of looking to extend the role of PIM, the problems start if you just look at the data and think: “oh, this is just a few more product attributes to add”. This will not give you a clear picture of the effort or risk associated with customisations. A complete picture requires looking beyond the attributes as data fields, and considering them in context: which processes and people (roles) are supported by these attributes?
Recently we were asked to assess the risk of PIM customisation for a customer. The situation was that data to be included in PIM was currently housed in separate, home grown and aging legacy systems. One school of thought was to move all the data, and their management tasks, into PIM and retire the three systems. That is, extending the role of PIM beyond a marketing application and into a Product MDM role. In this case, we found three main risks of customising PIM for this purpose. Here they are in more detail:
1. Decrease speed of PIM deployment
- Inclusion of the functionality (not just the data) will require customisations in PIM, not just additional attributes in the data model.
- Logic customisations are required for data validity checks, and some value calculations.
- Additional screens, workflows, integrations and UI customisations will be required for non-marketing roles
- PIM will become the source for some data, which is used in critical operational systems (e.g. SAP). Reference checks & data validation cannot be taken lightly due to risks of poor data elsewhere.
- Bottom line: A non-standard deployment with drive up implementation cost, time and risk.
2. Reduce marketing agility
- In the case concerned, whilst the additional data was important to marketing, it is primarily supporting by non-marketing users and processes including Product Development, Sales and Manufacturing
- These systems are key systems in their workflow in terms of creating and distributing technical details of new products to other systems, e.g. SAP for production
- If the systems are retired and replaced with PIM, these non-marketing users will need to be equal partners in PIM:
- Require access and customised roles
- Influence over configuration
- Equal vote in feature/function prioritisation
- Bottom Line: Marketing will no longer completely own the PIM system, and may have to sacrifice new functionality to prioritise supporting other roles.
3. Risk of marketing abandoning the hybrid tool in the mid-term
- An investment in PIM is usually an investment by Marketing to help them rapidly adapt to a dynamic external market.
- System agility (point 2) is key to rapid adaption, as is the ability to take advantage of new features within any packaged application.
- As more customisations are made, the cost of upgrades can become prohibitive, driven by the cost to upgrade customisations.
- Cost often driven by consulting fees to change what could be poorly documented code.
- Risk of falling behind on upgrades, and hence sacrificing access to the newest PIM functionality
- If upgrades are more expensive than new tools, PIM will be abandoned by Marketing, and they will invest in a new tool.
- Bottom line: In a worst case scenario, a customised PIM solution could be left supporting non-marketing functionality with Marketing investing in a new tool.
The first response to the last bullet point is normally “no they wouldn’t”. Unfortunately this is a pattern both I and some of my colleagues have seen in the area of marketing & eCommerce applications. The problem is that these areas are so fast moving, that nobody can afford to fall behind in terms of new functionality. If upgrades are large projects which need lengthy approval and implementation cycles, marketing is unlikely to wait. It is far easier to start again with a smaller budget under their direct control. (Which is where PIM should be in the first place.)
- Making PIM look and behave like Product MDM could have some undesirable consequences – both in the short term (current deployment) and in the longer term (application abandonment).
- A choice for customising PIM vs. enhancing your landscape with Product MDM should be made not on data attributes alone.
- Your business and data processes should guide you in terms of risk assessment for customisation of your PIM solution.
Bottom Line: If the risks seem too large, then consider enhancing your IT landscape with Product MDM. Trading PIM cost & risk for measurable business value delivered by MDM will make a very attractive business case.
This is a guest author post by Philip Howard, Research Director, Bloor Research.
I recently posted a blog about an interview style webcast I was doing with Informatica on the uses and costs associated with data integration tools.
I’m not sure that the poet John Donne was right when he said that it was strange, let alone fatal. Somewhat surprisingly, I have had a significant amount of feedback following this webinar. I say “surprisingly” because the truth is that I very rarely get direct feedback. Most of it, I assume, goes to the vendor. So, when a number of people commented to me that the research we conducted was both unique and valuable, it was a bit of a thrill. (Yes, I know, I’m easily pleased).
There were a number of questions that arose as a result of our discussions. Probably the most interesting was whether moving data into Hadoop (or some other NoSQL database) should be treated as a separate use case. We certainly didn’t include it as such in our original research. In hindsight, I’m not sure that the answer I gave at the time was fully correct. I acknowledged that you certainly need some different functionality to integrate with a Hadoop environment and that some vendors have more comprehensive capabilities than others when it comes to Hadoop and the same also applies (but with different suppliers, when it comes to integrating with, say, MongoDB or Cassandra or graph databases). However, as I pointed out in my previous blog, functionality is ephemeral. And, just because a particular capability isn’t supported today, doesn’t mean it won’t be supported tomorrow. So that doesn’t really affect use cases.
However, where I was inadequate in my reply was that I only referenced Hadoop as a platform for data warehousing, stating that moving data into Hadoop was not essentially different from moving it into Oracle Exadata or Teradata or HP Vertica. And that’s true. What I forgot was the use of Hadoop as an archiving platform. As it happens we didn’t have an archiving use case in our survey either. Why not? Because archiving is essentially a form of data migration – you have some information lifecycle management and access and security issues that are relevant to archiving once it is in place but that is after the fact: the process of discovering and moving the data is exactly the same as with data migration. So: my bad.
Aside from that little caveat, I quite enjoyed the whole event. Somebody or other (there’s always one!) didn’t quite get how quantifying the number of end points in a data integration scenario was a surrogate measure for complexity (something we took into account) and so I had to explain that. Of course, it’s not perfect as a metric but it’s the only alternative to ask eye of the beholder type questions which aren’t very satisfactory.
Anyway, if you want to listen to the whole thing you can find it HERE: