Category Archives: Enterprise Data Management

Which Method of Controls Should You Use to Protect Sensitive Data in Databases and Enterprise Applications? Part II

Sensitive Data

Protecting Sensitive Data

To determine what is the appropriate sensitive data protection method to use, you should first answer the following questions regarding the requirements:

  • Do you need to protect data at rest (in storage), during transmission, and/or when accessed?
  • Do some privileged users still need the ability to view the original sensitive data or does sensitive data need to be obfuscated at all levels?
  • What is the granularity of controls that you need?
    • Datafile level
    • Table level
    • Row level
    • Field / column level
    • Cell level
    • Do you need to be able to control viewing vs. modification of sensitive data?
    • Do you need to maintain the original characteristics / format of the data (e.g. for testing, demo, development purposes)?
    • Is response time latency / performance of high importance for the application?  This can be the case for mission critical production applications that need to maintain response times in the order of seconds or sub-seconds.

In order to help you determine which method of control is appropriate for your requirements, the following table provides a comparison of the different methods and their characteristics.

data

A combination of protection method may be appropriate based on your requirements.  For example, to protect data in non-production environments, you may want to use persistent data masking to ensure that no one has access to the original production data, since they don’t need to.  This is especially true if your development and testing is outsourced to third parties.  In addition, persistent data masking allows you to maintain the original characteristics of the data to ensure test data quality.

In production environments, you may want to use a combination of encryption and dynamic data masking.  This is the case if you would like to ensure that all data at rest is protected against unauthorized users, yet you need to protect sensitive fields only for certain sets of authorized or privileged users, but the rest of your users should be able to view the data in the clear.

The best method or combination of methods will depend on each scenario and set of requirements for your environment and organization.  As with any technology and solution, there is no one size fits all.

FacebookTwitterLinkedInEmailPrintShare
Posted in Data Integration, Data masking, Data Security, Data Services, Enterprise Data Management | Tagged , , , | Leave a comment

Which Method of Controls Should You Use to Protect Sensitive Data in Databases and Enterprise Applications? Part I

Sensitive Data

Protecting Sensitive Data

I’m often asked to share my thoughts about protecting sensitive data. The questions that typically come up include:

  • Which types of data should be protected?
  • Which data should be classified as “sensitive?”
  • Where is this sensitive data located?
  • Which groups of users should have access to this data?

Because these questions come up frequently, it seems ideal to share a few guidelines on this topic.

When protecting the confidentiality and integrity of data, the first level of defense is Authentication and access control. However, data with higher levels of sensitivity or confidentiality may require additional levels of protection, beyond regular authentication and authorization methods.

There are a number of control methods for securing sensitive data available in the market today, including:

  • Encryption
  • Persistent (Static) Data Masking
  • Dynamic Data Masking
  • Tokenization
  • Retention management and purging

Encryption is a cryptographic method of encoding data.  There are generally, two methods of encryption:  symmetric (using single secret key) and asymmetric (using public and private keys).  Although there are methods of deciphering encrypted information without possessing the key, a good encryption algorithm makes it very difficult to decode the encrypted data without knowledge of the key.  Key management is usually a key concern with this method of control.  Encryption is ideal for mass protection of data (e.g. an entire data file, table, partition, etc.) against unauthorized users.

Persistent or static data masking obfuscates data at rest in storage.  There is usually no way to retrieve the original data – the data is permanently masked.  There are multiple techniques for masking data, including: shuffling, substitution, aging, encryption, domain-specific masking (e.g. email address, IP address, credit card, etc.), dictionary lookup, randomization, etc.  Depending on the technique, there may be ways to perform reverse masking  - this should be used sparingly.  Persistent masking is ideal for cases where all users should not see the original sensitive data (e.g. for test / development environments) and field level data protection is required.

Dynamic data masking de-identifies data when it is accessed.  The original data is still stored in the database.  Dynamic data masking (DDM) acts as a proxy between the application and database and rewrites the user / application request against the database depending on whether the user has the privilege to view the data or not.  If the requested data is not sensitive or the user is a privileged user who has the permission to access the sensitive data, then the DDM proxy passes the request to the database without modification, and the result set is returned to the user in the clear.  If the data is sensitive and the user does not have the privilege to view the data, then the DDM proxy rewrites the request to include a masking function and passes the request to the database to execute.  The result is returned to the user with the sensitive data masked.  Dynamic data masking is ideal for protecting sensitive fields in production systems where application changes are difficult or disruptive to implement and performance / response time is of high importance.

Tokenization substitutes a sensitive data element with a non-sensitive data element or token.  The first generation tokenization system requires a token server and a database to store the original sensitive data.  The mapping from the clear text to the token makes it very difficult to reverse the token back to the original data without the token system.  The existence of a token server and database storing the original sensitive data renders the token server and mapping database as a potential point of security vulnerability, bottleneck for scalability, and single point of failure. Next generation tokenization systems have addressed these weaknesses.  However, tokenization does require changes to the application layer to tokenize and detokenize when the sensitive data is accessed.  Tokenization can be used in production systems to protect sensitive data at rest in the database store, when changes to the application layer can be made relatively easily to perform the tokenization / detokenization operations.

Retention management and purging is more of a data management method to ensure that data is retained only as long as necessary.  The best method of reducing data privacy risk is to eliminate the sensitive data.  Therefore, appropriate retention, archiving, and purging policies should be applied to reduce the privacy and legal risks of holding on to sensitive data for too long.  Retention management and purging is a data management best practices that should always be put to use.

FacebookTwitterLinkedInEmailPrintShare
Posted in Data Integration, Data masking, Data Security, Data Services, Enterprise Data Management | Tagged , , , | Leave a comment

Scalable Enterprise Analytics: Informatica PowerCenter Data Quality and Oracle Exadata

In 2012, Forbes published an article predicting an upcoming problem.

The Need for Scalable Enterprise Analytics

Specifically, increased exploration in Big Data opportunities would place pressure on the typical corporate infrastructure. The generic hardware used to run most tech industry enterprise applications was not designed to handle real-time data processing. As a result, the explosion of mobile usages, and the proliferation of social networks, was increasing the strain on the system. Most companies now faced real-time processing requirements beyond what the traditional model was designed to handle.

In the past two years, the volume of data and speed of data growth has grown significantly. As a result, the problem has become more severe. It is now clear that these challenges can’t be overcome by simply doubling or tripling their IT spending on infrastructure sprawl. Today, enterprises seek consolidated solutions that offer scalability, performance and ease of administration. The present need is for scalable enterprise analytics.

A Clear Solution Is Available

Informatica PowerCenter and Data Quality is the market leading data integration and data quality platform. This platform has now been certified by Oracle as an optimal solution for both the Oracle Exadata Database Machine and the Oracle SuperCluster.

As the high-speed on-ramp for data into Oracle Exadata, PowerCenter and Data Quality deliver up-to five times faster performance on data load, query, profiling and cleansing tasks. Informatica’s data integration customers can now easily reuse data integration code, skills and resources to access and transform any data from any data source and load it into Exadata, with the highest throughput and scalability.

Customers adopting Oracle Exadata for high-volume, high-speed analytics can now be confident with Informatica PowerCenter and Data Quality. With these products, they can ingest, cleanse and transform all types of data into Exadata with the highest performance and scale required to maximize the value of their Exadata investment.

Proving the Value of Scalable Enterprise Analytics

In order to demonstrate the efficacy of their partnership, the two companies worked together on a Proof Of Value (POV) project. The goal is to prove that using PowerCenter with Exadata would improve both performance and scalability. The project involved PowerCenter and Data Quality 9.6.1 and x4-2 Exadata Machine. Oracle 11g was considered for both standard Oracle and Exadata versions.

The first test conducted a 1TB load test to Exadata and standard Oracle in a typical PowerCenter use case. The second test consisted of querying 1TB profiling warehouse database in Data Quality use case scenario. Performance data was collected for both tests. The scalability factor was also captured. A variant of the TPCH dataset was used to generate the test data. The results were significantly higher than prior Exabyte 1TB test. In particular:

  • The data query tests achieved 5x performance.
  • The data load tests achieved a 3x-5x speed increase.
  • Linear scalability was achieved with read/write tests on Exadata.

What Business Benefits Could You Expect?

Informatica PowerCenter and Data Quality, along-with Oracle Exadata, now provide the best-of-breed combination of software and hardware, optimized to deliver the highest possible total system performance. These comprehensive tools drive agile reporting and analytics, while empowering IT organizations to meet SLAs and quality goals like never before.

  1. Extend Oracle Exadata’s access to even more business critical data sources. Utilize optimized out-of-the-box Informatica connectivity to easily access hundreds of data sources, including all the major databases, on-premise and cloud applications, mainframe, social data and Hadoop.
  2. Get more data, more quickly into Oracle Exadata. Move higher volumes of trusted data quickly into Exadata to support timely reporting with up-to-date information (i.e. up to 5x performance improvement compared to Oracle database).
  3.  Centralize management and improve insight into large scale data warehouses. Deliver the necessary insights to stakeholders with intuitive data lineage and a collaborative business glossary. Contribute to high quality business analytics, in a timely manner across the enterprise.
  4. Instantly re-direct workloads and resources to Oracle Exadata without compromising performance. Leverage existing code and programming skills to execute high-performance data integration directly on Exadata by performing push down optimization.
  5. Roll-out data integration projects faster and more cost-effectively. Customers can now leverage thousands of Informatica certified developers to execute existing data integration and quality transformations directly on Oracle Exadata, without any additional coding.
  6. Efficiently scale-up and scale-out. Customers can now maximize performance and lower the costs of data integration and quality operations of any scale by performing Informatica workload and push down optimization on Oracle Exadata.
  7.  Save significant costs involved in administration and expansion. Customers can now easily and economically manage large-scale analytics data warehousing environments with a single point of administration and control, and consolidate a multitude of servers on one rack.
  8.  Reduce risk. Customers can now leverage Informatica’s data integration and quality platform to overcome the typical performance and scalability limitations seen in databases and data storage systems. This will help reduce quality-of-service risks as data volumes rise.

Conclusion

Oracle Exadata is a well-engineered system that offers customers out-of-box scalability and performance on demand.  Informatica PowerCenter and Data Quality are optimized to run on Exadata, offering customers business benefits that speed up data integration and data quality tasks like never before.  Informatica’s certified, optimized, and purpose-built solutions for Oracle can help you enable more timely and trustworthy reporting.  You can now benefit from Informatica’s optimized solutions for Oracle Exadata to make better business decisions by unlocking the full potential of the most current and complete enterprise data available. As shown in our test results, you can attain up to 5x performance by scaling Exadata. Informatica Data Quality customers can perform profiling 1TB datasets, which is unheard before. We urge you to deploy the combined solution to solve your data integration and quality problems today while achieving high speed business analytics in these days of big data exploration and Internet Of Things.

Note:

Listen to what Ash Kulkarni, SVP, at OOW14 has to say on how @InformaticaCORP PowerCenter and Data Quality certified by Oracle as optimized for Exadata can deliver up-to five times faster performance improvement on data load, query, profiling, cleansing and mastering tasks, for Exadata.

FacebookTwitterLinkedInEmailPrintShare
Posted in Data Integration, Data Integration Platform, Data Quality, Data Services, Data Warehousing, Enterprise Data Management, PowerCenter, Vibe | Tagged | Leave a comment

Do We Really Need Another Information Framework?

Do We Really Need Another Information Framework?

The EIM Consortium is a group of nine companies that formed this year with the mission to:

Promote the adoption of Enterprise Information Management as a business function by establishing an open industry reference architecture in order to protect and optimize the business value derived from data assets.”

That sounds nice, but we do really need another framework for EIM or Data Governance? Yes we do, and here’s why. (more…)

FacebookTwitterLinkedInEmailPrintShare
Posted in CIO, Data Governance, Data Integration, Enterprise Data Management, Governance, Risk and Compliance, Integration Competency Centers, Uncategorized | Tagged , , | Leave a comment

8 Information Management Challenges for UDI Compliance

“My team spends far too much time pulling together medical device data that’s scattered across different systems and reconciling it in spreadsheets to create compliance reports.” This quotation from a regulatory affairs leader at a medical device manufacturer highlights the impact of poorly managed medical device data on compliance reporting, such as the reports needed for the FDA’s Universal Device Identification (UDI) regulation. In fact, an overreliance on manual, time-consuming processes brings an increased risk of human error in UDI compliance reports.

frustrated_man_computer

Is your compliance team manually reconciling data for UDI compliance reports?

If you are an information management leader working for a medical device manufacturer, and your compliance team needs quick and easy access to medical device data for UDI compliance reporting, I have five questions for you:

1) How many Class III and Class II devices do you have?
2) How many systems or reporting data stores contain data about these medical devices?
3) How much time do employees spend manually fixing data errors before the data can be used for reporting?
4) How do you plan to manage medical device data so the compliance team can quickly and easily produce accurate reports for UDI Compliance?
5) How do you plan to help the compliance team manage the multi-step submission process?

Watch this on-demand webinar "3 EIM Best Practices for UDI Compliance"

Watch this on-demand webinar “3 EIM Best Practices for UDI Compliance”

For some helpful advice from data management experts, watch this on-demand webinar “3 Enterprise Information Management (EIM) Best Practices for UDI Compliance.”

The deadline to submit the first UDI compliance report to the FDA for Class III devices is September 24, 2014. But, the medical device data needed to produce the report is typically scattered among different internal systems, such as Enterprise Resource Planning (ERP) e.g. SAP and JD Edwards, Product Lifecycle Management (PLM), Manufacturing Execution Systems (MES) and external 3rd party device identifiers.

The traditional approach for dealing with poorly managed data is the compliance team burns the midnight oil to bring together and then manually reconcile all the medical device data in a spreadsheet. And, they have to do this each and every time a compliance report is due. The good news is your compliance team doesn’t have to.

Many medical device manufacturers are are leveraging their existing data governance programs, supported by a combination of data integration, data quality and master data management (MDM) technology to eliminate the need for manual data reconciliation. They are centralizing their medical device data management, so they have a single source of trusted medical device data for UDI compliance reporting as well as other compliance and revenue generating initiatives.

Get UDI data management advice from data experts Kelle O'Neal, Managing Partner at First San Francisco Partners and Bryan Balding, MDM Specialist at Informatica
Get UDI data management advice from data experts Kelle O’Neal, Managing Partner at First San Francisco Partners and Bryan Balding, MDM Specialist at Informatica

During this this on-demand webinar, Kelle O’Neal, Managing Partner at First San Francisco Partners, covers the eight information management challenges for UDI compliance as well as best practices for medical device data management.

Bryan Balding, MDM Solution Specialist at Informatica, shows you how to apply these best practices with the Informatica UDI Compliance Solution.

You’ll learn how to automate the process of capturing, managing and sharing medical device data to make it quicker and easier to create the reports needed for UDI compliance on ongoing basis.

 

 

20 Questions & Answers about Complying with the FDA Requirement for Unique Device Identification (UDI)

20 Questions & Answers about Complying with the FDA Requirement
for Unique Device Identification (UDI)

Also, we just published a joint whitepaper with First San Francisco Partners, Information Management FAQ for UDI: 20 Questions & Answers about Complying with the FDA Requirement for Unique Device Identification (UDI). Get answers to questions such as:

What is needed to support an EIM strategy for UDI compliance?
What role does data governance play in UDI compliance?
What are the components of a successful data governance program?
Why should I centralize my business-critical medical device data?
What does the architecture of a UDI compliance solution look like?

I invite you to download the UDI compliance FAQ now and share your feedback in the comments section below.

FacebookTwitterLinkedInEmailPrintShare
Posted in Data Governance, Data Integration, Data Quality, Enterprise Data Management, Life Sciences, Manufacturing, Master Data Management, Vertical | Tagged , , , , , , , , , , , , , , | Leave a comment

Building a Data Foundation for Execution

Building a Data Foundation for Execution

Building a Data Foundation

I have been re-reading Enterprise Architecture as Strategy from the MIT Center for Information Systems Research (CISR).*  One concept that they talk about that jumped out at me was the idea of a “Foundation for Execution.”  Everybody is working to drive new business initiatives, to digitize their businesses, and to thrive in an era of increased technology disruption and competition.  The ideas around a Foundation for Execution in the book are a highly practical and useful framework to deal with these problems.

This got me thinking: What is the biggest bottleneck in the delivery of business value today?  I know I look at things from a data perspective, but data is the biggest bottleneck.  Consider this prediction from Gartner:

“Gartner predicts organizations will spend one-third more on app integration in 2016 than they did in 2013. What’s more, by 2018, more than half the cost of implementing new large systems will be spent on integration. “

When we talk about application integration, we’re talking about moving data, synchronizing data, cleansing, data, transforming data, testing data.  The question for architects and senior management is this: Do you have the Data Foundation for Execution you need to drive the business results you require to compete?  The answer, unfortunately, for most companies is; No.

All too often data management is an add-on to larger application-based projects.  The result is unconnected and non-interoperable islands of data across the organization.  That simply is not going to work in the coming competitive environment.  Here are a couple of quick examples:

  • Many companies are looking to compete on their use of analytics.  That requires collecting, managing, and analyzing data from multiple internal and external sources.
  • Many companies are focusing on a better customer experience to drive their business. This again requires data from many internal sources, plus social, mobile and location-based data to be effective.

When I talk to architects about the business risks of not having a shared data architecture, and common tools and practices for enterprise data management, they “get” the problem.  So why aren’t they addressing it?  The issue is that they find that they are only funded to do the project they are working on and are dealing with very demanding timeframe requirements.  They have no funding or mandate to solve the larger enterprise data management problem, which is getting more complex and brittle with each new un-connected project or initiative that is added to the pile.

Studies such as “The Data Directive” by The Economist show that organizations that actively manage their data are more successful. But, if that is the desired future state, how do you get there?

Changing an organization to look at data as the fuel that drives strategy takes hard work and leadership. It also takes a strong enterprise data architecture vision and strategy.  For fresh thinking on the subject of building a data foundation for execution, see “Think Data-First to Drive Business Value” from Informatica.

* By the way, Informatica is proud to announce that we are now a sponsor of the MIT Center for Information Systems Research.

FacebookTwitterLinkedInEmailPrintShare
Posted in Architects, Business Impact / Benefits, CIO, Data Governance, Data Integration, Data Synchronization, Enterprise Data Management | Tagged , , , , , , | Leave a comment

The CFO Viewpoint upon Data

According to the Financial Executives Institute, CFOs say their second highest priority this year is to harness business intelligence and big data. Their highest priority is to improve cash flow and working capital efficiency and effectiveness. This means CFOs highest two priorities are centered around data. At roughly the same time, KPMG has found in their survey of CFOs that 91% want to improve the quality of their financial and performance insight obtained from the data that they produce. Even more amazing 51% of CFO admitted that “collecting, storing, and retrieving financial and performance data at their company is primarily accomplished through a manual and/or spreadsheet-based exercise”. From our interviews of CFOs, we believe this number is much higher.

automationDigitization increasing but automation is limited for the finance department

Your question at this point—if you are not a CFO—should be how can this be the case? After all strategy consultants like Booz and Company, actively measure the degree of digitization and automation taking place in businesses by industry and these numbers year after year have shown a strong upward bias. How can the finance organization be digitized for data collection but still largely manual in its processes for putting together the figures that management and the market needs?
 
 
TrustCFOs do not trust their data

In our interviews of CFOs, one CFO answered this question bluntly by saying “If the systems suck, then you cannot trust the numbers when you get them.” And this reality truly limits CFOs in how they respond to their top priorities. Things like management of the P&L, Expense Management, Compliance, and Regulatory all are impacted by the CFOs data problem. Instead of doing a better job at these issues, CFOs and their teams remain largely focused on “getting the numbers right”. And even worse, the answering of business questions like how much revenue is this customer providing or how profitable this customer is, involves manual pulls of data today from more than one system. And yes, similar data issues exist in financial services organizations which close the books nightly.

ProblemCFOs share openly their data problem

The CFOs, that I have talked to, admit without hesitation that data is a big issue for them. These CFOs say that they worry about data from the source and the ability to do meaningful financial or managerial analysis. They say they need to rely on data in order to report but as important they need it to help drive synergies across businesses. This matters because CFOs say they want to move from being just “bean counters” to being participants in the strategy of their enterprises.

To succeed, CFOs say that they need timely, accurate data. However, they are the first to discuss how disparate systems get in their way. CFOs believe that making their lives easier starts with the systems that support them. What they believe is needed is real integration and consolidation of data. One CFO said what is needed this way, “we need the integration of the right systems to provide the right information so we can manage and make decisions at the right time”. CFOs clearly want to know that the accounting systems are working and reliable. At the same time, CFOs want, for example, a holistic view of customer. When asked why this isn’t a marketing activity, they say this is business issue that CFOs need to help manage. “We want to understand the customer across business units.  It is a finance objective because finance is responsible for business metrics and there are gaps in business metrics around customer. How much cross sell opportunities is the business as a whole pursuing?”

Chief Profitability Officers?                                               

Jonathan Brynes at the MIT Sloan School confirms this viewpoint is becoming a larger trend when he suggests that CFOs need to take on the function of “Chief Profitability Officers”. With this hat, CFOs, in his view, need to determine which product lines, customers, segments, and channels are the most and the least profitable. Once again, this requires that CFOs tackle their data problem to have relevant, holistic information.

CIOs remain responsible for data delivery

Data DeliveryCFOs believe that CIOs remain responsible for how data is delivered. CFOs, say that they need to lead in creating validated data and reports. Clearly, if data delivery remains a manual process, then the CFO will be severely limited in their ability to adequately support their new and strategic charter. Yet CFOs when asked if they see data as a competitive advantage say that “every CFO would view data done well as a competitive advantage”. Some CFOs even suggest that data is the last competitive advantage. This fits really well with the view of Davenport in “Competing on Analytics”. The question is how soon will CIOs and CFOs work together to get the finance organization out of its mess of manually massaging and consolidating financial and business data.

Related links

Solution Brief: The Intelligent Data Platform

Related Blogs

How CFOs can change the conversation with their CIO?

New type of CFO represents a potent CIO ally

Competing on Analytics

The Business Case for Better Data Connectivity

Is Big Data Destined To Become Small And Vertical?

What is big data and why should your business care?

Twitter: @MylesSuer

 

 

FacebookTwitterLinkedInEmailPrintShare
Posted in Business/IT Collaboration, Enterprise Data Management, Governance, Risk and Compliance | Tagged , , , , , | 2 Comments