Tag Archives: data security
Informatica announced Secure@Source last week, unveiling the industry’s first data security intelligence offering. At a time when ‘Not Knowing Where Sensitive and Confidential Data Reside‘ is the number one thing that keeps security professionals up at night for two years in a row, according to The Ponemon Institute, it seems like the timing is right for a capability, such as Data Security Intelligence, that gives line of site to obscured threats.
Neuralytix conducted market research, entitled The Future State of Data Security Intelligence, where they define data security intelligence (DSI) as a framework for understanding the risk of sensitive or confidential data and recommending the optimal set of controls to mitigate that risk. DSI is comprised of technology that provides the definition, classification, discovery, and assessment phases of a data-centric security approach. The state:
By deploying data security intelligence in combination with data security controls, enterprises can gain active insight into where risks exist and proactively set controls to mitigate the impact in the event of a data breach.
The Enterprise Strategy Group further commented in the report, “Data‐centric Security: A New Information Security Perimeter”, authored by industry expert, Jon Oltsik:
To address modern threats and IT mobility, CISOs must adopt two new security perimeters around identity attributes and data-centric security. In this regard, sensitive data must be continuously monitored for situational awareness and risk management.
This launch precedes the security industry’s equivalent of the NFL’s Superbowl – RSA Conference, where the world talks security. Informatica will be there, debuting its first Data Security Intelligence offering Secure@Source. The team should be so proud – this is by far one of the coolest products I have had the opportunity to be a part of. Here is a brief blurb on what Secure@Source is and does:
Secure@Source discovers, analyzes and visualizes data relationships, proliferation and sensitivity that details data risks and vulnerabilities to focus data protection and monitoring to secure data from external breaches and insider abuse. Secure@Source leverages proven data integration and quality capabilities to provide integrated views of data, independent of platform, from legacy, cloud, big data and mobile environments.
Secure@Source provides granular detail on what data has value, where the data resides and how it transverses the enterprise and how it should be protected. Informatica leverages market leading technology for data discovery and profiling, protection and retirement, and innovative analysis and visualizations for monitoring data security in real-time.
At a conference where the world talks security, I’m looking forward to engaging in conversations with you about getting smarter about Data Security Intelligence and eliminate blind spots. See you at the venue from April 20-24, South Hall, Booth No.2626 .
Original article is posted at techcrunch.com
It’s probably no surprise to the security professional community that once again, identity theft is among the IRS’s Dirty Dozen tax scams. Criminals use stolen Social Security numbers and other personally identifiable information to file tax claims illegally, deposit the tax refunds to rechargeable debit cards, and vanish before the average citizen gets around to filing.
Since the IRS began publishing its “Dirty Dozen” list to alert filers of the worst tax scams, identity theft has continually topped the list since 2011. In 2012, the IRS implemented a preventive measure to catch fraud prior to actually issuing refunds, and issued more than 2,400 enforcement actions against identity thieves. With an aggressive campaign to fight identity theft, the IRS saved over $1.4 billion in 2011 and over $63 billion since October 2014.
That’s great progress – but given that of the 117 million tax payers who filed electronically in 2014, 80 million received on average $2,851 directly deposited into their bank, which is more than $229 billion changing hands electronically. The pessimist in me has to believe that cyber criminals are already plotting how to nab more Social Security numbers and e-filing logins to tap into that big pot of gold.
So where are criminals getting the data to begin with? Any organization that has employees and a human resources department collects and possibly stores Social Security numbers, birthdays, addresses and income either on-premises or in a cloud HR application. This information is everything a criminal would need to fraudulently file taxes. Any time a common business process is digitally transformed, or moved to the cloud, the potential risk of exposure increases.
As the healthcare industry transforms to electronic health records and patient records, another abundant source of Social Security numbers and personally identifiable information increases the surface area of opportunity. When you look at the abundance of Social Security numbers stolen in major data breaches, such as the case with Anthem, you start to connect the dots.
One of my favorite dynamic infographics comes from the website Information is Beautiful entitled, ‘World’s Biggest Data Breaches.’ When you filter the data based on number of records versus sensitivity, the size of the bubbles indicate the severity. Even though the sensitivity score appears to be somewhat arbitrary, it does provide one way to assess the severity based on the type of information that was breached:
|Just email address/online information||1|
|Credit card information||300|
|Email password/health records||4000|
|Full bank account details||50000|
What would be an interesting addition is how many records were sold on the black market that resulted in tax or insurance fraud.
Cyber-security expert Brian Krebs, who was personally impacted by a criminal tax return filing last year, says we will likely see “more phony tax refund claims than last year.” With credentials for TurboTax and H&R Block marketed on black market websites for about 4 cents per identity, it is hard to disagree.
The Ponemon Institute published a survey last year, entitled The State of Data Centric Security. One research finding that sticks out is when security professionals were asked what keeps them up at night, and more than 50 percent said “not knowing where sensitive and confidential data reside.” As we enter full swing into tax season, what should security professionals be thinking about?
Data Security Intelligence promises to be the next big thing that provides a more automated and data-centric view into sensitive data discovery, classification and risk assessment. If you don’t know where the data is or its risk, how can you protect it? Maybe with a little more insight, we can at least reduce the surface area of exposed sensitive data.
What does it take to be an effective Chief Information Security Officer (CISO) in today’s era massive data breaches? Besides skin as thick as armor and proven experience in security, an effective CISO needs to hold the following qualities:
- A strong grasp of their security program’s capabilities and of their adversaries
- The business acumen to frame security challenges into business opportunties
- An ability to effectively partner and communicate with stakeholders outside of the IT department
- An insatiable appetite to make data-driven decisions and to take smart risks
In order to be successful, a CISO needs data-driven insights. The business needs this too. Informatica recently launched the industry’s first Data Security Intelligence solution, Secure@Source. At the launch event, we shared how CISOs can leverage new insights, gathered and presented by Secure@Source. These insights better equip their security and compliance teams to defend against misconfigurations, cyber-attacks and malicious insider threats.
Data-driven organizations are more profitable, more efficient, and more competitive . An effective CISO ensures the business has the data it needs without introducing undo risk. In my RSA Conference Security Leadership Development session I will share several other characteristics of effective CISOs.
Despite best efforts at threat modeling and security automation, security controls will never be perfect. Modern businesses require data agility, as attack surface areas and risks change quickly. As data proliferates by business users beyond the firewall, the ability to ensure that sensitive and confidential data is safe from exposure or a breach becomes an enormous task.
Data at rest isn’t valuable if the business can’t use it in a timely manner. Encrypted data may be safe from theft, but needs to be decrypted at some point to be useful for those using the data for predictive analytics. Data’s relative risk of breach goes up as the number of connections, applications, and accounts that have access to the data also increases.
If you have two databases, each with the same millions of sensitive records in them, the system with more applications linked to it and privileged administrative accounts managing it is the one you should be focusing your security investments on. But you need a way to measure and manage your risk with accurate, timely intel.
As Informatica’s CISO, my responsibility is to ensure that our brand is protected, that our customers, stakeholders, and employees trust Informatica — that we are trustworthy custodians of our customers’ most important data assets.
In order to do that, I need to have conviction about where our sensitive assets are, what threats and risks are relevant to them, and have a plan to keep them compliant and safe no matter where the data travels.
Modern security guidance like the SANS Critical Security Controls or NIST CyberSecurity Framework both start with “know your assets”, building an inventory and what’s most critical to your business. Next, they advise you to form a strategy to monitor, protect, and re-assess relevant risks as the business evolves. In the age of Agile development and security automation, continuous monitoring is replacing batch-mode assessments. Businesses move too fast to measure risk annually or once a quarter.
As Informatica has shifted to a cloud-first enterprise, and as our marketing organization makes data-driven decisions for their customer experience initiatives, my teams ensure we are making data available to those who need it while adhering to international data privacy laws. This task has become more challenging as the volume of data increases, is shared between targets, and as requirements become more stringent. Informatica’s Data Security Intelligence solution, Secure@Source, was designed to help manage these activities while making it easier to collaborate with other stakeholders.
The role of the CISO has transformed over time to being a trusted advisor to the business; relying on their guidance to help take smart risks. The CISO provides a lens in business discussions that focuses on technical threats, regulatory constraints, and business risks while ensuring that the business earns and maintains trust with customers. In order to be an effective CISO, it all comes down to the data.
Security professionals are in dire need of a solution that provides visibility into where sensitive and confidential data resides, as well as visibility into the data’s risk. This knowledge would allow those responsible to take an effective, proactive approach to combating cybercrime. By focusing on the data, Informatica and our customers, partners and market ecosystem are collaborating to make data-centric security with Data Security Intelligence the next line of defense.
Security technologies that focus on securing the network and perimeter require additional safeguards when sensitive and confidential data traverse beyond these protective controls. Data proliferates to cloud-based applications and mobile devices. Application security and identity access management tools may lack visibility and granular control when data is replicated to Big Data and advanced analytics platforms.
Informatica is filling this need with its data-centric security portfolio, which now includes Secure@Source. Informatica Secure@Source is the industry’s first data security intelligence solution that delivers insight into where sensitive and confidential data reside, as well as the data’s risk profile.
Join us at our online launch event on April 8th where we will showcase Secure@Source and share reactions from an amazing panel including:
- Security Industry leader Anil Chakravarthy, CPO and EVP Informatica and myself, Amit Walia, GM and SVP Informatica
- Luminaries Larry Ponemon, Founder Ponemon Institute and Jeff Northrop, CTO IAPP
- CISOs Bill Burns, Informatica and Arnold Federbaum, Former CISOs and CyberSecurity Professor NYU
- Enterprise Security Architect, Linda Hewlett, Santander Holdings USA.
The opportunity for Data Security Intelligence is extensive. In a recently published report, Neuralytix defined Data-Centric Security as “an approach to security that focuses on the data itself; to cover the gaps of traditional network, host and application security solutions.” A critical element for successful data security is collecting intelligence required to prioritize where to focus security controls and efforts that mitigate risk. This is precisely what Informatica Secure@Source was designed to achieve.
What has emerged from a predominantly manual practice, the data security intelligence software market is expected to reach $800M by 2018 with a CAGR of 27.8%. We are excited about this opportunity! As a leader in data management software, we are uniquely qualified to take an active role in shaping this emerging market category.
Informatica Secure@Source addresses the need to get smarter about where our sensitive and private data reside, who is accessing it, prioritize which controls to implement, and work harmoniously with existing security architectures, policies and procedures. Our customers are asking us for data security intelligence, the industry deserves it. With more than 60% of security professionals stating their biggest challenge is not knowing where their sensitive and confidential data reside, the need for Data Security Intelligence has never been greater
Neuralytix says “data security is about protecting individual data objects that traverse across networks, in and out of a public or private cloud, from source applications to targets such as partner systems, to back office SaaS applications to data warehouses and analytics platforms”. We couldn’t agree more. We believe that the best way to incorporate a data-centric security approach is to begin with data security intelligence.
JOIN US at the online launch event on April 8th for the security industry’s most exciting new Data Security Intelligence solution, Informatica Secure@Source.
 “The State of Data Centric Security,” Ponemon Institute, sponsored by Informatica, June 2014
With the European Medicines Agency (EMA) date for compliance to IDMP (Identification of Medicinal Products) looming, Q1 2015 has seen a significant increase in IDMP activity. Both Informatica & HighPoint Solution’s IDMP Round Table in January, and a February Marcus Evans conference in Berlin provided excellent forums for sharing progress, thoughts and strategies. Additional confidential conversations with pharmaceutical companies show an increase in the number of approved and active projects, although some are still seeking full funding. The following paragraphs sum up the activity and trends that I have witnessed in the first three months of the year.
I’ll start with my favourite quote, which is from Dr. Jörg Stüben of Boehringer Ingelheim, who asked:
“Isn’t part of compliance being in control of your data?”
I like it because to me it is just the right balance of stating the obvious, and questioning the way the majority of pharmaceutical companies approach compliance: A report that has to be created and submitted. If a company is in control of their data, regulatory compliance would be easier and come at a lower cost. More importantly, the company itself would benefit from easy access to high quality data.
Dr. Stüben’s question was raised during his excellent presentation at the Marcus Evans conference. Not only did he question the status quo, but proposed an alternate way for IDMP compliance: Let Boehringer benefit from their investment in IDMP compliance. His approach can be summarised as follows:
- Embrace a holistic approach to being in control of data, i.e. adopt data governance practices.
- This is not about just compliance. Include optional attributes that will deliver value to the organisation if correctly managed.
- Get started by creating simple, clear work packages.
Although Dr Stüben did not outline his technical solution, it would include data quality tools and a product data hub.
At the same conference, Stefan Fischer Rivera & Stefan Brügger of Bayer and Guido Claes from Janssen Pharmaceuticals both came out strongly in favour of using a Master Data Management (MDM) approach to achieving compliance. Both companies have MDM technology and processes within their organisations, and realise the value a MDM approach can bring to achieving compliance in terms of data management and governance. Having Mr Claes express how well Informatica’s MDM and Data Quality solutions support his existing substance data management program, made his presentation even more enjoyable to me.
Whilst the exact approaches of Bayer and Janssen differed, there were some common themes:
- Consider both the short term (compliance) and the long term (data governance) in the strategy
- Centralised MDM is ideal, but a federated approach is practical for July 2016
- High quality data should be available to a wide audience outside of IDMP compliance
The first and third bullet points map very closely to Dr. Stüben’s key points, and in fact show a clear trend in 2015:
IDMP Compliance is an opportunity to invest in your data management solutions and processes for the benefit of the entire organisation.
Although the EMA was not represented at the conference, Andrew Marr presented their approach to IDMP, and master data in general. The EMA is undergoing a system re-organisation to focus on managing Substance, Product, Organisation and Reference data centrally, rather than within each regulation or program as it is today. MDM will play a key role in managing this data, setting a high standard of data control and management for regulatory purposes. It appears that the EMA is also using IDMP to introduce better data management practice.
Depending on the size of the company, and the skills & tools available, other non-MDM approaches have been presented or discussed during the first part of 2015. These include using XML and SharePoint to manage product data. However I share a primary concern with others in the industry with this approach: How well can you manage and control change using these tools? Some pharmaceutical companies have openly stated that data contributors often spend more time looking for data than doing their own jobs. A XML/SharePoint approach will do little to ease this burden, but an MDM approach will.
Despite the others approaches and solutions being discovered, there is another clear trend in Q1 2015
MDM is becoming a favoured approach for IDMP compliance due to its strong governance, centralised attribute-level data management and ability to track changes.
Interestingly, the opportunity to invest in data management, and the rise of MDM as a favoured approach has been backed up with research by Gens Associates. Messers Gens and Brolund found a rapid increase in investment during 2014 of what they term Information Architecture, in which MDM plays a key role. IDMP is seen as a major driver for this investment. They go on to state that investment in master data management programs will allow a much easier and cost effective approach to data exchange (internally and externally), resulting in substantial benefits. Unfortunately they do not elaborate on these benefits, but I have placed a summary on benefits of using MDM for IDMP compliance here.
In terms of active projects, the common compliance activities I have seen in the first quarter of 2015 are as follows:
- Most companies are in the discovery phase: identifying the effort for compliance
- Some are starting to make technology choices, and have submitted RFPs/RFQs
- Those furthest along in technology already have MDM programs or initiatives underway
- Despite getting a start, some are still lacking enough funding for achieving compliance
- Output from the discovery phase will in some cases be used to request full funding
- A significant number of projects have a goal to implement better data management practice throughout the company. IDMP will be the as the first release.
A final trend I have noticed in 2015 is regarding the magnitude of the compliance task ahead:
Those who have made the most progress are those who are most concerned about achieving compliance on time.
The implication is that the companies who are starting late do not yet realise the magnitude of the task ahead. It is not yet too late to comply and achieve long term benefits through better data management, despite only 15 months before the initial EMA deadline. Informatica has customers who have implemented MDM within 6 months. 15 months is achievable provided the project (or program) gets the focus and resources required.
IDMP compliance is a common challenge to all those in the pharmaceutical industry. Learning from others will help avoid common mistakes and provide tips on important topics. For example, how to secure funding and support from senior management is a common concern among those tasked with compliance. In order to encourage learning and networking, Informatica and HighPoint Solutions will be hosting our third IDMP roundtable in London on May 13th. Please do join us to share your experiences, and learn from the experiences of others.
Data Governance, the art of being Regulation Ready is about a lot of things, but one thing is clear. It’s NOT just about the technology. You ever been in one of those meetings, probably more than a few, where committees and virtual teams discuss the latest corporate initiatives? You know, those meetings where you want to dip your face in lava and run into the ocean? Because at the end of the meeting, everyone goes back to their day jobs and nothing changes.
Now comes a new law or regulation from the governing body du jour. There are common threads to each and every regulation related to data. Laws like HIPAA even had entire sections dedicated to the types of filing cabinets required in the office to protect healthcare data. And the same is true of regulations like BCBS 239, CCAR reporting and Solvency II. The laws ask; what are you reporting, how did you get that data, where has it been, what does this data mean and who has touched it. Virtually all of the regulations dealing with data have those elements.
So it behooves an organization to be Regulation Ready. This means those committees and virtual teams need to be driving cultural and process change. It’s not just about the technology; it’s as much about people and processes. Every role in the organization, from the developer to the business executive should embed the concepts of data governance in their daily work. From the time a developer or architect builds a new system, they need to document and define everything and every piece of data. It reminds me of days writing code and remembering to comment each code block. And the business executive likewise is sharing business rules and definition from the top so they can be integrated into the systems that eventually have to report on it.
Finally, the processes that support a data governance program are augmented by the technology. It may seem to suffice, that systems are documented in spreadsheets and documents, but those are more and more error prone and in the end not reliable in audit.
Informatica is the market leader in data management infrastructure to be Regulation Ready. This means, everything, from data movement and quality to definitions and security. Because at the end of the day, once you have the people culturally integrated, and the processes supporting the data workload, a centralized, high performance and feature rich technology needs to be in place to complete the trifecta. Informatica is pleased to offer the industry this leading technology as part of a comprehensive data governance foundation.
Informatica will be sharing this vision at the upcoming Annual FIMA 2015 Conference in Boston from March 30 to April 1. Come and visit Informatica at FIMA 2015 in Booth #3.
The problem many banks encounter today is that they have vast sums of investment tied up in old ways of doing things. Historically, customers chose a bank and remained ’loyal’ throughout their lifetime…now competition is rife and loyalty is becoming a thing of a past. In order to stay ahead of the competition, gain and keep customers, they need to understand the ever-evolving market, disrupt norms and continue to delight customers. The tradition of staying with one bank due to family convention or from ease has now been replaced with a more informed customer who understands the variety of choice at their fingertips.
Challenger Banks don’t build on ideas of tradition and legacy and see how they can make adjustments to them. They embrace change. Longer-established banks can’t afford to do nothing, and assume their size and stature will attract customers.
Here’s some useful information
Accenture’s recent report, The Bank of Things, succinctly explains what ‘Customer 3.0’ is all about. The connected customer isn’t necessarily younger. It’s everybody. Banks can get to know their customers better by making better use of information. It all depends on using intelligent data rather than all data. Interrogating the wrong data can be time-consuming, costly and results in little actionable information.
When an organisation sets out with the intention of knowing its customers, then it can calibrate its data according with where the gold nuggets – the real business insights – come from. What do people do most? Where do they go most? Now that they’re using branches and phone banking less and less – what do they look for in a mobile app?
Customer 3.0 wants to know what the bank can offer them all-the-time, on the move, on their own device. They want offers designed for their lifestyle. Correctly deciphered data can drive the level of customer segmentation that empowers such marketing initiatives. This means an organisation has to have the ability and the agility to move with its customers. It’s a journey that never ends -technology will never have a cut-off point just like customer expectations will never stop evolving.
It’s time for banks to re-shape banking
Informatica have been working with major retail banks globally to redefine banking excellence and realign operations to deliver it. We always start by asking our customers the revealing question “Have you looked at the art of the possible to future-proof your business over the next five to ten years and beyond?” This is where the discussion begins to explore really interesting notions about unlocking potential. No bank can afford to ignore them.
Original article can be found here, scmagazine.com
On Jan. 13 the White House announced President Barack Obama’s proposal for new data privacy legislation, the Personal Data Notification and Protection Act. Many states have laws today that require corporations and government agencies to notify consumers in the event of a breach – but it is not enough. This new proposal aims to improve cybersecurity standards nationwide with the following tactics:
Enable cyber-security information sharing between private and public sectors.
Government agencies and corporations with a vested interest in protecting our information assets need a streamlined way to communicate and share threat information. This component of the proposed legislation incents organizations that participate in knowledge-sharing with targeted liability protection, as long as they are responsible for how they share, manage and retain privacy data.
Modernize the tools law enforcement has to combat cybercrime.
Existing laws, such as the Computer Fraud and Abuse Act, need to be updated to incorporate the latest cyber-crime classifications while giving prosecutors the ability to target insiders with privileged access to sensitive and privacy data. The proposal also specifically calls out pursuing prosecution when selling privacy data nationally and internationally.
Standardize breach notification policies nationwide.
Many states have some sort of policy that requires notification of customers that their data has been compromised. Three leading examples include California , Florida’s Information Protection Act (FIPA) and Massachusetts Standards for the Protection of Personal Information of Residents of the Commonwealth. New Mexico, Alabama and South Dakota have no data breach protection legislation. Enforcing standardization and simplifying the requirement for companies to notify customers and employees when a breach occurs will ensure consistent protection no matter where you live or transact.
Invest in increasing cyber-security skill sets.
For a number of years, security professionals have reported an ever-increasing skills gap in the cybersecurity profession. In fact, in a recent Ponemon Institute report, 57 percent of respondents said a data breach incident could have been avoided if the organization had more skilled personnel with data security responsibilities. Increasingly, colleges and universities are adding cybersecurity curriculum and degrees to meet the demand. In support of this need, the proposed legislation mentions that the Department of Energy will provide $25 million in educational grants to Historically Black Colleges and Universities (HBCU) and two national labs to support a cybersecurity education consortium.
This proposal is clearly comprehensive, but it also raises the critical question: How can organizations prepare themselves for this privacy legislation?
The International Association of Privacy Professionals conducted a study of Federal Trade Commission (FTC) enforcement actions. From the report, organizations can infer best practices implied by FTC enforcement and ensure these are covered by their organization’s security architecture, policies and practices:
- Perform assessments to identify reasonably foreseeable risks to the security, integrity, and confidentiality of personal information collected and stored on the network, online or in paper files.
- Limited access policies curb unnecessary security risks and minimize the number and type of network access points that an information security team must monitor for potential violations.
- Limit employee access to (and copying of) personal information, based on employee’s role.
- Implement and monitor compliance with policies and procedures for rendering information unreadable or otherwise secure in the course of disposal. Securely disposed information must not practicably be read or reconstructed.
- Restrict third party access to personal information based on business need, for example, by restricting access based on IP address, granting temporary access privileges, or similar procedures.
The Personal Data Notification and Protection Act fills a void at the national level; most states have privacy laws with California pioneering the movement with SB 1386. However, enforcement at the state AG level has been uneven at best and absent at worse.
In preparing for this national legislation organization need to heed the policies derived from the FTC’s enforcement practices. They can also track the progress of this legislation and look for agencies such as the National Institute of Standards and Technology to issue guidance. Furthermore, organizations can encourage employees to take advantage of cybersecurity internship programs at nearby colleges and universities to avoid critical skills shortages.
With online security a clear priority for President Obama’s administration, it’s essential for organizations and consumers to understand upcoming legislation and learn the benefits/risks of sharing data. We’re looking forward to celebrating safeguarding data and enabling trust on Data Privacy Day, held annually on January 28, and hope that these tips will make 2015 your safest year yet.
Informatica users leveraging HDP are now able to see a complete end-to-end visual data lineage map of everything done through the Informatica platform. In this blog post, Scott Hedrick, director Big Data Partnerships at Informatica, tells us more about end-to-end visual data lineage.
Hadoop adoption continues to accelerate within mainstream enterprise IT and, as always, organizations need the ability to govern their end-to-end data pipelines for compliance and visibility purposes. Working with Hortonworks, Informatica has extended the metadata management capabilities in Informatica Big Data Governance Edition to include data lineage visibility of data movement, transformation and cleansing beyond traditional systems to cover Apache Hadoop.
Informatica users are now able to see a complete end-to-end visual data lineage map of everything done through Informatica, which includes sources outside Hortonworks Data Platform (HDP) being loaded into HDP, all data integration, parsing and data quality transformation running on Hortonworks and then loading of curated data sets onto data warehouses, analytics tools and operational systems outside Hadoop.
Regulated industries such as banking, insurance and healthcare are required to have detailed histories of data management for audit purposes. Without tools to provide data lineage, compliance with regulations and gathering the required information for audits can prove challenging.
With Informatica, the data scientist and analyst can now visualize data lineage and detailed history of data transformations providing unprecedented transparency into their data analysis. They can be more confident in their findings based on this visibility into the origins and quality of the data they are working with to create valuable insights for their organizations. Web-based access to visual data lineage for analysts also facilitates team collaboration on challenging and evolving data analytics and operational system projects.
The Informatica and Hortonworks partnership brings together leading enterprise data governance tools with open source Hadoop leadership to extend governance to this new platform. Deploying Informatica for data integration, parsing, data quality and data lineage on Hortonworks reduces risk to deployment schedules.
A demo of Informatica’s end-to-end metadata management capabilities on Hadoop and beyond is available here:
- A free trial of Informatica Big Data Edition in the Hortonworks Sandbox is available here .
Data proliferation has traditionally been measured based on the number of copies data reside on different media. For example, if data residing on an enterprise storage device was backed up to tape, the proliferation was measured by the number of tapes the same piece of data would reside. Now that backups are no longer restricted to the data center and data is no longer constrained by the originating application, this definition is due for an update.
Data proliferation should be measured based on the number of users who have access to or can view the data and that data proliferation is a primary factor in measuring the risk of a data breach. My argument here is that as sensitive, confidential or private data proliferates beyond the original copy, it increases its surface area and proportionally increases its risk of a data breach.
Using the original definition of data proliferation and an example of data storage shown below, data proliferation would include production, production copies used for disaster recovery purposes and all physical backup copies. But as you can see, data is also copied to test environments for development purposes. When factoring in the number of privileged users with access to those copies, you have a different view of proliferation and potential risk.
In the example, there are potentially thousands of copies of sensitive data but only a small number of users who are authorized to access the data.
In the case of test and development, this image highlights a potentially high area of risk because the number of users who could see the sensitive data is high.
Similarly with online advertising, the measure of how many people see an online ad is called an impression. If an ad was seen by 100 online users, it would have 100 impressions.
When you apply that same principal to data security, you could say that data proliferation is a calculation of the number of copies of a data element multiplied by the potential number of users who could physically view the data, or in other words ‘impressions’. In this second image below, rather than considering the total number of copies, what if we measured risk based on the total number of impressions?
In this case, the measure of risk is independent of the physical media the data reside on. You could take this a few steps further and add a factor based on security controls in place to prevent unauthorized access.