Category Archives: Governance, Risk and Compliance
What is In-Database Archiving in Oracle 12c and Why You Still Need a Database Archiving Solution to Complement It (Part 1)
What is the new In-Database Archiving in the latest Oracle 12c release?
On June 25, 2013, Oracle introduced a new feature called In-Database Archiving with its new release of Oracle 12. “In-Database Archiving enables you to archive rows within a table by marking them as inactive. These inactive rows are in the database and can be optimized using compression, but are not visible to an application. The data in these rows is available for compliance purposes if needed by setting a session parameter. With In-Database Archiving you can store more data for a longer period of time within a single database, without compromising application performance. Archived data can be compressed to help improve backup performance, and updates to archived data can be deferred during application upgrades to improve the performance of upgrades.”
This is an Oracle specific feature and does not apply to other databases.
Many software vendors, analyst and journalist are overusing the term “Data Governance” in today’s complex business and IT environments. However, it has become one of the primary goals and drivers for data-related IT projects whilst at the same time being one of the most difficult to define, measure and quantify. What real meaning can we give to the concept of Data Governance? What are its importance, impact and meaning for the enterprise?
To try returning some meaning and context to Data Governance, let’s go back to the semantics through an analogy understandable by everyone, and insightful in the smallest detail.
Welcome to Data Land… If data are its citizens, the governance of such a country would aim at ensuring that these data co-exist in a peaceful way, stayed healthy, enriched themselves, were not living on top of each other, did not destroy each other in the case of conflict, and most importantly work together every year at improving the GDP of Data Land. This means creating value by the use and action of everyone. Of course, bioethics laws would prevent the cloning or duplication of its inhabitants… Data governance would then define itself as a framework which intends to ensure the efficient management of the data in the enterprise. Putting data under governance prevents its chaotic generation and use.
In Data Land, governance implies:
A territory to govern
The scope of influence of governance must be clearly defined, the border of its country clearly delimited.
What type of data are we talking about? The question of the perimeter is not a trivial one, and its impact on the projects and tools to be implemented is big. Master data, so critical that it commands a particular investment for its management, forms a first consistent set and its governance leads to MDM projects.
What about transactions data or social interactions data? More and more popular, full of intelligence for the enterprise, they do not fit into the normal referential bucket, but can benefit from data quality initiatives, with their own specific concerns (volume and volatility, for instance). Also, Data Land is not free from globalization. Though it is important to establish borders for security reasons, “common market” initiatives with neighboring countries (partners, data vendors, data pools) are increasing and aim at surpassing the scope of the traditional enterprise, in favor of the “exterprise”.
Any Data Land (for instance the master data one) must have a leader, a sponsor who conveys a vision and ensures the alignment of all the members of his government who, like in real life, may be tempted by the will of handling its governed data in an autonomous or selfish way. This executive sponsorship is an important success factor of projects related to data governance. Its absence, source of the famous government hitches, leads to systematic failure. The governor is often an executive (CIO, COO, CEO), with enough power and respected enough to impose a choice in case of blocking.
A government and its supervisory body:
Every country needs a team who define the detail in terms of strategy and laws to put in place to ensure it is functioning correctly. Data Land requires nothing less. The organizational changes and the setup of dedicated enterprise-wide teams are among the most advertised collateral for data governance projects. The Data Governance Council is tasked with defining the rules governing the data i.e. the law. The Data Stewards ensure compliance with the law and, if not enforced, will take action to ensure compliance. In order for the initiative to be a success and just like national governments, they should theoretically be independent of particular interests and business lobbies. They do however need to have an intimate knowledge of the data and its use in the enterprise processes. This is why they often come from the “civil society”, meaning they were members of the business teams before, with a mission of surpassing their previous assignment for the greater good.
Laws and institutional processes
The first objective of the abovementioned government is to establish the governance scheme, the set of rules that govern the best practices around creating, using, modifying and removing data. These laws are of multiple types. The ones that establish property titles (data owners), easement rights (data consumers) and security rules (data custodians). There are also the ones that define the boundaries, restrictions or more positively the data standards. These rules will be enforced and data controlled by the data stewards. As in the civil society, an efficient management of the data involves orderly empowerment of the actors (prevention) as well as systematic control (repression). The enforcement of the law and its corrective aspect may be supported by processes orchestrating multiple users, according to the scheme defined by the Governance Council.
So what about IT tools? They are the infrastructures of Data Land. Vehicles, road signs, and even if it is less fun, speed cameras. They are here to facilitate the application of the governance scheme, to give tools to the government, to enforce order and the respect of the law. In any circumstances, they can help with the definition of the scheme. Data governance is an initiative taken by the enterprise for the enterprise, independently of any IT solution which will have to adapt (if sufficiently flexible).
As with any country-based government, data governance has an ambition to manage the enterprise data landscape with perfect efficiency.
Ambitious ? Surely.
Critical ? Definitively.
Let’s then ensure that the way to this ideal will deliver value by itself. This is what the relevance of IT tools should be judged against.
Special thanks to David Jordan for translating the original article from French to English.
If your goal is to implement a world class Integration Competency Center (ICC) or COE, the best people you could find to make up the team already work for you. If you don’t currently have technical superstars on your team, you can still have a leading-edge world-class ICC that will “wow” your internal customers every time. You don’t need a world-class team to have a world-class competency center……you need a world-class management system. (more…)
They say people are resistant to change. I disagree. People are resistant to uncertainty. Once people are certain that a change is to their benefit, they will change so fast it will make your head spin. It would be a mistake however to underestimate the challenges of changing an organization from one where integration is a collaboration between two project silos to one where integration is a sustainable strategy with a common infrastructure based on strict standards and shared by everyone. (more…)
Earlier this week I met with security leaders at some of the largest organizations in the San Francisco Bay Area. They highlighted disturbing trends, in addition to the increased incidence of breaches they see increased:
- Numbers of customer who want to do security audits of their company
- Number of RFPs in which information is required about data security
- Litigation from data security breaches— and occurrences of class action lawsuits—as opposed to regulatory fines driving concerns
So much attention has been placed on defending the perimeter that many organizations feel they are in an arms race. Part of the problem is that it’s not clear how effective the firewalls are. While firewalls may be a part of the solution, organizations are increasingly looking at how to make their applications bulletproof and centralize controls. One of the high risk areas are systems where people have more access than they need to.
For example, many organizations have created copies of production environments for test, development and training purposes. As a result this data can be completely exposed and the confidential aspects are at risk of being leaked intentionally or unintentionally. I spoke to a customer a couple of weeks ago who had tried to change the email addresses in their test database. But they missed a few. As a result, during a test run, they sent their customers emails. Their customers called back and asked what was going on. That was when we started talking to them about a masking solution that would permanently mask the data in these environments. In this way they would have the best data to test with and all sensitive details obliterated.
Another high risk area is with certain users, for example cloud administrators, who have access to all data in the clear. As a result, the administrators have access to account numbers and social security numbers that they don’t need in order to do their jobs. Here, masking these values would enable them to still see the passwords they need to do their jobs. But it would prevent the breach of the other confidential data.
Going back to the concerns the security leaders had, how do you prove to your customers that you have data security? Especially, if it’s difficult to prove the effectiveness of a firewall? This is where reports on what data was masked and what it was masked to comes in. Yes, you can pay for cyberinsurance to cover your losses for when you have a breach. But wouldn’t it be better to prevent the breaches in the first place and showing how you’ve done it? Try looking at the problem from the inside—out.
While Dodd Frank received most of the media attention after the great financial crisis, during that period, the U.S. government signed into law the Foreign Account Tax Compliance Act (FATCA) back in March 2010 which will require Foreign Financial Institutions (FFIs) to report the names of U.S. persons and owners of companies who have bank accounts in foreign accounts for tax reporting and withholding purposes.
The law was set to go into effect on January 1, 2013 however on October 24, 2012, the U.S. Internal Revenue Service (IRS) announced a one year extension to January 1, 2014 to give FFIs more time implement procedures for meeting the FATCA reporting requirements. Banks who elect not to comply or fail to meet these deadlines will be tagged as a ‘non-participating FFI’ and subject to a 30% withholding tax on all U.S. sourced income paid to it by a U.S. financial institution. Ouch!!
The reasons for FATCA are fairly straight forward. The United States Internal Revenue Service (IRS) wants to collect its share of tax revenue from individuals who have financial accounts and assets in overseas banks. According to industry studies, it is estimated that of the seven million U.S. citizens and green card holders who live or work outside the U.S., less than seven percent file tax returns. Officially, the intention of FATCA is not to raise additional tax revenue but to trace its missing, non-compliant taxpayers and return them to the U.S. tax system. Once FATCA goes into effect, the IRS expects it will collect an additional $8.7 billion in tax revenue.
Satisfying FATCA reporting requirements will require banks to identify:
- Any customer who may have an existing U.S. tax status.
- Customers who hold a U.S. citizenship or green card.
- Country of birth and residency.
- U.S.-based addresses associated with accounts – incoming and outgoing payments.
- Customers who have re-occurring payments to the U.S. including electronic transfers and recipient banks located in the U.S.
- Customers who have payments coming from the U.S. to banks abroad.
- Customers with high balances across retail banking, wealth management, asset management, Investment and Commercial Banking business lines.
Although these requirements sound simple enough, there are many data challenges to overcome including:
- Access to account information from core banking systems, customer management and relationship systems, payment systems, databases and desktops across multiple lines of business which can range into the hundreds, if not thousands of individual data sources.
- Data varying in different formats and structures including unstructured documents such as scanned images, PDFs, etc.
- Data quality errors including:
- Incomplete records: Data that is missing or unusable from the source system or file yet required for FATCA identification.
- Non-conforming record types: Data that is available in a non-standard format that does not integrate with data from other systems.
- Inconsistent values: Data values that give conflicting information or have different definitions with similar values.
- Inaccuracy: Data that is incorrect or out of date.
- Duplicates: Data records or attributes are repeated.
- Lack of Integrity: Data that is missing or not referenced in any system.
Most modern core banking systems have built in data validation checks to ensure that the right values are entered. Unfortunately, many banks continue to operate 20-30 year-old systems, many of which were custom built and lack upstream validation capabilities. In many cases, these data errors arise when combining ‘like’ data and information from multiple systems. Given the number of data sources and the volume of data that banks deal with, it will be important for FFIs to have capable technology to expedite and accurately profile FATCA source data to identify errors at the source as well as errors that occur as data is being combined and transformed for reporting purposes.
Another data quality challenge facing FFI’s will be to identify unique account holders while dealing with the following data anomalies:
- Deciphering names across different language (山田太郎 vs. Taro Yamada)
- Use of Nicknames (e.g. John, Jonathan, Johnny)
- Concatenation (e.g. Mary Anne vs. Maryanne)
- Prefix / Suffix (e.g. MacDonald vs. McDonald)
- Spelling error (e.g. Potter vs. Porter)
- Typographical error (e.g. Beth vs. Beht)
- Transcription error (e.g. Hannah vs. Hamah)
- Localization (e.g. Stanislav Milosovich vs. Stan Milo)
- Phonetic variations (e.g. Edinburgh – Edinborough)
- Transliteration (e.g. Kang vs. Kwang)
Attempting to perform these intricate data validations and matching processes requires technology that is purposely built for this function. Specifically, identity matching and resolution technology that leverages proven probabilistic, deterministic and fuzzy matching algorithms against any data of any language, capable of processing large data sets in a timely manner and that is designed to be used by business analysts versus an IT developer. Most importantly, being able to deliver the end results into the bank’s FATCA reporting systems and applications where the business needs it most.
As I stated earlier, FATCA impacts both U.S. and non-U.S. banks and is as important for the U.S. tax collectors as well as to the health of the global financial and economic markets. Even with the extended deadlines, those who lack capable data quality management processes, policies, standards and enabling technologies to deal with these data quality issues must act now or face the penalties defined by Uncle Sam.
Data integrity is closely linked to the concept of trust which, in the world of human interactions, is based on a tight coupling between words and actions (do what you say and say what you do). In the IT world, this translates into first having a clear definition of data as well as how it is treated in the context of various business processes. If we have a clear definition of data, including policies such as access, privacy, change controls, etc. (the words), and if we have systems that consistently enforce the definition (the actions) then we have high trust and high data integrity. We know exactly what to expect, and the data always exactly matches our expectations. (more…)