Tag Archives: Data Integration

Integrating Structured Data Into the E-Discovery Process

 

This blog post initially appeared on Exterro and is reblogged here with their consent.

mastering-integration-doodadAs data volumes increase and become more complex, having an integrated e-discovery environment where systems and data sources are automatically synching information and exchanging data with e-discovery applications has become critical for organizations. This is true of unstructured and semi-structured data sources, such as email servers and content management systems, as well as structured data sources, like databases and data archives. The topic of systems integration will be discussed on Exterro’s E-Discovery Masters series webcast, “Optimizing E-Discovery in a Multi-Vendor Environment.” The webcast is CLE-accredited and will air on Thursday, September 4 at 1pm ET/10am PT. Learn more and register here.

I recently interviewed Jim FitzGerald, Sr. Director for Exterro, and Josh Alpern, VP ILM Domain Experts for Informatica, about the important and often overlooked role structured data plays during the course of e-discovery.

Q: E-Discovery demands are often discussed in the context of unstructured data, like email. What are some of the complications that arise when a matter involves structured data?

Jim: A lot of e-discovery practitioners are comfortable with unstructured data sources like email, file shares, or documents in SharePoint, but freeze up when they have to deal with structured data. They are unfamiliar with the technology and terminology of databases, extracts, report generation, and archives. They’re unsure about the best ways to preserve or collect from these sources. If the application is an old one, this fear often gets translated into a mandate to keep everything just as it is, which translates to mothballed applications that just sit there in case data might be needed down the road. Beyond the costs, there’s also the issue that IT staff turnover means that it’s increasingly hard to generate the reports Legal and Compliance need from these old systems.

Josh Alpern Informatica

Josh Alpern
Informatica

Josh: Until now, e-discovery has largely been applied to unstructured data and email for two main reasons: 1) a large portion of relevant data resides in these types of data stores, and 2) these are the data formats that everyone is most familiar with and can relate to most easily. We all use email, and we all use files and documents. So it’s easy for people to look at an email or a document and understand that everything is self-contained in that one “thing.” But structured data is different, although not necessarily any less relevant. For example, someone might understand conceptually what a “purchase order” is, but not realize that in a financial application a purchase order consists of data that is spread across 50 different database tables. Unlike with an email or a PDF document, there might not be an easy way to simply produce a purchase order, in this example, for legal discovery without understanding how those 50 database tables are related to each other. Furthermore, to use email as a comparison, everyone understands what an email “thread” is. It’s easy to ask for all the emails in single thread, and usually it’s relatively easy to identify all of those emails: they all have the same subject line. But in structured data the situation can be much more complicated. If someone asks to see every financial document related to a single purchase order, you would have to understand all of the connections between the many database tables that comprise all of those related documents and how they related back to the requested purchase order. Solutions that are focused on email and unstructured data have no means to do this.

Q: What types of matters tend to implicate structured data and are they becoming more or less common?

Jim: The ones I hear about most common are product liability cases where they need to look back at warranty claims or drug trial data, or employment disputes around pay history and practices, or financial cases where they need to look at pricing or trading patterns.

Josh: The ones that Jim mentioned are certainly prevalent. But in addition, I would add that all kinds of financial data are now governed by retention policies largely because of the same concerns that arise from potential legal situations: at some point, someone may ask for it. Anything related to consumer packaged goods, vehicle parts (planes, boats, cars, trucks, etc.) as well as industrial and durable goods, which tend to have very long lifecycles, are increasingly subject to these types of inquiries.

Q: Simply accessing legacy data to determine its relevance to a matter can present significant challenges. What are some methods by which organizations can streamline the process?

Jim FitzGerald Exterro

Jim FitzGerald
Exterro

Jim: If you are keeping around mothballed applications and databases purely for reporting purposes, these are prime targets to migrate to a structured data archive. Cost savings from licenses, CPU, and storage can run to 65% per year, with the added benefit that it’s much easier to enforce a retention policy on this data, roll it off when it expires, and compliance reporting is easier to do with modern tools.

Josh: One huge challenge that comes from these legacy applications stems from the fact that there are typically a lot of them. That means that when a discovery request arises, someone – or more likely multiple people – have to go to each one of those applications one by one to search for and retrieve relevant data. Not only is that time consuming and cumbersome, but it also assumes that there are people with the skill sets and application knowledge necessary to interact with all of those different applications. In any given company, that might not be a problem *today*, shortly after the applications have been decommissioned, because all the people that used the applications when they were live are still around. But will that still be the case 5, 7, 10 or 20 years from now? Probably not. Retiring all of these legacy applications into a “platform neutral” format is a much more sustainable, not to mention cost effective, approach.

Q: How can e-discovery preservation and collection technologies be leveraged to help organizations identify and “lock down” structured data?

Jim: Integrating e-discovery — legal holds and collections — with your structured data archive can make it a lot easier to coordinate preservation and collection activities across the two systems.   This reduces the chances of stranded holds — data under preservation that could have been released, and reduces the ambiguity about what needs to happen to the data to support the needs of legal and compliance teams.

Josh: Just as there are solutions for “locking down” unstructured and semi-structured (email) data, there are solutions for locking down structured data. The first and perhaps most important step is recognizing that the solutions for unstructured and semi-structured data are simply incapable of handling structured data. Without something that is purpose built for structured data, your discovery preservation and collection process is going to ignore this entire category of data. The good news is that some of the solutions that are purpose built for structured data have built in integrations to the leading e-discovery platforms.

You can hear more from Informatica’s Josh Alpern and Exterro’s Jim FitzGerald by attending Exterro’s CLE-accredited webcast, “Optimizing E-Discovery in a Multi-Vendor Environment,” airing on Thursday, September 4. Learn more and register here.

FacebookTwitterLinkedInEmailPrintShare
Posted in Data Integration | Tagged | Leave a comment

Is the Internet of Things relevant for the government?

Get connected. Be connected. Make connections. Find connections. The Internet of Things (IoT) is all about connecting people, processes, data and, as the name suggests, things. The recent social media frenzy surrounding the ALS Ice Bucket Challenge has certainly reminded everyone of the power of social media, the Internet and a willingness to answer a challenge. Fueled by personal and professional connections, the craze has transformed fund raising for at least one charity. Similarly, IoT may potentially be transformational to the business of the public sector, should government step up to the challenge.

shutterstock_132378518

Is the Internet of Things relevant for the government?

Government is struggling with the concept and reality of how IoT really relates to the business of government, and perhaps rightfully so. For commercial enterprises, IoT is far more tangible and simply more fun. Gaming, televisions, watches, Google glasses, smartphones and tablets are all about delivering over-the-top, new and exciting consumer experiences. Industry is delivering transformational innovations, which are connecting people to places, data and other people at a record pace.

It’s time to accept the challenge. Government agencies need to keep pace with their commercial counterparts and harness the power of the Internet of Things. The end game is not to deliver new, faster, smaller, cooler electronics; the end game is to create solutions that let devices connecting to the Internet interact and share data, regardless of their location, manufacturer or format and make or find connections that may have been previously undetectable. For some, this concept is as foreign or scary as pouring ice water over their heads. For others, the new opportunity to transform policy, service delivery, leadership, legislation and regulation is fueling a transformation in government. And it starts with one connection.

One way to start could be linking previously siloed systems together or creating a golden record of all citizen interactions through a Master Data Management (MDM) initiative. It could start with a big data and analytics project to determine and mitigate risk factors in education or linking sensor data across multiple networks to increase intelligence about potential hacking or breaches. Agencies could stop waste, fraud and abuse before it happens by linking critical payment, procurement and geospatial data together in real time.

This is the Internet of Things for government. This is the challenge. This is transformation.

This article was originally published on www.federaltimes.com. Please view the original listing here

 

FacebookTwitterLinkedInEmailPrintShare
Posted in Big Data, Business Impact / Benefits, Data Integration, Data Security, Master Data Management, Public Sector, Uncategorized | Tagged , , , , , | Leave a comment

Malcolm Gladwell, Big Data and What’s to be Done About Too Much Information

Malcolm Gladwell wrote an article in The New Yorker magazine in January, 2007 entitled “Open Secrets.” In the article, he pointed out that a national-security expert had famously made a distinction between puzzles and mysteries.

New Yorker writer Malcolm Gladwell

New Yorker writer Malcolm Gladwell

Osama bin Laden’s whereabouts were, for many years, a puzzle. We couldn’t find him because we didn’t have enough information. The key to the puzzle, it was assumed, would eventually come from someone close to bin Laden, and until we could find that source, bin Laden would remain at large. In fact, that’s precisely what happened. Al-Qaida’s No. 3 leader, Khalid Sheikh Mohammed, gave authorities the nicknames of one of bin Laden’s couriers, who then became the linchpin to the CIA’s efforts to locate Bin Laden.

By contrast, the problem of what would happen in Iraq after the toppling of Saddam Hussein was a mystery. It wasn’t a question that had a simple, factual answer. Mysteries require judgments and the assessment of uncertainty, and the hard part is not that we have too little information but that we have too much.

This was written before “Big Data” was a household word and it begs the very interesting question of whether organizations and corporations that are, by anyone’s standards, totally deluged with data, are facing puzzles or mysteries. Consider the amount of data that a company like Western Union deals with.

Western Union is a 160-year old company. Having built scale in the money transfer business, the company is in the process of evolving its business model by enabling the expansion of digital products, growth of web and mobile channels, and a more personalized online customer experience. Sounds good – but get this: the company processes more than 29 transactions per seconds on average. That’s 242 million consumer-to-consumer transactions and 459 million business payments in a year. Nearly a billion transactions – a billion! As my six-year-old might say, that number is big enough “to go to the moon and back.” Layer on top of that the fact that the company operates in 200+ countries and territories, and conducts business in 120+ currencies. Senior Director and Head of Engineering Abhishek Banerjee has said, “The data is speaking to us. We just need to react to it.” That implies a puzzle, not a mystery – but only if data scientists are able to conduct statistical modeling and predictive analysis, systematically noting trends in sending and receiving behaviors. Check out what Banerjee and Western Union CTO Sanjay Saraf have to say about it here.

Or consider General Electric’s aggressive and pioneering move into what’s dubbed as the industrial internet. In a white paper entitled “The Case for an Industrial Big Data Platform: Laying the Groundwork for the New Industrial Age,” GE reveals some of the staggering statistics related to the industrial equipment that it manufactures and supports (services comprise 75% of GE’s bottom line):

  • A modern wind turbine contains approximately 50 sensors and control loops which collect data every 40 milliseconds.
  • A farm controller then receives more than 30 signals from each turbine at 160-millisecond intervals.
  • At every one-second interval, the farm monitoring software processes 200 raw sensor data points with various associated properties with each turbine.

Phew! I’m no electricity operations expert, and you probably aren’t either. And most of us will get no further than simply wrapping our heads around the simple fact that GE turbines are collecting a LOT of data. But what the paper goes on to say should grab your attention in a big way: “The key to success for this wind farm lies in the ability to collect and deliver the right data, at the right velocity, and in the right quantities to a wide set of well-orchestrated analytics.” And the paper goes on to recommend that anyone involved in the Industrial Internet revolution strongly consider its talent requirements, with the suggestion that Chief Data officers and/or Data Scientists may be the next critical hires.

Which brings us back to Malcolm Gladwell. In the aforementioned article, Gladwell goes on to pull apart the Enron debacle, and argues that it was a prime example of the perils of too much information. “If you sat through the trial of (former CEO) Jeffrey Skilling, you’d think that the Enron scandal was a puzzle. The company, the prosecution said, conducted shady side deals that no one quite understood. Senior executives withheld critical information from investors…We were not told enough—the classic puzzle premise—was the central assumption of the Enron prosecution.” But in fact, that was not true. Enron employed complicated – but perfectly legal–accounting techniques used by companies that engage in complicated financial trading. Many journalists and professors have gone back and looked at the firm’s regulatory filings, and have come to the conclusion that, while complex and difficult to identify, all of the company’s shenanigans were right there in plain view. Enron cannot be blamed for covering up the existence of its side deals. It didn’t; it disclosed them. As Gladwell summarizes:

“Puzzles are ‘transmitter-dependent’; they turn on what we are told. Mysteries are ‘receiver dependent’; they turn on the skills of the listener.”

big data

Wind turbines, jet engines and other machinery sensors generate unprecedented amounts of data

I would argue that this extremely complex, fast moving and seismic shift that we call Big Data will favor those who have developed the ability to attune, to listen and make sense of the data. Winners in this new world will recognize what looks like an overwhelming and intractable mystery, and break that mystery down into small and manageable chunks and demystify the landscape, to uncover the important nuggets of truth and significance.

FacebookTwitterLinkedInEmailPrintShare
Posted in Big Data, Business Impact / Benefits, Enterprise Data Management | Tagged , , , | 1 Comment

Informatica Cloud Summer ’14 Release Breaks Down Barriers with Unified Data Integration and Application Integration for Real Time and Bulk Patterns

This past week, Informatica Cloud marked an important milestone with the Summer 2014 release of the Informatica Cloud platform. This was the 20th Cloud release, and I am extremely proud of what our team has accomplished.

“SDL’s vision is to help our customers use data insights to create meaningful experiences, regardless of where or how the engagement occurs. It’s multilingual, multichannel and on a global scale. Being able to deliver the right information at the right time to the right customer with Informatica Cloud Summer 2014 is critical to our business and will continue to set us apart from our competition.”

– Paul Harris, Global Business Applications Director, SDL Pic

When I joined Informatica Cloud, I knew that it had the broadest cloud integration portfolio in the marketplace: leading data integration and analytic capabilities for bulk integration, comprehensive cloud master data management and test data management, and over a hundred connectors for cloud apps, enterprise systems and legacy data sources.. all delivered in a self-service design with point-and-click wizards for citizen integrators, without the need for complex and costly manual custom coding.

But, I also learned that our broad portfolio belies another structural advantage: because of Informatica Cloud’s unique, unified platform architecture, it has the ability to surface application (or real time) integration capabilities alongside its data integration capabilities with shared metadata across real time and batch workflows.

With the Summer 2014 release, we’ve brought our application integration capabilities to the forefront. We now provide the most-complete cloud app integration capability in the marketplace. With a design environment that’s meant not for just developers but also line of business IT, now app admins can also build real time process workflows that cut across on-premise and cloud and include built-in human workflows. And with the capability to translate these process workflows instantly into mobile apps for iPhone and Android mobile devices, we’re not just setting ourselves apart but also giving customers the unique capabilities they need for their increasingly mobile employees.

InformaticaCloud

Informatica Cloud Summer Release Webinar Replay

“Schneider’s strategic initiative to improve front-office performance relied on recording and measuring sales person engagement in real time on any mobile device or desktop. The enhanced real time cloud application integration features of Informatica Cloud Summer 2014 makes it all possible and was key to the success of a highly visible and transformative initiative.”

– Mark Nardella, Global Sales Process Director, Schneider Electric SE

With this release, we’re also giving customers the ability to create workflows around data sharing that mix and match batch and real time integration patterns. This is really important.  Because unlike the past, where you had to choose between batch and real time, in today’s world of on-premise, cloud-based, transactional and social data, you’re now more than ever having to deal with both real time interactions and the processing of large volumes of data. For example, let’s surmise a typical scenario these days at high-end retail stores. Using a clienteling iPad app, the sales rep looks up bulk purchase history and inventory availability data in SAP, confirms availability and delivery date, and then processes the customer’s order via real time integration with NetSuite. And if you ask any customer, having a single workflow to unify all of that for instant and actionable insights is a huge advantage.

“Our industry demands absolute efficiency, speed and trust when dealing with financial information, and the new cloud application integration feature in the latest release of Informatica Cloud will help us service our customers more effectively by delivering the data they require in a timely fashion. Keeping call-times to a minimum and improving customer satisfaction in real time.”

– Kimberly Jansen, Director CRM, Misys PLC

We’ve also included some exciting new Vibe Integration packages or VIPs. VIPs deliver pre-built business process mappings between front-office and back-office applications. The Summer 2014 release includes new bidirectional VIPs for Siebel to Salesforce and SAP to Salesforce that make it easier for customers to connect their Salesforce with these mission-critical business applications.

And lastly, but not least importantly, the release includes a critical upgrade to our API Framework that provides the Informatica Cloud iPaaS end-to-end support for connectivity to any company’s internal or external APIs. With the newly available API creation, definition and consumption patterns, developers or citizen integrators can now easily expose integrations as APIs and users can consume them via integration workflows or apps, without the need for any additional custom code.

The features and capabilities released this summer are available to all existing Informatica Cloud customers, and everyone else through our free 30-day trial offer.

FacebookTwitterLinkedInEmailPrintShare
Posted in Big Data, Cloud, Cloud Application Integration, Cloud Computing, Cloud Data Integration, Data Integration, Uncategorized | Tagged , , , , , | Leave a comment

Moving to the Cloud: 3 Data Integration Facts That Every Enterprise Should Understand

Cloud Data Integration

Cloud Data Integration

According to a survey conducted by Dimensional Research and commissioned by Host Analytics, “CIOs continue to grow more and more bullish about cloud solutions, with a whopping 92% saying that cloud provides business benefits, according to a recent survey. Nonetheless, IT execs remain concerned over how to avoid SaaS-based data silos.”

Since the survey was published, many enterprises have, indeed, leveraged the cloud to host business data in both IaaS and SaaS incarnations.  Overall, there seems to be two types of enterprises: First are the enterprises that get the value of data integration.  They leverage the value of cloud-based systems, and do not create additional data silos.  Second are the enterprises that build cloud-based data silos without a sound data integration strategy, and thus take a few steps backward, in terms of effectively leveraging enterprise data.

There are facts about data integration that most in enterprise IT don’t yet understand, and the use of cloud-based resources actually makes things worse.  The shame of it all is that, with a bit of work and some investment, the value should come back to the enterprises 10 to 20 times over.  Let’s consider the facts.

Fact 1: Implement new systems, such as those being stood up on public cloud platforms, and any data integration investment comes back 10 to 20 fold.  The focus is typically too much on cost and not enough on the benefit, when building a data integration strategy and investing in data integration technology.

Many in enterprise IT point out that their problem domain is unique, and thus their circumstances need special consideration.  While I always perform domain-specific calculations, the patterns of value typically remain the same.  You should determine the metrics that are right for your enterprise, but the positive values will be fairly consistent, with some varying degrees.

Fact 2: It’s not just about data moving from place-to-place, it’s also about the proper management of data.  This includes a central understanding of data semantics (metadata), and a place to manage a “single version of the truth” when it comes to dealing massive amounts of distributed data that enterprises must typically manage, and now they are also distributed within public clouds.

Most of those who manage enterprise data, cloud or no-cloud, have no common mechanism to deal with the meaning of the data, or even the physical location of the data.  While data integration is about moving data from place to place to support core business processes, it should come with a way to manage the data as well.  This means understanding, protecting, governing, and leveraging the enterprise data, both locally and within public cloud providers.

Fact 3: Some data belongs on clouds, and some data belongs in the enterprise.  Those in enterprise IT have either pushed back on cloud computing, stating that data outside the firewall is a bad idea due to security, performance, legal issues…you name it.  Others try to move all data to the cloud.  The point of value is somewhere in between.

The fact of the matter is that the public cloud is not the right fit for all data.  Enterprise IT must carefully consider the tradeoff between cloud-based and in-house, including performance, security, compliance, etc..  Finding the best location for the data is the same problem we’ve dealt with for years.  Now we have cloud computing as an option.  Work from your requirements to the target platform, and you’ll find what I’ve found: Cloud is a fit some of the time, but not all of the time.

 

FacebookTwitterLinkedInEmailPrintShare
Posted in Cloud, Cloud Application Integration, Cloud Computing, Cloud Data Integration, Data Integration | Tagged , , | Leave a comment

In a Data First World, Knowledge Really Is Power!

Knowledge Really IS Power!

Knowledge Really IS Power!

I have two quick questions for you. First, can you name the top three factors that will increase your sales or boost your profit? And second, are you sure about that?

That second question is a killer because most people — no matter if they’re in marketing, sales or manufacturing — rely on incomplete, inaccurate or just plain wrong information. Regardless of industry, we’ve been fixated on historic transactions because that’s what our systems are designed to provide us.

“Moneyball: The Art of Winning an Unfair Game” gives a great example of what I mean. The book (not the movie) describes Billy Beane hiring MBAs to map out the factors that would win a baseball game. They discovered something completely unexpected: That getting more batters on base would tire out pitchers. It didn’t matter if batters had multi-base hits, and it didn’t even matter if they walked. What mattered was forcing pitchers to throw ball after ball as they faced an unrelenting string of batters. Beane stopped looking at RBIs, ERAs and even home runs, and started hiring batters who consistently reached first base. To me, the book illustrates that the most useful knowledge isn’t always what we’ve been programmed to depend on or what is delivered to us via one app or another.

For years, people across industries have turned to ERP, CRM and web analytics systems to forecast sales and acquire new customers. By their nature, such systems are transactional, forcing us to rely on history as the best predictor of the future. Sure it might be helpful for retailers to identify last year’s biggest customers, but that doesn’t tell them whose blogs, posts or Tweets influenced additional sales. Isn’t it time for all businesses, regardless of industry, to adopt a different point of view — one that we at Informatica call “Data-First”? Instead of relying solely on transactions, a data-first POV shines a light on interactions. It’s like having a high knowledge IQ about relationships and connections that matter.

A data-first POV changes everything. With it, companies can unleash the killer app, the killer sales organization and the killer marketing campaign. Imagine, for example, if a sales person meeting a new customer knew that person’s concerns, interests and business connections ahead of time? Couldn’t that knowledge — gleaned from Tweets, blogs, LinkedIn connections, online posts and transactional data — provide a window into the problems the prospect wants to solve?

That’s the premise of two startups I know about, and it illustrates how a data-first POV can fuel innovation for developers and their customers. Today, we’re awash in data-fueled things that are somehow attached to the Internet. Our cars, phones, thermostats and even our wristbands are generating and gleaning data in new and exciting ways. That’s knowledge begging to be put to good use. The winners will be the ones who figure out that knowledge truly is power, and wield that power to their advantage.

FacebookTwitterLinkedInEmailPrintShare
Posted in Data Aggregation, Data Governance, Data Integration, Data Quality, Data Transformation | Tagged , , | Leave a comment

3 Ways to Sell Data Integration Internally

3 Ways to Sell Data Integration Internally

Selling data integration internally

So, you need to grab some budget for a data integration project, but know one understands what data integration is, what business problems it solves, and it’s difficult to explain without a white board and a lot of time.  I’ve been there.

I’ve “sold” data integration as a concept for the last 20 years.  Let me tell you, it’s challenging to define the benefits to those who don’t work with this technology every day.  That said, most of the complaints I hear about enterprise IT are around the lack of data integration, and thus the inefficiencies that go along with that lack, such as re-keying data, data quality issues, lack of automation across systems, and so forth.

Considering that most of you will sell data integration to your peers and leadership, I’ve come up with 3 proven ways to sell data integration internally.

First, focus on the business problems.  Use real world examples from your own business.  It’s not tough to find any number of cases where the data was just not there to make core operational decisions that could have avoided some huge mistakes that proved costly to the company.  Or, more likely, there are things like ineffective inventory management that has no way to understand when orders need to be place.  Or, there’s the go-to standard: No single definition of what a “customer” or a “sale” is amongst the systems that support the business.  That one is like back pain, everyone has it at some point.

Second, define the business case in practical terms with examples.  Once you define the business problems that exist due to lack of a sound data integration strategy and technologies, it’s time to put money behind those numbers.  Those in IT have a tendency to either way overstate, or way understate the amount of money that’s being wasted and thus could be saved by using data integration approaches and technology.  So, provide practical numbers that you can back-up with existing data.

Finally, focus on a phased approach to implementing your data integration solution.  The “Big Bang Theory” is a great way to define the beginning of the universe, but it’s not the way you want to define the rollout of your data integration technology.  Define a workable plan that moves from one small grouping of systems and databases to another, over time, and with a reasonable amount of resources and technology.  You do this to remove risk from the effort, as well as manage costs, and insure that you can dial lessons learned back into the efforts.  I would rather roll out data integration within an enterprises using small teams and more problem domains, than attempt to do everything within a few years.

The reality is that data integration is no longer optional for enterprises these days.  It’s required for so many reasons, from data sharing, information visibility, compliance, security, automation…the list goes on and on.  IT needs to take point on this effort.  Selling data integration internally is the first and most important step.  Go get ‘em.

FacebookTwitterLinkedInEmailPrintShare
Posted in Data Integration, Data Integration Platform, Data Quality, Data Security | Tagged | Leave a comment

Scary Times For Data Security

Scary Times For Data Security

Scary Times For Data Security

These are scary times we live in when it comes to data security. And the times are even scarier for today’s retailers, government agencies, financial institutions, and healthcare organizations. The internet has become a battlefield. Criminals are looking to steal trade secrets and personal data for financial gain. Terrorists seek to steal data for political gain. Both are after your Personally Identifiable Information, like your name, account numbers, social security number, date of birth, ID’s and passwords.

How are they accomplishing this?  A new generation of hackers has learned to reverse engineer popular software programs (e.g. Windows, Outlook Java, etc.) in order to find so called “holes”. Once those holes are exploited, the hackers develop “bugs” that infiltrate computer systems, search for sensitive data and return it to the bad guys. These bugs are then sold in the black market to the highest bidder. When successful, these hackers can wreak havoc across the globe.

I recently read a Time Magazine article titled “World War Zero: How Hackers Fight to Steal Your Secrets.” The article discussed a new generation of software companies made up of former hackers. These firms help other software companies by identifying potential security holes, before they can be used in malicious exploits.

This constant battle between good (data and software security firms) and bad (smart, young, programmers looking to make a quick/big buck) is happening every day. Unfortunately, the average consumer (you and I) are the innocent victims of this crazy and costly war. As a consumer in today’s digital and data-centric age, I worry when I see these headlines of ongoing data breaches from the Targets of the world to my local bank down the street. I wonder not “if” but “when” I will become the next victim.  According to the Ponemon institute, the average cost to a company was $3.5 million in US dollars and 15 percent more than what it cost last year.

As a 20 year software industry veteran, I’ve worked with many firms across global financial services industry. As a result, my concerned about data security exceed those of the average consumer. Here are the reasons for this:

  1. Everything is Digital: I remember the days when ATM machines were introduced, eliminating the need to wait in long teller lines. Nowadays, most of what we do with our financial institutions is digital and online whether on our mobile devices to desktop browsers. As such every interaction and transaction is creating sensitive data that gets disbursed across tens, hundreds, sometimes thousands of databases and systems in these firms.
  2. The Big Data Phenomenon: I’m not talking about sexy next generation analytic applications that promise to provide the best answer to run your business. What I am talking about is the volume of data that is being generated and collected from the countless number of computer systems (on-premise and in the cloud) that run today’s global financial services industry.
  3. Increase use of Off-Shore and On-Shore Development: Outsourcing technology projects to offshore development firms has be leverage off shore development partners to offset their operational and technology costs. With new technology initiatives.

Now here is the hard part.  Given these trends and heightened threats, do the companies I do business with know where the data resides that they need to protect?  How do they actually protect sensitive data when using it to support new IT projects both in-house or by off-shore development partners?   You’d be amazed what the truth is. 

According to the recent Ponemon Institute study “State of Data Centric Security” that surveyed 1,587 Global IT and IT security practitioners in 16 countries:

  • Only 16 percent of the respondents believe they know where all sensitive structured data is located and a very small percentage (7 percent) know where unstructured data resides.
  • Fifty-seven percent of respondents say not knowing where the organization’s sensitive or confidential data is located keeps them up at night.
  • Only 19 percent say their organizations use centralized access control management and entitlements and 14 percent use file system and access audits.

Even worse, those surveyed said that not knowing where sensitive and confidential information resides is a serious threat and the percentage of respondents who believe it is a high priority in their organizations. Seventy-nine percent of respondents agree it is a significant security risk facing their organizations. But a much smaller percentage (51 percent) believes that securing and/or protecting data is a high priority in their organizations.

I don’t know about you but this is alarming and worrisome to me.  I think I am ready to reach out to my banker and my local retailer and let him know about my concerns and make sure they ask and communicate my concerns to the top of their organization. In today’s globally and socially connected world, news travels fast and given how hard it is to build trustful customer relationships, one would think every business from the local mall to Wall St should be asking if they are doing what they need to identify and protect their number one digital asset – Their data.

FacebookTwitterLinkedInEmailPrintShare
Posted in Data Governance, Data Integration, Data Privacy, Data Quality, Data Services, Data Warehousing | Tagged , , , | Leave a comment

The New Gartner Magic Quadrant for Data Integration Tools is Here!

Gartner Magic Quadrant

Gartner Magic Quadrant for Data Integration Tools July 2014

While the rest of you may not get that excited about the latest and greatest Gartner Magic Quadrant report, we sure get excited about it around here. And with good reason. Once again, for the 8th year in a row, if I am not mistaken, Informatica is in the leader’s quadrant of the Gartner Magic Quadrant for Data Integration Tools and we are positioned highest in ability to execute and furthest in completeness of vision within the leaders quadrant. So I have to say, I am pretty excited about that because we believe it speaks to our vision and execution among vendors…. And that’s the fact jack.

So if you still don’t get why we are so excited, I will just quote Navin R Johnson (from the movie: The Jerk) who, upon seeing his name published in the phone book stated, “This is the kind of spontaneous publicity – your name in print – that makes people. I’m in print! Things are going to start happening to me now.” (If you want to see Navin’s reaction click here)

Well, things are already happening for Informatica. And more importantly, for our customers who are using our market leading data integration platform to accelerate their mission critical data projects whether they are on premise, in the cloud or even on Hadoop. But don’t take my word for it. Download and read the latest Gartner Magic Quadrant for Data Integration Tools report for yourself, and find out why we are once again in the leadership quadrant.

Now I am going to take the rest of the day off, to celebrate!

Disclaimer – Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

FacebookTwitterLinkedInEmailPrintShare
Posted in Data Integration | Tagged | Leave a comment

The Swiss Army Knife of Data Integration

The Swiss Army Knife of Data Integration

The Swiss Army Knife of Data Integration

Back in 1884, a man had a revolutionary idea; he envisioned a compact knife that was lightweight and would combine the functions of many stand-alone tools into a single tool. This idea became what the world has known for over a century as the Swiss Army Knife.

This creative thinking to solve a problem came from a request to build a soldier knife from the Swiss Army.  In the end, the solution was all about getting the right tool for the right job in the right place. In many cases soldiers didn’t need industrial strength tools, all they really needed was a compact and lightweight tool to get the job at hand done quickly.

Putting this into perspective with today’s world of Data Integration, using enterprise-class data integration tools for the smaller data integration project is over kill and typically out of reach for the smaller organization. However, these smaller data integration projects are just as important as those larger enterprise projects, and they are often the innovation behind a new way of business thinking. The traditional hand-coding approach to addressing the smaller data integration project is not-scalable, not-repeatable and prone to human error, what’s needed is a compact, flexible and powerful off-the-shelf tool.

Thankfully, over a century after the world embraced the Swiss Army Knife, someone at Informatica was paying attention to revolutionary ideas. If you’ve not yet heard the news about the Informatica platform, a version called PowerCenter Express has been released and it is free of charge so you can use it to handle an assortment of what I’d characterize as high complexity / low volume data integration challenges and experience a subset of the Informatica platform for yourself. I’d emphasize that PowerCenter Express doesn’t replace the need for Informatica’s enterprise grade products, but it is ideal for rapid prototyping, profiling data, and developing quick proof of concepts.

PowerCenter Express provides a glimpse of the evolving Informatica platform by integrating four Informatica products into a single, compact tool. There are no database dependencies and the product installs in just under 10 minutes. Much to my own surprise, I use PowerCenter express quite often going about the various aspects of my job with Informatica. I have it installed on my laptop so it travels with me wherever I go. It starts up quickly so it’s ideal for getting a little work done on an airplane. 

For example, recently I wanted to explore building some rules for an upcoming proof of concept on a plane ride home so I could claw back some personal time for my weekend. I used PowerCenter Express to profile some data and create a mapping.  And this mapping wasn’t something I needed to throw away and recreate in an enterprise version after my flight landed. Vibe, Informatica’s build once / run anywhere metadata driven architecture allows me to export a mapping I create in PowerCenter Express to one of the enterprise versions of Informatica’s products such as PowerCenter, DataQuality or Informatica Cloud.

As I alluded to earlier in this article, being a free offering I honestly didn’t expect too much from PowerCenter Express when I first started exploring it. However, due to my own positive experiences, I now like to think of PowerCenter Express as the Swiss Army Knife of Data Integration.

To start claiming back some of your personal time, get started with the free version of PowerCenter Express, found on the Informatica Marketplace at:  https://community.informatica.com/solutions/pcexpress

 Business Use Cases

Business Use Case for PowerCenter Express

FacebookTwitterLinkedInEmailPrintShare
Posted in Architects, Data Integration, Data Migration, Data Transformation, Data Warehousing, PowerCenter, Vibe | Tagged , | Leave a comment