Tag Archives: Data Transformation
The thing that resonates today, in the odd context of big data, is that we may all need to look in the mirror, hold a thumb drive full of information in our hands, and concede once and for all It’s not the data… it’s us.
Many organizations have a hard time making something useful from the ever-expanding universe of big-data, but the problem doesn’t lie with the data: It’s a people problem.
The contention is that big-data is falling short of the hype because people are:
- too unwilling to create cultures that value standardized, efficient, and repeatable information, and
- too complex to be reduced to “thin data” created from digital traces.
Evan Stubbs describes poor data quality as the data analyst’s single greatest problem.
About the only satisfying thing about having bad data is the schadenfreude that goes along with it. There’s cold solace in knowing that regardless of how poor your data is, everyone else’s is equally as bad. The thing is poor quality data doesn’t just appear from the ether. It’s created. Leave the dirty dishes for long enough and you’ll end up with cockroaches and cholera. Ignore data quality and eventually you’ll have black holes of untrustworthy information. Here’s the hard truth: we’re the reason bad data exists.
I will tell you that most data teams make “large efforts” to scrub their data. Those “infrequent” big cleanups however only treat the symptom, not the cause – and ultimately lead to inefficiency, cost, and even more frustration.
It’s intuitive and natural to think that data quality is a technological problem. It’s not; it’s a cultural problem. The real answer is that you need to create a culture that values standardized, efficient, and repeatable information.
If you do that, then you’ll be able to create data that is re-usable, efficient, and high quality. Rather than trying to manage a shanty of half-baked source tables, effective teams put the effort into designing, maintaining, and documenting their data. Instead of being a one-off activity, it becomes part of business as usual, something that’s simply part of daily life.
However, even if that data is the best it can possibly be, is it even capable of delivering on the big-data promise of greater insights about things like the habits, needs, and desires of customers?
Despite the enormous growth of data and the success of a few companies like Amazon and Netflix, “the reality is that deeper insights for most organizations remain elusive,” write Mikkel Rasmussen and Christian Madsbjerg in a Bloomberg Businessweek blog post that argues “big-data gets people wrong.”
Big-data delivers thin data. In the social sciences, we distinguish between two types of human behavior data. The first – thin data – is from digital traces: He wears a size 8, has blue eyes, and drinks pinot noir. The second – rich data – delivers an understanding of how people actually experience the world: He could smell the grass after the rain, he looked at her in that special way, and the new running shoes made him look faster. Big-data focuses solely on correlation, paying no attention to causality. What good is thin “information” when there is no insight into what your consumers actually think and feel?
Accenture reported only 20 percent of the companies it profiled had found a proven causal link between “what they measure and the outcomes they are intending to drive.”
Now, I can contend they keys to transforming big-data to strategic value are critical thinking skills.
Where do we get such skills? People, it seems, are both the problem and the solution. Are we failing on two fronts: failing to create the right data-driven cultures, and failing to interpret the data we collect?
By now, the business benefits of effectively leveraging big data have become well known. Enhanced analytical capabilities, greater understanding of customers, and ability to predict trends before they happen are just some of the advantages. But big data doesn’t just appear and present itself. It needs to be made tangible to the business. All too often, executives are intimidated by the concept of big data, thinking the only way to work with it is to have an advanced degree in statistics.
There are ways to make big data more than an abstract concept that can only be loved by data scientists. Four of these ways were recently covered in a report by David Stodder, director of business intelligence research for TDWI, as part of TDWI’s special report on What Works in Big Data.
The time is ripe for experimentation with real-time, interactive analytics technologies, Stodder says. The next major step in the movement toward big data is enabling real-time or near-real-time delivery of information. Real-time data has been a challenge with BI data for years, with limited success, Stodder says. The good news is that Hadoop framework, originally built for batch processing, now includes interactive querying and streaming applications, he reports. This opens the way for real-time processing of big data.
Design for self-service
Interest in self-service access to analytical data continues to grow. “Increasing users’ self-reliance and reducing their dependence on IT are broadly shared goals,” Stodder says. “Nontechnical users—those not well versed in writing queries or navigating data schemas—are requesting to do more on their own.” There is an impressive array of self-service tools and platforms now appearing on the market. “Many tools automate steps for underlying data access and integration, enabling users to do more source selection and transformation on their own, including for data from Hadoop files,” he says. “In addition, new tools are hitting the market that put greater emphasis on exploratory analytics over traditional BI reporting; these are aimed at the needs of users who want to access raw big data files, perform ad-hoc requests routinely, and invoke transformations after data extraction and loading (that is, ELT) rather than before.”
Nothing gets a point across faster than having data points visually displayed – decision-makers can draw inferences within seconds. “Data visualization has been an important component of BI and analytics for a long time, but it takes on added significance in the era of big data,” Stodder says. “As expressions of meaning, visualizations are becoming a critical way for users to collaborate on data; users can share visualizations linked to text annotations as well as other types of content, such as pictures, audio files, and maps to put together comprehensive, shared views.”
Unify views of data
Users are working with many different data types these days, and are looking to bring this information into a single view – “rather than having to move from one interface to another to view data in disparate silos,” says Stodder. Unstructured data – graphics and video files – can also provide a fuller context to reports, he adds.
One up-and-coming use case in the Capital Markets that we are excited about is front office real-time risk analytics on streaming market data, to decrease risk by informing traders in real time about potential changes to trading strategies, based on the most up-to-date data possible.
Why have B2B solutions been defined as outside the firewall? Increasingly we are seeing organisations that have evolved, or are evolving into amorphous structures that are hard to define as either a single or multiple entities. Organisations that have traditionally been compartmentalized as in a specific industry and offering a specific service are frequently now focusing on their core abilities and organizational strengths and applying this to service other departments, subsidiaries or even other organisations in totally separate industries – areas of expertise such as Payments processing for Financial Service companies or Bill processing by Telecommunications operators are seen as revenue generators not corporate overheads. We also see supposedly key processes being outsourced such as Call centers and Networks. Organisations are diversifying hoping that owning the customer or consumer will be enough to be able to resell associated products or services that they have branded or can acquire from other departments or suppliers. It’s a long time since organisations have had to own all their product and service creation and delivery functions however successful and capable they are; but they do have to maximize the revenue and benefits they receive from them.
All in all the hairball of inter-relationships within and without the organisation is becoming more and more convoluted. The traditional supply chain and its management that typically was seen as the life blood of industries such as manufacturing and retail has increasingly been absorbed by industries and sectors as diverse as Financial Services, Telecommunications and the Public Sector who are reliant on partner organisations for key parts of their product and service creation, delivery and support.
But this evolution throws up significant issues as well as benefits.
A major issue is the management and control of access to data and security compliance. Visible security management, access control and auditability are prerequisites of any customer data integration solution but frequently data and access from partners and from within an organisation are viewed as separate processes.
The ability to swiftly respond to changing business and market requirements means not only managing new partnerships and data flows but that the organisation or department providing you will core services last week may not be the same as next week.
This all means that the traditional B2B data flows can now be rethought. The benefits of B2B solutions with partner on-boarding processes and management; data format transformations and managed file transfer are just as relevant within an organisation and its departments as well as when connecting external partner organisations.
The ability to link and manage data publishing organisations / systems / applications together with those applications within your organisation or department that consumes them is just as relevant within the firewall as from outside.
And if you can integrate the external organisation and internal departments data then you are definitely on the road to solving the problem of business change, data security, regulatory compliance and maximising the value of your most important asset – data.
Remote Data Collection and Transformation – with Ultra Messaging Cache Option and B2B Data Transformation
Sometimes when I drive past an electronic tollway collection sensor, I wonder about the amount of data it must generate. I’m no expert on such technology, but at a minimum, the RFID sensor has to read the chip in your car, and log the date and time plus your RFID info, and then a camera takes a picture to catch any potential violators. Now multiply that data times the hundreds of thousands of cars that drive such roads every day, times the number of sensors they pass, and I’m quite sure this number exceeds several million messages per day. (more…)
In the evolution of Billing ‘thinking’ for Telcos we’ve seen everything from ‘All you can eat’ offers to ‘Another coin in the slot’. But this perennial business process black-hole can prove to be an area that can add to a Telcos armory in retaining and keeping happy its corporate customers. Not only this but following on from lessons learnt by the Financial Services community it can provide early warnings of customer, partner and service exposure, significant benefits to any organisations Revenue Assurance efforts.
The Integrated Customer Service Hub has evolved to allow customers, frequently the high value corporate organisations, on-line access firstly to Billing information then expanding to encompass other operational data such as new service orders and provisioning data, trouble tickets and service usage data. Increasingly customers are requiring being more in control of their services and so the hub has further evolved to allow customer self-servicing allowing them to place orders and receive information in the format that works for them not just their telecommunications service supplier. (more…)