Category Archives: Application ILM
Columnar Deduplication and Column Tokenization: Improving Database Performance, Security and Interoperability
For some time now, a special technique called columnar deduplication has been implemented by a number of commercially available relational database management systems. In today’s blog post, I discuss the nature and benefits of this technique, which I will refer to as column tokenization for reasons that will become evident.
Column tokenization is a process in which a unique identifier (called a Token ID) is assigned to each unique value in a column, and then employed to represent that value anywhere it appears in the column. Using this approach, data size reductions of up to 50% can be achieved, depending on the number of unique values in the column (that is, on the column’s cardinality). Some RDBMSs use this technique simply as a way of compressing data; the column tokenization process is integrated into the buffer and I/O subsystems, and when a query is executed, each row needs to be materialized and the token IDs replaced by their corresponding values. At Informatica for the File Archive Service (FAS) part of the Information Lifecycle Management product family, column tokenization is the core of our technology: the tokenized structure is actually used during query execution, with row materialization occurring only when the final result set is returned. We also use special compression algorithms to achieve further size reduction, typically on the order of 95%.
I was at an IT conference a few years ago. The speaker was talking about application testing. At the beginning of his talk, he asked the audience:
“Please raise your hand if you flew here from out of town.”
Most of the audience raised their hands. The speaker then said:
“OK, now if you knew that the airplane you flew on had been tested the same way your company tests its applications, would you have still flown on that plane?
After some uneasy chuckling, every hand went down. Not a great affirmation of the state of application testing in most IT shops. (more…)
I’ve been approached by a number of customers who are looking to archive data from their Salesforce application. There are two primary drivers I have heard cited:
- The need to manage the retention of Salesforce data and easily find and access it for legal eDiscovory
- Storage cost reduction for data that’s no longer active
The OAUG hosted its annual convention, Collaborate13, this week in Denver, Colorado. The week started out with beautiful spring weather and turned quickly into frigid temperatures with a snow flurry bonus. The rapid change in weather didn’t stop 4,000 attendees from elevating their application knowledge in the mile high city. One topic that was very well attended from our perspective was the evolution of database archiving. (more…)
Businesses retain information in an Enterprise data archiving either for compliance – adhere to data retention regulations – or because business users are afraid to let go of data they are used to having access to. Many IT have told us they retain data in archives because they are looking to cut infrastructure costs and do not have retention requirements clearly articulated from the business. As a result, enterprise data archiving has morphed into serving multiple purposes for IT –they can eliminate costs associated with maintaining aging data in production applications, allow business users to access the information on demand, all while adhering to some – if any known or defined – retention policies. (more…)
Proactive performance management of systems and applications has always been an elusive goal for many organizations. We have enough fires to fight and issues to deal with in our day to day life to make searching for performance problems rank somewhere just below defragging our hard drives. At Informatica we talk with companies every day that try to manage their application performance proactively but lack the tools or process to make it happen. In this blog we will talk about the five keys to making proactive performance a reality. (more…)
The digitization of everything is creating a data explosion near you. Whether data is accumulating in the data center, in the cloud, on your laptop or mobile device, sometimes too much of something isn’t always a good thing. In a recent webinar cohosted by Informatica and Symantec, we polled our listeners to find out how the data explosion was impacting them. We also asked what type of unstructured and structured data is growing the fastest. Check out what they said. (more…)
According to analysts, users spend the majority of the application development lifecycle in development and testing and the least amount of time in quality management and documentation. This is probably not very shocking to anyone in QA or on a testing team. But how much time is actually spent on test data management? In a recent webinar, more than half of the listeners polled say they spend between 30-40% of their effort on ‘data related tasks.’ (more…)
Informatica recently hosted a webinar on Enterprise Data Archiving Best Practices with guest speakers, Tony Baer from Ovum and Murali Rathnam from Symantec IT. With over 600 registrations, I would say that enterprise data archiving is not hot, it is white hot. At least for Informatica. With Big Data entering the data center, organizations are looking for ways to make room – either in the budget or in the data center itself. Archiving is a proven approach that achieves both. Given the complexities and interconnections of enterprise applications, Enterprise Data Archive solutions based on market leading technologies such as Informatica Data Archive, can deliver on the value proposition while meeting tough requirements. (more…)
It’s official, “big data” is here to stay and the solutions, concepts, hardware and services to support these massive implementations are going to continue to grow at a rapid pace. However, every organization has their own definition of big data and how it plays in their organization. One area that we are seeing a lot of activity in is “big transaction data” for OLTP and relational databases because relational databases often lack the true management capabilities to scale transactional applications into the higher TBs and PBs. In this post we will explore some ways that your existing OLTP system can scale without crushing IT and your budget in the process. (more…)