Tag Archives: data-science
Data Science should change how your businesses are run
The importance of data science is becoming more and more clear. Marc Benioff says, “I think for every company, the revolution in data science will fundamentally change how we run our business”. “There’s just a huge amount more data than ever before, our greatest challenge is making sense of that data”. He goes on to say that “we need a new generation of executives who understand how to manage and lead through data. And we also need a new generation of employees who are able to help us organize and structure our business around data”. Mark then says “when I look at the next set of technologies that we have to build at Salesforce, it is all data science based technology.” Ram Charan in his article in Fortune Magazine “says to thrive, companies—and the execs who run them, must transform into math machines” (The Algorithmic CEO, Fortune Magazine, March 2015, page 45).
With such powerful endorsements for data science, the question you may be asking is when should you hire a data scientist or two. The answer has multiple answers. I liken data science to any business research. You need to do your upfront homework for the data scientists you hire to be effective.
Create a situation analysis before you start
You need to start by defining your problem—are you losing sales, finding it takes too long to manufacturer something, less profitable than you would like to be, and the list goes on. Next, you should create a situational analysis. You want to arm your data scientists with as much information as possible to define what you want them to solve or change. Make sure that you are as concrete as possible here. Data scientists struggle when the business people that they work with are vague. As well, it is important that you indicate what kinds of business changes will be considered if the model and data deliver this results or that result.
Next you need to catalog the data that you already have which is relevant to the business problem. Without relevant data there is little that the data scientist can do to help you. With relevant data sources in hand, you need to define the range of actions that you can possibly take once a model has been created.
Be realistic about what is required
With these things in hand, it may be time to hire some data scientists. As you start your process, you need to be realistic about the difficulty of getting a top flight data scientist. Many of my customers have complained about the difficulty competing with Google and other tech startups. As important, “there is a huge variance in the quality and ability of data scientists”. (Data Science for Business, Foster Provost, O’Reilly, page 321). Once you have hired someone, you need to keep in mind that effective data science requires business and data science collaboration. As well, please know that data scientist struggle when business people don’t appreciate the effort needed to get an appropriate training data set or model evaluation procedures.
Make sure internal or external data scientists give you an effective proposal
Once Once your data scientists are in place, you should realize that a data scientist worth their salt will create a proposal back to you. As we have said, it is important that you know what kinds of things will happen if the model and data delivery this results or that result. Data scientist in turn will be able to narrow things down to a dollar impact.
Their proposal should start by sharing their understanding of the business and the data which is available. What business problems are they trying to solve? Next the data scientist may define things like whether supervised or unsupervised learning will be used. Next they should openly discuss what efforts will be involved in data preparation. They should tell you here about the values for the target variable (whose values will be predicted). They should describe next their modeling approach and whether more than one model is be evaluated and then how models will be compared and final model be selected. And finally, they should discuss how the model will be evaluated and deployed. Are there evaluation and setup metrics? Data scientists can dedicate time and resources in their proposal to determining what things are real versus expected drivers.
To make all this work, it can be a good idea for data scientist to talk in their proposal about likelihood because business people that have not been through a quantitative MBA do not understand or remember statistics. It is important as well that data scientist before they begin ask business people the so what questions if the situation analysis is inadequate.
Leading an internal analytics team
In some cases, analytical teams will be built internally. Where this occurs, it is really importantly that the analytic leader have good people skills. They need as well to be able to set expectations that people will be making decisions from data and analysis. This includes having the ability to push back when someone comes to them will a recommendation based on gut feel.
The leader needs to hire smart analysts. To keep them, they need a stimulating and supportive work environment. Tom Davenport says analysts are motivated by interesting and challenging work that allows them to utilize their highly specialize skills. Like millenials, money is nice for analysts but they are more motivated more by exciting work and having the opportunity to grow and stretch their skills. Please know that data scientists want to spend time refining analytical models rather than doing simple analyses and report generation. Most importantly they want to do important work that makes a meaningful contribution. To do this, they want to feel supported and valued but have autonomy at work. This includes the freedom to organize their work. At the same time, analysts like to work together. And they like to be surrounded by other smart and capable collogues. Make sure to treat your data scientists as a strategic resource. This means you need development plans, career plans, and performance management processes.
As we have discussed, make sure to do your homework before contracting or hiring for data scientists. Once you have done your homework, if you are an analytic leader, make sure that you create a stimulating environment. Additionally, prove the value of analytics by signing up for results that demonstrate data modeling efficacy. To do this, look here for business problems that will lead to a big difference. And finally if you need an analytics leader to emulate, look no further than Brian Cornell, the new CEO of Target.
Myles in Twitter: @MylesSuer
Enterprises use Hadoop in data-science applications that improve operational efficiency, grow revenues or reduce risk. Many of these data-intensive applications use Hadoop for log analysis, data mining, machine learning or image processing.
Commercial, open source or internally developed data-science applications have to tackle a lot of semi-structured, unstructured or raw data. They benefit from Hadoop’s combination of storage and processing in each data node spread across a cluster of cost-effective commodity hardware. Hadoop’s lack of fixed-schema works particularly well for answering ad-hoc queries and exploratory “what if” scenarios.