Home » Data Analysis » Data scientist´s toolbox

Data scientist´s toolbox

tool box

Some months ago I understood something I will never forget. The work of a data scientist looks like the work of a carpenter. And that is correct. When the carpenter needs to solve a problem, he opens his toolbox, and he uses an specific tool for each task. So, as a carpenter, the data scientist needs to have in his toolbox of data analysis and data mining, all the necessary techniques and tools for extracting useful information, because it is impossible to face a data science problem with only statistic techniques, or with only five or six data mining methods. “The more tools in the box, the better solution”

About these ads

5 Comments

  1. In operations improvement, tools (e.g., six sigma, lean, metrics) too often become an end in themselves (i.e., now that I have a hammer, everything looks like a nail). How do data scientists avoid this dysfunction?

    • Hello
      In operational improvements, data scientists should be careful when solving problems with data. I think that there is no obvious mechanism to prevent the dysfunction that you raise it, I think it depends on the experience of the data scientist and the expertise that they have when they use the “tools”. It is true that some tools become useless at certain stages of the process, so it is important and I always warn that each problem does not resembles to another, even having the same data. Sometimes the time factor makes the difference and the same tool that you use now will become useless tomorrow.
      Thank you very much for commenting.
      Best regards

  2. guycuthbert says:

    More tools better solution, sorry.

    Most carpenters will carry a reasonable selection, but work with a favoured few; adding another power drill, circular saw or router doesn’t result in a better result. Knowing how to use the tools at hand, and their specialities, subtleties and limitations, is more useful than having a large collection.

    Yes, it’s important to include a selection, but only one or two from each of the major areas:
    * Data munging
    * Discovery / DQ assessment
    * Visualisation
    * Statistical profiling
    etc.

    Don’t be seduced by the idea that more tools = better tradesmen. It’s ust not true.

    • Hello,
      I agree with your comment, but it does not contradict what I said. Just one of the major problems for data scientists is not to be sure what is the best method of data analysis and data mining to use once they have the data in hand. Of course that trying to apply too many “tools” to the same problem, it could be a disaster. When I say that the more tools the carpenter has in his box, the better the solution, it is logical and deductive that the carpenter should know how to use each tool, otherwise it would not be a good carpenter. To combine some tools when facing a problem of data mining or data analysis, it is beneficial, according to the problem and taking some tools from each area. The more tools and knowing how to use them, the better solution.
      Thank you very much for commenting.
      Best regards

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: