Why domain knowledge is so important in practice

Melissa Anthony
4 min readJul 30, 2020

--

Data can be fascinating. It can be a lot of fun to jump into an excel file or a database and start putting together filters to find insights. But, what is data? Data is a collection of behaviors attributed to people and their practices. One big aspect of human civilization, one that has allowed us as a species to dominate the world, is language.

Language is the cornerstone of domain knowledge.

I was a GIS supervisor for a medium-sized electric coop for 11 years. I started as a technician, fresh out of a GIS master program, and thrown into converting an AutoCAD based mapping system into an ArcMap file structure. From here, I helped create the database from the ground up. Looking back at those early years, the thing I struggled with the most was domain knowledge.

I was brand new to electricity and I got my education while I was building a relational database to map all the features of an electrical grid system. During this time, I also followed around field crews to see the abstract ideas we were putting on maps. Also, I sat with electrical engineers to see how they use our maps to design new construction. Later, I was to sit with system improvement engineers to understand how they use map grid lines to improve reliability across the entire network.

Electricity explained.

A pattern emerged among the groups. Let me illustrate this through an example. On an electrical grid system, there is a device called a fuse. It is very similar to the fuse that you have to change in your car when a headlight stops working. As long as it is one whole piece electricity can pass through, from source to destination. A fuse is designed to break when there is a surge of energy going from destination to source. Why does it do this? For a multitude of reasons, but the most important one is safety. On an electric grid system, a surge of energy from a customer’s home back onto the line, perhaps coming from a generator, can be deadly for crew’s working on that line. Or, if a piece of equipment fails on the line, a fuse can detect the surge of energy that is demanded by an out of control device such as a transformer.

When I rode with the line crews, they referred to that fuse as a cut-out. That is because a cut-out is descriptive of what a fuse does, from their perspective. A tripped fuse “cuts out the electrical current from the grid downstream to the device”. This naming convention was a quick way for me to understand how to talk to the lineman. Even if I didn’t know what the name of a device was, I could describe it functions in a way that was important to crews. Since part of keeping a database current is getting updates about late-night equipment failures, if I needed clarification about a service crew update I could talk to them in this functional way.

This is different from when I sat with electrical engineers, they talked about fuses in another way. Engineers saw the function of the fuse from a project perspective. Field crews saw it as a generic part; when it went out they grabbed a part that was in stock. The engineers were responsible for inventory lists. They saw fuses from a part number/model point of view which was much more precise than what the field crews were interested in. These projects were most likely going to contract companies so the model also functioned as common verbiage across companies. Lastly, the precision was derived from having to take into account the entire electrical grid system and possible future development that could introduce more problems on the grid. So when I talked to engineers, I needed to describe it from a more precise model perspective.

What does this have to do with the data? Well, all databases are designed to be used. If we don’t design our databases in a way that can be used by non-data people then they won’t use it. This is why clean data starts with domain knowledge. I mentioned that keeping an up-to-date database means collecting data from users. If users don’t understand what is being collected, then they won’t be able to provide that information. Any work in data is largely a translation between behavior to numeric.

Not only does domain knowledge lead to a cleaner dataset, since the needed information is understood at the point of collection, but, it also leads to the buy-in of data products. Field crews and engineers don’t know how to say, “I need a heat map of where the greatest customer demand is for project A”. But, by understanding what each group needs, I was able to provide a cleaner map. Once they saw the value of data in supporting their tasks and projects, they were asking for more products. This led to a tracking system for changing out meters to smart meters; i.e. where should the field crews go every day to change out meters. And an integrated designer system for engineers on the mapping system that contained a specific model number component. This allowed them to see the whole system and place model numbers while designing and spit out an inventory list to add to their project packets.

Insights must be useful.

As a data person, you may not have time to learn all the intricacies of domain knowledge. It took me 11 years to learn the electric grid system, and I was still learning something new every day. But there are certainly subject matter experts in every aspect of the data you are looking at. My advice is to find those experts and have a basic understanding of the language that is being spoken. This will give you most of what is needed to not only get insight but more importantly useful insight.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Melissa Anthony
Melissa Anthony

No responses yet

Write a response