Data lineage
data.world is a cloud-native data catalog built on a knowledge graph. Our platform gives organizations more transparency into their data stack and helps them simplify data discovery and access. The tools within the platform are created with both technical and business-minded folks in mind, providing easy-to-use features for data discovery, governance, enrichment, and much more.
The main use cases we wanted to address with this feature were to:
Help users build trust in their data
Allow data engineers to conduct simple risk analysis.
From the onset of the project, we were faced with significant pressure from the rest of the organization because these data lineage capabilities would add an immense amount of strategic value to our product. We also had to introduce a new interaction model in the platform that was completely different from our usual interactions on a webpage. In August 2022, near the end of this work stream, we were able to unveil our new feature at the Gartner Data and Analytics Summit to a wide variety of interested and excited prospects.
Company
data [dot] world
In collaboration with
CTO (for initial proof of concept), Engineering, Product
Biggest challenges
Balancing technical complexity with good UX
What was exciting
Introducing a completely new type of interaction to the platform
THE PROBLEM
Data consumers needed an easier way to understand the provenance of their data, and data engineers needed a faster risk analysis workflow.
Initially, this project was presented as a business problem, not a user pain point. The responsibility fell on me to utilize the resources we had access to and dive deeper to uncover actual user problems. It was a challenge to understand this technical space and sift through the marketing jargon, but luckily I was able to work closely with customer-facing teams to surface the right information. We went through many user interviews and brainstorming sessions to land on the simple problem statement mentioned above.
THE PROCESS
Developed a proof-of-concept and followed-up with quick iterations
Our users for the proof-of-concept was the sales engineering team. I worked closely with our CTO to assemble the first version, which we used to demonstrate our underlying technical capabilities and simple UX. The challenge of this initial version was using what was available to us in the Cytoscape Library to display the most relevant information in the simplest visual format. Once we had this first version available for us to test out, we started narrowing down our use cases and figuring out the vision for this feature.
Getting creative with exploratory research: We used our external facing teams as a proxy for initial feedback and insights
Since we were in a competitive landscape, we couldn’t do any exploratory user research with our customers and users. As an alternative, I conducted interviews with our solutions engineering team to learn how lineage could deliver value to our customers, and how they thought it fit within our suite of solutions. I also held a workshop with our sales and product teams to hear everyone’s thoughts on the actual problems they were aware of and the value we wanted to deliver.
In the end, we boiled it down to two main problems to address in the MVP: (1) making it easier to understand data provenance and (2) simplifying risk analysis work. At this stage, our definition of success was getting it in the hands of our customers. If we started getting feedback on bugs or improvements, that would signal to us that users were actually utilizing the feature and that there was an appetite for more.
Utilized quick sketches to break down the feature, collaborate on micro-interactions, and speed up delivery
This was a new feature with a broad set of problems, so I started off with high-level sketches to give the broader team an idea of what we are trying to build and what the user experience could look like. As with many complex concepts, it was easier to discuss ideas and nuanced problems through visuals. We were able to point to these sketches to discuss different phases of work, priorities, and interactions to build. This quick and easy form of communication was highly beneficial since I was (unfortunately) designing as engineers were implementing.
The initial research and discussion about requirements took the most time. Once it came down to high fidelity designs, it was just a matter of iteration and collaboration with engineers to discuss feasibility and trade-offs. As new complexities arose, I worked closely with our engineers to deliver an experience that worked for our MVP but also helped users address their problems.
THE SOLUTION
Explorer lineage: A well-integrated, user-friendly column-level lineage feature that gives users visibility into the context of their data.
Imagine a data analyst needs to look for a data report. They go to data.world and find one that looks like what they need for their analysis. However, they want to be sure that the data funneling in is accurate, so they scroll down to the lineage section to see what is powering it.
They can see that it's not as simple as they thought and the upstream data pipeline is rather complex. They can easily open up the full screen visualization and see the string of transformations and processes the data went through to get to the report.
(It’s hard to zoom in on Squarespace, but you can look at the images up close in here. It includes the images below as well.)
In the full-screen view, they can see all the details in each node, such as the status and data owner. Accuracy and evidence is important for this report, so they also click “Download as CSV” to get a tabular format to attach to their analysis.
For data engineers who need to conduct impact/risk analysis, they would use this set of features as well. They would want to see the downstream effect of their changes to the data pipeline. The visualization would make it easy to understand at a glance, but the tabular export would give them an actual report of who to notify and work with when those downstream changes hit.
THE IMPACT
“The look and feel of it is a thousand times better than [our competitor]… I was really impressed.” -Customer
This work has directly helped our organization meet table-stakes requirements for a number of deals. It has also enabled us to take our biggest step into the data governance space, and having a user-friendly lineage feature helps visually communicate the power of our backend technology. We have been slowly rolling this feature out to our beta users and introducing it to other customers as well. On the user experience front, we’ve gotten very positive feedback from users who have evaluated our competitors as well!
This project has given me the opportunity to lead the design of a feature from scratch and directly contribute to an urgent strategic initiative. During the time frame of this project, we also introduced a new team of Product Managers to the organization, and our team has been able to deliver in a timely manner despite all the process changes.