JW Player Tackles Data Discovery With CastorDoc
Bridging the Gap Between Data and Decision-Making: How JW Player Used CastorDoc to Democratize Data Access Across the Organization
One of the most pressing issues that emerged was data discovery, particularly for non-technical people who found it difficult to navigate the data warehouse
This customer story was contributed by Emily Hopper, Senior Data Scientist at JW Player. JW Player provides video hosting and streaming, advertising services, and analytics for a billion monthly unique viewers in 200 countries.
Introduction
As a Senior Data Scientist at JW Player, I work alongside a team of data professionals, including 5 data engineers, 3 customer-facing analysts, 3 data scientists, and 3 experts in product analytics, who focus on providing insights for internal data requests.
At JW Player, data plays a crucial role in helping us understand how our customers interact with our product and how we can improve it. By closely tracking how our clients use our products, we're able to provide more effective customer support and make more informed recommendations. This allows us to better address any issues and ensure that our clients get the most out of their investment.
In 2022, we partnered with Castor to help us improve the data discovery experience across the organization.
Castor has been instrumental in helping JW Player with two essential business aspects: data discovery and team productivity. Our aim is to make data available to all members of the company, regardless of their technical background. We envision a scenario where every employee can use Castor to harness the full potential of our data.
I - The Need For a Central Repository
“One of the most pressing issues that emerged was data discovery, particularly for non-technical people who found it difficult to navigate the data warehouse.” Emily Hopper, Sr Data Scientist, JW Player
When I joined JW Player in 2021, the company held a summit where all employees involved with data gathered to discuss data issues and ways we could enhance our data strategy. During the summit, colleagues highlighted the pain of not having a central repository for documentation in the company.
To address this, we conducted interviews with stakeholders including customer support, account managers, support engineers, data engineers, product managers, marketing, and anyone who might benefit from a data catalog. One of the most pressing issues that emerged was data discovery, particularly for non-technical people who found it difficult to navigate the data warehouse.
As a new employee, I personally struggled with this problem. Even after six weeks of onboarding, I had no idea how to find new information other than asking, and different people often provided different answers to my questions.
Given requests like "I need this information split out by account key," I would have to coordinate with colleagues across several teams to find all the right tables to join backend-relevant segmentation to performance monitoring data to customer-support-relevant segments.
We were primarily looking for lineage tools, so that data producers would understand what downstream processes could be impacted by any changes made upstream.
We thought lineage would also be useful for building a single source of truth. People in the organization use different names and terms for the same things, which can be hard to follow. For example, metric X may be known by two different names across two different teams. We thought that if we could easily trace the source of metrics, it would be easier to know when metrics were the same (or different).
We were also looking for a central repository for documentation repository since our company had dispersed documentation across various platforms like Github markdown pages, GoogleDocs, Confluence, custom knowledge websites, and Slack. The scattered documentation made it difficult to find the answers we needed when we had questions.
II - Assessing Data Cataloging Solutions
“While some of the other tools may have had similar capabilities, they were not as user-friendly as Castor.” Emily Hopper, Sr Data Scientist, JW Player
To find the most suitable data catalog solution, we followed a simple method. We began by compiling a comprehensive list of features that we deemed necessary, based on insights gathered from surveys and discussions.
Next, we evaluated a range of tools against these criteria, including open source options. In total, we assessed five different solutions to ensure we made an informed decision.
We then shortlisted two potential options and conducted a trial where participants dedicated some time to test each platform and provide feedback via a survey. Castor did the best job at ticking all the boxes for what we wanted.
One factor that differentiated Castor from the other solutions we evaluated was its user interface. Not only was it visually appealing, but it was also user-friendly, which aligned with our goal of making the tool accessible to business users. Specifically, we were seeking a solution that was approachable and less intimidating, and Castor offered just this.
We ultimately chose Castor because of its strong, easy-to-use lineage feature, which helped us gain a better understanding of our data flow. For example, I found the "Show SQL Source" button to be very practical and easy to locate, setting Castor apart from other tools we evaluated. While some of the other tools may have had similar capabilities, they were not as user-friendly as Castor.
Stakeholders don’t have the time to watch lengthy training videos or go through extensive training sessions, so we needed the tool to be easy to use from the start.
The implementation of Castor was both rapid and straightforward, and the tool itself is highly secure and never accesses our data. With other vendors, we had some security concerns.
II - Results & Roadmap
Castor has been a game-changer for us at JW Player. By providing a centralized repository for our documentation and data catalog needs, Castor has been able to fill a critical gap in our organization.
We're now able to easily find the data we need, without having to ask the same questions over and over again. As a result, the number of recurring data questions from team members has drastically decreased, making a big difference in our day-to-day operations.
With Castor, once a question is asked and answered, it becomes accessible to everyone, preventing the need for individuals to ask the same question multiple times. The result is a more efficient use of time and resources, allowing our teams to focus on other critical tasks. This has also created a stronger connection between our engineers and business teams.
Recently, we acquired another company and had to pull all their data from a different Snowflake account. Fortunately, the process went very smoothly thanks to Castor. The integration of a whole new company’s data is a complex task, but Castor made it significantly easier for us to manage.
We know Castor is successful because now the most common complaints we hear are relatively minor, like "the tool doesn't automatically tell me who to reach out to if we haven't filled in the table owner manually." These minor quibbles are actually a good sign because it shows that Castor has successfully solved the bigger data problems we used to face.
At JW Player, we aim to empower non-technical team members to take ownership of their data needs, and we look forward to Castor continuing to help us achieve this goal in the future.
Our next objective is to standardize the way we define metrics across the company, as we currently lack a uniform method for doing so. Castor can serve as a centralized repository for defining and managing metrics, making it easier for everyone to have a shared understanding of important data concepts.
Get the Newsletter straight in your inbox
About us
We write about all the processes involved when leveraging data assets: from the modern data stack to data teams composition, to data governance. Our blog covers the technical and the less technical aspects of creating tangible value from data.
At Castor, we are building a data documentation tool for the Notion, Figma, Slack generation.
Or data-wise for the Fivetran, Looker, Snowflake, DBT aficionados. We designed our catalog software to be easy to use, delightful and friendly.
Want to check it out? Reach out to us and we will show you a demo.
Read More Success Stories
Contactez-nous pour en savoir plus
« J'aime l'interface facile à utiliser et la rapidité avec laquelle vous trouvez les actifs pertinents que vous recherchez dans votre base de données. J'apprécie également beaucoup le score attribué à chaque tableau, qui vous permet de hiérarchiser les résultats de vos requêtes en fonction de la fréquence d'utilisation de certaines données. » - Michal P., Head of Data.