How AI Redefines Self-Service Analytics

Enhancing Data Accessibility: Leveraging AI for Knowledge Sharing

How AI Redefines Self-Service Analytics

Introduction

Self-service has been the goal of many companies for quite a while now. After heavy data investments, organizations want to reach a position where the business can be independent in making data driven decisions.

Without effective self-service, companies are forced to hire larger data teams to meet the increasing demand for data-driven decisions. However, data professionals are expensive, and scaling the data team at a 1:1 ratio with business teams is unsustainable. This makes self-service the only sustainable pathway to company-wide data centric decision making.

The conventional approach to data self-service focuses on lowering the technical barriers for non-technical people to perform complex analyses. As vendors incorporate AI in their product - this is the direction they have taken: leverage AI to empower people to “do it themselves” through code assistants, Natural Language to SQL translators, etc.

Although we have developed such features - we want to challenge the approach that self-service is about DIY. Our view is that true self-service should leverage LLMs to share expert knowledge throughout the organization. By focusing on sharing pre-existing, high-quality analyses and building trust in data, rather than enabling everyone to create analyses from scratch, companies can achieve more effective and sustainable data self-service.

I - What is Self-Service in the First Place?

I like the definition of self-service & the food analogy proposed by the Holistics team:

“Self-service means allowing customers to achieve something without requiring assistance from a service provider, while they used to require it in the past.”

This concept can be illustrated using food service models: meal kit delivery services and buffets. Both are self-service, but they offer vastly different experiences.

Are you more of a Buffet or Meal Kit person? Image courtesy of CastorDoc

Have you ever tried HelloFresh?

I did, and it involved cooking for two hours every night, filling my house with ice packs and plastic containers, and finding tiny packs of salt, curry, and soy sauce all over my apartment long after canceling the subscription.

While it's technically self-service, it wasn’t a great experience.

While meal kits provide ingredients and recipes, users still spend considerable time and effort preparing meals. This often leads to frustration: long cooking times, excess packaging, and lingering ingredients.

Contrast this with a buffet: an array of dishes prepared by expert chefs, ready for immediate consumption. Users simply choose what they want, perhaps adding minor adjustments to suit their taste.

In data analytics, we've been focusing on providing "meal kits": SQL translators, no-code dashboard creators, and data preparation assistants. While these lower technical barriers, they often leave users feeling overwhelmed and unsure of their results. And just like some of my HelloFresh nights end up with a UberEats order, this 'self-serve' experience usually ends up with an email to an analyst, defeating the whole purpose of the initiative.

Benn Stancil, who has written extensively about self-service, emphasizes the importance of the experience:

"First, features don't matter, or at least not nearly as much as we often think they do. There are no "must haves" in self-serve experience. The experience is what matters, not the functionality. As long as people are comfortable with that experience and trust the results it produces, we can call it self-serve"

Applying this to data:

  • The "Meal Kit" approach equips users to conduct their own analyses from scratch.
  • The "Buffet" approach offers curated, high-quality analyses from data experts, which users can easily access and slightly modify as needed.

At CastorDoc, we believe the "buffet" model is more effective. It provides a smoother experience: minimal time investment, reduced frustration, and importantly, no need to email an analyst for help. This approach leverages existing expertise while truly empowering users across the organization.

II - AI-Enabled Access to Expert Insights: Redefining Data Self-Service

Most of the time, self-service consists in handing out the right piece of information, to the right person, at the right time. And there is an asymmetry between what vendors want to build, and what business users really consider as self-service.

There's often a disconnect between what vendors offer and what users truly need. While many tools focus on enabling complex DIY analysis - most users simply want easy access to consistent, reliable reporting.

The consistent reporting and analyses already exist, created by talented data teams. The challenge is to deliver these insights to data consumers seamlessly and without friction. This is where AI makes the biggest impact, not by enabling users to create their own analyses but by providing access to a buffet of high-quality, trusted analyses from data experts at the moment they need them.

This is where AI makes the biggest impact. Not in helping people meal-kit their own data analyses. But in letting them access a buffet of high quality, trusted analyses provided by data experts, at the exact moment they need it.

AI bridges the gap between users and the data buffet by addressing two main obstacles:

  1. Translating Business Needs into Data Language: Users often struggle to articulate their business needs in data terms.
  2. Trusting the Analysis: Users may doubt the analysis quality, questioning who conducted it and the data's reliability.

LLMs help by transforming business questions into clear data requests, browsing existing analyses to find the most relevant, high-quality insights, and assisting users in refining these analyses to meet their specific needs. The process is illustrated in the graph below.

Leveraging LLMs for data discovery - Image courtesy of CastorDoc

The process illustrated above unfolds as follows:

  1. Starting Point - User Query: when a business user asks a question, like "What were our sales in California last quarter?", the system initiates two parallel processes.
  2. Process One - Query Optimization (Prompting): This process rephrases the user's question into a format that the AI can process more effectively. For example, "What were our sales in California last quarter?" might be reformatted to "Retrieve quarterly sales data for California region, most recent completed quarter." This optimization helps the AI understand precisely what information to retrieve.
  3. Process Two - Query Vectorization (Embedding): Simultaneously, the system converts the question into a numerical representation (vector). This vector serves as a unique identifier for the query, allowing it to be compared mathematically with other data in the system.
  4. Data Preparation: Prior to any user queries, the system has already processed all existing reports and analyses created by data experts. These are converted into vector format and stored in a Vector Database, making them easily searchable.
  5. Relevance Matching (Similarity Algorithm) : The system compares the vector of the user's query with all the vectors in the database. This process identifies the most relevant existing analyses and reports that relate to the user's question.
  6. Answer Generation (LLM Processing): The AI (Large Language Model) then processes two key inputs: a) The optimized version of the user's question b) The relevant reports and analyses identified by the vector matching

Using these inputs, it generates a response tailored to the user's specific query. The user receives an answer based on existing, expert-created analyses, but customized to their specific question. This approach allows users to access and slightly modify pre-existing, high-quality analyses without needing to create them from scratch.

This method addresses two significant challenges:

  1. It automatically translates business queries into data-specific language.
  2. It provides reliable answers by leveraging pre-existing, expert-created analyses.

At CastorDoc, this is the approach we have chosen to take to provide users with a true self-service experience leveraging AI. If you would like to hear more about how we can make this happen in your organization - reach out to the team to start a conversation.

Sources:

New Release
Share

Get in Touch to Learn More

See Why Users Love CastorDoc
Fantastic tool for data discovery and documentation

“[I like] The easy to use interface and the speed of finding the relevant assets that you're looking for in your database. I also really enjoy the score given to each table, [which] lets you prioritize the results of your queries by how often certain data is used.” - Michal P., Head of Data