Podcast Episode on:

How To Best Implement a Data Catalog

Follow Lights On Data Show on:

Data catalogs are here to stay and a great asset to data-driven organizations. But how do you best implement a data catalog?

In this episode, Rupal Sumaria joins us to explore how to implement a data catalog. Rupal is the Head of Data Governance at Penguin Random House in the UK, which successfully implemented a data catalog recently.

In this session,  Rupal walks us through steps to follow when implementing a catalog, the must-have features, and importing data sets. Besides, she dives deep into the pitfalls to avoid, the best practices to follow, and some of the best lessons she has learned.

You will want to hear this episode if you are interested in:

  • [00:16] About Rupal Sumaria
  • [01:43] What a data catalog means and its purpose
  • [02:50] What drove Penguin Random House to implement a data catalog
  • [04:27] Why we need a data catalog
  • [05:30] The impact of having the right team in place and evaluating all the data catalog providers
  • [06:40] The must-haves in a data catalog
  • [07:16] Choosing a vendor and having a good user interface
  • [07:51] What made the data catalog a big success for Penguin Random House
  • [08:48] Double checking and verifying automation steps
  • [09:15] Focusing on the main content while importing data sets and ensuring it’s curated
  • [13:11] Pitfalls in implementing a data catalog
  • [15:30] How her business intelligence knowledge impacted her journey in the data catalog
  • [17:11] What happens when new data sets get changed
  • [18:25] Types of metadata being surfaced in the data catalog
  • [19:44] Penguin’s top consumers
  • [21:44] Handling sensitive data sets
  • [23:00] Best lessons learned from using a data catalog
  • [25:39] Success metrics in the data catalog

Notable Quotes

  • A good provider will do lots of demos for you and not just yourself as a data governor or expert but bringing your users into that journey.
  • We don’t want to spend a lot of time buffing around with the toolset. That's not what a good provider is. They are actually to help you, not make your life harder.
  • If you try to blow the ocean and try to import everything, you’re not getting any value out of it. Focus on the main data first.
  • The average information professional works with data, consumes data, and uses 80% of their time in the week searching for what they need to work with.

"Users expect instant information, so if your data catalog does not provide that, it’s time for reevaluation."

- Rupal Sumaria


Tags

data governance


You may also like

How To Do Data Governance Better

How To Do Data Governance Better

Effective Data Storytelling

Effective Data Storytelling

Subscribe, Watch and Listen

to Your Favorite Episodes!

Watch to the Video Version

*Voted as #1 Most Helpful Data Video Channel of 2020 by the audience of DataLiteracy.com

**Voted Top 3 Data Podcasts by Data Community Content Creators Awards of 2021
{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}
>