Featured Image

DataVault: Enterprise Data Governance & Compliance Platform

Automated data governance platform that achieved 99.2% regulatory compliance and reduced audit preparation from 6 weeks to 3 days

Author
Advenno TeamEnterprise Data & Compliance Engineering Lead
March 13, 2026 12 months
Client
Meridian Financial Group
Industry
Financial Services
Duration
12 months
Completed
Feb 2026
Location
Chicago, Illinois, United States

An enterprise data governance platform that automatically discovers, classifies, and monitors sensitive data across 340+ systems, enforcing regulatory compliance policies and generating audit-ready reports — achieving 99.2% compliance and reducing audit prep from 6 weeks to 3 days.

The Challenge

Meridian Financial Group's data governance crisis was the predictable result of a decade of growth-through-acquisition without technology consolidation. Each of the seven acquired firms brought its own databases, file structures, naming conventions, and security practices — creating a 340-system landscape that no single person fully understood. The data discovery problem alone was staggering: when the Chief Data Officer was hired in 2024, her first task was to answer a seemingly simple question — "Where is all of our customer PII?" — and after three months of manual investigation, her team could account for only 68% of regulated data with confidence. The classification inconsistency was equally problematic: a single customer's Social Security number might exist in 12 different systems with 5 different classification labels, 3 different retention policies, and no linkage between them. Regulatory examinations had become an existential threat. The company failed OCC and state insurance examinations in consecutive years, drawing $1.4 million in combined fines and a consent order that mandated implementation of a comprehensive data governance program within 18 months — with the implicit threat of license revocation if the company failed to comply. The manual compliance process was unsustainable: it consumed 14,000 staff hours annually across legal, IT, and compliance departments, yet still produced incomplete and inconsistent results. With European market expansion plans requiring GDPR compliance on top of existing CCPA, GLBA, and SOX obligations, Meridian's board authorized an aggressive technology investment to solve the governance crisis permanently.

  • 340+ data systems with no centralized catalog — the CDO's team could locate only 68% of regulated customer data after 3 months of manual investigation
  • Inconsistent data classification with the same PII carrying different labels, retention policies, and access controls across systems
  • Two consecutive failed regulatory audits resulting in $1.4M in fines and a consent order mandating a governance program
  • 14,000 annual staff hours consumed by manual compliance documentation, evidence gathering, and audit preparation
  • 6-week audit preparation cycle that still produced incomplete results and required last-minute scrambling
  • European market expansion requiring GDPR compliance on top of existing CCPA, GLBA, and SOX regulatory obligations

Our Solution

DataVault was engineered as the single source of truth for Meridian's entire data landscape. The discovery engine uses a combination of agentless network scanning, API connectors, and lightweight database agents to catalog data assets across all 340+ systems without impacting production performance. For each data source, the engine maps schemas, samples content, identifies relationships, and constructs a comprehensive data lineage graph showing how data flows between systems. The ML classification engine — trained on 4.2 million labeled data samples spanning financial services data types — analyzes content patterns, column names, data formats, and contextual clues to automatically classify every field and document. The system identifies 42 regulated data categories including SSNs, account numbers, health information, financial statements, and trade secrets with 97.8% accuracy. The policy engine translates regulatory text into executable rules: CCPA right-to-deletion requests automatically identify every instance of a customer's data across all 340 systems; GLBA safeguards are verified through continuous access control monitoring; SOX financial data retention is enforced with automated archival and hold capabilities. The compliance dashboard provides a real-time governance score broken down by regulation, department, and system, with automated evidence package generation that compiles the documentation auditors need into downloadable, indexed reports.

  • Agentless discovery engine cataloging 340+ data sources including databases, file shares, SaaS applications, and legacy mainframes
  • ML classification engine identifying 42 regulated data categories with 97.8% accuracy across 4.2 billion data fields
  • Automated data lineage mapping showing how sensitive data flows between systems, applications, and departments
  • Policy engine translating CCPA, GLBA, SOX, and GDPR requirements into continuously enforced executable rules
  • Real-time compliance dashboard with governance scores by regulation, department, and system
  • Automated audit evidence package generation reducing preparation from 6 weeks to 3 days
  • Privacy request automation handling CCPA deletion and access requests across all connected systems in under 4 hours

Our Approach

1

Data Landscape Assessment

Conducted a 5-week assessment of Meridian's entire data ecosystem, inventorying all 340+ systems, mapping network connections, cataloging data owners, and interviewing compliance, legal, and IT stakeholders. We discovered 47 previously unknown data stores containing regulated information, including 3 legacy systems that former acquisitions had never decommissioned.

2

Discovery Engine & Connector Development

Built the agentless discovery engine with connectors for 28 distinct data source types — SQL databases, NoSQL stores, cloud storage (S3, Azure Blob), SaaS APIs (Salesforce, ServiceNow), email archives (Exchange, Google Workspace), file shares, and two legacy IBM mainframe systems requiring custom EBCDIC data extraction.

3

Classification Model Training

Trained the ML classification engine on 4.2 million labeled data samples from financial services contexts, covering 42 regulated data categories. We validated accuracy with Meridian's compliance team by running the classifier against 50,000 manually classified records and achieving 97.8% agreement.

4

Policy Engine & Compliance Framework

Mapped CCPA, GLBA, SOX, and GDPR regulatory requirements into 186 executable policy rules covering data classification, retention, access control, encryption, and deletion. Each rule links to specific regulatory citations and generates compliance evidence when enforced.

5

Phased Deployment & Consent Order Resolution

Deployed DataVault in three phases aligned with the regulatory consent order timeline: Phase 1 covered the 80 highest-risk systems (8 weeks), Phase 2 extended to 200 systems (6 weeks), and Phase 3 completed full coverage of all 340+ systems (4 weeks). The consent order was satisfied 3 months ahead of the regulatory deadline.

The Results

DataVault resolved Meridian's data governance crisis and transformed compliance from a reactive burden into a continuous, automated process. The discovery engine cataloged 340+ data systems and identified 47 previously unknown data stores containing regulated information — including 3 legacy systems from acquisitions that should have been decommissioned years earlier. The ML classification engine processed 4.2 billion data fields across all systems, achieving 97.8% accuracy in identifying and tagging 42 categories of regulated data. The regulatory compliance score rose from an estimated 61% to 99.2% across all applicable frameworks, with the remaining 0.8% representing legacy system limitations with documented remediation plans. Audit preparation — previously a 6-week all-hands scramble — was reduced to 3 days of automated evidence package generation followed by targeted review. The platform handles CCPA privacy requests (deletion and access) across all 340 systems in under 4 hours, compared to the previous 11-day manual process. Annual compliance staff hours dropped from 14,000 to 3,200, freeing the legal and compliance team to focus on strategic risk management rather than evidence gathering. Most importantly, the consent order was satisfied 3 months ahead of the regulatory deadline, and Meridian passed its next OCC examination with zero findings — the first clean exam in four years. The European expansion proceeded on schedule with GDPR compliance pre-built into the governance framework.

99.2
Compliance Score
95
Audit Prep Reduction
3.2
Fines Prevented
340
Systems Cataloged
10.8
Staff Hours Saved

Return on Investment

$3.2M estimated
Regulatory Fines Prevented
10,800 hours/year saved
Compliance Staff Efficiency
3 months ahead of deadline
Consent Order Resolution

Technologies Used

Python
Go
React
PostgreSQL
Elasticsearch
Apache Kafka
Redis
AWS
Apache Spark
Terraform
Docker
Kubernetes

Integrations

Salesforce
ServiceNow
Microsoft 365
Google Workspace
Snowflake
AWS S3
Azure Blob Storage
IBM Mainframe (custom)
SAP
Workday
Okta
Splunk

DataVault didn't just check a regulatory box — it fundamentally changed our relationship with data. For the first time in Meridian's history, we know where every piece of sensitive data lives, who can access it, and whether it's being handled correctly. Passing our OCC exam with zero findings after two consecutive failures was a defining moment for this organization.

Catherine Park - Chief Data Officer, Meridian Financial Group

Summary

Advenno built DataVault, an enterprise data governance platform for Meridian Financial Group, a $14 billion financial services firm operating across wealth management, insurance, and lending. The platform automatically discovers and catalogs data across 340+ systems, classifies 42 regulated data categories with 97.8% ML accuracy, enforces CCPA/GLBA/SOX/GDPR policies, and generates audit-ready compliance reports. Regulatory compliance rose from 61% to 99.2%. Audit preparation dropped from 6 weeks to 3 days. The platform prevented an estimated $3.2M in regulatory fines, saved 10,800 compliance staff hours annually, and satisfied a consent order 3 months ahead of deadline.

Key Takeaways

  • Automated discovery cataloged 340+ data systems and found 47 previously unknown stores containing regulated data
  • ML classification engine identified 42 regulated data categories with 97.8% accuracy across 4.2 billion fields
  • Regulatory compliance rose from 61% to 99.2%, enabling Meridian to pass its next OCC exam with zero findings
  • Audit preparation dropped from 6 weeks to 3 days through automated evidence package generation
  • CCPA privacy requests now completed across all systems in under 4 hours versus the previous 11-day manual process

Frequently Asked Questions

DataVault's discovery engine uses three complementary approaches depending on the data source type. For databases (SQL and NoSQL), agentless connectors query system catalogs to map schemas, then sample content using read-only queries optimized to minimize performance impact — typically consuming less than 0.1% of system resources. For file-based systems (network shares, cloud storage, email archives), the engine uses API-level scanning that reads file metadata and samples content without requiring local agent installation. For legacy mainframe systems — Meridian had two IBM systems still in production — we developed custom connectors that extract EBCDIC-encoded data through batch job interfaces. The discovery runs on configurable schedules: critical systems are scanned continuously, while lower-risk sources are scanned weekly. Each scan updates the central data catalog with new assets, schema changes, and classification adjustments. During Meridian's initial full discovery, the engine identified 47 data stores that the CDO's manual investigation had missed — including 3 legacy databases from acquisitions that contained active customer PII but had no assigned data owner.
The classification engine achieves 97.8% accuracy across 42 regulated data categories, validated against 50,000 manually classified records. The model assigns a confidence score to every classification, and its behavior at different confidence thresholds is configurable. At Meridian, fields classified with above 90% confidence (representing 91% of all classifications) are automatically tagged and policy-enforced. Fields with 70-90% confidence are tagged with a review-recommended flag that appears in the data steward's queue. Fields below 70% confidence are flagged for mandatory human review before any policy is applied. This tiered approach means the system automates the vast majority of classifications while ensuring human oversight for ambiguous cases. The model also improves continuously: every human review decision feeds back into the training pipeline, and Meridian's classification accuracy has improved from 97.8% to 98.3% over the first six months as the model learned from domain-specific corrections.
DataVault continuously collects the evidence that auditors need rather than assembling it retroactively. Every policy enforcement action — access grants, classification changes, retention executions, encryption verifications — is logged with timestamps, responsible parties, and regulatory citations. When an audit is scheduled, the compliance team selects the applicable regulation (OCC examination, SOX audit, CCPA assessment, etc.) and time period. DataVault then automatically compiles an evidence package containing: a complete data inventory with classification status, policy compliance metrics by system and data type, access control logs showing who accessed regulated data and under what authorization, retention compliance showing that expired data was properly disposed, incident response logs for any policy violations detected and remediated, and trend graphs showing compliance posture over time. The package is generated as an indexed PDF with hyperlinked sections and supporting data exports. What previously required 6 weeks of manual gathering across departments now takes 3 days: 1 day for automated generation, 1 day for compliance team review, and 1 day for executive sign-off.
When a CCPA deletion request is received, DataVault's privacy request automation module executes a four-step process. First, it queries the data catalog to identify every system containing the requestor's data — using name, email, SSN, and other identifiers to find records across all 340+ connected systems, typically completing the search in under 15 minutes. Second, it generates a deletion plan showing which systems contain what data, the deletion method appropriate for each system type, and any legal hold or regulatory retention exceptions that must be honored. Third, after compliance team approval (typically same-day), it executes deletions through system-specific connectors — SQL DELETE statements for databases, API calls for SaaS systems, secure file removal for document stores — logging every action with cryptographic proof of deletion. Fourth, it generates a completion certificate documenting what was deleted, what was retained under legal exception with cited justification, and confirmation timestamps. The entire process completes in under 4 hours from initiation to completion certificate, compared to Meridian's previous 11-day manual process that required coordination across 8 departments and still could not guarantee complete coverage.

Key Terms

Data Governance
The framework of policies, processes, and technologies that ensure an organization's data assets are formally managed, consistently classified, properly secured, and compliant with applicable regulations throughout their lifecycle.
Data Lineage
A map showing the complete lifecycle of data — its origin, every system it flows through, transformations applied, and where it ultimately resides — essential for understanding data dependencies, impact analysis, and regulatory compliance.
Data Classification
The process of categorizing data assets by sensitivity level and regulatory status — such as PII, financial records, health information, or trade secrets — to determine appropriate handling, access controls, retention policies, and security measures.

Facts & Statistics

99.2%
regulatory compliance score across CCPA, GLBA, SOX, and GDPR frameworks
95%
reduction in audit preparation time, from 6 weeks to 3 days of automated evidence generation
$3.2M
estimated regulatory fines prevented through continuous compliance monitoring and enforcement
97.8%
ML classification accuracy across 42 regulated data categories and 4.2 billion data fields
10,800
annual staff hours saved by automating compliance documentation and evidence gathering
4 hours
time to complete CCPA deletion requests across 340+ systems, down from 11 days manually

Sources & Citations

  1. Gartner Data Governance Market Guide (2025)
  2. Ponemon Institute Cost of Compliance Report (2025)
  3. IAPP Privacy Governance Report (2025)

Take Control of Your Enterprise Data Governance

Discover how automated discovery, ML classification, and continuous compliance monitoring can transform your data governance — just like we did for Meridian Financial Group.

Book a Data Governance Assessment

Related Resources

References

Related Case Studies

Get a Project Estimate