MOLT Verify

On this page Carat arrow pointing down
Note:

This feature is in preview and subject to change. To share feedback and/or issues, contact Support.

MOLT Verify checks for data discrepancies between a source database and CockroachDB during a database migration.

The tool performs the following verifications to ensure data integrity during a migration:

  • Table Verification: Check that the structure of tables between the source database and the target database are the same.
  • Column Definition Verification: Check that the column names, data types, constraints, nullability, and other attributes between the source database and the target database are the same.
  • Row Value Verification: Check that the actual data in the tables is the same between the source database and the target database.

For a demo of MOLT Verify, watch the following video:

Supported databases

The following source databases are supported:

  • PostgreSQL 12-16
  • MySQL 5.7, 8.0 and later
  • Oracle Database 19c (Enterprise Edition) and 21c (Express Edition)

Installation

To install MOLT, download the binary that matches your architecture and source database:

Operating System Architecture PostgreSQL/MySQL Oracle
Windows AMD 64-bit Download N/A
ARM 64-bit Download N/A
Linux AMD 64-bit Download Download
ARM 64-bit Download N/A
Mac AMD 64-bit Download Download
ARM 64-bit Download Download

The download package includes the following:

  • molt binary.
  • replicator binary.
  • Grafana dashboard JSON files for MOLT Fetch (grafana_dashboard.json) and Replicator (replicator_grafana_dashboard.json) metrics. Each bundled dashboard is compatible with its corresponding binary version.
    • Oracle downloads also include the Oracle-specific Replicator dashboard (replicator_oracle_grafana_dashboard.json).
Tip:

For ease of use, keep both molt and replicator in your current working directory.

To display the current version of each binary, run molt --version and replicator --version.

Note:

molt is bundled with the latest replicator version available at the time of the MOLT release. This means that the MOLT download always contains the latest released version of MOLT Replicator. To verify that the molt and replicator versions match, run molt --version and replicator --version.

For previous binaries, refer to the MOLT version manifest. For release details, refer to the MOLT changelog.

Docker images

MOLT Fetch

Docker multi-platform images containing both the AMD and ARM molt and replicator binaries are available. To pull the latest image for PostgreSQL and MySQL:

icon/buttons/copy
docker pull cockroachdb/molt

To pull a specific version (for example, 1.1.3):

icon/buttons/copy
docker pull cockroachdb/molt:1.1.3

To pull the latest image for Oracle (note that only linux/amd64 is supported):

icon/buttons/copy
docker pull cockroachdb/molt:oracle-latest

MOLT Replicator

Docker images for MOLT Replicator are also available as a standalone binary:

icon/buttons/copy
docker pull cockroachdb/replicator

To pull a specific version (for example, v1.1.1):

icon/buttons/copy
docker pull cockroachdb/replicator:v1.1.1

Setup

Complete the following items before using MOLT Verify:

  • The SQL user running MOLT Verify must have the SELECT privilege on both the source and target CockroachDB tables.

  • Percent-encode the connection strings for the source database and CockroachDB. This ensures that the MOLT tools can parse special characters in your password.

    • Given a password a$52&, pass it to the molt escape-password command with single quotes:

      icon/buttons/copy
      molt escape-password --password 'a$52&'
      
      Substitute the following encoded password in your original connection url string:
      a%2452%26
      
    • Use the encoded password in your connection string. For example:

      postgres://postgres:a%2452%26@localhost:5432/molt
      

Flags

Flag Description
--source (Required) Connection string for the source database.
--target (Required) Connection string for the target database.
--concurrency Number of threads to process at a time when reading the tables.
Default: 16
For faster verification, set this flag to a higher value.
--filter-path Path to a JSON file that defines filter rules to verify only a subset of data in specified tables. Refer to Verify a subset of data.
--log-file Write messages to the specified log filename. If no filename is provided, messages write to verify-{datetime}.log. If "stdout" is provided, messages write to stdout.
--metrics-listen-addr Address of the metrics endpoint, which has the path {address}/metrics.

Default: '127.0.0.1:3030'
--row-batch-size Number of rows to get from a table at a time.
Default: 20000
--schema-filter Verify schemas that match a specified regular expression.

Default: '.*'
--table-filter Verify tables that match a specified regular expression.

Default: '.*'
--transformations-file Path to a JSON file that defines transformation rules applied during comparison to verify data that was transformed during fetch. Use the same transformation file from molt fetch. Refer to Verify transformed data.

Usage

molt verify takes two SQL connection strings as --source and --target arguments.

To compare a PostgreSQL database with a CockroachDB database:

icon/buttons/copy
molt verify \
  --source 'postgresql://{username}:{password}@{host}:{port}/{database}' \
  --target 'postgresql://{username}:{password}@{host}:{port}/{database}?sslmode=verify-full'

To compare a MySQL database with a CockroachDB database:

icon/buttons/copy
molt verify \
  --source 'mysql://{username}:{password}@{protocol}({host}:{port})/{database}' \
  --target 'postgresql://{username}:{password}@{host}:{port}/{database}?sslmode=verify-full'

Use the optional flags to customize the verification results.

When verification completes, the output displays a summary message like the following:

{"level":"info","type":"summary","table_schema":"public","table_name":"common_table","num_truth_rows":6,"num_success":3,"num_conditional_success":0,"num_missing":2,"num_mismatch":1,"num_extraneous":2,"num_live_retry":0,"num_column_mismatch":0,"message":"finished row verification on public.common_table (shard 1/1)"}
  • num_missing is the number of rows that are missing on the target database. You can add any missing data to the target database and run molt verify again.
  • num_mismatch is the number of rows with mismatched values on the target database.
  • num_extraneous is the number of extraneous tables on the target database.
  • num_column_mismatch is the number of columns with mismatched types on the target database, preventing molt verify from comparing the column's rows. For example, if your source table uses an auto-incrementing ID, MOLT Verify will identify a mismatch with CockroachDB's UUID type. In such cases, you might fix the mismatch by creating a composite type on CockroachDB that uses the auto-incrementing ID.
  • num_success is the number of rows that matched.
  • num_conditional_success is the number of rows that matched while having a column mismatch due to a type difference. This value indicates that all other columns that could be compared have matched successfully. You should manually review the warnings and errors in the output to determine whether the column mismatches can be ignored.

Verify a subset of data

You can write filter rules to have molt verify compare only a subset of rows in specified tables. This allows you to verify specific data ranges or conditions without processing entire tables.

Filter rules apply WHERE clauses to specified tables during verification. Columns referenced in filter expressions must be indexed.

Note:

Only PostgreSQL and MySQL sources are supported for selective data verification.

Step 1. Create a filter rules file

Create a JSON file that defines the filter rules. The following example defines filter rules on two tables, public.filtertbl and public.filtertbl2:

{
  "filters": [
    {
      "resource_specifier": {
        "schema": "public",
        "table": "filtertbl"
      },
      "expr": "x < 10"
    },
    {
      "resource_specifier": {
        "schema": "public",
        "table": "filtertbl2"
      },
      "source_expr": "id BETWEEN 5 AND 15",
      "target_expr": "15 > id > 5"
    }
  ]
}
  • resource_specifier: Identifies which schemas and tables to filter. Schema and table names are case-insensitive.
    • schema: Schema name containing the table.
    • table: Table name to apply the filter to.
  • expr: SQL expression that applies to both source and target databases. The expression must be valid for both database dialects.
  • source_expr and target_expr: SQL expressions that apply to the source and target databases, respectively. These must be defined together, and cannot be used with expr.

Step 2. Run molt verify with the filter file

Use the --filter-path flag to specify the filter rules file:

icon/buttons/copy
molt verify \
  --source 'postgres://user:password@localhost/molt' \
  --target 'postgres://root@localhost:26257/molt?sslmode=disable' \
  --filter-path='./filter-rules.json'

When verification completes, the output displays a summary showing the number of rows verified in each filtered table:

{"level":"info","message":"starting verify on public.filtertbl, shard 1/1"}
{"level":"info","type":"summary","table_schema":"public","table_name":"filtertbl","num_truth_rows":5,"num_success":5,"num_conditional_success":0,"num_missing":0,"num_mismatch":0,"num_extraneous":0,"num_column_mismatch":0,"message":"finished row verification on public.filtertbl (shard 1/1)"}

Verify transformed data

If you applied transformations during molt fetch, you can apply the same transformations with MOLT Verify to match source data with the transformed target data.

Note:

Only table and schema renames are supported.

Step 1. Create a transformation file

Create a JSON file that defines the transformation rules. MOLT Verify applies these transformations during comparison only and does not modify the source database.

The following example assumes that MOLT Fetch renamed table t to t2 and schema public to public2. The same transformation rule is applied during verification:

{
  "transforms": [
    {
      "id": 1,
      "resource_specifier": {
        "schema": "public",
        "table": "t"
      },
      "table_rename_opts": {
        "value": "t2"
      },
      "schema_rename_opts": {
        "value": "public2"
      }
    }
  ]
}
  • resource_specifier: Identifies which schemas and tables to transform. Schema and table names are case-insensitive.
    • schema: Schema name containing the table.
    • table: Table name to transform.
  • table_rename_opts: Rename the table on the target database.
    • value: The target table name to compare against.
  • schema_rename_opts: Rename the schema on the target database.
    • value: The target schema name to compare against.

Step 2. Run molt verify with the transformation file

Use the --transformations-file flag to specify the transformation file:

icon/buttons/copy
molt verify \
  --source 'postgres://user:password@localhost/molt' \
  --target 'postgres://root@localhost:26257/molt?sslmode=disable' \
  --transformations-file 'transformation-rules.json'

When verification completes, the output displays a summary:

{"level":"info","message":"starting verify on public.t, shard 1/1"}
{"level":"info","type":"summary","table_schema":"public","table_name":"t","num_truth_rows":10,"num_success":10,"num_conditional_success":0,"num_missing":0,"num_mismatch":0,"num_extraneous":0,"num_column_mismatch":0,"message":"finished row verification on public.t (shard 1/1)"}

Docker usage

Performance

MOLT Fetch, Verify, and Replicator are likely to run more slowly in a Docker container than on a local machine. To improve performance, increase the memory or compute resources, or both, on your Docker container.

Local connection strings

When testing locally, specify the host as follows:

  • For macOS, use host.docker.internal. For example:

    --source 'postgres://postgres:postgres@host.docker.internal:5432/migration_db?sslmode=disable'
    --target "postgres://root@host.docker.internal:26257/defaultdb?sslmode=disable"
    
  • For Linux and Windows, use 172.17.0.1. For example:

    --source 'postgres://postgres:postgres@172.17.0.1:5432/migration_db?sslmode=disable'
    --target "postgres://root@172.17.0.1:26257/defaultdb?sslmode=disable"
    

Known limitations

  • MOLT Verify compares 20,000 rows at a time by default, and row values can change between batches, potentially resulting in temporary inconsistencies in data. To configure the row batch size, use the --row_batch_size flag.
  • MOLT Verify only supports comparing one MySQL database to a whole CockroachDB schema (which is assumed to be public).
  • MOLT Verify checks for collation mismatches on primary key columns. This may cause validation to fail when a STRING is used as a primary key and the source and target databases are using different collations.
  • MOLT Verify might give an error in case of schema changes on either the source or target database.
  • Geospatial types cannot yet be compared.
  • Only PostgreSQL and MySQL sources are supported for verifying a subset of data.
  • Only table and schema renames are supported when verifying transformed data.

See also

×