Introduction
RIOT-X is a command-line utility to get data in and out of Redis. It supports Redis Cloud and Redis Software and includes the following features:
- Files (CSV, JSON, XML, Parquet)
- Databases
- Data Generators
-
-
Redis Data Generator: Data Structures → Redis
-
Faker Data Generator: Faker → Redis
-
- Replication
-
Redis → Redis
RIOT-X is supported by Redis, Inc. To report bugs, request features, or receive assistance, please file an issue or contact your Redis account team.
Install
RIOT-X can be installed on Linux, macOS, and Windows platforms and can be used as a standalone tool that connects remotely to a Redis database. It is not required to run locally on a Redis server.
Homebrew (macOS & Linux)
brew install redis/tap/riotx
Scoop (Windows)
scoop bucket add redis https://github.com/redis/scoop.git
scoop install riotx
Manual Installation (All Platforms)
Download the pre-compiled binary from RIOT-X Releases, uncompress and copy to the desired location.
|
Docker
You can run RIOT-X as a docker image:
docker run riotx/riotx [OPTIONS] [COMMAND]
Concepts
RIOT-X is essentially an ETL tool where data is extracted from the source system, transformed (see Processing), and loaded into the target system.
Redis URI
RIOT-X follows the Redis URI specification, which supports standalone, sentinel and cluster Redis deployments with plain, SSL, TLS and unix domain socket connections.
You can use the host:port short hand for redis://host:port .
|
- Redis Standalone
-
redis :// [[username :] password@] host [:port][/database] [?[timeout=timeout[d|h|m|s|ms|us|ns]] [&clientName=clientName] [&libraryName=libraryName] [&libraryVersion=libraryVersion] ]
- Redis Standalone (SSL)
-
rediss :// [[username :] password@] host [: port][/database] [?[timeout=timeout[d|h|m|s|ms|us|ns]] [&clientName=clientName] [&libraryName=libraryName] [&libraryVersion=libraryVersion] ]
- Redis Sentinel
-
redis-sentinel :// [[username :] password@] host1[:port1] [, host2[:port2]] [, hostN[:portN]] [/database] [?[timeout=timeout[d|h|m|s|ms|us|ns]] [&sentinelMasterId=sentinelMasterId] [&clientName=clientName] [&libraryName=libraryName] [&libraryVersion=libraryVersion] ]
You can provide the database, password and timeouts within the Redis URI. For example redis://localhost:6379/1 selects database 1 .
|
d
|
Days |
h
|
Hours |
m
|
Minutes |
s
|
Seconds |
ms
|
Milliseconds |
us
|
Microseconds |
ns
|
Nanoseconds |
Batching
Processing in RIOT-X is done in batches: a fixed number of records is read from the source, processed, and written to the target.
The default batch size is 50
, which means that an execution step reads 50 items at a time from the source, processes them, and finally writes then to the target.
If the source/target is Redis, reading/writing of a batch is done in a single command pipeline to minimize the number of roundtrips to the server.
You can change the batch size (and hence pipeline size) using the --batch
option.
The optimal batch size in terms of throughput depends on many factors like record size and command types (see Redis Pipeline Tuning for details).
Multi-threading
By default processing happens in a single thread, but it is possible to parallelize processing by using multiple threads. In that configuration, each chunk of items is read, processed, and written in a separate thread of execution. This is different from partitioning where items would be read by multiple readers. Here, only one reader is being accessed from multiple threads.
To set the number of threads, use the --threads
option.
riotx db-import "SELECT * FROM orders" --jdbc-url "jdbc:postgresql://host:port/database" --jdbc-user appuser --jdbc-pass passwd --threads 3 hset --keyspace order --key order_id
Importing
When importing data into Redis (file-import
, db-import
, faker
) the following options allow for field-level processing and filtering.
Processing
Processors allow you to create/update/delete fields using the Spring Expression Language (SpEL).
--proc field1="'foo'"
-
Generate a field named
field1
containing the stringfoo
--proc temp="(temp-32)*5/9"
-
Convert from Fahrenheit to Celsius
--proc name='remove("first").concat(remove("last"))'
-
Concatenate
first
andlast
fields and delete them --proc field2=null
-
Delete
field2
Input fields are accessed by name (e.g. field3=field1+field2
).
Processors have access to the following context variables and functions:
date
-
Date parsing and formatting object. Instance of Java SimpleDateFormat.
number
-
Number parsing and formatting object. Instance of Java DecimalFormat.
faker
-
Faker object.
redis
-
Redis commands object. Instance of Lettuce RedisCommands. The
replicate
command exposes 2 command objects namedsource
andtarget
. geo
-
Convenience function that takes a longitude and a latitude to produce a RediSearch geo-location string in the form
longitude,latitude
(e.g.location=#geo(lon,lat)
)
riot file-import --proc epoch="#date.parse(mydate).getTime()" location="#geo(lon,lat)" name="#redis.hget('person1','lastName')" ...
riotx file-import http://storage.googleapis.com/jrx/beers.csv --header --proc fakeid="#faker.numerify('########')" hset --keyspace beer --key fakeid
You can register your own variables using --var
.
riotx file-import http://storage.googleapis.com/jrx/lacity.csv --var rnd="new java.util.Random()" --proc randomInt="#rnd.nextInt(100)" --header hset --keyspace event --key Id
Filtering
Filters allow you to exclude records that don’t match a SpEL boolean expression.
For example this filter will only keep records where the value
field is a series of digits:
riot file-import --filter "value matches '\\d+'" ...
Exporting
When exporting data from Redis the following options allow for filtering .
Key Filtering
Key filtering can be done through multiple options in RIOT-X:
--key-pattern
-
Glob-style pattern used for scan and keyspace notification registration.
--key-type
-
Type of keys to consider for scan and keyspace notification registration.
--key-include
&--key-exclude
-
Glob-style pattern(s) to futher filter keys on the client (RIOT) side, i.e. after they are received through scan or keyspace notifications.
--mem-limit
: Ignore keys whose memory usage exceeds the given limit. For example --mem-limit 10mb
skips keys over 10 MB in size.
Usage
You can launch RIOT-X with the following command:
riotx
This will show usage help, which you can also get by running:
riotx --help
--help
is available on any command:
riotx COMMAND --help
Run the following command to give
|
Data Generation
RIOT-X includes 2 commands for data generation:
Data Structure Generator
The gen
command generates Redis data structures as well as JSON and Timeseries.
riot gen [OPTIONS]
riotx gen --type string hash json timeseries
Faker Generator
The faker
command generates data using Datafaker.
riot faker [OPTIONS] EXPRESSION... [REDIS COMMAND...]
where EXPRESSION
is a Faker expression field in the form field="expression"
.
To show the full usage, run:
riot faker --help
You must specify at least one Redis command as a target.
Redis connection options apply to the root command ( In this example the Redis options will not be taken into account:
|
Keys
Keys are constructed from input records by concatenating the keyspace prefix and key fields.
riotx faker id="numerify '##########'" firstName="name.first_name" lastName="name.last_name" address="address.full_address" hset --keyspace person --key id
riotx faker name="GameOfThrones.character" --count 1000 sadd --keyspace got:characters --member name
Data Providers
Faker offers many data providers. Most providers don’t take any arguments and can be called directly:
riot faker firstName="name.first_name"
Some providers take parameters:
riot faker lease="number.digits '2'"
Here are a few sample Faker expressions:
-
regexify '(a|b){2,3}'
-
regexify '\\.\\*\\?\\+'
-
bothify '????','false'
-
name.first_name
-
name.last_name
-
number.number_between '1','10'
Refer to Datafaker Providers for a list of providers and their corresponding documentation.
Databases
RIOT-X includes two commands for interaction with relational databases:
Drivers
RIOT-X relies on JDBC to interact with databases. It includes JDBC drivers for the most common database systems:
- Oracle
-
jdbc:oracle:thin:@myhost:1521:orcl
- SQL Server
-
jdbc:sqlserver://[serverName[\instanceName][:portNumber]][;property=value[;property=value]]
- MySQL
-
jdbc:mysql://[host]:[port][/database][?properties]
- Postgres
-
jdbc:postgresql://host:port/database
- Snowflake
-
jdbc:snowflake://<account_identifier>.snowflakecomputing.com/?<connection_params>
- SQLite
-
jdbc:sqlite:path_to_sqlite_file
- Db2
-
jdbc:db2://host:port/database
For non-included databases, place the JDBC driver jar under the |
Database Import
The db-import
command imports data from a relational database into Redis.
Ensure RIOT-X has the relevant JDBC driver for your database. See the Drivers section for more details. |
riot db-import --jdbc-url <jdbc url> -u <Redis URI> SQL [REDIS COMMAND...]
To show the full usage, run:
riot db-import --help
You must specify at least one Redis command as a target.
Redis connection options apply to the root command (riot ) and not to subcommands.
|
The keys that will be written are constructed from input records by concatenating the keyspace prefix and key fields.
riotx db-import "SELECT * FROM orders" --jdbc-url "jdbc:postgresql://host:port/database" --jdbc-user appuser --jdbc-pass passwd hset --keyspace order --key order_id
riotx db-import "SELECT * FROM orders" --jdbc-url "jdbc:postgresql://host:port/database" --jdbc-user appuser --jdbc-pass passwd set --keyspace order --key order_id
This will produce Redis strings that look like this:
{
"order_id": 10248,
"customer_id": "VINET",
"employee_id": 5,
"order_date": "1996-07-04",
"required_date": "1996-08-01",
"shipped_date": "1996-07-16",
"ship_via": 3,
"freight": 32.38,
"ship_name": "Vins et alcools Chevalier",
"ship_address": "59 rue de l'Abbaye",
"ship_city": "Reims",
"ship_postal_code": "51100",
"ship_country": "France"
}
Snowflake Import
The snowflake-import
command uses a Snowflake STREAM
object to track changes (CDC) to a table and read them into
a Redis data structure like hash
or json
. The Snowflake STREAM
is created and managed by RIOTX. The user credentials
you provide must have the ability to create a stream in the database and schema specified by the fully qualified object
name.
-
SAMPLE_DATABASE.SAMPLE_SCHEMA.DATA_TABLE_changestream
will be created or replaced. For security, this can be created in a different schema than the table you are importing from by specifying--cdc-schema
. -
riotx:offset:SAMPLE_DATABASE.SAMPLE_SCHEMA.DATA_TABLE_changestream
- this key will be stored in the destination Redis database and is used to track the stream offset. If RIOT-X fails in the middle of copying data from the stream when restarted it will resume copying data from this offset. Removing this offset key from Redis will result in RIOT-X creating recreating the stream at time "NOW". If--snapshot-mode INITIAL
option is specified the stream will also include the initial table data plus changes going forward. If you do not want initial table data to be included specify--snapshot-mode NEVER
-
snowflake-import currently works on tables and materialized views
The basic usage is:
riotx snowflake-import [TABLE] [OPTIONS] [REDIS COMMAND...]
The recommended minimal necessary permissions for a snowflake role and user to run this command are:
CREATE OR REPLACE ROLE riotx_cdc
COMMENT = 'minimum cdc role for riotx';
-- replace compute_wh with the name of the warehouse you want to use
GRANT USAGE, OPERATE ON WAREHOUSE compute_wh TO ROLE riotx_cdc;
-- replace tb_101.raw_pos_cdc with the name of a database and schema for RIOT to create the stream in
CREATE OR REPLACE SCHEMA tb_101.raw_pos_cdc;
GRANT USAGE ON SCHEMA tb_101.raw_pos_cdc TO ROLE riotx_cdc;
-- replace tb_101 with the name of the database RIOT needs to read out of
GRANT USAGE ON DATABASE tb_101 TO ROLE riotx_cdc;
-- replace tb_101.raw_pos with the name of the schema RIOT needs to read out of
GRANT USAGE ON SCHEMA tb_101.raw_pos TO ROLE riotx_cdc;
-- replace with the name of the table(s) you want to read from
GRANT SELECT ON TABLE tb_101.raw_pos.incremental_order_header TO ROLE riotx_cdc;
GRANT REFERENCE_USAGE ON TABLE tb_101.raw_pos.incremental_order_header TO ROLE riotx_cdc;
ALTER TABLE tb_101.raw_pos.INCREMENTAL_ORDER_HEADER SET CHANGE_TRACKING = TRUE;
GRANT SELECT ON FUTURE TABLES IN SCHEMA tb_101.raw_pos_cdc TO ROLE riotx_cdc;
GRANT CREATE TABLE ON SCHEMA tb_101.raw_pos_cdc TO ROLE riotx_cdc;
GRANT CREATE STREAM ON SCHEMA tb_101.raw_pos_cdc TO ROLE riotx_cdc;
GRANT SELECT ON FUTURE STREAMS IN SCHEMA tb_101.raw_pos_cdc TO ROLE riotx_cdc;
CREATE OR REPLACE USER riotx_cdc
DEFAULT_ROLE = 'riotx_cdc'
DEFAULT_WAREHOUSE = 'compute_wh'
PASSWORD = '{{PASSWORD}}';
GRANT ROLE riotx_cdc TO USER riotx_cdc;
For the full usage, run:
riotx snowflake-import --help
This command uses the example db, schema and table names from the minimal role setup above.
riotx snowflake-import \
tb_101.raw_pos.incremental_order_header \
--snapshot-mode INITIAL # include initial table data \
--role riotx_cdc \
--warehouse compute_wh \
--cdc-schema raw_pos_cdc \
--jdbc-url "jdbc:snowflake://abcdefg.abc12345.snowflakecomputing.com" \
--jdbc-user databaseuser \
--jdbc-pass databasepassword \
--repeat 10s # sleep 10s after each cdc import and then repeat \
hset \
--keyspace orderheader \
--key order_id # column name to use as id
The command above imports CDC data from the Snowflake table tb_101.raw_pos.incremental_order_header
into Redis hashes in the keyspace orderheader
.
If you only need to do a one time import of data from Snowflake you can use the db-import
command.
This command will read all of the rows output from your SQL query and will write them to Redis.
For more information see the db-import command.
riotx db-import \
"SELECT * FROM SAMPLE_DATABASE.SAMPLE_SCHEMA.DATA_TABLE" \
--jdbc-url "jdbc:snowflake://abcdefg.abc12345.snowflakecomputing.com" \
--jdbc-driver net.snowflake.client.jdbc.SnowflakeDriver \
--jdbc-user databaseuser \
--jdbc-pass databasepassword \
hset \
--keyspace datatable \
--key data_id # column name to use as id
This command performs a one-time import from Snowflake using the db-import
command.
Database Export
Use the db-export
command to read from a Redis database and writes to a SQL database.
Ensure RIOT-X has the relevant JDBC driver for your database. See the Drivers section for more details. |
The general usage is:
riot db-export --jdbc-url <jdbc url> SQL
To show the full usage, run:
riot db-export --help
riotx db-export "INSERT INTO mytable (id, field1, field2) VALUES (CAST(:id AS SMALLINT), :field1, :field2)" --jdbc-url "jdbc:postgresql://host:port/database" --jdbc-user appuser --jdbc-pass passwd --key-pattern "gen:*" --key-regex "gen:(?<id>.*)"
Files
RIOT-X includes two commands to work with files in various formats:
file-import
-
Import data from files
file-export
-
Export Redis data structures to files
File Import
The file-import
command reads data from files and writes it to Redis.
The basic usage for file imports is:
riot file-import [OPTIONS] FILE... [REDIS COMMAND...]
To show the full usage, run:
riot file-import --help
RIOT-X will try to determine the file type from its extension (e.g. .csv
or .json
), but you can specify it with the --type
option.
Gzipped files are supported and the extension before .gz
is used (e.g. myfile.json.gz
→ json
).
-
/path/file.csv
-
/path/file-*.csv
-
/path/file.json
-
http://data.com/file.csv
-
http://data.com/file.json.gz
Use - to read from standard input.
|
Amazon S3 and Google Cloud Storage buckets are supported.
riotx file-import s3://riotx/beers.json --s3-region us-west-1 hset --keyspace beer --key id
riotx file-import gs://riotx/beers.json hset --keyspace beer --key id
Data Structures
If no REDIS COMMAND
is specified, it is assumed that the input file(s) contain Redis data structures serialized as JSON or XML. See the File Export section to learn about the expected format and how to generate such files.
riotx file-import /tmp/redis.json
Redis Commands
When one or more `REDIS COMMAND`s are specified, these commands are called for each input record.
Redis client options apply to the root command ( In this example Redis client options will not be taken into account:
|
Redis command keys are constructed from input records by concatenating keyspace prefix and key fields.
blah:<id>
riot file-import my.json hset --keyspace blah --key id
riotx file-import http://storage.googleapis.com/jrx/es_test-index.json json.set --keyspace elastic --key _id
riot file-import my.json hset --keyspace blah --key id expire --keyspace blah --key id
blah:<id>
and set TTL and add each id
to a set named myset
riot file-import my.json hset --keyspace blah --key id expire --keyspace blah --key id sadd --keyspace myset --member id
Delimited (CSV)
The default delimiter character is comma (,
).
It can be changed with the --delimiter
option.
If the file has a header, use the --header
option to automatically extract field names.
Otherwise specify the field names using the --fields
option.
Let’s consider this CSV file:
row | abv | ibu | id | name | style | brewery | ounces |
---|---|---|---|---|---|---|---|
1 |
0.079 |
45 |
321 |
Fireside Chat (2010) |
Winter Warmer |
368 |
12.0 |
2 |
0.068 |
65 |
173 |
Back in Black |
American Black Ale |
368 |
12.0 |
3 |
0.083 |
35 |
11 |
Monk’s Blood |
Belgian Dark Ale |
368 |
12.0 |
The following command imports this CSV into Redis as hashes using beer
as the key prefix and id
as primary key.
riotx file-import http://storage.googleapis.com/jrx/beers.csv --header hset --keyspace beer --key id
This creates hashes with keys beer:321
, beer:173
, …
This command imports a CSV file into a geo set named airportgeo
with airport IDs as members:
riotx file-import http://storage.googleapis.com/jrx/airports.csv --header --skip-limit 3 geoadd --keyspace airportgeo --member AirportID --lon Longitude --lat Latitude
Fixed-Length (Fixed-Width)
Fixed-length files can be imported by specifying the width of each field using the --ranges
option.
riotx file-import http://storage.googleapis.com/jrx/accounts.fw --type fw --ranges 1 9 25 41 53 67 83 --header hset --keyspace account --key Account
JSON
The expected format for JSON files is:
[
{
"...": "..."
},
{
"...": "..."
}
]
riotx file-import /tmp/redis.json
JSON records are trees with potentially nested values that need to be flattened when the target is a Redis hash for example.
To that end, RIOT-X uses a field naming convention to flatten JSON objects and arrays:
|
→ |
|
|
→ |
|
XML
Here is a sample XML file that can be imported by RIOT-X:
<?xml version="1.0" encoding="UTF-8"?>
<records>
<trade>
<isin>XYZ0001</isin>
<quantity>5</quantity>
<price>11.39</price>
<customer>Customer1</customer>
</trade>
<trade>
<isin>XYZ0002</isin>
<quantity>2</quantity>
<price>72.99</price>
<customer>Customer2c</customer>
</trade>
<trade>
<isin>XYZ0003</isin>
<quantity>9</quantity>
<price>99.99</price>
<customer>Customer3</customer>
</trade>
</records>
riotx file-import http://storage.googleapis.com/jrx/trades.xml hset --keyspace trade --key id
Parquet
RIOT-X supports Parquet files.
riotx file-import s3://riotx/userdata1.parquet --s3-region us-west-1 hset --keyspace user --key id
File Export
The file-export
command reads data from a Redis database and writes it to a JSON or XML file, potentially gzip-compressed.
The general usage is:
riot file-export [OPTIONS] FILE
To show the full usage, run:
riot file-export --help
JSON
riotx file-export /tmp/redis.json
[
{
"key": "string:615",
"ttl": -1,
"value": "value:615",
"type": "STRING"
},
{
"key": "hash:511",
"ttl": -1,
"value": {
"field1": "value511",
"field2": "value511"
},
"type": "HASH"
},
{
"key": "list:1",
"ttl": -1,
"value": [
"member:991",
"member:981"
],
"type": "LIST"
},
{
"key": "set:2",
"ttl": -1,
"value": [
"member:2",
"member:3"
],
"type": "SET"
},
{
"key": "zset:0",
"ttl": -1,
"value": [
{
"value": "member:1",
"score": 1.0
}
],
"type": "ZSET"
},
{
"key": "stream:0",
"ttl": -1,
"value": [
{
"stream": "stream:0",
"id": "1602190921109-0",
"body": {
"field1": "value0",
"field2": "value0"
}
}
],
"type": "STREAM"
}
]
riotx file-export /tmp/beers.json.gz --key-pattern beer:*
Memcached Replication
The memcached-replicate
command reads data from a source Memcached database and writes to a target Memcached database.
riotx memcached-replicate SOURCE TARGET [OPTIONS]
For the full usage, run:
riotx memcached-replicate --help
riotx memcached-replicate mydb.cache.amazonaws.com:11211 mydb-12211.redis.com:12211 --source-tls
Redis Import
The redis-import
command reads data from a Redis database and writes it to another Redis database.
The basic usage is:
riotx redis-import [OPTIONS] [REDIS COMMAND...]
For the full usage, run:
riotx redis-import --help
riotx redis-import --target-uri redis://localhost:6380 --key-pattern 'hash:*' --key-regex 'hash:(?<id>.+)' json.set --keyspace doc --key id --remove
Replication
The replicate
command reads data from a source Redis database and writes to a target Redis database.
The replication mechanism is as follows:
-
Identify source keys to be replicated using scan and/or keyspace notifications depending on the replication mode.
-
Read data associated with each key using dump or type-specific commands.
-
Write each key to the target using restore or type-specific commands.
The basic usage is:
riot replicate [OPTIONS] SOURCE TARGET
where SOURCE and TARGET are Redis URIs.
For the full usage, run:
riot replicate --help
To replicate a Redis logical database other than the default (0 ), specify the database in the source Redis URI.
For example riot replicate redis://source:6379/1 redis://target:6379 replicates database 1 .
|
Replication Mode
Replication starts with identifying keys to be replicated from the source Redis database.
The --mode
option allows you to specify how RIOT-X identifies keys to be replicated:
-
iterate over keys with a key scan (
--mode scan
) -
received by a keyspace notification subscriber (
--mode liveonly
) -
or both (
--mode live
)
Scan
This key reader scans for keys using the Redis SCAN
command:
SCAN cursor [MATCH pattern] [COUNT count] [TYPE type]
MATCH pattern
-
configured with the
--key-pattern
option TYPE type
-
configured with the
--key-type
option COUNT count
-
configured with the
--scan-count
option
INFO: In cluster mode keys are scanned in parallel across cluster nodes.
The status bar shows progress with a percentage of keys that have been replicated. The total number of keys is estimated when the replication process starts and it can change by the time it is finished, for example if keys are deleted or added during replication.
riotx replicate redis://source redis://target
Live
The key notification reader listens for key changes using keyspace notifications.
Make sure the source database has keyspace notifications enabled using:
-
redis.conf
:notify-keyspace-events = KEA
-
CONFIG SET notify-keyspace-events KEA
For more details see Redis Keyspace Notifications.
riotx replicate --mode live redis://source redis://target
The live replication mechanism does not guarantee data consistency. Redis sends keyspace notifications over pub/sub which does not provide guaranteed delivery. It is possible that RIOT-X can miss some notifications in case of network failures for example. Also, depending on the type, size, and rate of change of data structures on the source it is possible that RIOT-X cannot keep up with the change stream. For example if a big set is repeatedly updated, RIOT-X will need to read the whole set on each update and transfer it over to the target database. With a big-enough set, RIOT-X could fall behind and the internal queue could fill up leading up to updates being dropped. For those potentially problematic migrations it is recommend to perform some preliminary sizing using Redis statistics and |
Replication Types
RIOT-X offers two different mechanisms for reading and writing keys:
-
Dump & restore (default)
-
Data structure replication (
--struct
)
Dump & Restore
The default replication mechanism is Dump & Restore:
-
Scan for keys in the source Redis database. If live replication is enabled the reader also subscribes to keyspace notifications to generate a continuous stream of keys.
-
Reader threads iterate over the keys to read corresponding values (DUMP) and TTLs.
-
Reader threads enqueue key/value/TTL tuples into the reader queue, from which the writer dequeues key/value/TTL tuples and writes them to the target Redis database by calling RESTORE and EXPIRE.
Data Structure Replication
There are situations where Dump & Restore cannot be used, for example:
-
The target Redis database does not support the RESTORE command (Redis Enterprise CRDB)
-
Incompatible DUMP formats between source and target (Redis 7.0)
In those cases you can use another replication strategy that is data structure-specific: each key is introspected to determine its type and then use the corresponding read/write commands.
Type | Read | Write |
---|---|---|
Hash |
|
|
JSON |
|
|
List |
|
|
Set |
|
|
Sorted Set |
|
|
Stream |
|
|
String |
|
|
TimeSeries |
|
|
This replication strategy is more intensive in terms of CPU, memory, and network for all the machines involved (source Redis, target Redis, and RIOT-X machines). Adjust number of threads, batch and queue sizes accordingly. |
riotx replicate --struct redis://source redis://target
riotx replicate --struct --mode live redis://source redis://target
Compare
Once replication is complete, RIOT-X performs a verification step by reading keys in the source database and comparing them against the target database.
The verification step happens automatically after the scan is complete (snapshot replication), or for live replication when keyspace notifications have become idle.
Verification can also be run on-demand using the compare
command:
riot compare SOURCE TARGET [OPTIONS]
The output looks like this:
Verification failed (type: 225,062, missing: 485,450)
- missing
-
Number of keys in source but not in target.
- type
-
Number of keys with mismatched types (e.g. hash vs string).
- value
-
Number of keys with mismatched values.
- ttl
-
Number of keys with mismatched TTL i.e. difference is greater than tolerance (can be specified with
--ttl-tolerance
).
There are 2 comparison modes available through --compare
(--quick
for compare
command):
- Quick (default)
-
Compare key types and TTLs.
- Full
-
Compare key types, TTLs, and values.
To show which keys differ, use the --show-diffs
option.
Performance
Performance tuning is an art but RIOT-X offers some options to identify potential bottlenecks.
In addition to --batch
and --threads
options you have the --dry-run
option which disables writing to the target Redis database so that you can tune the reader in isolation.
Add that option to your existing replicate
command-line to compare replication speeds with and without writing to the target Redis database.
Stats
The stats
command analyzes the Redis database and displays keyspace statistics as well as keys that could be problematic during a live replication.
The basic usage is:
riotx stats [OPTIONS]
For the full usage, run:
riotx stats --help
--mem <size>
-
Memory usage threshold above which a key is considered big
--rate <size>
-
Write bandwidth above which a key is considered problematic.
riotx stats --mem 3mb --rate 10mb
Stream
Stream Import
The stream-import
command reads data from a stream and writes it to Redis.
The basic usage is:
riotx stream-import STREAM...
For the full usage, run:
riotx stream-import --help
riotx stream-import stream:beers --idle-timeout 1s hset --keyspace beer --key id
Stream Export
The stream-export
command enables Redis CDC to a Redis stream.
riotx stream-export SOURCE TARGET [OPTIONS]
For the full usage, run:
riotx stream-export --help
riotx stream-export redis://localhost:6379 redis://localhost:6380 --mode live
redis-cli -p 6380 xread COUNT 3 STREAMS stream:export 0-0
1) 1) "stream:export"
2) 1) 1) "1718645537588-0"
2) 1) "key"
2) "order:4"
3) "time"
4) "1718645537000"
5) "type"
6) "hash"
7) "ttl"
8) "-1"
9) "mem"
10) "136"
11) "value"
12) "{\"order_date\":\"2024-06-13 22:19:35.143797\",\"order_id\":\"4\"}"
Cookbook
Here are various recipes using RIOT-X.
Observability
RIOT-X exposes several metrics over a Prometheus endpoint that can be useful for troubleshooting and performance tuning.
Getting Started
The riotx-dist
repository includes a Docker compose configuration that set ups Prometheus and Grafana.
git clone https://github.com/redis-field-engineering/riotx-dist.git
cd riotx-dist
docker compose up
Prometheus is configured to scrape the host every second.
You can access the Grafana dashboard at localhost:3000.
Now start RIOT-X with the following command:
riotx replicate ... --metrics
This will enable the Prometheus metrics exporter endpoint and will populate the Grafana dashboard.
Configuration
Use the --metrics*
options to enable and configure metrics:
--metrics
-
Enable metrics
--metrics-jvm
-
Enable JVM and system metrics
--metrics-redis
-
Enable command latency metrics. See https://github.com/redis/lettuce/wiki/Command-Latency-Metrics#micrometer
--metrics-name=<name>
-
Application name tag that will be applied to all metrics
--metrics-port=<int>
-
Port that Prometheus HTTP server should listen on (default:
8080
) --metrics-prop=<k=v>
-
Additional properties to pass to the Prometheus client. See https://prometheus.github.io/client_java/config/config/
Metrics
Below you can find a list of all metrics declared by RIOT-X.

Replication Metrics
Name | Type | Description |
---|---|---|
|
Counter |
Number of bytes replicated (needs memory usage with |
|
Summary |
Replication end-to-end latency |
|
Summary |
Replication read latency |
|
Timer |
Batch writing duration |
|
Timer |
Item processing duration |
|
Timer |
Item reading duration |
|
Timer |
Active jobs |
|
Counter |
Job launch count |
|
Gauge |
Gauge reflecting the remaining capacity of the queue |
|
Gauge |
Gauge reflecting the size (depth) of the queue |
|
Counter |
Number of keys scanned |
|
Timer |
Operation execution duration |
|
Gauge |
Gauge reflecting the chunk size of the reader |
|
Gauge |
Gauge reflecting the remaining capacity of the queue |
|
Gauge |
Gauge reflecting the size (depth) of the queue |
JVM Metrics
Use the --metrics-jvm
option to enable the following additional metrics:

Name | Type | Description |
---|---|---|
|
Gauge |
An estimate of the number of buffers in the pool |
|
Gauge |
An estimate of the memory that the Java virtual machine is using for this buffer pool |
|
Gauge |
An estimate of the total capacity of the buffers in this pool |
|
Timer |
Time spent in concurrent phase |
|
Gauge |
Size of long-lived heap memory pool after reclamation |
|
Gauge |
Max size of long-lived heap memory pool |
|
Gauge |
Incremented for an increase in the size of the (young) heap memory pool after one GC to before the next |
|
Counter |
Count of positive increases in the size of the old generation memory pool before GC to after GC |
|
Timer |
Time spent in GC pause |
|
Gauge |
The amount of memory in bytes that is committed for the Java virtual machine to use |
|
Gauge |
The maximum amount of memory in bytes that can be used for memory management |
|
Gauge |
The amount of used memory |
|
Gauge |
The current number of live daemon threads |
|
Gauge |
The current number of live threads including both daemon and non-daemon threads |
|
Gauge |
The peak live thread count since the Java virtual machine started or peak was reset |
|
Counter |
The total number of application threads started in the JVM |
|
Gauge |
The current number of threads |
|
Counter |
The "cpu time" used by the Java Virtual Machine process |
|
Gauge |
The "recent cpu usage" for the Java Virtual Machine process |
|
Gauge |
Start time of the process since unix epoch. |
|
Gauge |
The uptime of the Java virtual machine |
|
Gauge |
The number of processors available to the Java virtual machine |
|
Gauge |
The "recent cpu usage" of the system the application is running in |
|
Gauge |
The sum of the number of runnable entities queued to available processors and the number of runnable entities running on the available processors averaged over a period of time |
Changelog
You can use RIOT-X to stream change data from a Redis database.
riot file-export --mode live
{"key":"gen:1","type":"string","time":1718050552000,"ttl":-1,"memoryUsage":300003376}
{"key":"gen:3","type":"string","time":1718050552000,"ttl":-1,"memoryUsage":300003376}
{"key":"gen:6","type":"string","time":1718050552000,"ttl":-1,"memoryUsage":300003376}
...
riot file-export export.json --mode live
ElastiCache Migration
This recipe contains step-by-step instructions to migrate an ElastiCache (EC) database to Redis Cloud or Redis Software.
The following scenarios are covered:
-
One-time (snapshot) migration
-
Online (live) migration
It is recommended to read the Replication section to familiarize yourself with its usage and architecture. |
Setup
Prerequisites
For this recipe you will require the following resources:
-
AWS ElastiCache: Primary Endpoint in case of Single Master and Configuration Endpoint in case of Clustered EC. Refer to this link to learn more
-
An Amazon EC2 instance to run RIOT-X
Keyspace Notifications
For a live migration you need to enable keyspace notifications on your ElastiCache instance (see AWS Knowledge Center). |
Migration Host
To run the migration tool we will need an EC2 instance.
You can either create a new EC2 instance or leverage an existing one if available. In the example below we first create an instance on AWS Cloud Platform. The most common scenario is to access an ElastiCache cluster from an Amazon EC2 instance in the same Amazon Virtual Private Cloud (Amazon VPC). We have used Ubuntu 16.04 LTS for this setup but you can choose any Ubuntu or Debian distribution of your choice.
SSH to this EC2 instance from your laptop:
ssh -i “public key” <AWS EC2 Instance>
Install redis-cli
on this new instance by running this command:
sudo apt update
sudo apt install -y redis-tools
Use redis-cli
to check connectivity with the ElastiCache database:
redis-cli -h <ec primary endpoint> -p 6379
Ensure that the above command allows you to connect to the remote ElastiCache database successfully.
Installing RIOT-X
Let’s install RIOT-X on the EC2 instance we set up previously. For this we’ll follow the steps in Manual Installation.
Performing Migration
We are now all set to begin the migration process. The options you will use depend on your source and target databases, as well as the replication mode (snapshot or live).
Live ElastiCache Single Master → Redis
riot replicate source:port target:port --mode live
In case ElastiCache is configured with AUTH TOKEN enabled, you need to pass
|
ElastiCache Cluster → Redis
riot replicate source:port target:port --source-cluster
--cluster is an important parameter used ONLY for ElastiCache whenever cluster-mode is enabled.
Do note that the source database is specified first and the target database is specified after the replicate command and it is applicable for all the scenarios.
|
ElastiCache Single Master → Redis (with specific database index)
riot replicate redis://source:port/db target:port
Important Considerations
-
It is recommended to test migration in UAT before production use.
-
Once migration is completed, ensure that application traffic gets redirected to Redis endpoint successfully.
-
It is recommended to perform the migration process during low traffic hours so as to avoid chances of data loss.
Connectivity Test
The ping
command can be used to test connectivity to a Redis database.
riot ping [OPTIONS]
To show the full usage, run:
riot ping --help
The command prints statistics like these:
riot ping -h localhost --unit microseconds
[min=491, max=14811, percentiles={99.9=14811, 90.0=1376, 95.0=2179, 99.0=14811, 50.0=741}]
[min=417, max=1286, percentiles={99.9=1286, 90.0=880, 95.0=1097, 99.0=1286, 50.0=606}]
[min=382, max=2244, percentiles={99.9=2244, 90.0=811, 95.0=1036, 99.0=2244, 50.0=518}]
...
Best Practices
This section contains best practices and recipes for various RIOT-X use cases.
Replication Performance Tuning
The replicate
command reads from a source Redis database and write to a target Redis database.
Replication Bottleneck
To optimize throughput it is necessary to understand the two main possible scenarios:
- Slow Producer
-
In this scenario the reader does not read from source as fast as the writer can write to the target. This means the writer is starved and we should look into ways to speed up the reader.
- Slow Consumer
-
In this scenario the writer can not keep up with the reader and we should look into optimizing writes.
There are two ways to identify which scenario we fall into:
- No-op writer
-
With the
--dry-run
option the replication process will use a no-op writer instead of a Redis writer. If throughput with dry-run is similar to throughput without then the writer is not the bottleneck. Follow steps below to improve reader throughput. - Reader queue utilization
-
Using the Grafana dashboard you can monitor reader queue depth. A low queue utilization means the writer can keep up with the reader. A queue utilization close to 100% means writes are slower than reads.
Reader
To improve reader performance tweak the options below until you reach optimal throughput.
--read-threads
-
How many value reader threads to use in parallel (default: 1).
--read-batch
-
Number of values each reader thread should read in a single pipelined call (default: 50).
--read-queue
-
Capacity of the reader queue (default: 10000). When the queue is full the threads wait for space to become available. Increase this value if you have peaky traffic on the source database causing fluctuating reader throughput.
--source-pool
-
Number of Redis connections to the source database (default: 8). Keep in sync with the number of read threads to have a dedicated connection per thread.
Writer
To improve writer performance you can tweak the following options:
--batch
-
Number of items written in a single network round-trip to the Redis server (i.e. number of commands in the pipeline).
--threads
-
How many write operations can be performed concurrently (default: 1).
--target-pool
-
Number of Redis connections to the target database (default: 8). Keep in sync with the number of threads to have a dedicated connection per thread.
System Requirements
Operating System
RIOT-X works on all major operating systems but has been tested at scale on Linux X86 64-bit platforms.
CPU
CPU used by RIOT-X varies greatly dependending on specific replication settings and data structures at play.
You can monitor CPU usage with the supplied Grafana dashboard (process_cpu_usage
metric).
Memory
Memory requirements for RIOT-X itself are very light. Being JVM-based the default initial heap size is dependent on available system memory and on the operating system.
If you have very intensive replication requirements you will need to increase the JVM heap size.
To estimate the worst case scenario for memory requirements you can use this formula: keySize * queueSize
where:
keySize
-
average key size as reported by the
MEMORY USAGE
command queueSize
-
Redis reader queue capacity configured with the
--read-queue
option
Conversely if you need to minimize memory used by RIOT-X you can lower the reader queue size (but possibly at the expense of reader throughtput).
Network
RIOT-X replication is essentially a network bridge between the source and target Redis databases so underlying network is crucial for the overall throughput and a 10 Gigabit network is the minimum recommended. Network latency will also have an impact on replication (and other RIOT-X uses) performance. Make sure the host running RIOT-X offers minimal latency to both the source and target databases. You can test the latency using the ping command.
FAQ
-
Logs are cut off or missing
This could be due to concurrency issues in the terminal when refreshing the progress bar and displaying logs. Try running with job option
--progress log
. -
Unknown options: '--keyspace', '--key'
You must specify one or more Redis commands with import commands (
file-import
,faker
,db-import
). -
ERR DUMP payload version or checksum are wrong
Redis 7 DUMP format is not backwards compatible with previous versions. To replicate between different Redis versions, use Type-Based Replication.
-
ERR Unsupported Type 0
The target database is most likely CRDB in which case you need to use type-based replication (
--struct
option). -
Process gets stuck during replication and eventually times out
This could be due to big keys clogging the replication pipes. In these cases it might be hard to catch the offending key(s). Try running the same command with
--info
and--progress log
so that all errors are reported. Check the database withredis-cli
Big keys and/or use reader options to filter these keys out. -
NOAUTH Authentication required
This issue occurs when you fail to supply the
--pass <password>
parameter. -
ERR The ID argument cannot be a complete ID because xadd-id-uniqueness-mode is strict
This usually happens in Active/Active (CRDB) setups where stream message IDs cannot be copied over to the target database. Use the
--no-stream-id
option to disable ID propagation. -
ERR Error running script… This Redis command is not allowed from scripts
This can happen with Active/Active (CRDB) databases because the
MEMORY USAGE
command is not allowed to be run from a LUA script. Use the--mem-limit -1
option to disable memory usage. -
java.lang.OutOfMemoryError: Java heap space
The RIOT JVM ran out of memory. If you are running
db-import
this could be due to a large resultset being loaded upfront. Use the--fetch
option to set a fixed fetch size (e.g.--fetch 1000
). Otherwise increase max JVM heap size (export JAVA_OPTS="-Xmx8g"
) or reduce RIOT memory usage by loweringthreads
,batch
,read-batch
andread-queue
.