GridDB operation tools reference

Revision: 4.3.2-9

1 Introduction

1.1 Purpose and structure of this manual

This manual describes the operating tools of GridDB.

It is written for system designers and system administrators responsible for GridDB system's construction and operation management respectively.

The contents of each chapter is as follows:

2 Service

2.1 Preparing to use the service

The procedure to use and install GridDB service is as follows.

  1. Install GridDB server package and client package.
  2. Configure the respective GridDB nodes that constitute a GridDB cluster.
  3. Configure the start configuration file.

See the "GridDB database administrator guide" (GridDB_AdministratorsGuide.html) for the procedure to install GridDB and configure a GridDB node.

The table above shows the kinds of files used in GridDB services.

Type Meaning
Service script Script file is executed automatically during OS startup.
It is installed in /etc/init.d/gridstore by the server package of GridDB, and it registers with a system as GridDB service.
PID file File containing only the process ID (PID) of the gsserver process. This is created in $GS_HOME/conf/gridstore.pid when the gsserver process is started.
Start configuration file File containing the parameters that can be set while in service.
Depending on the GridDB server package, it is installed in /etc/sysconfig/gridstore/gridstore.conf.

2.2 Parameter setting

A list of parameters is available to control the GridDB service operations. A list of the parameters is given below.

Property Default Note
GS_USER admin GridDB user name
GS_PASSWORD admin GS_USER password
CLUSTER_NAME INPUT_YOUR_CLUSTER_NAME_HERE Cluster name to join
MIN_NODE_NUM 1 Number of nodes constituting a cluster

To change the parameters, edit the start configuration file ( /etc/sysconfig/gridstore/gridstore.conf ).

When a server package is updated or uninstalled, the start configuration file will not be overwritten or uninstalled.

[Notes]

2.3 Log

See the boot log( /var/log/boot.log ) and operating command log( $GS_HOME/log ) for details of the service log.

2.4 Command

GridDB service commands are shown below.

[Notes]

2.4.1 start

Action:

# service gridstore start

[Notes]

2.4.2 stop

Action:

# service gridstore stop

[Notes]

2.4.3 status

Action:

# service gridstore status

2.4.4 restart

Action:

2.4.5 condrestart

Action:

2.5 Error message list

Service error messages are as shown below.

Code Message Meaning
F00003 Json load error Reading of definition file failed.
F01001 Stop service timed out Stop node process timed out.
F01002 Startnode error An error occurred in the node startup process.
F01003 Startnode timed out Start node process timed out.
F01004 Joincluster error An error occurred in the join cluster process.
F01005 Joincluster timed out Join cluster process timed out.
F01006 Leavecluster error An error occurred in the leave cluster process.
F02001 Command execution error An error occurred in the command execution.
F02002 Command execution timed out Command execution timed out.

[Memo]

3 Operating commands

3.1 Command list

The following commands are available in GridDB.

Type Functions Command RPM package
(1) Start/stop node start node gs_startnode server
stop node gs_stopnode client
(2) User management Registration of administrator user gs_adduser server
Deletion of administrator user gs_deluser server
Change the password of an administrator user gs_passwd server
(3) Cluster management Joining a cluster configuration gs_joincluster client
Leaving a cluster configuration gs_leavecluster client
Stopping a cluster gs_stopcluster client
Getting cluster configuration data gs_config client
Getting node status gs_stat client
Adding a node to a cluster gs_appendcluster client
Manual failover of a cluster gs_failovercluster client
Getting partition data gs_partition client
Increasing the no. of nodes of the cluster gs_increasecluster client
Set up autonomous data redistribution of a cluster gs_loadbalance client
Set up data redistribution goal of a cluster gs_goalconf client
Controlling the checkpoint of the node gs_checkpoint server
(4) Log data Displaying recent event logs gs_logs client
Displaying and changing the event log output level gs_logconf client
(5) Backup/restoration backup execution gs_backup server
Check backup data gs_backuplist server
Backup/restoration gs_restore server
(6) Import/export Import gs_import client
Export gs_export client
(7) Maintenance Displaying and changing parameters gs_paramconf client

[Memo]

 

3.2 Common functions of operating commands

[Command option]

The options below are common options that can be used in all commands.

Options Note
-h|--help Display the command help.
--version Display the version of the operating command.

[Example]

The options below are common options that can be used in some of the commands.

Options Note
[-s <Server>[:<Port no.>]| -p <Port no.>] The host name or the server name (address) and port number,
that is, the connection port no. of the operating command.
The value "localhost (127.0.0.1):10040" is used by default.
-u <User name>/<Password> Specify authentication user and password.
-w|--wait [<No. of sec>] Wait for the process to end.
There is no time limit if the time is not set or if the time is set to 0.
-a | --address-type <Address type> Specify the service type of the port, address to display.
system: Connection address of operating command
cluster: Reception address used for cluster administration
transaction: Reception address for transaction process
sync: Reception address used for synchronization process
--no-proxy If specified, the proxy will not be used.

[Memo]

[Termination status]

The end status of the command is shown below.

[Log file]

Log file of the command will be saved in ${GS_LOG}/command name.log.

[Example] The log file below is created if the GS_LOG value is " /var/lib/gridstore/log (default)" and the "gs_startnode" command is executed.

3.3 Points to note

[Before using an operating command]

[To compose a cluster]

A cluster is composed of a group of 1 or more nodes, consisting of a master with the rest being followers.

In a cluster configuration, the number of nodes already participating in a cluster and the number of nodes constituting a cluster are important. The number of nodes already participating in a cluster is the actual number of nodes joined to the cluster. The number of nodes constituting a cluster is the number of nodes that can join the cluster which is specified in the gs_joincluster command.

The number of nodes already participating in a cluster and the number of nodes constituting a cluster can be checked by executing a gs_stat command on the master node, with the values being /cluster/activeCount and /cluster/designatedCount respectively.

The main procedure to create/change a cluster configuration is shown below for reference purposes. See the following sections for details of each command.

3.4 Starting/stopping a node

3.4.1 Starting a node

Execute the GridDB start node command on the machine executing the node. This command needs to be executed for each GridDB node.

[Memo]

 

3.4.2 Stopping a node

The following command is used to stop the GridDB node. To stop a node, the GridDB cluster management process needs to be stopped first.

[Memo]

 

3.5 User management

The user management is used to perform registration/deletion/password change for GridDB administrator user.

The default user below exists after installation.

[Notes]

 

3.5.1 Registration of administrator user

[Memo]

[Example]

 

3.5.2 Deletion of administrator user

[Memo]

[Example]

 

3.5.3 Update password

[Memo]

[Example]

 

3.6 Cluster management

3.6.1 Joining a cluster configuration

When composing a GridDB cluster, the nodes need to be attached (joined) to the cluster.

[Memo]

[Example] Compose a 3-node cluster with the cluster name "example_three_nodes_cluster" using node A - C

3.6.2 Leaving a cluster configuration

The following command is used to detach a node from a cluster.  

[Memo]

[Example]

 

3.6.3 Stop all clusters

The following command is used to stop a cluster.  

[Memo]

[Example]

 

3.6.4 Getting cluster configuration data

The following command is used to get the cluster configuration data (data on list of nodes joined to a cluster).  

[Memo]

[Example]

 

3.6.5 Getting node status

The following command gets the cluster data (cluster configuration data and internal data), or backup progress status.

[Memo]

[Example]

 

3.6.6. Adding a node to a cluster

Add a new node to a cluster in operation.  

[Memo]

[Example]

 

3.6.7 Manual failover of a cluster

The following command is used to execute GridDB cluster failover.

[Memo]

[Example]

 

3.6.8 Getting partition data

The following command is used to display the partition data of a GridDB node.

[Memo]

[Example]

 

3.6.9 Increasing the no. of nodes of the cluster

Increase the no. of nodes of the GridDB cluster.

[Memo]

[Example]

3.6.10 Set up autonomous data redistribution of a cluster

Enable/disable autonomous data redistribution of a GridDB cluster, or display the setting. As in the case of stopping nodes and rejoining them in a cluster for rolling upgrade, by disabling autonomous data redistribution, you can eliminate redundant redistribution processing and reduce the load on the operations.

[Memo]

[Example]

Confirm the settings of autonomous data redistribution on all nodes in a cluster.
$ gs_loadbalance -s 192.168.33.29:10040  -u admin/admin --cluster
192.168.33.29 ACTIVE
192.168.33.30 ACTIVE
192.168.33.31 ACTIVE

Disable the setting of the node, "192.168.33.31".
$ gs_loadbalance -s 192.168.33.31:10040  -u admin/admin --off

 

3.6.11 Set up data redistribution goal of a cluster

Enabled/disable GridDB autonomous data redistribution, display the present data redistribution goal, and manual setting. These commands are used during rolling upgrades, to detach the node safely off the cluster.

[Example]

Confirm the settings of autonomous data redistribution on all nodes in a cluster.
$ gs_loadbalance -s 192.168.33.29:10040  -u admin/admin --cluster
192.168.33.29 ACTIVE
192.168.33.30 ACTIVE
192.168.33.31 ACTIVE

Disable the setting of the node, "192.168.33.31".
$ gs_loadbalance -s 192.168.33.31:10040  -u admin/admin --off


Confirm the settings of autonomous data redistribution on all nodes in a cluster.
$ gs_goalconf -s 192.168.33.29:10040  -u admin/admin --cluster
192.168.33.29 ACTIVE
192.168.33.30 ACTIVE
192.168.33.31 ACTIVE

Disable the setting of the node, "192.168.33.31".
$ gs_goalconf -s 192.168.33.31:10040  -u admin/admin --off

Set up the data redistribution goal to leave the node of "192.168.33.31" for all the nodes in a cluster.
$ gs_goalconf -u admin/admin --cluster --leaveNode 192.168.33.31
Switching 43 owners to backup on 192.168.33.31:10040 ...
Setting goal requests have been sent. Sync operations will be started when loadbalancer is active.

 

3.6.12 Controlling the checkpoint

Enable/disable the periodic checkpoint of a GridDB node, or execute manual checkpoint.

[Memo]

[Example]

Disable the periodic checkpoint 
$ gs_checkpoint -u admin/admin --off

Perform the manual checkpoint and wait to complete. 
$ gs_checkpoint -u admin/admin --manual -w
...
The manual checkpoint has been completed.

Re-enable the periodic checkpoint 
$ gs_checkpoint -u admin/admin --on

 

3.7 Log data

3.7.1 Displaying recent event logs

The following command is used to get the most recent GridDB event log.  

[Memo]

[Example]

 

3.7.2 Displaying and changing the event log output level

The following command is used to display or change the event log output level. Get the list of settings if the argument is not specified.  

[Memo]

[Example]

 

3.8 Backup/restoration

3.8.1 Backup

The following command is used to get GridDB backup data on a per-node basis while continuing services.

A backup of the entire cluster can be carried out while continuing services by backing up all the nodes constituting the cluster in sequence.

<mode option>

Backup
Backup

[Memo]

[Example]

3.8.2 Checking backup data

The following is used to get a list of the backup data in the backup directory set up in the node definition file (gs_node.json).  

[Memo]

[Example]

 

3.8.3 Restoration

The following command is used to restore a GridDB backup file.

[Memo]

[Example]

[Example]

3.9 Maintenance

3.9.1 Displaying and changing parameters

The following command is used to display or change the node parameters.

[Memo]

[Example]

 

4 Cluster operation control command interpreter (gs_sh)

4.1 Overview

The cluster operation control command interpreter (hereinafter referred to gs_sh) is a command line interface tool to manage GridDB cluster operations and data operations.

The following can be carried out by gs_sh.

4.2 Using gs_sh

4.2.1 Preliminary preparations

Carry out the following preparations before using gs_sh.

4.2.2 gs_sh start-up

There are two types of start modes in gs_sh.

[Memo]

 

4.3 Definition of a GridDB cluster

The definition below is required in advance when executing a GridDB cluster operation control or data operation.

An explanation of node variables, cluster variables, and how to define user data is given below. An explanation of the definition of an arbitrary variable, display of variable definition details, and how to save and import variable definition details in a script file is also given below.

 

4.3.1 Definition of node variable

Define the IP address and port no. of a GridDB node in the node variable.

[Memo]

  

4.3.2 Definition of cluster variable

Define the GridDB cluster configuration in the cluster variable.

[Memo]

 

In addition, node variables can be added or deleted for a defined cluster variable.

[Memo]

 

4.3.3 Defining the SQL connection destination of a cluster

Define the SQL connection destination in the GridDB cluster configuration. This is set up only when using the GridDB NewSQL interface.

[Memo]

 

4.3.4 Definition of a user

Define the user and password to access the GridDB cluster.

[Memo]

  

4.3.5 Definition of arbitrary variables

Define an arbitrary variable.

[Memo]

 

4.3.6 Displaying the variable definition

Display the detailed definition of the specified variable.

[Memo]

 

4.3.7 Saving a variable definition in a script file

Save the variable definition details in the script file.

[Memo]

 

4.3.8 Executing a script file

Read and execute a script file.

[Memo]

4.4 GridDB cluster operation controls

The following operations can be executed by the administrator user only as functions to manage GridDB cluster operations.

4.4.1 Node status

This section explains the status of a GridDB node and GridDB cluster.

A cluster is composed of 1 or more nodes. A node status represents the status of the node itself e.g. start or stop etc. A cluster status represents the acceptance status of data operations from a client. A cluster status is determined according to the status of the node group constituting the cluster.

An example of the change in the node status and cluster status due to a gs_sh sub-command operation is shown below. A cluster is composed of 4 nodes. When the nodes constituting the cluster are started (startnode), the node status changes to "Start". When the cluster is started after starting the nodes (startcluster), each node status changes to "Join", and the cluster status also changes to "In Operation".

Status example
Status example

A detailed explanation of the node status and cluster status is given below.

Node status

Node status changes to "Stop", "Start" or "Join" depending on whether a node is being started, stopped, joined or detached. If a node has joined a cluster, there are 2 types of node status depending on the status of the joined cluster.

Node status
Node status
Status Status name Note
Join SERVICING Node is joined to the cluster, and the status of the joined cluster is "In Operation"
WAIT Node is joined to the cluster, and the status of the joined cluster is "Halted"
Start STARTED Node is started but has not joined a cluster
STARTING Starting node
Stop STOP Stopped node
STOPPING Stopping node

 

Cluster status

GridDB cluster status changes to "Stop", "Halted" or "In Operation" depending on the operation start/stop status of the GridDB cluster or the join/leave operation of the GridDB node. Data operations from the client can be accepted only when the GridDB cluster status is "In Operation".

Cluster status
Cluster status
Status Status name Note
In Operation SERVICE_STABLE All nodes defined in the cluster configuration have joined the cluster
SERVICE_UNSTABLE More than half the nodes defined in the cluster configuration have joined the cluster
Halted WAIT Half and more of the nodes defined in the cluster configuration have left the cluster
INIT_WAIT 1 or more of the nodes defined in the cluster configuration have left the cluster (when the cluster is operated for the first time, the status will not change to "In Operation" unless all nodes have joined the cluster)
Stop STOP All nodes defined in the cluster configuration have left the cluster

The GridDB cluster status will change from "Stop" to "In Operation" when all nodes constituting the GridDB cluster are allowed to join the cluster. In addition, the GridDB cluster status will change to "Halted" when half and more of the nodes have left the cluster, and "Stop" when all the nodes have left the cluster.

Join and leave operations (which affect the cluster status) can be applied in batch to all the nodes in the cluster, or to individual node.

When the operating target is a single node Operation When the operating targets are all nodes
Join startcluster : Batch entry of a group of nodes that are already operating but have not joined the cluster yet. joincluster : Entry by a node that is in operation but has not joined the cluster yet.
Leave stopcluster : Batch detachment of a group of nodes joined to a cluster. leavecluster : Detachment of a node joined to a cluster.

[Memo]

Details of the various operating methods are explained below.

4.4.2 Starting a node

Start the specified node.

[Memo]

4.4.3 Stopping a node

Stop the specified node.

In addition, the specified node can be forced to stop as well.

[Memo]

4.4.4 Batch entry of nodes in a cluster

Explanation on how to add batch nodes into a cluster is shown below. In this case when a group of unattached but operating nodes are added to the cluster, the cluster status will change to "In Operation".

[Memo]

4.4.5 Batch detachment of nodes from a cluster

To stop a GridDB cluster, simply make the attached nodes leave the cluster using the stopcluster command.

[Memo]

4.4.6 Node entry in a cluster

Join a node that is temporarily left from the cluster by leavecluster sub-command or failure into the cluster.

[Memo]

4.4.7 Detaching a node from a cluster

Detach the specified node from the cluster. Also force the specified active node to be detached from the cluster.

[Memo]

 

4.4.8 Adding a node to a cluster

Add an undefined node to a pre-defined cluster.

[Memo]

4.4.9 Displaying cluster status data

Display the status of an active GridDB cluster, and each node constituting the cluster.

[Memo]

 

4.4.10 Displaying configuration data

Display the cluster configuration data.

[Memo]

 

4.4.11 Displaying node status

Display the node configuration data.

[Memo]

 

4.4.12 Displaying event log

Displays the log of the specified node.

 

The output level of a log can be displayed and changed.

[Memo]

 

4.4.13 Displaying SQL processing under execution

Display the SQL processing under execution. This function can be executed in the GridDB Advanced Edition only.

[Memo]

  

4.4.14 Displaying executing event

Display the event list executed by the thread in each node in a cluster.

[Memo]

 

4.4.15 Displaying connection

Display the list of connections.

[Memo]

 

4.4.16 SQL cancellation

This function can be executed in the GridDB Advanced Edition edition only.

Cancel the SQL processing in progress.

[Memo]

 

4.5 Data operation in a database

To execute a data operation, there is a need to connect to the cluster subject to the operation. Data in the database configured during the connection ("public" when the database name is omitted) will be subject to the operation.

4.5.1 Connecting to a cluster

Establish connection to a GridDB cluster to execute a data operation.

[Memo]

 

4.5.2 Search (TQL)

Execute a search and retain the search results.

[Memo]

 

4.5.3 SQL command execution

Execute an SQL command and retains the search result. This function can be executed in the GridDB Advanced Edition edition only.

 

Sub-command name 'sql' can be omitted when the first word of SQL statement is one of the follows.

[Memo]

 

4.5.4 Getting search results

The following command gets the inquiry results and presents them in different formats. There are 3 ways to output the results as listed below.

(A) Display the results obtained in a standard output.

(B) Save the results obtained in a file in the CSV format.

(C) Results obtained will not be output.

Example:

//execute a search
gs[public]> tql c001 select *;
5 results. 

//Get first result and display
gs[public]> get 1
name,status,count
mie,true,2
The 1 result has been acquired. 

//Get second and third results and save them in a file
gs[public]> getcsv /var/lib/gridstore/test2.csv 2
The 2 results had been acquired. 

//Get fourth result
gs[public]> getnoprint 1
The 1 result has been acquired. 

//Get fifth result and display
gs[public]> get 1
name,status,count
akita,true,45
The 1 result has been acquired.

[Memo]

 

4.5.5 Getting the execution plan

Execute the specified TQL command and display the execution plan and actual measurement values such as the number of cases processed etc. Search is not executed.

In addition, the actual measurement values such as the number of processing rows etc. can also be displayed together with the executive plan by actually executing the specified TQL command.

[Memo]

 

4.5.6 Discarding search results

Close the tql and discard the search results saved.

Close the query and discard the search results saved.

Example:

//Discard search results
gs[public]> tqlclose

gs[public]> queryclose

[Memo]

 

4.5.7 Disconnecting from a cluster

Disconnect user from a GridDB cluster.

[Memo]

 

4.5.8 Hit count setting

Set whether to execute count query when SQL querying.

[Memo]

 

4.6 Database management

This section explains the available sub-commands that can be used for database management. Connect to the cluster first prior to performing database management with connect sub-command. (Subcommand connect)

4.6.1 Creating a database

Create a database with the specified name.

[Memo]

 

4.6.2 Deleting a database

Delete the specified database.

[Memo]

 

4.6.3 Displaying current database

Display the current database name.

 

4.6.4 Database list

List the databases with access right information.

[Memo]

 

4.6.5 Granting access rights

Grant the database access rights to user.

[Memo]

4.6.6 Revoking access rights

Revoke access rights to the database.

[Memo]

4.7 User management

This section explains the available sub-commands that can be used to perform user management. Connect to the cluster first prior to performing user management (sub-command connect).

4.7.1 Creating a general user

Create a general user (username and password).

[Memo]

 

4.7.2 Deleting a general user

Delete the specified general user

[Memo]

 

4.7.3 Update password

Update the user password.

[Memo]

 

4.7.4 Listing general users

List the general user data.

[Memo]

 

4.8 Container management

This section explains the available sub-commands that can be used when performing container operations. Connect to the cluster first before performing container management (sub-command connect). The container in the connected database will be subject to the operation.

4.8.1 Creating a container

Create a container.

 

Simplified version

Specify the container name and column data (column name and type) to create the container. The compression type can also be specified for timeseries containers only.

Detailed version

Specify the container definition data in the json file to create a container.

 

4.8.2 Deleting container

Delete a container

[Memo]

 

4.8.3 Displaying a container data

Display the container data.

[Memo]

 

4.8.4 Displaying a table data

Display the table data. It is compatible command of showcontainer.

 

4.8.5 Creating an index

Create an index in the column of a specified container.

[Memo]

 

4.8.6 Creating an compound index

Create a composite index on the column of a specified container.

[Memo]

 

4.8.7Deleting an index

Delete the index in the column of a specified container.

[Memo]

 

4.8.8 Deleting a compound index

Delete the compound index in the column of a specified container.

[Memo]

 

4.8.9 Deleting a trigger

Delete the trigger of a specified container.

 

4.8.10 Displaying trigger data

Display the trigger data of a specified container.

[Memo]

 

4.9 Execution plan

This section explains subcommands to displays an SQL execution plan.

4.9.1 Getting an SQL analysis result (global plan)

Display an SQL analysis result (global plan) in text format or in JSON format. This function can be executed in the GridDB Advanced Edition edition only.

4.9.1.1 Text format

[Memo]

4.9.1.2 JSON format

[Memo]

4.9.2 Getting detailed information about an SQL analysis result

Display the detailed information of an SQL analysis result in JSON format. This function can be executed in the GridDB Advanced Edition edition only.

[Memo]

4.10 Other operations

This section explains the sub-commands for other operations.

4.10.1 Echo back setting

Display the executed sub-command in the standard output.

[Memo]

 

4.10.2 Displaying a message

Display the definition details of the specified character string or variable.

[Memo]

 

4.10.3 Sleep

Set the time for the sleeping function.

[Memo]

 

4.10.4 Executing external commands

Execute an external command.

[Memo]

 

4.10.5 Terminating gs_sh

The above command is used to terminate gs_sh.

In addition, if an error occurs in the sub-command, the setting can be configured to end gs_sh.

[Memo]

 

4.10.6 Help

Display a description of the sub-command.

[Memo]

 

4.10.7 Version

Display the version of gs_sh.

[Memo]

4.10.8 Setting the Time zone

Set the time zone.

[Memo]

 

4.11 Options and sub-commands specifications

4.11.1 Option

[Memo]

 

4.11.2 Sub-command list

 

 

 

 

  

 

 

 

5 Integrated operation control GUI (gs_admin)

5.1 Overview

The integrated operation control GUI (hereinafter described as gs_admin) is a Web application that integrates GridDB cluster operation functions.

The following operations can be carried out using gs_admin.

5.1.1 gs_admin configuration

gs_admin needs to be installed on a machine in which nodes constituting a cluster have been started, or in a machine on the network with the same subnet and multicast distribution.

5.2 Setting up gs_admin

gs_admin is a Web application that runs on Tomcat.

To use gs_admin, Tomcat and Java have to be installed beforehand. The compatible versions are as follows.

The GridDB versions supported by gs_admin Ver.4.3 are as follows.

The procedure to use gs_admin is as follows.

  1. Configure the respective GridDB nodes that constitute a GridDB cluster.
  2. Install and configure gs_admin.
  3. Access the gs_admin application URI with a browser, and log in as a gs_admin user.

See the "GridDB Quick Start Guide" (GridDB_QuickStartGuide.html) for the procedure to configure a GridDB node.

The procedure to install and configure gs_admin is as follows.

  1. Installation of GridDB client package
  2. Deploy gs_admin.war in Tomcat
  3. gs_admin user settings
  4. gs_admin.properties file settings
  5. Node repository settings
  6. adminHome rights setting

5.2.1 Installation of GridDB client package

Install the GridDB client package (griddb-xx-client-X.X.X-linux.x86_64.rpm).

Log into a machine installed with the Web application as a root user, and install the package using the command below.

# rpm -Uvh griddb-xx-client-X.X.X-linux.x86_64.rpm

(*) xx indicates the GridDB edition. (se, ae, ve) (*) X.X.X indicates the GridDB version.

When a client package is installed, a directory named admin is created in the GridDB home directory ( /var/lib/gridstore ). This directory ( /var/lib/gridstore/admin ) is known as adminHomehereinafter.

gs_admin configuration data and data used by gs_admin are installed in adminHome. As there are functions in gs_admin to operate adminHome files, the appropriate rights need to be set. Rights settings will be described later.

The configuration under adminHome is as follows.

capture/                                                # snapshot storage directory (*)
        [Nodeaddress]_[portno.]/YYYYMMDDHHMMSS.json     # snapshotfile(*)
conf/                                                   # configuration file directory
     gs_admin.properties                                # Static parameter file to be configured initially
     gs_admin.settings                                  # dynamic parameter file to configure display-related settings
     password                                           # gs_admin user definition file
     repository.json                                    # node repository file
log/                                                    # log file directory of gs_admin (*)
    gs_admin-YYYYMMDD.log                               # log file (*)
tree/                                                   # structural file directory of container tree (*)
     foldertree--[cluster name]-[user name].json                  # folder tree file (*)

Files and directories marked with a (*) are created automatically by gs_admin.

[Notes]

5.2.2 Deployment in Tomcat

gs_admin is a Web application that runs on Tomcat. To use gs_admin, there is a need to deploy the gs_admin war file in Tomcat. Tomcat settings are omitted in this section.

The deployment procedure is as follows.

Deploy the war file included in the GridDB client package (griddb-xx-client-X.X.X-linux.x86_64.rpm) in Tomcat.

When a client package is installed, war file is installed under the following directory.

Copy gs_admin.war to the webapps directory under the Tomcat installation directory.

$ cp /usr/griddb/web/gs_admin.war [Tomcat installation directory]/webapps

5.2.3 gs_admin user settings

When using gs_admin, perform authentication as a gs_admin user.

Administrator users of GridDB clusters under management need to be set up as gs_admin users.

The gs_admin user definition file is found in /var/lib/gridstore/admin/conf/password

This file will not be created when a client package is installed.

To create this file easier, overwrite the user definition file of the node in the cluster you want to manage ( /var/lib/gridstore/conf/password ) to the gs_admin user definition file ( /var/lib/gridstore/admin/conf/password ). In this case, all administrator users listed in the copied user definition file will become gs_admin users.

[Memo]

5.2.4 gs_admin.properties file settings

The configuration file is found in /var/lib/gridstore/admin/conf/gs_admin.properties . Set together with the GridDB cluster configuration as a gsadm user.

Reload the Web application if the property file has been overwritten.

gs_admin.properties contains the following settings.

Property Default Note
adminUser admin Set the gs_admin administrator user. Multiple user names can be set by separating the names with commas. This function can be used by a gs_admin administrator user.
- cluster operation function
- Repository management function
ospassword - Set the password of the node gsadm user (OS user). The following functions can be used when the password is set.
- Node start operation (start) in the cluster operation functions
- OS information display screen
timeZone - Set timeZone as a property for cluster connection.
The set value is used as the time zone of the TIMESTAMP type column value on the TQL screen and SQL screen. If not specified, the time zone will be UTC.

[Memo]

5.2.5 Node repository settings

Node repository files are files to centrally manage cluster configuration data and node data ( /var/lib/gridstore/admin/conf/repository.json ). They are used to specify cluster under management and cluster operation functions. Set together with the GridDB cluster configuration as a gsadm user.

The default file contents are as follows.

{
    "header" : {
        "lastModified" : "",
        "version" : "2.7.0"
    },
    "clusters" : [
        {
            "name" : "INPUT_YOUR_CLUSTER_NAME_HERE",
            "address" : "239.0.0.1",
            "port" : 31999,
            "jdbcAddress" : "239.0.0.1",
            "jdbcPort" : 41999
        }
    ],
    "nodes" : [
        {
            "address" : "192.168.1.10",
            "port" : 10040,
            "sshPort" : 22,
            "clusterName" : "INPUT_YOUR_CLUSTER_NAME_HERE"
        }
    ]
}

To configure a node repository, either edit the file directly or use the repository management screen. Repository management screen is recommended. When configuring using the repository management screen, see the functions on the repository management screen and Starting management of a cluster in operation with gs_admin.

Use of the operation control command or command interpreter (gs_sh) is recommended when performing cluster configuration for the first time.

5.2.6 adminHome rights settings

Files and directories are created automatically by gs_admin under adminHome. As a result, a Tomcat execution user requires read and write rights to adminHome. Therefore, owners of files and directories under adminHome are changed to Tomcat execution users (tomcat by default) beforehand.

Change the owner as a root user.

# chown -R tomcat:tomcat /var/lib/gridstore/admin

[Memo]

[Notes]

5.3 Login and login destination screen

5.3.1 Login screen

Access the application URI below to access gs_admin.

http://[Tomcat operating machine address]:8080/gs_admin

he login screen appears when you access the gs_admin application URI.

Login screen
Login screen

In the log-in screen, you can choose from 2 different environment; cluster or repository manager. In the former option, you need to select the cluster that you would like to manage from the drop-down list. Once logged in, you will be taken to the Integrated operation control screen

On the other hand, for the latter option, you will be taken to the Repository management screen.

When logging in, enter your gs_admin user name and password in the box next to "user" and "password" respectively, and press the Login button.

[Memo]

5.3.2 Integrated operation control screen

The integrated operation control screen is shown below.

Integrated operation control screen
Integrated operation control screen

The integrated operation control screen is made up of the following elements.

Element Abbreviation Location Functions
Tree view Tree Left Display, select a list of operating targets
Data display and input section View Right Data display and data input subject to operation
Menu area Top Log out
Message area Bottom

Tree function

In Tree, a cluster or container can be selected as the main operation target by switching tabs at the top.

Tab Tree name Main functions
ClusterTree Cluster tree Display a list of the clusters and nodes, select the operating targets
ContainerTree Container tree Display a list of the databases, search for containers, select operating targets

View function

In View, the tab displayed at the top of View differs for each operating target selected in Tree. The function can be switched by selecting the tab at the top.

See the items of each tree and screen for details.

5.3.3 Repository management screen

This function can be used by a gs_admin administrator user only.

Select repository manager in the login screen and login as a gs_admin administrator user to arrive at the repository management screen.

The repository management screen is shown below.

Repository management screen
Repository management screen

The following functions are available in the repository management screen.

The specifications of the input column are as follows.

Cluster

Node

5.4 Cluster tree-related functions

5.4.1 Cluster tree

Summary

In a cluster tree, the nodes constituting a cluster under management, i.e the repository nodes (clusterName is the cluster under management) are displayed in a tree format.

Cluster tree
Cluster tree

An * will appear at the beginning of a node which has not been registered in the repository.

A description of the icons shown in a cluster tree is given below.

Icon Note
Cluster
Master node
Follower node
Started node
Stopped node
Status unconfirmed node
Message

Context menu

When an element of the tree is right clicked, a context menu appears according to which element is clicked, cluster or node. Data update and element operation can then be performed by selecting an item from the menu.

The menus and functions for the respective selected elements are as follows.

Selection element Menu Functions
Cluster refresh Get list of nodes in a tree again
Node refresh Display the latest node information in View

Operating target and view tab

When an element in the tree is left clicked, the functions appear in the View according to which element is clicked, cluster or node. The function can be changed by tapping the top section of the View.

Selection element Tab Screen name Functions
Cluster Dashboard Dashboard screen The dashboard screen contains a variety of information related to the entire cluster such as memory usage, cluster health, log information, etc.
Status Cluster status screen Display configuration data and information of cluster under management.
Monitor OS data display screen Display OS data of a machine with operating nodes.
Configuration Cluster operation screen The cluster operations screen consists of a list of table of the running nodes, as well as the start and end node features.
Node System System data screen Display system data of the node.
Container Container list screen The container list screen contains containers information such as the name of the containers and to which database it belongs to.
Performance Performance data screen Display performance data of the node as a graph.
Snapshot Snapshot screen The snapshot screen shows the node's performance at a point in time. The values can be compared with the values measured earlier.
Log Log screen The log screen contains the event log information of a node and the corresponding setting of its output level.

[Memo]

5.4.2 Dashboard screen

Summary

The dashboard screen contains a variety of information related to the entire cluster such as memory usage, cluster health, log information, etc.

Method of use

Type of tree Operating target Tab
Cluster tree Cluster Dashboard

Screen

Dashboard screen
Dashboard screen

Functions

The following functions are available in the dashboard screen.

5.4.3 Cluster status screen

Summary

Display configuration data and information of cluster under management.

Method of use

Type of tree Operating target Tab
Cluster tree Cluster Status

Screen

Cluster status screen
Cluster status screen

Functions

The cluster status screen is comprised of the following components.

5.4.4 OS data display screen

Summary

The OS data display screen is comprised of two components, Resource Information and OS Performance of the current cluster. The GridDB performance analysis, and the CPU and Network load status are displayed by pie charts and line graphs respectively.

Method of use

Type of tree Operating target Tab
Cluster tree Cluster Monitor

Screen

OS data display screen
OS data display screen

Functions

The OS data display screen is comprised of the following components.

[Memo]

5.4.5 Cluster operation screen

This function can be used by the gs_admin administrator only.

Summary

The cluster operations screen consists of a list of table of the running nodes, as well as the start and end node features.

Method of use

Type of tree Operating target Tab
Cluster tree Cluster Configuration

Screen

Cluster operation screen
Cluster operation screen

Functions

The following functions are available in the cluster operation screen.

[Memo]

5.4.6 System data screen

Summary

Display system data of the node.

Method of use

Type of tree Operating target Tab
Cluster tree Node System

Screen

System data screen
Cluster operation screen

Functions

The following functions are available in the system data screen.

5.4.7 Container list screen

Summary

The container list screen contains containers information such as the name of the containers and to which database it belongs to.

Method of use

Type of tree Operating target Tab
Cluster tree Node Container

Screen

Container list screen
Container list screen

Functions

The following functions are available in the container list screen.

[Memo]

5.4.8 Performance data screen

Summary

Display performance data of the node as a graph.

Method of use

Type of tree Operating target Tab
Cluster tree Node Performance

Screen

Performance data screen
Performance data screen

Functions

The following functions are available in the performance data screen.

5.4.9 Snapshot screen

Summary

The snapshot screen shows the node's performance at a point in time. The values can be compared with the values measured earlier.

Method of use

Type of tree Operating target Tab
Cluster tree Node Snapshot

Screen

Snapshot screen
Snapshot screen

Functions

The following functions are available in the snapshot screen.

5.4.10 Log screen

Summary

The log screen contains the event log information of a node and the corresponding setting of its output level.

Method of use

Type of tree Operating target Tab
Cluster tree Node Log

Screen

Log screen
Log screen

Functions

The following functions are available in the log screen.

[Notes]

5.5 Container tree-related functions

5.5.1 Container tree

Summary

In a container tree, the databases and containers which exist in a cluster under management are displayed in a tree format.

The cluster under management is displayed at the top of the tree (the figure within the parentheses () refer to the total number of databases in the cluster).

Container tree
Container tree

A description of the icons shown in a container tree is given below.

Icon Note
Cluster
Database
Database (does not exist)
Container (collection)
Container (timeseries container)
Partitioned table (container)
Search folder
Temporary work folder
Message

Functions

The following functions are available in a container tree.

After login, the ClusterTree tab and node list are displayed automatically. Upon switching to the ContainerTree tab, the tree structure of the container tree will be added automatically if it has been saved. However, search folders will not be searched again automatically.

The following operations cannot be carried out in a container tree.

Context menu

When an element of the tree is right clicked, a context menu appears according to which element is clicked, cluster or node. Data update and element operation can then be performed by selecting an item from the menu.

The menus and functions for the respective selected elements are as follows.

Selection element Menu Functions
Cluster refresh Read the tree structure of the tree again and automatically detect the database
Database refresh Check the database existence and search for containers again
Container refresh Display the latest container information in View
drop Deletion of container (with confirmation dialog)
Search folder refresh Search for container again
remove Deletion of the search folder
Temporary work folder remove Deletion of a temporary work folder

[Memo]

Operating target and view tab

When an element in the tree is left clicked, the functions appear in the View according to which element is clicked, cluster or node. The function can be changed by tapping the top section of the View.

Selection element Tab Screen name Function overview
Cluster Database Database management screen A database can be created or deleted, and access rights can be assigned or revoked.
User User management screen In the user management window, addition and deletion of general user, as well as modification of the password can be performed.
SQL SQL screen The results of a SQL command executed on the database can be displayed.
Database Create Container creation screen A container can be created in a database.
SQL SQL screen The results of a SQL command executed on the database can be displayed.
Container Details Container details screen The container details screen contains column and index configuration data of a container.
Index Index setting screen Index setting window allows an index to be created or deleted for each column of a container.
Trigger Trigger setting screen A container trigger can be created, edited or deleted.
TQL TQL screen Execute a TQL (query language) on a container and display the results.
Partition Details Container details screen Column, index and table partitioning data of a container will be displayed.

5.5.2 Database management screen

Summary

A database can be created or deleted, and access rights can be assigned or revoked.

Method of use

Type of tree Operating target Tab
Container tree Cluster Database

Screen

Database management screen
Database management screen

Functions

The following functions are available in the database management screen.

5.5.3 User management screen

Summary

In the user management window, addition and deletion of general user, as well as modification of the password can be performed.

Method of use

Type of tree Operating target Tab
Container tree Cluster User

Screen

User management screen
User management screen

Functions

5.5.4 SQL screen

This function can be used in the GridDB Advanced Edition only.

Summary

The results of a SQL command executed on the database can be displayed.

Method of use

Type of tree Operating target Tab
Container tree Cluster SQL
Container tree Database SQL

Screen

SQL screen
SQL screen

Functions

The following functions are available in the SQL screen.

[Memo]

5.5.5 Container creation screen

Summary

A container can be created in a database.

Method of use

Type of tree Operating target Tab
Container tree Database Create

Screen

Container creation screen (collection)
Container creation screen (collection)
Container creation screen (timeseries container)
Container creation screen (timeseries container)

Functions

The following functions are available in the container creation screen.

[Memo]

5.5.6 Container details screen

Summary

The container details screen contains column and index configuration data of a container.

Method of use

Type of tree Operating target Tab
Container tree Container Details

Screen

Container details screen
Container details screen

Functions

The following functions are available in the container details screen.

5.5.7 Index setting screen

Summary

An index can be created or deleted for each container.

Method of use

Type of tree Operating target Tab
Container tree Container Index

Screen

Index setting screen
Index setting screen

Functions

The following functions are available in the index setting screen.

[Memo]

5.5.8 Trigger setting screen

Summary

A container trigger can be created, edited or deleted.

Method of use

Type of tree Operating target Tab
Container tree Container Trigger

Screen

Trigger setting screen
Trigger setting screen

Functions

The following functions are available in the trigger setting screen.

[Memo]

5.5.9 TQL screen

Summary

Execute a TQL (query language) on a container and display the results.

Method of use

Type of tree Operating target Tab
Container tree Container TQL

Screen

TQL screen
TQL screen

Functions

The following functions are available in the TQL screen.

[Memo]

5.6 How to use gs_admin

This section provides a guide on how to use various functions accessible by gs_admin.

5.6.1 Commencing management of a cluster in operation

To manage the current active cluster in gs_admin, use the repository management function and follow the procedure below.

  1. Select the repository manager in the login screen and login as a gs_admin administrator user.

  2. Click the Sync button, enter the following data of any cluster in operation, and then click Sync to synchronize the data.

    • Specify /system/serviceAddress of the node definition file (gs_node.json) as the IP address.
    • Specify /system/servicePort of the node definition file (gs_node.json) as the port.
  3. Data of a cluster in operation will be reflected in the cluster list and node list.

  4. Click the Save button to save repository data.

  5. Click the Logout button to return to the login screen.

  6. Select the name of the cluster in operation from the list of clusters on the login screen.

  7. Log in as a gs_admin administrator user or a normal user to commence the operating functions.

5.6.2 Managing multiple clusters

When managing multiple clusters as a single gs_admin user, take note of the gs_admin user settings.

gs_admin user is managed in a single file, therefore if an administrator managing multiple clusters use different passwords for each of the cluster, the admin cannot be specified as a gs_admin user.

Therefore, the appropriate settings need to be configured according to number of admin in charge of the entire clusters.

The procedure to register a new gs_admin user is shown below.

  1. Use the gs_adduser command to add an administrator user to a single node among the clusters that you want to manage as a new user.

    Example: If the new user name/password is gs#newuser/newuser

  $ su - gsadm
  $ gs_adduser gs#newuser -p newuser
  $ cat /var/lib/gridstore/conf/password
  admin,8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918
  gs#newuser,9c9064c59f1ffa2e174ee754d2979be80dd30db552ec03e7e327e9b1a4bd594e
  system,6ee4a469cd4e91053847f5d3fcb61dbcc91e8f0ef10be7748da4c4a1ba382d17
  1. Distribute the above-mentioned user definition file to all the other nodes of the cluster that you want to manage as a new user.

  2. All nodes will be restarted to reconstitute the cluster.

  3. Add the user name and password added above to the gs_admin user definition file as a Tomcat execution user.

    Example: If the new user name/password is gs#newuser/newuser

  $ echo gs#newuser,9c9064c59f1ffa2e174ee754d2979be80dd30db552ec03e7e327e9b1a4bd594e >> /var/lib/gridstore/admin/conf/password

5.7 Gathering of error data

gs_admin error data and other logs are output to the adminHome log directory.

The default output level is info.

This command is used in collecting data when a gs_admin problem occurs, or when there is a request from the support desk, etc.

The log output level can be set in the /webapps/gs_admin/WEB-INF/classes/logback.xml under the Tomcat home directory ( /usr/local/tomcat by default).

5.8 Error list

Error type Error no. Message Treatment method
Internal Server Error E00104 Cluster is not servicing. Cluster under management is not operating. Use the configuration tab and other operation tools to operate the cluster, refresh the clusters from the cluster tree, or login again.
Internal Server Error E00105 D10135: Failed to check a node status. Nodes from Ver.1.5 or lower may have been registered in the nodes registered in the repository. Check the version of each node.
Internal Server Error Failed to create <File path>. File creation failed. Check if there is any directory which does not exist in the displayed path, or any directory for which access rights of Tomcat user have not been assigned.
Internal Server Error E0030C [Code:******] <Error message> Error message of GridDB node.
See "GridDB Errorcode" and check the countermeasure with the corresponding code.
Bad Request E00300 Container "Container name" already exists. Container name is duplicated. Specify another container name to create a container.
Bad Request E00303 Container "Container name" not found. Specified container does not exist. Right click the ContainerTree cluster, select refresh and search for the container again.
Bad Request [Code:******] <Error message> Error message of GridDB node.
See "GridDB Errorcode" and check the countermeasure with the corresponding code.
Input Error <Field name> is required. The input field has been left blank. Enter a value in the <Field name> input file.
Input Error <Field name> is invalid. An invalid value has been entered in the <Field name> input field. See "GridDB Operation Tools Reference" and input a possible types value.

6 Export/import tools

In the GridDB export/import tools, to recover a database from local damages or the database migration process, save/recovery functions are provided in the database and container unit.

In addition, there is also a function to link up with RDB, and RDB data can also be collected and registered in GridDB.

6.1 Installed directories and files

The export tool saves the container and row data of a GridDB cluster in the file below. A specific container can also be exported by specifying its name.

The import tool imports the container and export execution data files, and recover the container and row data in GridDB. A specific container data can also be imported as well.

Export/import configuration
Export/import configuration

6.1.1 Container data files

Container data files are composed of metadata files and row data files.

A metadata file is a file in the json format which contains the container type and schema, the index set up, and the trigger data.

There are 2 types of row data file, one of which is the CSV data file in which container data is stored in the CSV format, and the other is the binary data file in which data is stored in a zip format.

See Format of a container data file for details of the contents described in each file.

In addition, there are 2 types of container data file as shown below depending on the number of containers to be listed.

Hereinafter, container data files of various configurations will be written as single container data file and multi-container data file.

Container data file
Container data file

When a large container is specified as a single container data file and export is executed, management becomes troublesome as a large amount of metadata files and row data files are created. On the other hand, even if a large container is specified as a multi-container data file, only 1 metadata file and row data file is output.

Therefore, it is recommended that these 2 configurations be used differently depending on the application.

A single container data file is used in the following cases.

A multi-container data file is used in the following cases.

6.1.2 Export execution data file

Data such as the export date and time, the number of containers, container name etc. is saved in the export execution data file. This file is required to directly recover exported data in a GridDB cluster.

[Memo]

6.2 Configuration of export/import execution environment

The following settings are required to execute an export/import command.

6.2.1 RPM package installation

To execute the export/import commands, the client package containing the export/import functions and Java library package need to be installed.

[Example]

# rpm -Uvh griddb-xx-client-X.X.X-linux.x86_64.rpm
Preparing...                ########################################### [100%]
User and group has already been registered correctly.
GridDB uses existing user and group.
   1:griddb-xx-client          ########################################### [100%]

# rpm -Uvh griddb-xx-java_lib-X.X.X-linux.x86_64.rpm
Preparing...                ########################################### [100%]
   1:griddb-xx-java_lib        ########################################### [100%]

6.2.2 Property file settings

Set the property file in accordance with the GridDB cluster configuration used by a gsadm user. Property file is /var/lib/gridstore/expimp/conf/gs_expimp.properties .

The property file contains the following settings.

Property  Required Default value Note
mode Required MULTICAST Specify the type of connection method. If the method is not specified, the method used will be the multicast method.
MULTICAST ・・MULTICAST: multicast method
FIXED_LIST・・fixed list method
PROVIDER ・・provider method
hostAddress Essential if mode=MULTICAST 239.0.0.1 Specify the /transaction/notificationAddress in the GridDB cluster definition file (gs_cluster.json). Multicast address used by the export/import tool to access a cluster.
hostPort Essential if mode=MULTICAST 31999 Specify the /transaction/notificationPort in the GridDB cluster definition file (gs_cluster.json). Port of multicast address used by the export/import tool to access a cluster.
jdbcAddress Essential if mode=MULTICAST on AE 239.0.0.1 Specify /sql/notificationAddress in the GridDB cluster definition file (gs_cluster.json) when using the multicast method.
jdbcPort Essential if mode=MULTICAST on AE 41999 Specify /sql/notificationPort in the GridDB cluster definition file (gs_cluster.json) when using the multicast method.
notificationMember Essential if mode=FIXED_LIST Specify /cluster/notificationMember/transaction of the cluster definition file (gs_cluster.json) when using the fixed list method to connect. Connect address and port with a ":" in the description. For multiple nodes, link them up using commas.
Example)192.168.0.100:10001,192.168.0.101:10001
jdbcNotificationMember Essential if mode=FIXED_LIST on AE Specify sql/address and sql/port under the /cluster/notificationMember of the cluster definition file (gs_cluster.json) when using the fixed list method to connect. Connect address and port with a ":" in the description. For multiple nodes, link them up using commas.
Example)192.168.0.100:20001,192.168.0.101:20001
notificationProvider.url Essential if mode=PROVIDER Specify /cluster/notificationProvide/url of the cluster definition file (gs_cluster.json) when using the provider method to connect.
restAddress 127.0.0.1 Specify /system/listenerAddress of the GridDB node definition file (gs_node.json). Parameter for future expansion.
restPort 10040 Specify /system/listenerPort of the GridDB node definition file (gs_node.json). Parameter for future expansion.
clusterName Required INPUT_YOUR_CLUSTER_NAME_HERE Specify the cluster name of GridDB which is used in the command "gs_joincluster".
logPath /var/lib/gridstore/log Specify the directory to output the error data and other logs when using the export/import tools. Log is output in gs_expimp-YYYYMMDD.log under the directory.
commitCount 1000 Specify the number of rows as a unit to register data when registering container data with the import tool. When the numerical value becomes larger, the buffer for data processing gets larger too. If the row size is small, raise the numerical value, and if the row size is large, lower the numerical value. The parameter affects the registration performance for data import.
transactionTimeout 2147483647 Specify the time allowed from the start until the end of a transaction. When registering or acquiring a large volume of data, a large numerical value matching the data volume needs to be set. A maximum value has been specified for processing a large volume of data by default. (Unit: second)
failoverTimeout 10 Specify the failover time to repeat retry starting from the time a node failure is detected. This is also used in the timeout of the initial connection to the cluster subject to import/export. Increase the value when performing a process such as registering/acquiring a large volume of data in/from a container. (Unit: second)
jdbcLoginTimeout 10 Specify the time of initial connection timeout for JDBC. (Unit: second)
rdb.driver Essential for RDB linkage Parameter for RDB linkage. Specify the path of the JDBC driver.
rdb.kind Essential for RDB linkage oracle Parameter for RDB linkage. Specify the type of RDB "oracle".
rdb.host Essential for RDB linkage Parameter for RDB linkage. Specify the host name (address)) used to access RDB.
rdb.port Essential for RDB linkage Parameter for RDB linkage. Specify the port no. used to access RDB.
rdb.database Essential for RDB linkage Parameter for RDB linkage. Specify the applicable database name.
rdb.url Essential for RDB linkage Parameter for RDB linkage. Specify the connection character string when accessing the RDB. Specify a set of the host, port and database or the url in the RDB connection destination.
rdb.user Essential for RDB linkage Parameter for RDB linkage. Specify the user to access the target database.
rdb.password Essential for RDB linkage Parameter for RDB linkage. Specify the password of the user to access the target database.
load.input.threadNum 1 Parameter for RDB linkage. Specify the number of processing threads to collect from RDB. (1-64)
load.output.threadNum 1 Parameter for RDB linkage. Specify the number of processing threads to register in GridDB. (1-16)
storeBlockSize 64KB Specify the block size specified in a GridDB cluster. The upper limit of the string data and binary data that can be registered in GridDB differs depending on the block size.
maxJobBufferSize 512 Specify the buffer size (in MB) to hold collection and registration data.

6.3 Export function

The options that can be specified when using the export function is explained here (based on usage examples of the export function).

6.3.1 Specifying process target

6.3.1.1 How to specify a container

There are 3 ways to specify a container from a GridDB cluster, by specifying all the containers of the cluster: by specifying the database, and by specifying the container individually.

(1) Specify all containers

(2) Specify the database

(3) Specify container individually

6.3.1.2 How to specify a row

Rows located by a search query can be exported by specifying a search query to remove rows from a container. All rows stored in a container which has not been specified in the search query will be exported.

Specify search query

[Example] Execution example

$ gs_export -c c001 c002 -u admin/admin --filterfile filter1.txt
  
$ gs_export --all -u admin/admin --filterfile filter2.txt

[Example] Description of definition file

^cont_month     :select * where time > 100
^cont_minutes_.*:select * where flag = 0
cont_year2014   :select * where timestamp > TIMESTAMP('2014-05-21T08:00:00.000Z')

[Memo]

6.3.1.3 How to specify user access rights

Information on GridDB cluster users and their access rights can also be exported. Use the following command when migrating all data in the cluster.

[Example]

$ gs_export --all -u admin/admin --acl

[Memo]

6.3.1.4 How to specify a view (GridDB Advanced Edition only)

A view of a GridDB cluster can also be exported as well as the container.

Specify --all option or --db option to export the view of the database to be exported.

$ gs_export --db public -u admin/admin
Export Start.
Directory : /tmp/export
     :
Number of target container:5 ( Success:5 Failure:0 )

The number of target  views : 15
Export Completed.

6.3.2 Specifying the output format of a row data file

A CSV data file or binary data file can be specified as the output format of a row data file.

[Example]

$ gs_export -c c001 c002 -u admin/admin --binary
  
$ gs_export --all -u admin/admin --binary 500       //Export Completed.

6.3.3 Specifying the output configuration of container data file

A single container data file to create container data file in a container unit, or a multi-container data file to output all containers to a single container data file can be specified.

[Example]

$ gs_export -c c001 c002 -u admin/admin --out test
  
$ gs_export --all -u admin/admin --out           //file is created with the date

6.3.4 Specifying the output destination

The directory of the container data file can be specified as the output destination. Create a directory if the specified directory does not exist. If the directory is not specified, data will be output to the current directory when a command is executed. Use the -d option to specify the output destination.

[Example]

$ gs_export --all -u admin/admin --out test -d /tmp

[Memo]

6.3.5 Specifying the number parallel executions

Get data to access a cluster in parallel with the export tool. If a command is executed in parallel on a cluster composed of multiple nodes, data can be acquired at a high speed as each node is accessed in parallel.

[Memo]

[Example]

$ gs_export --all -u admin/admin --binary --out --parallel 4

6.3.6 Test execution function

Before exporting a container, the user can assess whether the export can be carried out correctly.

[Example]

$ gs_export -u admin/admin --all --test
Export Start.
[TEST Mode]
Directory : /var/lib/gridstore/export
The number of target containers  : 5

Name                                      PartitionId Row
------------------------------------------------------------------
public.container_2                                 15          10
public.container_3                                 25          20
public.container_0                                 35          10
public.container_1                                 53          10
public.container_4                                 58          20

Number of target container:5 ( Success:5 Failure:0 )
The number of target views : 15
Export Completed.

6.3.7 Error continuation specification

Export processing can be continued even if a row data acquisition error were to occur due to a lock conflict with another application.

[Example]

$ gs_export --all -u admin/admin --force

[Memo]

6.3.8 Other functions

Detailed settings in the operating display

[Example]

$ gs_export --containerregex "^c0" -u admin/admin --verbose
Export Start.
Directory : /data/exp
Number of target container : 4

public.c003 : 1
public.c002 : 1
public.c001 : 1
public.c010 : 1
The row data has been acquired. : time=[5080]

Number of target container:4 ( Success:4 Failure:0 )
Export Completed.

Suppressed settings in the operating display

[Example]

$ gs_export -c c002 c001 -u admin/admin --silent

6.4 Import function

Import the container data file or RDB data into the GridDB cluster.

6.4.1 Types of data source for import

The input data sources used by the import tool are as follows.

[Memo]

6.4.2 Importing from a container data file

Use the export function to import data in the exported data format into a GridDB cluster.

6.4.2.1 Specifying a process target

Processing data to be imported from the container data file needs to be specified.

6.4.2.1.1 How to specify a container

There are 3 ways to specify a container, by specifying all the containers in the container data file, by specifying the database, and by specifying the container individually.

(1) Specify all containers

(2) Specify the database

(3) Specify container individually

[Points to note]

[Memo]

6.4.2.1.2 How to specify user access rights

If data is exported by specifying the --acl option in the export function, data on the user and access rights can also be imported. Use the following command when migrating all data in the cluster.

[Example]

$ gs_import --all --acl -u admin/admin

[Memo]

6.4.2.1.3 How to specify a view (GridDB Advanced Edition only)

If the view was exported using the export function, a view can also be imported together with the container data.

Specify --all option or --db option to import the view of the database to be imported.

[Memo]

6.4.2.2 Specifying a container data file

Specify the container data file. If this is not specified, the file in the current directory will be processed.

[Example]

//Specify all containers from the current directory
$ gs_import --all -u admin/admin

//Specify multiple databases from a specific directory
$ gs_import --db db002 db001 -u admin/admin  -d /data/expdata

//Specify multiple containers from a specific directory
$ gs_import -c c002 c001 -u admin/admin  -d /data/expdata

[Memo]

6.4.2.3 Getting a container list

The container data can be checked before importing.

[Example]

$ gs_import --list
Container List in local export file
DB            Name              Type            FileName
public        container_2       COLLECTION      container_2.csv
public        container_0       TIME_SERIES     container_0.csv
public        container_1       COLLECTION      container_1.csv
userDB        container_1_db    TIME_SERIES     userDB.container_1_db.csv
userDB        container_2_db    TIME_SERIES     userDB.container_2_db.csv
userDB        container_0_db    COLLECTION      userDB.container_0_db.csv

6.4.3 Importing from RDB

The following section explains how to import RDB (Oracle) data to a GridDB cluster.

Summary

Basically, importing from RDB is simply done by connecting to the Oracle database, collecting data with a SQL command from the specified table, and registering the data in a GridDB container.

Importing from RDB
Importing from RDB

Data can be imported from RDB with the command below.

$ gs_import -u admin/admin --srcfile <resource definition file>

Specify the association between the Oracle table and GridDB container (mapping) in the resource definition file.

The following 4 settings can be specified in the json format in the resource definition file. The resource definition file is created in the RDB collection source unit.

 

Specifying the connection information of the RDB collection source/GridDB recovery destination

Configure the RDB connection data serving as the collection source (address, port no., etc.), JDBC driver data, and GridDB recovery destination connection data.

Item File to be specified
Path of JDBC driver        Property file
RDB connection data of collection source       Property file or resource definition file
GridDB connection data of recovery destination    Property file or resource definition file

 [Memo]

 

Specifying the RDB collection target table

Specify the processing data to be imported from the Oracle database.

[Memo]

 

Specifying a container subject to GridDB registration

Specify a GridDB container at the registration destination, using the following information.

All registration destination data can be omitted, and association is also possible through automatic conversion of the mapping.

However, if a processing table is specified in a SQL command, the container name must also be specified.

 

Specifying mapping data

Perform data association between Oracle and GridDB.

<Conversion method>

There are two ways: "auto conversion" and "user definition conversion."

 

<Table association>

An Oracle table can be associated with a GridDB container.

[Memo]

 

<Column association>

 An Oracle column can be associated with a GridDB column on a 1-to-1 basis.

[Memo]

 

Resource definition file settings

Resource definition files are written in the json format. Specify the RDB (Oracle) data to be connected and the container data which is the recovery destination.

The settings required to connect and import data, to and from RDB (Oracle) are as follows.

Property Note
/inputSource
  /type Specify "rdb" when using a RDB link.
  /server Can be omitted. RDB connection destination of the property file is used by default. *1
    /kind Specify the type of RDB. Specify "oracle".
    /host Specify the address used to access RDB.
    /port Specify the port of the address used to access RDB.
    /database Specify the database name (SID).
    /url Specify the connection character string when accessing the RDB. (Specify the host, port, database or url.
    /user Specify the user to access the database.
    /password Specify the user password to access the database.
/outputSource Can be omitted. GridDB connection destination of the property file is used by default.
  /type Specify "gridstore" when registering in GridDB.
  /server
    /host Specify the address used to access GridDB.
    /port Specify the port of the address used to access GridDB.
    /clusterName
    /user Specify the user to access the database.
    /password Specify the user password to access the database.
/targetList
The following can be specified repeatedly in an array.
  /rdb Specify the RDB collection targets. Either "table" or "sql" is required.
    /table Specify the table name.
    /sql Specify a SQL command.
    /select Specify column if the table name is specified.
    /where Filter the columns by conditions if the table name is specified.
    /orderby Sort the specified columns if the table name is specified.
    /partitionTable Specify "true" when accessing partition tables in parallel.
  /gridstore Specify a GridDB container at the registration destination.
    /database Specify the database name. Registered in the public database "public" by default.
    /name Specify the container name.
Container name may be omitted if the RDB collection target specifies the table name. The table name will become the container name.
Container name is required when specifying a SQL command.
    /type Specify the container type (COLLECTION/TIME_SERIES).
Container type is a collection by default.
    /rowKeyAssigned Specify whether there is any row key (true/false). *2
    /dataAffinity Specify the data affinity name.
    /indexSet Specify the index. *3
    /triggerInfoSet Specify a trigger. *3
    /compressionInfoSet The compression method (NO/SS) can be specified for timeseries containers only. *3
  /mapping Can be omitted, and the following can be specified repeatedly.
    /column  The following can be specified repeatedly.
      /rdb Specify the RDB column name.
      /gridstore Specify the GridDB column name.
      /type Specify the GridDB column type.
    /containerSplit Specify the container split method.
      /type Specify the column value split "columnValue" or record number split "dataNumber".
      /column For column value split, specify the column value to split.
      /number For record number split, specify the number of records to split.

[Memo]

An example of a resource definition file is shown below. Connection data shall be specified in the property file.

 

Partition table

For the Oracle partition tables, the partition unit (sub-partition unit in the case of a composite type) can be accessed in parallel. Using this, data of a partition table can be acquired at a high speed.

When processing a partition table in parallel, set the "partitionTable" item to true in the collection target settings of the resource definition file.

[Memo]

 

Concurrency

The import process can be executed at a higher speed by making access to RDB and GridDB parallel.

When performing parallel processing, specify the --parallel option in the command line.

Collection from RDB and registration in GridDB will be executed respectively with a degree of parallelism that is specified in --parallel. If the degree of parallelism is not specified, the number of GridDB clusters and nodes will automatically become the degree of parallelism.

[Example]

load.input.threadNum=64
load.output.threadNum=16

<Example>

Command line Property file No. of collected threads No. of registered threads
gs_import 1 1
gs_import --parallel 3 3 3
gs_import --parallel input.threadNum=16 16 3
gs_import --parallel Not specified No. of GridDB nodes No. of GridDB nodes

Preliminary checks and test run

The following items are checked prior to collection and registration processing. Preliminary checks on descriptive errors in the resource definition file and conformity of the specified data are carried out. If an error were to occur in the following checks, the processing of the tools will stop. The process cannot be continued even if the --force option is selected.

[Preliminary check items]

Perform a test run if you want to conduct a preliminary check to check out the operation only. Although communications between Oracle and GridDB are carried out during a test run, data registration will not be carried out in GridDB.

To perform a test run, specify the --test option together with the --srcfile option.

[Example]

$ gs_import -u admin/admin --srcfile partition_table.json --test
Start import
[TEST Mode]
Import test execution terminated.  No. of SQL subject to processing: 1920
Import terminated

If an error occurs in the preliminary checks, the following display will appear.

[Example]

$ gs_import -u admin/admin --srcfile not_found_column.json --test
Start import
[TEST Mode]
D00C0C: A non-existent column has been specified in the mapping definition. : sql=[SELECT * FROM mytable], column=[NOT_FOUND_COLUMN]

SmartEDA/DDS linkage

Data can be registered from Oracle to GridDB by linking the data collection/event processing base SmartEDA with the data collection server (DDS). Data is collected from Oracle using the import tool and sent to the DDS via HTTP. The DDS then registers the data received in GridDB. Data can be imported even if it exists in a subnet that does not allow multicast communications between the GridDB cluster and Oracle and the Import tools.

[Memo]

6.4.4 Data registration option

When importing, if a specific option is not specified, an error will occur if the container that you are trying to register already exists in the GridDB cluster. Data can be added or replaced by specifying the next option. During data registration, the number of containers registered successfully and the number of containers which failed to be registered are shown.

[Example]

$ gs_import -c c002 c001 -u admin/admin  --append
Import initiated (Append Mode)
Import completed
Success:2 Failure:0
 
$ gs_import -c c002 c001 -u admin/admin  --replace
Import initiated (Replace Mode)
Import completed
Success:2 Failure:0
 
$ gs_import --all  -u admin/admin  -d /datat/expdata   --replace

6.4.5 Error continuation specification

The import process can be continued even if a registration error were to occur in a specific row data due to a user editing error in the container data file.

[Example]

$ gs_import --all -u admin/admin -d /data/expdata --force

[Memo]

6.4.6 Other functions

Detailed settings in the operating display

6.5 Command/option specifications

6.5.1 Export command

[Memo]

6.5.2 Import command

[Memo]

6.6 Format of container data file

The respective file formats to configure container data files are shown below.

6.6.1 Metadata file

The metadata file stores the container data in the JSON format. The container data to be stored is shown below.

Item Note
<Container name> Name of the container.
Container type Refers to a collection or time series container.
Schema data Data of a group of columns constituting a row. Specify the column name, data type, and column constraints.
Compression configuration data Compression type data to be configured in a Time series data. Set up thinning compression with error, thinning compression without error, or no compression.
Index setting data Index type data set in a container. Availability of index settings. Specify the type of index e.g. hash index, spatial index, tree index, etc.
Trigger (event notification) data Notification is triggered when a container is updated (PUT/DELETE) by the JMS or REST interface.
Row key setting data Set up a row key when collection container is used. For time series containers, either there is no row key set or the default value, if set, will be valid.
Table partitioning data Specify table partitioning data.

The tag and data items of the metadata in the JSON format are shown below. Tags that are essential for new creations by the user are also listed (tag setting condition).

field     Item       Note   Setting conditions 
Common parameters  
database <Database name> <Database name> Arbitrary, "public" by default
container <Container name> <Container name> Required
containerType Container type Specify either COLLECTION or TIME_SERIES Required
containerFileType Container data file type Specify either csv or binary. Required
containerFile Container data file name File name Arbitrary
dataAffinity Data affinity name Specify the data affinity name. Arbitrary
partitionNo Partition Null string indicates no specification. Arbitrary, output during export. Not used even if it is specified when importing.
columnSet Column data set (, schema data) Column data needs to match when adding data to an existing container Required
  columnName Column name Required
  type JSON Data type Specify either of the following values: BOOLEAN/ STRING/ BYTE/ SHORT/ INTEGER/ LONG/ FLOAT/ DOUBLE/ TIMESTAMP/ GEOMETRY/ BLOB/ BOOLEAN[]/ STRING[]/ BYTE[] /SHORT. []/ INTEGER[]/ LONG[]/ FLOAT[]/ DOUBLE[]/ TIMESTAMP[]. Required
  notNull NOT NULL constraint true/false Arbitrary, "false" by default
rowKeyAssigned Row key setting (*1) specify either true/false
Specifying also rowKeySet causes an error
Arbitrary, "false" by default
rowKeySet Row key column names Specify row key column names in array format.
The row key needs to match when adding data to an existing container
Arbitrary (*2)
indexSet Index data set Can be set for each column. Non-existent column name will be ignored or an error will be output. Arbitrary
  columnNames Column names Specify column names in array format. Arbitrary (essential when indexSet is specified)
  type Index type Specify either of the following values: HASH (STRING/ BOOLEAN/ BYTE/ SHORT/ INTEGER/ LONG/ FLOAT/ DOUBLE/ TIMESTAMP) SPATIAL (GEOMETRY), TREE (STRING/ BOOLEAN/ BYTE/ SHORT/ INTEGER/ LONG/ FLOAT/ DOUBLE/ TIMESTAMP). Arbitrary (essential when indexSet is specified)
  indexName Index name Index name Arbitrary, not specified either by default or when null is specified.
triggerInfoSet Trigger settings Arbitrary
  eventName Trigger name Trigger name Arbitrary (essential when triggerInfoSet is specified)
  notificationType Notification method Specify either JMS or REST. Arbitrary (essential when triggerInfoSet is specified)
  targetEvents Event to be monitored Specify either PUT or DELETE. Arbitrary (essential when triggerInfoSet is specified)
  targetColumnNames Column name Arbitrary column subject to notification (multiple columns can be specified using commas to separate them) The "," (comma) separator is used, and an error will occur if a non-existent column name is specified.
  notificationURI Destination URI of notification Arbitrary (essential when triggerInfoSet is specified)
  JmsDestinationType Type of destination Specify either topic or queue. Valid only when notificationType is JMS
  JmsDestinationName Name of destination Essential when notificationType is JMS
  JmsUser <User name> Essential when notificationType is JMS
  JmsPassword <Password> Essential when notificationType is JMS
Table partitioning data
tablePartitionInfo Table partitioning data For Interval-Hash partitioning, specify the following group of items for both Interval and Hash as an array in that order Arbitrary
type Table partitioning type Specify either HASH or INTERVAL Essential if tablePartitionInfo is specified
column Partitioning key Column types that can be specified are as follows
Any type if type=HASH
BYTE, SHORT, INTEGER, LONG, TIMESTAMP if type=INTERVAL
Essential if tablePartitionInfo is specified
divisionCount Number of hash partitions (Effective only if type=HASH) Specify the number of hash partitions Essential if type=HASH
intervalValue Interval value (Effective only if type=INTERVAL) Specify the interval value Essential if type=INTERVAL
intervalUnit Interval unit (Effective only if type=INTERVAL) DAY only Essential if type=INTERVAL and column=TIMESTAMP
Interval or interval-hash partitioning only parameter
expirationType Type of expiry release function Specify "partition", when specifying partition expiry release. Arbitrary
expirationTime Length of expiration Integer value Essential if expirationType is specified
expirationTimeUnit Elapsed time unit of row expiration Specify either of the following values: DAY/ HOUR/ MINUTE/ SECOND/ MILLISECOND. Essential if expirationType is specified
TIME_SERIES only parameter
timeSeriesProperties Compression data setting Can only be specified when containerType is TIME_SERIES. Arbitrary
compressionMethod Specify either NO, SS, or HI. Arbitrary
compressionWindowSize Maximum window size of a row Integer value Arbitrary
compressionWindowSizeUnit Elapsed time unit of row expiration Specify either of the following values: DAY/ HOUR/ MINUTE/ SECOND/ MILLISECOND. Arbitrary
expirationDivisionCount Division count of row expiration Integer value Arbitrary
rowExpirationElapsedTime Elapsed time of row expiration Integer value Arbitrary
rowExpirationTimeUnit Elapsed time unit of row expiration Specify either of the following values: DAY/ HOUR/ MINUTE/ SECOND/ MILLISECOND. Arbitrary
compressionInfoSet Settings for each column Can only be specified if compressionMethod is HI. Arbitrary
  columnName Column name Arbitrary
  compressionType Compression type Specify either RELATIVE or ABSOLUTE (RELATIVE indicates a relative value and ABSOLUTE indicates an absolute value). Arbitrary
  width Absolute error exists. Thinning and compression parameters Floating-point number Arbitrary, essential if compression is specified. An error occurs when both rate and span are specified.
  rate Relative error exists. Thinning and compression parameters Floating-point number Arbitrary, can only be specified when compressionMethod is HI. In SS/NO, ignored/error occurs. An error occurs when width is also specified.
  span Relative error exists. Thinning and compression parameters Floating-point number Arbitrary, can only be specified when compressionMethod is HI. In SS/NO, ignored/error occurs. An error occurs when width is also specified.

[Memo]

[Notes]

[Example1] Example of a collection in a single container data file (public.c001_properties.json)

[Example 2] Example of a collection and timeseries container in a multi-container data file (public.container01_properties.json)

[Example 3] Example of a description for table partitioning

[Memo]

6.6.2 Row data file (binary data file)

A row data file, binary data file, is in zip format and can be created by gs_export only. No readability, and cannot be edited as well.

6.6.3 Row data file (CSV data file)

A row data file, csv file, is in CSV format and describes the references to the metadata file, which defines rows, in the container data file data section.

[Memo]

<CSV data file format>

1. Header section (1st - 2nd row)

Header section contains data output during export. Header data is not required during import.

[Example]

"#2017-10-01T17:34:36.520+0900 GridDB V4.0.00"
"#User:admin "

2. Container data file data section (3rd and subsequent rows)

Describe the references to the metadata file.

3. Row data section (container data and subsequent sections)

The following section describes the row data.

[Memo]

4. Comments section

The comment section can be described anywhere in the CSV data file except the header section.

[Memo]

<File name format>

The name of the CSV data file output by the export tool is as follows.

[Example] a meta data file in CSV format ,including external object file, for Example 1

"#2017-10-01T11:19:03.437+0900  GridDB V4.0.00"
"#User:admin"
"%","public.c001_properties.json"
"$","public.c001"
"1","Tokyo"
"2","Kanagawa"
"3","Osaka"

 

When the data below is included in some of the rows of the CSV data file, prepare an external object file separate from the CSV data file as an external object. List the references of the external data file in the target column of the CSV file as below.   "@data type": (file name)

When an external object file is exported, the external object file name is created in accordance with the following rules during export.

For import purposes, any file name can be used for the external object file. List down the CSV data file with a file name of any data type in the relevant column.

[Example] Naming example of an external object file

//When a collection (colb) having a BYTE array in the 3rd column is exported
 
Oct 4  12:51 2017 public.colb.csv
Oct 4  12:51 2017 public.colb_0_3.byte_array
Oct 4  12:51 2017 public.colb_1_3.byte_array
Oct 4  12:51 2017 public.colb_2_3.byte_array
Oct 4  12:51 2017 public.colb_3_3.byte_array
Oct 4  12:51 2017 public.colb_4_3.byte_array
Oct 4  12:51 2017 public.colb_properties.json

  

[Example] Description of an external object file in a single container data file is shown below.

7 Long term archive tool

7.1 How to use

The long term archive function is executed by a tool, gs_archive. Users can run gs_archive on a GridDB client machine in accordance with the following format.

gs_archive <command> <option>...

The command is called "archive command." Users can specify the following options.

A few sample scripts and programs are provided as examples of utilizing archive files. See Sample scripts and programs for the details of them. See also Output files for the file organization and File formats for the file formats.

7.1.1 Environment construction

7.1.1.1 Install

Having installed Java is required to use gs_archive. The supported versions are as follows.

Users can install the long term archive tool through installing the GridDB client package. The followings are included in the package.

The archive tool accepts user's requests in a command format and requests a processing to the archive engines on GridDB nodes as necessary.

The target machines of the installation are different depending on the commands to use.

7.1.1.2 Installed directories and files

The directories created for the long term archive by the GridDB client package are as followed.

/var/lib/gridstore/          # GridDB client's home directory
    archive/                 # Long term archive's root directory
        conf/                # Configuration files are placed
        data/                # Archive files are created
        tmp/                 # gs_archive uses as a working directory
    log/                     # gs_archive and gsserver_archive writes logs

The files are placed on the following directories.

7.1.2 Initial setting

Prepare the following procedure to use the long term archive tool.

7.1.2.1 Setting for the expiry release container

  1. Determine which data are targets of the long term archive in data design
  2. Determine which containers are to store the target data when designing containers and tables on GridDB
  3. Examine the expiry release settings on each target container in doing physical design on GridDB

[Memo]

7.1.2.2 Disable automatic erasing

You need to set the mode in which cold data are accumulated in GridDB without been deleted automatically before activating the GridDB server to use the long term archive function. Set /dataStore/autoExpire to false in gs_node.json. This setting is required on all nodes of GridDB cluster.

[Memo]

7.1.2.3 Long term archive tool setting

You can set the archive tool with the property format setting file (/var/lib/gridstore/archive/conf/gs_archive.properties). There are a required item and an optional item.

See Setting file for the details of the setting items.

7.1.3 Store command

7.1.3.1 Overview of the processing

When a user executes the store command, the following process is executed on each node of GridDB clusters and the client machine on which the command is called.

  1. Processing on each node of GridDB cluster
  1. Processing on the client machine executing the store command
System architecture of long term archive
System architecture of long term archive

7.1.3.2 How to use the store command

The store command executes to save cold data in archive files and delete them from GridDB. If the configuration of GridDB cluster changes during the execution, the command doesn't terminate normally. To avoid this situation, run the store command by the following steps.

  1. Disable automatic reconfiguration:
  1. Run "store" command

  2. Enable automatic reconfiguration:

See the setting of automatic cluster reconfiguration for the details of gs_loadbalance.

[Memo]

[Example]

The store command is executed in two steps, saving cold data in files on each GridDB node and collecting the files to the client machine ( Overview of the processing ). Users can run the command in parallel by using "--parallel" option when the GridDB cluster has multiple nodes. The following example is to run the store command in 2 parallels.

# An example of parallel execution
% gs_archive store -u admin/admin --parallel 2

See the store command options for details of the store command specification.

7.1.3.3 Output files

The following two types of files are created after the store command execution.

The following two types of files are included in archive files in pairs.

A range of cold data which has been become erasable simultaneously is saved in a row data file.

Archive file and catalog file
Archive file and catalog file

One or more archive files are created by the store command execution.

Only one catalog file is created when the store command is executed. Relations between files (row data files, metadata files, and archive files) and ranges of cold data are described in the catalog file. See catalog file for the details of a catalog file.

Archive files and a catalog file are created under the output directory of store command called archive directory (the default is /var/lib/gridstore/archive/data). Each execution of the store command creates a child directory which represents the execution date (UTC time), then output files are placed under the child directory. Note that the execution date is not based on the client machine's time. It's based on the latest time of GridDB nodes. Cold data which have become erasable by the time are targets of archiving.

The date (UTC time) format in directory and file names for archiving are as followed.

The name templates of archive file and catalog file are as followed.

An example of above names are as followed.

[An example of output files]

# An example of archive files and a catalog file (The command was executed at 4am on August 1, 2018 (UTC time))
/var/lib/gridstore/archive/data/   # Archive directory
    20180801040000000Z/
        20180801040000000Z_0001.tar    # Archive file
        20180801040000000Z_0002.tar    # Archive file
        20180801040000000Z_0003.tar    # Archive file
        20180801040000000Z_0004.tar    # Archive file
        20180801040000000Z_0005.tar    # Archive file
        20180801040000000Z_catalog.json  # Catalog file

Row data file paths and metadata file paths in an archive file are as followed.

An example of the row data file and a metadata file in the archive file is as follows.

[Example]

7.1.4 print command

The print command shows a list of cold data saved in the archive directory. The command doesn't require to connect a GridDB cluster because it shows the result for scanning the catalog file in the archive directory. Users must run the print command on the same machine on which they executed the store command.

The displayed items are as followed.

Item name Displayed information
container <database name>.<container name>
startTime Start time of the cold data
endTime End time of the cold data
archiveFile Archive file name including the following files
metaFile Path to the file holding metadata of the cold data
rowFile Path to the file holding row data of the cold data

The example of executing the print command is as follows.

# An example of print command execution
% gs_archive print
container=public.ExpRow_Time_NoPart_-./=
startTime=2017-07-08T15:00:00.000Z
endTime=2017-08-07T14:59:59.999Z
archiveFile=/var/lib/gridstore/archive/data/20180801040000000Z/20180801040000000Z_0001.tar
metaFile=20180801040000000Z/00068/20180603145959999Z/00068_0000000000000002_20180603145959999Z.json
rowFile=20180801040000000Z/00068/20180603145959999Z/00068_0000000000000002_20180603145959999Z.avro
#---------------------------
container=public.ExpRow_Time_NoPart_0001
startTime=2017-07-08T15:00:00.000Z
endTime=2017-08-07T14:59:59.999Z
archiveFile=/var/lib/gridstore/archive/data/20180801040000000Z/20180801040000000Z_0001.tar
metaFile=20180801040000000Z/00073/20180603145959999Z/00073_0000000000000000_20180603145959999Z.json
rowFile=20180801040000000Z/00073/20180603145959999Z/00073_0000000000000000_20180603145959999Z.avro
#---------------------------
...

Users can specify a database name, a container name, and a data range as a condition to filter the list. The data range is represented by date type values stored in the row key or the partitioning key of the container/table. Row data files including cold data information of the "ExpRow_Time_NoPart_001" container of the "public" database, meta data files, and the archive file path list including those files are displayed in the following example of executing the print command.

# An example of showing only information for the container "ExpRow_Time_NoPart_0001" of "public" database.
% gs_archive print --db public --container ExpRow_Time_NoPart_0001
container=public.ExpRow_Time_NoPart_0001
startTime=2017-07-08T15:00:00.000Z
endTime=2017-08-07T14:59:59.999Z
archiveFile=/var/lib/gridstore/archive/data/20180801040016000Z/20180801040016000Z_0001.tar
metaFile=20180801040016000Z/00073/20180603145959999Z/00073_0000000000000000_20180603145959999Z.json
rowFile=20180801040016000Z/00073/20180603145959999Z/00073_0000000000000000_20180603145959999Z.avro
#---------------------------
container=public.ExpRow_Time_NoPart_0001
startTime=2017-08-07T15:00:00.000Z
endTime=2017-09-06T14:59:59.999Z
archiveFile=/var/lib/gridstore/archive/data/20180801040016000Z/20180801040016000Z_0001.tar
metaFile=20180801040016000Z/00073/20180703145959999Z/00073_0000000000000000_20180703145959999Z.json
rowFile=20180801040016000Z/00073/20180703145959999Z/00073_0000000000000000_20180703145959999Z.avro
#---------------------------

The result of print command can be stored in a CSV file delimited by tabs at the same time that it is displayed on the terminal.

# An example of storing the result of print command in a csv file
% gs_archive print --db public --container ExpRow_Time_NoPart_0001 --output ExpRow_Time_NoPart_0001.csv
...
% cat ExpRow_Time_NoPart_0001.csv
public.ExpRow_Time_NoPart_0001  2017-07-08T15:00:00.000Z        2017-08-07T14:59:59.999Z        /var/lib/gridstore/archive/data/20180801040000000Z/20180801040000000Z_0001.tar 20180801040000000Z/00073/20180603145959999Z/00073_0000000000000000_20180603145959999Z.json     20180801040016000Z/00073/20180603145959999Z/00073_0000000000000000_20180603145959999Z.avro
public.ExpRow_Time_NoPart_0001  2017-08-07T15:00:00.000Z        2017-09-06T14:59:59.999Z        /var/lib/gridstore/archive/data/20180801040000000Z/20180801040000000Z_0001.tar 20180801040000000Z/00073/20180703145959999Z/00073_0000000000000000_20180703145959999Z.json     20180801040016000Z/00073/20180703145959999Z/00073_0000000000000000_20180703145959999Z.avro

See print command options for the details of the print command usage.

Moreover, see File formats for the details of data type in a meta data file and a row data file.

7.1.5 Sample scripts and programs

Sample scripts and programs to utilize archive files searched by the "print" command are provided. They are stored in the zip files in /usr/griddb/docs

7.2 Command/option specifications

[Memo]

7.3 Configuration file

Users can specify the following settings in /var/lib/gridstore/archive/conf/gs_archive.properties

Property Required Default value Note
mode Required MULTICAST Specify the connection method for GridDB.
MULTICAST :multicast method
FIXED_LIST:fixed list method
PROVIDER :provider method
clusterName Required Specify the cluster name of GridDB which is used in the command "gs_joincluster".
hostAddress required if mode=MULTICAST 239.0.0.1 Specify the value of /transaction/notificationAddress in gs_cluster.json.
hostPort required if mode=MULTICAST 31999 Specify the value of /transaction/notificationPort in gs_cluster.json.
notificationMember required if mode=FIXED_LIST Specify the list of cluster node's addresses and ports when using fixed list method.
notificationProvider.url required if mode=PROVIDER Specify the value of /cluster/notificationProvider/url in gs_cluster.json.
jdbcAddress required for GridDB AE if mode=MULTICAST 239.0.0.1 Specify the value of /sql/notificationAddress in gs_cluster.json.
jdbcPort required for GridDB AE if mode=MULTICAST 41999 Specify the value of /sql/notificationPort in gs_cluster.json.
sqlMember required for GridDB AE if mode=FIXED_LIST Specify the list of cluster node's JDBC address and ports when using fixed list method.
ospassword Required Specify the password for the Operating System user "gsadm".
sshPort 22 Specify the SSH port to use.
archiveDataPath archive/data Specify the file path to directory to place archive files.
archiveTempPath archive/tmp Specify the file path to working directory of the archive tool.
tarMaxFileSize 100 Specify the max amount size(MB) of row data files in an archive file. (1MB-1024MB)
tarMaxFileCount 10000 Specify the max number of row data files in an archive files. (10-10000)
storeMemoryLimit 1024MB Specify the upper limit and its unit of "store memory" for GridDB archive engine.

7.4 File formats

7.4.1 Row data file

Row data file is written in the Apache Avro format. The files consist of a header part and data blocks.

When building application programs to work with external systems, acquire the schema information from the header and read row data based on the schema in those programs.

[Memo]

 

7.4.2 Metadata file

A metadata file has information required to import the row data file. The content is followed to the format of metadata files used for gs_export / gs_import. The overlapped parameters of both metadata files are the same format.

The followings are written for information not to be restored. When it is required to restore the following information, move the description to a right position of the file.

Information used by the archive engine is also written in this file.

field       Item       Note 
Auxiliary information
schemaInformation
  expirationType Type of expiry release function PARTITION / ROW (omissible when specifying ROW)
  expirationTime Length of expiration Integer (written when PARTITION is specified)
  expirationTimeUnit Unit of the above length DAY / HOUR / MINUTE / SECOND / MILLISECOND (written when PARTITION is specified)
  triggerInfoSet Trigger settings Written only when any information is set
    eventName Trigger name Trigger name
    notificationType Notification method JMS / REST
    targetEvents Event to be monitored PUT / DELETE
    targetColumnNames Column name
    notificationURI Destination URI of notification
    JmsDestinationType Type of destination topic / queue
    JmsDestinationName Name of destination
    JmsUser <User name>
    JmsPassword <Password>
  timeSeriesProperties Information about time series container Written only for a time series container
    compressionMethod NO / SS / HI
    compressionWindowSize Maximum window size of a row Integer
    compressionWindowSizeUnit Elapsed time unit of row expiration DAY / HOUR / MINUTE / SECOND / MILLISECOND
    expirationDivisionCount Division count of row expiration Division count of row expiration
    rowExpirationElapsedTime Elapsed time of row expiration Integer
    rowExpirationTimeUnit Elapsed time unit of row expiration DAY / HOUR / MINUTE / SECOND / MILLISECOND
  compressionInfoSet Compression setting on each column Written only when any information is set
    columnName Column name
    compressionType Compression type RELATIVE / ABSOLUTE
    width Absolute error exists. Thinning and compression parameters Floating point number
    rate Relative error exists. Thinning and compression parameters Floating point number
    span Relative error exists. Thinning and compression parameters Floating point number
Detailed information of expired data
archiveInfo
  nodeAddr Node address where the data are acquired string
  nodePort REST port of the above node Integer
  databaseId Database ID string
  containerId Container ID string
  dataPartitionId Data partition ID Integer
  rowIndexOID Information used by the GridDB archive engine string
  mvccIndexOID Information used by the GridDB archive engine string
  initSchemaStatus Information used by the GridDB archive engine string
  schemaVersion Schema version Integer
  expirationType Type of expiry release function PARTITION / ROW
  startTime Start time of the data range string
  endTime End time of the data range string
  expiredTime The time when the data was expired string
  erasableTime The time when the data became disable string

7.4.3 Catalog file

Information about row data files and metadata files included by archive files are written in this file.

field       Item       Note 
rootDirectory Directory where archive files have been placed The directory specified in "store" command when saving archive files is written. The default of the directory is /var/lib/gridstore/archive/data and it is changed by "archiveDataPath" property in the setting file.
files A list of cold data
  database <Database name>
  container <Container name>
  startTime Start time of the cold data The combination of start time (startTime) and end time (endTime) of key column value set for the expiry release function represents the data range which became disable simultaneously and are stored in one row data file. It shows only the range, the value at either the start time or the end time does not always exist.
  endTime End time of the cold data
  archiveFile Archive file name The name of the tar format archive file including the row data file of the data range and the corresponding metadata file is written.
  subdirectory Relative directory path in the archive file The relative directory path where the row data file and the metadata file are placed in the above file is written.
  metaFile Metadata file name The name of the metadata file is written.
  rowDataFiles Row data file name The name of the row data file is written. It is written in an array corresponding multiple names for the future expansion, but only one name is written in this version.
startTime Start time of all data in this file The earliest start time (startTime) in this file is written.
endTime End time of all data in this file The latest end time (endTime) in this file is written.