Carol

Carol Connect (2C)

Carol Connect is the tool to get data from any database.

Currently, these are the supported databases:

  • SQL Server (including SQL Server on Azure)
  • Oracle
  • PostgreSQL
  • MySQL
  • Informix
  • Progress OpenEdge

Requirements (server)

To install and run Carol Connect perfectly, the following requirements should be attended:

  • Environment (Windows or Linux) with at least 100 GB of hard disk, 4 GB of ram memory and internet access without proxy. If there is table with big records data like images, is strongly recommend to have 8GB of ram memory.
  • An exclusive environment to avoid concurrence with other systems.
  • The URL to *.carol.ai should be allowed on the firewall server (if applicable).
  • A user with administrative rights on the database (permission to install triggers and create table). The administrative rights can be removed after the installation and initialization of the environment.

Network Requirements

Carol connect performs communication with Carol (cloud). In order to allow communication, some requirements should be observed:

  • In case the network has a firewall or proxy, Carol's URL should be added as an exception for performance insurance.
  • Carol connect communicates with Carol using the domain .carol.ai.

Installing Carol Connect

If you are starting a project with Carol and you need to have Carol Connect to integrate your data, you can download the most recent version in our Github repository: https://github.com/totvslabs/2c/releases.

Carol Connect has no installation, you just need to unzip the ZIP file and run the Carol connect through the file "run-2c.bat" or "run-2c.sh".

Configuring Carol Connector as a Service

Windows

Files needed:

  • carol2cservice.exe
  • serviceinstall.bat
  • serviceremove.bat

Instructions for registering 2c as windows service:

  1. Run the serviceinstall.bat.
  2. Start the service normally.

Note: You must be running the windows prompt as an administrator.

Linux

Files needed:

  • carolconnector.service

Instructions for registering 2c as linux service:

  1. Edit carolconnector.service and edit the following information:
    -- User
    -- WorkingDirectory
    -- ExecStart
  2. Run the following command (assuming /opt/2c is the install dir):
    -- sudo systemctl enable /opt/2c/carolconnector.service
  3. Start the service by running the following command:
    -- sudo systemctl start carolconnector

Login on Carol Connect

Access the URL http://localhost:8880 and type Carol's credentials:

Some details:

  • Tenant/Domain: the domain when you are accessing Carol's environment. For example, if your Carol's URL is "bematech.carol.ai", then, your tenant/domain is: bematech
  • Username: the same you are using to authenticate in Carol. This username will be used to integrate data to Carol. You are able to change later on.
  • Password: the password associated to the previous username.

Creating a connection to your Database

The button "+ Add new database" allow you to create a new database connection. You should choose one database to create the connection:

After, you should type the database parameters. These parameters will be saved locally to automatically integrate the data to Carol.

This is a sample of parameters for SQL Server connection:

After creating the connection, you will see this interface. This interface shows some important information:

  • Last batch sent: last time Carol Connect sent data to Carol.
  • Transmission: quantity of records integrated to Carol since last Carol Connect restart.
  • Last minute rate: quantity of records per second.
  • Carol connector: connector in Carol that is receiving this data.
  • Mean connection wait time: milliseconds that Carol Connect is waiting to get a connection and send data.
  • Mean connection usage time: milliseconds that Carol is using the connection to send data to Carol.

If you need to change the database connection parameters, you can do that clicking on the dots on the top right side:

Clicking on "Configure Entities", you will be able to add tables to be integrated to Carol.

Carol Connect shows all entities available through the database connection configured:

Clicking on the table allows you to see details related to the table:

Some details:

  • Integration: it shows the integration status.
  • Data processing: it shows if the data processing is running or not.
  • Primary key information: it shows the fields that belong to the primary key. If this entity does not have the primary key defined in the database, you should select the field(s) that identify one record inside the entity (it is mandatory to integrate data to Carol).
  • Condition for initialization: you can define a condition to filter out some data. For example, you can initialize this entity only sending "records that have the field creation_date > '2017-12-10', for example". In this sample, "creation_date is a field that belongs to the entity selected. This condition will be applied only when initializing the entity. All records inserted/updated will be integrated to Carol without validation with this condition.
  • Sync using timestamp field: At each time set by the cron expression in the configuration file, 2c will check for any records included or changed after the last timestamp returned and send those records to Carol. When selecting this option, the field containing the inclusion or modification timestamp information must be chosen.
  • Sync using Resync: At each time set by the cron expression in the configuration file, the 2c will check the local primary key set and compare them to the set of crosswalks stored in Carol and will send the missing records as well as will send deletes to the non-records existing ones.

After clicking on "Enable", Carol Connect will create the trigger to get all new records and all updated records. After creating the trigger, Carol Connect will initialize the Carol Connect queue (carol_3c_queue) to send all data that respects the condition types.

After enabling the integration of the entity, you can enable other entities. The next section describes the integration monitor through "Database Manager" menu.

Data Anonymization

2C is able to anonymize data before send it to Carol. In order to use this feature, just select the type of anonymization before enable the entity synchronization.
The types of anonymization are:

  • None. The field are not anonymizated.
  • Base Round. The field data is rounded using a base number. eg: 16800 using a 1000 base round becomes 17000. Define the base number in the option field.
  • Date. Change date values to 1 (if the field is day, month ou year) or 0 (if the field is hour, minutes or seconds). Put the letter refering to the date data in the options field. eg: 12/15/2015 11:10:35 using mdhMs option becomes 01/01/2015 00:00:00.
    • m Month
    • d Day of month
    • y Year
    • h hour
    • M minute
    • s second
  • Email Mask. Apply to email address data. eg: [email protected] becomes ***@somedomain.com.
  • Hash. A hash function is applied to field data. If the field is a number, CRC32C is selected, if it is a string, SHA3 is selected.
  • Mask. A mask is applied to field data. The character # means that char position is not masked. eg: card number: 123-456-789-012 using mask XXX-XXX-XXX-### becomes XXX-XXX-XXX-012. Define the mask in the option field.
  • Supression. This field is not sent to Carol. It's supressed from data payload.

Carol Connect Monitor

To verify the status related to the data integration, you can access the menu "Database Manager" to verify a few information as described previously:

2C Configuration

2C configuration is locate in app.config.yml file.

  • syncThreads: Number of job executors running in parallel.
  • poolSize: Maximum number of connections to the database.
  • ignoreTriggers: Set true if 2c isn't allow to create triggers on database.
  • enableReSync: Set true to enable ReSync strategy.
  • enableLogin: Set true if 2c need to ask login each time it's API is called.
  • syncByTimestampResendLastDate: Set true if during a Sync by Timestamp, 2c have to resend records from last timestamp. Useful when timestamp doesn't have time, only date.
  • resendAllRecordsOnResync: Set true if 2c needs to resend all records each time ReSync is executed.
  • imageAxisPixelsLimit: If bigger than zero, 2c will validade and resize images with axis bigger than informed limit.

2C Cron configuration

2C have four cron expressions:

  • processing: Define each time 2C checks queue table to send data to Carol.
  • resync: Define each time 2C will start one step to ReSync a table. A full ReSync have five steps: Check local data, check staging data, check golden records, check rejected records, compare local with remote data.
  • syncbyresync: Define each time 2C will start a ReSync to tables that are sinchronized by ReSync.
  • syncbytimestamp: Define each time 2C will start a Sync by Timestamp to tables that are synchronized this way.
  • initialload: Define each time 2C checks and starts initial loads for enabled tables.

SQL Server Requirements

These are the requirements that the database should attend to work perfectly with Carol Connect:

Progress OpenEdge Requirements

These are the requirements that the database should attend to work perfectly with Carol Connect:

  • OpenEdge 10. 3 or higher.
  • Grant 2C user to have table definition privilegies.
  • At first time, 2C'll create the queue table on database. OpenEdge only allow this operation when no other user is connected.

Carol Connect (2C)


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.