Zum Inhalt springen

Create your first Intake and send data to it

Diese Seite ist noch nicht in deiner Sprache verfügbar. Englische Seite aufrufen

Prior to creating an Intake you need to gather some information about your Dremio instance:

  • Dremio catalog URI: The API endpoint for your Dremio Iceberg catalog. Usually: https://dremio-<your-dremio-instance-name>-catalog.data-platform.stackit.run/iceberg/
  • Dremio warehouse: The name of the Dremio Iceberg catalog warehouse to connect. Usually: catalog-s3
  • Dremio token endpoint URI: The URL to request Dremio authentication tokens. Usually: https://dremio-<your-dremio-instance-name>.data-platform.stackit.run/oauth/token
  • Dremio personal access token (PAT): Your secret key for authentication. See how to get one in this guide

The Intake Runner is the engine for your data ingestion. You must create a runner before you can create an Intake.

  1. Open your project in the STACKIT Portal.
  2. Navigate to Data & AI > Intake.
  3. In sidebar, click on Intake Runners.
  4. In the topbar, click on Create Intake Runners.
  5. Fill in the required fields:
    • Name: A human-readable name for your runner.
    • Max message size: The maximum size, in Kibibytes, for a single message.
    • Max message rate (per h): The maximum number of messages the runner can process per hour.
  6. Click on Order feed-based.

To get the latest status of your runner:

After ordering, the Portal takes you to the Intake Runners list.

Here you should see your newly created runner in state Reconciling. Give it a few minutes to become Active.

During reconciliation, you can already look at the details of your runner by clicking on its name. In particular, you can check the Endpoint URI via which you will send messages to your Intakes.

An Intake is the data pipe that connects the stream of data ingested to your target Dremio Iceberg table.

  1. Open your project in the STACKIT Portal.
  2. Navigate to Data & AI > Intake.
  3. In sidebar, click on Intakes.
  4. In the topbar, click on Create Intake.
  5. Fill in the required fields:
    • Name: A human-readable name for your Intake.
    • Intake Runner: Select the Intake Runner you created in the previous step.
    • Iceberg catalog endpoint: The Dremio catalog URL you obtained for your Dremio instance.
    • Iceberg warehouse: The Dremio warehouse identifier you obtained for your Dremio instance.
    • Table namespace: Leave the setting at intake.
    • Table name: Leave the setting at “Generate name”.
    • Table Partitioning: Leave the setting at “No partitioning”.
    • Dremio Token Endpoint: the Dremio token endpoint URL you obtained for your Dremio instance.
    • Dremio Personal Access Token: Your PAT for authentication.
  6. Click on Create Intake.

To get the latest status of a specific intake:

After pressing create, the Portal takes you to the Intakes list.

Here you should see your newly created Intake in state Reconciling. Give it a few minutes to become Active.

During reconciliation, you can already look at the details of your Intake by clicking on its name. In particular, you can check the topic name via which you will send messages to your Intakes.

To send data to your Intake, you must create an Intake user with the intake type and provide a secure password (12 characters minimum, with at least one upper case, lower case, and special character, as well as at least one number). Consult Create and manage Intake users to get more details.

  1. Open your project in the STACKIT Portal.
  2. Navigate to Data & AI > Intake.
  3. In sidebar, click on Intakes.
  4. Click on the Intake you created in the previous step.
  5. In the sidebar, click on Users.
  6. In the topbar, click on Create User.
  7. Fill in the required fields:
    • Name: The name for your Intake user.
    • Role: leave it at “Intake Writer”.
    • Password: A secure password (12 charactersaracters minimum, with at least one upper case, lower case, and special character, as well as at least one number). You can also click on Generate new password to have a secure password created for you. In this case, make sure to copy it and store it securely.
  8. Click on Create.

The response will include a the Login name of the created user. Record it for future use with your Kafka client. The response will also include Configurations you can copy for Java or librdkafka clients.

You can look at the details of the created user:

In the STACKIT Portal, navigate to Data & AI > Intake.

Click on the Intake you created in the previous step.

In the sidebar, click on Users.

Click on the user you just created to see its details, including the Login name.

To inspect messages that fail processing, create a user with read-only access to the Dead Letter Queue (DLQ) by setting the type to dead-letter.

  1. Open your project in the STACKIT Portal.
  2. Navigate to Data & AI > Intake.
  3. In sidebar, click on Intakes.
  4. Click on the Intake you created in the previous step.
  5. In the sidebar, click on Users.
  6. In the topbar, click on Create User.
  7. Fill in the required fields:
    • Name: The name for your Dead Letter Intake user.
    • Role: set it to “Dead Letter Reader”.
    • Password: A secure password (12 charactersaracters minimum, with at least one upper case, lower case, and special character, as well as at least one number). You can also click on Generate new password to have a secure password created for you. In this case, make sure to copy it and store it securely.
  8. Click on Create.

The response will include a the Login name of the created user. Record it for future use with your Kafka client. The response will also include Configurations you can copy for Java or librdkafka clients.

List all Intake users for an Intake in your project:

In the STACKIT Portal, navigate to Data & AI > Intake.

Click on the Intake you created in the previous step.

In the sidebar, click on Users.

Here you should see both the Intake user and the Dead Letter user you created in the previous steps.

Send data to an Iceberg table via your Intake

Section titled “Send data to an Iceberg table via your Intake”

After completing the setup, you can verify that everything works by sending a test message to your Intake topic using a Kafka client like kcat.

Terminal window
echo '{"message": "hello world from kcat!", "id": 123}' | \
kcat -b $INTAKE_RUNNER_ID.intake.eu01.onstackit.cloud:9094 \
-t intake-"$INTAKE_ID" \
-P \
-X security.protocol=SASL_SSL \
-X sasl.mechanisms=SCRAM-SHA-512 \
-X sasl.username=intake-user-"$INTAKE_USER_ID" \
-X sasl.password='<YOUR_SECURE_PASSWORD_FOR_INTAKE_USER>'

After sending the message, query your target table in Dremio UI to confirm that the record has been successfully ingested.

You can also read from the Dead-Letter topic:

Terminal window
kcat -b $INTAKE_RUNNER_ID.intake.eu01.onstackit.cloud:9094 \
-t deadletter-intake-"$INTAKE_ID" \
-C \
-X security.protocol=SASL_SSL \
-X sasl.mechanisms=SCRAM-SHA-512 \
-X sasl.username=intake-user-"$INTAKE_USER_DLQ_ID" \
-X sasl.password='<YOUR_SECURE_PASSWORD_FOR_DEADLETTER_USER>'

You can delete the Intake Runner with the following commands. This will also delete the Intakes and Intake users.

  1. Open your project in the STACKIT Portal.
  2. Navigate to Data & AI > Intake.
  3. In sidebar, click on Intake Runners.
  4. Click on the three dots to the right of the Intake Runner you created.
  5. Select Delete.
  6. Confirm the deletion by entering the Intake Runner name.
  7. Press Delete.

Congratulations, you created your first Intake pipeline and sent data to it. From here you can dig deeper with the How-Tos.