Skip to content

Troubleshooting

Troubleshooting Undeliverable Messages in STACKIT Intake

Section titled “Troubleshooting Undeliverable Messages in STACKIT Intake”

If you are experiencing issues with messages not appearing in your Dremio Iceberg table, there can be several reasons. The troubleshooting steps below will help you identify and resolve the problem.

  • Messages in the Dead Letter Queue (DLQ): Your messages may have been moved to the DLQ because of a processing error. Follow the steps in the next section to diagnose this.
  • Flushing Delay: Data is periodically flushed from Intake’s buffer to the Dremio table. If your messages are not immediately visible, it might be due to this flushing delay.
    • Solution: Wait a few minutes (up to 5 minutes or more in high-load scenarios) for the data to be written and become visible in Dremio. The messages are safely buffered during this time.
  • Authentication Failure (Expired PAT): If the Personal Access Token (PAT) used by your Intake has expired, the Intake will no longer be able to write data to Dremio. The Intake’s status will likely reflect this with a failed state or an error message.
    • Solution: Create a new PAT for your technical Dremio user and update the Intake with the new token. To do this, follow the instructions in the section below.

How to Refresh a Personal Access Token (PAT)

Section titled “How to Refresh a Personal Access Token (PAT)”

If an expired or invalid PAT is the cause of your ingestion failure, you must generate a new token and update your Intake.

  1. Generate a New PAT in Dremio Using the Dremio configuration guide, log in to Dremio as your dedicated technical user (e.g., intake_write_user).

    1. Go to Account Settings and select the Personal Access Tokens tab.
    2. Click Generate Token and give it a new name and expiration time.
  2. Update the Intake with the New PAT

    Use a PUT request to update the personalAccessToken field of your Intake.

    Terminal window
    stackit curl -X PUT \
    "https://intake.api.eu01.stackit.cloud/v1beta/projects/$PROJECT_ID/regions/eu01/intakes/$INTAKE_ID" \
    -H "Content-Type: application/json" \
    --data '{
    "catalog": {
    "uri": "'"$DREMIO_CATALOG_URI"'",
    "warehouse": "'"$DREMIO_WAREHOUSE"'",
    "auth": {
    "type": "dremio",
    "dremio": {
    "tokenEndpoint": "'"$DREMIO_TOKEN_ENDPOINT"'",
    "personalAccessToken": "<YOUR_NEW_PAT>"
    }
    }
    }
    }'

    This will update your Intake’s authentication credentials, allowing it to reconnect and resume writing data to Dremio.

Diagnosing Messages in the Dead Letter Queue (DLQ)

Section titled “Diagnosing Messages in the Dead Letter Queue (DLQ)”

If the issue is not a simple flushing delay or an expired PAT, your messages may have been moved to the DLQ. The following steps will guide you through diagnosing and resolving these issues.

  1. Check the Intake Status for Errors

    The first step is to check the status of your Intake. The Intake’s status can provide a quick summary of any ongoing issues.

    Use the following command to get the details of your Intake:

    Terminal window
    stackit curl -X GET \
    "https://intake.api.eu01.stackit.cloud/v1beta/projects/$PROJECT_ID/regions/eu01/intakes/$INTAKE_ID"

    Look for two key fields in the JSON response:

    • undeliveredMessageCount: If this number is greater than zero, it indicates that messages have been sent to the DLQ.
    • failure_message: This field provides the last error message that occurred, which can often point directly to the root cause of the problem (e.g., “JSON parsing failed”).
  2. Read Messages from the Dead Letter Queue (DLQ)

    To inspect the problematic messages in detail, you need to read from the DLQ topic. You should have already created a deadletter type Intake User for this purpose according to the instructions in the getting started guide.

    Use a Kafka client like kcat to consume messages from the DLQ topic. Replace the placeholder values with your own:

    Terminal window
    kcat -b $INTAKE_RUNNER_ID.intake.eu01.onstackit.cloud:9094 \
    -t deadletter-intake-"$INTAKE_ID" \
    -C \
    -X security.protocol=SASL_SSL \
    -X sasl.mechanisms=SCRAM-SHA-512 \
    -X sasl.username=intake-user-"$INTAKE_USER_DLQ_ID" \
    -X sasl.password='<YOUR_SECURE_PASSWORD_FOR_DEADLETTER_USER>'

    This command will print the content of the messages in the DLQ to your terminal, allowing you to examine them for issues.

Based on the information from the failure_message and the messages you’ve read from the DLQ, here are some common reasons for undeliverable messages and how to fix them:

  • Malformed JSON: The most frequent cause of undeliverable messages is a payload that is not valid JSON. Ensure your application is sending well-formed JSON objects.
    • Solution: Correct the JSON payload format in your data producer.
  • Schema Mismatch or Evolution Problems: The data in your message might not conform to the schema of the target Iceberg table. This can happen if a field has a different data type than what was initially inferred or explicitly defined.
    • Solution: Check the failure_message for details on the schema conflict. If the schema of your data has changed, you may need to update your Dremio table to accommodate the new schema.
  • Unsupported Data Types: Some data types might not be supported or correctly parsed by Intake.
    • Solution: Reformat your data to use supported types (e.g., ensure timestamps are in a recognizable string format).
  • Dremio Permissions: The Dremio Personal Access Token (PAT) used by the Intake might lack the necessary permissions to write to the target table or namespace.
    • Solution: Verify that the PAT has the required USAGE, CREATE FOLDER, COMMIT, and CREATE TABLE permissions as outlined in the Dremio configuration guide.

If this troubleshooting guide did not help you resolving your problem, reach out to the Helpdesk .