Troubleshooting
Troubleshooting Undeliverable Messages in STACKIT Intake
Section titled “Troubleshooting Undeliverable Messages in STACKIT Intake”If you are experiencing issues with messages not appearing in your Dremio Iceberg table, there can be several reasons. The troubleshooting steps below will help you identify and resolve the problem.
Common Reasons for Undelivered Messages
Section titled “Common Reasons for Undelivered Messages”- Messages in the Dead Letter Queue (DLQ): Your messages may have been moved to the DLQ because of a processing error. Follow the steps in the next section to diagnose this.
- Flushing Delay: Data is periodically flushed from Intake’s buffer to the Dremio table. If your messages are not immediately visible, it might be due to this flushing delay.
- Solution: Wait a few minutes (up to 5 minutes or more in high-load scenarios) for the data to be written and become visible in Dremio. The messages are safely buffered during this time.
- Authentication Failure (Expired PAT): If the Personal Access Token (PAT) used by your Intake has expired, the Intake will no longer be able to write data to Dremio. The Intake’s status will likely reflect this with a failed state or an error message.
- Solution: Create a new PAT for your technical Dremio user and update the Intake with the new token. To do this, follow the instructions in the section below.
How to Refresh a Personal Access Token (PAT)
Section titled “How to Refresh a Personal Access Token (PAT)”If an expired or invalid PAT is the cause of your ingestion failure, you must generate a new token and update your Intake.
-
Generate a New PAT in Dremio Using the Dremio configuration guide, log in to Dremio as your dedicated technical user (e.g.,
intake_write_user).- Go to Account Settings and select the Personal Access Tokens tab.
- Click Generate Token and give it a new name and expiration time.
-
Update the Intake with the New PAT
Use a PUT request to update the personalAccessToken field of your Intake.
Terminal window stackit curl -X PUT \"https://intake.api.eu01.stackit.cloud/v1beta/projects/$PROJECT_ID/regions/eu01/intakes/$INTAKE_ID" \-H "Content-Type: application/json" \--data '{"catalog": {"uri": "'"$DREMIO_CATALOG_URI"'","warehouse": "'"$DREMIO_WAREHOUSE"'","auth": {"type": "dremio","dremio": {"tokenEndpoint": "'"$DREMIO_TOKEN_ENDPOINT"'","personalAccessToken": "<YOUR_NEW_PAT>"}}}}'This will update your Intake’s authentication credentials, allowing it to reconnect and resume writing data to Dremio.
Diagnosing Messages in the Dead Letter Queue (DLQ)
Section titled “Diagnosing Messages in the Dead Letter Queue (DLQ)”If the issue is not a simple flushing delay or an expired PAT, your messages may have been moved to the DLQ. The following steps will guide you through diagnosing and resolving these issues.
-
Check the Intake Status for Errors
The first step is to check the status of your Intake. The Intake’s status can provide a quick summary of any ongoing issues.
Use the following command to get the details of your Intake:
Terminal window stackit curl -X GET \"https://intake.api.eu01.stackit.cloud/v1beta/projects/$PROJECT_ID/regions/eu01/intakes/$INTAKE_ID"Look for two key fields in the JSON response:
undeliveredMessageCount: If this number is greater than zero, it indicates that messages have been sent to the DLQ.failure_message: This field provides the last error message that occurred, which can often point directly to the root cause of the problem (e.g., “JSON parsing failed”).
-
Read Messages from the Dead Letter Queue (DLQ)
To inspect the problematic messages in detail, you need to read from the DLQ topic. You should have already created a deadletter type Intake User for this purpose according to the instructions in the getting started guide.
Use a Kafka client like kcat to consume messages from the DLQ topic. Replace the placeholder values with your own:
Terminal window kcat -b $INTAKE_RUNNER_ID.intake.eu01.onstackit.cloud:9094 \-t deadletter-intake-"$INTAKE_ID" \-C \-X security.protocol=SASL_SSL \-X sasl.mechanisms=SCRAM-SHA-512 \-X sasl.username=intake-user-"$INTAKE_USER_DLQ_ID" \-X sasl.password='<YOUR_SECURE_PASSWORD_FOR_DEADLETTER_USER>'This command will print the content of the messages in the DLQ to your terminal, allowing you to examine them for issues.
Common Causes and Solutions
Section titled “Common Causes and Solutions”Based on the information from the failure_message and the messages you’ve read from the DLQ, here are some common reasons for undeliverable messages and how to fix them:
- Malformed JSON: The most frequent cause of undeliverable messages is a payload that is not valid JSON. Ensure your application is sending well-formed JSON objects.
- Solution: Correct the JSON payload format in your data producer.
- Schema Mismatch or Evolution Problems: The data in your message might not conform to the schema of the target Iceberg table. This can happen if a field has a different data type than what was initially inferred or explicitly defined.
- Solution: Check the
failure_messagefor details on the schema conflict. If the schema of your data has changed, you may need to update your Dremio table to accommodate the new schema.
- Solution: Check the
- Unsupported Data Types: Some data types might not be supported or correctly parsed by Intake.
- Solution: Reformat your data to use supported types (e.g., ensure timestamps are in a recognizable string format).
- Dremio Permissions: The Dremio Personal Access Token (PAT) used by the Intake might lack the necessary permissions to write to the target table or namespace.
- Solution: Verify that the PAT has the required
USAGE,CREATE FOLDER,COMMIT, andCREATE TABLEpermissions as outlined in the Dremio configuration guide.
- Solution: Verify that the PAT has the required
Contact STACKIT support to get help
Section titled “Contact STACKIT support to get help”If this troubleshooting guide did not help you resolving your problem, reach out to the Helpdesk .