Saturday, 3 August 2024

Trino Redpanda

 

Trino Deployment on Windows Instance and Integration with Redpanda

This document provides a step-by-step guide for deploying Trino on a Windows instance, integrating it with Redpanda using SHA-256 authentication, and working with Protobuf-based Kafka topics. It also explains how other users can access Trino's SQL endpoint.


1. Prerequisites

Java Installation

  • Required Version: Trino requires Java 11 or later.

  • Download Java:

  • Install and Configure Java:

    1. Set the JAVA_HOME environment variable to the Java installation directory (e.g., C:\Program Files\Java\jdk-11).
    2. Add the bin directory of the Java installation to the system's PATH (e.g., C:\Program Files\Java\jdk-11\bin).
  • Verify Installation:

    java -version
    

Download Trino

  1. Visit the official Trino Downloads Page.
  2. Download the latest stable version (e.g., trino-server-411.tar.gz).
  3. Extract the archive to a directory, such as C:\trino.

Redpanda Setup

  • Ensure Redpanda is running and configured with:
    • Authentication: SHA-256 with username/password.
    • Schema: Protobuf-based topics.

Protobuf Schema

  • Prepare the .proto file that defines the message structure for your Kafka topic.

Example message.proto:

syntax = "proto3";

message YourMessage {
    string field1 = 1;
    int32 field2 = 2;
    google.protobuf.Timestamp field3 = 3;
}

Save the file in a directory accessible to Trino, such as C:\trino\schemas.


2. Deploy Trino

Configure Trino

  1. Navigate to the Trino installation directory (C:\trino\etc) and configure the following files:
  • Node Properties (node.properties)
node.environment=production
node.id=trino-node-1
node.data-dir=C:\trino\data
  • JVM Config (jvm.config)
-Xmx16G
-Xms16G
  • Config Properties (config.properties)
coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=8080
query.max-memory=5GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
discovery.uri=http://localhost:8080
  1. Start Trino Open a Command Prompt, navigate to the Trino installation directory, and start the server:
    bin\launcher.bat run
    

3. Configure Redpanda Connector

Kafka Connector Setup

Create a file named kafka.properties in C:\trino\etc\catalog:

connector.name=kafka
kafka.nodes=localhost:9092
kafka.security-protocol=SASL_PLAINTEXT
kafka.sasl.mechanism=SCRAM-SHA-256
kafka.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required
username="your-username"
password="your-password";
protobuf-message-descriptor-file=C:/trino/schemas/message.proto

Define a Table in Trino

Use the .proto file for Protobuf schema:

CREATE SCHEMA kafka WITH (
    location = 'kafka://localhost:9092'
);

CREATE TABLE kafka.my_topic (
    field1 VARCHAR,
    field2 INTEGER,
    field3 TIMESTAMP
)
WITH (
    format = 'protobuf',
    topic = 'your_topic_name',
    message = 'YourMessage',
    protobuf-message-descriptor-file = 'C:/trino/schemas/message.proto'
);

Replace placeholders with actual values:

  • your_topic_name: Kafka/Redpanda topic name.
  • YourMessage: Protobuf message type in .proto file.

4. Accessing Trino's SQL Endpoint

Endpoint Information

  • URL: http://<trino-host>:8080

Accessing via Trino CLI

  1. Launch Trino CLI:

    bin\trino-cli.bat --server http://localhost:8080
    
  2. Run SQL queries:

    SHOW TABLES FROM kafka;
    SELECT * FROM kafka.my_topic LIMIT 10;
    

Accessing via Third-Party Tools

Users can connect to Trino using third-party SQL clients like Aqua Data Studio, DBeaver, or Tableau.

Steps:

  1. Configure a new connection in the client tool:

    • Host: <trino-host>
    • Port: 8080
    • Authentication: None (or as configured for Trino).
  2. Test the connection and execute queries.


5. Troubleshooting

Common Issues

  1. Java or Trino Startup Errors:

    • Ensure Java 11 or later is installed and JAVA_HOME is configured.
    • Check Trino logs in C:\trino\var\log for details.
  2. Kafka Connector Errors:

    • Verify credentials and Kafka connector configuration in kafka.properties.
  3. Protobuf Schema Parsing Errors:

    • Confirm the .proto file matches the topic structure.

This document should guide you through the process of deploying Trino, integrating it with Redpanda, and providing SQL access to other users. Let me know if you need further clarification!

No comments:

Post a Comment