This Week in Fluvio #20
Welcome to This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.
This week we wanted to walkthrough the process of developing a connector.
We’ll approach connector development in this order:
- Build and run your client locally
- Package your Client into Docker Image
- Load your Connector image into Kubernetes Cluster
- Create a new Connector in Fluvio
You’ll need the following tools installed
- Python 3.x
- pip
- python-fluvio
- python-requests
- Docker
- Kubernetes distros (k3d or minikube)
Let’s start with building a simple client.
This client is written in Python, but we have client libraries for Rust, Javascript, Java and Go.
#!/usr/bin/env python3
#
# get-cat-facts.py
# An example Python-based Fluvio connector
from fluvio import Fluvio
import requests
import time
WAIT_SECONDS = 10
CAT_FACTS_API = 'https://catfact.ninja/fact'
CAT_FACTS_TOPIC = 'cat-facts-random'
if __name__ == '__main__':
# Before entering event loop
# Connect to cluster and create a producer before we enter loop
fluvio = Fluvio.connect()
producer = fluvio.topic_producer(CAT_FACTS_TOPIC)
# Event loop
while True:
# Get random cat fact
catfact = requests.get(CAT_FACTS_API)
# Save fact
producer.send_string(catfact.text)
# Print fact to container logs
print(catfact.text)
# Be polite and control the rate we send requests to external API
time.sleep(WAIT_SECONDS)
Before we run this code, we need to create the fluvio topic that our client produces data to
$ fluvio topic create cat-facts-random
Install fluvio package:
$ pip install fluvio requests
Running the Python code prints out a new cat fact every 10 seconds
$ python3 ./get-cat-facts.py
{"fact":"Cats bury their feces to cover their trails from predators.","length":59}
{"fact":"Cats step with both left legs, then both right legs when they walk or run.","length":74}
And we verify that these records have made it to the Fluvio cluster by consuming from the topic.
$ fluvio consume cat-facts-random -B
Consuming records from the beginning of topic 'cat-facts-random'
{"fact":"Cats bury their feces to cover their trails from predators.","length":59}
{"fact":"Cats step with both left legs, then both right legs when they walk or run.","length":74}
Now that we have verfied that our client works locally, we need to package it for Kubernetes. We do that by defining our data connector runtime environment with a Dockerfile.
We use the python
base image to keep things simple.
Then we create a new user fluvio
with a home directory (in /home/fluvio
).
This is required for all connectors. The Fluvio cluster shares information with the fluvio
user on startup.
# Dockerfile
FROM python
# Copy our python script into the connector image
COPY get-cat-facts.py /usr/local/sbin/get-cat-facts.py
RUN chmod +x /usr/local/sbin/get-cat-facts.py
# This is required to connect to a cluster
# Connectors run as the `fluvio` user
ENV USER=fluvio
RUN useradd --create-home "$USER"
USER $USER
# Install dependencies
RUN pip install fluvio requests
# Start script on start
ENTRYPOINT get-cat-facts.py
You can build the Docker image with this command.
$ docker build -t infinyon/fluvio-connect-cat-facts .
The image should have been created
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
infinyon/fluvio-connect-cat-facts latest 08ced64017f0 5 seconds ago 936MB
...
Start a the container with this docker
command
$ docker run -it --rm -v $HOME/.fluvio:/home/fluvio/.fluvio --network host infinyon/fluvio-connect-cat-facts
{"fact":"In the 1750s, Europeans introduced cats into the Americas to control pests.","length":75}
...
<CTRL>-C
You can check out docker run --help
if you want a description of what these do, so I’ll describe why you want to use them instead.
Docker option | Why you want it |
---|---|
-it |
Without both -i and -t , you can’t use ctrl+c to exit your container |
--rm |
Prevents accumulating Docker-related mess on your host while you are testing your connector |
-v $HOME/.fluvio:/home/fluvio/.fluvio |
Share existing Fluvio config - This is effectively what fluvio connector create does |
--network host |
The default docker network can’t reach the host’s Fluvio cluster |
By default, connectors will attempt to pull images from the internet. However, development images need to be testable locally before making them available publicly. So we pre-load the connector images.
Let’s create a cluster called fluvio
$ k3d cluster create fluvio
And import the image (use the name of your cluster)
$ k3d image import infinyon/fluvio-connect-cat-facts --cluster fluvio
The image should have been created
$ docker exec k3d-fluvio-server-0 sh -c "ctr image list -q"
docker.io/infinyon/fluvio-connect-cat-facts:latest
...
Load image to minikube
:
$ minikube image load infinyon/fluvio-connect-cat-facts
Last step for testing our connector is verifying that it runs in the Fluvio cluster. We will create the config file and run the CLI command
Create a connector configuration file example-connector.yaml
:
# example-connector.yaml
version: dev
name: cat-facts-connector
type: cat-facts
topic: cat-facts-random
direction: source
Connector config option | Description |
---|---|
version |
This value must be dev for local development. |
name |
A unique name for this connector. It will be displayed in fluvio connector list |
type |
The value of this name will be used for tagging image before loading into Kubernetes. Connector image names follow the pattern: infinyon/fluvio-connect-{type} |
topic |
The name of the Fluvio topic where the connector will publish the data records. The Fluvio topic will be automatically created if the Fluvio topic does not exist. |
direction |
The metadata that defines the direction of data flow (source or sink ).This is a source connector. |
Lastly, create the connector
$ fluvio connector create --config example-connector.yaml
$ fluvio connector list
NAME STATUS
cat-facts-connector Running
We can look at the container logs, and verify the topic has our records.
$ fluvio connector logs cat-facts-connector
{"fact":"Cats eat grass to aid their digestion and to help them get rid of any fur in their stomachs.","length":92}
{"fact":"When a cat drinks, its tongue - which has tiny barbs on it - scoops the liquid up backwards.","length":92}
{"fact":"Cats and kittens should be acquired in pairs whenever possible as cat families interact best in pairs.","length":102}
And again, to verify we check the contents of the topic. We see the last 3 rows match.
$ fluvio consume cat-facts-random -B
Consuming records from the beginning of topic 'cat-facts-random'
{"fact":"Cats bury their feces to cover their trails from predators.","length":59}
{"fact":"Cats step with both left legs, then both right legs when they walk or run.","length":74}
{"fact":"Cats can jump up to 7 times their tail length.","length":46}
{"fact":"A cat can jump 5 times as high as it is tall.","length":45}
{"fact":"Cats eat grass to aid their digestion and to help them get rid of any fur in their stomachs.","length":92}
{"fact":"When a cat drinks, its tongue - which has tiny barbs on it - scoops the liquid up backwards.","length":92}
{"fact":"Cats and kittens should be acquired in pairs whenever possible as cat families interact best in pairs.","length":102}
Get in touch with us on Github Discussions or join our Discord channel and come say hello!
For the full list of changes this week, be sure to check out our CHANGELOG.
Until next week!