These instructions will show you how to use Flink’s SQL Client with the taxi data used in these exercises.


1. Edit sql-client-config.yaml

The home directory of the flink-training-exercises repository contains a file named sql-client-config.yaml. This file contains hardwired paths to the taxi ride and taxi fare datasets. Edit this file so that it correctly points to your copies of these files:

  path: "/Users/david/stuff/flink-training/trainingData/nycTaxiRides.gz"
  path: "/Users/david/stuff/flink-training/trainingData/nycTaxiFares.gz"

Following these instructions, download the Flink binaries and start a local flink cluster. Leave it running.

3. Start the SQL client

$ cd /to/your/clone/of/flink-training-exercises
$ /wherever/you/put/flink/bin/ embedded --jar target/flink-training-exercises-2.5.3.jar -e sql-client-config.yaml
Windows users, please note that you will need some way to run bash scripts, such as the Windows Subsystem for Linux.

4. Verify that it works

You can list all available tables using the SHOW TABLES command. It lists table sources and sinks as well as views.


You can get information about the schema of TaxiRides using the DESCRIBE statement.

Flink SQL> DESCRIBE TaxiRides;
 |-- rideId: Long
 |-- taxiId: Long
 |-- driverId: Long
 |-- isStart: Boolean
 |-- startLon: Float
 |-- startLat: Float
 |-- endLon: Float
 |-- endLat: Float
 |-- passengerCnt: Short
 |-- eventTime: TimeIndicatorTypeInfo(rowtime)

In order to explore the data of the TaxiRides table, execute a simple query:

SELECT * FROM TaxiRides;

The CLI client will enter the result visualization mode and display the results:

rideId                    taxiId                  driverId
    58                2013000058                2013000058
    38                2013000038                2013000038
    48                2013000048                2013000048
    24                2013000024                2013000024
    12                2013000012                2013000012