• Reading & Writing Hive Tables
    • Reading From Hive
    • Writing To Hive
      • Limitations

    Reading & Writing Hive Tables

    Using the HiveCatalog and Flink’s connector to Hive, Flink can read and write from Hive data as an alternative to Hive’s batch engine. Be sure to follow the instructions to include the correct dependencies in your application.

    • Reading From Hive
    • Writing To Hive
      • Limitations

    Reading From Hive

    Assume Hive contains a single table in its default database, named people that contains several rows.

    1. hive> show databases;
    2. OK
    3. default
    4. Time taken: 0.841 seconds, Fetched: 1 row(s)
    5. hive> show tables;
    6. OK
    7. Time taken: 0.087 seconds
    8. hive> CREATE TABLE mytable(name string, value double);
    9. OK
    10. Time taken: 0.127 seconds
    11. hive> SELECT * FROM mytable;
    12. OK
    13. Tom 4.72
    14. John 8.0
    15. Tom 24.2
    16. Bob 3.14
    17. Bob 4.72
    18. Tom 34.9
    19. Mary 4.79
    20. Tiff 2.72
    21. Bill 4.33
    22. Mary 77.7
    23. Time taken: 0.097 seconds, Fetched: 10 row(s)

    With the data ready your can connect to Hive connect to an existing Hive installation and begin querying.

    1. Flink SQL> show catalogs;
    2. myhive
    3. default_catalog
    4. # ------ Set the current catalog to be 'myhive' catalog if you haven't set it in the yaml file ------
    5. Flink SQL> use catalog myhive;
    6. # ------ See all registered database in catalog 'mytable' ------
    7. Flink SQL> show databases;
    8. default
    9. # ------ See the previously registered table 'mytable' ------
    10. Flink SQL> show tables;
    11. mytable
    12. # ------ The table schema that Flink sees is the same that we created in Hive, two columns - name as string and value as double ------
    13. Flink SQL> describe mytable;
    14. root
    15. |-- name: name
    16. |-- type: STRING
    17. |-- name: value
    18. |-- type: DOUBLE
    19. Flink SQL> SELECT * FROM mytable;
    20. name value
    21. __________ __________
    22. Tom 4.72
    23. John 8.0
    24. Tom 24.2
    25. Bob 3.14
    26. Bob 4.72
    27. Tom 34.9
    28. Mary 4.79
    29. Tiff 2.72
    30. Bill 4.33
    31. Mary 77.7

    Writing To Hive

    Similarly, data can be written into hive using an INSERT INTO clause.

    1. Flink SQL> INSERT INTO mytable (name, value) VALUES ('Tom', 4.72);

    Limitations

    The following is a list of major limitations of the Hive connector. And we’re actively working to close these gaps.

    • INSERT OVERWRITE is not supported.
    • Inserting into partitioned tables is not supported.
    • ACID tables are not supported.
    • Bucketed tables are not supported.
    • Some data types are not supported. See the limitations for details.
    • Only a limited number of table storage formats have been tested, namely text, SequenceFile, ORC, and Parquet.
    • Views are not supported.