跳转至

PostgreSQL

一、概述

AILab-PDBC (Python DataBase Connectivity)是数智教育发展(山东)有限公司 AI Lab 100 团队开发的高效、灵活的数据接口(API)

ailab100.pdbc.sql类用于用于关系型数据库管理系统(RDBMS)的读写,支持MySQL 、微软SQL SERVERSQLiteORACLEMariaDBPostgreSQL等主流结构化数据库。

本章介绍如何使用ailab100.pdbc.sql.ORACLE 连接 PostgreSQL 数据库进行读写和下载,以及数据库类型和Pandas类型之间的转换。

PostgreSQL是以加州大学伯克利分校计算机系开发的对象关系型数据库管理系统(RDBMS)。

二、API 接口说明

(一) DBConnector 基类

(二) PostgreSQL 类

Protocols

  • binary: Postgres Binary COPY protocol, recommend to use in general since fast data parsing speed.
  • csv: Postgres CSV COPY protocol, recommend to use when network is slow (csv usually results in smaller size than binary).
  • cursor: Conventional wire protocol (slowest one), recommend to use only when binary and csv is not supported by the source (e.g. Redshift).

Postgres Connection

Adding `sslmode=require` to connection uri parameter force SSL connection. Example: `postgresql://username:password@host:port/db?sslmode=require`. `sslmode=disable` to disable SSL connection.

To connect to redshift, replace `postgresql://` with `redshift://`.
import connectorx as cx

conn = 'postgres://username:password@server:port/database'  # connection token
query = "SELECT * FROM table"  # query string
cx.read_sql(conn,query)  # read data from Postgres

Postgres-Pandas Type Mapping

Postgres Type Pandas Type Comment
BOOL bool, boolean(nullable)
INT2 int64, Int64(nullable)
INT4 int64, Int64(nullable)
INT8 int64, Int64(nullable)
FLOAT4 float64
FLOAT8 float64
NUMERIC float64 cannot support precision larger than 28
TEXT object
BPCHAR object
VARCHAR object
CHAR object
BYTEA object
DATE datetime64[ns]
TIME object
TIMESTAMP datetime64[ns]
TIMESTAMPZ datetime64[ns]
UUID object
JSON object
JSONB object
ENUM object need to convert enum column to text manually (::text) when using csv and cursor protocol
ltree object binary protocol supported only after Postgres version 13
lquery object binary protocol supported only after Postgres version 13
ltxtquery object binary protocol supported only after Postgres version 13
INT2[] object list of i64
INT4[] object list of i64
INT8[] object list of i64
FLOAT4[] object list of f64
FLOAT8[] object list of f64
NUMERIC[] object list of f64

Performance (db.m6g.4xlarge RDS)

  • Time chart, lower is better.

time chart

  • Memory consumption chart, lower is better.

memory chart

In conclusion, ConnectorX uses 3x less memory and 13x less time compared with Pandas.