Python function to query SQLite files stored on S3

Overview

sqlite-s3-query CircleCI Test Coverage

Python function to query a SQLite file stored on S3. It uses multiple HTTP range requests per query to avoid downloading the entire file, and so is suitable for large databases.

All the HTTP requests for a query request the same version of the database object in S3, so queries should complete succesfully even if the database is replaced concurrently by another S3 client. Versioning must be enabled on the S3 bucket.

Operations that write to the database are not supported.

Inspired by phiresky's sql.js-httpvfs, and dacort's Stack Overflow answer.

Installation

sqlite-s3-query depends on APSW, which is not available on PyPI, but can be installed directly from GitHub.

pip install sqlite_s3_query
pip install https://github.com/rogerbinns/apsw/releases/download/3.36.0-r1/apsw-3.36.0-r1.zip --global-option=fetch --global-option=--version --global-option=3.36.0 --global-option=--all --global-option=build --global-option=--enable-all-extensions

Usage

from sqlite_s3_query import sqlite_s3_query

results_iter = sqlite_s3_query(
    'SELECT * FROM my_table WHERE my_column = ?', params=('my-value',),
    url='https://my-bucket.s3.eu-west-2.amazonaws.com/my-db.sqlite',
)

for row in results_iter:
    print(row)

If in your project you use multiple queries to the same file, functools.partial can be used to make an interface with less duplication.

from functools import partial
from sqlite_s3_query import sqlite_s3_query

query_my_db = partial(sqlite_s3_query,
    url='https://my-bucket.s3.eu-west-2.amazonaws.com/my-db.sqlite',
)

for row in query_my_db('SELECT * FROM my_table WHERE my_col = ?', params=('my-value',)):
    print(row)

for row in query_my_db('SELECT * FROM my_table_2 WHERE my_col = ?', params=('my-value',)):
    print(row)

The AWS region and the credentials are taken from environment variables, but this can be changed using the get_credentials parameter. Below shows the default implementation of this that can be overriden.

import os
from functools import partial
from sqlite_s3_query import sqlite_s3_query

query_my_db = partial(sqlite_s3_query
    url='https://my-bucket.s3.eu-west-2.amazonaws.com/my-db.sqlite',
    get_credentials=lambda: (
        os.environ['AWS_DEFAULT_REGION'],
        os.environ['AWS_ACCESS_KEY_ID'],
        os.environ['AWS_SECRET_ACCESS_KEY'],
        os.environ.get('AWS_SESSION_TOKEN'),  # Only needed for temporary credentials
    ),
)

for row in query_my_db('SELECT * FROM my_table_2 WHERE my_col = ?', params=('my-value',)):
    print(row)
Owner
Michal Charemza
Michal Charemza
Turn SELECT queries returned by a query into links to execute them

datasette-query-links Turn SELECT queries returned by a query into links to execute them Installation Install this plugin in the same environment as D

Simon Willison 5 Apr 27, 2022
Enfilade: Tool to Detect Infections in MongoDB Instances

Enfilade: Tool to Detect Infections in MongoDB Instances

Aditya K Sood 7 Feb 21, 2022
Simple json type database for python3

What it is? Simple json type database for python3! What about speed? The speed is great! All data is stored in RAM until saved. How to install? pip in

3 Feb 11, 2022
Given a metadata file with relevant schema, an SQL Engine can be run for a subset of SQL queries.

Mini-SQL-Engine Given a metadata file with relevant schema, an SQL Engine can be run for a subset of SQL queries. The query engine supports Project, A

Prashant Raj 1 Dec 03, 2021
Python function to extract all the rows from a SQLite database file while iterating over its bytes, such as while downloading it

Python function to extract all the rows from a SQLite database file while iterating over its bytes, such as while downloading it

Department for International Trade 16 Nov 09, 2022
A Python wrapper API for operating and working with the Neo4j Graph Data Science (GDS) library

gdsclient This repo hosts the sources for gdsclient, a Python wrapper API for operating and working with the Neo4j Graph Data Science (GDS) library. g

Neo Technology 101 Jan 05, 2023
Postgres full text search options (tsearch, trigram) examples

postgres-full-text-search Postgres full text search options (tsearch, trigram) examples. Create DB CREATE DATABASE ftdb; To feed db with an example

Jarosław Orzeł 97 Dec 30, 2022
Migrate data from SQL to NoSQL easily

Migrate data from SQL to NoSQL easily Installation 💯 pip install sql2nosql --upgrade Dependencies 📢 For the package to work, it first needs "clients

Facundo Padilla 43 Mar 26, 2022
Python object-oriented database

ZODB, a Python object-oriented database ZODB provides an object-oriented database for Python that provides a high-degree of transparency. ZODB runs on

Zope 574 Dec 31, 2022
Simpledb-py: Simple JSON database

Simpledb-py: Simple JSON database

тейлс 2 Feb 09, 2022
A NoSQL database made in python.

CookieDB A NoSQL database made in python.

cookie 1 Nov 30, 2022
Code for a db backend that relies on bash tools (grep, cat, echo, etc)

Simple-nosql-db is a python backend for a database that relies on unix tools such as cat, echo and grep. Funny enough I got the idea from this discuss

Sebastian Alonso 10 Aug 13, 2019
A simple GUI that interacts with a database to keep track of a collection of US coins.

CoinCollectorGUI A simple gui designed to interact with a database. The goal of the database is to make keeping track of collected coins simple. The G

Builder212 1 Nov 09, 2021
PathfinderMonsterDatabase - A database of all monsters in Pathfinder 1e, created by parsing aonprd.com

PathfinderMonsterDatabase A database of all monsters in Pathfinder 1e, created by parsing aonprd.com Setup Run the following line to install all requi

Yoni Lerner 11 Jun 12, 2022
A Simple , ☁️ Lightweight , 💪 Efficent JSON based database for 🐍 Python.

A Simple, Lightweight, Efficent JSON based DataBase for Python The current stable version is v1.6.1 pip install pysondb==1.6.1 Support the project her

PysonDB 282 Jan 07, 2023
Monty, Mongo tinified. MongoDB implemented in Python !

Monty, Mongo tinified. MongoDB implemented in Python ! Was inspired by TinyDB and it's extension TinyMongo

David Lai 523 Jan 02, 2023
Лабораторные работы по Postgresql за 5 семестр

Практикум по Postgresql ERD для заданий 2.x: ERD для заданий 3.x: Их делал вот тут Ниже есть 2 инструкции — по установке postgresql на manjaro и по пе

Danila 10 Oct 31, 2022
TelegramDB - A library which uses your telegram account as a database for your projects

TelegramDB A library which uses your telegram account as a database for your projects. Basic Usage from pyrogram import Client from telegram import Te

Kaizoku 79 Nov 22, 2022
ClutterDB - Extremely simple JSON database made for infrequent changes which behaves like a dict

extremely simple JSON database made for infrequent changes which behaves like a dict this was made for ClutterBot

Clutter Development 1 Jan 12, 2022
TinyDB is a lightweight document oriented database optimized for your happiness :)

Quick Links Example Code Supported Python Versions Documentation Changelog Extensions Contributing Introduction TinyDB is a lightweight document orien

Markus Siemens 5.6k Dec 30, 2022