Python reader for Linked Data in HDF5 files

Last update: May 17, 2022

Related tags

Overview

`h5ld`: HDF5 Linked Data

Linked Data are becoming more popular for user-created metadata in HDF5 files. This Python package provides readers for the HDF5-based formats with such metadata . Entire linked data content is read in one operation and made available as an rdflib graph object.

Currently supported:

Allotrope Data Format (ADF)

Installation

pip install git+https://github.com/HDFGroup/h5ld@{LABEL}

where {LABEL} is either master or a tag label.

Requirements:

Python >= 3.7
h5py >= 3.3.0
rdflib >= 5.0.0

License

This software is open source. See this file for details.

Quick Start

This package can be used either as a command-line tool or programmatically. On the command-line, the package dumps the link data of an input HDF5 file into several popular RDF formats supported by the rdflib package. For example:

python -m h5ld -f json-ld -o output.json INPUT.h5

will dump the input file's RDF data to a file output.json in the JSON-LD format. Omitting an output file prints out the same content so it can be ingested by another command-line tool. Full description is available from:

python -m h5ld --help

There is also a programmatic interface for integration into Python applications. Each h5ld reader will provide the following methods and attributes:

File format name.

print(f"Input file format is: {reader.name}")

Short (usually an acronym) of the file format.

print(f"File format acronym: {reader.short_name}")

Check if the reader is the right choice for the input file.

with h5py.File("input.h5", mode="r") as f:
    if reader.verify_format(f):
        # Do something...
      else:
          print("Sorry but not the right h5ld reader.")

Check if there is linked data content in the input HDF5 file. Optionally, print an appropriate description of the data.
```
with h5py.File("input.h5", mode="r") as f:
    reader.check_ld(f, report=True)
```

Read linked data and export it to a destination in the requested RDF format.

with h5py.File("input.h5", mode="r") as f:
    reader(f).dump_ld("output.json", format="json-ld")

Read linked data and return either an rdflib.Graph or rdflib.ConjunctiveGraph object.

with h5py.File("input.h5", mode="r") as f:
    graph = reader(f).get_ld()

A Python dictionary with the reader's namespace prefixes and their IRIs.

with h5py.File("input.h5", mode="r") as f:
    rdr = reader(f)
    namespaces = rdr.namespaces

Python reader for Linked Data in HDF5 files

Related tags

Overview

`h5ld`: HDF5 Linked Data

Installation

License

Quick Start

Owner

The HDF Group

CSV database for chihuahua (HUAHUA) blockchain transactions

In this tutorial, raster models of soil depth and soil water holding capacity for the United States will be sampled at random geographic coordinates within the state of Colorado.

WithPipe is a simple utility for functional piping in Python.

CaterApp is a cross platform, remotely data sharing tool created for sharing files in a quick and secured manner.

Python package for processing UC module spectral data.

Intake is a lightweight package for finding, investigating, loading and disseminating data.

PySpark bindings for H3, a hierarchical hexagonal geospatial indexing system

A data analysis using python and pandas to showcase trends in school performance.

Clean and reusable data-sciency notebooks.

A data structure that extends pyspark.sql.DataFrame with metadata information.

Program that predicts the NBA mvp based on data from previous years.

CRISP: Critical Path Analysis of Microservice Traces

CubingB is a timer/analyzer for speedsolving Rubik's cubes, with smart cube support

Open-Domain Question-Answering for COVID-19 and Other Emergent Domains

This module is used to create Convolutional AutoEncoders for Variational Data Assimilation

Pipeline and Dataset helpers for complex algorithm evaluation.

Making the DAEN information accessible.

A notebook to analyze Amazon Recommendation Review Dataset.

pipeline for migrating lichess data into postgresql

Get mutations in cluster by querying from LAPIS API

Python reader for Linked Data in HDF5 files

Related tags

Overview

h5ld: HDF5 Linked Data

Installation

License

Quick Start

Owner

The HDF Group

CSV database for chihuahua (HUAHUA) blockchain transactions

In this tutorial, raster models of soil depth and soil water holding capacity for the United States will be sampled at random geographic coordinates within the state of Colorado.

WithPipe is a simple utility for functional piping in Python.

CaterApp is a cross platform, remotely data sharing tool created for sharing files in a quick and secured manner.

Python package for processing UC module spectral data.

Intake is a lightweight package for finding, investigating, loading and disseminating data.

PySpark bindings for H3, a hierarchical hexagonal geospatial indexing system

A data analysis using python and pandas to showcase trends in school performance.

Clean and reusable data-sciency notebooks.

A data structure that extends pyspark.sql.DataFrame with metadata information.

Program that predicts the NBA mvp based on data from previous years.

CRISP: Critical Path Analysis of Microservice Traces

CubingB is a timer/analyzer for speedsolving Rubik's cubes, with smart cube support

Open-Domain Question-Answering for COVID-19 and Other Emergent Domains

This module is used to create Convolutional AutoEncoders for Variational Data Assimilation

Pipeline and Dataset helpers for complex algorithm evaluation.

Making the DAEN information accessible.

A notebook to analyze Amazon Recommendation Review Dataset.

pipeline for migrating lichess data into postgresql

Get mutations in cluster by querying from LAPIS API

`h5ld`: HDF5 Linked Data