A common, beautiful interface to tabular data, no matter the format

Overview

rows

Join the chat at https://gitter.im/turicas/rows Current version at PyPI Downloads per month on PyPI Supported Python Versions Software status License: LGPLv3

No matter in which format your tabular data is: rows will import it, automatically detect types and give you high-level Python objects so you can start working with the data instead of trying to parse it. It is also locale-and-unicode aware. :)

Want to learn more? Read the documentation (or build and browse the docs locally by running make docs-serve after installing requirements-development.txt).

Installation

The easiest way to getting the hands dirty is install rows, using pip.

PyPI

pip install rows

For another ways to instal refer to the Installation section documentation.

Contribution start guide

The preferred way to start contributing for the project is creating a virtualenv (you can do by using virtualenv, virtualenvwrapper, pyenv or whatever tool you'd like).

Create the virtualenv:

mkvirtualenv rows

Install all plugins' dependencies:

pip install --editable .[all]

Install development dependencies:

pip install -r requirements-development.txt
Comments
  • OverflowError

    OverflowError

    Após instalar as dependências requeridas para-o pacote socios-brasil, ao tentar descompactar como indicado, obtenho o erro abaixo:

    Traceback (most recent call last):
     File "extract_dump.py", line 27, in <module> 
        import rows
     File "C:\Users\milcent\AppData\Local\Continuum\Anaconda3\lib\site-packages\row s\__init__.py", line 22, in <module>
        import rows.plugins as plugins
     File "C:\Users\milcent\AppData\Local\Continuum\Anaconda3\lib\site-packages\row s\plugins\__init__.py", line 20, in <module>
        from . import plugin_csv as csv # NOQA
     File "C:\Users\milcent\AppData\Local\Continuum\Anaconda3\lib\site-packages\row s\plugins\plugin_csv.py", line 34, in <module>
        unicodecsv.field_size_limit(sys.maxsize) 
    OverflowError: Python int too large to convert to C long
    

    Rodando em Windows 7, Anaconda 64 bits, Python 3.6. Grato, Marcel Milcent

    opened by milcent 13
  • PDF Plugin

    PDF Plugin

    Create an algorithm to automatically extract tables from PDFs (available in text format). Could use pdftables, but the code is not up-to-date, does not work with Python3 etc.

    enhancement plugin 
    opened by turicas 7
  • Converter PDF x TXT

    Converter PDF x TXT

    Bom dia, estou tentando converter um arquivo pdf escaneado para texto (o pdf contém tabelas). Consegui instalar a biblioteca rows e as dependências rows[pdf], rows[cli]. Quando eu tento rodar o código em prompt command: rows pdf-to-text teste.pdf result.txt Eu tenho o seguinte erro: image

    Alguma ideia do que possa ser o problema?

    opened by Danielydsm 6
  • Autodetect delimiter in CSV files

    Autodetect delimiter in CSV files

    Currently the import_from_csv method have the parameter 'delimiter' that assumes ',' as default, but sometimes we don't know what is the delimiter and need it autodetect. Specially usefull in case of CSV files generated in MS Excell that uses ';' as delimiter.

    A quick and dirty possibility to make this works is counting the number of times ',', ';' and 'tab' is used in the file and assumes as delimiter the most used.

    enhancement help wanted plugin 
    opened by jeanferri 6
  • OverflowError: Python int too large to convert to C long

    OverflowError: Python int too large to convert to C long

    Bom dia!

    Estou aprendendo Python, então este pode ser um erro bem simples de resolver, mesmo assim não faço ideia do que pode ser feito:

    Ao tentar importar o rows aparece a mensagem do título.

    duplicate 
    opened by tbmpereira 5
  • Text plugin is not working on `rows convert`

    Text plugin is not working on `rows convert`

    The file cha-de-bebe.txt is not being read correctly on the command line (try rows print cha-de-bebe.txt or rows convert cha-de-bebe.txt cha-de-bebe.csv) -- but it was generated correctly using rows print http://some-url/ > cha-de-bebe.txt.

    @jsbueno could you please help checking it? I think this bug started after your PR #270 .

    bug 
    opened by turicas 5
  • locale.Error: unsupported locale setting

    locale.Error: unsupported locale setting

    ======================================================================
    ERROR: test_DecimalField (tests.tests_fields.FieldsTestCase)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/tests/tests_fields.py", line 203, in test_DecimalField
        with rows.locale_context(locale_name):
      File "/usr/lib64/python3.5/contextlib.py", line 59, in __enter__
        return next(self.gen)
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/rows/localization.py", line 23, in locale_context
        locale.setlocale(category, name)
      File "/usr/lib64/python3.5/locale.py", line 594, in setlocale
        return _setlocale(category, locale)
    locale.Error: unsupported locale setting
    
    ======================================================================
    ERROR: test_FloatField (tests.tests_fields.FieldsTestCase)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/tests/tests_fields.py", line 171, in test_FloatField
        with rows.locale_context(locale_name):
      File "/usr/lib64/python3.5/contextlib.py", line 59, in __enter__
        return next(self.gen)
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/rows/localization.py", line 23, in locale_context
        locale.setlocale(category, name)
      File "/usr/lib64/python3.5/locale.py", line 594, in setlocale
        return _setlocale(category, locale)
    locale.Error: unsupported locale setting
    
    ======================================================================
    ERROR: test_IntegerField (tests.tests_fields.FieldsTestCase)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/tests/tests_fields.py", line 144, in test_IntegerField
        with rows.locale_context(locale_name):
      File "/usr/lib64/python3.5/contextlib.py", line 59, in __enter__
        return next(self.gen)
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/rows/localization.py", line 23, in locale_context
        locale.setlocale(category, name)
      File "/usr/lib64/python3.5/locale.py", line 594, in setlocale
        return _setlocale(category, locale)
    locale.Error: unsupported locale setting
    
    ======================================================================
    ERROR: test_PercentField (tests.tests_fields.FieldsTestCase)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/tests/tests_fields.py", line 250, in test_PercentField
        with rows.locale_context(locale_name):
      File "/usr/lib64/python3.5/contextlib.py", line 59, in __enter__
        return next(self.gen)
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/rows/localization.py", line 23, in locale_context
        locale.setlocale(category, name)
      File "/usr/lib64/python3.5/locale.py", line 594, in setlocale
        return _setlocale(category, locale)
    locale.Error: unsupported locale setting
    
    ======================================================================
    ERROR: test_locale_context (tests.tests_localization.LocalizationTestCase)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/tests/tests_localization.py", line 41, in test_locale_context
        with locale_context(name):
      File "/usr/lib64/python3.5/contextlib.py", line 59, in __enter__
        return next(self.gen)
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/rows/localization.py", line 23, in locale_context
        locale.setlocale(category, name)
      File "/usr/lib64/python3.5/locale.py", line 594, in setlocale
        return _setlocale(category, locale)
    locale.Error: unsupported locale setting
    
    opened by ignatenkobrain 5
  • Porting rows to Python3

    Porting rows to Python3

    This is a work in progress.

    I could make all tests pass on Python3, but 3 are broken on Python2 because of something I can't find yet on the type identification system.

    This PR is just to share it with you. Maybe your familiarity with the code can help fixing the tests.

    []'s!

    opened by henriquebastos 5
  • UserWarning: Call to deprecated function or class get_active_sheet

    UserWarning: Call to deprecated function or class get_active_sheet

    Hi, when I build package for Debian, debhelper tools runs pybuild, showing this warnings [1] I use the lastest source: git20151115.837b41.

    Is there something here or other has the same problem? thanks.

    [1] pybuild --test --test-nose -i python{version} -p 2.7 --dir . I: pybuild base:184: cd /pkgs/pkg-rows/rows-0.1.1+git20151115.837b41/.pybuild/pythonX.Y_2.7/build; python2.7 -m nose tests ...................................................................................................../usr/lib/python2.7/dist-packages/openpyxl/workbook/workbook.py:102: UserWarning: Call to deprecated function or class get_active_sheet (Use the .active property). def get_active_sheet(self): /usr/lib/python2.7/dist-packages/openpyxl/workbook/workbook.py:102: UserWarning: Call to deprecated function or class get_active_sheet (Use the .active property). def get_active_sheet(self): ./usr/lib/python2.7/dist-packages/openpyxl/workbook/workbook.py:102: UserWarning: Call to deprecated function or class get_active_sheet (Use the .active property). def get_active_sheet(self): ./usr/lib/python2.7/dist-packages/openpyxl/workbook/workbook.py:102: UserWarning: Call to deprecated function or class get_active_sheet (Use the .active property). def get_active_sheet(self):

    ..........................

    Ran 129 tests in 1.936s

    OK

    opened by kretcheu 5
  • Add sphinx documentation

    Add sphinx documentation

    Hello dear reviewer,

    I basically did three things:

    • Add the sphinx to the requirements-development.txt
    • Create a basic documentation, based on the Readme, with few improvements i've made.
    • Move some basic project information (intro and archtecture) to the init.py of the rows module

    I think the Sphinx doc can also be used as a website, and maybe can be hosted at github pages.

    []'s I hope this will be usefull! :)

    opened by raphapassini 5
  • Could not find import_from_pdf function

    Could not find import_from_pdf function

    I need to import data from pdf and found this example: https://gist.github.com/turicas/6b9ca83dcd531a6cd4fd87ced2a28c70

    But I was unable to run it, since the import_from_pdf is not available to me.

    I have already run the command: pip install rows[all]

    Is pdf format no longer supported?

    opened by marcellalves 4
  • New release on pypi

    New release on pypi

    I started using the "rows" lib today, and I've lost several hours of work because of a bug on empty cells in ods input. Here is my story.

    I was learning/discovering the "rows" lib with an ODS file, and I fall across a strange behavior. Of course, I thought it was because I didn't use the lib properly : so I tried all possible options, searched on the Internet... etc. After several hours, I eventually tried the same code with an equivalent XLSX file and I found out that the behavior was different ! So I realized that I had found a bug on my first day of use of the rows lib !

    I decided that I should report the bug. I took the time to write a script to illustrate my bug report. I was using rows 0.4.1 from pypi, but, before creating the bug report on github, I thought I should check if the bug is still present in the "develop" branch... and my script shows that the bug is fixed in the "develop" branch !

    Release 0.4.1 is dated Feb 14, 2019... almost 4 years old ! There has been 210 commits since 0.4.1 ; among these 210 commits, I counted about 45 fixes. While counting the commit messages with a fix message, I found the commit that fixes my bug: issue #320 fixed on Match 27 2019 in this commit https://github.com/turicas/rows/commit/c569f9415f2c76b2f6e9afbe1d748946e759711f

    So, in December 2022, some users are wasting hours because of a bug that was found and fixed 3,5 years ago :-( No comment !

    So, please, push a new release on pypi !

    opened by alexis-via 2
  • Replace unicodecsv by standard csv module

    Replace unicodecsv by standard csv module

    unicodecsv is not maintained since a while now [1]. It was preferred over standard csv because of the unicode support. Now that Python3 csv module [2] supports it, let's use it.

    For more context, we hit issues while rebuilding uncicodecsv during Fedora Python3.11 mass rebuild [3][4].

    [1] https://github.com/jdunck/python-unicodecsv [2] https://docs.python.org/3/library/csv.html [3] https://copr.fedorainfracloud.org/coprs/g/python/python3.11/package/python-unicodecsv/ [4] https://bugzilla.redhat.com/show_bug.cgi?id=2021938

    opened by jcapiitao 1
  • NameError: name 'obj' is not defined

    NameError: name 'obj' is not defined

    Esse erro rolou quando fui tentar usar o método closest_same_column em rows.plugins.pdf image

    Aparentemente aqui no código está faltando a parte em que pegamos o o objeto que tem o valor passado como parâmetro para trabalharmos com ele (e aparentemente isso também acontece com o outro método closest_same_line

    opened by dehatanes 0
  • Python 3.10: cannot import name 'Iterator' from 'collections'

    Python 3.10: cannot import name 'Iterator' from 'collections'

    File "/data/data/com.termux/files/usr/lib/python3.10/site-packages/rows/plugins/utils.py", line 20, in <module> 
    from collections import Iterator, OrderedDict            
    ImportError: cannot import name 'Iterator' from 'collections'
    

    Maybe this will be fix:

    try:
        from collections.abc import Iterator
    except ImportError:
        from collections import Iterator
    
    opened by fagci 0
  • [pgimport] Option to do not store values as NULL

    [pgimport] Option to do not store values as NULL

    NULL values can be confusing when analyzing data and there will be some cases where we prefer to add empty values as empty strings instead of NULL. The function pgimport (and the CLI equivalent) should have an option to deal with this scenario.

    enhancement cli plugin utils 
    opened by turicas 0
Releases(v0.4.1)
Owner
Álvaro Justen
Free/libre software hacker, hypnotist, remote worker, teacher, coffee lover/roaster
Álvaro Justen
Platform Tree for Xiaomi Redmi Note 7/7S (lavender)

The Xiaomi Redmi Note 7 (codenamed "lavender") is a mid-range smartphone from Xiaomi announced in January 2019. Device specifications Device Xiaomi Re

MUHAMAD KHOIRON 2 Dec 20, 2021
Python Freecell Solver

freecell Python Freecell Solver Very early version right now. You can pick a board by changing the file path in freecell.py If you want to play a game

Ben Kaufman 1 Nov 26, 2021
A sandpit for textual related things

A sandpit repo for testing textual related things.

Craig Gumbley 1 Nov 08, 2021
Vector tile server for the Wildfire Predictive Services Unit

wps-tileserver Vector tile server for the Wildfire Predictive Services Unit Overview The intention of this project is to: provide tools to easily spin

Province of British Columbia 6 Dec 20, 2022
Basic code and description for GoBigger challenge 2021.

GoBigger Challenge 2021 en / 中文 Challenge Description 2021.11.13 We are holding a competition —— Go-Bigger: Multi-Agent Decision Intelligence Challeng

OpenDILab 183 Dec 29, 2022
In this repo i inherit the pos module and added QR code to pos receipt

odoo-pos-inherit In this repo i inherit the pos module and added QR code to pos receipt 1- Create new Odoo Module using command line $ python odoo-bin

5 Apr 09, 2022
Blender Addon for Snapping a UV to a specific part of a Tilemap

UVGridSnapper A simple Blender Addon for easier texturing. A menu in the UV editor allows a square UV to be snapped to an Atlas texture, or Tilemap. P

2 Jul 17, 2022
The Zig programming language, packaged for PyPI

Zig PyPI distribution This repository contains the script used to repackage the releases of the Zig programming language as Python binary wheels. This

Zig Programming Language 100 Nov 04, 2022
Easily map device and application controls to a midi controller

pymidicontroller Introduction Easily map device and application controls to a midi controller

Tane Barriball 24 May 16, 2022
Developing a python based app prototype with KivyMD framework for a competition :))

Developing a python based app prototype with KivyMD framework for a competition :))

Jay Desale 1 Jan 10, 2022
Lags valorant servers by rapidly picking up and throwing shorties.

Lags valorant servers by rapidly picking up and throwing shorties.

Eric Still 9 Dec 30, 2021
The most widely used Python to C compiler

Welcome to Cython! Cython is a language that makes writing C extensions for Python as easy as Python itself. Cython is based on Pyrex, but supports mo

7.6k Jan 03, 2023
HiSim - House Infrastructure Simulator

HiSim is a Python package for simulation and analysis of household scenarios using modern components as alternative to fossil fuel based ones.

FZJ-IEK3 17 Dec 17, 2022
Problem statements on System Design and Software Architecture as part of Arpit's System Design Masterclass

Problem statements on System Design and Software Architecture as part of Arpit's System Design Masterclass

Relog 1.1k Jan 04, 2023
🎅🏻 Helping santa understand ✨ python ✨

☃️ Advent of code 2021 ☃️ Helping santa understand ✨ python ✨

Fluffy 2 Dec 25, 2021
This code can help you with auto update for-TV-advertisements in the store.

Auto-update-files-for-TV-advertisements-in-the-store This code can help you with auto update for-TV-advertisements in the store. It was write for Rasp

Max 2 Feb 20, 2022
Code needed for hybrid land cover change analysis for NASA IDS project

Documentation for the NASA IDS change analysis Poley 10/21/2021 Required python packages: whitebox numpy rasterio rasterio.mask os glob math itertools

Andrew Poley 2 Nov 12, 2021
Integration of CCURE access control system with automation HVAC of a commercial building

API-CCURE-Automation-Quantity-Floor Integration of CCURE access control system with automation HVAC of a commercial building CCURE is an access contro

Alexandre Edson Silva Pereira 1 Nov 24, 2021
Aides to reduce a cheat file with a personal selection of the cheats you want to use.

Retroarch Cheat File Reducer Description Aides to reduce a cheat file with a personal selection of the cheats you want to use. Instructions Copy a sel

1 Jan 09, 2022
Python library to interact with Move Hub / PoweredUp Hubs

Python library to interact with Move Hub / PoweredUp Hubs Move Hub is central controller block of LEGO® Boost Robotics Set. In fact, Move Hub is just

Andrey Pokhilko 499 Jan 04, 2023