I was looking for a way to traverse CSV files as relational database tables.
I did some research as nothing I found fit my bill of requirements in its entirety. I found several partially good options, namely:
- termsql - which takes stdin or a file and allows some sql on it - but sets up only one "table"
- csv2sqlite - which is very promising given it allows for potentially more sql goodness than termsql - but still only one "table"
- this ULSE question - which describes how one would implement set operations with unix file traverse commands - promising and a possible starting point
It is possible, and very straightforward to traverse and perform some database-like operations on a single csv/text file (column sums, averages, min, max, subsets, etc), but not on two files, with some connection between them. It is also possible to import the files to a temp DB for querying, and I have done this, although not as practical as I would like.
TL;DR - I basically would like a convenient way to do quick and dirty sql joins on csv files. Not looking for a full fledged text based RDBMS, but just a nicer way to do some analysis on csv RDBMS extracts.
example:
sqlthingy -i tbl1.csv tbl2.csv -o 'select 1,2,3 from tbl1, tbl2 where tbl1.1 = tbl2.1'
This seems like an interesting enough problem that I could devote some time on, but I'd like to know if it exists already.