cloudfs_fdw now supports .xls (Excel 97-2003), .xlsx, and .ods (Open Document Format) Spreadsheets via pandas, xlrd, and odfpy. It requires pandas >= 1.0.1, so Multicorn must be compiled against Python 3.
Since pandas provides sorting and filtering capabilities, cloudfs_fdw tries to push down SQL qualifiers and sort keys when they can be translated into pandas notation.
Take a look and have fun.
Showing posts with label cloud. Show all posts
Showing posts with label cloud. Show all posts
Friday, February 14, 2020
Wednesday, September 25, 2019
cloudfs_fdw
Since I needed a Foreign Data Wrapper for files stored on S3, and the ones I found did things like loading the whole file in memory before sending the first rows, I wrote my own, using Multicorn.
Along the way, I discovered libraries like smart-open and ijson that allow to stream various file formats from various filesystems - and so this escalated a bit, into cloudfs_fdw.
It currently supports CSV and JSON files from S3, HTTP/HTTPS sources and local or network filesystems but since smart-open supports more than that (e.g. HDFS, SSH), it certainly can be extended if needed.
For now, have fun.
Along the way, I discovered libraries like smart-open and ijson that allow to stream various file formats from various filesystems - and so this escalated a bit, into cloudfs_fdw.
It currently supports CSV and JSON files from S3, HTTP/HTTPS sources and local or network filesystems but since smart-open supports more than that (e.g. HDFS, SSH), it certainly can be extended if needed.
For now, have fun.
Tuesday, August 25, 2015
The 'other' cloud - parasitic storage as a service
Since some NoSQL products by default ship with no security at all, this was no real surprise after MongoDB, but the magnitude is astounding.
How about using this for something useful?
But of course, this is purely fictional. The bad guys don't have good ideas and the good guys won't do it. There is no illegal data stored in your session cache.
Just keep ignoring the fine manuals and carry on. Nobody needs database administrators, everybody knows that...
How about using this for something useful?
- Scan the Internet for known products/servers that allow unconditional read/write access
- Write storage adapters
- Invent a mechanism to store data encrypted and with some redundancy, in case someone gets a wake up call
- Invent a mechanism to rebalance storage if servers become unavailable or new ones are added to the list of storage nodes
- Build a service around 1, 2, 3, and 4
But of course, this is purely fictional. The bad guys don't have good ideas and the good guys won't do it. There is no illegal data stored in your session cache.
Just keep ignoring the fine manuals and carry on. Nobody needs database administrators, everybody knows that...
Subscribe to:
Posts (Atom)