Showing posts with label cloud. Show all posts
Showing posts with label cloud. Show all posts

Friday, February 14, 2020

Excel and ODF support for cloudfs_fdw

cloudfs_fdw now supports .xls (Excel 97-2003), .xlsx, and .ods (Open Document Format) Spreadsheets via pandas, xlrd, and odfpy. It requires pandas >= 1.0.1, so Multicorn must be compiled against Python 3.

Since pandas provides sorting and filtering capabilities, cloudfs_fdw tries to push down SQL qualifiers and sort keys when they can be translated into pandas notation.

Take a look and have fun.

Wednesday, September 25, 2019

cloudfs_fdw

Since I needed a Foreign Data Wrapper for files stored on S3, and the ones I found did things like loading the whole file in memory before sending the first rows, I wrote my own, using Multicorn.

Along the way, I discovered libraries like smart-open and ijson that allow to stream various file formats from various filesystems - and so this escalated a bit, into cloudfs_fdw.

It currently supports CSV and JSON files from S3, HTTP/HTTPS sources and local or network  filesystems but since smart-open supports more than that (e.g. HDFS, SSH), it certainly can be extended if needed.

For now, have fun.

Tuesday, August 25, 2015

The 'other' cloud - parasitic storage as a service

Since some NoSQL products by default ship with no security at all, this was no real surprise after MongoDB, but the magnitude is astounding.

How about using this for something useful?
  1. Scan the Internet for known products/servers that allow unconditional read/write access
  2. Write storage adapters
  3. Invent a mechanism to store data encrypted and with some redundancy, in case someone gets a wake up call
  4. Invent a mechanism to rebalance storage if servers become unavailable or new ones are added to the list of storage nodes
  5. Build a service around 1, 2, 3, and 4
There it is, the 'other' cloud, at almost no cost except bandwidth...

But of course, this is purely fictional. The bad guys don't have good ideas and the good guys won't do it. There is no illegal data stored in your session cache.

Just keep ignoring the fine manuals and carry on. Nobody needs database administrators, everybody knows that...