{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "2WGeukhYh9fd" }, "source": [ "# TFDS CLI\n", "\n", "TFDS CLI is a command-line tool that provides various commands to easily work with TensorFlow Datasets." ] }, { "cell_type": "markdown", "metadata": { "id": "r-42ZFIIrgbF" }, "source": [ "Copyright 2020 The TensorFlow Datasets Authors, Licensed under the Apache License, Version 2.0" ] }, { "cell_type": "markdown", "metadata": { "id": "grQeV-PZroqn" }, "source": [ "\u003ctable class=\"tfo-notebook-buttons\" align=\"left\"\u003e\n", " \u003ctd\u003e\n", " \u003ca target=\"_blank\" href=\"https://www.tensorflow.org/datasets/cli\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" /\u003eView on TensorFlow.org\u003c/a\u003e\n", " \u003c/td\u003e\n", " \u003ctd\u003e\n", " \u003ca target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/datasets/blob/master/docs/cli.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" /\u003eRun in Google Colab\u003c/a\u003e\n", " \u003c/td\u003e\n", " \u003ctd\u003e\n", " \u003ca target=\"_blank\" href=\"https://github.com/tensorflow/datasets/blob/master/docs/cli.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" /\u003eView source on GitHub\u003c/a\u003e\n", " \u003c/td\u003e\n", " \u003ctd\u003e\n", " \u003ca href=\"https://storage.googleapis.com/tensorflow_docs/datasets/docs/cli.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/download_logo_32px.png\" /\u003eDownload notebook\u003c/a\u003e\n", " \u003c/td\u003e\n", "\u003c/table\u003e" ] }, { "cell_type": "markdown", "metadata": { "id": "kGrmMPUhXfUs" }, "source": [ "##### Disable TF logs on import\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "vJLAsn1c0Hxu" }, "outputs": [], "source": [ "%%capture\n", "%env TF_CPP_MIN_LOG_LEVEL=1 # Disable logs on TF import" ] }, { "cell_type": "markdown", "metadata": { "id": "uo-yMd3Zrm_K" }, "source": [ "## Installation\n", "\n", "The CLI tool is installed with `tensorflow-datasets` (or `tfds-nightly`)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "jeV8_FpsiDwH" }, "outputs": [], "source": [ "!pip install -q tfds-nightly apache-beam\n", "!tfds --version" ] }, { "cell_type": "markdown", "metadata": { "id": "hdZiDNR1ijRH" }, "source": [ "For the list of all CLI commands:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "CCJPO_Akij0U" }, "outputs": [], "source": [ "!tfds --help" ] }, { "cell_type": "markdown", "metadata": { "id": "fJrFRBDKj0sO" }, "source": [ "## `tfds new`: Implementing a new Dataset\n", "\n", "This command will help you kickstart writing your new Python dataset by creating\n", "a `\u003cdataset_name\u003e/` directory containing default implementation files.\n", "\n", "Usage:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "c0Bm7yFCk91Q" }, "outputs": [], "source": [ "!tfds new my_dataset" ] }, { "cell_type": "markdown", "metadata": { "id": "OZaDtK0elimF" }, "source": [ "`tfds new my_dataset` will create:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "CwSPLFRfli8I" }, "outputs": [], "source": [ "ls -1 my_dataset/" ] }, { "cell_type": "markdown", "metadata": { "id": "rUsoDbi0moyK" }, "source": [ "An optional flag `--data_format` can be used to generate format-specific dataset builders (e.g., `conll`). If no data format is given, it will generate a template for a standard ``tfds.core.GeneratorBasedBuilder``.\n", "Refer to the [documentation](https://www.tensorflow.org/datasets/format_specific_dataset_builders) for details on the available format-specific dataset builders.\n", "\n", "See our [writing dataset guide](https://www.tensorflow.org/datasets/add_dataset)\n", "for more info.\n", "\n", "Available options:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "pAWCw-fDkwky" }, "outputs": [], "source": [ "!tfds new --help" ] }, { "cell_type": "markdown", "metadata": { "id": "x7996uhD1-GP" }, "source": [ "## `tfds build`: Download and prepare a dataset\n", "\n", "Use `tfds build \u003cmy_dataset\u003e` to generate a new dataset. `\u003cmy_dataset\u003e` can be:\n", "\n", "* A path to `dataset/` folder or `dataset.py` file (empty for current directory):\n", " * `tfds build datasets/my_dataset/`\n", " * `cd datasets/my_dataset/ \u0026\u0026 tfds build`\n", " * `cd datasets/my_dataset/ \u0026\u0026 tfds build my_dataset`\n", " * `cd datasets/my_dataset/ \u0026\u0026 tfds build my_dataset.py`\n", "\n", "* A registered dataset:\n", "\n", " * `tfds build mnist`\n", " * `tfds build my_dataset --imports my_project.datasets`\n", "\n", "Note: `tfds build` has useful flags to help prototyping and debuging. See the `Debug \u0026 tests:` section bellow.\n", "\n", "Available options:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "IGAl6dw62KNO" }, "outputs": [], "source": [ "!tfds build --help" ] } ], "metadata": { "colab": { "collapsed_sections": [ "kGrmMPUhXfUs" ], "name": "cli.ipynb", "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" } }, "nbformat": 4, "nbformat_minor": 0 }