{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Loading and saving\n", "\n", "This notebook shows how the load data for use in the pipeline and how to save the results of a fit. The pipeline comes with the `iwutil` package, which contains utilities for loading and saving data." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import iwutil\n", "import numpy as np\n", "import pandas as pd\n", "import shutil" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Generate data\n", "\n", "Let's save some dummy data to a CSV and JSON files using the `iwutil.save` module. This module contains functions for saving data to a variety of formats. If the filename contains a \"/\" then the data is saved that subdirectory. If the subdirectory does not exist it will be created.\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Generate some dummy data\n", "n_samples = 100\n", "x = np.linspace(0, 10, n_samples)\n", "y = 2 * x + np.random.normal(0, 1, n_samples)\n", "z = x**2 + np.random.normal(0, 2, n_samples)\n", "\n", "# Create a DataFrame and save to CSV in the tmp directory\n", "df = pd.DataFrame({\"x\": x, \"y\": y, \"z\": z})\n", "iwutil.save.csv(df, \"tmp/data.csv\")\n", "\n", "# Create a dictionary and save to JSON in the tmp directory\n", "metadata = {\"A\": [1, 2, 3], \"B\": [4, 5, 6], \"C\": [7, 8, 9]}\n", "iwutil.save.json(metadata, \"tmp/metadata.json\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load the data \n", "\n", "We can load the data into a pandas DataFrame using the `iwutil.read_df` function. This function will automatically detect the format of the file based on the filename extension (e.g. `.csv`, `.json`, `.parquet`).\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df = iwutil.read_df(\"tmp/data.csv\")\n", "print(df.head())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "metadata = iwutil.read_df(\"tmp/metadata.json\")\n", "print(metadata)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we can delete the temporary directory and all of its contents" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "shutil.rmtree(\"tmp\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3" }, "vscode": { "interpreter": { "hash": "938a56cf8cc78a970178b6cd91dbc2084cfe03b4ddf365fda3eb6d44738b4092" } } }, "nbformat": 4, "nbformat_minor": 2 }