{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Fitting across multiple dimensions\n", "\n", "Suppose you have to fit a single model to multiple data points across some dimension, or even multiple dimensions. The accessor can handle this with ease.\n", "\n", "To demonstrate, let's extend our previous example of fitting a 1D Gaussian peak on a linear background to 2D, where each row contains a Gaussian peak with a different center." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import xarray as xr\n", "import matplotlib.pyplot as plt\n", "import lmfit\n", "import xarray_lmfit" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define coordinates\n", "x = np.linspace(-5.0, 5.0, 100)\n", "y = np.arange(3)\n", "\n", "# Center of the peaks along y\n", "center = np.array([-2.0, 0.0, 2.0])[:, np.newaxis]\n", "\n", "# Gaussian peak on a linear background\n", "z = -0.1 * x + 2 + 3 * np.exp(-((x - center) ** 2) / (2 * 1**2))\n", "\n", "# Add some noise with fixed seed for reproducibility\n", "rng = np.random.default_rng(5)\n", "zerr = np.full_like(z, 0.1)\n", "z = rng.normal(z, zerr)\n", "\n", "# Construct DataArray\n", "darr = xr.DataArray(z, dims=[\"y\", \"x\"], coords={\"y\": y, \"x\": x})\n", "darr.plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "{meth}`xarray.DataArray.xlm.modelfit` will automatically broadcast the model parameters across the non-fitting dimensions, allowing you to fit all rows in one go." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model = lmfit.models.GaussianModel() + lmfit.models.LinearModel()\n", "\n", "params = {\"center\": 0.0, \"slope\": -0.1}\n", "\n", "result_ds = darr.xlm.modelfit(coords=\"x\", model=model, params=params)\n", "result_ds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that {meth}`xarray.DataArray.xlm.modelfit` also allows `params` to be provided as a dictionary, structured like the keyword arguments to {meth}`lmfit.model.Model.make_params` or {func}`lmfit.parameter.create_params`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Providing initial guesses\n", "\n", "What if you want to provide different initial guesses for each row? Using the powerful broadcasting capabilities of xarray, you can provide initial guesses and bounds for the fitting parameters as {class}`xarray.DataArray`s.\n", "\n", "For instance, if we want to provide different initial guesses for the peak positions along `y`, we can do so by passing a dictionary of DataArrays to the `params` argument." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model = lmfit.models.GaussianModel() + lmfit.models.LinearModel()\n", "\n", "params = {\n", " \"center\": xr.DataArray([-2, 0, 2], coords=[darr.y]),\n", " \"slope\": -0.1,\n", "}\n", "\n", "result_ds = darr.xlm.modelfit(coords=\"x\", model=model, params=params)\n", "result_ds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's overlay the fitted peak positions on the data." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "result_ds.modelfit_data.plot()\n", "result_center = result_ds.sel(param=\"center\")\n", "\n", "plt.plot(result_center.modelfit_coefficients, result_center.y, \"o-\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The same can be done with *all* parameter attributes that can be passed to {func}`lmfit.parameter.create_params` (e.g., `vary`, `min`, `max`, etc.). For example:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model = lmfit.models.GaussianModel() + lmfit.models.LinearModel()\n", "\n", "params = {\n", " \"center\": {\n", " \"value\": xr.DataArray([-2, 0, 2], coords=[darr.y]),\n", " \"min\": -5.0,\n", " \"max\": xr.DataArray([0, 2, 5], coords=[darr.y]),\n", " },\n", " \"slope\": -0.1,\n", "}\n", "\n", "result_ds = darr.xlm.modelfit(coords=\"x\", model=model, params=params)\n", "result_ds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Parallelization\n", "\n", "The accessors are tightly integrated with `xarray`, so passing a dask array will\n", "parallelize the fitting process. See [Parallel Computing with Dask](https://docs.xarray.dev/en/stable/user-guide/dask.html) for background on how xarray and dask work together.\n", "\n", "Assuming you have `dask` installed with a `dask` scheduler set up, you can fit large datasets in parallel with ease.\n", "\n", "For example, recall the previous example where we created a DataArray with 3 Gaussian peaks, each with a different center, except this time we will create not just 3, but 300 peaks." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define coordinates\n", "x = np.linspace(-5.0, 5.0, 100)\n", "y = np.arange(300)\n", "\n", "# Center of the peaks along y\n", "center = np.linspace(-2.0, 2.0, 300)[:, np.newaxis]\n", "\n", "# Gaussian peak on a linear background\n", "z = -0.1 * x + 2 + 3 * np.exp(-((x - center) ** 2) / (2 * 1**2))\n", "\n", "# Construct DataArray\n", "darr = xr.DataArray(z, dims=[\"y\", \"x\"], coords={\"y\": y, \"x\": x})\n", "darr.plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let's try chunking the data and converting it to a dask array before fitting:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "darr = darr.chunk({\"y\": 50})\n", "darr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When {meth}`xarray.DataArray.xlm.modelfit` is called on a dask array, the fitting is not performed immediately. Instead, a dask graph is created that represents the computation to be performed:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model = lmfit.models.GaussianModel() + lmfit.models.LinearModel()\n", "\n", "params = {\n", " \"center\": xr.DataArray(np.linspace(-2.0, 2.0, 300), coords=[darr.y]),\n", " \"slope\": -0.1,\n", "}\n", "\n", "result = darr.xlm.modelfit(coords=\"x\", model=model, params=params)\n", "result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can call `.compute()` on the entire Dataset, or on individual variables to perform the fitting in parallel." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "result.compute()" ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.1" } }, "nbformat": 4, "nbformat_minor": 4 }