{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Introduction to the toynn_2023 tool box"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h1><a id='toc'></a>Table of contents</h1>\n",
    "\n",
    "<div class=\"alert alert-block alert-info\" style=\"margin-top: 20px\">\n",
    "    <ul>\n",
    "        <li><a href=\"#I\">1. The class ToyPb </a></li>\n",
    "        <li><a href=\"#II\">2. The class nD_data</a></li> \n",
    "        <li><a href=\"#III\">3. The class toyNN</a></li> \n",
    "        <li><a href=\"#IV\">4. Methods for basic operations on lists of weights </a></li> \n",
    "        <li><a href=\"#V\">5. Methods for optimization </a></li>\n",
    "    </ul>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "Ie13pHt3gyh8"
   },
   "source": [
    "### Standard libraries and the three classes of the tool box toynn_2022 (ToyPb, nD_data, ToyNN)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from toynn_2023 import *\n",
    "# performs the following:\n",
    "#    import numpy as np\n",
    "#    from numpy import random as nprd\n",
    "#    from matplotlib import pyplot as plt\n",
    "#    from matplotlib import cm as cm\n",
    "#    from copy import deepcopy as dcp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"#toc\">top</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;\n",
    "<a href=\"#I\">1.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;\n",
    "<a href=\"#II\">2.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;\n",
    "<a href=\"#III\">3.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;\n",
    "<a href=\"#IV\">4.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;\n",
    "<a href=\"#V\">5.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;\n",
    "<a href=\"#bot\">bot.</a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "Woacru-MVEgA"
   },
   "source": [
    "# 1. The class ToyPb <a id='I'></a> "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Attributes of an object in the class ToyPb"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "An object in the class ToyPb contains some information about a classification problem for points in a rectangle.\n",
    "\n",
    "If _pb_ is in this class:<br>\n",
    "($*$) _pb.name_ is a chain.<br>\n",
    "($*$) _pb.bounds_ is a tuple of floats $(x_0^-,x_0^+,x_1^-,x_1^+)$ which defines the rectangle.<br>\n",
    "($*$) _pb.f_ is an implementation of a numerical function $f(x_0,x_1)$.<br> The classification problem is the following. Given $x=(x_0,x_1)\\in[x_0^-,x_0^+]\\times[x_1^-,x_1^+]$ determine whether $x$ belongs to $\\Omega$ where \n",
    "$$\n",
    "\\Omega:=\\left\\{x \\in[x_0^-,x_0^+]\\times[x_1^-,x_1^+]: f(x)<0\\right\\}.\n",
    "$$\n",
    "\n",
    "There are two other attributes.<br>\n",
    "($*$) _pb.loss_ is an implementation of a numerical function $\\ell$.<br>\n",
    "($*$) *pb.loss_prime* is an implementation of the derivative $\\ell'$ of $\\ell$.<br>\n",
    "The _``loss function''_ $\\ell:\\mathbb{R}\\to\\mathbb{R}$ is used to estimate the error  of predictions.<br> \n",
    "Given a prediction $\\hat y\\in\\mathbb{R}$ and the correct classification:\n",
    "$$\n",
    "y= \\begin{cases}-1&\\text{if }x\\not\\in\\Omega,\\\\\n",
    "\\,\\,1&\\text{if }x\\in\\Omega,\n",
    "\\end{cases}\n",
    "$$\n",
    "the error (or cost) is measured by $\\ell(\\hat y y)$. The function $\\ell$ should be nondecreasing with $\\ell\\ge0$ and $\\ell(t)$ close to $0$ for large positive values of $t$. The ideal loss function would be \n",
    "$$\n",
    "\\ell_{ideal}(t)= \\begin{cases}\\,\\ 0&\\text{if }t>0,\\\\\n",
    "+\\infty&\\text{if }t\\le0.\n",
    "\\end{cases}\n",
    "$$\n",
    "This would give a zero cost to predictions $y$ with the correct sign and an infinite cost to the others.<br>\n",
    "However, to apply the gradient-descent methods we pick a smooth decreasing function for $\\ell$.<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below, is an example of creation and manipulation of an obect in the class *ToyPb*.<br>\n",
    "__Remark:__ The method *show_border()* displays the boundary of the region $\\Omega$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "pb = ToyPb(name = \"disk\", bounds = (-1,1), loss_name = \"softplus\")\n",
    "\n",
    "\n",
    "print(f\"pb.name={pb.name}, pb.bounds={pb.bounds}\")\n",
    "\n",
    "\n",
    "pb.show_border()\n",
    "plt.title(f\"boundary of $\\Omega$\",fontsize=20)\n",
    "plt.show()\n",
    "\n",
    "loss, loss_prime = pb.loss, pb.loss_prime\n",
    "t=np.linspace(-3,3,300)\n",
    "\n",
    "\n",
    "plt.figure(figsize=(12,5))\n",
    "plt.subplot(121)\n",
    "plt.plot(t,loss(t),'b',label=r\"$\\ell$\")\n",
    "plt.legend(fontsize=20)\n",
    "\n",
    "plt.subplot(122)\n",
    "plt.plot(t,loss_prime(t),'r',label=r\"$\\ell'$\")\n",
    "plt.legend(fontsize=20)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can try with _name_ = \"square\", \"sin\" or \"ring\" and with *loss_name* = \"demanding\"."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The \\_\\_init\\_\\_ method of the class ToyPb"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Parameters of the \\_\\_init\\_\\_ method**\n",
    "\n",
    "\\*\\*kwargs :<br>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "_f_ : numerical function $(x_0,x_1)\\in\\mathbb{R}^2\\mapsto f(x_0,x_1)\\in\\mathbb{R}$ (optional if _name_ is given)<br>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "_name_ : string (optional if _f_ is given)<br>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "*bounds*=(-1,1) : a tuple of 2 or 4 floats<br>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "*loss*, *loss_prime* : numerical functions: $\\mathbb{R}\\to\\mathbb{R}$ (optional)<br>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "*loss_name*: string                (optional)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Behaviour of the \\_\\_init\\_\\_ method.**\n",
    "\n",
    "The possible values for _name_ are \"sin\", \"affine\", \"disk\", \"square\", \"ring\".\n",
    "\n",
    "If _bounds_ is the tuple ($x_-$,$x_+$) with length 2, then *pb.bounds* receives the value ($x_-$,$x_+$,$x_-$,$x_+$).<br>\n",
    "If _bounds_ is a tuple with length 4, *pb.bounds* receives the value _bounds_. \n",
    "\n",
    "The possible values for *loss_name* are \"softplus\" and \"demanding\".<br> \n",
    "For \"softplus\", \n",
    "$$\n",
    "\\ell(t)=\\ln\\left(1 + e^{-t}\\right).\n",
    "$$\n",
    "For \"demanding\", \n",
    "$$\n",
    "\\ell(t)=\\sqrt{(t - 1)^2+1/10} - t + 1\n",
    "$$\n",
    "If the parameters *loss_name* and at least one of the parameters *loss* or *loss_prime* is not specified, then the deffect value for the loss function is:\n",
    "$$\n",
    "\\ell(t)=\\sqrt{t^2+1/10} - t.\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 2. The class nD_data <a id='II'></a> "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"#toc\">top</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#I\">1.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#II\">2.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#III\">3.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#IV\">4.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#V\">5.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#bot\">bot.</a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "An object of the class *nD_data* essentially contains:<br>\n",
    "($*$) a set of points of the plane:  $x^i=(x^i_0,x^i_1)$ for $i=0,\\dots,n-1$,<br>\n",
    "($*$) a set of labels $y^i\\in\\{-1,1\\}$ for $i=0,\\dots,n-1$ corresponding to the exact classification of the points $(x^i_0,x^i_1)$ with respect to a problem *pb* in the class _ToyPb_.,<br>\n",
    "($*$) possibly a set of predictions $y^i_{pred}\\in\\mathbb{R}$ for $i=0,\\dots,n-1$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If _data_ is in this class:<br>\n",
    "($*$) _data.n_ is an integer (the size of the sets).<br>\n",
    "($*$) _data.X_ is a numpy array of size $n\\times2$. With the above notation, *data.X*[i,0]$=x^i_0$, *data.X*[i,1]$=x^i_1$.<br>\n",
    "($*$) _data.Y_ is a numpy array of length $n$. With the above notation, *data.Y*[i]$=y^i$.<br>\n",
    "($*$) *data.Ypred* is also a numpy array of length $n$ and *data.Y*[i]$=y_{pred}^i$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "__Remarks:__<br>\n",
    "(a) *data.Ypred* is created only if *init_pred=True*. In this case it is initialized as a zero numpy array.<br>\n",
    "(b) For computing *data.Y* it is necessary to specify an object *pb* in the class *ToyPb*. The numpy array *data.Y* is then created according to the rule:\n",
    "$$\n",
    "\\textit{data.Y}\\text{[i]}:=\n",
    "\\begin{cases}\n",
    "-1&\\text{if }\\textit{pb.f}(\\textit{data.X}\\text{[i]})\\geq0,\\\\\n",
    "\\phantom{-}1&\\text{if }\\textit{pb.f}(\\textit{data.X}\\text{[i]})<0.\n",
    "\\end{cases}\n",
    "$$\n",
    "(c) The method *show_class()* displays the classification."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pb = ToyPb(name = \"disk\", bounds = (-1,1), loss_name = \"softplus\")\n",
    "\n",
    "ndata = 1000\n",
    "data = nD_data(n = ndata, pb = pb, init_pred=True)\n",
    "print(f\"data.n={data.n}\")\n",
    "print(f\"data.X.shape={data.X.shape}\")\n",
    "print(f\"data.Y.shape={data.Y.shape}\")\n",
    "print(f\"data.Ypred.shape={data.Ypred.shape}\")\n",
    "\n",
    "\n",
    "data.show_class()\n",
    "\n",
    "pb.show_border('k--')\n",
    "\n",
    "plt.legend(loc=1,fontsize=12)\n",
    "title1=\"Values of data.Y[i] on the points with\"\n",
    "title2=\" coordinates (data.X[i,0],data.X[i,1])\"\n",
    "plt.title(title1 + title2, fontsize=15)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The \\_\\_init\\_\\_ method of the class nD_data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Parameters of the \\_\\_init\\_\\_ method**\n",
    "\n",
    "\\*\\*kwargs :<br>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) _n_ : integer (>0)<br>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) _X_ : numpy array of shape *n*$\\times$*2*<br>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) _Y_ : numpy array of length _n_<br>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) _f_ : numerical function $(x_0,x_1)\\in\\mathbb{R}^2\\mapsto f(x_0,x_1)\\in\\mathbb{R}$ (optional)<br>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) _pb_ : object of type ToyPb<br>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) *bounds*=(-1,1) : a tuple of 2 or 4 floats<br>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) *init_pred*=None : boolean"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Behaviour of the \\_\\_init\\_\\_ method.**\n",
    "\n",
    "If *bounds*$=(x_0^-,x_0^+)$ then the attribute _bounds_ receives $(x_0^-,x_0^+,x_0^-,x_0^+)$.<br>\n",
    "If *bounds*$=(x_0^-,x_0^+,x_1^-,x_1^+)$ then the attribute _bounds_ receives _bounds_.<br>\n",
    "In the sequel, we denote _bounds_$=(x_0^-,x_0^+,x_1^-,x_1^+)$.\n",
    "\n",
    "If _X_ and _Y_ are given they are sent to the corresponding attributes of the object. \n",
    "\n",
    "If _X_ and _Y_ are not given, the \\_\\_init\\_\\_ method creates two atributes _X_ and _Y_.<br>\n",
    "_X_ is a numpy array of size _n_$\\times$_2_. The coefficients of _X_ are picked randomly *X*[i,0] is picked in $[x_0^-,x_0^+]$ and *X*[i,1] in $[x_1^-,x_1^+]$.<br>\n",
    "_Y_ is a numpy array of length _n_ which is defined with the rule.\n",
    "$$\n",
    "\\textit{Y}\\text{[i]}:=\n",
    "\\begin{cases}\n",
    "-1&\\text{if }\\textit{g}(\\textit{X}\\text{[i,0]},\\textit{X}\\text{[i,1]})\\geq0,\\\\\n",
    "\\ 1&\\text{if }\\textit{g}(\\textit{X}\\text{[i,0]},\\textit{X}\\text{[i,1]})<0,\n",
    "\\end{cases}\n",
    "$$\n",
    "where _g_=_f_ if _f_ is specified and _g_=_pb.f_ if not.\n",
    "\n",
    "\n",
    "If *init_pred=True* then an attribute *Ypred* is created which receives a zero numpy array of length _n_ (an array with the same shape as _Y_ and with zero entries)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 3. The class toyNN <a id='III'></a> "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"#toc\">top</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;\n",
    "<a href=\"#I\">1.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;\n",
    "<a href=\"#II\">2.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;\n",
    "<a href=\"#III\">3.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;\n",
    "<a href=\"#IV\">4.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;\n",
    "<a href=\"#V\">5.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;\n",
    "<a href=\"#bot\">bot.</a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "An object of the class _toyNN_ contains the characteristics of a neural network (number of hidden layers, number of nodes in each layer, activation function) __but__ not the coefficients of a specific neural network with this shape. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The number of  layers and nodes in a neural network is described as a tuple\n",
    "$$\n",
    "\\text{CardNodes}=(a_0,a_1,\\dots,a_{N-1},a_N),\n",
    "$$\n",
    "where $a_n$ is the number of nodes in the $n^{\\text{th}}$ layer .<br> \n",
    "There are $N-1$ hidden layers.<br> \n",
    "The neural networks of interest for the classification problems of part __1__ have two input nodes and one output node. Hence \n",
    "$$\n",
    "a_0=2\\qquad\\text{ and }\\qquad a_N=1.\n",
    "$$\n",
    "(In the optimization process, the two input nodes will be fed with the coordinates $(x_0,x_1)$ of the points to classify. The output will be a real number that we want to be positive for input points in $\\Omega$ and negative in the other cases.)<br>\n",
    "The neural networks are also characterized by an activation function $\\chi$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The atributes of an object *nn* in this class are the following.<br>\n",
    "($*$) _nn.N_ is an integer. The number of hidden layers is *nn.N*$-1$.<br>\n",
    "($*$) _nn.card_ is a tuple of integers which contains the number of nodes in each layer.<br>\n",
    "($*$) _nn.Nparam_ is the number of free coefficients of a neural network of type *nn*. Denoting $N=$_nn.N_ and $(a_0,a_1,\\dots,a_{N-1},a_N)=$*nn.card*, we have \n",
    "$$\n",
    "\\textit{nn.Nparam}=\\sum_{n=0}^{N-1}a_na_{n+1} +\\sum_{n=1}^N a_n.\n",
    "$$\n",
    "($*$) *nn.coef_bounds* is a 4-tuple of floats. It may be used when the coefficients of the neural network (weights and biasses) are picked randomly in the method *nn.create_rand*.<br> \n",
    "($*$) *nn.chi* is an implementation of the activation function $\\chi$.<br>\n",
    "($*$) *nn.chi_prime* is an implementation of the derivative $\\chi'$ of $\\chi$.<br>\n",
    "($*$) *nn.xx*, *nn.yy* and *nn.zz* are three 2D numpy arrays used in the graphic representations of the neural networks' outputs (in the method *nn.show_pred*).<br> "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below, we create a typical object in the class *ToyNN*.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "CardNodes = (2, 4, 6, 5, 1)\n",
    "nn = ToyNN(card = CardNodes, coef_bounds=(-1,1,-1,1), chi=\"tanh\", grid=(-1,1,41))\n",
    "\n",
    "print(f\"nn.N={nn.N}\")\n",
    "print(f\"nn.card={nn.card}\")\n",
    "print(f\"nn.coef_bounds={nn.coef_bounds}\")\n",
    "print(f\"nn.Nparam={nn.Nparam}\")\n",
    "\n",
    "chi, chi_prime = nn.chi, nn.chi_prime\n",
    "t=np.linspace(-3,3,100)\n",
    "\n",
    "plt.figure(figsize=(12,5))\n",
    "plt.subplot(121)\n",
    "plt.plot(t,chi(t),'b',label=r\"$\\chi$\")\n",
    "plt.legend(fontsize=20)\n",
    "plt.subplot(122)\n",
    "plt.plot(t,chi_prime(t),'r',label=r\"$\\chi'$\")\n",
    "plt.legend(fontsize=20)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With the method *nn.create_rand()* we can build lists *A*$=$[*W,Bias*] where *W* and *Bias* are both lists of _N_ numpy arrays. The coefficients in these arrays are the parameters of a neural network. _W_ contains the weights of the edges and _Bias_ the weights of the nodes.<br>\n",
    "More precisely, for $n=0,\\dots,N-1$ denoting $a_n$ the number of nodes in the $n^{\\text{th}}$ layer:<br>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) *W*[n][i,j]$=:w^{n}_{i,j}$ is the weight of the edge from the $i^{\\text{th}}$ node of layer $n$ to the $j^{\\text{th}}$ node of layer $n+1$.<br>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) *Bias*[n][i]$=:b^{n}_{i}$ is the weight on the $i^{\\text{th}}$ node of layer $n+1$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "__How is computed the ouput $h$*(X,A)* provided by a neural network given an input *X*$=:(x_0,x_1)$ ?__\n",
    "\n",
    "Let us number the nodes of layer $n$ as $q_j^n$ for $j=0,\\cdots,a_n-1$.<br> \n",
    "We define for each node $q_j^n$ of the layers $n\\in\\{0,\\dots,N-1\\}$ an output value $O^n_j$ and for each node $q_j^n$ of the layers $n\\in\\{1,\\dots,N\\}$ an input value $I^n_j$. These quantities are defined as follows.<br>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "The layer 0 has two nodes $q^0_0$ and $q^0_1$. We set (recall that *X*$=(x_0,x_1)$), \n",
    "$$\n",
    "(q^0_0,q^0_1)\\quad \\longleftarrow\\quad (x_0,x_1).\n",
    "$$\n",
    "Then for $n=0,\\dots,N-2$,<br>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;we set for $j=0,\\dots,a_{n+1}-1$,<br>\n",
    "$$\n",
    "\\begin{array}{rl}\n",
    "I_j^{n+1}&\\longleftarrow\\ \\displaystyle\\sum_{i=0}^{a_n-1}w^n_{i,j} O_i^n + b_j^{n+1},\\\\\n",
    "O_j^{n+1}&\\longleftarrow\\ \\chi(I_j^{n+1}).\n",
    "\\end{array}\n",
    "$$\n",
    "The input value $I^N_0$ associated to the unique node of the last layer is given by\n",
    "$$\n",
    "I_0^N\\longleftarrow\\ \\sum_{i=0}^{a_{N-1}-1}w^{N-1}_{i,0} O_i^{N-1} + b_0^N.\n",
    "$$\n",
    "The output of the neural network with coefficients $A=$[*W,Bias*] for the input data $x=(x_0,x_1)$ is then defined as\n",
    "$$\n",
    "h(x,A):=I_0^N.\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "__Example__: below we define an object _nn_ in the class _ToyNN_ and use it to build a list *A*=[*W,Bias*] which contains the weights of a neural network.<br>\n",
    "These weights are chosen randomly and uniformly in $[w_-,w_+]$ for the $w^n_{i,j}$'s and in $[b_-,b_+]$ for the $b^n_i$'s where $(w_-,w_+,b_-,b_+)=$*nn.coef_bounds*."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "CardNodes = (2, 3, 4, 2, 1)\n",
    "nn = ToyNN(card = CardNodes, coef_bounds=(-1,1,-1,1), chi=\"tanh\", grid=(-1,1,41))\n",
    "A=nn.create_rand()\n",
    "for n in range(nn.N):\n",
    "    print(f\"W[{n}]={A[0][n]}\\n\")\n",
    "for n in range(nn.N) :   \n",
    "    print(f\"Bias[{n}]={A[1][n]}\\n\")\n",
    "    \n",
    "nn.show(A)\n",
    "text1=\" The width of the edges is proportional to the absolute values\"\n",
    "text2=\" of the corresponding weights.\\n The color depends on their signs:\"\n",
    "text3=\" red if negative, green if positive.\\n The nodes are colored\"\n",
    "text4=\" according to the sign of the corresponding biasses\"\n",
    "text5=\" with the same convention.\"\n",
    "print(text1 + text2 + text3 + text4 + text5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 4. Methods for basic operations on lists of weights <a id='IV'></a>  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"#toc\">top</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#I\">1.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#II\">2.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#III\">3.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#IV\">4.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#V\">5.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#bot\">bot.</a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us test the basic operations available in the library.<br>\n",
    "We start by creating an object _pb_ of type *ToyPb*, an object _nn_ of type *ToyNN* and then two lists of random weights _A_, _B_.<br> \n",
    "In the sequel such objects are called _coef-lists_. For shortness the weights stored in _A_ are denoted $A_i$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pb = ToyPb(name = \"square\", bounds = (-1,1), loss_name = \"softplus\")\n",
    "\n",
    "CardNodes = (2, 3, 4, 1)\n",
    "nn = ToyNN(card = CardNodes, coef_bounds=(-1,1,-1,1), chi=\"tanh\", grid=(-1,1,41))\n",
    "A = nn.create_rand()\n",
    "B = nn.create_rand()\n",
    "\n",
    "print(\"A:\")\n",
    "nn.show(A)\n",
    "print(\"B:\")\n",
    "nn.show(B)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We add _A_ and 2.5 times _B_ and put the result in a new coef-list _C_, that is\n",
    "$$\n",
    "\\textit{C}\\ \\leftarrow\\ \\textit{A}+2.5\\times\\textit{B}.\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "C=nn.add(A,B,c=2.5)\n",
    "print(\"A + 2.5 x B:\")\n",
    "nn.show(C)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also put tHe result in _A_."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "nn.add(A,B,c=2.5,output=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After this _C_ and _A_ should be equal. Let us check this."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "D=nn.add(A,C,c=-1)\n",
    "print(\"coefficients of D=A - C\")\n",
    "print(D[0])\n",
    "print(D[1])\n",
    "nn.show(D)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It can be also usefull to be create a zero coef-list. This is done by:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "E=nn.create_zero() \n",
    "print(\"a zero coef-list:\")\n",
    "nn.show(E)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To make a (true, deep) copy of a coef_list, we do:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "D=nn.copy(A)\n",
    "print(\"A:\")\n",
    "nn.show(A)\n",
    "print(\"D (copy of A):\")\n",
    "nn.show(D)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We change _A_ and check that _D_ has not been modified"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "A=nn.create_rand()\n",
    "print(\"A after modification:\")\n",
    "nn.show(A)\n",
    "print(\"D:\")\n",
    "nn.show(D)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Other methods for operations on coef_lists:<br> \n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) if _c_ is a scalar and _A_ is a coef_list, *nn.scal_mult(A,c)* returns the coef list with weights *c*$\\times A_i$.<br> \n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) if _A_ and _B_ are coef_lists *nn.dot(A,B)* returns the dot products of the two vectors containing all the coefficients of _A_ and _B_, that is \n",
    "$$\n",
    "\\sum_i A_i B_i.\n",
    "$$ <br> \n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) if _A_ is a coef_list, *nn.square(A)* returns a coef-list with the same structure as _A_ and with weights ${A_i}^{\\!2}$.<br> \n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) if _A_  is a coef-list and _f_ is a numerical function (compatible with numpy) then *nn.maps(f,A)* returns the coef-list with weights $f(A_i)$.<br> \n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) if _A_ and _B_ are coef-lists  and *f* is a numerical function of two variables, then *nn.maps2(A,B)* returns the coef-list with weights $f(A_i,B_i)$.\n",
    "\n",
    "In the methods *square*, *maps* and *maps2*, it is possible to precise the parameter *output*$=$False. In this case the result is not returned but put in _A_.  \n",
    "\n",
    "In the method *maps* (respectively *maps2*), the function _f_ may depend on an additional parameter, precised by *param*$=p$. In this case the computed  weigths are $f(A_i,p)$ (resp. $f(A_i,B_i,p)$). See the examples below. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "A=nn.create_rand()\n",
    "B=nn.create_rand()\n",
    "\n",
    "print(\"Test of nn.scal_mult:\")\n",
    "fact=3\n",
    "C=nn.scal_mult(A,fact)\n",
    "D=nn.add(A,C,c=-1/fact)\n",
    "print(\"A-(1/3)*(3*A)=\\n\",D)\n",
    "\n",
    "print(\"\\nTest of nn.dot:\")\n",
    "print(f\"nn.dot(A,B)={nn.dot(A,B):1.5e}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Test of nn.square:\")\n",
    "A2=nn.square(A)\n",
    "A4=nn.square(A2)\n",
    "print(\"A\")\n",
    "nn.show(A)\n",
    "print(\"A^2\")\n",
    "nn.show(A2)\n",
    "print(\"A^4\")\n",
    "nn.show(A4)\n",
    "\n",
    "print(\"\\nTest of nn.maps:\")\n",
    "f = lambda x:x**2\n",
    "fA=nn.maps(f,A)\n",
    "D=nn.add(A2,fA,-1)\n",
    "print(\"A^2-f(A) with f(x)=x^2\")\n",
    "nn.show(D)\n",
    "\n",
    "\n",
    "print(\"\\nTest of nn.maps2:\")\n",
    "f = lambda x,y:x*y\n",
    "fAA=nn.maps2(f,A,A)\n",
    "D=nn.add(A2,fAA,-1)\n",
    "print(\"A^2-f(A,A) with f(x,y)=x*y\")\n",
    "nn.show(D)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Test nn.maps with additional parameters:\")\n",
    "f = lambda x, p : np.sin(p[0]*x)+np.sin(p[1]*x)\n",
    "p=(np.pi/2,np.pi/6)\n",
    "fAp=nn.maps(f,A,param=p)\n",
    "print(\"sin(π/2 A) + sin(π/6 A):\")\n",
    "nn.show(fAp)\n",
    "\n",
    "print(\"\\nTest nn.maps2:\")\n",
    "f = lambda x,y,p: np.exp(p[0]*x + p[1]*y)\n",
    "p=(.5,-1.5)\n",
    "fABp=nn.maps2(f,A,B,param=p)\n",
    "print(\"exp(1/2 A - 3/2 B):\")\n",
    "nn.show(fABp)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"#toc\">top</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#I\">1.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#II\">2.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#III\">3.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#IV\">4.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#V\">5.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#bot\">bot.</a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 5. Methods for optimization <a id='V'></a>  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this part, we present the following methods associated with an object _nn_ of type _ToyNN_. They take as arguments a coef-list _A_ and depending on the method: a numpy array _x_ with lenth 2 and/or a float _y_, or an object _data_ of type *nD_data* and/or an object _pb_ of type _ToyPb_.<br> \n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) The method _nn.output_ computes the output $h(x,A)$ produced by a neural network with weights *A*$=A_i$ for a given input $x=(x_0,x_1)$.<br> \n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) the method _nn.descent_ computes the opposite gradient of the function\n",
    "$$\n",
    "A\\mapsto \\ell\\left(h(x,A)\\times y\\right)\n",
    "$$\n",
    "where $A$, $x$ are as above, $y$ is a tag associated to $x$ and $\\ell$ is a loss function associated with some object _pb_ of type _ToyPb_.<br> \n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) The method _nn.prediction_ computes the outputs of _A_ at the points of a data set _data_ of type *nD_data* and put the result in the array _data.Ypred_.<br> \n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) The method *data.show_class* with the argument *pred*=True displays this predicted classification.<br>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) The method *show_pred* computes the outputs _nn.zz_ predicted by a coef-list _A_ on a grid (*nn.xx*,*nn.yy*) and displays the result as a heat map.<br>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) The method *nn.total_loss* computes the mean loss \n",
    "$$\n",
    "\\dfrac1{n_d}\\sum_{j=0}^{n_d-1} \\ell\\left(h(X_j,A)\\times y_j\\right),\n",
    "$$\n",
    "where the $X_j$'s and $y_j$'s are the points and tags in a data set _data_ of type *nD_data*. Namely, $n_d=$*data.n*, $X_j=$*data.X*[j], $y_j=$*data.Y*[j].<br>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "($*$) The method *nn.total_loss_and_prediction* combines the methods *nn.total_loss* and *nn.prediction*."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In general, the user does not need to call _nn.ouput_."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "CardNodes = (2, 3, 4, 2, 1)\n",
    "nn = ToyNN(card = CardNodes, coef_bounds=(-1,1,-1,1), chi=\"tanh\", grid=(-1,1,41))\n",
    "A=nn.create_rand()\n",
    "\n",
    "x=np.array([0.5,-0.3])\n",
    "o=nn.output(A,x)\n",
    "print(f\"x={x}\")\n",
    "print(f\"output(A,x)={o:1.5f}\")\n",
    "\n",
    "x=np.array([-0.75,0.25])\n",
    "o=nn.output(A,x)\n",
    "print(f\"x={x}\")\n",
    "print(f\"output(A,x)={o:1.5f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The method nn_descent is the heart of gradient descent algorithms. It computes the opposite gradient with respect to the coefficienst $A_i$ of _A_ of the mapping\n",
    "$$\n",
    "F_{x,y}:A\\mapsto \\ell\\left(h(x,A)\\times y\\right).\n",
    "$$\n",
    "It returns a coef-list _dA_ with coefficients \n",
    "$$\n",
    "(dA)_i =-\\dfrac{\\partial F_{x,y}}{\\partial A_i}(A).\n",
    "$$\n",
    "It takes as arguments: a coef-list _A_, a np.array $x$ with length 2, a float _y_ and an object _pb_ of type _ToyPb_ (the loss function $\\ell$ is then *pb.loss*)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pb = ToyPb(name = \"sin\", bounds = (-1,1), loss_name = \"softplus\")\n",
    "\n",
    "CardNodes = (2, 3, 4, 2, 1)\n",
    "nn = ToyNN(card = CardNodes, coef_bounds=(-1,1,-1,1), chi=\"tanh\", grid=(-1,1,41))\n",
    "\n",
    "A=nn.create_rand()\n",
    "\n",
    "x=np.array([-0.75,0.25])\n",
    "y=1\n",
    "\n",
    "dA=nn.descent(A,x,y,pb=pb)\n",
    "\n",
    "print(f\"x={x}, y={y}\")\n",
    "print(f\"dA=-Gradient Fxy(A)\")\n",
    "nn.show(dA)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There are optional arguments _alpha_ and _B_.<br>\n",
    "If the float _alpha_ is specified, the weights of the returned coef-list are \n",
    "$$\n",
    "(dA)_i =-\\alpha\\dfrac{\\partial F}{\\partial A_i}(A).\n",
    "$$\n",
    "If a coef-list _B_ is specified, the result is not returned but added to _B_. This is handy when using a mini-batch method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "x=np.array([-0.75,0.25])\n",
    "y=1\n",
    "\n",
    "print(\"Test of the parameter alpha.\")\n",
    "dA=nn.descent(A,x,y,pb=pb)\n",
    "dA_one_half=nn.descent(A,x,y,alpha=1/2, pb=pb)\n",
    "\n",
    "D=nn.add(dA,dA_one_half,-2)\n",
    "print(\"dA(alpha=1) - 2xdA(alpha=1/2)\")\n",
    "nn.show(D)\n",
    "\n",
    "print(\"Test of the parameter B.\")\n",
    "\n",
    "DA=nn.create_zero()\n",
    "\n",
    "x=np.array([-0.75,0.25])\n",
    "y=1\n",
    "nn.descent(A,x,y, B=DA, pb=pb)\n",
    "print(\"DA after one contribution\")\n",
    "nn.show(DA)\n",
    "\n",
    "x=np.array([0.5,-0.2])\n",
    "y=-1\n",
    "print(\"DA after two contributions\")\n",
    "nn.descent(A,x,y, B=DA, pb=pb)\n",
    "nn.show(DA)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The method _nn.prediction_ uses the method _nn.output()_ to compute the predictions of the neural network on the point of a data set _data_ of type *nD_data* and store the result in _data.Ypred_ \n",
    "\n",
    "The method *data.show_class(pred=True)* displays these predictions.\n",
    "\n",
    "The method *nn.show_pred* compute the predictions of the neural network on a grid and  displays these predictions as a heat map."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "pb = ToyPb(name = \"ring\", bounds = (-1,1), loss_name = \"softplus\")\n",
    "\n",
    "data = nD_data(n=500, pb=pb)\n",
    "\n",
    "CardNodes = (2, 3, 4, 2, 1)\n",
    "nn = ToyNN(card = CardNodes, coef_bounds=(-1,1,-1,1), chi=\"tanh\", grid=(-1,1,41))\n",
    "A=nn.create_rand()\n",
    "\n",
    "\n",
    "data.show_class()\n",
    "pb.show_border('k--')\n",
    "plt.axis('off')\n",
    "plt.title(\"Correct answer\", fontsize=15)\n",
    "plt.show()\n",
    "\n",
    "\n",
    "nn.prediction(A, data)\n",
    "\n",
    "data.show_class(pred=True)\n",
    "nn.show_pred(A)\n",
    "pb.show_border('k--')\n",
    "plt.title(\"predictions of a random  A\", fontsize=15)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To assess the performance of a coef-list _A_ for a given problem _pb_ on a  given data set *data*, we use the method *total_loss*. It returns,\n",
    "$$\n",
    "\\dfrac1{n_d}\\sum_{i=0}^{n_d-1} \\ell\\left(h(X_i,A)\\times y_i\\right),\n",
    "$$\n",
    "where $n_d=$*data.n*, the $X_i$'s and $y_i$'s are the points and tags in _data.X_ and _data.Y_ and $\\ell$ is the function _pb.loss_.<br>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pb = ToyPb(name = \"ring\", bounds = (-1,1), loss_name = \"softplus\")\n",
    "\n",
    "data = nD_data(n=500, pb=pb)\n",
    "\n",
    "CardNodes = (2, 3, 4, 2, 1)\n",
    "nn = ToyNN(card = CardNodes, coef_bounds=(-1,1,-1,1), chi=\"tanh\", grid=(-1,1,41))\n",
    "A=nn.create_rand()\n",
    "\n",
    "error = nn.total_loss(A,data,pb=pb)\n",
    "print(error)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The method *nn.total_loss_and_prediction* combines the effects of *nn.total_loss* and *nn.prediction*."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "error2 = nn.total_loss_and_prediction(A,data,pb=pb)\n",
    "print(f\"error ={error},\\nerror2={error2}\")\n",
    "\n",
    "data.show_class(pred=True)\n",
    "nn.show_pred(A)\n",
    "pb.show_border('k--')\n",
    "plt.title(\"predictions of A\", fontsize=15)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"#toc\">top</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#I\">1.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#II\">2.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#III\">3.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#IV\">4.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#V\">5.</a>\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n",
    "<a href=\"#bot\">bot.</a>\n",
    "<a id='bot'></a> "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "colab": {
   "collapsed_sections": [
    "GafO0zXoJ6Cx",
    "5l_mvC1OJ6Da",
    "ZzS5-IzwaKn3",
    "89AjhkJ2aKoB"
   ],
   "name": "ToyNN_class.ipynb",
   "provenance": [],
   "toc_visible": true
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}