{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "21aade2b",
   "metadata": {},
   "source": [
    "# Accelerate Python with Taichi\n",
    "\n",
    "Python has become the most popular language in many rapidly evolving sectors, such as deep learning and data sciences. Yet its easy readability comes at the cost of performance. Of course, we all complain about program performance from time to time, and Python should certainly not take all the blame. Still, it's fair to say that Python's nature as an interpreted language does not help, especially in computation-intensive scenarios (e.g., when there are multiple nested for loops).\n",
    "\n",
    "This notebook is modified from the [blog](https://docs.taichi-lang.org/blog/accelerate-python-code-100x) written by Yuanming Hu, who is the creator of [Taichi](https://taichi-lang.cn/). One of the most notable advantages `Taichi` delivers is speeding up Python code.\n",
    "\n",
    "To install `Taichi`, activate you environment and type the following command:\n",
    "```Prompt\n",
    "pip install taichi\n",
    "```\n",
    "\n",
    "To use `Taichi`, import the package with the following command:\n",
    "```Python\n",
    "import taichi as ti\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "878c1f23",
   "metadata": {},
   "source": [
    "## Count the number of primes\n",
    "\n",
    "Large-scale or nested for loops in Python always leads to **poor** runtime performance. The following demo counts the primes within a specified range and involves nested for loops. Simply by importing Taichi or switching to Taichi's GPU backends, you will see a significant boost to the overall performance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "013eae15",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "78498\n",
      "Execution time: 2.1988 seconds\n"
     ]
    }
   ],
   "source": [
    "\"\"\"Count the prime numbers in the range [1, n]\n",
    "\"\"\"\n",
    "import time\n",
    "\n",
    "# Checks if a positive integer is a prime number\n",
    "def is_prime(n: int) -> bool:\n",
    "    result = True\n",
    "\n",
    "    # Traverses the range between 2 and sqrt(n)\n",
    "    # - Returns False if n can be divided by one of them;\n",
    "    # - otherwise, returns True\n",
    "    for k in range(2, int(n ** 0.5) + 1):\n",
    "        if n % k == 0:\n",
    "            result = False\n",
    "            break\n",
    "\n",
    "    return result\n",
    "\n",
    "# Traverses the range between 2 and n\n",
    "# Counts the primes according to the return of is_prime()\n",
    "def count_primes(n: int) -> int:\n",
    "    count = 0\n",
    "    for k in range(2, n):\n",
    "        if is_prime(k):\n",
    "           count += 1\n",
    "\n",
    "    return count\n",
    "\n",
    "t_start = time.perf_counter()\n",
    "print(count_primes(1000000))\n",
    "t_end = time.perf_counter()\n",
    "print(f\"Execution time: {t_end - t_start:.4f} seconds\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b841ce77",
   "metadata": {},
   "source": [
    "Now, let's change the code a bit: import `Taichi` to your Python code and initialize it using the CPU backend."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "2162cb5d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Taichi] version 1.7.3, llvm 15.0.1, commit 5ec301be, win, python 3.12.9\n",
      "[Taichi] Starting on arch=x64\n"
     ]
    }
   ],
   "source": [
    "import taichi as ti\n",
    "ti.init(arch=ti.cpu)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c462f853",
   "metadata": {},
   "source": [
    "Decorate `is_prime()` with `@ti.func` and `count_primes()` with `@ti.kernel`.\n",
    "\n",
    "```{note}\n",
    "Taichi's compiler compiles the Python code decorated with `@ti.kernel` and `@ti.func` onto different devices, such as CPU and GPU, for high-performance computation.\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "270058f2",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "78498\n",
      "Execution time: 0.1199 seconds\n"
     ]
    }
   ],
   "source": [
    "@ti.func\n",
    "def is_prime(n: int):\n",
    "    result = True\n",
    "    for k in range(2, int(n ** 0.5) + 1):\n",
    "        if n % k == 0:\n",
    "            result = False\n",
    "            break\n",
    "        \n",
    "    return result\n",
    "\n",
    "@ti.kernel\n",
    "def count_primes(n: int) -> int:\n",
    "    count = 0\n",
    "    for k in range(2, n):\n",
    "        if is_prime(k):\n",
    "            count += 1\n",
    "\n",
    "    return count\n",
    "\n",
    "t_start = time.perf_counter()\n",
    "print(count_primes(1000000))\n",
    "t_end = time.perf_counter()\n",
    "print(f\"Execution time: {t_end - t_start:.4f} seconds\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ba029c76",
   "metadata": {},
   "source": [
    "```{admonition} Exercise\n",
    "1. Increase $N$ tenfold to 10,000,000 and rerun the codes. What is the speed-up?\n",
    "2. Change Taichi's backend from CPU to GPU and give it a rerun. What is the speed-up?\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "07562b96",
   "metadata": {},
   "source": [
    "## 2D Diffusion\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e536f17f",
   "metadata": {},
   "source": [
    "Import the required libraries."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "cdb9bdc3",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Taichi] Starting on arch=x64\n"
     ]
    }
   ],
   "source": [
    "# Import the required libraries\n",
    "import time\n",
    "import taichi as ti\n",
    "ti.init(arch=ti.cpu,\n",
    "        default_fp=ti.f64)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f161b210",
   "metadata": {},
   "source": [
    "Isolate the code responsible for heavy computation (loops) and enclose the code in a Taichi kernel. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "3184be5f",
   "metadata": {},
   "outputs": [],
   "source": [
    "@ti.kernel\n",
    "def fvm_iteration() -> ti.f64:\n",
    "    # copy the current temperature field to the placeholder field\n",
    "    for i, j in ti.ndrange(nx, ny):\n",
    "        Told[i, j] = T[i, j]\n",
    "\n",
    "    # loop over the grid points\n",
    "    for i, j in ti.ndrange(nx, ny):\n",
    "        # left-bottom corner\n",
    "        if i == 0 and j == 0:\n",
    "            T[i, j] = ((k*area/dx)*Told[i+1, j] + ((k*area/dx))*Told[i, j+1] + q*area + area*Tinf/(1/h + dx/(2*k))) / (2*k*area/dx + area/(1/h + dx/(2*k)))\n",
    "        # right-bottom corner\n",
    "        elif i == nx-1 and j == 0:\n",
    "            T[i, j] = ((k*area/dx)*Told[i-1, j] + (k*area/dx)*Told[i, j+1] + area/(1/h + dx/(2*k))*Tinf) / (2*k*area/dx + area/(1/h + dx/(2*k)))\n",
    "        # left-top corner\n",
    "        elif i == 0 and j == ny-1:\n",
    "            T[i, j] = ((k*area/dx)*Told[i+1, j] + (k*area/dx)*Told[i, j-1] + (q*area + 2*k*area/dx*Tn)) / (4*k*area/dx)\n",
    "        # right-top corner\n",
    "        elif i == nx-1 and j == ny-1:\n",
    "            T[i, j] = ((k*area/dx)*Told[i-1, j] + (k*area/dx)*Told[i, j-1] + (2*k*area/dx*Tn)) / (4*k*area/dx)\n",
    "        # left boundary\n",
    "        elif i == 0:\n",
    "            T[i, j] = ((k*area/dx)*Told[i+1, j] + (k*area/dx)*Told[i, j-1] + (k*area/dx)*Told[i, j+1] + q*area) / (3*k*area/dx)\n",
    "        # right boundary\n",
    "        elif i == nx-1:\n",
    "            T[i, j] = ((k*area/dx)*Told[i-1, j] + (k*area/dx)*Told[i, j-1] + (k*area/dx)*Told[i, j+1]) / (3*k*area/dx)\n",
    "        # bottom boundary\n",
    "        elif j == 0:\n",
    "            T[i, j] = ((k*area/dx)*Told[i-1, j] + (k*area/dx)*Told[i+1, j] + (k*area/dx)*Told[i, j+1] + (area*Tinf/(1/h + dx/(2*k)))) / (3*k*area/dx + area/(1/h + dx/(2*k)))\n",
    "        # top boundary\n",
    "        elif j == ny-1:\n",
    "            T[i, j] = ((k*area/dx)*Told[i-1, j] + (k*area/dx)*Told[i+1, j] + (k*area/dx)*Told[i, j-1] + (2*k*area/dx*Tn)) / (5*k*area/dx)\n",
    "        # internal nodes\n",
    "        else:\n",
    "            T[i, j] = 0.25 * (Told[i-1, j] + Told[i+1, j] + Told[i, j-1] + Told[i, j+1])\n",
    "    \n",
    "    # calculate the temperature difference\n",
    "    Tdiff = 0.0\n",
    "    for i, j in ti.ndrange(nx, ny):\n",
    "        Tdiff += ti.abs(T[i, j] - Told[i, j])\n",
    "\n",
    "    return Tdiff"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fbbd3997",
   "metadata": {},
   "source": [
    "Major routine of the code."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "5d3882e5",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Iteration 10: Tdiff = 71.3782\n",
      "Iteration 20: Tdiff = 39.7494\n",
      "Iteration 30: Tdiff = 22.2378\n",
      "Iteration 40: Tdiff = 12.4416\n",
      "Iteration 50: Tdiff = 6.9609\n",
      "Iteration 60: Tdiff = 3.8945\n",
      "Iteration 70: Tdiff = 2.1789\n",
      "Iteration 80: Tdiff = 1.2190\n",
      "Iteration 90: Tdiff = 0.6820\n",
      "Iteration 100: Tdiff = 0.3816\n",
      "Iteration 110: Tdiff = 0.2135\n",
      "Iteration 120: Tdiff = 0.1194\n",
      "Iteration 130: Tdiff = 0.0668\n",
      "Iteration 140: Tdiff = 0.0374\n",
      "Iteration 150: Tdiff = 0.0209\n",
      "Iteration 160: Tdiff = 0.0117\n",
      "Iteration 170: Tdiff = 0.0065\n",
      "Iteration 180: Tdiff = 0.0037\n",
      "Iteration 190: Tdiff = 0.0020\n",
      "Iteration 200: Tdiff = 0.0011\n",
      "******************************************\n",
      "Final temperature difference: 0.0010\n",
      "Number of iterations: 203\n",
      "Elapsed time: 0.107 seconds\n",
      "The temperature at the plate center is 193.1574 degree Celsius.\n"
     ]
    }
   ],
   "source": [
    "# Parameter declarations\n",
    "lx = 0.3                                # length of the plate\n",
    "ly = 0.4                                # height of the plate\n",
    "nx = 3                                  # number of grid points in x-direction\n",
    "ny = round(ly/lx*nx)                    # number of grid points in y-direction\n",
    "dx = lx/nx                              # grid spacing in x-direction\n",
    "dy = ly/ny                              # grid spacing in y-direction\n",
    "h = 0.01                                # plate thickness\n",
    "area = h*dx                             # flux area\n",
    "\n",
    "k = 1000                                # coefficient for heat conduction\n",
    "q = 500000                              # heat flux at the west boundary\n",
    "Tinf = 200                              # ambient temperature in the south\n",
    "h = 253.165                             # convective heat transfer coefficient at the southern edge\n",
    "Tn = 100                                # constant temperature at the northern edge\n",
    "\n",
    "# Set initial condition (Note the order of nx and ny)\n",
    "T = ti.field(dtype=ti.f64, shape=(nx, ny))    # a taichi field with all elements equal to zero\n",
    "\n",
    "# Finite volume calculations\n",
    "Told = ti.field(dtype=ti.f64, shape=(nx, ny)) # placeholder field to advance the solution\n",
    "Tdiff = 1                               # temperature difference for convergence\n",
    "cnt = 0                                 # counter for the number of iterations\n",
    "t_start = time.perf_counter()           # start time for the simulation\n",
    "\n",
    "while Tdiff > 1e-3:                     # loop until the difference is less than 1e-3\n",
    "    cnt += 1                            # increment the counter\n",
    "    Tdiff = fvm_iteration()             # calculate the temperature difference\n",
    "\n",
    "    if cnt % 10 == 0:                   # print every 100 iterations\n",
    "        print('Iteration {}: Tdiff = {:.4f}'.format(cnt, Tdiff))\n",
    "\n",
    "# Stop the timer and print the iteration results\n",
    "t_end = time.perf_counter()\n",
    "print('******************************************')\n",
    "print('Final temperature difference: {:.4f}'.format(Tdiff))\n",
    "print('Number of iterations: {}'.format(cnt))\n",
    "print('Elapsed time: {:.3f} seconds'.format(t_end - t_start))\n",
    "print('The temperature at the plate center is {:.4f} degree Celsius.'.format(0.5*(T[nx//2, ny//2] + T[nx//2, ny//2-1])))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6c83d577",
   "metadata": {},
   "source": [
    "## Exercise\n",
    "\n",
    "1. Explore the [Taichi website](https://taichi-lang.cn/)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "cfd2025",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}