{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# MolStandardize"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is a demonstration on how to use the __rdMolStandardize__ module within RDKit. The structure and capabilities remain largely the same as MolVS (https://molvs.readthedocs.io/en/latest/) but will extended capabilities of user-supplied lists in some of the standardization tools. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "import rdkit\n",
    "from rdkit import Chem\n",
    "from rdkit.Chem.Draw import IPythonConsole\n",
    "from rdkit.Chem.MolStandardize import rdMolStandardize"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The __rdMolStandardize__ module contains the following convenience functions for quickly performing that standardization task:\n",
    "-  rdMolStandardize.StandardizeSmiles()\n",
    "-  rdMolStandardize.ValidateSmiles()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Other functions within the module:\n",
    "-  rdMolStandardize.Cleanup() \n",
    "-  rdMolStandardize.ChargeParent()\n",
    "-  rdMolStandardize.FragmentParent()\n",
    "\n",
    "__rdMolStandardize.Cleanup()__ is equivalent to the __molvs.standardize.Standardizer().standardize()__ function in molvs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The __rdMolStandardize__ module contains the following classes, that allow you to develop your custom standardization process.\n",
    "-  rdMolStandardize.CleanupParameters\n",
    "-  rdMolStandardize.Normalize\n",
    "-  rdMolStandardize.MetalDisconnector\n",
    "-  rdMolStandardize.FragmentParent\n",
    "-  rdMolStandardize.FragmentRemover\n",
    "-  rdMolStandardize.Reionizer\n",
    "-  rdMolStandardize.Uncharger\n",
    "-  rdMolStandardize.RDKitValidation\n",
    "-  rdMolStandardize.MolVSValidation\n",
    "-  rdMolStandardize.AllowedAtomsValidation\n",
    "-  rdMolStandardize.DisallowedAtomsValidation\n",
    "\n",
    "TODO\n",
    "-  rdMolStandardize.tautomer "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## rdMolStandardize.StandardizeSmiles()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The __rdMolStandardize.StandardizeSmiles()__ convenience function contains all sensible default functionality to help get started"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'O=C([O-])c1ccc(C[S](=O)=O)cc1.[Na+]'"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sm = \"[Na]OC(=O)c1ccc(C[S+2]([O-])([O-]))cc1\"\n",
    "rdMolStandardize.StandardizeSmiles(sm)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The __rdMolStandardize.StandardizeSmiles()__ function performs the following steps."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'O=C([O-])c1ccc(C[S](=O)=O)cc1.[Na+]'"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mol = Chem.MolFromSmiles(sm)\n",
    "Chem.SanitizeMol(mol)\n",
    "mol = Chem.RemoveHs(mol)\n",
    "mol = rdMolStandardize.MetalDisconnector().Disconnect(mol)\n",
    "mol = rdMolStandardize.Normalize(mol)\n",
    "mol = rdMolStandardize.Reionize(mol)\n",
    "Chem.AssignStereochemistry(mol, force=True, cleanIt=True)\n",
    "Chem.MolToSmiles(mol)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## rdMolStandardize.ValidateSmiles()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The __rdMolStandardize.ValidateSmiles()__ function is a convenience funtion that quickly validates a single SMILES string. It uses the __rdMolStandardize.MolVSValidation()__ class with the default to do all validations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "RDKit INFO: [15:43:30] INFO: [FragmentValidation] 1,2-dichloroethane is present\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "['INFO: [FragmentValidation] 1,2-dichloroethane is present']"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rdMolStandardize.ValidateSmiles(\"ClCCCl.c1ccccc1O\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## rdMolStandardize validation methods"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There are four different ways of validating molecules, each can by run with their different classes:\n",
    "-  rdMolStandardize.RDKitValidation\n",
    "-  rdMolStandardize.MolVSValidation\n",
    "-  rdMolStandardize.AllowedAtomsValidation\n",
    "-  rdMolStandardize.DisallowedAtomsValidation\n",
    "\n",
    "All the classes have a __validate()__ method to perform the validation."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The __rdMolStandardize.RDKitValidation__ class validates the valency of every atom in the input molecule (much like RDKit automatically does)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "RDKit ERROR: [15:41:32] Explicit valence for atom # 1 O, 3, is greater than permitted\n",
      "RDKit INFO: [15:41:32] INFO: [ValenceValidation] Explicit valence for atom # 1 O, 3, is greater than permitted\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "['INFO: [ValenceValidation] Explicit valence for atom # 1 O, 3, is greater than permitted']"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "vm = rdMolStandardize.RDKitValidation()\n",
    "mol = Chem.MolFromSmiles(\"CO(C)C\", sanitize=False)\n",
    "vm.validate(mol)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The __rdMolStandardize.MolVSValidation__ class goes through similar validations as the standalone MolVS.\n",
    "https://molvs.readthedocs.io/en/latest/api.html#molvs-validate\n",
    "The default is to do all validations:\n",
    "-  NoAtomValidation\n",
    "-  FragmentValidation\n",
    "-  NeutralValidation\n",
    "-  IsotopeValidation\n",
    "\n",
    "Using the __rdMolStandardize.MolVSValidation__ class rather than the convenience function __rdMolStandardize.ValidateSmiles()__ provides more flexibility when working with multiple molecules or when a custom validation list is required"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "RDKit INFO: [15:41:32] INFO: [FragmentValidation] water/hydroxide is present\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "['INFO: [FragmentValidation] water/hydroxide is present']"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "vm = rdMolStandardize.MolVSValidation()\n",
    "mol = Chem.MolFromSmiles(\"COc1cccc(C=N[N-]C(N)=O)c1[O-].O.O.O.O=[U+2]=O\")\n",
    "vm.validate(mol)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Or you can specify which subset of MolVSValidations you want when initializing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[]"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "validations = [rdMolStandardize.NoAtomValidation()]\n",
    "vm = rdMolStandardize.MolVSValidation(validations)\n",
    "mol = Chem.MolFromSmiles(\"COc1cccc(C=N[N-]C(N)=O)c1[O-].O.O.O.O=[U+2]=O\")\n",
    "vm.validate(mol)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The __rdMolStandardize.AllowedAtomsValidation__ class lets the user input a list of atoms, anything not on the list throws an error.\n",
    "\n",
    "The __rdMolStandardize.DisallowedAtomsValidation__ class also takes an input of a list of atoms and as long as there are no atoms from the list it is deemed acceptable. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "RDKit INFO: [15:41:32] INFO: [AllowedAtomsValidation] Atom F is not in allowedAtoms list\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "['INFO: [AllowedAtomsValidation] Atom F is not in allowedAtoms list']"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from rdkit.Chem.rdchem import Atom\n",
    "atomic_no = [6,7,8]\n",
    "allowed_atoms = [Atom(i) for i in atomic_no]\n",
    "vm = rdMolStandardize.AllowedAtomsValidation(allowed_atoms) \n",
    "mol = Chem.MolFromSmiles(\"CC(=O)CF\")\n",
    "vm.validate(mol)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "RDKit INFO: [15:41:32] INFO: [DisallowedAtomsValidation] Atom F is in disallowedAtoms list\n"
     ]
    }
   ],
   "source": [
    "atomic_no = [9, 17, 35]                                                     \n",
    "disallowed_atoms = [Atom(i) for i in atomic_no]                             \n",
    "vm = rdMolStandardize.DisallowedAtomsValidation(disallowed_atoms)          \n",
    "mol = Chem.MolFromSmiles(\"CC(=O)CF\")                                       \n",
    "msg = vm.validate(mol)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## rdMolStandardize.Cleanup\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Whilst __rdMolStandardize.StandardizeSmiles()__ provides a quick and easy way to get standardized version of a SMILES string, it's inefficient when dealing with multiple molecules and doesn't allow customization of the standardize process.\n",
    "\n",
    "The __Cleanup__ function provides flexibility to specify custom cleanup files and parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAcIAAACWCAYAAABNcIgQAAAgD0lEQVR4nO3deVjU5f7/8Se74nJQTAVnGEA0STkuuKUllmlqm6VWopZpxxI1t2OlldvxuOWCucFP8Rw9muvplKVmnvxipkEgaFipJAKDAy6JKeAgy+f3xxxIYlBQmM8M835cV5czc98Mrym73nN/PvfioCiKghBCCGGnHNUOIIQQQqhJCqEQQgi7JoVQCCGEXZNCKIQQwq5JIRRCCGHXpBAKIYSwa1IIhRBC2DUphEIIIeyaFEIhhBB2TQqhEEIIuyaFUAghhF2TQiiEEMKuSSEUQghh16QQCiGEsGtSCIUQQtg1KYRCCCHsmhRCIYQQdk0KoRBCCLsmhVAIIYRdk0IohBDCrkkhFEIIYdekEAohhLBrUgiFEELYNSmEQggh7JoUQiGEEHZNCqEQQgi7JoVQCCGEXXNWO4AQorz8/Hz0ej0ZGRmkp6eTnp5ORkYGHTp0YPjw4TRo0EDtiELUGg6KoihqhxDCnhQXF5OVlUVaWprZYqfX68nKygLA3d0dnU6HRqNBq9USGxvLn//8Zz7++GOVP4UQtYcUQiGq2c2bN8nMzCQlJYWUlBQMBkOZ53q9noKCAlxcXGjSpAne3t74+/vj5eVV+rjkuZeXFw4ODqXv/csvvxAcHMzy5csZM2aMip9SiNpDCqEQVWA0GjEYDOWKW8nz5ORkrl+/DkCjRo3MFreS5zqdDicnpypn2LlzJ6+++ioxMTG0b9++uj+iEHZHCqEQZnz33XccPnwYvV5Peno6er0evV7P1atXAfDw8ECj0aDT6dBqtWg0Gnx8fPDx8UGj0aDRaHBzc6uxfGPGjCEmJoa4uDjc3d1r7PcIYQ+kEArxB59//jkffPABnp6eaLXaMvfotFotPj4+qk9WMRqNdOvWjS5durBhwwZVswhh66QQCvEHffr0ISgoiPDwcLWj3NGPP/5I165diYyMZMSIEWrHEcJmSSEU4janTp2iffv2/Pzzz7Ru3VrtOHe1fv16pk6dSlxcHG3atFE7jhA2SQqhELd5/fXXuXTpEnv27FE7CgBbtmxBr9czY8aMCvuMHDmSkydPEhsbS926dS2YTojaQXaWqe22bYMOHaBOHWjWDMaOhf9N+BBlXb16lW3btjFp0iS1o5TSarXMnj2br7/+usI+69at49atW7zzzjsWTCZE7SGFsDaLjISwMHj3Xbh4EaKjQa+HJ56A/Hy101mdiIgI/Pz8ePzxx9WOUiokJIQZM2YwfPjw0kX2f1S/fn127tzJhg0b+OSTTyycUAjbJ5dGayujEby9YelSGD3699fz8qBlS5g71zQ6FAAUFhbi5+fHrFmz+Mtf/mK2z969e+nTpw916tSxaLbi4mL69euHi4sLe/fuxdHR/PfXjz76iNmzZ5OQkICfn59FMwphy2REWFvFx0N2NgwdWvZ1d3d49lk4eFCdXFZq9+7d5ObmEhoaarb9p59+4tlnnyU1NdWywQBHR0e2bt1KYmIiS5curbDfW2+9xWOPPcbLL7/MrVu3LJhQCNsmhbC2unIF6tUDc+vdvLxM7aLUypUreeONN6hXr16F7QMGDFBtZmazZs34xz/+wQcffMDRo0cr7Ldx40YuXbrEBx98YMF0Qtg2KYS1VZMmkJsLN26Ub8vMNLULAOLj44mPjycsLMxse3Z2Nlu3blV9Es2AAQOYNGkSw4YN49dffzXbx8PDgx07dhAeHs7nn39u4YRC2CYphLVVcDB4eMCuXWVfz8uDPXugTx9VYlmj5cuXM3jwYLRardn2yMhINBoNTzzxhIWTlbdgwQK0Wi2jRo2iotv7Xbt2Zd68eYwZM4YLFy5YOKEQtkcmy9Rma9fCe+9BRAT0728aCU6ZAgYDxMaallTYOYPBgJ+fH9HR0Tz88MPl2gsLC/H39+e9997jjTfeUCFheXq9ng4dOjBnzhwmTpxoto+iKDz33HP89ttvHDp06J429xbCXsiIsDYLC4PVq2HBAmjaFB59FFq0gEOHyhdBvV6djCpbu3YtQUFBZosgwCeffEJOTo5VbWGm1WrZtGkTb7/9NgkJCWb7ODg4EBUVxS+//MLf/vY3CycUwrbIiFBAWhq0aQMHDkCvXmqnsZj8/Hx0Oh3Lli1j+PDhZvv07NmTRx55hMWLF1s43d1NmDCBL7/8koSEBBo2bGi2z+HDh+nbty/79u2ziku7QlgjGREK0Olg2jQIDbWr2aRbt27F0dGRoX9cYvI/CQkJxMbG8uabb1o4WeUsW7aMP/3pTxWuewTTgvyZM2cyYsSIChfkC2HvZEQoTIqKoF8/cHODvXvhtlPRa6sOHTowePDgCpcajBw5EqPRyK4/TjiyIiUn1i9btozXX3/dbJ+SBfnOzs7s27evwgX5Qtgr+T9CmDg5waZNEBcHy5ernabGRUdHc/r0acZWsLvOpUuX2LVrl+pLJu4mICCA9evXM3HiRE6ePGm2T8mC/JMnT/Lhhx9aOKEQ1k8KoT3JzITXXoObN823azSmYjhzJnz3nWWzWdjKlSsJDQ2lWbNmZtvXrFlD27ZteeSRRyycrOpefPFFQkNDCQ0NJS8vz2yfkgX5s2bNuuOCfCHskVwatSe5udCli2lCTERExf2mTYPduyExERo3tlw+C0lNTSUgIIDY2FiCg4PLtZdMovnwww8ZOXKkCgmrrrIn1r/99tts376dxMREPD09LZhQCOslhdDenDoFXbvChg2myTHmFBRASIhp0+7duy2bzwKmTp1KQkIC0dHRZtv/+c9/8s4775CWlmbxDbbvR2VOrC8sLCQkJITGjRuzZ88eHOzgXrAQdyOXRu1Nu3ame4BvvAFnz5rv4+IC27fD//2faVF+LZKTk8PGjRvveO9vzZo1hIWF2VQRBGjbti3h4eGMGzeO06dPm+3j7OzM9u3b+e6771i1apWFEwphnWREaK+GD4czZ+DoUdNMUXN274aRI+HYMejY0bL5asjq1atZunQp586dM7vbyjfffEPfvn1JS0ujefPmKiS8f5U5sf6LL75g6NChHD16lE6dOlk44d0VFBRw4cIF9Ho9QUFBeHh4qB1J1GJSCO3VjRvQuTMMHAgrVlTcb9w4+O9/4fhxqGDRtq1QFIWHHnqI119/nWnTppntM3jwYOrXr8+mTZssnK765OTk0LlzZ/r27XvHUd/EiRPZv3//HRfk1wRFUcjKykKv15ORkYFeryctLa30cXp6OllZWRQXF+Pm5oZGo+HTTz+lXbt2Fsso7IsUQnsWHw+PPGK6DDpokPk+RiP06AGtW5v62bB9+/YxdOhQ9Ho9jc1MAkpLSyMgIIBjx47RpUsXFRJWnx9++IHu3buzZcsWXnjhBbN98vPz6dGjBwEBAezYsaPafvfNmzfJzMwkJSUFg8FQ+rjkeVpaGrm5uQA0atQIf39//P398fLywtvbu8xjX19fxo4dy5EjR4iPj6eBuWPFhLhPUgjt3fLlMH8+JCSAr6/5PqdPQ5cuFK9aheOoUZZMV62efPJJAgICWLNmjdn26dOnExMTw5EjRyycrGZU5sT6yizIv11+fj4XLlwoU+Buf3zu3DmuXbsGmIqcueJWUvh8fHxwdna+6+80Go10796d4OBgoqKiqvTvQIjKkEJo7xQFnn8esrLgyBHTRBkzsrdv56kZM4jat4/AwEALh7x/Z86coW3btiQlJZnNn5eXh1arJTIykiFDhqiQsGa88MILXLhwgSNHjuDq6mq2z86dO3n11VeJiYnBx8enzOjtj8UuNTWV4uJi6tSpY7bAlTxu3bp1tY7eSmbERkRE2MySFmE7pBAKyM42TYYZPhz+/vcKu40aNYr4+Hi+//573N3dLRjw/r355pukp6ezb98+s+1r165l8eLFnDt3rlKjFFtx7do1OnbsyIsvvnjHjcMHDRpEfHw8Fy5cKL0vp9Fo8PHxwcfHB41Gg1arRafTodFoVJm8EhUVxaRJk4iPj6dNmzYW//2iFlOEUBRF+e47RXFzU5QDByrskpOTowQGBipjx461YLD7d/XqVaVevXrKl19+aba9uLhYCQwMVJYsWWLhZJZx7NgxxdXVVUlKSqqwT5cuXZSJEycqmZmZFkxWdSNGjFCCgoKUvLw8taOIWkQKofjd/PmK0rSpoly4UGGXpKQkpW7dusqWLVssGOz+LF68WGndurVSVFRktn3//v2Ku7u7cuXKFQsns5wzZ85U2PbNN98orq6uisFgsGCie3Pjxg2lTZs2yvjx49WOImoRuTQqfldcjPLUU/xXq6VPRESFpxREREQwffp04uPjefDBBy0csmqKiopo1aoV06dPZ9y4cWb7DBgwAF9fX9atW2fhdNZhyJAhuLu7s3nzZrWjVEpSUhLdunUjKiqKYcOGqR1H1AZqV2JhXS5mZSnNmzdX5s2bd8d+oaGhSnBwsGI0Gi2U7N7s2rVL8fDwUG7cuGG2/ezZs4qjo6Pyww8/WDiZdUhLS1OcnZ2V77//Xu0oVbJq1SrFw8NDSUlJUTuKqAVkRCjKiY6Opl+/fnz55Zc8/vjjZvvk5OQQHBzMwIEDWXGnBfkqe/TRR+nevXuFxw+NHz+eX375hQMHDlg4mXV4++23OXbsGN9++63aUaps8ODB6PV6vv322wpnxApRKWpXYmGd3nvvPaVFixbK5cuXK+wTFxenuLm5Kf/5z38sF6wKEhISFCcnpwpHDdnZ2Ur9+vWVvXv3WjiZdcjNzVU8PT2VnTt3qh3lnmRnZyt+fn7KX//6V7WjCBsnm24Ls+bOncuDDz7IK6+8glLBRYPOnTuzcOFCRo8eTWpqqmUDVkJ4eDiDBg2qcDF5VFQUXl5e9O/f38LJrMPmzZupU6cOgyraVcjKeXh4sGPHDlatWsVnn32mdhxhw6QQCrOcnJzYtGkTcXFxLFu2rMJ+kydPplevXrz88ssUFBRYMGHFsrOziY6OZufOnXc8ZSIxMZGJEydWOCmotlu7di0TJ07EpYJNFGxBly5dWLFiER4ea7h1K0PtOMJGyT1CcUf79+9n0KBBREdH8/DDD5vtk52dTceOHQkNDWXBggU1mufmzZukp6eXbticlpZWbvPmkn0sW7RoQXR0NAEBARW+n6Iodnkm31dffcWgQYPQ6/W14IBehXPnBlFYmE3r1odwcKg9GyIIy5BCKO5q2rRp7N69m8TERLObVQPExsYSEhLCnj176Nev3z3/ruzs7Grb4mvcuHEkJSURGxtrc2cL1rSnnnoKrVZLRESE2lGqRWHhVX7+uSOenq/i7T1P7TjCxkghFHdVUFBASEgI3t7e7L7DifWrV68mMDCQPn36mG3Pzs42expByfP09HQKCwtxdXXF09OzzAbN5ord3ZQcR/Tkk0+ycuXKe/78tU1ycjJt2rThxIkTBAUFqR2n2uTkfMPZs08QELCXhg37qh1H2BAphKJS0tPT6dixI/PmzWP8+PHl2o1GIwaDocKjd9LT08nJyQF+P3rnj6cRlDzX6XRmD829F8ePH6dnz55s27aN559/vlre09ZNmDCBs2fP8tVXX6kdpdplZs7j8uW1BAYm4uLipXYcYSOkEIpK+/e//82IESPKnGoeGxvL+PHjOX78OACenp5otdoyGzRrNBp0Oh1arRZvb2+LT84IDw9n7ty5JCYm4lvRUVN24vr162i1Wj7++GOeeuoptePUgGKSk59EUYpo1eogDg7V84VK1G5SCEWVhIWFcfDgQY4fP07Dhg15+OGHeeihh5g+fTo6nY66deuqHbEcRVF44YUXMBgMdzyOyB4sX76cdevWcebMmVo7W7ag4CI//9yBpk3fonnzGWrHETZACqGokpJDUkeOHEnv3r3p3r07KSkpaLVataPdUXZ2Np06deKll15i0aJFasdRRXFxMa1atWLy5MlMnDhR7Tg16saNQyQnD6B166+pX/8RteMIKyeFUFTZlStX8PT0ZMSIERQVFbF9+3a1I1VKbGwsvXr1Yvfu3TzzzDNqx7G4Tz/9lFdeeYWMjAwaNmyodpwad+HCDK5e3UJg4AmcnW19iYioSVIIxT0xGAz4+fndcX2hNVq4cCErVqzgxIkTlZp5Wps89thjdOjQwar3hq1OilLI2bO9cXLyICDgc8D+1ouKyqmdNwlEjVu7di1BQUE2VQQB3nnnHTp16kRoaChFRUVqx7GYU6dOceTIESZMmKB2FItxcHDGz28bubkxXLpkQ8tntm2DDh2gTh1o1gzGjoWrV9VOVatJIRRVlp+fz4YNG5gyZYraUarM0dGRf/3rXyQnJzN//ny141jM8uXLefrpp2nZsqXaUSzK1VWLr+8mMjLeITc3Ru04dxcZCWFh8O67cPEiREeDXg9PPAH5+Wqnq7Xk0qioso0bN/L++++TmppqszMwDx8+TN++fdm/f3+FGwDUFpcvX8bHx4d9+/bx2GOPqR1HFXr9JJydm+Dl9YHaUSpmNIK3NyxdCqNH//56Xh60bAlz55pGh6LayYhQVNlHH33EuHHjbLYIAoSEhDBjxgyGDx9OVlaW2nFqVGRkJAEBAfTu3VvtKKrRasOtuwgCxMdDdjYMHVr2dXd3ePZZOHhQnVx2QAqhqJLo6GhOnz7N2FrwzXT27Nm0bduW1157jeLiYrXj1IiCggIiIyOZPHmyXW4u/jvr+uwFBRcp+jkRbj8Q+soVqFcPGjQo/wNeXqZ2USOkEIoqWblyJcOGDaNZs2ZqR7lvjo6ObN68mfj4eJYuXap2nBqxa9cujEYjoaGhakexGrm5MZw504vExAacONGIM2d6cuPG4Sq9h6LcIj19AqdOtSQx0Z2ffmrPtWt7Kvn74/jhh+YULJwMb74JJXenmjSB3Fy4caP8D2VmmtojIsDBwfSPHY/wq50FDwEWNi41NVVxcnJS4uLi1I5Srfbt26e4uroq3377rdpRql23bt2UmTNnqh3DahQV3VROnPBUMjMXKoWFvykFBb8q167tVW7cMP/f/vLl/6dcvBhe7vXCwutKevoEJTc3TikouKhcuhShHD/uphiNyXfNUFxsVPLzU5Xi4ltlG/LyFMXDQ1Giosq+npurKM2bK8q6dZX+nKJqZLKMqLRp06YRHx/P4cNV+/ZsC6ZPn86OHTtITEysBefzmcTExNCrVy9SUlLQaDRqx7EKRuNpfvwxkI4db+LoePejubKyFlFcnIO3991nGJ861ZoWLebTqNGL9x5w7Vp47z3TyK9/f9NIcMoUMBggNta0pEJUO7k0KiolJyeHjRs33vHEd1u2cOFCtFoto0aNorZ8N1y5ciVDhgyRIngbV1dfXFxakJo6iuvXD1BUlF0t71tQkMWtW6nUrVv2WKuiomwuXVrN9euVnOgSFgarV8OCBdC0KTz6KLRoAYcOSRGsQTIiFJUSFRHB3xYt4ty5c9V2RJK10ev1dOjQgTlz5tj8XpwGgwFfX1+OHDlCt27d1I5jVW7dSiMrawnXr3/FrVupNGzYFx+fSFxdTfvlnjs3mJwc01WP4uKbgIKjozsA7u6dadXqyzLvpyj5JCf3p06dtvj4rC7TdubMI+TkHAWgVav9NGzYv4Y/nbgXUgjF3SkKRe3akTphAi3HjVM7TY364osvGDp0aJmjptRSWFhIZmYm6enppKenk5GRgV6vJy0tDb1ez/z58xk4cKDZn505cyYHDx4kLi7OwqltS0GBgdTU1wBo1co0g7Ow8ArFxUYALl9eS3FxLs2aTQfA0dENZ+cHSn9eUQpISRmCg4Mbfn7byh37dPJkEwoLfwVMSziaNq2dV1RsnbPaAYQN2L8fp9RUWr70ktpJatzTTz/NmDFjeOmll0qPmqopFy9eLC1u6enp6PV69Ho9GRkZpKWlkZmZSVFREW5ubqXnOvr4+BAUFMSAAQNo06aN2fe9efMm69ev56OPPqqx7LWFi4s3TZq8jl7/Vulrzs5NSh87OTXEwcERV9fyl5cVpZCUlJdRFAV//61mzz5s0WIRFy68S506bWjceGTNfAhx32REKO7uySchIADWrFE7iUXk5+fTo0cPAgIC2LFjxz29h9FoxGAwkJKSgsFgIDMzs8zjs2fPcuN/0+QbNWqEv78/Xl5eeHt74+/vX+a5Tqer0uXoDRs2MHv2bM6fP2/Tmx7UhPz88/z66yYaN34ZV1dfCgoySEt7AyenBrRs+Wm5/hVNllGUIs6fH05h4SVatvwPDg5uADg4uMhhwDZIRoTizs6eha+/hvBwtZNYjJubGzt27CA4OJioqCjGjBlTYd+8vDxWrFhRZiSn1+v57bffAPD09Cwdyfn4+NCzZ0+0Wi06nQ6tVou3tzcuLi7Vmn/VqlWEhYVJETTDyakht26lkpzcn4KCTJydG9OwYV80mqqtIy0o0JOdbfqSdOKER+nrWu0KmjadXI2JhSXIiFDc2bhxkJoK+/erncTiduzYwahRo4iJiaF9+/Zm++Tn5/PMM8+g1WrRarX4+Pig1WrRaDTodDrc3d0tmvnQoUMMHDiQtLS0WrHpgRCWIIVQVCw7G7Ra2L3btKbJDo0ZM4aYmBji4uIsXtTuxXPPPccDDzzAhg0b1I5ilS5d+oji4jyaN39X7SjCisg6QlGx9etNa5j69VM7iWrWrFmDs7Mzb7311t07q+z8+fPs3bvX5pd+1BRFKeTixaU4OTVSO4qwMnKPUJhXVGTa3WL6dHC03+9LderU4eOPP6Zr16707t2bESNGqJrHaDSWzjQtWUqRkZFBRkYGZ86cYciQIRVexrV31659QlFRDp6e6v43FNZHLo0K83bvhr/8xXQoaP36aqdR3fr165k6dSpxcXEVLlu4X0VFRWRlZZVOuMnIyCi3hvDixYsA1KtXD51Oh0ajKb0/qdPpGDFiBM7O8v3WnDNnelK//qO0aLFI7SjCykghFOY9+ih07w4ffqh2EqsxcuRITp48SWxsLHXr1q3yz2dnZ5tdSpGSkkJKSgp6vZ6CggJcXFxo0qRJ6VIKc8sqvLy87PxYparJyzvO6dPdaNfuF1xdfdWOI6yMFEJRXkICdO0Kycng56d2GquRk5ND586d6devX7nF6ndbN5icnMz169cB8+sGb39c1XWD4u7Onx+BotzC33+n2lGEFZJCKMp75RXIyzNdHhVlJCQk0KNHDwYMGEBhYWHpjjDZ2abNmxs1alS6dKJkGUXJZcuS3WFkfZ9lFRRkkpTkS+vWh6hfv6facYQVkkIoyrp0CXQ6+Oor0+VRUc6JEydYs2YN3t7epffnStYQ1pf7qVbHYJjFb7/tJTDwuNpRhJWSQijKmjMH9uwxXR4VwsYpSj5JSTpatPgQT0/Z61OYJ9PLRFlt20LnzmqnEKJaXL36MYqi0KjRULWjCCsmI0IhRK3188+d8fB4Fi+vWWpHEVbMfldKC9i2DTp0MJ183awZjB0LV6+qnUqIalEc+y1N9zWgSZM31I4irJwUQnsVGQlhYfDuu3DxIkRHmxbPP/EE5OernU6I++a4eDmeSTpcXGTzcXFncmnUHhmN4O0NS5fC6NG/v56XBy1bwty5ptGhELYqLc30dzkmRu55i7uSEaE9io83nSwx9A8TCNzd4dln4eBBdXIJUV1WrYIePaQIikqRQmhPbt40/XnlCtSrBw0alO/j5WVqF8JW5eXBP/4BkyapnUTYCFk+UQPatWtHeno658+fx9PTE4Dt27ezdOlS4uPja+aXGo1gMEBKiunPzMyyj5OToXdv+OwzaNIEcnPhxo3yxTAz09QuhK365z9NG8U/95zaSYSNkEJYQ1xcXFi8eDFLliy57/e6detW6VE7t59MoNfrmVW/Pl0OHDBd6gRo1Mh0mK6Pj+nPHj1AozHtFlOyb2hwMHh4wK5d5e8R7tkDs2ffd2YhVKEosHo1TJgAcgqHqCT5m1JDpkyZwpIlS5g6dSrNmzcv1z5r1iw2bdrE1atXCQgIIDw8nJCQEABWr15NdHR06TE8WVlZKIpC3bp18fHxKd3DMjg4GJfWrWHUKFPR0+lMlzzvpm5d+PvfYdo0U//+/U0jwSlToGlT0/sJYYsOHDBNlLn9C54QdyGFsIa0adOG559/ngULFpQ7qQAgMDCQ77//niZNmrBx40ZefPFFUlNTqVu3Loqi0LJlS0JCQsps3vzAAw9UX8CwMPjTn2DBAtMm2w0bmi4lbdliWlcohC1audL09/l/tySEqAxZPlED2rVrx5w5c+jUqRPt27fnp59+4ujRo3e8R+jr68tnn30mp4sLca+SkyEwEJKSTH8KUUkya7QG+fv7Exoayrx588q1bd68mS5duqDVavH19cVgMHBFZmsKce/Cw00bQkgRFFUkl0Zr2Pvvv09gYCC+vr6lryUnJzNp0iS++eYbgoKCAFPRlMG5EPdIUeDHH2H6dLWTCBskhbCGabVaRo8ezfLly/H736zN69ev4+7uTqtWrQD4/PPPOX/+vJoxhbBtDg6mbQKFuAdyadQCZs6cidFoLH0eHBzMsGHD6NixIwMGDODYsWO0a9dOxYRCCGG/ZLKMEMK2bNsGixfD6dOmmc/PPQeLFkHjxmonEzZKRoRCCNshp6aIGiAjQiGEbZBTU0QNkRGhEMI2yKkpooZIIRRC2AY5NUXUECmEQgjrFBFhWhbh4GA6OeX2U1P+SE5NEfdBCqEQwjq9+aZpobyimCbF3H5qyu1KTk3p00eNlKIWkAX1QgjbIKemiBois0aFELZl61ZYssS0jrDk1JTFi+XECXHPpBAKIYSwa3KPUAghhF2TQiiEEMKuSSEUQghh16QQCiGEsGtSCIUQQtg1KYRCCCHsmhRCIYQQdk0KoRBCCLsmhVAIIYRdk0IohBDCrkkhFEIIYdekEAohhLBrUgiFEELYNSmEQggh7JoUQiGEEHZNCqEQQgi7JoVQCCGEXZNCKIQQwq5JIRRCCGHXpBAKIYSwa1IIhRBC2DUphEIIIeyaFEIhhBB27f8DYDH1yoYpV5oAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<rdkit.Chem.rdchem.Mol at 0x7f318c0946c0>"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mol = Chem.MolFromSmiles(\"[Na]OC(=O)c1ccc(C[S+2]([O-])([O-]))cc1\")\n",
    "mol"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAcIAAACWCAYAAABNcIgQAAAg9UlEQVR4nO3de1RU5eI+8AcGRQYFvMRFUDElJVHDGUy5RIk3zLtiXk43O8eKjp3M06rT5VRWai7tnNKvl7J1TCO1EUVUEC+ogaA4hJoJImYaFxXlzjAyDPv3xwT9kBk1ndl7mHk+a7Faa97tzDPp4pn9zt7v6yAIggAiIiI75Sh1ACIiIimxCImIyK6xCImIyK6xCImIyK6xCImIyK6xCImIyK6xCImIyK6xCImIyK6xCImIyK6xCImIyK6xCImIyK6xCImIyK6xCImIyK6xCIkkEBQUBDc3N9y4caP5sS1btkCpVEqYisg+sQiJJNKuXTt8+umn9/08X331FT7//HMzJCKyTyxCIoksWLAAa9euxZUrV4yO//vf/0avXr3QqVMnBAcH48iRI0aPu3HjBkpLSy0ZlcimsQiJJNK/f39MmTIFixcvNjoeGBiIrKwsVFRUIDY2FjNmzEBdXZ3IKYlsnwN3qCcSX1BQED744AMMGTIEgwcPxtmzZ3H06FEsX74carXa6J/x9/fHzp07MXjwYEybNq35DLGurg6CIEAulwMAlEol9u7dK9p7IWrrnKQOQGTPHnzwQcyePRuLFi1CVFRUi7GNGzdi5cqVuHLlCmQyGYqLi3H9+nUAwLp166DVagEAq1evRm1tLd544w0AgLOzs7hvgqiNYxESSezdd99FYGAg/P39mx87f/48/vGPf+CHH37AwIEDARhKs2kCp1u3bs3Hurm5wdHREX5+fqLmJrIV/I6QSGI9evTA3Llz8dlnnzU/VlVVBblcjoCAAADArl27cPHiRakiEtk0FiGRFXj77bebpzoBQKFQYNasWQgODkZ0dDQyMjIQFBQkYUIi28WLZYiIyK7xjJCIiOwai5CIiOwai5CIiOwai5CIiOwai5CIiOwai5CIiOwai5CIiOwai5CIiOwai5CIiOwai5CIiOwai5CIiOwai5CIiOwai5CIiOwai5CIiOwai5CIiOwai5CIyJTNm4FHHgE6dAC8vIB584CyMqlTkZmxCImIjFm3DoiNBd56C7h6FTh8GPjtN2DkSODmTanTkRmxCMny+Kma2hqtFvjXv4AVK4CZMwF3dyAwEIiPB0pKgG++kTohmRGLkCyLn6qpLVKrgfJyICam5eNyOTBxIrB/vzS5yCJYhGQ5/FRNbdX164CrK9CpU+sxHx/DONkMFiFZDj9VU1vVrRtQWwtUV7ceKykxjJPNYBGS5fBTNbVVCgXg4QGoVC0f12iAxEQgKkqSWGQZLEKyHH6qprbKxQX45BNg4UJg61agshLIywOmTQM8PYHnnpM6IZkRi5Ash5+qqS2LjQVWrQIWLzaUX0QE4OsLpKYaroAmm+EgCIIgdQiyYatXA++8A6xdC4wdazgTXLAAKC4Gjh/nLxSyLtXVhul8R54j2BMWIVleXBywbJlhasnNDZg0Cfj0U6BrV6mTEbU0f76hDDdskDoJiYhFSOa3dy8waBDQvbvUSYjuXmMj4OcHLF8OzJ4tdRoSEc//ybx0OmDOHODECamTEP05R44YbvcZP974eHIy8PHH4mYiUThJHYBsTEqKoQzHjDE+rtMB7dqJm4nobqhUwLhxhul7Y77+GnjgAXEzkSh4RkjmpVIBkyebvghm8mTg88/FTER0Z3o9sH1768UfmtTWGqb8TY1Tm8YiJPOprwd27TL9y6K8HDhwABg2TNxcRHdy+LDhIhlT06KJiYYVkR57TNRYJA4WIZlPSorhk/WoUcbH4+MNu08MHSpuLqI7aZoW7djR9HhMDODEb5NsEYuQzOdO06IqlWHxbQcHUWMR3ZZeD+zYYXomo7qa06I2jkVI5nHzpmH6yNQvi+vXDSty8JcJWZtDh4CaGuDJJ42PJyYadk6JiBA3F4mG5/lkHnv3Gs70TE2Lbt8O9OgBKJXi5qJWqqqqkJOTg0uXLmHkyJHobu/3e6pUhhJ0dTU9Pn06IJOJm4tEwyIk82iaFnV2Nj0+YwanRUVWU1ODkydPQq1WQ61WIzs7G/n5+XB2dkZAQAAWLFiA9PR0BAYGSh1VGg0NQEKCYU1RY6qrDd99p6SIGovExSKk+6fVGq4W3bzZ+Pj164ar8pYuFTWWvdHpdMjPz0d2dnbzz4kTJyAIAgICAqBQKPDSSy9BoVAgJCQE7du3x9y5czF69GhkZmbCz89P6rcgvtRUw7TouHHGxxMSDAvHh4WJmYpExiXW6P7t2AHMnQtcvQq0b996fN06w9qiFy7wjNBMjJWeWq2GXq/HQw89BIVC0fyjVCrRwcQFTDqdDhMmTEBxcTHS0tLg7u4u8juR2N/+Zjjr27LF+PjEiUDv3rz31caxCOm+Vez6J+T5MrRf+KnxA0aOBEJCgCVLxA1mIxoaGnDu3LkWpZednY2GhoZWpadQKODi4vKnnr+6uhqRkZHo3LkzkpOT0d7Yhxlb1NBgWA93zRrDPoO3qqoy3O6zfz8QHi5+PhINi5DuS2OjFqdPe+HBB7fCzW1s6wNKSw2/bI4fB4YMET9gG2Os9H788UfU19ejX79+LQpvyJAhkMvlZnndkpISDB8+HOHh4di0aRMc7ODMvfrAAXSaNs2wNZix/4/ffAP8619AYSG3ZbJx/I6Q7ktl5R4AMnTqNMLoeMPRJDj17cMSNEKv1yMvL69F6eXk5ECj0cDHxwfh4eGIiYnB0qVLERwcDFdTVzWagY+PD5KTkxEeHo4PP/wQH3zwgcVey1os2LIFLlOmYKWpDxMqFfDUUyxBO8AzQrovv/wyEzJZJ/Tq9ZXR8fz8KHTsMBzde3LV/uLiYhw9ehTp6enIzs7GyZMnUVtbCx8fnxZnemFhYejSpYskGdPS0jB69GisWLECsbGxkmQQg06ng7e3N9avX48pU6a0PqCiwjAteugQEBoqej4SF88I6Z41NmpQWbkHffrEGx3X6a6ipuYI/PxWiJzMuly+fBkvvfQSkpOT0adPHyiVSkyePBkfffQRFAoF3EztdmABdXV1+Pnnn6E0cT9nREQENm7ciDlz5sDX1xeTJk0SLZuY9u/fD51Oh7FjjUznA9i1Zw+cxo5FNNfFtQssQrpnlZVJcHR0NjktWlGxDc7OfSCXPyJuMCuzYcMGXLt2DWVlZejcubOkWeLj4xEbG4vDhw9jiInp6piYGFy+fBmzZ8/GwYMHMcwGy0ClUmHixIkmLyxa+9136NevH6I5LWoX+LdM96y8XAUPj6lwcDD+eaq8XIXOnWeInMr6qFQqzJkzR/ISBIC//OUvmDt3LsaOHYuCggKTxy1cuBDz5s3DxIkTb3tcW6TT6ZCYmIgYE8v9VVRU4MCBAybHyfawCOmeNE2Ldu5s/JeFTncFNTXpJsftRV5eHn7++WdMM3Z5vkQ+++wzPPbYY4iOjkZpaanJ41asWIHIyMg7HtfW7Nu3DzqdDqNHjzY6vn37dnh5ednkmTAZxyKke1JZuRuOji7o1OkJo+Pl5So4O/eFi8sgkZNZly1btiA0NBQ9e/aUOkozR0dHfPvtt/D29saECROg0WhMHrdp0yZ4e3tj/PjxJo9ra1QqFSZNmmRyWlSlUmHGjBl2cQsJGbAI6Z4Ypj2n3WFa9CmRU1kflUpllVNsHTp0QEJCAiorKzFr1izo9XqTx+3cuRNVVVWYOXOmyePaivr6+ttOi5aXlyM1NdUq/87IcliE9KcZpkWT7zAtetTup0XPnDmD3NxcTJ061eh4bm4u0tLSRE71h65duyIpKQlZWVmYP3++yeO6dOmC5ORknDhxAn//+99FTGh+KSkp0Ov1JqdF4+Pj4eXlhaHcPNqu8KpR+tMqKhLh6ChHx46RRsfLy79Hhw4PwcUlSORk1kWlUiE8PBw9evQwOv7FF1+gtLQUERLuc9e7d2/s3r0bjz/+OPr27YvXX3/d6HH+/v7Ys2cPIiMj0bdvXyxcuFDkpOahUqkwefJkk2uvqlQqzJw5k9Oi9kYg+pO02nyhvHyHyfG8vHChqOh90fJYq4cfflj44osvjI41NDQInp6ewpYtW0ROZVxSUpLQrl07YdOmTXd13MaNG0VKZj5arVZwd3cXdu3aZXS8tLRUcHJyErKyskRORlJjEZJZ1dcXCmq1o6DRnJE6iqROnz4tODo6CoWFhUbHDxw4IMjlcqG6ulrkZKatX79eaN++vXDgwAGzHGdtEhISBA8PD0Gr1RodX7dundC7d2+hsbFR5GQkNU6NklndvHkR7u5j4OIyQOookmqaFvX19TU5/uSTT6Jjx44iJzPthRdewOXLlzFt2jSkpaVh4MCBJo/77bffMHXqVKSlpWHQoLZxZXDTtKizic2jebWoHZO6iantqanJFPLyIoQff+wo5OR4CHl5oUJV1WGpY1mVwMBAYeXKlUbHmqZFv//+e5FT3VljY6Pw/PPPC35+fsJvv/1222NffvllwdfXV7h8+bJI6e5dXV2d4ObmJuzZs8foeNO0qFqtFjkZWQNeNUomCUID6upOQ6crbn6ssVGLgoLxcHcfh0GDihAUdAHe3u+YvI3CHp06dQrnzp0zebVoamoqampqMM7UrugScnBwwLp16zBgwACMGzcOFRUVJo9duXIlFArFHY+zBidOnED79u0xcuRIo+Px8fHo2bOnyWXnyLZx9wkCAAiCHlptHjQaNTSabNTWqlFXdxKNjTfRq9dadOv2NwCAVpuHn38ORHBwHRwdjV95Z+/effddHD16FIcOHTI6Pm/ePFRWVmLr1q0iJ7t7TZv1enh4YO/evSY369VoNBgxYgTkcjmSk5NNTjtaA41GAycnJ5w6dQp1dXUIDg5Gp06dAAAjR45ESEgIlnDzaLvEIrRTOl3x74WXDY0mGzU1R6HXl6NdOx/I5QrI5Qq4uirg6hoKJ6euzX+usVGLM2f6omPHcHTr9jxcXYdCJmu9hqZeX44bN+LQoUM/uLmNEvOtSS4wMBDz5883uo1RQ0MDunfvjtWrV2P69OkSpLt7JSUlCA0NRWhoKL799luT352VlpYiNDQUISEhiIuLs5rv2HQ6Hc6cOQO1Wg21Wo3s7GycPn0agiAgKCgIzs7OSE1NRW1tLbp3747jx4/zjNBOsQhtnSAABQVAdjagVqNcUY5Lgdug11fB2bk35HIl5HIlXF2VkMuHQCbzuONT1tdfwpUry1BVtQ/19b/CzW0UevZch/bt/7hf7ty5cNTUHAUABAQkG9+93gbl5ORAqVSisLAQPj4+rcZTUlIwdepUXLt2zaIb7ZpLbm4uwsPD8corr2DRokUmjysoKEBoaChefPFFfPTRRyImNDC2yfGPP/6I+vp69OvXr8V+j0OGDEFDQwMiIiLQo0cPfPPNN0hISMALL7wgem6yDixCW3PxIqBWG36ysw0/FRVAz56AUombz43AzcgAyOVKODnd/+avOl0xfv31eQBAQEBK8+OnTnVDQ8MNAECPHv+Fp+c/7vu12oK3334bx48fx8GDB42O//Wvf0VtbS02b94scrJ717RZ7/Lly/HKK6+YPO748eMYMWIEli1bdtvj7pex0svJyYFGo2m1yXF4eLjJXT+KioowfPhwjB07Fl9++aXF8lIbINllOnT/iosFIT5eEN5+WxBGjxaELl0EARAEX19BmDRJEBYtEoSkJEG4ds2iMcrKvhdOnfJu8Vhp6VfCyZNdhby8MEGnu2HR17cmffv2FdasWWN0rL6+XujatasQHx8vcqr7t3PnTqF9+/ZCQkLCXR23Y8cOs712UVGRkJiYKLz//vvC+PHjhc6dOwsABB8fH2H8+PHC+++/LyQmJgo3bvz5f2c//fST4OHhISxevNhseantYRG2ZR9/LAgeHoIQFiYIr74qCN9/byhHC9JqfxGKit4X6upyBb2+TtBqzwvnzo0QCgomWfR12wK1Wi3IZDLh6tWrRseTkpKEjh07ChqNRuRk5rF69WpBLpcLmZmZdzzOxcVFyMjI+NOvcWvpdenSxWjplZaW3uvbaOXQoUOCs7OzsGHDBrM9J7UtnBq1Zps3A59+CuTlAe7uwKRJwNKlQJffpzQbGgAncW9baGi4gcLChaiuPgydrgROTl3g5jYKfn7L4eTkKWoWa/PWW28hOzsb+/fvNzo+d+5c3Lx5E3FxcSInM5/XX38dmzZtQkZGBgICAkwet3DhQmzcuPG2xxUXF7eY3jx27BiuX7/eanrz0UcfhaenZf9tbd68Gc8//zx27dqFUaPs6+Iu4neE1mvdOuCtt4A1a4DoaKC4GHj9deDqVSAzE7DwZeq3XlXa0HAN/fsft+hrtnUBAQF44403MG/evFZjOp0O3t7e+PrrrzF58mTxw5mJIAh4+umncezYMWRkZJgsKEEQ8MwzzyAzMxMZGRloaGhoUXpZWVm4du1aq9IbOnQovLy8RH5XBp988gmWLVuGI0eO4JFHHpEkA0mDRWiNtFqge3dg+XJg7tw/HtdogD59gA8/BIz8sr1X9fWFLe4f1GjUaGi4/vutFEq4uiogl4fA3f3ubgDX6yshk7mbLV9boFarMWzYMBQXFxsthz179mDWrFm4evWqyQ1h24r6+npER0ejpqYGqampJq9+1Wq1iIiIwPXr1/Hrr7/C29sbSqUSCoUCSqUSSqUS3t7eIqe/vfnz52P79u3IzMy0qs2UybJYhNYoPR2IiACqqoDfb/ht9uKLQFkZoFLd23OXlKAxLwdX+mU1l59OdwVOTp6/30KhaC6/du2Mr5N5OxpNDs6fH4WHHjoEFxfja1XaojfffBM5OTnYt2+f0fHnnnsOer0emzZtEjmZZVRWViIiIgK9evVCQkICZDKZ0eOeeuop1NfXY9WqVSbXXbUmer0e06dPR35+PtLT001ecUq2hetiWaPr1wFX19YlCAA+PkB+/t09z7VrzfcPNt9OUVQER28vaA4Hw6XjEHTr9gLkcgXatzfPp1+5PBhduszC+fNj0b9/ptme19rFx8fjzTffNDrWtCv6hg0bxA1lQe7u7khMTERYWBgOHjxodKNbjUaDpKQkbNu2rU2UIADIZDJ89913iIqKwpQpU5CSkmLVq+WQebAIrVG3bkBtLVBd3boMS0oM47eqrAR++umPewezs4HcXMOfHzgQUCiAadMM/334YfS14Ooffn7/RX19Ic6fj0b//ulGV56xJVlZWbh06ZLJ7/7utCt6W+Xv74+zZ8/C3d34NHhSUhKcnZ0xYsQIkZPdHxcXl+aSf/bZZ/Hdd9/B0ZHLMtsy/u1aI4UC8PBoPf2p0QCJiUBU1B+PabWG7w09PIAnnwQSEgBPT+C99wxnjhUVhqnWzz8HnnkGGDAAsPASWA4OMvTu/R1kMndcuDAFgnDToq8nNZVKhaioKDzwwAMmx2+3K3pbZqoEAcP7njp1Ktq1aydiIvPo1q0bdu/ejV/z8lAqwUo5JDKp7tugO/i//zPcI7hliyBUVAhCbq4gjB0rCIMGCUJdXctjt28XhLw8QdDrpclqgk5XKpw585Bw4cJTgiBYVzZzaWxsFPz9/YX169cbHb/Trui2qra2VnB1dRX27dsndZT7k5kpCHK5IKxaJXUSsiCeEVqr2Fhg1Spg8WLDGV5EBODrC6SmAreeWUyZAvTrB1jZ9I2TUzf07ZuE6upDKCp6W+o4FnH8+HEUFRXddlrUwcHB7u5N2717N1xcXPDEE09IHeX+DBsGbN1quHVp+3ap05CF8KpRsrja2hPIz38Cvr6L4en5qtRxzGrhwoXIzc1FUlKS0fGnn34aMpnMpi6UuRsxMTHo2rUr1q5dK3UU8/jyS+C114D9+4GwMKnTkJlZ1ykE2SRX1xA8+OAWFBa+gYoK2/lULQgC4uPjERMTY3T85s2b2LVrl8lxW6XRaJCcnGxb73vePODVVw2rO507J3UaMjMWIYnC3X08evZciYsX/4Kamgyp45hFZmYmiouLMWnSJKPjycnJdjktumvXLri4uCAyMlLqKOa1ZInhgrToaMMKT2QzWIQkmm7d5sHT81VcuDARWm3b/1StUqkwevRodOlifDsrlUqFKVOmmNzd3VapVCpMnz4dTiKvg2txDg7A+vVA376GQqypkToRmQmLkETl67sEbm5jUFDwJHS6a1LHuWeCIGD79u0mp/+0Wi12795tW9ODd6G2thZ79+613ffdrh2wbZthwfunnjL8l9o8FiGJzAH+/v9DZeVjGDNmKmpra6UOdE8yMjJQUlKCiRMnGh1PSkqCTCZD1P9/z6cdSExMhFwux2OPPSZ1FMtxcwOSkoAzZ4CXX5Y6DZkBi5BE5+DQHgMH/hdlZTWYMWMGGtrgp2qVSoUxY8aYXIuy6WZye5wWjYmJsb1p0Vt1724ow/h44OOPpU5D94lFSJJwc3NDUlISzpw5g5fb2KfqxsZGbNu27bbTf3K5HLNmzRIxlfSqq6tte1r0VgMGADt2AJ98AtjZ7TG2xsY/tpE16969O5KSkhAeHg5/f3+88847Uke6rUuXLkGtVmPfvn0oLy/HhAkTTB779ddfi5jMOiQmJsLd3R0RERFSRxFPZKThApq5c4HBg4HgYKkT0T1gEZKkBgwYgB07diA6Ohq+vr547rnnpI4EACgsLIRarUZ2djbUajXUanXz7unBwcFwdXXFRx99hM8++0zqqFaj6WpRU1sy2aw5cwy7wgweLHUSukcsQpLc448/jv/973949tln0b17d9F3aSgvL0d2djbS09Obi+/KlSvw8PDAgAEDoFAoMGvWLCgUCgwYMAAAkJ2djccffxw9e/bEa6+9Jmpea1RdXY2UlBSkpKRIHUUaxnbY2LwZ+PRTIC8PcHc33Iy/dClg4nYbkg6LkKzCzJkzcfHiRUyfPh1HjhxBsIWmmCoqKnDmzBlkZ2c3/zRtJRQUFASFQoGYmBgoFAo8/PDDcDCxU4dCocDWrVsxZcoU+Pn5Yfr06RbJ21YkJCTAw8MDYVx+zGDdOuCtt4A1aww34BcXG9YrHTkSyMwEuMehVeFao2RVXn31VWzbtg2ZmZno1avXfT1XZWUlfvrppxall5ubi06dOmHgwIFQKBTNP7crvdtZv349XnnlFSQnJ7e5fffMaeLEiejduzc+//xzqaNIT6s1XFW6fLnhu8MmGo1hy7QPPzQs2UZWg0VIVkWv1yMmJgbnzp1Denq6ydsTblVVVYXTp0+3Kr2OHTti0KBBLUovMDDQrButvvfee1i5ciXS0tIwcOBAsz1vW1FVVQUvLy/s378f4eHhUseRXnq6YbeYqqrWG2u/+CJQVtZ6r1GSFKdGyarIZDLExcVh5MiRmDx5MlJSUlptaFtdXY1Tp061KL28vDzI5XIMHjwYCoUCb775pkVKz5hFixahqKgI48aNQ2ZmJvz8/Cz6etYmISEBnTt3RmhoqNRRrMP164Cra+sSBAwX1eTni5+JbotnhGSVbty4gbCwMAQFBWH+/PnIyclpvnrz/Pnz6NChAx555BEoFAoolUoolUr069dPsisWdTodJkyYgOLiYqSlpd1253ZbM2HCBPTt2xf/+c9/pI5iHe7mjDAq6o9VaSIjgcOHRY9Jf2ARktUqKCjAiy++iLS0NAQEBLSY3hw6dKjVrdpSXV2NyMhIdO7cGcnJyVaXzxIqKirg7e2N1NRUnhE2qaszfEe4YoXx7wjffx946SXp8lErLEKyanq9Ho2NjWjXrp3UUe5KSUkJhg8fjvDwcGzatOmeLsBpSzZs2ID33nsPly5dsvgUdJuyejXwzjvA2rXA2LFASQmwYIHh6tHjx4FbpvtJWvyXS1ZNJpO1mRIEAB8fHyQnJyM5ORkffvih1HEsrmltUZbgLWJjgVWrgMWLAU9Pw1Spry+QmsoStEI8IySygLS0NIwePRorVqxAbGys1HEsoqKiAl5eXjh8+DCGDx8udRyie8aPcUQWEBERgY0bN+K1117Dzp07pY5jEdu3b4eXlxeGDRsmdRSi+8LbJ4gsJCYmBpcvX8bs2bNx8OBBmysMlUqFGTNm2Pz3oGT7ODVKZGELFixAXFwcjh49ioCAAKnj3LWysjJ0MbEuZnl5Oby9vfHDDz/g0UcfFTkZkXnxjJDIwlasWIHCwkJER0cjIyMDnp6eUkdqpbi4uMUCBVlZWejatSvOnj1r9Pj4+Hh4eXlh6NChIiclMj+eERKJQKvVYtSoUaivr8ehQ4cgl8sly3Jr6RnbbUOhUCA8PBwPPvig0ecYM2YMBg8ejGXLlomcnsj8WIREIikrK0NYWBgCAgKwY8cOUVbBuXW3jaNHj+KXX35psdtG00/TFlN30rQvY0ZGBkJCQiz8Dogsj0VIJKJff/0Vw4cPx+TJk7FmzRqzPrex3TbOnj0LNzc3s+22AQBffvklli5digsXLvBCGbIJ/I6QSET+/v7Yt28fIiIi0KdPH/zzn/+8p+e5m902mhYev5/SM4ZXi5Kt4RkhkQRSU1Mxbtw4fPXVV3j66adve2xdXR2OHTsGtVrd/J3ehQsX0KlTJwQHB0OpVDYvPh4QEGDRgmqaFj127BgUCoXFXodITCxCIonExcXhhRdewJ49exAVFWXyuNzcXAwePBgBAQEIDw9HWFgYFAoF+vfvL9puG6WlpVCr1fjmm29w4sQJFBQU8IyQbAanRokkMmfOHJw/fx5Tp05FWloaBg0aZPS4/v37o66uTrTSKysra3H2qVarcfnyZXh4eGDIkCFISUlhCZJN4RkhkcRiY2ORmJiIzMxM9OjRQ9TXNrbJcW5uLlxdXZs3OW76EWOTYyIpsAiJJKbX6zF16lT88ssvSEtLg4eHh0Vep6amBidPnmxRenl5eXBxcWne5JilR/aIRUhkBTQaDaKiotChQwfs3bsXzs7O9/V8tbW1yMnJaVV6MpmsTWxyTCQmFiGRlSgtLUVoaChCQkIQFxd319/D6XQ65Ofntyi9EydOQBCEVqUXEhJy3yVLZGtYhERWpKCgAKGhoZg3bx4+/vjjVuPGSk+tVkOv1+Ohhx5qUXpKpRIduAks0R2xCImsTFZWFp544gksWbIEUVFRLUovOzsbDQ0NrUpPoVDAxcVF6uhEbRKLkMgKxcfHY8mSJTh58iT69evXovCGDBki6aLdRLaGRUhkpaqrqyGTyVh6RBbGIiQiIrvGG4WIiMiusQiJiMiusQiJiMiusQiJiMiusQiJiMiusQiJiMiusQiJiMiusQiJiMiusQiJiMiusQiJiMiusQiJiMiusQiJiMiusQiJiMiusQiJiMiusQiJiMiusQiJiMiu/T/r75o4ZX8/EAAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<rdkit.Chem.rdchem.Mol at 0x7f3184051990>"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "smol = rdMolStandardize.Cleanup(mol)\n",
    "smol"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "__rdMolStandardize.Cleanup()__ function can take in an optional __rdMolStandardize.CleanupParameters__ object which you can specify your own standardization reference files to use.\n",
    "The member variables that you can change are:\n",
    "-  CleanupParameters.acidbaseFile\n",
    "-  CleanupParameters.normalizationsFile\n",
    "-  CleanupParameters.fragmentFile\n",
    "-  CleanupParameters.maxRestarts\n",
    "-  CleanupParameters.preferOrganic"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Default acidbaseFile: /data/dipper/leung/gsoc/rdkit/Data/MolStandardize/acid_base_pairs.txt\n",
      "Default normalizationsFile: /data/dipper/leung/gsoc/rdkit/Data/MolStandardize/normalizations.txt\n",
      "Default fragmentFile: /data/dipper/leung/gsoc/rdkit/Data/MolStandardize/fragmentPatterns.txt\n",
      "Default maxRestarts: 200\n",
      "Default preferOrganic: False\n"
     ]
    }
   ],
   "source": [
    "params = rdMolStandardize.CleanupParameters()\n",
    "\n",
    "print(\"Default acidbaseFile: %s\" % params.acidbaseFile)\n",
    "print(\"Default normalizationsFile: %s\" % params.normalizationsFile)\n",
    "print(\"Default fragmentFile: %s\" % params.fragmentFile)\n",
    "print(\"Default maxRestarts: %s\" % params.maxRestarts)\n",
    "print(\"Default preferOrganic: %s\" % params.preferOrganic)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## rdMolStandardize.Normalize"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The __rdMolStandardize.Normalizer__ class is used to apply a series of Normalization transforms to correct functional groups and recombine charges."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAcIAAACWCAYAAABNcIgQAAANkklEQVR4nO3df3BV5Z3H8Xd+QogECCEhP1ABFbIBq9LRdsRxRqWSWqlri7qdUVl3hZFZhQh1xboaZtQC/kK3w5aO0tGtQ6mjtGbrDOtqtbOOFUW6CjEYRWguCb9/hdyEXHLv/nEHEAl4CTe5JOf9mrnDzHmec873n8yH55zzPE9aLBaLIUlSQKWnugBJklLJIJQkBZpBKEkKNINQkhRoBqEkKdAMQklSoBmEkqRAMwglSYFmEEqSAs0glCQFmkEoSQo0g1CSFGgGoSQp0AxCSVKgGYSSpEAzCCVJgWYQSpICzSCUJAWaQShJCjSDUJIUaAahJCnQDEJJUqAZhJKkQDMIJUmBZhBKkgLNIJQkBZpBKEkKNINQkhRoBqEkKdAMQklSoBmEkqRAMwglSYFmEEqSAs0glCQFmkEoSQo0g1CSFGgGoQQwbhz85jfHHvvtb2Hs2GOPLV8OF10E/ftDURFMnw67d/dYmZKSzyCUErV0KcycCfffD9u2wdtvQ0MDXHMNHDyY6uokdZFBKCWirQ3mzYMnn4RbboFBg6C8HF55BZqa4IUXUl2hpC4yCKVEfPgh7NkDU6cee3zAAJgyBd54IzV1STptBqF02F13QUHB0d/06Ufbdu6E3FwYOPD484qL4+2SeiWDUDrsscfgr389+luw4GhbQQG0tEBz8/HnNTXF2yX1SgahdNiQIVBWdvSXn3+0bcIEGDwYXn752HPCYXjtNbj66h4tVVLyZKa6AKlXyMmBRx+FOXPij0gnT46PBKuqoLAQpk1LdYWSusgRoZSomTPhF7+IP0ItLIQrroDSUnjrrfi8Qkm9UlosFoulughJklLFEaF0uqLR+E9Sr2QQSqfr2mth2bJUVyGpiwxC6XTdcAP87Gewb1+qK5HUBb4jlE5XRwdccgl873vw+OOprkbSKXJEKJ2ujAx4+ml49lnYsCHV1Zwad9OQDEIpKa66Cq67DubOTXUliXM3DQnw0aiUPBs3QkUFrFwZn3B/Jmtrg5ISeOIJuOOOo8fDYRg9GubPP3atVakPc0QoJcuoUTB7Ntx7L0Qiqa7m5NxNQzrCIJSS6fDXo0uWpLqSo3btOv6Yu2lIRxiEUjKddVZ8TdL581MfJjt3xreWGjny+A9g3E1DOsIglJLt9tvh/PPhoYdSc/9Dh+BXv4LycvjgA3j99WN30gB305C+wiCUki0tDZ5dzO6Rn9La+nHP3vvNN+Hii6G6Oj4yXb0aJk48vt9Xd9NYsSL+OLeuDn70I3fTUOAYhFJ3uOy77PtxGQ0Ns3vmfvX1cNNN8P3vx6dy1NXFv/pMP8mfuLtpSIDTJ6RuE4lsYd26MYwc+SKDB9/YPTc5cCA+BWLBApg0CRYvjk9/kJQwR4RSN8nKKmX48H8lFJpLNNqW3ItHo/Dcc/HQe/VV+OMfoabGEJS6wCCUulFR0U+JxWJs3/5U0q7Z0vI+dXUT6fjdMnj4Yfjoo8Q/bvnzn+OPQyUdYRBK3Sg9vT9lZYtoavo5kUjjaV2rvT3Epk23sWHDFeTkjif2+h/i7/kyM7/55M2b4eab48unbdp0WnVIfY1BKHWzIUOmkpv7bbZsmdel86PRMFu3LmT9+nLa2xspL1/DOecsJTNz2DefHA7DwoUwblx8Yv2aNfF3ipKO8GMZqQe0tv4fn376bcaM+V9ycy9L+Lx9+2r429/uJi0ti9LSxxgyZOo3n3RYTQ3cfTdkZcW/DP36cmqSAINQ6jGbN88gHF5DeflqvulhTDj8EQ0NswiH11JUNJfi4nmkpfVL7EYffQSzZsHatfHdMObNg34JnisFkI9GpR5SWvoIBw9+wa5dL52wTyTSxObNM6iru4x+/UYybtznlJRUJxaCTU0wYwZcdll8WbUvvohPrDcEpZNK4C27pGTIzBxGScnDtLXVnrBPKDSH9vYQY8e+z4ABlyR03VjsIM1rniPvqnnxTXbffx8uSexcST4alc4o0egB0tNzgbSE+u/d+3tCobnEYoeo2PIU6df+fXyJN0kJMwilXqitrY5Q6F6am9+hsPBuiosfJD39rFSXJfVKPhqVUqSl5S+EQvcRDq8lLS2TnJy/o6TkMQYOvPKE53R07KGxsZodO5aQlzeZior1ZGef23NFS32QQSilQDTaxuef/4Ciormcd95/EYsdoqXlL6Sldf4nGYsdYteuZWzZ8iDZ2SO44IK3OOusK3q4aqlv8tGolAJtbXWsX1/OxRe3kp5+8p0empvfoqFhNpFII8XF/8awYf9CWlpGD1Uq9X1On5BSIDv7XLKyStm0aRr796+io2PPcX1isQhffHED9fWVDBpUybhxGyksnGUISknmiFBKkfb2zWzduoj9+/+b9vZN5OVN4uyzl5KdPeJIn+3bn2HQoOvo1++8FFYq9W0GoXQGiEQa2bTpHwE4//xVKa5GChYfjUo9IBo9QDi85oTtWVklFBT8M62tH/dgVZLAIJS6WYxdu/6TdevGEArNPXL04MEvaWyspq2tjmi0jYMHP2fHjl+e0oLckpLD6RNSNwmHP6ShYTatresZPvx+iopmH2nLyMijvX0T9fWTiUSayMzMJy9vEmVlR7dI2r//f2hufpPS0p+noHopOHxHKCVZJNJEY2M1u3YtIz//HygtfZysrKJTvk44vJa6uksZM+ZdcnMv7YZKJYFBKCVNLNbOjh3/QWPjQwwYcDFlZYsZMOCi07rm5s130tq6nrFj3yXR9UclnRqDUEqCfftqaGiYTSzWTknJowwdeiuJBlcsFiEtLavTtkOHtrNu3QWcffYS8vN/ksSKJR3mxzLSaWhrq6O+vpKNG28hP/9WKio+Y+jQ2ziVEKytvYj9+9/otD0zs5Di4gcJhe4jGm1JYuWSDjMIpS44dGg3DQ2zqK29kIyMgVRU1FJSUk16es4pXSctLYvBg39AQ8MsYrFIp30KC+8hIyOXrVsXJaN0SV9jEEqnIBKJsHbt86xbdx4tLe8xZsw7jBr1O7Kzz+nyNYcPf5COjr3s2PHLTtvT0rIpK3uCbdsep719U5fvI6lzviOUErRq1SqqqqrIyelg1aoHKCi4lWT9X3LnzmWEQnMZN+4zMjMLOu1TX19JRkYeo0atSMo9JcU5IpS+QX19PTfddBNTpkxh0qRJ/OlPH1BQcDvJ/PMpKJhG//7n09hYfcI+I0Y8xd69K2lufidp95VkEEon1NLSQnV1NePHj6e1tZXa2lqeeeYZ8vLyuuFu6YwYsZidO5fS2vpJpz369y9n2LC7CIVmE4t1dEMNUjAZhNLXRKNRXnzxRUaPHs2KFStYuXIlNTU1jB49ulvvm5v7XYYMmUpDw+wT9ikpqab/5nQ6Xl/erbVIQWIQSl+xevVqLr/8cqqqqpg3bx6ffPIJlZWVPXb/0tJFtLS8z969v++0PSNjCCM3/hOZd8yBfft6rC6pLzMIJSAUCnHbbbcxceJELrzwQurq6pg1axaZmT27HG92dhnDh/+UUGgO0Whb551mzIDhw+GRR3q0NqmvMggVaOFwmIULF1JeXk5jYyNr1qxh6dKlDBs2LGU1FRXdRywWZfv2xZ13yMiAp5+GZ5+FDRt6tDapL3L6hAJr3bp1VFZWkpOTw5NPPsn111+f6pKO2N/0Cv1n/zvZi5dDcXHnnW68ESIRqKnp2eKkPsYgVGC1trby/PPPM336dLKzs1NdzvGuvBJGjYJf/7rz9o0boaICXn0VevA9ptTXGITSmWrtWrj0Unj33fi/nXngAVi5Ej7+GLI6X7hb0skZhNKZ7M47Yf36eBimdbKQ94EDMGYM3HcfzJrV8/VJfYBBKJ3Jtm+HCy6AJUvgJyfYhumFF6CqCj77DAo6X55N0okZhNKZ7oknYPHi+BeiubnHt8di8J3vwIQJ8cCUdEoMQulM194O48fDLbfA/Pmd93nvPbj2WvjySxg6tGfrk3o5g1DqDWpq4OabobYWzj238z67d0N+fo+WJfUFBqHUW1RWQl4erHAbJimZDEKpt/j0U/jWt+CNN+JzDCUlhUusSb1FeTnccw980vk2TZK6xhGh1BcsXw4LF0JdHQwaBD/8ISxY4DtDKQGOCKXebulSmDkT7r8ftm2Dt9+Ghga45ho4eDDV1UlnPEeEUm/W1gYlJfG5hnfccfR4OAyjR8enW0yfnrr6pF7AEaHUm334IezZA1OnHnt8wACYMiX+YY2kkzIIpd5s5874ajMDBx7fVlwcb5d0Ugah1JsVFEBLCzQ3H9/W1OTao1ICDEKpN5swAQYPhpdfPvZ4OAyvvQZXX52SsqTeJDPVBUg6DTk58OijMGdO/BHp5MnxkWBVFRQWwrRpqa5QOuP51ajUF7z0EixaFJ9HmJcXn0e4cKELcEsJMAglSYHmO0JJUqAZhJKkQDMIJUmBZhBKkgLNIJQkBZpBKEkKNINQkhRoBqEkKdAMQklSoBmEkqRAMwglSYFmEEqSAs0glCQFmkEoSQo0g1CSFGgGoSQp0AxCSVKgGYSSpEAzCCVJgWYQSpICzSCUJAWaQShJCjSDUJIUaAahJCnQDEJJUqAZhJKkQDMIJUmBZhBKkgLNIJQkBZpBKEkKNINQkhRoBqEkKdAMQklSoBmEkqRAMwglSYFmEEqSAu3/AcPCF/eG1MSNAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<rdkit.Chem.rdchem.Mol at 0x7f318c0945d0>"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "n = rdMolStandardize.Normalizer()\n",
    "n.normalize(Chem.MolFromSmiles(\"C[S+2]([O-])([O-])O\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## rdMolStandardize.MetalDisconnector"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The __rdMolStandardize.MetalDisconnector__ class disconnects metal atoms that are defined as covalently bonded to non-metals."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAcIAAACWCAYAAABNcIgQAAANRklEQVR4nO3dW2yUZR7H8V+h5dDagilYYemCFCZTaAsDVQ66FYNuV4xsDA5LFQfSC9hg7EhcAjEqo0lJu0ugo0gorpswuuF0gVqMRqFaAxKgIKRLObRAWMCC0AKSHqDQZy9mYbeCUiqdYfp8P0lv3tP8e/Wd9533nYkyxhgBAGCpLuEeAACAcCKEAACrEUIAgNUIIQDAaoQQAGA1QggAsBohBABYjRACAKxGCAEAViOEAACrEUIAgNUIIQDAaoQQAGA1QggAsBohBABYjRACAKxGCAEAViOEAACrEUIAgNUIIQDAaoQQAGA1QggAsBohBABYjRACAKxGCAEAViOEAACrEUIAgNUIIQDAaoQQ6GBpaWlKSEhQbW3t9WVr1qxRZmZmGKcCcA0hBEIgJiZGhYWFv/o47733nvx+/x2YCMA1hBAIgblz52rFihU6derUTde/8cYbGjhwoOLj4+VyuVRWVnbT7Wpra3XmzJmOHBWwDiEEQsDpdOqZZ57RokWLbro+NTVVO3bs0Pnz5zVnzhxNnTpVjY2NIZ4SsFOUMcaEewigM0tLS5PP59OoUaM0YsQIVVZWauvWrVq8eLHKy8tvus+gQYP08ccfa8SIEZoyZcr1M8TGxkYZYxQbGytJyszM1Oeffx6y/wXojKLDPQBgi8GDB+u5557TW2+9pYkTJ7ZaFwgE9M477+jUqVPq2rWrvv/+e509e1aSVFxcrKamJknS8uXLVV9fr3nz5kmSunfvHtp/AuiECCEQQq+99ppSU1M1aNCg68uqqqrk9Xr1zTffKD09XVIwmtcu1vTp0+f6tgkJCerSpYsGDBgQ0rmBzozPCIEQSk5OVm5urpYsWXJ92Y8//qjY2FgNHTpUklRSUqKjR4+Ga0TAOoQQCLFXX331+qVOSRo9erRycnLkcrn05JNP6ttvv1VaWloYJwTsws0yAACrcUYIALAaIQQAWI0QAgCsRggBAFYjhAAAqxFCAIDVCCEAwGqEEABgNUIIALAaIQQAWI0QAgCsRggBAFYjhAAAqxFCAIDVCCEAwGqEEABgNUIIoP1Wr5ZGjpR69JCSkqRZs6S6unBPBdwWQgigfYqLpTlzpAULpNOnpa+/lo4flx5/XLp0KdzTAW0WZYwx4R4CQIRpapL695cWL5Zyc/+3vKFBSkmR3nwzeHYIRADOCIF2OnbsmEpLS3XmzJlwj9Lxfvp+ubxcOndOcrtbL4+NlSZPlr78MnSzAb8SIQRuU319vV5//XWlpqZq3rx5cjgcKioqUnNzc7hHu/NOnpQ8nuDlz/939qwUFyfFx9+4T79+wfVAhCCEQBsZY7R+/XoNHz5ca9eu1apVq1ReXq6VK1dq6dKlSktL08aNG8M95p3R2CgVFkpOZzCG06e3Xt+nj1RfL128eOO+NTXB9UCEIIRAG+zatUtZWVnKzc3VzJkzVVFRIbfbraioKLndbu3fv185OTlyu9164okntG/fvnCP3H4lJdKwYdLf/y794x/S5s1SenrrbUaPlnr3ltavb728oUH65BNp4sSQjQv8WoQQ+AU1NTWaPXu2xo4dqwceeEDV1dXy+Xzq3r17q+1iY2Pl8/l06NAh9evXTy6XS16vV+fPnw/P4O2xe7eUlSXl5EgzZkgVFTd+BnhNz55Sfr70yivS2rXShQvSgQPSlCnSffdJM2eGdHTg1yCEwE00NzfL7/fL6XRq79692rJliwKBgJKSkn5xv+TkZAUCAX3xxRcqKytTSkqK/H6/rl69GqLJ26G2VvJ6pTFjpEGDpOpqyecLPhv4S+bMkZYtkxYtCsbvd7+TfvMbqbT01vsCdxEenwB+oqSkRHPnztWlS5eUn5+vF154QVFRUbd9nJaWFn344YeaN2+eEhMTtXTpUmVnZ3fAxO3U3CwtXy4tXCg5HJLfL40bF+6pgJDjjBD4rwMHDmjSpEmaNm2ann32We3fv18ej6ddEZSkLl26yOPx6ODBg5o8ebImT56sp59+WkePHr3Dk7fDpk3Bb4T529+kt9+Wtm9vewRLSoJnkUAnQQhhvbq6Onm9XqWnp6tr166qrKxUQUGB7rnnnjty/N69e6ugoEAVFRUyxsjpdMrr9erize647GD/rqyUfv/74LN+brd06FDw8Yi2xL6yUsrOlv70p2A4gU6CEMJaV65c0cqVK+V0OrVlyxZ99dVXKikp0cCBAzvk9RwOhzZu3KhPP/1UmzdvltPp1MqVK9XS0tIhr/f/6urqlJeXpyEjR6omNTV4Y4vPF3wA/lbOnQs+R+hySd26BYM4aVKHzwyEjAEstGnTJpOenm769OljioqKzJUrV0L6+pcvXzZFRUWmV69e5sEHHzRbt27tkNe5evWqWbVqlenbt69xuVymrKzsdnY2ZtUqY/r2NcblMuZ29gUiCCGEVaqqqozb7TYxMTEmLy/PXLhwIazznD171uTl5Zno6GjjdrvNsWPH7tixS0tLTUZGhklMTLz92JeWGpORYUxiojFFRcaE+I0CEEpcGoUV6uvr5fP5lJaWpsbGRu3fv19+v18JCQlhnSsxMVF+v187duzQ6dOnNWzYMPl8PjU1NbX7mMePH5fH41F2drYmTJigw4cPy+v1qmvXrm3ZOfiZYXa2NGGCdPhw8NGKtuwLRChCiE7NGKNAIKAhQ4ZozZo12rBhg0pKSpSSkhLu0VpxuVwqKyvT6tWrFQgE5HA4FAgEbusY12LvcDhUU1OjPXv2yO/3q1evXm3ZOfiZocMR/Iq0PXuCj1O0ZV8g0oX7lBToKNu3bzdjx4419957rykqKjLNzc3hHqlNGhoaTEFBgYmPjzePPfaY2bt37y9u39LSYtatW2eSk5ONw+EwGzdubPuLtbQYs26dMcnJxjgcxtzOvkAnwRkhOp2TJ0/K4/HokUceUUZGhg4ePCiv16vo6Ohwj9YmPXv21Pz583XgwAENGDBAo0aNksfj0Q8//HDDtjt37tTDDz+sWbNm6cUXX1RFRYWeeuqpNr3Otm3b5J82TZo9W/rLX6R//Utq475AZ0II0Wk0NjaqsLBQTqdTJ0+e1K5du1RcXKy+ffuGe7R26d+/vwKBgLZt26bq6mo5nU4VFhbq8uXLkqRAIKDx48dr5MiRqqqq0vz589WtW7dbHvfEiROaPn26srKydDAxUZerqqS8PCkmpqP/JeCuxFesoVMoKSlRXl6eoqOjtWjRIrl/7suiI5QxRh988IHmz5+vXr16acmSJRo3bpxOnDih9J/+MsTPaGxs1Ntvv638/HxlZmaqqKhIGRkZHTw5cPfjjBARbffu3crKylJOTo5mzJhx/eeROpuoqCh5PB5VV1dr2rRpmjJliqZOndq2O0EVfKMwfPhwvfvuu1q2bJlKS0uJIPBfhBAR66WXXtKYMWM0dOjQ6z+P1KOT/+pBXFycfD6fKioqFBcXJ5fLpYKCgp/d/rvvvtOjjz6qnJwceTweHTp0SB6PJ4QTA3c/QoiIlZmZqW3btun999/X/fffH+5xQmrIkCH66KOPtGHDBhUXF+vIkSOt1tfW1srr9eqhhx5SUlKSKisrrXijALQHnxECEayhoUFxcXHat2+fhg0bpubmZi1fvlwLFy6Uw+FQUVGRxo8fH+4xgbtaZNxPDuCWPvvsM7388suqr6/XsmXL9Pzzz7f7J6QAmxBCoJPYuXOnpk6dqgULFiguLi7c4wARg0ujQAT76aVRALePm2UAAFYjhAAAqxFCAIDVCCEAwGqEEABgNUIIALAaIQQAWI0QAgCsRggBAFYjhAAAqxFCAIDV+NJtIIL1jIpS0+DB6tKF97RAexFCIIJFGaPuR45ILS3hHgWIWLyNBABYjRACAKxGCAEAViOEAACrEUIAgNUIIQDAaoQQAGA1QggAsBohBABYjRACAKxGCAEAViOEAACrEUIAgNUIIQDAalHGGBPuIQC0kzHSkSPSb38rxcSEexogIhFCAIDVuDQKRLLVq6WRI6UePaSkJGnWLKmuLtxTARGFEAKRqrhYmjNHWrBAOn1a+vpr6fhx6fHHpUuXwj0dEDG4NApEoqYmqX9/afFiKTf3f8sbGqSUFOnNN4NnhwBuiTNCIBKVl0vnzklud+vlsbHS5MnSl1+GZy4gAhFCIBKdPSvFxUnx8Teu69cvuB5AmxBCIBL16SPV10sXL964rqYmuH7FCikqKvg3YULIRwQiBSEEItHo0VLv3tL69a2XNzRIn3wiTZwo/fnPwecMjQneSAPgpqLDPQCAdujZU8rPl155JXiJ9A9/CJ4Jzp0r3XefNHNmuCcEIgZ3jQKR7J//lP76V+nAASkhQfrjH6XCQikxMdyTARGDEAIArMZnhAAAqxFCAIDVCCEAwGqEEABgNUIIALAaIQQAWI0QAgCsRggBAFYjhAAAqxFCAIDVCCEAwGqEEABgNUIIALAaIQQAWI0QAgCsRggBAFYjhAAAqxFCAIDVCCEAwGqEEABgNUIIALAaIQQAWI0QAgCsRggBAFYjhAAAqxFCAIDVCCEAwGqEEABgtf8AsGcIwKk06qoAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<rdkit.Chem.rdchem.Mol at 0x7f3184051e90>"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "md = rdMolStandardize.MetalDisconnector()\n",
    "md.Disconnect(Chem.MolFromSmiles(\"CCC(=O)O[Na]\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## rdMolStandardize.LargestFragmentChooser and rdMolStandardize.Remover"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "-  __rdMolStandardize.LargestFragmentChooser__ class gets the largest fragment"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAcIAAACWCAYAAABNcIgQAAAPXUlEQVR4nO3da2xUdf7H8U//lNJS7pRyEWmtKCigQIuFReOyVIKxFmOsMaFlcdWikLQaYuo+MGVD3IDZ6piFB82uWYoQCEQpZcULZkULirFcU1yUixCCZcq9paUXOt99MOH2R662c6bze78SnszMmfny6J3f+Z1zGmVmJgAAHPV/Xg8AAICXCCEAwGmEEADgNEIIAHAaIQQAOI0QAgCcRggBAE4jhAAApxFCAIDTCCEAwGmEEADgNEIIAHAaIQQAOI0QAgCcRggBAE4jhAAApxFCAIDTCCEAwGmEEADgNEIIAHAaIQQAOI0QAgCcRggBAE4jhAAApxFCAIDTCCEAwGmEEADgNEIIAHAaIQQAOI0QAgCcRggBAE4jhAAApxFCAIDTCCEAwGmEEADgNEIIAHAaIQQAOI0QAgCcRggBAE4jhADCz4oV0ujRUmys1L+/lJcnnTx56f2RI6Vly648ZuVKafjwkI6JyEAIAYSXkhJp9mzpjTckv1/auFE6fFjKyJCamryeDhGIEAIIH42N0p//LBUXS889J/XsKd13n/Thh1J1tVRa6vWEiECEEED4qKyUTp2SsrOvfL1rVykrS9qwwZu5ENEIIYDwcfy4FB8vde9+9XsDBwbfv+CVV6SEhEv/8vJCNyciCiEEED4SEqT6eqmu7ur3qquD71/w179KO3Zc+rdgQYiGRKQhhADCR2qq1KuXtHr1la83NEjl5dLkyZde691bGjz40r8+fUI6KiJHtNcDAMBFcXHSW29Jc+cGT5FOnRpcCb72mpSYKM2c6fWEiECsCAGEl9mzpUWLgqc+ExOlRx6R7rhD+s9/gvcVAm0syszM6yEAAPAKK0IAgNMIIYDwU10tTZr061ePAm2MEAIIP0eOSBUVwRvpgXZGCAGEH79f6ttX6tTJ60ngAEIIIPz4/cG/OgGEACEEEH5qagghQoYQAgg/rAgRQoQQQPjx+4M30wMhQAgBhB9OjSKECCGA8MOpUYQQIQQQfgghQogQAggvra3SiRPsESJkCCGA8HLsmBQIsCJEyBBCAOHF75eioqR+/byeBI4ghADCS01N8K/Ud+ni9SRwBCEEEF64UAYhRggBhBdupkeIEUIAYcVXW6sFw4d7PQYcQggBhJUdhw7pSEyM12PAIYQQQFjx+/1K5NQoQogQwlOnT59Wc3Oz12MgjPj9fvXnYhmEECFEyAUCAa1du1YTJ05Uenq6xo0bp6qqKq/HQpioqakhhAgpQoiQaWlp0dKlSzVq1Cjl5OQoLS1NZWVlGj9+vMaNG6eFCxcqEAh4PSY8ZGY6duwYIURIRZmZeT0EItvZs2f1/vvvq7i4WI2NjZo9e7by8/PVp0+fi5/56KOPNGvWLD344INasmSJBg8e7OHE8MrJkyfVt29f7d+/XykpKV6PA0ewIkS7OXbsmObNm6ekpCS99957mjt3rg4ePKh58+ZdEUFJevrpp1VVVaXY2FiNHDlSy5cv92hqeMnv90sSK0KEFCFEm/v5559VUFCg5ORklZWV6d1339VPP/2kgoICde3a9ZrH9e/fX+vWrdPbb7+tvLw8Pfvsszp16lQIJ4fX/H6/4uPjFR8f7/UocAghRJvZuXOnZsyYoXvvvVdbt27VypUrtX37ds2YMUPR0dE39R1RUVHKy8tTZWWl9u/frzFjxuirr75q58m91dzcrD179ujEiRNej+I5rhiFFwghfrNNmzbpySefVGpqqk6dOqVNmzZdfC0qKuq2vvO+++7Tli1bNHPmTGVkZKigoCDibrOoq6tTcXGxUlJSNG3aND3wwAP6/PPPvR7LU1wxCi8QQtyWQCCgdevWafz48crIyFDv3r21e/durVu3Tunp6W3yG507d9a8efP09ddf6+OPP1ZaWpp27drVJt/tpQt7p8nJyVq8eLFef/11bd26Vfn5+crKytKsWbPU0NDg9ZieYEUILxBC3JKmpiYtXbpU999/v3JycpSenq4DBw5o6dKlGjZsWLv85oQJE7Rt2zZNmDBB6enpHfY2i8v3TteuXXvF3mm3bt1UWFioiooKbdy4UWlpadq2bZvXI4fchafKtLS0RPwpcYQRA27CmTNnzOfz2aBBg2zAgAFWVFRkp06dCvkcq1atsj59+thjjz1mR44cCfnv344dO3ZYbm6uRUdH28SJE628vNwCgcA1P9/Q0GD5+fnWuXNnKyoqsvPnz4dwWm9lZWXZm2++aV9++aXFxMTYnDlzrKGhweuxEOEIIa7r6NGjVlRUZL169bKhQ4eaz+ezc+fOeTpTdXW1Pf7445aQkGBr1qzxdJbrqaiosMzMTOvUqZNlZmbad999d0vHf/rppzZw4ECbMGGC7du3r52mDC/p6em2aNEiMzP7/vvvbdiwYTZ8+HCrrKz0eDJEMkKIX7V3717Lz8+32NhYS01NtdLS0rBamQQCAfP5fNalSxfLzc21uro6r0cyM7PW1lYrLy+39PT0i7Pt2bPntr/P7/dbVlaW9ejRw0pKStpw0vDj9/tt4MCBtmrVqouvubw6RugQQlyhsrLy4mm8jIwMKy8v93qk66qqqrLRo0dbcnKyVVRUeDZHY2OjlZaW2rBhw6xHjx6Wn5/fpqduS0tLrVu3bpadnW0nTpxos+8NB/v27bNXXnnF4uLiLCsryxobG6/6zGeffWaDBg1yanWM0CGEsEAgYBs2bLDMzEzr3Lmz5ebmWlVVlddj3bRz585ZYWGhxcTEWGFhoTU3N4fst0O5d7p//36bOHGiDRgwwD755JN2+Y1QutW905qaGps2bZoTq2OEFiF0WHNzs5WWltrIkSOtW7dulp+fb4cOHfJ6rNv2xRdf2ODBg23cuHH2448/tutvVVdXe7J32tLSYkVFRRYdHW35+fm/unoKd/9/73TLli3m9/uttrb2po6/sDp+5pln7Pjx4+08LVxACB3U0NBg7777rt15552WmJho8+fPj5jTbadPn7bp06dbXFyc+Xy+664wbke47J1u2bLFhg4daiNGjLDt27eH/Pdv1eV7pzExMVftnc6ZM8fuuusu27Rp001934EDBy6ujtevX99eY8MRhNBB9fX19tBDD9nixYsj9tL0VatWWe/evW3q1Kn2yy+//Obvu3zvNDMz0zZs2NAGU/42tbW1lpeXZ7GxsbZgwQJrbW31eqSrNDU1WWlpqQ0fPvy6e6fNzc1XrHSbmppu+N0tLS22YMECi4mJsby8PKuvr2+P/wIcQAgRsQ4ePGiPPvqoJSYm3tZFPx1l7/TDDz+0hIQEmzx5sh0+fNjrccwsGGmfz2d33HGH9e/f/6b3Tjdv3mwpKSmWlpZ201fbfvPNN3b33Xdb8XPPme3a9Rsnh4sIISJaa2ur+Xy+i6fjzp49e8NjLuydjhgxosPsnR49etSeeOIJ69mzpy1btszTOX7r3unp06ctNzf3lk5v19bWWlNenlmXLmbFxWZhuDpG+CKEcMKuXbts1KhRlpKSYps3b/7Vz9TV1ZnP57MhQ4ZYv379rKioqENdjBEIBKykpMS6du1q2dnZdvLkyZD99r59+y7unY4dO7ZN9k4vPEVoypQpN38rykcfmSUkmP3hD2ZhsjpG+COEcMblt1lcfnN2TU2NFRUVWd++fS05Odl8Pl+H3m/64YcfbOzYsZaUlGQbN25s19/aunWr5ebmWqdOnS7eAtGWDh06ZJMmTbJ+/fpZWVnZzR109KhZZqZZz55mH3zQpvMgMhFCOGf9+vU2YMAAS09Pt5kzZ1pcXJyNGTPGVqxYETFPLrmdi09uxYVbIDp37mzZ2dnt+gi023qKUCBgVlJiFh9vlp1tFsLVMTqeKDMzrx/8DYTasWPHNH/+fB09elQvvviipkyZ4vVI7eLbb79VTk6O4uPjtXz5co0aNeq2v6ulpUVlZWVauHCh9uzZoxdeeEFz587VkCFD2nDia9u9e7emT5+u2tpaLV26VA8//PCND/rvf6WcHOn4cam0VPr979t9TnRAXpcYQPs6c+bMxdssbufeyrNnz17cO01ISPB07/TcuXNWUFBgMTEx9mNxsVlLy40PamkxKyoyi442y883a+PVMTo+VoSAI1avXq2XX35ZqampWrJkiQYNGnTdzx8/flyLFi3SokWL1L17d7366qt66aWX1LVr1xBNfG27v/xSI3JypCFDpA8+kIYOvfFB334r5eZKcXHS8uXSAw+0/6DoEPjDvIAjsrOztWPHDp0/f16jR49WWVnZr37u4MGDKigoUFJSktasWaN33nlHe/fuVUFBQVhEUJJGTJok7d4tpaQEg/beezc+aMIEads26Xe/k9LTpYULpQ74B57RDrxekgIIrWtdfLJz585begh22Fi1yqx3b7OpU82qq2/umJUrzSZONPP4b2siPHBqFHDUzp07lZOTo/r6eiUlJamiokJPPfWUCgsLNW7cOK/HuzWHDkkzZkh79kj//Kf05JNeT4QOhBACDmtsbFRJSYn8fr+ef/553XPPPV6PdPsCAenvf5cKC6U//lF65x0pPt7rqdABEEIAkaWyMnjLRFSUtGyZlJrq9UQIc1wsAyCypKVJ27dLU6YEL5CZN09qbb3251eskEaPlmJjpf79pbw86eTJUE2LMEAIAUSeuLjglaT//rf0j39IDz8s7d9/9edKSqTZs6U33pD8fmnjRunwYSkjQ2pqCvnY8AanRgFEtpoa6aWXpD59pH/969LrjY3SoEHS3/4m/elPl15vaJDuvlv6y1+Cq0NEPEIIwA2NjcHTnxds2iQ98ohUWyt1737lZ2fNCp4eXb06tDPCE5waBeCGyyMoBZ8/Gh9/dQQlaeDA4PtwAiEE4KaEBKm+Xqqru/q96urg+3ACIQTgptRUqVevq09/NjRI5eXS5MmejIXQi/Z6AADwRFyc9NZb0ty5wVOkU6cGV4KvvSYlJkozZ3o9IUKEi2UAuG35cuntt4OPZ+vRQ5o2LfhA7r59vZ4MIUIIAQBOY48QAOA0QggAcBohBAA4jRACAJxGCAEATiOEAACnEUIAgNMIIQDAaYQQAOA0QggAcBohBAA4jRACAJxGCAEATiOEAACnEUIAgNMIIQDAaYQQAOA0QggAcBohBAA4jRACAJxGCAEATiOEAACnEUIAgNMIIQDAaYQQAOA0QggAcBohBAA4jRACAJxGCAEATiOEAACnEUIAgNMIIQDAaYQQAOA0QggAcBohBAA4jRACAJxGCAEATiOEAACnEUIAgNP+B3CmTafC6asLAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<rdkit.Chem.rdchem.Mol at 0x7f31840517b0>"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lfc = rdMolStandardize.LargestFragmentChooser()\n",
    "lfc.choose(Chem.MolFromSmiles(\"O=C(O)CCC.O=C(O)CCCC.O=C(O)CCCCC.O=C(O)CCCC\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "__rdMolStandardize.FragmentParent()__ function is similar to __rdMolStandardize.LargestFragmentChooser__  but first performs __rdMolStandardize.Cleanup()__"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "-  __rdMolStandardize.FragmentRemover__ class filters out fragments"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAcIAAACWCAYAAABNcIgQAAAGkklEQVR4nO3dTYiVhR7H8Z8vA4UYFlYElb3qDA7lYmgIgnaDQygFbiqbRRBywe4oZVikSU1hEIIUTdYiGHpDVyluFMQ2USHNQrAzBS6awgKlqKjEPOcuzoULF2732p3xmfp/Ppt54cx5fgwcvjznOTNnXqfT6QQAiprf9AAAaJIQAlCaEAJQmhACUJoQAlCaEAJQmhACUJoQAlCaEAJQmhACUJoQAlCaEAJQmhACUJoQAlCaEAJQmhACUJoQAlCaEAJQmhACUJoQAlCaEAJQmhACUJoQAlCaEAJQmhACUJoQAlCaEAJQmhACUJoQAlCaEAJQmhACUJoQAlCaEAJQmhACUJoQAlCaEAJQmhACUJoQAlCaEAJQmhACUJoQAlCaEAJQmhACUJoQAlCaEAJQmhBCg8bHxzM+Pt70DChtYdMDoLLJycmmJ0B5zggBKE0IAShNCAEoTQgBKE0IAShNCAEoTQgBKE0IAShNCAEoTQgBKE0IAShNCAEoTQgBKE0IASjN2zBBg5Yu/XvTE6A8IYQGnT7d3/QEKM9TowCUJoQAlCaEAJQmhACUJoQAlCaEAJQmhACUJoQAlCaEAJQmhACUJoQAlCaEAJQmhACUJoQAlDav0+l0mh4BVbVa3Y+9vc3ugMqEEIDSPDUKDenvTy67LDlz5l/fe++9ZGCguU1QkRBCg3p6khdfbHoF1CaE0KDNm5PXXku++abpJVCXEEKDenuT++5LXnih6SVQlxBCw555JnnzzWR6uuklUJMQQsNuuil54IHk2WebXgI1LWx6AJA8/XTS15fccEPTS6AeZ4Qwy9rt7lOfW7b859tcd13y8MPJrl3dr7ds6f5Mu31xNkJlQgiz6MMPk8HB5LHHkmXLfv+2Tz2V/Ppr9/Nly7o/MzjYvQ9g9vjPMjALvv46efLJ5J13utf/XnopueqqC7uP775LduxIXn01Wb06efllT53CbHBGCDPol1+6fyDf15d89VXy6afJxMSFRzBJLr882b07OX48OX8+Wbky2bo1+emnmd8NlTkjhBly4EAyOpr89lsyNpaMjMz8/W/enJw9mzz/fPLQQ8m8eTN7DKjIGSH8nyYnk7vvTu6/vxu/zz+f+QgmyZo1yWefJY8/njz6aHLnncnHH8/8caAaIYQ/6MyZ7hngHXckV1+dnDjRvaZ3ySWzd8yenu4xW63k9tuTu+7qRvfbb2fvmPBXJ4Rwgc6d6167u/nm7is6P/gg2bs3uf76i7fhmmuSPXuSjz5KTp5MbrmlG+GzZy/eBvircI0QLsDBgwfzxhtf5tixv2XnzuTBB5u/TtfpJG+/3X0hzcDAaB55ZCj33HNPs6PgT8QZIfwPWq1WhoeHs27duvT3f52pqXbWr28+gkl3w/r1ydRUO/39i7Nu3boMDw+n1Wo1PQ3+FIQQfsf333+frVu3ZtWqVVm4cGFOnDiRsbGxLFo09x46ixbNz9jYWL744otceeWVue2227Jhw4acPn266Wkwp829RzPMAe12OxMTE1mxYkX279+f999/PwcOHMiNN97Y9LT/6tprr83ExEQOHz6cTz75JCtWrMju3btz/vz5pqfBnOQaIfybo0ePZtOmTZmens727duzcePGLFiwoOlZf0i73c5bb72VJ554IldccUV27dqV1atXNz0L5hRnhPBP09PTGRkZydDQUAYHBzM1NZXR0dE/bQSTZP78+RkZGUmr1cratWtz7733Zs2aNTl58mTT02DOEELK+/nnn7Njx44sX748p06dyuTkZPbs2ZOlS5c2PW3GLFmyJDt37szx48dz6aWXpq+vL6Ojo/nhhx+angaNE0LK6nQ62bdvX/r6+vLuu+9m3759OXz4cFauXNn0tFlz6623Zu/evTl48GCOHDmS3t7evP7662l7vycKc42Qso4dO5ahoaFs27YtGzduTE9PT9OTLqpz587llVdeyXPPPZdDhw5lYGCg6UnQCCGktB9//DGLFy9uekaj/A6oTggBKM01QgBKE0IAShNCAEoTQgBKE0IAShNCAEoTQgBKE0IAShNCAEoTQgBKE0IAShNCAEoTQgBKE0IAShNCAEoTQgBKE0IAShNCAEoTQgBKE0IAShNCAEoTQgBKE0IAShNCAEoTQgBKE0IAShNCAEoTQgBKE0IAShNCAEoTQgBKE0IAShNCAEoTQgBKE0IAShNCAEoTQgBKE0IAShNCAEoTQgBKE0IAShNCAEoTQgBKE0IASvsHwHdaImkAjPUAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<rdkit.Chem.rdchem.Mol at 0x7f3184051d00>"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fr = rdMolStandardize.FragmentRemover()\n",
    "fr.remove(Chem.MolFromSmiles(\"CN(C)C.Cl.Cl.Br\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## rdMolStandardize.Reionizer and rdMolStandardizer.Uncharger"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "-  __rdMolStandardize.Reionizer__ ensure the strongest acid groups ionize first in partially ionized molecules.\n",
    "-  __rdMolStandardize.Uncharger__ attempts to neutralize charges by adding and/or removing hydrogens where possible.\n",
    "\n",
    "\n",
    "__rdMolStandardize.ChargeParent()__ method is the uncharged version of the fragment parent. It involves taking the fragment parent then applying __Neutralize__ and __Reionize__."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAcIAAACWCAYAAABNcIgQAAATxUlEQVR4nO3de1BV5f4G8MeMFK/HI6lYY57Qvd2hhhcQZwJsTsrNy9GycMJGxcQwJXEkpfF4wQ4qIIq38AaapJlH8lIzpZiTCJZgSMhli5UpdFDDAiIS2Ov3x/5xzvG4tqLCetfmfT4z/cP7/vHUnDnPWnut9X7bKIqigIiISFKPiA5AREQkEouQiIikxiIkIiKpsQiJiEhqLEIiIpIai5CIiKTGIiQiIqmxCImISGosQiIikhqLkIiIpMYiJCIiqbEIiYhIaixCIiKSGouQiIikxiIkIiKpsQiJiEhqLEIiIpIai5CIiKTGIiQiIqmxCImISGosQiIikhqLkIiIpMYiJCIiqbEIiYhIaixCIiKSGouQiIikxiIkIiKpsQiJiEhqLEIiIpIai5CIiKTGIiQiIqmxCImISGosQiIikhqLkIiIpMYiJCIiqbEIiYhIaixCItKHvXsBNzegfXugZ09g1iygokJ0KpIAi5CIxEtKAsLCgEWLgPJy4ORJ4MoV4IUXgD/+EJ2OWrk2iqIookMQkcRqa4HevYG4OGDGjP/8vaYGcHEBli+33h0StRDeERKRWNnZwM2bwOTJt/+9Qwdg/Hjg2DExuUgaj4oOQPQwCgsLER8fjx9//FF0lDsEBgZi1qxZcHR0FB1F327cADp2BDp3vnPN2Rkwm7XPRFJhEZLdqqurw8SJE2EymTBq1CjRcW6jKArWrl2LyspKLFmyRHQcfXNyAn77DaiqurMMf/rJuv7ee8Abb1j/5uNjfYZI1Ez4jJDsVnx8PBISElBUVIROnTqJjnOHtLQ0vPrqqygsLMRTTz0lOo5+/f679RlhfLz6M8KlS4HZs8Xlo1aPRUh26fr16zAYDNiwYQOCg4NFx7HJ19cXTk5OSE1NFR1F3zZvBt55x3rn5+dnvROcPx8oKwO++sr6SQVRC2ERkl0KDQ1FXl4eMjMz0aZNG9FxbCooKICbmxvS09Ph5eUlOo6+paYCa9YARUVAly7AhAnA6tVA9+6ik1ErxyIku5Obmwt3d3dkZGRgxIgRouPc05w5c3DmzBmcPXsWjzzCF7WJ9IZFSHZn1KhR6Nu3L1JSUkRHaZKKigoYDAbExsZi+vTpouPoR10d4OAgOgURi5Dsy/79+xESEoLi4mL07t1bdU9WVhZE/M+6Z8+ecHFxUV1LTEzEP/7xDxQXF6Nr164aJ9OpN98EHB2B2FjRSUhyLEKyG7W1tTCZTJg1axYWL15sc5+joyMaGho0TGY1ffp0JCUlqa7V19djyJAhCAwMxKpVqzROpkMFBdZzRdPTAT47JcFYhGQ3oqOjkZycjIKCArS3w7cI09PTERAQgPz8fPTv3190HLF8fa3fB/JtWtIBFiHZhdLSUhiNRuzevRuTJk0SHeeBjR8/Hm3btkVaWproKOKkpQGvvgoUFgL8vpJ0gEVIdiE4OBhlZWU4ceKE6CgP5dKlS3B1dcWhQ4fg6+srOo72bt0CBg4EgoOBv/9ddBoiACxCsgNZWVnw9vZGTk4OBg8eLDrOQ4uMjMQnn3yC3NxcOMj21mRMDLBli/VbwQ4dRKchAsAiJJ1TFAWenp4YNmwYNm/eLDpOs6iqqoLRaMQ777yDOXPmiI6jnfJywGAAtm0DXn5ZdBqif2MRkq6lpKQgIiICZrMZTk5OouM0m+3btyMyMhIXL15Ed1lOTpk2DSgpAU6dAnR8GhDJh0VIulVdXQ2j0YjIyEiEh4eLjtOsLBYLPD094enpicTERNFxWt65c8CIEUBWFjB8uOg0RLdhEZJuLV68GIcOHcL58+db5bO0zMxM+Pj44Ny5cxg0aJDoOC1HUQBvb8BkArZuFZ2G6A4sQtKl7777Dq6urkhLS4Ofn5/oOC1mypQpuHbtGtLT00VHaTl79gBhYdYBu716iU5DdAcWIenSxIkTUV9fjyNHjoiO0qKuXr2KAQMG4IMPPsD48eNFx2l+NTXWO8F584AFC0SnIVLFIiTdOXHiBPz9/ZGXlwej0Sg6TotbtmwZ9uzZgwsXLqBdu3ai4zSvJUuAffuA/Hygtf27UavBmTCkKw0NDZg/fz7mzZsnRQkCwNtvv436+nqsX79edJRmdfnyZXx95gyUdetYgqRrvCMkXdm0aRNWrFgBs9ks1ZSGvXv3IjQ0FMXFxXB2dhYdp1lMnjwZv/zyC44dOyY6CtFdsQhJN27evAmDwYCYmBjMnDlTdBzN+fj4oF+/ftixY4foKA/t9OnTeP7553Hu3DkMHDhQdByiu2IRkm7MmzcPGRkZOHv2LNq2bSs6jua++eYbeHh4IDMzE+7u7qLjPDCLxQIPDw94eXkhISFBdByie2IRki4UFhbi2WefxfHjx+Ht7S06jjAhISEoLCzE6dOn0cZOT19JSkpCVFQUzGazPKfmkF1jEZIu+Pv7o2vXrti3b5/oKEJdu3YNBoMBW7ZswZQpU0THuW+VlZUwGo1YunQpZs+eLToOUZOwCEm4w4cPIygoCAUFBejbt6/oOMKtWbMGGzZsQFFRETp27Cg6zn1ZsGABPvvsM+Tm5uLRRx8VHYeoSViEJNStW7cwaNAgTJkyBcuWLRMdRxfs9b9JSUkJBg4ciCNHjmD06NGi4xA1GYuQhIqNjUViYqJd3v20pMa75MLCQjxlJ1PcAwMD4ejoiAMHDoiOQnRfWIQkjL0/D2tpfn5+6NatG/bu3Ss6yj0dP34cY8eORX5+Pvr16yc6DtF9YRGSMDNnzkRBQYFdvyHZkuzlTdr6+nq4ublhwoQJePfdd0XHIbpvLEISovGbudOnT8PDw0N0HN2yh28rExISsHr1apjNZjg4OMDR0VF0JKL7wiIkIVrTKSotSe+n7VRUVKB///5ISEiAi4sLgoKCkJ+fL9XxeGT/WISkuVOnTiEgIAAXL15EL5X5dIqiID4+HuPHj4fBYBCQUFs7duyAq6srPD09Vdc3bNiAmJgYBAYG6u4n5EuXLqG6uhpnzpyBxWLB0KFDMWbMGMTGxoqORtRknD5ButOmTRtkZmZigQTz665evYrw8HBcu3ZNdJQH4uPjg0OHDqFNmzZo27YtEhISkJiYCLPZLDoaUZPxjpCE8PHxgYuLC3bu3Km63jih/uDBg/D399c4nXaCgoJw48YNHD9+XHW98afRVatWISQkRON06i5fvgxHR0f06NFDdX3ixIloaGjA4cOHNU5G9GBYhCREU16WiYqKQlpaGvLy8uDg4KBxwpaXmZkJHx8fnDt3DoMGDVLdM3fuXJw+fRrZ2dl45BF9/IAzZswYPPnkk/e8iElLS4Ofn5/G6YjuH4uQhHn99ddx4cIFm59PVFdXw2g0IjIyEuHh4QISthyLxYIRI0Zg5MiRSExMVN1TUFAANzc33X0+0ZSLmMWLF+PQoUM4f/58q7yIoVZGIRKkvLxc6dq1q5KammpzT3JystKtWzfl+vXrGiZredu2bVO6deum3Lhxw+YeX19fZcqUKRqmarqQkBBl5MiRisViUV2vqqpSevfuraxbt07jZET3j0VIQsXGxipPPPGEUl1drbpusVgUDw8P5Y033tA4WcuprKxUnJ2dlY0bN9rc8/HHHyuOjo7KDz/8oGGyppP5IoZaH/40SkI1HjAdFBSE5cuXq+45c+YMvLy8kJ2djWeffVbjhM1v4cKF+PTTT3H+/HnVCQ32cuh2bGws1q9fj+LiYtVzYhVFgaenJ4YNG4bNmzcLSEjUNCxCEu7IkSN45ZVX7jqGaerUqSgtLcWJEye0DdfMLl26BFdXVxw+fBhjxoxR3bN69Wps3LhR9weR389FTE5ODgYPHqxxQqKmYRGSLvj7+6NLly748MMPVddLS0sxYMAA7Nq1C5MmTdI4XfMZN24cHBwccPDgQdX18vJyGI1GvPfeewgKCtI43f1rykVMcHAwysrK7P4ihlovFiHpQuMB08eOHYOPj4/qnpUrV2Lnzp0oKChA+/btNU748NLT0xEQEID8/Hz0799fdc+MGTNQXFyMjIwM3Z0iY4u/vz+6du2Kffv2qa6XlpbCaDRi9+7ddn0RQ60Xi5B0Izw8HF9++SWys7NVD5iura2FyWTC66+/jqioKAEJH1x9fT2GDBmCsWPHIiYmRnVP42cJmZmZcHd31zjhg2vKRUx0dDSSk5Pt9iKGWjlBL+kQ3aGiokJxcnJStm7danPP/v37lU6dOimlpaUaJnt469atU3r27Kn8+uuvqusWi0Xx8vJSQkJCNE7WPObNm6e4ubkp9fX1qus1NTXKdD8/5V8JCdoGI2oCFiHpyqZNm5QePXooN2/etLln1KhRymuvvaZdqIf0888/K927d1eSk5Nt7klNTVU6d+6slJWVaResGTVexGzbts32pv37FaVTJ0Wxs4sYav340yjpSkNDA4YOHYrRo0cjLi5Odc/58+cxfPhwZGRkYMSIERonvH9hYWH46quvcPbsWdVj0n7//XeYTCbMmTMHCxcuFJCweWzatAkrVqyA2Wy2PYZp1Cigb18gJUXDZER3xyIk3fniiy/g5+eHvLw8GI1G1T2hoaHIycnB119/rZszONU0HpN24sQJPPfcc6p7li5ditTUVFy4cAHt2rXTOGHzabyIuesYptxcwN0dyMgA7OAihuTAIiRdmjRpEm7duoWjR4+qrl+/fh0GgwGJiYmYOnWqxumaztfXF48//jj27Nmjun716lUYjUbs27cP48aN0zhd8/v65Ek8Nncu3P75T8DWLMnQUCAvD8jMBOzkzVhq3ViEpEtNGcO0du1axMbGori4GF26dNE44b0dPHgQwcHBKCoqQp8+fVT33GsMk12aOBGorweOHFFfv37dWpKJiYCOL2JIHixC0q17jWGqq6vDoEGD8NJLL2HlypUCEtp269YtDBw4EFOnTsWSJUtU9zRlDJNd+u47wNUVSEsDbI1hWrvW+k9REdCpk7b5iP6Hfh+ukPSioqJQWVmJTZs2qa47ODhgw4YNiIuLQ0lJicbp7i4uLg61tbVYsGCB6rrFYkF4eDjCwsJaVwkCwNNPA2+9BUREAHV16nvmzrUW4OrVmkYjUsM7QtK1Xbt24a233oLZbMbjjz+uuicgIAAdOnTAgQMHNE6nrry8HAaDAdu3b8fkyZNV92zbtg2LFi2C2WxG9+7dNU6ogepqwGgEIiMBW7MkP/kEeOkloKAA+MtftM1H9F9YhKRryv9PMBg6dCi2bNmiuqekpASurq44evQoRo8erXHCO02bNg0lJSU4deqU6jFpVVVVMBgMWLJkCcLCwgQk1EhKivWu0GwGnJzU9wQEAB07Ah99pGk0ov/GIiTda8oYpoiICHz++efIzc1VHW2klZycHHh6eiIrKwvDhw9X3XOvMUythqIAnp7AsGGArTFMRUXA4MHWu0MdXMSQnFiEZBfuNYapsrISRqMRS5cuxezZszVOZ6UoCry9vfHMM88gKSlJdU9TxjC1KllZgLc3kJNjLTw18+cDx45ZvzFszRcGpFssQrILjWOYUlJS8OKLL6ruSUpKQlRUlLDnbu+//z7efPNNFBcXo1evXqp77jWGqVUKDgbKygBbY5h++cX6OUV0tPUbQyKNsQjJbtxrDJPFYoGHhwe8vLyQkJCgabaamhqYTCaEh4cjIiJCdc/x48cRGBh41zFMrVJpqfXFmd27AVtjmJKSrN8V5ufzI3vSHIuQ7EZTxjCdPHkSfn5++P777+Hs7KxZtrVr1yIpKQnffvstHnvssTvWmzKGqVWLjgaSk61viKqNYWposL5pauuMUqIWxCIku/LRRx/9e3ht7969VfeYzWYYbB3v1ULq6upw5coVPP3006rr69evR0xMDMxmsy5PwWlxtbWAyQTMmgUsXiw6DdFtWIRkd55//nn06dMHu3btEh2lSSoqKmAwGBAXF4dp06aJjiPO/v1ASAhQXAzYuIghEoFFSHancQzTF198YXOig57cawyTVP72NyAsDJDhjVmyGyxCskuzZ89Gdna27scwXbhwAUOGDLnrGCbp7d1rPWqtqMj6jHDCBGDVKuDPfxadjCSh3/8HIbqL6OhoXLp0CampqaKj3NX8+fPx8ssvswRtSUqy3iEuWgSUlwMnTwJXrgAvvAD88YfodCQJ3hGS3WoNY5ikVltrfVYYFwfMmPGfv9fUAC4uwPLl1pdriFoYi5DsVuMYJpPJBHd3d9FxbqMoCrZu3YqZM2faHMMkvYwMwMsLqKwEOne+fS00FKio4BmkpAmeZ0R2y8HBAWlpaYiPj8fJkydFx7lDREQEZvGOxrYbN6wHbv9vCQKAs7P1sG4iDbAIya6ZTCZs375ddAx6EE5OwG+/AVVVd5bhTz/ZnlhB1Mz4sgwRiTFsGPCnP93582dNDXD4MPDXvwqJRfLhHSERieHoCLz7LrBggfUnUj8/653g/PlAjx6AzIcPkKb4sgwRiZWaCqxZY/2OsEsX63eEq1cDAiaIkJxYhEREJDU+IyQiIqmxCImISGosQiIikhqLkIiIpMYiJCIiqbEIiYhIaixCIiKSGouQiIikxiIkIiKpsQiJiEhqLEIiIpIai5CIiKTGIiQiIqmxCImISGosQiIikhqLkIiIpMYiJCIiqbEIiYhIaixCIiKSGouQiIikxiIkIiKpsQiJiEhqLEIiIpIai5CIiKTGIiQiIqmxCImISGosQiIikhqLkIiIpMYiJCIiqbEIiYhIaixCIiKSGouQiIikxiIkIiKpsQiJiEhqLEIiIpIai5CIiKT2f1hAYBW74cWxAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<rdkit.Chem.rdchem.Mol at 0x7f318405f3f0>"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mol = Chem.MolFromSmiles(\"[Na+].O=C([O-])c1ccccc1\")\n",
    "\n",
    "lfc = rdMolStandardize.LargestFragmentChooser()\n",
    "mol = lfc.choose(mol)\n",
    "mol"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAcIAAACWCAYAAABNcIgQAAAT90lEQVR4nO3de1BV5f4G8MeMFG9kmoo15cnc2x3eFcSZAJuTyqXsYFk4YqPiLUxRHEkojxYaGiiKt1ATvJCkJonVTKnoJIIlGBpy2WJlih7UsICIuOz1+2P/yONxbUWF9a7N+3xm/APet5mnxulZa++13m8LRVEUEBERSeoB0QGIiIhEYhESEZHUWIRERCQ1FiEREUmNRUhERFJjERIRkdRYhEREJDUWIRERSY1FSEREUmMREhGR1FiEREQkNRYhERFJjUVIRERSYxESEZHUWIRERCQ1FiEREUmNRUhERFJjERIRkdRYhEREJDUWIRERSY1FSEREUmMREhGR1FiEREQkNRYhERFJjUVIRERSYxESEZHUWIRERCQ1FiEREUmNRUhERFJjERIRkdRYhEREJDUWIRERSY1FSEREUmMREhGR1FiEREQkNRYhERFJjUVIROLt3AkMGAC0bg107QpMmwaUlt5Y79MH2LHj5n8mORno3VvTmNQ8sQiJSKz4eCA4GFiwACgpAY4cAS5cAJ5/HvjrL9HpSAIsQiISp6oKCA8HVqwAAgIAJyfAZAI+/RS4fBnYulV0QpIAi5CIxMnKAq5fB8aOvfn3bdoAo0cDBw6IyUVSeVB0AKL7kZ+fjxUrVuCXX34RHeUWfn5+mDZtGhwdHUVH0a9r14C2bYH27W9dc3YGzOYbP7/xBjBnzo2fq6uB7t2bPCI1fyxCsls1NTXw9/eHyWTC8OHDRce5iaIoWLlyJcrKyrBw4ULRcfSrc2fgjz+A8vJby/DyZet6vfffB/z9b/ycmgrExWmTk5o1FiHZrbi4OFRUVGD79u1o166d6Di3eOaZZzB+/Hi8/vrrePLJJ0XH0afBg4GHHwZ27wYmT77x+8pKa9EtWnTjdx07Ao8/fuPnRx7RLCY1b/yOkOzS1atXsWTJEixbtkyXJQgA/v7+8PDwQEREhOgo+uXoCCxdCsybB3zyCfD770BBAfDyy0CXLsDEiaITkgRYhGSX3nnnHfTu3Rvjx48XHeW2YmNjsXv3bhw9elR0FP0KDgbWrrV+9NmlC+DhATz2GJCWZn2vkKiJtVAURREdguhu5OTkwNXVFenp6Rg6dKjoOHc0c+ZMHD9+HCdOnMADD/Dak0hvWIRkd4YPH44ePXogMTFRdJQGKS0thcFgQHR0NCZNmiQ6jn7U1AAODqJTELEIyb7s2rULQUFBKCwsRHcbj85nZmZCxF/rrl27omfPnqprcXFxeP/991FYWAgnJyeNk+nUm29avyOMjhadhCTHIiS7UVVVBZPJhGnTpiE8PNzmPkdHR9TV1WmYzGrSpEmIj49XXautrcXAgQPh5+eHZcuWaZxMh/LyrGeLHjpk/U6QSCAWIdmNyMhIJCQkIC8vD63t8CGKQ4cOwdfXF7m5uejVq5foOGKNGmV9RzApSXQSIhYh2Yfi4mIYjUZs27YNY8aMER3nno0ePRotW7ZESkqK6CjipKQA48cD+fkA368kHWARkl0IDAzEpUuXkJaWJjrKfTl37hxcXFywb98+jBo1SnQc7VVXW0cqBQYC//636DREAFiEZAcyMzPh6emJ7Oxs9OvXT3Sc+xYWFoYvvvgCOTk5cJDtqcmoKGDDButL823aiE5DBIBFSDqnKArc3d0xePBgrF+/XnScRlFeXg6j0Yi3334bM2fOFB1HOyUlgMEAbNoEvPqq6DREf2MRkq4lJiYiNDQUZrMZnf/7AGY7t3nzZoSFheHs2bPo1KmT6DjamDgRKCoCjh4FWrQQnYbobyxC0q2KigoYjUaEhYUhJCREdJxGZbFY4O7uDnd3d8TJMEHh5Elg6FAgMxMYMkR0GqKbsAhJt8LDw7Fv3z6cOnWqWX6XlpGRAS8vL5w8eRJ9+/YVHafpKArg6WmdPL9xo+g0RLdgEZIu/fjjj3BxcUFKSgq8vb1Fx2ky48aNw5UrV3Do0CHRUZrOjh3Wg7XNZqBbN9FpiG7BIiRd8vf3R21tLfbv3y86SpO6ePEievfujY8//hijR48WHafxVVZa7wRnz7aOWiLSIRYh6U5aWhp8fHxw+vRpGI1G0XGa3OLFi7Fjxw6cOXMGrVq1Eh2ncS1cCCQnA7m5QHP7d6NmgzNhSFfq6uowd+5czJ49W4oSBIC33noLtbW1WL16tegojer8+fP47vhxKKtWsQRJ13hHSLqybt06vPfeezCbzVJNadi5cyemT5+OwsJCODs7i47TKMaOHYvffvsNBw4cEB2F6LZYhKQb169fh8FgQFRUFKZMmSI6jua8vLzw9NNP46OPPhId5b4dO3YMzz33HE6ePIk+ffqIjkN0WyxC0o3Zs2cjPT0dJ06cQMuWLUXH0dz3338PNzc3ZGRkwNXVVXSce2axWODm5gYPDw/ExsaKjkN0RyxC0oX8/Hz0798fBw8ehKenp+g4wgQFBSE/Px/Hjh1DCzs9fSU+Ph4REREwm83ynJpDdo1FSLrg4+MDJycnJCcni44i1JUrV2AwGLBhwwaMGzdOdJy7VlZWBqPRiEWLFmHGjBmi4xA1CIuQhEtNTUVAQADy8vLQo0cP0XGE++CDD7BmzRoUFBSgbdu2ouPclXnz5uGrr75CTk4OHnzwQdFxiBqERUhCVVdXo2/fvhg3bhwWL14sOo4u2Ot/k6KiIvTp0wf79+/HiBEjRMchajAWIQkVHR2NuLg4u7z7aUr1d8n5+fl40k6muPv5+cHR0RF79uwRHYXorrAISRh7/z6sqXl7e6Njx47YuXOn6Ch3dPDgQbzwwgvIzc3F008/LToO0V1hEZIwU6ZMQV5enl0/IdmU7OVJ2traWgwYMAAvvfQSli5dKjoO0V1jEZIQ9e/MHTt2DG5ubqLj6JY9vFsZGxuL5cuXw2w2w8HBAY6OjqIjEd0VFiEJ0ZxOUWlKej9tp7S0FL169UJsbCx69uyJgIAA5ObmSnU8Htk/FiFp7ujRo/D19cXZs2fRTWU+naIoWLFiBUaPHg2DwSAgobY++ugjuLi4wN3dXXV9zZo1iIqKgp+fn+4+Qj537hwqKipw/PhxWCwWDBo0CCNHjkR0dLToaEQNxukTpDstWrRARkYG5kkwv+7ixYsICQnBlStXREe5J15eXti3bx9atGiBli1bIjY2FnFxcTCbzaKjETUY7whJCC8vL/Ts2RNbtmxRXa+fUL937174+PhonE47AQEBuHbtGg4ePKi6Xv/R6LJlyxAUFKRxOnXnz5+Ho6MjunTporru7++Puro6pKamapyM6N6wCEmIhjwsExERgZSUFJw+fRoODg4aJ2x6GRkZ8PLywsmTJ9G3b1/VPbNmzcKxY8eQlZWFBx7Qxwc4I0eOxOOPP37Hi5iUlBR4e3trnI7o7rEISZipU6fizJkzNl+fqKiogNFoRFhYGEJCQgQkbDoWiwVDhw7FsGHDEBcXp7onLy8PAwYM0N3rEw25iAkPD8e+fftw6tSpZnkRQ82MQiRISUmJ4uTkpCQlJdnck5CQoHTs2FG5evWqhsma3qZNm5SOHTsq165ds7ln1KhRyrhx4zRM1XBBQUHKsGHDFIvForpeXl6udO/eXVm1apXGyYjuHouQhIqOjlYee+wxpaKiQnXdYrEobm5uyhtvvKFxsqZTVlamODs7K2vXrrW557PPPlMcHR2Vn3/+WcNkDSfzRQw1P/xolISqP2A6ICAA7777ruqe48ePw8PDA1lZWejfv7/GCRvf/Pnz8eWXX+LUqVOqExrs5dDt6OhorF69GoWFharnxCqKAnd3dwwePBjr168XkJCoYViEJNz+/fvx2muv3XYM04QJE1BcXIy0tDRtwzWyc+fOwcXFBampqRg5cqTqnuXLl2Pt2rW6P4j8bi5isrOz0a9fP40TEjUMi5B0wcfHBx06dMAnn3yiul5cXIzevXtj69atGDNmjMbpGs+LL74IBwcH7N27V3W9pKQERqMRH374IQICAjROd/cachETGBiIS5cu2f1FDDVfLELShfoDpg8cOAAvLy/VPUuWLMGWLVuQl5eH1q1ba5zw/h06dAi+vr7Izc1Fr169VPdMnjwZhYWFSE9P190pMrb4+PjAyckJycnJquvFxcUwGo3Ytm2bXV/EUPPFIiTdCAkJwTfffIOsrCzVA6arqqpgMpkwdepURERECEh472prazFw4EC88MILiIqKUt1T/1pCRkYGXF1dNU547xpyERMZGYmEhAS7vYihZk7QQzpEtygtLVU6d+6sbNy40eaeXbt2Ke3atVOKi4s1THb/Vq1apXTt2lX5/fffVdctFovi4eGhBAUFaZysccyePVsZMGCAUltbq7peWVmpTPL2Vv4TG6ttMKIGYBGSrqxbt07p0qWLcv36dZt7hg8frrz++uvahbpPv/76q9KpUyclISHB5p6kpCSlffv2yqVLl7QL1ojqL2I2bdpke9OuXYrSrp2i2NlFDDV//GiUdKWurg6DBg3CiBEjEBMTo7rn1KlTGDJkCNLT0zF06FCNE9694OBgfPvttzhx4oTqMWl//vknTCYTZs6cifnz5wtI2DjWrVuH9957D2az2fYYpuHDgR49gMREDZMR3R6LkHTn8OHD8Pb2xunTp2E0GlX3TJ8+HdnZ2fjuu+90cwanmvpj0tLS0vDss8+q7lm0aBGSkpJw5swZtGrVSuOEjaf+Iua2Y5hycgBXVyA9HbCDixiSA4uQdGnMmDGorq7G559/rrp+9epVGAwGxMXFYcKECRqna7hRo0bh0UcfxY4dO1TXL168CKPRiOTkZLz44osap2t83x05godmzcKATz8FbM2SnD4dOH0ayMgA7OTJWGreWISkSw0Zw7Ry5UpER0ejsLAQHTp00Djhne3duxeBgYEoKCjAE088obrnTmOY7JK/P1BbC+zfr75+9aq1JOPiAB1fxJA8WISkW3caw1RTU4O+ffvilVdewZIlSwQktK26uhp9+vTBhAkTsHDhQtU9DRnDZJd+/BFwcQFSUgBbY5hWrrT+KSgA2rXTNh/R/9DvlyskvYiICJSVlWHdunWq6w4ODlizZg1iYmJQVFSkcbrbi4mJQVVVFebNm6e6brFYEBISguDg4OZVggDw1FPAnDlAaChQU6O+Z9YsawEuX65pNCI1vCMkXdu6dSvmzJkDs9mMRx99VHWPr68v2rRpgz179micTl1JSQkMBgM2b96MsWPHqu7ZtGkTFixYALPZjE6dOmmcUAMVFYDRCISFAbZmSX7xBfDKK0BeHvCPf2ibj+i/sAhJ15T/n2AwaNAgbNiwQXVPUVERXFxc8Pnnn2PEiBEaJ7zVxIkTUVRUhKNHj6oek1ZeXg6DwYCFCxciODhYQEKNJCZa7wrNZqBzZ/U9vr5A27bA7t2aRiP6byxC0r2GjGEKDQ3F119/jZycHNXRRlrJzs6Gu7s7MjMzMWTIENU9dxrD1GwoCuDuDgweDNgaw1RQAPTrZ7071MFFDMmJRUh24U5jmMrKymA0GrFo0SLMmDFD43RWiqLA09MTzzzzDOLj41X3NGQMU7OSmQl4egLZ2dbCUzN3LnDggPUdw+Z8YUC6xSIku1A/hikxMREvv/yy6p74+HhEREQI+95t+/btePPNN1FYWIhu3bqp7rnTGKZmKTAQuHQJsDWG6bffrK9TREZa3zEk0hiLkOzGncYwWSwWuLm5wcPDA7GxsZpmq6yshMlkQkhICEJDQ1X3HDx4EH5+frcdw9QsFRdbH5zZtg2wNYYpPt76XmFuLl+yJ82xCMluNGQM05EjR+Dt7Y2ffvoJzs7OmmVbuXIl4uPj8cMPP+Chhx66Zb0hY5iatchIICHB+oSo2himujrrk6a2ziglakIsQrIru3fv/nt4bffu3VX3mM1mGGwd79VEampqcOHCBTz11FOq66tXr0ZUVBTMZrMuT8FpclVVgMkETJsGhIeLTkN0ExYh2Z3nnnsOTzzxBLZu3So6SoOUlpbCYDAgJiYGEydOFB1HnF27gKAgoLAQsHERQyQCi5DsTv0YpsOHD9uc6KAndxrDJJV//QsIDgZkeGKW7AaLkOzSjBkzkJWVpfsxTGfOnMHAgQNvO4ZJejt3Wo9aKyiwfkf40kvAsmXAI4+ITkaS0O//QYhuIzIyEufOnUNSUpLoKLc1d+5cvPrqqyxBW+LjrXeICxYAJSXAkSPAhQvA888Df/0lOh1JgneEZLeawxgmqVVVWb8rjIkBJk++8fvKSqBnT+Ddd60P1xA1MRYh2a36MUwmkwmurq6i49xEURRs3LgRU6ZMsTmGSXrp6YCHB1BWBrRvf/Pa9OlAaSnPICVN8DwjslsODg5ISUnBihUrcOTIEdFxbhEaGoppvKOx7do164Hb/1uCAODsbD2sm0gDLEKyayaTCZs3bxYdg+5F587AH38A5eW3luHly7YnVhA1Mj4sQ0RiDB4MPPzwrR9/VlYCqanAP/8pJBbJh3eERCSGoyOwdCkwb571I1Jvb+ud4Ny5QJcugMyHD5Cm+LAMEYmVlAR88IH1PcIOHazvES5fDgiYIEJyYhESEZHU+B0hERFJjUVIRERSYxESEZHUWIRERCQ1FiEREUmNRUhERFJjERIRkdRYhEREJDUWIRERSY1FSEREUmMREhGR1FiEREQkNRYhERFJjUVIRERSYxESEZHUWIRERCQ1FiEREUmNRUhERFJjERIRkdRYhEREJDUWIRERSY1FSEREUmMREhGR1FiEREQkNRYhERFJjUVIRERSYxESEZHUWIRERCQ1FiEREUmNRUhERFJjERIRkdRYhEREJDUWIRERSY1FSEREUmMREhGR1FiEREQktf8DaDZRD5zGf9wAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<rdkit.Chem.rdchem.Mol at 0x7f318405f530>"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "u = rdMolStandardize.Uncharger()\n",
    "mol = u.uncharge(mol)\n",
    "mol"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAcIAAACWCAYAAABNcIgQAAAT90lEQVR4nO3de1BV5f4G8MeMFG9kmoo15cnc2x3eFcSZAJuTyqXsYFk4YqPiLUxRHEkojxYaGiiKt1ATvJCkJonVTKnoJIIlGBpy2WJlih7UsICIuOz1+2P/yONxbUWF9a7N+3xm/APet5mnxulZa++13m8LRVEUEBERSeoB0QGIiIhEYhESEZHUWIRERCQ1FiEREUmNRUhERFJjERIRkdRYhEREJDUWIRERSY1FSEREUmMREhGR1FiEREQkNRYhERFJjUVIRERSYxESEZHUWIRERCQ1FiEREUmNRUhERFJjERIRkdRYhEREJDUWIRERSY1FSEREUmMREhGR1FiEREQkNRYhERFJjUVIRERSYxESEZHUWIRERCQ1FiEREUmNRUhERFJjERIRkdRYhEREJDUWIRERSY1FSEREUmMREhGR1FiEREQkNRYhERFJjUVIROLt3AkMGAC0bg107QpMmwaUlt5Y79MH2LHj5n8mORno3VvTmNQ8sQiJSKz4eCA4GFiwACgpAY4cAS5cAJ5/HvjrL9HpSAIsQiISp6oKCA8HVqwAAgIAJyfAZAI+/RS4fBnYulV0QpIAi5CIxMnKAq5fB8aOvfn3bdoAo0cDBw6IyUVSeVB0AKL7kZ+fjxUrVuCXX34RHeUWfn5+mDZtGhwdHUVH0a9r14C2bYH27W9dc3YGzOYbP7/xBjBnzo2fq6uB7t2bPCI1fyxCsls1NTXw9/eHyWTC8OHDRce5iaIoWLlyJcrKyrBw4ULRcfSrc2fgjz+A8vJby/DyZet6vfffB/z9b/ycmgrExWmTk5o1FiHZrbi4OFRUVGD79u1o166d6Di3eOaZZzB+/Hi8/vrrePLJJ0XH0afBg4GHHwZ27wYmT77x+8pKa9EtWnTjdx07Ao8/fuPnRx7RLCY1b/yOkOzS1atXsWTJEixbtkyXJQgA/v7+8PDwQEREhOgo+uXoCCxdCsybB3zyCfD770BBAfDyy0CXLsDEiaITkgRYhGSX3nnnHfTu3Rvjx48XHeW2YmNjsXv3bhw9elR0FP0KDgbWrrV+9NmlC+DhATz2GJCWZn2vkKiJtVAURREdguhu5OTkwNXVFenp6Rg6dKjoOHc0c+ZMHD9+HCdOnMADD/Dak0hvWIRkd4YPH44ePXogMTFRdJQGKS0thcFgQHR0NCZNmiQ6jn7U1AAODqJTELEIyb7s2rULQUFBKCwsRHcbj85nZmZCxF/rrl27omfPnqprcXFxeP/991FYWAgnJyeNk+nUm29avyOMjhadhCTHIiS7UVVVBZPJhGnTpiE8PNzmPkdHR9TV1WmYzGrSpEmIj49XXautrcXAgQPh5+eHZcuWaZxMh/LyrGeLHjpk/U6QSCAWIdmNyMhIJCQkIC8vD63t8CGKQ4cOwdfXF7m5uejVq5foOGKNGmV9RzApSXQSIhYh2Yfi4mIYjUZs27YNY8aMER3nno0ePRotW7ZESkqK6CjipKQA48cD+fkA368kHWARkl0IDAzEpUuXkJaWJjrKfTl37hxcXFywb98+jBo1SnQc7VVXW0cqBQYC//636DREAFiEZAcyMzPh6emJ7Oxs9OvXT3Sc+xYWFoYvvvgCOTk5cJDtqcmoKGDDButL823aiE5DBIBFSDqnKArc3d0xePBgrF+/XnScRlFeXg6j0Yi3334bM2fOFB1HOyUlgMEAbNoEvPqq6DREf2MRkq4lJiYiNDQUZrMZnf/7AGY7t3nzZoSFheHs2bPo1KmT6DjamDgRKCoCjh4FWrQQnYbobyxC0q2KigoYjUaEhYUhJCREdJxGZbFY4O7uDnd3d8TJMEHh5Elg6FAgMxMYMkR0GqKbsAhJt8LDw7Fv3z6cOnWqWX6XlpGRAS8vL5w8eRJ9+/YVHafpKArg6WmdPL9xo+g0RLdgEZIu/fjjj3BxcUFKSgq8vb1Fx2ky48aNw5UrV3Do0CHRUZrOjh3Wg7XNZqBbN9FpiG7BIiRd8vf3R21tLfbv3y86SpO6ePEievfujY8//hijR48WHafxVVZa7wRnz7aOWiLSIRYh6U5aWhp8fHxw+vRpGI1G0XGa3OLFi7Fjxw6cOXMGrVq1Eh2ncS1cCCQnA7m5QHP7d6NmgzNhSFfq6uowd+5czJ49W4oSBIC33noLtbW1WL16tegojer8+fP47vhxKKtWsQRJ13hHSLqybt06vPfeezCbzVJNadi5cyemT5+OwsJCODs7i47TKMaOHYvffvsNBw4cEB2F6LZYhKQb169fh8FgQFRUFKZMmSI6jua8vLzw9NNP46OPPhId5b4dO3YMzz33HE6ePIk+ffqIjkN0WyxC0o3Zs2cjPT0dJ06cQMuWLUXH0dz3338PNzc3ZGRkwNXVVXSce2axWODm5gYPDw/ExsaKjkN0RyxC0oX8/Hz0798fBw8ehKenp+g4wgQFBSE/Px/Hjh1DCzs9fSU+Ph4REREwm83ynJpDdo1FSLrg4+MDJycnJCcni44i1JUrV2AwGLBhwwaMGzdOdJy7VlZWBqPRiEWLFmHGjBmi4xA1CIuQhEtNTUVAQADy8vLQo0cP0XGE++CDD7BmzRoUFBSgbdu2ouPclXnz5uGrr75CTk4OHnzwQdFxiBqERUhCVVdXo2/fvhg3bhwWL14sOo4u2Ot/k6KiIvTp0wf79+/HiBEjRMchajAWIQkVHR2NuLg4u7z7aUr1d8n5+fl40k6muPv5+cHR0RF79uwRHYXorrAISRh7/z6sqXl7e6Njx47YuXOn6Ch3dPDgQbzwwgvIzc3F008/LToO0V1hEZIwU6ZMQV5enl0/IdmU7OVJ2traWgwYMAAvvfQSli5dKjoO0V1jEZIQ9e/MHTt2DG5ubqLj6JY9vFsZGxuL5cuXw2w2w8HBAY6OjqIjEd0VFiEJ0ZxOUWlKej9tp7S0FL169UJsbCx69uyJgIAA5ObmSnU8Htk/FiFp7ujRo/D19cXZs2fRTWU+naIoWLFiBUaPHg2DwSAgobY++ugjuLi4wN3dXXV9zZo1iIqKgp+fn+4+Qj537hwqKipw/PhxWCwWDBo0CCNHjkR0dLToaEQNxukTpDstWrRARkYG5kkwv+7ixYsICQnBlStXREe5J15eXti3bx9atGiBli1bIjY2FnFxcTCbzaKjETUY7whJCC8vL/Ts2RNbtmxRXa+fUL937174+PhonE47AQEBuHbtGg4ePKi6Xv/R6LJlyxAUFKRxOnXnz5+Ho6MjunTporru7++Puro6pKamapyM6N6wCEmIhjwsExERgZSUFJw+fRoODg4aJ2x6GRkZ8PLywsmTJ9G3b1/VPbNmzcKxY8eQlZWFBx7Qxwc4I0eOxOOPP37Hi5iUlBR4e3trnI7o7rEISZipU6fizJkzNl+fqKiogNFoRFhYGEJCQgQkbDoWiwVDhw7FsGHDEBcXp7onLy8PAwYM0N3rEw25iAkPD8e+fftw6tSpZnkRQ82MQiRISUmJ4uTkpCQlJdnck5CQoHTs2FG5evWqhsma3qZNm5SOHTsq165ds7ln1KhRyrhx4zRM1XBBQUHKsGHDFIvForpeXl6udO/eXVm1apXGyYjuHouQhIqOjlYee+wxpaKiQnXdYrEobm5uyhtvvKFxsqZTVlamODs7K2vXrrW557PPPlMcHR2Vn3/+WcNkDSfzRQw1P/xolISqP2A6ICAA7777ruqe48ePw8PDA1lZWejfv7/GCRvf/Pnz8eWXX+LUqVOqExrs5dDt6OhorF69GoWFharnxCqKAnd3dwwePBjr168XkJCoYViEJNz+/fvx2muv3XYM04QJE1BcXIy0tDRtwzWyc+fOwcXFBampqRg5cqTqnuXLl2Pt2rW6P4j8bi5isrOz0a9fP40TEjUMi5B0wcfHBx06dMAnn3yiul5cXIzevXtj69atGDNmjMbpGs+LL74IBwcH7N27V3W9pKQERqMRH374IQICAjROd/cachETGBiIS5cu2f1FDDVfLELShfoDpg8cOAAvLy/VPUuWLMGWLVuQl5eH1q1ba5zw/h06dAi+vr7Izc1Fr169VPdMnjwZhYWFSE9P190pMrb4+PjAyckJycnJquvFxcUwGo3Ytm2bXV/EUPPFIiTdCAkJwTfffIOsrCzVA6arqqpgMpkwdepURERECEh472prazFw4EC88MILiIqKUt1T/1pCRkYGXF1dNU547xpyERMZGYmEhAS7vYihZk7QQzpEtygtLVU6d+6sbNy40eaeXbt2Ke3atVOKi4s1THb/Vq1apXTt2lX5/fffVdctFovi4eGhBAUFaZysccyePVsZMGCAUltbq7peWVmpTPL2Vv4TG6ttMKIGYBGSrqxbt07p0qWLcv36dZt7hg8frrz++uvahbpPv/76q9KpUyclISHB5p6kpCSlffv2yqVLl7QL1ojqL2I2bdpke9OuXYrSrp2i2NlFDDV//GiUdKWurg6DBg3CiBEjEBMTo7rn1KlTGDJkCNLT0zF06FCNE9694OBgfPvttzhx4oTqMWl//vknTCYTZs6cifnz5wtI2DjWrVuH9957D2az2fYYpuHDgR49gMREDZMR3R6LkHTn8OHD8Pb2xunTp2E0GlX3TJ8+HdnZ2fjuu+90cwanmvpj0tLS0vDss8+q7lm0aBGSkpJw5swZtGrVSuOEjaf+Iua2Y5hycgBXVyA9HbCDixiSA4uQdGnMmDGorq7G559/rrp+9epVGAwGxMXFYcKECRqna7hRo0bh0UcfxY4dO1TXL168CKPRiOTkZLz44osap2t83x05godmzcKATz8FbM2SnD4dOH0ayMgA7OTJWGreWISkSw0Zw7Ry5UpER0ejsLAQHTp00Djhne3duxeBgYEoKCjAE088obrnTmOY7JK/P1BbC+zfr75+9aq1JOPiAB1fxJA8WISkW3caw1RTU4O+ffvilVdewZIlSwQktK26uhp9+vTBhAkTsHDhQtU9DRnDZJd+/BFwcQFSUgBbY5hWrrT+KSgA2rXTNh/R/9DvlyskvYiICJSVlWHdunWq6w4ODlizZg1iYmJQVFSkcbrbi4mJQVVVFebNm6e6brFYEBISguDg4OZVggDw1FPAnDlAaChQU6O+Z9YsawEuX65pNCI1vCMkXdu6dSvmzJkDs9mMRx99VHWPr68v2rRpgz179micTl1JSQkMBgM2b96MsWPHqu7ZtGkTFixYALPZjE6dOmmcUAMVFYDRCISFAbZmSX7xBfDKK0BeHvCPf2ibj+i/sAhJ15T/n2AwaNAgbNiwQXVPUVERXFxc8Pnnn2PEiBEaJ7zVxIkTUVRUhKNHj6oek1ZeXg6DwYCFCxciODhYQEKNJCZa7wrNZqBzZ/U9vr5A27bA7t2aRiP6byxC0r2GjGEKDQ3F119/jZycHNXRRlrJzs6Gu7s7MjMzMWTIENU9dxrD1GwoCuDuDgweDNgaw1RQAPTrZ7071MFFDMmJRUh24U5jmMrKymA0GrFo0SLMmDFD43RWiqLA09MTzzzzDOLj41X3NGQMU7OSmQl4egLZ2dbCUzN3LnDggPUdw+Z8YUC6xSIku1A/hikxMREvv/yy6p74+HhEREQI+95t+/btePPNN1FYWIhu3bqp7rnTGKZmKTAQuHQJsDWG6bffrK9TREZa3zEk0hiLkOzGncYwWSwWuLm5wcPDA7GxsZpmq6yshMlkQkhICEJDQ1X3HDx4EH5+frcdw9QsFRdbH5zZtg2wNYYpPt76XmFuLl+yJ82xCMluNGQM05EjR+Dt7Y2ffvoJzs7OmmVbuXIl4uPj8cMPP+Chhx66Zb0hY5iatchIICHB+oSo2himujrrk6a2ziglakIsQrIru3fv/nt4bffu3VX3mM1mGGwd79VEampqcOHCBTz11FOq66tXr0ZUVBTMZrMuT8FpclVVgMkETJsGhIeLTkN0ExYh2Z3nnnsOTzzxBLZu3So6SoOUlpbCYDAgJiYGEydOFB1HnF27gKAgoLAQsHERQyQCi5DsTv0YpsOHD9uc6KAndxrDJJV//QsIDgZkeGKW7AaLkOzSjBkzkJWVpfsxTGfOnMHAgQNvO4ZJejt3Wo9aKyiwfkf40kvAsmXAI4+ITkaS0O//QYhuIzIyEufOnUNSUpLoKLc1d+5cvPrqqyxBW+LjrXeICxYAJSXAkSPAhQvA888Df/0lOh1JgneEZLeawxgmqVVVWb8rjIkBJk++8fvKSqBnT+Ddd60P1xA1MRYh2a36MUwmkwmurq6i49xEURRs3LgRU6ZMsTmGSXrp6YCHB1BWBrRvf/Pa9OlAaSnPICVN8DwjslsODg5ISUnBihUrcOTIEdFxbhEaGoppvKOx7do164Hb/1uCAODsbD2sm0gDLEKyayaTCZs3bxYdg+5F587AH38A5eW3luHly7YnVhA1Mj4sQ0RiDB4MPPzwrR9/VlYCqanAP/8pJBbJh3eERCSGoyOwdCkwb571I1Jvb+ud4Ny5QJcugMyHD5Cm+LAMEYmVlAR88IH1PcIOHazvES5fDgiYIEJyYhESEZHU+B0hERFJjUVIRERSYxESEZHUWIRERCQ1FiEREUmNRUhERFJjERIRkdRYhEREJDUWIRERSY1FSEREUmMREhGR1FiEREQkNRYhERFJjUVIRERSYxESEZHUWIRERCQ1FiEREUmNRUhERFJjERIRkdRYhEREJDUWIRERSY1FSEREUmMREhGR1FiEREQkNRYhERFJjUVIRERSYxESEZHUWIRERCQ1FiEREUmNRUhERFJjERIRkdRYhEREJDUWIRERSY1FSEREUmMREhGR1FiEREQktf8DaDZRD5zGf9wAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<rdkit.Chem.rdchem.Mol at 0x7f318405f170>"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "r = rdMolStandardize.Reionizer()\n",
    "mol = r.reionize(mol)\n",
    "mol"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAcIAAACWCAYAAABNcIgQAAAT90lEQVR4nO3de1BV5f4G8MeMFG9kmoo15cnc2x3eFcSZAJuTyqXsYFk4YqPiLUxRHEkojxYaGiiKt1ATvJCkJonVTKnoJIIlGBpy2WJlih7UsICIuOz1+2P/yONxbUWF9a7N+3xm/APet5mnxulZa++13m8LRVEUEBERSeoB0QGIiIhEYhESEZHUWIRERCQ1FiEREUmNRUhERFJjERIRkdRYhEREJDUWIRERSY1FSEREUmMREhGR1FiEREQkNRYhERFJjUVIRERSYxESEZHUWIRERCQ1FiEREUmNRUhERFJjERIRkdRYhEREJDUWIRERSY1FSEREUmMREhGR1FiEREQkNRYhERFJjUVIRERSYxESEZHUWIRERCQ1FiEREUmNRUhERFJjERIRkdRYhEREJDUWIRERSY1FSEREUmMREhGR1FiEREQkNRYhERFJjUVIROLt3AkMGAC0bg107QpMmwaUlt5Y79MH2LHj5n8mORno3VvTmNQ8sQiJSKz4eCA4GFiwACgpAY4cAS5cAJ5/HvjrL9HpSAIsQiISp6oKCA8HVqwAAgIAJyfAZAI+/RS4fBnYulV0QpIAi5CIxMnKAq5fB8aOvfn3bdoAo0cDBw6IyUVSeVB0AKL7kZ+fjxUrVuCXX34RHeUWfn5+mDZtGhwdHUVH0a9r14C2bYH27W9dc3YGzOYbP7/xBjBnzo2fq6uB7t2bPCI1fyxCsls1NTXw9/eHyWTC8OHDRce5iaIoWLlyJcrKyrBw4ULRcfSrc2fgjz+A8vJby/DyZet6vfffB/z9b/ycmgrExWmTk5o1FiHZrbi4OFRUVGD79u1o166d6Di3eOaZZzB+/Hi8/vrrePLJJ0XH0afBg4GHHwZ27wYmT77x+8pKa9EtWnTjdx07Ao8/fuPnRx7RLCY1b/yOkOzS1atXsWTJEixbtkyXJQgA/v7+8PDwQEREhOgo+uXoCCxdCsybB3zyCfD770BBAfDyy0CXLsDEiaITkgRYhGSX3nnnHfTu3Rvjx48XHeW2YmNjsXv3bhw9elR0FP0KDgbWrrV+9NmlC+DhATz2GJCWZn2vkKiJtVAURREdguhu5OTkwNXVFenp6Rg6dKjoOHc0c+ZMHD9+HCdOnMADD/Dak0hvWIRkd4YPH44ePXogMTFRdJQGKS0thcFgQHR0NCZNmiQ6jn7U1AAODqJTELEIyb7s2rULQUFBKCwsRHcbj85nZmZCxF/rrl27omfPnqprcXFxeP/991FYWAgnJyeNk+nUm29avyOMjhadhCTHIiS7UVVVBZPJhGnTpiE8PNzmPkdHR9TV1WmYzGrSpEmIj49XXautrcXAgQPh5+eHZcuWaZxMh/LyrGeLHjpk/U6QSCAWIdmNyMhIJCQkIC8vD63t8CGKQ4cOwdfXF7m5uejVq5foOGKNGmV9RzApSXQSIhYh2Yfi4mIYjUZs27YNY8aMER3nno0ePRotW7ZESkqK6CjipKQA48cD+fkA368kHWARkl0IDAzEpUuXkJaWJjrKfTl37hxcXFywb98+jBo1SnQc7VVXW0cqBQYC//636DREAFiEZAcyMzPh6emJ7Oxs9OvXT3Sc+xYWFoYvvvgCOTk5cJDtqcmoKGDDButL823aiE5DBIBFSDqnKArc3d0xePBgrF+/XnScRlFeXg6j0Yi3334bM2fOFB1HOyUlgMEAbNoEvPqq6DREf2MRkq4lJiYiNDQUZrMZnf/7AGY7t3nzZoSFheHs2bPo1KmT6DjamDgRKCoCjh4FWrQQnYbobyxC0q2KigoYjUaEhYUhJCREdJxGZbFY4O7uDnd3d8TJMEHh5Elg6FAgMxMYMkR0GqKbsAhJt8LDw7Fv3z6cOnWqWX6XlpGRAS8vL5w8eRJ9+/YVHafpKArg6WmdPL9xo+g0RLdgEZIu/fjjj3BxcUFKSgq8vb1Fx2ky48aNw5UrV3Do0CHRUZrOjh3Wg7XNZqBbN9FpiG7BIiRd8vf3R21tLfbv3y86SpO6ePEievfujY8//hijR48WHafxVVZa7wRnz7aOWiLSIRYh6U5aWhp8fHxw+vRpGI1G0XGa3OLFi7Fjxw6cOXMGrVq1Eh2ncS1cCCQnA7m5QHP7d6NmgzNhSFfq6uowd+5czJ49W4oSBIC33noLtbW1WL16tegojer8+fP47vhxKKtWsQRJ13hHSLqybt06vPfeezCbzVJNadi5cyemT5+OwsJCODs7i47TKMaOHYvffvsNBw4cEB2F6LZYhKQb169fh8FgQFRUFKZMmSI6jua8vLzw9NNP46OPPhId5b4dO3YMzz33HE6ePIk+ffqIjkN0WyxC0o3Zs2cjPT0dJ06cQMuWLUXH0dz3338PNzc3ZGRkwNXVVXSce2axWODm5gYPDw/ExsaKjkN0RyxC0oX8/Hz0798fBw8ehKenp+g4wgQFBSE/Px/Hjh1DCzs9fSU+Ph4REREwm83ynJpDdo1FSLrg4+MDJycnJCcni44i1JUrV2AwGLBhwwaMGzdOdJy7VlZWBqPRiEWLFmHGjBmi4xA1CIuQhEtNTUVAQADy8vLQo0cP0XGE++CDD7BmzRoUFBSgbdu2ouPclXnz5uGrr75CTk4OHnzwQdFxiBqERUhCVVdXo2/fvhg3bhwWL14sOo4u2Ot/k6KiIvTp0wf79+/HiBEjRMchajAWIQkVHR2NuLg4u7z7aUr1d8n5+fl40k6muPv5+cHR0RF79uwRHYXorrAISRh7/z6sqXl7e6Njx47YuXOn6Ch3dPDgQbzwwgvIzc3F008/LToO0V1hEZIwU6ZMQV5enl0/IdmU7OVJ2traWgwYMAAvvfQSli5dKjoO0V1jEZIQ9e/MHTt2DG5ubqLj6JY9vFsZGxuL5cuXw2w2w8HBAY6OjqIjEd0VFiEJ0ZxOUWlKej9tp7S0FL169UJsbCx69uyJgIAA5ObmSnU8Htk/FiFp7ujRo/D19cXZs2fRTWU+naIoWLFiBUaPHg2DwSAgobY++ugjuLi4wN3dXXV9zZo1iIqKgp+fn+4+Qj537hwqKipw/PhxWCwWDBo0CCNHjkR0dLToaEQNxukTpDstWrRARkYG5kkwv+7ixYsICQnBlStXREe5J15eXti3bx9atGiBli1bIjY2FnFxcTCbzaKjETUY7whJCC8vL/Ts2RNbtmxRXa+fUL937174+PhonE47AQEBuHbtGg4ePKi6Xv/R6LJlyxAUFKRxOnXnz5+Ho6MjunTporru7++Puro6pKamapyM6N6wCEmIhjwsExERgZSUFJw+fRoODg4aJ2x6GRkZ8PLywsmTJ9G3b1/VPbNmzcKxY8eQlZWFBx7Qxwc4I0eOxOOPP37Hi5iUlBR4e3trnI7o7rEISZipU6fizJkzNl+fqKiogNFoRFhYGEJCQgQkbDoWiwVDhw7FsGHDEBcXp7onLy8PAwYM0N3rEw25iAkPD8e+fftw6tSpZnkRQ82MQiRISUmJ4uTkpCQlJdnck5CQoHTs2FG5evWqhsma3qZNm5SOHTsq165ds7ln1KhRyrhx4zRM1XBBQUHKsGHDFIvForpeXl6udO/eXVm1apXGyYjuHouQhIqOjlYee+wxpaKiQnXdYrEobm5uyhtvvKFxsqZTVlamODs7K2vXrrW557PPPlMcHR2Vn3/+WcNkDSfzRQw1P/xolISqP2A6ICAA7777ruqe48ePw8PDA1lZWejfv7/GCRvf/Pnz8eWXX+LUqVOqExrs5dDt6OhorF69GoWFharnxCqKAnd3dwwePBjr168XkJCoYViEJNz+/fvx2muv3XYM04QJE1BcXIy0tDRtwzWyc+fOwcXFBampqRg5cqTqnuXLl2Pt2rW6P4j8bi5isrOz0a9fP40TEjUMi5B0wcfHBx06dMAnn3yiul5cXIzevXtj69atGDNmjMbpGs+LL74IBwcH7N27V3W9pKQERqMRH374IQICAjROd/cachETGBiIS5cu2f1FDDVfLELShfoDpg8cOAAvLy/VPUuWLMGWLVuQl5eH1q1ba5zw/h06dAi+vr7Izc1Fr169VPdMnjwZhYWFSE9P190pMrb4+PjAyckJycnJquvFxcUwGo3Ytm2bXV/EUPPFIiTdCAkJwTfffIOsrCzVA6arqqpgMpkwdepURERECEh472prazFw4EC88MILiIqKUt1T/1pCRkYGXF1dNU547xpyERMZGYmEhAS7vYihZk7QQzpEtygtLVU6d+6sbNy40eaeXbt2Ke3atVOKi4s1THb/Vq1apXTt2lX5/fffVdctFovi4eGhBAUFaZysccyePVsZMGCAUltbq7peWVmpTPL2Vv4TG6ttMKIGYBGSrqxbt07p0qWLcv36dZt7hg8frrz++uvahbpPv/76q9KpUyclISHB5p6kpCSlffv2yqVLl7QL1ojqL2I2bdpke9OuXYrSrp2i2NlFDDV//GiUdKWurg6DBg3CiBEjEBMTo7rn1KlTGDJkCNLT0zF06FCNE9694OBgfPvttzhx4oTqMWl//vknTCYTZs6cifnz5wtI2DjWrVuH9957D2az2fYYpuHDgR49gMREDZMR3R6LkHTn8OHD8Pb2xunTp2E0GlX3TJ8+HdnZ2fjuu+90cwanmvpj0tLS0vDss8+q7lm0aBGSkpJw5swZtGrVSuOEjaf+Iua2Y5hycgBXVyA9HbCDixiSA4uQdGnMmDGorq7G559/rrp+9epVGAwGxMXFYcKECRqna7hRo0bh0UcfxY4dO1TXL168CKPRiOTkZLz44osap2t83x05godmzcKATz8FbM2SnD4dOH0ayMgA7OTJWGreWISkSw0Zw7Ry5UpER0ejsLAQHTp00Djhne3duxeBgYEoKCjAE088obrnTmOY7JK/P1BbC+zfr75+9aq1JOPiAB1fxJA8WISkW3caw1RTU4O+ffvilVdewZIlSwQktK26uhp9+vTBhAkTsHDhQtU9DRnDZJd+/BFwcQFSUgBbY5hWrrT+KSgA2rXTNh/R/9DvlyskvYiICJSVlWHdunWq6w4ODlizZg1iYmJQVFSkcbrbi4mJQVVVFebNm6e6brFYEBISguDg4OZVggDw1FPAnDlAaChQU6O+Z9YsawEuX65pNCI1vCMkXdu6dSvmzJkDs9mMRx99VHWPr68v2rRpgz179micTl1JSQkMBgM2b96MsWPHqu7ZtGkTFixYALPZjE6dOmmcUAMVFYDRCISFAbZmSX7xBfDKK0BeHvCPf2ibj+i/sAhJ15T/n2AwaNAgbNiwQXVPUVERXFxc8Pnnn2PEiBEaJ7zVxIkTUVRUhKNHj6oek1ZeXg6DwYCFCxciODhYQEKNJCZa7wrNZqBzZ/U9vr5A27bA7t2aRiP6byxC0r2GjGEKDQ3F119/jZycHNXRRlrJzs6Gu7s7MjMzMWTIENU9dxrD1GwoCuDuDgweDNgaw1RQAPTrZ7071MFFDMmJRUh24U5jmMrKymA0GrFo0SLMmDFD43RWiqLA09MTzzzzDOLj41X3NGQMU7OSmQl4egLZ2dbCUzN3LnDggPUdw+Z8YUC6xSIku1A/hikxMREvv/yy6p74+HhEREQI+95t+/btePPNN1FYWIhu3bqp7rnTGKZmKTAQuHQJsDWG6bffrK9TREZa3zEk0hiLkOzGncYwWSwWuLm5wcPDA7GxsZpmq6yshMlkQkhICEJDQ1X3HDx4EH5+frcdw9QsFRdbH5zZtg2wNYYpPt76XmFuLl+yJ82xCMluNGQM05EjR+Dt7Y2ffvoJzs7OmmVbuXIl4uPj8cMPP+Chhx66Zb0hY5iatchIICHB+oSo2himujrrk6a2ziglakIsQrIru3fv/nt4bffu3VX3mM1mGGwd79VEampqcOHCBTz11FOq66tXr0ZUVBTMZrMuT8FpclVVgMkETJsGhIeLTkN0ExYh2Z3nnnsOTzzxBLZu3So6SoOUlpbCYDAgJiYGEydOFB1HnF27gKAgoLAQsHERQyQCi5DsTv0YpsOHD9uc6KAndxrDJJV//QsIDgZkeGKW7AaLkOzSjBkzkJWVpfsxTGfOnMHAgQNvO4ZJejt3Wo9aKyiwfkf40kvAsmXAI4+ITkaS0O//QYhuIzIyEufOnUNSUpLoKLc1d+5cvPrqqyxBW+LjrXeICxYAJSXAkSPAhQvA888Df/0lOh1JgneEZLeawxgmqVVVWb8rjIkBJk++8fvKSqBnT+Ddd60P1xA1MRYh2a36MUwmkwmurq6i49xEURRs3LgRU6ZMsTmGSXrp6YCHB1BWBrRvf/Pa9OlAaSnPICVN8DwjslsODg5ISUnBihUrcOTIEdFxbhEaGoppvKOx7do164Hb/1uCAODsbD2sm0gDLEKyayaTCZs3bxYdg+5F587AH38A5eW3luHly7YnVhA1Mj4sQ0RiDB4MPPzwrR9/VlYCqanAP/8pJBbJh3eERCSGoyOwdCkwb571I1Jvb+ud4Ny5QJcugMyHD5Cm+LAMEYmVlAR88IH1PcIOHazvES5fDgiYIEJyYhESEZHU+B0hERFJjUVIRERSYxESEZHUWIRERCQ1FiEREUmNRUhERFJjERIRkdRYhEREJDUWIRERSY1FSEREUmMREhGR1FiEREQkNRYhERFJjUVIRERSYxESEZHUWIRERCQ1FiEREUmNRUhERFJjERIRkdRYhEREJDUWIRERSY1FSEREUmMREhGR1FiEREQkNRYhERFJjUVIRERSYxESEZHUWIRERCQ1FiEREUmNRUhERFJjERIRkdRYhEREJDUWIRERSY1FSEREUmMREhGR1FiEREQktf8DaDZRD5zGf9wAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<rdkit.Chem.rdchem.Mol at 0x7f318405f670>"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mol = Chem.MolFromSmiles(\"[Na+].O=C([O-])c1ccccc1\")\n",
    "rdMolStandardize.ChargeParent(mol)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# TODO"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##  MolStandardize.tautomer"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "TODO\n",
    "\n",
    "-  __Tautomer_enumeration__ generates all possible tautomers using a series of transform rules. It also removes stereochemistry from double bonds that are single in at least 1 tautomer.\n",
    "-  __Tautomer_canonicalization__ enumerates all possible tautomers using transform rules and uses a scoring system to determine canonical tautomer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "from rdkit.Chem.MolStandardize.standardize import enumerate_tautomers_smiles, canonicalize_tautomer_smiles"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'C=C(O)C(C)C', 'CC(=O)C(C)C', 'CC(C)=C(C)O'}"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "enumerate_tautomers_smiles(\"OC(C)=C(C)C\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'CC(=O)C(C)C'"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "canonicalize_tautomer_smiles(\"OC(C)=C(C)C\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
