\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-XovA71E3XFM"
},
"source": [
"# Titanic PCA"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "7pW1UhJT8ePk"
},
"source": [
"As an example of how to work with both categorical and numerical data, we will perform survival predicition for the passengers of the HMS Titanic.\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "bvj3Wids8ePm",
"outputId": "3c075657-ff1a-424c-9757-3eb6ef1c2b18"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Index(['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp',\n",
" 'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'],\n",
" dtype='object') Index(['PassengerId', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp', 'Parch',\n",
" 'Ticket', 'Fare', 'Cabin', 'Embarked'],\n",
" dtype='object')\n"
]
}
],
"source": [
"import os\n",
"import pandas as pd\n",
"train = pd.read_csv('https://raw.githubusercontent.com/rpi-techfundamentals/spring2019-materials/master/input/train.csv')\n",
"test = pd.read_csv('https://raw.githubusercontent.com/rpi-techfundamentals/spring2019-materials/master/input/test.csv')\n",
"\n",
"print(train.columns, test.columns)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "0xqjk2-P8ePp"
},
"source": [
"Here is a broad description of the keys and what they mean:\n",
"\n",
"```\n",
"pclass Passenger Class\n",
" (1 = 1st; 2 = 2nd; 3 = 3rd)\n",
"survival Survival\n",
" (0 = No; 1 = Yes)\n",
"name Name\n",
"sex Sex\n",
"age Age\n",
"sibsp Number of Siblings/Spouses Aboard\n",
"parch Number of Parents/Children Aboard\n",
"ticket Ticket Number\n",
"fare Passenger Fare\n",
"cabin Cabin\n",
"embarked Port of Embarkation\n",
" (C = Cherbourg; Q = Queenstown; S = Southampton)\n",
"boat Lifeboat\n",
"body Body Identification Number\n",
"home.dest Home/Destination\n",
"```\n",
"\n",
"In general, it looks like `name`, `sex`, `cabin`, `embarked`, `boat`, `body`, and `homedest` may be candidates for categorical features, while the rest appear to be numerical features. We can also look at the first couple of rows in the dataset to get a better understanding:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 548
},
"id": "bqmMR9G78ePr",
"outputId": "b8b2ab48-ac65-48a9-f7ff-4575b1d63f8d"
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
PassengerId
\n",
"
Survived
\n",
"
Pclass
\n",
"
Name
\n",
"
Sex
\n",
"
Age
\n",
"
SibSp
\n",
"
Parch
\n",
"
Ticket
\n",
"
Fare
\n",
"
Cabin
\n",
"
Embarked
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
1
\n",
"
0
\n",
"
3
\n",
"
Braund, Mr. Owen Harris
\n",
"
male
\n",
"
22.0
\n",
"
1
\n",
"
0
\n",
"
A/5 21171
\n",
"
7.2500
\n",
"
NaN
\n",
"
S
\n",
"
\n",
"
\n",
"
1
\n",
"
2
\n",
"
1
\n",
"
1
\n",
"
Cumings, Mrs. John Bradley (Florence Briggs Th...
\n",
"
female
\n",
"
38.0
\n",
"
1
\n",
"
0
\n",
"
PC 17599
\n",
"
71.2833
\n",
"
C85
\n",
"
C
\n",
"
\n",
"
\n",
"
2
\n",
"
3
\n",
"
1
\n",
"
3
\n",
"
Heikkinen, Miss. Laina
\n",
"
female
\n",
"
26.0
\n",
"
0
\n",
"
0
\n",
"
STON/O2. 3101282
\n",
"
7.9250
\n",
"
NaN
\n",
"
S
\n",
"
\n",
"
\n",
"
3
\n",
"
4
\n",
"
1
\n",
"
1
\n",
"
Futrelle, Mrs. Jacques Heath (Lily May Peel)
\n",
"
female
\n",
"
35.0
\n",
"
1
\n",
"
0
\n",
"
113803
\n",
"
53.1000
\n",
"
C123
\n",
"
S
\n",
"
\n",
"
\n",
"
4
\n",
"
5
\n",
"
0
\n",
"
3
\n",
"
Allen, Mr. William Henry
\n",
"
male
\n",
"
35.0
\n",
"
0
\n",
"
0
\n",
"
373450
\n",
"
8.0500
\n",
"
NaN
\n",
"
S
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" PassengerId Survived Pclass \\\n",
"0 1 0 3 \n",
"1 2 1 1 \n",
"2 3 1 3 \n",
"3 4 1 1 \n",
"4 5 0 3 \n",
"\n",
" Name Sex Age SibSp \\\n",
"0 Braund, Mr. Owen Harris male 22.0 1 \n",
"1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 \n",
"2 Heikkinen, Miss. Laina female 26.0 0 \n",
"3 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 \n",
"4 Allen, Mr. William Henry male 35.0 0 \n",
"\n",
" Parch Ticket Fare Cabin Embarked \n",
"0 0 A/5 21171 7.2500 NaN S \n",
"1 0 PC 17599 71.2833 C85 C \n",
"2 0 STON/O2. 3101282 7.9250 NaN S \n",
"3 0 113803 53.1000 C123 S \n",
"4 0 373450 8.0500 NaN S "
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"train.head()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "54WY6zD78ePv"
},
"source": [
"### Preprocessing function\n",
"\n",
"We want to create a preprocessing function that can address transformation of our train and test set. "
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "FKX26KU34Ti6",
"outputId": "a9bd50ee-68c8-4ffb-f077-77c700aecbed"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Total missing values before processing: 179\n",
"Total missing values after processing: 0\n",
"Total missing values before processing: 87\n",
"Total missing values after processing: 0\n"
]
}
],
"source": [
"from sklearn.impute import SimpleImputer\n",
"import numpy as np\n",
"\n",
"cat_features = ['Pclass', 'Sex', 'Embarked']\n",
"num_features = [ 'Age', 'SibSp', 'Parch', 'Fare' ]\n",
"\n",
"\n",
"def preprocess(df, num_features, cat_features, dv):\n",
" features = cat_features + num_features\n",
" if dv in df.columns:\n",
" y = df[dv]\n",
" else:\n",
" y=None \n",
" #Address missing variables\n",
" print(\"Total missing values before processing:\", df[features].isna().sum().sum() )\n",
" \n",
" imp_mode = SimpleImputer(missing_values=np.nan, strategy='most_frequent')\n",
" df[cat_features]=imp_mode.fit_transform(df[cat_features] )\n",
" imp_mean = SimpleImputer(missing_values=np.nan, strategy='mean')\n",
" df[num_features]=imp_mean.fit_transform(df[num_features])\n",
" print(\"Total missing values after processing:\", df[features].isna().sum().sum() )\n",
" \n",
" X = pd.get_dummies(df[features], columns=cat_features, drop_first=True)\n",
" return y,X\n",
"\n",
"y, X = preprocess(train, num_features, cat_features, 'Survived')\n",
"test_y, test_X = preprocess(test, num_features, cat_features, 'Survived')"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "yIEMMxHGwEXG"
},
"source": [
"# PCA Analysis\n",
"\n",
"See [Documentation](https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html). \n",
"\n",
"You can incorporate PCA based on number of components or the variance explained. "
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"id": "pRaYU2YCvyNw"
},
"outputs": [],
"source": [
"from sklearn.decomposition import PCA\n",
"pca = PCA(n_components=5)\n",
"pca.fit(X)\n",
"X2=pca.transform(X)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Q_NMT2q9v5e2",
"outputId": "ddfa29b0-f907-4fc8-ca84-a30731610608"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[2.47107661e+03 1.67651481e+02 1.25165106e+00 4.73653673e-01\n",
" 3.18808533e-01]\n"
]
}
],
"source": [
"#This indicates the amount of variance explained by each of the principal components.\n",
"print(pca.explained_variance_)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"id": "6P5EwsE5wROv"
},
"outputs": [],
"source": [
"from sklearn.decomposition import PCA\n",
"pca2 = PCA(n_components=0.97)\n",
"pca2.fit(X)\n",
"X3=pca2.transform(X)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "o7jTEHLBzIzw",
"outputId": "b29c6774-34c8-4a44-f03f-21824a1d319a"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[2471.07660618 167.65148116]\n"
]
}
],
"source": [
"print(pca2.explained_variance_)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "YUBZDEqxv8s1",
"outputId": "7d115fee-2c83-4b9e-cac0-fd3e8f31e6d5"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[1.00000000e+00, 1.90521771e-16],\n",
" [1.90521771e-16, 1.00000000e+00]])"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cov_data = np.corrcoef(X3.T)\n",
"cov_data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Elbow Plot and Kaisers Rule Cutoff\n",
"\n",
"[Here](https://docs.displayr.com/wiki/Kaiser_Rule) is a link to documentation of Kaisers Rule. \n"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Data passed Bartlett’s test for sphericity.\n",
"Performing PCA using rotation: quartimax factors: 4 and standardization: False\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAfgAAAFNCAYAAADsL325AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAABCD0lEQVR4nO3debxN9f7H8dfnHGSIEkrIECIi1clU16FUpKLhGlJd/dwrDbf5NtfVoLr3Nk/iNqiulEID0WyoEIpUkkhIQipjxs/vj+8+bGdyHGefdc4+7+fjsR5n77XW3vuzjuGzv9/1/X6+5u6IiIhIckmJOgAREREpeErwIiIiSUgJXkREJAkpwYuIiCQhJXgREZEkpAQvIiKShJTgRSRXZtbHzD4qwPc708yWmNk6MzuqoN5XRHalBC9SyMzseDP7xMx+N7PVZvaxmR0bcUwDzGxLLOn+FouvTT7eZ4KZ/XU3p90HXObu+7r75/mLOMtn/hGLfZWZjTKzg+OOtzSzt2LXtdrMPjWzCzO9Rz0z225mT+xtPCJFhRK8SCEys0rAGOBR4ACgJnA7sGkP36dUwUfHy+6+L1AN+AgYZWaWgM+pA3yVnxeaWWoOhy6LxX4YsD/wYOz8NsAHwESgAVAFuBjonOn1FwC/Aj3NbJ/8xCZS1CjBixSuwwDcfbi7b3P3je7+jrt/kXGCmf3NzOaa2Voz+9rMjo7tX2Rm15vZF8B6MytlZq1jre3fzGy2mbWPe5/9zOxpM/vJzH40s7tySZA7uPsW4DmgOiEh7sLM2prZ9FgPxHQzaxvbPxD4E/BYrDX9WKbX7WNm64BUYLaZLYjtPzzWCv/NzL4yszPiXjPUzAbFWuDrgQ67iX01MBI4IrbrP8Bz7v4vd1/lwUx3757ppRcAtwBbgNN39zsSKQ6U4EUK17fANjN7zsw6m1nl+INm9mdgACHhVALOAH6JO6UX0IXQSj0IGAvcRegNuBYYaWbVYuc+B2wltFyPAk4Gdtd9TqwF2wdY6u6rMh07IPaZjxCS/wPAWDOr4u43A5PZ2f1+Wfxr3X1TrJUNcKS71zez0sCbwDvAgcDfgWFm1ijupecCA4GKhJ6F3GKvCpwNfG5m5YE2wKu7ec2fgFrAS8AIwu9epNhTghcpRO6+BjgecOC/wEoze8PMDoqd8lfg3+4+Pdba/M7df4h7i0fcfYm7bwTOA95y97fcfbu7vwvMAE6NvV9n4Ep3X+/uKwjd1j1zCa+7mf0GLAGOAbplc04XYL67v+DuW919OPAN+W/1tgb2Be51983u/gHhFkavuHNed/ePY9f4Rw7v80gs9tnAT8DVQGXC/3E/7SaGvwDj3P1X4EWgs5kdmM/rESkylOBFCpm7z3X3Pu5ei9CVXAN4KHb4EGBBLi9fEve4DvDnWNf2b7EEdzxwcOxYaeCnuGODCa3knIxw9/3d/UB3P8HdZ2ZzTg3gh0z7fiCMJciPGsASd9+ey/stYfcuj8Ve0917u/tKwj317YTfR7bMrBzwZ2AYgLtPARYTeg1EijUleJEIufs3wFB23jNeAtTP7SVxj5cAL8QSW8ZWwd3vjR3bBFSNO1bJ3ZvuZcjLCF8e4tUGfswmvry+3yFmFv9/Ufz75ec9w4vcNwBTCF32OTmTcCvkCTNbbmbLCV8u1E0vxZ4SvEghMrPGZnaNmdWKPT+E0B09NXbKU8C1ZnaMBQ3MLHNCzfA/4HQzO8XMUs2srJm1N7Na7v4T4b72/WZWycxSzKy+maXv5SW8BRxmZufGBvn1AJoQutUBfgYO3YP3mwasB64zs9KxQYKnE+6HF4TrgD5m9g8zqwJgZkeaWcb7/wV4BmgGtIhtxwEtzKxZAcUgEgkleJHCtRZoBUyLjQqfCnwJXAPg7q8QBpS9GDv3NcIAuizcfQnQFbgJWElotf+Dnf+uLwDKAF8TuqtfJZfu6rxw91+A02Lx/kJIoKfFDcZ7GDjHzH41s0fy8H6bCQMJOwOrgCeAC2I9G3vN3T8BTohtC81sNTAEeMvMagInAg+5+/K4bSYwnpD8RYotc89X75eIiIgUYWrBi4iIJCEleBERkSSkBC8iIpKElOBFRESSkBK8iIhIEkrEilSRqVq1qtetWzfqMERERArFzJkzV7l7teyOJVWCr1u3LjNmzIg6DBERkUJhZplLR++gLnoREZEkpAQvIiKShJKqi15EpCTbtm0ba9asYevWrVGHIglQqlQpKlWqRGpqat7OT3A8IiJSSNasWcM+++zD/vvvj5lFHY4UIHdn48aNrFmzhsqVK+fpNeqiFxFJElu3bqVcuXJK7knIzChXrtwe9c4owYuIJJHCSO6LFi2iWrVqtG/fnvbt23PLLbcAcNFFFyX8s+P16dOHL7/8ssDeb9asWRx11FFcd911BfaeABMmTOCQQw4hPT2dE088kVWrwuKLo0eP5vjjj6ddu3Z06dKFH3/8EQg9MeXKlWPKlCm7vM+e/tmqi15ERPZYeno6r7766i77Bg8eHFE0BWPcuHHcdtttnHnmmTv2Zay4urdfnHr06MF9993H/fffz+DBgzn77LO59957ee+996hYsSKLFi1i8+bNALzxxhtceOGFvPLKK7Rp0ybfn6kWfHaGDYO6dSElJfwcNizqiERECs6mTfDll+FnAUpLSwPg888/Jy0tjTPOOIOuXbsyYcIE3J2///3vdOjQgZNOOomlS5cCcPjhh9O7d2+OOuooXnjhBTZv3sxxxx234z179erFggUL+M9//sMJJ5zAMcccw7vvvrvL506YMIFrr70WgG+++YY+ffoAMH78eP70pz/Rtm1bhg8fDsAtt9xCmzZtaNeuHVOnTt3xHl9//TWDBw/mtttu4+mnn6ZPnz5cfPHFdOzYkVWrVnHuueeSnp7OqaeeyurVq1m0aBFt2rThnHPOoUmTJowaNYqzzjqL5s2bM3fu3Bx/R0cccQRLly5lxIgR9O/fn4oVKwKhjku9evUAeP311xk4cCBffPEFe7OkuxJ8ZsOGQb9+8MMP4B5+9uunJC8ixYPZ7reyZaFZs/Azt/NyMXHixB1d9I899tgux2655RZefPFFXn/9dX799VcAxo4dS+XKlfnwww+59957uffeewFYvnw5gwYNYvLkyTzxxBOUKVOGxo0bM2fOHDZu3Mjy5cupX78+l156KR988AFvv/02d999925/Ddu3b+eOO+7g/fff56OPPuLJJ59k27ZtvP3220yePJlJkybRsmXLHec3adKEPn36cM8999C3b18gfGF5//33mTBhArVr12bixIn06NGDRx99FIBff/2Vl19+mUcffZSBAwfy6quvcuedd/LCCy/kGNfkyZNp1KgRP/30EzVq1MhyfO3atbg7lStX5rjjjuPTTz/d7bXmRF30md18M2zYsOu+DRvC/t69o4lJRKSIya6LPsOKFSs47LDDADjmmGOA0EIePXo0kyZNwt055JBDADj00EOpVKkSsLM7vEePHrz88ssceeSRnHbaaQAMGzaM559/npSUFJYvX77L58V3n2e8x6pVq5g/fz4nn3zyjucrV67krrvu4qKLLqJUqVLcfvvtVK9ePcdrPPbYYwFYsGDBjsetWrXivffeA6Bp06akpqZSs2ZNjjjiCFJSUqhZs+aOLzXxXn75ZaZPn06dOnW48cYbeeCBB3bcc4/3xhtv8O2339KpUyfWr1/Phg0baNWqVY4x5kYt+MwWL855/150lYiIFAr33Lc//oBGjaB8+fDzjz9yPjefDjroIObPn4+789lnnwHQuHFjunfvzoQJE5g4cSLPPvsskP297RNOOIEPP/yQV155he7duwNw33338eGHH2b7paJy5cosWbIEgJkzZwJQtWpVDj/8cN59910mTJjArFmzqF69Ou3atePpp58mPT2dIUOG5HodKSkhRTZo0IDp06cDMG3aNBo2bJgl9uy+ZMTr0aMHEydO5Pnnn6dChQp0796dwYMHs3btWgAWL17MokWLGDlyJB9++CHjx49n8uTJO35/+aEWfGa1a4du+czc4aij4JJL4NxzYd99Cz82EZG9tc8+MHs2zJ8PDRuG5/mQ0UUP0KJFCx566KEdx+644w569epF9erV2XfffSldujSnn346H3zwAR06dADgvPPO29EVnlmpUqVo1qwZ8+bN29HS79ChA3/6059o1arVjhZ/hmbNmvHHH3/QsWNHGjRoAITkfPPNN9OxY0dSUlKoVq0aI0aMoFu3bmzcuJFNmzbx1FNP5elau3XrxqhRo2jXrh0VKlRg2LBhrFmzZk9+XVk0atSI66+/nk6dOpGamsp+++3HY489xrJly6hSpcqO8+rXr8/06dN39CDsCdubG/hFTVpamu/1YjMZ9+Dju+lLlYJy5SD2TYtKleCCC+Dii6FJk737PBGRArJy5UqqVct2YbFCtWXLFkqXLs327ds54YQTGD58OAcffHDUYSWFzH/GZjbT3dOyO1dd9Jn17g1DhkCdOmGQSZ06MHQorFwZkv9xx8GaNfDYY9C0KXToAK+8Alu2RB25iEiRMG3aNNLT02nVqhUdO3ZUco+IWvD58cUXMGgQvPACrF8f9h18MPztb2GrVSvxMYiIZFJUWvCSOGrBJ1rz5iHBL1sWWvJNmsBPP8Edd4R582efDe+9p0F5IiISGSX4vVGpElx6aSgYMWECdO8euvVHjYKTToLGjeGhhyCbKRMiIiKJpARfEMwgPR1efjlMp7vzztBN/+23cNVVULMm/PWvsBfTHURERPaEEnxBO/hguOUW+P57GD06tOQ3boSnn4ZjjoHWreH558PcUxGRYmjRokWcc845AMybN4+0tLQdpWczu/LKK9m4cWNCYshY8KZ169Y75r9nZ8CAAYwZMyZP7zt58mTat29Pu3btOPHEE3NdzGZ38+jjF69J1EI2uVGCT5RSpaBbN3jnHZg3L7Tk998fpk2Dv/wltOr/8Q9YsCDqSEVE8mXp0qWcd955vPjii9TKYXDxQw89RLly5fb4vbdv377bc9LT05kwYQIPPPDAjtK3e+OXX37hkksu4aWXXmLSpEmMGDEi11rwu0vwGYvX/Pvf/97lcWFRgi8Mhx0GDzwAP/64syW/ejXcdx80aACdO8Obb8K2bVFHKiKSJ6tXr+acc87hySef3FGW9tprr6V9+/a0bNmSWbNmAdC+fXvWrVvHa6+9xrHHHkv79u0ZNGgQAEOHDt2xGMwHH3yw4/xrrrmGzp07M2XKFFq2bEl6ejq33XZbjrH89ttvOxJxxoI3AK1bt85y7t133016ejrt2rVjzpw5uxwbO3YsZ5111o7ytVWqVKFZs2YMHTp0R7398ePHM2DAAAYNGsS8efNo3749EydO5P3336d169a0atWKoUOH7rJ4zeOPP77LQjaFRZXsClP58vB//xe26dPhiSfgpZdg/Piw1a4N/ftD375w4IFRRysixVWswtwuuncPlTg3bIBTT816vE+fsK1aBeecEwYO5+Kzzz6jXbt2O2rNQ6hgV758eb744gv+9a9/MSxuka6RI0fy7LPPcsQRR7B9+3ZWrVrF8OHDmTRpEhs3buT000/nhBNOAODUU0/l/vvv59Zbb+W2227jtNNOy7ZFP3HiRFq1asWCBQt2fEHIzZw5c5g3bx4TJ05k+fLlXHzxxYwePXrH8ZwWgMnOxRdfzNNPP82E2O+pVatWjB07lkqVKtG6dWs+/vhj+vTpQ1paGqeddhorV67c8biwqAUflWOPhWefhaVLQ0u+fv0wQO+mm8IAvXPPhY8+0lQ7ESmSOnbsSJ06dRgwYMCOfffffz/HH388l112GcuWLdvl/FtvvZXHH3+cCy64gE8//ZSFCxfy9ddf06FDB0499dRdFpDJKMt66aWX8u6773LBBRcwfvz4LDGkp6czbdo0br31VqZNm5bleObu9blz5/LJJ5/Qvn17evbsmaXcbI0aNbJdAGZ3deYh3FKoWrUqZcqU4bDDDsty/VFQCz5qVarANdeEe/Tvvhta9WPGwPDhYWvWLHzr7t0bYusGi4jkKrfWd/nyuR+vWnW3rfcMDz30EN26deOFF17g1FNPZcyYMUydOpU5c+Zw+eWX73LuIYccwqBBg/jxxx85//zzGTFiBM2bN2fMmDGYGVviqoFmLPKy33778fDDD7N582aOOeYYTs2u5wG45JJLaNu2LRdeeCGpqak7Evf8+fN3Oa9x48akp6fvqEG/JVMF0i5dupCens4ll1xC9erVWb16NcuWLaNy5cp8/fXXALsM5otP/CkpKaxatYpKlSrx7bff5rknIJGU4IuKlBQ45ZSwLV4cyuX+978wZ06oeX/ddTvr3zdtGnW0IiKkpqYyfPhwOnbsSO3atTnooIPo0KEDbdu2zXLugAEDmDJlCuvWrePaa6+latWq9OzZk/T0dFJTU2nWrBmPPPLILq8ZPHgwo0aNYv369fTp0yfHOEqXLk3Hjh0ZOXIkl112Ge3ataNp06ZZkmzz5s1p2LAh6enppKSkcNJJJ3HTTTftOH7AAQfwxBNP0KNHDwDKlCnDww8/TMeOHbnvvvvo1KkT1apVo379+kBYMObss8/mH//4B3fffTddunQB4PLLL8/XwMKCplK1RdnmzaFozhNPwOTJO/enp4dEf+aZUKZMdPGJSJGiUrXJT6Vqk0WZMtCzJ0yaFOrfX3xxWKZ24sSwv3ZtuPVWiK2DLCIikkEJvrho1iy05JctCz+bNoWff4a77gr17888M9zDz8PcURERSX5K8MVNxYqhJT9nTmjZ9+wJqanw2mtw8smh/v2DD+6sfz9sWPgCkJISfsZNWxGR5LNly5Zci7NI8eTubNmyZZeBfbuje/DJYPnyUEBn8OCd3fVly0LLlvDpp7uWxS1fPgzg6907mlhFJGH++OMPNmzYwDYVzUpKqamplC9fnrJly+7Yl9s9+IQleDN7BjgNWOHuR2Rz/B9ARpYpBRwOVHP31Wa2CFgLbAO25hR8ZiU2wWfYuhXGjg1L2b79ds7n1akDixYVWlgiIpIYUSX4dsA64PnsEnymc08HrnL3E2LPFwFp7r5qTz6zxCf4ePPnhxK52THTvXoRkSQQySh6d58ErM7j6b2A4YmKpURq2DC01LNTpYoq5ImIJLnIB9mZWXmgEzAybrcD75jZTDPrt5vX9zOzGWY2Y+XKlYkMtfgZODDcc89s1Sro0gV++KHwYxIRkUIReYIHTgc+dvf41v5x7n400Bm4NNbdny13H+Luae6epgIPmfTuHQbU1akTuuVr14a//jUsWztuXJhq98gjWsVORCQJFYUE35NM3fPuviz2cwUwGmgZQVzJoXfvMKBu+/bQYv/vf2HuXPjzn2H9erjiCjjuOPjyy6gjFRGRAhRpgjez/YB04PW4fRXMrGLGY+BkQNmnIFWvDiNGhLnzNWrAtGlw9NFw222waVPU0YmISAFIWII3s+HAFKCRmS01s75m1t/M+seddibwjruvj9t3EPCRmc0GPgXGunvWdQJl73XtCl9/HQrnbNkCd94JLVqEZWpFRKRYU6EbCT76KNyfnzcvPL/4Yrj3XqhUKdq4REQkR1psRnbv+ONh1qyweE2pUqFYTpMm8MYbUUcmIiL5oAQvO5UtC3fcAZ99Bq1awY8/hm787t1DOVwRESk2lOAlq2bN4OOP4eGHoUIFeOUVOPxweOYZFcgRESkmlOAle6mpcPnl8NVX0Lkz/PYb9O0LHTvCd99FHZ2IiOyGErzkrk6dsIDNsGFQtSp88EFo4f/732FxGxERKZKU4GX3zODcc0OBnPPPD8vPXn99WI72s8+ijk5ERLKhBC95V7UqPP88jB8fWvaffx6S/HXXwYYNUUcnIiJxlOBlz51ySihte9VVYdDdf/4Tuu3ffz/qyEREJEYJXvJn333hgQdgypSQ3BcuDAPw/u//YHVeVwkWEZFEUYKXvdOyJcycGZam3WcfePbZMKVuxAhNqRMRiZASvOy90qXhpptg9mxo1w5WrIAePUKRnKVLo45ORKREUoKXgtOoEXz4IQweHGrYv/lmKHf7xBNhuVoRESk0SvBSsFJSoF+/sEpdt26wdi1cemlo2c+dG3V0IiIlhhK8JEbNmjB6NIwcGdaf//jjsBTtnXfC5s1RRycikvSU4CWxzjortOb/9reQ2G+7DY4+GqZOjToyEZGkpgQviVe5MgwZEsrcNmgQ6tu3bQtXXBG68EVEpMApwUvh6dABvvgCbrgh3Kt/5BE44gh4662oIxMRSTpK8FK4ypWDe+6BGTPgmGNg8WLo0iXUul+5MuroRESShhK8RKNFi3Af/r77QtIfPjwUyHnhBRXIEREpAErwEp1SpeCaa0Jd+5NOgl9+gQsugE6d4Pvvo45ORKRYU4KX6B16KLz9NgwdGgbkvfNOuDf/4IOwbVvU0YmIFEtK8FI0mMFf/hKK4fTsGZafvfpqaNMmDMwTEZE9ogQvRctBB4X78W++CbVqwfTpYTBe165Qu3YYfV+3LgwbFnWkIiJFmhK8FE2nnRYK5Fx2GWzdCm+8AUuWhAF4P/wQyuEqyYuI5EgJXoquihXh0UdDqz6zDRvg5psLPyYRkWJCCV6KvhUrst+/eHHhxiEiUowowUvRV7t29vtr1izcOEREipGEJXgze8bMVpjZlzkcb29mv5vZrNh2W9yxTmY2z8y+M7MbEhWjFBMDB0L58ln377+/ptGJiOQgkS34oUCn3Zwz2d1bxLY7AMwsFXgc6Aw0AXqZWZMExilFXe/eYbGaOnXCdLqaNWHffUOBHN2HFxHJVsISvLtPAlbn46Utge/cfaG7bwZeAroWaHBS/PTuDYsWwfbtsHRpGFWfmgr/+heMGBF1dCIiRU7U9+DbmNlsMxtnZk1j+2oCS+LOWRrbly0z62dmM8xsxkotVlJydOgQ6tgDXHghzJkTbTwiIkVMlAn+M6COux8JPAq8Fttv2Zyb4+oj7j7E3dPcPa1atWoFH6UUXVdcAeedF6bMdesGq/PTYSQikpwiS/Duvsbd18UevwWUNrOqhBb7IXGn1gKWRRCiFHVmMHgwHHUULFwYlpzVoDsRESDCBG9m1c3MYo9bxmL5BZgONDSzemZWBugJvBFVnFLElS8Po0dDlSphwZpbb406IhGRIiGR0+SGA1OARma21Mz6mll/M+sfO+Uc4Eszmw08AvT0YCtwGfA2MBcY4e5fJSpOSQJ16oSBdqmpcM898OqrUUckIhI5c8/x9naxk5aW5jNmzIg6DInKgw+GFegqVICpU8OSsyIiSczMZrp7WnbHoh5FL1JwrrwyTKdbvz4Muvv116gjEhGJjBK8JA+zUBCnRQtYsECD7kSkRFOCl+QSP+hu/Hi47bbdv0ZEJAkpwUvyqVsXXn4ZUlLg7rth5MioIxIRKXRK8JKcTjwR/vOf8Pgvf4GvNBFDREoWJXhJXlddFe7Da9CdiJRASvCSvMzgv/+FI4+E774LI+w16E5ESggleEluGYPuDjgAxo2Df/4z6ohERAqFErwkv3r1dg66GzgQRo2KOiIRkYRTgpeSoWPHsHY8hEF3X38dbTwiIgmmBC8lxzXXQM+esG5dGHT3229RRyQikjBK8FJymMFTT0Hz5jB/flhLfvv2qKMSEUkIJXgpWSpUgNdeC4Puxo6FAQOijkhEJCGU4KXkqVcPXnopDLq7884wyl5EJMkowUvJdNJJcO+94fEFF8DcudHGIyJSwJTgpeS69lro0WPnoLvff486IhGRAqMELyWXGTz9dBh09+23GnQnIklFCV5KtgoVwj34ypVhzBi4/faoIxIRKRBK8CKHHrpz0N0dd8Drr0cdkYjIXlOCFwE4+WS4557w+Pzz4Ztvoo1HRGQvKcGLZPjHP6B7d1i7Frp21aA7ESnWlOBFMpjBM89As2Zh0N3552vQnYgUW0rwIvHiB929+Wa4Jy8iUgwpwYtkVr8+DB8eWvS33w5vvBF1RCIie0wJXiQ7p5wCd98dHp93ngbdiUixowQvkpPrr4dzzgmD7rp1gzVroo5IRCTPlOBFcmIGzz4LRxwB8+Zp0J2IFCsJS/Bm9oyZrTCzL3M43tvMvohtn5jZkXHHFpnZHDObZWYzEhWjyG7tu29YXnb//cO9+LvuijoiEZE8SWQLfijQKZfj3wPp7t4cuBMYkul4B3dv4e5pCYpPJG/iB939859hdL2ISBGXsATv7pOA1bkc/8Tdf409nQrUSlQsInutUycYODA8Pu+80GUvIlKEFZV78H2BcXHPHXjHzGaaWb/cXmhm/cxshpnNWLlyZUKDlBLuhhvCoLs1azToTkSKvMgTvJl1ICT46+N2H+fuRwOdgUvNrF1Or3f3Ie6e5u5p1apVS3C0UqJlDLpr2jRMm/vLXzToTkSKrEgTvJk1B54Curr7Lxn73X1Z7OcKYDTQMpoIRTKJH3T32ms7u+1FRIqYyBK8mdUGRgHnu/u3cfsrmFnFjMfAyUC2I/FFItGgAbz44s5Bd2PGRB2RiEgWeUrwZnaQmT1tZuNiz5uYWd/dvGY4MAVoZGZLzayvmfU3s/6xU24DqgBPZJoOdxDwkZnNBj4Fxrr7+Hxcm0jidO4cpsy5Q+/eYXEaEZEixNx99yeFxP4scLO7H2lmpYDP3b1ZogPcE2lpaT5jhqbNSyFxD4PuRo2Cww+HadOgYsWooxKREsTMZuY0nTyvXfRV3X0EsB3A3bcC2wooPpHiyQyGDoUmTWDuXA26E5EiJa8Jfr2ZVSFMX8PMWgO/JywqkeKiYsUw2G6//cIys/fcE3VEIiJA3hP81cAbQH0z+xh4Hvh7wqISKU4aNoRhw0KL/tZbYezYqCMSEclbgnf3z4B0oC1wEdDU3b9IZGAixUqXLnDHHTsH3c2fH3VEIlLClcrLSWZ2QaZdR5sZ7v58AmISKZ5uugk++yx01XfrBlOnatCdiEQmr130x8ZtfwIGAGckKCaR4iklBZ57Loyo//pr6NMntOhFRCKQ1y76v8dtfwOOAsokNjSRYih+0N2oURp0JyKRyW8luw1Aw4IMRCRpHHbYzkF3t9wCb70VdUQiUgLltZLdm2b2RmwbA8wDXk9saCLFWJcucPvtoYv+3HPhu++ijkhESpg8DbID7ot7vBX4wd2XJiAekeRx881h0N1rr4VBd1OmaNCdiBSaPCV4d5+Y6EBEkk7GoLtWreCrr+DCC+GVV0LXvYhIguXaRW9ma81sTTbbWjNbU1hBihRblSqFFnylSjByJNx7b9QRiUgJkWuCd/eK7l4pm62iu1cqrCBFirVGjeB//wuPb74ZxmtxRBFJvD0aRW9mB5pZ7YwtUUGJJJ3TT9856O6ss6BmzdCFX7duGHEvIlLA8jqK/gwzmw98D0wEFgHjEhiXSPK55RY4+mjYuBGWLQvJ/ocfoF8/JXkRKXB5bcHfCbQGvnX3esCJwMcJi0okGaWkwMqVWfdv2BC67kVEClBeE/wWd/8FSDGzFHf/EGiRuLBEktTSHGaXLl5cuHGISNLL6zz438xsX2ASMMzMVhDmw4vInqhdO3TLZ1azZuHHIiJJLa8t+K6E8rRXAeOBBcDpiQpKJGkNHAjly2fdv2ULLFxY+PGISNLKa4LvB9Rw963u/py7PxLrsheRPdG7NwwZAnXqhII3NWuGVv3PP0PbtqHynYhIAchrgq8EvG1mk83sUjM7KJFBiSS13r1h0SLYvj3ck58zBzp2DEk+PR3eeSfqCEUkCeR1udjb3b0pcClQA5hoZu8lNDKRkqJSJRg7NiT+devCQjUvvBB1VCJSzO3pcrErgOXAL8CBBR+OSAlVpgw8/zxcdx1s3QoXXAD/+leYKy8ikg95LXRzsZlNAN4HqgJ/c/fmiQxMpMRJSQlJ/eGHw/35G26AK66AbduijkxEiqG8TpOrA1zp7rMSGIuIAFx+OVSvDuefD48+Cj/9FLrsy5aNOjIRKUbyulzsDWaWamY14l/j7qrOIZII3bvDgQeGdeRffRVWrAir0lWuHHVkIlJM5LWL/jLgZ+BdYGxsG5PAuESkfXuYPDlMpZs0Cf70J1iyJOqoRKSYyOsguyuBRu7e1N2bxbZc78Gb2TNmtsLMvszhuJnZI2b2nZl9YWZHxx3rZGbzYsduyPPViCSbZs1gyhRo0gS++gratIEvs/0nJSKyi7wm+CXA73v43kOBTrkc7ww0jG39gEEAZpYKPB473gToZWZN9vCzRZLHIYeElvzxx8OPP4aW/KRJUUclIkVcXhP8QmCCmd1oZldnbLm9wN0nAatzOaUr8LwHU4H9zexgoCXwnbsvdPfNwEuxc0VKrgMOgHffDWvJ//YbnHRSuDcvIpKDvI6iXxzbysS2glCT0DOQYWlsX3b7WxXQZ+6Z9u2z7uveHS65JCzxeeqpWY/36RO2VavgnHOyHr/4YujRI9xLPf/8rMevuQZOPx3mzYOLLsp6/JZbQtWzWbPgyiuzHr/77lDy9JNP4Kabsh5/6CFo0QLeew/uuivr8cGDoVEjePNNuP/+rMdfeCG0KF9+GQYNynr81VehalUYOjRsmb31VqjF/sQTMGJE1uMTJoSf990HYzIN8yhXDsaNC4/vvBPef3/X41WqwMiR4fGNN4au7Xi1asH//hceX3ll+B3GO+ywUEYWwhrt33676/EWLcLvD+C887KuDNemDdxzT3h89tnwS6ZqzieeCLfeGh537hzWhY932mlw7bXhcU5/90aMCH//hgyBP/8ZGjTYuVCN/u7p7x4k7u+e/t8rmL97l1yS9XiC5HUU/e0AZlbB3dcX0Gdbdh+Vy/7s38SsH6GLn9q1axdMZCJFVWoqPPBAaM1//z189x1s2gSHHhp1ZCJSxJjnoVKWmbUBngb2dffaZnYkcJG75/pVxMzqAmPc/Yhsjg0GJrj78NjzeUB7oC4wwN1Pie2/EcDd79ldnGlpaT5jxozdXo9IUnjuOfjrX0Plu/PPh6eeChXxRKTEMLOZ7p6W3bG83oN/CDiFUKIWd58NtNvLuN4ALoiNpm8N/O7uPwHTgYZmVs/MygA9Y+eKSLy//CV0KVaoELoQTz8d1q6NOioRKSLyXIve3TNPwM21fqaZDQemAI3MbKmZ9TWz/mbWP3bKW4TBe98B/wUuiX3OVuAy4G1gLjDC3b/Ka5wiJUqnTuHe8YEHhlXo2reH5cujjkpEioC8DrJbYmZtAY+1qi8nJN8cuXuv3Rx3wup02R17i/AFQER2Jy0tDC7q1CmsJ9+2Lbz9NjRsGHVkIhKhvLbg+xOScU3CqPYW5JCcRSQC9evDxx/DsceGwXdt28Knn0YdlYhEKK/rwa9y997ufpC7H+ju57n7L7t/pYgUmgMPhA8+CNOgVq2CDh3COvMiUiLlqYvezB7JZvfvwAx3f71gQxKRfNt3X3j99TCX+NlnoWvXMMe3b9+oIxORQpbXLvqyhG75+bGtOXAA0NfMHkpIZCKSP6VLw9NPh+Ig27aFqXR33AF5mBIrIskjr4PsGgAnxEa4Y2aDgHeAk4A5CYpNRPLLLFRcq1kTLr0U/vnPUMf+8cehVF7/2YtIcZbXFnxNoELc8wpADXffBmwq8KhEpGD07x9KqJYtG0qhnn12KDcqIkkvrwn+38AsM3vWzIYCnwP3mVkF4L1EBSciBaBbt1CDu3JleOONUNM7c61yEUk6eSpVCxC30psBn7r7skQGlh8qVSuSi7lzw1z5xYvDwhrjx0PdulFHJSJ7Id+las2scezn0cDBhFXeFgPVY/tEpLg4/PCwylnz5mHVrjZtsq5qJiJJY3ejba4B/gZks34eDpxQ4BGJSOLUqAGTJsGZZ8KHH0K7djB6dFhOVESSSp676IsDddGL5NGmTWGxmpdfDtPqnnsOeuVaXVpEiqC96aK/Lu7xnzMdu7tgwhORQrfPPvDii3DVVbBlC5x7LtyfXUediBRXuxtF3zPu8Y2ZjnUq4FhEpDClpMADD+xM7NdeC1dfDdu3RxuXiBSI3SV4y+Fxds9FpDi6+urQmi9dGh58MLTmN6m8hUhxt7sE7zk8zu65iBRXvXqFaXMVK4b78p07w++/Rx2ViOyF3SX4I81sjZmtBZrHHmc8b1YI8YlIYTnhBJg8GQ4+eOcI+2VFrtyFiORRrgne3VPdvZK7V3T3UrHHGc9LF1aQIlJIjjwSPvkkFML54oswV37u3KijEpF8yGupWhEpKerWhY8/Dsl98WI47rjwXESKFSV4EcmqSpVQv/6MM+DXX0P9+tdeizoqEdkDSvAikr3y5cNKdBddBH/8EVaiGzQo6qhEJI+U4EUkZ6VKhaR+xx1hfvwll8Att0ASVcAUSVZK8CKSOzO49VZ46ilITYWBA6Fv31ABT0SKLCV4Ecmbvn3h9ddD1/2zz0LXrrBuXdRRiUgOlOBFJO+6dAlz5KtWhXHjwtKzhxwSyt7WrQvDhkUdoYjEKMGLyJ5p2TJMm6taFb7/HpYuDffkf/gB+vVTkhcpIna3HryISFaHHQZly2bdv2FDWIZ26FA49NCw1a+/8/H++xd2pCIllhK8iOTPjz9mv3/btjCHPjsHHJA18Wf8rFUrDOITkQKR0ARvZp2Ah4FU4Cl3vzfT8X8AveNiORyo5u6rzWwRsBbYBmzNaUF7EYlI7dqhWz6zGjXgv/+FhQvDtmDBzserV4dtxoysrytdOtzHz67lf+ihYSEcEckz8wTNZzWzVOBb4CRgKTAd6OXuX+dw/unAVe5+Quz5IiDN3Vfl9TPT0tJ8Rnb/cYhIwRs2LNxz37Bh577y5WHIEOjdO+v57vDzz1mTfsbjn37K/fOqVcua+DMe16gRBvqJlDBmNjOnBnAiW/Atge/cfWEsiJeArkC2CR7oBQxPYDwiUpAykvjNN4ea9bVrhzny2SV3CPPpq1cPW9u2WY9v2BAG7WXX8l+4EFauDNu0aVlfu88+UK9e9t3/9eqFLx67M2xY3q9FpBhIZAv+HKCTu/819vx8oJW7X5bNueUJrfwG7r46tu974FfCuvOD3X3I7j5TLXiRJLV9e2jhZ9fyX7gQVqzI/fXVq2ff8j/00HDsxRf3rDdCpIiIqgVv2ezL6dvE6cDHGck95jh3X2ZmBwLvmtk37j4py4eY9QP6AdSuXXtvYxaRoiglBWrWDFu7dlmPr127s/Wf+UvAokWwfHnYslsVr1w52Lo1a2W+DRtCi14JXoqpRCb4pcAhcc9rActyOLcnmbrn3X1Z7OcKMxtN6PLPkuBjLfshEFrwex+2iBQ7FSuGojvNm2c9tm1bmKufXct/wYIw6C8nixcnLmaRBEtkgp8ONDSzesCPhCR+buaTzGw/IB04L25fBSDF3dfGHp8M3JHAWEUkWaWmQp06YevQIevx336DJk2yH+RXvnzoGahXL+FhihS0hA07dfetwGXA28BcYIS7f2Vm/c2sf9ypZwLvuPv6uH0HAR+Z2WzgU2Csu49PVKwiUoLtvz/85z/ZD8Rbvz4U9enfH5YsKfTQRPZGwgbZRUGD7EQk3zKPor/iCpg1C/73vzDIr0yZkOhvvDEMzBMpAnIbZKcELyKSm2++gQED4OWXw/Ny5eCyy+C660I9fpEI5ZbgVRlCRCQ3jRvDSy/B7NnQrRts3Bi69OvVg1tvDffwRYogJXgRkbxo3hxGj4bp06FzZ1i3Du66K5TXveuuMFVPpAhRghcR2RNpafDWW2FO/QknwO+/h5Z8vXqhZR9fLEckQkrwIiL50bYtvP8+fPABHHcc/PJLuC9/6KHwyCPwxx9RRyglnBK8iMje6NABJk+GceNC6/7nn8MI/IYNYfBg2Lw56gilhFKCFxHZW2bQqRN8+im8/nq4X790aZhW16gRDB0ayuGKFCIleBGRgmIGZ5wBn38eptU1bhxq4V94ITRtCsOHhzn1IoVACV5EpKClpED37vDll/DCC2H1um+/hXPPDa37UaMgiWqQSNGkBC8ikiipqXDeeTB3Ljz1VKiQ99VXcPbZcMwxMHasEr0kjBK8iEiilS4NffuGVvzjj8PBB4du/NNOgzZt4N13leilwCnBi4gUln32gUsuCcvUPvAAVKsG06bBySdD+/YwKcuK2CL5pgQvIlLYypWDq64Ka9Lfcw9UrhySe3p6SPbTpkUdoSQBJXgRkajsuy/ccENYc37AAKhUKXTXt24Np58euvFF8kkJXkQkavvtB//8Z0j0N94Y1qYfMwaOPhrOOScMzBPZQ0rwIiJFxQEHwN13h0R/9dVQtiyMHAnNmkHv3mGQnkgeKcGLiBQ1Bx4I998fBuNdeimUKgUvvghNmsD//V8oniOyG0rwIiJFVY0a8NhjMH8+/PWvYd+zz8Jhh8HFF4dyuCI5UIIXESnq6tSB//4XvvkGzj8ftm2DJ5+EBg3gyith+fKoI5QiSAleRKS4aNAAnn8+lMDt3h02bYKHHw5L1F5/fViyViRGCV5EpLg5/PCwmM2sWdC1K2zcCP/+N9SrB2eeCYccEurh160Lw4ZFHa1ERAleRKS4OvJIeO21sExtp06wdm14vnRpKH37ww/Qr5+SfAmlBC8iUtwdeyyMGwcHHZT12IYNcM01qnVfAinBi4gkixUrst//889hit3DD8OvvxZuTBIZJXgRkWRRu3b2+1NSwgj8K68MU+/69IGpU9WqT3JK8CIiyWLgwFDmNl758mHu/MiRYSGbP/6A554Ly9QedVSYbrd2bTTxSkIpwYuIJIvevWHIkDBv3iz8HDIELrgAzjoL3n47FM257jqoWhVmzw4Fc2rUgP79w6h8SRrmCeyiMbNOwMNAKvCUu9+b6Xh74HXg+9iuUe5+R15em520tDSfMWNGgcUvIpK0Nm2CUaNCCz5+HfpWrUKy7949a2+AFDlmNtPd07I7lrAWvJmlAo8DnYEmQC8za5LNqZPdvUVsu2MPXysiIvmxzz7QqxdMnBhWq7v88rCq3bRpcOGFULMmXHEFfP111JFKPiWyi74l8J27L3T3zcBLQNdCeK2IiOyJjBH2y5bBM8+EVvxvv8Ejj0DTppCeDsOHh1a/FBuJTPA1gSVxz5fG9mXWxsxmm9k4M2u6h68VEZGCUr58aL1PnQqffQYXXQQVKoQu/HPPhVq1QkncBQuijlTyIJEJ3rLZl/mG/2dAHXc/EngUeG0PXhtONOtnZjPMbMbKlSvzG6uIiMTLGGG/bBkMGhSq5q1aFUriNmgAp5wCo0fDli1RRyo5SGSCXwocEve8FrAs/gR3X+Pu62KP3wJKm1nVvLw27j2GuHuau6dVq1atIOMXEZFKlcKgu88/hylTwhz6smXhnXfCyPy6deG222DJkt29kxSyRCb46UBDM6tnZmWAnsAb8SeYWXUzs9jjlrF4fsnLa0VEpBCZQevWYU79smXw0EPQuHF4fOedIdGfcQa89VZYzlYil7AE7+5bgcuAt4G5wAh3/8rM+ptZ/9hp5wBfmtls4BGgpwfZvjZRsYqIyB6oXHnnCPsJE6BnT0hNhTffhC5doH59uPturVMfsYTOgy9smgcvIhKRFStC637wYPg+VtqkVKmwfG3//tChQ+gFkAIVyTx4EREpQQ48MIyw/+47GD8+JHZ3eOUVOPHE0J1///3wyy9RR1piKMGLiEjBSUkJI+xHjQrr0d9+e5he9+23cO21oYDO+efDxx9rsZsEU4IXEZHEqFkzjLD//nt4/XXo3Bk2b4b//Q+OPx6aN4fHH4fff4860qSkBC8iIolVqtTOEfYLFsCNN4Yu/S+/hMsuC4vd/O1vMHNm1JEmFSV4EREpPPXqhRH2S5bAyy/DCSfAhg3w1FOQlgbHHhser18Pw4aF6XcpKeHnsGFRR1+saBS9iIhEa968sKzts8/Cr7+GfWXLwtatYctQvnw4r3fvaOIsgjSKXkREiq5GjcII+x9/hOefh7Zt4Y8/dk3uEFr6N98cTYzFkBK8iIgUDeXK7Rxhn9Oc+cWLCzemYkwJXkREip7atbPfn5oa6uDLbinBi4hI0TNwYLjnHs8sdNufckooj/vTT9HEVkwowYuISNHTu3cYUFenTkjsderA0KFhudry5cMI/MaNwzx6LW6TLY2iFxGR4uWHH+Dvfw+L20CYXjd4MBx9dLRxRUCj6EVEJHnUqRMq440eHcrgzpgR5s9fcQWsWRN1dEWGEryIiBQ/ZtCtG8ydC9dcE54/8ggcfji8+qrq3KMELyIixdm++8J994Uyt61bw7Jl8Oc/h3XpFy6MOrpIKcGLiEjxd+SRYf78k0/C/vvDuHHQtCncc09Y4KYEUoIXEZHkkJICF10E33wTRuH/8QfcdBO0aAGTJkUdXaFTghcRkeRy0EFhSdr33oOGDcN9+vR0uPBCWLUq6ugKjRK8iIgkpxNPhC++gNtvh332CfPoGzWCZ56B7dujji7hlOBFRCR5lS0Lt90Gc+ZAx46wejX07Rta9F9+GXV0CaUELyIiya9hw1DD/sUXQxf+Rx/BUUfBDTeEteeTkBK8iIiUDGbQq1cYhHfJJaHE7b/+FUbbjx0bdXQFTgleRERKlv33DzXsp0wJI+x/+AFOOw3OPhuWLo06ugKjBC8iIiVTq1YwfTo8+GAomDNqVKiE9+CDYdW6Yk4JXkRESq5SpeDKK8NUurPPhnXr4OqrQ237adOijm6vKMGLiIjUqhVq2I8ZA3XrwqxZ0KZNuFf/228RB5c/SvAiIiIZunSBr74Ko+tTU2HQoLDu/IsvFrsFbBKa4M2sk5nNM7PvzOyGbI73NrMvYtsnZnZk3LFFZjbHzGaZmRZ5FxGRwlG+fKhhP2sWHH88/PxzKH178skwf37U0eVZwhK8maUCjwOdgSZALzNrkum074F0d28O3AkMyXS8g7u3yGkxexERkYRp2hQmToSnn4YDDgilb5s1C5XxNm2KOrrdSmQLviXwnbsvdPfNwEtA1/gT3P0Td/819nQqUCuB8YiIiOyZlBT4v/+DefNCLftNm2DAAGjeHN5/P+rocpXIBF8TWBL3fGlsX076AuPinjvwjpnNNLN+CYhPREQkb6pWDTXsJ04MU+m+/TaUvu3dO3ThF0GJTPCWzb5sRyiYWQdCgr8+bvdx7n40oYv/UjNrl8Nr+5nZDDObsXLlyr2NWUREJGft2oV783ffHercv/hiWMDmySeL3AI2iUzwS4FD4p7XApZlPsnMmgNPAV3d/ZeM/e6+LPZzBTCa0OWfhbsPcfc0d0+rVq1aAYYvIiKSjTJl4MYbw2j7zp3h99/h4ouhbVuYPTvq6HZIZIKfDjQ0s3pmVgboCbwRf4KZ1QZGAee7+7dx+yuYWcWMx8DJQHIv+yMiIsXLoYeGGvavvgo1aoTCOMccA9dcEwrmRCxhCd7dtwKXAW8Dc4ER7v6VmfU3s/6x024DqgBPZJoOdxDwkZnNBj4Fxrr7+ETFKiIiki9moQLe3LlwxRVhrvwDD4T79KNHRzp33ryYTdzPTVpams+YoSnzIiISkc8+g4sugoxcdNpp8NhjYXnam2+GxYuhdm0YODAM0NtLZjYzp6nkSvAiIiIFadu2MOjupptgzRooXTrs37Jl5znly8OQIXud5HNL8CpVKyIiUpBSU+HSS8O68z17hsQen9wBNmwILfoEUoIXERFJhIMPhuHDw3367CxenNCPV4IXERFJpNq192x/AVGCFxERSaSBA8M993jly4f9CaQELyIikki9e4cBdXXqhO76OnUKZIDd7pRK6LuLiIhISOYJTuiZqQUvIiKShJTgRUREkpASvIiISBJSghcREUlCSvAiIiJJSAleREQkCSnBi4iIJCEleBERkSSUVMvFmtlK4IcCfMuqwKoCfL8oJcu1JMt1gK6lqEqWa0mW6wBdS27quHu17A4kVYIvaGY2I6d1doubZLmWZLkO0LUUVclyLclyHaBryS910YuIiCQhJXgREZEkpASfuyFRB1CAkuVakuU6QNdSVCXLtSTLdYCuJV90D15ERCQJqQUvIiKShJTgs2Fmz5jZCjP7MupY9oaZHWJmH5rZXDP7ysyuiDqm/DKzsmb2qZnNjl3L7VHHtDfMLNXMPjezMVHHsjfMbJGZzTGzWWY2I+p49oaZ7W9mr5rZN7F/M22ijik/zKxR7M8jY1tjZldGHVd+mNlVsX/vX5rZcDMrG3VM+WVmV8Su46vC+vNQF302zKwdsA543t2PiDqe/DKzg4GD3f0zM6sIzAS6ufvXEYe2x8zMgAruvs7MSgMfAVe4+9SIQ8sXM7saSAMquftpUceTX2a2CEhz92I/R9nMngMmu/tTZlYGKO/uv0Uc1l4xs1TgR6CVuxdkjZCEM7OahH/nTdx9o5mNAN5y96HRRrbnzOwI4CWgJbAZGA9c7O7zE/m5asFnw90nAaujjmNvuftP7v5Z7PFaYC5QM9qo8seDdbGnpWNbsfx2ama1gC7AU1HHIoGZVQLaAU8DuPvm4p7cY04EFhS35B6nFFDOzEoB5YFlEceTX4cDU919g7tvBSYCZyb6Q5XgSwgzqwscBUyLOJR8i3VrzwJWAO+6e3G9loeA64DtEcdREBx4x8xmmlm/qIPZC4cCK4FnY7dOnjKzClEHVQB6AsOjDiI/3P1H4D5gMfAT8Lu7vxNtVPn2JdDOzKqYWXngVOCQRH+oEnwJYGb7AiOBK919TdTx5Je7b3P3FkAtoGWs26tYMbPTgBXuPjPqWArIce5+NNAZuDR2e6s4KgUcDQxy96OA9cAN0Ya0d2K3Gc4AXok6lvwws8pAV6AeUAOoYGbnRRtV/rj7XOBfwLuE7vnZwNZEf64SfJKL3a8eCQxz91FRx1MQYl2nE4BO0UaSL8cBZ8TuXb8EnGBm/4s2pPxz92WxnyuA0YR7jMXRUmBpXK/Qq4SEX5x1Bj5z95+jDiSfOgLfu/tKd98CjALaRhxTvrn70+5+tLu3I9wCTuj9d1CCT2qxgWlPA3Pd/YGo49kbZlbNzPaPPS5H+Mf/TaRB5YO73+jutdy9LqH79AN3L5atEjOrEBu8Saw7+2RCV2Sx4+7LgSVm1ii260Sg2A1GzaQXxbR7PmYx0NrMysf+LzuRMI6oWDKzA2M/awNnUQh/NqUS/QHFkZkNB9oDVc1sKfBPd3862qjy5TjgfGBO7N41wE3u/lZ0IeXbwcBzsVHBKcAIdy/WU8ySwEHA6PB/L6WAF919fLQh7ZW/A8NiXdsLgQsjjiffYvd5TwIuijqW/HL3aWb2KvAZoTv7c4p3RbuRZlYF2AJc6u6/JvoDNU1OREQkCamLXkREJAkpwYuIiCQhJXgREZEkpAQvIiKShJTgRUREkpASvEhEzGxbbLWvL83sldjUpuzO+ySf759mZo/sRXzrcthf3cxeMrMFZva1mb1lZofl93OKAjNrb2bFtoiKSHaU4EWis9HdW8RWLNwM9I8/GJvzj7vnK/G4+wx3v3zvw9wlJiNUrJvg7vXdvQlwE2FOfHHWnmJcJU0kO0rwIkXDZKBBrCX5oZm9CMyBnS3p2LEJcWuWD4slXMzsWDP7xMxmm9mnZlYxdv6Y2PEBZvaCmX1gZvPN7G+x/fua2ftm9pmFdd277ibODsAWd38yY4e7z3L3yRb8J9YjMcfMesTFPdHMRpjZt2Z2r5n1jsU5x8zqx84bamZPmtnk2HmnxfaXNbNnY+d+bmYdYvv7mNkoMxsfu6Z/Z8RkZieb2ZTYdb0SW48hY/362+Out7GFhZj6A1fFelT+tJd/liJFgirZiUTMwlKYnQmLUECo536Eu3+fzelHAU0Jy2Z+DBxnZp8CLwM93H26haVPN2bz2uZAa6AC8LmZjSWszHemu68xs6rAVDN7w3OugHUEkNNCOWcBLYAjgarAdDObFDt2JGHJzNWEKnFPuXtLM7uCUEHuyth5dYF0oD7woZk1AC4FcPdmZtaYsHpdxi2BFrHfySZgnpk9Grv2W4CO7r7ezK4HrgbuiL1mlbsfbWaXANe6+1/N7Elgnbvfl8O1iRQ7SvAi0SkXV0J4MmHdgLbApzkkd2LHlgLEXlsX+B34yd2nA2SsGBhr3Md73d03AhvN7EPCF4mxwN0WVoHbDtQkdLcvz8f1HA8Md/dtwM9mNhE4FlgDTHf3n2JxLQAylv2cQ+gVyDDC3bcD881sIdA49r6Pxq7tGzP7AchI8O+7+++x9/0aqAPsDzQBPo79DsoAU+I+I2PRpZmELyUiSUkJXiQ6G2PL3+4QS0jrc3nNprjH2wj/ho2wLvvuZD7Hgd5ANeAYd99iYZW7srm8x1fAOTkcy/KNIk583Nvjnm9n1/+Hsosxr+8b//t419177eY1GeeLJCXdgxcp/r4BapjZsQCx++/ZJa6usfvZVQiDyqYD+xHWp98Su7ddZzef9QGwT8Y9/NjnHWtm6cAkoIeZpZpZNaAd8OkeXsufzSwldl/+UGBe7H17xz7rMKB2bH9OphJuXTSIvaZ8Hkb5rwUq7mGsIkWaErxIMefum4EewKNmNht4l+xb4Z8SuuSnAnfG1nIfBqSZ2QxCEs11Cd7YvfkzgZMsTJP7ChhAGBMwGvgCmE34InBdbBnWPTEPmAiMA/q7+x/AE0Cqmc0hjDXo4+6bcnoDd18J9AGGm9kXsettvJvPfRM4U4PsJJloNTmREsDMBlDEB5GZ2VBgjLu/GnUsIslALXgREZEkpBa8iIhIElILXkREJAkpwYuIiCQhJXgREZEkpAQvIiKShJTgRUREkpASvIiISBL6f3Ncp/u2BmsWAAAAAElFTkSuQmCC\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"from factor_analyzer import FactorAnalyzer\n",
"from factor_analyzer.factor_analyzer import calculate_bartlett_sphericity\n",
"from sklearn.decomposition import PCA\n",
"import matplotlib.pyplot as plt\n",
"import matplotlib\n",
"def scree_plot(eigvals):\n",
" fig = plt.figure(figsize=(8,5))\n",
" sing_vals = np.arange(len(eigvals)) + 1\n",
" plt.plot(sing_vals, eigvals, 'ro-', linewidth=2)\n",
" #####horizontal line\n",
" horiz_line_data = np.array([1 for i in range(len(sing_vals))])\n",
" plt.plot(sing_vals, horiz_line_data, 'r--')\n",
" plt.title('Scree Plot for PCA')\n",
" plt.xlabel('Principal Component')\n",
" plt.ylabel('Eigenvalue')\n",
" #I don't like the default legend so I typically make mine like below, e.g.\n",
" #with smaller fonts and a bit transparent so I do not cover up data, and make\n",
" #it moveable by the viewer in case upper-right is a bad place for it\n",
" leg = plt.legend(['Eigenvalues from PCA', 'Kaisers Rule Cutoff'], loc='best', borderpad=0.3,\n",
" shadow=False, prop=matplotlib.font_manager.FontProperties(size='small'),\n",
" markerscale=0.4)\n",
" leg.get_frame().set_alpha(0.4)\n",
"\n",
" #plt.savefig(os.path.join(save_dir / (name +'.jpg')))\n",
" return plt\n",
"\n",
"def pca_workflow(X, factors=-1, standardize=False, rotation='quartimax'):\n",
" \"\"\"\n",
" This will perform factor analysis, calculating the number of factors.\n",
" Printing scree plots, etc.\n",
" \"\"\"\n",
"\n",
" chi_square_value,p_value=calculate_bartlett_sphericity(X)\n",
"\n",
" if round(p_value,2)<=0.05:\n",
" print(\"Data passed Bartlett’s test for sphericity.\")\n",
" else:\n",
" print(\"Data failed Bartlett’s test for sphericity, use PCA with caution.\")\n",
" \n",
" #This is used to calculate\n",
" if factors ==-1:\n",
" fa = FactorAnalyzer(n_factors=X.shape[1], rotation=None, method='ml')\n",
" fa.fit_transform(X)\n",
" # Check Eigenvalues\n",
" ev, v = fa.get_eigenvalues()\n",
" #set the number of factors as where Eigenvalue > 1.0\n",
" factors = np.sum(ev>1.0)\n",
" print (\"Performing PCA using rotation:\", rotation, \" factors: \", factors, \"and standardization: \", standardize)\n",
" loading_cols=['F'+str(x+1) for x in range(factors)]\n",
" plot=scree_plot(ev)\n",
"\n",
" if standardize:\n",
" X = StandardScaler().fit_transform(X)\n",
"\n",
" fa = FactorAnalyzer(n_factors=factors, method='principal', rotation=rotation)\n",
" fa.fit(X)\n",
"\n",
" #Change it back to a dataframe.\n",
" results=pd.DataFrame(fa.transform(X),columns=loading_cols)\n",
" \n",
" return results\n",
"\n",
"X4= pca_workflow(X)"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"