onur baltacı
- 6 Tem 2023
- 2 dakikada okunur

Insurance Charge Prediction - Web App

Güncelleme tarihi: 8 Tem 2023

Hello everyone, in this blog post we are going to create a dynamic machine learning web application where we are going to predict insurance charges based on the conditions customers have. I recorded 2 YouTube videos on this dataset where I did exploratory data analysis and machine learning applications, I am leaving their link.

Data Analysis Video -> Link

Machine Learning Video -> Link

In the video, the random forest regressor was the best performing model so I am not going to train other models in order to finding the best one. I am only going to use the necessary code for training the model and saving it as a pkl file. We are going to start by installing 2 packages.

!pip install streamlit
!npm install localtunnel

And now, I am going to load the dataset and perform necessary preprocessing operations.

import pandas as pd
import numpy as np
from sklearn import preprocessing, model_selection
import joblib
labelencoder = preprocessing.LabelEncoder()
scaler = preprocessing.StandardScaler()
data = pd.read_csv("insurance.csv")
data.drop_duplicates(inplace=True)
data = data[["age","sex","bmi","children","smoker","charges"]]
data["sex"] = labelencoder.fit_transform(data["sex"])
data["smoker"] = labelencoder.fit_transform(data["smoker"])
X = data[["age","sex","bmi","children","smoker"]]
y= data["charges"]
X_train, X_test, y_train, y_test = model_selection.train_test_split(X,y, test_size = 0.3)
scaled_X_train = scaler.fit_transform(X_train)

In the code we imported the necessary libraries and performed preprocessing operations. Now we are going to train a random forest regressor model because it was the best performing one in the machine learning video. We are going to use 128 for number of estimators parameter and 3 for maximum features parameter

from sklearn.ensemble import RandomForestRegressor
rfr_model = RandomForestRegressor(max_features = 3, n_estimators = 128)
rfr_model.fit(scaled_X_train,y_train)

Now we are going to create a pkl file from our model using joblib. For this we are going to use .dump() method.

import joblib
joblib.dump(rfr_model,"rfr_model")

And now we are going to write a requirements.txt file which is going to contain information about the versions of the packages we are going to use in our app.

%%writefile requirements.txt
pandas == 1.5.3
numpy == 1.22.4
joblib == 1.2.0
streamlit == 1.24.0

And now we are going to write our streamlit application, app.py. We will start by loading our model, then we will take user input via a function, then we are going to write the prediction function and we will write the main function which is going to contain the information about our streamlit app's UI.

%%writefile app.py
import streamlit as st
import pandas as pd
import numpy as np
import joblib

model = joblib.load("/content/rfr_model")

def preprocess_input(age, sex, bmi, children, smoker):
    data = pd.DataFrame({
        'age': [age],
        'sex': [sex],
        'bmi': [bmi],
        'children': [children],
        'smoker': [smoker]
    })
    return data

def predict_insurance_charge(data):
    prediction = model.predict(data)
    return prediction

def main():
    st.title("Insurance Charge Estimation")
    st.sidebar.title("User Input")

    age = st.sidebar.slider("Age", 20, 100, step=1, value=30)
    sex = st.sidebar.selectbox("Sex", [0, 1], format_func=lambda x: "Male" if x == 0 else "Female")
    bmi = st.sidebar.slider("BMI", 10.0, 40.0, step=0.1, value=20.0)
    children = st.sidebar.slider("Number of Children", 0, 10, step=1, value=0)
    smoker = st.sidebar.selectbox("Smoker", [0, 1], format_func=lambda x: "No" if x == 0 else "Yes")

    input_data = preprocess_input(age, sex, bmi, children, smoker)

    prediction = predict_insurance_charge(input_data)

    st.subheader("Estimated Insurance Charge:")
    result_placeholder = st.empty()
    result_placeholder.write(prediction[0])


if __name__ == "__main__":
    main()

And now we are going to run it locally

!streamlit run app.py &>/content/logs.txt & npx localtunnel --port 8501 & curl ipv4.icanhazip.com

Here is the look of the application we created

That was all for this blog post, thanks for reading. Have a great day!

Insurance Charge Prediction - Web App

Son Yazılar