# Sales Prediction Model in Power PI

Leveraging the Python Scripting option in Power BI is a powerful way to build complex machine learning models with the interactive of a dashboard.

For the Python model, the SciKit Learn library to create a Linear Regression model that will have a training and testing set for the model to learn on. Then we will run the model on the total dataset.

We can derive the coefficients and rebuild the linear regression equation using What-If parameters in Power BI.

TV | radio | newspaper | sales |

23.01 | 37.8 | 69.2 | 27183 |

4.45 | 39.3 | 45.1 | 12792 |

This is a sample of the data set that is going to be used.

In the data above we will be using Sales as our predictor and the 3 channels will make up of our coefficients.

When building your code, its best to use an IDE which will give you the ability to decode the Python script. Spyder is a good lightweight IDE that come with the Anaconda

**Get the dataset:** Advertisment Dataset

**This is the code:**

#Load in the dependencies

import pandas as pd

import numpy as np

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

from sklearn.preprocessing import LabelEncoder, StandardScaler

dataset=pd.read_csv('HR_comma_sep.csv')

#lets change categories to numbers

le = LabelEncoder()

dataset['Departments'] = le.fit_transform(dataset['Departments'])

dataset['salary'] = le.fit_transform(dataset['salary'])

#preprocess your data

y=dataset['left']

features = ['satisfaction_level', 'last_evaluation', 'number_project',

'average_montly_hours', 'time_spend_company', 'Work_accident',

'promotion_last_5years', 'Departments', 'salary']

X=dataset[features]

#lets scale the data

s = StandardScaler()

X = s.fit_transform(X)

#split and train the dataset

X_train,X_test,y_train,y_test = train_test_split(X,y)

#Let the model predict results

log = LogisticRegression()

log.fit(X_train,y_train)

y_pred = log.predict(X)

y_prob = log.predict_proba(X)[:,1]

# Lets add the columns back to the dataframe

dataset['predictions'] = y_pred

dataset['probabilities'] = y_prob

Please review the video and the code below. Feel free to ask questions in the comment section below.

Can you please attach the sample dataset in CSV format.

Thanks Gaelim Holland. Just small findings, The code you mentioned in the blog is for HR Data set. Where Can I find/get that data set? Thanks.

sorry you should find the data set on this page got churn model :https://www.absentdata.com/power-bi/python-machine-learning-in-power-bi/

Great work! Can you share the pbix file