Author: pbadanapally1
Time Series Transformation and Stationarity Analysis:
Original Time Series: Visualized the original ‘Total Gross Earnings’ over time. Checked for any discernible trends or patterns. Box-Cox Transformation: Applied the Box-Cox transformation to stabilize variance. Visualized the transformed time series. Augmented Dickey-Fuller Test (Box-Cox Transformed): Conducted the ADF test on the Box-Cox transformed series. ADF Statistic: -4.073 (approx.) p-value: Read More…
XG Boost
The goal is to predict ‘TOTAL_GROSS’ earnings using an XGBoost regression model. Data Preparation: Features: [‘RETRO’, ‘DETAIL’, ‘OVERTIME’, ‘INJURED’, ‘QUINN_EDUCATION’, ‘REGULAR’] Target: ‘TOTAL_GROSS’ Top Departments Selection: The top 10 departments by employee count are selected for analysis. Train-Test Split: The data is split into training and testing sets (80% training, 20% testing). XGBoost Regression Model: Read More…
Visualizing the Trends in different earnings components over the year
Creating a line chart to visualize the trends in different earnings components over the years. It uses the pd.pivot_table function to aggregate the data, specifically selecting columns like ‘RETRO’, ‘OTHER’, ‘OVERTIME’, ‘INJURED’, ‘DETAIL’, and ‘QUINN_EDUCATION’. The resulting pivot table is then used to plot a line chart where each line represents a specific earnings component. Read More…
Visualizing the trend of total gross earnings over the years for the Boston Police Department.
Data Selection: The code filters the DataFrame (final_df) to include only records associated with the ‘Boston Police Department’. Data Aggregation: The total gross earnings for the Boston Police Department are then aggregated for each year. Plotting: A line plot is created to visualize the trend in total gross earnings over the years. The x-axis represents Read More…
Payroll expenditure and the number of employees over different years
Creating a bar plot with subplots to visualize information related to payroll expenditure and the number of employees over different years. Let’s break down the code and its implications: Data Aggregation: The data is first grouped by the ‘year’ column. For each year, it calculates the total regular earnings (‘REGULAR’), total gross earnings (‘TOTAL_GROSS’), and Read More…
Visualize Earnings Distribution
The distribution of total earnings with frequency refers to the spread or pattern of how frequently different total earnings occur in the dataset. It indicates that the distribution of total earnings is right-skewed (positively skewed). In a right-skewed distribution, the tail is on the right side, and there are relatively few individuals with very Read More…
Project 2 Final Report
Project 1 Updated version
Discovering Top 5 departments with Highest Total Earnings
The top 5 departments with the highest total earnings, ordered from highest to lowest. Each bar represents a department, and the height of the bar corresponds to the total earnings of that department. The x-axis labels will be the department names, and the y-axis labels will be the total earnings. So we can conclude that Read More…