How to Ensure Your Airbnb Rental Gets Rented?

This post attempts to predict successful Airbnb listings

Brian Strand
4 min readMar 24, 2021

The vacation rental industry has fierce competition. This post will use a dataset of nearly 4,000 listings in the Seattle area to predict successful listings.

Questions of interest are as follows. Do Airbnb renters give bad reviews to listings that demand high fees? Do Airbnb renters rent cheaper units more than expensive units? And finally, how can Airbnb owners keep their listings occupied?

The Dataset

The dataset provided by Airbnb comprises of 3,818 listings with 92 attributes (or potential features used to predict). We first run a correlation matrix to see if any of the continuous variables in the dataset have interesting correlations; keeping in mind that correlation does not infer causality.

As the correlation matrix shows below, there are some interesting lack-of-correlation variables to take note of.

There is no correlation between the price of the listing and the review scores. This suggests positive reviews are independent of the expensiveness of the house or condo being rented (including cleaning fees, security fees, etc.). So people don’t give bad reviews just because they pay higher fees. There is also no correlation between room availability and price of listing, suggesting cheaper units do not get booked more than expensive ones.

As expected, we see strong correlation between some variables. For example, price is strongly correlated to the number of beds available as well as the occupants a unit will accommodate.

The Prediction

As shown below, the average listing has an occupancy rate of 44% in the next 30 day window. The median, however, is only 33% so the data is right skewed as many owners are dealing with occupancy rates much lower than the average.

Descriptive Statistics of All Listings’ Occupancy Rate

In order to predict occupancy rate, lets first take a look at the response variable’s distribution. The distribution plot clearly shows that some owners are very successful; in that they are 100% booked while others are completely vacant.

Percentage of Time the Listing is Booked

After accommodating for multicollinearity, we split the data into train and test datasets. When we apply a linear regression model to predict occupancy rates we get a poor prediction (an R-squared of 0.27). This means only 27% of the variability of occupancy rate can be explained by the predictions of the model.

This is a very non-normal distribution. Listings are frequently either 0% occupied or 100% occupied.

We can modify our “Occupancy Rate” variable as follows: any listing with an occupancy rate less than 50% is set to 0, and any listing greater than 50% is set to 1. We will then try to predict failure (0) success (1) using a logistic regression model instead since success seems to be binary.

We are now predicting whether an owner is “Successful” or “Not Successful” at renting their unit.

The confusion matrix (below left) shows we can predict successful landlords with 54% accuracy and unsuccessful landlords with 67% accuracy. The distributions of predictions (below middle) indicates there is some randomness in the model’s ability to predict successfulness.

However, since the model does a better job at predicting “Not Successful” occupancy rates, we can infer some things to avoid for our landlords. The negative values displayed below are the most powerful variables at predicting zero occupancy rates.

The Magnolia neighborhood, Bed & Breakfast listings, and Lake City are the top 3 variables which may lead to your Airbnb not being rented.

On the positive side, if you live in the Cascade area and are considering renting it out on Airbnb, then you have a good opportunity for a steady stream of income!

--

--