A look at the data used to power the cat adoption prediction tool.
Last year, while my family and I were in the process of adopting a cat, I thought it would be fun to write a Python script to help us keep track of available kittens. The script scanned the local animal shelter's website and sent me a message whenever new kittens were listed for adoption. What started as a simple project to aid our search quickly turned into a data collection endeavor. Over time, I amassed a significant amount of data on cat adoptions, which sparked the idea for this project. The following delves into the details of how I used this data and built a statistical model to predict how long it would take for a cat to be adopted.
The prediction tool is based on data from the no-kill Loudoun County animal shelter. Initially, the Python script was set to scan the website every 15 minutes. When a new cat was added, the script recorded the date and time, along with the cat's attributes such as weight, gender, age, breed, coat type, and color. It also tracked cats that were previously recorded but were no longer listed, marking them as adopted and recording the removal date and time. This raw data underwent cleaning and preprocessing before being used to develop the prediction model. I collected information on about 650 cats from May 2023 to May 2024.
From the data I collected, there were 356 female cats and 293 male cats.
The cats varied widely in weight. On average, they weighed approximately 7.23 pounds, with weights ranging from a minimum of 1 pound to a maximum of 21 pounds (a 14 year old cat named Elvis). The median weight among the cats was 7 pounds, reflecting the central tendency of the weight distribution.
The box plot depicting weight by gender confirms typical expectations: male cats tend to be larger than females. Notably, the weight distribution among males exhibits greater variability compared to females. Box plots summarize dataset distributions by displaying key metrics like median, quartiles, and outliers, offering insights into the spread and central tendencies of variables.
The scatter plot of weight by number of days to adoption provides insight into the relationship between a cat's weight and the duration it spends in the shelter before adoption. It demonstrates that there is a noticeable positive correlation, indicating that heavier cats generally require more time to find adoptive homes. This trend suggests potential considerations for adopters and shelters alike, highlighting the importance of understanding how weight influences the adoption process. Factors such as the perception of care needs, space requirements, and health considerations may play roles in adopters' decisions, influencing the adoption timeline for heavier cats. This becomes more noticable in the statistical model detailed below.
The data reveals interesting insights into the ages of cats available for adoption. The mean age of the cats is approximately 2 years and 1 month (25.7 months), while the median age is 1 year (12 months). The youngest kittens in the dataset are just 2 months old. On the other end of the spectrum, the oldest cat was an 18-year-old gray tabby named Sienna. The data heavily skewed to the left, showing that most cats for adoption were kittens.
The box plot depicting age by gender had a large number of outliers, so I've only shown the part of the box plot. Despite this, we see female cats tend to live longer than male cats, and male cats are put up for adoption earlier than female cats.
The scatter plot above depicting age against the number of days to adoption offers valuable insights into the relationship between a cat's age and its shelter stay duration prior to adoption. This visualization reveals a clear positive correlation, indicating that older cats tend to spend more time awaiting adoption. This trend suggests important considerations for both adopters and shelters, underscoring the influence of age on the adoption process. Factors such as perceived care needs, adaptability to new environments, and potential health considerations likely influence adopters' decisions, influencing the adoption timeline for older cats. These insights are further illuminated in the statistical models discussed below.
After exploring various models to predict adoption times for shelter cats, I ultimately chose the random forest regression model due to its robustness and versatility in handling complex datasets. Random forests excel in capturing non-linear relationships and interactions between variables, which is crucial given the diverse and multifaceted nature of the data. Moreover, the model's ability to handle large datasets with numerous features without overfitting makes them ideal for this application. The ensemble nature of random forests, combining multiple decision trees, also provides reliable predictions and helps mitigate the impact of outliers in the data. These factors, alongside their interpretability and ease of implementation, made the random forest regression model the optimal choice for predicting adoption times effectively.
The below shows the correlation for each feature within the statistical model with adoption days. A negative correlation indicts a negative impact on the number of days a cat might stay in the shelter before adoption. A positive, the opposite.
As indicated in the previous sections, age and weight have a large positive impact on the number of days a cat will stay in the shelter before adoption, as does whether or not the cat is a "barn cat," meaning feral. Some other interesting insights from this chart are the minor impact gender has, for this model, 0 was encoded to mean female, and 1 was encoded to mean male. The fact there is a small positive correlation on gender means male cats tend to stay more time in the shelter than female cats.
This project began as an opportunity for me to learn Python to better aid in my family's quest to adopt a cat but evolved into a comprehensive exploration of adoption dynamics at the Loudoun County animal shelter. Through the process of data collection and analysis, I think I've identified significant correlations and trends that shed light on the factors influencing a cat's journey from shelter to adoption. The findings underscore the impact of age and weight on adoption timelines, revealing that, unfortunately, older and heavier cats generally face longer waits for adoption. These insights could not only provide valuable guidance for shelter operations and adoption strategies but also highlight the nuanced considerations adopters weigh when choosing their feline companions.
The application of a random forest regression model proved instrumental in predicting adoption times accurately, leveraging its ability to discern complex patterns in adoption data. Beyond age and weight, the model identified subtle influences such as gender and feral status ("barn cat") that contribute to adoption durations, as did other less obvious traits, like the cat being the color brown. I hope this project has been as intriguing for you to read about as it was fulfilling for me to undertake. And if you haven't yet, check out your local shelter and adopt an animal!
© 2025 ; AlsoJon