Hot or Not: The AI Edition – Deep Learning Decodes Attractiveness

In our increasingly digital world, the intersection of technology and human perception continues to grow more intertwined. This project explores one of the more intriguing aspects of this intersection: the assessment of human attractiveness through the lens of artificial intelligence. While beauty is often said to be in the eye of the beholder, there are patterns in what people tend to find appealing, particularly upon first impression. By harnessing the power of deep learning, we aim to decode these patterns and develop a model that can predict the attractiveness of human faces with a high degree of accuracy.

Utilizing the expansive CelebA dataset, which provides over 200,000 celebrity images annotated with various attributes, this study employs sophisticated convolutional neural networks (CNNs) to learn and predict facial attractiveness. This approach not only challenges the AI to understand and interpret human aesthetics but also offers insights into the complex dynamics of beauty as perceived by different cultures and societies.

The project is as much about understanding the capabilities and limitations of AI in social perception as it is about technical achievement. Through this exploration, we aim to shed light on how deep learning can be applied to subjective human experiences, potentially transforming industries such as digital marketing, online dating, and social media.





Setting Up the Environment

Integrating Google Colab and Google Drive

I began my project in Google Colab, a powerful cloud-based environment that supports free GPU usage—a boon for deep learning tasks. To seamlessly access my dataset, I integrated Google Drive with Colab:

pythonCopy code

from google.colab import drive drive.mount('/content/drive')

This code mounts the Google Drive to the Colab notebook, enabling direct file manipulation and access within my workspace.


Extracting the Data

The dataset, compressed into a zip file and stored on Google Drive, was extracted directly into the Colab environment for accessibility and efficiency:

pythonCopy code

!unzip '/content/drive/MyDrive/Colab Notebooks/Project1/' > /dev/null

Using Unix commands within Colab facilitates straightforward file operations, setting the stage for data preprocessing.


Data Preparation


Loading and Preprocessing Data

The backbone of my model's training is TensorFlow, a versatile framework that simplifies data manipulation and model construction. I began by loading our attribute labels:

pythonCopy code

import pandas as pd attributes = pd.read_csv("list_attr_celeba.csv")

I focused on the "Attractive" attribute, converting its annotations into binary labels—1 for attractive and 0 for not—to frame our problem as binary classification.


Creating TensorFlow Dataset

To optimize model training, I processed images and labels into a TensorFlow dataset, resizing images to 64x64 pixels and batching them to enhance computational efficiency:

pythonCopy code

from tensorflow.keras.utils import image_dataset_from_directory ds = image_dataset_from_directory("img_align_celeba", labels=int_labels, image_size=(64,64), batch_size=BATCH_SIZE)

Sample of Image Dataset



Model Training

Model Architecture

Building the Convolutional Neural Network (CNN)

My model architecture is a sequential array of layers tailored to extract, analyze, and classify features from facial images:

pythonCopy code

from tensorflow.keras import layers, models model = models.Sequential([ layers.Conv2D(32, (3,3), activation='relu', input_shape=(64, 64, 3)), layers.MaxPooling2D(2, 2), # Additional layers... layers.Flatten(), layers.Dense(512, activation='relu', kernel_regularizer=l2(0.001)), layers.Dense(1, activation='sigmoid') ])

Each convolutional layer identifies different features, pooling layers reduce dimensionality, and dense layers decide the outcome based on the features recognized. Regularization and appropriate activation functions ensure the model generalizes well without overfitting.


Training the Model

Configuring and Executing Training

I configured the model to minimize binary crossentropy loss, indicative of how well the model distinguishes between classes:

pythonCopy code

from tensorflow.keras.optimizers import Adam model.compile(optimizer=Adam(learning_rate=0.001), loss='binary_crossentropy', metrics=['accuracy'])

Training over 30 epochs with both training and validation datasets allowed me to monitor and tune the model’s performance iteratively:

pythonCopy code

history =, epochs=30, validation_data=val_ds)

Visualizing Results

Post-training, I assessed model performance through accuracy and loss plots, identifying epochs where the model achieved optimal balance between learning and generalization:

pythonCopy code



Training Results

Training Results

After about 7 epochs we reach the point of diminishing returns for our model reaching a maximum validation accuracy of about 80%, which is good!

The graph shows that training accuracy continued to increase reaching a maximum of about 92%

Training Results graph


Testing and Results

Testing the Model

Applying the Model to New Data

Finally, I tested the model’s real-world applicability by predicting attractiveness scores for new images, showing how to preprocess and predict:

pythonCopy code

image_pre = tf.keras.preprocessing.image.load_img(image_path, target_size=(64, 64)) image = tf.keras.preprocessing.image.img_to_array(image_pre) image = np.expand_dims(image, axis=0) score = model.predict(image)[0][0]



The model produced some really interesting results when I ran a few sample images. The common result in my tests was that images with white people got higher rates of attractiveness versus images of black people. The model assigned 20% attractiveness to the woman in Image3 (white smiling woman) and assigned an extremely low 0.07% attractiveness to Image4 (black woman not smiling).

Is the model wrong? No. It is doing exactly what it was trained to do. This is potentially because of the bias in the training dataset. There was a person who was ranking all the images in the dataset with 1 or 0 for attractiveness. One explanation was the person genuinely thought all the white people were more attractive, but the other possibly is the lack of representation in the training dataset, hence the model struggled to classify them correctly or another could be the amount of "noise" in the image like the black banner in Image3 or the grey "Gettys Images" banner in Image4, hence the results we see. Any of those factors could have produced these "unexpected" results.

This raises an important issue in the use of AI technologies, we need to ensure that bias like this is controlled early on to prevent such occurrences, especially in this new era with the increased popularity and access to AI technologies.

Here is an article expanding upon this issue: 




Hot or Not: The AI Edition – Deep Learning Decodes Attractiveness

Hot Or Not

In the quest to blend artificial intelligence with everyday human judgments, this project dives into a fascinating yet subjective realm: assessing human attractiveness. Leveraging the expansive CelebA dataset, which offers over 200,000 celebrity images annotated with various attributes, this study employs deep learning techniques to predict facial attractiveness, a trait often assessed instantly by humans.

Project Dates
January 2024 - March 2024
Project Type
Console Application
Project Category

Contact Me

Want to connect or collaborate?


[email protected]