Published on in Vol 3 (2024)

Preprints (earlier versions) of this paper are available at, first published .
Augmenting Telepostpartum Care With Vision-Based Detection of Breastfeeding-Related Conditions: Algorithm Development and Validation

Augmenting Telepostpartum Care With Vision-Based Detection of Breastfeeding-Related Conditions: Algorithm Development and Validation

Augmenting Telepostpartum Care With Vision-Based Detection of Breastfeeding-Related Conditions: Algorithm Development and Validation

Original Paper

1Department of Electrical and Computer Engineering, University of California, San Diego, La Jolla, CA, United States

2Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, United States

3Division of Extended Studies, University of California, San Diego, La Jolla, CA, United States

Corresponding Author:

Jessica De Souza, MSc

Department of Electrical and Computer Engineering

University of California, San Diego

9500 Gilman Drive

La Jolla, CA, 92093

United States

Phone: 1 (858) 534 7013


Background: Breastfeeding benefits both the mother and infant and is a topic of attention in public health. After childbirth, untreated medical conditions or lack of support lead many mothers to discontinue breastfeeding. For instance, nipple damage and mastitis affect 80% and 20% of US mothers, respectively. Lactation consultants (LCs) help mothers with breastfeeding, providing in-person, remote, and hybrid lactation support. LCs guide, encourage, and find ways for mothers to have a better experience breastfeeding. Current telehealth services help mothers seek LCs for breastfeeding support, where images help them identify and address many issues. Due to the disproportional ratio of LCs and mothers in need, these professionals are often overloaded and burned out.

Objective: This study aims to investigate the effectiveness of 5 distinct convolutional neural networks in detecting healthy lactating breasts and 6 breastfeeding-related issues by only using red, green, and blue images. Our goal was to assess the applicability of this algorithm as an auxiliary resource for LCs to identify painful breast conditions quickly, better manage their patients through triage, respond promptly to patient needs, and enhance the overall experience and care for breastfeeding mothers.

Methods: We evaluated the potential for 5 classification models to detect breastfeeding-related conditions using 1078 breast and nipple images gathered from web-based and physical educational resources. We used the convolutional neural networks Resnet50, Visual Geometry Group model with 16 layers (VGG16), InceptionV3, EfficientNetV2, and DenseNet169 to classify the images across 7 classes: healthy, abscess, mastitis, nipple blebs, dermatosis, engorgement, and nipple damage by improper feeding or misuse of breast pumps. We also evaluated the models’ ability to distinguish between healthy and unhealthy images. We present an analysis of the classification challenges, identifying image traits that may confound the detection model.

Results: The best model achieves an average area under the receiver operating characteristic curve of 0.93 for all conditions after data augmentation for multiclass classification. For binary classification, we achieved, with the best model, an average area under the curve of 0.96 for all conditions after data augmentation. Several factors contributed to the misclassification of images, including similar visual features in the conditions that precede other conditions (such as the mastitis spectrum disorder), partially covered breasts or nipples, and images depicting multiple conditions in the same breast.

Conclusions: This vision-based automated detection technique offers an opportunity to enhance postpartum care for mothers and can potentially help alleviate the workload of LCs by expediting decision-making processes.

JMIR AI 2024;3:e54798




The benefits of breastfeeding for both the mother and baby, such as lower gastrointestinal infections in the child, more rapid maternal weight normalization after birth, and prolonged amenorrhea for the mother, are just a few examples of why physicians recommend breastfeeding for at least 6 months [1-5]. Breastfeeding rates are on the rise in the United States, with 83.2% of newborn infants being breastfed in 2019, thanks to increased education and promotion of its benefits [6]. Despite the compelling evidence, many families struggle to continue breastfeeding. Although 95% of mothers initiate breastfeeding, the continuation rate drops to <41% and <19% for exclusive breastfeeding at 3 and 6 months, respectively [7]. Parents who breastfeed may face issues, such as low milk supply, fatigue, medical problems, difficulties with feeding techniques or pain, and lack of social support [8-10].

Lactation consultant (LC) professionals specialize in breastfeeding, milk supply, breast and nipple issues, breast milk management, and prenatal education. LCs ensure a mother’s smooth and painless transition into breastfeeding and increase the possibility of continued breastfeeding through 6 months or longer [11,12]. The availability of international board-certified LCs (IBCLCs) globally is limited. In 2021, there were 3.6 million births in the United States and only 18,500 LCs with IBCLC certification, a rate of 194 babies per LC a year. In low- and middle-income countries such as Brazil, for instance, there were 2.6 million births in the same year but only 154 certified LCs, resulting in a rate of 16,883 babies per LC per year. The high demand for LCs, coupled with geographic and financial barriers, underscores the need for better tools to improve access to specialized lactation services, especially in less urbanized areas where such resources are scarce, leading to decreased breastfeeding support [13-20].

Another issue is professional availability itself, as LCs often combine their practice with midwife nursing, splitting their time between prenatal visits, attending births, lactation consultations, and managing their patients, which can lead to professional exhaustion, burnout, and emotional stress [21-23]. Moreover, the predominantly independent practice of LCs outside the United States, without the support of clinics with sophisticated patient management and triage systems, further complicates their time management and patient organization [22,24].

Supporting LCs Through Tele-Lactation Services

Tele-lactation services facilitate text, audio, and video communication. This enables LCs to consult with patients from any location, reduces travel time, helps balance their workload, increases their availability to receive new patients, and provides quicker responses to their patients [20]. Complementing tele-lactation services, patient triaging using information systems allow LCs to prioritize in-person visits for severe cases requiring physical assessment, while less critical cases can be handled remotely [25,26]. Prior research suggests that LCs would benefit from time-saving tools for efficient patient information delivery while focusing on mitigating prolonged interactions, helping alleviate the burden on these professionals with a load of patients [22,27]. As LCs often follow up with their patients up to weeks after birth to ensure positive breastfeeding outcomes, an easy-to-access system to monitor patient progress is essential for effective patient triage, facilitating consultation scheduling, holding remote consultations, or providing reassurance. However, LCs’ current access to remote consultation systems lacks patient triaging tools and is not time efficient, indicating an area in need of development.

Our work proposes a novel method for the identification of breastfeeding-related conditions using convolutional neural networks (CNNs). We evaluated a self-curated data set containing 7 different breastfeeding conditions on 5 distinct CNN models. The assessment of breast conditions is vital as pain and discomfort experienced during breastfeeding is a major barrier faced by parents who want to continue breastfeeding their child. About 80% of mothers are estimated to experience nipple pain and fissures, while 20% are estimated to experience mastitis [28,29]. Our pipeline incorporates automatic detection of visually discernible painful breastfeeding-related conditions, such as nipple cracks and fissures related to poor latching and positioning; skin conditions, such as dermatitis, eczema, thrush, or herpes; and risk of mastitis spectrum issues, such as engorgement, abscess, and nipple blebs. The CNN model is used for automatic detection of breast conditions, which can benefit the triaging of remote lactation patients for faster and more efficient patient response based on their conditions.

Our work evaluated 5 distinct CNN models’ ability to differentiate between healthy and various unhealthy breast conditions (including breast abscesses, dermatoses, engorgement, mastitis, nipple blebs, and nipple damage) by performing both multiclass and binary evaluations on 1078 breast images. We evaluated the model’s performance using the data set with and without data augmentation techniques. The data were divided into training, validation, and testing sets, using k-fold cross-validation for robustness. Performance evaluation on the best model includes an average area under the curve (AUC) of 0.93 for all conditions after data augmentation and precise detection of healthy breasts (precision of 84.4%) and unhealthy breasts (average precision of 66%, SD 12.8%) for 6 conditions. For binary classification, we achieved, with the best model, an average AUC of 0.96 for all conditions after data augmentation and precise detection of healthy breasts (precision of 93.8%) and unhealthy breasts (precision of 83.5%). The breast images have been curated from perinatal education resources such as images and video recordings under various lighting, environments, and image-taking conditions, where we examined potential issues around how the images are taken and their impacts on performance. Finally, we provide insights into future designs of user interfaces and guidance needed for the proper application of the system.

Related Work

Lactating Care Pipeline: In-Person, Remote, and Hybrid

Health care providers introduce breastfeeding options to expectant mothers, including educational materials in print or web-based, during prenatal care. The initiation of breastfeeding after delivery is timed according to the type of birth. Many hospitals worldwide follow the United Nations Children’s Fund and World Health Organization baby-friendly initiative, prioritizing maternal and infant health and supporting mothers facing challenges [30,31]. After a child’s birth, families often seek breastfeeding support from LCs, who typically offer hands-on consultations from birth until support is no longer required [18]. They conduct visual and physical evaluations of both mother and baby, assessing the baby’s internal mouth structure, breast and nipple anatomy, and milk supply and ensuring proper attachment or repositioning of the baby to prevent nipple fissures. LCs may also introduce laser therapy as a treatment option for damaged nipples from breast pump misuse or issues with baby attachment [8]. The immersive approach of LCs is crucial for providing personalized and effective lactation support to mothers and infants.

Remote Lactation Care

The widespread adoption of smartphone communication apps, particularly WhatsApp (Meta Platforms, Inc), has transformed public health facilities, including family clinics in limited-income countries, offering various patient services such as appointment scheduling, health guidance, and vaccine campaign notifications [32-34]. WhatsApp has become a popular communication tool between LCs and patients, facilitating breastfeeding education and family support during the neonatal period [35,36]. During the COVID-19 pandemic, LCs transitioned to telehealth consultations using established smartphone apps such as WhatsApp, Instagram (Meta Platforms, Inc), and Facebook (Meta Platforms, Inc). LCs adapted their approach to maintain quality care despite resource limitations in remote consultations [37,38]. Similar to other practices requiring physical evaluation, LCs reimagined their methods when shifting from in-person to remote consultations, using communication and social media apps to reach and educate parents while having broader visibility in their community [37,39].

Remote lactation care presents challenges, including limited visibility during video calls, communication difficulties, and technical issues [18,40,41]. Despite challenges, remote care offers benefits, reducing the mother’s sense of isolation, enabling faster feedback, and promoting effective communication and patient engagement for improved independent learning [17,18,22]. These benefits positively impact mothers’ intentions in exclusive breastfeeding for up to 6 months and reduce the risk of breastfeeding cessation at 3 months by 25% [42].

Hybrid Lactation Care

Previous research showed that fully remote consultations work well for cases where geographic distance, transportation issues, or patient disease prevent in-person meetings between patients and providers. LCs often conduct remote consultations from their workplaces, including personal offices, clinics, or hospitals, especially when they are also midwives with on-call responsibilities [37]. They provide consultations for patients before birth, after birth, and in emergency cases where the mother is facing breastfeeding challenges [22]. Depending on the nature of the consultation, in-person or remote visits are chosen to meet the patient’s specific needs. In summary, remote care complements in-person care, being a valuable resource for mothers seeking guidance, reassurance, and confidence, particularly in the absence of a supportive home environment [38].

LCs, especially those who are also midwives, have limited time availability due to demanding schedules and receiving numerous remote messages from patients daily, some requiring higher priority attention [22,43]. Manually sorting through patient messages to determine priority can be time consuming and inconvenient for mothers with urgent needs. Our work proposes a computer vision–based system to triage breast conditions, facilitating telehealth and assisting LCs in identifying patients who require immediate responses in remote settings.

Issues Associated With Breastfeeding

Breastfeeding pain is one of the reasons associated with breastfeeding cessation, which can be caused by issues such as poor attachment of the baby onto the breast, physical conditions of the mother or baby, misuse of breast pumps, oversupply of breast milk, and even environmental conditions [44]. These issues, if left untreated in the first few days after birth, can persist for weeks and pose a threat to breastfeeding continuity beyond 6 months. Some conditions can be fully mitigated when the mother receives orientation and education on the topic. In contrast, other conditions can be alleviated and managed for a better experience for the mother in the case of physical conditions, including nipple physiology, baby tongue-tie, jaw clenching, and excessive milk supply [28,45].

This study concentrates on conditions leading to breastfeeding pain and potential interruption. The first condition is the mastitis spectrum disorder, where about 20% of mothers who breastfeed may face it during their time breastfeeding. This disorder starts with the overproduction of milk and breast engorgement, which can cause milk passage obstruction in the form of galactoceles and nipple blebs. When not properly treated, a case of milk bleb or galactocele can evolve into phlegmon, bacterial, or inflammatory mastitis, which may require patients to treat it with medications and sometimes medical procedures to drain the inflammation fluids from the breast in case it becomes an abscess [46,47]. Conditions associated with mastitis are painful and include symptoms such as redness in the breast, influenza-like symptoms, hardened skin surface in the location of the milk blockage, formation of blisters in the nipple, and even blood in the milk [29,48].

The second condition is nipple damage caused by improper latching and positioning from the infant, excessive pressure from breast pumping devices, infant tongue-tie or palate abnormality, infant’s arrhythmic milk expression, and even infant biting or jaw clenching [9,44]. Considering the cause of nipple damage, 80% of mothers are expected to face some level of nipple issues during breastfeeding, which, if not treated, may cause an average of 35% of these mothers to cease breastfeeding before 1 month [28,45]. Nipple damage is painful and may be visible or invisible. When visible, it can present features at the skin surface, such as fissures, cracks, pus, blood, scarring, or crusting. Some skin dermatoses, such as thrush, herpes, eczema, and psoriasis, are also responsible for discomfort and pain during breastfeeding. These conditions can be caused by friction, weather, and temperature changes and using medications or ingredients that can make the skin prone to these disorders. Dermatoses conditions present on both breast and nipple and can have visible features such as scarring, crusting formations, redness, and thickened skin regions [44]. Our research incorporates breast and nipple images from the following disorders: breast abscess, dermatoses, breast engorgement, inflammatory and bacterial mastitis, nipple blebs, and nipple damage.

Current Research Supporting Lactating Mothers

Extensive literature has highlighted the efficacy of deep learning in assessing breast images, helping detect malignant and benign breast tumors for both lactating and nonlactating women [49-54]. This has helped improve the precision of breast ultrasound and mammogram examinations, involving the use of medical imaging previously taken in medical facilities to enhance the evaluation of breast-related illnesses and allow better accuracy in diagnosis for medical personnel [53]. However, these studies relied on images gathered from specialized equipment found only in health care facilities. They did not extend their evaluation to external body images, focusing primarily on aiding health care practitioners in diagnosis. Our work diverges from previous contributions by primarily focusing on using external breast images gathered from personal devices, such as smartphones or cameras from lactating patients, to identify breastfeeding-related conditions in the early stages and evaluate the necessity of further examination and medical intervention.

In the context of breastfeeding disorders, there is a lack of research regarding using deep learning algorithms to evaluate real breast images and identify abnormalities such as mastitis, nipple fissures, dermatoses, and abscesses. To illustrate, literature addressing the early prediction of mastitis mainly originates from agricultural studies, in which the risk of mastitis is constantly assessed to prevent a reduction in animal milk production, which significantly impacts the dairy industry [55,56]. This shows a need for research to adapt these technologies for detecting and preventing breastfeeding disorders in humans. Our study is crucial in settings where access to medical professionals and LCs is limited, as it can help prevent breastfeeding cessation, promote maternal-infant bonding, and improve the overall health and well-being of mothers and infants.

In this section, we detail the data set collection process, including inclusion and exclusion criteria, data sources, and the characteristics of the images. The section also discusses the artificial intelligence (AI) algorithms used in the study, including the models and their training and validation process, and performance metrics used during evaluation.

Ethical Considerations

This study was approved by the University of California, San Diego Institutional Review Board (801,904). We did not incorporate any personally identifiable data from the participants into this research.

Data Set Collection


This study used a breast image data set (refer to Textbox 1 and Table 1), a compilation of physical and digital images specifically curated to train and validate our deep learning model’s ability to distinguish between healthy and unhealthy lactating breasts. The data set includes images categorized according to their respective conditions: healthy lactating breast; nipple injuries due to various causes; nipple blebs due to plugged ducts; breast or nipple with signs of dermatoses; and breasts with engorgement, mastitis, or abscess.

Textbox 1. Data set description.


  • Data set size
    • 393.7 MB (each image: minimum 0.015, average 0.360, and maximum 3.575 MB)
  • Dimensions (pixels)
    • Width (minimum 68, average 606, and maximum 2448)
    • Height (minimum 68, average 607, and maximum 2448)
  • Number of images
    • 1078
  • Number of classes
    • 7
  • Number of unique subjects
    • 586
  • Number of images per class
    • Abscess: 115
    • Dermatoses: 123
    • Engorgement: 63
    • Mastitis: 180
    • Nipple bleb: 82
    • Nipple damage: 197
    • Healthy: 318
  • Visual features per class
    • Abscess: swelling and redness, area with palpable fluid collection, and pus
    • Dermatoses: rash, discoloration, flaky skin, uneven skin tone, crusting, and redness
    • Engorgement: swelling, redness, skin stretched and shiny, and enlarged nipple
    • Mastitis: red patches on breast or nipple, swelling, and pus or blood discharge
    • Nipple bleb: small white or yellow bumps on nipple or areola, similar to a blister
    • Nipple damage: nipple swelling, redness, peeling or flaking skin, bleeding, and shape differences
    • Healthy: regular breast and nipple color, may have visible veins
  • Number of images per source
    • Physical: 178 (eg, books, magazines, and articles)
    • Physician websites: 366
    • YouTube: 65 (eg, educational channels on women’s health)
    • Other: 469 (eg, received by lactation consultants; international board-certified lactation consultant’s Instagram, Google Images, and Flickr; support groups mediated by lactation consultants on social media; and other educational websites)
Table 1. Number of images per skin tone per class (FSTa [57]).
Nipple bleb9161886322
Nipple damage4059221511545
Total per FST203312269106674477

aFST: Fitzpatrick skin type.

bNot classified due to the absence of breast tissue around the nipple in the image.

Data Inclusion and Exclusion Criteria

To be included in the data set, images must meet the following criteria: (1) the image must be in red, green, and blue (RGB) format, either as PNG or JPEG; (2) it must visually have at least 1 of the 7 conditions; (3) the breast or nipple should be visible; (4) the image should be hosted in a trustworthy source (ie, from medical professionals such as physicians, midwife nurses, and IBCLCs), in which the image must have a word or description identifying its condition among the 7 classes to be included as its label; and (5) the visual condition present in the image and the label provided describing the condition should match. Images were excluded from the data set if (1) the breast or nipple were from nonlactating female patients; (2) the condition described on the label and the visual features of the image did not match; (3) the breast or nipple was not visible in the image; and (4) the image did not have any label describing it. A board-certified nurse practitioner (ie, Certified Nurse Practitioner, Advanced Registered Nurse Practitioner, or IBCLC) with >15 years of experience performed a final review of the data set to ensure that images and labels had no discrepancies.

Data Source

We collected images from diverse sources such as breastfeeding-related books, articles, web-based blogs for mothers and physicians, YouTube videos from educative organizations, and social media platforms (eg, Instagram, Facebook, and Twitter) of certified health care providers who would have educative resources for mothers. To ensure diversity in geographic and racial representation, we conducted image searches using multiple languages (eg, English, Portuguese, Spanish, French, and Chinese) and used search engines adjusted for other countries.

The images were obtained from a diverse group of female patients with several skin colors and breast and nipple sizes, with unstandardized image sizes, orientations, backgrounds, and light sources. In total, the data set consisted of 1078 images, with 318 images of healthy breasts, 115 images of breast abscesses, 123 images of dermatoses, 63 images of breast engorgement, 180 images of mastitis, 82 images of nipple blebs, and 197 images of nipple damage. As shown in Figure 1 and Table 1, a healthy lactating breast presented a uniform color, was free of redness, and had no signs of discharge. Nipples were expected to exhibit a variety of shapes, including flat, protruded, or inverted, and to vary in size. In engorgement, images showed breast and nipple swelling, skin stretched and shiny, and some light redness due to high milk production. For nipple blebs or nipple damage, signs of laceration, blood, blisters, and redness were expected. Mastitis showed swelling, redness, and discharge of pus or blood in the nipple. Abscess shared similarities with mastitis but involved worsened redness and pus in the infected region and may display signs of rupture. Finally, dermatosis images contained signs of skin rash, breast or nipple uneven skin tone, and crusting.

Figure 1. Example images from the testing set that were correctly classified and show features of each breastfeeding-related condition: (A) abscess, (B) dermatoses, (C) engorgement, (D) mastitis, (E) nipple bleb, (F) nipple damage, and (G) healthy.

AI Algorithms

We examined the performance of 5 CNNs commonly used in computer vision problems: Visual Geometry Group model with 16 layers (VGG16) [58], Resnet50 [59], InceptionV3 [60], EfficientNetV2 [61], and DenseNet169 [62]. All models were built with the PyTorch library for image classification, in which the models had all layers frozen except for the last layer, which was replaced with a fully connected layer adapted to the number of classes—2 for binary classification and 7 for the multiclass task. All models were trained for 100 epochs using the AdamW optimizer with a learning rate of 3e-4, weight decay of 0.1, and batch size of 20. We chose 100 epochs because it was a converging point where the accuracy no longer increased or decreased. For the loss functions, we applied Binary Cross-Entropy with Logits Loss for binary classification tasks, and for multiclass tasks, we used Cross-Entropy Loss, both fine-tuned with class weights to strategically adjust for class imbalances by proportionally penalizing misclassifications in less represented classes. These models were evaluated using stratified k-fold cross-validation with 10 folds. To ensure the robustness of our cross-validation process, we reset any learned parameters by initializing the models from scratch at the beginning of each fold. Instead of using the entire image data set to train the model, we did feature extraction to optimize the training process (detailed in the Feature Extraction section). We compared the performance of the 5 models across the same data and keep the hyperparameters the same: learning rate, weight decay, batch size, and number of epochs.

Data Set Preprocessing

Before using the images as inputs for the deep learning models, the images were manually cropped to ensure they were deidentified and had no irrelevant content, such as unrelated body areas, clothes, jewelry, identifiable tattoos, or backgrounds, enhancing the model’s accuracy and performance. The images were cropped in a 1:1 ratio to prevent image flattening or warping during resizing and loss of important features. Most images have breast and nipple tissue concentrated in the center of the image, thereby focusing the model’s evaluation on the most relevant areas. Our image preprocessing guidelines followed similar works in dermatology for AI disease detection and telehealth applications [63-65], which aim to objectively show the area of interest for optimized detection and reduce risks of poorly triaged images.

After cropping the images in a 1:1 ratio and before entering the deep learning pipeline, we applied some standard transformations in the data, starting with image resizing. In this paper, we trained, validated, and tested our data set using 5 different models. Notably, 4 of the chosen models (VGG16, Resnet50, EfficientNetV2, and DenseNet169) specified the input images to be resized to 224×224 pixels, and the InceptionV3 model required input images to be resized to 299×299 pixels. Therefore, we proceeded with the image resizing according to each model’s requirements. The last transformation step incorporates normalization of the images, a procedure where the pixel intensity values are standardized across the data set. To help the models generalize better for our data set, we calculated the mean and SD of all images in the data set to use in the normalization process instead of using the ImageNet data set pretrained parameters, inspired by the previous work involving skin disease classification [66].

Data Set Augmentation

In the process of curating the data set, we recognized that the number of images per class was constrained, given the complexity of gathering images and variability in the clinical features of each class. We implemented data augmentation techniques to mitigate these limitations, reduce the risk of overfitting, and enrich the data set. These techniques artificially expanded the data set by generating realistic transformations of the existing images. We implemented the following 6 data augmentations that were previously used in data sets involving skin lesions [63,67]: center zoom, random rotation, brightness, shear, vertical flip, and horizontal flip. Samples of augmentation are shown in Figure 2. Before data augmentation, our data set consisted of 1078 images. After the augmentation, the data set consisted of 6478 images. The detailed number of samples before and after augmentation is shown in Table 2.

We evaluated our data set before and after data augmentation. In the original data set, the 1000 images were allocated for training and validation, split using stratified k-fold cross-validation [68] with 10 folds. In this process, 90% (900/1000) of the data are used for training and 10% (100/1000) for validation within each fold, as described in Figure 3. The stratified k-fold maintains the proportion of images in each class in both train and validation splits, making sure each fold will be representative of the overall data set. The remaining 78 images were completely excluded from these folds and reserved exclusively for final testing to assess the model’s performance on unseen data. After augmenting the original data set, we expanded it to 6000 images for training and validation. Similarly, we increased our test set to 468 images to maintain consistency with the expanded training data, ensuring the model’s evaluation on unseen examples remains robust.

Figure 2. Samples of augmented data: (A) original, (B) brightness, (C) center zoom, (D) horizontal flip, (E) rotation, (F) shear, and (G) vertical flip.
Table 2. Detailed number of samples in the data set.
Data set and classesTrain samples, nTest samples, nTrain samples (augmented), nTest samples (augmented), n
7-class data set





Nipple bleb75745042

Nipple damage1889112854

Binary data set



aUnhealthy class combines the classes abscess, dermatoses, mastitis, nipple bleb, and nipple damage, while the healthy class combines healthy and engorgement, all from the 7-class data set.

Figure 3. Graphical diagram of stratified k-fold cross-validation on a 7-class data set.

Feature Extraction

We performed feature extraction using 5 models pretrained on the ImageNet data set. This process helped to reduce the number of computational resources necessary for processing the data set by transforming images into numerical features, without losing relevant information. The models were set to evaluation mode, in which the feature maps are extracted from the final convolutional layers. These maps were then processed through adaptive pooling and flattened into 1D arrays. The extracted features were saved and used as input for the model classifiers.

Training and Evaluation

As previously mentioned in the AI Algorithms section, a total of 5 CNNs were trained on the data set. We proposed 4 tasks in this study, which evaluates the CNNs in the following data sets: (1) multiclass not augmented, (2) multiclass augmented, (3) binary not augmented, and (4) binary augmented. As described in Table 2, we performed an additional 2 evaluations considering a binary model to assess the models’ capacity to differentiate between healthy and unhealthy images. The unhealthy class consolidates 5 of the previous conditions: abscess, dermatoses, mastitis, nipple bleb, and nipple damage. The healthy class consolidates the original healthy and engorgement conditions. For this binary evaluation, we included engorgement images in the healthy condition because it is not inherently indicative of disease and often resolves without medical intervention. Furthermore, engorgement shares visual characteristics with healthy breast conditions, which might not be distinguishable at an early, nonproblematic stage. All models underwent k-fold cross-validation, where we collected performance metrics from each fold and computed their average. We assessed the models’ performance for the multiclass and binary data sets using the same metrics: accuracy, precision, recall, F1-score, and the receiver operating characteristic AUC (ROC-AUC).


We collected 1078 unique breast images from the web and physical resources, 1000 images as part of the training and validation set, and 78 images as part of the testing set. The augmented data set has 6000 images for training and validation and 468 images for testing. In the Multiclass Image Detection Evaluation section, we show evaluation results from the multiclass and binary data sets, which we evaluated before and after data augmentation. There was no hyperparameter tuning between each fold, and all models had the same optimizer, learning rate, weight decay, and batch size.

Multiclass Image Detection Evaluation

We evaluated 5 CNNs on their ability to distinguish between healthy and 6 breastfeeding-related issues. Table 3 presents the aggregated evaluation metrics for each model sorted based on the test accuracy. The precision, recall, F1-score, and overall area under the ROC-AUC are reported as weighted averages to account for the class imbalance within the data sets, ensuring that each class contributes to the final metric in proportion to its prevalence. For each fold in the cross-validation, a separate test set was used to evaluate the model, and the metrics presented are the mean of these evaluations. The best-performing model was Resnet 50, as it managed to contain the best testing accuracy, followed by VGG16 and EfficientNetV2 on a small performance difference. With a similar weighted average setting, in a one-versus-rest fashion, the models achieved an overall ROC-AUC of 0.934 for VGG16, 0.929 for Resnet50, 0.912 for InceptionV3, 0.908 for Densenet169, and 0.872 for EfficientNetV2. The detailed ROC-AUC per class for each model is shown in Figure 4.

When applying data augmentation to the multiclass model, we provided a wider variety of images to help the model better generalize from the training data while not altering the original class distribution. In Figure 5 and Table 4, we show the results across the CNNs after data augmentation, where most of the models showed improved metrics, with Resnet50 being the leading model. The models achieved a ROC-AUC of 0.934 for Resnet50, 0.912 for VGG16, 0.909 for Densenet169, 0.898 for InceptionV3, and 0.893 for EfficientNetV2.

Looking into the performance of the best model, the Resnet50 with the augmented data set, we can look closer at the metrics per class of this CNN. Table 5 shows the results for 10-fold cross-validation, in which the model had an overall consistent performance across the iterations. Figure 6 presents the aggregated confusion matrix for the Resnet50 model, in which we consolidated the predictions across all 10 iterations applied to the augmented data set. We achieved this aggregation by taking the median predicted class for each instance over the multiple folds, synthesizing a singular prediction representing the consensus of the model’s behavior across the test set.

Out of the 468 images used in the testing set, the model could correctly classify 341 images. The total images correctly classified by category are as follows: abscess (24/42; accuracy=57%), dermatoses (43/48; accuracy=90%), engorgement (25/48; accuracy=52%), mastitis (26/54; accuracy=48%), nipple bleb (30/42; accuracy=71%), nipple damage (41/54; accuracy=76%), and healthy (152/180; accuracy=84%). The remaining images that were incorrectly classified happened throughout visually similar conditions and the conditions that can precede each other. Table 6 summarizes the selected model’s performance per class on the augmented test set. The model had difficulty categorizing between abscesses, which had false positives on dermatoses and mastitis for 12% (5/42) and 19% (8/42) of the images, respectively. Breast engorgement had false positives on mastitis and healthy breasts for 15% (7/48) and 33% (16/48) of the images, respectively. Mastitis had false positives in abscess (12/54, 22%), nipple damage (9/54, 17%), and healthy breasts (6/54, 11%). About 21% (9/42) of the nipple bleb images were confused as nipple damage.

Table 3. Average evaluation metrics for the trained models on the not augmented data set (sorted based on performance).
Data set and modelTraining accuracyValidation accuracyTest set metrics

7-class data set






aItalicized items represent the best metric.

bVGG16: Visual Geometry Group model with 16 layers.

Figure 4. Performance of the 5 convolutional neural networks on the 7-class data set: (A) Resnet50, (B) Visual Geometry Group model with 16 layers (VGG16), (C) EfficientNetV2, (D) InceptionV3, and (E) DenseNet169. AUC: area under the curve.
Figure 5. Performance of the 5 convolutional neural networks on the 7-class augmented data set: (A) Resnet50, (B) InceptionV3, (C) EfficientNetV2, (D) Visual Geometry Group model with 16 layers, (E) DenseNet169. AUC: area under the curve.
Table 4. Average evaluation metrics for the trained models on the augmented data set (sorted based on performance).
Data set and modelTraining accuracyValidation accuracyTest set metrics

7-class augmented data set






aItalicized items represent the best metric.

bVGG16: Visual Geometry Group model with 16 layers.

Table 5. Results of 10-fold cross-validation for the augmented data set on Resnet50.
10-fold iterationsAccuracyPrecisionRecallF1-score
Iteration 10.6990.7050.6990.699
Iteration 20.7140.7150.7140.712
Iteration 30.7090.7130.7090.709
Iteration 40.7290.7300.7290.727
Iteration 50.7180.7190.7180.716
Iteration 60.7330.7340.7330.730
Iteration 70.7200.7220.7200.718
Iteration 80.7070.7110.7070.706
Iteration 90.7070.7070.7070.705
Iteration 100.7200.7150.7200.713
Figure 6. Aggregated confusion matrix for the Resnet50 model for the augmented data set with example images from the augmented data set that were correctly and incorrectly classified across all folders.
Table 6. Summary of the detection results per class: accuracy, precision, recall, F1-score, and support (ie, number of samples per class) using the Resnet50 architecture.
Nipple bleb0.7140.8570.7140.77954
Nipple damage0.7590.6830.7590.71954

Binary Image Detection Evaluation

To improve the accuracy of our clinical predictions and reduce the chances of incorrect results, we simplified our data set of 7 categories to just 2: healthy and unhealthy. The unhealthy category now includes 5 conditions: abscess, dermatoses, mastitis, nipple bleb, and nipple damage. The healthy category now includes the original healthy conditions and engorgement. Engorgement shares many visual similarities with healthy breast conditions, which made it difficult for the multiclass models to identify engorgement accurately. As presented previously, 33% (16/48) of the images of engorgement were classified as healthy. Table 7 presents the aggregated evaluation metrics for 5 models sorted based on the test accuracy.

The accuracy is reported as a balanced score to address class imbalance, ensuring that each class contributes equally to the final metric. Precision, recall, and F1-score are reported for the positive class, with the positive class label specified. For each fold in the cross-validation, we used a separate test set to evaluate the model, and the reported metrics are the average of these evaluations. The best-performing model was the VGG16, which contained the best testing accuracy, followed by Resnet50 and InceptionV3. The models achieved an overall ROC-AUC of 0.977 for VGG16, 0.966 for Resnet50, 0.935 for InceptionV3, 0.921 for EfficientNetV2, and 0.910 for Densenet169. The detailed ROC-AUC for the not augmented and augmented data set is shown in Figures 7A and 7B, respectively.

When applying data augmentation to the binary model, we provided a wider variety of images to help the model better generalize from the training data while not altering the original class distribution. In Table 8, we show the results across the CNNs after data augmentation, where most of the models showed improved metrics, with Resnet50 being the leading model. The models achieved a ROC-AUC of 0.962 for Resnet50, 0.956 for VGG16, 0.931 for EfficientNetV2, 0.929 for InceptionV3, and 0.915 for Densenet169.

Looking into the performance of the best model, the Resnet50 with the augmented data set, we can look closer at the metrics per class of this CNN. Table 9 shows the results for 10-fold cross-validation, in which the model had an overall consistent performance across the iterations. Figure 8 presents the aggregated confusion matrix for the Resnet50 model, in which we consolidated the predictions across all 10 folds applied to the augmented data set. This aggregation was achieved by taking the median predicted class for each instance over the multiple folds, synthesizing a singular prediction representing the consensus of the model’s behavior across the test set.

Out of the 468 images used in the testing set, the model could correctly classify 411 images. The total images correctly classified by category are as follows: unhealthy (228/240; accuracy=95%, precision=83.5%, recall=95% and F1-score=89%) and healthy (183/228; accuracy=80.3%, precision=94%, recall=80% and F1-score=86.5%). The remaining images that were incorrectly classified presented redness (ie, for engorgement cases misclassified as unhealthy; 26/228), and incomplete images (ie, too close or nipple and breast not fully visible; 12/228). Discussion

The issues that caused model misclassification included (1) wrong positioning of the breast in the image, (2) common visual features in the images between the classes, (3) a lack of variety of images belonging to specific cases in the data set due to variety limitations, and (4) presence of an extraneous object in the frame. Figure 1 presents the correct prediction from the 7 classes.

Table 7. Average evaluation metrics for the trained models on the not augmented binary data set (sorted based on test accuracy).
Data set and modelTraining accuracyValidation accuracyTest set metrics

Binary data set






aVGG16: Visual Geometry Group model with 16 layers.

bItalicized items represent the best metric.

Figure 7. Model performance on the binary data set: (A) without augmentation and (B) with augmentation. AUC: area under the curve; VGG16: Visual Geometry Group model with 16 layers.
Table 8. Average evaluation metrics for the trained models on the augmented binary data set (sorted based on performance).
Data set and modelTraining accuracyValidation accuracyTest set metrics

Binary augmented data set






aItalicized items represent the best metric.

bVGG16: Visual Geometry Group model with 16 layers.

Table 9. Results of 10-fold cross-validation for the augmented binary data set on Resnet50.
Iteration of 10-foldAccuracyPrecisionRecallF1-score
Iteration 10.7690.9480.5570.702
Iteration 20.7610.9600.5310.684
Iteration 30.7910.9510.6010.737
Iteration 40.7820.9700.5700.718
Iteration 50.7820.9320.5960.727
Iteration 60.7780.9430.5790.717
Iteration 70.7930.9280.6230.745
Iteration 80.7670.9610.5440.695
Iteration 90.7780.9630.5660.713
Iteration 100.7710.9690.5480.700
Figure 8. Aggregated confusion matrix for the Resnet50 model for the augmented data set with example images from the augmented data set that were correctly and incorrectly classified across all folders.

Image Quality

When examining misclassification results in our image data set study, we found many image quality issues that likely contributed to the model’s diminished performance. In the example images from the testing set, Figures 9A-9C demonstrate good image samples that allow a complete evaluation of the breast’s condition and, therefore, can be used for the model’s evaluation. These images fully or almost entirely show the nipple at a distance that allows diagnosis and does not show information about the person’s surroundings or extraneous objects that the model might misinterpret. In Figures 9D and 9E, the main issue in both examples is the lack of nipple or breast presence or only partial presence, making it difficult for the model to assimilate them with breast figures; even if there are signs of mastitis or engorgement in both images, the image is incomplete. For Figures 9F and 9G, the presence of hands or fingers, nail polish, and partially occluded areas with extraneous objects also affects the model interpretation, especially because we did not train the model with such extra components.

Other issues noted in the preprocessing phase were causing issues in training and validation loss as well as false positive and negative detections. For example, having the image of both breasts instead of one affect prediction accuracy, especially in cases where one breast has a different condition compared to the other. The model did not have a large variety of images showing both breasts. Therefore, we improved the training and test results metrics once we separated the breasts into different figures. In addition, we encountered classification problems with extracted images that show some background components, such as clothes surrounding the breast, breast pumps, or segments of the baby’s face or hands. The issues were corrected for these cases by cropping the image to the area of interest. If an object was too similar, such as a hand or a baby, we manually applied blurriness filters in the area and removed saturation so that only the breast is recognizable. Images with low resolution also affect the model’s performance, especially if they are originally smaller than the size determined by the data augmentation algorithm and were stretched later. Some images that belonged to this case and were misclassified had their size manually corrected afterward, and the model properly classified them afterward.

Figure 9. Example images from the testing set. (A), (B), and (C) High-quality images, with a full view of the breast and nipple. (D) Image in which the full breast does not appear, making it hard to classify which condition it belongs to. (E) Although the condition is clear and the full breast is visible, the nipple is pixelated in the photo, altering the original features that the model is not used to. (F) and (G) Partially occluded breasts, and the presence of nail polish in the color of the wound also impacts the model’s performance in those cases. The examples of low-quality data provide details about how to improve data acquisition for future development.

Visual Similarities Between Conditions

Conditions that present common features and can cause confusion in the diagnosis are mastitis, engorgement, and healthy. Mastitis shows redness throughout the entire breast, showing little skin tone differences and making breasts appear fuller. Some of these features are commonly found in breast engorgement. However, there are fewer signs of intensified redness, sometimes no redness at all, but there may be visible veins and stretched nipples, making them visually similar to healthy ones. Due to the limited availability of images of breast engorgement for a separate class and the fact that engorgement is not necessarily an issue but can become mastitis when not alleviated, the model classified some engorged breasts as mastitis. When we included engorgement in the healthy class for the binary classification, we still got images misclassified as unhealthy, showing how transition conditions should be followed more closely.

This highlights the need for (1) increasing the engorgement data set; (2) working closely with LCs to investigate the need to categorize conditions that can be a problem but indicate false positive cases of more serious issues; and (3) exploring the possibility of using these conditions that have higher errors as a base for following patient condition progression, where there is a transition between conditions for improving or worsening a patient’s situation.

Lack of Variety of Images Belonging to Specific Cases in the Data Set

For the case of Figure 10A, the engorged breast occurs in an inverted nipple, showing its center lighter and misclassifying it as a nipple bleb. Another example of misclassification includes conditions that occur together, which is the case in Figure 10B, showcasing a breast abscess concentrated behind the nipple and with signs of nipple damage. Such an example was one of the very few occurrences of simultaneous conditions in the data set and emphasized the reality that LCs have patients with similar cases, bringing the need to think about systems that (1) recognize multiple conditions or (2) decide between the most severe one for patient priority. Figure 10C is a case of granulomatous mastitis that was classified as nipple damage due to the presence of nipple scarring, highlighting the fewer occurrences of such a specific case in the data set.

In addition, Figures 10C and 10D show breasts in the conditions of engorgement and nipple damage, respectively. For Figure 10D, due to the proximity and nature of the nipple damage with a blood blister, the reflection on the dot suggests that it could be a nipple bleb, also misclassifying the image. These misclassified images with distinct features can also be complex to classify for humans, mainly because some of these conditions rarely occur. Given the nature of the images and the lack of images publicly available with the variety of cases across different skin tones, breasts, and nipple sizes, we believe that working with more images involving rare disorders and providing more data augmentation alternatives can improve the model’s classification significantly. In addition, Figure 10D highlights the issue with image angle and proximity. The picture was taken too close to the breast, having a higher chance of misclassification.

Figure 10. Images incorrectly classified due to data set variety limitations: (A) an engorged breast with an inverted nipple classified as nipple bleb, (B) breast with an abscess but also has nipple damage, (C) breast with granulomatous mastitis classified as nipple damage, and (D) nipple damage classified as nipple bleb.


Our findings emphasize the need for improvement in several areas. As demonstrated in our evaluation, naturalistic images captured by users have several image quality issues that can impede the classification system from proper functioning. Thus, future systems must implement a user interface to properly guide parents in taking pictures to input the AI triaging system. This system should provide basic guidelines around how to frame the breast such that no occlusion is present; not use the finger to point out parts of interest; and ensure the camera framing can see the entire breast so that the nipple, areola, and breast tissue are all visible. Previous works explore the importance of implementing guidelines for image assessment of external diseases, such as in dermatology disease assessments, and its benefits for better professional evaluation and higher accuracy in diagnosing conditions [64,65,69]. Guidelines may be implemented as a set of easy instructions, and more advanced systems could provide immediate image quality feedback.

Moreover, our system only uses RGB images to triage breastfeeding-related conditions, not incorporating patient input regarding pain onset, location, symptoms, and pain levels. These are critical data for diagnosing with higher accuracy and providing more effective feedback to patients experiencing breastfeeding-related pain [70]. Furthermore, automating patient responses [71-73] and using large language models [74] can help categorize issues based on their problem description and image inputs, streamlining the care process and ensuring prompt patient attention.

Finally, the most significant limitation of this work is how this evaluation was limited in having a properly balanced data set to help achieve close-to-perfect performance scores from the model. Despite these limitations, we addressed imbalance issues and proved it possible to obtain satisfactory results in detecting and differentiating the conditions we tested.

Applications and Future Work

This study showcases the potential for high-accuracy breastfeeding-related condition detection to manage postpartum challenges better. In addition, we demonstrate the feasibility of implementing patient support and condition triaging for smartphone-based apps by using deep learning RGB image recognition. The model can be integrated into a telehealth pipeline for postpartum lactation care, helping LCs classify and organize patients based on the severity of their condition or the level of certainty regarding their health concerns. In addition, the system can help track patient disease progression and aid newly qualified LCs by providing faster decision-making support.

The evaluation will serve as a baseline for performing a co-design study with mothers and LCs to evaluate the system requirements regarding data gathering and privacy concerns regarding sensitive data sharing. Understanding the benefits of such a system and recognizing its challenges is essential for building effective tools that will meet patients’ and health care providers’ needs. Furthermore, a comprehensive approach is needed to determine the threshold for flagging a patient as unhealthy in the AI-mediated lactation care system, combining quantitative measures (eg, image detection and pain assessment) with clinical expertise. These improvements will allow this work to compose applications for (1) patient self-assessment tools for actionable feedback for breastfeeding pain, (2) reliably identifying cases that require immediate attention and flagging them for LCs, and (3) enabling timely interventions and improved patient outcomes in lactation care. Future work could envision a fully developed hybrid remote consultation system where patients answer questions for the assessment stage, and images are shared between the patient and provider to visualize the severity of the issue before care is provided. Integrating visual information and pain assessment in remote consultations enhances the diagnostic process and enables LCs to deliver tailored care promptly [75] and help overcome burnout from these professionals.


This study demonstrates the feasibility of AI-mediated detection of breast conditions for lactating women. We took the first step in this domain by using RGB breast images to triage healthy from unhealthy breasts in mastitis spectrum disease conditions such as nipple blebs, engorgement, abscess, and mastitis; nipple damage caused by poor breastfeeding techniques, breast pumps, and other conditions; and dermatoses caused by a variety of conditions. We implemented 5 distinct CNN models to classify images from 2 different data sets, identifying 7 breast conditions and distinguishing between healthy and unhealthy conditions. The evaluation of the models based on our data set demonstrated the feasibility of using CNNs to classify and intervene with patients who seek remote guidance and management of their symptoms. Although this model’s performance was good, it can be improved by increasing the variety of images and conditions in the data set and implementing the best practices for image posing for proper image classification, leaving significant room for improvement. The feasibility of this work is the initial step toward building tele-lactation services with better data for LCs. We hope our work will inspire future exploration to apply technologies to help lactation support research that can reach more people globally and investigate ideas beyond laboratory settings. This will allow a more comprehensive understanding of breast health for postpartum mothers and empower them to take proactive steps in maintaining their well-being.


The authors thank the Google Health Equity Research Initiative that supported this research through their program to advance health equity research and improve health outcomes for groups disproportionately impacted by health disparities.

Data Availability

The data sets generated and analyzed during this study are not publicly available due to confidentiality reasons but are available from the corresponding author on reasonable request.

Authors' Contributions

JDS conceptualized the research question, acquired the data, analyzed the data, wrote the manuscript, and takes responsibility for the integrity of the data and the accuracy of the data analyses. JME provided guidance and assisted with the cross-validation and data augmentation strategies. KC provided guidance during the study design and material support and data consistency. EJW and VKV provided guidance, data analysis, and technical support during the study. All authors contributed to drafting the paper and its critical revision for important intellectual content.

Conflicts of Interest

None declared.

  1. Kramer MS, Kakuma R. Optimal duration of exclusive breastfeeding. Cochrane Database Syst Rev. Aug 15, 2012;2012(8):CD003517. [FREE Full text] [CrossRef] [Medline]
  2. Duijts L, Ramadhani MK, Moll HA. Breastfeeding protects against infectious diseases during infancy in industrialized countries. A systematic review. Matern Child Nutr. Jul 2009;5(3):199-210. [FREE Full text] [CrossRef] [Medline]
  3. Kramer MS, Guo T, Platt RW, Sevkovskaya Z, Dzikovich I, Collet JP, et al. Infant growth and health outcomes associated with 3 compared with 6 mo of exclusive breastfeeding. Am J Clin Nutr. Aug 2003;78(2):291-295. [CrossRef] [Medline]
  4. Kent G. Child feeding and human rights. Int Breastfeed J. Dec 18, 2006;1:27. [FREE Full text] [CrossRef] [Medline]
  5. Dinour LM. Speaking out on "breastfeeding" terminology: recommendations for gender-inclusive language in research and reporting. Breastfeed Med. Oct 01, 2019;14(8):523-532. [FREE Full text] [CrossRef] [Medline]
  6. Breastfeeding report card. Centers for Disease Control and Prevention. 2022. URL: ta/reportcard.htm [accessed 2023-11-13]
  7. Breastfeeding. United Nations International Children's Emergency Fund. URL: feeding/ [accessed 2023-11-13]
  8. Coca KP, Marcacine KO, Gamba MA, Corrêa L, Aranha AC, Abrão AC. Efficacy of low-level laser therapy in relieving nipple pain in breastfeeding women: a triple-blind, randomized, controlled trial. Pain Manag Nurs. Aug 2016;17(4):281-289. [FREE Full text] [CrossRef] [Medline]
  9. Brown CR, Dodds L, Legge A, Bryanton J, Semenic S. Factors influencing the reasons why mothers stop breastfeeding. Can J Public Health. May 09, 2014;105(3):e179-e185. [FREE Full text] [CrossRef] [Medline]
  10. Friesen CA, Hormuth LJ, Petersen D, Babbitt T. Using videoconferencing technology to provide breastfeeding support to low-income women: connecting hospital-based lactation consultants with clients receiving care at a community health center. J Hum Lact. Nov 2015;31(4):595-599. [CrossRef] [Medline]
  11. Chaves AF, Vitoriano LN, Borges FL, Alves Melo RD, de Oliveira MG, Chagas Costa Lima AC. Percepção das mulheres que receberam consultoria em amamentação. Enfermagem em Foco. 2019;10(5). [CrossRef]
  12. Patel S, Patel S. The effectiveness of lactation consultants and lactation counselors on breastfeeding outcomes. J Hum Lact. Aug 2016;32(3):530-541. [CrossRef] [Medline]
  13. Current statistics on worldwide IBCLCs. International Board of Lactation Consultant Examiners. URL: [accessed 2023-11-13]
  14. Hamilton BE, Martin JA, Osterman MJ. Births: provisional data for 2021. Centers for Disease Control and Prevention. May 2022. URL: [accessed 2024-06-02]
  15. Registros. Portal da Transparência. URL: [accessed 2023-11-13]
  16. DeLeo A, Geraghty S. iMidwife: midwifery students' use of smartphone technology as a mediated educational tool in clinical environments. Contemp Nurse. Dec 18, 2018;54(4-5):522-531. [FREE Full text] [CrossRef] [Medline]
  17. Tripp N, Hainey K, Liu A, Poulton A, Peek M, Kim J, et al. An emerging model of maternity care: smartphone, midwife, doctor? Women Birth. Mar 2014;27(1):64-67. [CrossRef] [Medline]
  18. Feinstein J, Slora EJ, Bernstein HH. Telehealth can promote breastfeeding during the COVID-19 pandemic. NEJM Catal Innov Care Deliv. 2021;2(2):1-11. [CrossRef]
  19. Haase B, Brennan E, Wagner CL. Effectiveness of the IBCLC: have we made an impact on the care of breastfeeding families over the past decade? J Hum Lact. Aug 17, 2019;35(3):441-452. [FREE Full text] [CrossRef] [Medline]
  20. Ray KN, Demirci JR, Uscher-Pines L, Bogen DL. Geographic access to international board-certified lactation consultants in Pennsylvania. J Hum Lact. Feb 03, 2019;35(1):90-99. [FREE Full text] [CrossRef] [Medline]
  21. Hoddinott P, Britten J, Pill R. Why do interventions work in some places and not others: a breastfeeding support group trial. Soc Sci Med. Mar 2010;70(5):769-778. [FREE Full text] [CrossRef] [Medline]
  22. de Souza J, Calsinski C, Chamberlain K, Cibrian F, Wang EJ. Investigating interactive methods in remote chestfeeding support for lactation consulting professionals in Brazil. Frontiers in Digital Health. Apr 02, 2023;5:1-16. [FREE Full text] [CrossRef] [Medline]
  23. Donovan H, Welch A, Williamson M. Reported levels of exhaustion by the graduate nurse midwife and their perceived potential for unsafe practice: a phenomenological study of Australian double degree nurse midwives. Workplace Health Saf. Feb 2021;69(2):73-80. [CrossRef] [Medline]
  24. Fraser HS, Blaya J. Implementing medical information systems in developing countries, what works and what doesn't. AMIA Annu Symp Proc. Nov 13, 2010;2010:232-236. [FREE Full text] [Medline]
  25. Busch DW, Logan K, Wilkinson A. Clinical practice breastfeeding recommendations for primary care: applying a tri-core breastfeeding conceptual model. J Pediatr Health Care. 2014;28(6):486-496. [CrossRef] [Medline]
  26. Kern-Goldberger AR, Srinivas SK. Obstetrical telehealth and virtual care practices during the COVID-19 pandemic. Clin Obstet Gynecol. Mar 01, 2022;65(1):148-160. [FREE Full text] [CrossRef] [Medline]
  27. Burns E, Fenwick J, Sheehan A, Schmied V. Mining for liquid gold: midwifery language and practices associated with early breastfeeding support. Matern Child Nutr. Jan 09, 2013;9(1):57-73. [FREE Full text] [CrossRef] [Medline]
  28. Niazi A, Rahimi VB, Soheili-Far S, Askari N, Rahmanian-Devin P, Sanei-Far Z, et al. A systematic review on prevention and treatment of nipple pain and fissure: are they curable? J Pharmacopuncture. Sep 30, 2018;21(3):139-150. [FREE Full text] [CrossRef]
  29. Mitoulas LR, Davanzo R. Breast pumps and mastitis in breastfeeding women: clarifying the relationship. Front Pediatr. 2022;10:856353. [FREE Full text] [CrossRef] [Medline]
  30. Gomez-Pomar E, Blubaugh R. The Baby Friendly Hospital Initiative and the ten steps for successful breastfeeding. A critical review of the literature. J Perinatol. Jun 7, 2018;38(6):623-632. [FREE Full text] [CrossRef] [Medline]
  31. VanDevanter N, Gennaro S, Budin W, Calalang-Javiera H, Nguyen M. Evaluating implementation of a baby friendly hospital initiative. MCN Am J Matern Child Nurs. 2014;39(4):231-237. [CrossRef] [Medline]
  32. Most popular messaging apps worldwide 2023. Similarweb. URL: /worldwide-messaging-apps/ [accessed 2023-06-22]
  33. Coelho LS. Telefarmácia na atenção primária à saúde: relato de experiência sobre a implementação e prática em um centro de Saúde de florianópolis. Universidade Federal de Santa Catarina. Sep 23, 2021. URL: [accessed 2024-06-02]
  34. Weaver NS, Roy A, Martinez S, Gomanie NN, Mehta K. How WhatsApp is transforming healthcare services and empowering health workers in low-and middle-income countries. In: Proceedings of the IEEE Global Humanitarian Technology Conference (GHTC). 2022. Presented at: GHTC 2022; September 8-11, 2022; Santa Clara, CA. [CrossRef]
  35. Trude AC, Martins RC, Martins-Silva T, Blumenberg C, Carpena MX, Del-Ponte B, et al. A WhatsApp-based intervention to improve maternal social support and maternal-child health in southern Brazil: the text-message intervention to enhance social support (TIES) feasibility study. Inquiry. Oct 08, 2021;58:469580211048701. [FREE Full text] [CrossRef] [Medline]
  36. de Araujo JC, de Sousa Lima T, dos Santos JA, dos santos Costa E. Use of WhatsApp app as a tool to education and health promotion of pregnant women during prenatal care. Anais do I Congresso Norte Nordeste de Tecnologias em Saúde. 2018. URL: [accessed 2024-06-02]
  37. Lima AC, Chaves AF, Oliveira MG, Lima SA, Machado MM, Oriá MO. Consultoria em amamentação durante a pandemia COVID-19: relato de experiência. Esc Anna Nery. 2020;24(spe):e20200350. [CrossRef]
  38. Gavine A, Marshall J, Buchanan P, Cameron J, Leger A, Ross S, et al. Remote provision of breastfeeding support and education: systematic review and meta-analysis. Matern Child Nutr. Apr 2022;18(2):e13296. [FREE Full text] [CrossRef] [Medline]
  39. Nóbrega V, Melo R, Diniz A, Vilar R. As redes sociais de apoio para o Aleitamento Materno: uma pesquisa-ação. Saúde Debate. 2019;43(121):429-440. [FREE Full text] [CrossRef]
  40. Hinman R, Lawford B, Bennell K. Harnessing technology to deliver care by physical therapists for people with persistent joint pain: telephone and video‐conferencing service models. J Appl Biobehav Res. Oct 30, 2018;24(2):e12150. [FREE Full text] [CrossRef]
  41. Candido NL, Marcolino AM, Santana JM, Silva JR, Silva ML. Remote physical therapy during COVID-19 pandemic: guidelines in the Brazilian context. Fisioterapia em Movimento. Mar 2022;35(4):e35202. [FREE Full text] [CrossRef]
  42. Giglia R, Cox K, Zhao Y, Binns CW. Exclusive breastfeeding increased by an internet intervention. Breastfeed Med. 2015;10(1):20-25. [CrossRef] [Medline]
  43. Krishnamurti T, Simhan HN, Borrero S. Competing demands in postpartum care: a national survey of U.S. providers' priorities and practice. BMC Health Serv Res. Apr 06, 2020;20(1):284. [FREE Full text] [CrossRef] [Medline]
  44. Berens P, Eglash A, Malloy M, Steube AM. ABM clinical protocol #26: persistent pain with breastfeeding. Breastfeed Med. Mar 2016;11(2):46-53. [CrossRef] [Medline]
  45. Douglas P. Re-thinking lactation-related nipple pain and damage. Womens Health (Lond). 2022;18:17455057221087865. [FREE Full text] [CrossRef] [Medline]
  46. Mitchell KB, Johnson HM, Rodríguez JM, Eglash A, Scherzinger C, Zakarija-Grkovic I, et al. Academy of breastfeeding medicine clinical protocol #36: the mastitis spectrum, revised 2022. Breastfeed Med. May 2022;17(5):360-376. [CrossRef] [Medline]
  47. Pevzner M, Dahan A. Mastitis while breastfeeding: prevention, the importance of proper treatment, and potential complications. J Clin Med. Jul 22, 2020;9(8):2328. [FREE Full text] [CrossRef] [Medline]
  48. Nakamura M, Asaka Y, Ogawara T, Yorozu Y. Nipple skin trauma in breastfeeding women during postpartum week one. Breastfeed Med. Sep 2018;13(7):479-484. [FREE Full text] [CrossRef] [Medline]
  49. Aldhyani TH, Nair R, Alzain E, Alkahtani H, Koundal D. Deep learning model for the detection of real time breast cancer images using improved dilation-based method. Diagnostics (Basel). Oct 16, 2022;12(10):2505. [FREE Full text] [CrossRef] [Medline]
  50. Yoon JH, Kim EK. Deep learning-based artificial intelligence for mammography. Korean J Radiol. Aug 2021;22(8):1225-1239. [FREE Full text] [CrossRef] [Medline]
  51. Kim SY, Choi Y, Kim EK, Han BK, Yoon JH, Choi JS, et al. Deep learning-based computer-aided diagnosis in screening breast ultrasound to reduce false-positive diagnoses. Sci Rep. Jan 11, 2021;11(1):395. [FREE Full text] [CrossRef] [Medline]
  52. Calisto FM, Nunes N, Nascimento JC. BreastScreening: on the use of multi-modality in medical imaging diagnosis. In: Proceedings of the International Conference on Advanced Visual Interfaces. 2020. Presented at: AVI '20; September 28-October 2, 2020; Salerno, Italy. [CrossRef]
  53. Zhou Y, Feng BJ, Yue WW, Liu Y, Xu ZF, Xing W, et al. Differentiating non-lactating mastitis and malignant breast tumors by deep-learning based AI automatic classification system: a preliminary study. Front Oncol. Sep 15, 2022;12:997306. [FREE Full text] [CrossRef] [Medline]
  54. Shen Y, Shamout FE, Oliver JR, Witowski J, Kannan K, Park J, et al. Artificial intelligence system reduces false-positive findings in the interpretation of breast ultrasound exams. Nat Commun. Sep 24, 2021;12(1):5645. [FREE Full text] [CrossRef] [Medline]
  55. Abdul Ghafoor N, Sitkowska B. MasPA: a machine learning application to predict risk of mastitis in cattle from AMS sensor data. AgriEngineering. Aug 04, 2021;3(3):575-584. [CrossRef]
  56. Fadul-Pacheco L, Delgado H, Cabrera VE. Exploring machine learning algorithms for early prediction of clinical mastitis. Int Dairy J. Aug 2021;119:105051. [CrossRef]
  57. Fitzpatrick TB. The validity and practicality of sun-reactive skin types I through VI. Arch Dermatol. Jun 1988;124(6):869-871. [CrossRef] [Medline]
  58. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv. Preprint posted online September 4, 2014. [FREE Full text] [CrossRef]
  59. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. Presented at: CVPR 2016; June 27-30, 2016; Las Vegas, NV. [CrossRef]
  60. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. Presented at: CVPR 2016; June 27-30, 2016; Las Vegas, NV. [CrossRef]
  61. Tan M, Le QV. Efficientnetv2: smaller models and faster training. arXiv. Preprint posted online April 1, 2021. [FREE Full text]
  62. Huang G, Liu Z, van der Maaten L, Weinberger KQ. Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017. Presented at: CVPR 2017; July 21-26, 2017; Honolulu, HI. [CrossRef]
  63. Rafay A, Hussain W. EfficientSkinDis: an EfficientNet-based classification model for a large manually curated dataset of 31 skin diseases. Biomed Signal Process Control. Aug 2023;85:104869. [FREE Full text] [CrossRef]
  64. Vodrahalli K, Daneshjou R, Novoa RA, Chiou A, Ko JM, Zou J. TrueImage: a machine learning algorithm to improve the quality of telehealth photos. Pac Symp Biocomput. 2021;26:220-231. [FREE Full text] [Medline]
  65. Finnane A, Curiel-Lewandrowski C, Wimberley G, Caffery L, Katragadda C, Halpern A, et al. Proposed technical guidelines for the acquisition of clinical images of skin-related conditions. JAMA Dermatol. May 01, 2017;153(5):453-457. [CrossRef] [Medline]
  66. Jain S, Singhania U, Tripathy B, Nasr EA, Aboudaif MK, Kamrani AK. Deep learning-based transfer learning for classification of skin cancer. Sensors (Basel). Dec 06, 2021;21(23):8142. [FREE Full text] [CrossRef] [Medline]
  67. Perez F, Vasconcelos C, Avila S, Valle E. Data augmentation for skin lesion analysis. In: Proceedings of the Third International Skin Imaging Collaboration Workshop. 2018. Presented at: ISIC 2018; September 16 and 20, 2018; Granada, Spain. [CrossRef]
  68. Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint Conference on Artificial intelligence - Volume 2. 1995. Presented at: IJCAI'95; August 20-25, 1995; Montreal, QC. [CrossRef]
  69. Ukharov AO, Shlivko IL, Klemenova IA, Garanina OE, Uskova KE, Mironycheva AM, et al. Skin cancer risk self-assessment using AI as a mass screening tool. Inform Med Unlocked. 2023;38:101223. [FREE Full text] [CrossRef]
  70. Lucas R, McGrath J. Clinical assessment and management of breastfeeding pain. Topics Pain Manag. Oct 2016;32(3):1-11. [CrossRef]
  71. Yadav D, Malik P, Dabas K, Singh P. Feedpal: understanding opportunities for chatbots in breastfeeding education of women in India. Proc ACM Hum Comput Interact. Nov 07, 2019;3(CSCW):1-30. [CrossRef]
  72. Gupta V, Arora N, Jain Y, Mokashi S, Panda C. Assessment on adoption behavior of first-time mothers on the usage of chatbots for breastfeeding consultation. J Mahatma Gandhi Univ Med Sci Technol. Aug 2021;6(2):64-68. [CrossRef]
  73. Bennett V. Could artificial intelligence assist mothers with breastfeeding? Br J Midwifery. Apr 02, 2018;26(4):212-213. [CrossRef]
  74. Thirunavukarasu AJ, Ting DS, Elangovan K, Gutierrez L, Tan TF, Ting DS. Large language models in medicine. Nat Med. Aug 2023;29(8):1930-1940. [CrossRef] [Medline]
  75. de Souza J, Chamberlain K, Gupta S, Gao Y, Alshurafa N, Wang EJ. Opportunities in designing HCI tools for lactation consulting professionals. In: Proceedings of the Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems. 2022. Presented at: CHI EA '22; April 29-May 5, 2022; New Orleans, LA. URL: [CrossRef]

AI: artificial intelligence
AUC: area under the curve
CNN: convolutional neural network
IBCLC: international board-certified lactation consultant
LC: lactation consultant
RGB: red, green, and blue
ROC-AUC: receiver operating characteristic area under the curve
VGG16: Visual Geometry Group model with 16 layers

Edited by K El Emam, B Malin; submitted 22.11.23; peer-reviewed by Z Li, L Juwara, J Li; comments to author 07.02.24; revised version received 20.04.24; accepted 09.05.24; published 24.06.24.


©Jessica De Souza, Varun Kumar Viswanath, Jessica Maria Echterhoff, Kristina Chamberlain, Edward Jay Wang. Originally published in JMIR AI (, 24.06.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR AI, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.